Wolfmans Howlings

A programmers Blog about Ruby, Rails and a few other issues

a Capistrano scm module for local SVN access

Posted by Jim Morris Thu, 07 Dec 2006 05:42:00 GMT

UPDATE 2007-06-09 This method has been deprecated in Cap 2.0.

UPDATE 2007-02-21 I have updated the files to correctly update revisions.log

UPDATE I added rsync as suggested see end of article for more info.

I previously published a custom deploy recipe that allowed subversion to checkout from a local repository and send to the remote server.

This was a suboptimal solution, as it wouldn't work with things like deprec.

So here is the way I should have done it in the first place. It is a modified version of the standard subversion scm module, it should be backward compatible with the built in version, but adds a couple of features.

  • Handles the subversion repository only being accessible from the local machine
  • If the subversion repository is accessible from the remote server allows for different URLs for access from the local and remote machines.

If you use a standard deploy.rb and your subversion is accessible from the server, then it should work exactly as the built-in version (no need to use it then, but it should work)

However if you have your subversion server behind your local firewall, you just add these three lines to your deploy.rb

require 'lib/tasks/local_subversion_rsync.rb'
set :scm, Capistrano::SCM::LocalSubversionRsync
set :repository_is_not_reachable_from_remote, true

And everything will work as before, even though the server has no idea what subversion is.

Of course for this to work you need to download this de-tar it and put it in your lib/tasks directory. (That's where the require line gets this extension from). There is also a unit test for the new_subversion module which is extended from the one shipped with Capistrano, which passes.

Everything else in deploy.rb should be the same as before, except that you set the repository to the URL your local machine uses to access your locally accessible subversion server or repository. (Should even work with the URL file://...), for instance...

set :repository, "svn://your.svnserver.host/#{application}/trunk"

The assumption is that both the local and remote machines can create files and directories in /tmp, if this is not true then one or both of these should be set...

set :tmpdir_local, "/usr/tmp"
set :tmpdir_remote, "/home/user/tmp"

The way this module works is to export the relevant version of the project from Subversion into a temporary directory on the local machine (default is /tmp/unique_name). Then I create a gzip'd tar archive into the temp directory. The tar file is then transferred to the target servers, using the Capistrano put command into a temporary directory (/tmp by default) on the server. I then create the target directory on the server and un-tar the file into that directory. Then clean up the various temporary files. From the perspective of the server the end result should be identical to doing an svn export into the target directory on the server.

The other facility this SCM Module provides is when your subversion server is accessible from the server as is the "normal" use case (per the Capistrano developers, but bad practice IMHO), but you need a different URL to access the SVN repository from the local machine vs the remote server, eg

server accesses with this URL svn://localhost/myapp/trunk local machine accesses with this URL svn+ssh://myserver.com/myapp/trunk

in this case you add these lines...

require 'lib/tasks/local_subversion_rsync.rb'
set :scm, Capistrano::SCM::LocalSubversionRsync

set :local_repository_path, "svn+ssh://myserver.com/myapp/trunk"
set :repository, "svn://localhost/myapp/trunk"

In addition if the actual SVN binary is different on server and local you can add these for instance...

set :remote_svn, "/usr/local/bin/svn"
set :local_svn, "/usr/bin/svn"

If the relevant configuration variable is not set then the standard svn configuration variable is used, and then just "svn" if that is not set.

Please beware I have not fully tested the use cases other than the first one, where I use my local subversion server to deploy to my remote servers that know nothing about subversion, however the unit tests do pass for the other use cases.

UPDATE 2007-02-15 check this article for using the rsync option.

Posted in ,  | Tags ,  | 26 comments | no trackbacks

Allow a different local and remote subversion repository path for Capistrano

Posted by Jim Morris Wed, 15 Nov 2006 22:04:28 GMT

One of the things that bugs me about Capistrano is the requirement that access to the remote subversion repository have the same path from the local machine and the remote machine. This is never the case in my experience.

I have been getting around it by setting the path to svn://localhost/... and running ssh locally to port forward the SVN port to the remote host. This sucks as I usually forget to run ssh in another window first.

So I added an optional configuration variable to be set in deploy.rb called local_repository_path, so now you set the repository path as normal, which is the path the remote server uses to access the SVN repository, and you set the path that the local machine (your workstation) uses to access the same repository...

set :repository, "svn://localhost/#{application}/trunk"
set :local_repository_path, "svn+ssh://myremotehost.com/path/to/repostitory/#{application}/trunk"

Ideally one would patch the Capistrano distribution to achieve this, but thanks to the magic of ruby you can patch it from your own setup so I put the following into lib/tasks/patch_capistrano.rb


# Patch the svn scm to allow a local svn repository path as well as
# the remote one as most systems I use have a different path depending
# on whether you access svn from the local machine or remote machine

  module Capistrano

    # override the two scm methods that access svn from the local machine
    module SCM

      class Subversion
        # Return an integer identifying the last known revision in the svn
        # repository. (This integer is currently the revision number.)
        def latest_revision
          @latest_revision ||= begin
            configuration.logger.debug "querying latest revision..."
            match = svn_log(configuration[:local_repository_path]).scan(/r(\d+)/).first or
            raise "Could not determine latest revision"
            match.first
          end
        end

        # Return a string containing the diff between the two revisions. +from+
        # and +to+ may be in any format that svn recognizes as a valid revision
        # identifier. If +from+ is +nil+, it defaults to the last deployed
        # revision. If +to+ is +nil+, it defaults to HEAD.
        def diff(actor, from=nil, to=nil)
          from ||= current_revision(actor)
          to ||= "HEAD"
          `svn diff #{configuration[:local_repository_path]}@#{from} #{configuration[:local_repository_path]}@#{to}`
        end

      end
    end

  end

Then simply add this line to your config/deploy.rb...

require 'lib/tasks/patch_capistrano'

and everything works fine now.

Note there are only two scm methods that access SVN from the local machine, latest_revision and diff. I have patched both.

I'll tidy this up, make the default for local_repository_path be repository, and submit the patch to the Capistrano folks, who will hopefully commit it to the source code, as it won't affect existing users, but will make the rest of us a little happier.

Posted in  | Tags , ,  | 2 comments | no trackbacks

Using Ruby SVN bindings to get file status

Posted by Jim Morris Mon, 04 Sep 2006 18:51:35 GMT

If you Google around for information or even some documentation on the ruby SVN bindings you will find plenty of comments that it simply is not documented, so when I wanted to add an SVN status check to a UI I was working on (a project browser window for Rails), I had to "Use the source Luke". However given the bindings are actually mostly automatically generated by SWIG, and the actual details are hidden in a goo of swig generated c code, even that was a challenge.

Eventually I realized that the API was almost identical to the c level subversion API, not surprisingly, and for the most part it works the way you would expect. However I could not find any examples of the client status call, so here it is for anyone else struggling with this issue.

require "svn/core"
require "svn/client"
require "svn/wc"
require "svn/repos"

# define Consts for all the numeric status values
NONE = 1
UNVERSIONED = 2
NORMAL = 3
ADDED = 4
MISSING = 5
DELETED = 6
REPLACED = 7
MODIFIED = 8
MERGED = 9
CONFLICTED = 10
IGNORED = 11
OBSTRUCTED = 12
EXTERNAL = 13
INCOMPLETE = 14
# I added these so I can amalgamate repo and local status
UPDATED = 15
MODIFIED_NEWER = 16
NEWFILE = 17

$modes= { # Map status to English
    NONE => "None",
    UNVERSIONED => "unversioned",
    NORMAL => "normal",
    ADDED => "file added",
    MISSING => "missing",
    DELETED => "file removed",
    REPLACED => "deleted and then re-added",
    MODIFIED => "modified",
    MERGED => "received repos mods",
    CONFLICTED => "file modified and in conflict",
    IGNORED => "ignored",
    OBSTRUCTED =>  "unversioned resource is in the way of the versioned resource",
    EXTERNAL => "unversioned path populated by an svn:external property",
    INCOMPLETE => "directory doesn't contain a complete entries list",
    UPDATED => "newer version in repository",
    MODIFIED_NEWER => "modified and newer version in repository",
    NEWFILE => "new file in repository",
    nil => "Unknown status"
}

# this amalgamates the most common status between local and repository
def compute_status(status)
  # if modified in repo and localy
  return MODIFIED_NEWER if status.repos_text_status == MODIFIED && status.text_status != NORMAL
  # if added in repo and does not exist locally
  return NEWFILE if status.repos_text_status == ADDED && status.text_status == NONE
  # if repo is none then use local status
  return status.text_status if status.text_status != NORMAL
  # if normal locally but changed in repo
  return UPDATED if status.repos_text_status == MODIFIED
  # otherwise it must be normal
  NORMAL
end

ctx = Svn::Client::Context.new()

rev = ctx.status("/home/user/project/myproject", "HEAD", true, true) do |path, status|
  astat= compute_status(status)
  puts "#{path}: #{status.text_status},#{status.repos_text_status} = #{$modes[astat]}"

  unless status.entry.nil?
    puts(".....name: #{status.entry.name}")
    puts(".....url: #{status.entry.url}")
    puts(".....repos: #{status.entry.repos }")
    puts(".....revision: #{status.entry.revision}")
    puts(".....kind: #{status.entry.kind}")
    puts(".....schedule: #{status.entry.schedule}")
    puts(".....deleted: #{status.entry.deleted}")
    puts(".....absent: #{status.entry.absent}")
    puts(".....incomplete: #{status.entry.incomplete}")
    puts(".....cmt_date: #{status.entry.cmt_date}")
    puts(".....cmt_rev: #{status.entry.cmt_rev}")
    puts(".....cmt_author: #{status.entry.cmt_author}")
    puts(".....prop_time: #{status.entry.prop_time}")
    puts(".....text_time: #{status.entry.text_time}")
  end
end

This is a sample, it shows how to setup the client context, call the status call, and parse the results. I also added some sugar by defining the numeric status codes and adding a hash to translate them to english. I also compute an amalgamated status code from the local status and status in the repository, to get the extra status codes I added above.

The status parameter returned by the status() call turns out to be a standard svn status structure, containing various fields, the most interesting one as far as status is concerned are text_status and repos_text_status fields which are numeric fields which specifies the SVN status of the file locally and in the repository, I have created in the example above a bunch of constants to define each of the states, and a hash to turn them into english.

Note the compute_status() method which looks at both the local status and repository status to see what the relative state of any file is wrt the repository.

The other interesting field is the entry field, which contains bunch of data as shown above, and in the structure definition below.

There are more parameters that can be passed to the status call, but the last two I show should be set to true, as that allows recursion into sub directories and shows all files, by default recursion is true but all_files is false which only show files not under svn control.

The developers have made some attempt to make this relatively easy to use, you can forget about the apr pools and stuff, and the second parameter is usually a structure that tells it the version or revision to search for, this has been simplified so you can pass in a string describing the revision ("HEAD", "TAIL", etc) or a number denoting the revision number.

The status and entry structures have also had the types converted into convenient ruby types, but for reference I show the c struct definitions from the various svn header files...

Here is the full status structure from svn_wc.h

typedef struct svn_wc_status2_t
{
  /** Can be @c NULL if not under version control. */
  svn_wc_entry_t *entry;

  /** The status of the entries text. */
  enum svn_wc_status_kind text_status;

  /** The status of the entries properties. */
  enum svn_wc_status_kind prop_status;

  /** a directory can be 'locked' if a working copy update was interrupted. */
  svn_boolean_t locked;

  /** a file or directory can be 'copied' if it's scheduled for
   * addition-with-history (or part of a subtree that is scheduled as such.).
   */
  svn_boolean_t copied;

  /** a file or directory can be 'switched' if the switch command has been
   * used.
   */
  svn_boolean_t switched;

  /** The entry's text status in the repository. */
  enum svn_wc_status_kind repos_text_status;

  /** The entry's property status in the repository. */
  enum svn_wc_status_kind repos_prop_status;

  /** The entry's lock in the repository, if any. */
  svn_lock_t *repos_lock;

}

and the entry structure

typedef struct svn_wc_entry_t
{
  /* IMPORTANT: If you extend this structure, check svn_wc_entry_dup() to see
     if you need to extend that as well. */

  /* General Attributes */

  /** entry's name */
  const char *name;

  /** base revision */
  svn_revnum_t revision;

  /** url in repository */
  const char *url;

  /** canonical repository URL or NULL if not known */
  const char *repos;

  /** repository uuid */
  const char *uuid;

  /** node kind (file, dir, ...) */
  svn_node_kind_t kind;

  /* State information */

  /** scheduling (add, delete, replace ...) */
  svn_wc_schedule_t schedule;

  /** in a copied state */
  svn_boolean_t copied;
  /** deleted, but parent rev lags behind */
  svn_boolean_t deleted;

  /** absent -- we know an entry of this name exists, but that's all
      (usually this happens because of authz restrictions)  */
  svn_boolean_t absent;

  /** for THIS_DIR entry, implies whole entries file is incomplete */
  svn_boolean_t incomplete;

  /** copyfrom location */
  const char *copyfrom_url;

  /** copyfrom revision */
  svn_revnum_t copyfrom_rev;

  /** old version of conflicted file */
  const char *conflict_old;

  /** new version of conflicted file */
  const char *conflict_new;

  /** working version of conflicted file */
  const char *conflict_wrk;

  /** property reject file */
  const char *prejfile;

  /** last up-to-date time for text contents (0 means no information available)
   */
  apr_time_t text_time;

  /** last up-to-date time for properties (0 means no information available) */
  apr_time_t prop_time;

  /** base64-encoded checksum for the untranslated text base file,
   * can be @c NULL for backwards compatibility.
   */
  const char *checksum;

  /* "Entry props" */

  /** last revision this was changed */
  svn_revnum_t cmt_rev;

  /** last date this was changed */
  apr_time_t cmt_date;

  /** last commit author of this item */
  const char *cmt_author;

  /** lock token or NULL if path not locked in this WC
   * @since New in 1.2.
   */
  const char *lock_token;
  /** lock owner, or NULL if not locked in this WC
   * @since New in 1.2.
   */
  const char *lock_owner;
  /** lock comment or NULL if not locked in this WC or no comment
   * @since New in 1.2.
   */
  const char *lock_comment;
  /** Lock creation date or 0 if not locked in this WC
   * @since New in 1.2.
   */
  apr_time_t lock_creation_date;

  /* IMPORTANT: If you extend this structure, check svn_wc_entry_dup() to see
     if you need to extend that as well. */
} svn_wc_entry_t;

For the most part you can simply reference the fields in the structure and get a relatively understandable result.

The status call is defined in ruby as this

def status(path, rev=nil, recurse=true, get_all=false,
           update=true, no_ignore=false,
           ignore_externals=false, &status_func)

The first parameter is the path on the local file system to get the status of, the second is the revision to get, something like "HEAD", or 2345 can be passed in, the next parameter is whether to recurse into subdirectories, the next tells it to return the status of all files if set to true, default of false and only returns "interesting files" ie local mods and/or out-of-date or not versioned. Update is set to true it will contact the repository to get more information wrt to the version specified in the first parameter (which is ignored otherwise). I couldn't find any documentation in the c stuff about the no_ignore parameter, but the ignore_externals tells it to return status on externals or not. See the header file svn_client.h and the call svn_client_status2 for more documentation.

Posted in ,  | Tags , , ,  | 3 comments | no trackbacks