~ePirat

Migrating to Git

A while ago the whole Icecast SVN Stuff was migrated to Git.

Migrating this was quite a challenge, as the Icecast tree was rather complex with some changes of locations in the history and already was migrated to SVN from CVS, so there were even more messed-up things.

Another problem I was faced with, was finding a good solution for the SVN Externals we used, as there is no easy replacement for them in Git.

The Tools

The first thing to do was obviously finding the right tool for the task. There are quite a lot tools out there that promise to do what you want, for example:

  • svn2git
  • git svn
  • svn2git

You might ask why I listed svn2git twice, the answer is simple: There are two different tools called svn2git. One is a tool written in Ruby, called svn2git (which has some forks).

But somehow harder to find, especially using Google is the other svn2git, written in C++ using Qt, which was written by members of the KDE Project to migrate their SVN repos to Git.

It turns out that the svn2git ruby tool wasn’t nearly of any good use for us, as it could not do the complex migration steps we needed. The built-in git svn tool is very limited either, and really slow. So two tools out, remains the last one: KDE’s svn2git.

Installing svn2git

As it is somewhat common for Open-Source Software, it might be very good but nevertheless have awful documentation, so installing was a bit tricky. At least there are some hints on the Repo.

Dependencies

First, we need to make sure we have Qt. (doh…) It might seem a bit extreme to need Qt for such a tool, anyway we need it. Another thign, relatively obvious that we need is libsvn-dev. Then installing was quite easy. As mentioned by the Repo page, it can be resumed as: Check out latest source and do a qmake followed by make.

Doing the magic

So now first we need to get a full dump of the svn repository we want to migrate, in order to make a local copy. This can be done by:

svnadmin dump /projects/cats/supercat > supercat.dump

(You might want to compress this, it can be very large)

Now we need to set up our local svn repository, so let’s do this:

svnadmin create /local/projects/supercat/

and now import the dump:

svnadmin load /local/projects/supercat < supercat.dump

(Note that you might need the --bypass-prop-validation option, I had some issues for some unknown reason)

Ok so now we have our local Repository. Nice. What’s next?

Fun!

Ok, sorry, just kidding. Now we need to write our rulesets. Don’t worry it sounds more complex than it actually is. The svn2git tool requires a rules file, that will indicate it what to do and how to handle commits.

A lot of examples can be found in the repository. Nevertheless I will provide some brief documentation here, as some things are not documented at all:

Writing the rules

A rules file should start with the definition of the repositories we are going to create. If the SVN repo just contains one project (lucky you if so) then you would just need to specify this repository there. If there are more than one repos inside your SVN repo, you need to specify every repo you want have converted to git. (Note: For some repositories it can make sense to have a separate rules file for each repository you want to convert from svn to git, but this really should only be needed if you need to do a lot of weird things or have a very weird repo structure).

Defining the repositories

The repos are defined as follows:

create repository supercat
end repository

There might obviously be more than one of these block, one for each repo.

Matching folders

Next we define the actual matching rules. svn2git works by applying the rules to every commit, finding the first rule that matches and executing what is given in the matching definition. It matches the svn path of the commit. Depending on your structure this can be very different. Let’s assume you have your main repo in /trunk in your svn repo, so you would use the following rule:

match /trunk/
    repository supercat
    branch master
end match

Important: The string used for match must always end with a /, else things will not work! Exceptions only for some regular expressions!

As you might have guessed, this matches /trunk/ and specified that it’s contents belong to the master branch of your new supercat repository, defined earlier.

Note that the match string can be a regular expression, so for example to match all branches in the /branches/ folder in your svn repo and to created them in the git repo, you could use the following rule:

match /branches/(^[/]+)/
    repository supercat
    branch \1
end match

Possible match block options

The possible options for the match block are:

  • repository: This specifies the repository name these matching commits should go to. (This can take references to the matching regex using \match, see below)
  • branch: Specifies the git branch these commits should go to. (This can take references to the matching regex using \match, for example: \1 inserts the value of the first matching group, as it can be seen in the example above)
  • min revision: The minimum revision number this match block will match. (Every svn revision < min revision will not match this block)
  • max revision: The maximum revision number this match block will match. (Every svn revision > max revision will not match this block)
  • prefix: This will be appended to the file name/path, which is useful if you want to move some things to a specific subfolder of your git repo.

There is one special rule/option, which is the recurse rule, which can be found in this example. (I am not quite sure how this works, and have never used it, but want to mention it for completeness)

What about tags?

You might have noticed I have not shown any example for tags, that’s because SVN Tags are actually branches and can’t easily be converted to tags in git. The best option is to add a rule for them which will crate branches in git, with a special name, i.e. TAG-\1 or so and manually fix it up later. (Yeah, sucks)

One last rule!

This is quite important: All commits need to match a rule, so you might want to add a last rule to your file which does nothing:

match /
end match

Examples

Ok this was a lot theory, now some concrete Icecast project examples. First I want to mention our structure:

For quite some time it was structured like:

/trunk/icecast/

and was then switched to:

/icecast/trunk/icecast/

These are the rules I used to match:

#########################
# Create Git repository #
#########################

create repository icecast-raw
end repository

# Extra repo for the important branches
# needed to not mess up things significantly
# This needs manual work afterwards
# unfortunately

# ph3
create repository icecast-ph3-raw
end repository

# kh
create repository icecast-kh-raw
end repository


#############################################
# Never forget: All rules need a trailing / #
#    Else things will go horribly wrong     #
#############################################

#####################
# Include externals #
#####################

match /trunk/m4/
    max revision 6101
    repository icecast-raw
    branch master
    prefix m4/
end match

match /icecast/trunk/m4/
    min revision 6152
    repository icecast-raw
    branch master
    prefix m4/
end match

match /trunk/(avl|httpp|log|net|thread|timing)/
    max revision 6101
    repository icecast-raw
    branch master
    prefix src/\1/
end match

match /icecast/trunk/(avl|httpp|log|net|thread|timing)/
    min revision 6152
    repository icecast-raw
    branch master
    prefix src/\1/
end match

#########################
# Convert trunk commits #
#########################

match /icecast/trunk/icecast/
    min revision 6152
    repository icecast-raw
    branch master
end match

match /trunk/icecast/
    max revision 6101
    repository icecast-raw
    branch master
end match

#######################
# Convert tag commits #
#######################

# Ignore tags, they mess up more
# It's easier to re-create them later

##########################
# Convert branch commits #
##########################

match /icecast/branches/ph3/icecast/
     #min revision 6152
     repository icecast-ph3-raw
     branch master
end match

match /icecast/branches/kh/icecast/
     repository icecast-kh-raw
     branch master
end match

match /icecast/branches/kh/net/
    repository icecast-kh-raw
    branch master
    prefix src/net/
end match

match /icecast/branches/kh/thread/
    repository icecast-kh-raw
    branch master
    prefix src/thread/
end match

match /icecast/branches/icecast-webm/
    min revision 6152
    repository icecast-raw
    branch webm
end match

########################
# Ignore other commits #
########################

match /
end match

As you can see this is a bit complex ;)

There are other rules files involved, you can find them for reference here.