ariya.io About Talks Articles

Extracting Parts of Git Repository and Keeping the History

3 min read

At some point, a software project will grow beyond its original scope. In many cases, some portions of the project become their own mini world. For maintenance purposes, it is often benefical to separate them into their own projects. Furthermore, the commit history for the extracted project should not be lost. With Git, this can be achieved using git-subtree.

signpostWhile git-subtree is quite powerful, the feature that we need for this task is its splitting capability. The documentation says the following regarding this split feature:

Extract a new, synthetic project history from the history of the prefix subtree. The new history includes only the commits (including merges) that affected prefix, and each of those commits now has the contents of prefix at the root of the project instead of in a subdirectory. Thus, the newly created history is suitable for export as a separate git repository.

This turns out to be quite simple. In fact, there is already a Stack Overflow answer which describes the necessary step-by-step instructions. The illustration below, also dealing with a real-world repo, hopefully serves as an additional example of this use case.

First of all, make sure you have a fresh version of Git:

git --version

If it says 1.8.3, then get a newer version since there is a bug (fixed in 1.8.4) which will pollute your commit logs badly, i.e. by adding “-n” everywhere.

For this example, let’s say we want to extract the funny automatic name generator (for a container) from the Docker project into its own Git repository. We start by cloning the main Docker repository:

git clone https://github.com/dotcloud/docker.git
cd docker

We then split the name generator, which lives under pkg/namesgenerator, and place it into a separate branch. Here the branch is called namesgen but feel free to name it anything you like.

git-subtree split --prefix=pkg/namesgenerator/ --branch=namesgen

The above process is going to take a while, depending on the size of the repository. When it is completed, we can verify it by inspecting the commit history:

git log namesgen

The next step is to prepare a place for the new repository (choose any directory you prefer). From there, all we need to do is to pull the namesgen branch which was splitted before:

cd ~
mkdir namesgen
cd namesgen
git init
git pull /path/to/docker/checkout namesgen

That’s it! Of course, normally you want to push this to some remote, e.g. a repository on GitHub or Bitbucket or your own Git endpoint:

git remote add origin git@github.com:joesixpack/namesgen.git
git push -u origin --all

The new repository will only contain the files from pkg/namesgenerator/ directory from Docker repository. And obviously, every commit that touch that directory still appears in the history.

Mission accomplished!

Related posts:

♡ this article? Explore more articles and follow me Twitter.

Share this on Twitter Facebook