The secret of git

The secret of git popularity is simple. It is very logical, quick and has an awful usability. A perfect combination for its primary audience – the geeks, who love tinkering with a complicated beautiful thing to make it work for them. Another lesson learned – great UX doesn’t have to be ergonomic (in tayloristic sence).

In particular, I couldn’t think I would enjoy working with SCM in the command line. In TFS, I frowned upon every time somebody told me you have to use tf or even some 3rd party tool to accomplish some tasks, because it was obvious for me that if I see all my projects and sources in the Visual Studio, there is also the proper place to perform all SCM-related tasks. I didn’t want to work with my files using two completely different UIs, let alone work with my files and change history from the command line.

But with git, working in command line is the only option for me. First, git is a too dangerous tool to allow any intransparent GUI wrappers doing some uncontrollable magic around it. Second, using it for embedded C development feels surprisingly harmonious: compared with high-level languages and projects you have dramatically less files when working in C, so that the scope is still manageable.

Nevertheless, after tinkering with it for a couple of months, trying to conceive a workable development process, I want to give up on some its areas. Perhaps some good git wizard will stumple upon this post and share his wizdom with me.

Basically, our current process looks like this. We have a central git server with several bare repositories. They are cloned into each dev PC, as well as into a release PC. In most repos, there is a branch called “develop”, this is the integration branch. Working on source code would typically look like this:

git checkout develop
git pull
git checkout -b maxim_bug1234
# Switch to Eclipse, code, build, test, debug, ready
git status
git add  OR
git commit -a -m "Changed abc"
# repeat until bug fixed or feature done
git checkout develop
git pull
git checkout maxim_bug1234
git rebase develop
git checkout develop
git rebase maxim_bug1234
git push

We normally don’t push our developer branches to the central server, because the commits themselves are still pushed (and thus backed up) as part of the develop branch, and we currently are afraid of cluttering the server with thousands of dev branches.

Besides of the develop, there is a release branch. Our software versioning format is <major>.<minor>.<release>. To prepare for release, we do the following on the development PC:

git checkout develop
git pull
git checkout . # or "git checkout -b" if the branch does not exist yet
git pull
git merge develop
git push origin .
# If we want to make an internal release for debugging purposes, that's it
# Otherwise, we increment the release number and tag it
git tag -f  ..
git push origin --tags

Now, in our process we can either release a branch (that is, its top) for debugging purposes only, or a tag. Given the <tag-or-branch> we want to release, the procedure is as follows (on release PC):

git fetch
git fetch --tags
git checkout 
git merge origin/

# automatically generate version string of this release
# this will give either the tag, or (in case of a branch top), --
echo "#define VERSION \"`git describe --tags  --match=[[:digit:]]\.[[:digit:]]\.[[:digit:]]`\"" > version.h

make all

# Now automatically tag the released state to be able to return to it later
git tag $stamp
git push origin $stamp

The issues we have with this process are:

1) It seems to be in git nature to be very selective in what you commit at a time. For example, before git I would regularly do several tasks at once, and then commit all files with one changeset, possibly describing all changes in its message. Now I would first stage changes related to one task, commit them, then stage changes of other task, create a second commit, and so forth. It seems to be a good idea to keep commit scope atomic; this should help when merging or cherry-picking. The problem is that there are SO MANY COMMITS! And the git log command, at least when used without additional parameters, is very unhelpful in keeping an overview of work done in the last several days. It prints a lot of useless information in a verbose format, so that we have a feeling of get lost in this sea of commits. GUI tools like gitk and SmartGit are not better at this. There are still way too much commits to keep track of. Compared to old-style SCMs, where you’ve typically did one or two commits a day, it is a quantum leap. We need to find an alternative to commit messages to understand, what has been done lately.

2) Perhaps we’re doing something wrong with how we merge or rebase, but I’ve expected to see the develop branch as a straight line in visualization tools like gitk or SmartGit. It is not; the branches look like a complex graph, so that you cannot see visually what changes have been merged in to the develop by whom, because you can’t recognize the clear line of develop branch.

3) In a traditional SCM, the server tracks all files I’m currently working on (i.e. changed but not yet commited), no matter where I work on them (in which working folder and on what PC). We work in virtual machines on several PCs, for example I have at least six different VMs. Each of the VM has a local clone of git repository. When working on a release (often under time pressure) I tend to perform changes in several VMs at the same time. Now, if I forget to push the change to central git server and turn off the VM, I have to means to remember about this change. Typically, I would forget about it, then proceed with the development, then at some point wonder why the change is not there (“haven’t I already fixed this thing?!”), then perhaps re-implement the change and push it to origin. Afterwards, there are good chances I will resume my other VM, make some changes there and try to push everything, including my previous implementation. At this point, I will be presented with a nasty merge conflict and I have absolutely no history spanning across my VMs that would help me to understand what has just happened.

4) Fundamentally I don’t like how merging is implemented. git can only merge into the branch currently checked out. In more than 50% of cases, this is the wrong approach, because I typically want to “uplift” changes from my current branch first into the develop branch, and then from the develop branch “uplift” them into a release branch. Constant switching between branches just to merge something is waste. We would need a way to merge “uplift” changes, at least for the case where it is fast track.

Any ideas?

Leave a comment