Category Archives: Revision control

Another Git cheatsheet

This one is pretty cool.

Git Cheatsheet

It shows all commands relative to what portion of the conceptual workspace they modify – e.g. stash, workspace, index, local repository or remote repository.

That said, it stops at the one-line summary for a command. In looking at this, it would be cool if you could see commands in action in terms of how they modify data, but still at the logical level, not as yet another of the interactive git tutorials.

Git tidbits

There’s a new Git GUI on Windows: https://github.com/kaisellgren/Git-GUI. It uses libgit2 and .NET 4.5.

GitHub for Windows is also libgit2+.NET: https://github.com/blog/1127-github-for-windows, and interoperates with the GitHub site. However, since libgit2 isn’t yet feature-complete, GitHub for Windows also uses msysGit where necessary.

Since libgit2 has pluggable backends, people have used it to store Git repositories in databases: https://deveo.com/blog/2013/06/19/your-git-repository-in-a-database-pluggable-backends-in-libgit2/. Some existing backends include Memcached, MySQL, Redis and SQLite.

Revision control – past, present and future

I’d like to write a comprehensive book on revision control at some point. This is not even the start of that, but I’m going to record a few trends.

I’m going to cover the revision control systems that sprang to life as part of the BitKeeper debacle, because those are the most interesting ones to consider.

Monotone never really existed

Monotone was the template that both Mercurial and Git copied. However, monotone always followed a very purist model, and was dreadfully slow for many years. It caught up in speed somewhat, but has very few users.

Development is sparse, only a few commits in the past 6 months, and mostly bug fixes or translations.

Bazaar is dead

There have been no significant public changes to Bazaar in the past year. It’s been declining for years, and insiders admit that the focus changed from “Bazaar as a decent general-purpose version control system” to “ensuring that Bazaar worked well for package management in the Ubuntu project”.

There’s been some polite exchanges saying that many Bazaar architectural directions were superior to Git, but I have to disagree. The architecture of Bazaar informed the implementation, and the implementation is second-rate compared to Git or Mercurial. That said, some of the things Bazaar attempted were good ideas, and maybe those ideas can be transplanted somewhere else.

By “no changes”, I mean that there have been virtually no commits to Bazaar itself since mid 2012. There have been 7 commits in the past 30 days, 144 commits in the past year, and almost none of those are commits that will get released.

Some relevant links

Mercurial is slowing down

While Mercurial is doing far better than Bazaar, it’s growth seems to have been slowing down markedly. It has some big advantages, though: since it’s written in Python, the Windows client is as good as the Mac or Linux client. It has local revision numbers, which are far easier to deal with cognitively than hashes. Since it’s written in Python, it can be scripted in Python very reliably. It arguably has very clean and comprehensive documentation.

But it’s written in Python. It has a relatively weak branching model (so weak that cloning was the branching model pushed for many years, and cloning is not good for merging or comparing history). It doesn’t do rebase very well. And it’s getting new users at a far slower rate than Git, the emerging leader. And the rate of development of Mercurial itself is about half that of Git for the past 12 months. It’s of course moderately hard to compare a Python project to a C project.

Still, Mercurial isn’t dying. It’s perhaps more accurate to say that it’s reached its level. Fortunately, it’s fairly easy to go between Mercurial and Git. I predict that Mercurial will eventually completely lose to Git, largely because of the decision to use revlog approach instead of a blob approach. Blobs are more bookkeeping for moderate projects, but give more flexibility for large projects with lots of history.

One brilliant idea that was problematic in execution was Mercurial extensions. It’s super-easy to extend Mercurial with an extension. But then, you have a custom Mercurial install that you need to replicate to others if you need them to use your extension. Why not put extensions in repositories? Hmm, I should implement that change and push it upstream.

Git

Git is taking over the world, although that might be due in large part to Github. While Subversion still seems to be the dominant open-source revision control system, most of that is inertia. Few new projects are choosing Subversion, and none are choosing CVS.

There are alternate implementations of Git. Github itself uses a Ruby version, and uses libgit2 for Github for Windows.

I need to analyse Git’s status more accurately, I’m an admitted partisan.

Git weaknesses

  • no partial repository support (although you can permanently drop old history)
  • falls down on gigantic repositories (10+ GB repository not pleasant)
  • no good way to use multiple repositories together (git submodule is sub-par)
  • no human-understandable version number

Some references:

Veracity is a dark horse

SourceGear switched direction several years back and is trying to be a player in distributed revision control with Veracity. I read a lot of the book that was released, but I haven’t yet used Veracity. It hasn’t made much of a dent in the landscape yet, but since it’s very polished, it could get a lot of enterprise customers.

The impressive part is that it’s an open-source project, which is a big step for a company to take, but probably an essential one in order to compete with Git. So, since it’s open-source, it might survive SourceGear; not that SourceGear is in any immediate danger, but it is a project that’s being run by a commercial company, and those tend to live and die with the company.

Git feature –assume-unchanged

This is a cool feature. You can mark files as “yes, I know this is tracked by Git, but I don’t want my changes committed.”

For example, there’s a config file that’s checked in. You need to make local edits to test with. However, you often accidentally commit those changes (you forget). But you could tell Git to ignore changes in this file. Let’s say we have a file config.xml that we want to edit locally and leave edited.

git update-index --assume-unchanged config.xml

After this, we can commit all we want, and Git will ignore config.xml. If you need to commit a change to it, you can undo this with

git update-index --no-assume-unchanged config.xml

If you’ve forgotten which files you have set the “assume unchanged” bit on, you can do git ls-files -v to see.

This is an edge case, but useful for some work flows.

Revision control systems

This is more about theory and implementation of revision control systems, and not really about use. I’m interested in various concepts came into being and how they evolved.

One early system (that died quite thoroughly, evidently it never got used much) was OpenCM, billed as a “secure and safe CVS replacement”. I found a copy of a user’s manual from 2002 that describes some of the concepts (I’ll see if I can mirror it locally before it too disappears into obscurity).

OpenCM User’s Guide

It hit upon the idea of giving each file a universal name, but it does so by generating “random” names based on your machine name, so it loses some of the benefit of doing content-based names (like using the SHA-1 of the file contents as Monotone, Git and Mercurial do).

Here’s a snapshot of what existed in version-control land back in 2007, at least as far as open source went.

Appendix A. Free Version Control Systems from Producing Open-Source Software by Karl Fogel.

Here’s a slideshow history of revision control: http://www.win.tue.nl/~aserebre/2IS55/2009-2010/stijn.pdf. Well, somewhat – it skips 90% of revision control systems and only talks about the open-source ones. This may be appropriate, since there hasn’t been a lot of cross-pollination from the closed-source revision-control systems.

Monotone’s first release was created by Graydon Hoare and released on April 6, 2003 (according to LWN and Wikipedia). Monotone was rejected by Linus Torvalds as being too slow, and this led directly to the creation of Git.

Veracity is Eric Sink’s replacement for SourceGear. http://veracity-scm.com/

Petr Baudis’ Bachelor Thesis (2008) was on Current Concepts in Version Control Systems. He contributed a lot to Git development, starting days after Linus Torvald’s first release by building a front end for Git (git-pasky, later Cogito, and then folded some of it into the Git core).