Changeset Evolution

Changeset What?

Mercurial is a distributed version control system, similar to git. If you have not tried it yet, you really should!

I work on Mercurial, and as you know already, I love to automate everything. If you use git and mercurial today, you know that source control is not trivial, workflows could be easier and require less manual intervention and dark magic.

Changeset evolution is a proposal to make source-control less error-prone, more forgiving and flexible. I will use changeset evolution and evolve interchangeably. Pierre-Yves David created Changeset Evolution and you can see his talk at FOSDEM 2013

The history of commits does not exist

Let’s start with an example. Assume that a user committed b on top of a:

Before running the amend command
Starting point

After making some changes, the user runs hg commit --amend (like git commit --amend) and decides to call the new commit b’:

After running the amend command
After amend

Under the cover, the amend command creates a new commit but the old revision is still there but hidden:

Under the cover
b didn't disappear yet, it is hidden

For the user, b’ is a newer version of b. Even though, the intent of amending is clear, no information about this intent is recorded in the source control system!

If the user wants to access, let’s say a week from now, what b was before the amend, he or she will have to dig through the reflog to find the hash of b.

What if we could record that b’ is the successor of b?

Defining the commit history with obsolescence markers

Changeset evolution introduces the concept of obsolescence markers to represent that a revision is the successor of another revision. I will represent the obsolescence markers with dotted lines in the following graphs. In the example above after running hg commit --amend we would have:

Before running the amend command
Recording that b' is the successor of b with an obsolescence marker. b is the precursor of b'

And after running hg commit --amend again:

Before running the amend command
Two amends: b" is the successor of b' and b' is the successor of b

All this is happening under the hood, and the user does not see any difference in the UI. It is just some extra information that is recorded that can be used by commands as we will see in the next section.

Simplify rebases, go back in time and don’t make mistakes

Let’s see how we can seamlessly use the obsolescence markers to simplify the life of the user through three examples.

1. Easily accessing a precursor:

Consider the situation discussed above:

Before running the amend command
After hg commit --amend

We can give the user some commands to access precursors of revisions to compare them or manipulate them. After running the amend, you can easily:

  • Go back to the previous version (without using the reflog)
  • Figure out what changes the amend introduced.

The reflog (git or mercurial) is a command to list the successive location of the head of the repository and all its branches or bookmarks. It is a list of lines with the format: “hashes command” and shows the working copy parent (i.e. current commit) after each command. It is used to recover from mistakes and go back to a previous state.

2. Rebasing with fewer conflicts:

It is common to have a testing/continuous integration system run all the tests on a revision before pushing it to a repository. Let’s assume that you are working on a feature and committed b and c locally.

Before running the amend command
Before pushing b to the server on top of d

Satisfied with b, you send it to the CI system that pushes it onto remote/master on the server, when you pull, you will have:

Before running the amend command
Pushing a commit can also add a marker

If you pull one hour later (assuming other people are very productive :D) you will have a situation like that:

Before running the amend command
Your colleagues have been productive and pushed many new changes since you last pulled

And if you try to rebase your stack (b and c) on top of master, you will potentially have conflicts applying b because of the work of another developer. This could happen if this other developer changed the same files you changed in b. But in that case, you know that the person resolved the conflicts once already when applying their work on top of newb. The user should not have to do a merge and resolve conflicts in that case and obsolescence markers can help resolving this. What if on pull the server could tell you that newb is the new version of b:

Before running the amend command
When rebasing the stack, the first commit can be omitted

This way when you rebase the stack, only c gets rebased, b is skipped, and you cannot get conflicts from the content in b.

3. Working with other people

Let’s assume that you start from this simple state:

Before running the amend command
Starting point

You and your friend make changes to the revision b. You create a new version of b called b’ and your friend creates a new version of b called b’’.

Before running the amend command
The first developer rewrote b
Before running the amend command
The second developer rewrote b as well

Then you decide to put your work together. For example, you can do that by pulling from eachother’s repository. The obsolescence markers and revisions are exchanged and you end up with the following state:

Before running the amend command
b has two successors, b' and b'' are called divergent

In git or vanilla (no extension) mercurial, you would have to figure out that b’ and b’’ are two new versions of b and merge them. Changeset evolution detects that situation, marks b’ and b’’ as being divergent. It then suggests automatic resolution with a merge and preserves history.

Before running the amend command
Everything gets resolved intelligently

The graph might seem overcomplicated, but once again, most things are happening under the hood and the UI impact is minimal. These examples show one of the benefit of working with Changeset Evolution: it provides an automatic resolution of typical source control issues.

As we will see in the next section, Changeset Evolution does much more than that and gives developers more flexibility when working with stacks of commits.

A more flexible workflow with stacks

Changeset evolution defines the concept of an unstable revision, a revision based on an obsolete revision. From the previous section:

Before running the amend command
c is unstable because it is based on b and b has as a new version

Evolve resolves instability intelligently by rebasing unstable commits on a stable destination, in the case above newb. But it does not force the user to resolve the instability right away and allows, therefore, to be more flexible when working with stacks. Consider the following stack of commits:

Before running the amend command

A user can amend b or c without having to rebase d.

Before running the amend command
We rewrote b and c, so c' and d are now unstable

And when everything looks good changeset evolution can figure out the right commands to run to end up with the desired stack:

Before running the amend command

If the user was not using changeset evolution, he or she would have to rebase every time anything changes in the stack. Also, the user would have to figure out what rebase command to run and could potentially make mistakes!

What I didn’t cover

  • Working collaboratively with stacks
  • Markers defining multiple precursors (fold) and multiple successors (split)
  • And a lot of other things

How to install evolve and start playing with it

  1. Install mercurial
  2. Clone evolve’s repository with hg clone http://hg.netv6.net/evolve-main/
  3. Add the following configuration to you ~/.hgrc with the correct path from the repo you just cloned: {% highlight ini %} [extensions] evolve = path to/evolve.py {% endhighlight ini %}

More resources