Continuous Integration
Whilst Nix’s strict control of software versions and dependencies is excellent when handling other people’s programs, it can be a little frustrating when developing one’s own software.
In particular, version control systems like Git have their own notion of immutable state (commits) and dependencies (each commit’s hash includes that of its parent, just like Nix derivations include their dependencies’ hashes). This similarity can often lead to overlaps, where I need to synchronise Nix packages with git updates, or vice versa.
I’ve tried several methods of resolving this, but have finally found one that I’m happy with. Here I’ll explain these approaches.
Manual Git Revisions
This is the most tedious, but it ensures you’re using a known-good configuration. For example, this is the way you’d manage an official package in nixpkgs.
Let’s say we’re maintaining the following packages foo
and bar
. For the sake of argument, let’s say we’re
maintaining these definitions in our ~/.nixpkgs/config.nix
file:
foo = stdenv.mkDerivation {
name = "foo";
src = fetchgit {
url = "https://example.com/foo.git";
rev = "f000001";
sha256 = "f000000000000000000000000000000000000000000000000001";
};
};
bar = stdenv.mkDerivation {
name = "bar";
src = fetchgit {
url = "https://example.com/bar.git";
rev = "b000001";
sha256 = "b000000000000000000000000000000000000000000000000001";
};
buildInputs = [ foo ];
};
Now let’s say we make a change to foo
, which lives in a
new git revision f000002
. We need to go and edit our
package definition:
foo = stdenv.mkDerivation {
name = "foo";
src = fetchgit {
url = "https://example.com/foo.git";
rev = "f000002";
sha256 = "f000000000000000000000000000000000000000000000000001";
};
};
In fact, this package will not build, since the SHA256 hash will not match. Keep in mind that the SHA sum of the git commit doesn’t have anything to do with the SHA sum of the nix derivation, so we can’t just copy it over. Instead, we must ask Nix what the new SHA sum should be, and the easiest way to do that is to attempt to build the package:
$ nix-shell -p foo
...
output path ‘/nix/store/...foo’ should have r:sha256 hash
‘f000000000000000000000000000000000000000000000000001’, instead has
‘f000000000000000000000000000000000000000000000000002’
...
This is what we were expecting, so we can now safely copy that new hash in place of the old one:
foo = stdenv.mkDerivation {
name = "foo";
src = fetchgit {
url = "https://example.com/foo.git";
rev = "f000002";
sha256 = "f000000000000000000000000000000000000000000000000002";
};
};
Clearly, this is a tedious process. The worst part is that we need to
do the same thing even for tiny changes. Let’s say that
bar
requires some small change to the API of
foo
, e.g. changing a private function into a public one
(before you scoff, read on for a treatment of the software engineering
aspects!).
Even if we have a nice development
workflow for bar
, with this setup we will still need to
bump the version of foo
globally, just to have the
new version available to bar
.
This is problematic in a couple of ways: firstly, all
package definitions will now point to the new version of
foo
. Of course, thanks to the way Nix works this won’t
affect any existing derivations (i.e. all of our installed
packages will carry on working as-is), however it will affect
new derivations, like those instantiated by
nix-shell
. Symptoms might include a load of expensive
rebuilds, or subtle breakages that we don’t want to contend with
right now (software engineers, read on!).
The other problem is that our change might not work! With this
approach, we must go to all this effort and bump the global version of
foo
just to try something out!
As for the software engineering perspective, it’s certainly true that any change in public API should lead to a version increase, and is not considered “tiny”. It’s also true that we should try not to release new API versions if they’re known to break existing clients, at least without understanding whether the breakages are acceptable.
However, our problem shouldn’t have anything to do with releases and versions! Even at the stage of experimenting and prototyping, to see if the proposed change will even work in the first place, we are suddenly forced to deal with a release management scenario!
This is obviously a bad state of affairs, in particular because it penalises modularity: these headaches in tying packages together will subconsciously bias us against re-using existing components, or splitting up our task into smaller sub-tasks.
Ignoring Git
We can also go to the opposite extreme and ignore version control altogether! Consider the following:
foo = stdenv.mkDerivation {
name = "foo";
src = ~/Programming/foo;
};
bar = stdenv.mkDerivation {
name = "bar";
src = ~/Programming/bar;
buildInputs = [ foo ];
};
Using hard-coded paths is clearly a bit of a code smell, as these package definitions are no longer portable to other machines. However, look at what we’ve gained! We no longer need to specify a git revision or an SHA256 checksum: Nix will scan the contents of the directories, and if they’ve changed from the last build it will copy them into the store and make a new derivation.
Unfortunately, there are obvious problems with this approach. In
particular, we’ve still got the issue that any changes we make will have
a global effect. In fact, we no longer have to commit something in order
to alter our system! If we have unstaged changes in the foo
or bar
directories, any new derivations depending on these
packages will get those changes, regardless of whether we intended them
to or not.
This undermines a lot of the purpose of Nix, since we want to have confidence in our system integrity and the freedom to develop software without fear of breaking anything with half-finished changes.
Automating Updates
Another solution I tried is to automate the process of updating
revisions and hashes. I wrote two scripts, called
bumpCommit
and bumpEverything
, to aid this
process. First, the packages would be defined to load their revision and
hash attributes from external files:
foo = stdenv.mkDerivation {
name = "foo";
src = fetchgit {
url = "https://example.com/foo.git";
rev = import "./foo.rev.nix";
sha256 = import "./foo.sha256.nix";
};
};
bar = stdenv.mkDerivation {
name = "bar";
src = fetchgit {
url = "https://example.com/bar.git";
rev = import "./bar.rev.nix";
sha256 = import "./bar.sha256.nix";
};
buildInputs = [ foo ];
};
bumpCommit
would look at the current working directory,
and use a look up table to find the corresponding rev.nix
and sha256.nix
files, which it would overwrite with the new
versions.
Since these package definitions themselves are stored in git
repositories, running bumpCommit
on a project may cause
changes in other projects, which requires more commits, and so on.
That’s where bumpEverything
came in: it would loop
through all projects in dependency order, commit any outstanding changes
to rev.nix
or sha256.nix
files, then invoke
bumpCommit
.
This was clearly a hack, and was very fragile in the face of changing projects, e.g. splitting a project into two parts. It also generated a large number of git commits, which would do nothing other than bump revision numbers and hashes.
latestGit
My current solution abandons the bumpCommit
and
bumpEverything
commands. Instead, we try to combine the
best features of the git approach and the no-git approach.
The aim is to treat a git URL as if it were a local directory: no
need to specify a particular revision (just fetch a sensible default
like HEAD
, master
, etc.) and no need to
specify a particular hash (just like with local directories, if it’s
changed then build a new derivation; of course, in practice a particular
git revision will not change, and is hence safe to cache).
To do this, we abuse Nix’s idea of derivations in order to run arbitrary scripts and cache the results in the Nix store for subsequent access.
First, we write a “package” which is just a single file, containing the latest git revision of a particular repository:
with import <nixpkgs> {};
with builtins;
getHeadRev = { url, ref ? "HEAD" }:
stdenv.mkDerivation {
inherit url ref;
name = "repo-head-${hashString "sha256" url}";
version = toString currentTime;
# Required for SSL
GIT_SSL_CAINFO = "${cacert}/etc/ssl/certs/ca-bundle.crt";
buildInputs = [ git gnused ];
buildCommand = ''
source $stdenv/setup
# printf is an ugly way to avoid trailing newlines
printf "%s" $(git ls-remote "$url" "$ref" | head -n1 | sed -e 's/\s.*//g') > "$out"
'';
};
The getHeadRev
function takes a url
parameter and, optionally, a ref
(defaulting to
HEAD
). It then generates a package with a name derived from
the given URL, and a version based on the current time; this ensures
that we avoid the Nix cache.
The contents of the package, stored in the file at location
$out
, are generated by the buildCommand
, which
queries the given URL for the latest revision ID.
Next, we need a way to access this revision information from within
Nix. We do this by coercing the packages generated by
getHeadRev
into strings, which will correspond the the
$out
path at which they’re stored. We use
readFile
to read the contents of the generated files:
rev = args: unsafeDiscardStringContext (readFile "${getHeadRev args}");
We use unsafeDiscardStringContext
to ignore the
“dependencies” of this string, i.e. the particular invocation of the
git ls-remote
command; all we care about is the revision,
not the time at which it was looked up.
Now that we have a git revision, we just need to avoid the hash
requirement of fetchgit
, to prevent it from being a
so-called “fixed-output derivation”. We do this in two steps, utilising
Nix’s lazy evaluation. First, we do a regular fetchgit
invocation:
fg = args: fetchgit {
url = args.url;
rev = rev args;
# Dummy hash
sha256 = hashString "sha256" args.url;
};
Finally, we override this derivation to strip out all of the hashing information:
latestGit = args: stdenv.lib.overrideDerivation (fg args) (old: {
outputHash = null;
outputHashAlgo = null;
outputHashMode = null;
sha256 = null;
});
Without these details, Nix will use the hashes it calculates from the
source. Now we can use this latestGit
function to specify
our package sources:
foo = stdenv.mkDerivation {
name = "foo";
src = latestGit {
url = "https://example.com/foo.git";
};
};
bar = stdenv.mkDerivation {
name = "bar";
src = latestGit {
url = "https://example.com/bar.git";
};
buildInputs = [ foo ];
};
This ensures that Nix and git are always synchronised, since git revisions are now the canonical form of package versions, and Nix will ask Git if these have changed when a new derivation is being instantiated.
From a development point of view, this gives us the freedom to tinker
and experiment without fear of breaking our system. When our local
changes are suitable for wider use, we can do the usual
git push
to make them available to the world; which now
includes our installation of Nix!
Our packages are also portable, as long as latestGit
is
made available somewhere.
This doesn’t quite solve the problem of using experimental
versions of foo
from within bar
, however we’re
free to define extra packages, like foo-unstable
,
which use the local directory as their source. If we make
foo
a parameter of the bar
package,
we can override it with nix-shell
to get a one-off
development shell using foo-unstable
, without breaking
anything, without exposing our dodgy prototypes to the world, and
without building a mountain of trivial git commits.