Last updated: 2019-10-26 22:17:27 +0100
Upstream URL: git clone http://chriswarbo.net/git/haskell-te.git
Contents of README follows
Contact: chriswarbo@gmail.com
Homepage: http://chriswarbo.net/git/haskell-te
Mirrors:
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see http://www.gnu.org/licenses/.
This repository provides commands for performing “theory
exploration”, mostly in the Haskell programming language. Theory
exploration describes the task of taking in some function definitions
and outputting conjectures about those functions. For example, given the
list functions append
and nil
, we might get
the following conjectures:
append v0 nil = v0
append nil v0 = v0
append v0 (append v1 v2) = append (append v0 v1) v2
In which case, we’d find that append
and
nil
form a monoid for lists.
Currently, we’re limited to finding equations. We rely on the excellent QuickSpec library to do most of the work, but we also use and provide many complementary tools. We use Nix to tame the menagerie of languages and systems ‘under the hood’.
We can think of these tools as “dynamic analysers” (as opposed to static analysers), since we discover properties of the given code by running it, over and over, in many combinations. Haskell is well suited to this, since its “purity” prevents functions relying on any ambient ‘context’: everything a Haskell function needs should be taken in as an argument. The type system tells us which functions we can combine together, and what arguments they should be given.
QuickSpec uses this knowledge, along with data generators for the popular test framework QuickCheck, to find combinations of the given functions which seem to behave the same way on many inputs. We conjecture that such expressions are (beta) equivalent.
Whilst QuickSpec is very useful, it has a few weaknesses which we seek to address:
Here, we use “Haskell package” to mean a directory containing a
Haskell project, with a .cabal
file, etc.
The Nix package defined in default.nix
provides the
following commands:
quickspec
takes the path to a Haskell package as an
argument, explores the code it finds inside, and writes the conjectures
it discovers to stdout.renderEqs
reads conjectures from stdin and
pretty-prints a more human-friendly syntax to stdout.reduce-equations
reads conjectures from stdin and
writes to stdout only those which aren’t “redundant” (i.e. it removes
anything which is just a special case of some other conjecture).haskellPkgToAsts
takes the path to a Haskell package as
an argument and outputs a JSON array detailing the function definitions
it finds inside.quickspecAsts
will explore the equations described by a
JSON array given on stdin, in the same format outputted by
haskellPkgToAsts
. The relevant Haskell packages should also
be given as an argument, so GHC knows where to find them.concurrentQuickspec
takes a “2D array” of JSON on
stdin, where each element of the outer array is a JSON array as per
quickspecAsts
. A QuickSpec process is launched for each,
and the conjectures from all of them are pooled.The intermediate JSON data is useful if we want to remove particular
functions from being explored, or if we want to explore functions from
multiple packages together. Tools like jq
are handy for
manipulating this JSON.
We can start a shell with these commands in scope as follows:
$ nix-shell --show-trace -p 'import ./. {}'
After much building and testing, this should drop us to a “nix shell”
with an augmented PATH
:
[nix-shell]$ command -v quickspecAsts
/nix/store/...-haskell-theory-exploration/bin/quickspecAsts
Now we need some code to explore; you’ll probably be using your own projects, but here we’ll fetch a package from Hackage:
[nix-shell]$ cabal get dlist
Downloading dlist-0.8.0.2...
Unpacking to dlist-0.8.0.2/
First let’s try exploring it, by passing the directory path to
quickspec
:
[nix-shell]$ quickspec dlist-0.8.0.2 | tee dlist.eqs
...the usual messages from nix, haddock, and cabal...
Checking module availability
Found modules
Building type-extraction command
Extracting types
Building scope-checking command
Checking scope
Outputting JSON
Finished output
Set DEBUG=1 if you want to see gory GHCi output.
This stage is tricky. Set DEBUG=1 to see the debug info.
Getting ASTs
Getting types
Getting scope
Tagging
Tagged
...more messages from nix, haddock and cabal...
[
{
"relation": "~=",
...
},
...
]
This can take a while, and you might want to keep an eye on GHC’s
memory usage. The messages in the middle are from our AST extraction
system as it finds functions definitions in the dlist
packages, along with their types, arities, etc.
The result is a bunch of conjectured equations involving functions
from the dlist
package. These are in our JSON format,
suitable for piping into other tools. Since we only want to read them,
we’ll pipe them into renderEqs
:
[nix-shell]$ renderEqs < dlist.eqs
((foldr v0) v1) empty ~= v1
(cons v0) empty ~= singleton v0
(map v0) empty ~= empty
(snoc empty) v0 ~= singleton v0
(apply empty) v0 ~= v0
head empty ~= undefined
tail empty ~= undefined
(append empty) empty ~= empty
(snoc ((replicate v0) v1)) v1 ~= (cons v1) ((replicate v0) v1)
(snoc (singleton v0)) v1 ~= (cons v0) (singleton v1)
head (singleton v0) ~= v0
tail (singleton v0) ~= empty
toList (fromList v0) ~= v0
(append ((unfoldr v0) v1)) (singleton v2) ~= (snoc ((unfoldr v0) v1)) v2
(append ((replicate v0) v1)) (singleton v2) ~= (snoc ((replicate v0) v1)) v2
(append ((replicate v0) v1)) ((replicate v2) v1) ~= (append ((replicate v2) v1)) ((replicate v0) v1)
(append (singleton v0)) ((unfoldr v1) v2) ~= (cons v0) ((unfoldr v1) v2)
(append (singleton v0)) ((replicate v1) v2) ~= (cons v0) ((replicate v1) v2)
(append empty) ((unfoldr v0) v1) ~= (unfoldr v0) v1
(append empty) ((replicate v0) v1) ~= (replicate v0) v1
(append ((unfoldr v0) v1)) empty ~= (unfoldr v0) v1
(append ((replicate v0) v1)) empty ~= (replicate v0) v1
(append (singleton v0)) (singleton v1) ~= (cons v0) (singleton v1)
(append (singleton v0)) (fromList v1) ~= (cons v0) (fromList v1)
(append (fromList v0)) (singleton v1) ~= (snoc (fromList v0)) v1
(append empty) (singleton v0) ~= singleton v0
(append empty) (fromList v0) ~= fromList v0
(append (singleton v0)) empty ~= singleton v0
(append (fromList v0)) empty ~= fromList v0
fromList (toList empty) ~= empty
(apply ((unfoldr v0) v1)) (toList empty) ~= toList ((unfoldr v0) v1)
(apply ((replicate v0) v1)) (toList empty) ~= toList ((replicate v0) v1)
(apply (singleton v0)) (toList empty) ~= toList (singleton v0)
(apply (fromList v0)) (toList empty) ~= v0
Here we’ve found that toList
is the inverse of
fromList
, that head
is the inverse of
singleton
, etc.
We use the symbol ~=
to indicate that these are not
definitions or proven theorems, they’re just conjectures based on many
empirical tests. The terms v0
, v1
, etc. are
free variables, which are numbered from left to right per equation; the
prefix contains enough v
s to prevent clashing with any
function name (e.g. if we had a function called verbose
, or
even v0
, the variables would render as vv0
,
vv1
, etc.)
Notice that there are several equations of the form
(append empty) x = x
, but that more general statement
itself wasn’t conjectured. The reason is that x
would be a
variable of type DList a
(in fact
DList Integer
, since we monomorphise), but the
dlist
package doesn’t expose a DList a
instance of QuickCheck’s Arbitrary
type class. Without
this, we can’t test equations involving such variables, so we don’t
include them and hence the general equation wasn’t found.
If we want more control over what gets explored, we can use
haskellPkgToAsts
to dump out a package’s function
definitions as JSON, without exploring them immediately:
[nix-shell]$ haskellPkgToAsts dlist-0.8.0.2 | tee dlist.asts
...output from nix, haddock, cabal, ...
[{"name":"replicate","version":"0.8.0.2",...},...]
Tools like jq
can manipulate this JSON. Here we’ll take
functions from the LazyLength
module of the
list-extras
package, and functions from dlist
which have List
in their name, and combine them into one
array:
[nix-shell]$ cabal get list-extras
[nix-shell]$ haskellPkgToAsts list-extras-0.4.1.4 > list-extras.asts
[nix-shell]$ jq -n --argfile le list-extras.asts \
--argfile dl dlist.asts \
'($dl | map(select(.name | contains("List")))) +
($le | map(select(.module | contains("LazyLength"))))' |
tee combined.asts
We can explore these together by using quickspecAsts
.
Note that we provide both package directories as arguments, so
GHC can find all of the necessary code:
[nix-shell]$ quickspecAsts dlist-0.8.0.2 \
list-extras-0.4.1.4 < combined.asts | renderEqs
(lengthCompare v0) v0 ~= (lengthCompare v1) v1
v0 ~= toList (fromList v0)
In this case we didn’t find any equations involving both
lengthCompare
and
toList
/fromList
, so there’s no need to explore
them together. Instead, we could have explored them in separate
processes using concurrentQuickspec
.
We need a different jq
command, since the one above uses
(...) + (...)
to append two arrays together, resulting
in:
[{"name":"lengthCompare",...},{"name":"toList",...},{"name":"fromList",...}]
This time we want multiple arrays, so we use
[...] + [...]
instead to get:
[[{"name":"lengthCompare",...}],
[{"name":"toList",...},{"name":"fromList",...}]]
Here’s the call we use:
[nix-shell]$ jq -n --argfile le list-extras.asts \
--argfile dl dlist.asts \
'[$dl | map(select(.name | contains("List")))] +
[$le | map(select(.module | contains("LazyLength")))]' |
tee separate.asts
We send these nested arrays to concurrentQuickspec
instead of quickspecAsts
, remembering to provide both
directory paths as before. We also send the resulting conjectures
through reduce-equations
to remove duplicates and
redundancies (quickspec
and quickspecAsts
currently do this automatically):
[nix-shell]$ concurrentQuickspec dlist-0.8.0.2 \
list-extras-0.4.1.4 < separate.asts |
reduce-equations | renderEqs
(lengthCompare v0) v0 ~= (lengthCompare v1) v1
v0 ~= toList (fromList v0)
The idea of concurrentQuickspec
is to work around
QuickSpec’s exponential worst-case complexity: turning O(2^N) into
O(k*2^(N/k)). How we should divide up one big array into multiple
smaller ones, without losing good conjectures in the process, is
currently an open question!
The design of this codebase is influenced by the following considerations:
Exploring arbitrary (Haskell) code
To be as useful as possible, we want to accept as much “normal” Haskell code as we can, rather than placing too many requirements on users. Since the vast majority of Haskell code only works with GHC, due to language extensions, dependencies, etc. we’re pretty much forced to use it.
Likewise, most code won’t build without taking specific instructions into account, like ensuring dependencies are available, preprocessors are executed, commandline flags are provided, etc. This forces us to use a build system, so we’ve opted for Cabal since it’s been around for long enough to “just work” in most situations, especially with Nix.
We’re also forced to use an “eval” mechanism, since we need to run
code supplied by the user at runtime, but GHC is an ahead-of-time
compiler. We’ve opted for nix-eval
, which spawns
runghc
in a subprocess and pipes in the given code. As the
name suggests, nix-eval
runs this subprocess in an
environment provided by Nix, which lets us fetch, build and install any
packages needed by the user’s code.
Measuring performance
We want a way to reliably measure and compare different exploration algorithms and approaches. This requires the code to be:
nix-eval
from compile
dependencies during an eval
call.We abstract over a lot of implementation details. These abstractions are leaky, so you may encounter problems.
Barring dependency clashes, the packages most likely to work are
“helper libraries” for built-in types, like list-extras
,
since they can use the Arbitrary
instances bundled with
QuickCheck (Bool
, [a]
, Maybe a
,
etc.).
Packages which define combinator libraries and simple ‘sums of
products’ datatypes should usually work, but you may need to
define/expose some Arbitrary
instances, and maybe be
selective about which functions get explored together to avoid running
out of memory.
Libraries making heavy use of type classes and records may struggle, as will those whose “data” can’t be serialised or compared (e.g. functions).
It goes without saying, but anything which invokes IO
may cause damage to your system. This should only be possible with
unsafePerformIO
and friends, but don’t blindly trust
third-party code. A welcome feature would be to enforce GHC’s “safe
haskell” mode when exploring!
Here are some known difficulties and workarounds:
Function extraction uses a GHC compiler plugin
This means we must be able to compile your code with GHC. We do this
by invoking Cabal, with specially crafted arguments in a specially
crafted environment. If your code doesn’t have an accompanying
.cabal
file we cannot extract function information
from it. We have no plans to support alternative build tools at this
time.
Note that it is still be possible to explore non-Cabal
packages (e.g. with quickspecAsts
or
concurrentQuickspec
), but haskellPkgToAsts
cannot be used to extract the function information; you’ll have to
provide it some other way.
Packages must be installable by Nix
We use Nix to set up our working environments, e.g. for compiling, for querying GHCi, for exploring, etc. If your package can’t be built by Nix then it will not work. Unlike the Cabal requirement, this cannot be worked around.
There are two ways to make a package work with Nix. The easiest is to
have a working .cabal
file, in which case we will
automatically run cabal2nix
to convert it to Nix. If this
doesn’t work, or you don’t have a .cabal
file, then you can
provide your own definition in a default.nix
file instead.
This should follow the same convention as those produced by
cabal2nix
.
Dependency hell
We use a few Haskell packages internally. Some, like
reduce-equations
, provide standalone commands which don’t
interfere with the packages being explored. Others, like
QuickSpec
and AstPlugin
, must be installed
into the same ghc-pkg
database as the packages being
explored, so there is potential for dependencies to clash.
Clashes with AstPlugin
can be worked around by avoiding
quickspec
and haskellPkgToAsts
, and instead
providing your own JSON to quickspecAsts
or
concurrentQuickspec
.
Clashes with packages used during exploration
(e.g. QuickSpec
) cannot be worked around, due to the nature
of the algorithm: functions from one must be invoked by the other, so
their dependencies must be compatible.
Overrides
We prefer to hard-code our dependency versions, to make results more deterministic and reproducible, but this may prevent some user packages from building, so we provide the following override mechanisms:
stable
which defaults to true
. If set to
false
, we will check out the latest revision of each git
repository we access, rather than the known-good revision we’ve
hard-coded. For example:
nix-shell -p 'import ./. { stable = false; }'
NIX_PATH
environment variable, e.g. most of the custom Haskell packages in
nix-support/hsOverride.nix
.HASKELL_PACKAGES
environment variable is set to
a .nix
file, that will be used as the Haskell package set
when exploring. Note that this will not affect things other
than exploration, like function extraction or setup code.nix-eval
package, but we
don’t expose its NIX_EVAL_HASKELL_PKGS
override.
That’s because we need to use it ourselves, but the
HASKELL_PACKAGES
we provide acts in a similar way.Extraction caveats
haskellPkgToAsts
looks at all of the functions defined
in the given package. In particular, it does not extract
anything which is re-exported from a different package, and it
does not include constuctors or type class methods.
If a package defines things in a “private” (non-exposed) module, those functions may be deemed unexplorable, since we can’t import the module; even if an exposed module happens to re-export them elsewhere.
We can work around these problems by defining wrapper functions (e.g. in a “helper” package). For example:
-- Wrap constructors in functions
nil = []
cons = (:)
-- Wrap methods in functions
listEq :: Eq a => [a] -> [a] -> Bool
listEq = (==)
-- Wrap unexposed definitions in functions
publicFoo = privateFoo
-- Wrap re-exported names in functions
definedInPublic = reexportedPrivateName
There is also an edge case if we try to explore a function whose module doesn’t expose the relevant types. We won’t be able to add variables for these types, since the type names won’t be in scope. Unfortunately this will die ungracefully with a GHC error.
We can work around this by re-exporting the desired types from a module which we know will get imported. For example, we could wrap the relevant functions, and also expose the types:
module ExplorationHelper (MyType1, MyType2, myFunc) where
import MyPkg.Types (MyType1, MyType2)
import qualified MyPkg.Funcs (myFunc)
myFunc = MyPkg.Funcs.myFunc
Polymorphism, higher kinds, Arbitrary, Ord, etc.
QuickSpec’s exploration algorithm relies on the use of free variables, and testing by plugging many different values into these variables to see what happens.
To do this, we need instances of QuickCheck’s Arbitrary
typeclass for each function’s inputs, so we can generate random values.
We also need the outputs to be comparable, either by having an instance
of Ord
(we could use Eq
, but it would require
O(n^2) comparisons) or an instance of Serialize
.
We use the latter if possible, since serialised data can be easily hashed to a fixed-size value, which may save time and memory as our function outputs grow.
Another consideration is that we need code to be monomorphic: we
can’t generate random inputs if we don’t know which particular type to
generate values of. For example, to explore
id :: (Arbitrary a) => a -> a
, how do we choose which
a
to use?
We solve this by ‘monomorphising’: replacing type variables with a
particular concrete type. Currently, we replace variables of kind
*
with Integer
and those of kind
* -> *
with []
(i.e. lists). If the
variable has constraints, and these aren’t satisfied by
Integer
or []
, then we use Template Haskell to
look up a valid instance in the module’s scope; if there are multiple
possibilities, one is chosen non-deterministically.
These instance-finding mechanisms are rather basic and could certainly be improved, but more sophisticated approaches (e.g. logic programming) would be better off being developed as a separate project.
It’s easy to work around most of these mechanisms’ failures, by simply writing down monomorphic versions of your functions, e.g.
myCrazyFunc :: Heavy Wizardry..
myCrazyFunc = ...
myMonoFunc :: Bool -> String
myMonoFunc = myCrazyFunc
One annoyance is that many projects use QuickCheck “internally”, and
hence their test suites define Arbitrary
instances, but
these aren’t provided by their library interface, so we can’t use
them.
Resource usage
We use QuickSpec version 1, which has a rather slow algorithm. In particular, memory usage may explode when exploring some functions and not for others. Characterising this behaviour is itself an interesting question.
To prevent GHC eating all of your resources, it can be killed after a
time limit by setting the MAX_SECS
env var, or memory limit
by setting MAX_KB
.
default.nix
benchmarks/README
.issues
and managed by Git.