SK logic in egglog: part 2, property-checking extensionality
Update: part 3 is here
This is a spiritual follow-on from part 1 of my experiments with SK logic in egglog: that post serves as the motivation for this one, but the content here is somewhat standalone. It’s the result of a rabbit hole I entered, after getting results that were so unexpected and confusing that it made me question some of the assumptions I’d been making in those experiments.
I was modelling a small programming language called SK
(combinatory) logic; and whilst I’m no stranger to being wrong about
my code, I had chosen SK since it’s very simple and
very familiar. Implementing SK is like writing “hello world”:
something I’ve done countless times in all sorts of languages and
paradigms, hence why it was an obvious choice when I wanted to learn
egglog. In fact, combinatory
logic even appears in egglog’s examples/
folder!
However, an innocuous-looking one-line extension brought the whole thing
crashing down: all structure vanished and every term collapsed into one
big blob, where everything equalled everything else.
Note for non-logicians: That’s not good!
Despite several rewrites, trying various different encodings, and much scouring of egglog’s Rust source, I kept running into the same problem. This made me seriously doubt my understanding, particularly of free variables. If I could spend years immersed in the worlds of mathematics, computer science, post-graduate academia, and software development; yet fail to grasp such a fundamental aspect in all that time; then what else might I be wrong about? Am I a complete fraud?
Thankfully I also trained as a physicist, so I know what to do when theoretical understanding goes awry: experiment! Or, as software engineers call it: test! I wanted to thoroughly test the assumptions I was making in those egglog experiments, so I went crawling back to the comfort of Haskell where I could use my favourite hammer: property-based testing. This would also let me try out the new falsify package (which I’ve blogged about before): in essence, to try and “falsify myself”, to figure out what I’d misunderstood.
This page gives the background of what I was attempting; along with
executable Haskell code, properties it should satisfy, and
falsify
checks of those properties. The latter are defined
using active code.
The setup
I’ll explain some of this jargon below, but in short: that problematic line of code which broke everything was my attempt to implement extensional equality; the idea that if two things always behave in the same way, then we can consider them equal. To specify what “always” means, we need some notion of universal quantification: that doesn’t quite fit into the formal system of egglog, so my idea was to approximate universally-quantified variables using symbolic execution. This should be a conservative approach: failing to spot some equalities, but never mistakenly claiming two things to be equal. Instead, it seemed to do the extreme opposite: always claiming that everything is equal!
Preamble boilerplate…
Here’s the initial boilerplate for the script; it just needs GHC with
falsify
in its package database. For more details on how
it’s executed, see this
page’s Markdown source.
module Main (main) where
import Control.Applicative ((<|>))
import Control.Exception (assert)
import Data.Maybe (isJust)
import GHC.Natural (Natural)
import System.Environment (getEnv)
import Test.Falsify.Generator (Gen, Tree(..))
import qualified Test.Falsify.Generator as Gen
import Test.Falsify.Predicate (satisfies)
import Test.Falsify.Property (testGen)
import Test.Falsify.Range (Range)
import qualified Test.Falsify.Range as Range
import Test.Tasty (defaultMain)
import Test.Tasty.Falsify (testProperty)
-- Useful helpers. These are available in libraries, but I want this code to be
-- self-contained (other than falsify)
-- | Like 'uncurry', but for three arguments
uncurry3 :: (a -> b -> c -> d) -> (a, b, c) -> d
= f a b c
uncurry3 f (a, b, c)
-- | Combine two results into a single tuple
tuple2 :: Applicative f => f a -> f b -> f (a, b)
= (,) <$> a <*> b
tuple2 a b
-- | Combine three results into a single tuple
tuple3 :: Applicative f => f a -> f b -> f c -> f (a, b, c)
= (,,) <$> a <*> b <*> c
tuple3 a b c
-- | Combine four results into a single tuple
tuple4 :: Applicative f => f a -> f b -> f c -> f d -> f (a, b, c, d)
= (,,,) <$> a <*> b <*> c <*> d
tuple4 a b c d
-- | Like a list, but never ends.
data Stream a = Cons { sHead :: !a, sTail :: Stream a } deriving (Functor)
instance Show a => Show (Stream a) where
show (Cons x _) = "[" ++ show x ++ ",...]"
sPrefix :: [a] -> Stream a -> Stream a
= ys
sPrefix [] ys :xs) ys = Cons x (sPrefix xs ys)
sPrefix (x
-- | Drop elements from a 'Stream' which don't satisfy the given predicate.
sFilter :: (a -> Bool) -> Stream a -> Stream a
Cons x xs) = (if p x then Cons x else id) (sFilter p xs)
sFilter p (
-- | Extract the nth element of a 'Stream'.
sAt :: Integral n => n -> Stream a -> a
Cons x xs) = if i <= 0 then x else sAt (i - 1) xs
sAt i (
-- | All prefixes of the given Stream, e.g. [], [x], [x, y], [x, y, z], ...
sInits :: Stream a -> Stream [a]
Cons x xs) = Cons [] ((x:) <$> sInits xs)
sInits (
-- | All String values, in order of length
allStrings :: Stream String
= go 0
allStrings where strs 0 = [""]
= [c:s | c <- [minBound..maxBound], s <- strs (n - 1)]
strs n = sPrefix (strs n) (go (n + 1))
go n
-- | Like a list, but only stores data at the end. Useful for representing
-- | general recursion.
data Delay a = Now a | Later (Delay a) deriving (Eq, Functor, Ord, Show)
instance Applicative Delay where
Now f) <*> x = f <$> x
(Later f) <*> x = Later (f <*> x)
(pure = Now
instance Monad Delay where
Now x) >>= f = f x
(Later x) >>= f = Later (x >>= f)
(
-- | Unwrap (up to) a given number of 'Later' wrappers from a 'Delay' value.
runDelay :: Integral n => n -> Delay a -> Delay a
| n <= 0 = x
runDelay n x @(Now _) = x
runDelay _ xLater x) = runDelay (n - 1) x
runDelay n (
-- | Try to extract a value from the given 'Delay', or use the given default
runDelayOr :: Integral n => a -> Delay a -> n -> a
= case runDelay n x of
runDelayOr def x n Now x' -> x'
Later _ -> def
SK (combinatory logic)
Combinatory logic, which I’ll shamelessly abbreviate as “SK”, is a
particularly simple programming language. It can be written entirely
with two symbols, traditionally called S
and
K
, along with parentheses for grouping symbols together.
We’ll represent SK expressions using the following Haskell datatype
called Com
(this
representation also matches my definition in egglog):
data Com = C String | App Com Com deriving (Eq, Ord)
That deriving
clause asks Haskell to automatically implement some useful
interfaces:
Eq
provides the==
function. Deriving this lets us check whether twoCom
values are identical (same structure and sameString
data).Ord
provides comparisons like<
. Deriving this gives us a lexicographic ordering, but the details aren’t important: as long as we have some implementation ofOrd
, we can use efficientSet
datastructures.
We’ll implement the Show
interface
ourselves, to render expressions in traditional SK notation:
instance Show Com where
show (C c) = c
show (App x (App y z)) = show x ++ "(" ++ show (App y z) ++ ")"
show (App x y) = show x ++ show y
We represent the “primitives” S
and K
using
the C
constructor (annoyingly, value names in Haskell must begin with a
lowercase letter; so we call these s
and k
instead):
-- | Primitive SK expressions
k :: Com
s,= C "S"
s = C "K" k
App
groups expressions together two at a time, like a pair of parentheses.
Traditional SK notation only shows parentheses “on the right”, e.g.
App s (App s k)
is written S(SK)
, but App (App s k) k
is simply SKK
. The following functions extract the left and
right children of an App
value
(returning Maybe
, in case
they’re given a primitive expression):
right :: Com -> Maybe Com
left,App x _) = Just x
left (= Nothing
left _ App _ y) = Just y
right (= Nothing right _
Anatomy of SK expressions
We can refer to various parts of an SK expression using the following terminology:
- The “root” is the top-most constructor.
- The “head” is the left-most leaf.
- The “spine” is the chain of left-nested
App
constructors between the head and the root. If the expression is a leaf, its spine is empty. - The “arguments” (or “args”) of that head are the right-hand children hanging off the spine, ordered from the head to the root. If the spine is empty, there are no arguments.
Examples can be seen in the figure below. We will often describe an expression as “applying 〈its head〉 to 〈its arguments〉”.
Running SK expressions
We can “run” SK expressions using two rewriting rules:
- An expression applying
K
to two args can be replaced by the first arg. - An expression applying
S
to three args can be replaced by an expression which applies the first arg to the third arg and «the second applied to the third».
We can implement these as Haskell functions, returning Nothing
if the
rule didn’t match:
-- | Replaces Kxy with x, otherwise 'Nothing'
stepK :: Com -> Maybe Com
App (App (C "K") x) _) = Just x
stepK (= Nothing
stepK _
-- | Replaces Sxyz with xz(yz), otherwise 'Nothing'
stepS :: Com -> Maybe Com
App (App (App (C "S") x) y) z) = Just (App (App x z) (App y z))
stepS (= Nothing stepS _
Notice that arguments can be any Com
value,
represented using the Haskell variables x
, y
and z
; but only App
, s
and k
have any effect on the output (this
will be important for symbolic execution later!).
Remarkably, these two rules make SK a universal
(Turing-complete) programming language! The following
step
function tries to apply both of these rules. It also
applies the rules recursively to sub-expressions of App
: note that
we try stepping both children at once, to prevent any reducible parts of
an expression getting “stuck” waiting for their sibling to finish. Nothing
is
returned when neither rule matched and no sub-expressions were
rewritten; we say the expression is in normal
form:
-- | Attempt to reduce a K or S combinator, or the children of an 'App'.
-- | 'Nothing' if the argument is in normal form.
step :: Com -> Maybe Com
= stepK c <|> stepS c <|> app l' r' <|> app l' r <|> app l r'
step c where app x y = App <$> x <*> y
= (left c , right c )
(l , r ) = (l >>= step, r >>= step) (l', r')
Next we’ll need to iterate the step
function until its argument
reaches normal form. We’ll distinguish normalised expressions using a
separate type called Normal
:
-- | Wraps Com values which are assumed to be in normal form
newtype Normal = N Com deriving (Eq, Ord)
instance Show Normal where
show = show . toCom
-- | Coerce a 'Normal' back to a 'Com', i.e. forget that it's in normal form.
N c) = c
toCom (
-- | 'step' the given 'Com': if it worked, returns that in 'Left'; otherwise
-- | returns the original value (now 'Normal') in 'Right'.
toNormal :: Com -> Either Com Normal
= case step c of
toNormal c Just c' -> Left c'
Nothing -> Right (N c)
Understanding Normal
…
The Normal
type
has the invariant
that it only contains Com
values
which are in normal form. More specifically, we can say step . toCom
is (extensionally) equal to const Nothing
.
Values of type Normal
act as
evidence
that the Com
they
contain is in normal form (i.e. that it will return Nothing
when
given to step
).
This invariant is not guaranteed to be the case; e.g. we can
violate it by writing code like N (App (App k s) k)
(since KSK
reduces to S
via the stepK
rule). To prevent such
violations, we avoid using the N
constructor
directly: instead always using the toNormal
function, which only returns
a Normal
value when it is found to return Nothing
from
step
(i.e. when our invariant
holds).
toNormal
is a smart
constructor: a function which only returns values of the desired
type when it’s checked they’re valid. Ideally we would enforce
usage of toNormal
, by encapsulating
the N
constructor, making the Normal
type opaque. That’s
usually achieved in Haskell using modules,
but I want the code for this post to all live in one module (to make
compilation and execution easier).
toNormal
tells us when to
stop iterating, but since SK is a universal programming language,
there’s no way to know beforehand if it ever will. We’ll account for
this by wrapping such undecidable computations in a Delay
type:
-- | Step the given 'Com' until it reaches 'Normal' form.
reduce :: Com -> Delay Normal
= case toNormal c of
reduce c Left c' -> Later (reduce c')
Right n -> Now n
Normal equivalence
The rules of SK are confluent,
meaning that every expression has at most one Normal
form,
and applying the reduction rules in any order, to any parts of an
expression, will not change its Normal
form or
the ability to reach it using those rules. As a consequence, all SK
expressions with the same Normal
form
are equivalent and interchangable.
To see this, consider two expressions X and Y with the same Normal
form
N. Whenever X occurs inside a larger expression,
confluence allows us to replace X with N without affecting the overall
result. Now imagine we instead replace X with Y: whatever that gives us, it will
be unaffected (thanks to confluence) if we reduce Y to its Normal
form.
Since the Normal
form of
Y is also N, and we already deduced that using
N is equivalent to using X, the result of using Y must also be equivalent
to using X. Hence any
expressions with the same Normal
form
are equivalent, in the sense that they can be interchanged inside any
expression without altering its Normal
form.
We’ll call this relationship “Normal
equivalence”. It’s undecidable in general, so it also gets wrapped in
Delay
:
-- | Result of comparing two values
data Compared a = Same a | Diff a a deriving (Show)
diff :: Compared a -> Bool
same,Same _) = True
same (= False
same _ Diff _ _) = True
diff (= False
diff _
-- | Uses (==) to find identical values
comparer :: Eq a => a -> a -> Compared a
= if x == y then Same x else Diff x y
comparer x y
-- | Compare the 'Normal' forms of the given 'Com' values.
normalEq :: Com -> Com -> Delay (Compared Normal)
= comparer <$> reduce x <*> reduce y normalEq x y
Property-based testing
Let’s test this normalEq
function, to see whether it actually behaves in the way our theory of SK
predicts it should. For example, every equality relation should have the
“reflexive
property”, meaning that values should be equal to themselves; so any
call like normalEq foo foo
should always result in True
.
Unfortunately, such results are buried in a Delay
which
prevents us testing them directly. Delay
values
could be a never-ending chain of Later
wrappers, like the following:
loop :: Delay a
= Later loop loop
Hence any attempt to extract a result from them must eventually give up. We’ll work around this using a parameter that counts down until we give up; an approach known as fuel:
-- | Represents a parameter for "when to give up"
type Fuel = Word
Now we can write a predicate (a function returning a boolean) to
check whether its argument is normalEq
to itself, before its Fuel
runs
out:
normalEqToItself :: (Fuel, Com) -> Bool
= runDelayOr False (same <$> normalEq x x) n normalEqToItself (n, x)
We can use this predicate to make statements, by quantifying
its argument. Tests commonly use existential
quantification, which asserts that some argument satisfies
normalEqToItself
(AKA
“example-based” testing). For example:
= assert (normalEqToItself (0, k)) (putStrLn "PASS") main
PASS
However, that isn’t really what we want to say: reflexivity doesn’t
just apply to a few hand-picked examples, it says that every
argument satisfies normalEqToItself
. Talking about “every
argument” is universal
quantification. Universally-quantified predicates are called
“properties”, so this approach is known as property-based testing.
To test a property, we give it to a “property checker” which searches for counterexamples: argument values which cause the assertion to fail. If no counterexample can be found, the test passes. Search techniques range from simple enumeration all the way up to sophisticated AI algorithms. Haskell has many property checkers, beginning with the wonderful QuickCheck package. We’ll use the state-of-the-art for 2024, which is falsify.
Data generators
falsify
searches through argument values at
random, sampling them from a given “generator” with the Haskell
type Gen a
(for
generating values of some type a
). We can build up generators using
familiar interfaces (Applicative
,
Monad
,
etc.); from primitives such as Gen.inRange
, which samples fixnums
from a Range
and is
perfect for generating Fuel
:
-- | Generates a (relatively small) amount of Fuel
= genFuelN limit
genFuel
-- | Generates up to a certain amount of Fuel
genFuelN :: Fuel -> Gen Fuel
= Gen.inRange . to
genFuelN
-- | A reasonable default for procedures requiring Fuel. Small enough to keep
-- | exponentially-growing terms from blowing up.
limit :: Fuel
= 20
limit
-- | We usually want Ranges from zero upwards
to :: Fuel -> Range Fuel
= Range.between . (0,) to
Generating recursive Com
values…
Generating Com
values is
more tricky, since they are recursive. A naïve generator that simply
calls itself recursively will make Com
values of
exponential size: either blowing up memory (if the recursive
case App
is likely to be chosen) or being limited to a handful of tiny values (if
the recursive case is unlikely to be chosen). To generate a diverse
spread of values, without risking out-of-memory conditions, we can use
the same Fuel
trick as
above; this time dividing it up (unevenly) between recursive calls.
This is such a useful approach that I’ve implemented it for
QuickCheck
, ScalaCheck
,
Hypothesis
, etc. over the years. In contrast,
falsify
provides it out-of-the-box! The Gen.tree
function generates binary
Tree
values of a given size, so we just need to transform those into the
Com
values we need:
-- | Create a Com from a binary Tree (with unit values () at the nodes)
treeToCom :: Tree () -> Com
= case t of
treeToCom t -- The smallest Trees have one and two leaves. Turn them into K, since that
-- has the simplest step rule.
Leaf -> k
Branch _ Leaf Leaf -> k
-- There are two Trees with three leaves. Turn those into S.
Branch _ Leaf (Branch _ Leaf Leaf) -> s
Branch _ (Branch _ Leaf Leaf) Leaf -> s
-- For larger Trees, we recurse on their children and combine using App
Branch _ l r -> App (treeToCom l) (treeToCom r)
-- | Generate a Com, with size bounded by the given Fuel
genComN :: Fuel -> Gen Com
= treeToCom <$> Gen.tree (to n) (pure ())
genComN n
-- | Generate (relatively small) Com values
= genComN limit genCom
falsify
integrates with Haskell’s tasty
test framework. Normally a project would declare a big test suite to run
all at once, but for this literate/active style we’ll be testing things
as we go, using the following function:
-- | Turn the given predicate into a falsify property, universally quantified
-- | over the given generator's outputs. Check it on (at least) 100 samples.
check :: Show a => (a -> Bool) -> Gen a -> IO ()
pred gen = do
check <- getEnv "NAME" -- Avoids repetition
name pred)) gen)) defaultMain (testProperty name (testGen (satisfies (name,
Included middles
The observant among you may have noticed that normalEqToItself
does
not hold for all argument values! falsify
can
generate a counterexample to show us why:
= check normalEqToItself (tuple2 genFuel genCom) main
normalEqToItself: FAIL (0.04s)
failed after 14 successful tests and 9 shrinks
not (normalEqToItself generated)
generated: (0,KKK)
Logs for failed test run:
Use --falsify-replay=01040523b283c5642978889c9d68d5d40d to replay.
1 out of 1 tests failed (0.05s)
The precise counterexample falsify
finds may vary
depending on the random seed, but they’ll all have the following in
common: the Fuel
will be
0
,
whilst the Com
expression will not be in normal form. For
example, it may produce KKK
(the Haskell value App (App k k) k
).
stepK
will reduce that to a
single K
, so whilst the value KKK
is
equal to itself (as we expect), normalEq
wraps it in a Later
constructor, which may cause normalEqToItself
to give up for
some Fuel
parameters: in particular, when given 0
Fuel
. This
disproves our claim that the property holds for any amount of
Fuel
.
We could alter our claim to existentially quantify the
amount of Fuel
: that for
any SK expression x
, there is
some value of n
where
runDelayOr False (same <$> normalEq x x) n
holds. However that is also false, since there are infinite
loops which have no Normal
form;
hence never produce a Now
value; and therefore cannot have
a result extracted regardless of how much Fuel
we use.
Even when a Normal
form
does exist, there is no way to bound the amount of Fuel
required
to find it.
The correct way to fix our claim is to universally quantify x
and n
as before, but add two negations to
our predicate: instead of claiming every expression is equal to itself
(regardless of Fuel
), we
claim that every expression is not unequal to itself
(regardless of Fuel
):
notUnnormalEqToItself :: (Fuel, Com) -> Bool
= runDelayOr True (not . diff <$> normalEq x x) n notUnnormalEqToItself (n, x)
= check notUnnormalEqToItself (tuple2 genFuel genCom) main
notUnnormalEqToItself: OK (0.04s)
100 successful tests
All 1 tests passed (0.04s)
You may have learned in school that double-negatives are redundant: that anything “not false” must be “true” (the law of the excluded middle). However, we’re not dealing with the true/false ideals of classical logic, which ignore computation and its associated undecidability. All we have are the more pragmatic falsified/unfalsified results of a property checker, where there are non-excluded middles such as “don’t know”, “timed out” and “gave up”!
Extensional equality
My problems arose when trying to extend the equality of SK expressions even further to include extensional equality. To explain how this works, we’ll need a bit more terminology…
Inputs and agreement
To understand the behaviour of a particular SK expression, we can see
what happens when it’s applied to other SK expressions. We’ll call the
latter “input values”, to emphasise that they are not part of the
original expression we’re focused on. For example, we can understand how
the expression KS
behaves by applying it to an input value
like S
: giving KSS
(which reduces to Normal
form
S
). Note that KS
contains head K
and argument S
; whilst the resulting expression
KSS
also contains the input value S
as an
extra argument.
If we apply two expressions to the same input value, and the results
reduce to the same Normal
form,
we say those two expressions “agree on” that input value. For example,
KK
and SKK
agree on the input value
K
, since KKK
reduces to K
; and
SKKK
reduces to KK(KK)
then to K
.
Note that they disagree on the input value S
,
since KKS
reduces to K
; whilst
SKKS
reduces to KS(KS)
then to S
.
Expressions can also agree on a sequence of input values, where
we apply them both to the first value, then apply those results to the
second value, and so on. We’ll represent such sequences using a Stream
, which
is like a list except there is no “nil” case, so it goes on forever:
type InputValue = Com
type InputValues = Stream InputValue
-- | Apply the given 'Com' expressions to more and more of the 'InputValues',
-- | checking whether they reach the same 'Normal' form.
agree :: Com -> Com -> InputValues -> Stream (Delay (Compared Normal))
Cons x xs) = Cons (normalEq f g) (agree (App f x) (App g x) xs) agree f g (
Everything that satisfies normalEq
should also satisfy agree
, which we can state with the
following property:
normalEqImpliesAgree :: (Fuel, Com, Com, InputValues) -> Bool
=
normalEqImpliesAgree (n, f, g, xs) case runDelay n (tuple2 (normalEq f g) (sHead (agree f g xs))) of
Now (Same _, Diff _ _) -> False
-> True
_
-- | Generate a 'Stream' of Com values, with element size bounded by 'Fuel'
genComsN :: Fuel -> Gen InputValues
= Cons <$> genComN fuel <*> genComsN fuel
genComsN fuel
-- | Generate (relatively small) lists of Com values
genComs :: Gen InputValues
= genComsN limit genComs
= check normalEqImpliesAgree (tuple4 genFuel genCom genCom genComs) main
normalEqImpliesAgree: OK (0.07s)
100 successful tests
All 1 tests passed (0.07s)
We say expressions “agree on N inputs” when they agree on
every sequence of N
input values; i.e. the input values are universally-quantified, but the
length of the sequence is not. For example, SK
and
S(K(SK))(KK)
agree on two inputs; which we can test by
asserting that they never disagree:
skNeverDisagreesWithSKSKKK :: (Fuel, InputValues) -> Bool
=
skNeverDisagreesWithSKSKKK (n, xs) True (not . diff <$> sAt (2 + n) (agree f g xs)) n
runDelayOr where f = App s k
= App (App s (App k (App s k))) (App k k) g
= check skNeverDisagreesWithSKSKKK (tuple2 genFuel genComs) main
skNeverDisagreesWithSKSKKK: OK (1.14s)
100 successful tests
All 1 tests passed (1.14s)
Note that any expressions which agree on N inputs also agree on N + 1 inputs (and so on), since the
left child of each root has the same Normal
form
(by definition of agreement on N inputs).
agreementIsMonotonic :: ((Fuel, Fuel), (Com, Com), InputValues) -> Bool
=
agreementIsMonotonic ((n, m), (f, g), xs) case runDelay n (tuple2 (sAt n (agree f g xs))
+ m) (agree f g xs))) of
(sAt (n Now (Same _, Diff _ _) -> False
-> True _
= check agreementIsMonotonic (tuple3 (tuple2 genFuel genFuel)
main
(tuple2 genCom genCom ) genComs)
agreementIsMonotonic: OK (0.85s)
100 successful tests
All 1 tests passed (0.85s)
Extensionality
Now we’ve defined inputs and agreement, extensional equality becomes
quite simple: SK expressions are extensionally equal iff they agree on
some number of inputs. We saw that SK
and
S(K(SK))(KK)
agree on some number of inputs
(namely: on two inputs), so they are extensionally equal. Expressions
which are normally equivalent are also extensionally equal, since they
agree on zero inputs.
Notice that the number of inputs is existentially-quantified, whilst the values of those inputs are universally-quantified. This makes extensional equality difficult to determine:
- If we compare two expressions on N input values and see that they agree, that doesn’t mean they’re extensionally equal; since there might be other input values of length N for which they disagree.
- If we compare two expressions on N input values and see that they disagree, that doesn’t mean they’re not extensionally equal; since they might only agree on inputs longer than N.
In order to make more definitive assertions about extensional equality, we need to borrow techniques from the world of formal methods!
Symbolic execution
The reason I like property-checking is that it can provide a lot of confidence for relatively little effort, compared to e.g. example-based testing (less confidence for a similar effort), manual testing (less confidence for more effort) or formal verification (more confidence for more effort). However, there are occasions when the scales tip in favour of other approaches, and extensional equality checking turns out to be particularly suited to a formal method called symbolic execution.
In this approach, we avoid using real, “concrete” SK expressions for
our input values; and instead use abstract
symbols. These can be copied, discarded or rearranged during program
execution; which is perfect for seeing where different inputs end up in
the resulting Normal
form.
They are otherwise inert or “uninterpreted”: causing reduction to stop
if it would depend on the particular value of an input.
Implementing symbolic execution can often be laborious, but our
definition of Com
actually
makes it trivial to graft on top: we can represent abstract symbols
using the C
constructor
applied to any String
(except "S"
or "K"
,
which we’re treating specially). Execution of the stepK
and stepS
functions will treat these as atomic
symbols, to be copied, discarded and rearranged as needed.
Once we’ve normalised a Com
value
containing symbols, we need a semantics, or abstract
interpretation, to understand what they mean. Since we’re using each
symbol to represent a distinct input, the way they appear in Normal
forms
tells us something about the overall behaviour of the expression and,
crucially, how it depends on the particular concrete Inputs
it may
be applied to. It’s undecidable whether part of an SK expression will
influence its reduction (or just get discarded by the stepK
rule), so we’ll limit our
ambitions to understanding a few simple situations.
Symbolic agreement proves extensional equality
Expressions which agree on symbolic inputs will agree on all inputs, since symbols do not reduce, and hence those inputs must have been irrelevant for the reductions which lead to the agreement. Unlike property-checking, where we rely on double-negatives to “fail to disprove agreement”, this is real positive proof that expressions will always agree, and hence we can claim definitively that they are extensionally equal.
We don’t need to implement this check specially, since we can reuse
agree
, just using symbolic
values as our inputs instead of concrete SK expressions:
-- | Synonym to indicate when we're dealing with uninterpreted symbolic values
type Symbol = String
-- | All Symbols except "S", "K" and "" (which doesn't show well)
symbols :: Stream Symbol
= sFilter keep allStrings
symbols where keep s = s /= "" && s /= "S" && s /= "K"
The following agreeSym
function applies two expressions to more and more symbolic inputs, to
see if they reduce to the same Normal
form:
-- | Checks whether two Com values agree on more and more symbolic input values.
agreeSym :: Com -> Com -> Stream (Delay (Compared Normal))
= agree f g (C <$> symbols) agreeSym f g
The return type of agreeSym
is a little awkward, since it’s “two dimensional”: travelling further
along the Stream
applies
more and more symbolic inputs; travelling further along any of those
Delay
values applies more and more steps to that expression. To get at the
results inside (assuming any of those expressions has a Normal
form)
we need a linear path which traverses this structure. We can’t use
breadth-first search since the Stream
never
ends; we also can’t use depth-first search, since a Delay
might
never end. The following race
function is a bit smarter: it acts like a round-robin scheduler, each
iteration running more expressions until one of them finishes (if
ever):
-- | Runs every 'Delay' element in an interleaved fashion until one of them
-- | produces a 'Now'. Returns that result, its index in the 'Stream', and a
-- | 'Stream' of the remaining 'Delay' elements.
race :: Stream (Delay a) -> Delay (a, Natural, Stream (Delay a))
= go [] [] s
race s where go ran run next = case (run, next) of
Cons z zs) -> go [] (reverse (z:ran)) zs
( [], Later y:ys, zs) -> Later (go (y:ran) ys zs)
(Now y:ys, zs) -> Now ( y
( fromIntegral (length ran)
, reverse ran ++ ys) zs
, sPrefix ( )
Checking race
…
The race
algorithm is a bit
tricky, so we’ll double-check it with a few property tests. First some
helper functions:
-- | Wrap the given number of 'Later' constructors around '()'
| n <= 0 = Now ()
wrapLater n = Later (wrapLater (n - 1))
wrapLater n
-- | Count the number of 'Later' wrappers around a 'Now' value. Gives up if the
-- | given 'Fuel' runs out.
countLaters :: Integral n => Fuel -> Delay a -> Maybe n
= go 0
countLaters where go !n _ ( Now _) = Just n
0 _ = Nothing
go _ !n m (Later x) = go (n + 1) (m - 1) x
go
= countLaters (n + m) (wrapLater n) == Just n countMatchesWrap (n, m)
= check countMatchesWrap (tuple2 genFuel genFuel) main
countMatchesWrap: OK
100 successful tests
All 1 tests passed (0.01s)
= case runDelay n (wrapLater n) of
runUndoesWrap n Now _ -> True
-> False _
= check runUndoesWrap genFuel main
runUndoesWrap: OK (0.01s)
100 successful tests
All 1 tests passed (0.01s)
= countLaters m (runDelay n (wrapLater (n + m)) ) == Just m runStops (n, m)
= check runStops (tuple2 genFuel genFuel) main
runStops: OK (0.02s)
100 successful tests
All 1 tests passed (0.02s)
Now on to race
itself: it
searches for a Now
value
using a triangular “flood fill” approach, illustrated below:
The numbers in the first column (i.e. which steps race
allocates to running the first
element of the Stream
) are
precisely the triangular numbers.
We can check this, by counting the number of Later
wrappers
race
produces for a Stream
whose
first element has a known number of Later
wrappers, and whose other elements are never-ending loops (and hence
don’t contribute any Now
values of
their own):
-- | Calculates the nth triangular number https://oeis.org/A000217
triangle :: (Integral n, Integral m) => n -> m
= fromIntegral . go . fromIntegral . max 0
triangle where go (n :: Natural) = (n * (n + 1)) `div` 2
loops :: Stream (Delay a)
= Cons loop loops
loops
raceFirstUsesTriangularFuel :: Fuel -> Bool
=
raceFirstUsesTriangularFuel n Cons (wrapLater n) loops)) == Just tn
countLaters tn (race (where tn = triangle n
= check raceFirstUsesTriangularFuel genFuel main
raceFirstUsesTriangularFuel: OK (0.03s)
100 successful tests
All 1 tests passed (0.04s)
At the other extreme, race
will run the first step of element n
just before it goes back to the
first element; i.e. at step triangle (n + 1) - 1
(this is OEIS A000096):
=
raceHitsNthElementBeforeNextTriangularNumber n == Just x
countLaters x (race (prog n)) where x = triangle (n + 1) - 1
0 = Cons (Now ()) loops
prog = Cons loop (prog (m - 1)) prog m
= check raceHitsNthElementBeforeNextTriangularNumber genFuel main
raceHitsNthElementBeforeNextTriangularNumber: OK (0.04s)
100 successful tests
All 1 tests passed (0.04s)
We can use race
to check
whether two expressions ever agree on symbolic inputs, and therefore
whether they’re extensionally equal:
-- | 'True' iff the given expressions ever agree for any number of inputs.
everAgree :: Com -> Com -> Delay Bool
= race (agreeSym x y) >>= go
everAgree x y where go (Same _ , _, _) = Now True
Diff _ _, _, s) = race s >>= go go (
Note that everAgree
is more
general than normalEq
, since the
latter only checks for agreement on 0
inputs; although it requires more Fuel
to
account for running it interleaved with 1 input, 2
inputs, etc. (see the fold-out section above for more details):
normalEqImpliesEverAgree :: (Fuel, Com, Com) -> Bool
=
normalEqImpliesEverAgree (n, x, y) if go (same <$> normalEq x y) n
then go ( everAgree x y) (triangle n)
else True
where go = runDelayOr False
= check normalEqImpliesEverAgree (tuple3 genFuel genCom genCom) main
normalEqImpliesEverAgree: OK (0.05s)
100 successful tests
All 1 tests passed (0.05s)
However, everAgree
is not yet
a predicate for checking extensional equality, since the Compared
values returned by agreeSym
don’t represent “equal/unequal”; only “equal/unsure”. Hence the everAgree
function claims to
return Delay Bool
,
which we can think of as “at most one boolean”. Yet its result is not
really a boolean, since it can never be False
! It
would be more accurate to return a unit value Now ()
, of
type Delay ()
. That
type is isomorphic to the
natural numbers, counting how long it took to prove equality. That
can’t be a predicate, since it begs the
question!
A useful predicate for extensional equality not only needs
to return True
for some
inputs it’s sure are equal; but also return False
for some
inputs it’s sure are unequal. When it’s unsure (which is unavoidable due
to undecidability), it can use a never-ending chain of Later
wrappers
to avoid ever returning at all!
Different symbolic heads prove disagreement
We can easily show that two expressions disagree for N inputs, by checking if the Nth element of agreeSym
results in Diff
. To be
sure they are not extensionally equal, we also need to prove
they will disagree for all inputs longer than N.
A simple way to prove this is when the head of each expression is a
different symbolic input. In that case, their Normal
forms
depend entirely on those inputs, so disagreement can be “forced” by
choosing input values which reduce to different results. For example,
say Xab
is some concrete SK expression applied to two
symbolic inputs, which happens to reduce to aS
; hence its
result depends on the first input, and is independent of the
second. If some other expression Yab
reduces those inputs
to bS
, it depends on the second input, and is
independent of the first. They can be forced to disagree by choosing
concrete input values in place of a
and b
,
which reduce to values that are known to be unequal. In this example we
could use input values KS
and KK
, which
X
reduces to S
but Y
reduces to
K
. This works even if both inputs appear in a result,
e.g. ab
can still be forced independently of
b
, by choosing a concrete expression for a
which discards its argument.
This ability to “force” the result of expressions which use an input
as their head can be used to extend disagreements from N inputs to all inputs
M > N. To see
this, note that the result of applying an expression to M > N inputs is
equivalent to first applying it to N inputs (which we can force to
result in any value we like), then applying that result to the remaining
M − N inputs. We can
choose those intermediate results to be expressions that consume the
remaining M − N
inputs (e.g. through repeated use of K
) and then
disagree with each other. ∎
The following function will spot when two expressions have different symbols in head position, and are hence provably unable to agree, even with more inputs:
-- | Return the head (left-most leaf) of the given Com
headPos :: Com -> Symbol
C c) = c
headPos (App l r) = headPos l
headPos (
-- | Whether a String represents an uninterpreted Symbol (i.e. not S or K)
isSym :: String -> Bool
= s /= "S" && s /= "K"
isSym s
-- | Whether the given Com values have distinct symbolic values in head position
distinctSymbolicHeads :: Com -> Com -> Bool
= isSym hX && isSym hY && hX /= hY
distinctSymbolicHeads x y where (hX, hY) = (headPos x, headPos y)
We can perform a few sanity checks, to confirm that our argument above holds:
-- | Generate String values to represent uninterpreted symbolic values
genSymN :: Fuel -> Gen Symbol
= (`sAt` symbols) <$> genFuelN n
genSymN n
-- | Generate (relatively small) symbolic values
genSym :: Gen Symbol
= genSymN limit
genSym
-- | Generate Com values which may also contain symbols
genSymComN :: Fuel -> Gen Com
= toSymCom <$> Gen.tree (to n) (genSymN n)
genSymComN n where toSymCom t = case t of
Leaf -> k
Branch _ Leaf Leaf -> k
Branch _ Leaf (Branch _ Leaf Leaf) -> s
Branch s (Branch _ Leaf Leaf) Leaf -> C s
Branch _ l r -> App (toSymCom l) (toSymCom r)
-- | Generate (relatively small) Com values which may contain symbols
= genSymComN limit
genSymCom
-- | Shouldn't matter which order we compare two Com values
distinctSymbolicHeadsCommutes :: (Com, Com) -> Bool
= distinctSymbolicHeads x y
distinctSymbolicHeadsCommutes (x, y) == distinctSymbolicHeads y x
= check distinctSymbolicHeadsCommutes (tuple2 genSymCom genSymCom) main
distinctSymbolicHeadsCommutes: OK (0.04s)
100 successful tests
All 1 tests passed (0.04s)
Different numbers of arguments prove disagreement
Expressions whose heads are the same symbolic input, but
applied to a different number of arguments, can also be forced to
disagree. For example, if Xab
reduces to aSb
,
and Yab
reduces to aS
, then distinctSymbolicHeads
can’t be sure if
they’re different. Yet we can still force them to disagree, by giving
them three inputs (a
, b
and
c
):
- We choose a value for
a
which consumes two arguments and reduces to the second, such asSK
. - We give
b
the valueKd
(whered
can represent any expression). - We leave
c
unconstrained.
X
and Y
will reduce these inputs as
follows:
X(SK)(Kd) c | Y(SK)(Kd)c | Xabc and Yabc
SKS(Kd) c | SKS c | Reduced per example
K (Kd)(S(Kd))c | K c(Sc) | Applied S rule to each
Kd c | c | Applied K rule to each
d | c | Applied K rule again to first
Since X
and Y
reduce to different symbols,
we can use this scheme to force their results to any value,
including unequal Normal
forms
or (as above) expressions which consume any number of subsequent inputs
then become unequal Normal
forms.
∎
The following unequalArgCount
function will spot when two expressions apply a symbol to an unequal
number of arguments, and are hence sure to be extensionally unequal:
-- | Split a Com value into its head and any arguments that's applied to
headAndArgs :: Com -> (String, [Com])
C x) = (x, [])
headAndArgs (App l r) = case headAndArgs l of
headAndArgs (-> (h, args ++ [r])
(h, args)
-- | True iff the given expressions have the same Symbol in their head, but
-- | applied to a different number of arguments.
unequalArgCount :: Com -> Com -> Bool
= isSym headX
unequalArgCount x y && headX == headY
&& length argsX /= length argsY
where (headX, argsX) = headAndArgs x
= headAndArgs y (headY, argsY)
Again, we can sanity-check this:
-- | Order of Com values shouldn't affect result
unequalArgCountCommutes :: (Com, Com) -> Bool
= unequalArgCount x y
unequalArgCountCommutes (x, y) == unequalArgCount y x
= check distinctSymbolicHeadsCommutes (tuple2 genSymCom genSymCom) main
distinctSymbolicHeadsCommutes: OK (0.03s)
100 successful tests
All 1 tests passed (0.03s)
Disagreeing arguments prove disagreement
Finally, we can force expressions to disagree by forcing a
disagreement inside them, then propagating it up to the overall
result. This is useful, since it can be achieved when each expression
has the same Symbol
at their heads (avoiding distinctSymbolicHeads
), and they apply
it to the same number of arguments (avoiding unequalArgCount
); as long as we can
force the values of those arguments to disagree.
For example, let’s say Xabc
reduces to ab
and Yabc
reduces to ac
: these have the same
symbol a
as their heads, and both apply it to a single
argument. However, the values of those arguments (b
and
c
) are provably unequal (via distinctSymbolicHeads
, in this case).
We can hence use the first input a
to propagate its unequal
argument, e.g. using the input value SKK
. Since
X(SKK)bc
reduces to b
, and
Y(SKK)bc
reduces to c
, we can force a
disagreement by choosing unequal values for b
and
c
; and this extends to more inputs by having them consume
the remainder, as before. ∎
The following function checks for the situation described above. It
checks whether argument values are unequal using a given unEq
parameter, which is more general
and useful than if we hard-coded some specific check:
-- | Whether the given Com values have matching symbols in their heads, but
-- | applied to unequal values (determined by the given unEq function)
symbolGivenUnequalArgs :: (Com -> Com -> Bool) -> Com -> Com -> Bool
= isSym headX
symbolGivenUnequalArgs unEq x y && headX == headY
&& any id (zipWith unEq' argsX argsY)
where (headX, argsX) = headAndArgs x
= headAndArgs y
(headY, argsY) = unEq a b || unEq b a -- Compare both ways round
unEq' a b
-- | The order of arguments shouldn't alter the result
symbolGivenUnequalArgsCommutes :: (Com -> Com -> Bool) -> Com -> Com -> Bool
= symbolGivenUnequalArgs f x y
symbolGivenUnequalArgsCommutes f x y == symbolGivenUnequalArgs f y x
Generating unEq
functions to
test with…
The symbolGivenUnequalArgs
function doesn’t care how arguments are deemed to be unequal,
so it lets the caller decide by passing in a function. The choice of
function shouldn’t affect commutativity, so the property symbolGivenUnequalArgsCommutes
is
universally quantified over that choice; allowing a counterexample
search to try many different functions.
This may seem surprising if you’ve not encountered it before, but we can generate a function value just as easily as a “data-like” value. For example, we can generate functions which look up their arguments in a (generated) lookup table:
genViaTable :: Eq a => Range Fuel -> Gen a -> Gen b -> Gen (a -> b)
range genA genB = do
genViaTable <- genB
def <- Gen.list range (tuple2 genA genB)
pairs pure (maybe def id . (`lookup` pairs))
However, such ordinary function values are opaque “black boxes”,
which makes them less useful for property-checking: firstly, it’s hard
to show
them (e.g. if they form part of a counterexample); and secondly it’s
hard to shrink them (to look for “simpler” alternatives).
Property-checkers avoid these problems by replacing ordinary functions
a -> b
with
their own alternative, whose constructors ‘remember’ how to show
and
shrink themselves. In falsify
this alternative is Gen.Fun a b
,
so we’ll need to convert the Com -> Com -> Bool
argument of symbolGivenUnequalArgsCommutes
into
either Fun Com (Fun Com Bool)
(curried form) or,
equivalently, Fun (Com, Com) Bool
(uncurried form). We’ll use the uncurried form, since the tuples will be
handled automatically by type class instance resolution. The following
helpers let symbolGivenUnequalArgsCommutes
take a
Fun
as argument:
-- | Transform a higher-order function to take a Fun instead
liftFun :: ((a -> b) -> c) -> Gen.Fun a b -> c
= f . Gen.applyFun
liftFun f
-- | Lift a function with binary argument to take an uncurried Fun instead
liftFun2 :: ((a -> b -> c) -> d) -> Gen.Fun (a, b) c -> d
= liftFun (f . curry) liftFun2 f
To check (the lifted version of) symbolGivenUnequalArgsCommutes
, we
need a generator of type Gen (Fun (Com, Com) Bool)
.
We can get one from Gen.fun
, but
that requires an instance of Gen.Function (Com, Com)
.
There’s an existing instance for tuples, but we still need to implement
Gen.Function Com
ourselves:
instance Gen.Function Com where
= Gen.functionMap comToPre preToCom <$> Gen.function gen function gen
This is piggybacking on an existing instance (the Gen.function gen
call at the end) by
converting our Com
values
back-and-forth to another type that already implements Gen.Function
.
The type I’ve chosen is [Maybe (Either Bool Symbol)]
,
and the conversions are implemented by comToPre
& preToCom
:
-- | Prefix notation for expressions. `Nothing` represents an 'apply' operation.
type Prefix = [Maybe (Either Bool Symbol)]
-- | Converts expressions from "applicative style" Com to "prefix style" Prefix.
comToPre :: Com -> Prefix
= case x of
comToPre x -- Treat K and S specially, so they're more likely to be generated
| x == k -> [Just (Left False)]
_ | x == s -> [Just (Left True )]
_ -- Any other symbol
C c -> [Just (Right c)]
-- Nothing acts like an apply operator for two expressions that follow it
App l r -> [Nothing] ++ comToPre l ++ comToPre r
-- | Parses "prefix style" Prefix back into an "applicative style" Com.
preToCom :: Prefix -> Com
= fst . go -- Discard any remaining suffix
preToCom where go [] = (k, []) -- Makes preToCom total, but overlaps 'Left False'
:xs) = case x of
go (xJust (Left False) -> (k , xs)
Just (Left True ) -> (s , xs)
Just (Right c ) -> (C c, xs)
Nothing -> let (l, ys) = go xs
= go ys
(r, zs) in (App l r, zs)
comToPreRoundtrips :: Com -> Bool
= preToCom (comToPre c) == c comToPreRoundtrips c
= check comToPreRoundtrips genCom main
comToPreRoundtrips: OK (0.03s)
100 successful tests
All 1 tests passed (0.03s)
preToComAlmostRoundtrips :: Prefix -> Bool
= comToPre (preToCom p') == p'
preToComAlmostRoundtrips p where p' = comToPre (preToCom p) -- Extra roundtrip to make p "correct"
genPre :: Gen Prefix
= Gen.list (to limit) genEntry
genPre where genEntry = Gen.choose (Just <$> genLeaf) (pure Nothing)
= Gen.choose (Left <$> Gen.bool False) (Right <$> genSym) genLeaf
= check preToComAlmostRoundtrips genPre main
preToComAlmostRoundtrips: OK (0.02s)
100 successful tests
All 1 tests passed (0.02s)
The Prefix
type is
a prefix
form for expressions, as opposed to the “applicative form” of Com
. They’re
equivalent, but the structure of an expression is less obvious in prefix
form, which is why it only appears in this hidden section. This is the
encoding used in binary
combinatory logic, although we’re allowing arbitrary Symbol
values
rather than just S
and K
.
The functions preToCom
and
comToPre
don’t quite
form an isomorphism. This
is essentially due to Prefix
being a
prefix(-free)
code:
- The empty list
[]
does not correspond to a particularCom
value. It a valid prefix, since appending it with anything will either form a complete expression or another valid prefix; but that’s not so useful forpreToCom
, which might have only been given a finite list and needs to return aCom
. In this case, it just returnsK
(arbitrarily). Prefix
values may contain “too many”Nothing
values. These act likeApp
, indicating that the following expression is applied to the one after. That “following expression” may itself useNothing
to consume subsequent list elements, and so on; corresponding to the arbitrary nesting whichApp
allows. However, the givenPrefix
may “run out” before specifying what those following elements should be! Again, that’s a perfectly good prefix, but not a complete value. When this happens, we will hit the same[]
case as above, and simply returnK
(potentially many times, as the applications get “popped off the stack”).- Expressions encoded as
Prefix
values are self-delimiting, meaning that the encoding itself tells us when to finish parsing: if the value we parsed wasJust _
, we have finished that expression; if the value isNothing
we can finish after parsing exactly two more expressions. Hence any elements appearing after a correctly-encoded expression will be completely ignored, resulting in the sameCom
. - Our hacky use of
K
as a fallback for emptyPrefix
values introduces some ambiguity, since an expression likeKK
could be encoded as[Nothing, Just (Left False), Just (Left False)]
(the “correct” way); or as[Nothing, Just (Left False)]
(relying on the fallback behaviour to introduce anotherK
); or indeed as[Nothing]
(relying on the fallback for bothK
expressions). Hence the fallback behaviour should not be relied upon; also, it breaks when composing expressions together (since missingK
values will only be introduced at the end of a list). - There’s also some redundancy between
Left False
/Left True
andRight "K"
/Right "S"
. This was a deliberate choice, to makefalsify
generate moreS
andK
expressions.
These aren’t a problem for our use of Prefix
in
generating values; but you may need to keep them in mind when using this
encoding for other purposes (in which case I’d recommend
embracing its nature as a prefix(-free) code, since it has all
sorts of cool applications!)
= check (uncurry3 (liftFun2 symbolGivenUnequalArgsCommutes))
main False)) genSymCom genSymCom) (tuple3 (Gen.fun (Gen.bool
symbolGivenUnequalArgsCommutes: OK (0.04s)
100 successful tests
All 1 tests passed (0.04s)
Combining disagreement provers
We can never spot all disagreements, but the simple checks
above can be combined into a single, reasonably-useful function. Not
only is this easier to use than the individual checks, but we can also
use it as the unEq
argument of
the symbolGivenUnequalArgs
function, which lets us recurse through the given expressions looking
for disagreement at any depth!
provablyDisagree :: Com -> Com -> Bool
= distinctSymbolicHeads x y
provablyDisagree x y || unequalArgCount x y
|| symbolGivenUnequalArgs provablyDisagree x y
provablyDisagreeCommutes :: (Com, Com) -> Bool
= provablyDisagree x y == provablyDisagree y x provablyDisagreeCommutes (x, y)
= check provablyDisagreeCommutes (tuple2 genSymCom genSymCom) main
provablyDisagreeCommutes: OK (0.05s)
100 successful tests
All 1 tests passed (0.05s)
An extensional equality test
Now we have ways to spot both extensional equality and
inequality (in certain situations), we can fold them over the result of
agreeSym
. We’ll begin by
extracting the numerical index provided by race
, which should tell us how many
inputs it takes for the given expressions to start agreeing:
-- | 'Just n' if the given expressions agree on 'n' inputs (they might also
-- | agree on fewer). 'Nothing' if it is determined that they won't agree; if
-- | that cannot be determined, the 'Delay' will never end.
extensionalInputs :: Com -> Com -> Delay (Maybe Natural)
= race (agreeSym x y) >>= go 0
extensionalInputs x y where go !n (Same _ , m, _) = Now (Just (n+m))
!n (Diff a b, _, s) = if provablyDisagree (toCom a) (toCom b)
go then Now Nothing
else race s >>= go (n+1)
agreeOnExtensionalInputs :: (Fuel, Com, Com, InputValues) -> Bool
=
agreeOnExtensionalInputs (n, x, y, inputs) True (extensionalInputs x y >>= check) n
runDelayOr where check Nothing = Now True
Just i) = same <$> sAt i (agree x y inputs) check (
= check agreeOnExtensionalInputs (tuple4 genFuel genCom genCom genComs) main
agreeOnExtensionalInputs: OK (0.11s)
100 successful tests
All 1 tests passed (0.11s)
This is easy to transform into a legitimate predicate for testing
extensional equality, that’s able to answer both True
and False
(at
least, some of the time):
-- | Whether the given expressions are extensionally equal, i.e. cannot be
-- | distinguished by an SK expression.
extEq :: Com -> Com -> Delay Bool
= isJust <$> extensionalInputs x y extEq x y
extEq
is a generalisation of
our previous checks:
extEqGeneralisesEqAndNormalEqAndEverAgree :: (Fuel, Com, Com) -> Bool
=
extEqGeneralisesEqAndNormalEqAndEverAgree (n, x, y) case ( go ( extEq x y)
, go ( everAgree x y)<$> normalEq x y)
, go (same ==) x y) of
, (Now False, Now True, _ , _ ) -> False
(Now False, _ , Now True, _ ) -> False
(Now False, _ , _ , True) -> False
(-> True
_ where go = runDelay n
= check extEqGeneralisesEqAndNormalEqAndEverAgree
main (tuple3 genFuel genCom genCom)
extEqGeneralisesEqAndNormalEqAndEverAgree: OK (0.16s)
100 successful tests
All 1 tests passed (0.16s)
Next time
We’ve seen how extensional equality works in theory, part 3 explores these definitions in more depth, to find out where I was going wrong in my egglog.