Ivory: Sums And Products

We tend to learn about sums (“adding numbers”) quite early, either when starting school or even before. We also learn about products (“times-ing numbers”) shortly after. We often represent each using a table, to show their patterns. For example:

\begin{matrix} + & \bar{2} & \bar{1} & 0 & 1 & 2 \\ \bar{2} & \bar{4} & \bar{3} & \bar{2} & \bar{1} & 0 \\ \bar{1} & \bar{3} & \bar{2} & \bar{1} & 0 & 1 \\ 0 & \bar{2} & \bar{1} & 0 & 1 & 2 \\ 1 & \bar{1} & 0 & 1 & 2 & 3 \\ 2 & 0 & 1 & 2 & 3 & 4 \end{matrix}

Addition table for integers

\begin{matrix} \times & \bar{2} & \bar{1} & 0 & 1 & 2 \\ \bar{2} & 4 & 2 & 0 & \bar{2} & \bar{4} \\ \bar{1} & 2 & 1 & 0 & \bar{1} & \bar{2} \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & \bar{2} & \bar{1} & 0 & 1 & 2 \\ 2 & \bar{4} & \bar{2} & 0 & 2 & 4 \end{matrix}

Multiplication table for integers

We can generalise these ideas of sum and product to much more than simple numbers. To do so we have to decide which aspects to keep, and which to ignore. In other words, what is the “essence” that makes the above tabulations a sum and a product? Which things cannot be considered as sums or products? Whilst we’re at it, what’s the actual distinction between a “sum” and a “product”?

Representing Sums And Products in Scheme

Ivory will represent sums and products as “uninterpreted function symbols”, i.e. as lists like '(+ 1 2) or '(× 3 4). This page defines various normalisation procedures, to reduce such lists into a unique normal form.

Types

We can think of addition and multiplication as operations or functions which transform “input values” into an “output value”. A fundamental aspect of these values is their type: what sort of things are we talking about?

For the addition and multiplication we learn in school, all of the values are numbers; i.e. they “have type number”. Numbers can be “literal”, like 12; or a (named) “variable” which represents some number, like x; or, crucially, they can be the output of some other numerical operation, like (+ 1 2).

Having the same type for inputs and outputs allows sums and products to be nested in arbitrary ways: with the output of one used as the input of another, forming “trees”:

    +
  ┌─┴─┐
  │   │
  ×   3
┌─┴─┐
│   │
7   ×
  ┌─┴─┐
  │   │
  2   9

A tree of nested sums and products, equivalent to the s-expression (+ (× 7 (× 2 9)) 3), or the infix expression (7 × (2 × 9)) + 3.

This is a crucial aspect of sums and products. An operation which doesn’t output the same type as its inputs, or which combines inputs of different types, is not usually considered a sum or a product.

Laws of Algebra

We can characterise operations algebraically by stating “laws”: equations which an operation always satisfies, regardless of its inputs. This latter aspect can be represented by using variables for inputs (we’ll use the names a, b and c), and stating the laws “for all” values of those variables. (Programmers may also know these as invariants or properties).

Since these laws state that different forms of sums and products are equal, and our goal with Ivory is to represent all numbers in a unique normal form, we will choose one form for each law, and rewrite the other equal forms into that one.

Associativity

   +                   +                      +
┌──┴──┐             ┌──┴──┐             ┌─────┼─────┐
│     │             │     │             │     │     │
a     +      =      +     c      =      a     b     c
   ┌──┴──┐       ┌──┴──┐
   │     │       │     │
   b     c       a     b

   ×                   ×                      ×
┌──┴──┐             ┌──┴──┐             ┌─────┼─────┐
│     │             │     │             │     │     │
a     ×      =      ×     c      =      a     b     c
   ┌──┴──┐       ┌──┴──┐
   │     │       │     │
   b     c       a     b

The law of associativity for + and ×. These equations always hold, regardless of the values of a, b and c.

This law states that sums-of-sums, or products-of-products, do not depend on how they are nested; so long as the inputs occur in the same order from left-to-right. This justifies the application of + and × to arbitrarily-many inputs, which are equivalent to nested operations (but don’t require an arbitrary choice of branching structure).

Implementing Associativity

We will normalise such nested operations by flattening them, using the following procedure:

(define (normalise:associative normalise ≤ n) (match n
  ;; Match nested operations; bind the inner operation's inputs to xs, and
  ;; append them to the outer operation's other inputs (pre and suf)

  [(cons '+ (app (split +?) (list pre (cons '+ xs) suf)))
   (changed (cons '+ (append pre xs suf)))]

  [(cons '× (app (split ×?) (list pre (cons '× xs) suf)))
   (changed (cons '× (append pre xs suf)))]

  [n (unchanged n)]))

These make use of a few trivial helper functions:

;; Predicates for spotting products and sums
(define (×? n) (and (pair? n) (equal? '× (car n))))
(define (+? n) (and (pair? n) (equal? '+ (car n))))

;; Indicates whether a value was/wasn't in normal form, and hence may need
;; further normalising.
(define (unchanged x) (values x #f))
(define (  changed x) (values x #t))

More interestingly, we use the split function to split up a list (above, we’re splitting up the inputs of a sum or product), by finding an element y that satisfies a given predicate (above, the predicates are +? and ×?, to spot a nested sum or product). The result also includes all the elements occuring before y (which we call xs) and all those following y (which we call zs). If no such element is found, we return #f (and hence the above clauses won’t match):

;; Return #f when no element of xs satisfies pred, otherwise return (list a b c)
;; where (append a (cons b c)) = xs, and (pred b)
(define ((split pred) xs)
  (match/values (splitf-at xs (negate pred))
    [(xs (cons y zs)) (list xs y zs)]
    [(_  _          ) #f]))

Identity

+
│      =      a
a

×
│      =      a
a

A sum or product containing only a single input, is equal to that input. Hence, in the case of a single argument, both sum and product act as the identity function.

In mathematics, the word “identity” refers to something that leaves a value unchanged. Here we consider two sorts of identity. Firstly, the identity function (or operation) is that which outputs its input. Since a sum with only a single input does not add anything to it, the output equals that input; and hence that sum is an identity function. Likewise, a product with only a single input does not multiply it by anything, so the output equals that input, and the product too is an identity function. These facts allow us to “unwrap” sums or products of a single value when normalising expressions.

The second sort of identity is a value, whose inclusion in the inputs of an operation does not change the output. We know from associativity that nested operations are equivalent to a single operation with all the same inputs; hence an empty operation is the identity value, since it does not affect the inputs (only the nesting). For example, (+ a (+)) is (by associativity) the same as (+ a), and hence the same as a (since, as described above, the sum of a single input leaves it unchanged). Here is the same example shown graphically (with a half-line ╵ to indicate no inputs):

   +                   +                +
┌──┴──┐      =      ┌──┴──┐      =      │      =      a
a     +             +     a             a
      ╵             ╵

Hence (+) is the identity value for sums, and a similar argument shows that (×) is the identity value for products.

Now, we know from primary school that adding 0 to a number also leaves it unchanged; so 0 must also be an identity for addition. In which case, what happens if we add 0 to an empty sum, like (+ 0 (+))? Adding 0 leaves the empty sum unchanged, so the result is (+); yet we also showed that adding (+) will leave the 0 unchanged, so the result is also 0. This only makes sense if both of these are the same value!

We can use the same approach to find that (×) and 1 must be the same value. Hence:

+      =      0
╵

×      =      1
╵

A empty sum or product (i.e. which has no inputs) is called the identity element of that operation. These are equal to the values 0 and 1, respectively.

Implementing Identity

(define (normalise:identity normalise ≤ n) (match n

Unwrapping a single-input sum or product is easy:

  ;; A singleton sum or product outputs its only input
  [(list '+ n) (changed n)]
  [(list '× n) (changed n)]

Since the identity values can be written in two ways (0 or (+); 1 or (×)) our normal form needs to choose one for each. We’ll use the empty operation, since that exposes structure that other clauses can use, e.g. to allow the existing associativity clauses to simplify multiplication by 1 and addition of 0:

  ;; Empty sums and products are their identity elements
  [0 (changed (list '+))]
  [1 (changed (list '×))]

  [n (unchanged n)]))

Distributivity

   ×                    +
┌──┴──┐             ┌───┴───┐
│     │             │       │
a     +      =      ×       ×
   ┌──┴──┐       ┌──┴──┐ ┌──┴──┐
   │     │       │     │ │     │
   b     c       a     b a     c


      ×                 +
   ┌──┴──┐          ┌───┴───┐
   │     │          │       │
   +     c   =      ×       ×
┌──┴──┐          ┌──┴──┐ ┌──┴──┐
│     │          │     │ │     │
a     b          a     c b     c

Distributivity relates addition and multiplication, with the latter “distributing over” the former.

So far all of these laws have applied equally to both sums and products. Not only does distributivity apply differently to each, but it tells us which operations act like sums, and which act like products. Specifically, it says that multiplying a sum, like (× 2 (+ 3 7 19)), is the same as multiplying each element of that sum individually, like (+ (× 2 3) (× 2 7) (× 2 19)). We say that multiplication “distributes over” addition. The same is not true if we switch the role of + and × (you may like to convince yourself of this by finding a counter-example)!

Notice that distributing a multiplication doesn’t change its order: distributing a multiplication “on the left”, like (× a ...), produces a sum of products which also multiply on the left; and vice versa “on the right”, like (× ... a). This isn’t relevant for types like integer, since their product also obeys the commutativity law, that (= (× a b) (× b a)); however, we’ll soon encounter numbers which don’t commute, so it’s important that we don’t swap things around by accident!

Absorption

   ×                ×                ×
┌──┴──┐             │             ┌──┴──┐
│     │      =      +      =      │     │
a     +             ╵             +     a
      ╵             ╵             ╵

Distributivity has an important relationship to our identity laws. When a product contains a sum, distributivity says that all of the product’s other inputs can be distributed over the inputs of that sum. If that sum has no inputs, the result will be an empty sum; we say that the empty sum has absorbed the products.

We showed that an empty sum must equal 0, so this “absorption” behaviour tells us that products with zero occuring as an input will always output zero (regardless of any other inputs).

Implementing Distributivity

Distributivity can help us normalise expressions, by replacing one of these forms with the other. We’ll choose the sum-of-products form to be normal, since it’s easy to convert into that form by distributing multiplications over sums; going the other way, to produce a product-of-sums, would require factorising, which is notoriously slow!

We need to be careful not to change the order of any multiplications; to ensure this, when a sum is multiplied by many values we’ll only distribute those which occur immediately to its left or right; this can be repeated until all of the values have been distributed into the sum. For example, given a product like (× a b (+ c d)) we will first distribute b, to get (× a (+ (× b c) (× b d))), then distribute a to get (+ (× a b c) (× a b d)).

To find a sum and its immediate neighbour in a product, we use a modified version of the above split function: split-pair will split a list at a pair of consecutive elements which satisfy the given binary predicate:

;; Check (pred (nth 0 xs) (nth 1 xs)), (pred (nth 1 xs) (nth 2 xs)), etc. until
;; we find a neighbouring pair elements a & b which pass the predicate, then
;; return (list pre a b suf), where (append pre (cons a (cons b suf))) = xs.
;; If no such pair of elements is found, return #f.
(define ((split-pair pred) xs)
  (define/match (go xs ys)
    [((cons x xs) '()        ) (go xs (list x))]
    [('()         ys         ) #f]
    [((cons x xs) (cons y ys))
     (if (pred y x)
       (list (reverse ys) y x xs)
       (go xs (cons x (cons y ys))))])

  (go xs '()))

We use split-pair to define two separate clauses. We first look through the inputs of a product, for a neighbouring pair of the form (cons '+ xs) (a sum with inputs xs) and any value y. If found, we remove both from the product’s inputs, and insert a sum that’s had the multiplication by y distributed across its inputs (i.e. replacing each element x of xs with (× x y)):

(define (normalise:distributivity normalise ≤ n) (match n
  [(cons '× (app (split-pair (lambda (x y) (+? x)))  ;; A sum followed by y
                 (list pre (cons '+ xs) y suf)))
   ;; Keep the product, in case pre/suf contain more elements
   (changed (cons '× (append
     pre
     ;; Multiply all of xs's summands by y on their right
     (list (cons '+ (map (lambda (x) (list '× x y)) xs)))
     suf
   )))]

Next we do the same when the second value is a sum:

  [(cons '× (app (split-pair (lambda (x y) (+? y)))  ;; x followed by a sum
                 (list pre x (cons '+ ys) suf)))
   ;; Keep the product, in case pre/suf contain more elements
   (changed (cons '× (append
     pre
     ;; Multiply all of ys's sumands by x
     (list (cons '+ (map (curry list '× x) ys)))
     suf
   )))]

  [n (unchanged n)]))

Notice that we do not have to implement absorption separately, since the identity clauses will replace 0 with '(+), and the two clauses above will cause it to absorb all of a product’s inputs. In particular, their results use (map ... xs), which will return an empty list when xs is empty!

Commutativity

   +                   +
┌──┴──┐             ┌──┴──┐
│     │             │     │
a     b      =      b     a

Changing the order of the inputs to a commutative operation does not affect its output.

Addition and multiplication of numbers are both commutative, meaning that the order of their inputs does not affect their output. We will require sums to be commutative, but we will not assume that products are; since that turns out to be quite restrictive. We can use commutativity to normalise sums, by sorting their inputs into a consistent order.

Implementing Commutativity

The following clause implements a form of Bubble Sort, by spotting a pair of neighbouring inputs which are in the wrong order:

(define (normalise:commutativity normalise ≤ n) (match n
  [(cons '+ (app (split-pair (lambda (x y) (not (lex≤ x y))))
                 (list pre x y suf)))
   (changed (append '(+) pre (list y x) suf))]

  [n (unchanged n)]))

Note that we cannot sort the inputs based on their numerical value, since we will encounter multi-dimensional values which don’t have a linear ordering. Instead we will compare the literal syntax of expressions, known as a lexicographical ordering:

;; Compare expressions lexicographically (i.e. via "dictionary order" of their
;; literal syntax)
(define (lex≤ . args)
  (define (to-string x)
    (let ([o (open-output-string)])
      (write x o)
      (get-output-string o)))
  (apply string<=? (map to-string args)))

Final Pieces

Our normalise functions need a few more clauses to be complete. Firstly, if we find a sum or product containing ordinary Racket numbers, then we should add/multiply them using Racket’s usual addition/multiplication operations:

(define (normalise:sums-and-products normalise ≤ n) (match n
  ;; Sums of multiple Racket numbers simplify via addition
  [(cons '+ (app (split number?)
                 (list pre x (app (split number?)
                                  (list mid y suf)))))
   (changed (append (list '+ (+ x y)) pre mid suf))]

  ;; Products of multiple Racket numbers simplify via multiplication
  [(cons '× (app (split number?)
                 (list pre x (app (split number?)
                                  (list mid y suf)))))
   (changed (append (list '× (× x y)) pre mid suf))]

If we have a sum or product which doesn’t match any of the above clauses, we should try normalising its inputs recursively:

  ;; Recurse through the elements of a sum or product to see if any change
  [(cons (and op (or '+ '×)) xs)
   (for/fold ([result      (list op)]
              [any-changed #f])
             ([(x changed) (map normalise xs)])
     (values (snoc result x) (or any-changed changed)))]

Finally, a value which doesn’t match any of the clauses defined on this page should be left unchanged:

  ;; Otherwise there's nothing left to do
  [n (unchanged n)]))

Code

Here’s a URI containing all of the Racket code generated by this post:

DOWNLOAD RACKET CODE

Test results

raco test: (submod (file "sums-and-products.rkt") test)
  ✓ property ×-matches-* passed 100 tests.
  ✓ property integer?-implies-rational? passed 100 tests.
  ✓ property rational?-implies-number? passed 100 tests.
  ✓ property natural?-implies-integer? passed 100 tests.
  ✓ property natural-is-closed passed 100 tests.
  ✓ property lex≤-is-reflexive passed 100 tests.
  ✓ property lex≤-is-total passed 100 tests.
  ✓ property lex≤-is-transitive passed 100 tests.
  ✓ property split-works passed 100 tests.
  ✓ property split-pair-works passed 100 tests.
31 tests passed