Thursday, 28 February 2008

The Co-Natural Numbers in Haskell and Coq

I was playing around with the Peano numerals in Haskell (as I often do), because I remember briefly reading about a generalisation of exponentiation in a paper once when I didn't have time to read it. I decided I'd quickly look at it again now that I was in a mood to procrastinate.

> data Nat = Z | S Nat deriving Show
This data type is usually identified with the peano numerals. In fact as we will see, in haskell it isn't quite the natural numbers. We take Z to mean 0 and take (Sn Z) or (S (S (S ... Z))) n times, to mean n. It's basically a unary number system where the length of the string gives us the number.

It is pretty straightforward to write the basic arithmetic operations on the peano numerals.

> add x Z = x
> add x (S y) = S (add x y)
> mult x Z = Z
> mult x (S y) = add x (mult x y)
> expt x Z = (S Z)
> expt x (S y) = mult x (expt x y)
Here we can see an extension of this series fairly easily. The only real question we might have is what to do with the base case. It turns out that (S Z) or 1 is a good choice.

> exptexpt x Z = (S Z)
> exptexpt x (S y) = expt x (exptexpt x y)
Playing with exptexpt got me thinking about the fact that I wasn't dealing with natural numbers and I was curious what I could prove about them. In order to see that we don't actually have the natural numbers we can define the following:

> inf :: Nat
> inf = S (inf)
This element is clearly not a natural number since it is not a valid recursive equation in the sense that there is no decreasing quantity. It is however a valid corecursive equation. It is always productive in the sense that it always has it's recursive reference inside of a constructor. We will always be able to "peel" something away from this function in order to reason about it. This will become obvious when we start looking at equational reasoning with infinity.

So this leads us to the question: what does inf+x equal? I'll prove it in pseudo-Coq. Let us make the following conjecture:

∀ x, inf + x = inf

Theorem : ∀ x, inf + x = inf.
  By coinduction:

  case x = 0: 
    1a) inf + 0 = inf 
    2a) inf = inf
    * true by the reflexive property of =
  case x = S x'
    1b) inf + S x' = inf
    2b) S (inf + x') = inf
    3b) S (inf + x') = S inf
    4b) inf + x' = inf
    * true as this is the coinductive hypothesis!
Qed. 

Now the equality that we are using here is actually not term based equality. It is a coinductive type of equality known as bisimulation. It also must follow the rule that we can never use recursion unless we are inside a constructor for the type of the function we are using. The recursion is the coinductive hypothesis. The constructor prior to the use of the coinductive hypothesis is from the definition of bisimulation. Namely:

∀ x y, x = y → S x = S y.

The full definition in Coq of the bisimulation which we denote above as "=" and below as "CoEq" is:
CoInductive CoEq : CoNat -> CoNat -> Prop := 
| coeq_base : forall x, CoEq Z Z
| coeq_next : forall x y, CoEq x y -> CoEq (S x) (S y).

The constructor coeq_next, which is used between step 3b and 4b above ensures that we satisfy our guardedness, or productivity condition and can safely "call" the coinductive hypothesis. What do I mean "call" and how is the "constructor" being used? We can see exactly by printing out the term in Coq that proves inhabitation of the type for our proposition.

add_x_inf = 
cofix add_x_inf (x : CoNat) : x + inf ~= inf :=
  eq_ind_r (fun c : CoNat => x + c ~= c)
    (eq_ind_r (fun c : CoNat => c ~= S inf)
       (coeq_next (x + inf) inf (add_x_inf x)) (add_x_s x inf))
    (decomp_eql inf)
     : forall x : CoNat, x + inf ~= inf

Here we see in gory detail a corecursive function that allows us to show this type is inhabited along with the corecursive call shown embedded in the constructor coeq_next. Sorry if it isn't very readable, it was constructed automagically from the proof script.

To see how this is all done in detail, check out the Coq proof script for the co-naturals. This might be interesting to those of you who want to see how to use coinductive reasoning on simple examples and how setoids work in Coq. You'll notice that the reasoning is slightly more cumbersome then what I've used here. I think this mostly comes down to the "cofix" tactic being unwieldy in coq, and making automation a hassle. I'm going to think about ways to fix this since it really is unnecessarily inconvenient.

Here are some helper functions which convert from positive integers into our representation so you can play with the code more easily.

> to 0 = Z
> to (n+1) = (S (to n))
> from Z = 0
> from (S x) = 1+(from x)

Tuesday, 12 February 2008

Verifying the Monad Laws with Supercompilation

A while ago I wrote a post about how one can use coq to make a proper monad module. I was just thinking today that it would be nice to have a tool for haskell that would allow one to write down conjectures and discharge them automatically with a theorem prover. Supercompilation makes a nice clean theorem prover for haskell since one could express the equations of interest in haskell itself. Below is an example of the list monad, and then the 3 monad laws written as conj1,conj2 and conj3. I prove the first law by manual supercompilation. The next two are left as an exercise for the interested reader.

equal xs ys = case xs of 
                [] -> case ys of 
                        [] -> True 
                        _ -> False 
                (x:xs') -> case ys of 
                             [] -> False 
                             (y:ys') -> case (eq x y) of 
                                          False -> False
                                          True ->  equal xs' ys'

bind [] f = [] 
bind (h:t) f = (f h)++(bind t f)
  
ret a = [a]

conj1 a f = equal (bind (ret a) f) (f a)
conj2 m = equal (bind m (\\a -> ret a)) m
conj3 m f g = equal (bind (bind m f) g) (bind m (\\x -> (bind (f x) g)))

In order to define equality on lists we had to make reference to a different equality predicate on the elements of the list. We will make the assumption that this equality is decidable and supercompilation can prove the reflexive property for this equality predicate, that is "eq x x = True" will be taken as an assumption. It is somewhat hard to imagine a case where supercompilation would have a hard time with this because of the case substitution rule.

Now we take conj1 and attempt to prove it by semantics preserving transformations of the source code [1]. I've been brutally explicit in the steps used so that people who are interested can see exactly how to do these sorts of things themselves. I've found that program transformation techniques can be extremely useful in reasoning about code. Of course, it helps to assume that all functions are total, and I'll be doing just that. I use the notation M[x:=y] to mean the substitution of y for x in M. Aside from that everything is just Haskell.

conj1 a f = equal (bind (ret a) f) (f a)

{- unfold equal -} 
conj1 a f = case (bind (ret a) f) of 
              [] -> case (f a) of 
                      [] -> True 
                      _ -> False
              (x:xs') -> case (f a) of 
                           [] -> False 
                           (y:ys') -> case eq x y of 
                                        False -> False 
                                        True -> equal xs' ys'

{- unfold bind -}
conj1 a f = case (case (ret a) of 
                    [] -> []
                    (z:zs) -> (f h)++(bind zs f))
              [] -> case (f a) of 
                      [] -> True 
                      _ -> False
              (x:xs') -> case (f a) of 
                           [] -> False 
                           (y:ys') -> case eq x y of 
                                        False -> False 
                                        True -> equal xs' ys'

{- case distribution -} 
conj1 a f = case (ret a) of 
              [] -> case (f a) of 
                      [] -> True 
                      _ -> False
              (z:zs) -> case ((f h)++(bind zs f)) of 
                          [] -> case (f a) of 
                                  [] -> True 
                                  _ -> False
                          (x:xs') -> case (f a) of 
                                       [] -> False 
                                       (y:ys') -> case eq x y of 
                                                    False -> False 
                                                    True -> equal xs' ys'

{- unfold ret -}
conj1 a f = case [a] of 
              [] -> case (f a) of 
                      [] -> True 
                      _ -> False
              (z:zs) -> case ((f h)++(bind zs f)) of 
                          [] -> case (f a) of 
                                  [] -> True 
                                  _ -> False
                          (x:xs') -> case (f a) of 
                                       [] -> False 
                                       (y:ys') -> case eq x y of 
                                                    False -> False 
                                                    True -> equal xs' ys'

{- case selection -}
conj1 a f = (case ((f z)++(bind zs f)) of 
                      [] -> case (f a) of 
                              [] -> True 
                              _ -> False
                      (x:xs') -> case (f a) of 
                                   [] -> False 
                                   (y:ys') -> case eq x y of 
                                                False -> False 
                                                True -> equal xs' ys') 
                                     [z := a,  zs := []]

{- substitution -}
conj1 a f = case ((f a)++(bind [] f)) of 
              [] -> case (f a) of 
                      [] -> True 
                      _ -> False
              (x:xs') -> case (f a) of 
                           [] -> False 
                           (y:ys') -> case eq x y of 
                                        False -> False 
                                        True -> equal xs' ys'

{- unfold bind  -}
conj1 a f = case (case (f a) of
                    [] -> []
                    (z:zs') -> z:(zs'++[])) of 
              [] -> case (f a) of 
                      [] -> True 
                      _ -> False 
              (x:xs') -> case (f a) of 
                           [] -> False 
                           (y:ys') -> case eq x y of 
                                        False -> False 
                                        True -> equal xs' ys' 

{- case selection, substitution -}
conj1 a f = case (f a) of
              [] -> True                     
              (z:zs') -> case eq z z of 
                           False -> False 
                           True -> equal (zs'++[]) zs'

{- Assumption: eq z z = True -}
 conj1 a f = case (f a) of
              [] -> True                     
              (z:zs') -> equal (zs'++[]) zs'

{- unfold of equal and ++ -}
conj1 a f = case (f a) of
              [] -> True                     
              (z:zs') -> case (case zs' of 
                                 [] -> [] 
                                 (w:ws) -> w:(ws++[]) of 
                           [] -> case zs' of 
                                   [] -> True
                                   (y:ys) -> False
                           (x:xs) -> case zs' of 
                                       [] -> False 
                                       (y:ys) -> case (eq x y) of
                                                   False -> False 
                                                   True -> equal xs ys

{- case selection -}                                                 
conj1 a f = case (f a) of
              [] -> True                     
              (z:zs') -> case zs' of 
                           [] -> True
                           (w:ws) -> case (eq w w) of
                                       False -> False 
                                       True -> equal (ws++[]) ws
                 
{- Assumption: eq w w = True -}
conj1 a f = case (f a) of
              [] -> True                     
              (z:zs') -> case zs' of 
                           [] -> True
                           (w:ws) -> equal (ws++[]) ws
            
{- fold, (we encountered a repeated instance of 'equal (ws++[]) ws') -}
conj1 a f = case (f a) of 
              [] -> True                     
              (z:zs') -> let g = \\ xs -> 
                                 case xs of 
                                   [] -> True 
                                   (y:ys') -> g ys'
                         in g zs'

Now we have a function that can only return true, regardless of the value of (f a) and assuming that f is total, we can replace this term with True. The proof of this is simply by induction on (f a), but the principle can be built into a checker and is the principle used in the Poitín [3] theorem prover.

Something along these lines would be very light-weight in comparison to using a full theorem prover and could just issue a warning if some laws couldn't be proved. The proof of this in Coq is actually more work and requires lemmas about the list type to be proven. I did in fact fudge the ((f a)++[]) situation there by replacing with (f a), and I'm not entirely sure what supercompilation does with it if you leave it in. I'll have to try it later. UPDATE: This is fixed in the code above.

If we want to prove this result by hand we can also derive it in a much simpler way by not following the supercompilation method verbatim:

conj1 a f = equal (bind [a] f) (f a)
conj1 a f = equal (append (f a) (bind [] f)) (f a)
conj1 a f = equal (append (f a) []) (f a)
conj1 a f = equal (f a) (f a)

The advantage of the former is of course that it is entirely mechanical.

UPDATE: I went ahead and incorporated the definition of append and supercompiled it (by hand) with this definition and it reduces to exactly the same term, so in fact supercompilation, without any special knowledge, is capable of proving the conjecture. This is a real win over most proof assistants which would require this append lemma to be proved separately, or incorporated into the knowledge base. For a more visual and more compact depiction of the derivation we can draw a partial process tree with the append function definition included in the derivation.



And indeed the second monad law:



Is the third monad law for the list monad provable using supercompilation? I expect it is, but since I don't have a supercompiler, and it is a bit of a bear to do by hand, I'm not sure.

Incidently, there is already a supercompiler being developed for haskell called Supero [2]. It probably wouldn't be too much work to extend this with theorem proving capabilities using ideas from [3].

[1] A Transformation System for Developing Recursive Programs
[2] A Supercompiler for Core Haskell
[3] Poit?n: Distilling Theorems From Conjectures

Sunday, 10 February 2008

Proof by Program Transformation

Program transformation is a sort of meta-compilation where the source and target languages are the same. In this sense it can be seen as an automorphism on the set of programs in a language.

As was pointed out by Turchin, it is sometimes possible to prove properties of programs simply by performing a program transformation. Basically a program, and a program which verifies it's behaviour (a predicate) can be replaced with a truth value if the program transformation is sufficiently powerful. An obvious example of course, which can be calculated using partial evaluation would be something like (max (< 3) (1::2::1::nil)). In fact very complex properties can be proved automatically by Distillation, a particularly powerful program transformation system.

In my research I've been working with the μ-Calculus, which is a type of temporal logic that is very powerful (LTL, CTL, CTL* can all be embedded in the μ-Calculus) and good for describing reactive systems. It uses a notion of least and greatest fixed-points in order to describe temporal properties. This lead me to consider infinite proofs, bi-simulation and co-induction in the context of distillation.

I was reading a paper [1], which lead me to wonder if Distillation couldn't be used as a means of determining if two programs bi-simulate in a very simple way. Distillation can be viewed as a relation between programs, such that a program a and a program b are related by a D b iff a Distils to b (that is b is the result of running the distillation algorithm on b). If the bi-simulation relation is called ~ then we have trivially that:

∀ a b c ∈ PROG. a D c ∧ b D c → a~b

This is obvious since D is semantics preserving, and c must have the same semantics as both a and b. So this looks a little bit silly at first, what good can this possibly be? Can we ever expect this to happen? Well, I was a little bit surprised to find that it does work for simple examples that I tried on paper. For instance, let us take the following program.

iterate f x = x:(iterate f (f x)) 
map f (h:tl) = (f h):(map f tl)
cofix f = f (cofix f)

conjecture_1 f x = iterate f (f x)  
conjecture_2 f x = map f (iterate f x)

We will assume that the ":" constructor is for a type of lists with no "nil" terminator, that is, we are only dealing with infinite lists. Supercompilation is strictly less powerful than Distillation, but sufficient for this example, and simpler to do on paper. We will therefore supercompile the two halves of the conjecture:

∀ f x. iterate f (f x) ~ map f (iterate f x)

to yield:

conjecture_1 f x = iterate f (f x) 
-- unfold iterate
conjecture_1 f x = (f x):(iterate f (f (f x)) 
-- generalise [iterate f (f x), iterate f (f (f x))] => 
--  iterate f (f x) which is just conjecture_1
conjecture_1 f x = (f x):(conjecture_1 f (f x))
conjecture_1 = cofix (\ g f x -> (f x):(g f (f x)))

conjecture_2 f x = map f (iterate f x) 
-- unfold map 
conjecture_2 f x = case (iterate f x) of h:tl => (f h):(map f tl)  
-- unfold iterate
conjecture_2 f x = case (x:(iterate f (f x)) of h:tl => (f h):(map f tl) 
-- case law, [h:=x, tl:=(iterate f (f x))]
conjecture_2 f x = (f x):(map f (iterate f (f x)) 
-- generalise [map f (iterate f (f x)), map f (iterate f x)] =>  
-- map f (iterate f x), which is just conjecture_2
conjecture_2 f x = (f x):(conjecture_2 f (f x))
conjecture_2 = cofix (\ g f x -> (f x):(g f (f x)))

So we can see that conjecture_1 and conjecture_2 are syntactically identical modulo alpha-conversion. The cofix was added just so that the function name didn't obscure the equality. In fact we could also go to some De Bruijn representation as well, to get rid of the need for alpha-conversion equivalence, but I think you get the picture. In addition, it is trivial to verify that this is a valid co-recursive program satisfying the guard condition that the recursive call happens under the constructor of the co-inductive type which is the co-domain of the co-recursive function (look at all those "co"s!).

The point is though that we have derived a syntactically identical program, and hence proved bi-simulation, using a completely automated procedure!

So the obvious question of course is, how often will this work? The technique relies on finding a canonical form. For non-recursive (NOREC) programs, we can probably always find a canonical form using supercompilation (the S relation, a subset of D). That is, I expect that:

∀ p,q,r,s ∈ NOREC. p D r ∧ q D s → r=s

This is probably obviously true to someone who understands better about normal forms in the lambda calculus, but I haven't yet proved it for the language over which distillation is defined.

For recursively defined programs, it can't be the case that we can always find a canonical form. This is the undecidability of equivalence for the lambda calculus. However, I'll stick my neck out and conjecture that it will work for all programs of suitably low complexity (PRIMREC, ELEMENTARY maybe? [3]). This doesn't tell us anything about bi-simulation however, since co-recursive programs don't have a complexity. Is there a co-complexity we could use? It would be really nice to have some class we could show is definitely canonisable.

And while I'm on a roll playing fast and loose, I'll say that I think that Distillation represents "the right way" of going about finding "proof-net" like structures for intuitionistic logic. That is, internal to the distillation algorithm we actually form a graph, and in the case of conjecture_1 and conjecture_2 the graphs are identical. The internal representation used for distillation then is actually the form in which we remove "The bureaucracy of syntax" [2] for some subset of possible proofs.

For further reading on some of these ideas:

[1] Proof Methods for Corecursive Programs
[2] Down with the bureaucracy of syntax!
[3] The Expressive Power of Higher-order Types or, Life without CONS
[4] Deforestation, program transformation, and cut-elimination

Friday, 18 January 2008

Supercompiltation

I've lately been working on my disertation concering a program transformation technique known as Distillation. Distillation is a technique inspired by supercompilation.

Supercompilation involves mapping terms from a particular language to other terms of the same language. It can be though of as a semantics preserving rewriting system, or a function which maps terms to other terms which are equal modulo some semantic interpretation relation. A really strait-forward description of supercompilation is given by XXX on his paper on supercompilation for Haskell. This work leverages work by Gluck and Sorenson but presents generalisation in a very clear and declarative way, which is a notable exception to most of the literature in this field. Generalisation is actually quite a tricky problem. Generalisation has some of the same problems as the old problem with standards. That is, the thing about standards is that there are so many to chose from. It takes quite a bit of effort to come up with not only a reasonable formulation of a generalisation. But one that has the characteristics that you want. One that doesn't exhibit over-generalisation.

Supercompilation is a particular meta-compilation technique. In fact there are loads of meta-compilation techniques and they can be quite powerful. Partial-evaluation is the most well known technique and one of the simplest. Deforestation is another well known technique. Geoff Hamilton described a much more sophisticated technique known as Distillation.

Turchin noted that supercompilation had the capacity to prove conjectures through the transformation of semantically equivalent programs. Basically by reducing applications of predicates to their truth value. More sophisticated techniques of program transformation can lead to even more sophisticated automated solutions to conjectures.

In my research I've decided to leverage the automated proof capacity of the particular metacompilation strategy "distillation" to solve problems in the domain of reactive systems. Reactive systems are particularly important since they represent a class of systems that is immanently useful, yet much less studied. They are programs which must respond to stimulae. They basically encompass all programs which deal with actions provided by an external environment.

The particular objective is to be able to make specifications in some temporal logic (posibly LTL initially, but maybe the mu-calculus depending on circumstances) and prove correctness of the specifications using program transformation in an automated fashion. Some work on this has already been done. Most notably by Leuschel and XXX (paper on crypto). There are a number of obvious sticking points that I've run into.

The representation of runs of a machine is crucial to the notion of verification using some temporal logic. For LTL you need only have a stream of possible runs. Because the specifications that will be given to the compiler will represent programs with a precondition, that act as state tranformers given some particular satisfaction, it is easiest to encode this as a state transformer for each operation that satisfies the precondition. That is, the "stream" of values becomes a "tree" of values where each possible output is contingent on the satisfaction of some precondition predicate.

This representation mirrors the types of proofs that are done over inductive predicates in a language such as coq for temporal properties, and this proof representation was inspirational.

The tree of possible values contingent on the state and predicate over states is a coinductive function. It presents an infinite number of possible states over which proofs must function. This should present no difficulties in a system which incorporates inductive and coinductive types.

The difficulties in the representation become clear when one tries to prove safety properties over this representation. Safety properties over a coinductively defined instruction sequence are necessarily coinductive. Supposing we create a coinductive list-type, such that we have a cons constructor, in addition to nil. Now if we have the function:

from x = x:(from (S x))

If (S x) is the successor function, we have a program that computes the largest set of members of the co-lists starting from x. Supposing we have a predicate All : (P : Set) -> co-list -> bool, which returns true when ever P is true on every element of co-list. This naturally corresponds to a safety property. We can now ask things like:

All (x > 3) (from 4)

That statement requires an infinite proof! We need to unfold "from" an infinite number of times. All needs unfolding an infinite number of times as well. Fortunately we can procede using coinduction.