Skip to main content

Supercompiltation

I've lately been working on my disertation concering a program transformation technique known as Distillation. Distillation is a technique inspired by supercompilation.

Supercompilation involves mapping terms from a particular language to other terms of the same language. It can be though of as a semantics preserving rewriting system, or a function which maps terms to other terms which are equal modulo some semantic interpretation relation. A really strait-forward description of supercompilation is given by XXX on his paper on supercompilation for Haskell. This work leverages work by Gluck and Sorenson but presents generalisation in a very clear and declarative way, which is a notable exception to most of the literature in this field. Generalisation is actually quite a tricky problem. Generalisation has some of the same problems as the old problem with standards. That is, the thing about standards is that there are so many to chose from. It takes quite a bit of effort to come up with not only a reasonable formulation of a generalisation. But one that has the characteristics that you want. One that doesn't exhibit over-generalisation.

Supercompilation is a particular meta-compilation technique. In fact there are loads of meta-compilation techniques and they can be quite powerful. Partial-evaluation is the most well known technique and one of the simplest. Deforestation is another well known technique. Geoff Hamilton described a much more sophisticated technique known as Distillation.

Turchin noted that supercompilation had the capacity to prove conjectures through the transformation of semantically equivalent programs. Basically by reducing applications of predicates to their truth value. More sophisticated techniques of program transformation can lead to even more sophisticated automated solutions to conjectures.

In my research I've decided to leverage the automated proof capacity of the particular metacompilation strategy "distillation" to solve problems in the domain of reactive systems. Reactive systems are particularly important since they represent a class of systems that is immanently useful, yet much less studied. They are programs which must respond to stimulae. They basically encompass all programs which deal with actions provided by an external environment.

The particular objective is to be able to make specifications in some temporal logic (posibly LTL initially, but maybe the mu-calculus depending on circumstances) and prove correctness of the specifications using program transformation in an automated fashion. Some work on this has already been done. Most notably by Leuschel and XXX (paper on crypto). There are a number of obvious sticking points that I've run into.

The representation of runs of a machine is crucial to the notion of verification using some temporal logic. For LTL you need only have a stream of possible runs. Because the specifications that will be given to the compiler will represent programs with a precondition, that act as state tranformers given some particular satisfaction, it is easiest to encode this as a state transformer for each operation that satisfies the precondition. That is, the "stream" of values becomes a "tree" of values where each possible output is contingent on the satisfaction of some precondition predicate.

This representation mirrors the types of proofs that are done over inductive predicates in a language such as coq for temporal properties, and this proof representation was inspirational.

The tree of possible values contingent on the state and predicate over states is a coinductive function. It presents an infinite number of possible states over which proofs must function. This should present no difficulties in a system which incorporates inductive and coinductive types.

The difficulties in the representation become clear when one tries to prove safety properties over this representation. Safety properties over a coinductively defined instruction sequence are necessarily coinductive. Supposing we create a coinductive list-type, such that we have a cons constructor, in addition to nil. Now if we have the function:

from x = x:(from (S x))

If (S x) is the successor function, we have a program that computes the largest set of members of the co-lists starting from x. Supposing we have a predicate All : (P : Set) -> co-list -> bool, which returns true when ever P is true on every element of co-list. This naturally corresponds to a safety property. We can now ask things like:

All (x > 3) (from 4)

That statement requires an infinite proof! We need to unfold "from" an infinite number of times. All needs unfolding an infinite number of times as well. Fortunately we can procede using coinduction.

Comments

Popular posts from this blog

Decidable Equality in Agda

So I've been playing with typing various things in System-F which previously I had left with auxiliary well-formedness conditions. This includes substitutions and contexts, both of which are interesting to have well typed versions of. Since I've been learning Agda, it seemed sensible to carry out this work in that language, as there is nothing like a problem to help you learn a language. In the course of proving properties, I ran into the age old problem of showing that equivalence is decidable between two objects. In this particular case, I need to be able to show the decidability of equality over types in System F in order to have formation rules for variable contexts. We'd like a context Γ to have (x:A) only if (x:B) does not occur in Γ when (A ≠ B). For us to have statements about whether two types are equal or not, we're going to need to be able to decide if that's true using a terminating procedure. And so we arrive at our story. In Coq, equality is

Managing state in Prolog monadically, using DCGs.

Prolog is a beautiful language which makes a lot of irritating rudimentary rule application and search easy. I have found it is particularly nice when trying to deal with compilers which involve rule based transformation from a source language L to a target language L'. However, the management of these rules generally requires keeping track of a context, and this context has to be explicitly threaded through the entire application, which involves a lot of irritating and error prone sequence variables. This often leads to your code looking something a bit like this: compile(seq(a,b),(ResultA,ResultB),S0,S2) :- compile(a,ResultA,S0,S1), compile(b,ResultB,S1,S2). While not the worst thing, I've found it irritating and ugly, and I've made a lot of mistakes with incorrectly sequenced variables. It's much easier to see sequence made explicitly textually in the code. While they were not designed for this task, but rather for parsing, DCGs turn out to be a conveni

Teagrey

I was ironing my shirt today, which I almost never do. Because of this I don't have an ironing board so I tried to make a make-shift ironing board on my floor using a towel and some books. I grabbed the heaviest books on the nearest shelf, which happened to be Organic Chemistry, Stalingrad and an annotated study bible containing the old and new testament. As I pulled out the bible, a flower fell out which had been there for over 17 years now. I know that because it was put there by my first wife, Daniel, who killed herself in April about 17 years ago. It fell from Thessalonians to which it had been opened partially. Highlighted was the passage: "Ye are all sons of the light and sons of the day." I guess the passage gave her solace. Daniel was a complicated woman. She had serious mental health issues which plagued her for her entire life. None of them were her fault. She was dealt an absolutely awful hand in life, some truly nasty cards. She had some considerable c