User-defined constraint domain

From The Twelf Project
Jump to: navigation, search

This tutorial describes a technique for defining a limited form of constraint domain in Twelf, that is, a type with non-trivial equations among its terms. Unlike Twelf's built-in (but currently problematic) constraint domains of rationals, integers, and strings, these constraint domains are already part of the LF type theory and thus interact nicely with each other and with the rest of the theory. And, as we shall see, they can be used for many of the same purposes, not to mention a whole host of new ones, including a limited form of type refinement.

Although this tutorial has (user-defined) constraint domains as its main theme, it touches on a lot of other topics and techniques, and so has some value as a general tutorial on Twelf.

Motivation

We begin with some motivation (and review), which you can skip if you are interested in getting right to the examples that follow.

LF is a representational framework: given a particular formal system, we construct LF signatures that determine sets of terms (and types and kinds) that can stand as representatives of the constituents of the formal system. We do this in such a way that the syntactic and proof-theoretic structures of the formal system, for example the binding structure of terms or the inductive construction of proofs, are mirrored in the type-theoretic structure of LF. The measure of success for such a representation is what we call adequacy.

Equality of terms (and types and kinds) clearly plays a key role here, since adequacy depends on what terms are considered equal. Twelf's built-in notion of equality, which is much coarser than plain syntactic identity, is the combination of beta- and eta-equivalence. The former, along with the notion of higher-order abstract syntax, is what allows us to represent object-level substitution using meta-level function application. And the latter implies a kind of extensionality or behavioral equivalence that gets rid of some annoying duplication in our representations that would otherwise make adequacy much more difficult to achieve. Beta-eta equivalence has the very nice property that its equivalence classes have unique canonical forms, which are inductively defined; it is these canonical forms that are used to represent object-level entities. Equality also plays a fundamental role in the operational semantics of Twelf, since one of its main components, unification, is all about finding substitutions that make a set of terms equal up to beta-eta equivalence, which is a crucial but quite non-trivial endeavor.

However, despite the importance of equality, it remains an internal notion: there are no built-in "equality types" that we can use to reason about equality explicitly, for example to prove uniqueness of constructed objects or functionality of relations. Instead, we must define equality for each type where it is needed, and prove respects and uniqueness lemmas in order to make use of it. This approach has the advantage, though, that the same or closely-related techniques will work when the equality we define is not the built-in equality but one that carries more information.

Another crucial aspect of LF from a representational standpoint is that its terms are generated freely from its signature. That is, no terms are present that are not built from the constructors in the signature and the variables in the context, and no identifications between terms are made, except those that follow from beta-eta equivalence. Without this, it would be difficult to represent aspects of our formal systems that are defined (freely) by induction. On the other hand, this freeness can also be an impediment, as when what we really want is a kind of algebra: a set endowed with certain operations that satisfy (non-trivial) equations. Think, for example, of the type bool, which is freely generated by its two constants, true and false, but which has operations and, or, not, implies, etc., that satisfy the equations of Boolean Algebra.

We are therefore led to wonder whether we can represent such "algebras" as LF types.

Example 1: Even and Odd

We borrow our first example from the tutorial on Output factoring:

nat : type.
z : nat.
s : nat -> nat.

odd : nat -> type.
even : nat -> type.

z-e : even z.
s-o : odd (s X) <- even X.
s-e : even (s X) <- odd X.

Here, even and odd are predicates on nat and are defined in terms of their relation to s. The output factoring tutorial shows you how to prove that, under this definition, every natural number is either even or odd; the statement of the theorem requires a disjoint-union type, and the proof requires a lemma to take care of an output coverage problem.

An approach using indexing

Since the subsets of even and odd natural numbers are disjoint, you might think of another approach to representing these subsets, namely, by introducing a new type of natural number sorts, and indexing nat by the sort:

nat_sort : type.
even : nat_sort.
odd : nat_sort.

nat : nat_sort -> type.
z : nat even.
s : nat even -> nat odd.    % Doesn't work:
s : nat odd -> nat even.    % redefinition of s

We'd like for s to have both of the declared types at the same time, but that is not possible in LF (though some proposed extensions of LF to include refinement types do allow something like this). Alternatively, we might try to write something like

s : nat X -> nat (f X).

for some term f of type nat_sort -> nat_sort with the property that f even = odd and f odd = even, but again this is not possible in LF, since terms are freely generated and functional terms are parametric in their arguments (i.e., there are no destructors or case-splitting on arguments). This approach seems to be stuck. Or is it?

A constraint domain of natural-number sorts

The idea is to use a higher-order "Church-encoding" for our two-element type and to use abbreviations to "hide" the implementation of this type:

t : type.

%abbrev nat_sort : type = t -> t -> t.
%abbrev even : nat_sort = [even:t] [odd:t] even.
%abbrev odd  : nat_sort = [even:t] [odd:t] odd.

%abbrev add1-mod2 : nat_sort -> nat_sort = [s:nat_sort] [even:t] [odd:t] s odd even.

nat : nat_sort -> type.
z : nat even.
s : nat X -> nat (add1-mod2 X).

Why does this work? As long as the type t does not become subordinate to any other type, the only canonical terms of type nat_sort are even and odd, and so the type looks "from the outside" just like the type nat_sort from the previous section. However, add1-mod2, rather than being a constructor on this type that freely generates new terms, is instead a "pre-existing" canonical term of type nat_sort -> nat_sort. In fact, there are four canonical terms of this type, the other three being

%abbrev const_even : nat_sort -> nat_sort = [s:nat_sort] [even:t] [odd:t] s even even.   % or: [s] even
%abbrev const_odd  : nat_sort -> nat_sort = [s:nat_sort] [even:t] [odd:t] s odd  odd.    % or: [s] odd
%abbrev identity   : nat_sort -> nat_sort = [s:nat_sort] [even:t] [odd:t] s even odd.    % or: [s] s

Similarly, there are (22)2 = 16 canonical terms of type nat_sort -> nat_sort -> nat_sort, and 24 = 16 canonical terms of type (nat_sort -> nat_sort) -> nat_sort, and so on. In fact, what we have here are all lambda-definable functions in the full type-hierarchy over {even, odd}, which, since this set is finite, is the same as the full (set-theoretic) type-hierarchy over {even, odd}. And these functions are just there waiting to be used!

To see one such use, let's define the predicate plus X Y Z. What should be its type? We'll give two definitions, the first with minimal type information:

plus_min : nat X -> nat Y -> nat Z -> type.
%mode plus_min +N1 +N2 -N3.

plus_min_z : plus_min N1 z N1.
plus_min_s : plus_min N1 (s N2) (s N3) <- plus_min N1 N2 N3.

%worlds () (plus_min N1 N2 N3).
%total N2 (plus_min N1 N2 N3).

This works fine, but we can do better. We know, for example, that if X and Y are the same, then Z will be even, and if X and Y are different, then Z will be odd. In other words, tracking the sorts of the elements involved in an addition requires being able to compute addition mod 2. But this is just a (finite) function in the type-hierarchy over {even, odd}, and so to get a maximally precise type for plus, we can just define the term of type nat_sort -> nat_sort -> nat_sort that represents addition mod 2, and then use that in the type:

%abbrev plus-mod2 : nat_sort -> nat_sort -> nat_sort
   = [s1:nat_sort] [s2:nat_sort] [even:t] [odd:t] s1 (s2 even odd) (s2 odd even).

plus : nat X -> nat Y -> nat (plus-mod2 X Y) -> type.
%mode plus +N1 +N2 -N3.

plus_z : plus N1 z N1.
plus_s : plus N1 (s N2) (s N3) <- plus N1 N2 N3.

%worlds () (plus N1 N2 N3).
%total N2 (plus N1 N2 N3).

See Twelf's output. You'll see in this output that all of our abbreviations have been expanded, which makes the output a bit harder to read. It would have been better to have used definitions rather than abbreviations, but (at present anyway) this is, for technical reasons, incompatible with Twelf's meta-checks.

Also notice that, apart from the change in name from plus_min to plus, the declarations of plus_z and plus_s are the same; it's just the type that has become more precise. As an exercise, you may like to check that if f is any of the other 15 possible canonical forms of type nat_sort -> nat_sort, and you replace the declaration of plus above with

plus : nat X -> nat Y -> nat (f X Y) -> type.

then at least one of the two declarations for plus_z or plus_s will no longer type-check without generating unresolved constraints. Finally, we observe that, as compared to the definitions of even and odd in the tutorial on output factoring, it is intrinsic in our encoding via the constraint domain nat_sort that every natural number is either even or odd, and so this is not something that needs to be proved.

Example 2: Polarities

For our next example, we are going to use the same constraint domain that we used in Example 1, and for a similar purpose (although we will be renaming it to suit the current situation). Suppose that we have a language with a type system that includes a function-space operator, =>, along possibly with some others, such as products, sums, and type constants, and we have represented these as canonical forms of an LF type tp. We want to add an operator mu to represent positive inductive types. What this means is that if we have a type expression F : tp -> tp such that for any x : tp, the type F x only has positive occurrences of x (that is, every occurrence of x is on the left-hand side of an even number of => operators), then we want to be able to form mu F : tp. Furthermore, these are the only mu-types that we want to form; i.e., we want this to be an intrinsic encoding. All of these types will have introduction and elimination forms, but for the purposes of this example, we will just focus on the types themselves.

An approach via predicates

A natural approach to this problem is to try to represent the positivity condition as a predicate pos on tp -> tp and to define this predicate mutually with another predicate neg on tp -> tp, as follows:

tp : type.

pos : (tp -> tp) -> type.
neg : (tp -> tp) -> type.

un : tp.                                       % unit type
*  : tp -> tp -> tp.    %infix right 11 *.     % product type
=> : tp -> tp -> tp.    %infix right 10 =>.    % function type
mu : {F:tp -> tp} pos F -> tp.                 % inductive type

pos_id : pos ([a] a).

pos_un : pos ([a] un).
pos_*  : pos ([a] F1 a) -> pos ([a] F2 a) -> pos ([a] F1 a * F2 a).
pos_=> : neg ([a] F1 a) -> pos ([a] F2 a) -> pos ([a] F1 a => F2 a).
pos_mu : ({b:tp} pos ([a] b) -> pos ([a] F a b)) -> pos ([a] mu ([b] F a b) _).

neg_un : neg ([a] un).
neg_*  : neg ([a] F1 a) -> neg ([a] F2 a) -> neg ([a] F1 a * F2 a).
neg_=> : pos ([a] F1 a) -> neg ([a] F2 a) -> neg ([a] F1 a => F2 a).
neg_mu : ({b:tp} neg ([a] b) -> neg ([a] F a b)) -> neg ([a] mu ([b] F a b) _).

Some comments about this code are in order:

  • We have chosen to the make the type constructor F an explicit parameter of mu in order to make mu-terms appear more natural, with the positivity proof relegated to second position. However, we could just as easily have left this parameter implicit, declaring mu : pos F -> tp, since we would still have access to F through unification with terms involving type ascriptions, as in mu (P : pos F).
  • We define pos F and neg F by induction on the body of F, taking advantage of the inductive characterization of LF canonical forms of type tp -> tp as the canonical forms of type tp in a context that includes an extra assumption a : tp. The cases for mu say that a type a occurs positively (or negatively) in a mu-type if it occurs positively (or negatively) in its body, and are necessary if we want to allow nested mu-types (but see below!). The proof components of the mu-type are not used, hence the underscores _ in these declarations.
  • We have chosen to define pos and neg using types involving the -> arrow rather than the <- arrow that is typical in logic-programming-style definitions, e.g.,
    pos_* : pos ([a] F1 a * F2 a)
              <- pos ([a] F1 a)
              <- pos ([a] F2 a).
There are two reasons. First, these declarations are not the same: using the earlier declaration, a term of the form pos_* P1 P2 : pos ([a] F1 a * F2 a) has P1 as the positivity proof for F1 and P2 as the positivity proof for F2, whereas in the declaration just given, it is reversed! Second, we prefer to use -> when we are representing data (i.e., making object-level definitions) and <- when we are representing programs (i.e., proving metatheorems), but this is a matter of style.

All of this appears to work, and we were able, in passing, to make some useful comments about the code, but unfortunately there is a problem here, and the problem has to do with nested mu-types. As a simple example, take the μ-type μX. μY. X * Y (which could be a considered the type of infinite streams of infinite streams of infinite streams of ...). In order for the outer mu in any representation of this type to be well-formed, it's body, the inner mu, must have only positive occurrences of X in its body (which it seems to have) and similarly for the positivity of Y in the body of the inner mu, which is X * Y. However, you can see in the clause for pos_* above that, for Y to be positive in X * Y, not only does it have to be positive in Y, which it is by pos_id, but it also has to be positive in X, which we have not arranged to be the case.

Now, since every occurrence of a variable is both positive and negative in an expression in which it does not occur—vacuously—we could add two catch-all cases to our declarations to fix this problem:

pos_vac : pos ([a] F).
neg_vac : neg ([a] F).

We could then represent our μ-type μX. μY. X * Y with the term

streams : tp = mu ([x] mu ([y] x * y) (pos_* pos_vac pos_id)) (pos_mu ([y][py] pos_* pos_id py)).

However, as mentioned in the tutorial on catch-all cases, this creates non-determinism, since it is now possible to have multiple proofs of positivity for the same type constructor. For example, we have:

proof1 : pos ([a] un * un) = pos_* pos_un  pos_un.
proof2 : pos ([a] un * un) = pos_* pos_vac pos_un.
proof3 : pos ([a] un * un) = pos_* pos_un  pos_vac.
proof4 : pos ([a] un * un) = pos_* pos_vac pos_vac.
proof5 : pos ([a] un * un) = pos_vac.

and so, filling in the hole labeled %{proof}% in the term

mu ([x] un * un) ( %{proof}% )

with each of the five proofs above, we would get five different canonical terms of type tp representing the same μ-type, μX. 1 * 1 ! This destroys the adequacy of our representation, and is why it's best to avoid catch-all cases in object representations. On the other hand, the same can't be said, in general, of metatheorem representation, since there we are usually more concerned about the existence of proofs rather than their unique representation (whatever that means: if you really figure out what it means for two proofs to be "equal", let me know!) Finally, just as we mentioned in connection with Example 1 that some proposed extensions to LF to include refinement types could give a nice solution to the representation of even and odd natural numbers (among many others such "subsorts"), we should also mention here that some proposed extensions to LF to include proof-irrelevance could address the problem we have described above, by identifying all elements, or "proofs", of certain types, restoring adequacy of such representations.

See Twelf's output for the code above.

Is there an adequate intrinsic encoding?

So how can we achieve an adequate intrinsic encoding of μ-types? The situation is worse than you might think: there doesn't seem to be any hope of getting such an encoding. We've seen that adding a catch-all case destroys adequacy, but yet we've also seen that, in representing the μ-type μX. μY. X * Y, we need a positivity assumption about X in order to show that Y occurs positively in X * Y. So our only hope seems to be to add it to the argument of mu, thus:

mu : {F:{a:tp} pos ([b] a) -> tp} pos F -> tp.

But this is not well-typed unless we also either

  1. change the type of pos so that it accepts arguments of the same type as F, or
  2. extract the underlying constructor of type tp -> tp from F and use that instead as the argument to pos.

But neither of these is possible: we can't do 1 because pos would then have to be declared as

pos : ({a:tp} pos ([b] a) -> tp) -> type

which is circular, and we can't do 2, because to get a tp output from F, we'd need first to construct something of type pos ([b] a), which is exactly the problem we were trying to solve in the first place. So, it seems that our search for an adequate intrinsic encoding of μ-types is doomed. Or is it?

Using the even/odd constraint domain

You may remember our original characterization of what it means for a type expression F : tp -> tp to have only positive occurrences of its bound variable, namely, that every occurrence of the variable is on the left-hand side of an even number of => operators. This suggests that, in the same way that we used the nat_sort constraint domain in Example 1 as an index to nat in order to capture the even (and odd) natural numbers, we can also use it here as an index to tp to capture occurrences on the left of an even (and odd) number of =>s. When we follow this idea to its logical conclusion, this is what we get:

t : type.

%abbrev pol : type = t -> t -> t.                              % polarities
%abbrev opp : pol -> pol = [p:pol] [pos:t] [neg:t] p neg pos.  % the opposite polarity

tp : pol -> type.

un : tp P.                                                     % unit type
*  : tp P -> tp P -> tp P.            %infix right 11 *.       % product type
=> : tp (opp P) -> tp P -> tp P.      %infix right 10 =>.      % function type
mu : (tp P -> tp P) -> tp P.                                   % inductive type

That's it! You may find it hard to believe that this works, so let's look at it more closely. The constraint domain is the same one we used before, with nat_sort renamed to pol and add1-mod2 renamed to opp. In all of the type constructors the polarity index is implicitly quantified, so each will apply to both polarities, but the individual polarities themselves have no meaning—there isn't a "positive" polarity or a "negative" polarity, for example—so we haven't given them names. The main force of the polarities is to be found in the rules for => and mu. In the case of =>, the rule says that, whatever polarity is to be assigned to the type A => B, the same polarity must be assigned to B and the opposite polarity must be assigned to A. And in the case of mu, the fact that the functional argument to mu has domain and codomain types with the same polarity means that, if we consider this functional argument to be an abstract syntax tree with instances of the bound variable at the leaves, then an even number of applications of opp must occur between the root of this tree and any occurrence of the bound variable at the leaves.

In short, you can think of the rules as giving conditions for a particular simultaneous polarity assignment to all of the subterms of a given term to be coherent, and you'll notice that if one polarity assignment is coherent, then so is its opposite, reinforcing the idea that the individual polarities are meaningless. (We might say, following Einstein, that positivity and negativity are not absolute notions, but are only meaningful relative to other instances of one another!) Thus, since each of the type constructors relates the polarity of its "output" to the polarities of its "inputs" (with, in the case of mu, an assumption about the polarity of the bound type variable), a given term representing one of our types will have, overall, either zero or exactly two polarity assignments for its subterms that can be coherent, with only the latter case corresponding to the type being well-formed. It follows that the product type {P:pol} tp P adequately represents our informal notion of positive inductive types.

Here are some sample encodings of μ-types, starting with the two examples we discussed before:

%abbrev pos_ind : type = {P:pol} tp P.

streams : pos_ind = [P] mu [x] mu [y] x * y.
ones    : pos_ind = [P] mu [x] un * un.

ex1 : pos_ind = [P] mu [x] un => x.
ex2 : pos_ind = [P] mu [x] (x => un) => x.
ex3 : pos_ind = [P] mu [x] (x * x => un) => x * x.
ex4 : pos_ind = [P] mu [x] (mu [y] x => y) => x.

See Twelf's output.

You might also like to check that each of these definitions fails to type-check:

ex5 : pos_ind = [P] mu [x] x => un.
ex6 : pos_ind = [P] mu [x] x => x.
ex7 : pos_ind = [P] mu [x] x * (mu [y] x => y).

Curiously, if you remove the type ascriptions and polarity abstractions from these three ill-formed mu-types, then they do type-check, but only because Twelf, during type-reconstruction, abstracts a variable X of type t from these definitions and then assigns the "constant polarity" [t1][t2] X (which is its own opposite!) to the contradictory subterms. This illustrates again the reliance of this method on the fact that the basic type t cannot have any canonical elements.

Example 3: Uniform syntactic categories

Looking over the examples from the previous two sections, you may have noticed a pattern in the method we used to achieve simplification in each case, a method that we might call intrinsification by indexing. That is, we were given a type A with some unary predicates P on that type and we "intrinsified" those predicates by turning A into a type family A' that was indexed by a particular constraint domain whose elements represented those predicates; for any predicate P, the elements of A' P were exactly the original elements of A that satisfied P. This opened up the possibility of giving more precise types to functions defined on A that respected the predicates, since we were are able to "track" their behavior on the indices by using definable functions in the type-hierarchy over the constraint domain.

Thus, in Example 1, the type A was nat and the predicates were even and odd. These predicates became elements of a constraint domain nat_sort that we used to index A, and the functions s and (via its graph) plus could be tracked by functions add1-mod2 and plus-mod2 on nat_sort. And in Example 2, the type A was tp -> tp and the predicates were pos and neg. These predicates became elements of a constraint domain pol and the various type constructors were tracked with the help of (the identity function and) the function opp on pol.

We will continue to explore this theme by using a couple of examples from the typed lambda calculus and LF itself. But first, let's look a little more carefully at the conditions under which we can expect the methods of the previous two sections to work. Clearly, if we want there to be a bijection between A and the collection of types A' P indexed by the predicates P, then the extensions of these predicates must form a partition of A (i.e., they must be pairwise disjoint and collectively exhaustive). This may seem like a severe restriction, but there are some ways around it, as we will see shortly. But another restriction of the method is that, if we want to give more informative types for functions defined on A, these must must not only respect the predicates, but the induced functions on the indices must also be uniform in the sense that they can be defined parametrically on the index. These restrictions will become clearer as we look at more examples.

Simply-typed lambda calculus with values

Our next example won't actually require a constraint domain, but it nevertheless illustrates the intrinsification-by-indexing theme we are developing, and it shows how you can sometimes get around some of the limitations of the method that were discussed at the end of the previous section. Suppose we want to represent a simply-typed call-by-value lambda-calculus with a (big-step) evaluation relation that relates arbitrary terms with their values under evaluation, and we decide to represent typed terms intrinsically and values extrinsically:

tp : type.
=> : tp -> tp -> tp.   %infix right 10 =>.

exp : tp -> type.
lam : (exp A -> exp B) -> exp (A => B).
app : exp (A => B) -> exp A -> exp B.

val : exp A -> type.
lam-val : val (lam [x] E x).

eval : exp A -> exp A -> type.
eval-lam : eval (lam [x] E x) (lam [x] E x).
eval-app : eval (app E1 E2) V
            <- eval E1 (lam [x] E1' x)
            <- eval E2 V2
            <- eval (E1' V2) V.

That is, the encoding of typed terms here is intrinsic in that we are only representing well-typed terms (along with their types), and not separate types and (not necessarily well-typed) terms that are related by a typing judgment. And our encoding of values is extrinisic in that we are representing values as a subset of terms through a separate judgment, instead of similarly building the notion into the terms. (See the tutorial on Intrinsic and extrinsic encodings, which uses some similar examples.)

See Twelf's output for the above declarations.

Now, we'd like to "intrinsify" the representation of values in this encoding, but there is an obstacle here that wasn't present in our previous examples: we have two predicates here corresponding to values and general terms, but they aren't disjoint: every values is a general term. The solution is to introduce a separate predicate for non-values, which, along with values, partition the set of terms. Fortunately, this is possible here, since non-values have a simple inductive definition: they are precisely the applications. Here is the final result:

tp : type.
=> : tp -> tp -> tp.  %infix right 10 =>.

stlc_sort : type.
val : stlc_sort.
non-val : stlc_sort.

exp : stlc_sort -> tp -> type.

lam : (exp val A -> exp S B) -> exp val (A => B).
app : exp S1 (A => B) -> exp S2 A -> exp non-val B.

eval : exp S A -> exp val A -> type.
ev-lam : eval (lam [x] E x) (lam [x] E x).
ev-app : eval (app E1 E2) V
          <- eval E1 (lam [x] E1' x)
          <- eval E2 V2
          <- eval (E1' V2) V.

Now we can see why a constraint domain wasn't required in this case: there are no special functions needed to track lam and app on the indices. Yet we still get an intrinsic encoding with more informative types for lam, app, and eval. And whereas with the extrinisic encoding we would have to prove a metatheorem that if eval E V then val V, with the intrinsic encoding this comes for free.

See Twelf's output.

The syntax of Canonical LF

In modern presentations of the LF type theory (see the glossary entry for canonical forms for more information), the set of syntactic objects is divided into five categories: kinds, canonical types, atomic type families, canonical terms, and atomic terms, often denoted by the letters K, A, P, M, and R. This stratification of syntax necessitates a similar stratification of the typing judgment (or, as we will call it, the classification judgment), the hereditary substitution judgment, and the statements of theorems about these judgments, for example the substitution theorem.

In this section, we will show how it is possible to define a constraint domain cat of (syntactic) categories to allow an economical presentation of the syntax, judgments, and metatheorems of LF. In particular, we will have one type family of terms, one classification judgment, one substitution judgment, and one substitution theorem statement, each of which is indexed by a syntactic category. In order to achieve this unification, we will have to make use of some functions in the type hierarchy over cat, much the way we did with add1-mod2, plus-mod2, and opp above. To simplify the presentation for this tutorial, we will not be representing signatures, (type family or term) constants, subordination, the simple-type subscripts to hereditary substitution, or the bi-directionality of the typing judgments and substitution theorem.

The constrain domain of syntactic categories

We first define the constraint domain, cat. There are two subtleties here. First, because our unified classification judgment relates a syntactic object to its classifier, it will be necessary for the purpose of uniformity to introduce a category to classify kinds; we call it "unit" (letter: U) and give it a unique element * (this element might be called a "hyperkind"). Second, the hereditary substitution operation, when applied to an atomic term R, can produce either a canonical term or an atomic term, depending on whether the substitution variable occurs at the head of R or not. Consequently, we introduce a syntactic category to represent the disjoint union of canonical and atomic terms (letter: M+R), and we introduce two injections from M and R into M+R as term constructors.

Here is the definition of the constraint domain:

t : type.

%abbrev cat : type = t -> t -> t -> t -> t -> t -> t -> t.  % seven elt type

%abbrev k   : cat = [k][a][p][m][r][m+r][u] k.
%abbrev a   : cat = [k][a][p][m][r][m+r][u] a.
%abbrev p   : cat = [k][a][p][m][r][m+r][u] p.
%abbrev m   : cat = [k][a][p][m][r][m+r][u] m.
%abbrev r   : cat = [k][a][p][m][r][m+r][u] r.
%abbrev m+r : cat = [k][a][p][m][r][m+r][u] m+r.
%abbrev u   : cat = [k][a][p][m][r][m+r][u] u.

Terms and term constructors

We can now introduce the type tm of terms, which is indexed by cat, as well as the various term constructors:

tm  : cat -> type.

typ      : tm k.                               % type (the base kind)
pi_k     : tm a -> (tm r -> tm k) -> tm k.     % kind-level product

coer_pa  : tm p -> tm a.                       % type coercion
pi_a     : tm a -> (tm r -> tm a) -> tm a.     % type-level product

app_p    : tm p -> tm m -> tm p.               % atomic family application

coer_rm  : tm r -> tm m.                       % term coercion
lam_m    : (tm r -> tm m) -> tm m.             % term-level lambda abstraction

app_r    : tm r -> tm m -> tm r.               % atomic term application

canon    : tm m -> tm m+r.                     % left injection
atom     : tm r -> tm m+r.                     % right injection

*        : tm u.                               % unit

Here, we have used the standard higher-order abstract syntax representations of binders, which in this case always abstract over atomic term variables. We have also included explicit coercions from atomic to canonical types and terms, since we really have disjoint syntactic categories here, instead of the usual overlapping ones, so some operation is needed to move from one category to another.

Main judgments

We come now to the main judgments, classification and substitution. We haven't yet made any special use of the constraint domain—for everything above, we could have used an ordinary type—but this is where we will need it. The reason is that, in order to state these judgments in a general enough way to include all of the usual stratified LF judgments, we will need to be able to track the functions that take a category to the category of objects that classifies it, as well as the function that takes a category to the category of objects that result from substitutions into it (recall the subtlety with M+R). These are provided by the functions class and subst, respectively:

%abbrev class : cat -> cat = [X:cat] [k][a][p][m][r] [m+r][u]
                              X       u  u  k  a  a   a    u.

%abbrev subst : cat -> cat = [X:cat] [k][a][p][m][r] [m+r][u]
                              X       k  a  p  m  m+r m+r  u.

classj : tm X -> tm (class X) -> type.

substj : (tm r -> tm X) -> tm m -> tm (subst X) -> type.

Here, classj U V means that U is classified by (has type, has kind, has hyperkind) V, and substj ([x] E x) M E' means that E' is the result of substituting M for x in E. Also, notice that the category index of each of these judgments is implicit, so you won't see any categories mentioned in the rules that follow.

Rules for classification

We now give the encodings of the rules for classification. For the purposes of this tutorial, it is not necessary that you study these rules in detail (unless they interest you!); just notice that Twelf is, in each case, able to infer the syntactic categories of all syntactic constituents of these rules by the using the term constructors involved, as well as the class function used to track the category of the classifier.

canon_kind_type : classj typ *.

canon_kind_pi   : classj (pi_k A2 [y] K y) *
                   <- classj A2 *
                   <- {y} classj y A2 -> classj (K y) *.

canon_fam_pi    : classj (pi_a A2 [y] A y) *
                   <- classj A2 *
                   <- ({y} classj y A2 -> classj (A y) *).

canon_fam_atom  : classj (coer_pa P) *
                   <- classj P typ.

atom_fam_app    : classj (app_p P1 M2) K
                  <- classj P1 (pi_k A2 [y] K1 y)
                  <- classj M2 A2
                  <- substj ([x] K1 x) M2 K.

canon_term_lam  : classj (lam_m [y] M y) (pi_a A2 [y] A y)
                   <- ({y} classj y A2 -> classj (M y) (A y)).

canon_term_atom : classj (coer_rm R) (coer_pa P)
                   <- classj R (coer_pa P).

% atom_term_var not necessary

atom_term_app   : classj (app_r R1 M2) A
                   <- classj R1 (pi_a A2 [y] A1 y)
                   <- classj M2 A2
                   <- substj ([x] A1 x) M2 A.

canon_atom_can  : classj (canon M) A
                   <- classj M A.

canon_atom_atom : classj (atom R) A
                   <- classj R A.

unit_unit       : classj * *.

Rules for substitution

Similarly, we present the rules for substitution. Just notice here that categories are inferred in the same way as with classification, with the added subtlety of the disjoint union type returned by substitutions into atomic terms:

subst_k_type  : substj ([x] typ) M0 typ.

subst_k_pi    : substj ([x] pi_k (A x) [y] K x y) M0 (pi_k A' [y] K' y)
                 <- substj ([x] A x) M0 A'
                 <- {y} substj ([x] y) M0 (atom y)
                    -> substj ([x] K x y) M0 (K' y).

subst_a_p     : substj ([x] coer_pa (P x)) M0 (coer_pa P')
                 <- substj ([x] P x) M0 P'.

subst_a_pi    : substj ([x] pi_a (A2 x) [y] A x y) M0 (pi_a A2' [y] A' y)
                 <- substj ([x] A2 x) M0 A2'
                 <- {y} substj ([x] y) M0 (atom y)
                    -> substj ([x] A x y) M0 (A' y).

subst_p_app   : substj ([x] app_p (P x) (M x)) M0 (app_p P' M')
                 <- substj ([x] P x) M0 P'
                 <- substj ([x] M x) M0 M'.

subst_m_lam   : substj ([x] lam_m [y] M x y) M0 (lam_m [y] M' y)
                 <- {y} substj ([x] y) M0 (atom y)
                    -> substj ([x] M x y) M0 (M' y).

subst_m_rh  : substj ([x] coer_rm (R x)) M0 M'
               <- substj ([x] R x) M0 (canon M').

subst_m_r   : substj ([x] coer_rm (R x)) M0 (coer_rm R')
               <- substj ([x] R x) M0 (atom R').

subst_rh_var  : substj ([x] x) M0 (canon M0).

% subst_r_var not necessary

subst_rh_app : substj ([x] app_r (R1 x) (M2 x)) M0 (canon M'')
                <- substj ([x] R1 x) M0 (canon (lam_m [y] M' y))
                <- substj ([x] M2 x) M0 M2'
                <- substj ([y] M' y) M2' M''.

subst_r_app : substj ([x] app_r (R1 x) (M2 x)) M0 (atom (app_r R1' M2'))
                <- substj ([x] R1 x) M0 (atom R1')
                <- substj ([x] M2 x) M0 M2'.

subst_m+r_canon : substj ([x] canon (M x)) M0 (canon M')
                   <- substj ([x] M x) M0 M'.

subst_m+r_atom  : substj ([x] atom (P x)) M0 (atom P')
                   <- substj ([x] P x) M0 (atom P').

subst_unit_unit : substj ([x] *) M0 *.

Statement of the substitution theorem

We finish up this example with one more illustration of the unification afforded by our constraint domain: the statement of the substitution theorem for LF. The theorem statement involves an implicit quantification over the syntactic category. This means, in particular, that it isn't necessary to prove this by a mutual induction over the syntactic category, as is usually the case. (Note, however, that because of the simplifications to LF that we have used for this presentation, it won't be possible to prove the theorem in this form.)

sub_thm : {T:tm r -> tm X} classj M0 A0
           -> ({x:tm r} classj x A0 -> classj (T x) (C x))
           -> substj ([x] C x) M0 C'
           %%---------------------------------------------
           -> {T':tm (subst X)}
              substj ([x] T x) M0 T'
           -> classj T' C'
           -> type.

See Twelf's output for all the code in this section. Expansion of abbreviations is really hurting us here in this output, since nat expands to a type with seven arrows!

Example 4: Lists of length n

All of the constraint domains that we have used so far have been finite. For our last example, we will use an infinite constraint domain to implement one of the examples from the Twelf Manual that was used to illustrate the built-in constraint domain of rational numbers. The example defines a dependent type of lists of items, where the index is the length of the list. It then defines an append function, using the integer addition function to track the index of the result. We can achieve that result here by using a Church encoding of the natural numbers:

t : type.

%abbrev nat = (t -> t) -> t -> t.

%abbrev 0   = [s:t -> t] [z:t] z.
%abbrev s   = [n:nat] [s:t -> t] [z:t] s (n s z).
%abbrev +   = [n:nat] [m:nat] [s:t -> t] [z:t] n s (m s z).  %infix right 10 +.

item : type.
list : nat -> type.    % list N = item lists of length N

nil  : list 0.
cons : item -> list N -> list (s N).

append : list M -> list N -> list (M + N) -> type.

append_nil  : append nil L L.
append_cons : append (cons T L1) L2 (cons T L3)
               <- append L1 L2 L3.

It is interesting to note here that, in order to type-check the declarations for append_nil and append_cons, Twelf needs to prove for itself that the natural numbers defined above satisfy, respectively,

  1. 0 + X = X, for all X in nat, and
  2. X + Y = Z implies (s X) + Y = (s Z), for all X, Y, and Z in nat.

These are statements—an equation and an implication between two equations—that are universally quantified over an infinite set, and are normally the kind of statement that is proved by induction on the natural numbers. So, is Twelf secretly proving statements by induction during type-checking? No, but that would be nice! In fact, in the case of 1, if you apply the definition of + to 0 and a free variable X of type nat and then (beta-eta) normalize the result, you will see that it is equal (i.e., beta-eta equivalent) to X. Similarly, if you apply the definitions of + and s to (s X) + Y, on the one hand, and s(X + Y) on the other, for free variables X and Y of type nat, you will see that they are again equal. Thus, all Twelf needs in order to prove 1 and 2 above is beta-eta equivalence!

See Twelf's output

Conclusion

The techniques described in this tutorial fit into the general category of using indexing of types instead of predicates to make encodings more intrinsic. Our so-called user-defined constraint domains were one variety of index type that we found to be useful in this regard, since they allow us to define "tracking functions" that would be impossible to define with a free representation.

However, there are still some issues that we have not yet examined. For example,

  • All of our domains were defined in terms of a type t, and we saw that it was necessary for the method that the type hierarchy over t (or at least the part of it that we are using) doesn't acquire any new canonical elements other than those that exist for all t, but we didn't talk about the best way to insure this in our developments.
  • The examples we gave where certain ill-formed terms produced unresolved constraints when we tried to type-check them are in a kind of nether world between terms that type-check and terms that fail to type-check. We would like to believe that the set of constraints produced on these examples by Twelf's type-checker are unresolvable, but there is no part of Twelf that can check that.

Nevertheless, these examples illustrate once again the power of LF as a representational framework.