Wednesday, March 14, 2012

Proving that if an element normalizes in at most n steps then it normalizes in at most (n+1) steps.

Of course, the above is trivial IRL. However, if "at most n steps" is formalized indirectly in Coq, the proof can be nontrivial.


Module Termination.


Parameter T : Set.


Parameter R : T -> T -> Prop.


Fixpoint terminates (n : nat) (gamma : T) : Prop :=
  match n with
    | O => False
    | S n' =>
      forall gamma' : T,
        (R gamma gamma' -> terminates n' gamma')
  end.

The proposition "terminates n gamma" means that gamma reduces to a normal form in at most (n-1) steps. The "at most" part of the definition is very implicit in the way the second case is written. Proving that this is actually the case:

Theorem terminates_monotone : forall n gamma,
  terminates n gamma -> terminates (S n) gamma.

i.e. that if gamma normalizes in at most (n-1) steps then it normalizes in at most n steps, is not trivial and it shows why it pays to think about a proof before jumping into it. Here is the proof I came up with:

  induction n.
  inversion 1.
  intros.
  simpl.
  intros.
  simpl in H.
  specialize (H gamma' H0).
  specialize (IHn gamma' H).
  simpl in IHn.
  specialize (IHn gamma'0 H1).
  assumption.
Qed.

End Termination.

after some scribbling on paper. As an open question, how can we make the proof more automatic by making use of Ltac? The "intelligent" parts of the proof seem to be the choice of "n" as an induction parameter and the specializations of the different hypothesis, so these would probably need to be kept (or brute-forced).

Thursday, March 8, 2012

COQ: Proving the duality between forall and exists

I recently needed to prove the following theorem in Coq:

Module ForallExists.
Theorem forall_exists : forall D, forall P : D -> Prop,
~ (forall x : D, P x) <-> (exists x : D, ~ P x).

That is, if a property does not hold for all of the elements, then there must be an element for which the property does not hold. I don't think this is provable in Coq without classical reasoning (if I'm wrong, please correct me).

So I added classical reasoning:

Require Import Classical.

and then I went on to construct its proof:

1 subgoal
============================
forall (D : Type) (P : D -> Prop),
~ (forall x : D, P x) <-> (exists x : D, ~ P x)

The first thing I did was to split the equivalence proof into proofs of the two implications:

split.
2 subgoals
D : Type
P : D -> Prop
============================
~ (forall x : D, P x) -> exists x : D, ~ P x

subgoal 2 is:
(exists x : D, ~ P x) -> ~ (forall x : D, P x)

The first implication is the one which requires classical reasoning. After introducing the antecedent into the hypothesis:

intro.

2 subgoals
D : Type
P : D -> Prop
H : ~ (forall x : D, P x)
============================
exists x : D, ~ P x

subgoal 2 is:
(exists x : D, ~ P x) -> ~ (forall x : D, P x)

we want to continue by contradiction on whether such an x really exists. This is called Peirce's law

Check Peirce.
Peirce
: forall P : Prop, ((P -> False) -> P) -> P

and it comes with the Classical module. That is, if we can prove that the negation ~P of P (~P is shorthand for P -> False) implies P, then we can deduce P.

apply Peirce.
intros.

2 subgoals
D : Type
P : D -> Prop
H : ~ (forall x : D, P x)
H0 : (exists x : D, ~ P x) -> False
============================
exists x : D, ~ P x

subgoal 2 is:
(exists x : D, ~ P x) -> ~ (forall x : D, P x)

As H and H0 are contradictory, we will continue by proving False using H.

exfalso.
apply H.
intros.

2 subgoals
D : Type
P : D -> Prop
H : ~ (forall x : D, P x)
H0 : (exists x : D, ~ P x) -> False
x : D
============================
P x

subgoal 2 is:
(exists x : D, ~ P x) -> ~ (forall x : D, P x)

That is, it remains to be shown that P x holds (for arbitrary x). Again we reason using Peirce's law.

apply Peirce.
intros.

2 subgoals
D : Type
P : D -> Prop
H : ~ (forall x : D, P x)
H0 : (exists x : D, ~ P x) -> False
x : D
H1 : P x -> False
============================
P x

subgoal 2 is:
(exists x : D, ~ P x) -> ~ (forall x : D, P x)

As H0 and H1 are contradictory, we change the goal with False and prove the contradiction.

exfalso.
apply H0.

2 subgoals
D : Type
P : D -> Prop
H : ~ (forall x : D, P x)
H0 : (exists x : D, ~ P x) -> False
x : D
H1 : P x -> False
============================
exists x0 : D, ~ P x0

subgoal 2 is:
(exists x : D, ~ P x) -> ~ (forall x : D, P x)

The required x0 is easy to produce (it is sufficient to use x).

exists x.
assumption.

1 subgoal
D : Type
P : D -> Prop
============================
(exists x : D, ~ P x) -> ~ (forall x : D, P x)

and we are left with the second implication. This one is easier and we do not need classical reasoning for it.

intros.
unfold not.
intros.

1 subgoal
D : Type
P : D -> Prop
H : exists x : D, ~ P x
H0 : forall x : D, P x
============================
False

We first introduce everything into the hypothesis and next we produce a witness for H:

elim H.
intros.

1 subgoal
D : Type
P : D -> Prop
H : exists x : D, ~ P x
H0 : forall x : D, P x
x : D
H1 : ~ P x
============================
False

The rest is routine, as it is easy to derive the contradiction.

assert (P x).
apply H0.
contradiction.

Proof completed.

Qed.

End ForallExists.

Tuesday, February 21, 2012

The Real Reason for Lexing

All too often I see introductory texts say that parsing is really a two phase process:
  1. lexing
  2. (actual) parsing
And they say that phase 1 is done for efficiency reasons. Which is right, but only in an indirect manner. The reality of the situation is that people need(ed) to do parsing using only 1 token of lookahead. If you don't split your stream into tokens, even trivial languages cannot be parsed deterministically: e.g. seeing an "i" you don't know (yet) if what follows is an if-then-else statement or the identifier "iamalongidentifier". However, you can hack around this problem by first tokenizing/scanning/lexing the input stream. Then the lookahead is the entire string "iamalongidentifier" (or "if"), which solves the determinacy problem. There you have it: the real reason for this (historical) split-up into phases.

Wednesday, February 8, 2012

Proving the Strong Induction Principle for Natural Numbers in Coq



The following following inference rule:

        P 0         forall n : nat, (P n -> P (n + 1))
(SI)    ----------------------------------------------
                       forall n : nat, P n


is the structural induction (SI) principle for natural numbers, where P is the predicate to be proven. It says that if you can prove a property P holds for 0 and also prove the implication P n -> P (n + 1) for any natural n, then P holds of every natural number.

In contrast, consider the following inference rule:

        P 0         forall n : nat, ((forall m <= n, P m) -> P (n + 1))
(CI)    ---------------------------------------------------------------
                               forall n : nat, P n


This rule is the complete induction (CI) principle for natural numbers, also called the strong induction principle or the course-of-value induction principle. The difference is that in the inductive case, you are allowed to use the inductive hypothesis for any value which is structurally smaller than the current value. This makes CI seem stronger than SI.

In Coq, the (automatically generated) induction principle for inductive types is structural induction. We can check this by (re)defining naturals:

Section mynat.

Inductive nat : Set :=
  | O : nat
  | S : nat -> nat.

The constructor O denotes the natural number 0 and the constructor S denotes the successor of a natural number. The structural induction principle generated by Coq:

Check nat_ind.

<pre> nat_ind : forall P : nat -> Prop, P O -> (forall n : nat, P n -> P (S n)) -> forall n : nat, P n </pre>

corresponds to the (SI) rule above.

End mynat.

Fortunately, the two induction principles can be obtained one from the other. This blogpost presents the formal proof of the strong induction principle from the structural induction principle for natural numbers.

Section InductionPrinciple.

Require Export Datatypes.

We first define the proposition to be proven by induction as a parameter of the section.
Parameter P : nat -> Prop.

We also need a technical lemma to perform case analysis of less-than-or-equal-to.
Lemma le_S :
  forall n m : nat,
    n <= S m -> n <= m \/ n = S m.
Proof.
  intros.
  inversion H.
  right. reflexivity.
  left. assumption.
Qed.

We obtain the strong induction principle by first applying structural induction to prove the predicate Q(n) = (forall m <= n, P m).
Theorem strongind_aux :
  P 0 ->
  (forall n, (forall m, m <= n -> P m) -> P (S n)) ->
  forall n, (forall m, ((m <= n) -> P m)).
Proof.
  induction n as [ | n' IHn' ].
    intros m H1.
    inversion H1.
    assumption.
    intros.
    assert (m <= n' \/ m = S n'). apply le_S. assumption.
    inversion H2.
    apply IHn'; assumption.
    rewrite H3.
    apply (H0 n'); assumption.
Qed.

It is then very easy to show that Q(n) implies P(n).
Lemma weaken :
  (forall n, (forall m, ((m <= n) -> P m))) -> (forall n, P n).
Proof.
  intros H n.
  apply (H n n).
  apply le_n.
Qed.

And we obtain a proof of the strong induction principle by applying strongind_aux to show Q(n) for all n and then apply weaken to obtain P(n) from Q(n).
Theorem strongind :
  P 0 ->
  (forall n, (forall m, m <= n -> P m) -> P (S n)) ->
  forall n, P n.
Proof.
  intros.
  apply weaken.
  apply strongind_aux; assumption.
Qed.

End InductionPrinciple.
¯

Wednesday, January 11, 2012

Getting Started with Coq and Proof General

In this tutorial I will describe how to quickly get started with Coq
using the ProofGeneral plugin for Emacs. I'll also provide detailed
instructions for installing the three components on Ubuntu 11.10.

Coq is two things in one: a rather esoteric programming language and
proof assistant. The programming language is somewhat similar to
(Oca)ML and Haskell. The proof assistant allows to write formal
mathematical proofs that the computer automatically checks. More
interestingly, the proof assistant can be used to reason about
programs written in Coq itself.

Coq can be invoked either as a REPL (like the one in Python, Ocaml,
Haskell or Scheme) called coqtop or as a compiler called coqc that
turns coq source files (usually stored in .v files) into coq object
files (stored in .vo files) containing machine-checked proofs.

However, the best way to develop in Coq is by using an advanced editor
(we will be using Emacs) which can interact with the coq toplevel
(coqtop). This allows to combine the advantages of an editor (moving
source code around, moving around source code) with the advantages of
the REPL (checking if code works).

Emacs is an advanced, highly customizable and somewhat esoteric
editor. People who use it are usually very proficient at text editing
due to its streamlined non-standard interface. This tutorial does not
advocate the use of Emacs throughout, but I do think that it provides
the best interface for editing Coq developments.

Emacs itself cannot talk to the Coq REPL and it needs a plugin called
ProofGeneral which can do this. In fact ProofGeneral is a more general
plugin which enables Emacs to talk to proof assistants other than Coq,
but we will use it here only for Coq.

Under Ubuntu 11.10, to install everything you need, the single command
line:


sudo apt-get install emacs coq proofgeneral proofgeneral-doc
proofgeneral-coq


will do the trick. On Ubuntu 11.10, this will install Coq 8.3pl2,
ProofGeneral 3.7 and Emacs 23.3.1.

The proofgeneral package helpfully includes a command
/usr/bin/proofgeneral that will start Emacs with the ProofGeneral
plugin already loaded. Therefore, assuming that /usr/bin is in your
PATH variable, you can issue the command:


proofgeneral


to start up your environment (Emacs w/ the ProofGeneral plugin).

To open a file (or "visit" a file in Emacs-speak) where you save your
Coq development, type C-x C-f (this means keep the control key pressed
while typing 'x', following by the same action with 'f' instead of
'x'). Type the name of the file you want to open (let's say the
filename is "test1.v") followed by Enter. By convention, Coq files end
with the extension ".v".

Next type in the following Coq code:


Inductive bool : Set :=
| true
| false.

Theorem bool_only : forall b : bool, b = true \/ b = false.
intros.
destruct b.
constructor 1. reflexivity.
constructor 2. reflexivity.
Qed.


This Coq code defines an inductive data type "bool" much like an ADT
in Ocaml or Haskell (or in this simple case like an enum in
C/C++/Java). The data type contains (exactly) two constructors "true"
and "false". Then a trivial theorem is proven showing that any value
of this type is either "true" or "false".

To run the above code in Coq up to a certain point, you should move
the cursor up to that point (say immediately after a command ending
with a ".") and type C-c C-ENTER. This command will send the code to
coqtop, which will start up and reply in a different window (the Emacs
screen will split into two -- one part with your code and the other
part where coqtop writes the response). To send the remainder of the
code to coqtop, go to the end of the file and type C-c C-ENTER.

To save the file, type C-x C-s. To exit Emacs, type C-x C-c. If you
find yourself issuing an Emacs command by mistake and you don't know
how to continue, usually C-g (you may press it several times) will get
you out and back to editing the file. I encourage you to find more
about Emacs by typing C-h t (control + "h", following by "t" (without
control)), which will open the Emacs tutorial.

To learn Coq, I suggest taking a look at the official Coq
documentation: http://coq.inria.fr/documentation.