Helmut Schwichtenberg
Mathematisches Institut der Universitat M unchen
Wintersemester 2003/2004
Contents
Chapter 1. Logic 1
1. Formal Languages 2
2. Natural Deduction 4
3. Normalization 11
4. Normalization including Permutative Conversions 20
5. Notes 31
Chapter 2. Models 33
1. Structures for Classical Logic 33
2. BethStructures for Minimal Logic 35
3. Completeness of Minimal and Intuitionistic Logic 39
4. Completeness of Classical Logic 42
5. Uncountable Languages 44
6. Basics of Model Theory 48
7. Notes 54
Chapter 3. Computability 55
1. Register Machines 55
2. Elementary Functions 58
3. The Normal Form Theorem 64
4. Recursive Denitions 69
Chapter 4. Godels Theorems 73
1. Godel Numbers 73
2. Undenability of the Notion of Truth 77
3. The Notion of Truth in Formal Theories 79
4. Undecidability and Incompleteness 81
5. Representability 83
6. Unprovability of Consistency 87
7. Notes 90
Chapter 5. Set Theory 91
1. Cumulative Type Structures 91
2. Axiomatic Set Theory 92
3. Recursion, Induction, Ordinals 96
4. Cardinals 116
5. The Axiom of Choice 120
6. Ordinal Arithmetic 126
7. Normal Functions 133
8. Notes 138
Chapter 6. Proof Theory 139
i
ii CONTENTS
1. Ordinals Below
0
139
2. Provability of Initial Cases of TI 141
3. Normalization with the Omega Rule 145
4. Unprovable Initial Cases of Transnite Induction 149
Bibliography 157
Index 159
CHAPTER 1
Logic
The main subject of Mathematical Logic is mathematical proof. In this
introductory chapter we deal with the basics of formalizing such proofs. The
system we pick for the representation of proofs is Gentzens natural deduc
tion, from [8]. Our reasons for this choice are twofold. First, as the name
says this is a natural notion of formal proof, which means that the way proofs
are represented corresponds very much to the way a careful mathematician
writing out all details of an argument would go anyway. Second, formal
proofs in natural deduction are closely related (via the socalled Curry
Howard correspondence) to terms in typed lambda calculus. This provides
us not only with a compact notation for logical derivations (which other
wise tend to become somewhat unmanagable treelike structures), but also
opens up a route to applying the computational techniques which underpin
lambda calculus.
Apart from classical logic we will also deal with more constructive logics:
minimal and intuitionistic logic. This will reveal some interesting aspects of
proofs, e.g. that it is possible und useful to distinguish beween existential
proofs that actually construct witnessing objects, and others that dont. As
an example, consider the following proposition.
There are irrational numbers a, b such that a
b
is rational.
This can be proved as follows, by cases.
Case
2
is rational. Choose a =
2 and b =
2. Then a, b are
irrational and by assumption a
b
is rational.
Case
2
is irrational. Choose a =
2
and b =
2. Then by
assumption a, b are irrational and
a
b
=
_
2
_
2
=
_
2
_
2
= 2
is rational.
As long as we have not decided whether
2
is rational, we do not
know which numbers a, b we must take. Hence we have an example of an
existence proof which does not provide an instance.
An essential point for Mathematical Logic is to x a formal language to
be used. We take implication and the universal quantier as basic. Then
the logic rules correspond to lambda calculus. The additional connectives ,
, and are dened via axiom schemes. These axiom schemes will later
be seen as special cases of introduction and elimination rules for inductive
denitions.
1
2 1. LOGIC
1. Formal Languages
1.1. Terms and Formulas. Let a countable innite set v
i
[ i N
of variables be given; they will be denoted by x, y, z. A rst order language
L then is determined by its signature, which is to mean the following.
For every natural number n 0 a (possible empty) set of nary rela
tion symbols (also called predicate symbols). 0ary relation symbols
are called propositional symbols. (read falsum) is required as
a xed propositional symbol. The language will not, unless stated
otherwise, contain = as a primitive.
For every natural number n 0 a (possible empty) set of nary
function symbols. 0ary function symbols are called constants.
We assume that all these sets of variables, relation and function symbols are
disjoint.
For instance the language L
G
of group theory is determined by the sig
nature consisting of the following relation and function symbols: the group
operation (a binary function symbol), the unit e (a constant), the inverse
operation
1
(a unary function symbol) and nally equality = (a binary
relation symbol).
Lterms are inductively dened as follows.
Every variable is an Lterm.
Every constant of L is an Lterm.
If t
1
, . . . , t
n
are Lterms and f is an nary function symbol of L
with n 1, then f(t
1
, . . . , t
n
) is an Lterm.
From Lterms one constructs Lprime formulas, also called atomic for
mulas of L: If t
1
, . . . , t
n
are terms and R is an nary relation symbol of L,
then R(t
1
, . . . , t
n
) is an Lprime formula.
Lformulas are inductively dened from Lprime formulas by
Every Lprime formula is an Lformula.
If A and B are Lformulas, then so are (A B) (if A, then B),
(A B) (A and B) and (A B) (A or B).
If A is an Lformula and x is a variable, then xA (for all x, A
holds) and xA (there is an x such that A) are Lformulas.
Negation, classical disjunction, and the classical existential quantier
are dened by
A := A ,
A
cl
B := A B ,
cl
xA := xA.
Usually we x a language L, and speak of terms and formulas instead
of Lterms and Lformulas. We use
r, s, t for terms,
x, y, z for variables,
c for constants,
P, Q, R for relation symbols,
f, g, h for function symbols,
A, B, C, D for formulas.
1. FORMAL LANGUAGES 3
Definition. The depth dp(A) of a formula A is the maximum length
of a branch in its construction tree. In other words, we dene recursively
dp(P) = 0 for atomic P, dp(A B) = max(dp(A), dp(B)) + 1 for binary
operators , dp(A) = dp(A) + 1 for unary operators .
The size or length [A[ of a formula A is the number of occurrences of
logical symbols and atomic formulas (parentheses not counted) in A: [P[ = 1
for P atomic, [A B[ = [A[ +[B[ +1 for binary operators , [ A[ = [A[ +1
for unary operators .
One can show easily that [A[ + 1 2
dp(A)+1
.
Notation (Saving on parentheses). In writing formulas we save on
parentheses by assuming that , , bind more strongly than , , and
that in turn , bind more strongly than , (where A B abbreviates
(A B) (B A)). Outermost parentheses are also usually dropped.
Thus A B C is read as ((A (B)) C). In the case of iterated
implications we sometimes use the short notation
A
1
A
2
. . . A
n1
A
n
for A
1
(A
2
. . . (A
n1
A
n
) . . . ).
To save parentheses in quantied formulas, we use a mild form of the dot
notation: a dot immediately after x or x makes the scope of that quantier
as large as possible, given the parentheses around. So x.A B means
x(A B), not (xA) B.
We also save on parentheses by writing e.g. Rxyz, Rt
0
t
1
t
2
instead of
R(x, y, z), R(t
0
, t
1
, t
2
), where R is some predicate symbol. Similarly for a
unary function symbol with a (typographically) simple argument, so fx for
f(x), etc. In this case no confusion will arise. But readability requires that
we write in full R(fx, gy, hz), instead of Rfxgyhz.
Binary function and relation symbols are usually written in inx nota
tion, e.g. x + y instead of +(x, y), and x < y instead of <(x, y). We write
t ,= s for (t = s) and t ,< s for (t < s).
1.2. Substitution, Free and Bound Variables. Expressions c, c
t
which dier only in the names of bound variables will be regarded as iden
tical. This is sometimes expressed by saying that c and c
t
are equivalent.
In other words, we are only interested in expressions modulo renaming of
bound variables. There are methods of nding unique representatives for
such expressions, for example the namefree terms of de Bruijn [7]. For the
human reader such representations are less convenient, so we shall stick to
the use of bound variables.
In the denition of substitution of expression c
t
for variable x in ex
pression c, either one requires that no variable free in c
t
becomes bound
by a variablebinding operator in c, when the free occurrences of x are re
placed by c
t
(also expressed by saying that there must be no clashes of
variables), c
t
is free for x in c, or the substitution operation is taken to
involve a systematic renaming operation for the bound variables, avoiding
clashes. Having stated that we are only interested in expressions modulo
renaming bound variables, we can without loss of generality assume that
substitution is always possible.
4 1. LOGIC
Also, it is never a real restriction to assume that distinct quantier
occurrences are followed by distinct variables, and that the sets of bound
and free variables of a formula are disjoint.
Notation. FV is used for the (set of) free variables of an expression;
so FV(t) is the set of variables free in the term t, FV(A) the set of variables
free in formula A etc.
c[x := t] denotes the result of substituting the term t for the variable
x in the expression c. Similarly, c[x :=
t ] is the result of simultaneously
substituting the terms
t = t
1
, . . . , t
n
for the variables x = x
1
, . . . , x
n
, respec
tively.
Locally we shall adopt the following convention. In an argument, once
a formula has been introduced as A(x), i.e., A with a designated variable x,
we write A(t) for A[x := t], and similarly with more variables.
1.3. Subformulas. Unless stated otherwise, the notion of subformula
we use will be that of a subformula in the sense of Gentzen.
Definition. (Gentzen) subformulas of A are dened by
(a) A is a subformula of A;
(b) if B C is a subformula of A then so are B, C, for = , , ;
(c) if xB or xB is a subformula of A, then so is B[x := t], for all t free
for x in B.
If we replace the third clause by:
(c)
t
if xB or xB is a subformula of A then so is B,
we obtain the notion of literal subformula.
Definition. The notions of positive, negative, strictly positive subfor
mula are dened in a similar style:
(a) A is a positive and a stricly positive subformula of itself;
(b) if BC or BC is a positive [negative, strictly positive] subformula of
A, then so are B, C;
(c) if xB or xB is a positive [negative, strictly positive] subformula of A,
then so is B[x := t];
(d) if B C is a positive [negative] subformula of A, then B is a negative
[positive] subformula of A, and C is a positive [negative] subformula of
A;
(e) if B C is a strictly positive subformula of A, then so is C.
A strictly positive subformula of A is also called a strictly positive part
(s.p.p.) of A. Note that the set of subformulas of A is the union of the
positive and negative subformulas of A. Literal positive, negative, strictly
positive subformulas may be dened in the obvious way by restricting the
clause for quantiers.
Example. (P Q) R xR
t
(x) has as s.p.p.s the whole formula,
R xR
t
(x), R, xR
t
(x), R
t
(t). The positive subformulas are the s.p.p.s
and in addition P; the negative subformulas are P Q, Q.
2. Natural Deduction
We introduce Gentzens system of natural deduction. To allow a direct
correspondence with the lambda calculus, we restrict the rules used to those
2. NATURAL DEDUCTION 5
for the logical connective and the universal quantier . The rules come
in pairs: we have an introduction and an elimination rule for each of these.
The other logical connectives are introduced by means of axiom schemes:
this is done for conjunction , disjunction and the existential quantier
. The resulting system is called minimal logic; it has been introduced by
Johannson in 1937 [14]. Notice that no negation is present.
If we then go on and require the exfalsoquodlibet scheme for the nullary
propositional symbol (falsum), we can embed intuitionistic logic. To
obtain classical logic, we add as an axiom scheme the principle of indirect
proof , also called stability. However, to obtain classical logic it suces to
restrict to the language based on , , and ; we can introduce classical
disjunction
cl
and the classical existential quantier
cl
via their (classical)
denitions above. For these the usual introduction and elimination proper
ties can then be derived.
2.1. Examples of Derivations. Let us start with some examples for
natural proofs. Assume that a rst order language L is given. For simplicity
we only consider here proofs in pure logic, i.e. without assumptions (axioms)
on the functions and relations used.
(1) (A B C) (A B) A C
Assume A B C. To show: (A B) A C. So assume A B.
To show: A C. So nally assume A. To show: C. We have A, by the last
assumption. Hence also B C, by the rst assumption, and B, using the
next to last assumption. From B C and B we obtain C, as required.
(2) (x.A B) A xB, if x / FV(A).
Assume x.A B. To show: A xB. So assume A. To show: xB.
Let x be arbitrary; note that we have not made any assumptions on x. To
show: B. We have A B, by the rst assumption. Hence also B, by the
second assumption.
(3) (A xB) x.A B, if x / FV(A).
Assume A xB. To show: x.A B. Let x be arbitrary; note that we
have not made any assumptions on x. To show: A B. So assume A. To
show: B. We have xB, by the rst and second assumption. Hence also
B.
A characteristic feature of these proofs is that assumptions are intro
duced and eliminated again. At any point in time during the proof the free
or open assumptions are known, but as the proof progresses, free assump
tions may become cancelled or closed because of the impliesintroduction
rule.
We now reserve the word proof for the informal level; a formal represen
tation of a proof will be called a derivation.
An intuitive way to communicate derivations is to view them as labelled
trees. The labels of the inner nodes are the formulas derived at those points,
and the labels of the leaves are formulas or terms. The labels of the nodes
immediately above a node are the premises of the rule application, the
formula at node is its conclusion. At the root of the tree we have the
conclusion of the whole derivation. In natural deduction systems one works
6 1. LOGIC
with assumptions axed to some leaves of the tree; they can be open or else
closed.
Any of these assumptions carries a marker. As markers we use as
sumption variables
0
,
1
, . . . , denoted by u, v, w, u
0
, u
1
, . . . . The (previ
ous) variables will now often be called object variables, to distinguish them
from assumption variables. If at a later stage (i.e. at a node below an as
sumption) the dependency on this assumption is removed, we record this by
writing down the assumption variable. Since the same assumption can be
used many times (this was the case in example (1)), the assumption marked
with u (and communicated by u: A) may appear many times. However, we
insist that distinct assumption formulas must have distinct markers.
An inner node of the tree is understood as the result of passing form
premises to a conclusion, as described by a given rule. The label of the node
then contains in addition to the conclusion also the name of the rule. In some
cases the rule binds or closes an assumption variable u (and hence removes
the dependency of all assumptions u: A marked with that u). An application
of the introduction rule similarly binds an object variable x (and hence
removes the dependency on x). In both cases the bound assumption or
object variable is added to the label of the node.
2.2. Introduction and Elimination Rules for and . We now
formulate the rules of natural deduction. First we have an assumption rule,
that allows an arbitrary formula A to be put down, together with a marker
u:
u: A Assumption
The other rules of natural deduction split into introduction rules (Irules
for short) and elimination rules (Erules) for the logical connectives and
. For implication there is an introduction rule
+
u and an elimination
rule
is
called major premise (or main premise), and the right premise A minor
premise (or side premise). Note that with an application of the
+
urule
all assumptions above it marked with u: A are cancelled.
[u: A]
[ M
B
+
u
A B
[ M
A B
[ N
A
B
For the universal quantier there is an introduction rule
+
x and an
elimination rule
+
x
xA
[ M
xA r
A[x := r]
We now give derivations for the example formulas (1) (3). Since in
many cases the rule used is determined by the formula on the node, we
2. NATURAL DEDUCTION 7
suppress in such cases the name of the rule,
u: A B C w: A
B C
v : A B w: A
B
C
+
w
A C
+
v
(A B) A C
+
u
(A B C) (A B) A C
(1)
u: x.A B x
A B v : A
B
+
x
xB
+
v
A xB
+
u
(x.A B) A xB
(2)
Note here that the variable condition is satised: x is not free in A (and
also not free in x.A B).
u: A xB v : A
xB x
B
+
v
A B
+
x
x.A B
+
u
(A xB) x.A B
(3)
Here too the variable condition is satised: x is not free in A.
2.3. Axiom Schemes for Disjunction, Conjunction, Existence
and Falsity. We follow the usual practice of considering all free variables
in an axiom as universally quantied outside.
Disjunction. The introduction axioms are
+
0
: A A B
+
1
: B A B
and the elimination axiom is
: (A C) (B C) A B C.
Conjunction. The introduction axiom is
+
: A B A B
and the elimination axiom is
: (A B C) A B C.
Existential Quantier. The introduction axiom is
+
: A xA
and the elimination axiom is
: A.
In the literature this axiom is frequently called exfalsoquodlibet, written
Efq. It clearly is derivable from its instances Rx, for every relation
symbol R.
Equality. The introduction axiom is
Eq
+
: Eq(x, x)
and the elimination axiom is
Eq
for . A
formula B is called derivable from assumptions A
1
, . . . , A
n
, if there is a
derivation (without
t. Take n = 1 and A
1
:= R
t. Case
AB. By induction hypothesis, we have A
1
, . . . , A
n
and B
1
, . . . , B
m
. Take
A
1
, . . . , A
n
, B
1
, . . . , B
m
. Case A B. By induction hypothesis, we have
A
1
, . . . , A
n
and B
1
, . . . , B
m
. For the sake of notational simplicity assume
n = 2 and m = 3. Then
(A
1
A
2
B
1
B
2
B
3
)
2. NATURAL DEDUCTION 9
(A
1
A
2
B
1
) (A
1
A
2
B
2
) (A
1
A
2
B
3
).
Case xA. By induction hypothesis for A, we have A
1
, . . . , A
n
. Take
xA
1
, . . . , xA
n
, for
x
n
__
i=1
A
i
n
__
i=1
xA
i
.
This concludes the proof.
For the rest of this section, let us restrict to the language based on ,
, and . We obtain classical logic by adding, for every relation symbol
R distinct from , the principle of indirect proof expressed as the socalled
stability axiom (Stab
R
):
Rx Rx.
Let
Stab := x.Rx Rx [ R relation symbol distinct from .
We call the formula A classically derivable and write
c
A if there is a
derivation of A from stability assumptions Stab
R
. Similarly we dene clas
sical derivability from and write
c
A, i.e.
c
A : Stab A.
Theorem (Stability, or Principle of Indirect Proof). For every formula
A (of our language based on , , and ),
c
A A.
Proof. Induction on A. For simplicity, in the derivation to be con
structed we leave out applications of
+
at the end. Case R
t with R
distinct from . Use Stab
R
. Case . Observe that = ((
) ) . The desired derivation is
v : ( )
u:
+
u
+
u
2
(A B)
+
u
1
B
B
Case xA. Clearly it suces to show (A A) xA A; a
derivation is
u: A A
v : xA
u
1
: A
u
2
: xA x
A
+
u
2
xA
+
u
1
A
A
10 1. LOGIC
The case A B is left to the reader.
Notice that clearly
c
A, for stability is stronger:
[ M
Stab
A A
u:
+
v
A
A
A
+
u
A
where M
Stab
is the (classical) derivation of stability.
Notice also that even for the , fragment the inclusion of minimal
logic in intuitionistic logic, and of the latter in classical logic are proper.
Examples are
, P, but
i
P,
,
i
((P Q) P) P, but
c
((P Q) P) P.
Nonderivability can be proved by means of countermodels, using a semantic
characterization of derivability; this will be done in Chapter 2.
i
P
is obvious, and the Peirce formula ((P Q) P) P can be derived in
minimal logic from Q and P P, hence is derivable in classical
logic.
2.5. Negative Translation. We embedd classical logic into minimal
logic, via the socalled negative (or GodelGentzen) translation.
A formula A is called negative, if every atomic formula of A distinct
from occurs negated, and A does not contain , .
Lemma. For negative A, A A.
Proof. This follows from the proof of the stability theorem, using
R
t R
t.
Since , do not occur in formulas of classical logic, in the rest of this
section we consider the language based on , , and only.
Definition (Negative translation
g
of GodelGentzen).
(R
t)
g
:= R
t (R distinct from )
g
:= ,
(A B)
g
:= A
g
B
g
,
(A B)
g
:= A
g
B
g
,
(xA)
g
:= xA
g
.
Theorem. For all formulas A,
(a)
c
A A
g
,
(b)
c
A i
g
A
g
, where
g
:= B
g
[ B .
Proof. (a). The claim follows from the fact that
c
is compatible with
equivalence. 2. . Obvious . By induction on the classical deriva
tion. For a stability assumption R
t R
t we have (R
t R
t)
g
=
3. NORMALIZATION 11
R
t R
+
u
A B
Then we have by induction hypothesis
u: A
g
T
g
B
g
hence
[u: A
g
]
T
g
B
g
+
u
A
g
B
g
Case
. Assume
T
0
A B
T
1
A
B
Then we have by induction hypothesis
T
g
0
A
g
B
g
T
g
1
A
g
hence
T
g
0
A
g
B
g
T
g
1
A
g
B
g
The other cases are treated similarly.
Corollary (Embedding of classical logic). For negative A,
c
A A.
Proof. By the theorem we have
c
A i A
g
. Since A is negative,
every atom distinct from in A must occur negated, and hence in A
g
it
must appear in threefold negated form (as R
t R
t.
Since every formula is classically equivalent to a negative formula, we
have achieved an embedding of classical logic into minimal logic.
Note that , P P (as we shall show in Chapter 2). The corollary
therefore does not hold for all formulas A.
3. Normalization
We show in this section that every derivation can be transformed by
appropriate conversion steps into a normal form. A derivation in normal
form does not make detours, or more precisely, it cannot occur that an
elimination rule immediately follows an introduction rule. Derivations in
normal form have many pleasant properties.
Uniqueness of normal form will be shown by means of an application of
Newmans lemma; we will also introduce and discuss the related notions of
conuence, weak conuence and the ChurchRosser property.
We nally show that the requirement to give a normal derivation of
a derivable formula can sometimes be unrealistic. Following Statman [25]
and Orevkov [19] we give examples of formulas C
k
which are easily derivable
with nonnormal derivations (of size linear in k), but which require a non
elementary (in k) size in any normal derivation.
This can be seen as a theoretical explanation of the essential role played
by lemmas in mathematical arguments.
12 1. LOGIC
3.1. Conversion. A conversion eliminates a detour in a derivation,
i.e., an elimination immediately following an introduction. We consider the
following conversions:
conversion.
[u: A]
[ M
B
+
u
A B
[ N
A
[ N
A
[ M
B
conversion.
[ M
A
+
x
xA r
A[x := r]
[ M
t
A[x := r]
3.2. Derivations as Terms. It will be convenient to represent deriva
tions as terms, where the derived formula is viewed as the type of the term.
This representation is known under the name CurryHoward correspondence.
We give an inductive denition of derivation terms in the table below,
where for clarity we have written the corresponding derivations to the left.
For the universal quantier there is an introduction rule
+
x and an
elimination rule
L,
where u is an assumption variable or assumption constant, v is an assump
tion variable or object variable, and M, N, L are derivation terms or object
terms.
Here the nal form is not normal: (vM)N
M[v := N].
Notice that in a substitution M[v := N] with M a derivation term and
v an object variable, one also needs to substitute in the formulas of M.
The closure of the conversion relation
is dened by
If M
M
t
, then M M
t
.
If M M
t
, then also MN M
t
N, NM NM
t
, vM vM
t
(inner reductions).
So M N means that M reduces in one step to N, i.e., N is obtained
from M by replacement of (an occurrence of) a redex M
t
of M by a con
versum M
tt
of M
t
, i.e. by a single conversion. The relation
+
(properly
3. NORMALIZATION 13
derivation term
u: A u
A
[u: A]
[ M
B
+
u
A B
(u
A
M
B
)
AB
[ M
A B
[ N
A
B
(M
AB
N
A
)
B
[ M
A
+
x (with var.cond.)
xA
(xM
A
)
xA
(with var.cond.)
[ M
xA r
A[x := r]
(M
xA
r)
A[x:=r]
Table 1. Derivation terms for and
reduces to) is the transitive closure of and
, respectively.
A term M is in normal form, or M is normal , if M does not contain a
redex. M has a normal form if there is a normal N such that M
N.
A reduction sequence is a (nite or innite) sequence M
0
M
1
M
2
. . . such that M
i
M
i+1
, for all i.
Finite reduction sequences are partially ordered under the initial part
relation; the collection of nite reduction sequences starting from a term
M forms a tree, the reduction tree of M. The branches of this tree may
be identied with the collection of all innite and all terminating nite
reduction sequences.
A term is strongly normalizing if its reduction tree is nite.
Example.
(xyz.xz(yz))(uv u)(u
t
v
t
u
t
)
(yz.(uv u)z(yz))(u
t
v
t
u
t
)
14 1. LOGIC
(yz.(v z)(yz))(u
t
v
t
u
t
)
(yz z)(u
t
v
t
u
t
) z z.
Lemma (Substitutivity of ). (a) If M M
t
, then MN M
t
N.
(b) If N N
t
, then MN MN
t
.
(c) If M M
t
, then M[v := N] M
t
[v := N].
(d) If N N
t
, then M[v := N]
M[v := N
t
].
Proof. (a) and (c) are proved by induction on M M
t
; (b) and (d)
by induction on M. Notice that the reason for
L, k +l + 1).
Proof. (a). Induction on k. Assume sn(M, k). We show sn(M, k + 1).
So let M
t
with M M
t
be given; because of sn(M, k) we must have k > 0.
We have to show sn(M
t
, k). Because of sn(M, k) we have sn(M
t
, k1), hence
by induction hypothesis sn(M
t
, k).
(b). Induction on k. Assume sn(MN, k). We show sn(M, k). In case k =
0 the term MN is normal, hence also M is normal and therefore sn(M, 0).
So let k > 0 and M M
t
; we have to show sn(M
t
, k 1). From M
M
t
we have MN M
t
N. Because of sn(MN, k) we have by denition
sn(M
t
N, k 1), hence sn(M
t
, k 1) by induction hypothesis.
(c). Assume sn(M
i
, k
i
) for i = 1 . . . n. We show sn(uM
1
. . . M
n
, k) with
k := k
1
+ + k
n
. Again we employ induction on k. In case k = 0 all
M
i
are normal, hence also uM
1
. . . M
n
. So let k > 0 and uM
1
. . . M
n
M
t
. Then M
t
= uM
1
. . . M
t
i
. . . M
n
with M
i
M
t
i
; We have to show
sn(uM
1
. . . M
t
i
. . . M
n
, k 1). Because of M
i
M
t
i
and sn(M
i
, k
i
) we have
k
i
> 0 and sn(M
t
i
, k
i
1), hence sn(uM
1
. . . M
t
i
. . . M
n
, k 1) by induction
hypothesis.
(d). Assume sn(M, k). We have to show sn(vM, k). Use induction on
k. In case k = 0 M is normal, hence vM is normal, hence sn(vM, 0). So
let k > 0 and vM L. Then L has the form vM
t
with M M
t
. So
sn(M
t
, k 1) by denition, hence sn(vM
t
, k) by induction hypothesis.
(e). Assume sn(M[v := N]
L K, namely M[v := N]
L, and
this K is normal. So let k + l > 0 and (vM)N
L K. We have to show
sn(K, k +l).
Case K = M[v := N]
L, k) we obtain sn(M[v := N]
L, k +l) by (a).
Case K = (vM
t
)N
L with M M
t
. Then we have M[v := N]
L
M
t
[v := N]
L. Now sn(M[v := N]
L, k 1 +l + 1).
Case K = (vM)N
t
L with N N
t
. Now sn(N, l) implies l > 0 and
sn(N
t
, l 1). The induction hypothesis yields sn((vM)N
t
L, k +l 1 +1),
since sn(M[v := N
t
]
L, k) by (a),
Case K = (vM)N
L
t
with L
i
L
t
i
for some i and L
j
= L
t
j
for j ,= i.
Then we have M[v := N]
L M[v := N]
L
t
. Now sn(M[v := N]
L, k)
implies k > 0 and sn(M[v := N]
L
t
, k 1). The induction hypothesis yields
sn((vM)N
L
t
, k 1 +l + 1).
The essential idea of the strong normalization proof is to view the last
three closure properties of sn from the preceding lemma without the infor
mation on the bounds as an inductive denition of a new set SN:
M SN
(Var)
u
M SN
M SN
()
vM SN
M[v := N]
L SN N SN
()
(vM)N
L SN
Corollary. For every term M SN there is a k N such that
sn(M, k). Hence every term M SN is strongly normalizable
Proof. By induction on M SN, using the previous lemma.
In what follows we shall show that every term is in SN and hence is
strongly normalizable. Given the denition of SN we only have to show
that SN is closed under application. In order to prove this we must prove
simultaneously the closure of SN under substitution.
Theorem (Properties of SN). For all formulas A, derivation terms M
SN and N
A
SN the following holds.
(a) M[v := N] SN.
(a) M[x := r] SN.
(b) Suppose M derives A B. Then MN SN.
(b) Suppose M derives xA. Then Mr SN.
Proof. By courseofvalues induction on dp(A), with a side induction
on M SN. Let N
A
SN. We distinguish cases on the form of M.
Case u
M by (Var) from
M SN. (a). The SIH(a) (SIH means side
induction hypothesis) yields M
i
[v := N] SN for all M
i
from
M. In case u ,=
v we immediately have (u
M)[v := N] SN. Otherwise we need N
M[v :=
N] SN. But this follows by multiple applications of IH(b), since every
M
i
[v := N] derives a subformula of A with smaller depth. (a). Similar, and
simpler. (b), (b). Use (Var) again.
16 1. LOGIC
Case vM by () from M SN. (a), (a). Use () again. (b). Our goal
is (vM)N SN. By () it suces to show M[v := N] SN and N SN.
The latter holds by assumption, and the former by SIH(a). (b). Similar,
and simpler.
Case (wM)K
L by () from M[w := K]
M
3
and
M
2
R
M
3
, where R
is conuent.
Proof. Call M good if it satises the conuence property w.r.t.
,
i.e. if whenever K
L, then K
L for some N. We
show that every strongly normalizing M is good, by transnite induction on
the wellfounded partial order
+
, restricted to all terms occurring in the
reduction tree of M. So let M be given and assume
M
t
.M
+
M
t
= M
t
is good.
We must show that M is good, so assume K
L. We may further
assume that there are M
t
, M
tt
such that K
M
t
M M
tt
L, for
otherwise the claim is trivial. But then the claim follows from the assumed
weak conuence and the induction hypothesis for M
t
and M
tt
, as shown in
the picture below.
3.6. Uniqueness of Normal Forms. We rst show that is weakly
conuent. From this and the fact that it is strongly normalizing we can
easily infer (using Newmans Lemma) that the normal forms are unique.
Proposition. is weakly conuent.
Proof. Assume N
0
M N
1
. We show that N
0
N
1
for
some N, by induction on M. If there are two inner reductions both on the
same subterm, then the claim follows from the induction hypothesis using
substitutivity. If they are on distinct subterms, then the subterms do not
3. NORMALIZATION 17
M
d
d
d
M
t
weak conf. M
tt
d
d
d
d
d
d
K IH(M
t
)
N
t
L
d
d
d
IH(M
tt
)
N
tt
d
d
d
N
Table 2. Proof of Newmans lemma
overlap and the claim is obvious. It remains to deal with the case of a head
reduction together with an inner conversion.
(vM)N
d
d
d
M[v := N]
L
d
d
d
(vM
t
)N
M
t
[v := N]
L
(vM)N
d
d
d
M[v := N]
L
d
d
d
(vM)N
t
M[v := N
t
]
L
(vM)N
d
d
d
M[v := N]
L
d
d
d
(vM)N
L
t
M[v := N]
L
t
where for the lower left arrows we have used substitutivity again.
Corollary. Every term is strongly normalizing, hence normal forms
are unique.
3.7. The Structure of Normal Derivations. Let M be a normal
derivation, viewed as a prooftree. A sequence of f.o.s (formula occurrences)
A
0
, . . . , A
n
such that (1) A
0
is a top formula (leaf) of the prooftree, and for
0 i < n, (2) A
i+1
is immediately below A
i
, and (3) A
i
is not the minor
premise of an
cl
dened by . In particular we need:
Lemma (Existence Introduction). A
cl
xA.
Proof. u
A
v
xA
.vxu.
Lemma (Existence Elimination). (B B)
cl
xA (x.A
B) B if x / FV (B).
Proof. u
BB
v
xA
w
x.AB
.uu
B
2
.vxu
A
1
.u
2
(wxu
1
).
Note that the stability assumption B B is not needed if B does
not contain an atom ,= as a strictly positive subformula. This will be the
case for the derivations below, where B will always be a classical existential
formula.
Let us now x our language. We use a ternary relation symbol R to
represent the graph of the function y +2
x
; so R(y, x, z) is intended to mean
y +2
x
= z. We now axiomatize R by means of Horn clauses. For simplicity
we use a unary function symbol s (to be viewed as the successor function)
3. NORMALIZATION 19
and a constant 0; one could use logic without function symbols instead as
Orevkov does , but this makes the formulas somewhat less readable and
the proofs less perspicious. Note that Orevkovs result is an adaption of a
result of Statman [25] for languages containing function symbols.
Hyp
1
: yR(y, 0, s(y))
Hyp
2
: y, x, z, z
1
.R(y, x, z) R(z, x, z
1
) R(y, s(x), z
1
)
The goal formula then is
C
k
:=
cl
z
k
, . . . , z
0
.R(0, 0, z
k
) R(0, z
k
, z
k1
) . . . R(0, z
1
, z
0
).
To obtain the short proof of the goal formula C
k
we use formulas A
i
(x) with
a free parameter x.
A
0
(x) := y
cl
z R(y, x, z),
A
i+1
(x) := y.A
i
(y)
cl
z.A
i
(z) R(y, x, z).
For the two lemmata to follow we give an informal argument, which can
easily be converted into a formal proof. Note that the existence elimination
lemma is used only with existential formulas as conclusions. Hence it is not
necessary to use stability axioms and we have a derivation in minimal logic.
Lemma. Hyp
1
Hyp
2
A
i
(0).
Proof. Case i = 0. Obvious by Hyp
1
.
Case i = 1. Let x with A
0
(x) be given. It is sucient to show A
0
(s(x)),
that is y
cl
z
1
R(y, s(x), z
1
). So let y be given. We know
(4) A
0
(x) = y
cl
z R(y, x, z).
Applying (4) to our y gives z such that R(y, x, z). Applying (4) again to
this z gives z
1
such that R(z, x, z
1
). By Hyp
2
we obtain R(y, s(x), z
1
).
Case i +2. Let x with A
i+1
(x) be given. It suces to show A
i+1
(s(x)),
that is y.A
i
(y)
cl
z.A
i
(z) R(y, s(x), z). So let y with A
i
(y) be given.
We know
(5) A
i+1
(x) = y.A
i
(y)
cl
z
1
.A
i
(z
1
) R(y, x, z
1
).
Applying (5) to our y gives z such that A
i
(z) and R(y, x, z). Applying (5)
again to this z gives z
1
such that A
i
(z
1
) and R(z, x, z
1
). By Hyp
2
we obtain
R(y, s(x), z
1
).
Note that the derivations given have a xed length, independent of i.
Lemma. Hyp
1
Hyp
2
C
k
.
Proof. A
k
(0) applied to 0 and A
k1
(0) yields z
k
with A
k1
(z
k
) such
that R(0, 0, z
k
).
A
k1
(z
k
) applied to 0 and A
k2
(0) yields z
k1
with A
k2
(z
k1
) such
that R(0, z
k
, z
k1
).
A
1
(z
2
) applied to 0 and A
0
(0) yields z
1
with A
0
(z
1
) such that R(0, z
2
, z
1
).
A
0
(z
1
) applied to 0 yields z
0
with R(0, z
1
, z
0
).
Note that the derivations given have length linear in k.
We want to compare the length of this derivation of C
k
with the length
of an arbitrary normal derivation.
20 1. LOGIC
Proposition. Any normal derivation of C
k
from Hyp
1
and Hyp
2
has at
least 2
k
nodes.
Proof. Let a normal derivation M of falsity from Hyp
1
, Hyp
2
and
the additional hypothesis
u: z
k
, . . . , z
0
.R(0, 0, z
k
) R(0, z
k
, z
k1
) R(0, z
1
, z
0
)
be given. We may assume that M does not contain free object variables
(otherwise substitute them by 0). The main branch of M must begin with
u, and its side premises are all of the form R(0, s
n
(0), s
k
(0)).
Observe that any normal derivation of R(s
m
(0), s
n
(0), s
k
(0)) from Hyp
1
,
Hyp
2
and u has at least 2
n
occurrences of Hyp
1
and is such that k = m+2
n
.
This can be seen easily by induction on n. Note also that such a derivation
cannot involve u.
If we apply this observation to the above derivations of the side premises
we see that they derive
R(0, 0, s
2
0
(0)), R(0, s
2
0
(0), s
2
2
0
(0)), . . . R(0, s
2
k1
(0), s
2
k
(0)).
The last of these derivations uses at least 2
2
k1
= 2
k
times Hyp
1
.
4. Normalization including Permutative Conversions
The elimination of detours done in Section 3 will now be extended to
the full language. However, incorporation of , and leads to diculties.
If we do this by means of axioms (or constant derivation terms, as in 2.3),
we cannot read o as much as we want from a normal derivation. If we
do it in the form of rules, we must also allow permutative conversion. The
reason for the diculty is that in the elimination rules for , , the minor
premise reappears in the conclusion. This gives rise to a situation where we
rst introduce a logical connective, then do not touch it (by carrying it along
in minor premises of
+
0
A B
[ M
B
+
1
A B
and the elimination rule is
[ M
A B
[u: A]
[ N
C
[v : B]
[ K
C
u, v
C
Conjunction. The introduction rule is
[ M
A
[ N
B
+
A B
and the elimination rule is
[ M
A B
[u: A] [v : B]
[ N
C
u, v
C
Existential Quantier. The introduction rule is
r
[ M
A[x := r]
+
xA
and the elimination rule is
[ M
xA
[u: A]
[ N
B
x, u (var.cond.)
B
The rule
and
is called
major premise (or main premise), and each of the right premises minor
premise (or side premise).
4.2. Conversion. In addition to the , conversions treated in 3.1,
we consider the following conversions:
conversion.
[ M
A
+
0
A B
[u: A]
[ N
C
[v : B]
[ K
C
u, v
C
[ M
A
[ N
C
and
[ M
B
+
1
A B
[u: A]
[ N
C
[v : B]
[ K
C
u, v
C
[ M
B
[ K
C
conversion.
[ M
A
[ N
B
+
A B
[u: A] [v : B]
[ K
C
u, v
C
[ M
A
[ N
B
[ K
C
conversion.
r
[ M
A[x := r]
+
xA
[u: A]
[ N
B
x, u
B
[ M
A[x := r]
[ N
t
B
4.3. Permutative Conversion. In a permutative conversion we per
mute an Erule upwards over the minor premises of
or
.
perm conversion.
[ M
A B
[ N
C
[ K
C
C
[ L
C
t
Erule
D
[ M
A B
[ N
C
[ L
C
t
Erule
D
[ K
C
[ L
C
t
Erule
D
D
4. NORMALIZATION INCLUDING PERMUTATIVE CONVERSIONS 23
perm conversion.
[ M
A B
[ N
C
C
[ K
C
t
Erule
D
[ M
A B
[ N
C
[ K
C
t
Erule
D
D
perm conversion.
[ M
xA
[ N
B
B
[ K
C
Erule
D
[ M
xA
[ N
B
[ K
C
Erule
D
D
4.4. Derivations as Terms. The term representation of derivations
has to be extended. The rules for , and with the corresponding terms
are given in the table below.
The introduction rule
+
has as its left premise the witnessing term r to
be substituted. The elimination rule
instead
of
+
x,A
,
x,A,B
. So we consider derivation terms M, N, K of the forms
u [ vM [ yM [
+
0
M [
+
1
M [ M, N) [
+
rM [
MN [ Mr [ M(v
0
.N
0
, v
1
.N
1
) [ M(v, w.N) [ M(v.N);
in these expressions the variables y, v, v
0
, v
1
, w get bound.
To simplify the technicalities, we restrict our treatment to the rules for
and . It can easily be extended to the full set of rules; some details for
disjunction are given in 4.6. So we consider
u [ vM [
+
rM [ MN [ M(v.N);
in these expressions the variable v gets bound.
We reserve the letters E, F, G for eliminations, i.e. expressions of the
form (v.N), and R, S, T for both terms and eliminations. Using this notation
we obtain a second (and clearly equivalent) inductive denition of terms:
u
M [ u
ME [ vM [
+
rM [
(vM)N
R [
+
rM(v.N)
R [ u
MER
S.
24 1. LOGIC
derivation term
[ M
A
+
0
A B
[ M
B
+
1
A B
_
+
0,B
M
A
_
AB
_
+
1,A
M
B
_
AB
[ M
A B
[u: A]
[ N
C
[v : B]
[ K
C
u, v
C
_
M
AB
(u
A
.N
C
, v
B
.K
C
)
_
C
[ M
A
[ N
B
+
A B
M
A
, N
B
)
AB
[ M
A B
[u: A] [v : B]
[ N
C
u, v
C
_
M
AB
(u
A
, v
B
.N
C
)
_
C
r
[ M
A[x := r]
+
xA
_
+
x,A
rM
A[x:=r]
_
xA
[ M
xA
[u: A]
[ N
B
x, u (var.cond.)
B
_
M
xA
(u
A
.N
B
)
_
B
(var.cond.)
Table 3. Derivation terms for , and
Here the nal three forms are not normal: (vM)N
R and
+
rM(v.N)
R
both are redexes, and u
MER
M[v := N]
conversion,
+
x,A
rM(v.N)
N[x := r][v := M]
conversion,
M(v.N)R
M
t
or M
M
t
, then M M
t
.
If M M
t
, then also MR M
t
R, NM NM
t
, N(v.M)
N(v.M
t
), vM vM
t
,
+
rM
+
rM
t
(inner reductions).
We now give the rules to inductively generate a set SN:
M SN
(Var
0
)
u
M SN
M SN
()
vM SN
M SN
()
+
rM SN
M, N SN
(Var)
u
M(v.N) SN
u
M(v.NR)
S SN
(Var
)
u
M(v.N)R
S SN
M[v := N]
R SN N SN
(
)
(vM)N
R SN
N[x := r][v := M]
R SN M SN
(
+
x,A
rM(v.N)
R SN
where in (Var
S implies u
M(v.N)R
S,
by induction on u
M(v.NR)
S = (v
t
.N
t
)T
T and we have a permutative conversion of R = (v
t
.N
t
) with
T, leading to the term M = u
M(v.N)(v
t
.N
t
T)
S = u
M(v.N(v
t
.N
t
))T
T
leads in two permutative steps to u
M(v.N(v
t
.N
t
T))
R.
We need to consider all possible reducts of (vM)N
R. In case of an outer 
reduction use the assumption. If N is reduced, use the induction hypothesis.
Reductions in M and in
R as well as permutative reductions within
R are
taken care of by the side induction hypothesis.
Case (
+
rM(v.N)
R = S
S. Apply the
third induction hypothesis, since (NS)[x := r][v := M]
S = N[x := r][v :=
M]S
S.
For later use we prove a slightly generalized form of the rule (Var
):
Proposition. If M(v.NR)
S SN.
Proof. Induction on the generation of M(v.NR)
S SN. We distin
guish cases according to the form of M.
Case u
T(v.NR)
S SN. If
T =
M, use (Var
). Otherwise we have
u
M(v
t
.N
t
)
R(v.NR)
) from u
M(v
t
.N
t
R(v.NR)
M SN and N
t
R(v.NR)
R(v.N)R
S SN, hence u
M(v.N
t
R(v.N)R
S) SN by (Var) and
nally u
M(v.N
t
)
R(v.N)R
S SN by (Var
).
Case
+
rM
T(v.NR)
) instead of (Var
). In
detail: If
T is empty, by (
S =
N[x := r][v := M]R
S SN
again by (
). Otherwise we have
+
rM(v
t
.N
t
)
T(v.NR)
S SN. This
must be generated by (
) from N
t
[x := r][v
t
:= M]
T(v.NR)
S SN. The
induction hypothesis yields N
t
[x := r][v
t
:= M]
T(v.N)R
S SN, hence
+
rM(v
t
.N
t
)
T(v.N)R
S SN by (
).
Case (vM)N
t
R(w.NR)
S SN. By (
R(w.NR)
R(w.N)R
R(w.N)R
S SN by (
).
In what follows we shall show that every term is in SN and hence is
strongly normalizable. Given the denition of SN we only have to show
that SN is closed under
and
), (
) and (
).
4. NORMALIZATION INCLUDING PERMUTATIVE CONVERSIONS 27
Case (vM)
A
0
A
1
SN by () from M SN. Use (
); for this we
need to know M[v := N] SN. But this follows from IH(c) for M, since N
derives A
0
.
(b). By induction on M SN. Let M SN and assume that M proves
A = xB and N SN. The goal is M(v.N) SN. We distinguish cases
according to how M SN was generated. For (Var
), (
) and (
) use
the same rule again.
Case u
M SN by (Var
0
) from
M SN. Use (Var).
Case (
+
rM)
xA
SN by () from M SN. Use (
).
(c). By induction on M SN. Let N
A
SN; the goal is M[v := N]
SN. We distinguish cases according to how M SN was generated. For (),
(), (
) and (
S SN by (Var
) from u
M(v
t
.N
t
R)
S SN. If u ,= v,
use (Var
S[v := N] SN
Now use the proposition above.
Corollary. Every term is strongly normalizable.
Proof. Induction on the (rst) inductive denition of terms M. In
cases u and vM the claim follows from the denition of SN, and in cases
MN and M(v.N) it follows from parts (a), (b) of the previous theorem.
4.6. Disjunction. We describe the changes necessary to extend the
result above to the language with disjunction .
We have additional conversions
+
i
M(v
0
.N
0
, v
1
.N
1
)
M[v
i
:= N
i
]
i
conversion.
The denition of SN needs to be extended by
M SN
(
i
)
+
i
M SN
M, N
0
, N
1
SN
(Var
)
u
M(v
0
.N
0
, v
1
.N
1
) SN
u
M(v
0
.N
0
R, v
1
.N
1
R)
S SN
(Var
,
)
u
M(v
0
.N
0
, v
1
.N
1
)R
S SN
28 1. LOGIC
N
i
[v
i
:= M]
R SN N
1i
R SN M SN
(
i
)
+
i
M(v
0
.N
0
, v
1
.N
1
)
R SN
The former rules (Var), (Var
), (Var
,
).
The lemma above stating that every term in SN is strongly normalizable
needs to be extended by an additional clause:
Case (
i
). We show that N
i
[v
i
:= M]
R, N
1i
R, N
1i
R, third on N
1i
R. In
case of an outer reduction use the assumption. If M is reduced, use the
rst induction hypothesis. Reductions in N
i
and in
R as well as permutative
reductions within
R are taken care of by the second induction hypothesis.
Reductions in N
1i
are taken care of by the third induction hypothesis. The
only remaining case is when
R = S
S and (v
0
.N
0
, v
1
.N
1
) is permuted with
S, to yield (v
0
.N
0
S, v
1
.N
1
S). Apply the fourth induction hypothesis, since
(N
i
S)[v := M]
S = N
i
[v := M]S
S.
Finally the theorem above stating properties of SN needs an additional
clause:
for all M SN, if M proves A = A
0
A
1
and N
0
, N
1
SN, then
M(v
0
.N
0
, v
1
.N
1
) SN.
Proof. The new clause is proved by induction on M SN. Let M SN
and assume that M proves A = A
0
A
1
and N
0
, N
1
SN. The goal is
M(v
0
.N
0
, v
1
.N
1
) SN. We distinguish cases according to how M SN was
generated. For (Var
,
), (Var
,
), (
), (
) and (
i
) use the same rule
again.
Case u
M SN by (Var
0
) from
M SN. Use (Var
).
Case (
+
i
M)
A
0
A
1
SN by (
i
) from M SN. Use (
i
); for this we
need to know N
i
[v
i
:= M] SN and N
1i
SN. The latter is assumed,
and the former follows from main induction hypothesis (with N
i
) for the
substitution clause of the theorem, since M derives A
i
.
Case u
M(v
t
.N
t
) SN by (Var
) from
M, N
t
SN. For brevity let
E := (v
0
.N
0
, v
1
.N
1
). Then N
t
E SN by side induction hypothesis for N
t
,
so u
M(v
t
.N
t
E) SN by (Var
) and therefore u
M(v
t
.N
t
)E SN by (Var
,
).
Case u
M(v
t
0
.N
t
0
, v
t
1
.N
t
1
) SN by (Var
) from
M, N
t
0
, N
t
1
SN. Let
E := (v
0
.N
0
, v
1
.N
1
). Then N
t
i
E SN by side induction hypothesis for N
t
i
,
so u
M(v
t
0
.N
t
0
E, v
t
1
.N
t
1
E) SN by (Var
) and therefore u
M(v
t
0
.N
t
0
, v
t
1
.N
t
1
)E
SN by (Var
,
).
Clause (c) now needs additional cases, e.g.,
Case u
M(v
0
.N
0
, v
1
.N
1
) SN by (Var
) from
M, N
0
, N
1
SN. If u ,= v,
use (Var
). If u = v, we show N
M[v := N](v
0
.N
0
[v := N], v
1
.N
1
[v := N])
SN. Note that N proves A; hence in case
M empty the claim follows from
(b), and otherwise from (a) and the induction hypothesis.
4.7. The Structure of Normal Derivations. As mentioned already,
normalizations aim at removing local maxima of complexity, i.e. formula oc
currences which are rst introduced and immediately afterwards eliminated.
4. NORMALIZATION INCLUDING PERMUTATIVE CONVERSIONS 29
However, an introduced formula may be used as a minor premise of an ap
plication of
or
or
,
with conclusion A
i+1
;
(b) A
n
is not a minor premise of
or
.
(c) A
1
is not the conclusion of
or
.
(Note: An f.o. which is neither a minor premise nor the conclusion of an
application of
or
t
if the formula A in is a subformula of B in
t
. Clearly a derivation is
normal i it does not contain a maximal segment.
The argument in 3.7 needs to be rened to also cover the rules for , , .
The reason for the diculty is that in the Erules
the subformulas
of a major premise A B, A B or xA of an Erule application do not
appear in the conclusion, but among the assumptions being discharged by
the application. This suggests the denition of track below.
The general notion of a track is designed to retain the subformula prop
erty in case one passes through the major premise of an application of a

rule;
(b) A
i
for i < n is not the minor premise of an instance of
, and either
(i) A
i
is not the major premise of an instance of a
rule and
A
i+1
is directly below A
i
, or
(ii) A
i
is the major premise of an instance of a
rule and
A
i+1
is an assumption discharged by this instance;
(c) A
n
is either
(i) the minor premise of an instance of
, or
(ii) the conclusion of M, or
(iii) the major premise of an instance of a
j+1
, so each
j
is a s.p.p. of
n
.
Definition. A track of order 0, or main track, in a normal derivation
is a track ending either in the conclusion of the whole derivation or in the
major premise of an application of a
or
application, with
major premise belonging to a track of order n.
A main branch of a derivation is a branch in the prooftree such that
passes only through premises of Irules and major premises of Erules, and
begins at a top node and ends in the conclusion.
Remark. By an obvious simplication conversion we may remove every
application of an
or
application:
[ M
xA
[u: A]
[ N
B
x, u
B
B in N belongs to a track (induction hypothesis); either this does not
start in u: A, and then , B is a track in K which ends in the conclusion; or
starts in u: A, and then there is a track
t
in M (induction hypothesis)
such that
t
, , C is a track in K ending in the conclusion. The other cases
are left to the reader.
Theorem (Subformula property). Let M be a normal derivation where
every application of an
or
u, v
xA
By assumption again neither B nor C can have an existential s.p.p. Applying
the induction hypothesis to N
0
and N
1
we obtain
[ M
B C
[u: B]
[ N
0
__
n
i=1
A[x := r
i
]
+
__
n+m
i=1
A[x := r
i
]
[v : C]
[ N
1
__
n+m
i=n+1
A[x := r
i
]
+
__
n+m
i=1
A[x := r
i
]
u, v
__
n+m
i=1
A[x := r
i
]
(b). Similarly; by assumption the last rule can be neither
nor
.
Remark. RasiowaHarrop formulas (in the literature also called Harrop
formulas) are formulas for which no s.p.p. is a disjunction or an existential
formula. For consisting of RasiowaHarrop formulas both theorems above
hold.
5. Notes
The proof of the existence of normal forms w.r.t permutative conversions
is originally due to Prawitz [20]. We have adapted a method developed by
Joachimski and Matthes [13], which in turn is based on van Raamsdonks
and Severis [28].
CHAPTER 2
Models
It is an obvious question to ask whether the logical rules we have been
considering suce, i.e. whether we have forgotten some necessary rules. To
answer this question we rst have to x the meaning of a formula, i.e. we
have to provide a semantics.
This is rather straightforward for classical logic: we can take the usual
notion of a structure (or model, or (universal) algebra). However, for min
imal and intuitionistic logic we need a more rened notion: we shall use
socalled Bethstructures here. Using this concept of a model we will prove
soundness and completeness for both, minimal and intuitionistic logic. As a
corollary we will obtain completeness of classical logic, w.r.t. the standard
notion of a structure.
1. Structures for Classical Logic
1.1. Structures. We dene the notion of a structure (more accurately
Lstructure) and dene what the value of a term and the meaning of a
formula in such a structure should be.
Definition. / = (D, I) is a prestructure (or Lprestructure), if D
a nonempty set (the carrier set or the domain of /) and I is a map
(interpretation) assigning to every nary function symbol f of L a function
I(f): D
n
D.
In case n = 0, I(f) is an element of D. / = (D, I
0
, I
1
) is a structure (or
Lstructure), if (D, I
0
) is a prestructure and I
1
a map assigning to every
nary relation symbol R of L an nary relation
I
1
(R) D
n
.
In case n = 0, I
1
(R) is one of the truth values 1 and 0; in particular we
require I
1
() = 0.
If / = (D, I) or (D, I
0
, I
1
), then we often write [/[ for the carrier set
D of / and f
/
, R
/
for the interpretations I
0
(f), I
1
(R) of the function
and relation symbols.
An assignment (or variable assignment) in D is a map assigning to
every variable x dom() a value (x) D. Finite assignments will be
written as [x
1
:= a
1
, . . . , x
n
:= a
n
] (or else as [a
1
/x
1
, . . . , a
n
/x
n
]), with
distinct x
1
, . . . , x
n
. If is an assignment in D and a D, let
a
x
be the
assignment in D mapping x to a and coinciding with elsewhere, so
a
x
(y) :=
_
(y), if y ,= x
a, if y = x.
33
34 2. MODELS
Let a prestructure / and an assignment in [/[ be given. We dene
a homomorphic extension of (denoted by as well) to the set Set Ter
/
of
Lterms t such that vars(t) dom() by
(c) := c
/
,
(f(t
1
, . . . , t
n
)) := f
/
((t
1
), . . . , (t
n
)).
Observe that the extension of depends on /; therefore we may also write
t
/
[] for (t).
For every structure /, assignment in [/[ and formula Awith FV(A)
dom() we dene / [= A[] (read: A is valid in / under the assignment
) by recursion on A.
/ [= R(t
1
, . . . , t
n
)[] : (t
/
1
[], . . . , t
/
n
[]) I
1
(R) for R not 0ary.
/ [= R[] : I
1
(R) = 1 for R 0ary.
/ [= (A B)[] : / [= A[] and / [= B[].
/ [= (A B)[] : / [= A[] or / [= B[].
/ [= (A B)[] : if / [= A[], then / [= B[].
/ [= (xA)[] : for all a [/[ we have / [= A[
a
x
].
/ [= (xA)[] : there is an a [/[ such that / [= A[
a
x
].
Because of I
1
() = 0 we have in particular / ,[= [].
If is a set of formulas, we write / [= [], if for all A we have
/ [= A[]. If / [= A[] for all assignments in [/[, we write / [= A.
1.2. Coincidence and Substitution Lemma.
Lemma (Coincidence). Let / be a structure, t a term, A a formula and
, assignments in [/[.
(a) If (x) = (x) for all x vars(t), then (t) = (t).
(b) If (x) = (x) for all x FV(A), then / [= A[] i / [= A[].
Proof. Induction on terms and formulas.
Lemma (Substitution). Let / be an Lstructure, t, r Lterms, A an
Lformula and an assignment in [/[. Then
(a) (r[x := t]) =
(t)
x
(r).
(b) / [= A[x := t][] / [= A[
(t)
x
].
Proof. (a). Induction on r. (b). Induction on A. We restrict ourselves
to the cases of an atomic formula and a universal formula; the other cases
are easier.
Case R(s
1
, . . . , s
n
). For simplicity assume n = 1. Then
/ [= R(s)[x := t][] / [= R(s[x := t])[]
(s[x := t]) R
/
(t)
x
(s) R
/
by (a)
/ [= R(s)[
(t)
x
].
2. BETHSTRUCTURES FOR MINIMAL LOGIC 35
Case yA. We may assume y ,= x and y / vars(t).
/ [= (yA)[x := t][]
/ [= (yA[x := t])[]
for all a [/[, / [= A[x := t][
a
y
]
for all a [/[, / [= A[(
a
y
)
b
x
] with b :=
a
y
(t) = (t)
(by IH and the coincidence lemma)
for all a [/[, / [= A[(
b
x
)
a
y
], (because x ,= y)
/ [= (yA)[
b
x
]
This completes the proof.
1.3. Soundness. We prove the soundness theorem: it says that every
formula derivable in classical logic is valid in an arbitrary structure.
Theorem (Soundness). Let
c
B. If / is a structure and an
assignment in [/[, then / [= [] entails / [= B[].
Proof. Induction on derivations. The given derivation of B from
can only have nitely many free assumptions; hence we may assume =
A
1
, . . . , A
n
.
Case u: B. Then B and the claim is obvious.
Case Stab
R
: x.Rx Rx. Again the claim is clear, since / [=
A[] means the same as / [= A[].
Case
t is
true in all worlds above k.
More formally, each Bethstructure is based on a nitely branching tree
T. A node k over a set S is a nite sequence k = a
0
, a
1
, . . . , a
n1
) of
36 2. MODELS
elements of S; lh(k) is the length of k. We write k _ k
t
if k is the initial
segment of k
t
. A tree on S is a set of nodes closed under initial segments. A
tree T is nitely branching if every node in T has nitely many immediate
successors.
A tree T is unbounded if for every n N there is a node k T such that
lh(k) = n. A branch of T is a linearly ordered subtree of T. A leaf is a node
without successors in T.
For the proof of the completeness theorem, a Bethstructure based on
a complete binary tree (i.e. the complete tree over 0, 1) will suce. The
nodes will be all the nite sequences of 0s and 1s, and the ordering is as
above. The root is the empty sequence and k0 is the sequence k with the
postx 0. Similarly for k1.
Definition. Let (T, _) be a nitely branching tree. B = (D, I
0
, I
1
) is
a LBethstructure on T, where D is a nonempty set, and for each nary
function symbol in L, I
0
assigns f a map I
0
(f): D
n
D. For each nary
relation symbol R in L and each node k T, I
1
(R, k) D
n
is assigned in
such a way that monotonicity is preserved, that is,
k _ k
t
I
1
(R, k) I
1
(R, k
t
).
If n = 0, then I
1
(R, k) is either true or false, and it follows by the mono
tonicity that if k _ k
t
and I
1
(R, k) then I
1
(R, k
t
).
There is no special requirement set on I
1
(, k). In minimal logic, falsum
plays a role of an ordinary propositional variable.
For an assignment , t
B
[] is understood classically. The classical sa
tisfaction relation / [= A[] is replaced with the forcing relation in Beth
structures. It is obvious from the denition that any T can be extended
to a complete tree
T without leaves, in which for each leaf k T all se
quences k0, k00, k000, . . . are added to T. For each node k0 . . . 0, we add
I
1
(R, k0 . . . 0) := I
1
(R, k).
Definition. B, k A[] (B forces A at a node k for an assignment )
is dened inductively as follows. We write k A[] when it is clear from
the context what the underlying structure B is, and we write k
t
_
n
k A for
k
t
_k.lh(k
t
) = lh(k) +n A.
k R(t
1
, . . . , t
p
)[] : nk
t
_
n
k (t
B
1
[], . . . , t
B
p
[]) I
1
(R, k
t
),
if R is not 0ary.
k R[] : nk
t
_
n
k I
1
(R, k
t
) = 1 if R is 0ary.
k (A B)[] : nk
t
_
n
k.k
t
A[] or k
t
B[].
k (xA)[] : nk
t
_
n
ka[B[ k
t
A[
a
x
].
k (A B)[] : k
t
_k.k
t
A[] k
t
B[].
k (A B)[] : k A[] and k B[].
k (xA)[] : a[B[ k A[
a
x
].
The clauses for atoms, disjunction and existential quantier include a
concept of a bar, in
T.
2. BETHSTRUCTURES FOR MINIMAL LOGIC 37
2.2. Covering Lemma. It is easily seen (from the denition and using
monotonicity) that from k A[] and k _ k
t
we can conclude k
t
A[].
The converse is also true:
Lemma (Covering Lemma).
k
t
_
n
k k
t
A[] k A[].
Proof. Induction on A. We write k A for k A[].
Case R
t. Assume
nk
t
_
n
k k
t
R
t,
hence by denition
nk
t
_
n
kmk
tt
_
m
k
t
t
B
[] I
1
(R, k
tt
).
Since T is a nitely branching tree,
mk
t
_
m
k
t
B
[] I
1
(R, k
t
).
Hence k R
t.
The cases A B and xA are handled similarly.
Case A B. Let k
t
A B for all k
t
_ k with lh(k
t
) = lh(k) + n.
We show
l_k.l A l B.
Let l _ k and l A. We show that l B. We apply the IH to B
and m := max(lh(k) + n, lh(l)). So assume l
t
_ l and lh(l
t
) = m. It is
sucient to show l
t
B. If lh(l
t
) = lh(l), then l
t
= l and we are done. If
lh(l
t
) = lh(k) + n > lh(l), then l
t
is an extension of l as well as of k and
length lh(k) + n, and hence l
t
A B by assumption. Moreover, l
t
A,
since l
t
_ l and l A. It follows that l
t
B.
The cases A B and xA are obvious.
2.3. Coincidence and Substitution. The coincidence and substitu
tion lemmas hold for Bethstructures.
Lemma (Coincidence). Let B be a Bethstructure, t a term, A a formula
and , assignments in [B[.
(a) If (x) = (x) for all x vars(t), then (t) = (t).
(b) If (x) = (x) for all x FV(A), then B, k A[] B, k A[].
Proof. Induction on terms and formulas.
Lemma (Substitution). Let B be a Bethstructure, t, r terms, A a for
mula and an assignment in [B[. Then
(a) (r[x := t]) =
(t)
x
(r).
(b) B, k A[x := t][] B, k A[
(t)
x
].
Proof. Induction on terms and formulas.
38 2. MODELS
2.4. Soundness. As usual, we proceed to prove soundness theorem.
Theorem (Soundness). Let A be a set of formulas such that A.
Then, if B is a Bethstructure, k a node and an assignment in [B[, it follows
that B, k [] entails B, k A[].
Proof. Induction on derivations.
We begin with the axiom schemes
+
0
,
+
1
,
,
+
and
. k C[] is
abbreviated k C, when is known from the context.
Case
+
0
: A A B. We show k A A B. Assume for k
t
_ k
that k
t
A. Show: k
t
A B. This follows from the denition, since
k
t
A. The case
+
1
: B A B is symmetric.
Case
: A B (A C) (B C) C. We show that
k A B (A C) (B C) C. Assume for k
t
_ k that
k
t
A B, k
t
A C and k
t
B C (we can safely assume that k
t
is
the same for all three premises ). Show that k
t
C. By denition, there
is an n s.t. for all k
tt
_
n
k
t
, k
tt
A or k
tt
B. In both cases it follows
that k
tt
C, since k
t
A C and k
t
B C. By the covering lemma,
k
t
C.
Case
+
: A xA. Show that k (A xA)[]. Assume that k
t
_ k
and k
t
A[]. Show that k
t
(xA)[]. Since =
(x)
x
there is an a [B[
(namely a := (x)) such that k
t
A[
a
x
]. Hence, k
t
(xA)[].
Case
d
d
P
d
d
P
d
d
P
.
.
.
The it is easily seen that
) , P, ) P.
Thus ) , P P and hence , P P. Since for each R and all k,
k Efq
R
, it also follows that ,
i
P P. The model also shows that the
Pierce formula ((P Q) P) P is invalid in intuitionistic logic.
3. Completeness of Minimal and Intuitionistic Logic
Next, we show the converse of soundness theorem, for minimal as well
as intuitionistic logic.
3.1. Completeness of Minimal Logic.
Theorem (Completeness). Let A be a set of formulas. Then the
following propositions are equivalent.
(a) A.
(b) A, i.e. for all Bethstructures B, nodes k and assignments
B, k [] B, k A[].
Proof. Soundness is one direction. For the other direction we employ a
technique developed by Harvey Friedman and construct a Bethstructure B
(over the set T
01
of all nite 01sequences k ordered by the initial segment
relation k _ k
t
) with the property that B is equivalent to B, ) B[id].
In order to dene B, we will need an enumeration A
0
, A
1
, A
2
, . . . of L
formulas, in which each formula occurs countably many times. We also x
an enumeration x
0
, x
1
, . . . of variables. Let =
n
n
be the union of nite
sets
n
such that
n
n+1
. With each node k T
01
, we associate a nite
set
k
of formulas by induction on the length of k.
Let
)
:= . Take a node k such that lh(k) = n and suppose that
k
is
already dened. Write
n
B to mean that there is a derivation of length
n of B from . We dene
k0
and
k1
as follows:
Case 1.
n
,
k
,
n
A
n
. Then let
k0
:=
k
and
k1
:=
k
A
n
.
40 2. MODELS
Case 2.
n
,
k
n
A
n
= A
t
n
A
tt
n
. Then let
k0
:=
k
A
n
, A
t
n
and
k1
:=
k
A
n
, A
tt
n
.
Case 3.
n
,
k
n
A
n
= xA
t
n
. Then let
k0
:=
k1
:=
k
A
n
, A
t
n
[x := x
i
].
x
i
is the rst variable / FV(
n
, A
n
,
k
).
Case 4.
n
,
k
n
A
n
, and A
n
is neither a disjunction nor an existen
tially quantied formula. Then let
k0
:=
k1
:=
k
A
n
.
Obviously k _ k
t
implies that
k
k
. We note that
(6) k
t
_
n
k ,
k
B ,
k
B.
It is sucient to show
,
k0
B and ,
k1
B ,
k
B.
In cases 1 and 4, this is obvious. For cases 2 and 3, it follows immediately
from the axiom schemes
and
.
Next, we show
(7) ,
k
B nk
t
_
n
k B
k
.
We choose n lh(k) such that B = A
n
and
n
,
k
n
A
n
. For all k
t
_ k, if
lh(k
t
) = n + 1 then A
n
k
(cf. the cases 24).
Using the sets
k
we can dene an LBethstructure B as (Ter
/
, I
0
, I
1
)
(where Ter
/
denotes the set of terms of L) and the canonical I
0
(f)
t := f
t
and
t I
1
(R, k) : R
t
k
.
Obviously, t
B
[id] = t for all Lterms t.
We show that
(8) ,
k
B B, k B[id],
by induction on the complexity of B. For B, k B[id] we write k B.
Case R
t
nk
t
_
n
k R
t
k
by (7) and (6)
nk
t
_
n
k
t I
1
(R, k
t
) by denition of B
k R
t by denition of , since t
B
[id] = t.
Case B C. . Let ,
k
B C. Choose an n lh(k) such that
n
,
k
n
A
n
= B C. Then, for all k
t
_ k s.t. lh(k
t
) = n it follows that
0
=
k
B C, B and
k
1
=
k
B C, C,
and by IH
k
t
0 B and k
t
1 C.
By denition, we have k B C. .
k B C
nk
t
_
n
k .k
t
B or k
t
C
3. COMPLETENESS OF MINIMAL AND INTUITIONISTIC LOGIC 41
nk
t
_
n
k .,
k
B or ,
k
C by IH
nk
t
_
n
k ,
k
B C
,
k
B C by (6).
The case B C is evident.
Case B C. . Let ,
k
B C. We must show k B C, i.e.,
k
t
_k.k
t
B k
t
C.
Let k
t
_ k be such that k
t
B. By IH, it follows that ,
k
B, and
,
k
C follows by assumption. Then again by IH k
t
C.
. Let k B C, i.e. k
t
_k.k
t
B k
t
C. We show that
,
k
B C. At this point, we apply (6). Choose an n lh(k) such
that B = A
n
. Let k
t
_
m
k be such that m := n lh(k). We show that
,
k
B C. If ,
k
n
A
n
, then k
t
B by IH, and k
t
C by
assumption, hence ,
k
C again by IH and thus ,
k
B C.
If ,
k
,
n
A
n
then by denition
k
1
=
k
B, hence ,
k
1
B,
and k
t
1 B by IH. Now k
t
1 C by assumption, and nally ,
k
1
C by
IH. From
k
1
=
k
B, it follows that ,
k
B C.
Case xB. The following propositions are equivalent.
,
k
xB
tTer
/
,
k
B[x := t]
tTer
/
k B[x := t] by IH
tTer
/
k B[id
t
x
] by the substitution lemma, since t
B
[id] = t
k xB by denition of .
Case xB. This case is similar to the case . The proof proceeds as
follows. . Let ,
k
xB. Choose an n lh(k) such that
n
,
k
n
A
n
= xB. Then, for all k
t
_ k such that lh(k
t
) = n it follows that
0
=
k
1
=
k
xB, B[x := x
i
]
where x
i
is not free in
k
xB. Hence by IH
k
t
0 B[x := x
i
] and k
t
1 B[x := x
i
].
It follows by denition that k xB. .
k xB
nk
t
_
n
ktTer
/
k
t
B[id
t
x
]
nk
t
_
n
ktTer
/
k
t
B[x := t]
nk
t
_
n
ktTer
/
,
k
B[x := t] by IH
nk
t
_
n
k ,
k
xB
,
k
xB by (6).
Now, we are in a position to nalize the proof of the completeness the
orem. We apply (b) to the Bethstructure B constructed above from , the
empty node ) and the assignment = id. Then B, ) [id] by (8), hence
B, ) A[id] by assumption and therefore A by (8) again.
42 2. MODELS
3.2. Completeness of Intuitionistic Logic. Completeness of intu
itionistic logic follows as a corollary.
Corollary. Let A be a set of formulas. The following proposi
tions are equivalent.
(a)
i
A.
(b) , Efq A, i.e., for all Bethstructures B for the intuitionistic logic,
nodes k and assignments
B, k [] B, k A[].
4. Completeness of Classical Logic
We give a proof of completeness of classical logic relying on the com
pleteness proof for minimal logic above. Write [= A to mean that, for all
structures / and assignments ,
/ [= [] / [= A[].
4.1. The Completeness Theorem.
Theorem (Completeness). Let A be a set of formulas (in L). The
following propositions are equivalent.
(a)
c
A.
(b) [= A.
Proof. Soundness is one direction. For the other direction, we adapt
the completeness of minimal logic.
Evidently, it is sucient to treat formulas without , and (by
Lemma 2.4).
Let ,
c
A, i.e., , Stab , A. By the completeness theorem of minimal
logic, there is a Bethstructure B = (Ter
/
, I
0
, I
1
) on the complete binary
tree T
01
and a node l
0
such that l
0
, Stab and l
0
, A (we write k B
for B, k B[id]).
A node k is consistent if k , , and stable if k Stab. Let k be a stable
node, and B a formula (without , and ). Then, Stab B B by
the stability lemma. Hence, k B B, and
k , B k , B
k
t
_k.k
t
consistent and k
t
B. (9)
Let be a branch in the underlying tree T
01
. We dene
A : kk A,
is consistent : , ,
is stable : kk Stab.
Note that
(10) from
A and
A B it follows that B.
To see this, consider
A. Then k
A for a k , since is linearly
ordered. From
A B it follows that k B, i.e., B.
4. COMPLETENESS OF CLASSICAL LOGIC 43
A branch is generic (in the sense that it generates a classical model)
if it is consistent and stable, if in addition for all formulas B
(11) B or B,
and for all formulas yB (with y not empty) where B is not universally
quantied
(12) sTer
/
B[y := s] yB
For a branch , we dene a classical structure /
= (Ter
/
, I
0
, I
1
) as
I
1
(R) :=
_
k
I
1
(R, k) for R ,= .
We show that for every generic branch and each formula B with all con
nectives in ,
(13) B /
[= B.
The proof is by induction on the logical complexity of B.
Case R
[= B. We must show
that /
[= C
again by IH. . Let /
[= B C. If /
[= B, then /
[= C, hence
C by IH and therefore B C. If /
[= B[y := s] by IH
/
[= yB.
We show that for each consistent stable node k, there is a generic branch
containing k. For the purposes of the proof, we let A
0
, A
1
, . . . be an enumer
ation of formulas. We dene a sequence k = k
0
_ k
1
_ k
2
. . . of consistent
stable nodes inductively. Let k
0
:= k. Assume that k
n
is dened. We write
A
n
in the form yB (y possibly empty) and B is not a universal formula. In
case k
n
yB let k
n+1
:= k
n
. Otherwise we have k
n
, B[y := s] for some
s, and by (9) there is a consistent node k
t
_ k
n
such that k
t
B[y := s].
Let k
n+1
:= k
t
. Since k
n
_ k
n+1
, the node k
n+1
is stable.
Let := l [ n l _ k
n
, hence k . We show that is generic.
Clearly is consistent and stable. The propositions (11) and (12) can be
proved simultaneously. Let C = yB, where B is not a universal formula,
and choose n, C = A
n
. In case k
n
yB we are done. Otherwise we have
k
n
, B[y := s] for some s, and by construction k
n+1
B[y := s]. For (11)
we get k
n+1
yB (since yB B[y := s]), and (12) follows from the
consistency of .
We are now in a position to give a proof of completeness. Since l
0
, A
and l
0
is stable, (9) yields a consistent node k _ l
0
such that k A.
44 2. MODELS
Evidently, k is stable as well. By the proof above, there is a generic branch
such that k . Since k A it follows that A, hence /
[= A
by (13). Moreover, , and /
0
[= contrary to our assumption that
0
has a model.
Corollary (Lowenheim and Skolem). Let be a set of Lformulas (we
assume that L is countable). If is satisable, then is satisable on an
Lstructure with a countable carrier set.
Proof. We make use of the proof of the completeness theorem with
A = . It either yields
c
(which is excluded by assumption), or else a
model of , whose carrier set is the countable set Ter
/
.
5. Uncountable Languages
We give a second proof of the completeness theorem for classical logic,
which works for uncountable languages as well. This proof makes use of the
axiom of choice (in the form of Zorns lemma).
5.1. Filters and Ultralters. Let M ,= be a set. F T(M) is
called lter on M, if
(a) M F and / F;
(b) if X F and X Y M, then Y F;
(c) X, Y F entails X Y F.
F is called ultralter, if for all X T(M)
X F or M X F.
The intuition here is that the elements X of a lter F are considered to be
big. For instance, for M innite the set F = X M [ M X nite is
a lter.
Lemma. Suppose F is an ultralter and X Y F. Then X F or
Y F.
Proof. If both X and Y are not in F, then M X and M Y are in
F, hence also (M X) (M Y ), which is M (X Y ). This contradicts
the assumption X Y F.
5. UNCOUNTABLE LANGUAGES 45
Let M ,= be a set and S T(M). S has the nite intersection
property, if X
1
X
n
,= for all X
1
, . . . , X
n
S and all n N.
Lemma. If S has the nite intersection property, then there exists a
lter F on M such that F S.
Proof. F := X [ X X
1
X
n
for some X
1
, . . . , X
n
S .
Lemma. Let M ,= be a set and F a lter on M. Then there is an
ultralter U on M such that U F.
Proof. By Zorns lemma (which will be proved  from the axiom of
choice  in Chapter 5), there is a maximal lter U with F U. We claim
that U is an ultralter. So let X M and assume X / U and M X / U.
Since U is maximal, U X cannot have the nite intersection property;
hence there is a Y U such that Y X = . Similary we obtain Z U
such that Z (M X) = . But then Y Z = , a contradiction.
5.2. Products and Ultraproducts. Let M ,= be a set and A
i
,=
sets for i M. Let
iM
A
i
:= [ is a function, dom() = M and (i) A
i
for all i M .
Observe that, by the axiom of choice,
iM
A
i
,= . We write
iM
A
i
as (i) [ i M ).
Now let M ,= be a set, F a lter on M and /
i
structures for i M.
Then the Fproduct structure / =
F
iM
/
i
is dened by
(a) [/[ :=
iM
[/
i
[ (notice that [/[ ,= ).
(b) for an nary relation symbol R and
1
, . . . ,
n
[/[ let
R
,
(
1
, . . . ,
n
) : i M [ R
,
i
(
1
(i), . . . ,
n
(i)) F.
(c) for an nary function symbol f and
1
, . . . ,
n
[/[ let
f
,
(
1
, . . . ,
n
) := f
,
i
(
1
(i), . . . ,
n
(i)) [ i M ).
For an ultralter U we call / =
U
iM
/
i
the Uultraproduct of the /
i
for
i M.
5.3. The Fundamental Theorem on Ultraproducts. The prop
erties of ultralters correspond in a certain sense to the denition of the
consequence relation [=. For example, for an ultralter U we have
/ [= (A B)[] / [= A[] or / [= B[]
X Y U X U or Y U
and
/ [= A[] / ,[= A[]
X / U M X U.
This is the background of the following theorem.
46 2. MODELS
Theorem (Fundamental Theorem on Ultraproducts, Los 1955). Let
/ =
U
iM
/
i
be an Uultraproduct, A a formula and an assignment
in [/[ . Then we have
/ [= A[] i M [ /
i
[= A[
i
] U,
where
i
is the assignment induced by
i
(x) = (x)(i) for i M.
Proof. We rst prove a similar property for terms.
(14) t
,
[] = t
,
i
[
i
] [ i M ).
The proof is by induction on t. For a variable the claim follows from the
denition. Case ft
1
. . . t
n
. For simplicity assume n = 1; so we consider ft.
We obtain
(ft)
,
[] = f
,
(t
,
[])
= f
,
( t
,
i
[
i
] [ i M )) by IH
= (ft)
,
i
[
i
] [ i M ).
Case Rt
1
. . . t
n
. For simplicity assume n = 1; so consider Rt. We obtain
/ [= Rt[] R
,
(t
,
[])
i M [ R
,
i
(t
,
[](i)) U
i M [ R
,
i
(t
,
i
[
i
]) U by (14)
i M [ /
i
[= Rt[
i
] U.
Case A B.
/ [= (A B)[]
if / [= A[], then / [= B[]
if i M [ /
i
[= A[
i
] U, then i M [ /
i
[= B[
i
] U
by IH
i M [ /
i
[= A[
i
] / U or i M [ /
i
[= B[
i
] U
i M [ /
i
[= A[
i
] U or i M [ /
i
[= B[
i
] U
for U is an ultralter
i M [ /
i
[= (A B)[
i
] U.
Case xA.
/ [= (xA)[]
for all [/[, / [= A[
x
]
for all [/[, i M [ /
i
[= A[(
i
)
(i)
x
] U by IH
i M [ for all a [/
i
[, /
i
[= A[(
i
)
a
x
] U see below (15)
i M [ /
i
[= (xA)[
i
] U.
It remains to show (15). Let X := i M [ for all a [/
i
[, /
i
[= A[(
i
)
a
x
]
and Y
:= i M [ /
i
[= A[(
i
)
(i)
x
] for [/[.
. Let [/[ and X U. Clearly X Y
, hence also Y
U.
5. UNCOUNTABLE LANGUAGES 47
. Let Y
0
(i) =
_
some a [/
i
[ such that /
i
,[= A[(
i
)
a
x
] if i M X,
an arbitrary [/
i
[ otherwise.
Then Y
0
(M X) = , contradicting Y
0
, M X U.
If we choose /
i
= B constant, then / =
U
iM
B satises the same for
mulas as B (such structures will be called elementary equivalent in section 6;
the notation is / B).
U
iM
B is called an ultrapower of B.
5.4. General Compactness and Completeness.
Corollary (General Compactness Theorem). Every nitely satisable
set of formulas is satisable.
Proof. Let M := i [ i nite . For i M let /
i
be a model of
i under the assignment
i
. For A let Z
A
:= i M [ A i = i
[ i nite and A i . Then F := Z
A
[ A has the nite intersection
property (for A
1
, . . . , A
n
Z
A
1
. . . Z
A
n
). By the lemmata in 5.1 there is
an ultralter U on M such that F U. We consider / :=
U
iM
/
i
and the
product assigment such that (x)(i) :=
i
(x), and show / [= []. So let
A . By the theorem it suces to show X
A
:= i M [ /
i
[= A[
i
] U.
But this follows form Z
A
X
A
and Z
A
F U.
An immediate consequence is that if [= A, then there exists a nite
subset
t
such that
t
[= A.
For every set of formulas let L() be the set of all function and relation
symbols occurring in . If L is a sublanguage of L
t
, / an Lstructure and
/
t
an L
t
structure, then /
t
is called an expansion of /(and /a reduct of
/
t
), if [/[ = [/
t
[, f
/
= f
/
for
all relation symbols in the language L. The (uniquely determined) Lreduct
of /
t
is denoted by /
t
L. If /
t
is an expansion of / and an assignment
in [/[, then clearly t
/
[] = t
/
t
:= x
i
,= x
j
[ i, j N such that i < j .
By assumption every nite subset of
t
is satisable, hence by the general
compactness theorem so is
t
. Then we have / and such that / [=
t
[]
and therefore (x
i
) ,=
/
(x
j
) for i < j. Hence / is innite.
6.3. Complete Theories, Elementary Equivalence. Let L be the
set of all closed Lformulas. By a theory T we mean an axiom system closed
under
c
, so Eq
L(T)
T and
T = A L(T) [ T
c
A.
A theory T is called complete, if for every formula A L(T), T
c
A or
T
c
A.
For every Lstructure / (satisfying the equality axioms) the set of all
closed Lformulas A such that / [= A clearly is a theory; it is called the
theory of / and denoted by Th(/).
Two Lstructures / and /
t
are called elementarily equivalent (written
/ /
t
), if Th(/) = Th(/
t
). Two Lstructures / and /
t
are called
isomorphic (written /
= /
t
), if there is a map : [/[ [/
t
[ inducing a
bijection between [//=
/
[ and [/
t
/=
/
[, so
a, b[/[.a =
/
b (a) =
/
(b),
(a
t
[/
t
[)(a[/[) (a) =
/
a
t
,
such that for all a
1
, . . . , a
n
[/[
(f
/
(a
1
, . . . , a
n
)) =
/
f
/
((a
1
), . . . , (a
n
)),
R
/
(a
1
, . . . , a
n
) R
/
((a
1
), . . . , (a
n
))
for all nary function symbols f and relation symbols R of the language L.
We rst collect some simple properties of the notions of the theory of a
structure / and of elementary equivalence.
Lemma. (a) Th(/) ist complete.
(b) If is an axiom system such that L() L, then
A L [
c
A =
Th(/) [ / Mod
/
() .
(c) / /
t
/ [= Th(/
t
).
(d) If L is countable, then for every Lstructure / there is a countable
Lstructure /
t
such that / /
t
.
Proof. (a). Let / be an Lstructure and A L. Then / [= A or
/ [= A, hence Th(/)
c
A or Th(/)
c
A.
(b). For all A L we have
c
A [= A
for all Lstructures /, (/ [= / [= A)
for all Lstructures /, (/ Mod
/
() A Th(/))
A
Th(/) [ / Mod
/
() .
50 2. MODELS
(c). . Assume / /
t
and A Th(/
t
). Then /
t
[= A, hence
/ [= A.
. Assume / [= Th(/
t
). Then clearly Th(/
t
) Th(/). For the
converse inclusion let A Th(/). If A / Th(/
t
), by (a) we would also
have A Th(/
t
), hence / [= A contradicting A Th(/).
(d). Let L be countable and / an Lstructure. Then Th(/) is satis
able and therefore by the theorem of Lowenheim and Skolem possesses
a satisfying Lstructure /
t
with a countable carrier set Ter
/
. By (c),
/ /
t
.
Moreover, we can characterize complete theories as follows:
Theorem. Let T be a theory and L = L(T). Then the following are
equivalent.
(a) T is complete.
(b) For every model / Mod
/
(T), Th(/) = T.
(c) Any two models /, /
t
Mod
/
(T) are elementarily equivalent.
Proof. (a) (b). Let T be complete and / Mod
/
(T). Then
/ [= T, hence T Th(/). For the converse assume A Th(/). Then
A / Th(/), hence A / T and therefore A T.
(b) (c) is clear.
(c) (a). Let A L and T ,
c
A. Then there is a model /
0
of
T A. Now let / Mod
/
(T) be arbitrary. By (c) we have / /
0
,
hence / [= A. Therefore T
c
A.
6.4. Elementary Equivalence and Isomorphism.
Lemma. Let be an isomorphism between / and /
t
. Then for all
terms t and formulas A and for every suciently big assignment in [/[
(a) (t
/
[]) =
/
t
/
[ ] and
(b) / [= A[] /
t
[= A[ ]. In particular,
/
= /
t
/ /
t
.
Proof. (a). Induction on t. For simplicity we only consider the case of
a unary function symbol.
(x
/
[]) = ((x)) = x
/
[ ]
(c
/
[]) = (c
/
) =
/
c
/
((ft)
/
[]) = (f
/
(t
/
[]))
=
/
f
/
((t
/
[]))
=
/
f
/
(t
/
[ ])
= (ft)
/
[ ].
(b). Induction on A. For simplicity we only consider the case of a unary
relation symbol and the case xA.
/ [= Rt[] R
/
(t
/
[])
R
/
((t
/
[]))
R
/
(t
/
[ ])
6. BASICS OF MODEL THEORY 51
/
t
[= Rt[ ],
/ [= xA[] for all a [/[, / [= A[
a
x
]
for all a [/[, /
t
[= A[
a
x
]
for all a [/[, /
t
[= A[( )
(a)
x
]
for all a
t
[/
t
[, /
t
[= A[( )
a
x
]
/
t
[= xA[ ]
This concludes the proof.
The converse, i.e. that / /
t
implies /
= /
t
, is true for nite
structures (see exercise sheet 9), but not for innite ones:
Theorem. For every innite structure / there is an elementarily equi
valent structure /
0
not isomorphic to /.
Proof. Let =
/
be the equality on M := [/[, and let T(M) denote
the power set of M. For every T(M) choose a new constant c
. In the
language L
t
:= L c
,= c
[ , T(M) and ,= Eq
/
.
Every nite subset of is satisable by an appropriate expansion of /.
Hence by the general compactness theorem also is satisable, say by /
t
0
.
Let /
0
:= /
t
0
L. We may assume that =
/
0
is the equality on [/
0
[. /
0
is not isomorphic to /, for otherwise we would have an injection of T(M)
into M and therefore a contradiction.
6.5. Non Standard Models. By what we just proved it is impossi
ble to characterize an innite structure by a rst order axiom system up
to isomorphism. However, if we extend rst order logic by also allowing
quantication over sets X, we can formulate the following Peano axioms
nS(n) ,= 0,
nm.S(n) = S(m) n = m,
X.0 X (n.n X S(n) X) nn X.
One can show easily that (N, 0, S) is up to isomorphism the unique model
of the Peano axioms. A structure which is elementarily equivalent, but not
isomorphic to ^ := (N, 0, S), is called a non standard model of the natural
numbers. In non standard models of the natural numbers the principle of
complete induction does not hold for all sets X N.
Similarly, a structure which is elementarily equivalent, but not isomor
phic to (R, 0, 1, +, , <) is called a non standard model of the reals. In every
non standard model of the reals the completeness axiom
X. , = X bounded y.y = sup(X)
does not hold for all sets X R.
Theorem. There are countable non standard models of the natural num
bers.
52 2. MODELS
Proof. Let x be a variable and
:= Th(^) x ,= n [ n N,
where 0 := 0 and n + 1 := Sn. Clearly every nite subset of is satisable,
hence by compactness also . By the theorem of Lowenheim and Skolem we
then have a countable or nite / and an assignment such that / [= [].
Because of / [= Th(^) we have / ^ by 6.3; hence / is countable.
Moreover (x) ,=
/
n
/
for all n N, hence / ,
= ^.
6.6. Archimedian Ordered Fields. We now consider some easy ap
plications to wellknown axiom systems.
The axioms of eld theory are (the equality axioms and)
x + (y +z) = (x +y) +z,
0 +x = x,
(x) +x = 0,
x +y = y +x,
x (y z) = (x y) z,
1 x = x,
x ,= 0 x
1
x = 1,
x y = y x,
and also
(x +y) z = (x z) + (y z),
1 ,= 0.
Fields are the models of this axiom system.
In the theory of ordered elds one has in addition a binary relation
symbol < and as axioms
x ,< x,
x < y y < z x < z,
x < y
cl
x = y
cl
y < x,
x < y x +z < y +z,
0 < x 0 < y 0 < x y.
Ordered elds are the models of this extended axiom system. An ordered
eld is called archimedian ordered, if for every element a of the eld there
is a natural number n such that a is less than the nfold multiple of the 1
in the eld.
Theorem. For every archimedian ordered eld there is an elementarily
equivalent ordered eld that is not archimedian ordered.
Proof. Let / be an archimedian ordered eld, x a variable and
:= Th(/) n < x [ n N.
Clearly every nite subset of is satisable, hence by the general compact
ness theorem also . Therefore we have / and such that / [= [].
Because of / [= Th(/) we obtain / / and hence / is an ordered
eld. Moreover 1
/
n <
/
(x) for all n N, hence / is not archimedian
ordered.
6. BASICS OF MODEL THEORY 53
6.7. Axiomatizable Structures. A class o of Lstructures is called
(nitely) axiomatizable, if there is a (nite) axiom system such that
o = Mod
/
(). Clearly o is nitely axiomatizable i o = Mod
/
(A) for
some formula A. If for every / o there is an elementarily equivalent
/
t
/ o, then o cannot possibly be axiomatizable. By the theorem above
we can conclude that the class of archimedian ordered elds is not axiom
atizable. It also follows that the class of non archimedian ordered elds is
not axiomatizable.
Lemma. Let o be a class of Lstructures and an axiom system.
(a) o is nitely axiomatizable i o and the complement of o are axiomati
zable.
(b) If Mod
/
() is nitely axiomatizable, then there is a nite
0
such
that Mod
/
(
0
) = Mod
/
().
Proof. (a). Let 1 o denote the complement of o.
. Let o = Mod
/
(A). Then / 1 o / [= A, hence
1 o = Mod
/
(A).
. Let o = Mod
/
(
1
) and 1 o = Mod
/
(
2
). Then
1
2
is not
satisable, hence there is a nite
1
such that
2
is not satisable.
One obtains
/ o / [= / ,[=
2
/ / 1 o / o,
hence o = Mod
/
().
(b). Let Mod
/
() = Mod
/
(A). Then [= A, hence also
0
[= A for a
nite
0
. One obtains
/ [= / [=
0
/ [= A / [= ,
hence Mod
/
(
0
) = Mod
/
().
6.8. Complete Linear Orders Without End Points. Finally we
consider as an example of a complete theory the theory DO of complete
linear orders without end points. The axioms are (the equality axioms and)
x ,< x,
x < y y < z x < z,
x < y
cl
x = y
cl
y < x,
x < y
cl
z.x < z z < y,
cl
y x < y,
cl
y y < x.
Lemma. Every countable model of DO is isomorphic to the structure
(Q, <) of rational numbers.
Proof. Let / = (M, ) be a countable model of DO; we can assume
that =
/
is the equality on M. Let M = b
n
[ n N and Q = a
n
[ n
N, where we may assume a
n
,= a
m
and b
n
,= b
m
for n < m. We dene
recursively functions f
n
Q M as follows. Let f
0
:= (a
0
, b
0
). Assume
we have already constructed f
n
.
Case n+1 = 2m. Let j be minimal such that b
j
/ ran(f
n
). Choose a
i
/
dom(f
n
) such that for all a dom(f
n
) we have a
i
< a b
j
< f
n
(a); such an
a
i
exists, since / and (Q, <) are models of DO. Let f
n+1
:= f
n
(a
i
, b
j
).
Case n + 1 = 2m + 1. This is treated similarly. Let i be minimal such
that a
i
/ dom(f
n
). Choose b
j
/ ran(f
n
) such that for all a dom(f
n
) we
54 2. MODELS
have a
i
< a b
j
< f
n
(a); such a b
j
exists, since / and (Q, <) are models
of DO. Let f
n+1
:= f
n
(a
i
, b
j
).
Then b
0
, . . . , b
m
ran(f
2m
) and a
0
, . . . , a
m+1
dom(f
2m+1
) by
construction, and f :=
n
f
n
is an isomorphism of (Q, <) onto /.
Theorem. The theory DO is complete, and DO = Th(Q, <).
Proof. Clearly (Q, <) is a model of DO. Hence by 6.3 it suces to
show that for every model / of DO we have / (Q, <). So let / model
of DO. By 6.3 there is a countable /
t
such that / /
t
. By the preceding
lemma /
t
= (Q, <), hence / /
t
(Q, <).
A further example of a complete theory is the theory of algebraically
closed elds. For a proof of this fact and for many more subjects of model
theory we refer to the literature (e.g., the book of Chang and Keisler [6]).
7. Notes
The completeness theorem for classical logic has been proved by Godel
[10] in 1930. He did it for countable languages; the general case has been
treated 1936 by Malzew [17]. Lowenheim and Skolem proved their theorem
even before the completeness theorem was discovered: Lowenheim in 1915
[16] und Skolem in 1920 [24].
Bethstructures for intuitionistic logic have been introduced by Beth
in 1956 [1]; however, the completeness proofs given there were in need of
correction. 1959 Beth revised his paper in [2].
CHAPTER 3
Computability
In this chapter we develop the basics of recursive function theory, or as
it is more generally known, computability theory. Its history goes back to
the seminal works of Turing, Kleene and others in the 1930s.
A computable function is one dened by a program whose operational
semantics tell an idealized computer what to do to its storage locations as
it proceeds deterministically from input to output, without any prior re
strictions on storage space or computation time. We shall be concerned
with various programstyles and the relationships between them, but the
emphasis throughout will be on one underlying datatype, namely the nat
ural numbers, since it is there that the most basic foundational connections
between proof theory and computation are to be seen in their clearest light.
The two bestknown models of machine computation are the Turing
Machine and the (Unlimited) Register Machine of Shepherdson and Sturgis
[22]. We base our development on the latter since it aords the quickest
route to the results we want to establish.
1. Register Machines
1.1. Programs. A register machine stores natural numbers in registers
denoted u, v, w, x, y, z possibly with subscripts, and it responds step by
step to a program consisting of an ordered list of basic instructions:
I
0
I
1
.
.
.
I
k1
Each instruction has one of the following three forms whose meanings are
obvious:
Zero: x := 0
Succ: x := x + 1
Jump: if x = y then I
m
else I
n
.
The instructions are obeyed in order starting with I
0
except when a condi
tional jump instruction is encountered, in which case the next instruction
will be either I
m
or I
n
according as the numerical contents of registers x
and y are equal or not at that stage. The computation terminates when it
runs out of instructions, that is when the next instruction called for is I
k
.
Thus if a program of length k contains a jump instruction as above then it
must satisfy the condition m, n k and I
k
means halt. Notice of course
that some programs do not terminate, for example the following oneliner:
if x = x then I
0
else I
1
55
56 3. COMPUTABILITY
1.2. Program Constructs. We develop some shorthand for building
up standard sorts of programs.
Transfer. x := y is the program
x := 0
if x = y then I
4
else I
2
x := x + 1
if x = x then I
1
else I
1
which copies the contents of register y into register x.
Predecessor. The program x := y
1 ; z := w od
which computes the modied subtraction function x
y.
Bounded Sum. If P(x
1
, . . . , x
k
, w; y) computes the k + 1ary function
then the program Q(x
1
, . . . , x
k
, z; x):
x := 0 ;
for i = 1, . . . , z do w := i
1 ; P(x, w; y) ; v := x ; Add(v, y; x) od
computes the function
(x
1
, . . . , x
k
, z) =
w<z
(x
1
, . . . , x
k
, w)
which will be undened if for some w < z, (x
1
, . . . , x
k
, w) is undened.
Multiplication. Deleting w := i
w<z
(x
1
, . . . , x
k
, w) .
58 3. COMPUTABILITY
Composition. If P
j
(x
1
, . . . , x
k
; y
j
) computes
j
for each j = i, . . . , m and
if P
0
(y
1
, . . . , y
m
; y
0
) computes
0
, then the program Q(x
1
, . . . , x
k
; y
0
):
P
1
(x
1
, . . . , x
k
; y
1
) ; . . . ; P
m
(x
1
, . . . , x
k
; y
m
) ; P
0
(y
1
, . . . , y
m
; y
0
)
computes the function
(x
1
, . . . , x
k
) =
0
(
1
(x
1
, . . . , x
k
) , . . . ,
m
(x
1
, . . . , x
k
))
which will be undened if any of the subterms on the right hand side is
undened.
Unbounded Minimization. If P(x
1
, . . . , x
k
, y; z) computes then the pro
gram Q(x
1
, . . . , x
k
; z):
y := 0 ; z := 0 ; z := z + 1 ;
while z ,= 0 do P(x
1
, . . . , x
k
, y; z) ; y := y + 1 od ;
z := y
1
computes the function
(x
1
, . . . , x
k
) = y ((x
1
, . . . , x
k
, y) = 0)
that is, the least number y such that (x
1
, . . . , x
k
, y
t
) is dened for every
y
t
y and (x
1
, . . . , x
k
, y) = 0.
2. Elementary Functions
2.1. Denition and Simple Properties. The elementary functions
of Kalmar (1943) are those numbertheoretic functions which can be dened
explicitly by compositional terms built up from variables and the constants
0, 1 by repeated applications of addition +, modied subtraction
, bounded
sums and bounded products.
By omitting bounded products, one obtains the subelementary functions.
The examples in the previous section show that all elementary functions
are computable and totally dened. Multiplication and exponentiation are
elementary since
m n =
i<n
m and m
n
=
i<n
m
and hence by repeated composition, all exponential polynomials are elemen
tary.
In addition the elementary functions are closed under
Denitions by Cases.
f(n) =
_
g
0
(n) if h(n) = 0
g
1
(n) otherwise
since f can be dened from g
0
, g
1
and h by
f(n) = g
0
(n) (1
h(n)) +g
1
(n) (1
(1
h(n))).
2. ELEMENTARY FUNCTIONS 59
Bounded Minimization.
f(n, m) = k<m(g(n, k) = 0)
since f can be dened from g by
f(n, m) =
i<m
_
1
ki
(1
g(n, k))
_
.
Note: this denition gives value m if there is no k < m such that g(n, k) =
0. It shows that not only the elementary, but in fact the subelementary
functions are closed under bounded minimization. Furthermore, we dene
km(g(n, k) = 0) as k<m+1 (g(n, k) = 0). Another notational conven
tion will be that we shall often replace the brackets in k<m(g(n, k) = 0)
by a dot, thus: k<m. g(n, k) = 0.
Lemma.
(a) For every elementary function f : N
r
N there is a number k such that
for all n = n
1
, . . . , n
r
,
f(n) < 2
k
max(n)
where 2
0
(m) = m and 2
k+1
(m) = 2
2
k
(m)
.
(b) Hence the function n 2
n
(1) is not elementary.
Proof. (a). By induction on the buildup of the compositional term
dening f. The result clearly holds if f is any one of the base functions:
f(n) = 0 or 1 or n
i
or n
i
+n
j
or n
i
n
j
.
If f is dened from g by application of bounded sum or product:
f(n, m) =
i<m
g(n, i) or
i<m
g(n, i)
where g(n, i) < 2
k
max(n, i) then we have
f(n, m) 2
k
max(n, m)
m
< 2
k+2
max(n, m)
(using m
m
< 2
2
m
). If f is dened from g
0
, g
1
, . . . , g
l
by composition:
f(n) = g
0
(g
1
(n), . . . , g
l
(n))
where for each j l we have g
j
() < 2
k
j
(max()), then with k = max
j
k
j
,
f(n) < 2
k
(2
k
max(n)) = 2
2k
max(n)
and this completes the rst part.
(b). If 2
n
(1) were an elementary function of n then by (a) there would
be a positive k such that for all n,
2
n
(1) < 2
k
(n)
but then putting n = 2
k
(1) yields 2
2
k
(1)
(1) < 2
2k
(1), a contradiction.
60 3. COMPUTABILITY
2.2. Elementary Relations. A relation R on N
k
is said to be elemen
tary if its characteristic function
c
R
(n) =
_
1 if R(n)
0 otherwise
is elementary. In particular, the equality and less than relations are
elementary since their characteristic functions can be dened as follows:
c
<
(m, n) = 1
(1
(n
m)) ; c
=
(m, n) = 1
(c
<
(m, n) + c
<
(n, m))) .
Furthermore if R is elementary then so is the function
f(n, m) = k<mR(n, k)
since R(n, k) is equivalent to 1
c
R
(n, k) = 0.
Lemma. The elementary relations are closed under applications of propo
sitional connectives and bounded quantiers.
Proof. For example, the characteristic function of R is
1
c
R
(n) .
The characteristic function of R
0
R
1
is
c
R
0
(n) c
R
1
(n).
The characteristic function of i<mR(n, i) is
c
=
(m, i<m. c
R
(n, i) = 0).
m
n
 = k<m(m < (k + 1)n)
m mod n = m
m
n
n
Prime(m) 1 < m n<m(1 < n m mod n = 0)
p
n
= m<2
2
n
_
Prime(m) n =
i<m
c
Prime
(i)
_
so p
0
, p
1
, p
2
, . . . gives the enumeration of primes in increasing order. The
estimate p
n
2
2
n
for the nth prime p
n
can be proved by induction on n:
For n = 0 this is clear, and for n 1 we obtain
p
n
p
0
p
1
p
n1
+ 1 2
2
0
2
2
1
2
2
n1
+ 1 = 2
2
n
1
+ 1 < 2
2
n
.
2.3. The Class c.
Definition. The class c consists of those number theoretic functions
which can be dened from the initial functions: constant 0, successor S,
projections (onto the ith coordinate), addition +, modied subtraction
,
multiplication and exponentiation 2
x
, by applications of composition and
bounded minimization.
2. ELEMENTARY FUNCTIONS 61
The remarks above show immediately that the characteristic functions
of the equality and less than relations lie in c, and that (by the proof of the
lemma) the relations in c are closed under propositional connectives and
bounded quantiers.
Furthermore the above examples show that all the functions in the class
c are elementary. We now prove the converse, which will be useful later.
Lemma. There are pairing functions ,
1
,
2
in c with the following
properties:
(a) maps N N bijectively onto N,
(b) (a, b) < (a +b + 1)
2
,
(c)
1
(c),
2
(c) c,
(d) (
1
(c),
2
(c)) = c,
(e)
1
((a, b)) = a,
(f)
2
((a, b)) = b.
Proof. Enumerate the pairs of natural numbers as follows:
.
.
.
10
6 . . .
3 7 . . .
1 4 8 . . .
0 2 5 9 . . .
At position (0, b) we clearly have the sum of the lengths of the preceeding
diagonals, and on the next diagonal a + b remains constant. Let (a, b) be
the number written at position (a, b). Then we have
(a, b) =
_
ia+b
i
_
+a =
1
2
(a +b)(a +b + 1) +a.
Clearly : N N N is bijective. Moreover, a, b (a, b) and in case
(a, b) ,= 0 also a < (a, b). Let
1
(c) := xcyc ((x, y) = c),
2
(c) := ycxc ((x, y) = c).
Then clearly
i
(c) c for i 1, 2 and
1
((a, b)) = a,
2
((a, b)) = b,
(
1
(c),
2
(c)) = c.
,
1
and
2
are elementary by deniton.
Remark. The above proof shows that ,
1
and
2
are in fact subele
mentary.
Lemma (Godel). There is in c a function with the following property:
For every sequence a
0
, . . . , a
n1
< b of numbers less than b we can nd a
number c 4 4
n(b+n+1)
4
such that (c, i) = a
i
for all i < n.
62 3. COMPUTABILITY
Proof. Let
a := (b, n) and d :=
i<n
_
1 +(a
i
, i)a!
_
.
From a! and d we can, for each given i < n, reconstruct the number a
i
as
the unique x < b such that
1 +(x, i)a! [ d.
For clearly a
i
is such an x, and if some x < b were to satisfy the same
condition, then because (x, i) < a and the numbers 1 + ka! are relatively
prime for k a, we would have (x, i) = (a
j
, j) for some j < n. Hence
x = a
j
and i = j, thus x = a
i
.
We can now dene the G odel function as
(c, i) :=
1
_
y<c. (1 +(
1
(y), i)
1
(c))
2
(y) =
2
(c)
_
.
Clearly is in c. Furthermore with c := (a!, d) we see that (a
i
, d/1 +
(a
i
, i)a!) is the unique such y, and therefore (c, i) = a
i
. It is then not
dicult to estimate the given bound on c, using (b, n) < (b +n + 1)
2
.
Remark. The above denition of shows that it is subelementary.
2.4. Closure Properties of c.
Theorem. The class c is closed under limited recursion. Thus if g, h, k
are given functions in c and f is dened from them according to the scheme
f( m, 0) = g( m)
f( m, n + 1) = h(n, f( m, n), m)
f( m, n) k( m, n)
then f is in c also.
Proof. Let f be dened from g, h and k in c, by limited recursion
as above. Using Godels function as in the last lemma we can nd for
any given m, n a number c such that (c, i) = f( m, i) for all i n. Let
R( m, n, c) be the relation
(c, 0) = g( m) i<n. (c, i + 1) = h(i, (c, i), m)
and note by the remarks above that its characteristic function is in c. It
is clear, by induction, that if R( m, n, c) holds then (c, i) = f( m, i), for all
i n. Therefore we can dene f explicitly by the equation
f( m, n) = (c R( m, n, c), n).
f will lie in c if c can be bounded by an c function. However, Lemma 2.3
gives a bound 4 4
(n+1)(b+n+2)
4
, where in this case b can be taken as the
maximum of k( m, i) for i n. But this can be dened in c as k( m, i
0
),
where i
0
= in. jn. k( m, j) k( m, i). Hence c can be bounded by an
c function.
Remark. Notice that it is in this proof only that the exponential func
tion is required, in providing a bound for .
Corollary. c is the class of all elementary functions.
2. ELEMENTARY FUNCTIONS 63
Proof. It is sucient merely to show that c is closed under bounded
sums and bounded products. Suppose for instance, that f is dened from
g in c by bounded summation: f( m, n) =
i<n
g( m, i). Then f can be
dened by limited recursion, as follows
f( m, 0) = 0
f( m, n + 1) = f( m, n) +g( m, n)
f( m, n) n max
i<n
g( m, i)
and the functions (including the bound) from which it is dened are in c.
Thus f is in c by the last lemma. If instead, f is dened by bounded
product, then proceed similarly.
2.5. Coding Finite Lists. Computation on lists is a practical neces
sity, so because we are basing everything here on the single data type N
we must develop some means of coding nite lists or sequences of natural
numbers into N itself. There are various ways to do this and we shall adopt
one of the most traditional, based on the pairing functions ,
1
,
2
.
The empty sequence is coded by the number 0 and a sequence n
0
, n
1
,
. . . , n
k1
is coded by the sequence number
n
0
, n
1
, . . . , n
k1
) =
t
(. . .
t
(
t
(0, n
0
), n
1
), . . . , n
k1
)
with
t
(a, b) := (a, b) + 1, thus recursively,
) := 0,
n
0
, n
1
, . . . , n
k
) :=
t
(n
0
, n
1
, . . . , n
k1
), n
k
).
Because of the surjectivity of , every number a can be decoded uniquely as
a sequence number a = n
0
, n
1
, . . . , n
k1
). If a is greater than zero, hd(a) :=
2
(a
1) is the
tail of the list. The kth iterate of tl is denoted tl
(k)
and since tl(a) is less
than or equal to a, tl
(k)
(a) is elementarily denable (by limited recursion).
Thus we can dene elementarily the length and decoding functions:
lh(a) := ka. tl
(k)
(a) = 0,
(a)
i
:= hd(tl
(lh(a)
(i+1))
(a)).
Then if a = n
0
, n
1
, . . . , n
k1
) it is easy to check that
lh(a) = k and (a)
i
= n
i
for each i < k.
Furthermore (a)
i
= 0 when i lh(a). We shall write (a)
i,j
for ((a)
i
)
j
and
(a)
i,j,k
for (((a)
i
)
j
)
k
. This elementary coding machinery will be used at
various crucial points in the following.
Note that our previous remarks show that the functions lh and (a)
i
are
subelementary, and so is n
0
, n
1
, . . . , n
k1
) for each xed k .
Concatenation of sequence numbers b a is dened thus:
b ) := b,
b n
0
, n
1
, . . . , n
k
) := (b n
0
, n
1
, . . . , n
k1
), n
k
) + 1.
64 3. COMPUTABILITY
To check that this operation is also elementary, dene h(b, a, i) by recursion
on i as follows.
h(b, a, 0) = b,
h(b, a, i + 1) = (h(b, a, i), (a)
i
) + 1
and note that since (h(b, a, i), (a)
i
) < (h(b, a, i)+a)
2
it follows by induction
on i that h(b, a, i) is less than or equal to (b+a+i)
2
i
. Thus h is denable by
limited recursion from elementary functions and hence is itself elementary.
Finally
b a = h(b, a, lh(a)).
Lemma. The class c is closed under limited courseofvalues recursion.
Thus if h, k are given functions in c and f is dened from them according
to the scheme
f( m, n) = h(n, f( m, 0), . . . .f( m, n 1)), m)
f( m, n) k( m, n)
then f is in c also.
Proof.
f( m, n) := f( m, 0), . . . .f( m, n 1)) is denable by
f( m, 0) = 0,
f( m, n + 1) =
f( m, n) h(n,
f( m, n), m))
f( m, n)
_
in
k( m, i) + 1
_
2
n
, using n, . . . , n
. .
k
) < (n + 1)
2
k
1))
where r(e
0
, e
1
, i) =
_
2, (e
1
)
i,1
, (e
1
)
i,2
, (e
1
)
i,3
+ lh(e
0
), (e
1
)
i,4
+ lh(e
0
)) if (e
1
)
i,0
= 2
(e
1
)
i
otherwise
readdresses the jump instructions in P
1
. Clearly r and hence comp are
elementary functions.
Definition. Henceforth,
(r)
e
denotes the partial function computed by
the register machine program with program number e, operating on the
input registers v
1
, . . . , v
r
and with output register v
0
. There is no loss of
generality here, since the variables in any program can always be renamed
so that v
1
, . . . , v
r
become the input registers and v
0
the output. If e is not a
program number, or it is but does not operate on the right variables, then
we adopt the convention that
(r)
e
(n
1
, . . . , n
r
) is undened for all inputs
n
1
, . . . , n
r
.
3.2. Normal Form.
Theorem (Kleenes Normal Form). For each arity r there is an ele
mentary function U and an elementary relation T such that, for all e and
all inputs n
1
, . . . , n
r
,
(r)
e
(n
1
, . . . , n
r
) is dened s T(e, n
1
, . . . , n
r
, s)
(r)
e
(n
1
, . . . , n
r
) = U( e, n
1
, . . . , n
r
, s T(e, n
1
, . . . , n
r
, s) ).
Proof. A computation of a register machine program P(v
1
, . . . , v
r
; v
0
)
on numerical inputs n = n
1
, . . . , n
r
proceeds deterministically, step by step,
each step corresponding to the execution of one instruction. Let e be its
program number, and let v
0
, . . . , v
l
be all the registers used by P, including
the working registers so r l.
The state of the computation at step s is dened to be the sequence
number
state(e, n, s) = e, i, m
0
, m
1
, . . . , m
l
)
where m
0
, m
1
, . . . , m
l
are the values stored in the registers v
0
, v
1
, . . . , v
l
after
step s is completed, and the next instruction to be performed is the ith one,
thus (e)
i
is its instruction number.
The state transition function tr : N N computes the next state.
So suppose that x = e, i, m
0
, m
1
, . . . , m
l
) is any putative state. Then in
what follows, e = (x)
0
, i = (x)
1
, and m
j
= (x)
j+2
for each j l. The
denition of tr(x) is therefore as follows:
tr(x) = e, i
t
, m
t
0
, m
t
1
, . . . , m
t
l
)
where
If (e)
i
= 0, j) where j l then i
t
= i + 1, m
t
j
= 0, and all other
registers remain unchanged, i.e. m
t
k
= m
k
for k ,= j.
66 3. COMPUTABILITY
If (e)
i
= 1, j) where j l then i
t
= i + 1, m
t
j
= m
j
+ 1, and all
other registers remain unchanged.
If (e)
i
= 2, j
0
, j
1
, i
0
, i
1
) where j
0
, j
1
l and i
0
, i
1
lh(e) then
i
t
= i
0
or i
t
= i
1
according as m
j
0
= m
j
1
or not, and all registers
remain unchanged, i.e. m
t
j
= m
j
for all j l.
Otherwise, if x is not a sequence number, or if e is not a program
number, or if it refers to a register v
k
with l < k, or if lh(e) i,
then tr(x) simply repeats the same state x so i
t
= i, and m
t
j
= m
j
for every j l.
Clearly tr is an elementary function, since it is dened by elementarily decid
able cases, with (a great deal of) elementary decoding and recoding involved
in each case.
Consequently, the state function state(e, n, s) is also elementary be
cause it can be dened by iterating the transition function by limited recur
sion on s as follows:
state(e, n, 0) = e, 0, n
1
, . . . , n
r
, 0, . . . , 0)
state(e, n, s + 1) = tr( state(e, n, s) )
state(e, n, s) h(e, n, s)
where for the bounding function h we can take
h(e, n, s) = e, e) max(n) +s, . . . , max(n) +s),
This is because the maximum value of any register at step s cannot be
greater than max(n) + s. Now this expression clearly is elementary, since
m, . . . , m) with i occurrences of m is denable by a limited recursion with
bound (m+i)
2
i
, as is easily seen by induction on i.
Now recall that if program P has program number e then computation
terminates when instruction I
lh(e)
is encountered. Thus we can dene the
termination relation T(e, n, s) meaning computation terminates at step
s, by
T(e, n, s) ( state(e, n, s) )
1
= lh(e).
Clearly T is elementary and
(r)
e
(n) is dened s T(e, n, s).
The output on termination is the value of register v
0
, so if we dene the
output function U(e, n, s) by
U(e, n, s) = ( state(e, n, s) )
2
then U is also elementary and
(r)
e
(n) = U(e, n, s T(e, n, s)).
This completes the proof.
3.3.
0
1
Denable Relations and Recursive Functions. A rela
tion R of arity r is said to be
0
1
denable if there is an elementary relation
E, say of arity r +l, such that for all n = n
1
, . . . , n
r
,
R(n) k
1
. . . k
l
E(n, k
1
, . . . , k
l
).
3. THE NORMAL FORM THEOREM 67
A partial function is said to be
0
1
denable if its graph
(n, m) [ (n) is dened and = m
is
0
1
denable.
To say that a nonempty relation R is
0
1
denable is equivalent to saying
that the set of all sequences n) satisfying R can be enumerated (possibly
with repetitions) by some elementary function f : N N. Such relations are
called elementarily enumerable. For choose any xed sequence a
1
, . . . , a
r
)
satisfying R and dene
f(m) =
_
(m)
1
, . . . , (m)
r
) if E((m)
1
, . . . , (m)
r+l
)
a
1
, . . . , a
r
) otherwise.
Conversely if R is elementarily enumerated by f then
R(n) m(f(m) = n))
is a
0
1
denition of R.
The recursive functions are those (partial) functions which can be
dened from the initial functions: constant 0, successor S, projections (onto
the ith coordinate), addition +, modied subtraction
and multiplication
, by applications of composition and unbounded minimization. Note that
it is through unbounded minimization that partial functions may arise.
Lemma. Every elementary function is recursive.
Proof. By simply removing the bounds on in the lemmas in 2.3
one obtains recursive denitions of the pairing functions ,
1
,
2
and of
Godels function. Then by removing all mention of bounds from Theorem
in 2.4 one sees that the recursive functions are closed under (unlimited)
primitive recursive denitions: f( m, 0) = g( m), f( m, n+1) = h(n, f( m, n)).
Thus one can recursively dene bounded sums and bounded products, and
hence all elementary functions.
3.4. Computable Functions.
Definition. The whileprograms are those programs which can be built
up from assignment statements x := 0, x := y, x := y + 1, x := y
1, by
Conditionals, Composition, ForLoops and WhileLoops as in the subsection
on program constructs in Section 1.
Theorem. The following are equivalent:
(a) is register machine computable,
(b) is
0
1
denable,
(c) is recursive,
(d) is computable by a while program.
Proof. The Normal Form Theorem shows immediately that every re
gister machine computable function
(r)
e
is
0
1
denable since
(r)
e
(n) = m s.T(e, n, s) U(e, n, s) = m
and the relation T(e, n, s) U(e, n, s) = m is clearly elementary. If is
0
1
denable, say
(n) = m k
1
. . . k
l
E(n, m, k
1
, . . . , k
l
)
68 3. COMPUTABILITY
then can be dened recursively by
(n) = ( mE(n, (m)
0
, (m)
1
, . . . , (m)
l
) )
0
,
using the fact (above) that elementary functions are recursive. The exam
ples of computable functionals in Section 1 show how the denition of any
recursive function translates automatically into a while program. Finally,
the subsection on program constructs in Section 1 shows how to implement
any while program on a register machine.
Henceforth computable means register machine computable or any of
its equivalents.
Corollary. The function
(r)
e
(n
1
, . . . , n
r
) is a computable partial func
tion of the r + 1 variables e, n
1
, . . . , n
r
.
Proof. Immediate from the Normal Form.
Lemma. A relation R is computable if and only if both R and its com
plement N
n
R are
0
1
denable.
Proof. We can assume that both R and N
n
R are not empty, and (for
simplicity) also n = 1.
. By the theorem above every computable relation is
0
1
denable,
and with R clearly its complement is computable.
. Let f, g c enumerate R and N R, respectively. Then
h(n) := i.f(i) = n g(i) = n
is a total recursive function, and R(n) f(h(n)) = n.
3.5. Undecidability of the Halting Problem. The above corollary
says that there is a single universal program which, given numbers e and
n, computes
(r)
e
(n) if it is dened. However we cannot decide in advance
whether or not it will be dened. There is no program which, given e and
n, computes the total function
h(e, n) =
_
1 if
(r)
e
(n) is dened,
0 if
(r)
e
(n) is undened.
For suppose there were such a program. Then the function
(n) = m(h(n
1
, n) = 0)
would be computable, say with xed program number e
0
, and therefore
(r)
e
0
(n) =
_
0 if h(n
1
, n) = 0
undened if h(n
1
, n) = 1
But then xing n
1
= e
0
gives:
(r)
e
0
(n) dened h(e
0
, n) = 0
(r)
e
0
(n) undened
a contradiction. Hence the relation R(e, n) which holds if and only if
(r)
e
(n)
is dened, is not recursive. It is however
0
1
denable.
There are numerous attempts to classify total computable functions ac
cording to the complexity of their termination proofs.
4. RECURSIVE DEFINITIONS 69
4. Recursive Denitions
4.1. Least Fixed Points of Recursive Denitions. By a recursive
denition of a partial function of arity r from given partial functions
1
, . . . ,
m
of xed but unspecied arities, we mean a dening equation of
the form
(n
1
, . . . , n
r
) = t(
1
, . . . ,
m
, ; n
1
, . . . , n
r
)
where t is any compositional term built up from the numerical variables
n = n
1
, . . . , n
r
and the constant 0 by repeated applications of the successor
and predecessor functions, the given functions
1
, . . . ,
m
, the function
itself, and the denition by cases function :
dc(x, y, u, v) =
_
_
u if x, y are both dened and equal
v if x, y are both dened and unequal
undened otherwise.
Our notion of recursive denition is essentially a reformulation of the Her
brandGodelKleene equation calculus; see Kleene [15].
There may be many partial functions satisfying such a recursive def
inition, but the one we wish to single out is the least dened one, i.e. the
one whose dened values arise inevitably by lazy evaluation of the term t
from the outside in, making only those function calls which are absolutely
necessary. This presupposes that each of the functions from which t is con
structed already comes equipped with an evaluation strategy. In particular
if a subterm dc(t
1
, t
2
, t
3
, t
4
) is called then it is to be evaluated according to
the program construct:
x := t
1
; y := t
2
; if x := y then t
3
else t
4
.
Some of the function calls demanded by the term t may be for further values
of itself, and these must be evaluated by repeated unravellings of t (in other
words by recursion).
This least solution will be referred to as the function dened by that
recursive denition or its least xed point. Its existence and its computabil
ity are guaranteed by Kleenes Recursion Theorem below.
4.2. The Principles of Finite Support and Monotonicity, and
the Eective Index Property. Suppose we are given any xed partial
functions
1
, . . . ,
m
and , of the appropriate arities, and xed inputs n.
If the term t = t(
1
, . . . ,
m
, ; n) evaluates to a dened value k then the
following principles are required to hold:
Finite Support Principle. Only nitely many values of
1
, . . . ,
m
and
are used in that evaluation of t.
Monotonicity Principle. The same value k will be obtained no matter
how the partial functions
1
, . . . ,
m
and are extended.
Note also that any such term t satises the
Eective Index Property. There is an elementary function f such that if
1
, . . . ,
m
and are computable partial functions with program numbers
e
1
, . . . , e
m
and e respectively, then according to the lazy evaluation strategy
just described,
t(
1
, . . . ,
m
, ; n)
70 3. COMPUTABILITY
denes a computable function of n with program number f(e
1
, . . . , e
m
, e).
The proof of the Eective Index Property is by induction over the build
up of the term t. The base case is where t is just one of the constants 0, 1
or a variable n
j
, in which case it denes either a constant function n 0
or n 1, or a projection function n n
j
. Each of these is trivially
computable with a xed program number, and it is this program number
we take as the value of f(e
1
, . . . , e
m
, e). Since in this case f is a constant
function, it is clearly elementary. The induction step is where t is built up
by applying one of the given functions: successor, predecessor, denition by
cases or (with or without a subscript) to previously constructed subterms
t
i
(
1
, . . . ,
m
, ; n), i = 1 . . . l, thus:
t = (t
1
, . . . , t
l
).
Inductively we can assume that for each i = 1 . . . l, t
i
denes a partial
function of n = n
1
, . . . , n
r
which is register machine computable by some
program P
i
with program number given by an alreadyconstructed elemen
tary function f
i
= f
i
(e
1
, . . . , e
m
, e). Therefore if is computed by a program
Q with program number e, we can put P
1
, . . . , P
l
and Q together to con
struct a new program obeying the evaluation strategy for t. Furthermore,
by the remark on indexconstructions near the beginning of Section 3, we
will be able to compute its program number f(e
1
, . . . , e
m
, e) from the given
numbers f
1
, . . . , f
l
and e, by some elementary function.
4.3. Recursion Theorem.
Theorem (Kleenes Recursion Theorem). For given partial functions
1
, . . . ,
m
, every recursive denition
(n) = t(
1
, . . . ,
m
, ; n)
has a least xed point, i.e. a least dened solution, . Moreover if
1
, . . . ,
m
are computable, so is the least xed point .
Proof. Let
1
, . . . ,
m
be xed partial functions of the appropriate
arities. Let be the functional from partial functions of arity r to partial
functions of arity r dened by lazy evaluation of the term t as described
above:
()(n) = t(
1
, . . . ,
m
, ; n).
Let
0
,
1
,
2
, . . . be the sequence of partial functions of arity r generated
by thus:
0
is the completely undened function, and
i+1
= (
i
) for
each i. Then by induction on i, using the Monotonicity Principle above, we
see that each
i
is a subfunction of
i+1
. That is, whenever
i
(n) is dened
with a value k then
i+1
(n) is dened with that same value. Since their
dened values are consistent with one another we can therefore construct
the union of the
i
s as follows:
(n) = k i (
i
(n) = k).
(i) This is then the required least xed point of the recursive denition.
To see that it is a xed point, i.e. = (), rst suppose (n) is dened
with value k. Then by the denition of just given, there is an i > 0 such
that
i
(n) is dened with value k. But
i
= (
i1
) so (
i1
)(n) is
dened with value k. Therefore by the Monotonicity Principle for , since
4. RECURSIVE DEFINITIONS 71
i1
is a subfunction of , ()(n) is dened with value k. Hence is a
subfunction of ().
It remains to show the converse, that () is a subfunction of . So sup
pose ()(n) is dened with value k. Then by the Finite Support Principle,
only nitely many dened values of are called for in this evaluation. By
the denition of there must be some i such that
i
already supplies all of
these required values, and so already at stage i we have (
i
)(n) =
i+1
(n)
dened with value k. Since
i+1
is a subfunction of it follows that (n)
is dened with value k. Hence () is a subfunction of .
To see that is the least such xed point, suppose
t
is any xed point
of . Then (
t
) =
t
so by the Monotonicity Principle, since
0
is a
subfunction of
t
it follows that (
0
) =
1
is a subfunction of (
t
) =
t
.
Then again by Monotonicity, (
1
) =
2
is a subfunction of (
t
) =
t
etcetera so that for each i,
i
is a subfunction of
t
. Since is the union of
the
i
s it follows that itself is a subfunction of
t
. Hence is the least
xed point of .
(ii) Finally we have to show that is computable if the given functions
1
, . . . ,
m
are. For this we need the Eective Index Property of the term
t, which supplies an elementary function f such that if is computable
with program number e then () is computable with program number
f(e) = f(e
1
, . . . , e
m
, e). Thus if u is any xed program number for the
completely undened function of arity r, f(u) is a program number for
1
= (
0
), f
2
(u) = f(f(u)) is a program number for
2
= (
1
), and in
general f
i
(u) is a program number for
i
. Therefore in the notation of the
Normal Form Theorem,
i
(n) =
(r)
f
i
(u)
(n)
and by the second corollary to the Normal Form Theorem, this is a com
putable function of i and n, since f
i
(u) is a computable function of i den
able (informally) say by a forloop of the form for j = 1 . . . i do f od.
Therefore by the earlier equivalences,
i
(n) is a
0
1
denable function of i
and n, and hence so is itself because
(n) = m i (
i
(n) = m) .
So is computable and this completes the proof.
Note. The above proof works equally well if is a vectorvalued func
tion. In other words if, instead of dening a single partial function , the
recursive denition in fact denes a nite list of such functions simultane
ously. For example, the individual components of the machine state of any
register machine at step s are clearly dened by a simultaneous recursive
denition, from zero and successor.
4.4. Recursive Programs and Partial Recursive Functions. A
recursive program is a nite sequence of possibly simultaneous recursive
denitions:
0
(n
1
, . . . , n
r
0
) = t
0
(
0
; n
1
, . . . , n
r
0
)
1
(n
1
, . . . , n
r
1
) = t
1
(
0
,
1
; n
1
, . . . , n
r
1
)
2
(n
1
, . . . , n
r
2
) = t
2
(
0
,
1
,
2
; n
1
, . . . , n
r
2
)
72 3. COMPUTABILITY
.
.
.
k
(n
1
, . . . , n
r
k
) = t
k
(
0
, . . . ,
k1
,
k
; n
1
, . . . , n
r
k
).
A partial function is said to be partial recursive if it is one of the functions
dened by some recursive program as above. A partial recursive function
which happens to be totally dened is called simply a recursive function.
Theorem. A function is partial recursive if and only if it is computable.
Proof. The Recursion Theorem tells us immediately that every partial
recursive function is computable. For the converse we use the equivalence of
computability with recursiveness already established in Section 3. Thus
we need only show how to translate any recursive denition into a recursive
program:
The constant 0 function is dened by the recursive program
(n) = 0
and similarly for the constant 1 function.
The addition function (m, n) = m+n is dened by the recursive pro
gram
(m, n) = dc(n, 0, m, (m, n
1) + 1)
and the subtraction function (m, n) = m
1. Multiplication is
dened recursively from addition in much the same way. Note that in each
case the right hand side of the recursive denition is an allowed term.
The composition scheme is a recursive denition as it stands.
Finally, given a recursive program dening , if we add to it the recursive
denition:
(n, m) = dc( (n, m), 0, m, (n, m+ 1) )
followed by
t
(n) = (n, 0)
then the computation of
t
(n) proceeds as follows:
t
(n) = (n, 0)
= (n, 1) if (n, 0) ,= 0
= (n, 2) if (n, 1) ,= 0
.
.
.
= (n, m) if (n, m1) ,= 0
= m if (n, m) = 0
Thus the recursive program for
t
denes unbounded minimization:
t
(n) = m((n, m) = 0).
This completes the proof.
CHAPTER 4
Godels Theorems
1. G odel Numbers
1.1. Coding Terms and Formulas. We use the elementary sequence
coding and decoding machinery developed earlier. Let L be a countable rst
order language. Assume that we have injectively assigned to every nary
relation symbol R a symbol number SN(R) of the form 1, n, i) and to every
nary function symbol f a symbol number SN(f) of the form 2, n, j). Call
L elementarily presented, if the set Symb
/
of all these symbol numbers is
elementary. In what follows we shall always assume that the languages L
considered are elementarily presented. In particular this applies to every
language with nitely many relation and function symbols.
Assign numbers to the logical symbols by SN() := 3, 1), SN() :=
3, 2) und SN() := 3, 3), and to the ith variable assign the symbol number
0, i).
For every Lterm t we dene recursively its Godel number t by
x := SN(x)),
c := SN(c)),
ft
1
. . . t
n
:= SN(f), t
1
, . . . , t
n
).
Similarly we recursively dene for every Lformula A its Godel number A
by
Rt
1
. . . t
n
:= SN(R), t
1
, . . . , t
n
),
A B := SN(), A, B),
A B := SN(), A, B),
xA := SN(), x, A).
Let Var := 0, i)) [ i N. Var clearly is elementary, and we have a Var
if and only if a = x for a variable x. We dene Ter N as follows, by
courseofvalues recursion.
a Ter :
a Var
((a)
0
Symb
/
(a)
0,0
= 2 lh(a) = (a)
0,1
+ 1 i
0<i<lh(a)
(a)
i
Ter).
Ter is elementary, and it is easily seen that a Ter if and only if a = t for
some term t. Similarly For N is dened by
a For :
((a)
0
Symb
/
(a)
0,0
= 1 lh(a) = (a)
0,1
+ 1 i
0<i<lh(a)
(a)
i
Ter)
(a = SN(), (a)
1
, (a)
2
) (a)
1
For (a)
2
For)
73
74 4. G
ODELS THEOREMS
(a = SN(), (a)
1
, (a)
2
) (a)
1
For (a)
2
For)
(a = SN(), (a)
1
, (a)
2
) (a)
1
Var (a)
2
For).
Again For is elementary, and we have a For if and only if a = A for
some formula A. For a set S of formulas let S := A [ A S .
We could continue in this way and dene Godel numberings of various
other syntactical notions, but we are mainly concerned that the reader be
lieves that it can be done rather than sees all the (gory) details. In particular
there are elementary functions msm and sub with the following properties:
msm(, ) codes the result of deleting from the multiset of
formulas all the formulas which lie in .
sub(t, ) = t, and sub(A, ) = A, where is a sub
stitution (i.e. a nite assignment of terms to variables) and t and
A are the results of substituting those terms for those variables in
t or A respectively.
1.2. Sequents. In our previous exposition of natural deduction one can
nd the assumptions free at a given node by inspecting the upper part of
the proof tree. An alternative is to write the free assumptions next to each
node, in the form of a multiset.
By a sequent A we mean a pair consisting of a multiset =
A
1
, . . . , A
n
of formulas and a single formula A. We dene
m
A
inductively by the following rules. An assumption can be introduced by
A if A in .
For conjunction we have an introduction rule I and two elimination rules
E
l
und E
r
.
A B
, A B
I
A B
A
E
r
A B
B
E
l
Here , denotes multiset union. For implication we have an introduc
tion rule I (not mentioning an assumption variable u) and an elimination
rule E.
B
A B
I
A B A
, B
E
In I the multiset is obtained from by cancelling some occurrences
of A. For the universal quantier we have an introduction rule I and
an elimination rule E (formulated without the term t to be substituted as
additional premise)
A
xA
I
xA
A[x := t]
E
In I the variable condition needs to hold: for all B in we must have
x / FV(B).
Lemma. (a) If
m
A
1
, . . . , A
n
A, then for all (not necessarily dis
tinct) u
1
, . . . , u
n
such that u
i
= u
j
A
i
= A
j
we can nd a derivation
term M
A
[u
A
1
1
, . . . , u
A
n
n
].
(b) For every derivation term M
A
[u
A
1
1
, . . . , u
A
n
n
] one can nd multiplicities
k
1
, . . . , k
n
0 such that
m
A
k
1
1
, . . . , A
k
n
n
A; here A
k
means a
kfold occurrence of A.
1. G
ODEL NUMBERS 75
Proof. (a). Assume
m
A
1
, . . . , A
n
A. We use induction on
m
.
Case assumption. Let A = A
i
. Take M = u
i
.
Case I. We may assume
A
1
, . . . , A
n
, A, . . . , A B
A
1
, . . . , A
n
A B
I.
Let u
1
, . . . , u
n
be given such that u
i
= u
j
A
i
= A
j
. Pick a new as
sumption variable u. By IH there exists an M
B
such that FA(M
B
)
u
A
1
1
, . . . , u
A
n
n
, u
A
. Then (u
A
M
B
)
AB
is a derivation term with free as
sumptions among u
A
1
1
, . . . , u
A
n
n
.
Case E. Assume we have derivations of
A
1
, . . . , A
n
A B and A
n+1
, . . . , A
n+m
A.
Let u
1
, . . . , u
n+m
be given such that u
i
= u
j
A
i
= A
j
. By IH we have
derivation terms
M
AB
[u
A
1
1
, . . . , u
A
n
n
] and N
A
[u
A
n+1
n+1
, . . . , u
A
n+m
n+m
].
But then also
(MN)
B
[u
A
1
1
, . . . , u
A
n+m
n+m
]
is a derivation term.
The other cases are treated similarly.
(b). Let a derivation term M
A
[u
A
1
1
, . . . , u
A
n
n
] be given. We use induction
on M. Case u
A
. Then
m
A A.
Case I, so (u
A
M
B
)
AB
. Let FA(M
B
) u
A
1
1
, . . . , u
A
n
n
, u
A
with
u
1
, . . . , u
n
, u distinct. By IH we have
m
A
k
1
1
, . . . , A
k
n
n
, A
k
B.
Using the rule I we obtain
m
A
k
1
1
, . . . , A
k
n
n
A B.
Case E. We are given (M
AB
N
A
)
B
[u
A
1
1
, . . . , u
A
n
n
]. By IH we have
m
A
k
1
1
, . . . , A
k
n
n
A B and
m
A
l
1
1
, . . . , A
l
n
n
A.
Using the rule E we obtain
A
k
1
+l
1
1
, . . . , A
k
n
+l
n
n
B.
The other cases are treated similarly.
1.3. Coding Derivations. We can now dene the set of Godel num
bers of formal proofs (in the above formulation of minimal logic), as follows.
Deriv(d) : i<lh(d).
(m<lh((d)
i,0
) For((d)
i,0,m
) n<lh((d)
i,0
) ((d)
i,1
= (d)
i,0,n
)) (A)
(j, k<i.(d)
i,1
= SN(), (d)
j,1
, (d)
k,1
) (d)
i,0
=
m
(d)
j,0
(d)
k,0
) (I)
(j<i.(d)
j,1
= SN(), (d)
i,1
, (d)
j,1,2
) (d)
i,0
=
m
(d)
j,0
) (E
r
)
(j<i.(d)
j,1
= SN(), (d)
j,1,1
, (d)
i,1
) (d)
i,0
=
m
(d)
j,0
) (E
l
)
(j<i.(d)
i,1
= SN(), (d)
i,1,1
, (d)
j,1
) For((d)
i,1,1
) (I)
msm((d)
i,0
, (d)
j,0
) = 0
n<lh(msm((d)
j,0
, (d)
i,0
)) ((msm((d)
j,0
, (d)
i,0
))
n
= (d)
i,1,1
))
76 4. G
ODELS THEOREMS
(j, k<i.(d)
j,1
= SN(), (d)
k,1
, (d)
i,1
) (d)
i,0
=
m
(d)
j,0
(d)
k,0
) (E)
(j<i.(d)
i,1
= SN(), (d)
i,1,1
, (d)
j,1
) Var((d)
i,1,1
) (I)
(d)
i,0
=
m
(d)
j,0
n<lh((d)
i,0
) FV((d)
i,1,1
, (d)
i,0,n
))
(j<i.(d)
j,1,0
= SN() (d)
i,0
=
m
(d)
j,0
(E)
((d)
i,1
= (d)
j,1,2
n<(d)
i,1
.Ter(n)
(d)
i,1
= sub((d)
j,1,2
, (d)
j,1,1
, n))))).
Note that (1) this clearly denes an elementary set, and (2) if one carefully
reads the denition, then it becomes clear that d is in Deriv i d codes a
sequence of pairs (sequents)
i
A
i
with
i
a multiset, such that this
sequence constitutes a derivation in minimal logic, i.e. each sequent is either
an axiom or else follows from previous sequents by a rule. Thus
Lemma.
(a) Deriv(d) if and only if d is the G odel number of a derivation.
(b) Deriv is elementary.
1.4. Axiomatizable Theories. A set S of formulas is called recursive
(elementary,
0
1
denable), if S := A [ A S is recursive (elemen
tary,
0
1
denable). Clearly the sets Stabax
/
of stability axioms and Eq
/
of
Lequality axioms are elementary.
Now let L be an elementarily presented language with = in L. A theory
T with L(T) L is called recursively (elementarily) axiomatizable, if there
is a recursive (elementary) set S of closed Lformulas such that T = A
L [ S Eq
/
c
A.
Theorem. For theories T with L(T) L the following are equivalent.
(a) T is recursively axiomatizable.
(b) T is elementarily axiomatizable.
(c) T is
0
1
denable.
Proof. (c) (b). Let T be
0
1
denable. Then by Section 2.5
of Chapter 3 there exists an f c such that T = ran(f), and by the
argument there we can assume f(n) n for all n. Let f(n) = A
n
. We
dene an elementary function g with the property g(n) = A
0
A
n
by
g(0) := f(0),
g(n + 1) := g(n)
f(n + 1),
where a
b := SN(), a, b). Clearly g can be bounded in c. For S :=
A
0
A
n
[ n N we have S = ran(g), and this set is elementary
because of a ran(g) n<a (a = g(n)). T is elementarily axiomatizable,
since T = A L [ S Eq
/
c
A.
(b) (a) is clear.
(a) (c). Let T be axiomatized by S with S recursive. Then
a T dc<d.Deriv(d) (d)
lh(d)
1
= c, a)
i<lh(c) ((c)
i
Stabax Eq S).
Hence T is
0
1
denable.
2. UNDEFINABILITY OF THE NOTION OF TRUTH 77
A theory T in our elementarily presented language L is called axioma
tized, if it is given by a
0
1
denable axiom system Ax
T
. By the theorem just
proved we can even assume that Ax
T
is elementary. For such axiomatized
theories we dene Prf
T
N N by
Prf
T
(d, a) : Deriv(d) c<d.(d)
lh(d)
1
= c, a)
i<lh(c) ((c)
i
Stabax Eq Ax
T
).
Clearly Prf
T
is elementary and Prf
T
(d, a) if and only if d is a derivation of
a sequent A with composed from stability axioms, equality axioms
and formulas from Ax
T
, and a = A.
A theory T is called consistent, if there is a closed formula A such that
A / T; otherwise T is called inconsistent.
Corollary. Every axiomatized complete theory T is recursive.
Proof. If T is inconsistent, then T is recursive. If not, then from the
completeness of T we obtain
a N T a / For b<a FV(b, a) a T,
where a := a and a b := SN(), a, b). Hence with T also
N T is
0
1
denable and therefore T is recursive.
2. Undenability of the Notion of Truth
Recall the convention in 1.2 of Chapter 1: once a formula has been
introduced as A(x), i.e., A with a designated variable x, we write A(t) for
A[x := t], and similarly with more variables.
2.1. Denable Relations. Let / be an Lstructure. A relation R
[/[
n
is called denable in / if there is an Lformula A(x
1
, . . . , x
n
) with
only the free variables shown such that
R = (a
1
, . . . , a
n
) [/[
n
[ / [= A[a
1
, . . . , a
n
] .
We assume in this section that [/[ = N, 0 is a constant in L and S is a
unary function symbol in L with 0
/
= 0 and S
/
(a) = a+1. Then for every
a N we can dene the numeral a Ter
/
by 0 := 0 and a + 1 := S(a).
Observe that in this case the denability of R N
n
by A(x
1
, . . . , x
n
) is
equivalent to
R = (a
1
, . . . , a
n
) N
n
[ / [= A(a
1
, . . . , a
n
) .
Furthermore let L be an elementarily presented language. We shall always
assume in this section that every elementary relation is denable in /. A
set S of formulas is called denable in /, if S := A [ A S is
denable in /.
We shall show that already from these assumptions it follows that the
notion of truth for /, more precisely the set Th(/) of all closed formulas
valid in /, is undenable in /. From this it will follow in turn that the
notion of truth is in fact undecidable, for otherwise the set Th(/) would
be recursive (by Churchs Thesis), hence
0
1
denable, and hence denable,
because we have assumed already that all elementary relations are denable
in /.
78 4. G
ODELS THEOREMS
2.2. Fixed Points. For the proof we shall need the following lemma,
which will be generalized in the next section.
Lemma (Semantical Fixed Point Lemma). If every elementary relation
is denable in /, then for every Lformula B(z) with only z free we can
nd a closed Lformula A such that
/ [= A if and only if / [= B[A].
Proof. We dene an elementary function s by
s(b, k) := sub(b, z, k))).
Here z is a specially given variable determined by B(z), say
0
. Then for
every formula C(z) we have
s(C, k) = sub(C, z, k))) = C(k),
hence in particular
s(C, C) = C(C).
By assumption the graph G
s
of s is denable in /, by A
s
(x
1
, x
2
, x
3
) say.
Let
C(z) := x.B(x) A
s
(z, z, x),
A := C(C),
so
A = x.B(x) A
s
(C, C, x).
Hence / [= A if and only if aN./ [= B[a] and a = C(C), so if and
only if / [= B[A].
2.3. Undenability. We can now prove the undenability of truth.
Theorem (Tarskis Undenability Theorem). Assume that every ele
mentary relation is denable in /. Then Th(/) is undenable in /,
hence in particular not
0
1
denable.
Proof. Assume that Th(/) is denable by B
W
(z). Then for all
closed formulas A
/ [= A if and only if / [= B
W
[A].
Now consider the formula B
W
(z) and choose by the Fixed Point Lemma
a closed Lformula A such that
/ [= A if and only if / [= B
W
[A].
This contradicts the equivalence above.
We already have noticed that all
0
1
denable relations are denable in
/. Hence it follows that Th(/) cannot be
0
1
denable.
3. THE NOTION OF TRUTH IN FORMAL THEORIES 79
3. The Notion of Truth in Formal Theories
We now want to generalize the arguments of the previous section. There
we have made essential use of the notion of truth in a structure /, i.e. of
the relation / [= A. The set of all closed formulas A such that / [= A has
been called the theory of /, denoted Th(/).
Now instead of Th(/) we shall start more generally from an arbitrary
theory T. We shall deal with the question as to whether in T there is a notion
of truth (in the form of a truth formula B(z)), such that B(z) means that
z is true.
What shall this mean? We have to explain all the notions used without
referring to semantical concepts at all.
z ranges over closed formulas (or sentences) A, or more precisely
over their Godel numbers A.
A true is to be replaced by T A.
C equivalent to D is to be replaced by T C D.
We want to study the question as to whether it is possible that a truth
formula B(z) exists, such that for all sentences A we have T A B(A).
The result will be that this is impossible, under rather weak assumptions on
the theory T.
3.1. Representable Relations. Technically, the issue will be to re
place the notion of denability by the notion of representability within a
formal theory.
Let L again be an elementarily presented language with 0, S, = in L and
T be a theory containing the equality axioms Eq
/
.
Definition. A relation R N
n
is representable in T if there is a formula
A(x
1
, . . . , x
n
) such that
T A(a
1
, . . . , a
n
), if (a
1
, . . . , a
n
) R,
T A(a
1
, . . . , a
n
), if (a
1
, . . . , a
n
) / R.
A function f : N
n
N is called representable in T if there is a formula
A(x
1
, . . . , x
n
, y) representing the graph G
f
N
n+1
of f, i.e., such that
T A(a
1
, . . . , a
n
, f(a
1
, . . . , a
n
)), (16)
T A(a
1
, . . . , a
n
, c), if c ,= f(a
1
, . . . , a
n
) (17)
and such that in addition
T A(a
1
, . . . , a
n
, y) A(a
1
, . . . , a
n
, z) y = z for all a
1
, . . . , a
n
N.
(18)
Notice that in case T b ,= c for b < c the condition (17) follows from
(16) and (18).
Lemma. If the characteristic function c
R
of a relation R N
n
is repre
sentable in T, then so is the relation R itself.
Proof. For simplicity assume n = 1. Let A(x, y) be a formula rep
resenting c
R
. We show that A(x, 1) represents the relation R. So assume
a R. Then c
R
(a) = 1, hence (a, 1) G
c
R
, hence T A(a, 1). Conversely,
assume a / R. Then c
R
(a) = 0, hence (a, 1) / G
c
R
, hence T A(a, 1).
80 4. G
ODELS THEOREMS
3.2. Fixed Points. We can now prove a generalized (syntactical) ver
sion of the Fixed Point Lemma above.
Lemma (Fixed Point Lemma). Assume that all elementary functions
are representable in T. Then for every formula B(z) with only z free we can
nd a closed formula A such that
T A B(A).
Proof. We start as in the proof of the Semantical Fixed Point Lemma.
Let A
s
(x
1
, x
2
, x
3
) be a formula which represents the elementary function
s(b, k) := sub(b, z, k))). Let
C(z) := x.B(x) A
s
(z, z, x),
A := C(C),
i.e.
A = x.B(x) A
s
(C, C, x).
Because of s(C, C) = C(C) = A we can prove in T
A
s
(C, C, x) x = A,
hence by denition of A also
A x.B(x) x = A
and hence
A B(A).
ODELS THEOREMS
Proof. We rst dene Refut
T
N N by
Refut
T
(d, a) : Prf
T
(d, a).
So Refut
T
is elementary, and we have Refut
T
(d, a) if and only if d is a refu
tation of a in T, i.e., d is a derivation of a sequent A coded by
a = A and is composed from stability axioms and formulas from
Ax
T
. Let B
Prf
T
(x
1
, x
2
) and B
Refut
T
(x
1
, x
2
) be formulas representing Prf
T
and Refut
T
, respectively. Choose by the Fixed Point Lemma in 3.2 a closed
formula A such that
T A x.B
Prf
T
(x, A) y.y < x B
Refut
T
(y, A).
So A expresses its own underivability, in the form (due to Rosser) For every
proof of me there is a shorter proof of my negation.
We shall show () T , A and () T , A. Ad (). Assume T A.
Choose a such that
Prf
T
(a, A).
Then we also have
not Refut
T
(b, A) for all b,
since T is consistent. Hence we have
T B
Prf
T
(a, A),
T B
Refut
T
(b, A) for all b.
By (19) we can conclude
T B
Prf
T
(a, A) y.y < a B
Refut
T
(y, A)
Hence we have
T x.B
Prf
T
(x, A) y.y < x B
Refut
T
(y, A),
T A.
This contradicts the assumed consistency of T.
Ad (). Assume T A. Choose a such that
Refut
T
(a, A).
Then we also have
not Prf
T
(b, A) for all b,
since T is consistent. Hence we have
T B
Refut
T
(a, A),
T B
Prf
T
(b, A) for all b.
But this implies
T x.B
Prf
T
(x, A) y.y < x B
Refut
T
(y, A),
as can be seen easily by cases on x, using (20). Hence T A. But this again
contradicts the assumed consistency of T.
5. REPRESENTABILITY 83
4.3. Relativized G odelRosser Theorem. Finally we formulate a
variant of this theorem which does not assume any more that the theory T
talks about numbers only.
Theorem (GodelRosser). Assume that T is an axiomatized consistent
Ltheory with 0, S, = in L and Eq
/
T. Furthermore assume that there are
formulas N(x) and L(x, y) written Nx and x < y such that T N0,
T xN N(S(x)) and
T xN.x < a x = 0 x = a 1,
T xN.x = 0 x = a a < x.
Here xN A is short for x.Nx A. Moreover assume that every ele
mentary function is representable in T. Then one can nd a closed formula
A such that neither A nor A is provable in T.
Proof. As before; just relativize all quantiers to N.
5. Representability
We show in this section that already very simple theories have the prop
erty that all recursive functions are representable in them.
5.1. A Weak Arithmetic. It is here where the need for Godels 
function arises: Recall that we had used it to prove that the class of recursive
functions can be generated without use of the primitive recursion scheme,
i.e. with composition and the unbounded operator as the only generating
schemata.
Theorem. Assume that T is an Ltheory with 0, S, = in L and Eq
/
T.
Furthermore assume that there are formulas N(x) and L(x, y) written Nx
and x < y such that T N0, T xN N(S(x)) and the following hold:
T S(a) ,= 0 for all a N, (21)
T S(a) = S(b) a = b for all a, b N, (22)
the functions + and are representable in T, (23)
T xN (x ,< 0), (24)
T xN.x < S(b) x < b x = b for all b N, (25)
T xN.x < b x = b b < x for all b N. (26)
Here again xN A is short for x.Nx A. Then T fullls the assumptions
of the theorem in 4.3., i.e., the G odelRosser Theorem relativized to N. In
particular we have, for all a N
T xN.x < a x = 0 x = a 1, (27)
T xN.x = 0 x = a a < x, (28)
and every recursive function is representable in T.
Proof. (27) can be proved easily by induction on a. The base case
follows from (24), and the step from the induction hypothesis and (25).
(28) immediately follows from the trichotomy law (26), using (27).
For the representability of recursive functions, rst note that the for
mulas x = y and x < y actually do represent in T the equality and the
84 4. G
ODELS THEOREMS
lessthan relations, respectively. From (21) and (22) we can see immediately
that T a ,= b when a ,= b. Assume a ,< b. We show T a ,< b by induction
on b. T a ,< 0 follows from (24). In the step we have a ,< b + 1, hence
a ,< b and a ,= b, hence by induction hypothesis and the representability
(above) of the equality relation, T a ,< b and T a ,= b, hence by (25)
T a ,< S(b). Now assume a < b. Then T a ,= b and T b ,< a, hence by
(26) T a < b.
We now show by induction on the denition of recursive functions,
that every recursive function is representable in T. Observe rst that the
second condition (17) in the denition of representability of a function au
tomatically follows from the other two conditions (and hence need not be
checked further). This is because if c ,= f(a
1
, . . . , a
n
) then by contraposing
the third condition (18),
T c ,= f(a
1
, . . . , a
n
) A(a
1
, . . . , a
n
, f(a
1
, . . . , a
n
)) A(a
1
, . . . , a
n
, c)
and hence by using representability of equality and the rst representability
condition (16) we obtain T A(a
1
, . . . , a
n
, c)
The initial functions constant 0, successor and projection (onto the i
th coordinate) are trivially represented by the formulas 0 = y, S(x) = y
and x
i
= y respectively. Addition and multiplication are represented in
T by assumption. Recall that the one remaining initial function of 
recursiveness is
b = i. b +i a = i. c
<
(b +i, a) = 0. We now show that the
characteristic function of < is representable in T. (It will then follow that
b, c) so by logic T A
f
(a, c). It remains to show uniqueness
T A
f
(a, z
1
) A
f
(a, z
2
) z
1
= z
2
. But this follows by logic from the
induction hypothesis for g
i
, which gives
T A
g
i
(a, y
1i
) A
g
i
(a, y
2i
) y
1i
= y
2i
= g
i
(a)
and the induction hypothesis for h, which gives
T A
h
(
b, z
1
) A
h
(
b, z
2
) z
1
= z
2
with b
i
= g
i
(a).
For the case, suppose f is dened from g (taken here to be binary for
notational convenience) by f(a) = i (g(i, a) = 0), assuming ai (g(i, a) =
5. REPRESENTABILITY 85
0). By induction hypothesis we have a formula A
g
(y, x, z) representing g.
In this case we represent f by the formula
A
f
(x, y) := Ny A
g
(y, x, 0) vN.v < y u.u ,= 0 A
g
(v, x, u).
We rst show the representability condition (16), that is T A
f
(a, b) when
f(a) = b. Because of the form of A
f
this follows from the assumed repre
sentability of g together with T v < b v = 0 v = b 1.
We now tackle the uniqueness condition (18). Given a, let b := f(a) (thus
g(b, a) = 0 and b is the least such). It suces to show T A
f
(a, y) y = b,
and we do this by proving T y < b A
f
(a, y) and T b < y
A
f
(a, y), and then appealing to the trichotomy law.
We rst show T y < b A
f
(a, y). Now since, for any i < b,
T A
g
(i, a, 0) by the assumed representability of g, we obtain immediately
T A
f
(a, i). Hence because of T y < b y = 0 y = b 1 the
claim follows.
Secondly, T b < y A
f
(a, y) follows almost immediately from
T b < y A
f
(a, y) u.u ,= 0 A
g
(b, a, u) and the uniqueness for g,
T A
g
(b, a, u) u = 0. This now completes the proof.
5.2. Robinsons Theory Q. We conclude this section by consider
ing a special and particularly simple arithmetical theory due originally to
Robinson. Let L
1
be the language given by 0, S, +, and =, and let Q be
the theory determined by the axioms Eq
/
1
and
S(x) ,= 0, (29)
S(x) = S(y) x = y, (30)
x + 0 = x, (31)
x + S(y) = S(x +y), (32)
x 0 = 0, (33)
x S(y) = x y +x, (34)
z (x + S(z) = y) x = y z (y + S(z) = x). (35)
Theorem. Every theory T Q fullls the assumptions of the theorem
of G odelRosser in 4.3, w.r.t. the denition L(x, y) := z (x + S(z) = y) of
the <relation. Moreover, every recursive function is representable in T.
Proof. We show that T with N(x) := (x = x) and L(x, y) := z (x +
S(z) = y) satises the conditions of the theorem in 5.1. For (21) and (22)
this is clear. For (23) we can take x + y = z and x y = z as representing
formulas. For (24) we have to show z (x + S(z) = 0); this follows from
(32) and (29). For the proof of (25) we need the auxiliary proposition
(36) x = 0 y (x = 0 + S(y)),
which will be attended to below. So assume x + S(z) = S(b), hence also
S(x+z) = S(b) and therefore x+z = b. We now use (36) for z. In case z = 0
we obtain x = b, and in case y (z = 0 + S(y)) we have y
t
(x + S(y
t
) = b),
since 0 + S(y) = S(0 + y). Thus (25) is proved. (26) follows immediately
from (35).
86 4. G
ODELS THEOREMS
For the proof of (36) we use (35) with y = 0. It clearly suces to exclude
the rst case z (x + S(z) = 0). But this means S(x +z) = 0, contradicting
(29).
Corollary (Essential Undecidability of Q). Every consistent theory
T Q is nonrecursive.
Proof. By the theorems in 5.2 and 4.1.
5.3. Undecidability of First Order Logic. As a simple corollary to
the (essential) undecidability of Q we even obtain the undecidability of pure
logic.
Corollary (Undecidability of First Order Logic). The set of formulas
derivable in classical rst order logic is nonrecursive.
Proof. Otherwise Q would be recursive, because a formula A is deriv
able in Q if and only if the implication B A is derivable in classical rst
order logic, where B is the conjunction of the nitely many axioms and
equality axioms of Q.
Remark. Notice that it suces that the rst order logic should have
one binary relation symbol (for =), one constant symbol (for 0), one unary
function symbol (for S) and two binary functions symbols (for + and ). The
study of decidable fragments of rst order logic is one of the oldest research
areas of Mathematical Logic. For more information see Borger, Gradel and
Gurevich [3].
5.4. Representability by
1
formulas of the language L
1
. By
reading through the above proof of representability, one sees easily that the
representing formulas used are of a restricted form, having no unbounded
universal quantiers and therefore dening
0
1
relations. This will be of cru
cial importance for our proof of Godels Second Incompleteness Theorem to
follow, but in addition we need to make a syntactically precise denition of
the class of formulas actually involved.
Definition. The
1
formulas of the language L
1
are those generated
inductively by the following clauses:
Only atomic formulas of the restricted forms x = y, x ,= y, 0 = x,
S(x) = y, x +y = z and x y = z are allowed as
1
formulas.
If A and B are
1
formulas, then so are A B and A B.
If A is a
1
formula, then so is x<y A, which is an abbreviation
for x.z (x + S(z) = y) A.
If A is a
1
formula, then so is xA.
Corollary. Every recursive function is representable in Q by a
1

formula in the language L
1
.
Proof. This can be seen immediately by inspecting the proof of the
theorem in 5.1. Only notice that because of the equality axioms z (x +
S(z) = y) is equivalent to zw(S(z) = wx+w = y) and A(0) is equivalent
to x.0 = x A.
6. UNPROVABILITY OF CONSISTENCY 87
6. Unprovability of Consistency
We have seen in the GodelRosser Theorem how, for every axiomatized
consistent theory T sasfying certain weak assumptions, we can construct
an undecidable sentence A meaning For every proof of me there is a shorter
proof of my negation. Because A is unprovable, it is clearly true.
Godels Second Incompleteness Theorem provides a particularly inter
esting alternative to A, namely a formula Con
T
expressing the consistency
of T. Again it turns out to be unprovable and therefore true.
We shall prove this theorem in a sharpened form due to Lob.
6.1.
1
Completeness of Q. We begin with an auxiliary proposition,
expressing the completeness of Q with respect to
1
formulas.
Lemma. Let A(x
1
, . . . , x
n
) be a
1
formulas in the language L
1
deter
mined by 0, S, +, und =. Assume that ^
1
[= A[a
1
, . . . , a
n
] where ^
1
is
the standard model of L
1
. Then Q A(a
1
, . . . , a
n
).
Proof. By induction on the
1
formulas of the language L
1
. For atomic
formulas, the cases have been dealt with either in the earlier parts of the
proof of the theorem in 5.1, or (for x+y = z and x y = z) they follow from
the recursion equations (31)  (34).
Cases AB, AB. The claim follows immediately from the induction
hypothesis.
Case x<y A(x, y, z
1
, . . . , z
n
); for simplicity assume n = 1. Suppose
^
1
[= (x<y A)[b, c]. Then also ^
1
[= A[i, b, c] for each i < b and hence by
induction hypothesis Q A(i, b, c). Now by the theorem in 5.2
Q x<b.x = 0 x = b 1,
hence
Q (x<y A)(b, c).
Case xA(x, y
1
, . . . , y
n
); for simplicity take n = 1. Assume ^
1
[=
(xA)[b]. Then ^
1
[= A[a, b] for some a N, hence by induction hypothesis
Q A(a, b) and therefore Q xA(x, b).
6.2. Formalized
1
Completeness.
Lemma. In an appropriate theory T of arithmetic with induction, we
can formally prove for any
1
formula A
A(x) pPrf
T
(p, A(
x)).
Here Prf
T
(p, z) is a suitable
1
formula which represents in Robinsons Q
the recursive relation a is the G odel number of a proof in T of the formula
with G odel number b. Also A( x) is a term which represents, in Q, the
numerical function mapping a number a to the G odel number of A(a).
Proof. We have not been precise about the theory T in which this
result is to be formalized, but we shall content ourselves at this stage with
merely pointing out, as we proceed, the basic properties that are required.
Essentially T will be an extension of Q, together with induction formalized
by the axiom schema
B(0) (x.B(x) B(S(x))) xB(x)
88 4. G
ODELS THEOREMS
and it will be assumed that T has suciently many basic functions available
to deal with the construction of appropriate Godel numbers.
The proof goes by induction on the buildup of the
1
formula A(x).
We consider three atomic cases, leaving the others to the reader. Suppose
A(x) is the formula 0 = x. We show T 0 = x pPrf
T
(p, 0 = x), by
induction on x. The base case merely requires the construction of a numeral
representing the Godel number of the axiom 0 = 0, and the induction step is
trivial because T S(x) ,= 0. Secondly suppose A is the formula x +y = z.
We show T z.x + y = z pPrf
T
(p, x + y = z) by induction on y. If
y = 0, the assumption gives x = z and one requires only the Godel number
for the axiom x(x + 0 = x) which, when applied to the Godel number of
the xth numeral, gives pPrf
T
(p, x + 0 = z). If y is a successor S(u),
then the assumption gives z = S(v) where x + u = v, so by the induction
hypothesis we already have a p such that Prf
T
(p, x + u = v). Applying
the successor to both sides, one then easily obtains from p a p
t
such that
Prf
T
(p
t
, x + y = z). Thirdly suppose A is the formula x ,= y. We show
T y.x ,= y pPrf
T
(p, x ,= y) by induction on x. The base case
x = 0 requires a subinduction on y. If y = 0, then the claim is trivial (by
exfalso). If y = S(u), we have to produce a Godel number p such that
Prf
T
(p, 0 ,= S( u), but this is just an axiom. Now consider the step case
x = S(v). Again we need an auxiliary induction on y. Its base case is dealt
with exactly as before, and when y = S(u) it uses the induction hypothesis
for v ,= u together with the injectivity of the successor.
The cases where A is built up by conjunction or disjunction are rather
trivial. One only requires, for example in the conjunction case, a function
which combines the Godel numbers of the proofs of the separate conjuncts
into a single Godel number of a proof of the conjunction A itself.
Now consider the case yA(y, x) (with just one parameter x for sim
plicity). By the induction hypothesis we already have T A(y, x)
pPrf
T
(p, A( y, x)). But any Godel number p such that Prf
T
(p, A( y, x))
can easily be transformed (by formally applying the rule) into a Godel
number p
t
such that Prf
T
(p
t
, yA(y, x)). Therefore we obtain as required,
T yA(y, x) p
t
Prf
T
(p
t
, yA(y, x)).
Finally suppose the
1
formula is of the form u<y A(u, x). We must
show
u<y A(u, x) pPrf
T
(p, u< y A(u, x)).
By the induction hypothesis
T A(u, x) pPrf
T
(p, A( u, x))
so by logic
T u<y A(u, x) u<ypPrf
T
(p, A( u, x)).
The required result now follows immediately from the auxiliary lemma:
T u<ypPrf
T
(p, A( u, x)) qPrf
T
(q, u< y A(u, x)).
It remains only to prove this, which we do by induction on y (inside T). In
case y = 0 a proof of u < 0 A is trivial, by exfalso, so the required Godel
number q is easily constructed. For the step case y = S(z) by assumption
we have u<zpPrf
T
(p, A( u, x)), hence qPrf
T
(q, u< z A(u, x)) by IH.
6. UNPROVABILITY OF CONSISTENCY 89
Also p
t
Prf
T
(p
t
, A( z, x)). Now we only have to combine p
t
and q to obtain
(by means of an appropriate simple function) a Godel number q
t
so that
Prf
T
(q
t
, u< y A(u, x)).
6.3. Derivability Conditions. So now let T be an axiomatized con
sistent theory with T Q, and possessing enough induction to formalize
1
completeness as we have just done. Dene, from the associated formula
Prf
T
, the following L
1
formulas:
Thm
T
(x) := y Prf
T
(y, x),
Con
T
:= y Prf
T
(y, ).
So Thm
T
(x) denes in ^
1
the set of formulas provable in T, and we have
^
1
[= Con
T
if and only if T is consistent. For L
1
formulas A let A :=
Thm
T
(A).
Now consider the following two derivability conditions for T (Hilbert
Bernays [12])
T A A (A closed
1
formula of the language L
1
), (37)
T (A B) A B. (38)
(37) is just a special case of formalized
1
completeness for closed formulas,
and (38) requires only that the theory T has a term that constructs, from
the Godel number of a proof of A B and the Godel number of a proof
of A, the Godel number of a proof of B, and furthermore this fact must be
provable in T.
Theorem (Godels Second Incompleteness Theorem). Let T be an ax
iomatized consistent extension of Q, satisfying the derivability conditions
(37) und (38). Then T , Con
T
.
Proof. Let C := in the theorem below, which is Lobs generalization
of Godels original proof.
Theorem (Lob). Let T be an axiomatized consistent extension of Q
satisfying the derivability conditions (37) and (38). Then for any closed L
1

formula C, if T C C (that is, T Thm
T
(C) C), then already
T C.
Proof. Assume T C C. Choose A by the Fixed Point Lemma
in 3.2 such that
(39) Q A (A C).
We must show T C. First we show T A C, as follows.
T A A C by (39)
T (A A C) by
1
completeness
T A (A C) by (38)
T A A C again by (38)
T A C because T A A by (37).
Therefore from the assumption T C C we obtain T A C.
This implies T A by (39), and then T A by
1
completeness. But
T A C as we have just shown, therefore T C.
90 4. G
ODELS THEOREMS
Remark. It follows immediately that if T is any axiomatized consistent
extension of Q satisfying the derivability conditions (37) und (38), then the
reection scheme
Thm
T
(C) C for closed L
1
formulas C
is not derivable in T. For by Lobs Theorem, it cannot be derivable when
C is underivable.
By adding to Q the induction scheme for all formulas we obtain Peano
arithmetic PA, which is the most natural example of a theory T to which
the results above apply. However, various weaker fragments of PA, obtained
by restricting the classes of induction formulas, would serve equally well as
examples of such T.
7. Notes
The undecidability of rst order logic has rst been proved by Church;
however, the basic idea of the proof was present in Godels [11] already
The fundamental papers on incompleteness are Godels [10] (from 1930)
and [11] (from 1931). Godel also discovered the function, which is of cen
tral importance for the representation theorem; he made use of the Fixed
Point Lemma only implicitely. His rst Incompleteness Theorem is based
on the formula I am not provable, a xed point of Thm
T
(x). For the
independence of this proposition from the underlying theory T he had to
assume consistency of T. Rosser (1936) proved the sharper result repro
duced here, using a formula with the meaning for every proof of me there
is a shorter proof of my negation. The undenability of the notion of truth
has rst been proved by Tarski (1939). The arithmetical theories R und Q
0
(in Exercises 46 and 47) are due to R. Robinson (1950). R is essentially un
decidable, incomplete and strong enough for
1
completeness; moreover, all
recursive relations are representable in R. Q
0
is a very natural theory and
in contrast to R nite. Q
0
is minimal in the following sense: if one axiom is
deleted, then the resulting theory is not essentially undecidable any more.
The rst essentially undecidable theory was found by Mostowski and Tarski
(1939); when readng the manuscript, J. Robinson had the idea of treating
recursive functions without the scheme of primitive recursion.
Important examples for undecidable theories are (in historic order):
Arithmetic of natural numbers (Rosser, 1936), arithmetic of integers (Tarski,
Mostowski, 1949), arithmetic of rationals and the theory of ordered elds
(J. Robinson 1949), group theory and lattice theory (Tarski 1949). This
is in contrast to the following decidable theories: the theory of addition
for natural numbers (Pressburger 1929), that of multiplication (Mostowski
1952), the theory of abelian groups (Szmielew 1949), of algebraically closed
elds and of boolean algebras (Tarski 1949), the theory of linearly ordered
sets (Ehrenfeucht, 1959).
CHAPTER 5
Set Theory
1. Cumulative Type Structures
Set theory can be viewed as a framework within which mathematics can
be given a foundation. Here we want to develop set theory as a formal theory
within mathematical logic. But rst it is necessary to have an intuitive
picture of the notion of a set, to be described by the axioms.
1.1. Cantors Denition. Cantor in 1895 gave the following deni
tion:
Unter einer Menge verstehen wir jede Zusammenfassung
M von bestimmten wohlunterschiedenen Objekten m un
serer Anschauung oder unseres Denkens (welche die Ele
mente von M genannt werden) zu einem Ganzen.
One can try to make this denition more precise, as follows. Let V be the
collection of all objects unserer Anschauung oder unseres Denkens. Let
A(x) denote properties of objects x from V . Then one can form the set
x [ A(x) , the set of all objects x of V with the property A(x). According
to Cantors denition x [ A(x) is again an object in V .
Examples for properties: (1) x is a natural number. (2) x is a set. (3) x
is a point, y is a line and x lies on y. (4) y is a set and x is an element of y,
shortly: Set(y) x y.
However, Cantors denition cannot be accepted in its original form, for
it leads to contradictions. The most well known is Russells antinomy: Let
x
0
:= x [ Set(x) x / x. Then
x
0
x
0
Set(x
0
) x
0
/ x
0
x
0
/ x
0
,
for x
0
is a set.
1.2. Shoenelds Principle. The root for this contradiction is the
fact that in Cantors denition we accept the concept of a nished totality
of all sets. However, this is neither necessary nor does it mirror the usual
practice of mathematics. It completely suces to form a set only if all its
elements are available already. This leads to the concept of a stepwise
construction of sets, or more precisely to the cumulative type structure: We
start with certain urelements, that form the sets of level 0. Then on an
arbitrary level we can form all sets whose elements belong to earlier levels.
If for instance we take as urelements the natural numbers, then 27, 5
belongs to level 2.
The following natural questions pose themselves: (1) Which urelements
should we choose? (2) How far do the levels reach?
91
92 5. SET THEORY
Ad (1). For the purposes of mathematics it is completely sucient not
to assume any urelements at all; then one speaks of pure sets. This will be
done in the following.
Level 0:
Level 1:
Level 2: ,
Level 3: , , , ,
and so on.
Ad (2). In [23], Shoeneld formulated the following principle:
Consider a collection o of levels. If a situation can be
conceived where all the levels from o are constructed, then
there exists a level which is past all those levels.
From this admittedly rather vage principle we shall draw exact consequences,
which will be xed as axioms.
By a set we intuitively understand an object that belongs to some level
of the cumulative type structure. By a class we mean an arbitrary collection
of sets.
So every set clearly is a class. Moreover there are classes that are not
sets, for instance the class V of all sets.
2. Axiomatic Set Theory
In set theory as in any axiomatic theory we have to explicitely state
all used properties, including the obvious ones.
2.1. Extensionality, Equality. The language of set theory has a sin
gle nonlogical symbol, the element relation . So the only atomic formulas
are of the form x y (x is an element of y). Equality x = y is dened by
x = y := z.z x z y.
To ensure compatibility of the relation with equality we need an axiom:
Axiom (Extensionality).
x = y x z y z.
Remark. If alternatively equality is to be used as a primitive symbol,
one must require the equality axioms and in addition
(z.z x z y) x = y.
As classes in our axiomatic theory we only allow denable collections of
sets. By denable we mean denable by a formula in the language of set
theory. More precisely: If A(x) is a formula, then
x [ A(x)
denotes the class of all sets x with the property A(x).
Instead of classes we could have used properties or more precisely for
mulas as well. However, classes allow for a simpler and more suggestive
formulation of many of the propositions we want to consider.
2. AXIOMATIC SET THEORY 93
If A(x) is the formula x = x, then x [ A(x) is called the all class or
the (set theoretic) universe. If A(x) is the formula x / x, then x [ A(x)
is called the Russell class.
We now give some denitions that will be used all over in the following.
A set b is an element of the class x [ A(x) if A(b) holds:
b x [ A(x) := A(b).
Two classes /, B are equal if they have the same elements:
/ = B := x.x / x B.
If / is a class and b a set, then / and b are called equal if they have the
same elements:
/ = b := x.x / x b.
In this case we identify the class / with this set b. Instead of / ist set we
also write / V . A class B is an element of a set a (of a class /, resp.) if
B is equal to an element x of a (of /, resp.).
B a := x.x a B = x,
B / := x.x / B = x.
A class / is a proper class if / is not a set:
/ proper class := x(x ,= /).
Remark. Every set b is a class, since
b = x [ x b .
The Russell class is a proper class, for if x [ x / x = x
0
, we would have
x
0
x
0
x
0
/ x
0
.
So the Russell construction is not an antinomy any more, but simply says
that there are sets and (proper) classes.
Let /, B be classes (proper classes or sets) and a, b, a
1
, . . . , a
n
sets. We
dene
a
1
, . . . , a
n
:= x [ x = a
1
x = a
n
,
:= x [ x ,= x empty class,
V := x [ x = x all class,
/ B := x.x / x B / is subclass of B,
/ B := / B / ,= B / is proper subclass of B,
/ B := x [ x / x B intersection,
/ B := x [ x / x B union,
/ B := x [ x / x / B dierence,
_
/ := x [ y.y / x y big union,
a, b and a b =
a, b, and
x is a set.
2. AXIOMATIC SET THEORY 95
The union axiom holds in the cumulative type structure. To see this,
consider a level S where x is formed. An arbitrary element v x then is
available at an earlier level S
v
already. Similarly every element u v is
present at a level S
v,u
before S
v
. But all these u make up
x. Hence also
y = y and we write
/: x y.
2.3. Separation, Power Set, Replacement Axioms.
Axiom (Separation). For every class /,
/ x y (/ = y).
So the separation scheme says that every subclass / of a set x is a set.
It is valid in the cumulative type structure, since on the same level where x
is formed we can also form the set y, whose elements are just the elements
of the class /.
Notice that the separation scheme consists of innitely many axioms.
Axiom (Power set).
T(x) is a set.
The power set axiom holds in the cumulative type structure. To see
this, consider a level S where x is formed. Then also every subset y x has
been formed at level S. On the next level S
t
(which exists by the Shoeneld
principle) we can form T(x).
Explicitely the power set axiom is xyz.z y z x.
Lemma 2.1. a b is a set.
Proof. We show a b T(T(a b)). So let x a and y b. Then
x, x, y a b
x, x, y T(a b)
x, x, y T(a b)
(x, y) = x, x, y T(T(a b))
The claim now follows from the union axiom, the pairing axiom, the power
set axiom and the separation scheme.
Axiom (Replacement). For every class /,
/ is a function x(/[x] is a set).
Also the replacement scheme holds in the cumulative type structure;
however, this requires some more thought. Consider all elements u of the
set xdom(/). For every such u we know that /(u) is a set, hence is formed
at a level S
u
of the cumulative type structure. Because x dom(/) is a set,
we can imagine a situation where all S
u
for u xdom(/) are constructed.
Hence by the Shoeneld principle there must be a level S coming after all
these S
u
. In S we can form /[x].
96 5. SET THEORY
Lemma 2.2. The replacement scheme implies the separation scheme.
Proof. Let / x and B := (u, v) [ u = v u /. Then B is a
function and we have B[x] = /.
This does not yet conclude our list of axioms of set theory: later we will
require the innity axiom, the regularity axiom and the axiom of choice.
3. Recursion, Induction, Ordinals
We want to develop a general framework for recursive denitions and
inductive proofs. Both will be done by means of socalled wellfounded
relations. To carry this through, we introduce as an auxiliary notion that of
a transitively wellfounded relation; later we will see that it is equivalent to
the notion of a wellfounded relation. We then dene the natural numbers in
the framework of set theory, and will obtain induction and recursion on the
natural numbers as special cases of the corresponding general theorems for
transitively wellfounded relations. By recursion on natural numbers we can
then dene the transitive closure of a set, and by means of this notion we
will be able to show that wellfounded relations coincide with the transitively
wellfounded relations.
Then we study particular wellfounded relations. We rst show that
arbitrary classes together with the relation are up to isomorphism the
only wellfounded extensional relations (Isomorphy Theorem of Mostowski).
Then we consider linear wellfounded orderings, called wellorderings. Since
they will always be extensional, they must be isomorphic to certain classes
with the relation, which will be called ordinal classes. Ordinals can then
be dened as those ordinal classes that happen to be sets.
3.1. Recursion on Transitively WellFounded Relations. Let /,
B, ( denote classes. For an arbitrary relation on / we dene
(a) x
1
:= y [ yx is the class of predecessors of x. We shall write x
instead of x
1
, if is clear from the context.
(b) B / is called transitive if
x.x B x B.
Hence B / is transitive i yx and x B imply y B.
(c) Let B /. x B is an minimal element of B if x B = .
(d) is a transitively wellfounded relation on / if
(i) Every nonempty subset of / has an minimal element, i.e.
a.a / a ,= x.x a x a = .
(ii) For every x / there is an transitive set b / such that x b.
We shall almost everywhere omit , if is clear from the context.
Remark. Let be a relation on /. is a transitive relation on / if
for all x, y, z /
xy yz xz.
We have the following connection to the notion of transitivity for classes:
Let be a relation on /. Then
is a transitive relation on / for every y /, y is transitive.
3. RECURSION, INDUCTION, ORDINALS 97
Proof. . Let be a transitive relation on /, y / and x y,
hence xy. We must show x y. So let zx. We must show zy. But
this follows from the transitivity of . . Let x, y, z /, xy and yz.
We must show xz. We have xy and y z. Since z is transitive, we
obtain x z, hence xz.
Lemma 3.1. Let be a transitively wellfounded relation on /. Then
(a) Every nonempty subclass B / has an minimal element.
(b) x.x/ x is a set.
Proof. (a). Let B / and z B. We may assume that z is not
Bminimal, i.e. z B , = . By part (ii) of the denition of transitively well
founded relations there exists an transitive superset b / of z. Because
of z B , = we have bB , = . By part (i) of the same denition there exists
an minimal x b B, i.e., x b B = . Since b is transitive, from
x b we obtain x b. Therefore x B = and hence x is an minimal
element of B.
(b). This is a consequence of the separation scheme.
3.2. Induction and Recursion Theorems. We write x/. . . for
x.x / . . . and similarly x/. . . for x.x / . . . .
Theorem 3.2 (Induction Theorem). Let be a transitively wellfounded
relation on / and B an arbitrary class. Then
x/. x B x B
implies / B.
Proof. Assume / B ,= . Let x be a minimal element of / B. It
suces to show x B, for then by assumption we obtain x B, hence a
contradiction. Let z x. By the choice of x we have z / / B, hence z B
(because z / holds, since is a relation on /).
Theorem 3.3 (Recursion Theorem). Let be a transitively wellfounded
relation on / and (: V V . Then there exists exactly one function
T: / V such that
x/
_
T(x) = ((T x)
_
.
Proof. First observe that for T: / V we have T x x T[ x],
hence T x is a set.
Uniqueness. Given T
1
, T
2
. Consider
x [ x / T
1
(x) = T
2
(x) =: B.
By the Induction Theorem it suces to show x/. x B x B. So let
x / and x B. Then
T
1
x = T
2
x
( (T
1
x) = ( (T
2
x)
T
1
(x) = T
2
(x)
x B.
98 5. SET THEORY
Existence. Let
B := f [ f function, dom(f) transitive subset of /,
xdom(f) (f(x) = ((f x))
and
T :=
_
B.
We rst show that
f, g B x dom(f) dom(g) f(x) = g(x).
So let f, g B. We prove the claim by induction on x, i.e., by an application
of the Induction Theorem to
x [ x dom(f) dom(g) f(x) = g(x) .
So let x dom(f) dom(g). Then
x dom(f) dom(g), for dom(f), dom(g) are transitive
f x = g x by IH
( (f x) = ( (g x)
f(x) = g(x).
Therefore T is a function. Now this immediately implies f B x
dom(f) T(x) = f(x); hence we have shown
(40) T(x) = ( (T x) for all x dom(T).
We now show
dom(T) = /.
is clear. . Use the Induction Theorem. Let y dom(T). We must show
y dom(T). This is proved indirectly; so assume y / dom(T). Let b be
transitive such that y b /. Dene
g := Tb (y, ((T y)) .
It clearly suces to show g B, for because of y dom(g) this implies
y dom(T) and hence the desired contradiction.
g is a function: This is clear, since y / dom(T) by assumption.
dom(g) is transitive: We have dom(g) = (b dom(T)) y. First
notice that dom(T) as a union of transitive sets is transitive itself.
Moreover, since b is transitive, also b dom(T) is transitive. Now let
zx and x dom(g). We must show z dom(g). In case x b dom(T)
also z b dom(T) (since b dom(T) is transitive, as we just observed),
hence z dom(g). In case x = y we have z y, hence z b and z dom(T)
be the choice of b and y, hence again z dom(g).
xdom(g)
_
g(x) = ((g x)
_
: In case x b dom(T) we have
g(x) = T(x)
= ((T x) by (40)
= ((g x) since x b dom(T), for b dom(T) is transitive.
In case x = y is g(x) = ((T x) = ((g x), for x = y b dom(T) by the
choice of y.
3. RECURSION, INDUCTION, ORDINALS 99
3.3. Natural Numbers. Zermelo dened the natural numbers within
set theory as follows: 0 = , 1 = , 2 = , 3 = and so
on. A disadvantage of this denition is that it cannot be generalized to the
transnite. Later, John von Neumann proposed to represent the number n
by a certain set consisting of exactly n elements, namely
n := 0, 1, . . . , n 1 .
So 0 = and n +1 = 0, 1, . . . , n = 0, 1, . . . , n 1 n. Generally we
dene
0 := , x + 1 := x x.
In particular, 1 := 0 + 1, 2 := 1 + 1, 3 := 2 + 1 and so on.
In order to know that the class of all natural numbers constructed in
this way is a set, we need another axiom:
Axiom (Innity).
x. x y.y x y y x.
The Innity Axiom holds in the cumulative type structure. To see this,
observe that 0 = , 1 := , 2 := 1 1 and so on are formed at levels
S
0
, S
1
, S
2
. . . , and we can conceive a situation where all these levels are
completed. By the Shoeneld principle there must be a level  call it S

which is past all these levels. At S
we can form .
We call a class / inductive if
/ y.y / y y /.
So the innity axiom says that there is an inductive set. Dene
:=
x [ x is inductive .
Clearly is a set, with the properties 0 and y y + 1 . is
called the set of natural numbers.
Let n, m denote natural numbers. nA(n) is short for x.x A(x),
similarly nA(n) for x.x A(x) and n [ A(n) for x [ x A(x) .
Theorem 3.4 (Induction on ).
(a) x 0 x (n.n x n + 1 x) x = .
(b) For every formula A(x),
A(0) (n.A(n) A(n + 1)) nA(n).
Proof. (a). x is inductive, hence x. (b). Let / := n [ A(n) .
Then / (so / is set), and by assumption
0 /,
n / n + 1 /.
By (a), / = .
We now show that for natural numbers the relation has all the prop
erties of <, and the relation all the properties of .
A class / is called transitive if it is ctransitive w.r.t. the special relation
c := (x, y) [ x y on V , i.e., if x.x / x /. Therefore / is
transitive i
y x / y /.
100 5. SET THEORY
Lemma 3.5. (a) n is transitive.
(b) is transitive.
Proof. (a). Induction by n. 0 is transitive. n n + 1. By IH, n is
transitive. We must show that n + 1 is transitive. We argue as follows:
y x n + 1
y x n n
y x n y x = n
y n y n
y n n = n + 1.
(b). We show x.x n x , by induction on n. 0: Clear. n n+1.
By IH we have x.x n x . So assume x n+1. Then x nx = n,
hence x .
Lemma 3.6. n / n.
Proof. Induction on n. 0. Clear. n n + 1: By IH is n / n. Assume
n + 1 n + 1
n + 1 n n + 1 = n
n n + 1 n n n + 1 = n
n n for n is transitive by Lemma 3.5.
This is a contradiction to the IH.
Lemma 3.7. (a) n m+ 1 n m n = m+ 1.
(b) n m n m n = m.
(c) n m m n.
(d) n m n = m m n.
Proof. (a). follows from m m+1. . Assume n m+1. Case
m n. We show n = m+ 1. holds by assumption. .
p m+ 1
p m p = m
p n.
Case m / n. We show n m.
p n
p m+ 1
p m p = m,
but p = m is impossible because of m / n.
(b). follows from transitivity of m. . Induction on m. 0. Clear.
m m+ 1.
n m+ 1
n m n = m+ 1 by (a)
n m n = m n = m+ 1 by IH
n m+ 1 n = m+ 1.
3. RECURSION, INDUCTION, ORDINALS 101
(c). Induction on n. 0. Clear. n n + 1: Case m n. Clear. Case
n m. Then
n m n = m by (b)
n, n m m n + 1
n + 1 m m n + 1
(d). Follows from (c) and (b).
Theorem 3.8 (PeanoAxioms). (a) n + 1 ,= .
(b) n + 1 = m+ 1 n = m.
(c) x 0 x (n.n x n + 1 x) x = .
Proof. (a). Clear. (c). This is Theorem 3.4(a). (b).
n + 1 = m+ 1
n m+ 1 m n + 1
(n m m n) n = m
n n n = m
n = m.
This concludes the proof.
We now treat dierent forms of induction.
Theorem 3.9 (Courseofvalues induction on ).
(a) x [n.(m.m n m x) n x] x = .
(b) [n.(m.m n A(m)) A(n)] nA(n).
Proof. (b). Assume n.(m.m n A(m)) A(n); we shall say
in this case that A(n) is progressive. We show m.m n A(m), by
induction on n. 0. Clear. n n + 1. By IH m.m n A(m). So let
m n + 1. Then m n m = n. In case m n we obtain A(m) by IH,
and in case m = n we can infer A(n) from the progressiveness of A, using
the IH.
(a). From (b), with A(y) := y x.
Theorem 3.10 (Principle of least element for ).
(a) , = x n.n x n x = .
(b) nA(n) n.A(n) m.m n A(m).
Proof. (b). By Theorem 3.9(b)
[n.(m.m n A(m)) A(n)] nA(n).
Contraposition gives
nA(n) n.A(n) m.m n A(m)
nA(n) n.A(n) m.m n A(m).
(a). From (b), using A(y) := y x.
We now consider recursion on natural numbers, which can be treated as
a special case of the Recursion Theorem 3.3. To this end, we identify with
the relation c = (x, y) [ x y and prove the following lemma:
Lemma 3.11. ( ) is a transitively wellfounded relation on .
102 5. SET THEORY
Proof. We show both conditions, from the denition of transitively
wellfounded relations. (i). Let , = a . We must show n.n ana =
. But this is the above principle of the least element. (ii). Clear, since n is
transitive.
Theorem 3.12 (Courseofvalues recursion on ). Let (: V V . Then
there is exactly one function f : V such that
n(f(n) = ((fn)).
Proof. By the Recursion Theorem 3.3 there is a unique T: V
such that n(T(n) = ((Tn)). By Replacement, rng(T) = T[] is a set.
By Lemma 2.1 and Separation, also T T[] is a set.
Corollary 3.13. Let (: V V and a be a set. Then there is exactly
one function f : V such that
f(0) = a,
n(f(n + 1) = ((f(n))).
Proof. First observe that
(n + 1) = n, because of
x
_
(n + 1) y.x y n + 1
m.x m n + 1
m.x m n
x n.
For the given ( we will construct (
t
such that (
t
(fn + 1) = ((f(n)). We
dene a function (
t
: V V satisfying
(
t
(x) =
_
((x(
dom(x))), if x ,= ;
a, if x = ,
by
(
t
= (x, y) [ (x ,= y = ((x
_
dom(x))) (x = y = a) .
Then there is a unique function f : V such that
f(n + 1) = (
t
(fn + 1)
= (((fn + 1)(
_
(n + 1)
. .
n
))
= ((f(n)),
f(0) = (
t
( f0
..
)
= a.
This concludes the proof.
We now dene
s
m
(0) = m, s
m
(n + 1) = s
m
(n) + 1.
3. RECURSION, INDUCTION, ORDINALS 103
By Corollary 3.13 for every m there is such a function, and it is uniquely
determined. We dene
m+n := s
m
(n).
Because of s
m
(1) = s
m
(0 +1) = s
m
(0) +1 = m+1, for n = 1 this denition
is compatible with the previous terminology. Moreover, we have m+0 = m
and m+ (n + 1) = (m+n) + 1.
Lemma 3.14. (a) m+n .
(b) (m+n) +p = m+ (n +p).
(c) m+n = n +m.
Proof. (a). Induction on n. 0. Clear. n n + 1. m + (n + 1) =
(m+n) + 1, and by IH m+n .
(b). Induction on p. 0. Clear. p p + 1.
(m+n) + (p + 1) = [(m+n) +p] + 1 by denition
= [m+ (n +p)] + 1 by IH
= m+ [(n +p) + 1]
= m+ [n + (p + 1)].
(c). We rst prove two auxiliary propositions.
(i) 0 + n = n. The proof is by induction on n. 0. Clear. n n + 1.
0 + (n + 1) = (0 +n) + 1 = n + 1.
(ii) (m+1) +n = (m+n) +1. Again the proof is by induction on n. 0.
Clear. n n + 1.
(m+ 1) + (n + 1) = [(m+ 1) +n] + 1
= [(m+n) + 1] + 1 by IH
= [m+ (n + 1)] + 1.
Now the claim m+n = n +m con be proved by induction on m. 0. By
(i). Step m m+ 1.
(m+ 1) +n = (m+n) + 1 by (ii)
= (n +m) + 1 by IH
= n + (m+ 1).
This concludes the proof
We dene
p
m
(0) = 0, p
m
(n + 1) = p
m
(n) +m.
By Corollary 3.13 for every m there is a unique such function. Here we need
(: V V,
((x) =
_
x +m, if x ;
, otherwise.
We nally dene m n := p
m
(n). Observe that this implies m 0 = 0,
m (n + 1) = m n +m.
Lemma 3.15. (a) m n .
(b) m (n +p) = m n +m p.
(c) (n +p) m = n m+p m.
104 5. SET THEORY
(d) (m n) p = m (n p).
(e) 0 n = 0, 1 n = n, m n = n m.
Proof. Exercise.
Remark. n
m
, m n can be treated similarly; later (when we deal
with ordinal arithmetic) this will be done more generally.  We could now
introduce the integers, rationals, reals and complex numbers in the well
known way, and prove their elementary properties.
3.4. Transitive Closure. We dene the transitive closure of a set a,
w.r.t. a relation with the property that the predecessors of an arbitrary
element of its domain form a set.
Theorem 3.16. Let be a relation on / such that x
1
(:= y [ yx)
is a set, for every x /. Then for every subset a / there is a uniquely
determined set b such that
(a) a b /;
(b) b is transitive;
(c) c.a c / c transitive b c.
b is called the transitive closure of a.
Proof. Uniqueness. Clear by (c). Existence. We shall dene f : V
by recursion on , such that
f(0) = a,
f(n + 1) = y [ xf(n)(yx) .
In order to apply the Recursion Theorem for , we must dene f(n + 1) in
the form ((f(n)). To this end choose (: V V , z
x [ x f(n)
 from the IH, Replacement and the Union Axiom. We now dene b :=
rng(f) =
f(n) [ n . Then
(a). a = f(0) b /.
(b).
yx b
yx f(n)
y f(n + 1)
y b.
(c). Let a c / and c be transitive. We show f(n) c by
induction on n. 0. a c. n n + 1.
y f(n + 1)
3. RECURSION, INDUCTION, ORDINALS 105
yx f(n)
yx c
y c.
This concludes the proof.
In the special case of the element relation on V , the condition x( x =
y [ y x is a set) clearly holds. Hence for every set a there is a uniquely
determined transitive closure of a. It is called the transitive closure of a.
By means of the notion of the transitive closure we can now show that
the transitively wellfounded relations on / coincide with the wellfounded
relations on /.
Let be a relation on /. is a wellfounded relation on / if
(a) Every nonempty subset of / has an minimal element, i.e.,
a.a / a ,= xa. x a = ;
(b) for every x /, x is a set.
Theorem 3.17. The transitively wellfounded relations on / are the
same as the wellfounded relations on /.
Proof. Every transitively wellfounded relation on / is wellfounded by
Lemma 3.1(b). Conversely, every wellfounded relation on / is transitively
wellfounded, since for every x /, the transitive closure of x is an
transitive b / such that x b.
Therefore, the Induction Theorem 3.2 and the Recursion Theorem 3.3
also hold for wellfounded relations. Moreover, by Lemma 3.1(a), every
nonempty subclass of a wellfounded relation has an minimal element.
Later we will require the socalled Regularity Axiom, which says that
the relation on V is wellfounded, i.e.,
a.a ,= xa.x a = .
This will provide us with an important example of a wellfounded relation.
We now consider extensional wellfounded relations. From the Regular
ity Axiom it will follow that the relation on an arbitrary class / is a well
founded extensional relation. Here we show  even without the Regularity
Axiom  the converse, namely that every wellfounded extensional relation
is isomorphic to the relation on a transitive class. This is Mostowskis
Isomorphy Theorem. Then we consider linear wellfounded orderings, well
orderings for short. They are always extensional, and hence isomorphic to
the relation on certain classes, which will be called ordinal classes. Ordi
nals will then be dened as ordinal sets.
A relation on / is extensional if for all x, y /
(z/.zx zy) x = y.
For example, for a transitive class / the relation (//) is extensional
on /. This can be seen as follows. Let x, y /. For := (/ /) we
have zx z x, since / is transitive. We obtain
z/.zx zy
z.z x z y
106 5. SET THEORY
x = y
From the Regularity Axiom it will follows that all these relations are
wellfounded. But even without the Regularity Axiom these relations have
a distinguished meaning; cf. Corollary 3.19.
Theorem 3.18 (Isomorphy Theorem of Mostowski). Let be a well
founded extensional relation on /. Then there is a unique isomorphism T
of / onto a transitive class B, i.e.
=1
T.T: / rng(T) rng(T) transitive x, y/.yx T(y) T(x).
Proof. Existence. We dene by the Recursion Theorem
T: / V,
T(x) = rng(T x) (= T(y) [ yx).
T injective: We show x, y/.T(x) = T(y) x = y by induction
on x. So let x, y / be given such that T(x) = T(y). By IH
z/.zx u/.T(z) = T(u) z = u.
It suces to show zx zy, for all z /. .
zx
T(z) T(x) = T(y) = T(u) [ uy
T(z) = T(u) for some uy
z = u by IH, since zx
zy
.
zy
T(z) T(y) = T(x) = T(u) [ ux
T(z) = T(u) for some ux
z = u by IH, since ux
zx.
rng(T) is transitive: Assume u v rng(T). Then v = T(x) for some
x /, hence u = T(y) for some yx.
yx T(y) T(x): . Assume yx. Then T(y) T(x) by denition
of T. .
T(y) T(x) = T(z) [ zx
T(y) = T(z) for some zx
y = z since T is injective
yx.
Uniqueness. Let T
i
(i = 1, 2) be two isomorphisms as described in
the theorem. We show x/(T
1
(x) = T
2
(x)), by induction on x. By
symmetry it suces to prove u T
1
(x) u T
2
(x).
u T
1
(x)
u = T
1
(y) for some y /, since rng(T
1
) is transitive
3. RECURSION, INDUCTION, ORDINALS 107
yx by the isomorphy condition for T
1
u = T
2
(y) by IH
T
2
(y) T
2
(x) by the isomorphy condition for T
2
u T
2
(x).
This concludes the proof.
A relation on / is a linear ordering if for all x, y, z /
xx irreexivity,
xy yz xz transitivity,
xy x = y yx trichotomy (or compatibility).
is a wellordering if is a wellfounded linear ordering.
Remark. Every wellordering on /is extensional. To see this, assume
z/.zx zy.
Then x = y by trichotomy, since from xy we obtain by assumption xx,
contradicting irreexivity, and similarly yx entails a contradiction.
Corollary 3.19. For every wellordering on / there is a unique
isomorphism T of / onto a transitive class B.
3.5. Ordinal classes and ordinals. We now study more closely the
transitive classes that appear as images of wellorderings.
/ is an ordinal class if / is transitive and (//) is a wellordering
on /. Ordinal classes that happen to be sets are called ordinals. Dene
On := x [ x is an ordinal .
First we give a convenient characterization of ordinal classes. / is called
connex if for all x, y /
x y x = y y x.
For instance, is connex by Lemma 3.7(d). Also every n is connex, since
by Lemma 3.5(b), is transitive.
A class / is wellfounded if (//) is a wellfounded relation on /,
i.e., if
a.a / a ,= xa.x a = .
We now show that in wellfounded classes there can be no nite cycles.
Lemma 3.20. Let / be wellfounded. Then for arbitrary x
1
, . . . , x
n
/
we can never have
x
1
x
2
x
n
x
1
.
Proof. Assume x
1
x
2
x
n
x
1
. Consider x
1
, . . . , x
n
. Since
/ is wellfounded we may assume x
1
x
1
, . . . , x
n
= . But this contradicts
x
n
x
1
.
Corollary 3.21. / is an ordinal class i / is transitive, connex and
wellfounded.
108 5. SET THEORY
Proof. is clear; / is connex because of trichotomy. : We must
show, for all x, y, z /,
x / x,
x y y z x z.
Since / is connex, both propositions follow from Lemma 3.20.
Here are some examples of ordinals. is transitive by Lemma 3.5(b),
connex as noted above and wellfounded by the principle of least element
(Theorem 3.10). So, is an ordinal class. Since by the Innity Axiom is
a set, is even an ordinal. Also, n is transitive by Lemma 3.5(a), connex
(see above) and wellfounded; the latter follows with transitivity of by the
principle of least element (Theorem 3.10).
Let us write Ord(/) for / is an ordinal class. We now show that
ordinal classes have properties similar to those of natural numbers: the
relation has the properties of <, and the relation has the properties of
.
Lemma 3.22. (a) Ord(/) Ord(B) Ord(/ B).
(b) Ord(/) x / Ord(x).
(c) Ord(/) Ord(B) (/ B / B / = B).
(d) Ord(/) Ord(B) (/ B / = B B /).
Proof. (a). / B transitive:
x y / B
x y / and x y B
x / and x B
x / B.
/ B connex, wellfounded: Clear.
(b). x transitive:
u v x /
u v /
u /
u x u = x x u.
From u = x it follows that u v u contradicting Lemma 3.20, and from
x u it follows that u v x u, again contradicting Lemma 3.20.
x connex, wellfounded. Clear, for x /.
(c). . Clear, for B is transitive. . Let / B. Wlog / B. Choose
x B/ such that x(B/) = (this is possible, since B is wellfounded);
it suces to show that x = /.
x /. Assume y x, hence y x B. Then y /, for x(B/) = .
/ x. Assume y /. Then also y B. It follows that x y x =
y y x. But the rst two cases are impossible, for in both of them we
obtain x /.
(d). Assume Ord(/) and Ord(B). Then by (a), Ord(/ B). Using (c)
we obtain
[(/ B /) (/ B = /)] [(/ B B) (/ B = B)].
3. RECURSION, INDUCTION, ORDINALS 109
Distributing yields
(/ B / B) (/ B) (B /) (/ = B).
But the rst case / B / B is impossible by Lemma 3.20.
Lemma 3.23. (a) Ord(On).
(b) On is not a set.
(c) On is the only proper ordinal class.
Proof. (a). On is transitive by Lemma 3.22(b) and it is connex by
Lemma 3.22(d). On is wellfounded: Let a On, a ,= . Choose x a.
Wlog x a ,= . Since x is wellfounded, there is a y x a such that
y x a = . It follows that y a and y a = ; the latter holds since
y x because of y x, x transitive.
(b). Assume On is a set. Then On On, contradicting Lemma 3.20.
(c). Let Ord(/), / not a set. By Lemma 3.22(d)
/ On / = On On /.
The rst and the last case are excluded, for then / (or On, resp.) would be
a set.
Lemma 3.24. (a) On is inductive,
(b) n, On.
Proof. (a). 0 On is clear. So let x On. We must show x +1 On,
that is x x On.
x x transitive: Assume u v x x, so u v x or u v = x.
In both cases it follows that u x.
x x is connex: Assume u, v x x. Then
u, v x (u x v = x) (u = x v x) (u = v = x)
u v u = v v u.
xx is wellfounded: Let a xx, a ,= . We must show ya (y
a = ). Case a x ,= . Then the claim follows from the wellfoundedness
of x. Case a x = . Then a = x, and we have x x = .
(b). This has been proved above, after Corollary 3.21.
Lemma 3.25. x, y On x + 1 = y + 1 x = y.
Proof. The proof is similar to the proof of the second PeanoAxiom in
Theorem 3.8(b).
x + 1 = y + 1
x y + 1 y x + 1
(x y y x) x = y.
Since the rst case is impossible by Lemma 3.22, we have x = y.
Lemma 3.26. / On
/ On
/ = On.
110 5. SET THEORY
Proof. It suces to show Ord(
/).
/ is transitive: Let x y
/.
/ On. So
let x
/, hence x y / for some y. Then x y and y On, so
x On.
Remark. If / On, then
/ we have
x / x
_
/,
(x/.x y)
_
/ y.
We therefore also write sup / for
/.
Here are some examples of ordinals:
0
1 = 0 + 1
2 = 1 + 1
.
.
.
set by the Innity Axiom
+ 1
+ 2
.
.
.
2 :=
_
+n [ n by recursion on
2 + 1
2 + 2
.
.
.
3 :=
_
2 +n [ n
.
.
.
4
.
.
.
:=
2
:=
_
n [ n
2
+ 1
2
+ 2
.
.
.
2
+
2
+ + 1
2
+ + 2
3. RECURSION, INDUCTION, ORDINALS 111
.
.
.
2
+ 2
.
.
.
2
+ 3
.
.
.
3
.
.
.
4
.
.
.
+ 1
.
.
.
and so on.
, , will denote ordinals.
is a successor number if ( = + 1). is a limit if is neither 0
nor a successor number. We write
Lim() for ,= 0 ( = + 1).
Clearly for arbitrary either = 0 or is successor number or is a limit.
Lemma 3.27. (a) Lim() ,= 0 . + 1 .
(b) Lim().
(c) Lim() .
Proof. (a). : Let . Then + 1 + 1 = + 1.
The second case + 1 = is excluded assumption. In the third case it
follows that = ; but because of both are impossible by
Lemma 3.20. . Let ,= 0 and assume . + 1 . Then if
is not a limit, we must have = + 1. Then we obtain , hence by
assumption also + 1 and hence , which is impossible.
(b). Follows from (a), since is inductive.
(c). Assume Lim(). We show n by induction on n. 0. We have
0 0 = 0, where the cases two and three clearly are impossible.
n + 1. We have n by IH, hence n + 1 by (a).
Lemma 3.28. (a) =
( + 1).
(b) For limits we have =
.
Proof. (a). . Let . The claim follows from + 1. . Let
. Then + 1 .
(b). . Let . Then + 1 . . Let . We obtain
.
112 5. SET THEORY
Theorem 3.29 (Transnite induction on On; class form).
(. B B) On B.
Proof. This is a special case of the Induction Theorem 3.2
Corollary 3.30 (Dierent forms of transnite induction on On). First
form:
A(0) (.A() A( + 1))
(.Lim() (. A()) A())
A().
Second form: (Transnite induction on On, using all predecessors).
[.(. A()) A()] A().
Third form: (Principle of least element for On).
A() .A() . A().
Proof. The third form follows from the second by contraposition. Also,
the rst form follows easily from the second. The second form follows from
Theorem 3.29 using B := [ A() .
Theorem 3.31 (Transnite recursion on On). Let (: V V . Then
there is exactly one function T: On V such that for all
T() = ((T).
Proof. This is a special case of the Recursion Theorem 3.3.
Corollary 3.32. Assume (: V V , H: V V and a is a set. Then
there is a unique function T: On V such that
T(0) = a,
T( + 1) = ((T()),
T() = H(T) for limit.
Proof. First observe that
( + 1) = , because of
_
( + 1) . + 1
.
.
For given a, ( and H we shall nd a (
t
such that
(
t
(0) = a,
(
t
(T + 1) = ((T()),
(
t
(T) = H(T) for limit.
We dene a function (
t
: V V by
(
t
(x) =
_
_
a, otherwise;
((x(
as follows,
by transnite recursion on On.
V
0
= ,
V
+1
= T(V
),
V
=
_
for limit.
Remark. More precisely, V
is transitive.
(b) V
.
(c) V
.
(d) V
On = .
Proof. (a). (Transnite) induction by . 0. is transitive. + 1.
x y V
+1
= T(V
)
x y V
x V
x V
by IH
x V
+1
.
limit.
x y V
=
_
x y V
for some
x V
by IH
114 5. SET THEORY
x V
.
(b). Induction by . 0. Clear. + 1.
+ 1
or =
V
or V
= V
by IH
V
by (a)
V
V
+1
.
limit.
+ 1
V
V
+1
_
= V
.
(c). Using = the claim follows from (a) and (b).
(d). Induction on . 0. Clear. + 1.
V
+1
V
On = by IH
+ 1.
limit.
V
On = (
_
) On
=
_
(V
On)
=
_
by IH
= .
This concludes the proof.
We now show that the von Neumann levels exhaust the universe, which
means that V =
On
V
rn(x) .
Proof. (a). induction on x. We have rn(x) =
rn(y)+1 [ y x
On, for by IH rn(y) On for every y x.
(b). induction on x. Let y x. Then y V
rn(y)
by IH, hence
y T(V
rn(y)
) = V
rn(y)+1
V
rn(x)
because of rn(y) + 1 rn(x).
(c). Induction on . Let x V
rn(y) + 1 [
y x . Let y x. We must show rn(y) + 1 . Because of x V
we have y V
. This implies y V
+1
= T(V
t
) and hence y V
t
, hence y V
and therefore y V
for some .  By
IH it follows that rn(y) , whence rn(y) .
Now we obtain easily the proposition formulated above as our goal.
Corollary 3.35. V =
On
V
.
Proof. is clear. . For every x we have x V
rn(x)
by Lemma 3.34(b),
hence x V
rn(x)+1
.
Now V
= x [ rn(x) .
Proof. . Let rn(x) . Then x V
rn(x)
implies x V
rn(x)+1
V
.
. Induction on . Case 0. Clear. Case + 1. Let x V
+1
. Then
x T(V
), hence x V
rn(y) + 1 [ y x .
Case limit. Let x V
. Then x V
rn(z) + 1 [
z x rn(y).
Moreover we can show that the sets and V
) = .
Proof. (a). Induction on . We have rn() =
rn() + 1 [ ,
hence by IH rn() =
+ 1 [ = by Lemma 3.28(a).
(b). We have
rn(V
) =
_
rn(x) + 1 [ x V
).
This completes the proof.
We nally show that a class / is a set if and only if the ranks of their
elements can be bounded by an ordinal.
Lemma 3.39. / is set i there is an such that y/(rn(y) ).
Proof. . Let / = x. From Lemma 3.37(a) we obtain that rn(x) is
the we need.
. Assume rn(x) for all y /. Then / y [ rn(y) =
V
.
4. Cardinals
We now introduce cardinals and develop their basic properties.
4.1. Size Comparison Between Sets. Dene
[a[ [b[ : f.f : a b and f injective,
[a[ = [b[ : f.f : a b,
[a[ < [b[ : [a[ [b[ [a[ , = [b[,
b
a := f [ f : b a .
Two sets a and b are equinumerous if [a[ = [b[. Notice that we did not dene
[a[, but only the relations [a[ [b[, [a[ = [b[ and [a[ < [b[.
The following properties are clear:
[a b[ = [b a[;
[
a
(
b
c)[ = [
ab
c[;
[T(a)[ = [
a
0, 1[.
Theorem 4.1 (Cantor). [a[ < [T(a)[.
Proof. Clearly f : a T(a), x x is injective. Assume that we
have g : a T(a). Consider
b := x [ x a x / g(x) .
Then b a, hence b = g(x
0
) for some x
0
a. It follows that x
0
g(x
0
)
x
0
/ g(x
0
) and hence a contradiction.
Theorem 4.2 (Cantor, Bernstein). If a b c and [a[ = [c[, then
[b[ = [c[.
Proof. Let f : c a be bijective and r := c b. We recursively dene
g : V by
g(0) = r,
g(n + 1) = f[g(n)].
4. CARDINALS 117
Let
r :=
_
n
g(n)
and dene i : c b by
i(x) :=
_
f(x), if x r,
x, if x / r.
It suces to show that (a) rng(i) = b and (b) i is injective. Ad (a). Let
x b. We must show x rng(i). Wlog let x r. Because of x b we then
have x / g(0). Hence there is an n such that x g(n + 1) = f[g(n)], so
x = f(y) = i(y) for some y r. Ad (b). Let x ,= y. Wlog x r, y / r. But
then i(x) r, i(y) / r, hence i(x) ,= i(y).
Remark. The Theorem of Cantor and Bernstein can be seen as an
application of the Fixed Point Theorem of KnasterTarski.
Corollary 4.3. [a[ [b[ [b[ [a[ [a[ = [b[.
Proof. Let f : a b and g : b a injective. Then (g f)[a] g[b] a
and [(g f)[a][ = [a[. By the Theorem of Cantor and Bernstein [b[ = [g[b][ =
[a[.
4.2. Cardinals, Aleph Function. A cardinal is dened to be an or
dinal that is not equinumerous to a smaller ordinal:
is a cardinal if <([[ , = [[).
Here and later we write  because of Lemma 3.22  < for and
for .
Lemma 4.4. [n[ = [m[ n = m.
Proof. Induction on n. 0. Clear. n + 1. Let f : n + 1 m + 1. We
may assume f(n) = m. Hence fn: n m and therefore n = m by IH,
hence also n + 1 = m+ 1.
Corollary 4.5. n is a cardinal.
Lemma 4.6. [n[ , = [[.
Proof. Assume [n[ = [[. Because of n n + 1 the Theorem of
Cantor and Bernstein implies [n[ = [n + 1[, a contradiction.
Corollary 4.7. is a cardinal.
Lemma 4.8. [ + 1[ = [[.
Proof. Dene f : + 1 by
f(x) :=
_
_
, if x = 0;
n, if x = n + 1;
x, otherwise.
Then f : + 1.
Corollary 4.9. If and is a cardinal, then is a limit.
Proof. Assume = + 1. Then < , hence [[ = [ + 1[,
contradicting the assumption that is a cardinal.
118 5. SET THEORY
Lemma 4.10. If a is a set of cardinals, then sup(a) (:=
a) is a cardinal.
Proof. Otherwise there would be an < sup(a) such that [[ =
[ sup(a)[. Hence
a and therefore a for some cardinal .
By the Theorem of Cantor and Bernstein from
a and [[ = [
a[
it follows that [[ = [[. Because of and a cardinal this is impossi
ble.
We now show that for every ordinal there is a strictly bigger cardinal.
More generally, even the following holds:
Theorem 4.11 (Hartogs).
a!.<([[ [a[) [[ , [a[.
is the Hartogs number of a; it is denoted by H(a).
Proof. Uniqueness. Clear. Existence. Let w := (b, r) [ b a
r wellordering on b and
(b,r)
the uniquely determined ordinal isomorphic
to (b, r). Then
(b,r)
[ (b, r) w is a transitive subset of On, hence an
ordinal . We must show
(a) < [[ [a[,
(b) [[ , [a[.
(a). Let < . Then is isomorphic to a
(b,r)
with (b, r) w, hence there
exists an f : b.
(b). Assume f : a is injective. Then =
(b,r)
for some b a
(b := rng(f)), hence , a contradiction.
Remark. (a). The Hartogs number of a is a cardinal. For let be the
Hartogs number of a, < . If [[ = [[, we would have [[ = [[ [a[, a
contradiction.
(b). The Hartogs number of is the least cardinal such that > .
The aleph function : On V is dened recursively by
0
:= ,
+1
:= H(
),
:= sup
is a cardinal.
(b) <
<
.
(c) . cardinal ( =
).
Proof. (a). Induction on ; clear. (b). Induction on . 0. Clear.
+ 1.
< + 1
< =
<
<
+1
.
limit.
<
< for some <
4. CARDINALS 119
<
.
(c). Let be minimal such that
by cases on . 0. Clear.
=
t
+ 1. By the choice of we have
< , hence
. limit.
By the choice of we have
= sup
[ <
.
We show that every innite ordinal is equinumerous to a cardinal.
Lemma 4.13. ()([[ = [
[).
Proof. Consider := min [ < [[ = [[ . Clearly is a
cardinal. Moreover , for otherwise
= n
[n[ = [[
n n + 1
[n[ = [n + 1[,
a contradiction. Hence =
[.
4.3. Products of Cardinals. We now show that [
[ = [
[.
On the set On On we dene a relation by
(, ) (, ) :max, < max,
(max, = max, < )
(max, = max, = < ).
Lemma 4.14. is a wellordering on On On.
Proof. Clearly is a linear ordering. To see the wellfoundedness of
consider an a On On such that a ,= . Then
, = / := [ , ((, ) a max, = ) On.
Let
0
:= min(/). Then
, = /
1
:= [ ((, ) a max, =
0
) On.
Let
0
:= min(/
1
). Then
, = /
2
:= [ (
0
, ) a max
0
, =
0
) On.
Let
0
:= min(/
2
). Then clearly (
0
,
0
) = min
is isomorphic to
and on
).
120 5. SET THEORY
Proof. Assume: (
not isomorphic to
). Let
0
:= min [
not isomorphic to
.
Clearly
0
,= 0. Since
0
and
0
are wellordered sets, one of them
must be isomorphic to a proper initial segment of the other. Therefore we
distinguish two cases.
Case (a).
0
is isomorphic to
(, ) with , <
0
. Choose <
0
with , < . Then
(, ) , and
[
0
[ = [
(, )[ [ [ = [
[ by choice of
0
,
hence a contradiction to Lemma 4.12(b).
Case (b).
0
is isomorphic to <
0
. Then
[
0
[ [
0
[ = [[ [
0
[
[
0
[ = [[,
hence a contradiction to the fact that
0
is a cardinal, <
0
.
Corollary 4.17. (a) [
[ = [ max
[.
(b) n ,= 0 [
n
[ = [
[.
Proof. (a). We may assume . Then
[
[ [
[ [
[ = [
[.
(b). This follows easily from Theorem 4.16, by induction on n.
5. The Axiom of Choice
5.1. Axiom of Choice, Well Ordering Theorem, Zorns Lemma.
A relation on / is a partial ordering if for all x, y, z /
xx, irreexivity
xy yz xz, transitivity.
An element x / is maximal if there is no y / such that xy. Let
B /. An element x / is an upper bound of B if
yB.yx y = x.
Theorem 5.1. The following are equivalent.
(a) The axiom of choice (AC)
x. / x f.f : x
_
x (yx)(f(y) y).
(b) The well ordering theorem (WO)
ar(r is a well ordering on a).
(c) Zorns Lemma (ZL): Let (P, <) be a non empty partial ordering, with
the property that every (by <) linearly ordered subset L P has an
upper bound in P. Then P has a maximal element.
5. THE AXIOM OF CHOICE 121
Proof. (ZL) (WO). Let a be given, and dene
P := f [ (f : a injective) T(H(a) a).
P is partially ordered by proper inclusion . Let L P be linearly ordered.
Then
L P, hence
x.
Clearly < induces a well ordering on every y x. Dene
f : x
_
x,
y min
<
(y) y.
(AC) (ZL). Let < be a partial ordering on P ,= . Assume that every
subset L P linearly ordered by < has an upper bound in P. By (AC)
there is a choice function f on T(P) . Let z / P be arbitrary, and
dene
T: On V
T() =
_
f( y [ y P T[] y upper bound of T[] ), if . . . , = ;
z, otherwise.
Then there is a such that T() = z, for otherwise T: On P would be
injective, contradicting our assumption that P is a set. Let
0
:= min [
T() = z . T[
0
] is linearly ordered, and we have T[
0
] P. By assump
tion there is an upper bound y
0
P of T[
0
]. We show that y
0
is a maximal
element in P. So assume y
0
< y for some y P. Then y is an upper bound
of T[
0
] and y / T[
0
]. But this contradicts the denition of
0
.
From now on we will assume the axiom of choice; however, we will mark
every theorem and every denition depending on it by (AC).
(AC) clearly is equivalent to its special case where every two elements
y
1
, y
2
x are disjoint. We hence note the following equivalent to the axiom
of choice:
Lemma 5.2. The following are equivalent
(a) The axiom of choice (AC).
(b) For every surjective g : a b there is an injective f : b a such that
(xb)(g(fx) = x).
Proof. (a) (b). Let g : b a surjective. By (AC) there is a well
ordering < of b. Dene f : a b by f(x) := min
<
y [ y b g(y) = x.
(b) (a). We may assume x ,= and (y
1
, y
2
x)(y
1
y
2
= ). Dene
g :
iI
A
i
[ max[I[, sup
iI
[A
i
[.
(b) If in addition (i I)(A
i
,= ) and (i, j I)(i ,= j A
i
A
j
= ),
then equality holds.
Proof. (a). We may assume := sup
iI
[A
i
[ , = 0. Choose a well
ordering < of I and dene w.r.t. this wellordering
f :
_
iI
A
i
_
iI
(i A
i
),
f(x) = (min i I [ x A
i
, x).
Clearly f is injective. Hence
[
_
iI
A
i
[ [
_
iI
(i A
i
)[
[
_
iI
(i )[
= [I [
= max[I[, .
(b). Because of (a) it suces to show that [I[, [A
i
[ [
iI
A
i
[. The se
cond estimate is clear. For the rst one choose a wellordering < of
iI
A
i
and dene f : I
iI
A
i
by f(i) := min
<
x [ x A
i
. By our assumption
f is injective.
A set a is Dedekindnite if a cannot be mapped bijectively onto a proper
subset b of a, otherwise Dedekindinnite.
Theorem 5.6 (AC). A set a is Dedekindinnite i a is innite.
Proof. . Let b a and f : a b. Assume [a[ < , say [a[ = n.
Then there is a c n and some g : n c. We show by induction on n that
this is impossible:
n(c n)g(g : n c).
5. THE AXIOM OF CHOICE 123
0. Clear. n+1. Let g : n+1 c and c n+1. We may assume n / rng(gn).
It follows that gn: n c n n and hence a contradiction to the IH.
. Let g : a be injective and h: g[] g[ 1] dened by
h = (g(n), g(n + 1)) [ n .
Dene f : a (a g(0)) by
f(x) =
_
x, if x a g[];
h(x), otherwise.
, then
is singular.
Proof. (a). Assume is singular, that is cf() < . Then there is an
x such that [x[ = n and sup(x) = . But this is impossible (proof by
induction on n).
(b). Assume
+1
is singular. Then cf(
a+1
)
a
. Hence there is an
x
+1
such that [x[
and sup(x) =
+1
. But then
+1
= [
_
x[
max[x[, sup [y[ [ y x by Theorem 5.5(a)
,
a contradiction.
(c). Let be a limit such that <
. Then we have
= sup
[
< and moreover [
[ < [ = [[ <
. Hence
is singular.
By denition for every innite cardinal there is a subset x whose
cardinality equals cf(), hence which can be mapped bijectively onto cf().
We now show that one can even assume that this bijection is an isomorphism.
Lemma 5.9 (AC). Let be an innite cardinal. Then there exists a
subset x connal in that is isomorphic to cf().
Proof. Let y , sup(y) = , [y[ = cf() and g : cf() y. By
transnite recursion we dene
T: On V,
T() := sup(T[] g[]) + 1.
Let f := Tcf(). One can see easily
(a) < < cf() f() < f() g() < f().
(b) rng(f)
(c) rng(f) is connal with .
124 5. SET THEORY
rng(f) is the x we are looking for.
Corollary 5.10 (AC). If is an innite cardinal, then cf() is a re
gular cardinal.
Proof. cf(cf()) cf() is clear. We must show cf() cf(cf()). By
the lemma above there are x, f such that x , sup(x) = and f : cf()
x isomorphism. Moreover there is y cf() such that sup(y) = cf() and
[y[ = cf(cf()). One can see easily that f() [ y is connal with .
Hence
cf() [ f() [ y [
= [y[
= cf(cf())
This concludes the proof.
Theorem 5.11 (Konig). Let be an innite cardinal. Then < [
cf()
[.
Proof. = [
1
[ [
cf()
[ is clear. Hence it suces to derive a contra
diction from the assumption that there is a bijection f :
cf()
. Accord
ing to Lemma 5.9 there exists x such that sup(x) = and moreover an
isomorphism g : cf() x. For every < cf() we therefore have g() <
and hence
[ f()() [ < g() [ [g()[ < ,
hence f()() [ < g() . Let
h: cf() ,
h() := min( f()() [ < g() ).
We obtain the desired contradiction by showing that f() ,= h for all < .
So let < . Choose < cf() such that < g(). Then h() ,= f()()
by construction of h.
5.4. Cardinal Powers, Continuum Hypothesis. In this section we
again assume (AC). We dene
:= [
[.
Later we will introduce powers of ordinals as well. It should always be clear
from the context whether we mean ordinal or cardinal power.
Theorem 5.12 (AC). (a)
< cf(
[T(
)[
(b) cf(
<
[T(
)[
(c)
= [T(
)[.
Proof. (a).
[
[
0, 1)[
= [
0, 1[
= [
0, 1[ because
= [T(
)[.
5. THE AXIOM OF CHOICE 125
(b).
< [
cf(
[ Konigs Theorem
[
[
[T(
)[ as in (a).
(c).
[T(
)[ = [
0, 1[
[
[
[
0, 1[
= [
0, 1[
= [T(
)[.
This concludes the proof.
One can say much more about cardinal powers if one assumes the so
called continuum hypothesis:
[T(
0
)[ =
1
. (CH)
An obvious generalization to all cardinals is the generalized continuum hy
pothesis:
[T(
)[ =
+1
. (GCH)
It is an open problem whether the continuum hypothesis holds in the cu
mulative type structure (No. 1 in Hilberts list of mathematical problems,
posed in a lecture at the international congress of mathematicians in Paris
1900). However, it is known that continuum hypothesis is independent from
the other axioms of set theory. We shall always indicate use of (CH) or
(GCH).
Theorem 5.13 (GCH). (a)
< cf(
.
(b) cf(
=
+1
.
(c)
=
+1
.
Proof. (b) and (c) follow with (GCH) from the previous theorem.
(a). Let
< cf(
=
_
[ <
. Because
of [f[
][
< cf(
) we have sup(f[
]) < <
.
This gives
[ previous theorem
= [
_
[ <
[, sup
<
[ by Theorem 5.5(a)
Hence it suces to show that [
for <
. So let <
.
[
[ [
0, 1[
126 5. SET THEORY
[
<
_
[T(
)[ if <
[T(
)[ if
=
_
+1
if <
+1
if
.
This concludes the proof.
6. Ordinal Arithmetic
We dene addition, multiplication and exponentiation for ordinals and
prove their basic properties. We also treat Cantors normal form.
6.1. Ordinal Addition. Let
+ 0 := ,
+ ( + 1) := ( +) + 1,
+ := sup + [ < if limit.
More precisely, dene s
: On V by
s
(0) := ,
s
( + 1) := s
() + 1,
s
() :=
_
rng(s
) if limit
and then let + := s
().
Lemma 6.1 (Properties of Ordinal Addition). (a) + On.
(b) 0 + = .
(c) , ( + ,= +).
(d) < + < +.
(e) There are , , such that < , but + ,< +.
(f) + +.
(g) For there is a unique such that + = .
(h) If is a limit, then so is +.
(i) ( +) + = + ( +).
Proof. (a). Induction on . Case 0. Clear. Case + 1. Then
+( +1) = ( +) +1 On, for by IH + On. Case limit. Then
+ = sup + [ < On, for by IH + On for all < .
(b). Induction on . Case 0. Clear. Case + 1. Then 0 + ( + 1) =
(0 +) + 1 = + 1, for by IH 0 + = . Case limit. Then
0 + = sup 0 + [ <
= sup [ < by IH
=
_
= , because is a limit.
(c). 1 + = sup 1 +n [ n = ,= + 1.
6. ORDINAL ARITHMETIC 127
(d). Induction on . Case 0. Clear. Case + 1. Then
< + 1,
< = ,
+ < + + = + by IH,
+ + < ( +) + 1 = + ( + 1).
Case limit. Let < , hence < for some < . Then + < +
by IH, hence + < sup + [ < = +.
(e). 0 < 1, but 0 + = = 1 +
(f). We rst remark that there can be no such that < < + 1,
for otherwise we would have in case the contradiction and
in case = the contradiction . As a second preliminary remark we
note that
+ 1 + 1,
for in case +1 < +1 we would have < +1 < +1, which cannot be the
case (as we have just seen). We now show the claim + +
by induction on . Case 0. Clear. Case + 1. Then
+ + by IH,
( +) + 1 ( +) + 1 by the second preliminary remark,
+ ( + 1) + ( + 1) by denition.
Case limit. Then
+ + for all < , by IH,
+ sup + [ <
sup + [ < sup + [ <
+ + by denition.
(g). Uniqueness of follows from (d). Existence: Let . By (b)
and (f) = 0 + +. Let be the least ordinal such that +.
We show that = + . Case = 0. Then + = + 0 = ,
hence = +. Case =
t
+1. Then +
t
< , hence (+
t
) +1
by the rst preliminary remark for (f) and hence + = . Case limit.
Then + < for all < , hence + = sup + [ < and
hence + = .
(h). Let limit. We use the characterization of limits in Lemma 3.27(a).
+ ,= 0: Because of 0 we have 0 < = 0 + + by (f).
< + + 1 < + : Let < + = sup + [ < , hence
< + for some < , hence + 1 < + ( + 1) with + 1 < , hence
+ 1 < sup + [ < .
(i). Induction on . Case 0. Clear. Case + 1. Then
( +) + ( + 1) = [( +) +] + 1
= [ + ( +)] + 1 by IH
= + [( +) + 1]
= + [ + ( + 1)]
128 5. SET THEORY
Case limit. By (h) also + is a limit. Hence
( +) + = sup ( +) + [ <
= sup + ( +) [ < by IH
= sup + [ < + see below
= + ( +).
The equality of both suprema can be seen as follows. If < + , then
< + for some < (by denition of +) and hence + < +(+).
If conversely < , then + < +, hence +( +) = + for some
< + (take := +).
6.2. Ordinal Multiplication. Ordinal multiplication is dened by
0 := 0,
( + 1) := ( ) +,
:= sup [ < if limit.
We write for .
Lemma 6.2 (Properties of Ordinal Multiplication). (a) On.
(b) 0 = 0, 1 = .
(c) , ( ,= ).
(d) 0 < < < .
(e) There are , , such that 0 < and < , but ,< .
(f) .
(g) If 0 < and is a limit, then so is .
(h) ( +) = +.
(i) There are , , such that ( +) ,= +.
(j) = 0 = 0 = 0.
(k) () = ().
(l) If 0 < , then there are unique , such that = + and < .
Proof. (a). Induction on . Case 0. Clear. Case + 1. Then
( + 1) = () + On, for by IH On. Case limit. Then
= sup [ < On, for by IH On for all < .
(b). 0 = 0: Induction on . Case 0. Clear. Case + 1. Then
0( + 1) = (0) + 0 = 0 by IH. Case limit. 0 = sup 0 [ < = 0
by IH. 1 = : Induction on . Case 0. Clear. Case + 1. Then
1( + 1) = (1) + 1 = + 1 by IH. Case limit. 1 = sup 1 [ < =
sup [ < = by IH.
(c). Fist note that for all n we have n = sup nm [ m < = .
This implies 2 = , but 2 = (1 + 1) = 1 + = + > .
(d). Let 0 < . We show < < by induction on . Case 0.
Clear. Case + 1. Then
< + 1,
< = ,
< = by IH,
< () + = ( + 1).
6. ORDINAL ARITHMETIC 129
Case limit. Let < , hence < for some < . Then < by
IH, hence < sup [ < = .
(e). We have 0 < and 1 < 2, but 1 = = 2.
(f). We show the claim by induction on . Case 0.
Clear. Case + 1. Then
by IH,
() + () + () + by Lemma 6.1(f) and (d)
( + 1) ( + 1) by denition.
Case limit. Then
for all < , by IH,
sup [ < ,
sup [ < sup [ < ,
by denition.
(g). Let 0 < and limit. For the proof of limit we again use the
characterization of limits in Lemma 3.27(a). ,= 0: Because of 1 and
we have 0 < = 1 by (f). < + 1 < : Let
< = sup [ < , hence < for some < , hence + 1 <
+1 + = ( +1) with +1 < , hence +1 < sup [ < .
(h). We must show ( +) = +. We may assume let 0 < . We
emply induction on . Case 0. Clear. Case + 1. Then
[ + ( + 1)] = [( +) + 1]
= ( +) +
= ( +) + by IH
= + ( +)
= +( + 1).
Case limit. By (g) is a limit as well. We obtain
( +) = sup [ < +
= sup ( +) [ <
= sup + [ < by IH
= sup + [ <
= +.
(i). (1 + 1) = 2 = , but 1 + 1 = +.
(j). If 0 < , , hence 1 , , then 0 < 1 1 .
(k). Induction on . We may assume ,= 0. Case 0. Clear. Case
+ 1. Then
()( + 1) = () +
= () + by IH
= ( +) by (h)
= [( + 1)]
130 5. SET THEORY
Case limit. By (g) is a limit as well. We obtain
() = sup () [ <
= sup () [ < by IH
= sup [ <
= ().
(l). Existence: Let 0 < , hence 1 and hence = 1 .
Let be the least ordinal such that . Case = . Let = 0.
Case < . If =
t
+ 1, then
t
< . Hence there is a such
that
t
+ = . Moreover, < , because from is follows that
=
t
+
t
+ = (
t
+ 1) = , contradicting our assumption. If
is a limit, then < = sup [ < , hence < for some < ,
a contradiction.
Uniqueness: Assume
1
+
1
=
2
+
2
with
1
,
2
< . If say
1
<
2
,
then
1
+
1
<
1
+
= (
1
+ 1)
2
2
+
2
hence we have a contradiction. Therefore
1
=
2
, and hence
1
=
2
.
Corollary 6.3. Every ordinal can be written uniquely in the form
= +n. Here n = 0 i = 0 or is a limit.
Proof. It remains to be shown that for every either = 0 or is a
limit. In case = 0 this is clear. In case +1, the ordinal (+1) = + is
a limit by Lemma 6.1(h). If is a limit, then so is (by Lemma 6.2(g)).
6.3. Ordinal Exponentiation. Ordinal exponentiation is dened by
0
:=
_
0, if = 0;
1, otherwise,
+1
:=
:= sup
[ < if limit.
Lemma 6.4 (Properties of Ordinal Exponentiation). (a)
On.
(b) 0
= 0, 1
= .
(c) 1 < <
<
.
(d) There are , , such that 1 < and 1 < < , but
,<
.
(e)
.
(f) If 1 < and is a limit, then so is
.
(g)
+
=
.
(h)
= (
.
(i) 1 <
.
Proof. (a). Induction on . Case 0. Clear. Case + 1. Then
+1
= (
) On, for by IH
= sup
[
< On, for by IH
= 0: Induction on . Case 0. 0
0
= 0 holds by denition. Case
+ 1. Then 0
+1
= (0
)0 = 0. Case limit. 0
= sup 0
[ < = 0
by IH. 1
= sup 1
[ < = sup 1 [
< = 1 by IH.
(c). Let 1 < . We show <
<
by induction on . Case 0.
Clear. Case + 1. Then
< + 1,
< = ,
<
by IH,
<
+1
.
Case limit. Let < , hence < for some < . Then
<
by
IH, hence
< sup
[ < =
.
(d). For 1 < n we have n
= sup n
m
[ m < = and hence
2
= = 3
.
(e). We show the claim
by induction on . Case 0.
Clear. Case + 1. Let . Then
by IH,
+1
=
=
+1
.
Case limit. Let again . Then
sup
[ <
sup
[ < sup
[ <
by denition.
(f). Let 1 < and limit. For the proof of
,= 0: Because of 1 we
have 1 = 1
. <
+ 1 <
: Let <
= sup
[ < ,
hence <
+ 1
2
+1
with
+ 1 < , hence + 1 < sup
[ < .
(g). We must show
+
=
++1
=
by IH
=
+1
.
Case limit.
+
= sup
[ < +
= sup
+
[ <
= sup
[ < by IH
132 5. SET THEORY
= sup
[ <
.
(h). We must show
= (
(+1)
=
= (
by IH
= (
)
+1
.
Case limit. Because of ,= 0, 1 and ,= 0 we know that
and (
= sup
[ <
= sup
[ <
= sup (
[ < by IH
= (
.
(i). Let 1 < . We show
by IH, hence
+ 1
+ 1
+1
.
Case limit.
= sup [ <
sup
[ < by IH
=
.
This concludes the proof.
6.4. Cantor Normal Form.
Theorem 6.5 (Cantor Normal Form). Let 2. Every can be written
uniquely in the form
=
1
+ +
n
where
1
> >
n
and 0 <
i
< .
Proof. Existence. Induction on . Let be minimal such that <
;
such a exists since
1
<
1
+1
.
Division with remainder gives
=
1
+ with <
1
.
Clearly 0 <
1
< . Now if = 0 we are done. Otherwise we have
=
2
+ +
n
by IH.
7. NORMAL FUNCTIONS 133
We still must show
1
>
2
. But this holds, because
2
1
entails
1
, a contradiction.
Uniqueness. Let
1
+ +
n
=
t
1
+ +
t
m
.
and assume that both representations are dierent. Since no such sum can
extend the other, we must have i n, m such that (
i
,
i
) ,= (
t
i
,
t
i
). By
Lemma 6.1(d) we can assume i = 1. First we have
1
+ +
n1
n1
+
n
<
1
+ +
n1
n1
+
n
+1
since
n
<
1
+ +
n1
(
n1
+ 1) for
n
<
n1
1
+ +
n1
+1
. . .
1
(
1
+ 1).
Now if e.g.
1
<
t
1
, then we would have
1
+ +
n
<
1
(
1
+1)
1
+1
1
, which cannot be. Hence
1
=
t
1
. If e.g.
1
<
t
1
, then we
would have
1
+ +
n
<
1
(
1
+1)
t
1
, which again cannot
be the case. Hence
1
=
t
1
.
Corollary 6.6 (Cantor Normal Form With Base ). Every can be
written uniquely in the form
=
1
+ +
n
with
1
n
.
An ordinal is an additive principal number when ,= 0 and + <
for , < .
Corollary 6.7. Additive principal numbers are exactly the ordinals of
the form
.
Proof. This follows from Cantors normal form with base .
Corollary 6.8 (Cantor Normal Form With Base 2). Every can be
written uniquely in the form
= 2
1
+ + 2
n
with
1
> >
n
.
Let
0
:= 1,
k+1
:=
k
and
0
:= sup
k<
k
. Notice that
0
is the
least ordinal such that
= .
7. Normal Functions
In [29] Veblen investigated the notion of a continuous monotonic func
tion on a segment of the ordinals, and introduced a certain hierarchy of
normal functions. His goal was to generalize Cantors theory of numbers
(from [5]).
134 5. SET THEORY
7.1. Closed Unbounded Classes. Let be a regular cardinal >
or = On. An important example is =
1
, that is the case where is the
set of all countable ordinals. Let , , , , , , , denote elements of . A
function : is monotone if < implies < . is continuous
if = sup
<
for every limit . is normal if is monotone and
continuous.
Lemma 7.1. For every monotone function we have .
Proof. Induction on . Case 0. 0 0. Case + 1. <
( + 1). Case limit. = sup
<
sup
<
.
A class B is bounded if sup(B) . A class / is closed if
for every bounded subclass B / we have sup(B) /. Closed unbounded
classes / are called normal or closed unbounded in (club for short).
If for instance =
1
, then every B is a set, and B is bounded i
B is countable. If = On, then B is bounded i B is a set.
By Corollary 3.19 (to the Isomorphy Theorem of Mostowski) for every
/ On we have a uniquely determined isomorphism of an ordinal class onto
/, that is an f : On / (or f : /). This isomorphism is called the
ordering function of /. Notice that f is the monotone enumeration of /.
Lemma 7.2. The range of a normal function is a normal class. Con
versely, the ordering function of a normal class is a normal function.
Proof. Let be a normal function. [] is unbounded, since for every
we have . We now show that [] is closed. So let B = [
/ be bounded, i.e., sup(B) . Because of then also / is bounded.
We must show sup(B) = for some . If / has a maximal element we
are done. Otherwise := sup(/) is a limit. Then = sup
<
=
sup
,
= sup(B). Conversely, let / be closed and unbounded. We
dene a function : / by transnite recursion, as follows.
:= min / [ . < < .
is well dened, since / is unbounded. Clearly is the ordering function
of / and hence monotone. It remains to be shown that is continuous.
So let be a limit. Since [] is bounded (this follows from <
for < ) and / is closed, we have sup
/, hence by denition
= sup
.
Lemma 7.3. The xed points of a normal function form a normal class.
Proof. (Cf. Cantor [5, p. 242]). Let be a normal function. For every
ordinal we can construct a xed point of by
:= sup
n
[ n N.
Hence the class of xed points of is unbounded. It is closed as well, since
for every class B of xed points of we have (sup(B)) = sup [
B = sup [ B = sup(B), i.e., sup(B) is a xed point of .
7. NORMAL FUNCTIONS 135
7.2. The Veblen Hierarchy of Normal Functions. The ordering
function of the class of xed points of a normal function has been called
by Veblen the rst derivative
t
of . For example, the rst derivative of
the function
is the function
.
Lemma 7.4 (Veblen). Let (/
)
<
with limit be a decreasing sequence
of normal classes. Then the intersection
<
/
is normal as well.
Proof. Unboundedness. Let be given and
:= min /
[ >
. Then (
)
<
is weakly monotonic. Let := sup
<
. Then /
<
/
.
Closedness. Let B
<
/
, B bounded. Then B /
for every
< and therefore sup(B) /
. Hence sup(B)
<
/
.
We now dene the Veblen hierarchy of normal functions. It is based on
an arbitrary given normal function : . We use transnite recursion
to dene for every a normal function
: :
0
:= ,
+1
:= (
)
t
,
for limits let
<
[].
For example, for := 1 + we obtain
+. If we start with
:=
, then
1
=
and
2
enumerates the critical numbers, i.e.,
the ordinals such that
= .
Lemma 7.5. Let > 0. Then
for < .
Proof. We must show
[] = [ . <
= .
. This is proved by transnite induction on . In case + 1 every
+1
is a xed point of
for
< . If is a limit, then the claim follows from
[] =
<
[].
. Let such that . <
= be given. If is a successor,
then
[] by denition of
. If is a limit, then
<
[] =
[].
It follows that
) =
0.
Lemma 7.6. If is a normal function with 0 < 0, then
0 is a
normal function as well.
Proof. We rst show
<
0 <
0,
by induction on . So let < . Observe that 0 <
0 by IH or in case
= 0 by assumption. Hence 0 is not a xed point of
and therefore
0 <
0 <
0) =
0.
We now show that
0 with
limit. We must show =
0. Because of
[] is closed we have
[], hence
<
[] =
[]
and therefore
0 <
0) =
0, hence
0.
The xed points of this function, i.e., the ordinals such that
0 = ,
are called strongly critical ordinals. Observe that they depend on the given
normal function =
0
. Their ordering function is usually denoted by .
Hence by denition
0
:= 0 is the least ordinal such that
0 = .
7.3. Normal Form. We now generalize Cantors normal form, using
the Veblen hierarchy instead of
.
Lemma 7.7.
0
<
1
_
0
<
1
, if
0
<
1
;
0
<
1
, if
0
=
1
;
0
<
1
, if
0
>
1
,
(41)
0
=
1
_
0
=
1
, if
0
<
1
;
0
=
1
, if
0
=
1
;
0
=
1
, if
0
>
1
.
(42)
Proof. . (41). If
0
<
1
and
0
<
1
, then
0
<
1
=
1
. If
0
=
1
and
0
<
1
, then
0
<
1
. If
0
>
1
and
0
<
1
, then
0
=
0
<
1
. For (42) one argues similarly.
. If the right hand side of (41) is false, we have
_
0
, if
1
<
0
;
1
0
, if
1
=
0
;
1
0
, if
1
>
0
,
hence by (with 0 and 1 exchanged)
1
<
0
or
1
=
0
,
hence (
0
<
1
). If the right hand side of (42) is false, we have
_
0
,=
1
, if
0
<
1
;
0
,=
1
, if
0
=
1
;
0
,=
1
, if
0
>
1
,
and hence by in (41) either
0
<
1
or
1
<
0
, hence
0
,=
1
.
Corollary 7.8. If
0
1
, then
1
.
Proof. Assume
0
<
1
. By Lemma 7.7 (for ) it suces to show
1
. But this follows from Lemma 7.1.
Corollary 7.9. If
0
=
1
, then
0
=
1
and
0
=
1
, provided
0
<
0
and
1
<
1
.
Proof. Case
0
=
1
. Then
0
=
1
follows from Lemma 7.7. Case
0
<
1
. By Lemma 7.7 we have
0
=
1
=
0
, contradicting our
assumption. Case
1
<
0
. Similar.
7. NORMAL FUNCTIONS 137
Corollary 7.10. If is a normal function with 0 < 0, then every
xed point of =
0
can be written uniquely in the form =
t
with
t
< .
Proof. We have + 1
+1
0 by Lemma 7.6 and hence <
+1
.
Now let be minimal such that <
t
for some
t
. Using
<
it follows that
t
< .
Uniqueness. Let in addition =
1
with
1
< . Then
1
<
1
,
hence
1
by the choice of . Now if <
1
, then we would obtain
1
=
1
= , contradicting our choice of . Hence =
1
and therefore
1
= .
We now show that every ordinal can be written uniquely in a certain
normal form. Here we assume that our initial normal function
0
= is the
exponential function with base .
Theorem 7.11 ( Normal Form). Let
0
:=
1
+ +
n
with
n
and
i
<
i
for i = 1, . . . , k. If <
0
, then
in addition we have
i
<
i
for i = 1, . . . , n.
Proof. Existence. First write in Cantor normal form =
0
1
+ +
n
with
1
n
. Every summand with
i
<
0
i
is left unchanged.
Every other summand satises
i
=
0
i
and hence by Corollary 7.10 can
be replaced by
t
where
t
<
t
.
Uniqueness. Let
=
1
+ +
n
=
t
1
+ +
t
m
and assume that both representations are dierent. Since no such sum can
extend the other, we must have i n, m such that (
i
,
i
) ,= (
t
i
,
t
i
). By
Lemma 6.1(d) we can assume i = 1. Now if say
1
<
t
1
, then we
would have (since
t
1
is an additive principal number and
n
)
1
+ +
n
<
t
1
t
1
+ +
t
m
,
a contradiction.
We must show that in case <
0
we have
i
<
i
for i = 1, . . . , n.
So assume
i
i
for some i. Then
i
0
i
i
i
0,
hence
i
0 =
i
and hence
0
i
=
i
0
i
.
From the
0
:= 0. Observe however that
0
=
0
0. by denition of
0
.
138 5. SET THEORY
8. Notes
Set theory as presented in these notes is commonly called ZFC (Zermelo
Fraenkel set theory with the axiom of choice; C for Choice). Zermelo wrote
these axioms in 1908, with the exeptions of the Regularity Axioms (of von
Neumann, 1925) and the replacement scheme (Fraenkel, 1922). Also Skolem
considered principles related to these additional axioms. In ZFC the only
objects are sets, classes are only a convenient way to speak of formulas.
The hierarchy of normal functions dened in Section 7 has been extended
by Veblen [29] to functions with more than one argument. Sch utte in [21]
studied these functions carefully and could show that they can be used for a
cosntructive representation of a segment of the ordinals far bigger than
0
.
To this end he introduced socalled inquotesKlammersymbole to denote the
multiary Veblenfunctions.
Bachmann extended the Veblen hierarchy using the rst uncountable
ordinal . His approach has later been extended by means of symbols for
higher number classes, rst by Pfeier for nite number clasees and then
by Isles for transnite number classes. However, the resulting theory was
quite complicated and dicult to work with. An idea of Feferman then
simplied the subject considerably. He introduced functions
: On On
for On, that form again a hierarchy of normal functions and extend the
Veblen hierarchy. One usually writes instad of
() and views as a
binay function. The ordinals can be dened byransnite recursion on ,
as follows. Assume thatd
: On On is
dened as the ordering function of the class of all critical ordinals.
Buchholz observed in [4] that the second argument in is not used
in any essential way, and that the functions
v
with v = 0, 1, . . . ,
generate a notation system for ordinals of the same strength as the system
with the binary function. He then went on and dened directly functions
v
with v , that correspond to
v
. More precisely he dened
v
for On and v by transnite recursion on (simultaneously for
all v), as follows.
v
:= min [ / C
v
() ,
where C
v
() is the set of all ordinals that can be generated from the ordinals
<
v
by the functions + and all
u
[ < with u .
CHAPTER 6
Proof Theory
This chapter presents an example of the type of proof theory inspired
by Hilberts programme and the Godel incompleteness theorems. The prin
cipal goal will be to oer an example of a true mathematically meaningful
principle not derivable in rstorder arithmetic.
The main tool for proving theorems in arithmetic is clearly the induction
schema
A(0) (x.A(x) A(S(x))) xA(x).
Here A(x) is an arbitrary formula. An equivalent form of this schema is
cumulative or courseofvalues induction
(x.y<xA(y) A(x)) xA(x).
Both schemes refer to the standard ordering of the natural numbers. Now
it is tempting to try to strengthen arithmetic by allowing more general
induction schemas, e.g. with respect to the lexicographical ordering of NN.
More generally, we might pick an arbitrary wellordering over N and use
the schema of transnite induction:
(x.yxA(y) A(x)) xA(x).
This can be read as follows. Suppose the property A(x) is progressive,
i.e. from the validity of A(y) for all y x we can always conclude that A(x)
holds. Then A(x) holds for all x.
One might wonder whether this schema of transnite induction actually
strengthens arithmetic. We will prove here a classic result of Gentzen [9]
which in a sense answers this question completely. However, in order to
state the result we have to be more explicit about the wellorderings used.
This is done in the next section.
1. Ordinals Below
0
In order to be able to speak in arithmetical theories about ordinals, we
use use a Godelization of ordinals. This clearly is possible for countable
ordinals only. Here we restrict ourselves to a countable set of relatively
small ordinals, the ordinals below
0
. Moreover, we equip these ordinals
with an extra structure (a kind of algebra). It is then customary to speak
of ordinal notations. These ordinal notations could be introduced without
any set theory in a purely formal, combinatorial way, based on the Cantor
normal form for ordinals. However, we take advantage of the fact that we
have just dealt with ordinals within set theory. We also introduce some
elementary relations and operations for such ordinal notations, which will
be used later. For brevity we from now on use the word ordinal instead
of ordinal notation.
139
140 6. PROOF THEORY
1.1. Comparison of Ordinals; Natural Sum.
Lemma 1.1. Let
m
+ +
0
and
n
+ +
0
be Cantor normal
forms (with m, n 1). Then
m
+ +
0
<
n
+ +
0
i there is an i 0 such that
mi
<
ni
,
mi+1
=
ni+1
, . . . ,
m
=
n
,
or m < n and
m
=
n
, . . . ,
0
=
nm
.
Proof. Exercise.
We use the notations 1 for
0
, a for
0
+ +
0
with a copies of
0
and
a for
+ +
.
Lemma 1.2. Let
m
+ +
0
and
n
+ +
0
be Cantor normal
forms. Then
m
+ +
0
+
n
+ +
0
=
m
+ +
i
+
n
+ +
0
,
where i is minimal such that
i
n
; if there is no such i, let i = m+1 (so
n
+ +
0
).
Proof. Exercise.
One can also dene a commutative variant of addition. This is the so
called natural sum or Hessenberg sum of two ordinals. For Cantor normal
forms
m
+ +
0
and
n
+ +
0
it is dened by
(
m
+ +
0
)#(
n
+ +
0
) :=
m+n+1
+ +
0
,
where
m+n+1
, . . . ,
0
is a decreasing permutation of
m
, . . . ,
0
,
n
, . . . ,
0
.
Lemma 1.3. # is associative, commutative and strongly monotonic in
both arguments.
Proof. Exercise.
1.2. Enumerating Ordinals. In order to work with ordinals in a
purely arithmetical system we set up some eective bijection between our
ordinals <
0
and nonnegative integers (i.e., a Godel numbering). For its
denition it is useful to refer to ordinals in the form
m
k
m
+ +
0
k
0
with
m
> >
0
and k
i
,= 0 (m 1).
(By convention, m = 1 corresponds to the empty sum.)
For every ordinal we dene its Godel number inductively by
m
k
m
+ +
0
k
0
:=
_
im
p
k
i
_
1,
where p
n
is the nth prime number starting with p
0
:= 2. For every non
negative integer x we dene its corresponding ordinal notation o(x) induc
tively by
o
_
_
il
p
q
i
i
_
1
_
:=
il
o(i)
q
i
,
where the sum is to be understood as the natural sum.
Lemma 1.4. (a) o() = ,
(b) o(x) = x.
2. PROVABILITY OF INITIAL CASES OF TI 141
Proof. This can be proved easily by induction.
Hence we have a simple bijection between ordinals and nonnegative
integers. Using this bijection we can transfer our relations and operations
on ordinals to computable relations and operations on nonnegative integers.
We use the following abbreviations.
x y := o(x) < o(y),
x
:=
o(x)
,
x y := o(x) + o(y),
xk := o(x)k,
k
:=
k
,
where
0
:= 1,
k+1
:=
k
.
We leave it to the reader to verify that , x.
x
, xy.xy, xk.xk and
k.
k
are all primitive recursive.
2. Provability of Initial Cases of TI
We now derive initial cases of the principle of transnite induction in
arithmetic, i.e., of
(x.yxPy Px) xa Px
for some number a and a predicate symbol P, where is the standard order
of order type
0
dened in the preceeding section. In a later section we will
see that our results here are optimal in the sense that for the full system of
ordinals <
0
the principle
(x.yxPy Px) xPx
of transnite induction is underivable. All these results are due to Gentzen
[9].
2.1. Arithmetical Systems. By an arithmetical system Z we mean
a theory based on minimal logic in the language (including equality
axioms), with the following properties. The language of Z consists of a xed
(possibly countably innite) supply of function and relation constants which
are assumed to denote xed functions and relations on the nonnegative
integers for which a computation procedure is known. Among the function
constants there must be a constant S for the successor function and 0 for
(the 0place function) zero. Among the relation constants there must be
a constant = for equality and for the ordering of type
0
of the natural
numbers, as introduced in Section 1. In order to formulate the general
principle of transnite induction we also assume that a unary relation symbol
P is present, which acts like a free set variable.
Terms are built up from object variables x, y, z by means of f(t
1
, . . . , t
m
),
where f is a function constant. We identify closed terms which have the
same value; this is a convenient way to express in our formal systems the as
sumption that for each function constant a computation procedure is known.
Terms of the form S(S(. . . S(0) . . . )) are called numerals. We use the nota
tion S
n
0 or n or (only in this chapter) even n for them. Formulas are built
up from and atomic formulas R(t
1
, . . . , t
m
), with R a relation constant or
142 6. PROOF THEORY
a relation symbol, by means of A B and xA. Recall that we abbreviate
A by A.
The axioms of Z will always include the Peano axioms, i.e., the universal
closures of
S(x) = S(y) x = y, (43)
S(x) = 0 A, (44)
A(0) (x.A(x) A(S(x))) xA(x), (45)
with A(x) an arbitrary formula. We express our assumption that for every
relation constant R a decision procedure is known by adding the axiom Rn
whenever Rn is true, and Rn whenever Rn is false. Concerning we
require irreexivity and transitivity for as axioms, and also following
Sch utte the universal closures of
x 0 A, (46)
z y
0
(z y A) (z = y A) A, (47)
x 0 = x, (48)
x (y z) = (x y) z, (49)
0 x = x, (50)
x
0 = 0, (51)
x
S(y) =
x
y
x
, (52)
z y
S(x)
z y
e(x,y,z)
m(x, y, z), (53)
z y
S(x)
e(x, y, z) S(x), (54)
where , xy.
x
y, e and m denote the appropriate function constants and
A is any formula. (The reader should check that e, m can be taken to be
primitive recursive.) These axioms are formal counterparts to the properties
of the ordinal notations observed in the preceeding section. We also allow
an arbitrary supply of true formulas xA with A quantierfree and without
P as axioms. Such formulas are called
1
formulas (in the literature also
0
1
formulas).
Moreover, we may also add an exfalsoquodlibet schema Efq or even a
stability schema Stab for A:
x. A,
x.A A.
Addition of Efq leads to an intuitionistic arithmetical system (the 
fragment of of Heyting arithmetic HA), and addition of Stab to a classical
arithmetical system (a version of Peano arithmetic PA). Note that in our
fragment of minimal logic these schemas are derivable from their in
stances
x. Rx,
x.Rx Rx,
with R a relation constant or the special relation symbol P. Note also that
when the stability schema is present, we can replace (44), (46) and (47) by
2. PROVABILITY OF INITIAL CASES OF TI 143
their more familiar classical versions
S(x) ,= 0, (55)
x , 0, (56)
z y
0
z ,= y z y. (57)
We will also consider restricted arithmetical systems Z
k
. They are dened
like Z, but with the induction schema (45) restricted to formulas A of level
lev(A) k. The level of a formula A is dened by
lev(R
t ) := lev() := 0,
lev(A B) := max(lev(A) + 1, lev(B)),
lev(xA) := max(1, lev(A)).
However, the trivial special case of induction A(0) xA(Sx) xA,
which amounts to case distinction, is allowed for arbitrary A. (This is needed
in the proof of Theorem 2.2 below)
2.2. Gentzens Proof.
Theorem 2.1 (Provable Initial Cases of TI in Z). Transnite induction
up to
n
, i.e., for arbitrary A(x) the formula
(x.yxA(y) A(x)) x
n
A(x),
is derivable in Z.
Proof. To every formula A(x) we assign a formula A
+
(x) (with respect
to a xed variable x) by
A
+
(x) := (y.zy A(z) zy
x
A(z)).
We rst show
If A(x) is progressive, then A
+
(x) is progressive,
where B(x) is progressive means x.yxB(y) B(x). So assume that
A(x) is progressive and
(58) yxA
+
(y).
We have to show A
+
(x). So assume further
(59) zy A(z)
and z y
x
. We have to show A(z).
Case x = 0. Then z y
0
. By (47) it suces to derive A(z) from
z y as well as from z = y. If z y, then A(z) follows from (59), and if
z = y, then A(z) follows from (59) and the progressiveness of A(x).
Case Sx. From z y
Sx
we obtain z y
e(x,y,z)
m(x, y, z) by
(53) and e(x, y, z) Sx by (54). From (58) we obtain A
+
(e(x, y, z)). By
the denition of A
+
(x) we get
uy
e(x,y,z)
v A(u) u(y
e(x,y,z)
v)
e(x,y,z)
A(u)
and hence, using (49) and (52)
uy
e(x,y,z)
v A(u) uy
e(x,y,z)
S(v) A(u).
144 6. PROOF THEORY
Also from (59) and (51), (48) we obtain
uy
e(x,y,z)
0 A(u).
Using an appropriate instance of the induction schema we can conclude
uy
e(x,y,z)
m(x, y, z) A(u)
and hence A(z).
We now show, by induction on n, how for an arbitrary formula A(x) we
can obtain a derivation of
(x.yxA(y) A(x)) x
n
A(x).
So assume the left hand side, i.e., assume that A(x) is progressive.
Case 0. Then x
0
and hence x 0
0
by (50). By (47) it suces
to derive A(x) from x 0 as well as from x = 0. Now x 0 A(x) holds
by (46), and A(0) then follows from the progressiveness of A(x).
Case n + 1. Since A(x) is progressive, by what we have shown above
A
+
(x) is also progressive. Applying the IH to A
+
(x) yields x
n
A
+
(x),
and hence A
+
(
n
) by the progressiveness of A
+
(x). Now the denition of
A
+
(x) (together with (46) and (50)) yields z
n
A(z).
Note that in the induction step of this proof we have derived transnite
induction up to
n+1
for A(x) from transnite induction up to
n
for a
formula of level higher than the level of A(x).
We now want to rene the preceeding theorem to a corresponding result
for the subsystems Z
k
of Z.
Theorem 2.2 (Provable Initial Cases of TI in Z
k
). Let 1 l k. Then
in Z
k
we can derive transnite induction for any formula A(x) of level l
up to
kl+2
[m] for arbitrary m, i.e.
(x.yxA(y) A(x)) x
kl+2
[m] A(x),
where
1
[m] := m,
i+1
[m] :=
i
[m]
.
Proof. Note rst that if A(x) is a formula of level l 1, then the
formula A
+
(x) constructed in the proof of the preceeding theorem has level
l + 1, and for the proof of
If A(x) is progressive, then A
+
(x) is progressive,
we have used induction with an induction formula of level l.
Now let A(x) be a xed formula of level l, and assume that A(x) is
progressive. Dene A
0
:= A, A
i+1
:= (A
i
)
+
. Then lev(A
i
) l +i, and hence
in Z
k
we can derive that A
1
, A
2
, . . . A
kl+1
are all progressive. Now from
the progressiveness of A
kl+1
(x) we obtain A
kl+1
(0), A
kl+1
(1), A
kl+1
(2)
and generally A
kl+1
(m) for any m, i.e., A
kl+1
(
1
[m]). But since
A
kl+1
(x) = (A
kl
)
+
(x) = y(zy A
kl
(z) zy
x
A
kl
(z))
we rst get (with y = 0) z
2
[m] A
kl
(z) and then A
kl
(
2
[m]) by the
progressiveness of A
kl
. Repeating this argument we nally obtain
z
kl+2
[m] A
0
(z).
This concludes the proof.
3. NORMALIZATION WITH THE OMEGA RULE 145
Our next aim is to prove that these bounds are sharp. More precisely, we
will show that in Z (no matter how many true
1
formulas we have added
as axioms) one cannot derive purely schematic transnite induction up to
0
, i.e., one cannot derive the formula
(x.yxPy Px) xPx
with a relation symbol P, and that in Z
k
one cannot derive transnite
induction up to
k+1
, i.e., the formula
(x.yxPy Px) x
k+1
Px.
This will follow from the method of normalization applied to arithmetical
systems, which we have to develop rst.
3. Normalization with the Omega Rule
We will show in Theorem 4.7 that a normalization theorem does not hold
for arithmetical systems Z, in the sense that for any formula A derivable in Z
there is a derivation of the same formula A in Z which only uses formulas of
a level bounded by the level of A. The reason for this failure is the presence
of induction axioms, which can be of arbitrary level.
Here we remove that obstacle against normalization in a somewhat dras
tic way: we leave the realm of proofs as nite combinatory objects and
replace the induction axiom by a rule with innitely many premises, the so
called rule (suggested by Hilbert and studied by Lorenzen, Novikov and
Sch utte), which allows us to conclude xA(x) from A(0), A(1), A(2), . . . , i.e.
d
0
A(0)
d
1
A(1)
d
i
. . . A(i) . . .
xA(x)
So derivations can be viewed as labelled innite (countably branching) trees.
As in the nitary case a label consists of the derived formula and the name of
the rule applied. Since we dene derivations inductively, any such derivation
tree must be wellfounded, i.e., must not contain an innite descending path.
Clearly this rule can also be used to replace the rule
+
x. As a con
sequence we do not need to consider free individual variables.
It is plain that every derivation in an arithmetical system Z can be
translated into an innitary derivation with the rule; this will be carried
out in Lemma 3.3 below. The resulting innitary derivation has a notewor
thy property: in any application of the rule the cutranks of the innitely
many immediate subderivations d
n
are bounded, and also their sets of free
assumption variables are bounded by a nite set. Here the cutrank of a
derivation is as usual the least number the level of any subderivation
obtained by
+
as the main premise of
) If d
AB
and e
A
are derivations of depths
i
< (i=1,2), then
(d
AB
e
A
)
B
is a derivation of depth .
() For all A(x), if d
A(i)
i
are derivations of depths
i
< (i < ),
then (d
A(i)
i
)
i<
)
xA
is a derivation of depth .
(
. In
this embedding we refer to the number n
I
(d) of nested applications of the
induction schema within a Z
k
derivation d.
The nesting of applications of induction in d, n
I
(d), is dened by induc
tion on d, as follows.
n
I
(u) := n
I
(Ax) := 0,
n
I
(Ind) := 1,
n
I
(Ind
tde) := max(n
I
(d), n
I
(e) + 1),
n
I
(de) := max(n
I
(d), n
I
(e)), if d is not of the form Ind
td
0
,
n
I
(ud) := n
I
(xd) := n
I
(dt) := n
I
(d).
3.2. Long Normal Form. For the next lemma we need the notion of
the long normal form of a derivation. In Subsection 3.7 of Chapter 1 we
have studied the form of normal derivations in minimal logic. We considered
the notion of a track and observed, that in every track all elimination rules
precede all introduction rules, and that in a uniquely determined minimal
node we encounter a minimal formula, that is a subformula of any formula
in the elimination part as well as in the introduction part of the track. In
the notion of a long normal form we additionally require that every minimal
formula is atomic.
For simplicity we restrict ourselves to the fragment of minimal propo
sitional logic; however, our considerations are valid for the full language as
well.
For terms of the typed calculus we dene the expansion of a variable
by
V
(x
) := z
.x
V
(z),
so by induction on the type of the variable. The expansion of a term can
then be dened by induction on terms:
_
y.(x
M)
_
:= y, z
.x(
M)
V
(z).
Note that we always have (x) =
V
(x). Hence clearly:
Lemma 3.2. Every term can be transformed into long normal form, by
rst normalizing and then expanding it.
3.3. Embedding of Z
k
.
Lemma 3.3. Let a Z
k
derivation in long normal form be given with m
nested applications of the induction schema, i.e., of
A(0) (x.A(x) A(Sx)) xA(x),
all with lev(A) k. We consider subderivations d
B
not of the form Ind
t or
Ind
td
0
. For every such subderivation and closed substitution instance B of
B we construct (d
)
B
in Z
[ <
m+1
and cr(d
) k, and moreover
such that d is obtained by
+
i d
td
0
e i d
tde
t
. Since the deduction is in long normal form, e
t
= xv.e.
By IH we have d
and e
t and Ind
td
0
, since both are in long normal form). Write
e
(t, f) for e
td(xv.e))
:= d
, e
(0, d
), e
(1, e
(0, d
)), . . . ).
By IH [e
[
m1
p and [d
[
m
q for some p, q < . By Lemma 3.1
we obtain
[e
(0, d
)[
m
q +
m1
p,
[e
(1, e
(0, d
))[
m
q +
m1
2p
and so on, and hence
[(Ind d(xv.e))
[
m
(q + 1).
Concerning the cutrank we have by IH cr(d
), cr(e
) k. Therefore
cr(e
(0, d
)) max(lev(A(0)), cr(d
), cr(e
)) k,
cr(e
(1, e
(0, d
)) = k,
and so on, and hence
cr((Ind d(xv.e))
) k.
Case u
C
.d
B
. By IH, we have (d
)
B
with possibly free assumptions
u
C
. Take (u.d)
:= u
C
.d
.
Case de, with d not of the form Ind
t or Ind
td
0
. By IH we have d
and e
is not obtained by
+
. Therefore (de)
:= d
is
normal and cr(d
) = max(cr(d
), cr(e
[ <
m+1
.
Case (x.d)
xB(x)
. By IH for every i and substitution instance B(i)
we have d
,i
. Take (x.d)
:= d
,i
)
i<
.
Case (dt)
B[x:=t]
. By IH, we have (d
)
(xB)
. Let j be the numeral
with the same value as t. If d
= d
i
)
i<
(which can only be the case
if d = Ind
td
0
e
0
, for dt is a subderivation of a normal derivation), take
(dt)
:= d
j
. Otherwise take (dt)
:= d
j
3.4. Normalization for Z
and cutrank k.
Proof. By induction on . The only case which requires some ar
gument is when the derivation is of the form de with [d[
1
< and
[e[
2
< , but is not a simple application. We rst consider the sub
case where d
k
= u.d
1
(u) and lev(d) = k + 1. Then lev(e) k by the
denition of level (recall that the level of a derivation was dened to be
the level of the formula it derives), and hence d
1
[u := e
k
] has cutrank k
by Lemma 3.1. Furthermore, also by Lemma 3.1, d
1
[u := e
k
] has depth
2
2
+2
1
2
max(
2
,
1
)+1
2
. If
we are not in the above subcases, we can simply take (de)
k
to be d
k
e
k
.
This derivation clearly has depth 2
)
A
with free as
sumption variables contained in those of d, which has depth 2
k
, where
2
0
:= , 2
m+1
:= 2
2
m
.
As in Section 3.7 of Chapter 1 we can now analyze the structure of
normal derivations in Z
. In particular we obtain:
Theorem 3.6 (Subformula Property for Z
+ Prog(P). In particular we
have
Lemma 4.1. For any derivation d
A
in Z
and
cutrank k.
We now show that from the progression rule for P we can easily derive
the progressiveness of P.
Lemma 4.2. We have a normal derivation of x.yxPy Px in
Z
i j Pi i j
Pi . . . (all i j)
Prog
Pj
+
yj Py Pj . . . (all j)
x.yxPy Px
+ Prog(P)
of a formula xA with A quantierfree from quantierfree assumptions.
Then any track in d of positive order ends with a quantierfree formula.
Proof. If not, then the major premise of the
+ Prog(P) of
a formula A without to the left of from quantierfree assumptions.
Then for any quantierfree instance A
t
of A we can nd a quasinormal
derivation d
t
of A
t
from the same assumptions such that
(a) d
t
does not use the rule,
(b) d
t
contains
.
A B A
B
152 6. PROOF THEORY
By the quasisubformula property A must be quantierfree. Let B
t
be a
quantierfree instance of B. Then by denition A B
t
is a quantierfree
instance of A B. The claim now follows from the IH.
Case
+
.
B
+
A B
Any instance of A B has the form A B
t
with B
t
an instance of B.
Hence the claim follows from the IH.
Case
.
xA(x) i
A(i)
Then any quantierfree instance of A(i) is also a quantierfree instance of
xA(x), and hence the claim follows from the IH.
Case .
. . . A(i) . . . (all i < )
xA(x)
Any quantierfree instance of xA(x) has the form A(i)
t
with A(i)
t
a quan
tierfree instance of A(i). Hence the claim again follows from the IH.
A derivation d in Z
+Prog(P) is called a P , P
refutation if and
refutation shows
__
i
P
i
__
j
P
j
.)
Lemma 4.6. Let d be a quasinormal P , P
refutation. Then
min(
) [d[ + lh(
t
),
where
t
is the sublist of consisting of all
i
< min(
), and lh(
t
) denotes
the length of the list
t
.
Proof. By induction on [d[. By the Lemma above we may assume that
d does not contain the rule, and contains
. Then d = f
C(
AB)
e
C
. If C is a true quantierfree formula
without P or of the form P with < min(
) [f[ + lh(
t
) + 1 [d[ + lh(
t
).
If C is a false quantierfree formula without P or of the form P with
min(
) [e[ + lh(
t
) + 1 [d[ + lh(
t
).
It remains to consider the case when C is a quantierfree implication in
volving P. Then lev(C) 1, hence lev(C (
0
. So we have
f
P P
[u: P]
e
0
P
P
The claim now follows from the IH for e
0
.
Case
cannot
occur.
Case Prog(P). Then d = d
P
)
P
<
. By assumption on d, is in
. We may assume =
i
:= min(
i
: P
i
would be a quasinormal P , P
is a
P , P
, ) = dp(d
),
hence = min(
) [d[.
To deal with the situation that some
j
are less than , we observe that
there can be at most nitely many
j
immediately preceeding ; so let be
the least ordinal such that
. < .
Then , +1, . . . , +k 1 , +k = . We may assume that is either
a successor or a limit. If =
t
+ 1, it follows by the IH that since d
is a
P , P
, P( 1)refutation,
1 dp(d
1
) + lh(
t
) k,
where
t
is the sequence of
j
< . Hence [d[ + lh(
t
) k, and so
[d[ + lh(
t
).
If is a limit, there is a sequence
f(n)
)
n
with limit , and with all
j
<
below
f(0)
, and so by IH
f(n)
dp(d
f(n)
) + lh(
t
) k,
and hence [d
[ + lh(
t
) k, so [d[ + lh(
t
).
154 6. PROOF THEORY
4.4. Underivability of Transnite Induction.
Theorem. Transnite induction up to
0
is underivable in Z, i.e.
Z , (x.yxPy Px) xPx
with a relation symbol P, and for k 3 transnite induction up to
k+1
is
underivable in Z
k
, i.e.,
Z
k
, (x.yxPy Px) x
k+1
Px.
Proof. We restrict ourselves to the second part. So assume that trans
nite induction up to
k+1
is derivable in Z
k
. Then by the embedding
of Z
k
into Z
m+1
k1
<
k+1
and cutrank 1.
Hence there is also a quasinormal derivation of P + 3 in Z
+
Prog(P) with depth + 2 and cutrank 1, of the form
d
x
k+1
Px
+ 3
k+1
P + 3
d
t
+ 3
k+1
P + 3
where d
t
is a deduction of nite depth (it may even be an axiom, depend
ing on the precise choice of axioms for Z); this contradicts the lemma just
proved.
4.5. Normalization for Arithmetic is Impossible. The normaliza
tion theorem for rstorder logic applied to one of our arithmetical systems
Z is not particularly useful since we may have used in our derivation induc
tion axioms of arbitrary complexity. Hence it is tempting to rst eliminate
the induction schema in favour of an induction rule allowing us to conclude
xA(x) from a derivation of A(0) and a derivation of A(Sx) with an ad
ditional assumption A(x) to be cancelled at this point (note that this rule
is equivalent to the induction schema), and then to try to normalize the
resulting derivation in the new system Z with the induction rule. We will
apply Gentzens Theorems on Underivability and Derivability of Transnite
Induction to show that even a very weak form of the normalization theorem
cannot hold in Z with the induction rule.
Theorem. The following weak form of a normalization theorem for Z
with the induction rule is false: For any derivation d
B
with free assumption
variables among u
A
for formulas
A, B of level l there is a derivation (d
)
B
,
with free assumption variables contained in those of d, which contains only
formulas of level k, where k depends on l only.
Proof. Assume that such a normalization theorem holds. Consider the
formula
(x.yxPy Px) x
n+1
Px
4. UNPROVABLE INITIAL CASES OF TRANSFINITE INDUCTION 155
expressing transnite induction up to
n+1
, which is of level 3. By Gentzens
Theorems on Derivability of Transnite Induction it is derivable in Z. Now
from our assumption it follows that there exists a derivation of this formula
containing only formulas of level k, for some k independent of n. Hence
Z
k
derives transnite induction up to
n+1
for any n. But this clearly
contradicts theorem above (Underivability of Transnite Induction).
Bibliography
1. E.W. Beth, Semantic construction of intuitionistic logic, Medelingen de KNAW N.S.
19 (1956), no. 11.
2. , The foundations of mathematics, NorthHolland, Amsterdam, 1959.
3. Egon Borger, Erich Gradel, and Yuri Gurevich, The classical decision problem, Per
spectives in Mathematical Logic, Springer Verlag, Berlin, Heidelberg, New York, 1997.
4. Wilfried Buchholz, A new system of prooftheoretic ordinal functions, Annals of Pure
and Applied Logic 32 (1986), no. 3, 195207.
5. Georg Cantor, Betrage zur Begr undung der transniten Mengenlehre, Mathematische
Annalen 49 (1897).
6. C.C. Chang and H.J. Keisler, Model theory, 3rd ed., Studies in Logic, vol. 73, North
Holland, Amsterdam, 1990.
7. Nicolaas G. de Bruijn, Lambda calculus notation with nameless dummies, a tool for
automatic formula manipulation, with application to the ChurchRosser theorem, Inda
gationes Math. 34 (1972), 381392.
8. Gerhard Gentzen, Untersuchungen uber das logische Schlieen, Mathematische Zeit
schrift 39 (1934), 176210, 405431.
9. Gerhard Gentzen, Beweisbarkeit und Unbeweisbarkeit von Anfangsfallen der trans
niten Induktion in der reinen Zahlentheorie, Mathematische Annalen 119 (1943),
140161.
10. Kurt Godel, Die Vollstandigkeit der Axiome des logischen Funktionenkalk uls, Monat
shefte f ur Mathematik und Physik 37 (1930), 349360.
11. Kurt Godel,
Uber formal unentscheidbare Satze der Principia Mathematica und ver
wandter Systeme I, Monatshefte f ur Mathematik und Physik 38 (1931), 173198.
12. David Hilbert and Paul Bernays, Grundlagen der Mathematik II, second ed.,
Grundlehren der mathematischen Wissenschaften, vol. 50, SpringerVerlag, Berlin,
1970.
13. Felix Joachimski and Ralph Matthes, Short proofs of normalisation for the simply
typed calculus, permutative conversions and Godels T, Archive for Mathematical
Logic 42 (2003), 5987.
14. Ingebrigt Johansson, Der Minimalkalk ul, ein reduzierter intuitionistischer Formalis
mus, Compositio Mathematica 4 (1937), 119136.
15. Stephen C. Kleene, Introduction to metamathematics, D. van Nostrand Comp., New
York, 1952.
16. L. Lowenheim,
Uber Moglichkeiten im Relativkalk ul, Mathematische Annalen 76
(1915), 447470.
17. A. Malzew, Untersuchungen aus dem Gebiete der mathematischen Logik, Rec. Math.
N. S. 1 (1936), 323336.
18. M.H.A. Newman, On theories with a combinatorial denition of equivalence, Annals
of Mathematics 43 (1942), no. 2, 223243.
19. V.P. Orevkov, Lower bounds for increasing complexity of derivations after cut elimi
nation, Zapiski Nauchnykh Seminarov Leningradskogo 88 (1979), 137161.
20. Dag Prawitz, Natural deduction, Acta Universitatis Stockholmiensis. Stockholm Stud
ies in Philosophy, vol. 3, Almqvist & Wiksell, Stockholm, 1965.
21. Kurt Sch utte, Kennzeichnung von Ordinalzahlen durch rekursiv denierte Funktionen,
Mathematische Annalen 127 (1954), 1632.
22. J.C. Shepherdson and H.E. Sturgis, Computability of recursive functions, J. Ass. Com
puting Machinery 10 (1963), 217255.
157
158 BIBLIOGRAPHY
23. Joseph R. Shoeneld, Mathematical logic, AddisonWesley Publ. Comp., Reading,
Massachusetts, 1967.
24. T. Skolem, Logischkombinatorische Untersuchungen uber die Erf ullbarkeit oder Be
weisbarkeit mathematischer Satze nebst einem Theorem uber dichte Mengen, Skrifter
utgitt av Videnkapsselskapet i Kristiania, I, Mat. Naturv. Kl. 4 (1920), 36 pp.
25. Richard Statman, Bounds for proofsearch and speedup in the predicate calculus, An
nals of Mathematical Logic 15 (1978), 225287.
26. William W. Tait, Innitely long terms of transnite type I, Formal Systems and Re
cursive Functions (J. Crossley and M. Dummett, eds.), NorthHolland, Amsterdam,
1965, pp. 176185.
27. Anne S. Troelstra and Helmut Schwichtenberg, Basic proof theory, 2nd ed., Cambridge
University Press, 2000.
28. Femke van Raamsdonk and Paula Severi, On normalisation, Computer Science Report
CSR9545 1995, Centrum voor Wiskunde en Informatica, 1995, Forms a part of van
Raamsdonks thesis from 1996.
29. Oswald Veblen, Continuous increasing functions of nite and transnite ordinals,
Transactions AMS 9 (1908), 280292.
Index
Rtransitive closure, 104
1
formulas
of the language L
1
, 86
dp(A), 3
A, 3
FV, 4
, 12
,
+
,
, 13
+
, 12
, 13
A(t), 4, 77
E[x :=
t], 4
E[x := t], 4
all class, 93
application
simple, 148
arithmetic
Peano, 142
arithmetical system, 141
classical, 142
intuitionistic, 142
restricted, 143
assignment, 33
assumption, 6
closed, 6
open, 6
axiom of choice, 44, 45, 47, 120, 121
axiom system, 48
bar, 36
Bethstructure, 35
branch, 36
generic, 43
main, 30
Bruijn, de, 3
Cantor
Theorem of, 116
CantorBernstein
theorem of, 116
cardinal, 117
regular, 123
singular, 123
cardinality, 121
carrier set, 33
cartesian product, 94
ChurchRosser property, 16
weak, 16
class, 92
bounded, 134
closed, 134
closed unbounded, 134
inductive, 99
normal, 134
proper, 93
transitive, 99
wellfounded, 107
classes
equal, 93
closure, 12, 24
coincidence lemma, 37
composition, 94
concatenation, 63
conclusion, 5, 6
connal, 123
connality, 123
conuent, 16
weakly, 16
congruence relation, 48
conjunction, 7, 21
connex, 107
consistency, 87
consistent set of formulas, 44
constant, 2
continuum hypothesis
generalized, 125
continuum hypothesis, 125
conversion rule, 12, 24
countable, 48
CR, 16
critical number, 135
cumulative type structure, 91
CurryHoward correspondence, 12
cut, 29
cutrank, 146
decoding, 63
Dedekindnite, 122
Dedekindinnite, 122
denability
explicit, 31
depth (of a formula), 3
derivability conditions, 89
derivable, 8
159
160 INDEX
derivation, 5
convertible, 148
normal, 29, 148
quasinormal, 150
derivative, 135
disjunction, 7, 21
classical, 2
domain, 33, 94
dot notation, 3
Epart, 30
Erule, 6
element, 92
maximal, 120
elementarily enumerable, 67
elementarily equivalent, 49
elementary equivalent, 47
elementary functions, 58
elimination, 23
elimination part, 30
equality, 8
equality axioms, 48
equinumerous, 116
expansion
of a term, 147
of a variable, 147
exfalsoquodlibet, 5
exfalsoquodlibet schema, 142
existence elimination, 18
existence introduction, 18
existential quantier, 7, 21
classical, 2
expansion, 47
explicit denability, 31
extensionality axiom, 92
Fproduct structure, 45
falsity, 8
eld, 52
archimedian ordered, 52
ordered, 52
elds
ordered, 52
lter, 44
nite, 122
nite intersection property, 45
nitely axiomatizable, 53
xed point
least, 69
formula, 2
atomic, 2
negative, 10
1
, 142
prime, 2
RasiowaHarrop, 31
formula occurrence, 17
free (for a variable), 3
Friedman, 39
function, 94
bijective, 94
computable, 68
continuous, 134
elementary, 58
injective, 94
monotone, 134
recursive, 67
recursive, 72
representable, 79
subelementary, 58
surjective, 94
function symbol, 2
Godel function, 62
Godel number, 73
Gentzen, 1, 139, 141
G odelGentzen translation
g
, 10
Godels function, 83
Hartogs number, 118
Hessenberg sum, 140
Ipart, 30
Irule, 6
image, 94
Incompleteness Theorem
First, 81
indirect proof, 5
Induction
transnite on On, 112
induction
courseofvalues, on , 101
on , 99
transnite, 139
transnite on On, dierent forms, 112
Induction theorem, 97
inductively dened predicate, 8
innite, 48, 122
innity axiom, 99
inx notation, 3
inner reductions, 12, 25
instance of a formula, 151
instruction number, 64
interpretation, 33
introduction part, 30
intuitionistic logic, 5
inverse, 94
isomorphic, 49
Klammersymbol, 138
Kleene, 55
Kuratowski pair, 94
Lowenheim, 44
language
elementarily presented, 73
leaf, 36
least number operator, 58
INDEX 161
lemma
Newmans, 16
length, 63
length of a formula, 3
length of a segment, 29
level
of a derivation, 145
level of a formula, 143
limit, 111
logic
classical, 9
intuitionistic, 8
minimal, 8
marker, 6
maximal segment, 29
minimal formula, 147
minimal node, 147
minimum part, 29
model, 48
modus ponens, 6
monotone enumeration, 134
Mostowski
isomorphy theorem of, 106
natural numbers, 99
natural sum, 140
negation, 2
Newman, 16
Newmans lemma, 16
node
consistent, 42
stable, 42
non standard model, 51
normal derivation, 29
normal form, 13
long, 147
Normal Form Theorem, 65
normal function, 134
normalizing
strongly, 13
notion of truth, 79
numbers
natural, 99
numeral, 77
of cardinality n, 48
order of a track, 30
ordering
linear, 107
partial, 120
ordering function, 134
ordinal, 107
strongly critical, 136
ordinal class, 107
ordinal notation, 139
Orevkov, 18
parentheses, 3
part
elimination, 30
introduction, 30
minimum, 29
strictly positive, 4
Peano arithmetic, 90
Peano axioms, 51, 142
Peanoaxioms, 101
Peirce formula, 39
permutative conversion, 20
power set axiom, 95
prestructure, 33
predicate symbol, 2
premise, 5, 6
major, 6, 22
minor, 6, 22
principal number
additive, 133
principle of least element, 101
principle of indirect proof, 9
progression rule, 150
progressive, 101, 143
proof, 5
propositional symbol, 2
quasisubformulas, 151
quotient structure, 48
range, 94
rank, 114
RasiowaHarrop formula, 31
Recursion Theorem, 97
recursion theorem
rst, 70
redex, 148
, 12, 24
permutative, 24
reduct, 47
reduction, 13
generated, 13
onestep, 12
proper, 13
reduction sequence, 13
reduction tree, 13
register machine computable, 57
register machine, 55
Regularity Axiom, 114
relation, 94
denable, 77
elementarily enumerable, 67
elementary, 60
extensional, 105
representable, 79
transitively wellfounded, 96
wellfounded, 105
relation symbol, 2
renaming, 3
162 INDEX
replacement scheme, 95
representability, 79
restriction, 94
Rosser, 81
rule, 6
progression, 150
Russell class, 93
Russells antinomy, 91
satisable set of formulas, 44
segment, 29
maximal, 29
minimum, 29
separation scheme, 95
sequence
reduction, 13
sequent, 74
set, 92
pure, 92
set of formulas
0
1
denable, 76
set of formulas
denable, 77
elementary, 76
recursive, 76
Shoeneld principle, 92
signature, 2
simplication conversion, 30
size of a formula, 3
Skolem, 44
sn, 14
soundness theorem, 38
s.p.p., 4
stability, 5, 9
stability schema, 142
state
of computation, 65
Statman, 18
strictly positive part, 4
strongly normalizing, 13
structure, 33
subformula, 4
literal, 4
negative, 4
positive, 4
strictly positive, 4
subformula (segment), 29
subformula property, 30
substitution, 3, 4
substitution lemma, 37
substitutivity, 14
successor number, 111
symbol number, 73
Tait, 149
term, 2
theory, 49
axiomatized, 77
completee, 49
consistent, 77
elementarily axiomatizable, 76
inconsistent, 77
of M, 49
recursively axiomatizable, 76
track, 17, 29, 147
main, 30
transitive closure, 105
transitive relation, 96
tree
reduction, 13
unbounded, 36
truth, 77
truth formula, 79
Turing, 55
Uultraproduct, 45
ultralter, 44
ultrapower, 47
union axiom, 94
universe, 93
upper bound, 120
urelement, 91
validity, 34
variable, 2
assumption, 6
free, 4
object, 6
variable condition, 6, 12, 21, 23
Veblen hierarchy, 135
von Neumann levels, 113
WCR, 16
well ordering theorem, 120
wellordering, 107
Zorns Lemma, 120
Zorns lemma, 44