0 оценок0% нашли этот документ полезным (0 голосов)

7 просмотров67 страницModal Logic

© © All Rights Reserved

PDF, TXT или читайте онлайн в Scribd

Modal Logic

© All Rights Reserved

0 оценок0% нашли этот документ полезным (0 голосов)

7 просмотров67 страницModal Logic

© All Rights Reserved

Вы находитесь на странице: 1из 67

Rosja Mastop

Abstract

These course notes were written for an introduction in modal logic for students in Cognitive Ar-

tificial Intelligence at Utrecht University. Earlier notes by Rosalie Iemhoff have been used both as a

source and as an inspiration, the chapters on completeness and decidability are based on her course

notes. Thanks to Thomas Müller for suggesting the use of the Fitch-style proof system, which has

been adopted from Garson [7]. Thanks to Jeroen Goudsmit and Antje Rumberg for comments and

corrections. Further inspiration and examples have been drawn from a variety of sources, including

the course notes Intensional Logic by F. Veltman and D. de Jongh, Basic Concepts in Modal Logic by

E. Zalta, the textbook Modal Logic by P. Blackburn, M. de Rijke, and Y. Venema [2] and Modal Logic

for Open Minds by J. van Benthem [15].

These notes are meant to present the basic facts about modal logic and so to provide a common

ground for further study. The basics of propositional logic are merely briefly rehearsed here, so that the

notes are self-contained. They can be supplemented with more advanced text books, such as Dynamic

Epistemic Logic by H. van Ditmarsch, W. van der Hoek, and B. Kooi [16], with chapters from one of

the handbooks in logic, or with journal articles.

Contents

1 Propositional logic 5

1.1 Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Truth values and truth tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Proof theory: natural deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1 What is the role of formal logic in artificial intelligence? . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Modal logic: reasoning about necessity and possibility . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 A brief history of modal logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4 Modal logic between propositional logic and first order logic . . . . . . . . . . . . . . . . . . . . . 15

3.1 The modal language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Examples of sentences and arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

The variety of modalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2 Kripke models and the semantics of for the modal language . . . . . . . . . . . . . . . . . . . . . . 18

3.3 Semantic validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.1 Characterizability and the modal language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Different Kripke frames for different modalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Expressive power of the modal language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2 Frame correspondence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1

4.3 Bisimulation invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.4 The limits of characterizability: three methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Generated subframes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Disjoint unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

P-morphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.1 Hilbert system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Factual premises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.2 Natural deduction for modal logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.4 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.5 Adding extra rules or axioms: the diversity of modal logics . . . . . . . . . . . . . . . . . . . . . . 43

5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6 Completeness 47

6.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

7 Decidability 52

7.1 Small models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

7.2 The finite model property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

7.3 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7.4 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

8 Tense Logic 56

8.1 Basic tense logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

8.2 Additional properties: seriality, transitivity, and linearity . . . . . . . . . . . . . . . . . . . . . . . 57

8.3 Varieties of linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

8.4 Time and modality: the Master Argument of Diodorus . . . . . . . . . . . . . . . . . . . . . . . . 60

8.5 Aristotle on the sea battle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

8.6 Ockhamist semantics for modal tense logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

8.7 Computation tree logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

2

What is formal logic?

Logic is concerned with the study of reasoning or, more specifically, the study of arguments. An ar-

gument is an act or process of drawing a conclusion from premises. We call an argument sound if the

premises are all true, and valid if the truth of the premises guarantees the truth of the conclusion. Note

that an argument can be valid without being sound: one or more premises may in fact be false, but if

they were true, then the conclusion would have also been true. Vice versa, an argument may be sound

without being valid: the premises may be true but the conclusion just doesn’t follow from it.

Formal logic is concerned with the study of validity of argument forms. For example, the argument

on the left is valid because of its form, whereas the one on the right is valid because of its content.

The door is closed or the window is open. The door is closed or the window is open.

The window is not open. The window is closed.

Therefore, the door is closed. Therefore, the door is closed.

The argument on the right is valid, but only in virtue of the meanings of the words ‘open’ and ‘closed’,

which are such that a window cannot both be open and closed. The argument on the left, however, is

valid in virtue of its form. That is, any argument of the form

A or B

not A

(Therefore) B

is valid, regardless of the sentences we use in the place of A and B. The only items that need to be fixed

are ‘or’ and ‘not’ in this case. If we would replace ‘not’ by ‘maybe’, then the argument would not be

valid anymore. We call ‘or’ and ‘not’ logical constants. Together with ‘and’, ‘if . . . then’ and ‘if, and

only if’, they are the logical constants of propositional logic (see section 1).

A formal logic is a definition of valid argument forms, such as the one above. There are different

methods for doing so. Here we are concerned with two of them: the model-theoretic approach and the

proof-theoretic approach.

We first need to have a language. Propositional logic, predicate logic and modal logic all have different

languages. In all cases, what we have is a set L of sentences (or: closed formulas, or: well-formed

formulas). This is enough to say what model theory and proof theory say. So below, we simply assume

that some language L is given.

Now we define what a model is. A model is intended to give a meaning to the symbols of the language

L. Specifically, it specifies for every sentence in the language whether it is true in the model or not. An

argument is valid if in every model in which all of the premises are true, the conclusion is also true.

Definition 1 (Validity, model-theoretic). Let a method T be given for evaluating formulas ϕ ∈ L as being

true or false in a model M, notation M |=T ϕ. The conclusion ψ ∈ L can be validly drawn from a set of

premises Φ ⊆ L, notation Φ |=T ψ if, and only if, in every model in which all of the premises in Φ are

true, the conclusion ψ is also true.

When there are no premises we simply write |=T ψ, meaning that the formula ψ is true in every model.

Such a formula is also called a general validity or tautology.

3

A formula is satisfiable if there is a model in which the formula is true. A formula is a tautology

if it is true in every model. A formula is contradictory if it is false in every model (hence, if it is not

satisfiable). If a formula is satisfiable and its negation is also satisfiable, then we call it contingent.

Two formulas are logically equivalent if they are true in exactly the same models. Note that if ϕ and

ψ are logically equivalent, then the formula ϕ ↔ ψ is a tautology.

From the model-theoretic standpoint, we can understand what logical constants are: in propositional

logic they are the ones that are entirely truth-functional. If we know what the truth value is of A and B,

then we know what the truth value is of ‘A or B’, ‘not A’, and so on.

Proof-theoretic approach

In proof theory, we try to find a fixed set of axioms and/or inference rules. Axioms are formulas that are

considered to be self-evidently true, for which no proof is required. They may be used at the beginning

of a proof. Inference rules tell us which steps we are allowed to make in a proof.

Valid argument forms, in the sense of proof theory, are those that make use only of the inference

rules. If we also only make use of axioms as the premises of our proof, then the conclusion of the proof

is just as self-evident as the axioms. Those conclusions we call theorems of the logic.

Definition 2 (Validity, proof-theoretic). Let a set of inference rules and axioms S be given. The conclu-

sion ψ ∈ L can be validly drawn from premises Φ ⊆ L, notation Φ `S ψ if, and only if, there is a proof

starting with only the premises Φ and the axioms in S and using only the inference rules in S , that leads

to the conclusion ψ.

We write `S ψ if there is a valid inference to the conclusion ψ starting from no premises (not including

axioms).

Given these two different ways of defining validity, we can also compare them. It would be rather odd

if an argument could be shown to be valid using one method but invalid using the other. If everything

that can be proven valid using inference is also valid model-theoretically, then we say that the inference

system is sound with respect to the model-theoretic interpretation. Vice versa, if everything that is valid

model-theoretically can be proven using deduction, then we call the inference system complete with

respect to the model-theoretic interpretation.

Put differently, soundness means that the inference system does not allow us too much, whereas

completeness means that it enables us to prove everything that is valid.

4

1 Propositional logic

The following is a brief summary of propositional logic, intended only as a reminder to those who have

taken a course in elementary logic.

1.1 Language

Propositional logic is the logic of propositional formulas. Propositional formulas are constructed from

a set var of elementary or ‘atomic’ propositional variables p, q, and so on, with the connectives ¬

( negation, ‘not’), ∧ (conjunction, ‘and’), ∨ (disjunction, ‘or’), → (implication, ‘if . . . then’), and ↔

(equivalence , ‘if and only if’). If ϕ and ψ are formulas, then so are ¬ϕ and ¬ψ, (ϕ ∧ ψ), (ϕ ∨ ψ), (ϕ → ψ),

and (ϕ ↔ ψ). So p is a formula, ¬(p ∧ q) and q ∨ (q ∧ ¬(r → ¬p)) are formulas, but pq is not a formula,

and neither are p ∧ q → and p¬ ∨ r. We add the simple symbol ⊥ which is called the falsum. We write

this definition in Backus-Naur Form (bnf) notation, as follows:

[Lprop ] ϕ ::= p | ⊥ | ¬ϕ | (ϕ ∧ ϕ) | (ϕ ∨ ϕ) | (ϕ → ϕ) | (ϕ ↔ ϕ)

This means that a formula ϕ can be an atom p, the falsum ⊥, or a complex expression of the other forms,

whereby its subexpressions themselves must be formulas. The language of propositional logic is called

Lprop .

Brackets are important to ensure that formulas are unambiguous. The sentence p ∨ q ∧ r could be

understood to mean either (p ∨ q) ∧ r or p ∨ (q ∧ r), which are quite different insofar as their meaning is

concerned. We omit the outside brackets, so we do not write ((p ∨ q) ∧ r).

The symbols ϕ, ψ, χ, . . . are formula variables. So, if it is claimed that the formula ϕ ∨ ¬ϕ is a

tautology, it means that every propositional formula of that form is a tautology. This includes p ∨ ¬p,

(p → ¬q) ∨ ¬(p → ¬q) and any other such formula. In a similar way we formulate axiom schemata and

inference rules by means of formula variables. If ϕ → (ψ → ϕ) is an axiom scheme, then every formula

of that form is an axiom, such as (p ∧ q) → (¬q → (p ∧ q)). And an inference rule that allows us to infer

ϕ ∨ ψ from ϕ allows us to infer p ∨ (q ↔ r) from p, and so on.

In propositional logic, the models are (truth) valuations. A truth valuation determines which truth value

the atomic propositions get. It is a function v : var → {0, 1}, where var is the non-empty set of atomic

propositional variables. A proposition is true if it has the value 1, and false if it has the value 0. The truth

value of falsum is always 0, so v(⊥) = 0 for every valuation v.

Whether a complex propositional formula is true in a given model (valuation) can be calculated by

means of truth functions, that take truth values as input and give truth values as output. To each logical

connective corresponds a truth function. They are defined as follows:

f¬ (1) = 0 f¬ (0) = 1

f∧ (1, 1) = 1 f∧ (1, 0) = 0 f∧ (0, 1) = 0 f∧ (0, 0) = 0

f∨ (1, 1) = 1 f∨ (1, 0) = 1 f∨ (0, 1) = 1 f∨ (0, 0) = 0

f→ (1, 1) = 1 f→ (1, 0) = 0 f→ (0, 1) = 1 f→ (0, 0) = 1

f↔ (1, 1) = 1 f↔ (1, 0) = 0 f↔ (0, 1) = 0 f↔ (0, 0) = 1

For instance, if v(p) = 1 and v(q) = 0, then the propositional formula (p → q) ∧ ¬q is false. To see that

this is the case, first, we calculate the truth value of p → q, which is f→ (v(p), v(q)) = f→ (1, 0) = 0, so

p → q is false. Second, we calculate the truth value of ¬q, which is f¬ (v(q)) = f¬ (0) = 1, so ¬q is true.

5

Finally, we combine the two obtained truth values and get f∧ (0, 1) = 0, so the formula (p → q) ∧ ¬q is

false. We can do the same thing in one step:

Doing this for all of the four possible valuations, we get the following truth table:

p q p→q ¬q (p → q) ∧ ¬q

1 1 1 0 0

1 0 0 1 0

0 1 1 0 0

0 0 1 1 1

The bold face indicates the valuation we considered above and the truth value we calculated.

By means of truth tables we can easily determine in which models complex formulas true. The

model/truth valuation is given by the rows on left of the double line. Below is the truth table for all the

simplest non-atomic formulas.

1 1 0 0 0 1 1 1 1

1 0 0 0 1 0 1 0 0

0 1 0 1 0 0 1 1 0

0 0 0 1 1 0 0 1 1

An argument is semantically valid if, and only if, the conclusion is true in every truth valuation in which

the premises are true. Using the truth table above, we can see that p ∨ q is a valid consequence of p.

Therefore, p |= p ∨ q.

A formula is a tautology if it is true in every valuation, and a contradiction if it is false in every

valuation. The formula p ∨ ¬p is a tautology, so |= p ∨ ¬p. No matter what the truth valuation assigns

to the proposition p, the outcome of the truth functional calculation is always 1. Similarly, p ∧ ¬p is a

contradiction.

One of the most familiar type of proof systems is natural deduction. This system does not work with

axioms, but only with inference rules: two rules for each connective. The introduction rules regulate the

deduction of a formula with the connective, and the elimination rules govern the deduction of a formula

from a premise with that connective. As basic familiarity with natural deduction will be presupposed in

this course, in the box below the rules are merely restated.

Assumptions are written above the horizontal line, and the inferences based on those assumptions are

written below that line. The vertical line indicates that the proof is still dependent on the assumptions. In

a proof that looks like the one below, the conclusion χ has been reached dependent on the assumptions

ϕ and ψ.

ϕ

ψ

..

.

χ

6

Assumptions can be withdrawn, but only in accordance with specific inference rules: only by means of

the rules for introducing →, ↔ or ¬, or the rule for eliminating ∨. So, for instance, when we prove an

implication ϕ → ψ (Intro →), we first assume ϕ and then, using that assumption, we prove that ψ. If

we succeeded in doing so, we can withdraw the assumption by concluding that if ϕ, ψ. The vertical line

then ends just above the formula ϕ → ψ, as shown in the box below.

ϕ

..

ϕ .

ϕ ...... ϕ ϕ

.. ϕ∨ψ .. ψ ..

. . . ¬¬ϕ

ψ ......

ψ ψ .. ⊥ ϕ

...... ϕ ...... . ......

ϕ∧ψ ...... ϕ→ψ ¬ϕ

ψ∨ϕ ϕ

......

ϕ↔ψ

ϕ∨ψ

ϕ

ϕ∧ψ .. ϕ↔ψ

...... . ϕ→ψ ...... ϕ

ϕ .. ϕ→ψ ..

χ . . ⊥

......

ψ ϕ ¬ϕ ϕ

ϕ∧ψ .. ...... ϕ↔ψ ......

...... . ψ ...... ⊥

ψ ψ→ϕ

χ

......

χ

Some rules are very simple: if you can prove ϕ and you can prove ψ, then you can also prove their

conjunction ϕ ∧ ψ. Other rules are more complicated. For example, the only way to ‘eliminate’ the

disjunction ϕ ∨ ψ is by proving, first that ϕ ∨ ψ, and second, that some conclusion χ can be proven both

from ϕ alone and from ψ alone.

Conjunction, disjunction, and equivalence are commutative, which means that the order of the sub-

formulas is irrelevant: ϕ ∨ ψ is logically equivalent to ψ ∨ ϕ, and likewise for ϕ ∧ ψ and ϕ ↔ ψ.

Apart from the introduction and elimination rules for each connective, there are two special rules.

The double negation rule says that two negations cancel each other out. The name ‘EFSQ’ is an acronym

for Ex Falso Sequitur Quodlibet, which means literally “From the False follows whatever”. Note that

this is also semantically valid: in every valuation in which ⊥ is true (namely: none), any other formula

is also true.

Lastly, we are allowed to reiterate earlier steps, for ease of exposition of the proof, but only provided

that no assumptions were withdrawn in between. So the reiteration on the left is correct, and the one on

7

the right is incorrect.

.. ..

. .

..

ϕ .

..

. ϕ

.. ..

. .

..

ϕ (allowed) .

..

. ϕ (not allowed)

Below are two examples of natural deductions. In the left one, we prove the validity of the argument

p → q ` ¬q → ¬p. First, we introduce the premise, and then we prove the conclusion. To do so,

we assume ¬q and prove ¬p, after which we can use the introduction rule for implication to draw the

conclusion. On the right is the proof for p ∨ q, ¬p ` q. Since one of the premises is a disjunction, we

need to use the rule for elimination of disjunction to prove the conclusion.

2 ¬q Assumption 2 ¬p Assumption

3 p Assumption 3 p Assumption

4 q Elim → 1,3 4 ⊥ Elim ¬ 2,3

5 ¬q Iteration 2 5 q EFSQ 4

6 ⊥ Elim ¬ 4,5 6 q Assumption

7 ¬p Intro ¬ 3,6 7 q Iteration 6

8 ¬q → ¬p Intro → 2,7 8 q Elim ∨ 1,5,7

Below are, on the left, a proof of p ∨ ¬p, which crucially involves the double negation rule and, on

the right, a proof of one of the ‘distribution laws’ (p ∨ q) ∧ r ` (p ∧ r) ∨ (q ∧ r). It can easily be checked

that they are also semantically valid by checking the truth tables.

1 (p ∨ q) ∧ r Assumption

1 ¬(p ∨ ¬p) Assumption

2 p∨q Elim ∧ 1

2 p Assumption

3 r Elim ∧ 1

3 p ∨ ¬p Intro ∨ 2

4 p Assumption

4 ⊥ Elim ¬ 1,3

5 p∧r Intro ∧ 3,4

5 ¬p Intro ¬ 2,4

6 (p ∧ r) ∨ (q ∧ r) Intro ∨ 5

6 p ∨ ¬p Intro ∨ 5

7 q Assumption

7 ⊥ Elim ¬ 1,6

8 q∧r Intro ∧ 3,7

8 ¬¬(p ∨ ¬p) Intro ¬ 1,7

9 (p ∧ r) ∨ (q ∧ r) Intro ∨ 8

9 p ∨ ¬p Double ¬ 8

10 (p ∧ r) ∨ (q ∧ r) Elim ∨ 2,6,9

8

1.4 Exercises

1. Explain why, if ϕ is a tautology, ¬ϕ is a contradiction. Explain why every formula that is not a

contradiction is satisfiable.

2. Write down the truth table for (a) ¬(p → (q ∨ p)), (b) (p → q) ∨ (q → p), (c) ¬((p ↔ (q ∧ ¬r)).

3. Show by means of truth tables that (a) |= p → (q → p), (b) (p ∧ r) ∨ q |= p ∨ q, (c) |= ¬(p → q) ↔

(p ∧ ¬q).

4. Give a natural deduction proof for (a) ϕ → ϕ, (b) ϕ → (ψ → ϕ), (c) (ϕ → (ψ → χ)) → ((ϕ →

ψ) → (ϕ → χ)).

5. Write down a natural deduction proof for (a) ` ¬p → (p → q), (b) p ∨ q ` (p → q) → q, (c)

p → ¬p, ¬p → p ` ⊥.

6. In this exercise you are asked to prove the Deduction Theorem.

(b) Prove that if Γ ` ϕ → ψ, then Γ, ϕ ` ψ.

7. Write down a formula that is logically equivalent to (a) ϕ ∧ ψ, (b) ϕ → ψ, (c) ϕ ↔ ψ, using only

the connectives ¬ and ∨ and the formulas ϕ and ψ. How can we write the formula ¬ϕ using only

→ and ⊥?

As this last exercise already points out, we can define the logical language using only ¬ and ∨, treating

the other connectives as defined abbreviations. Note that we can do this for any combination of ¬ and

one of the other connectives.

9

2 Modal logic and artificial intelligence

2.1 What is the role of formal logic in artificial intelligence?

Formal logic has had a central place in the study of artificial intelligence since its first conception in the

1950s. Why is this so? Logic has not been made formal just to make it suitable for computers and robots.

Formal logic is also a tool for humans. Just as we could write, and execute, our grocery list and daily

schedules in Perl or Haskell, so we could make our daily inferences by means of deduction in formal

logic. Moreover, formal logic wasn’t even invented for machines, but to regiment and explicate human

scientific reasoning.

Aristotle’s logic of syllogisms was intended to regiment, amongst others, categorisation of individuals

into species and genera. Frege created predicate logic in an attempt to regiment mathematical proof, and

thereby to provide an ultimate justification of mathematical knowledge on the basis of principles of

pure reasoning, themselves defined in a mathematically precise way. Modal logic grew out of several

endeavours to regiment reasoning about possible situations: utopias and ideals, hypothetical scenarios,

the unknown future, responsibilities and ‘what might have happened’, that which is possible given our

limited knowledge of the facts, and so on. None of these formal logics were invented with artificial

agents in mind.

Formal logic for artificial intelligence can still be understood in two ways: the formal logic can

be implemented as a capacity for automated reasoning by an intelligent agent, and it can be used by

a designer in developing intelligent agents that meet certain criteria. In the latter case, formal logic is

used, again, as an instrument to regiment our own reasoning: about what the agent should choose to do in

certain situations, about what uncertainty the agent might have to cope with, about possible malfunctions

we need to consider, and so on.

The first role is in knowledge representation. Artificial agents can use formal languages for repre-

senting the knowledge they gather of the environment in which they have to act intelligently. Because the

knowledge is expressed in formal language, they can use formal methods for drawing inferences from

the knowledge. The agent may know that lightning is always followed by thunder, and know that there

was lighting just now. But in order to know that thunder will follow soon, it has to draw an inference

from its knowledge. Logic is then an instrument of the agent, in the form of a calculus or algorithm to

extend its explicit knowledge with further, inferred knowledge.

The second role of logic in artificial intelligence is agent specification. In this case, the formal

language is used to characterize the agent and its (desired or actual) intelligent behaviour. In designing

an agent, for instance, we want to make sure that it meets certain criteria of behaviour. IF it knows that

it is raining and it does not want to get wet, then it should not go outside without an umbrella. Or, if

it knows that some other agent knows that it is raining, then the agent himself knows that it is raining.

Writing down these criteria in a formal language allows us to compute the consequences of those criteria.

For instance, if we design the agent using this set of criteria, will it always stay out of the rain? Will it

not bump into the walls? Will it shut down if it malfunctions?

In order to fulfill the second of these tasks, a formal language needs to be rich enough to express

many things having to do with intelligent behaviour: we need to say things about what the agent knows

or believes, what the agent intends to do, what it is allowed to do, what it can do, what it has done and

will eventually do. As you can see from the above description, it is modal logic that is most suited for

expressing such criteria and for drawing inferences from such expressions. So what is modal logic more

precisely?

A modality is a ‘mode of truth’ of a proposition: about when that proposition is true, or in what way,

or under which circumstances it would, could, or might have been true. Modal logic, accordingly, is

10

the study of reasoning about modalities, inferring from modal premisses that some modal conclusion is

valid. The following examples illustrate what we mean by this.

Imagine that you’re holding a pen in your hand and you release your grip on it. What will the pen

do? Presumably, the answer will be ‘It will fall down until it hits the ground and then it will be at rest.’

Yet, many will admit that this is no mere accident. The pen will be subject to the gravitational pull from

the earth, making its fall inevitable: it has to fall, it cannot happen otherwise. Here we speak of alethic

modality: distinguishing between what is necessary, possible or accidental, and impossible.

Likewise, imagine that you buy a pack of cookies. When you are outside the store, you open the pack

and take a cookie. Now, most likely no one is going complain that you do so. You are allowed to eat a

cookie outside of the store. However, before you bought the pack, when you were still inside the store,

opening the pack and eating one of the cookies would not have been allowed, but prohibited. You may

not eat the cookie under those circumstances—legally, you cannot do so. This is called deontic modality:

that something is obligatory, permissible, or prohibited.

Third, consider the situation that you come home after attending a lecture and you see that a window

is broken, the door is open, things from the cupboard are spread across the floor and the tv and stereo

are missing. You infer that you must have been burglarized. If a police officer asks if there is anything

else that might have happened, you should rightly respond ‘no.’ There is no other possibility than that

a burglar broke in to your home and stole your belongings, that must be what happened, it cannot be

otherwise. These are the epistemic modalities: that something is certain (known, verified), undecided

(consistent with what is known), or excluded (known to be false, falsified).

Further examples can be offered, including power, time, belief, and more.

In all these examples, we are confronted with something that is not merely true, but necessarily so:

necessary in virtue of the (physical) nature of things, necessary in virtue of property law, or necessary

in virtue of the evidence. If we think of the language of propositional logic, then what we add to this

language is two operators, 2 (‘box’) and 3 (‘diamond’). Given some arbitrary formula ϕ, the modal

formula 2ϕ reads: “It is necessarily the case that ϕ”. The formula 3ϕ reads: “It is possibly the case that

ϕ”. So, if p is the proposition that the pen falls to the ground, then 2p says that necessarily, the pen falls

to the ground. Similarly, if q is the proposition that you own the pack of cookies and r the proposition

that you eat one of the cookies from the pack, then q → 3r says that, if you own the pack, it is possible

(i.e., allowed) that you eat one of the cookies.

Modal logic is concerned with reasoning about necessities and possibilities such as these. For exam-

ple, it can be used to reason about what is permissible under precise circumstances given the entire penal

code: if we can rewrite the penal code in a formal language saying what is permissible and what is not

permissible, and we also write down the precise details of the circumstances, then using modal logic we

can draw conclusions about what we are allowed to do (what the penal code allows us to do) under the

circumstances.

Consider, for example, the following inference:

2This pen falls

This pen falls

This says that if the pen has to fall, then in fact it does fall.

The next example is more complicated:

2(Door open → (Forgot to close ∨ Burglarized))

2(Window broken → (Football accident ∨ Burglarized))

Window broken ∧ Door open

¬3(Forgot to close ∧ Football accident)

2Burglarized

11

This argument expresses the formal structure of the following reasoning: I know that, if the door is open,

then either I forgot to close it, or I was burglarized. Secondly, I also know that, if the window is broken,

then either there was a football accident, or I was burglarized. Now, as a matter of fact, the window is

broken and the door is open. But, it is inconceivable that I both forgot to close the door and there was a

football accident. Therefore, I know that I was burglarized.

Modal logic is concerned with the task of determining whether such an argument is valid, or not.

It is not concerned with the question whether the premisses are in fact true. Questions such as ‘how

do you know that something is necessary?’ and ‘Is this action really permissible?’ are not addressed.

Only, an argument with such premisses is valid only if the premisses, if they are true, guarantee that

the conclusion is true as well. When you have completed this course, you will be able to prove that the

second argument above is indeed valid, or rather, that every argument of the same logical form is valid.

Since there are various kinds of modality, there are also various kinds of modal logic:

alethic logic necessary / possible / actually

tense logic always / sometimes / never / until/ eventually

epistemic logic certainly / perhaps / ‘it is known that’

deontic logic obligatory /permissible / forbidden

dynamic logic makes sure / allows / enables / avoids

provability logic provable / consistent

... ...

The study of modal logic dates back to the very beginning of logic itself, in the work of Aristotle.

Although most of Aristotle’s logic is concerned with “categorical” statements such as “All horses are

four-legged” and “Some houses are not wooden”, he also considered the logical relationships between

possibility and necessity. In his Prior Analytics he discusses a logical relation between those two con-

cepts, that has come to be called the “duality” of possibility and necessity:

If it is possible that A is true, then it is not necessary that A is not true.

Using the notation above, with 2 for necessity and 3 for possibility, we can state these things as follows:

D1 2A ↔ ¬3¬A.

D2 3A ↔ ¬2¬A.

This duality is similar to the duality of ∀ and ∃ in predicate logic: ∃xPx ↔ ¬∀x¬Px.

When modern formal logic was invented at the end of the nineteenth century, there was not much

attention for ‘necessity’ and ‘possibility’ at first. Many authors at that time thought it was senseless to

talk about anything more than what is actually true or false and so they were generally skeptical of the

very idea of ‘necessity’, apart from ‘logical necessity’ (i.e., tautology).

If this conclusion is valid, the subject of modality ought to be banished from logic, since

propositions are simply true or false . . . (B. Russell [13])

A necessity for one thing to happen because another has happened does not exist. There is

only logical necessity. (L. Wittgenstein [17], 6.37)

12

What makes some formula a tautology could be understood by pointing out that it can be deduced

from axioms, which are the most fundamental principles of reasoning and so immediately grasped to be

tautological themselves. (Wittgenstein, who was displeased with the concept of an axiom proposed an

alternative way of thinking about tautologies, suggesting that each tautology on its own can be grasped

to be true independently of the way the world is.) Accordingly, there was no need to express in a formal

language that something is necessary.

This attitude started to change after C.I. Lewis published his Survey of Symbolic Logic in 1918, in

which he discussed several proof systems for reasoning about possibility and necessity. Specifically, he

was displeased with the way the connectives from propositional logic were analyzed as truth functions.

He was of the opinion that we need to distinguish two meanings to disjunctions, implications, and so on:

one extensional meaning and the other an intensional meaning.

The extensional implication is what is also known as the ‘material implication’. It is the familiar

truth functional connective of propositional logic. The intensional implication is also called ‘strict im-

plication’. The statement that A strictly implies B means that B logically follows from A.

MI Material implication: A → B.

Means that either A is false, or B is true.

SI Strict implication: A J B.

Means that we can validly infer B from A.

In propositional logic we cannot express that something is a tautology, or that some proposition is a

logical consequence of some other proposition. Thus, Lewis proceeded to develop different logics for

strict implication. The five logical systems he came up with have been named S1 to S5. These logics are

all defined by means of axioms. They are proof-theoretic descriptions.

The model-theoretic approach to modal logic came later. It followed not long after the development

of model theory. The first attempt was made by R. Carnap [4, 5]. He introduced models consisting of

sets of state descriptions, whereto the truth value of propositions (or rather, sentences) is relativized. So

in one state description it is true that unicorns exist, but in another state description that proposition is

false. And if propositions A and B are true in state description S , their conjunction A ∧ B is also true

there.

Now a model-theoretic account of the meaning of necessity statements could be given. A sentence

of the form “It is necessary that A” is true if, and only if, it is true in all state descriptions. Using the 2

for “It is necessary that”, we can spell this out precisely. A state description S is simply a set of atomic

propositional variables p, q, r, and so on.

¬A is true in S if, and only if, A is not true in S

A ∧ B is true in S if, and only if, A is true in S and B is true in S

2A is true in S if, and only if, for all state descriptions T, A is true in T

Given Carnap’s semantics for necessity, there is no real point to say things like 22A or 23A. If A is

true everywhere, then it is necessary in S , but it is also necessary in T . In short, if A is true everywhere,

it is immediately also necessary everywhere. Thus, if 2A is true in S , then 2A is true everywhere; so

22A is true in S ; so 22A is true everywhere, and so on. The same point can be made for 3.

Incidentally, Carnap’s model-theoretic interpretation of “Necessarily” leads to the same as Lewis’

logic S5, if we define A J B is as: 2(A → B).

13

The next important development in modal logic came from A.N. Prior [11]. His major concern was

with the logic of time. A statement about the future, for instance, can be thought of as a statement about

all moments in the future. These moments are a bit like Carnap’s state descriptions. In fact, as Prior

suggested, we can think of the ‘time line’ as the set of all integers ω, ordered by the relation ‘smaller

than’ <. Now we can give an interpretation of the statement that ‘always in the future A will be true”.

If we use the symbol G as the operator “It is always going to be the case that . . . ”, we can define the

meaning of GA as follows:

GA is true at moment m if, and only if, for all moments m0 > m, A is true at m0 .

Now it was also possible to define the same thing for the past, by modifying the definition. Let HA mean

that “It has always been the case that A”. Then it can be evaluated in the same model with the definition

given here:

HA is true at moment m if, and only if, for all moments m0 < m, A is true at m0 .

Thus, with a simple change in the ordering, we can choose to say something either about the future or

about the past. In different words, we use the ordering relation to restrict the moments we are talking

about: not all moments in the domain ω, but only those that come later (or earlier) than m.

This was a major innovation to Carnap’s models, in which we could only say that something is

necessary in the sense that it is true in absolutely all state descriptions. Now, using Prior’s idea, we could

begin to make sense of, for instance, the difference between 2A and 22A or the difference between

3A and 23A. Such differences are especially important if we think about the modalities of knowledge,

action, and time. The examples below illustrate the meaningfulness of such statements with multiple

modalities.

(1) It is not certain that I have enough money for a pizza, but it is certain that this is not certain.

(2) It is possible that I open the door; necessarily, if the door is open, then it is possible that I leave

the room. Therefore, it is possible that it is possible that I leave the room.

(3) If it is true now that it rains, then it will always be true in the future, that it has been true in the

past that it rained.

The meaning of such statements could be understood by restricting the state descriptions, moments, or

possibilities that are relevant to evaluate them. In other words, we need to think of these modalities, not

as absolute, i.e. for all possibilities, but as relative, i.e. limited to what is possible now, possible for us,

or possible in a given situation. Just like in Prior’s case, where we look at only those times that are in

the future of our time, so we need to look at only those situations that are achievable for me by acting,

and we need to look at only those situations that are in agreement with what I know given my actual

evidence.

Who was the first to come up with the idea of relativized possibility is not quite clear. But it was the

formulation by S. Kripke [8] that has come to be the standard way of giving a semantics for modal logic.

Kripke’s models are built out of three things:

1. Possible worlds. These are similar to the state descriptions of Carnap, but Kripke used the meta-

physical terminology of seventeenth century philosopher G.W. Leibniz, who argued that the world

God created is the best of all possible worlds, and who proposed that necessary truths are “eternal

truths. Not only will they hold as long as the world exists, but also they would have held if God

had created the world according to a different plan” (Leibniz, as quoted by Mates [10] p.107).

14

2. Accessibility relation. Not all possible worlds are ‘accessible’ from a given possible world w. A

sentence of the form “Possibly, A” is true in w only if there is a possible world where A is true,

that is accessible from w. Similarly, a sentence of the form “Necessarily, A” is true in w only if A

is true in all accessible possible worlds.

3. Valuation. The valuation determines for atomic propositions whether they are true or false at

a possible world. So, the valuation determines for each possible world, which of the atomic

propositional variables are true in that world and which ones are not.

These ‘Kripke models’, as they have come to be called, made it possible to understand better the meaning

of the modal axioms that had been proposed and were being debated. It also led to comparisons of

different modal logics and comparisons between modal logic and first order logic.

The invention of Kripke models led to a systematic method for studying all of these kinds of modality

in a mathematically similar way. If we want a Kripke model for the logic of epistemic modality (certainty,

knowledge), then we take the possible worlds to be different ways the world might be, and such a world

w will be accessible from a world v if, and only if, the world might be like w on the basis of the evidence

you have in v. To obtain a Kripke model for the logic of time the possible worlds become the moments

in time. Then, a world w will be future-accessible from a world v if w is in the future of v—and past-

accessible if the ordering is the other way around. We can begin to reason about combinations of time,

knowledge and action by combining the domains of those Kripke models (e.g., ways a moment in the

futre might be), and combining the accessibility relations (e.g., distinguishing what I know now about

the present from what I knew in the past about the present).

The development of modal logic after this point has been rapid and very diverse. Logicians realized

that Kripke models are, from a mathematical point of view, nothing other than labelled graphs. We can

think of Kripke models as interpretations of a modal language but, vice versa, we can also think of

modal languages as tools to talk about (labelled) graphs. Modal logic then becomes an instrument for

describing properties of graphs, and for proving that a graph has certain properties. We could also use

first order logic for this purpose, but for various reasons modal logic has been a popular alternative to

first order logic.

2.4 Modal logic between propositional logic and first order logic

We could very easily construct a formal language for reasoning about necessity by means of first order

logic. To say that it is necessary that the pen falls to the ground after it is released, we could write

that “for all x, if x is a possibility given the laws of physics, and that possibility is such that the pen is

released, then that possibility is also such that the pen falls to the ground.” Or, in a more formal notation:

Instead of this, modal logic has only the box and the diamond, and no variables. We do not write

“3xpx” for “There is a possibility x, and x is such that p is true” but we write “3p”, saying that “There

is a possibility that p is true”. This second, modal logic way of expressing possibility and necessity is

more limited than the first order logical way that was suggested above.

For at least two reasons this is not the way modal reasoning has been formalized.

The first reason for preferring a ‘Box’ over a predicate logical approach to modality is that the first

order logical approach presupposes a bird’s eye point of view on the domain of quantification—as in the

above example we are quantifying over a domain of physical possibilities x. This makes sense when

we think of a domain of objects, for instance the books in one’s bookcase. If I say “everything is a

paperback” then we can easily think of this as saying that for all books b in the domain of books in my

bookcase, it is true that b is a paperback. However, when we are concerned with modal concepts this

bird’s eye view on the domain does not make equally good sense.

15

For one example, consider the logic of time. Here, a predicate logical approach to the temporal op-

erator ‘always’ presupposes that we can refer to all moments in time, so ‘Time’ has to be thought of as

a big domain of moments about which we can make meaningful statements. And all future moments in-

deed ‘exist’ already, then does that also imply that there are facts about those future moments, statements

about the future that are already true now? This issue is an ongoing controversy in philosophy.

A similar point could be made about possibility. If we treat the statement “necessarily, all objects have

a mass” as a statement about a domain of all possibilities, then it would seem that we are presupposing

the ‘existence’ of possibilities. But what does it mean for a possibility to exist, or to be real? Are

possibilities, such as the possibility for me to stop writing after this sentence, real? Perhaps they are, or

were at some time. Again here, philosophers argue about this matter of modal realism.

These philosophical concerns are somewhat different once a model-theoretic perspective on modal

logic is accepted. In first order logic we evaluate sentences relative to a model, which includes a domain

of entities. Similarly, in modal logic we evaluate sentences relative to a model, which has a domain of

possibilities. Still, the important difference is that in models for modal logic we also want to single out

the actual situation, the present time, the real world, and so on: one possibility that is special in the sense

that it is ‘our’ possibility, the real one. Looking at a domain as a totality of possibilities, it is not so clear

what makes any of these ‘special’. So even if we adopt a model-theoretic perspective on modal logic,

it is still different from first order logic because we need to adopt a perspective in that model, select our

own possibility, in order to determine what is true there.

The second reason for dispreferring a predicate logical language is a more technical one. A major

disadvantage of predicate logic is the fact that it is undecidable. This means that it is impossible to

determine for every argument whether it is logically valid or not. Basic modal logic has the advantage

that it is decidable. Many variants of modal logic are in the computational complexity class PSPACE.

Languages with modalities (i.e., box and diamond) are less expressive than a language with universal

and existential quantifiers. But from a computational point of view this often means that they are more

‘manageable’ to work with in a computational seeting.

Modal logics provide, as we will see, a natural way of reasoning about relational structures or graphs.

This is what many computer scientists look for in a logic. For this reason, it is often discovered that some

logic invented in a particular area in computer science turns out to be a kind of modal logic. So-called

Description Logic is a good example of this. Once it is recognized that this logic is a modal logic, all the

meta-logical techniques that have been developed for modal logic can be used to study the properties of

Description Logic. So, as a matter of fact, modal logic is also a useful instrument in computer science.

Still, the distinction between these propositional logic, modal logic and first-order logic is not so

strict as all of this suggests. In particular, logicians have developed logical systems that are a hybrid of

modal and first-order logic—appropriately called hybrid logic. Thereby, the distinction between modal

and first order logic has become that of two ends of a scale, with many intermediate logics. Furthermore,

logicians have studied the combination of modal and first-order logic, mostly called modal predicate

logic. In such a logic we could express such things as “It is possible that there exists a unicorn” and

“There exists something for which it is possible that it is a unicorn”.

16

3 Basic Modal Logic I: Semantics

3.1 The modal language

Modal logic has been developed in order to analyse and precision reasoning about different possibilities

and necessities. In predicate logic we cannot draw a distinction between “All men will die” and “All men

could die”, or between “Susan is certainly the best pilot” and “Susan might be the best pilot”, or “Marie

makes sure that the light is on” and “Marie allows for the light to be on”.

In the language of modal logic, these differences are expressed by means of the operators 2 and

3. The formula 2ϕ means “It is necessary that ϕ” or, in other words, “ϕ is the case in every possible

circumstance”. The formula 3ϕ means “It is possible that ϕ” or, in other words, “ϕ is the case in at least

one possible circumstance”.

We can add modal operators to propositional logic, and we can also add them to predicate logic.

In this course we restrict ourselves to propositional modal logic. The language of propositional modal

logic consists of everything from propositional logic plus the modal operators. So, we have a set var

atomic propositions p, q, and so on, and complex propositional formulas such as ¬p ∨ (q → r) and

¬((⊥ ∧ q) ↔ q), complex modal formulas 2(p ∧ ¬q) and 3(q ↔ ¬q) → ¬22(p ∧ r). In a BNF

expression:

[Lmodal ] ϕ ::= p | ⊥ | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ → ϕ | ϕ ↔ ϕ | 2ϕ | 3ϕ

The expression > is also sometimes used as an abbreviation of ¬⊥, and so it means something that is

always true. In contrast to the falsum, > is also called the verum. This language is called Lmodal .

Considering the intuitive meaning of some modal formulas helps to get a better grasp of the things that

can be expressed using the modal language.

The formula 3(p ∧ q) says that it is possible that p and q are true together, whereas the formula

3p ∧ 3q says that p is possible and q is possible, but not that it is possible that they are true together.

2ϕ → ϕ states that, if ϕ is necessary, it is in fact also true. This seems to be correct: for example, if it

is necessary that the sun will implode at some point in the future, then it is true that the sun will implode

at some point in the future. Nevertheless, as we shall see, there are also uses of modal logic in which this

is not the case.

If it is necessary that I open the door before I leave the house, then 2(p → q) is true, if the proposi-

tional variable p stands for the proposition ‘I leave the house’ and q stands for the proposition ‘I open the

door’. Note that this formula is different in meaning from 2p → 2q, which says that if it is necessary

that I leave the house, then it is necessary that I open the door.

Another question to consider is whether 2(p ∨ ¬p) should be true. If something is logically valid, or

tautological, then it would seem reasonable that it is also necessarily true. How could there be a situation

in which p ∨ ¬p is not true? Vice versa, 3⊥ can never be true: it cannot be possible that a contradiction

is true. The falsum is false by definition, so there cannot be some possible situation in which it is true.

Duality

As mentioned above, an important insight of modal logic is the duality of possibility and necessity. That

is, the formula 2ϕ means the same as ¬3¬ϕ. If it is necessary that ϕ is true, whatever ϕ may be, then it

cannot be possible that ϕ is false. Therefore, it is then not possible that not ϕ, or ¬3¬ϕ. Similarly, 3ϕ

intuitively means the same as ¬2¬ϕ. If it is possible that ϕ is true, then it cannot also be necessary that

¬ϕ is true, hence necessary that ϕ is false. Given this duality of 2 and 3, possibility can be defined in

terms of necessity:

3ϕ is an abbreviation of ¬2¬ϕ.

By standard propositional logic we can infer 2¬ϕ ↔ ¬3ϕ from 3ϕ ↔ ¬2¬ϕ, and also 2ϕ ↔ ¬3¬ϕ.

17

The variety of modalities

Necessity and possibility are not the only modalities. The introduction of Kripke’s models showed that

the different modalities could all be understood in the same way: the content of the 2 and 3 is different,

but their logical form is the same.

alethic logic necessarily possibly

tense logic always in the future (past) sometime in the future (past)

epistemic logic certainly / known perhaps

deontic logic obligatorily permissibly

dynamic logic decided / determined undecided / left open

In all these logics, the 2 and 3 are considered dual. So, for instance, it is permitted to enter the zoo

if, and only if, it is not obligatory to not enter (stay out of) the zoo. And it is consistent with what is

known that the sun will rise tomorrow if, and only if, it is not known that the sun will not rise tomorrow.

There are some differences between the modalities. It is natural to think that, if it is known that

Bob is in front of the door, then Bob actually is in front of the door. We would not call it knowledge if it

weren’t so. On the other hand, we would not say that if it is obligatory that personal data are kept private,

it is true that personal data are kept private. So, within epistemic logic the formula 2ϕ → ϕ would be

considered logically valid, but in deontic logic that same formula would be considered invalid. More on

this in section 5.

3.2 Kripke models and the semantics of for the modal language

For propositional logic all we needed as a model was the truth valuation. That won’t do for our modal

language: to evaluate whether something is necessary, or possible, we have to look beyond what is

actually true or false. To do so, we make use of Kripke models.

The concept of a Kripke model was introduced in the previous chapter. As was pointed out there,

these models consist of three items: (1) a set of possible worlds, (2) a relation that determines whether

one possible world is accessible from another, and (3) a valuation that determines whether an atomic

proposition is true or false in a give possible world. In some cases we want to consider the Kripke model

in abstraction of the valuation. For this reason we also define Kripke frames, which consist of only the

set of possible worlds and the accessibility relation.

Definition 3 (Kripke frame and Kripke model). A Kripke frame is a tuple F = hW, Ri, such that

- R ⊆ (W × W) is a binary relation on W; if wRv we say that v is accessible from w.

- hW, Ri is a Kripke frame (the frame underlying M, or the frame M is based on),

- V : W → Pow(var) is a valuation for the set of atomic propositions var; proposition p is true in

world w if p ∈ V(w), and false in w if p < V(w).

The Kripke models (not the frames) are dependent on the choice of propositional variables. In prac-

tice we often omit an explicit definition of the variables and assume a countably infinite set of them.

18

We can represent Kripke frames by means of graphs. Here is are three simple Kripke frames:

w /v /ud wo tO w /vd

@

t v /u s uo /t

In the rightmost of these, the set of possible worlds is {w, v, u, t} and the accessibility relation is wRv,

vRv, uRt, and tRu. So, in world w the only accessible world is v, and for v the only accessible world is v

itself. The leftmost frame has the same set of possible worlds, but the relation is wRv, wRt, tRv, vRu and

uRu.

Kripke models can be represented by writing the valuation set V(w) next to world w in the graphs.

(p, r) t / w (q, r)

O

&

(p) v u ()

In this Kripke model, the frame consists of four possible worlds and the relation is tRw, tRv, tRu and

uRw. The valuation is such that V(t) = {p, r}, so p and r are true at t and q is false there. At u all atomic

propositions are false.

Modal formulas are evaluated in a Kripke model. For all formulas in the modal language and for all

possible worlds in all models, the truth definition determines whether that formula is true in that possible

world in that model. The connectives from propositional logic are evaluated in the same way. Given the

model represented above, the propositions q and r are both true in world w, and therefore also q ∧ r is

true in w. Proposition p is true in world v, and therefore formula p ∨ ¬q is also true in v. In world u, q is

false, and therefore q → r is true.

The relation R is only relevant when it comes to evaluating 2 and 3 formulas.

Given a model M = hW, R, Vi, the formula 2ϕ is true in possible world w if, and only if, ϕ

is true in every possible world that is accessible from w.

Given a model M = hW, R, Vi, the formula 3ϕ is true in possible world w if, and only if, ϕ

is true in some possible world that is accessible from w.

Definition 4 (Truth in a model). Let M = hW, R, Vi be a model, w a possible world in W, and var a set

of atomic propositional variables. The truth value of modal formulas is inductively defined relative to M

and w, in the manner given below.

M, w |= p ⇔ p ∈ V(w) (for propositions p ∈ var)

M, w |= ⊥ ⇔ is not the case

M, w |= ¬ϕ ⇔ it is not the case that M, w |= ϕ

M, w |= ϕ ∧ ψ ⇔ M, w |= ϕ and M, w |= ψ

M, w |= ϕ ∨ ψ ⇔ M, w |= ϕ or M, w |= ψ

M, w |= ϕ → ψ ⇔ M, w |= ϕ implies that M, w |= ψ

M, w |= ϕ ↔ ψ ⇔ M, w |= ϕ if, and only if, M, w |= ψ

M, w |= 2ψ ⇔ for all v : wRv implies that M, v |= ϕ

M, w |= 3ϕ ⇔ there is a v such that wRv and M, v |= ϕ

19

Note that ⊥ is by definition false in every possible world. Therefore, > = ¬⊥ is true in every possible

world. In exercise 2 below, you are asked to determine the truth value of formulas such as 2⊥ and 3>.

Example 1. Consider the Kripke model M represented below.

(p) w f / v (r)

O

&

(p) u d t (p, q)

1. In u only one possible world is accessible, u itself, and p is true there. Hence, in all accessible

possible worlds the proposition p is true. Therefore, 2p is true in u, or M, u |= 2p.

2. In t, by contrast, there is some accessible possible world in which p is true (namely w), but there is

also an accessible world in which p is false (namely v). Therefore, p is possible, but not necessary:

M, t |= 3p but M, t |= ¬2p.

3. In w, only two worlds are accessible. In one of them r is true, in the other one q is true. So in both

of these worlds q ∨ r is true. This means that q ∨ r is true in all accessible worlds for w, and so

M, w |= 2(q ∨ r).

worlds: M, v |= 2⊥. And because v is accessible from w: M, w |= 32⊥.

To define what a normal modal logic is semantically, we have to define not only truth but also validity. At

this point we have only defined truth in a world of a model. We generalize this notion now in three steps

to arrive at general validity of a formula. (The concept of a frame class will be made clear in section 4.)

Definition 5 (Validity). Let ϕ be a modal formula, M a Kripke model, F a Kripke frame, and C a class

of Kripke frames.

F |= ϕ ϕ is valid on F ⇔ M |= ϕ, for all models M based on F

|=C ϕ ϕ is valid on C ⇔ F |= ϕ, for all frames F in frame class C

|= ϕ ϕ is generally valid ⇔ F |= ϕ, for all frames F

An inference from Φ to ψ is generally valid, notation Φ |= ψ, if, and only if, for all models M and worlds

w: M, w |= ϕ for all ϕ ∈ Φ implies that M, w |= ψ.

An inference from Φ to ψ is valid in frame class C, notation Φ |=C ψ, if, and only if, for all models M

based on a frame in C and worlds w: M, w |= ϕ for all ϕ ∈ Φ implies that M, w |= ψ.

Something is valid on a Kripke frame just in case it is true at every possible world in every possible

Kripke model based on that frame. Similarly, something is valid on all Kripke frames just in case it

is true at every possible world in every possible Kripke model (based on any possible Kripke frame).

Because of this, the definition of semantic validity is just the same one as Definition 1 in chapter , when

we take the relativization of truth to possible worlds in consideration. A modal formula ϕ is generally

valid, |= ϕ, if it is valid in all Kripke frames. That just means that it is true in all Kripke models (models

based on any possible frame) at every possible world in that model.

The notion of a frame class will be made more clear in the next chapter.

20

Example 2 (Duality). On the basis of the definition of semantic validity, we can now show that the

duality of 2 and 3 is valid. In other words, the formulas 3ϕ and ¬2¬ϕ are logically equivalent. And,

by simple propositional logic, the same goes for 2ϕ and ¬3¬ϕ.

Proposition 1 (Semantic validity of duality). 3ϕ ↔ ¬2¬ϕ is generally valid.

Proof. We show that if 3ϕ is true anywhere, ¬2¬ϕ must be true there, and if ¬2¬ϕ is true anywhere,

3ϕ must be true there.

Let M be an arbitrary model, w a world in M, and ϕ a modal formula, such that M, w |= 3ϕ. Then

there is a possible world v such that wRv and v |= ϕ. Now suppose that M, w |= 2¬ϕ. Then ¬ϕ is true

in every possible world that is accessible from w. Since v is accessible from w, it must be the case that

M, v |= ¬ϕ. However, then M, v |= ϕ ∧ ¬ϕ, which cannot be the case. So M, w 6|= 2¬ϕ, and therefore

M, w |= ¬2¬ϕ.

Vice versa, suppose that M, w |= ¬2¬ϕ. Then it is not true that, in all possible worlds that are

accessible from w, ¬ϕ is true. In other words, there must be some possible world v, with wRv, in which

¬ϕ is false. So M, v |= ϕ. Now, given the semantic definition of 3, it is true that M, w |= 3ϕ.

Given some modal formula ϕ, we may want to consider whether or not it is a general validity of

modal logic. If so, it must be true at every world in every model. If not, then there must be at least some

world in some Kripke model where ϕ is false. Such a model is called a ‘countermodel’ to the claim that

ϕ is a general validity. Similarly, a countermodel to the claim that Φ |= ψ is a model where, in some

world, all the formulas in Φ are true, but ψ is false.

Example 3. Consider the two formulas 2(p → q) and 2p → 2q. In fact, 2p → 2q 6|= 2(p → q). A

countermodel to prove this is one where at some world 2p → 2q is true and 2(p → q) is false. A first

step is this:

() w / v ()

As a consequence, M, w |= 2p → 2q (look this up in the truth table for → to see this).

Now, we need to make sure that 2(p → q) is false in w. Note that it is true in the model above: in v,

p → q is true (again, look this up in the truth table for →). And, since this is the only accessible world

for w, it is the case that M, w |= 2(p → q). So, we add another accessible world for w, called u, such

that p → q is false there.

() w / v ()

"

u (p)

In this model, in w, 2p is still false (because p is false at v) and so 2p → 2q is still true. But

2(p → q) is not true anymore, because p → q is false at u. Hence, the above model is a countermodel

for 2p → 2q |= 2(p → q). There are models and worlds where the premise is true but the conclusion is

false.

3.4 Exercises

1. (a) Let 2ϕ stand for “It is known that ϕ”. Explain why the formulas 2ϕ → 22ϕ and ¬2ϕ →

2¬2ϕ are also called ‘knowledge introspection’. (b) If 2ϕ were to mean that ϕ is obligatory, then

which formula would say that ϕ is permitted? And how about a formula saying that ϕ is forbidden?

21

2. Given the Kripke model below, which of the following statements is true?

t (p)

< b

a. w |= ¬p f. w |= 2p ∨ 2q

b. v |= 2p g. u |= 22⊥

(p) v u (q) c. t |= 3> h. v |= 2q → ¬p

b <

d. t |= 2⊥ i. w |= 3⊥

e. w |= 2(p ∨ q) j. w |= 2>

w ()

frame such that all of the following is true: w |= 3p, w |=

22p, v |= 3q, u |= ¬q. Does there exist more than one

valuation that validates these constraints? u

/vo /ud

4. i. Consider this frame, with the valuation 9w

V(w) = {p, q}, V(v) = {p}, V(t) = ∅, V(u) =

{q}, V(s) = {p}, V(r) = {p, q}. In which

world(s) is the following formula true? (a)

32⊥, (b) 22q, (c) 32q, (d) 3n (q ∨ 2q),

for every sequence of 3 of length n ≥ 1. t :so r

ii. Give an alternative valuation such that the following formulas are all, simultaneously, true

on the resulting model: (a) w |= 32p → 2p, (b) r |= ¬p → 33q, (c) u |= 232¬q;

(d) v |= p → 3p; (e) s |= q → 2¬q.

5. Which of the following formulas is a tautology, that is, which formulas are true in all worlds of all

models? If a formula is not a tautology, give a countermodel.

a. 2> e. 2p → 3p

b. 2⊥ f. (2ϕ ∧ 3ψ) → 3ϕ

c. 3> g. (2(p ∨ q) ∧ 3¬p) → 3q

d. 3⊥ h. 2(ϕ → ϕ)

6. Explain why the following formulas are generally valid.

a. ϕ ∨ ¬ϕ d. (2ϕ ∧ 2ψ) → 2(ϕ ∧ ψ)

b. ϕ, 2ϕ, 22ϕ, . . . , for propositional tautologies ϕ e. (2ϕ ∨ 2ψ) → 2(ϕ ∨ ψ)

c. 3ϕ → 3> f. (2(ϕ → ψ) ∧ 2ϕ) → 2ψ

7. Show that the other direction of the true formula (2ϕ ∨ 2ψ) → 2(ϕ ∨ ψ) given above, i.e. 2(ϕ ∨

ψ) → (2ϕ ∨ 2ψ), is not generally valid. That is, give formulas ϕ and ψ and a model and world at

which 2(ϕ ∨ ψ) → (2ϕ ∨ 2ψ) does not hold.

8. Give instances of ϕ and ψ for which 2(ϕ ∨ ψ) → (2ϕ ∨ 2ψ) does hold in all worlds in all models.

22

4 Characterizability and frame correspondence

4.1 Characterizability and the modal language

Different Kripke frames for different modalities

As has been said, there are various modalities: epistemic for knowledge, deontic for obligation, tense

for the flow of time, dynamic for action and causation, and so on. Since the work of Kripke and others,

the common idea is, that the truth values of all modal formulas can be captured by Kripke models. The

only difference between those modalities concerns what it means that one ‘possible world’ is ‘accessible’

from another one. For epistemic modality ‘accessible’ means that we cannot distinguish one world (i.e.

one picture of how the world might be) from another on the basis of what we know. For deontic modality

some world, or situation, is ‘accessible’ from the current one if it is permissible to change the current

situation into the other one. For tense, the ‘worlds’ are time points and accessibility represents the

passing of time, from one point in time to the next.

However, even if we accept that all modalities can be understood in terms of Kripke models, that

does not imply that all Kripke models make sense for every modality. For instance, the most common

picture of time is that of a (continuous) line, without loops, circles, or jumps. For any two times, they

are the same, or one comes before the other. (If you believe that you cannot change the past but you

can change the future, then the time line might be ‘branching’ towards the future.) So, with this specific

interpretation of ‘accessibility’ comes a more specific demand on the kinds of Kripke frames we consider

applicable.

Similarly, if we think of the accessibility relation such that wRv means that, on the basis of our

knowledge, we cannot distinguish between the possible worlds w and v. Then it would be natural that

for all possible worlds w, if wRv, then also vRw.

And consider the deontic accessibility relation, such that wRv means that, in situation w we are

allowed to perform an action that changes the situation to v. In that sense of ‘accessibility’ it would not

be necessary that wRv implies vRw: if we are in a bad situation, it would be permissible to improve it,

but then it would not be permissible to change the improved situation back to the old one. On the other

hand, we might like to think that, morally, we are always permitted to do something. Our best action is

the most that can, morally speaking, be asked for. Hence, there is always some world that is accessible.

In this way, different senses of ‘accessibility’ lead to different ideas of which models make sense

and which do not. Specifically, in the example with knowledge, only those Kripke frames are applicable

where

∀w∀v(wRv → vRw),

whereas in the case of permissible action the frames should be restricted to those where

∀w∃v(wRv).

In case accessibility represents the flow of time, the only frames we wish to include are those where the

worlds follow each other consecutively, continuously, non-cyclically, and eternally.

The question we then have to deal with, what the consequences of such a restriction of the Kripke

frames (or models) are for reasoning about those modalities. General validity is defined in terms of

what is true in all possible worlds in all Kripke models, or, equivalently, in terms of what is valid in

all Kripke frames. Hence, if we restrict ourselves to only some of the Kripke frames, perhaps more

formulas will become valid: there might be formulas that are valid on all Kripke frames with the con-

dition ∀w∀v(wRv → vRw), but not valid on some other Kripke frames. Those formulas could then be

considered ‘valid for the epistemic modality’, but not valid for the deontic modality.

Vice versa, we would like to be able to express using our modal language that the accessibility relation

has this additional structure. For instance, we would like to have a modal formula that says “If I cannot

distinguish this world from that one, then I cannot distinguish that world from this one”. In other words,

a modal formula that is true if, and only if, the accessibility relation is such that ∀w∀v(wRv → vRw).

23

Due to the limitations in the expressive power of the modal language, this search will be successful in

some cases but not in others. For instance, in view of the proposed condition on epistemic accessibility,

the formula 32ϕ → ϕ is ‘valid for epistemic modality’, and that it is valid in this sense also means, vice

versa, that the frame fulfils this condition. On the other hand, we will not be able to find such a modal

formula characterizing the direction of time.

The expressive power of the modal language is importantly constrained by the fact that (i) modal formu-

las are evaluated at a world in a model, and (ii) their truth values can only depend on the worlds that are

accessible from it (and the worlds accessible from there, and so on). It is sometimes said that the modal

language gives us an ‘internal’ perspective on a model. We evaluate a formula in the given world ‘where

we are’, and then the accessibility relation allows us to ‘travel’ to another world in the model, where

we can evaluate another formula, and so on. This contrasts with predicate logic, in which we evaluate

formulas relative to the model as a whole, thus taking an ‘external’ perspective on the model.

Two different models or frames can seem to be the same, given an internal perspective, although they

can be clearly distinguished once we adopt an external perspective. For instance, if the proposition p is

true everywhere on a model of which the frame is represented by the natural numbers, then ‘travelling’

through it would be indistinguishable from ‘travelling’ from a possible world a to a itself all the time.

All that we would observe is that, no matter how much we travel, the proposition p is true all the time.

0 a

As a consequence of this internal perspective, the modal language is sometimes not able to distinguish

two worlds, frames, or models, when the predicate logical language is able to distinguish them. The

predicate logical formula ∀x∀y(x = y ∧ xRy) is true only for the right hand model. Yet, in both a and in

0 (or any other natural number), it would be true that

p ∧ 3p ∧ 2p ∧ 33p ∧ 22p ∧ . . .

so as far as the modal language goes, those two models are equivalent. We will say, later on, that there

is a “bisimulation” between the two models, or that 0 in the one model and a in the other model are

“bisimilar”.

In what follows, we first discuss the properties on frames, the extra demands on the accessibility

relation. As we will see, we can define some of them using the modal language. Others we cannot

define. To understand why this is so, we will give a precise definition of what it means that two worlds,

models, or frames, are indistinguishable as far as the modal language goes. Several techniques are

presented for determining whether a property of accessibility relations can be expressed or not.

Then, the first thing to do is to consider the properties a frame has, so that we may consider to what

extent those properties can be expressed in the modal language. Of course we can mention the size of

the frame, i.e., the number of possible worlds, or for instance the fact that there are exactly three worlds

for which no world is accessible. Here we restrict ourselves to structural uniformities in a frame, such

as the property that an accessible world exists for every world in the frame, or the property that the

accessibility relation is transitive, or symmetric. The definition below lists a number of these properties,

stated in the language of predicate logic.

Definition 6 (Frame properties). The following list defines various properties of frames using predicate

logical notation.

24

hW, Ri is reflexive ⇔ ∀w(wRw)

hW, Ri is irreflexive ⇔ ∀w¬(wRw)

hW, Ri is serial ⇔ ∀w∃v(wRv)

hW, Ri is symmetric ⇔ ∀w∀v(wRv → vRw)

hW, Ri is asymmetric ⇔ ∀w∀v(wRv → ¬vRw)

hW, Ri is anti-symmetric ⇔ ∀w∀v((wRv ∧ w , v) → ¬vRw)

hW, Ri is weakly ordered ⇔ ∀w∀v(w , v → (wRv ∨ vRw))

hW, Ri is transitive ⇔ ∀w∀v∀u((wRv ∧ vRu) → wRu)

hW, Ri is Euclidean ⇔ ∀w∀v∀u((wRv ∧ wRu) → vRu)

hW, Ri is dense ⇔ ∀w∀v(wRv → ∃u(wRu ∧ uRv))

hW, Ri is deterministic ⇔ ∀w∀v∀u((wRv ∧ wRu) → v = u)

hW, Ri is piecewise connected ⇔ ∀w∀v∀v((wRv ∧ wRu) → (vRu ∨ uRv))

hW, Ri is universal ⇔ ∀w∀v(wRv)

hW, Ri is disconnected ⇔ ∀w∀v¬(wRv)

A preordered frame is transitive and reflexive. A partially ordered frame is anti-symmetric and pre-

ordered. An equivalence frame is symmetric and preordered.

The frame properties each define a frame class. For instance, C = {F | F is transitive} is the class of

all and only transitive frames. The class of preordered frames is the intersection of the class of reflexive

frames and the class of transitive frames, so Cpreorder = Creflexive ∩ Ctransitive .

The question we are concerned with is, whether there are modal formulas that express these prop-

erties. Or, more precisely, whether there are modal formulas that, if valid on a frame, guarantee that

the frame has this property (that it belongs to the frame class). This is what the following definition is

concerned with.

Definition 7 (Definable frame class). A set of formulas Φ characterizes a class of frames C if, and only

if the formulas in Φ are jointly valid on all, and only, the frames F in class C.

F |= ϕ for all ϕ ∈ Φ ⇔ F ∈ C

A class of frames is modally definable if there is a set of modal formulas that characterises it.

Given this definition, trivially every set of formulas characterizes some class of frames. The issue

we are interested in, however, is vice versa to what extent classes of frames are modally definable. The

following theorem states some positive results in this direction. Many frame classes can be modally

defined by means of a single formula.

- 32ϕ → ϕ characterizes the class of symmetric frames.

- 3ϕ → 23ϕ characterizes the class of Euclidean frames.

- 22ϕ → 2ϕ characterizes the class of dense frames.

- 3ϕ → 2ϕ characterizes the class of deterministic frames.

25

Reflexivity means that every possible world is accessible to itself. The opposite of this is irreflexivity:

then no world is accessible to itself. There can also be frames in which some worlds are ‘auto-accessible’

and others are not. Any frame that is not reflexive is called ‘non-reflexive’. So, all irreflexive frames are

non-reflexive, but not vice versa.

Asymmetry means that, for any two worlds in the frame, if the first is accessible to the second, then the

second is not accessible to the first. To see that asymmetry implies that the frame is also irreflexive,

consider the case where the two worlds are one and the same. Anti-symmetric frames allow for both

reflexive and non-reflexive frames, by restricting the asymmetry-condition to only two different worlds.

Symmetric frames can be either reflexive or non-reflexive.

The difference between transitive and Euclidean frames can be illustrated as follows. The continuous

lines represent the condition and the dotted line indicates what the frame property then stipulates.

w /v w /v

u u

Note 1: An R-cycle wR . . . Rw in a transitive frame implies that for any two worlds in that chain, those

worlds are mutually R-accessible. (You can calculate this for yourself.) In other words, if we would limit

the model to the worlds in that cycle, the relation would be universal. A Euclidean relation implies that

the set of worlds accessible from a world are always accessible to each other. Both facts are illustrated

below.

/vz /vz

9 wX k @ w F

$ z

u u

Note 2: If a frame is reflexive, then it is Euclidean if, and only if, it is symmetric and transitive. The

resulting relation on the frame is called an equivalence relation.

- 2(2ϕ → ψ) ∨ 2(2ψ → ϕ) characterizes the class of piecewise connected frames.

In other words, a frame is reflexive if, and only if, the formula 2ϕ → ϕ is valid on it. It is important to

realize that these correspondence theorems relate formulas to frames, not models. The formula 2p → p

can be valid on a model that does not have a reflexive frame. But if it is valid on the frame, then that

frame is reflexive. The models below illustrate this point: they all have the same frame, but only on the

right hand one is the formula 2p → p false.

In order to see how to prove these theorems, next we give two such proofs. The first is a proof of the

modal definability of reflexivity, the one thereafter proves the modal definability of transitivity.

⇐: Suppose F = hW, Ri is reflexive. We have to show that F |= 2ϕ → ϕ, that is, for all formulas ϕ, for

26

all valuations V, for all w ∈ W, w |= 2ϕ → ϕ in the model hW, R, Vi. Thus consider an arbitrary formula

ϕ, an arbitrary valuation V and an arbitrary world w in W. Since R is reflexive wRw has to hold. Now

suppose that w |= 2ϕ. This means that for all v, if wRv, then v |= ϕ. Since wRw, this implies that w |= ϕ.

This proves that w |= 2ϕ → ϕ, and we are done.

⇒: This direction we show by contraposition. Thus we assume F = hW, Ri is not reflexive, and then

show that F 6|= 2ϕ → ϕ for some formula ϕ. In other words, we have to show that if F is not reflexive,

then there is a formula ϕ and a valuation V and a world w in W such that w 6|= 2ϕ → ϕ in the model

hW, R, Vi. Note that w 6|= 2ϕ → ϕ is the same as w |= 2ϕ ∧ ¬ϕ. Thus suppose F is not reflexive. This

means that there is at least one world w such that not wRw. Now define the valuation V on F as follows

(with only one atomic proposition p). For all worlds v:

p ∈ V(v) ⇔ wRv

Observe that in this definition the v are arbitrary, but w is the particular world such that not wRw that

we fixed above. The definition implies that v |= p if wRv, and for all other worlds z in W it implies z 6|= p,

i.e., z |= ¬p. For instance, as in this model:

(p) u v (p)

d ;

w (¬p)

O

z (¬p)

Since not wRw, we have w |= ¬p. But the definition of V implies that all accessible worlds v of w, i.e., all

worlds such that wRv, have v |= p. Thus w |= 2p. Hence w |= 2p ∧ ¬p and so w 6|= 2p → p. Therefore,

there is a formula ϕ, namely the formula p, such that w 6|= 2ϕ → ϕ.

⇐: Suppose F = hW, Ri is transitive. We have to show that F |= 2ϕ → 22ϕ, that is, that for all

formulas ϕ, for all valuations V, for all w ∈ W, w |= 2ϕ → 22ϕ in the model hW, R, Vi. Thus consider

an arbitrary formula ϕ, an arbitrary valuation V and an arbitrary world w in W. Now suppose w |= 2ϕ.

We have to show that w |= 22ϕ, i.e. for all v such that wRv, v |= 2ϕ. Thus consider a v such that wRv.

To show v |= 2ϕ, we have to show that for all u with vRu, u |= ϕ. Thus consider a u such that vRu. The

transitivity of R now implies that wRu. Since w |= 2ϕ, this means that all successors of w force ϕ. Since

wRu, u is a successor of w. Hence u |= ϕ. Thus we have shown that for all u with vRu, u |= ϕ. Hence

v |= 2ϕ. And that is what we had to show, as it proves that w |= 2ϕ → 22ϕ.

⇒: This direction we show by contraposition. Thus we assume F = hW, Ri is not transitive, and then

show that F 6|= 2ϕ → 22ϕ for some ϕ. In other words, we have to show that if F is not transitive, then

there is a formula ϕ and a valuation V and a world w in W such that w 6|= 2ϕ → 22ϕ in the model

hW, R, Vi. Note that w 6|= 2ϕ → 22ϕ is the same as w |= 2ϕ ∧ ¬22ϕ. Thus suppose F is not transitive.

Then there are at least three worlds w, v and u such that wRv and vRu and not wRu. Now define the

valuation V on F as follows:

p ∈ V(x) ⇔ wRx.

Thus, we put v |= p if wRv, and for all other nodes u in W we put u 6|= p, i.e. u |= ¬p. E.g. as in this

27

model:

(¬p) u

O

(p) v

O

(¬p) w

Since not wRu, we have u |= ¬p. This implies that v |= ¬2p. But this again implies that w |= ¬22p.

But the definition of V implies that all successors v of w, i.e. all nodes such that wRv, have v |= p. Thus

w |= 2p. Hence w |= 2p ∧ ¬22p. Thus w 6|= 2p → 22p. Thus there is a formula ϕ, namely p, such

that w 6|= 2ϕ → 22ϕ. This proves that the formula 2ϕ → 22ϕ characterizes the class of transitive

frames.

We can get a bit further with our quest for modal definability by means of the following proposition.

It states that we can freely combine the formulas characterizing frame properties into sets of formulas

characterizing combined frame properties.

Proposition 2 (Combination). If Φ1 characterizes frame class C1 and Φ2 characterizes frame class C2 ,

then Φ1 ∪ Φ2 characterizes frame class C1 ∩ C2 .

Some interesting frame classes are defined by a set of properties, such as the preordered frames

and the equivalence frames. A direct consequence of Proposition 2 is that these frame classes are also

modally definable:

Theorem 2 (Correspondence Theorems (2)).

A proof of correspondence theorem for equivalence frames can be found in the book of van Ditmarsch

et.al. [16].

All of this seems to suggest that the frame properties that are definable in the modal language are

always expressible as predicate logical formulas starting with a universal quantifier. However, that is not

quite true. There are a few properties that cannot be defined by means of the predicate logical language

that are nonetheless modally definable.

- 2(2ϕ → ϕ) → 2ϕ characterizes the class of Gödel-Löb frames.

The Gödel-Löb frames are transitive and conversely well-founded. A frame is well-founded if

there is always a finite series of steps to a ‘root’ world, in other words, there is no infinite chain of

worlds . . . Rw2 Rw1 Rw0 . A frame is conversely well-founded if there is no infinite chain of worlds

w0 Rw1 Rw2 R . . .. The property of converse well-foundedness is not expressible in predicate logic, and it

is also not modally definable as such. But in combination with transitivity it is modally defined by the

28

Gödel-Löb formula. The name originates from the two founders of Provability Logic, which is a modal

logic in which 2ϕ is understood to mean “ϕ is provable”. The modal formula is an axiom in that logic

(see the next section) and it means that there is a proof of A if there is a proof of the fact that a proof of

A implies that A is true.

We leave out the question what the McKinsey frame property is.

Not all frame properties can be modally defined. To show that a property cannot be defined, there are

several methods available. They are all based on the fact that two possible worlds in two different models

can make exactly the same formulas true. In this section we define a relation between worlds in models

called a ‘bisimulation’, that guarantees that the two worlds are ‘semantically equivalent’, i.e., they make

the same modal formulas true. Put differently, no modal formula can distinguish the two possible worlds,

by being true in the one world and false in the other. In the next section we will use this concept to define

three types of relations between Kripke frames that can be used to show that a frame property cannot be

modally defined.

Definition 8 (Bisimulation). Given two models M = hW, R, Vi and M 0 = hW 0 , R0 , V 0 i, a bisimulation

between M and M 0 is a relation Z on W × W 0 such that

1. wZw0 implies that the same propositions are true in w and w0 , V(w) = V 0 (w0 ),

2. wZw0 and wRv implies that there is a v0 ∈ W 0 such that w0 R0 v0 and vZv0 ,

3. wZw0 and w0 R0 v0 implies that there is a v ∈ W such that wRv and vZv0 .

Here is a graphical representation of the second and third conditions, the so-called ‘forth’ and ‘back’

conditions. The continuous lines indicate the conditions under which the dotted lines should be found.

vO / v0 vO / v0

Z O Z O

R R0 R R0

w Z / w0 w Z / w0

These graphs illustrate the way to check for bisimilarity. You start with a certain world w in one model

and another world w0 in another model, with at least the same propositional valuation V(w) = V(w0 )

(condition 1). Then you look to the worlds that are accessible from w and for each of them you connect

it with a world in the other, that must be accessible from w0 . And you do so vice versa for the worlds

that are accessible from w0 : they must be connected with a world in the first model. Then, for all of the

worlds you ‘connected’ with each other you do the same thing, until you have reached the point where

there are no more worlds to connect.

Here is a simple example of a bisimulation between the worlds of two different models:

(q)w o / v(q)

R

(q)a l R0

The bisimulation is wZa and vZa. For the first condition we check that V(w) = V(a), which is correct,

and V(v) = V(a), which is also correct. Then we move on to the second condition: w can access v, and

wZa, so there must be some world in the right hand model that is (i) accessible from a, and (ii) bisimilar

29

to v. Clearly, that world is a itself: (i) aRa, and (ii) vZa. And v can access w, so there must be a world

in the right hand model that is accessible from a and bisimilar to w. Again, this world is a itself. For the

third condition we have to do the same thing in the reverse direction. We observe that wZa and aR0 a, so

we have to find a world x in the left hand model such that wRx and xZa. That world can only be v: wRv

and vZa. We are done.

The notion of bisimilarity goes to the heart of what modal logic is. Bisimulations were introduced

by Van Benthem [14], who actually defined modal logic to be the fragment of first order logic that is

invariant under bisimulation. To understand this characterization, we would have to explain how modal

logic can be seen as a fragment of first-order logic. We leave it to the reader to look this up in for instance

Blackburn et al. [2].

Theorem 4 (Bisimulation theorem). If for two models M = hW, R, Vi and M 0 = hW 0 , R0 , V 0 i there is a

bisimulation Z such that wZw0 for some w ∈ W and w0 ∈ W 0 , then the same formulas are true in w and

w0 . That is,

M, w |= ϕ ⇔ M 0 , w0 |= ϕ.

Next, we will give a proof of the Bisimulation theorem by means of formula induction. With an

induction on the complexity of a formula, we obtain a general method of showing that the models make

all the same formulas true. We know that they are the same for the atomic propositions. And if they are

equivalent on p and q, then they are equivalent on p ∨ q. And if they are equivalent for p ∨ q, then they

are equivalent for 3(p ∨ q), and so on.

Proof. For reasons of symmetry, we only have to prove it for the direction from M to M 0 . The other

direction is completely analogous. So we assume that M, w |= ϕ, for abitrary ϕ, and prove that w0 |= ϕ.

We prove this by induction on the complexity of ϕ. That is, we prove that for every complex formula

ϕ, it is true that w |= ϕ ⇔ w0 |= ϕ, given that this is true for the less complex subformulas. This is the

induction hypothesis.

Induction hypothesis: if ψ is a subformula of ϕ, then M, w |= ψ ⇔ M 0 , w0 |= ψ.

Basic case: suppose ϕ = p, some atomic proposition, and so w |= p. Then it follows directly from the

definition of bisimilarity, condition 1, that w0 |= p. So w0 |= ϕ.

Negation: suppose ϕ = ¬ψ, and so w |= ¬ψ. This means that w 6|= ψ. Then, on the induction

hypothesis, it is true that w |= ψ ⇔ w0 |= ψ. And since w 6|= ψ, also w0 6|= ψ. Then, by the definition of ¬

once again, it follows that w0 |= ¬ψ, so w0 |= ϕ.

Disjunction: suppose ϕ = ψ ∨ χ, and so w |= ψ ∨ χ. By the induction hypothesis, we can assume

that w |= ψ ⇔ w0 |= ψ and w |= ψ ⇔ w0 |= ψ. Because w |= ψ ∨ χ, by the semantic definition of ∨ we

know that w |= ψ or w |= χ. Suppose that w |= ψ (the other case is similar). Then it is also the case that

w0 |= ψ. Therefore, by the semantic definition of ∨ once again, it follows that w0 |= ψ ∨ χ, and so w0 |= ϕ.

The other connectives are similar (and can be defined in terms of negation and disjunction).

Possibility: suppose that ϕ = 3ψ, and so M, w |= 3ψ. By the truth definition, there must be a world

v in W such that wRv and M, v |= ψ. From the fact that w and w0 are linked by the bisimulation Z, we

may infer that there is some world v0 accessible from w0 in M 0 , such that vZv0 (this is condition 2 in the

definition of bisimulation). The inductive hypothesis gives us that M 0 , v0 |= ψ. But then we may conclude

from the fact that w0 Rv0 that M 0 , w0 |= 3ψ, which is precisely what we are after.

The case of necessity is similar.

For each connective, we use first the truth definition to reduce the truth value of the complex formula

to the truth value of some more simple formula. Then we use the induction hypothesis to show that in

the bisimilar world that formula has the same truth value. Then finally we use the truth definition again

30

to establish the truth value of the complex formula (in the other world) on the basis of the truth values of

the simpler formulas. For the modalities we also have to use condition 2, from M to M 0 , and condition

3, vice versa.

Bisimulations are relations between worlds in models. The bisimulation theorem shows how bisim-

ulations make those related worlds modally indistinguishable. This result can also be used to establish

dependencies between the larger structures themselves: models validating all of the same modal for-

mulas, and frames validating all of the same modal formulas. Those consequences of the bisimulation

theorem are presented below.

Corollary 1. Call a bisimulation between M and M 0 complete for M 0 if all the worlds in M 0 are related

to a world in M. If there is a bisimulation between M and M 0 that is complete for M 0 , then

M |= ϕ ⇒ M 0 |= ϕ.

Proof. Suppose that there is a bisimulation Z between M and M 0 that is complete for M 0 . This means

that for all worlds w0 in model M 0 , there is some world w in model M such that wZw0 . Therefore, any

formula that is true in w is true in w0 . Now, if it is true that M |= ϕ, then ϕ is true in every world in M,

including world w. So, M, w |= ϕ, and by the bisimulation theorem M 0 , w0 |= ϕ. Seeing as this is true for

all w0 in M 0 , ϕ is true in all worlds in M 0 , and therefore M 0 |= ϕ.

Corollary 2. Let F and F 0 be two frames. If for any valuation V 0 on F 0 we can define a valuation on F,

such that there is a bisimulation between the resulting models M and M 0 , complete for M 0 , then

F |= ϕ ⇒ F 0 |= ϕ.

This corollary will be the key to grasping the limitations of characterizability of frames. We will

consider three kinds of relations between frames that are such that for all valuations on the one frame we

can always find a valuation on the other frame that creates a bisimulation that is complete for the first

model. Using this corollary, we can then conclude that all formulas that are valid on the second frame

are valid on the first frame. As a consequence, no formula can be such that it defines a class of frames

including the first frame but not the second one.

We will define three kinds of relations between frames. Those relations are such that whatever is valid

on the one frame is also valid on the other frame. Therefore, if the one frame has a property that the other

frame does not have, then that property cannot be modally characterized. We will use the Bisimulation

theorem, and its corollaries, to show this.

One of the best known results in modal logic is the Goldblatt-Thomason theorem, which pinpoints

precisely the frame properties that can be modally defined. We will only explain the first three concepts

involved, so we cannot prove the theorem here, but see Blackburn et al. [2] for an extensive treatment.

Theorem 5 (Goldblatt-Thomason Theorem). If a frame property can be formulated in the language of

predicate logic (so not the Gödel-Löb or McKinsey properties), then it is modally definable if, and only

if, it is closed under taking generated subframes, disjoint unions, p-morphic images, and its complement

is closed under taking ultrafilter extensions.

31

Generated subframes

The intuition behind generated subframes is perhaps relatively easy to grasp. Suppose that a frame F

is a member of the definable class C. Then some formula (or set of formulas) must be true in all the

worlds in any model based on F. We can in some cases throw some worlds out of the frame, without

‘disrupting’ the remaining worlds in the frame. That is, those remaining worlds make all of the same

formulas true, in any model based on that resulting frame. If that is so, then the characterizing formula

must also be true in all of the remaining worlds, so the resulting frame must also be in the class C.

Now we ask, which are those ‘irrelevant’ worlds that can be eliminated without disrupting the remain-

ing worlds? Say that a world w can ‘see’ another world v if v is accessible from w via any intermediate

steps. Then if w cannot see v, whatever is the case in v cannot be relevant for w: no amount of 2 and 3

can make the truth value of a formula in w dependent on what is the case in v. Moreover, if w cannot see

v, then neither can any of the worlds w can see. Hence, if we eliminate the worlds that w cannot see, and

keep all of the other worlds, then the resulting frame should validate all of the formulas that are valid on

the original frame.

The first step is to make this concept of ‘seeing’ more precise.

Definition 9 (Hereditary closure). Let F = hW, Ri. The heriditary closure of R is R∗ .

wR∗ v if, and only if, there are worlds u1 . . . un such that wRu1 Ru2 R . . . Run Rv

The set Ww thus includes all those possible worlds that might be relevant for evaluating a modal

formula in world w: its ‘immediate successors’ (with wRv) are relevant for whether 2p is true in w, the

ones that are accessible from there are relevant for evaluating whether 23q is true, and so on. The only

thing we have to do is define the frame resulting from restricting the worlds to Ww . This is really self-

explanatory, but stated here for completeness. We retain all the relational dependencies vRu for worlds u

and v in Ww .

Definition 10 (Generated subframe). Let F = hW, Ri be a frame and w ∈ W. The w-generated subframe

of F is Fw = hWw , Rw i, where vRw u if, and only if vRu, v ∈ Ww and u ∈ Ww .

Proposition 3. Let F be a frame, Fw its w-generated subframe, and Mw is a model based on Fw . Then

there is a model M based on F such that M, w ↔ Mw , w is an Mw -complete bisimulation.

Proof. For clarity, we name the worlds in Ww w0 , v0 , u0 and so on, to avoid confusion. Nevertheless,

w0 = w, v0 = v and so on.

We take any valuation V such that V(v) = Vw (v0 ) for all v0 ∈ Ww , and define M = hW, R, Vi. We have

to show that the relation Z such that wZw0 for all w0 ∈ Ww is a bisimulation that is complete for Mw .

First, the defined valuation is such that, if vZv0 , then V(v) = Vw (v0 ). Second, suppose that vRu and

vZv0 . Then v0 ∈ Ww , which means that wR∗ v and, because vRu, also wR∗ u, so there is some u0 (i.e. u

itself), such that v0 Rw u0 and, according to the bisimulation we defined, uZu0 . Third, suppose that v0 Rw u0

and vZv0 . Then, from the fact that Rw is a restriction (subset) of R it immediately follows that vRu, with

uZu0 . Thus, the relation Z is a bisimulation. Finally, the bisimulation is complete for Mw , because for

all of the worlds v0 in Mw , there is a world v in M, with vZv0 .

As will be clear, this proposition implies by Corollary 2 that every formula that is valid on a frame

is valid on its generated subframes. From there it is also straightforward to conclude that no frame class

can be modally defined if some frame belongs to that class but some of its generated subframes do not.

32

Corollary 3. Let F be a frame and Fw its w-generated subframe. For all ϕ: if F |= ϕ, then Fw |= ϕ. If C

is a characterizable class of frames and F ∈ C, then Fw ∈ C.

This corollary was the reason why we defined generated subframes and submodels above. The corol-

lary demonstrates a limitation to the expressive power of modal logic for characterizing frame classes.

It can now be proven, for instance, that the class of non-reflexive frames is not modally definable. You

will be asked to prove this as an exercise.

Disjoint unions

A second technique for proving that a frame property cannot be characterized is by means of a disjoint

union of two frames. Such a union is ‘disjoint’ if there is no overlap between the possible worlds of the

two frames. In order to guarantee disjointness, the general definition is as follows:

Definition 11 (Disjoint Union). Let F1 = hW1 , R1 i and F2 = hW2 , R2 i be two Kripke frames. Their

disjoint union, F1 t F2 is hW, Ri, where

- W = {(w, i) | w ∈ Wi }, and

- R = {h(w, i), (v, i)i | wRi v}, with i ∈ {1, 2}.

Provided that W1 and W2 are already disjoint, the disjoint union of the two frames is simply F =

hW1 ∪ W2 , R1 ∪ R2 i. If w is a member of both W1 and W2 , then there will be two ‘copies’ of it in the

disjoint union: (w, 1) and (w, 2). This guarantees that the structure of the two frames will not be disrupted

by their union. For instance, the disjoint union of two linearly ordered frames will consist of two linearly

ordered frames—even if the two frames have overlapping members. We can also create a disjoint union

by putting two ‘instances’ or ‘copies’ of a single frame together.

That there is a bisimulation, for every valuation on F1 , relating (w, 1) in F to w in F1 is a trivial

observation. That this bisimulation is complete for F1 is also easily seen. There is no ‘influence’ from

‘the other frame’ in F, because the two are disjoint.

However, what we need is a stronger claim.

Proposition 4. Let F = F1 t F2 . Then,

F |= ϕ ⇔ F1 |= ϕ and F2 |= ϕ.

We want to establish that everything that is valid in both of the smaller frames is also valid in their

disjoint union.

Proof. We only do the direction ⇐. Assume that it is the case that, F1 |= ϕ and F2 |= ϕ. We need to show,

for their disjoint union F, that F |= ϕ. We do this by contraposition.

So, suppose F 6|= ϕ. Then there is some model M = hW, R, Vi based on F, and some (w, i) ∈ W such

that M, (w, i) 6|= ϕ. Now take the model Mi = hWi , Ri , Vi i, where Vi is defined by Vi (w) = V((w, i)). It can

easily be checked that the relation Z such that (w, i)Zw, for all w ∈ Wi , is a bisimulation. From that it

follows that M, (w, i) |= ϕ if, and only if, Mi , w |= ϕ. But then, Mi , w 6|= ϕ, and so Fi 6|= ϕ. That contradicts

our initial assumption.

We use this fact about disjoint union to prove the non-characterizability of, among others, (strong/weak)

connectedness and universality. The disjoint union of two universally connected frames (i.e., ∀w∀v(wRv))

is not itself universally connected, because for no two worlds (w, 1) and (v, 2) is it the case that (w, 1)R(v, 2).

33

P-morphisms

P-morphisms are functions between frames. They exist when there is a certain similarity between the

frames. That is, given a p-morphism f one can define valuations on the frames such that a node and its

image under the p-morphism cannot be distinguished modally: w |= ϕ ⇔ f (w) |= ϕ.

Definition 12 (P-morphism). Given two frames F = hW, Ri and F = hW 0 , R0 i, a p-morphism f : W →

W 0 between F and F 0 is a map such that

1. f is a surjection,

2. wRv implies f (w)R0 f (v),

3. f (w)R0 v0 implies that there is a v ∈ W such that wRv and f (v) = v0 .

Note the difference between p-morphisms and bisimulations: the former are functions between two

frames, while bisimulations are relations between the worlds in two models. As for bisimulations, the

second (left) and third (right) condition on p-morphisms can be depicted as follows:

vO / v0 vO / v0

f

O f

O

R R0 R R0

w f / w0 w f / w0

The similarity with the bisimulation conditions is obvious from this depiction. The differences are

worth pointing out. First, because f is a function, in case wRv on the left, there must be images f (w) and

f (v) on the right. So condition 2 only needs to state that those images stand in the accessibility relation,

f (w)R0 f (v). Therefore, only the arrow from w0 to v0 is dotted. Second, in case f (w)R0 v0 (picture on the

right), condition 1 that f is surjective implies that there are one or more v such that f (v) = v0 . Condition

3 only requires that, for one of those v, it is moreover true that wRv.

Proposition 5. Let f : W → W 0 be a p-morphism between F = hW, Ri and F 0 = hW 0 , R0 i, and let

M 0 = hW 0 , R0 , V 0 i be a model based on F 0 . Then there is a model M = hW, R, Vi based on F such that

there is a bisimulation M, w ↔ M 0 , f (w) that is complete for M 0 .

Proof. To obtain the model M we choose V such that V(x) = V 0 ( f (x)). We now have to show that the

relation Z, defined by wZ f (w) for all w ∈ W is a bisimulation, and that this bisimulation is complete for

M0.

For the first condition, we chose V such that for any possible world v, V(v) = V 0 ( f (v)).

For the second condition, suppose that wRv and wZ f (w). Then the p-morphism definition guarantees

that f (w)R0 f (v). So there is a v0 ∈ W 0 , namely f (v) such that vZ f (v) and f (w)R0 f (v).

For the third condition, suppose that f (w)R0 v0 . Then the p-morphism definition guarantees that there

is a v such that wRv and f (v) = v0 . Since vZ f (v), this immediately fulfils the third condition for a

bisimulation.

Finally, to show that the bisimulation is complete for M 0 , the p-morphism is a surjection, which

means that every possible world in w0 is in the image of the function. Accordingly, the bisimulation

with wZ f (w) for every w, relates to every world in M 0 some world in M. Therefore the bisimulation is

complete for M 0 .

It directly follows from this that if a frame has some property X, belongs to some frame class C, then

a p-morphic image of it must also have that property, and so belong to that same frame class.

34

Corollary 4. Let f be a p-morphism between F and F 0 . For all ϕ: if F |= ϕ, then F 0 |= ϕ. If C is a

characterizable class of frames and F ∈ C, then F 0 ∈ C.

Below is one example of a p-morphism, showing that the property of antisymmetry is not modally

definable. The frame on the left is antisymmetric, but the frame on the right is not: in fact it is symmetric.

The dotted lines show one possible p-morphism from the left frame to the right one. By means of the

corollary above we can then conclude that antisymmetry is not modally definable.

f

?v

f

$& o /$: b d

9w_ tc =a

f

uZ f

4.5 Exercises

1. Give a bisimulation between the following two models such that w and a become bisimilar.

b < O

(p) z (p) b

O O

c ; ;

w a

2. Prove the correspondence theorems for (a) disconnected, (b) serial, (c) symmetric, (d) preorder.

3. Which formula characterizes the class of frames where there are no three worlds w, v, and u such

that wRv and vRu?

4. Give a model which is not based on a transitive frame, on which the formula 2p → 22p is valid.

Give a different model based on the same frame where that formula is not true.

5. Show that (a) every reflexive frame is serial and dense, (b) every well-founded frame is irreflexive

and antisymmetric, (c) reflexive frames are Euclidean if, and only if, they are symmetric and

transitive.

6. (a) Prove that 2ϕ ↔ 22ϕ holds on all reflexive transitive frames. (b) Give a formula that char-

acterizes the class of reflexive transitive frames. (c) Show that the formula in (a) is not such a

formula.

7. Show by means of a simple infinite frame that transitivity plus irreflexivity is not characterizable.

8. To get a feeling for what generated subframes look like it is useful to prove the following:

(a) Show that Ww is the smallest subset of W such that the following holds:

(a) w ∈ Ww ; (b) for all v ∈ Ww : if vRu, then u ∈ Ww .

35

(b) Show that if R is transitive, then Ww = {v ∈ W | w = v or wRv} and if R is reflexive also

Ww = {v ∈ W | wRv}.

(c) Show that if v ∈ Ww , then Wv = (Ww )v .

9. Prove Corollary 1.

10. Prove Corollary 3.

11. Give an example of a non-reflexive frame and a reflexive generated subframe. What can we prove

on that basis?

12. Given an example of a p-morphism showing that partial ordering is not modally definable.

13. Can you give an example of an irreflexive frame that has a reflexive frame as its p-morphic image?

What does that prove?

36

5 Basic Modal Logic II: Proof theory

5.1 Hilbert system

The standard way of describing a proof system for modal logic is with a so-called Hilbert system, named

after the German mathematician David Hilbert. In these systems there are only two inference rules and a

variety of axioms. Although those systems are very easy to state, they are nearly impossible to use. For

the interested reader, a statement of the Hilbert system for the basic modal logic is given here.

This system is built on the basis of a Hilbert-system for propositional logic, which has one inference

rule called Modus Ponens (which is the same as Elim →) and a set of axiom schemata (meaning that

we can substitute any propositional formula for the variables ϕ, ψ, and χ). Given the fact that every

propositional formula can be rewritten using only → and ¬, the three axiom schemata below suffice.

(Axiom 1) ϕ → (ψ → ϕ)

(Axiom 2) (ϕ → (ψ → χ)) → ((ϕ → ψ) → (ϕ → χ))

To obtain the basic modal logic K, named after Kripke, we allow modal formulas to be used in the

above system (so that, e.g., 2p → (q → 2p) is an instance of axiom scheme 1), and we add one more

inference rule and one more axiom scheme:

As may be clear from the length of these formulas, making proofs with only these tools available will be

quite tedious. Therefore, we will be using the lesser known natural deduction system for modal logic.

However, in some cases we are merely interested in claiming that some proof system can be given, with

certain properties. In those cases it is convenient that we can restrict ourselves to a proof system of six

lines only.

Factual premises

If we make an assumption that some atomic proposition p is actually true, we could seemingly infer

from this that 2p is true. However, that would be clearly mistaken: the statement “If it is raining, then

it is necessary that it is raining” is not (supposed to be) a tautology. Because of this, we cannot use

Necessitation based on such premises.

This shows that we must formulate the Necessitation rule more clearly. It says that, if we can prove

ϕ, then we can also prove 2ϕ. So it should be understood as follows: from `K ϕ infer `K 2ϕ. The same

applies to Modus Ponens, although there we do not have the same risk of misunderstanding.

We make use of the Natural Deduction system for modal logic from Garson [7]. To obtain our proof

system, we use Natural Deduction for Propositional Logic (see section 1) and we add four rules. They

37

involve a special type of assumption, expressed by the 2.

..

.

2

..

.

..

.

When we make the 2 assumption we, so to say, go into a ‘necessity modus’ in our proof. To make

an analogy with the semantics, the 2 indicates the ‘shift’ from a possible world to the set of all of

its accessible worlds. If we can derive some formula ϕ from the 2 premise, then we can withdraw the

assumption and conclude that 2ϕ (Intro 2). Next to the introduction rule for the 2 there is an elimination

rule for 2. After we have proven that 2ϕ, we can make the assumption 2 and, in the ‘necessity modus’,

infer ϕ. The result of Elim 2 is therefore that ϕ is inferred only under the assumption of 2. There is an

introduction rule for the 3, although we could do without it, because it is derivable from the rules for 2,

using Def 3.

The system K is the natural deduction system defined by the inference rules for Propositional logic

(see section 1) plus the following four inference rules.

3ϕ

3ϕ

2ϕ 2 2 ......

.. ϕ ¬2¬ϕ

2 .

.. ..

. ϕ .

... ...... ¬2¬ϕ

ψ ......

ϕ 2ϕ ......

3ϕ

3ψ

The 2 assumption is really different from other assumptions, including assumptions of the form 2ϕ.

We can use our familiarity with Kripke models to understand this. If we are trying to prove that, in any

arbitrary possible world, ϕ is true, then we might make an assumption that 2ψ is true in that arbitrary

world. But this is different from the reasoning about what is then true in the worlds accessible from that

arbitrary world, which is what the 2 assumption allows us to do. So, when we make that assumption

twice, from the semantic point of view we are considering the worlds accessible ‘in two steps’.

22p w |= 22p

2

2p wRv ⇒ v |= 2p

2

p wRvRu ⇒ u |= p

..

.

Because of the special nature of the 2 assumption, there is an important further rule to take into

consideration. We need to constrain Reiteration in a specific way. The normal cases of reiteration are

38

still permissible, including when the premise is a modal formula of whatever form.

.. ..

. .

ϕ ϕ

ψ 2ψ

.. ..

. .

ϕ (allowed) ϕ (allowed)

.. ..

. .

But reiteration ‘into’ a subproof is not permissible in case the assumption is 2 (see below, on the

left). That would be like inferring from the fact that in every world where ϕ is true, 2ϕ is also true. In

this respect, the carefulness required here with reiteration is essentially the same as the concerns with the

Necessitation rule in the context of the Hilbert-style proof system. The other limitation for reiteration

also still holds: we cannot reiterate intermediate steps after we dropped an assumption (on the right).

.. ..

. .

..

ϕ .

2 ϕ

.. ..

. .

ϕ (not allowed) ϕ (not allowed)

5.3 Examples

With these inference rules in place, we can make simple natural deductions in the basic modal logic K.

The first example is a proof for `K (2ϕ ∧ 2ψ) → 2(ϕ ∧ ψ).

1 2ϕ ∧ 2ψ Assumption

2 2ϕ Elim ∧, 1

3 2ψ Elim ∧, 1

4 2 2

5 ϕ Elim 2, 2

6 ψ Elim 2, 3

7 ϕ∧ψ Intro ∧, 5, 6

8 2(ϕ ∧ ψ) Intro 2, 7

9 (2ϕ ∧ 2ψ) → 2(ϕ ∧ ψ) Intro →, 1, 8

If we assume that ϕ is necessary and ψ is necessary as well, then if we start to reason about what is

necessarily the case, we can infer that ϕ is, and ψ is, and therefore ϕ ∧ ψ is. Consequently, we draw the

conclusion that necessarily, ϕ ∧ ψ.

39

The next example presents a derivation of the K-axiom from the Hilbert-style proof system.

1 2(ϕ → ψ) Assumption

2 2ϕ Assumption

3 2 2

4 ϕ→ψ Elim 2, 1

5 ϕ Elim 2, 2

6 ψ Elim →, 4,5

7 2ψ Intro 2, 6

8 2ϕ → 2ψ Intro →, 2, 7

9 2(ϕ → ψ) → (2ϕ → 2ψ) Intro →, 1,8

Observe that these proofs do not involve reiteration in the context of the 2 assumptions. We only

introduce facts χ in the 2 context of which 2χ was proven outside of the 2 context. A third example of

a 2 ‘distribution rule’ is the following. It proves that 2ϕ ∨ 2ψ `K 2(ϕ ∨ ψ), using the elimination rule

for disjunction.

1 2ϕ ∨ 2ψ Premise

2 2ϕ Assumption

3 2 2

4 ϕ Elim 2, 2

5 ϕ∨ψ Intro ∨, 4

6 2(ϕ ∨ ψ) Intro 2, 5

7 2ψ Assumption

8 2 2

9 ψ Elim 2, 7

10 ϕ∨ψ Intro ∨, 9

11 2(ϕ ∨ ψ) Intro 2, 10

12 2(ϕ ∨ ψ) Elim ∨, 1, 6, 11

As for the 3, in fact we can do without it, since it can be treated as a defined operator. However, here

40

is a short proof of 3(ϕ ∧ ψ) `K 3ϕ ∧ 3ψ, using the 3 introduction rule.

1 3(ϕ ∧ ψ) Premise

2 2 2

3 ϕ∧ψ Assumption

4 ϕ Elim ∧, 3

5 3ϕ Intro 3, 1,4

6 2 2

7 ϕ∧ψ Assumption

8 ψ Elim ∧, 7

9 3ψ Intro 3, 1,8

10 3ϕ ∧ 3ψ Intro ∧, 5,9

The only way we can validly derive a formula 3ϕ is in case we have first established that something

is possible and, secondly, that necessarily, if that possible something is true, then ϕ is also true. Hence,

if we assume that it is possible that ϕ ∧ ψ, we can derive that, necessarily (assumption 2), if ϕ ∧ ψ is the

case (assumption 3), ϕ is also the case. And therefore we can then conclude that it is possible that ϕ is

the case.

The rule for defining 3 can be used to prove the following: 3¬ϕ `K ¬2ϕ.

1 3¬ϕ Premise

2 ¬2¬¬ϕ Def 3, 1

3 2ϕ Assumption

4 2 2

5 ϕ Elim 2, 4

6 ¬ϕ Assumption

7 ⊥ Elim ¬, 5,6

8 ¬¬ϕ Intro ¬ 6,7

9 2¬¬ϕ Intro 2 4,8

10 ¬2¬¬ϕ Reit, 2

11 ⊥ Elim ¬, 9,10

12 ¬2ϕ Intro ¬, 3,11

The proof for 2¬ϕ `K ¬3ϕ is shorter, and proceeds by means of the assumption that 3ϕ. The proofs

in the other directions, from negated modalities to the dual modality of a negation, are also shorter.

They can both be formulated with assuming the contrary and using negation introduction and the double

negation rule in the last steps.

We can use derivations such as these to form derived rules which are, in effect, shortcuts in a deriva-

tion. That is, whenever we prove that 3¬ϕ, for some modal formula ϕ, then we can continue the

derivation with the above steps 2-12 to ¬2ϕ, and continue the derivation further from there. So, instead,

we can shortcut the derivation, going immediately from 1 to 12, and leaving out the steps 2-11.

41

As a final example, the following is a rather lengthy demonstration of the correctness of the 3 intro-

duction rule. Or, more precisely, it is a derivation proving that 2(ϕ → ψ), 3ϕ `K 3ψ. The derivation

uses two, permissible, instances of the reiteration rule.

1 2(ϕ → ψ) Premise

2 3ϕ Premise

3 ¬2¬ϕ Def 3, 2

4 2¬ψ Assumption

5 2 2

6 ¬ψ Elim 2, 4

7 ϕ→ψ Elim 2, 1

8 ϕ Assumption

9 ψ Elim →, 7,8

10 ¬ψ Reit, 6

11 ⊥ Elim ¬, 9,10

12 ¬ϕ Intro ¬, 8,11

13 2¬ϕ Intro 2, 5,12

14 ¬2¬ϕ Reit, 3

15 ⊥ Elim ¬, 13,14

16 ¬2¬ψ Intro ¬, 4,15

17 3ψ Def 3, 16

5.4 Soundness

A proof system is called ‘sound’, in relation to the semantics, if it only allows us to make derivations that

are semantically in order, i.e., only derivations in which the truth of the premises guarantees the truth

of the conclusion. The logic K is sound (in relation to the Kripke model semantics). The proof of this

is rather standard as long as we restrict ourselves to the Hilbert system, but for our natural deduction

system we need some more detail.

Because of the 2 in the derivations, we cannot simply define ∆ |= ϕ in the usual manner: ∆ is not

just a conjunction of premises, but rather a sequence of premises and boxes. In the schema below, the

premises at the point where we infer χ are: ∆ = ϕ, 2, ψ, 2.

ϕ

2

ψ

2 ∆ = ϕ, 2, ψ, 2

..

.

χ ∆ `K χ

42

Accordingly, in order to state the soundness theorem, we first need a concept of ‘truth of the premises

∆’ in a world and a model. The main obstacle is that there is no semantics for the 2 assumption, as it

stands. However, the 2 has a natural interpretation in the context of the proof, as was already illustrated

in the demonstration of the inference rules. This allows us to define a semantic counterpart for it.

Here is an inductive definition for truth of a premise sequence ∆ in a world and a model. We decom-

pose the sequence of premises and 2 assumptions, starting at the right hand side and working our way

toward the left hand side.

M, w |= ∆, ϕ ⇔ M, w |= ϕ and M, w |= ∆;

M, w |= ∆, 2 ⇔ ∃v, vRw and M, v |= ∆.

To some extent, we can read the 2 assumption in the semantics as “the present world is accessible from

a world where the preceding sequence is true”.

Now, for soundness:

Theorem 6 (Soundness Theorem). Everything inference in K is generally valid.

∆ `K ϕ ⇒ ∀M∀w(M, w |= ∆ ⇒ M, w |= ϕ)

Proof. The proof is an induction on the length of a proof. We need to prove that every inference rule is

generally valid. We do so for selected cases.

(Elim →) If ∆ `K ϕ → ψ and ∆ `K ϕ, then ∆ `K ψ. We assume that ∆ |= ϕ → ψ and ∆ |= ϕ, and we

prove that ∆ |= ψ. Consider an arbitrary world w such that w |= ∆. Then, by assumption, w |= ϕ → ψ

and w |= ϕ. The definition of truth of sequences tells us that w |= ϕ → ψ. It now follows immediately

from the semantic clause for implication that w |= ψ.

(Elim 2) If ∆ `K 2ϕ, then ∆, 2 `K ϕ. We assume that ∆ |= 2ϕ and prove that ∆, 2 |= ϕ. Consider

an arbitrary world w such that w |= ∆. Then, by our assumption, w |= 2ϕ. The definition of truth of a

sequence now says that, for all v such that wRv, v |= ∆, 2. The fact that w |= 2ϕ implies, by the Truth

definition, that v |= ϕ.

(Intro 2) If ∆, 2 `K ϕ, then ∆ `K 2ϕ. Suppose this is not sound. So, we assume that ∆, 2 |= ϕ, but

there is some world w in some model, such that w |= ∆ and w 6|= 2ϕ. Hence, there is some world v such

that wRv and v 6|= ϕ. Then, according to the definition of truth of a sequence of formulas above, v |= ∆, 2.

After all, there is w such that wRv and w |= ∆. But, by the fact that ∆, 2 |= ϕ, it must now be true that

v |= ϕ. That contradicts our assumption that v 6|= ϕ.

(Intro 3) If ∆ `K 3ϕ and ∆, 2, ϕ `K ψ, then ∆ `K 3ψ. We assume that ∆ |= 3ϕ and ∆, 2, ϕ |= ψ. We

prove that ∆ |= 3ψ. Consider an arbitrary w such that w |= ∆. Then, by assumption, w |= 3ϕ. According

to the Truth definition, this implies that there is a v such that wRv and v |= ϕ. Also, in view of the

definition of truth of a sequence, it is then true that v |= ∆, 2 and, combining these things, v |= ∆, 2, ϕ.

Then, by our other assumption, v |= ψ. Hence, there is a world v accessible from w, such that v |= ψ. The

Truth definition now tells us that w |= 3ψ.

The reverse of the soundness theorem is called ‘completeness’, because it says that the proof system

is complete with respect to the semantics: every argument that is semantically valid can be derived with

the proof system. This theorem will be discussed later.

A major difference between predicate logic and modal logic, in practice, is that there is a plurality of

modal logics, that are all obtained from the basic modal logic K with additional rules or axioms. That

43

these are ‘different logics’ is debatable, as also van Benthem [15] explains: in effect they are the same

logic but we make use of an extra inference rule. Also, they have the same semantics, but with extra

restrictions on the models.

Just as modal logic is used to reason about necessity, time, knowledge, action, and more, so predicate

logic is used to reason about physical objects, people, events, spatial locations, and more. Yet, we do not

talk about ‘different predicate logics’, but only about different domains of objects and predicate logical

theories about those domains (i.e., sets of formulas valid on those domains). For instance, the statement

that x and y are in the same location may be false for all physical objects, but not false for all events. In

essence modal logic is no different. Instead of claiming that “the logic of knowledge is K plus inference

rules X” (see van Ditmarsch et. al. [16]), we might also say that inference rules X constitute a theory

about the domain of knowledge: as long as 2 is understood as the epistemic modality, those inference

rules are acceptable to us. As a matter of historical fact, though, the different modal logics were invented

before Kripke semantics showed how we could standardize all of them. This explains why modal logic

is sometimes presented as comprising a diversity of logics, unlike predicate logic.

Nevertheless, there are clearly differences with respect to the natural assumptions for certain modal-

ities. For instance, it is natural to think that what is necessarily the case is actually the case. But it is not

natural at all to think that what is the case always in the future is the case now, or what is a necessary

outcome of executing a certain program is also true prior to executing that program. This example gives

a hint to the kind of additional rules that we might add. The statement that whatever is necessary is true

is represented by the formula 2ϕ → ϕ that, as we saw in the previous section, modally defines the class

of reflexive frames. We may therefore add this formula as an axiom scheme, or add 2ϕ ` ϕ as a rule

of inference. And so, likewise, we can add other frame-class defining formulas as rules or axioms. The

logic we then obtain also corresponds to a class of frames.

K All frames

D K+2ϕ → 3ϕ Serial frames

T K+2ϕ → ϕ Reflexive frames

B K+32ϕ → ϕ Symmetric frames

4 K+2ϕ → 22ϕ Transitive frames

5 K+3ϕ → 23ϕ Euclidean frames

S4 T+2ϕ → 22ϕ Preordered frames

S5 S4+32ϕ → ϕ Equivalence frames

An alternative name for S4 is KT4; alternative names for S5 are KTB4, KT45 and KT5. (To un-

derstand why, answer exercise 5(c) of section 4.) Many more combinations are possible, considering

Proposition 2 in section 4, which states that any combination of frame class characterizing formulas

modally defines the intersection of those frame classes. So for instance the logic KD45 corresponds to

the class of all serial, transitive and euclidean frames.

These correspondence results lead to the formulation of soundness theorems for the different logics:

Theorem 7 (Soundness for L). For all of the modal logics L defined above, L is sound with respect to its

corresponding frame class CL .

Φ `L ψ ⇒ Φ |=CL ψ

See the definition of validity in section 3.3 for the definition of the right hand side of this equivalence.

44

The additions to the logic can naturally be understood as additional axiom schemes, in the Hilbert

style. But we can also easily understand them as inference rules:

D T B 4 5

2ϕ 2ϕ 32ϕ 2ϕ 3ϕ

... ...... ...... ...... ......

3ϕ ϕ ϕ 22ϕ 23ϕ

A few more examples will illustrate the resulting logics. On the left a proof in the system 5 of

`5 3ϕ → 223ϕ. On the right a simple and useful proof of ϕ `T 3ϕ, that comes in useful as a derived

rule: given the T rule, so assuming reflexivity, if ϕ is true, then there is a possible world where ϕ is true.

1 3ϕ Assumption 1 ϕ Premise

2 23ϕ Rule 5, 1 2 2¬ϕ Assumption

3 2 2 3 ¬ϕ Rule T, 2

4 3ϕ Elim 2, 2 4 ϕ Reit, 1

5 23ϕ Rule 5, 3 5 ⊥ Elim ¬, 3,4

6 223ϕ Intro 2 6 ¬2¬ϕ Intro ¬, 2,5

7 3ϕ → 223ϕ Intro →, 1,6 7 3ϕ Def 3, 6

The derivation on the right above shows that we can always infer 3ϕ from ϕ if we may use Rule T.

Given a derivation of some formula ϕ, we can continue the derivation along the steps 2-7 and arrive at

3ϕ. So we can also add a derived rule: if `T ϕ, then `T 3ϕ.

For a last example, we prove Rule B in system KT5. In semantic terms, the following derivation

illustrates that every frame that is reflexive and Euclidean is also symmetric. We use the derived rule for

T proven above. We also use two derived rules of the Def 3 rule. The first of these has been proven

earlier, and the second one is an exercise.

1 32ϕ Assumption

2 ¬ϕ Assumption

3 3¬ϕ Rule T (derived), 2

4 23¬ϕ Rule 5, 3

5 2 2

6 3¬ϕ Elim 2, 3

7 ¬2ϕ Def 3 (derived), 6

8 2¬2ϕ Intro 2, 7

9 ¬32ϕ Def 3 (derived), 8

10 ⊥ Elim ¬, 1,9

11 ¬¬ϕ Intro ¬, 2,10

12 ϕ Double ¬, 11

13 32ϕ → ϕ Intro →

45

5.6 Exercises

1. For additional understanding of natural deduction, consider the exercises in section 1.

(a) 2p `K 2(p ∨ q); (c) 2(ϕ ∨ ψ), 2(ϕ → χ), 2(ψ → χ) `K 2χ;

(b) 2(ϕ ∧ ψ) `K 2ϕ ∧ 2ψ; (d) 2(ϕ → ψ), 2¬ψ `K 2¬ϕ.

3. In this section there is a derivation showing that `K 3¬ϕ ⇒ `K ¬2ϕ is an admissible rule. In

other words, we may use this fact to legitimate the inference rule from 3¬ϕ to ¬2ϕ. Show that

`K 2¬ϕ ⇒ `K ¬3ϕ is also an admissible rule, using the hints in the text. Using Intro 3, Def 3,

and these two additional admissible rules, give derivations for

(a) 2¬2ϕ `K 23¬ϕ; (c) (3ϕ ∨ 3ψ) `K 3(ϕ ∨ ψ);

(b) 3¬3ϕ `K 32¬ϕ; (d) 3(ϕ ∨ ψ) `K (3ϕ ∨ 3ψ).

4. Give derivations for

(a) 232ϕ `KTB ϕ; (c) `S4 2ϕ ↔ 22ϕ;

(b) 2ϕ `S4 ϕ ∧ 22ϕ; (d) `KD5 2ϕ → 23ϕ.

5. Prove that `K ϕ → ψ ⇒ `K 2ϕ → 2ψ is a derived rule in the logic K, (a) using the Hilbert system;

(b) using the Natural Deduction system.

6. Prove that the `K 2(ϕ → ψ) ⇒ `K 3ϕ → 3ψ can be derived in the Hilbert system, defining 3 as

¬2¬, and using the derived rule that `K ϕ → ψ ⇒ `K ¬ψ → ¬ϕ for all modal formulas ϕ and ψ.

7. Prove, on the basis of the soundness of K, that the following logics are also sound: (a) T; (b) B;

and (c) S5. That is, prove that the additional inference rules only allow us to derive conclusions

that are valid on all frames in the corresponding class of frames, in the manner of the proof of

Theorem 6.

8. In the proof of Theorem 6 the step for Def 3 is skipped. Write down the proof-part for that

inference rule (both versions).

46

6 Completeness

Soundness and completeness theorems link the syntax and semantics of modal logics, by providing a

correspondence between derivability (`) and validity (|=). Soundness means that we can only derive

conclusions that are valid (on the class of frames for that logic). Completeness means that we can derive

everything that is valid. In other words, what can be expressed in the language, and is generally valid (or

valid on the class of frames), can be derived using the proof system. The general outline of the proof of

this main theorem in modal logic will be given below, in the section on canonical models.

Theorem 8 (Completeness theorem).

ϕ is valid on all reflexive frames ⇒ `T ϕ

ϕ is valid on all transitive frames ⇒ `4 ϕ

ϕ is valid on all preordered frames ⇒ `S4 ϕ

ϕ is valid on all equivalence frames ⇒ `S5 ϕ

Thus for these logics derivability is connected to a frame property in an elegant way. Because of the

correspondence theorems we also know that these classes of frames can be characterized by one single

formula, e.g. 2ϕ → ϕ in case of the relexive frames, the formula that is the characteristic inference rule

(or axiom) of T.

Every modal logic has one special model that is in some sense as general as possible. It is close

to the syntax of the logic because its worlds are sets of formulas. This model is called the canonical

model. Its importance stems from the fact that from the existence of such a model one can in some cases

easily prove the completeness of the logic in question. We will do so at the end of this section. We will

consider the canonical model in detail for the logic K and later comment on its construction for other

modal logics. Some definitions first.

Definition 13 (Maximally consistent set). Given a modal logic L, a set of formulas ∆ is L-consistent if

one cannot derive a contradiction from it, i.e. if ⊥ cannot be inferred from it, in the proof system for L.

A set of formulas ∆ is called maximally L-consistent if it is L-consistent and for every formula ϕ, either

ϕ belongs to the set or ¬ϕ does.

We will mainly work with K in this section, therefore the K-part is often omitted, so consistent means

K-consistent, and so on. A simple but important observation:

Proposition 6. If a set of formulas is true on a model (for L), in a world, then it is consistent.

Proof. For if not, it would derive ϕ ∧ ¬ϕ for some ϕ. But then ϕ ∧ ¬ϕ should hold in the model, which

cannot be.

Because of this, the set {p, 2q} clearly is consistent, as there are models in which both the formulas

hold. The same argument applies to the set

Obviously, the set {ϕ, ¬ϕ} is not consistent, as it derives ϕ∧¬ϕ. Also the set {2(ϕ → ψ), 2(> → ϕ), 3¬ψ}

is inconsistent, since 2ψ∧¬2ψ follows from it. The set {p, 2q} is not maximally consistent since neither

q nor ¬q belongs to the set (and so do many other formulas).

Lemma 1 (Lindenbaum lemma). Every consistent set of formulas can be extended to a maximal consis-

tent set of formulas.

47

That this lemma is true can be shown by means of the following method for constructing a maximal

consistent set Γ out of any possible (non-maximal) consistent set ∆. This construction begins by choosing

an enumeration of all the formulas in the modal language. Note that, with a countably infinite set of

atomic propositions there is a countably infinite set of modal formulas (you can try proving this yourself

in the exercises). We then extend ∆ step by step, considering each formula in the language along the

way. We begin with ∆ itself.

Γ0 = ∆

Γn ∪ {ϕn+1 }, if Γn ∪ {ϕn+1 } is consistent;

Γn+1 =

Γ,

otherwise.

n

Finally, we define the maximal consistent set as the ‘end point’ of this construction: Γ =

S

n∈N (Γn ).

Proof. You are asked to prove that the result of this construction method is a maximal consistent set,

containing the original consistent set. That proves the Lindenbaum lemma.

Examples of maximal consistent sets are a bit harder to describe. The typical example is the follow-

ing. Given a possible world w in a model, the set of formulas L = {ϕ | w |= ϕ} is a maximal consistent set.

That it is consistent is clear, as it has a model (and ⊥ is not true in any world). That it is also maximal in

this respect follows from the fact that for any formula ϕ, either w |= ϕ or w |= ¬ϕ, and thus either ϕ ∈ L

or ¬ϕ ∈ L. Thus we see that worlds in a Kripke model naturally correspond to maximally consistent sets

of formulas. This is the guiding idea behind the canonical model.

One more observation on the correspondence between worlds and maximally consistent sets of for-

mulas. Given that wRv holds in a model, then for the sets

Lw = {ϕ | w |= ϕ} Lv = {ϕ | v |= ϕ},

it holds that 2ϕ ∈ Lw implies ϕ ∈ Lv , for all formulas ϕ. This immediately follows from the truth

definition.

We are ready for the definition of a canonical model.

Definition 14. The K-canonical model is the Kripke model MK = hWK , RK , VK i, where

2. ΓRK ∆ ⇔ ∀ϕ (2ϕ ∈ Γ ⇒ ϕ ∈ ∆),

3. p ∈ V(Γ) ⇔ p ∈ Γ, for atomic propositions p.

Thus the canonical model consists of all maximally consistent sets, with arrows between them at the

appropriate places (think of the remark on Lw and Lv above). As explained above, for every world w in

a model, the set {ϕ | w |= ϕ} is maximally K-consistent. Thus one could view the canonical model as

containing all possible Kripke models together, and putting arrows between two sets {ϕ | w |= ϕ} and

{ϕ | v |= ϕ} if for all 2ψ ∈ {ϕ | w |= ϕ} we have ψ ∈ {ϕ | v |= ϕ}.

The next step is then to reduce the truth of a formula in a maximal consistent set to membership

of that set, which is the content of the truth lemma. This lemma is rather involved, because we have

to consider all possible formulas. Therefore, we first present the valuation lemma, which functions as

an intermediate step. In this lemma we establish several dependencies for membership of a maximal

consistent set.

Lemma 2 (Valuation lemma). For any maximal consistent set Γ, the following are true.

48

2. ϕ ∈ Γ if, and only if, ¬ϕ < Γ;

3. ϕ ∧ ψ ∈ Γ if, and only if, ϕ ∈ Γ and ψ ∈ Γ;

4. 2ϕ ∈ Γ if, and only if, (ϕ ∈ ∆ for all ∆ such that ΓRK ∆).

The lemma could be extended with other connectives from propositional logic, in an obvious way.

That is left to the reader as an exercise. The propositional cases are simple, but the case for necessity is

more involved.

1. Because Γ is maximally consistent, either ϕ ∈ Γ or ¬ϕ ∈ Γ. Given that Γ `K ϕ, if ¬ϕ ∈ Γ, then

also Γ `K ¬ϕ and so Γ `K ⊥ (by the inference rule ‘negation elimination’), which would contradict

consistency of Γ. Therefore, it can only be the case that ϕ ∈ Γ.

2. ⇒: given ϕ ∈ Γ, if ¬ϕ ∈ Γ, then Γ `K ⊥ (elim ¬), and so Γ is not consistent. Thus, ¬ϕ < Γ.

⇐: since Γ is maximally consistent, either ϕ ∈ Γ or ¬ϕ ∈ Γ. Therefore, if ¬ϕ < Γ, then ϕ ∈ Γ.

3. ⇒: if ϕ ∧ ψ ∈ Γ, then Γ `K ϕ and Γ `K ψ (elim ∧). By deductive closure, ϕ ∈ Γ and ψ ∈ Γ.

⇐: if Γ `K ϕ and Γ `K ψ, then Γ `K ϕ ∧ ψ (intro ∧). Therefore, by deductive closure, ϕ ∧ ψ ∈ Γ.

4. ⇒: follows immediately from the definition of RK .

⇐: suppose that 2ϕ < Γ, we prove that there is some ∆ such that ΓRK ∆ and ϕ < ∆.

Consider the set Ψ = {ψ | 2ψ ∈ Γ}. Either Ψ ∪ {¬ϕ} is consistent, or it is not. If it is not consistent, then

Ψ `K ϕ, for otherwise we could not derive ⊥ by adding ¬ϕ to Ψ. But then from Ψ2 = {2ψ | 2ψ ∈ Γ} we

could derive Ψ2 , 2 `K ϕ (using 2 elimination), and so Ψ2 `K 2ϕ (by 2 introduction). Since Ψ2 ⊆ Γ, we

know that also Γ `K 2ϕ and so, by deductive closure of Γ (see the first item), 2ϕ ∈ Γ, which contradicts

our assumption that 2ϕ < Γ. So Ψ ∪ {¬ϕ} cannot be inconsistent, hence it is consistent. Given that

Ψ ∪ {¬ϕ} is consistent, it also has a maximal consistent extension (by the Lindenbaum lemma), and since

¬ϕ is in the set, ϕ is not. This maximal consistent set is now our ∆. The definition of the canonical model

guarantees that ΓRK ∆, since Ψ ⊆ ∆, and it has already been established that ϕ < ∆.

Now we can formulate the truth lemma, and prove it. This lemma crucially establishes that truth of a

formula in a ‘world’ in the canonical model comes down to being a member of that maximal consistent

set.

Lemma 3 (Truth lemma). For any maximally K-consistent set of formulas Γ (that is, for any world in

the canonical model), for any formula ϕ:

MK , Γ |= ϕ ⇔ ϕ ∈ Γ.

Note that here MK , Γ |= ϕ means that in the canonical model, in ‘world’ Γ, formula ϕ is true. The

proof for this lemma is by induction on the complexity of the formula, just as in the proof for the

bisimulation theorem (Theorem 4). So we prove that the lemma is correct for atomic propositions, and

then we prove that no way of making a formula more complex (adding ¬, ∧, 2 or another connective)

poses a problem: if it is correct for p and for q, then also for ¬p, p ∧ q, and 2q, and so on; and therefore

also for ¬(p ∧ q), ¬p ∧ 2q, and so on. The induction hypothesis thus says that, given formulas ψ and χ

of arbitrary complexity, if the truth lemma is correct for those formulas, then also for their negation, and

for their conjunction, and for their ‘necessitation’, and so on.

Basic case: Suppose ϕ = p, for some atomic proposition. From the definition of the canonical model

we know that p ∈ Γ if, and only if, p ∈ VK (Γ). That latter fact, by the truth definition, is equivalent to

MK , Γ |= p. So p ∈ Γ is equivalent to MK , Γ |= p.

49

Negation: Suppose ϕ = ¬ψ. From the valuation lemma we know that ¬ψ ∈ Γ if, and only if, ψ < Γ.

By the induction hypothesis, ψ < Γ is equivalent to MK , Γ 6|= ψ. And, according to the truth definition,

that is equivalent to MK , Γ |= ¬ψ. Hence, ¬ψ ∈ Γ is equivalent to MK , Γ |= ¬ψ.

Conjunction: Suppose ϕ = ψ ∧ χ. From the valuation lemma we know that ψ ∧ χ ∈ Γ if, and only if,

ψ ∈ Γ and χ ∈ Γ. By the induction hypothesis, that is equivalent to MK Γ |= ψ and MK , Γ |= χ, respectively.

Lastly, applying the truth definition, this is equivalent to MK , Γ |= ψ ∧ χ. Therefore, ψ ∧ χ ∈ Γ if, and

only if, MK , Γ |= ψ ∧ χ.

Necessity: Suppose ϕ = 2ψ. By the valuation lemma, 2ψ ∈ Γ is equivalent to (*) ψ ∈ ∆ for every

∆ such that ΓRK ∆. But, by the induction hypothesis, (*) is in turn equivalent to (+) MK , ∆ |= ψ for all ∆

accessible from Γ. Then, by the truth definition, (+) is equivalent to MK , Γ |= 2ψ. So, all in all, 2ψ ∈ Γ

is equivalent to MK , Γ |= 2ψ.

The proof only mentions negation, conjunction and necessity. These are all we need, in principle,

as we can define the other connectives in terms of only those three. You can try to extend the proof for

those other connectives yourself in the exercises.

Now we are ready to prove the completeness theorem for the basic modal logic K.

We prove this by contraposition, showing that 0K ϕ implies 6|= ϕ. If 0K ϕ, there is a maximally

consistent set Γ containing ¬ϕ, as the Lindenbaum lemma shows. By the definition of canonical model,

Γ is a world in this model. By the Truth lemma we have that MK , Γ |= ¬ϕ ⇔ ¬ϕ ∈ Γ. And thus

MK , Γ |= ¬ϕ, since ¬ϕ ∈ Γ. Hence there is a Kripke model, namely MK , and a world in it, namely Γ,

where ¬ϕ is true and ϕ is false. Therefore, 6|= ϕ, and that is what we had to show.

The proofs of the completeness theorem for the other logics follow the same pattern as the proof for

K given above. For instance, if we want to prove that the logic S4 is complete, we need to show that

everything that can be derived in that logic is valid on the class of all preordered frames. So we define

the canonical model MS4 in precisely the same way. Then, we add the following lemma:

If the rule T is valid in logic L, then the canonical relation RL is reflexive.

If the rule 4 is valid in logic L, then the canonical relation RL is transitive.

Proof. Given that T is valid, for every maximal consistent set Γ, if 2ϕ ∈ Γ, then Γ `L ϕ and so, by

deductive closure, ϕ ∈ Γ. So, for all ϕ such that 2ϕ ∈ Γ, also ϕ ∈ Γ. Then, by the definition of the

canonical relation for L, ΓRL Γ. This proves that the relation is reflexive.

Suppose that in the canonical model ΓRL ∆RL E. We need to prove that ΓRL E. Given that 4 is valid, if

2ϕ ∈ Γ, then Γ `L 22ϕ and so, by deductive closure, 22ϕ ∈ Γ. Therefore, according to the definition

of the canonical relation, 2ϕ ∈ ∆. And, again by the definition of the canonical relation, ϕ ∈E. So, for

every ϕ, if 2ϕ ∈ Γ, then ϕ ∈ E. Then, applying the definition of the canonical relation once more, it is

the case that ΓRL E. This is what we had to prove.

So we prove that the canonical model for the logic S4 is a preorder, and we prove completeness in the

same way as before. Hence, we know that, if some formula is not derviable in the proof system S4, then

it is not valid on the class of all preordered frames (because there is a counterexample in the canonical

model for S4).

The book of van Ditmarschet al. [16] contains a completeness proof for the logic S5, on p.180 and

further. The accessibility relation ∼ca in the canonical model is then an equivalence relation, and Ka

50

(with intended meaning “agent a knows that . . . ”) is the necessity operator. The completeness proof is

slightly different, because they define the accessibility relation in such a way that it is immediately an

equivalence relation.

Note that we have not discussed completeness relative to the 2-subproofs in the natural deduction sys-

tem. We have only proven completeness of `K ϕ with respect to |= ϕ, but not completeness of ∆ `K ϕ

with respect to ∆ |= ϕ. Such an extension is possible, but has been left out of the present discussion.

6.1 Exercises

1. Show that {2(ϕ → ψ), 2ϕ, 3¬ψ} is inconsistent.

3. If 2ϕ ∈ Γ and ¬2ϕ ∈ ∆, is it possible that ΓRS5 ∆? And ∆RS5 Γ? Explain your answer.

4. Given that wRv holds in a model, show that for the sets

Lw = {ϕ | w |= ϕ} Lv = {ϕ | v |= ϕ}

6. Which clauses can be added to the valuation lemma for disjunction ∨, and for implication →?

Give the proofs for those extra clauses, using the proof for ∧ as an example.

7. Which clause can be added to the valuation lemma for possibility 3? Give the proof for this extra

clause, by applying the (already proven) clauses for ¬ and 2.

8. Which clauses could be added to the truth lemma for disjunction ∨, and for possibility 3? Extend

the formula induction in the proof of the truth lemma with clauses for disjunction and possibility.

(Tip: complete exercises 6 and 7 first, and use your answers here.)

9. Extend the correspondence lemma for rule 5 and prove completeness for S5.

10. Show that the logic KB is complete. That is, show that if we add the inference rule `KB 32ϕ ⇒

`KB ϕ, then everything that is valid on the class of symmetrical frames can be derived in the logic

KB.

51

7 Decidability

By proving soundness and completeness, we have ‘reduced’ the issue of whether a modal formula ϕ is

provable with natural deduction, `K ϕ, to the issue of whether ϕ is generally valid, |= ϕ. Similarly, we

know that ϕ is not provable if we know that there is a frame on which it is not valid, i.e., if there is a

Kripke model with a possible world where ϕ is false. Given that there are infinitely many frames, this

might not be an easy task. However, we can restrict the frames that we have to consider in such a way

that in order to check whether there is a frame that refutes ϕ, we only have to check a finite number of

finite frames, which implies the ‘decidability’ of the logic. This is the content of this section. We will

see that the number of frames we need to check only depends on the size of the formula ϕ.

A logic L is decidable if, and only if, an effective procedure (Turing machine) exists by means of

which for every formula ϕ in the language it can be settled whether ϕ is generally valid in L (i.e., whether

ϕ is a theorem in L). Propositional logic is decidable, because the truth table method is such an effective

procedure. Predicate logic is not decidable. A. Turing, who proved this fact (as well as A. Church), gave

a theoretical definition for ‘effective procedure’, that later came to be called a ‘Turing machine’. Using

this concept, a logic is decidable if there is a Turing machine that will compute the general (in)validity

of every formula.

As we will see, normal modal logics such as K and S5 are decidable. This provides a computational

logical reason for preferring formal reasoning in modal logic over predicate logic, so, for instance, char-

acterizing transitivity of a frame as 2ϕ → 22ϕ instead of ∀w∀v∀u((wRv ∧ vRu) → wRu)). In general,

the more expressive a logic is (see section 4), the less decidable it is, and the more computationally com-

plex. Logicians sometimes describe this as a pay-off between expressivity and complexity of a logical

formalism.

A by now common method of proving that a logic is decidable is by means of the finite model

property. A logic L has the finite model property if every formula that is not generally valid has a

finite countermodel; in other words, if the fact that a formula is valid on all finite frames is enough to

conclude that it is valid on all frames, finite or not. When a modal logic has this property, we only need

to check the finite frames to see whether some formula is valid, and it turns out that this is an effective

procedure—also in Turing’s sense.

If a sound and complete logic L has only a finite number of axioms and inference rules and it has the

finite model property, then it is decidable. We know that the modal logics we have considered are all

sound and complete, and they have only a finite number of axioms (in the Hilbert system) and inference

rules (in both the Hilbert and natural deduction systems). Therefore, if we can prove the finite model

property for modal logic L, we can conclude that L is decidable.

In order to prove the finite model property, we first prove that to establish validity of a formula in a frame

we only need to check frames of a finite length dependent on the size of the formula ϕ. Intuitively, we

establish “how far up” we have to inspect the frame in order to establish whether a certain node forces a

formula. It turns out that the number of boxes decides this. First, consider the following example.

Example 4.

x (¬p)

O

(p) u 8 v (p)

e

w (¬p)

52

To see that w |= 2p it suffices to consider v and u and check whether p is true in them. In other words,

the truth of w |= 2p does only depend on the valuation at the successors of w and not on the world x,

which is not a successor of w. If p would be true in x, this would not change the truth of w |= 2p,

whereas a change in the valuation of u or v could. On the other hand, for a formula with two boxes, like

22p, whether w |= 22p holds (it does not) depends on the valuation of p in x.

Definition 15. The depth of a frame F is the maximum length of a path from a root of the frame (a

lowest world, a world that is no successor of another world) to the top. Formally: the depth of a frame F

is the maximum number n for which there exists a chain w1 Rw2 R . . . Rwn Rwn+1 in the frame, where all

wi are distinct. Clearly, frames can have infinite depth.

The depth of a world v from a world w is the length of the shortest path from w to v. v is of depth 0

from w when it is equal to w or when it cannot be reached from w by travelling along the arrows.

Let |ϕ| be the size of ϕ, i.e. the number of symbols in it, and let b(ϕ) denote the maximal nesting of

boxes in ϕ. The size of a frame is the number of worlds in it.

Example 5. This frame has depth 2:

/x

u` ?v

The world x has depth 2 from w and depth 1 from v and depth 0 from x and from u. And this frame has

depth 0:

wr

The maximal nesting of boxes in (22p ∧ 2q) is 2, and in 2(2p → 2(2p ∧ q)) it is 3 (coming

from the box in front of p, and the box in front of the conjunction, and finally the box in front of the

implication). Note that the nesting of boxes in 23p is 2, not 1.

Returning to the first example, it seems to suggest that to evaluate a formula ϕ in a node w in a model

M, we have to consider only the nodes in M that are of depth ≤ b(ϕ) from w. Here follow two more

examples to support this claim.

First we consider the case that the number of boxes in a formula ϕ is 0, i.e. b(ϕ) = 0. This means that

the formula does not contain boxes. Considering the definition of w |= ϕ, it is not difficult to see that to

establish w |= ϕ for a formula without boxes, one only has to know which atomic propositions are true

in w and which are not. Thus the truth of ϕ at w is indepedent of the model outside w.

In the following example,

q

x_ ?y

vO

wQ

53

the truth of w |= 2p does not depend x or y. In other words, w |= 2p holds if and only if w |= p and

v |= p, no matter whether p is true in x or y or not. However, the truth of v |= 2p depends on the valuation

at x, since v |= 2p if and only if x |= p and y |= p. On the other hand, to verify whether w |= 2p → 22q

all the nodes w, v, x, and y have to be taken into account.

This intuition is captured by the following theorem.

Theorem 9 (Finite depth theorem). For all numbers n, for all models M and all nodes w in M there

exists a model N of depth n with root w0 such that for all ϕ with b(ϕ) ≤ n:

M, w |= ϕ ⇔ N, w0 |= ϕ.

Proof. We do not formally prove this statement, but only sketch the idea. Given a model M with world

w, consider Mw . By Corollary 3, concerning generated subframes, we have for all formulas ϕ that for

all v in Mw ,

M, v |= ϕ ⇔ Mw , v |= ϕ,

but this does not prove the lemma as Mw may still have depth > n. Therefore, in Mw we cut out all worlds

that have depth > n from w and call this model N. Observe that the root of N is w. The ideas explained

above imply that for all formulas ϕ with b(ϕ) ≤ n we have M, w |= ϕ if and only if N, w |= ϕ.

Corollary 5.

`K ϕ ⇔ F |= ϕ for all frames F of depth ≤ b(ϕ).

⇐: this direction we show by contraposition. Thus assuming 0K ϕ we show that there is a frame F

of depth ≤ b(ϕ) such that F 6|= ϕ. Thus suppose 0K ϕ. By the completeness theorem, there should be a

frame G such that G 6|= ϕ. Thus there is a model M on this frame and a world w such that M, w |= ¬ϕ.

By Theorem 9 there is a model N of depth ≤ b(¬ϕ) and a world v such that N, v |= ¬ϕ. Since the number

of boxes in ϕ and ¬ϕ is the same, b(¬ϕ) = b(ϕ). Let F be the frame of N. This then shows that F has

depth ≤ b(ϕ) and F 6|= ϕ, and we are done.

Results similar to Corollary 5 hold for various modal logics. The result can also be improved in such

a way that in the completeness theorem not only can we restrict ourselves to frames of finite depth, but

even to frames that are finite. The precise formulation is as follows.

Theorem 10.

`K ϕ ⇔ ϕ holds on all frames of size ≤ 2|ϕ| .

`T ϕ ⇔ ϕ holds on all reflexive frames of size ≤ 2|ϕ| .

`4 ϕ ⇔ ϕ holds on all transitive frames of size ≤ 2|ϕ| .

`S4 ϕ ⇔ ϕ holds on all preordered frames of size ≤ 2|ϕ| .

`S5 ϕ ⇔ ϕ holds on all equivalence frames of size ≤ 2|ϕ|

We say that a logic has the finite model property (FMP) if, whenever a formula ϕ is not derivable in

the logic, there is a finite model of the logic (a model in which all formulas of the logic are true) that

contains a world in which ϕ is refuted.

54

Proof. We prove it for T. Suppose 0T ϕ. Then by Theorem 10 there is a reflexive frame F of size ≤ 2|ϕ| on

which ϕ does not hold. Thus there is a model M on the frame and a node w such that w |= ¬ϕ. By the

correspondence theorem 2ϕ → ϕ holds on all reflexive frames. That is, T holds on all reflexive frames.

Thus M is a finite model of T with a world in which ¬ϕ is true. This proves that T has the finite model

property.

7.3 Decidability

Recall that a language is decidable if there is a Turing machine that decides it. We can define a similar

notion for logics, by considering them as languages, namely as the set of all formulas that are derivable

in the logic. We say that a formula belongs to a logic when it is derivable in it. E.g. with a logic L is

associated the set {ϕ | `L ϕ}. We call a Turing machine a decider for L when it decides {ϕ | `L ϕ}. In

general, we call a logic L decidable if there is a Turing machine that is a decider for L. The previous

theorem implies the decidability of all modal logics mentioned there.

Corollary 7. The logics K, T, 4, S4, S5 are decidable.

Proof. We show that K is decidable and leave the other logics to the reader. Thus we have to construct a

Turing machine that, given a formula ϕ, outputs “yes” if `K ϕ and “no” otherwise. By Theorem 10, `K ϕ

is equivalent to ϕ being valid in all frames of size ≤ 2|ϕ| . Thus the Turing machine has to do the following.

Given ϕ it tests for all worlds w in all models M on all frames of size ≤ 2|ϕ| whether M, w |= ϕ. If in

all cases the answer is positive, it accepts, and otherwise it rejects. It is clear that this Turing machine

decides K.

7.4 Complexity

In terms of complexity the Turing machine constructed in the proof above might not do so well since

there are at least exponentially many frames of size ≤ 2|ϕ| . The exponential factor is likely to be essential,

as for many of these logics, including K, T, 4 and S4, one can show that the corresponding satisfiability

problems are PSPACE-complete. That is, it can be solved in polynomial space whether a formula belongs

to such a logic or not, and any problem in PSPACE can be reduced to such problems. (Recall that the

satisfiability problem for propositional logic is NP-complete.) On the other hand, decidability is still

nice. Recall that predicate logic is not decidable. Of course, propositional logic is, but since modal

logics are extensions of propositional logic with much more expressive power, their decidability is not

apparant, and indeed these facts have nontrivial proofs that, regrettably, fall outside the scope of this

exposition.

55

8 Tense Logic

So far we have only looked at modal logic in general, leaving the meaning of ‘necessary’ vague and with

Kripke frames in which possible worlds are ‘accessible’. Now we go into a more specific form of modal

logic, namely, the logic of time. We start out with standard tense logic.

To understand the idea behind tense logic, consider that there are two ways to think of propositions:

For example: There is a vase on top of a table at 12h:45m:17s, at 7 October 2011, on earth, latitude:

52.087325, longitude: 5.108177.

- Indexical propositions: specifiy circumstances relative to a perspective of evaluation (or index).

For example: That vase is standing on top of my table (now, here).

In modal logic we always take propositions to be indexical in some or other respect: we evaluate them

from the perspective of some ‘possible world’, but we can also adopt the perspective of the present

moment, ‘now’, or from our current location, or from the perspective of the speaker, and so on. Expres-

sions such as ‘I’, ‘here’ and ‘now’ are also called indexical expressions. What they mean in a particular

utterance depends on who is speaking, where, and when.

In tense logic we take propositions to be indexical in one respect. They are evaluated relative to

a point in time. The same idea can be expressed differently, looking not at the language but at the

models. These are now no longer populated by different ‘possible worlds’ but rather by different ‘possible

moments’ or time points. And accessibility is no longer a matter of determining which worlds are

possible from a given world, but rather for describing which time points are earlier or later than some

given time point.

Tense Logic was introduced by Arthur Prior as a logic for reasoning about past, present and future. It is a

logic that has two 2-like operators, from which we can also define their corresponding 3-like operators.

The operator G (for ‘it is always Going to be the case that’) is for the future. A formula Gϕ says that,

at all possible future moments in time, ϕ is true. The other operator H (for ‘it Has always been the case

that’) is the reverse of G. So Hϕ means that, at all possible moments in the past, ϕ is true.

The dual of G is F (for Future) and the dual of H is P (for Past). Putting these things together, we

get the following table:

always sometimes

towards future: G F

towards past: H P

[Ltense ] ϕ ::= p | ⊥ | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ → ϕ | ϕ ↔ ϕ | Gϕ | Fϕ | Hϕ | Pϕ

The statement FH p → p means that, if at some point in the future, p will always have been true

up to that moment, then p is true now. Intuitively, many people take this to be true. Analogously, the

statement PG p → p says that, if at some moment in the past it was true that p would always be true from

that moment on, then p is true now. These two statements are true in tense logic. They characterize the

most general class of all tense frames (see below) and formulated as inference rules they are the basic

rules of tense logic.

56

To evaluate the sentences of the language of tense logic, we use a Kripke model with only one

accessibility relation. So we have two different modal operators which are defined by means of one

accessibility relation. We could use the same notation with W and R as usual, but instead we will use T

for the domain of times, or moments, < for the earlier-than relation and > for the later-than relation. So

a tense frame is a tuple F = hT, <, >i, if it meets the additional constraint that

for all times t and t0 : t > t0 if, and only if, t0 < t.

This is to guarantee that the earlier-than and later-than relations are complementary in the natural way.

A more abstract formulation of this condition is: >=<−1 . The ‘−1’ means that the relation is inverted. A

tense model can be obtained from a tense frame by adding a valuation, so a model is a tuple M = hT, <

, >, Vi.

Below we only display the semantic definitions for the two modal operators.

M, t |= Gϕ if, and only if, for all t0 , if t < t0 , then M, t0 |= ϕ

M, t |= Hϕ if, and only if, if t > t0 , then M, t0 |= ϕ

The basic tense logic P (for Prior) is obtained from the basic modal logic K (for both modalities),

plus the following:

Tmp1 Tmp2

FHϕ PGϕ

...... ......

ϕ ϕ

The inference rules characterize the class of tense frames. That is, the condition that < and > are com-

plementary in the sense defined above is characterized by the two rules of inferences. To prove this,

we assume arbitrary frames with two accessibility relations, R1 and R2 , and two corresponding modal

operators 21 (and 31 ) and 22 (and 32 ).

Proof. ⇐: Suppose that F = hW, R1 , R2 i satisfies the property that R2 = R−1 1 . We have to show that

for every model M based on F and every possible world w ∈ W, if M, w |= 31 22 ϕ, then M, w |= ϕ. So

suppose, for arbitrary such M and w, M, w |= 31 22 ϕ. By the semantic definition this means that there

is some possible world v such that wR1 v and M, v |= Hϕ. Now, we assumed that R2 = R−1 . Therefore,

from wR1 v it follows that vR2 w. According to the semantic definition, M, v |= Hϕ means that ϕ is true in

all worlds accessible by R2 from v. Since vR2 w, w is one of those worlds, so M, w |= ϕ. This is what we

needed to prove.

⇒: By contraposition, we prove that if frame F does not satisfy the property that R2 = R−1 1 , then it

is not true at all worlds in all models based on F, that if 31 22 ϕ is true, ϕ is true as well. Suppose that

R2 , R−11 . This means that for some w and v, wR1 v but not vR2 w (or vice versa, here the problem is with

the other inference rule). Now, consider this w. We define a model based on F by means of a valuation V

which is such that proposition p is true everywhere except at w: so p ∈ V(u) if, and only if, u , w. Given

our assumption, not vR2 w. Therefore, all the worlds 2-accessible from v are worlds where p is true. The

semantic definition then tells us that M, v |= 22 p. We also assumed that wR1 v. Therefore, applying the

semantic definition once more, M, w |= 31 22 p. But the valuation of p guarantees that M, w 6|= p.

Note that, apart from the connection between < and >, the earlier-than relation can be anything: circular,

lineair, or even universal. If we have some intuition on what the earlier-than relation is, we have to

restrict membership to the class of tense frames.

57

Some people would say that time is serial: that there is no such thing as a last moment of time, or

a beginning of time: time has always been and will always continue. If this is true, then the temporal

ordering relations are serial.

- ∀x∃y(x < y)

- ∀x∃y(x > y)

A further thought is that whatever is in the future of the future, is itself in the future. The same could

be said of the past. If this is correct, then the earlier-than relation is transitive.

A common idea is that time is linear. We commonly speak of a ‘time line’. A tense frame is linear if

any two points in the future are ordered as earlier or later—and similarly for the past.

- ∀xyz((x < y ∧ x < z) → (y < z ∨ y = z ∨ z < y)) (future linear)

Mostly, it is agreed that there is only one past, so the past is linear. There is not one past in which

the Philips the Second ruled the Netherlands and another past in which he did not. For the future it is

somewhat more debatable: some people say that the future is ‘open’: at the present moment it is not yet

determined what the future will be like. For example, who the next prime minister of the Netherlands

will be depends on how people will vote. Voting is a free choice, so it is not fixed now what I will

vote in (perhaps) two years time. So at least in some sense, many people say, there are several possible

futures, not a single one. This view is further strengthened by certain common interpretations of quantum

mechanics according to which, amongst others, radio active decay does not happen according to strict,

deterministic laws. Others reject this idea, arguing either that the future is determined, or that there are

several futures possible, but only one future is the real future. They will maintain that only future-linear

frames can be real tense frames.

The additional rules for characterizing linearity are the ones below.

Fwd-lin Bwd-lin

Fϕ Pϕ

...... ......

G(Pϕ ∨ ϕ ∨ Fϕ) H(Pϕ ∨ ϕ ∨ Fϕ)

Proof. An exercise.

The logic Lin consists of the basic tense logic P with additionally the rules D and 4 for both modal-

ities and both of the linearity rules, Fwd-lin and Bwd-lin. The class of frames characterized by the logic

Lin consists of all frames that are serial, transitive, linear in both directions, and in which the two acces-

sibility relations are each other’s mirror image. That is, the characterized class of frames is simply the

intersection of the frame properties characterized by the additional rules. This class of frames includes

also frames in which time is circular, frames in which times precede themselves (since the irreflexivity

of < cannot be modally characterized), and frames in which there are two time lines side by side (since

connectedness cannot be modally characterized either).

58

Still, we can define the frame class of genuine time lines—with an acyclic, irreflexive and connected

ordering of times. It turns out that the logic Lin is complete with respect to this frame class: every tense

logical validity on that frame class is provable in the logic. The proof for this fact is more involved

than the straightforward Henkin proof for completeness given earlier and will not be presented here. It

is important to realize that a logic can be complete for a frame class even if that frame class is more

restrictive than the frame class that the logic characterizes.

Below is a representation of a time line, with closed arrows for the future direction, left to right, and

the interrupted arrows for the past direction.

( ( ( ( (

... t1 h t2 h t3 h t4 h t5 h t6 ...

The alternative with only backward linearity and not forward linearity has an open future and a fixed

past. Such a tense frame can be pictured as a tree or, in the words of writer Luis Borges, as a “garden of

forking paths”.

>· > t4 ...

/ t3 / t5 ...

? t2

... / t1 t6 ...

t7 / t8 / t9 ...

Time progresses from left to right, but there are many forks in the road, where time can develop in

different ways. So time can progress in different ways: in one future you visit a museum tomorrow, in

another future you stay at home and watch a movie instead. In this situation we speak of “branching

time”. We call the logic with only backwards linearity Tree. So Lin = Tree+Fwd-lin.

Definition 16. Let F = hT, <, >i be a tense frame in the class Tree.

- F is discrete if, and only if, ∀x∃y(x < y ∧ ∀z((x < z ∧ z , y) → y < z));

- F has finite intervals if, and only if, for any two points, there are only finitely many time points in

between them;

- F is dense if, and only if, ∀x∀y(x < y → ∃z(x < z ∧ z < y));

- F is continuous if, and only if, no cut determines a gap

A cut is a partition of the domain into two parts such that (i) the parts are non-empty, (ii) together

they are the entire domain, (iii) if x is in the first partition and y in the second partition, then x < y.

A cut is a gap if, and only if, neither partition has a final element. So, ∀C(∃x(x ∈ C ∧ ∀y(y ∈ C → y <

x))∨(x < C∧∀y(y < C → x < y))), where C is a cut-set, i.e., a set such that ∀x∀y(x ∈ C∧y < x → y ∈ C).

59

Characteristic formulas.

Theorem 11. Let F be a frame in the class Ctree .

- F is discrete if, and only if, F |= (ϕ ∧ Hϕ) → FHϕ.

- F has only finite intervals if, and only if, F |= G(Gϕ → ϕ) → (FGϕ → ϕ) and F |= H(Hϕ →

ϕ) → (PHϕ → ϕ).

- F is dense if, and only if, F |= Fϕ → FFϕ.

- F is continuous if, and only if, F |= (Fϕ ∧ O¬ϕ ∧ ¬O(ϕ ∧ P¬ϕ)) → O((ϕ ∧ G¬ϕ) ∨ (¬ϕ ∧ Hϕ)).

The logic Lin-Z consists of Lin plus the axioms for discreteness and only finite intervals. Everything

that is valid on the frame Z, the frame in which time points are ordered like the integers, can be validly

inferred from this logic. In other words, the logic is complete for the class of frames {Z}. The logic

does not characterize this class, because there are also other frames which have exactly the same set of

validities as the frame Z.

To get from a logic matching Z to a logic matching N we need to replace the rule for seriality D of

the past by a rule that expresses that there is a beginning of time.

The logic Lin-Q is obtained by adding the axiom (or rule) for density to the logic Lin. This logic is

complete for the class of frames {Q}—the class consisting of the single frame constituted by the rational

numbers. That is, if we think of time as the rational numbers ordered by the smaller/greater than relations,

then every tense logical validity is provable in the logic Lin-Q.

By adding to Lin-Q the axiom (or rule) for continuity, we obtain the logic Lin-R, which is complete

for the class of frames {R}. So if time is structured like the real line, then all tense logical validities to be

had are those provable in Lin-R.

Tense logic can be studied for various reasons. One such reason is historical. Prior himself wanted to

reconstruct the so-called ‘Master argument’ by ancient philosopher Diodorus of Cronus. Diodorus used

this argument to defend what might be called determinism about the future. Unfortunately, Diodorus’

own exposition has not survived. His argument is only known indirectly through descriptions by his

contemporaries. Epictetus writes ([6] Book II, chapter 19):

The Master Argument seems to be based on premisses of this sort. There is a general conflict

among these three statements:

1. Everything past and true is necessary;

2. The impossible does not follow from the possible;

3. There is something possible which neither is nor will be true.

Seeing this conflict, Diodorus relied on the plausibility of the first two to establish: Nothing

is possible that neither is true nor will be.

The ancient philosophers responded differently to this trilemma. Some were inclined to reject the second

statement and maintain the first and the third, for instance.

Diodorus’s conclusion, that the third proposition is false, can be reformulated as the statement:

60

Dio If it is possible that p, then either p is true now, or will be true some time in the future.

This is also what we can make out of the following statement by Boethius ([3], 234.22-26).

Diodorus defines as possible that which either is or will be; the impossible as that which,

being false, will not be true; the necessary as that which, being true, will not be false; and

the non-necessary as that which either is already or will be false.

the only possible future is the actual future: nothing is possible other than what is true now or true in the

future.

Prior [12] attempted to reconstruct Diodorus’ argument by means of a combination of (alethic) modal

logic and tense logic. The language of this logic is simply that of tense logic plus the operators 2 and

3. We will first look at Prior’s argument and only later come to discuss what kind of models and

interpretations would make sense for this language.

In the language of modal tense logic, the conclusion by Diodorus is:

Dio 3ϕ → (ϕ ∨ Fϕ)

The trilemma is then expressed by the following three formulas, of which the third is the negation of

Dio.

D1 Pϕ → 2Pϕ

D2 2(ϕ → ψ) → (3ϕ → 3ψ)

More precisely, D3 should be, not that the negation of Dio is valid, but that Dio itself is invalid. Ac-

cording to the formulation by Epictetus, it says that there is some proposition for which Dio is not true.

Nevertheless, the conclusion that Diodorus drew is that this is false, so that for no proposition D3 is true,

which means that Dio is generally valid.

In order to come to a valid inference of this kind, Prior added two further premisses. These two

premisses are, he claimed, reasonable for an ancient logician such as Diodorus. (Mates [9] agrees with

Prior’s assessment, but some other commentors have been more critical.)

P1 2(ϕ → HFϕ)

P2 (¬ϕ ∧ ¬Fϕ) → P¬Fϕ

Now Prior attempted to show that Diodorus’s reasoning can be validated using modal tense logic. This

means that we now have four premisses, D1, D2, P1, and P2, leading to the conclusion Dio. The first

of these says (intuitively) that the past is necessary. The second one is a straightforward validity of the

basic modal logic K. Furthermore, we can observe that ϕ → HFϕ is a validity of basic tense logic (it is

esssentially the same as the inference rule Tmp2). We can infer P1 from this validity using Necessitation

(in the Hilbert-style inference system), or by a simple natural deduction. Finally, the premise P2 says:

Of whatever is and always will be false (i.e. what neither is nor ever will be true, it has

already been the case that it will always be false.

61

Effectively, the tense logical system we assume is basic tense logic plus D1 and P2.

Prior’s reconstruction of the Master Argument is a valid inference of ‘Dio’ using these four premisses.

Below is a sketch of the proof, skipping some of the inference steps that you can fill in for yourself. At

two points it makes use of the derived rule called ‘Contraposition’.

ϕ→ψ

......

¬ψ → ¬ϕ

You can prove that this is is a derived rule in all (modal) logics. The deduction sketch below then proves

that D1, D2, P1, P2 ` Dio.

1 Pψ → 2Pψ Premise D1

2 2(ψ → χ) → (3ψ → 3χ) Premise D2

3 2(ϕ → HFϕ) Premise P1

4 (¬ϕ ∧ ¬Fϕ) → P¬Fϕ Premise P2

5 2(ϕ → ¬P¬Fϕ) from 3, duality of H and P

6 2(ϕ → ¬P¬Fϕ) → (3ϕ → 3¬P¬Fϕ) instance of D2 (ψ = ϕ and χ = P¬Fϕ)

7 3ϕ → 3¬P¬Fϕ from 5, 6, Elim →

8 ¬3¬P¬Fϕ → ¬3ϕ from 7, contraposition

9 2P¬Fϕ → ¬3ϕ from 8, duality of 2 and 3

10 P¬Fϕ → 2P¬Fϕ instance of D1 (ψ = ¬Fϕ)

11 (¬ϕ ∧ ¬Fϕ) → 2P¬Fϕ from 4 and 11, using Intro/Elim →

12 (¬ϕ ∧ ¬Fϕ) → ¬3ϕ from 9 and 12, using Intro/Elim →

13 3ϕ → (ϕ ∨ Fϕ) from 12, contraposition

If we view this inference from the perspective of modern modal logic, D2 and P1 are part of basic

modal and basic tense logic, respectively. If we accept those as given, then logically we have to either

give up D1 or P2, or accept Dio. Accepting Dio would mean accepting that the future is settled from the

beginning of the world: at the Big Bang it was determined that you would be reading this sentence now.

Given a closer look, the two premisses are not that plausible however.

First, consider P2. In fact this formula implies that time is discrete. More precisely, that there is, at

any given time, an immediate past moment.

Theorem 12 (Discreteness). Let F be a frame in the class of all tense frames. Then, F |= (¬ϕ ∧ ¬Fϕ) →

P¬Fϕ if, and only if, every time has a unique closest predecessor, ∀x∃y(y < x ∧ ∀z(z < x → z < y)).

Proof. An exercise.

Second, consider D1. This formula seems to express the plausible idea that the past is fixed, but if

we apply it to future tensed statements it goes beyond the fixedness of the past.

Yesterday it was true that tomorrow it is going to rain. Therefore, it is now necessary that yesterday

it was true that tomorrow it is going to rain.

62

The first sentence in this example is true if, in the actual world, it is raining tomorrow. The second

sentence is true if, in all worlds that are possible now, it is raining tomorrow. In other words, if we use a

future tensed statement in D1, it implies that the future is determined. This is precisely what is involved

in the Master Argument, on Prior’s reconstruction, as can be seen in the deduction sketch above. Perhaps

the past is fixed in the sense that ‘what’s done is done’, without the past being fixed in the sense that

everything that was true in the past (including statements about the future) is fixed.

So the Master argument, on Prior’s reconstruction, says that a discrete time with a (strongly) fixed

past also has a fixed future.

Aristotle’s argument in favour of the idea of an open future has been very influential. It has been viewed

as a response to the so-called Megaric school of which Diodorus was a member. He makes a distinction

between something being necessary simpliciter and something being necessary when it happens. Using

this distinction he can combine the open future with the idea that everything that happens is necessary

(when it happens). Aristotle uses the example of a sea battle that may or may not take place tomorrow.

The only thing that is necessary is that the sea battle either does or does not happen tomorrow: it must

be one of these, but it need not be both.

Now that which is must needs be when it is, and that which is not must needs not be when

it is not. Yet it cannot be said without qualification that all existence and non-existence is

the outcome of necessity. For there is a difference between saying that that which is, when

it is, must needs be, and simply saying that all that is must needs be, and similarly in the

case of that which is not. In the case, also, of two contradictory propositions this holds

good. Everything must either be or not be, whether in the present or in the future, but it is

not always possible to distinguish and state determinately which of these alternatives must

necessarily come about.

Let me illustrate. A sea-fight must either take place tomorrow or not, but it is not necessary

that it should take place tomorrow, neither is it necessary that it should not take place, yet

it is necessary that it either should or should not take place tomorrow. Since propositions

correspond with facts, it is evident that when in future events there is a real alternative, and

a potentiality in contrary directions, the corresponding affirmation and denial have the same

character. (Aristotle [1], Ch.9)

In our formal language of modal tense logic, what Aristotle proposes is that we acknowledge that neces-

sarily there is a sea battle (p) tomorrow or not

although it is not necessary now that in the future there is going to be a sea battle, or necessary that it is

not going to happen,

2F p ∨ 2¬F p.

The only thing that is true is that the sea battle necessarily takes place when it does take place.

63

8.6 Ockhamist semantics for modal tense logic

There are different ways to make models for combinations of tense and modality. One common approach

is the Ockhamist semantics. This semantics interestingly corresponds with the intuitions of Aristotle. It

makes use of the tree-like frames of tense logic, in which the future is not linear but the past is. We do

not alter these models, but we introduce the notion of a history:

A history is a maximal chain of times. If t and 0 are in history h, then so are all times in between

them. And if the tree goes on for ever into the future (or past), then the history also goes on for

ever into the future (or past).

These histories are also called branches of the tree. They pick out the different possible courses time

might take on the tree. We say that a history “goes through” some moment in time, and vice versa that a

time or moment “occurs in” a history. Observe that, in a tree structure, any moment in time has only one

past, but a multitude of futures. So all histories going through time t have the same past, but they may

have different futures.

In the Ockhamist semantics for modal tense logic, we relativize truth not just to a moment of time,

but rather to the combination of a moment and a history. This means that we get the following truth

definition:

M, t, h |= p if, and only if, p ∈ V(t)

M, t, h |= 2ϕ if, and only if, for all histories h0 going through time t, M, t, h0 |= ϕ

M, t, h |= Gϕ if, and only if, for all times t0 > t occurring in history h, M, t0 , h |= ϕ

M, t, h |= Hϕ if, and only if, for all times t0 < t occurring in history h, M, t0 , h |= ϕ

The 2 modality makes a shift in the history-dimension and stays fixed at the same moment in time. The

tense modalities stay fixed on the given history: they only make steps forward and backward on the

time-dimension.

Truth of an atomic propositional variable depends only on the time point and not on the history. So

the valuation is V(t) and not V(t, h). A consequence of this is that, for all atomic propositions:

p → 2p.

This does not generalize to all formulas. It is not valid on all tree frames that F p → 2F p, for instance.

Suppose that there will actually be a sea battle tomorrow, but this is not necessary now. Then yesterday

it was actually true that there will be a sea battle in two days time, but this does not mean that it was

necessary yesterday that there will be a sea battle.

A number of other interesting validities are listed here:

- S5 for 2

The notion “h is an alternative history going through t as h0 ” expresses an equivalence relation.

This means that it is characterized by the validities 2ϕ → ϕ, 2ϕ → 22ϕ and 32ϕ → ϕ.

- H2ϕ ↔ 2Hϕ

What is necessary now with respect to what has always been the case, is the same as what has

always been necessary. This is so in virtue of the fact, mentioned above, that for any given moment

in time there is only one past (all histories going through t have the same past before t).

- P2ϕ → 2Pϕ

If it was once necessary that ϕ, then now it is necessary that once ago, ϕ.

64

- 2Gϕ → G2ϕ

For the future this is only true in one direction: if in all histories it will henceforth always be true

that ϕ, then henceforth it will always be necessary that ϕ. The possible histories for any later time

are always a subset of the possible histories now.

Some philosophers have been more restrictive in their acceptance of the distinction between actual

and possible future. C.S. Pierce, for example, rejected the idea of an actual future. On his view, the

only notion of the future is the one in which Fϕ is true if in all possible futures there is a moment where

ϕ is true. In terms of the language used here, Pierce would only accept as meaningful 2Fϕ, not the

actual future Fϕ, which assumes that we can speak of the ‘real’ course of time in advance of its actual

occurrence. On the other extreme, many philosophers have supported a view we might call determinism,

according to which there is no future other than the actual one. The very idea of a future that could

happen but that does never actually happen is an absurdity to these philosophers. If we would accept

that view, then there would only be one branch. We would have the inference rule of Forward-linearity

for tense modality, but also 3ϕ ↔ 2ϕ. A consequence of this is Fϕ ↔ 2Fϕ. So, interestingly, at both

extremes the distinction between Fϕ and 2Fϕ collapses.

The logic CTL∗ is a version of tense logic for branching time that has been proposed in computer science.

It is used for automated verification of software, debugging of hardware circuits and communication

protocols.

This logic is similar in interesting ways to the modal tense logic we discussed above, and its semantics

is comparable to the Ockhamist semantics. Its language is defined by means of a simultaneous induction.

We distinguish two separate types of (modal) formulas: state formulas ϕ and path formulas π.

ϕ := p | Aπ | Eπ

π := ϕ | ¬π | π ∧ ρ | Fπ | Xπ | πUρ

U is a binary (or dyadic) modality: if we compare 2 with the ¬-operator, then U is like the ∧ operator:

it has two immediate subformulas instead of one.

By defining the language in this way we can preclude certain combinations of operators. The lan-

guage of CTL∗ does not include formulas such as E pUAq or XAp, because U and X require path formulas

and Ap and E p are state formulas.

Then for the models. They are basically the branching time structures Ctree . Instead of histories, we

now define the concept of a path. A path is like a part of a branch with an initial state. That is, a path is

a sequence of states s0 s1 s2 s3 . . . which is ordered by the precedence relation, so s0 < s1 < s2 < . . ..

- Pk is the path obtained by ‘cutting off’ the initial k states, so that Pk (0) = P(k);

65

M, s |= p iff p ∈ V(s)

M, s |= Aπ iff ∀P(P(0) = s ⇒ M, P |= π)

M, s |= Eπ iff ∃P(P(0) = s and M, P |= π)

M, P |= ϕ iff M, P(0) |= ϕ

M, P |= Fπ iff ∃k, M, Pk |= π

M, P |= Xπ iff ∃k, M, P1 |= π

M, P |= πUρ iff ∃k : M, Pi |= ϕ for all i < k and M, Pk |= ψ

M, P |= Iπ iff for infinitely many i, M, Pi |= π

We see that Aϕ is basically the same as 2ϕ, and Fϕ is what it used to be in Ockhamist semantics.

66

References

[1] Aristotle. On Interpretation (ca. 350 B.C., translated by E. M. Edghill). Kessinger Publishing, 2004.

[2] Patrick Blackburn, Maarten de Rijke, and Yde Venema. Modal Logic, volume 53 of Cambridge Tracts in Theoretical

Computer Science. Cambridge University Press, 2001.

[3] Boethius. Commentary on aristoteles’ de interpretatione.

[4] R. Carnap. Introduction to Semantics. Harvard University Press, 1942.

[5] R. Carnap. Meaning and Necessity: a Study in Semantics and Modal Logic. University of Chicago Press, 1947.

[6] Epictetus. Discourses. In B. Inwood and L.P. Gerson, editors, Hellenistic Philosophy. Indianapolis: Hackett Publishing

Company, 1988.

[7] J.W. Garson. Modal Logic for Philosophers. Cambridge University Press, 2006.

[8] Saul A. Kripke. A completeness theorem in modal logic. The Journal of Symbolic Logic, 24(1):1–13, 1959.

[9] Benson Mates. Review of prior ‘diodorean modalities’. The Journal of Symbolic Logic, 21(2):199–200, 1956.

[10] Benson Mates. The Philosophy of Leibniz. Oxford University Press, 1986.

[11] A.N. Prior. Time and Modality. Oxford University Press, 1957.

[12] Arthur N. Prior. Past, Present and Future. Oxford: Clarendon Press, 1967.

[13] B. Russell. Necessity and possibility [1905]. In A. Urquhart and A.C Lewis, editors, Foundations of Logic, 1903-05.

Routledge, 1994.

[14] J. van Benthem. Modal Correspondence Theory. PhD thesis, University of Amsterdam, 1976.

[15] Johan van Benthem. Modal Logic for Open Minds. CSLI Publications, 2010.

[16] Hans van Ditmarsch, Wiebe van der Hoek, and Barteld Kooi. Dynamic Epistemic Logic, volume 337 of Synthese Library.

Springer, 2007.

[17] Ludwig Wittgenstein. Tractatus Logico-Philosophicus. London: Routledge and Kegan Paul, 1922. Translated by C.K.

Ogden, with an Introduction by Bertrand Russell.

67