Вы находитесь на странице: 1из 188

Logics for Linguistic Structures

Trends in Linguistics
Studies and Monographs 201
Editors
Walter Bisang
Hans Henrich Hock
(main editor for this volume)
Werner Winter
Mouton de Gruyter
Berlin New York
Logics for Linguistic Structures
Edited by
Fritz Hamm
Stephan Kepser
Mouton de Gruyter
Berlin New York
Mouton de Gruyter (formerly Mouton, The Hague)
is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.
Printed on acid-free paper which falls within the guidelines
of the ANSI to ensure permanence and durability.
Library of Congress Cataloging-in-Publication Data
Logics for linguistic structures / edited by Fritz Hamm and Stephan
Kepser.
p. cm. (Trends in linguistics ; 201)
Includes bibliographical references and index.
ISBN 978-3-11-020469-8 (hardcover : alk. paper)
1. Language and logic. 2. Computational linguistics. I. Hamm,
Fritz, 1953 II. Kepser, Stephan, 1967
P39.L5995 2008
401dc22
2008032760
ISBN 978-3-11-020469-8
ISSN 1861-4302
Bibliographic information published by the Deutsche Nationalbibliothek
The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbiblio-
grafie;
detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.
Copyright 2008 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin
All rights reserved, including those of translation into foreign languages. No part of this
book may be reproduced or transmitted in any form or by any means, electronic or mecha-
nical, including photocopy, recording or any information storage and retrieval system,
without permission in writing from the publisher.
Cover design: Christopher Schneider, Berlin.
Printed in Germany.
Contents
Introduction 1
Fritz Hamm and Stephan Kepser
Type Theory with Records and unication-based grammar 9
Robin Cooper
One-letter automata: How to reduce k tapes to one 35
Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
Two aspects of situated meaning 57
Eleni Kalyvianaki and Yiannis N. Moschovakis
Further excursions in natural logic: The Mid-Point Theorems 87
Edward L. Keenan
On the logic of LGB type structures. Part I: Multidominance
structures
105
Marcus Kracht
Completeness theorems for syllogistic fragments 143
Lawrence S. Moss
List of contributors 175
Index 179
Introduction
Fritz Hamm and Stephan Kepser
Logic has long been playing a major role in the formalization of linguistic
structures and linguistic theories. This is certainly particularly true for the
area of semantics, where formal logic has been the major tool ever since
the Fregean program. In the area of syntax it was the rising of principles
based theories with the focus shifting away from the generation process of
structures to dening general well-formedness conditions of structures that
opened the way for logic. The naturalness by which many types of well-
formedness conditions can be expressed in some logic or other led to different
logics being proposed and used in diverse formalizations of syntactic theories
in general and the eld of model theoretic syntax in particular.
The contributions collected in this volume address central topics in theo-
retical and computational linguistics, such as quantication, types of context
dependence and aspects concerning the formalisation of major grammatical
frameworks, among others GB, DRT and HPSG. All contributions have in
common a strong preference for logic as the major tool of analysis. Two of
them are devoted to formal syntax, three to aspects of logical semantics. The
paper by Robin Cooper contributes both to syntax and semantics. We there-
fore grouped the description of the papers in this preface in a syntactic and
a semantic section with Coopers paper as a natural interface between these
two elds.
The contribution by Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
belongs to the eld of nite state automata theory and provides a method how
to reduce multi tape automata to single tape automata. Multi tape nite state
automata have many applications in computer science, they are particularly
frequently used in many areas of natural language processing. Adisadvantage
for the usability of multi tape automata is that certain automata constructions,
namely composition, projection, and cartesian product, are a lot more com-
plicated than for single tape automata. A reduction of multi tape automata to
single tape automata is thus desirable.
The key construction by Ganchev, Mihov, and Schulz in the reduction is
the denition of an automaton type that bridges between multi and single
tape automata. The authors introduces so-called one-letter automata. These
2 Fritz Hamm and Stephan Kepser
are multi tape automata with a strong restriction. There is only one type of
transitions permitted and that is where only a single letter in the k-tuple of sig-
nature symbols in the transition differs from the empty word. In other words,
all components of the tuple are the empty word with the exception of one
component. Ganchev, Mihov, and Schulz show that one-letter automata are
equivalent to multi tape automata. Interestingly, one-letter automata can be
regarded as single tape automata over an extended alphabet which consists of
complex symbols each of which is a k-tuple with exactly one letter differing
from the empty word.
One of the known differences between multi dimensional regular rela-
tions and one dimensional regular relations is that the latter are closed under
intersection while the former are not. To cope with this difference, Ganchev,
Mihov, and Schulz dene a criterion on essentiality of a component in a k-
dimensional regular relation and show that the intersection of two k-dimen-
sional regular relations is regular if the two relations share at most one essen-
tial component. This result can be extended to essential tapes of one-letter
automata and the intersection of these automata.
On the basis of this result Ganchev, Mihov, and Schulz present automata
constructions for insertion, deletion, and projection of tapes as well as com-
position and cartesian product of regular relations all of which are based on
the corresponding constructions for single tape automata. This way the ef-
fectiveness of one-letter automata is shown. It should hence be expected that
this type of automata will be very useful for practical applications.
The contribution by Marcus Kracht provides a logical characterisation of the
datastructures underlying the linguistic frameworks Government and Bind-
ing and Minimalism. Kracht identies so-called multi-dominance structures
as the datastructures underlying these theories. A multi-dominance structure
is a binary tree with additional immediate dominance relations which have a
restricted distribution in the following way. All parents of additional dom-
inance relations must be found on the path from the root to the parent of
the base dominance relation. Additional dominance relations provide a way
to represent movement of some component from a lower part in a tree to a
position higher up.
The logic chosen by Kracht to formalize multi-dominance structures is
propositional dynamic logic (PDL), a variant of modal logic that has been
used before on many occasions by Kracht and other authors to formalize lin-
guistic theories. In this paper, Kracht shows that PDL can be used to axioma-
tize multi-dominance structures. This has the important and highly desirable
Introduction 3
consequence that the dynamic logic of multi-dominance structures is decid-
able. The satisability of a formula can be decided in in 2EXPTIME.
In order to formalize a linguistic framework it is not enough to provide
an axiomatisation of the underlying datastructures only. The second contri-
bution of this paper is therefore a formalisation of important grammatical
concepts and notions in the logic PDL. This formalisation is provided for
movement and its domains, single movement, adjunction, and cross-serial
dependencies. In all of these, care is taken to ensure that the decidability
result of multi-dominance structures carries over to grammatical notions de-
ned on these structures. It is thereby shown that large parts of the linguistic
framework Government and Binding can be formalized in PDL and that this
formalisation is decidable.
The contribution by Robin Cooper shows how to render unication based
grammar formalisms with type theory using record structures. The paper is
part of a broader project which aims at providing a coherent unied approach
to natural language dialog semantics. The type theory underlying this work
is based on set theory and follows Montagues style of recursively dening
semantic domains. There are functions and function types available in this
type theory providing a version of the typed -calculus. To this base records
are added. A record is a nite set of elds, i.e., ordered pairs of a label and an
object. A record type is accordingly a nite set of ordered pairs of a label and
a type. Records and record types may be nested. The notions of dependent
types and subtype relations are systematically extended to be applicable to
record types.
The main contribution of this paper is a type theoretical approach to uni-
cation phenomena. Feature structures of some type play an important role
in almost all modern linguistic frameworks. Some frameworks like LFG and
HPSG make this rather explicit. They also provide a systematic way to com-
bine two feature structures partially describing some linguistic object. This
combination is based on ideas of unication even though this notion need no
longer be explicitely present in the linguistic frameworks. In a type theo-
retical approach, records and their types render feature structures in a rather
direct and natural way. The type theoretical tools to describe unication are
meet types and equality. Cooper assumes the existence of a meet type for
each pair of types in his theory including record types. He provides a func-
tion that recursively simplies a record type. This function is particularly
applicable to record types which are the result of the construction of a meet
record type and should be interpreted as the counterpart of unication in type
4 Fritz Hamm and Stephan Kepser
theory with records. There are though important differences to feature struc-
ture unication. One of them is that type simplication never fails. If the
meet of incompatible types was constructed, the simplication will return a
distinguished empty type.
The main advantage of this approach is that it provides a kind of intension-
ality which is not available for feature structures. This intensionality can be
used, e.g., to distinguish equivalent types such as the source of grammatical
information. It can also be used to assign different empty types with different
ungrammatical phrases. This may provide a way to support robust parsing
in that ungrammatical phrases can be processed and the the consistent parts
of their record types may contain useful informations for further processing.
Type theory with records also offers a very natural way to integrate sematic
analyses into syntactic analyses based on feature structures.
The paper by Eleni Kalyvianaki and Yiannis Moschovakis contains a so-
phisticated application of the theory of referential intensions developed by
Moschovakis in a series of papers (see for instance (Moschovakis 1989a,b,
1993, 1998)), and applied to linguistics in (Moschovakis 2006). Based on
the theory of referential intensions the paper introduces two notion of context
dependent meaning, factual content and local meaning, and shows that these
notions solve puzzles in philosophy of language and linguistics, especially
those concerning the logic of indexicals.
Referential intension theory allows to dene three notions of synonymy,
namely referential synonymy, local synonymy, and factual synonymy. Ref-
erential synonymy, the strongest concept, holds between two terms A and B
iff there referential intensions are the same; i.e., int(A) = int(B). Here the
referential intension of an expression A, int(A) is to be understood as the nat-
ural algorithm (represented as a settheoretical object) which computes the
denotation of A with respect to a given model. Thus referential synonymy
is a situation independent notion of synonymy. This contrasts with the other
two notions of synonymy which are dependent on a given state a. Local syn-
onymy is synonymy with regard to local meaning where the local meaning
of an expression A is computed from the referential intension of A applied to
a given state a. It is important to note that for the constitution of the local
meaning of A the full meanings of the parts of A have to be computed. In
this respect the concept of local meaning differs signicantly from the notion
factual content and for this reason from the associated notion of synonymy
as well. This is best explained by way of an example.
Introduction 5
If in a given state a her(a) = Mary(a) then the factual content of the
sentence John loves her is the same as the factual content of John loves Mary.
The two sentences are therefore synonymous with regard to factual content.
But they are not locally synonymous since the meaning of her in states other
than a my well be different from the meaning of Mary.
The paper applies these precisely dened notions to Kaplans treatment of
indexicals and argues for local meanings as the most promising candidates
for belief carriers. The paper ends with a brief remark on what aspects of
meaning should be preserved under translation.
The paper by Edward L. Keenan tries to identify inference patterns which
are specic for proportionality quantiers. For instance, given the premisses
(1-a), (1-b) in (1) we may conclude (1-c).
(1) a. More than three tenths of the students are athletes.
b. At least seven tenths of the students are vegetarians.
c. At least one student is both an athlete and a vegetarian.
This is an instance of the following inference pattern:
(2) a. More than
n
/m of the As are Bs.
b. At least 1
n
/m of the As are Cs.
c. Ergo: Some A is both a B and a C.
Although proportionality quantiers satisfy inference pattern (2), other quan-
tiers do so as well, as observed by Dag Westerst ahl. Building on (Keenan
2004) the paper provides an important further contribution to the question
whether there are inference patterns specic to proportionality quantiers.
The central result of Keenans paper is the Mid-Point Theorem and a gen-
eralization thereof.
The Mid-Point Theorem Let p, q be fractions with 0 p q 1 and p+
q = 1. Then the quantiers
(BETWEEN p AND q) and (MORE THAN p AND LESS THAN q)
are xed by the postcomplement operation.
The postcomplement of a generalized quantier Qis that generalized quan-
tier which maps a set B to Q(B). The following pair of sentences illustrates
this operation:
6 Fritz Hamm and Stephan Kepser
(3) a. Exactly half the students got an A on the exam.
b. Exactly half the students didnt get an A on the exam.
The mid-point theorem therefore guarantees the equivalence of sentences
(4-a) and (4-b); and analogously the equivalence of sentences formed with
MORE THAN p AND LESS THAN q).
(4) a. Between one sixth and ve sixth of the students are happy.
b. Between one sixth and ve sixth of the students are not happy.
However, this and the generalization of the mid-point theorem are still
only partial answers to the question concerning specic inference patterns
for proportionality quantiers, since non-proportional determiner exist which
still satisfy the conditions of the generalized mid-point-theorem.
The paper by Lawrence S. Moss studies syllogistic systems of increasing
strength from the point of view of natural logic (for a discussion of this notion,
see Purdy (1991)). Moss proves highly interesting new completeness results
for these systems. More specically, after proving soundness for all systems
considered in the paper the rst result states the completeness of the following
two axioms for L (all) a syllogistic fragment containing only expressions of
the form All X are Y:
All X are Z All Z are Y
All X are X All X are Y
In addition to completness the paper studies a further related but stronger
property, the canonical model property. A system which has the canonical
model property is also complete, but this does not hold vice versa. Roughly,
a model M is canonical for a fragment F , a set of sentences in F and a
logical system for F if for all S F , M [= S iff S. A fragment F has
the canonical model if every set F has a canonical model. The canonical
model property is a rather strong property. Classical propositional logic, for
instance, does not have this property, but the fragment L (all) has it. Some
but not all of the systems in the paper have the canonical model property.
Other system studied in Moss paper include Some X are Y, combina-
tions of this system with L (all) and sentences involving proper names, sys-
tems with Boolean combinations, a combination of L (all) with There are
at least s many X as Y, logical theories for Most and Most + Some. The
largest logical system for which completeness is proved adds

to the theory
Introduction 7
L (all, some, no, names) with Boolean operations, where

(X,Y) is consid-
ered true in case X contains more elements than Y.
Moss paper contains two interesting digressions as well. The rst is con-
cerned with sentences of the form All X which are Y are Z, the second with
most. For instance, Moss proves that the following two axioms are complete
for most.
Most X are Y Most X are Y
Most X are X Most Y are Y
Moreover, if Most X are Y does not follow from a set of sentences then
there exists a model of with cardinality 5 which falsies Most X are Y.
All papers collected in this volume grew out of a conference in honour of
Uwe M onnich which was held in Freudenstadt in November 2004. Since
this event four years elapsed. But another important date is now imminent,
Uwes birthday. Hence we are in the lucky position to present this volume as
a Festschrift for Uwe M onnich on the occasion of his 70
th
birthday.
T ubingen, July 2008 Fritz Hamm and Stephan Kepser
References
Keenan, Edward L.
2004 Excursions in natural logic. In Claudia Casadio, Philip J. Scott, and
Robert A.G. Seely, (eds.), Language and Grammar: Studies in Math-
ematical Linguistics and Natural Language. Stanford: CSLI.
Moschovakis, Yiannis
1989a The formal language of recursion. The Journal of Symbolic Logic 54:
12161252.
8 Fritz Hamm and Stephan Kepser
1989b A mathematical modeling of pure recursive algorithms. In Albert R.
Meyer and Michael Taitslin, (eds.), Logic at Botik 89, LNCS 363.
Berlin: Springer.
1993 Sense and denotation as algorithm and value. In Juha Oikkonen and
Jouko V a an anen, (eds.), Logic Colloquium 90. Natick, USA: Asso-
ciation for Symbolic Logic, A.K. Peters, Ltd.
1998 On founding the theory of algorithms. In Harold Dales and Gianluigi
Oliveri, (eds.), Truth in Mathematics. Oxford University Press.
2006 A logical calculus of meaning and synonymy. Linguistics and Phi-
losophy 29: 2789.
Purdy, William C.
1991 A logic for natural language. Notre Dame Journal of Formal Logic
32: 409425.
Type Theory with Records and unication-based
grammar
Robin Cooper
Abstract
We suggest a way of bringing together type theory and unication-based grammar
formalisms by using records in type theory. The work is part of a broader project
whose aim is to present a coherent unied approach to natural language dialogue
semantics using tools from type theory.
1. Introduction
Uwe M onnich has worked both on the use of type theory in semantics and on
formal aspects of grammar formalisms. This paper suggests a way of bringing
together type theory and unication as found in unication-based grammar
formalisms like HPSG by using records in type theory which provide us with
feature structure like objects. It represents a small offering to Uwe to thank
him for many kindnesses over the years sprinkled with insights and rigorous
comments.
This work is part of a broader project whose aim is to present a coher-
ent unied approach to natural language dialogue semantics using tools from
type theory. We are seeking to do this by bringing together Head Driven
Phrase Structure Grammar (HPSG) (Sag et al. 2003), Montague semantics
(Montague 1974), Discourse Representation Theory (DRT) (Kamp and Reyle
1993; van Eijck and Kamp 1997, and much other literature), situation seman-
tics (Barwise and Perry 1983) and issue-based dialogue management (Lars-
son 2002) into a single type-theoretic formalism. A survey of our approach
to the semantic theories (i.e., Montague semantics, DRT and situation seman-
tics) and HPSG can be found in (Cooper 2005b). Other work in progress can
be found on http://www.ling.gu.se/

cooper/records. We give a brief


summary here: Record types can be used as discourse representation struc-
tures (DRSs). Truth of a DRS corresponds to there being an object of the
appropriate record type and this gives us the effect of simultaneous binding
of discourse referents (corresponding to labels in records) familiar from the
10 Robin Cooper
semantics of DRSs in (Kamp and Reyle 1993). Dependent function types pro-
vide us with the classical treatment of donkey anaphora from DRT in a way
corresponding to the type theoretic treatment proposed by M onnich (1985),
Sundholm (1986) and Ranta (1994). At the same time record types can be
used as feature structures of the kind found in HPSGsince they have recursive
structure and induce a kind of subtyping which can be used to mimic unica-
tion. Because we are using a general type theory which includes records we
have functions available and a version of the -calculus. This means that we
can use Montagues -calculus based techniques for compositional interpre-
tation. From the HPSG perspective this gives us the advantage of being able
to use real variable binding which can only be approximately simulated in
pure unication based systems. From the DRT perspective this use of com-
positional techniques gives us an approach similar to that of Muskens (1996)
and work on -DRT (Kohlhase et al. 1996).
In this paper we will look at the notion of unication as used in unication-
based grammar formalisms like HPSG from the perspective of the type theo-
retical framework. This work has been greatly inuenced by work of Jonathan
Ginzburg (for example, Ginzburg in prep, Chap. 3). In Section 2 we will give
a brief informal introduction to our view of type theory with records. The ver-
sion of type theory that we discuss has been made more precise in (Cooper
2005a) and in an implementation called TTR (Type Theory with Records)
which is under development in the Oz programming language. In Section 3
we will discuss the notion of subtype which records introduce (correspond-
ing to the notion of subsumption in the unication literature). We will then,
in Section 4, propose that linguistic objects are to be regarded as records
whereas feature structures are to be regarded as corresponding to record types.
Type theory is function-based rather than unication-based. However,
the addition of records to type theory allows us to get the advantages of uni-
cation without having to leave the function-based approach. We show how
to do this in Section 5 treating some classical simple examples which have
been used to motivate the use of unication. Section 6 deals with the way in
which unication analyses are used to allow the extraction of linguistic gen-
eralizations as principles in the style of HPSG. The conclusion (Section 7) is
that by using record types within a type theory we can have the advantages of
unication-based approaches together with an additional intensionality not
present in classical unication approaches and without the disadvantage of
leaving the function-based approach which is necessary in order to deal
adequately with semantics (at least).
TTR and unication-based grammar 11
2. Records in type theory
In this section
1
we give a very brief intuitive introduction to the kind of
type theory we are employing. A more detailed and formal account can be
found in (Cooper 2005a) and work in progress on the project can be found
on http://www.ling.gu.se/

cooper/records. While the type theoreti-


cal machinery is based on work carried out in the Martin-L of approach (Co-
quand et al. 2004; Betarte 1998; Betarte and Tasistro 1998; Tasistro 1997)
we are making a serious attempt to give it a foundation in standard set the-
ory using Montague style recursive denitions of semantic domains. There
are two main reasons for this. The rst is that we think it important to show
the relationship between the Montague model theoretic tradition which has
been developed for natural language semantics and the proof-theoretic tradi-
tion associated with type theory. We believe that the aspects of this kind of
type theory that we need can be seen as an enrichment of Montagues original
programme. The second reason is that we are interested in exploring to what
extent intuitionistic and constructive approaches are appropriate or necessary
for natural language. For example, we make important use of the notion
propositions as types which is normally associated with an intuitionistic
approach. However, we suspect that our Montague-like approach to dening
the type theory to some extent decouples the notion from intuitionism. We
would like to see type theory as providing us with a powerful collection of
tools for natural language analysis which ultimately do not commit one way
or the other to philosophical notions associated with intuitionism.
The central idea of records and record types can be expressed informally
as follows, where T(a
1
, . . . , a
n
) represents a type T which depends on the
objects a
1
, . . . , a
n
.
If a
1
: T
1
, a
2
: T
2
(a
1
), . . . , a
n
: T
n
(a
1
, a
2
, . . . , a
n1
), a record:

l
1
= a
1
l
2
= a
2
. . .
l
n
= a
n
. . .

is of type:

l
1
: T
1
l
2
: T
2
(l
1
)
. . .
l
n
: T
n
(l
1
, l
2
, . . . , l
n1
)

12 Robin Cooper
A record is to be regarded as a nite set of elds , a), which are ordered
pairs of a label and an object. A record type is to be regarded as a nite set
of elds , T) which are ordered pairs of a label and a type. The informal
notation above suggests that the elds are ordered with types being dependent
on previous elds in the order. This is misleading in that we regard record
types as sets of elds on which a partial order is induced by the dependency
relation. Dependent types give us the possibility of relating the values in
elds to each other and play a crucial role in our treatment of both feature
structures and semantic objects. Both records and record types are required
to be the graphs of functions, that is, if , ) and
/
, ) are members of a
given record or record type then ,=
/
. A record r is of record type R just
in case for each eld , T) in R there is a eld , a) in r (i.e., with the same
label) and a is of type T. Notice that the record may have additional elds not
mentioned in the type. Thus a record will generally belong to several record
types and any record will belong to the empty record type. This gives us a
notion of subtyping which we will explore further in Section 3.
Let us see how this can be applied to a simple linguistic example. We will
take the content of a sentence to be modelled by a record type. The sentence
a man owns a donkey
corresponds to a record type:

x : Ind
c
1
: man(x)
y : Ind
c
2
: donkey(y)
c
3
: own(x,y)

A record of this type will be:

x = a
c
1
= p
1
y = b
c
2
= p
2
c
3
= p
3

where
a, b are of type Ind, individuals
p
1
is a proof of man(a)
p
2
is a proof of donkey(b)
p
3
is a proof of own(a, b).
TTR and unication-based grammar 13
Note that the record may have had additional elds and still be of this type.
The types man(x), donkey(y), own(x,y) are dependent types of proofs
(in a convenient but not quite exact abbreviatory notation we will give a
more precise account of dependencies within record types in Section 3). The
use of types of proofs for what in other theories would be called propositions
is often referred to as the notion of propositions as types. Exactly what
type man(x) is depends on which individual you choose in your record to
be labelled by x. If the individual a is chosen then the type is the type
of proofs that a is a man. If another individual d is chosen then the type
is the type of proofs that d is a man, and so on. What is a proof? Martin-
L of considers proofs to be objects rather than arguments or texts. For non-
mathematical propositions proofs can be regarded as situations or events. For
useful discussion of this see (Ranta 1994, p. 53ff). We discuss it in more
detail in (Cooper 2005a).
There is an obvious correspondence between this record type and a dis-
course representation structure (DRS) as characterised in (Kamp and Reyle
1993). The characterisation of what it means for a record to be of this type
corresponds in an obvious way to the standard embedding semantics for such
a DRS which Kamp and Reyle provide.
Records (and record types) are recursive in the sense that the value corre-
sponding to a label in a eld can be a record (or record type)
2
. For example,
r =

f =

f =
_
ff = a
gg = b
_
g = c

g =
_
h =
_
g = a
h = d
_ _

is of type
R =

f :

f :
_
ff : T
1
gg : T
2
_
g : T
3

g :
_
h :
_
g : T
1
h : T
4
_ _

given that a : T
1
, b : T
2
, c : T
3
and d : T
4
. We can use path-names in records
and record types to designate values in particular elds, e.g.
r.f =

f =
_
ff = a
gg = b
_
g = c

R.f.f.ff = T
1
14 Robin Cooper
The recursive nature of records and record types will be important later in the
paper when we use record types to correspond to linguistic feature structures.
Another important aspect of the type theory we are using is that types
themselves can also be treated as objects.
3
A simple example of how this
can be exploited is the following representation for a girl believes that a man
owns a donkey. This is a simplied version of the treatment discussed in
(Cooper 2005a).

x : Ind
c
1
: girl(x)
c
2
: believe(x,

y : Ind
c
3
: man(y)
z : Ind
c
4
: donkey(z)
c
5
: own(y, z)

The treatment of types as rst class objects in this way is a feature which
this type theory has in common which situation theory and it is an important
component in allowing us to incorporate analyses from situation semantics in
our type theoretical treatment.
The theory of records and record types is embedded in a general type the-
ory. This means that we have functions and function types available giving us
a version of the -calculus. We can thus use Montagues techniques for com-
positional interpretation. For example, we can interpret the common noun
donkey as a function which maps records r of the type
_
x:Ind

(i.e. records
which introduce an individual labelled with the label x) to a record type
dependent on r. We notate the function as follows:
r:
_
x:Ind

(
_
c:donkey(r.x)

)
The type of this function is
P = (
_
x:Ind

)RecType
This corresponds to Montagues type e, t) (the type of functions from in-
dividuals (entities) to truth-values). In place of individuals we use records
introducing individuals with the label x and in place of truth-values we use
record types which, as we have seen above, correspond to an intuitive notion
of proposition (in particular a proposition represented by a DRS). Using the
power of the -calculus we can treat determiners Montague-style as functions
which take two arguments of type P and return a record type. For example,
we represent the indenite article by
TTR and unication-based grammar 15
R
1
:(
_
x:Ind

)RecType
R
2
:(
_
x:Ind

)RecType
(

par :
_
x : Ind

restr : R
1
@ par
scope : R
2
@ par

)
Here we use F @ a to represent the result of applying function F to argument
a.
The type theory includes dependent function types. These can be used to
give a classical treatment of universal quantication corresponding to DRTs
. For example, an interpretation of every man owns a donkey can be the
following record type:

f : (r :
_
x : Ind
c
1
: man(x)
_
)

y : Ind
c
2
: donkey(y)
c
3
: own(r.x,y)

Records of the type


(r :
_
x : Ind
c
1
: man(x)
_
)

y : Ind
c
2
: donkey(y)
c
3
: own(r.x,y)

map records r of type


_
x : Ind
c
1
: man(x)
_
to records of type

y : Ind
c
2
: donkey(y)
c
3
: own(r.x,y)

Our interpretation of every man owns a donkey requires that there exist a
function of this type. Why do we use the record type with the label f rather
than the function type itself as the interpretation of the sentence? One reason
is to achieve a uniform treatment where the interpretation of a sentence is al-
ways a record type. Another reason is that the label gives us a handle which
can be used to anaphorically refer to the function. This can, for example, be
exploited in so-called paycheck examples (Karttunen 1969) such as Every-
body receives a paycheck. Not everybody pays it into the bank immediately,
though.
The nal notion we will introduce which is important for the modelling
of HPSG typed feature structures as record types is that of manifest eld.
16 Robin Cooper
This notion is introduced in (Coquand et al. 2004). It builds on the notion
of singleton type. If a : T, then T
a
is a singleton type and b : T
a
iff b = a. A
manifest eld in a record type is one whose type is a singleton type, e.g.
_
x : T
a

written for convenience as


_
x=a : T

This notion allows record types to be progressively instantiated, i.e. intu-


itively, for values to be specied within a record type. A record type that only
contains manifest elds is completely instantiated and there will be exactly
one record of that type. We will allow dependent singleton types, where a in
T
a
can be represented by a path in a record type. Manifest elds are important
for the modelling of HPSG-style unication in type theory with records.
3. Dependent types and the subtype relation
We are now in a position to give more detail about the treatment of depen-
dencies in record types. Dependent types within record types are treated as
pairs consisting of functions and sequences of path-names providing corre-
sponding to the arguments required by the functions. Thus the type on p. 12
corresponding to a man owns a donkey is in ofcial, though less readable,
notation:

x : Ind
c
1
: v :Ind(man(v)), x))
y : Ind
c
2
: v :Ind(donkey(v)), y))
c
3
: v :Ind(w :Ind(own(v, w))), x,y))

This enables us to give scope to dependencies outside the object in which the
dependency occurs. Thus if, on the model of the type for a girl believes that
a man owns a donkey on p. 14, we wish to construct a type corresponding to
a girl believes that she owns a donkey (where she is anaphorically related to
a girl), this can be done as follows:

x : Ind
c
1
: v :Ind(girl(v)), x))
c
2
: u :Ind(believe(u,

z : Ind
c
4
: v :Ind(donkey(v)), z))
c
5
: v :Ind(own(u, v)), z))

)),
x))

TTR and unication-based grammar 17


There are two kinds of path-names which can occur in such types: relative
path-names which are constructed from labels,
1
. . . . .
n
, and absolute path-
names which refer explicitly to the record, r in which the path-name is to be
evaluated, r.
1
. . . . .
n
. A dependent record type is a set of pairs of the form
, T) where is a label and T is either a type, a dependent record type or a
pair consisting of a function and a sequence of path-names as characterized
above. An anchor for a dependent record type T of the form

1
: T
1
. . .

n
: T
n

is a record, h, such that for each T


i
of the form f ,
1
, . . . ,
m
)),
i
is either
an absolute path or a path dened in h, and for each T
i
which is a dependent
record type, h is also an anchor for T
i
. In addition we require that the result
of anchoring T with h as characterized below is well-dened, i.e., that the
anchor provides arguments of appropriate types to functions and provides
objects of appropriate types for the construction of singleton types as required
by the anchoring. The result of anchoring T with h, T[h] is obtained by
replacing
1. each T
i
in T of the form f ,
1
, . . . ,
m
)) with f (
1
[h]). . . (
m
[h]) (where

i
[h] is the value of h.
i
if
i
is a relative path-name and the value of
i
if
i
is an absolute path-name)
2. each T
i
in T which is a dependent record type with T
i
[h]
3. each basic type which is the value of a path in T and for which h.
is dened, with T
a
, where a is the value of h., i.e. the singleton type
obtained from T and the value of h..
A dependent record type T is said to be closed just in case each path which
T requires to be dened in an anchor for T is dened within T itself. It is the
closed dependent record types which belong to our type universe. If T is a
closed dependent record type then r : T if and only if r : T[r].
Let us return to our type above for a girl believes that she owns a donkey.
An anchor for this type is
_
x = m

(where m is an object of type Ind) and the result of anchoring the type with
this record is
18 Robin Cooper

x=m : Ind
c
1
: girl(m)
c
2
: believe(m,

z : Ind
c
4
: v :Ind(donkey(v)), z))
c
5
: v :Ind(own(m, v)), z))

Notice that the anchor has no effect on dependencies with scope within the
argument type corresponding to that she owns a donkey but only the depen-
dency with scope external to it.
We now turn our attention to the subtype relation. Record types introduce
a notion of subtype which corresponds to what is known as subsumption in
the unication literature. The subtype relation can be characterized model
theoretically as follows:
If T
1
and T
2
are types, then T
1
is a subtype of T
2
(T
1
_T
2
) just in
case
a [ a :
M
T
1
a [ a :
M
T
2
for all models M.
where the right-hand side of this equivalence refers to sets in the sense of
classical set theory and models as dened in (Cooper 2005a). These mod-
els assign sets of objects to the basic types and sets of proofs to proof-types
constructed from predicates with appropriate arguments, e.g. if k is a woman
according to model M then M will assign a non-empty set of proofs to the
type woman(k). Such models correspond to sorted rst-order models. If
this notion of subtype is to be computationally useful, we need some way of
computing whether two types stand in the subtype relation without having to
compute the sets of objects which belong to those types in all possible mod-
els. Thus we dene another relation _
c
which is computed without reference
to the models.
The approach taken to this in the implementation TTR is to instantiate
(dependent) record types, R, recursively as an anchor for R introducing ar-
bitrary formal objects guaranteed to be of the appropriate type. Basic types
and types constructed with predicates are instantiated to arbitrary formal ob-
jects guaranteed to be of the type (in the implementation, pairings of gensym
atoms with the type); singleton types are instantiated to the object used to
dene the type; record type structures are instantiated to records containing
the instantiations of the types (or anchors for the dependent record types) in
each eld; similar instantiations are given for other complex types. Thus the
instantiation of the record type
TTR and unication-based grammar 19

f : T
1
g=b : T
2
h :
_
i : r
1
(f,g)
j : r
2
(g,f)
_

can be represented as:

f = a0#T
1
g = b
h =
_
i = a1#r
1
(a0#T
1
, b)
j = a2#r
2
(b, a0#T
1
)
_

We use Inst(T) to represent such an instantiation of type T. T


1
_
c
T
2
just in
case Inst(T
1
) : T
2
. One advantage of this approach is that the computation of
the subtype relation will be directly dependent on the of-type relation. If T
1
is a record type containing a superset of the elds of the record type T
2
then
T
1
_
c
T
2
as desired for the modelling of subsumption in unication systems.
Thus, for example,

f:T
1
g:T
2
h:T
3

_
c
_
f:T
1
g:T
2
_
This method of computing the subtype relation appears to be sound with
respect to the models but not necessarily complete since it does not take ac-
count of the logic associated with predicates. For example, later in the paper
we will make use of an equality predicate. The predicate eq is such that
a : eq(T, x, y) iff a =x, y), x, y : T, and x = y. Now consider that the type

x:T
y:T
c:r(x)

is not a subtype of

x:T
y:T
c:r(y)

whereas according to the model theoretic denition

x:T
y:T
c:r(x)
d:eq(T,x,y)

x:T
y:T
c:r(y)

20 Robin Cooper
since anything of the rst type must also be of the second type. The instanti-
ation of the rst type

x=a0#T
y=a1#T
c=a2#r(a0#T )
d=a3#eq(T, a0#T, a1#T)

will not, however, be computed as being of the second type unless we take
account of the import of the equality predicate. This is easily xed, for ex-
ample by normalizing the instantiation so that all arbitrary objects which are
required to be identical are represented by the same symbols, in this case, for
example, substituting a0 for all occurrences of a1:

x=a0#T
y=a0#T
c=a2#r(a0#T )
d=a3#eq(T, a0#T, a0#T)

This will then have the desired effect on the computation of the subtype rela-
tion. However, there is no guarantee that it will always be possible to give a
complete characterization of the subtype relation if the logic of the predicates
is incomplete. But we should not let this stop us exploiting those inferences
about subtyping which we can draw in a computational implementation.
4. Records as linguistic objects
We will consider linguistic objects to be records. Here is a simple linguistic
object which might correspond to the word man.

phon = [man]
cat = n
agr =

num = sg
gen = masc
pers = third

It is a record with three elds. The eld for phon(ology) has as value
a (singleton) list of words (following the traditional HPSG simplifying as-
sumption about phonology). For the cat(egory) eld we will use atomic cat-
egories like n(oun), although nothing in our approach excludes the complex
categories normally used in HPSG analyses. We include three agr(eement)
features: num(ber), in this case with the value s(in)g(ular), gen(der), in this
TTR and unication-based grammar 21
case with the value masc(uline) and pers(on) in this case with the value third
(person).
Not all words correspond to a single linguistic object. For example, the
English word sh can be singular or plural and masculine, feminine or neuter.
This means that there will be six records corresponding to the single record
for man. Here are three of them:

phon = [sh]
cat = n
agr =

num = sg
gen = neut
pers = third

phon = [sh]
cat = n
agr =

num = pl
gen = neut
pers = third

phon = [sh]
cat = n
agr =

num = sg
gen = masc
pers = third

Now let us consider types of linguistic objects. Nouns correspond to ob-


jects which have a phonology, the category n and the agreement features for
number, gender and person. That is we can dene a record type Noun as
follows:
Noun

phon : Phon
cat=n : Cat
agr :

num : Number
gen : Gender
pers : Person

where:
Phon [Lex] (i.e. type of list of objects of type Lex)
the, a, sh, man, men, swim, swims, swam, . . . : Lex
n, det, np, v, vp, s, . . . : Cat
sg, pl : Number
masc, fem, neut : Gender
rst, second, third : Person
We can further dene types for determiners, verbs and agreement:
22 Robin Cooper
Det

phon : Phon
cat=det : Cat
agr :

num : Number
gen : Gender
pers : Person

phon : Phon
cat=v : Cat
agr :

num : Number
gen : Gender
pers : Person

Agr

num : Number
gen : Gender
pers : Person

Now we can dene the type of linguistic objects corresponding to the word
man.
Man

phon=[man] : Phon
cat=n : Cat
agr :

num=sg : Number
gen=masc : Gender
pers=third : Person

This type identies a unique linguistic object (namely the record correspond-
ing to man which we introduced above). It is a singleton (or fully specied)
type. It is also a subtype (or specication) of Noun in the sense that if an
object is of type Man it is also of type Noun. We dene the type for the plural
men in a similar way.
Men

phon=[men] : Phon
cat=n : Cat
agr :

num=pl : Number
gen=masc : Gender
pers=third : Person

The type Fish corresponding to the noun sh is a less specied type, however:
Fish

phon=[sh] : Phon
cat=n : Cat
agr :

num : Number
gen : Gender
pers=third : Person

TTR and unication-based grammar 23


The objects which are of this type will be the six records which we identied
earlier. Fish is also a subtype of Noun.
We can also dene two types IndefArt and DefArt corresponding to the
indenite and denite articles which have different degrees of specication:
IndefArt

phon=[a] : Phon
cat=det : Cat
agr :

num=sg : Number
gen : Gender
pers=third : Person

DefArt

phon=[the] : Phon
cat=det : Cat
agr :

num : Number
gen : Gender
pers=third : Person

Both of these are subtypes of Det. IndefArt is specied for both number and
person whereas DefArt is only specied for person.
Similar differences in specication arise in verbs:
4
Swims

phon=[swims] : Phon
cat=v : Cat
agr :

num=sg : Number
gen : Gender
pers=third : Person

Swim

phon=[swim] : Phon
cat=v : Cat
agr :

num=pl : Number
gen : Gender
pers=third : Person

Swam

phon=[swam] : Phon
cat=v : Cat
agr :

num : Number
gen : Gender
pers=third : Person

These three types are all subtypes of V.


24 Robin Cooper
5. A type theoretical approach to unication phenomena
The types that we introduced in Section 4 lay the basis for a kind of unica-
tion phenomenon in language which has been discussed in the classical lit-
erature on unication approaches to natural language grammar (e.g. Shieber
1986). The sentence (1) is underspecied with respect to number.
(1) The sh swam.
Note, however, that either all the words are singular or all the words are plural.
It cannot be the case that sh is regarded as singular while swam is regarded
as plural, for example. This is because there are requirements that the de-
terminer and the noun agree in number and that the subject noun-phrase and
the verb agree in number. In a unication-based grammar this is expressed
by requiring that the number features of the relevant phrases unify. In the
terms of our previous discussion it means that there are two linguistic objects
corresponding to (1) rather than eight. Note that the sentences in (2) are all
singular.
(2) a. a sh swam.
b. the man swam.
c. the sh swims.
However, the source of the singularity is different in each case. It is the
fact that a, man, and swims respectively are specied for singular, together
with the requirements that all the number features unify which have as a con-
sequence that all the words are specied for singular in the single linguistic
object corresponding to each of these sentences. Unication is regarded as
a useful tool in the linguistic analysis because it reects the lack of direc-
tionality of the agreement phenomenon, that is, as long as one of the words
is specied for number they all have to be specied for the same number.
Unication is traditionally regarded as partial, that is, it can fail and this is
used to explain why the strings of words in (3) are not sentences of English,
that is, they do not correspond to linguistic objects allowed by the grammar
of English.
(3) a. *a sh swim.
b. *the man swim.
c. *a men swims.
TTR and unication-based grammar 25
On our type theoretical view the intuitive notion of unication is related
to meet (as in the meet, or conjunction, of two types) and equality. In the
denition of our type theory in Cooper(2005b) we introduce meet-types in
the following way:
If T
1
and T
2
are types, then T
1
T
2
is also a type.
a : T
1
T
2
iff a : T
1
and a : T
2
If T
1
and T
2
are record types then there will always be a record type (not a
meet) T
3
which is equivalent to T
1
T
2
(in the sense that a : T
3
iff a : T
1
T
2
).
Let us consider some examples:
_
f:T
1

_
g:T
2

_
f:T
1
g:T
2
_
_
f:T
1

_
f:T
2

_
f:T
1
T
2

Below we present some informal pseudocode for a function which will sim-
plify meets of records types, returning an equivalent record type. The algo-
rithm is similar in essential respects to the graph unication algorithm used in
classical implementations of feature based grammar systems (Shieber 1986).
One important respect in which it differs from the classical unication algo-
rithm is that it never fails. In cases where the corresponding unication would
have failed it will return a record type which is equivalent to the distinguished
empty type .
5
Another way in which it differs from the classical unica-
tion algorithm is that it applies to all types, reducing meets of records types
to non-meet types and recursively performing this reduction within record
types and otherwise returning the original type. The algorithm that is infor-
mally presented here is a simplication of the one that is implemented in
TTR which has some additional special cases and also has an additional level
of complication for the handling of dependent types, using the technique of
environments which we referred to in Section 3. In order to understand the
intention of this pseudocode it is important to remember that record types are
considered to be nite sets of ordered pairs (representing the elds) as de-
scribed above in Section 2. When we write Map(T, l, v[]) we mean that
each eld , T
/
) in T is to be replaced by the result of applying the function
l, v[] to and T
/
. When we say that T. is dened we mean that for some
T
/
, , T
/
) T. We assume that the type theory will dene an incompatibility
relation which holds between certain basic types such that if T
1
and T
2
are
incompatible then there will be no a such that a : T
1
and a : T
2
. For example,
one might require that all basic types are pairwise incompatible.
26 Robin Cooper
(T) =
if for some T
1
, T
2
, T = T
1
T
2
then
let
T
1
/
= (T
1
)
T
2
/
= (T
2
)
in
if T
1
/
_T
2
/
then T
1
/
elseif T
2
/
_T
1
/
then T
2
/
elseif T
1
/
and T
2
/
are incompatible, then
elseif T
1
/
and T
2
/
are record types then
Map(T
1
/
, l, v[if T
2
/
.l is dened then l, (v T
2
/
.l)) else l, v)])

(T
2
/
l, v) T
2
/
[ T
1
/
.l is dened)
else T
1
/
T
2
/
end
end
elseif T is a record type, then
Map(T, l, v[l, (v))])
else T
end
If we know a : T
1
and b : T
2
and in addition know a = b then we know
a : T
1
T
2
and a : (T
1
T
2
). Intuitively, equality of objects corresponds
to meet (or unication) of types. This can be expressed in terms of the
following rules of inference.
a : T
1
b : T
2
a = b
a : T
1
T
2
a : T
1
T
2
a : (T
1
T
2
)
We can exploit this in characterizing a type NP which allows noun-phrases
consisting of a determiner and a noun which agree in number.
TTR and unication-based grammar 27
NP

phon=append(daughters.rst.phon, daughters.rest.rst.phon):Phon
cat=np:Cat
daughters:

rst : Det
rest :
_
rst : Noun
rest=nil : [Sign]
_

agr=daughters.rest.rst.agr:Agr
c:eq(Number, daughters.rst.agr.num, daughters.rest.rst.agr.num)

In the denition of NP, Sign is to be thought of as a recursively dened type


dened by:
1. if a : Det then a : Sign
2. if a : Noun then a : Sign
3. if a : NP then a : Sign
. . . (similarly for other word and phrase types)
n. no object is of type Sign except as required by the above clauses
The predicate eq is as dened in Section 3.
The type NP can be further specied to a type where the rst daughter is
of type DefArt and the second daughter is of type Man since these types are
subtypes of Det and Noun respectively.

phon=append(daughters.rst.phon, daughters.rest.rst.phon):Phon
cat=np:Cat
daughters:

rst:

phon=[the]:Phon
cat=det :Cat
agr :

num : Number
gen : Gender
pers=third : Person

rest:

rst :

phon=[man]:Phon
cat=n:Cat
agr:

num=sg : Number
gen=masc : Gender
pers=third : Person

rest=nil:[Sign]

agr=daughters.rest.rst.agr:Agr
c:eq(Number, daughters.rst.agr.num, daughters.rest.rst.agr.num)

Note that this type represents that the singularity of the phrase has its source
28 Robin Cooper
in man. Similarly we can create the type corresponding to a sh where the
source of the singularity is the determiner.

phon=append(daughters.rst.phon, daughters.rest.rst.phon):Phon
cat=np:Cat
daughters:

rst:

phon=[a]:Phon
cat=det :Cat
agr :

num=sg : Number
gen : Gender
pers=third : Person

rest:

rst :

phon=[sh]:Phon
cat=n:Cat
agr:

num : Number
gen : Gender
pers=third : Person

rest=nil:[Sign]

agr=daughters.rest.rst.agr:Agr
c:eq(Number, daughters.rst.agr.num, daughters.rest.rst.agr.num)

A difference between our record types and feature structures is that the
record types preserve the information of the source of the number information
in these examples whereas in feature structures this information is lost once
the feature structures have been unied. An additional difference is that we
are able to form types corresponding to ungrammatical phrases such as *a
men.

phon=append(daughters.rst.phon, daughters.rest.rst.phon):Phon
cat=np:Cat
daughters:

rst:

phon=[a]:Phon
cat=det :Cat
agr :

num=sg : Number
gen : Gender
pers=third : Person

rest:

rst :

phon=[men]:Phon
cat=n:Cat
agr:

num=pl : Number
gen=masc : Gender
pers=third : Person

rest=nil:[Sign]

agr=daughters.rest.rst.agr:Agr
c:eq(Number, daughters.rst.agr.num, daughters.rest.rst.agr.num)

TTR and unication-based grammar 29


This is a well-formed type but one that cannot have any elements since sg and
pl are not identical. This type is thus equivalent to . Such types might be
usefully exploited in robust parsing. Note that even though the type is empty
it contains a great deal of information about the phrase. In particular if our
types were to include information about the meaning or content of a phrase
it might be possible to extract information about the meaning of a phrase
even though it does not actually correspond to any well-formed linguistic
object. This could potentially be exploited in a way similar to that suggested
by Fouvry (2003) for weighted feature structures.
6. Using unication to express generalizations
Unication is also exploited in unication grammars to extract generalities
from individual cases. For example, the noun-phrase agreement phenomenon
that we discussed in Section 5 requires that the agreement features on the
noun be the same as those on the NP. This is an instance of the head feature
principle which requires that the agreement features of the mother be the same
as those of the head daughter. If we identify the head daughter in phrases
then we can extract this principle out by creating a new type HFP which
corresponds to the head feature principle.
HFP
_
hd-daughter : Sign
agr=hd-daughter.agr : Agr
_
We also dene a new version of the type NP, NP
/
, which identies the
head daughter but does not contain the information corresponding to the head
feature principle.
NP
/

phon=append(daughters.rst.phon, daughters.rest.rst.phon) :Phon


cat=np :Cat
hd-daughter=daughters.rest.rst :Noun
daughters:

rst : Det
rest :
_
rst : Noun
rest=nil : [Sign]
_

agr :Agr
c:eq(Number, daughters.rst.agr.num, daughters.rest.rst.agr.num)

The record type that characterizes noun-phrases is now (NP


/
HFC).
30 Robin Cooper
7. Conclusions
We have shown how a type theory with records gives us a notion of subtyping
corresponding to subsumption in the unication literature and a way of reduc-
ing meets of record types to record types which is similar to the graph uni-
cation used in unication-based grammar formalisms. Using record types
instead of feature structures gives us a kind of intensionality which is not
available in feature structures. This intensionality allows us to distinguish
equivalent types which preserve information which is lost in the unication
of feature structures, such as the source of grammatical information associ-
ated with a phrase. This intensionality can also be exploited by associating
empty types with ungrammatical phrases. Such types may contain informa-
tion which could be used in robust parsing. While it may appear odd to refer
to this property of as intensionality in the context of parsing, we do so
because it is the same kind of intensionality which is important for our ap-
proach to the semantic analysis of attitudes such as know and believe. The
type theory provides us with a level of abstraction which permits us to make
generalizations across phenomena in natural language that have previously
been treated by separate theories. Finally, this approach to unication is em-
bedded in a rich function-based type theoretical framework which provides
us with the kind of tools that are needed for semantics while at the same time
allowing us to import unication into our semantic analysis.
Acknowledgements
This work was supported by Swedish Research Council projects numbers
2002-4879 Records, types and computational dialogue semantics and 2005-
4211 Library-based Grammar Engineering. I am grateful to Thierry Co-
quand, Dan Flickinger, Jonathan Ginzburg, Erhard Hinrichs, Bengt Nord-
str om and Aarne Ranta for discussion in connection with this work and to an
anonymous referee for this volume for making a number of useful sugges-
tions.
Notes
1. This section contains revised material from (Cooper 2005a).
2. There is a technical sense in which this recursion is non-essential. These records
could also be viewed as non-recursive records whose labels are sequences of
atomic labels. See (Cooper 2005a) for more discussion.
TTR and unication-based grammar 31
3. In order to do this safely we stratify the types. We dene the type system as a
family of type systems of order n for each natural number n. The idea is that
types which are not dened in terms of other types are of order 0 and that types
which are dened in terms of types of order n are of order n +1. We will not
discuss this in detail here but rely on the discussion in (Cooper 2005a). In this
paper we will suppress reference to order in the specication of our types.
4. We are making the simplifying assumption that all the verb forms represented
here are third person.
5. Any record type which has in one of its elds will be such that there are no
records of that type and thus the type will be equivalent to .
References
Barwise, Jon and John Perry
1983 Situations and Attitudes. Bradford Books. Cambridge, Mass.: MIT
Press.
Betarte, Gustavo
1998 Dependent Record Types and Algebraic Structures in Type Theory.
Ph.D. thesis, Department of Computing Science, G oteborg University
and Chalmers University of Technology.
Betarte, Gustavo and Alvaro Tasistro
1998 Extension of Martin-L ofs type theory with record types and subtyp-
ing. In Giovanni Sambin and Jan Smith, (eds.), Twenty-Five Years of
Constructive Type Theory, number 36 in Oxford Logic Guides. Ox-
ford: Oxford University Press.
Cooper, Robin
2005a Austinian truth, attitudes and type theory. Research on Language and
Computation 3: 333362.
2005b Records and record types in semantic theory. Journal of Logic and
Computation 15(2): 99112.
Coquand, Thierry, Randy Pollack, and Makoto Takeyama
2004 A logical framework with dependently typed records. Fundamenta
Informaticae XX: 122.
Fouvry, Frederik
2003 Constraint relaxation with weighted feature structures. In IWPT 03,
International Workshop on Parsing Technologies. Nancy (France).
Gabbay, Dov and Franz Guenthner, (eds.)
1986 Handbook of Philosophical Logic, Vol. III. Dordrecht: Reidel.
Ginzburg, Jonathan
in prep Semantics and interaction in dialogue. Draft available from
http://www.dcs.kcl.ac.uk/staff/ginzburg/papers.html.
32 Robin Cooper
Kamp, Hans and Uwe Reyle
1993 From Discourse to Logic. Dordrecht: Kluwer.
Karttunen, Lauri
1969 Pronouns and variables. In Robert I. Binnick, Alice Davison, Geor-
gia M. Green, and Jerry L. Morgan, (eds.), Papers from the Fifth Re-
gional Meeting of the Chicago Linguistic Society, 108115. Depart-
ment of Linguistics, University of Chicago, Chicago, Illinois.
Kohlhase, Michael, Susanna Kuschert, and Manfred Pinkal
1996 A type-theoretic semantics for -DRT. In Paul Dekker and Mar-
tin Stokhof, (eds.), Proceedings of the 10th Amsterdam Colloquium,
479498. ILLC, Amsterdam.
Larsson, Staffan
2002 Issue-based Dialogue Management. Ph.D. thesis, University of
Gothenburg.
M onnich, Uwe
1985 Untersuchungen zu einer konstruktiven Semantik f ur ein Fragment
des Englischen. Habilitationsschrift, Universit at T ubingen.
Montague, Richard
1974 Formal Philosophy: Selected Papers of Richard Montague. New
Haven: Yale University Press. Ed. and with an introduction by Rich-
mond H. Thomason.
Muskens, Reinhard
1996 Combining Montague semantics and discourse representation. Lin-
guistics and Philosophy 19(2): 143186.
Ranta, Aarne
1994 Type-Theoretical Grammar. Oxford: Clarendon Press.
Sag, Ivan A., Thomas Wasow, and Emily M. Bender
2003 Syntactic Theory: A Formal Introduction. Stanford: CSLI Publica-
tions, 2nd edition.
Shieber, Stuart
1986 An Introduction to Unication-Based Approaches to Grammar. Stan-
ford: CSLI Publications.
Sundholm, G oran
1986 Proof theory and meaning. In Gabbay and Guenthner (1986), chap-
ter 8, 471506.
Tasistro, Alvaro
1997 Substitution, record types and subtyping in type theory, with appli-
cations to the theory of programming. Ph.D. thesis, Department of
Computing Science, University of Gothenburg and Chalmers Univer-
sity of Technology.
van Benthem, Johan and Alice ter Meulen, (eds.)
1997 Handbook of Logic and Language. North Holland and MIT Press.
TTR and unication-based grammar 33
van Eijck, Jan and Hans Kamp
1997 Representing discourse in context. In van Benthem and ter Meulen
(1997), chapter 3, 179237.
One-letter automata: How to reduce k tapes to one
Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
Abstract
The class of k-dimensional regular relations has various closure properties that are in-
teresting for practical applications. Froma computational point of view, each closure
operation may be realized with a corresponding construction for k-tape nite state
automata. While the constructions for union, Kleene-star and (coordinate-wise) con-
catenation are simple, specic and non-trivial algorithms are needed for relational
operations such as composition, projection, and cartesian product. Here we show
that all these operations for k-tape automata can be represented and computed using
standard operations on conventional one-tape nite state automata plus some trivial
rules for tape manipulation. As a key notion we introduce the concept of a one-letter
k-tape automaton, which yields a bridge between k-tape and one-tape automata. We
achieve a general and efcient implementational framework for n-tape automata.
1. Introduction
Multi-tape nite state automata and especially 2-tape automata have been
widely used in many areas of computer science such as Natural Language
Processing (Karttunen et al. 1996; Mohri 1996; Roche and Schabes 1997)
and Speech Processing (Mohri 1997; Mohri et al. 2002). They provide an uni-
form, clear and computationally efcient framework for dictionary represen-
tation (Karttunen 1994; Mihov and Maurel 2001) and realization of rewrite
rules (Gerdemann and van Noord 1999; Kaplan and Kay 1994; Karttunen
1997), as well as text tokenization, lexicon tagging, part-of-speech disam-
biguation, indexing, ltering and many other text processing tasks (Karttunen
et al. 1996; Mohri 1996; Roche and Schabes 1995, 1997). The properties of
k-tape nite state automata differ signicantly from the corresponding prop-
erties of 1-tape automata. For example, for k 2 the class of relations rec-
ognized by k-tape automata is not closed under intersection and complement.
Moreover there is no general determinization procedure for k-tape automata.
On the other side the class of relations recognized by k-tape nite state au-
tomata is closed under a number of useful relational operations like compo-
sition, cartesian product, projection, inverse etc. It is this latter property that
36 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
makes k-tape automata interesting for many practical applications such as the
ones listed above.
There exist a number of implementations for k-tape nite state automata
(Karttunen et al. 1996; Mohri et al. 1998; van Noord 1997). Most of them
are implementing the 2-tape case only. While it is straightforward to realize
constructions for k-tape automata that yield union, Kleene-star and concate-
nation of the recognized relations, the computation of relational operations
such as composition, projection and cartesian product is a complex task. This
makes the use of the k-tape automata framework tedious and difcult.
We introduce an approach for presenting all relevant operations for k-tape
automata using standard operations for classical 1-tape automata plus some
straightforward operations for adding, deleting and permuting tapes. In this
way we obtain a transparent, general and efcient framework for implement-
ing k-tape automata.
The main idea is to consider a restricted form of k-tape automata where
all transition labels have exactly one non-empty component representing a
single letter. The set of all k-tuples of this form represents the basis of the
monoid of k-tuples of words together with the coordinate-wise concatena-
tion. We call this kind of automata one-letter automata. Treating the basis
elements as symbols of a derived alphabet, one-letter automata can be con-
sidered as conventional 1-tape automata. This gives rise to a correspondence
where standard operations for 1-tape automata may be used to replace com-
plex operations for k-tape automata.
The paper is structured as follows. Section 2 provides some formal back-
ground. In Section 3 we introduce one-letter k-tape automata. We show that
classical algorithms for union, concatenation and Kleene-star over one-letter
automata (considered as 1-tape automata) are correct if the result is inter-
preted as a k-tape automaton. Section 4 is central. A condition is given
that guarantees that the intersection of two k-dimensional regular relations is
again regular. For k-tape one-letter automata of a specic form that reects
this condition, any classical algorithm for intersecting the associated 1-tape
automata can be used for computing the intersection of the regular relations
recognized by the automata. Section 5 shows how to implement tape permu-
tations for one-letter automata. Using tape permutations, the inverse relation
to a given k-dimensional regular relation can be realized. In a similar way,
Section 6 treats tape insertion, tape deletion and projection operations for
k-dimensional regular relations. Section 7 shows how to reduce the computa-
tion of composition and cartesian product of regular relations to intersections
of the kind discussed in Section 4. plus tape insertion and projection. In Sec-
One-letter automata 37
tion 8 we add some nal remarks. We comment on problems that may arise
when using k-tape automata and on possible solutions.
2. Formal Background
We assume that the reader is familiar with standard notions from automata
theory (see, e.g., (Aho et al. 1983; Roche and Schabes 1995)). In the sequel,
with we denote a nite set of symbols called the alphabet, denotes the
empty word, and

:= . The length of a word w

is written [w[.
If L
1
, L
2

are languages, then


L
1
L
2
:=w
1
w
2
[ w
1
L
1
, w
2
L
2

denotes their concatenation. Here w


1
w
2
is the usual concatenation of words.
Recall that

, , ) is the free monoid with set of generators .


If v =v
1
, . . . , v
k
) and w =w
1
, . . . , w
k
) are two k-tuples of words, then
v w :=v
1
w
1
, . . . , v
k
w
k
)
denotes the coordinate-wise concatenation. With we denote the k-tuple
, . . . , ). The tuple (

)
k
, , ) is a monoid that can be described as the
k-fold cartesian product of the free monoid

, , ). As set of generators we
consider

k
:=, . . . , a

i
, . . . , ) [ 1 i k, a .
Note that the latter monoid is not free, due to obvious commutation rules for
generators. For relations R (

)
k
we dene
R
0
:= ,
R
i+1
:= R
i
R,
R

:=

[
i=0
R
i
(Kleene-star).
Let k 2 and 1 i k. The relation
R(i) := w
1
, . . . , w
i1
, w
i+1
, . . . , w
k
) [ v

:
w
1
, . . . , w
i1
, v, w
i+1
, . . . , w
k
) R
is called the projection of R to the set of coordinates
1, . . . , i 1, i +1, . . . , k.
38 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
If R
1
, R
2
(

)
k
are two relations of the same arity, then
R
1
R
2
:=v w [ v R
1
, w R
2

denotes the coordinate-wise concatenation. If R


1

k
and R
2

l
are two
relations, then
R
1
R
2
:=w
1
, . . . , w
k+l
) [ w
1
, . . . , w
k
) R
1
, w
k+1
, . . . , w
k+l
) R
2

is the cartesian product of R


1
and R
2
and
R
1
R
2
:= w
1
, . . . , w
k+l2
) [ w : w
1
, . . . , w
k1
, w) R
1
,
w, w
k
, . . . , w
k+l2
) R
2

is the composition of R
1
and R
2
. Further well-known operations for relations
are union, intersection, and inversion (k = 2).
Denition 1 The class of k-dimensional regular relations over the alphabet
is recursively dened in the following way:
and v for all v

k
are k-dimensional regular relations.
If R
1
, R
2
and R are k-dimensional regular relations, then so are
R
1
R
2
,
R
1
R
2
,
R

.
There are no other k-dimensional regular relations.
Note 1 The class of k-dimensional regular relations over a given alphabet
is closed under union, Kleene-star, coordinate-wise concatenation, composi-
tion, projection, and cartesian product. For k 2 the class of regular relations
is not closed under intersection, difference and complement. Obviously, ev-
ery 1-dimensional regular relation is a regular language over the alphabet .
Hence, for k = 1 we obtain closure under intersection, difference and com-
plement.
Denition 2 Let k be a positive integer. A k-tape automaton is a six-tuple
A = k, , S, F, s
0
, E), where is an alphabet, S is a nite set of states, F S
One-letter automata 39
is a set of nal states, s
0
S is the initial state and E S(

)
k
S is a nite
set of transitions. A sequence
s
0
, a
1
, s
1
, . . . , s
n1
, a
n
, s
n
,
where s
0
is the initial state, s
i
S and a
i
(

)
k
for i = 1, . . . , n, is a path
for A iff s
i1
, a
i
, s
i
) E for 1 i < n. The k-tape automaton A recognizes
v (

)
k
iff there exists a path s
0
, a
1
, s
1
, . . . , s
n1
, a
n
, s
n
for A such that s
n
F
and v = a
1
a
2
. . . a
n1
a
n
. With R(A) we denote the set of all tuples in
(

)
k
recognized by A, i.e., R(A) :=v (

)
k
[ A recognizes v.
For a given k-tape automaton A = k, , S, F, s
0
, E) the generalized transition
relation E

S(

)
k
S is recursively dened as follows:
1. s, , . . . , ), s) E

for all s S,
2. if s
1
, v, s
/
) E

and s
/
, a, s
2
) E, then s
1
, v a, s
2
) E

, for all v
(

)
k
, a (

)
k
, s
1
, s
/
, s
2
S.
Clearly, if Ais a k-tape automaton, then R(A) =v (

)
k
[ f F : s
0
, v, f )
E

.
Note 2 By a well-known generalization of Kleenes Theorem (see Kaplan
and Kay (1994)), for each k-tape automaton A the set R(A) is a k-dimensional
regular relation, and for every k-dimensional regular relation R
/
, there exists
a k-tape automaton A
/
such that R(A
/
) = R
/
.
3. Basic Operations for one-letter automata
In this section we introduce the concept of a one-letter automaton. One-letter
automata represent a special form of k-tape automata that can be naturally
interpreted as one-tape automata over the alphabet

k
. We show that basic
operations such as union, concatenation, and Kleene-star for one-letter au-
tomata can be realized using the corresponding standard constructions for
conventional one-tape automata.
Denition 3 A k-tape nite state automaton A = k, , S, F, s
0
, E) is a one-
letter automaton iff all transitions e E are of the form
e =s, , . . . ,

i1
, a

i
,

i+1
, . . . ,

k
), s
/
)
for some 1 i k and a .
40 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
Proposition 1 For every k-tape automaton A we may effectively construct a
k-tape one-letter automaton A
/
such that R(A
/
) = R(A).
Proof. First we can apply the classical -removal procedure in order to
construct an -free k-tape automaton, which leaves the recognized relation
unchanged. Let

A =k, , S, F, s
0
, E) be an -free k-tape automaton such that
R(A) = R(

A). Then we construct A
/
= k, , S
/
, F, s
0
, E
/
) using the following
algorithm:
S
/
= S, E
/
=
FOR s S DO:
FOR s, a
1
, a
2
, . . . , a
k
), s
/
) E DO
LET I =i N [ a
i
,= (I =i
1
, . . . , i
t
);
LET S
//
=s
i
1
, . . . , s
i
t1
, SUCH THAT S
//
S
/
=;
S
/
= S
/
S
//
;
E
/
= E
/
s
i
j
, , . . . , , a
i
j

i
j
, , . . . , ), s
i
j+1
) [ 0 j t 1,
s
i
0
= s and s
i
t
= s
/
;
END;
END.
Informally speaking, we split each transition with label a
1
, a
2
, . . . , a
k
)
with t > 1 non-empty coordinates into t subtransitions, introducing t 1 new
intermediate states.
Corollary 1 If R (

)
k
is a k-dimensional regular relation, then there ex-
ists a k-tape one-letter automaton A such that R(A) = R.
Each k-tape one-letter automaton A over the alphabet can be considered
as a one-tape automaton (denoted by

A) over the alphabet

k
. Conversely,
every -free one-tape automaton over the alphabet

k
can be considered as
a k-tape automaton over . Formally, this correspondence can be described
using two mappings.
Denition 4 The mapping maps every k-tape one-letter automaton A =
k, , S, F, s
0
, E) to the -free one-tape automaton

A :=

k
, S, F, s
0
, E).
One-letter automata 41
The mapping maps a given -free one-tape automaton
A
/
=

k
, S, F, s
0
, E)
to the k-tape one-letter automaton

A
/
:= k, , S, F, s
0
, E).
Obviously, the mappings and are inverse. From a computational point
of view, the mappings merely represent a conceptual shift where we use an-
other alphabet for looking at transitions labels. States and transitions are not
changed.
Denition 5 The mapping
:

k

k
: a
1
a
n
a
1
a
n
,
is called the natural homomorphism between the free monoid

k
, , ) and
the monoid
k
, , ).
It is trivial to check that is in fact a homomorphism. We have the following
connection between the mappings , and .
Lemma 1 Let A =k, , S, F, s
0
, E) be a k-tape one-letter automaton. Then
1.

A = A.
2. R(A) =(L(

A)).
Furthermore, if A
/
is an -free one-tape automaton over

k
, then

A
/
= A
/
.
Thus we obtain the following commutative diagram:
A
R

L
R(A)
L(

A)

We get the following proposition as a direct consequence of Lemma 1 and


the homomorphic properties of the mapping .
Proposition 2 Let A
1
and A
2
be two k-tape one-letter automata. Then we
have the following:
1. R(A
1
) R(A
2
) =(L(

A
1
) L(

A
2
)).
2. R(A
1
) R(A
2
) =(L(

A
1
) L(

A
2
)).
3. R(A
1
)

=(L(

A
1
)

).
42 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
Algorithmic constructions From Part 1 of Proposition 2 we see the fol-
lowing. Let A
1
and A
2
be two k-tape one-letter automata. Then, to construct
a one-letter automaton A such that R(A) =R(A
1
)R(A
2
) we may interpret A
i
as a one-tape automaton

A
i
(i = 1, 2). We use any union-construction for one-
tape automata, yielding an automaton A
/
such that L(A
/
) = L(

A
1
) L(

A
2
).
Removing -transitions and interpreting the resulting automaton A
//
as a k-
tape automaton A :=

A
//
we receive a one-letter automaton such that R(A) =
R(A
1
) R(A
2
). Similarly Parts 2 and 3 show that classical algorithms for
closing conventional one-tape automata under concatenation and Kleene-star
can be directly applied to k-tape one-letter automata, yielding algorithms for
closing k-tape one-letter automata under concatenation and Kleene-star.
4. Intersection of one-letter automata
It is well-known that the intersection of two k-dimensional regular relations is
not necessarily a regular relation. For example, the relations R
1
=a
n
b
k
, c
n
) [
n, k N and R
2
=a
s
b
n
, c
n
) [ s, n N are regular, but R
1
R
2
=a
n
b
n
, c
n
) [
n N is not regular since its rst projection is not a regular language. We
now introduce a condition that guarantees that the classical construction for
intersecting one-tape automata is correct if used for k-tape one-letter au-
tomata. As a corollary we obtain a condition for the regularity of the inter-
section of two k-dimensional regular relations. This observation will be used
later for explicit constructions that yield composition and cartesian product
of one-letter automata. A few preparations are needed.
Denition 6 Let v = b
1
. . . b
n
be an arbitrary word over the alphabet , i.e.,
v

. We say that the word v


/
is obtained from v by adding the letter b iff
v
/
= b
1
. . . b
j
bb
j+1
. . . b
n
for some 0 j n. In this case we also say that v is
obtained from v
/
by deleting the symbol b.
Proposition 3 Let v =a
1
. . . a
n

k
and (v) =a
1
a
2
a
n
=w
1
, . . . , w
k
).
Let also a =, . . . , b

i
, . . . , )

k
. Then, if v
/
is obtained from v by adding the
letter a, then
(v
/
) =w
1
, . . . , w
i1
, w
/
i
, w
i+1
, . . . , w
k
)
and w
/
i
is obtained from w
i
by adding the letter b.
One-letter automata 43
Denition 7 For a regular relation R (

)
k
the coordinate i (1 i k) is
inessential iff for all w
1
, . . . , w
k
) R and any v

we have
w
1
, . . . , w
i1
, v, w
i+1
, . . . , w
k
) R.
Analogously, if A is a k-tape automaton such that R(A) = R we say that tape
i of A is inessential. Otherwise we call coordinate (tape) i essential.
Denition 8 Let A be a k-tape one-letter automaton and assume that each
coordinate in the set I 1, . . . , k is inessential for R(A). Then A is in normal
form w.r.t. I iff for any tape i I we have:
1. s S, a : s, , . . . , a

i
, . . . , ), s) E,
2. s, s
/
S, a : (s
/
,= s) s,

1
, . . . , a

i
, . . . ,

k
), s
/
) / E.
Proposition 4 For any k-tape automaton A and any given set I of inessential
coordinates of R(A) we may effectively construct a k-tape one-letter automa-
ton A
/
in normal form w.r.t. I such that R(A
/
) = R(A).
Proof. Let A = k, , S, F, s
0
, E). Without loss of generality we can assume
that A is in one-letter form (Proposition 1). To construct A
/
=k, , S, F, s
0
, E
/
)
we use the following algorithm:
E
/
= E
FOR s S DO
FOR i I DO
FOR a DO
IF ((s, , . . . , , a

i
, , . . . , ), s
/
) E
/
) & (s
/
,= s)) THEN
E
/
= E
/
s, , . . . , , a

i
, , . . . , ), s
/
);
E
/
= E
/
s, , . . . , , a

i
, , . . . , ), s);
END;
END;
END.
The algorithm does not change any transition on an essential tape. Transi-
tions between distinct states that affect an inessential tape in I are erased. For
44 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
each state we add loops with all symbols from the alphabet for the inessen-
tial tapes in I. The correctness of the above algorithm follows from the
fact that for any inessential tape i I we have w
1
, . . . , w
i
, . . . , w
n
) R(A)
iff w
1
, . . . , , . . . , w
n
) R(A).
Corollary 2 Let R (

)
k
be a regular relation with a set I of inessential
coordinates. Then there exists a k-tape one-letter automaton A in normal
form w.r.t. I such that R(A) = R.
The following property of k-tape automata in normal form will be useful
when proving Lemma 2.
Proposition 5 Let A = k, , S, F, s
0
, E) be a k-tape one-letter automaton in
normal form w.r.t. the set of inessential coordinates I. Let i
0
I and let
v = a
1
. . . a
n
L(

A). Then for any a = , . . . , b

i
0
, . . . , )

k
and any word
v
/

k
obtained from v by adding a we have v
/
L(

A).
Proof. The condition for the automaton A to be in normal form w.r.t. I yields
that for all s S the transition s, a, s) is in E, which proves the proposition.

Now we are ready to formulate and prove the following sufcient condition
for the regularity of the intersection of two regular relations. With K we
denote the set of coordinates 1, . . . , k.
Lemma 2 For i = 1, 2, let A
i
be a k-tape one-letter automaton, let I
i
K
denote a given set of inessential coordinates for A
i
. Let A
i
be in normal form
w.r.t. I
i
(i =1, 2). Assume that [K(I
1
I
2
)[ 1, which means that there exists
at most one common essential tape for A
1
and A
2
. Then R(A
1
) R(A
2
) is a
regular k-dimensional relation. Moreover R(A
1
)R(A
2
) =(L(

A
1
)L(

A
2
)).
Proof. It is obvious that (L(

A
1
) L(

A
2
)) R(A
1
) R(A
2
), because if
a
1
. . . a
n
L(

A
1
) L(

A
2
), then by Lemma 1 we have a
1
a
n
R(A
1
)
R(A
2
). We give a detailed proof for the other direction, showing that
R(A
1
) R(A
2
) (L(

A
1
) L(

A
2
)).
For the proof the reader should keep in mind that the transition labels of the
automata A
i
(i = 1, 2) are elements of

k
, which means that the sum of the
lengths of the words representing the components is exactly 1.
One-letter automata 45
Let w
1
, w
2
, . . . , w
k
) R(A
1
)R(A
2
). Let j
0
K be a coordinate such that
for each j
0
,= j K we have j I
1
or j I
2
. Let E
1
= KI
1
. Recall that for
i E
1
, i ,= j
0
always i I
2
is an inessential tape for A
2
. Then by the denition
of inessential tapes the tuples w
/
1
, . . . , w
/
k
) and w
//
1
, . . . , w
//
k
), where
w
/
i
=
_
, if i I
1
w
i
, if i E
1
w
//
i
=
_
, if i E
1
and i ,= j
0
w
i
, otherwise
respectively are in R(A
1
) and R(A
2
). Then there are words
v
/
= a
/
1
. . . a
/
n
L(

A
1
)
v
//
= a
//
1
. . . a
//
m
L(

A
2
)
such that (v
/
) =w
/
1
, . . . , w
/
k
) and (v
//
) =w
//
1
, . . . , w
//
k
).
Note that n =

k
i=1
[w
/
i
[ and m =

k
i=1
[w
//
i
[. Furthermore, w
j
0
= w
/
j
0
= w
//
j
0
.
Let l =[w
j
0
[.
We now construct a word a
1
a
2
. . . a
r
L(

A
1
)L(

A
2
) such that a
1
a
2
. . .
a
r
=w
1
, . . . , w
k
), which imposes that r = n+ml. Each letter a
i
is obtained
copying a suitable letter from one of the sequences a
/
1
. . . a
/
n
and a
//
1
. . . a
//
m
. In
order to control the selection, we use the pair of indices t
/
i
, t
//
i
(0 i <n+m
l), which can be considered as pointers to the two sequences. The denition
of t
/
i
, t
//
i
and a
i
proceeds inductively in the following way. Let t
/
0
= t
//
0
:= 1.
Assume that t
/
i
and t
//
i
are dened for some 0 i < n+ml. We show how
to dene a
i+1
and the indices t
/
i+1
and t
//
i+1
. We distinguish four cases:
1. if t
/
i
= n+1 and t
//
i
= m+1 we stop; else
2. if a
/
t
/
i
=, . . . , b

j
, . . . , ) for some j ,= j
0
, then a
i+1
:= a
/
t
/
i
, t
/
i+1
:= t
/
i
+1,
t
//
i+1
:= t
//
i
,
3. if a
/
t
/
i
=, . . . , b

j
0
, . . . , ) or t
/
i
=n+1, and a
//
t
//
i
=, . . . , c

j
, . . . , ) for some
j ,= j
0
, then a
i+1
:= a
//
t
//
i
, t
/
i+1
:= t
/
i
, and t
//
i+1
:= t
//
i
+1.
4. if a
/
t
/
i
= a
//
t
//
i
= , . . . , b

j
0
, . . . , ) for some b , then a
i+1
:= a
/
t
/
i
, t
/
i+1
:=
t
/
i
+1 and t
//
i+1
:= t
//
i
+1.
From an intuitive point of view, the denition yields a kind of zig-zag con-
struction. We always proceed in one sequence until we come to a transition
46 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
that affects coordinate j
0
. At this point we continue using the other sequence.
Once we have in both sequences a transition that affects j
0
, we enlarge both
indices. From w
/
j
0
= w
//
j
0
= w
j
0
it follows immediately that the recursive def-
inition stops exactly when i +1 = n +ml. In fact the subsequences of
a
/
1
. . . a
/
n
and a
//
1
. . . a
//
m
from which w
j
0
is obtained must be identical.
Using induction on 0 i n+ml we now prove that the word a
1
. . . a
i
is obtained from a
/
1
. . . a
/
t
/
i
1
by adding letters in

k
which have a non- symbol
in an inessential coordinate for R(A
1
). The base of the induction is obvious.
Let the statement be true for some 0 i < n+ml. We prove it for i +1:
The word a
1
. . . a
i
a
i+1
is obtained from a
1
. . . a
i
by adding the letter a
i+1
=
, . . . b

j
, . . . , ), and according to the induction hypothesis a
1
. . . a
i
is obtained
from a
/
1
, . . . a
/
t
i
1
by adding letters with a non- symbol in a coordinate in I
1
.
If j E
1
(Cases 2 and 4), then a
i+1
= a
/
t
/
i
, t
/
i+1
=t
/
i
+1 and t
/
i+1
1 =t
/
i
, hence
a
1
. . . a
i
a
i+1
is obtained from a
/
1
, . . . a
/
t
/
i
1
a
/
t
/
i+1
1
by adding letters satisfying
the above condition. On the other side, if j I
1
(Case 3) we have a
i+1
=
a
//
t
//
i
and t
/
i+1
:= t
/
i
, which means that a
/
1
. . . a
/
t
/
i+1
1
= a
/
1
. . . a
/
t
/
i
1
and a
i+1
is a
letter satisfying the condition. Thus a
1
. . . a
i
a
i+1
is obtained from a
/
1
. . . a
/
t
/
i+1
1
adding letters which have non- symbol on an inessential tape for A
1
, which
means that the statement is true for i +1.
Analogously we prove for 0 i n+ml that a
1
. . . a
i
is obtained from
a
//
1
. . . a
//
t
//
i
1
by adding letters in

k
which have a non- symbol in an inessential
coordinate for R(A
2
).
By Proposition 5, a
1
. . . a
n+ml
L(

A
1
) and a
1
. . . a
n+ml
L(

A
2
). From
Proposition 3 we obtain that a
1
a
n+ml
=u
1
, . . . , u
k
), where u
i
=w
/
i
if
i E
1
and u
i
= w
//
i
otherwise. But now remembering the denition of w
/
i
and
w
//
i
we obtain that a
1
a
n+ml
=w
1
, . . . , w
k
), which we had to prove.
Corollary 3 If R
1
()
k
and R
2
()
k
are two k-dimensional regular rela-
tions with at most one common essential coordinate i (1 i k), then R
1
R
2
is a k-dimensional regular relation.
Algorithmic construction From Lemma 2 we see the following. Let A
1
and
A
2
be two k-tape one-letter automata with at most one common essential tape
i. Assume that both automata are in normal form w.r.t. the sets of inessential
tapes. Then the relation R(A
1
) R(A
2
) is recognized by any -free 1-tape
automaton A
/
accepting L(

A
1
) L(

A
2
), treating A
/
as a k-tape one-letter au-
tomaton A =

A
/
.
One-letter automata 47
5. Tape permutation and inversion for one-letter automata
In the sequel, let S
k
denote the symmetric group of k elements.
Denition 9 Let R (

)
k
be a regular relation, let S
k
. The permutation
of coordinates induced by , (R), is dened as
(R) :=w

1
(1)
, . . . , w
i

(i)
, . . . , w

1
(k)
) [ w
1
, . . . , w
k
) R.
Proposition 6 For a given k-tape one-letter automaton
A =k, , S, F, s
0
, E),
let (A) :=k, , S, F, s
0
, (E)) where
(E) :=s, , . . . , a
i

(i)
, . . . , ), s
/
) [ s, , . . . , a
i

i
, . . . , ), s
/
) E.
Then R((A)) =(R(A)).
Proof. Using induction over the construction of E

and (E)

we prove that
for all s
/
S and w
1
, . . . , w
k
) (

)
k
we have
s
0
, w
1
, . . . , w
k
), s
/
) E

s
0
, w

1
(1)
, . . . , w
i

(i)
, . . . , w

1
(k)
), s
/
) (E)

.
. The base of the induction is obvious since s
0
, , . . . , ), s
0
) E

(E)

. Now suppose that there are transitions


s
0
, w
1
, . . . , w
i1
, w
/
i
, w
i+1
, . . . , w
k
), s) E

s, , . . . , a

i
, . . . , ), s
/
) E.
Then, by induction hypothesis, s
0
, w

1
(1)
, . . . , w
/
i

(i)
, . . . , w

1
(k)
), s) (E)

.
The denition of (E) shows that s, , . . . , a

(i)
, . . . , ), s
/
) (E). Hence
s
0
, w

1
(1)
, . . . , w
/
i
a

(i)
, . . . , w

1
(k)
), s
/
) (E)

,
48 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
which we had to prove.
. Follows analogously to .
Corollary 4 Let R (

)
k
be a k-dimensional regular relation and S
k
.
Then also (R) is a k-dimensional regular relation.
Algorithmic construction From Proposition 6 we see the following. Let A
be a 2-tape one-letter automaton. If denotes the transposition (1, 2), then
automaton (A) dened as above recognizes the relation R(A)
1
.
6. Tape insertion, tape deletion and projection
Denition 10 Let R (

)
k
be a k-dimensional regular relation. We dene
the insertion of an inessential coordinate at position i (denoted R(i)) as
R(i) := w
1
, . . . , w
i1
, v, w
i
, . . . , w
k
) [
w
1
, . . . , w
i1
, w
i
, . . . , w
k
) R, v

.
Proposition 7 Let A = k, , S, F, s
0
, E) be a k-tape one-letter automaton.
Let A
/
:=k +1, , S, F, s
0
, E
/
) where
E
/
:= s, , . . . , a

i
, . . . , ,

k+1
), s
/
) [ s, , . . . , a

i
, . . . ,

k
), s
/
) E
s, , . . . , , a

k+1
), s) [ s S, a .
Then R(A
/
) = R(A) (k +1).
Proof. First, using induction on the construction of E
/
we prove that for all
s S and w
1
, . . . , w
k
, w
k+1
) (

)
k+1
we have
s
0
, w
1
, . . . , w
k
, w
k+1
), s
/
) E
/
s
0
, w
1
, . . . , w
k
, ), s
/
) E
/
.
. The base of the induction is obvious. Assume there are transitions
s
0
, w
1
, . . . , w
i1
, w
/
i
, w
i+1
, . . . , w
k
, w
k+1
), s) E
/
s, , . . . , a

i
, . . . , ), s
/
) E
/
.
First assume that i k. By induction hypothesis,
s
0
, w
1
, . . . , w
i1
, w
/
i
, w
i+1
, . . . , w
k
, ), s) E
/
.
One-letter automata 49
Using the denition of E
/
we obtain
s
0
, w
1
, . . . , w
i1
, w
/
i
a, w
i+1
, . . . , w
k
, ), s
/
) E
/
.
If i = k +1, then s = s
/
. We may directly use the induction hypothesis to
obtain s
0
, w
1
, . . . , w
k
, ), s
/
) E
/
.
. Let s
0
, w
1
, . . . , w
k
, ), s
/
) E
/
. Let w
k+1
=v
1
. . . v
n

where v
i
.
The denition of E
/
shows that for all v
i
(1 i n) there exists a transition
s
/
, , . . . , , v
i

k+1
), s
/
) E
/
. Hence s
0
, w
1
, . . . , w
k
, w
k+1
), s
/
) E
/
.
To nish the proof observe that the denition of E
/
yields
s
0
, w
1
, . . . , w
k
), s
/
) E

s
0
, w
1
, . . . , w
k
, ), s
/
) E
/
.

Corollary 5 If R (

)
k
is a regular relation, then R (i) is a (k +1)-
dimensional regular relation.
Proof. The corollary directly follows from Proposition 7 and Proposition 6
having in mind that R(i) =(R(k+1)) where is the cyclic permutation
(i, i +1, . . . , k, k +1) S
k+1
.
It is well-known that the projection of a k-dimensional regular relation is
again a regular relation. The following propositions show how to obtain a
(k 1)-tape one-letter automaton representing the relation R(i) (cf. Sec-
tion 2) directly from a k-tape one-letter automaton representing the relation
R.
Proposition 8 Let A=k, , S, F, s
0
, E) be a k-tape one-letter automaton. Let
A
/
:= k 1, , S, F, s
0
, E
/
) be the (k 1)-tape automaton where for i k 1
we have
s, , . . . , a

i
, . . . ,

k1
), s
/
) E
/
s,

1
, . . . , a

i
, . . . ,

k
), s
/
) E
and furthermore
s, , . . . ,

k1
), s
/
) E
/
a
k

k
: s, , . . . ,

k1
, a
k

k
), s
/
) E.
Then R(A
/
) = R(A) (k).
50 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
Note 3 The resulting automaton A
/
is not necessarily a one-letter automaton
because A
/
may have some -transitions. It could be transformed into a one-
letter automaton using a standard -removal procedure.
Proof of Proposition 8. It is sufcient to prove that for all
w
1
, . . . , w
k1
, w
k
) (

)
k
and s
/
S we have
s
0
, w
1
, . . . , w
k1
, w
k
), s
/
) E

s
0
, w
1
, . . . , w
k1
), s
/
) E
/
.
Again we use an induction on the construction of E

and E
/
.
. The base is trivial since s
0
, , . . . ,

k
), s
0
) E

and
s
0
, , . . . ,

k1
), s
0
) E
/
.
Let s
0
, w
1
, . . . , w
/
j
, . . . , w
k
), s) E

and s, , . . . , a

j
, . . . , ), s
/
) E for some
1 j k. First assume that j < k.
The induction hypothesis yields s
0
, w
1
, . . . , w
/
j
, . . . , w
k1
), s) E
/
. Since
s, , . . . , a

j
, . . . ,

k1
), s
/
) E
/
we have s
0
, w
1
, . . . , w
/
j
a, . . . , w
k1
), s
/
) E
/
.
If j =k, then the induction hypothesis yields s
0
, w
1
, . . . , w
k1
), s) E
/
. We
have
s, , . . . ,

k1
), s
/
) E
/
,
hence s
0
, w
1
, . . . , w
k1
), s
/
) E
/
.
. Similarly as .
Corollary 6 If R (

)
k
is a regular relation, then R(i) is (k1)-dimen-
sional regular relation.
Proof. The corollary follows directly from R(i) = (
1
(R)) (k), where
is the cyclic permutation (i, i +1, . . . , k) S
k
and Proposition 6.
Algorithmic construction The constructions given in Proposition 8 and 6
together with an obvious -elimination show how to obtain a one-letter (k
1)-tape automaton A
/
for the projection R(i), given a one-letter k-tape au-
tomaton A recognizing R.
One-letter automata 51
7. Composition and cartesian product of regular relations
We now show how to construct composition and cartesian product (cf. Sec-
tion 2) of regular relations via automata constructions for standard 1-tape
automata.
Lemma 3 Let R
1
(

)
n
1
and R
2
(

)
n
2
be regular relations. Then the
composition R
1
R
2
is a (n
1
+n
2
2)-dimensional regular relation.
Proof. Using Corollary 5 we see that the relations
R
/
1
:= (. . . ((R
1
(n
1
+1)) (n
1
+2)) . . .) (n
1
+n
2
1)
R
/
2
:= (. . . ((R
2
(1)) (2)) . . .) (n
1
1)
are (n
1
+n
2
1)-dimensional regular relations. Using the denition of we
see that the essential coordinates for R
/
1
are in the set E
1
= 1, 2, . . . , n
1
and
those of R
/
2
are in the set E
2
=n
1
, n
1
+1, . . . , n
1
+n
2
1. Therefore R
/
1
and
R
/
2
have at most one common essential coordinate, namely n
1
. Corollary 3
shows that R = R
/
1
R
/
2
is a (n
1
+n
2
1)-dimensional regular relation. Since
coordinates in E
1
(resp. E
2
) are inessential for R
/
2
(resp. R
/
1
) we obtain
w
/
1
, . . . , w
/
n
1
1
, w, w
//
n
1
+1
, . . . , w
//
n
1
+n
2
1
) R
/
1
R
/
2
w
/
1
, . . . , w
/
n
1
1
, w) R
1
& w, w
//
n
1
+1
, . . . , w
//
n
1
+n
2
1
) R
2
.
Using the denition of and Corollary 2 we obtain that R(n
1
) is a (n
1
+
n
2
2)-dimensional regular relation such that
w
/
1
, . . . , w
/
n
1
1
, w
//
n
1
+1
, . . . , w
//
n
1
+n
2
1
) R(n
1
)
w

: w
/
1
, . . . , w
/
n
1
1
, w

n
1
, w
//
n
1
+1
, . . . , w
//
n
1
+n
2
1
) R.
Combining both equivalences we obtain
w
/
1
, . . . , w
/
n
1
1
, w
//
n
1
+1
, . . . , w
//
n
1
+n
2
1
) R(n
1
)
w

: w
/
1
, . . . , w
/
n
1
1
, w) R
1
&
w, w
//
n
1
+1
, . . . , w
//
n
1
+n
2
1
) R
2
,
i.e. R(n
1
) = R
1
R
2
.
Lemma 4 Let R
1
(

)
n
1
and R
2
(

)
n
2
be regular relations. Then the
cartesian product R
1
R
2
is a (n
1
+n
2
)-dimensional regular relation over .
52 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
Proof. Similarly as in Lemma 3 we construct the (n
1
+n
2
)-dimensional
regular relations
R
/
1
:= (. . . ((R
1
(n
1
+1)) (n
1
+2)) . . .) (n
1
+n
2
)
R
/
2
:= (. . . ((R
2
) (1)) (2)) . . .) (n
1
).
The coordinates in 1, 2, . . . , n
1
are inessential for R
/
2
and those in n
1
+
1, . . . , n
1
+n
2
are inessential for R
/
1
. Therefore R
/
1
and R
/
2
have no com-
mon essential coordinate and, by Corollary 3, R := R
/
1
R
/
2
is a (n
1
+n
2
)-
dimensional regular relation. Using the denition of inessential coordinates
and the denition of we obtain
w
/
1
, . . . , w
/
n
1
, w
//
n
1
+1
, . . . , w
//
n
1
+n
2
) R
w
/
1
, . . . , w
/
n
1
) R
1
& w
//
n
1
+1
, . . . , w
//
n
1
+n
2
) R
2
,
which shows that R = R
1
R
2
.
Algorithmic construction The constructions described in the above proofs
show how to obtain one-letter automata for the composition R
1
R
2
and for
the cartesian product R
1
R
2
of the regular relations R
1
(

)
n
1
and R
2

(

)
n
2
, given one-letter automata A
i
for R
i
(i = 1, 2). In more detail, in order
to construct an automaton for R
1
R
2
we
1. add n
2
1 nal inessential tapes to A
1
and n
1
1 initial inessential tapes
to A
2
in the way described above (note that the resulting automata are
in normal form w.r.t. the new tapes),
2. intersect the resulting automata as conventional one-tape automata over
the alphabet

n
1
+n
2
1
, obtaining A,
3. remove the n
1
-th tape from A and apply an -removal, thus obtaining
A
/
, which is the desired automaton.
In order to construct an automaton for R
1
R
2
we
1. add n
2
nal inessential tapes to A
1
and n
1
initial inessential tapes to A
2
in the way described above,
2. intersect the resulting automata as normal one-tape automata over the
alphabet

n
1
+n
2
, obtaining A, which is the desired automaton.
One-letter automata 53
At the end we will discuss the problem of how to represent identity relations
as regular relations. First observe that the automaton A := 2, , S, F, s
0
, E)
where :=a
1
, . . . , a
n
, S :=s
0
, s
1
, . . . , s
n
, F := s
0
and
E :=s
0
, a
i
, )s
i
) [ 1 i ns
i
, , a
i
)s
0
) [ 1 i n
accepts R(A) =v, v) [ v

. The latter relation we denote with Id

.
Proposition 9 Let R
1
be 1-dimensional regular relation, i.e., a regular lan-
guage. Then the set Id
R
1
:= v, v) [ v R
1
is a regular relation. Moreover
Id
R
1
= (R
1
(2)) Id

.
8. Conclusion
We introduced the concept of a one-letter k-tape automaton and showed that
one-letter automata can be considered as conventional 1-tape automata over
an enlarged alphabet. Using this correspondence, standard constructions for
union, concatenation, and Kleene-star for 1-tape automata can be directly
used for one-letter automata. Furthermore we have seen that the usual rela-
tional operations for k-tape automata can be traced back to the intersection of
1-tape automata plus straightforward operations for adding, permuting and
erasing tapes.
We have implemented the presented approach for implementation of trans-
ducer (2-tape automata) representing rewrite rules. Using it we have success-
fully realized Bulgarian hyphenation and tokenization.
Still, in real applications the use of one-letter automata comes with some
specic problems, in particular in situations where the composition algorithm
is heavily used. In the resulting automata we sometimes nd a large number
of paths that are equivalent if permutation rules for generators are taken into
account. For example, we might nd three paths with label sequences
a, ), a, ), , b)
, b), a, , ), a, , )
a, , ), , b), a, , ),
all representing the tuple aa, b). In the worst case this may lead to an expo-
nential blow-up of the number of states, compared to the classical construc-
tion for n-tape automaton.
We currently study techniques to get rid of superuous paths. In many
cases, equivalences of the above form can be recognized and used for elimi-
nating states and transitions. The extension and renement of these methods
is one central point of current and future work.
54 Hristo Ganchev, Stoyan Mihov, and Klaus U. Schulz
References
Aho, Alfred V., John E. Hopcroft, and Jeffrey D. Ullman
1983 Introduction to Automata Theory, Languages, and Computation.
Reading, MA: Addison-Wesley.
Gerdemann, Dale and Gertjan van Noord
1999 Transducers from rewrite rules with backreferences. In Proceedings
of the 9th Conference of the European Chapter of the Association for
Computational Linguistics (EACL 99), 126133.
Kaplan, Ronald and Martin Kay
1994 Regular models of phonological rule systems. Computational Lin-
guistics 20(3): 331279.
Karttunen, Lauri
1994 Constructing lexical transducers. In Proceedings of the 15th Interna-
tional Conference on Computational Linguistics. Coling 9, 406411.
Kyoto, Japan.
1997 The replace operator. In Emmanuel Roche and Yves Schabes, (eds.),
Finite-State Language Processing, 117147. Cambridge, MA: MIT
Press.
Karttunen, Lauri, Jean-Pierre Chanod, Gregory Grefenstette, and Anne Schiller
1996 Regular expressions for language engineering. Journal of Natural
Language Engineering 2(4): 305328.
Mihov, Stoyan and Denis Maurel
2001 Direct construction of minimal acyclic subsequential transducers. In
Proceedings of the Conference on Implementation and Application of
Automata CIAA2000, LNCS 2088, 217229. Berlin: Springer.
Mohri, Mehryar
1996 On some applications of nite-state automata theory to natural lan-
guage processing. Journal of Natural Language Engineering 2: 120.
1997 Finite-state transducers in language and speech processing. Compu-
tational Linguistics 23(2): 269311.
Mohri, Mehryar, Fernando Pereira, and Michael Riley
1998 A rational design for a weighted nite-state transducer library. In
Derick Wood and Sheng Yu, (eds.), Proceedings of the Second In-
ternational Workshop on Implementing Automata (WIA 97), LNCS
1436, 144158. Berlin: Springer.
Mohri, Mehryar, Fernando C. N. Pereira, and Michael Riley
2002 Weighted nite-state transducers in speech recognition. Computer
Speech and Language 16(1): 6988.
Roche, Emmanuel and Yves Schabes
1995 Deterministic part-of-speech tagging with nite state transducers.
Computational Linguistics 22(2): 227253.
One-letter automata 55
1997 Introduction. In Emmanuel Roche and Yves Schabes, (eds.), Finite-
State Language Processing, 166. Cambridge, MA: MIT Press.
van Noord, Gertjan
1997 FSA utilities: A toolbox to manipulate nite-state automata. In De-
rick Wood Darrell Raymond and Sheng Yu, (eds.), Automata Imple-
mentation, LNCS 1260. Berlin: Springer.
Two aspects of situated meaning
Eleni Kalyvianaki and Yiannis N. Moschovakis
Abstract
We introduce two structural notions of situated meaning for natural language sen-
tences which can be expressed by terms of Montagues Language of Intensional
Logic. Using the theory of referential intensions, we dene for a sentence at a par-
ticular situation its factual content and its local meaning which express different
abstract algorithms that compute its reference at that situation. With the use of char-
acteristic examples, we attempt to show the distinctive roles of these two notions in
any theory of meaning and to discuss briey their relation to indexicality, proposi-
tional attitudes and translation.
1. Introduction
If a speaker of the language can rationally believe A and disbe-
lieve B in the same situation, then the sentences A and B do not
have the same meaningthey are not synonymous.
The principle is old (Frege 1892), and it has been used both as a test for
theories of meaning and a source of puzzles about belief and synonymy. We
think that at least some of the puzzles are due to a confusion between two
plausible and legitimate but distinct understandings of situated meaning, the
factual content and the (referential) local meaning.
Consider, for example, the sentences
A John loves Mary, and B John loves her,
in a state (situation) a in which her refers to Mary. They express the same
information about the world in state a (they have the same factual content
at that state); but they do not have the same meaning in that state, as they
are not interchangeable in belief contexts: one may very well believe A but
disbelieve B in a, because she does not know that her refers to Mary.
We will give precise, mathematical denitions of factual content and local
meaning for the fragments of natural language which can be formalized in the
58 Eleni Kalyvianaki and Yiannis N. Moschovakis
Language of Intensional Logic of Montague (1973), within the mathematical
theory of referential intensions; this is a rigorous (algorithmic), structural
modeling of meanings for the typed -calculus developed in (Moschovakis
2006), and so the article can be viewed as a contribution to the formal logic
of meaning. We think, however, that some of our results are relevant to the
discussion of these matters in the philosophy of language and in linguistics,
and, in particular, to Kaplans work on the logic of indexicals. We will discuss
briey some of these connections in Section 5.
2. Three formal languages
There are three (related) formal languages that we will deal with, the Lan-
guage of Intensional Logic LIL of Montague (1973); the Two-sorted Typed
-calculus Ty
2
of Gallin (1975); and the extension L

ar
of Ty
2
by acyclic recur-
sion in (Moschovakis 2006). We describe these briey in this section, and in
the next we summarize equally briey the theory of referential intensions in
L

ar
, which is our main technical tool.
1
All three of these languages start with the same, three basic types
e : entities, t : truth values, s : states,
and, for the interpretation, three xed, associated, non-empty sets
T
e
=the entities, T
s
=the states, T
t
=the truth values =0, 1, er, (1)
where 1 stands for truth, 0 for falsity and er for error.
2
The types are dened
by the recursion
3
:e [ s [ t [ (
1

2
), (2)
and a set T

of objects of type is assigned to each by adding to (1) the


recursive clause
4
T
()
= the set of all functions f : T

. (3)
For each type , there is an innite sequence of variables of type
v

0
, v

1
, . . .
which range over T

.
It is also useful for our purpose here to assume a xed (nite) set K of
typed constants as in Table 1, which we will use to specify the terms of all
three languages. Each constant c K stands for some basic word of natural
Two aspects of situated meaning 59
Names, indexicals
5
John, I, he, thetemp : e
Sentences it rains : t
Common nouns man : e t
Extensional intransitive verbs run : e t
Intensional intransitive verbs rise : (s e) t
Extensional transitive verbs love : e (e t)
The denite article the : (e t) e
Propositional connectives &, : t (t t)
(Basic) necessity operator : (s t) t
de dicto modal operators Yesterday, Today : (s t) t
de re modal operators Yesterday
1
, Today
1
: (s (e t)) (e t)
Table 1. Some constants with their LIL-typing.
language or logic and is assigned a type which (roughly) corresponds to
its grammatical category; notice though, that common nouns and extensional
intransitive verbs are assigned the same type (e t), because they take an
argument of type e and (intuitively) deliver a truth value, as in the simple
examples of rendering (formalization) in LIL,
John is running
render
run(John), John is a man
render
man(John).
For the xed interpretation, each constant c of type is assigned a function
from the states to the objects of type :
6
if c : , then den(c) = c : T
s
T

. (4)
Thus John is interpreted in each state a by John(a), the (assumed) specic,
unique entity which is referred to by John in state a.
7
The de re modal op-
erator Yesterday
1
is interpreted by the relation on properties and individuals
Yesterday
1
(a)(p)(x) p(a

)(x),
where for each state a, a

is the state on the preceding day, and similarly for


Today
1
.
Starting with these common ingredients, the languages LIL, Ty
2
and L

ar
have their own features, as follows.
2.1. The language of intensional logic LIL
Montague does not admit s as a full-edged primitive type like e and t, but
uses it only as the name of the domain in the formation of function types.
60 Eleni Kalyvianaki and Yiannis N. Moschovakis
This leads to the following recursive denition of types in LIL:
:e [ t [ (s
2
) [ (
1

2
). (LIL-types)
We assume that all constants in the xed set K are of LIL-type.
The terms of LIL are dened by the recursion
A :x [ c [ A(B) [ (x)(B) [ (A) [ (A) (LIL-terms)
subject to some type restrictions, and each A is assigned a type as follows,
where
A : the type of A is .
(LIL-T1) x v

i
for some LIL-type and some i, and x : .
(LIL-T2) c is a constant (of some Montague type ), and c : .
(LIL-T3) A : ( ), B : and A(B) : .
(LIL-T4) x v

i
for some LIL-type and some i, B : and (x)(B) : ( ).
(LIL-T5) A : (s ) and (A) : .
(LIL-T6) A : and (A) : (s ).
In addition, the free and bound occurrences of variables in each term are
dened as usual, and A is closed if no variable occurs free in it. A sentence is
a closed term of type t.
The constructs (A) and (A) are necessary because LIL does not have vari-
ables over states, and express (roughly) application and abstraction on an im-
plicit variable which ranges over the current state. This is explained by the
semantics of LIL and made explicit in the Gallin translation of LIL into Ty
2
which we will describe in the next section.
Semantics of LIL
As usual, an assignment is a function which associates with each variable
x v

i
some object (x) T

. The denotation of each LIL-term A : is a


function
den
LIL
(A) : Assignments (T
s
T

)
which satises the following, recursive conditions, where a, b range over the
set of states T
s
:
8
Two aspects of situated meaning 61
(LIL-D1) den
LIL
(x)()(a) =(x).
(LIL-D2) den
LIL
(c)()(a) = c(a), as in (4).
(LIL-D3) den
LIL
(A(B))()(a) =
_
den
LIL
(A)()(a)
_
_
den
LIL
(B)()(a)
_
.
(LIL-D4) den
LIL
((x)(B))()(a) =
_
t den
LIL
(B)(x := t)(a)
_
,
where x : and t ranges over the objects in T

(with a LIL-type).
(LIL-D5) den
LIL
((A))()(a) =
_
den(A)()(a)
_
(a).
(LIL-D6) den
LIL
((A))()(a) = (b den
LIL
(A)()(b)) (= den
LIL
(A)()).
Consider the following four, simple and familiar examples which we will
use throughout the paper to illustrate the various notions that we introduce;
these are sentences whose denotations are independent of any assignment ,
and so we will omit it.
John loves her
render
love(John, her) (5)
den
LIL
_
love(John, her)
_
(a) = love(a)(John(a), her(a))
John loves himself
render

_
(x)love(x, x)
_
(John) (6)
den
LIL
_
_
(x)love(x, x)
_
(John)
_
(a) = (t love(a)(t, t))(John(a))
= love(a)(John(a), John(a))
The President is necessarily American
render

_
(American(the(president)))
_
(7)
den
LIL
_
((American(the(president))))
_
(a)
= Nec(a)
_
b American(b)(the(b)(president(b)))
_
I was insulted yesterday
render
Yesterday
1
(be insulted, I) (8)
den
LIL
_
Yesterday
1
(be insulted, I)
_
(a)
= Yesterday
1
(a)(den
LIL
(be insulted)(a), den
LIL
(I)(a))
= Yesterday
1
(a)(b be insulted(b), I(a))
62 Eleni Kalyvianaki and Yiannis N. Moschovakis
The temperature is ninety and rising
render

_
(x)[ninety(x)&rise(x)]
_
((thetemp)) (9)
For (9), a computation similar to those in the examples above gives the cor-
rect, expected denotation.
2.2. The two-sorted, typed, -calculus Ty
2
The assumption that every term is interpreted in the current state and the
lack of state variables are natural enough when we think of the terms of LIL
as rendering expressions of natural language, but they are limiting and tech-
nically awkward. Both are removed in the two-sorted typed -calculus Ty
2
,
whose characteristic features are that it admits all types as in (2), and inter-
prets terms of type by objects in T

. We x a set of constants
K
G
=c
G
[ c K
in one-to-one correspondence with the constants K of LIL
9
. In accordance
with the interpretation (rather than the formal typing) of LIL,
if c : , then c
G
: (s ) and den(c
G
)(a) = c
G
(a) = c(a) (a T
s
),
i.e., the object c
G
in Ty
2
which interprets c
G
is exactly the object c which
interprets c in LIL. The terms of Ty
2
are dened by the recursion
A :x [ c
G
[ A(B) [ (x)(B) (Ty
2
-terms)
where now x can be a variable of any type as in (2), and the typing of terms
is obvious. Assignments interpret all variables, including those of type s, and
denotations are dened naturally:
(Ty
2
D1) den(x)() =(x).
(Ty
2
D2) den(c
G
)() = c
G
.
(Ty
2
D3) den(A(B))() =
_
den(A)()
_
(den(B)()).
(Ty
2
D4) den((x)A(x))() =
_
t den(A)(x := t)
_
.
We notice the basic property of the Ty
2
-typing of terms: for every assignment
,
if A : , then den(A)() T

. (10)
Two aspects of situated meaning 63
The Gallin translation
For each LIL-term A and each state variable u representing the current state,
the Gallin translation A
G,u
of A in Ty
2
is dened by the following recursive
clauses:
10
[x]
G,u
:x
[c]
G,u
:c
G
(u)
[A(B)]
G,u
:A
G,u
(B
G,u
)
[(x)(A)]
G,u
:(x)(A
G,u
)
[A]
G,u
:A
G,u
(u)
[A]
G,u
:(u)A
G,u
By an easy recursion on the LIL-terms, A
G,u
has the same LIL-type as A, and
for every assignment ,
if (u) = a, then den(A
G,u
)() = den
LIL
(A)()(a).
In effect, the Gallin translation expresses formally (within Ty
2
) the denition
of denotations of LIL.
Here are the Gallin translations of the standard examples above:
11
[love(John, her)]
G,u
love
G
(u)(John
G
(u), her
G
(u)) (11)
_
(x)
_
love(x, x)
_
(John)
_
G,u

_
(x)love
G
(u)(x, x)
_
(John
G
(u)) (12)
_

_
(American(the(president)))
__
G,u

G
(u)
_
(u)American
G
(u)(the
G
(u)(president
G
(u)))
_
(13)
_
Yesterday
1
(be insulted, I)
_
G,u
Yesterday
1
G
(u)
_
(u)be insulted
G
(u), I
G
(u)
_
(14)
__
(x)[ninety(x)&rise(x)]
_
_
(thetemp)
_
_
G,u

_
(x)[ninety
G
(u)(x(u))&rise
G
(u)(x)]
_
_
(u)thetemp
G
(u)
_
(15)
Notice that the selected formal variable u occurs both free and bound in these
Gallin translationswhich may be confusing, but poses no problem.
64 Eleni Kalyvianaki and Yiannis N. Moschovakis
2.3. The -calculus with acyclic recursion L

ar
We now add to Ty
2
an innite sequence of recursion variables or locations
p

0
, p

1
, . . .
for each type . In the semantics of the extended language L

ar
, these will
vary over the corresponding universe T

just as the usual (pure) variables


v

0
, v

1
, . . ., but they will be assigned-to rather than quantied in the syntax,
and so they will be treated differently by the semantics. The terms of L

ar
are
dened by the following extension of the recursive denition of the Ty
2
-terms:
A :x [ p [ c
G
[ A(B) [ (x)(B)
[ A
0
where p
1
:= A
1
, . . . , p
n
:= A
n
(L

ar
-terms)
where x is a pure variable of any type; p is a location of any type; and the
restrictions, typing and denotations are dened exactly as for Ty
2
for all but
the last, new acyclic recursion construct, where they are as follows.
Acyclic recursive terms
For A A
0
where p
1
:= A
1
, . . . , p
n
:= A
n
to be well-formed, the following
conditions must be satised:
(i) p
1
, . . . , p
n
are distinct locations, such that the type of each p
i
is the
same as that of the term A
i
, and
(ii) the system of simultaneous assignments p
1
:= A
1
, . . . , p
n
:= A
n
is
acyclic, i.e., there are no cycles in the dependence relation
i ~ j p
j
occurs free in A
i
on the index set 1, . . . , n.
All the occurrences of the locations p
1
, . . . , p
n
in the parts A
0
, . . . , A
n
of A are
bound in A, and the type of A is that of its head A
0
. The body of A is the
system p
1
:= A
1
, . . . , p
n
:= A
n
.
To dene the denotation function of a recursive term A, we notice rst that
by the acyclicity condition, we can assign a number rank(p
i
) to each of the
locations in A so that
if p
j
occurs free in A
i
, then rank(p
i
) > rank(p
j
).
Two aspects of situated meaning 65
For each assignment then (which now interprets all pure and recursive vari-
ables), we set by induction on rank(p
i
),
p
i
() = den(A
i
)(p
j
1
:= p
j
1
, . . . , p
j
m
:= p
j
m
),
where p
j
1
, . . . , p
j
m
is an enumeration of the locations with rank(p
j
k
) <rank(p
i
),
(k = 1, . . . , m), and nally,
den(A)() = den(A
0
)(p
1
:= p
1
, . . . , p
n
:= p
n
).
For example, if
A
_
(x)(p(x)&q(x))
_
(t) where
_
p :=(x)ninety
G
(u)(r(x)),
r :=(x)x(u), q :=(x)rise
G
(u)(x), t := (u)thetemp
G
(u)
_
, (16)
we can compute den(A)() = den(A) in stages, corresponding to the ranks of
the parts, with a =(u):
Stage 1. r = (x x(a)), so r(x) = x(a),
q = (x rise
G
(a)(x)) = rise
G
(a), so that
q(x) = 1 x is rising in state a, and t = thetemp
G
.
Stage 2. p = (x ninety
G
(a)(r(x))), so p(x) = 1 x(a) = 90.
Stage 3. den(A) = p(t)&q(t), so
den(A) = 1 thetemp
G
(a) = 90&rise
G
(a)(thetemp
G
).
We will use the familiar model-theoretic notation for denotational equiva-
lence,
[= A = B for all assignments , den(A)() = den(B)().
It is very easy to check that every L

ar
-term A is denotationally equivalent
with a Ty
2
-term A

, and so L

ar
is no-more expressive than Ty
2
as far as deno-
tations go; it is, however, intensionally more expressive than Ty
2
, as we will
see.
66 Eleni Kalyvianaki and Yiannis N. Moschovakis
Congruence
Two L

ar
-terms are congruent if one can be obtained from the other by alpha-
betic changes of bound variables (of either kind) and re-orderings of the parts
in the bodies of recursive subterms, so that, for example, assuming that all
substitutions are free,
(x)(Az :x)
c
(y)(Az :y),
Ap :q where q := Bp :q
c
A where p := B,
A where p := B, q :=C
c
A where q :=C, p := B.
All the syntactic and semantic notions we will dene respect congruence, and
so it will be convenient on occasion to identify congruent terms.
Since Ty
2
is a sublanguage of L

ar
, we can think of the Gallin translation
as an interpretation of LIL into L

ar
; and so we can apply to the terms of LIL
the theory of meaning developed for L

ar
in (Moschovakis 2006), which we
describe next.
3. Referential intension theory
The referential intension int(A) of a L

ar
-term A is a mathematical (set-theoret-
ic) object which purports to represent faithfully the natural algorithm (pro-
cess) which computes den(A)() for each . It models an intuitive notion of
meaning for L

ar
-terms (and the natural language expressions which they ren-
der), and it provides a precise relation of synonymy between terms which
can be tested against our intuitions and other theories of meaning that are
similarly based on truth conditions. Roughly:
A B int(A) = int(B) (A, B in L

ar
), (17)
where A in L

ar
naturally means that A is a L

ar
-term. To facilitate the discus-
sion of meaning in LIL, we also set
A
LIL
B A
G,u
B
G,u
(A, B in LIL). (18)
This relation models quite naturally (global) synonymy for terms of LIL.
The operation A int(A) and the relation of referential synonymy are
fairly complex, and their precise denitions in (Moschovakis 2006) require
the establishment of several technical facts. Here we will conne ourselves to
a brief summary of the main results of referential intension theory, primarily
so that this article can be read independently of (Moschovakis 2006).
Two aspects of situated meaning 67
There are two important points to keep in mind.
First, variables and some very simple, immediate (variable-like) terms
are not assigned referential intensions: they denote directly and immediately,
without the mediation of a meaning. Thus (17) is not exactly right: it holds
for proper (non-immediate terms), while for immediate terms synonymy co-
incides with denotational equality or (equivalently for these terms) congru-
ence. The distinction between direct and immediate reference is precise but
not just technical: it lies at the heart of the referential intension approach to
modeling meaning, and it plays an important role in our analysis of examples
from natural language. We will discuss it in Section 3.2.
Second, the denotational rule of -conversion
[=
_
(x)A
_
(B) = Ax :B
does not preserve referential synonymy, so that, for example,
_
(x)love(x, x)
_
(John) ,
LIL
love(John, John).
This is common in structural theories of meaning in which the meaning of a
term A codes (in particular) the logical form of A; see (Moschovakis 2006)
for a related extensive discussion. It is good to remember this here, especially
as we render natural language phrases into LIL and then translate these terms
into Ty
2
and so into L

ar
: we want rendering to preserve (intuitive) meaning,
so that we have a chance of capturing it with the precisely dened referential
intension of the end result, and so we should not lose it by carelessly applying
-conversions in some step of the rendering process.
3.1. Reduction, irreducibility, canonical forms
The main technical tool of (Moschovakis 2006) is a binary relation of reduc-
tion between L

ar
-terms, for which (intuitively)
A B A
c
B
or A and B have the same meaning
and B expresses that meaning more simply.
The disjunction is needed because the reduction relation is dened for all
pairs of terms, even those which do not have a meaning, for which, however,
the relation is trivial. We set
A is irreducible for all B, if A B, then A
c
B, (19)
68 Eleni Kalyvianaki and Yiannis N. Moschovakis
so that the irreducible terms which have meaning, express that meaning as
simply as possible.
Theorem 1 (Canonical form) For each term A, there is a unique (up to con-
gruence) recursive, irreducible term
cf(A) A
0
where p
1
:= A
1
, . . . , p
n
:= A
n
,
such that A cf(A). We write
A
cf
B B
c
cf(A).
If A B, then [= A = B, and, in particular, [= A = cf(A).
The reduction relation is determined by ten, simple reduction rules which
comprise the Reduction Calculus, and the computation of cf(A) is effective.
The parts A
i
of cf(A) are explicit
12
, irreducible terms; they are determined
uniquely (up to congruence) by A; and they code the basic facts which are
needed to compute the denotation of A, in the assumed xed interpretation of
the language. If A : t and den(A) = 1, then the irreducible parts of cf(A) can
be viewed as the truth conditions which ground the truth of A.
Variables and constants are irreducible, and so is the more complex look-
ing term (x)love
G
(u)(x, x). On the other hand, the term expressing Johns
self-love in the current state is not:
_
(x)love
G
(u)(x, x)
_
(John
G
(u))

cf
_
(x)love
G
(u)(x, x)
_
( j) where j := John
G
(u). (20)
For a more complicated example, the canonical form of the Gallin transla-
tion of the Partee term in (15) is the term (16). So canonical forms get very
complex, as do their explicit, irreducible partswhich is not surprising, since
they are meant to express directly the meanings of complex expressions.
The specic rules of the Reduction Calculus are at the heart of the matter,
of course, and they deliver the subtle differences in (formal) meaning with
which we are concerned here. It is not possible to state or explain them in
this articlethey are the main topic of (Moschovakis 2006); but the most
important of them will be gleaned from their applications in the examples
below.
Two aspects of situated meaning 69
3.2. Direct vs. immediate reference
An important role in the computation of canonical forms is played by the
immediate terms. These are dened by
X :v [ p [ p(v) [ (u)p(v), (Immediate terms)
where v = (v
1
, . . . , v
n
),u = (u
1
, . . . , u
m
) and v, v
1
, . . . , v
n
, u
1
, . . . , u
m
are pure
variables, while p is a location. Immediate terms are treated like variables
in the Reduction Calculus; this is not true of constants (and other irreducible
terms) which contribute in a non-trivial way to the canonical forms of the
terms in which they occur. For example, run
G
(u)(p(v)) is irreducible, be-
cause p(v) is immediate, while run
G
(u)(John
G
(u)) is not:
run
G
(u)(John
G
(u))
cf
run
G
(u)( j) where j := John
G
(u).
In the intensional semantics of L

ar
to which we will turn next, immediate
terms refer directly and immediately: they are not assigned meanings, and
they contribute only their reference to the meaning of larger (proper) terms
which contain them. Irreducible terms also refer directly, in the sense that
their meaning is completely determined by their reference; but they are as-
signed meanings, and they affect in a non-trivial (structural) way the mean-
ings of larger terms which contain them.
3.3. Referential intensions
If A is not immediate and
A
cf
A
0
where p
1
:= A
1
, . . . , p
n
:= A
n
,
then int(A) is the abstract algorithm which intuitively computes den(A)()
for each assignment as indicated in the remarks following (16), as follows:
(i) Solve the system of equations
d
i
= den(A
i
)(p
1
:= d
1
, p
2
:= d
2
, . . . , p
n
:= d
n
) (i = 1, . . . , n),
(which, easily, has unique solutions by the acyclicity hypothesis).
(ii) If the solutions are p
1
, . . . , p
n
, set
den(A)() = den(A
0
)(p
1
:= p
1
, . . . , p
n
:= p
n
).
70 Eleni Kalyvianaki and Yiannis N. Moschovakis
So how can we dene precisely this abstract algorithm? The idea is
that it must be determined completely by the head of A and the system of
equations in its body, and it should not depend on any particular method of
solving the system; so it is most natural to simply identify it with the tuple of
functions
int(A) = ( f
0
, f
1
, . . . , f
n
) (21)
dened by the parts of A, i.e.,
f
i
(d
1
, . . . , d
n
, ) = den(A
i
)(p
1
:= d
1
, p
2
:= d
2
, . . . , p
n
:= d
n
) (i n).
Tuples of functions such as (21) are called recursors.
For a concrete example, which also illustrates just how abstract this notion
of meaning is, the referential intension of the Partee example (15) is deter-
mined by its canonical form A
G,u
in (16), and it is the recursor
int(A) = ( f
0
, f
1
, f
2
, f
3
, f
4
),
where
f
0
(p, r, q, t, ) =
_
x (p(x)&q(x))(t)
_
,
f
1
(p, r, q, t, ) =
_
x ninety
G
((u))(r(x))
_
,
f
2
(p, r, q, t, ) =
_
x x((u))
_
,
f
3
(p, r, q, t, ) =
_
x rise
G
((u))(x)
_
,
f
4
(p, r, q, t, ) = thetemp
G
.
Theorem 2 (Compositionality) The operation A int(A) on proper (not
immediate) terms is compositional, i.e., int(A) is determined from the ref-
erential intensions of the proper subterms of A and the denotations of its
immediate subterms.
This does not follow directly from the denition of referential intensions
that we gave above, via canonical forms, but it is not difcult to prove.
3.4. Referential synonymy
Two terms A and B are referentially synonymous if either A
c
B, or int(A)
and int(B) are naturally isomorphic. Now this is tedious to make precise, but,
happily, we dont need to do this here because of the following
Two aspects of situated meaning 71
Theorem 3 (Referential Synonymy) For any two terms A, B of L

ar
, A is ref-
erentially synonymous with B if and only if there exist suitable terms A
0
, . . .,
B
0
, . . . such that
A
cf
A
0
where p
1
:= A
1
, . . . , p
n
:= A
n
,
B
cf
B
0
where p
1
:= B
1
, . . . , p
n
:= B
n
,
and for i = 0, 1, . . . , n, [= A
i
= B
i
, i.e., for all , den(A
i
)() = den(B
i
)().
Thus the referential synonymy relation A B is grounded by a system of
denotational identities between explicit, irreducible terms. It is important, of
course, that the formal identities
A
i
= B
i
, i = 0, . . . , n
can be computed from A and B (using the Reduction Calculus), although their
truth or falsity depends on the assumed, xed structure of interpretation and
cannot, in general, be decided effectively.
4. Two notions of situated meaning
We can now make precise the two, promised notions of situated meaning for
terms of LIL, after just a bit more preparation.
State parameters
Intuitively, a notion of situated meaning of a LIL-term A : is a way that
we understand A in a given state a; and so it depends on a, even when A is
closed, when its semantic values do not depend on anything else. To avoid
the cumbersome use of assignments simply to indicate this state dependence,
we introduce a parameter a for each state a, so that the denition of the terms
of L

ar
now takes the following form:
A :x [ a [ p [ c
G
[ A(B) [ (x)(B)
[ A
0
where p
1
:= A
1
, . . . , p
n
:= A
n
(L

ar
-terms)
Parameters are treated like free pure variables in the denition of immediate
terms and in the Reduction Calculus; in fact, the best way to think of a is as
a free variable with preassigned value
den( a)() = a
which does not depend on the assignment .
72 Eleni Kalyvianaki and Yiannis N. Moschovakis
4.1. Factual content
The term
A
G, a
A
G,u
u : a (22)
expresses the Gallin translation of A at the state anot only its denotation,
but also its meaning or at least one aspect of it. Thus, for each proper LIL-term
A and each state a, we set
FC(A, a) = int
_
A
G, a
_
. (23)
This is the (referential) factual content of A at the state a. For proper terms
A, B and states a, b, we also set
13
(A, a) is factually synonymous with (B, b) FC(A, a) = FC(B, b)
A
G, a
B
G,

b
.
By the Referential Synonymy Theorem 3 then, we can read factual synonymy
by examining the canonical forms of terms. For example,
[John loves her]
G, a
render
love
G
( a)(John
G
( a), her
G
( a))

cf
love
G
( a)( j, h) where j := John
G
( a), h := her
G
( a)
and
[John loves Mary]
G, a
render
love
G
( a)(John
G
( a), Mary
G
( a))

cf
love
G
( a)( j, h) where j := John
G
( a), h := Mary
G
( a),
so that
if den
_
Mary
G
( a)
_
= den
_
her
G
( a)
_
,
then [John loves her]
G, a
[John loves Mary]
G, a
,
which expresses formally the fact that John loves her and John loves Mary
convey the same information about the world at this state a. These two sen-
tences are not, of course, synonymous, as it is easy to verify by the denition
of
LIL
in (18) and Theorem 3.
Next consider example (8) which involves the indexical Yesterday. In
(Frege 1918), Frege argues that
Two aspects of situated meaning 73
If someone wants to say today what he expressed yesterday us-
ing the word today, he will replace this word with yesterday.
Although the thought is the same, its verbal expression must be
different in order that the change of sense which would otherwise
be effected by the differing times of utterance may be cancelled
out.
It appears that Freges thought in this case is best modeled by the factual
content of the uttered sentence in the relevant state.
In detail, suppose that at state a the speaker is DK and the time is 27 June
2005. If we consider the sentence I am insulted today uttered at a state
b = a

when the time is 26 June 2005, the speaker is again DK and nothing
else has changed, then, according to Freges remark above, it should be that
_
I was insulted yesterday

G, a

_
I am insulted today

G,

b
.
This is indeed the case:
_
I was insulted yesterday

G, a
render
Yesterday
1
G
( a)
_
(u)be insulted
G
(u), I
G
( a)
_

cf
Yesterday
1
G
( a)(p, q) where p :=(u)be insulted
G
(u), q := I
G
( a)
_
I am insulted today

G,

b render
Today
1
G
(

b)
_
(v)be insulted
G
(v), I
G
(

b)
_

cf
Today
1
G
(

b)(p, q) where p :=(v)be insulted


G
(v), q := I
G
(

b),
and the canonical forms of these sentences at these states satisfy the condi-
tions of Theorem 3 for synonymyassuming, of course, that Yesterday
1
and
Today
1
are interpreted in the natural way, so that for these a and b,
Yesterday
1
(a)(p)(x) Today
1
(b)(p)(x) (p : T
s
(T
e
T
t
), x T
e
).
On the other hand, example (7) shows that, in some cases, the factual
content is independent of the state and incorporates the full meaning of the
term:
[The President is necessarily American]
G, a
render

G
( a)
_
(u)American
G
(u)(the
G
(u)(president
G
(u)))
_

cf

G
( a)(q) where q :=(u)American
G
(u)(t(u)),
t :=(u)the
G
(u)(p(u)), p :=(u)president
G
(u).
74 Eleni Kalyvianaki and Yiannis N. Moschovakis
Notice that the state parameter a occurs only in the head of the relevant canon-
ical form, and so, with the necessarily always interpretation of that we
have adopted, the factual content of this term is independent of the state a.
4.2. Referential (global) meaning
A plausible candidate for the (global) referential meaning of a LIL-term A is
the operation
a int
_
A
G, a
_
which assigns to each state a the factual content of A at a. We can understand
this outside the formal system, as an operation from states to recursors; but we
can also do it within the system, taking advantage of the abstraction construct
of the typed -calculus and setting
M(A) = int
_
(u)A
G,u
_
. (24)
It follows by Theorem 3 and the Reduction Calculus that for proper terms
A, B,
M(A) = M(B) (u)A
G,u
(u)B
G,u
A
LIL
B,
and so there is no conict between this notion of global meaning and the
referential synonymy relation between LIL-terms dened directly in terms of
the Gallin translation.
The recursor M(A) is expressed directly by the canonical form of (u)A
G,u
,
which gives some insight into this notion of formal meaning. For example:
(u)[John loves her]
G,u
render
(u)love
G
(u)(John
G
(u), her
G
(u))

cf
(u)love
G
(u)( j(u), h(u)) where j :=(u)John
G
(u), h :=(u)her
G
(u)
(u)love
G
(u)( j(u), h(u)) where j := John
G
, h := her
G

while
(u)[John loves Mary]
G,u
render
(u)love
G
(u)(John
G
(u), Mary
G
(u))

cf
(u)love
G
(u)( j(u), h(u)) where j :=(u)John
G
(u),
h := (u)Mary
G
(u)
(u)love
G
(u)( j(u), h(u)) where j := John
G
, h := Mary
G
.
To grasp the meanings of these two sentences, as Frege would say, we
need the functions love, John, Mary and hernot their values in any one,
particular state, but their range of values in all states; and to realize that they
are not synonymous, we need only realize that her is not Mary in all states.
Two aspects of situated meaning 75
4.3. Local meaning
Once we have a global meaning of A, we can compute its local meaning
at a state a by evaluation, and, again, we could do this outside the system
by dening in a natural way an operation of application of a recursor to an
argument; but since we already have application in the typed -calculus, we
set, within the system,
LM(A, a) = int
_
((u)A
G,u
)( a)
_
. (25)
This is the (referential) local meaning of A at a. For proper terms A, B and
states a, b, we set
(A, a) is locally synonymous with (B, b) LM(A, a) = LM(B, b)

_
(u)A
G,u
_
( a)
_
(v)B
G,v
_
(

b).
It is important to recall here that, in general,
_
(u)C
G,u
_
( a) ,C
G, a
,
because -conversion does not preserve referential synonymy.
The three synonymy relations we have dened are related as one would
expect:
Lemma 1 (a) Referential synonymy implies local synonymy at any state,
that is
(u)A
G,u
(u)B
G,u
=
_
(u)A
G,u
_
( a)
_
(u)B
G,u
_
( a)
(b) Local synonymy at a state implies factual synonymy at that state,
_
(u)A
G,u
_
( a)
_
(u)B
G,u
_
( a) =A
G, a
B
G, a
.
Both parts of the lemma are easily proved using Theorem 3 and some sim-
ple denotational equalities between the parts of the relevant canonical forms.
In the following sections, we consider some examples which (in particu-
lar) show that neither part of the Lemma has a valid converse. Perhaps most
interesting are those which distinguish between factual and local synonymy,
and show that the latter is a much more ne-grained relation, very close in
fact to (global) referential synonymy.
76 Eleni Kalyvianaki and Yiannis N. Moschovakis
4.4. Factual content vs. local meaning
In Section 4.1, we showed that for any state a,
if her(a) = Mary(a), then [John loves her]
G, a
[John loves Mary]
G, a
.
To check for local synonymy, we compute the canonical forms of these terms:
_
(u)[John loves her]
G,u
_
( a)
render

_
(u)love
G
(u)(John
G
(u), her
G
(u))
_
( a)

cf
_
(u)love
G
(u)( j(u), h(u))
_
( a)
where j := (u)John
G
(u), h :=(u)her
G
(u)
love
G
( a)( j( a), h( a)) where j := John
G
, h := her
G

while
_
(u)[John loves Mary]
G,u
_
( a)
render

_
(u)love
G
(u)(John
G
(u), Mary
G
(u))
_
( a)

cf
_
(u)love
G
(u)( j(u), h(u))
_
( a)
where j := (u)John
G
(u), h :=(u)Mary
G
(u)
love
G
( a)( j( a), h( a)) where j := John
G
, h := Mary
G

But her
G
,= Mary
G
, and so these two sentences are not locally synonymous
at aalthough they have the same factual content at a.
The example illustrates the distinction between factual content and local
meaning: to grasp the factual content FC(John loves her, a) we only need
know who her is at state a; on the other hand, to grasp the local meaning
LM(John loves her, a) we need to understand her as a function on the states.
This is what we also need in order to grasp the (global) referential meaning
of John loves her, which brings us to the more difcult comparison between
local and global meaning.
4.5. Local vs. global synonymy
By the Reduction Calculus, if
A
G,u

cf
A
0
where p
1
:= A
1
, . . . , p
n
:= A
n

Two aspects of situated meaning 77


then
(u)A
G,u

cf
(u)(A
0
p
1
:q
1
(u), . . . , p
n
:q
n
(u))
where
_
q
1
:=(u)A
1
p
1
:q
1
(u), . . . , p
1
:q
n
(u),
.
.
.
q
n
:=(u)A
n
p
1
: q
1
(u), . . . , p
n
:q
n
(u)
_
and
_
(u)A
G,u
_
( a)
cf
_
(u)(A
0
p
1
:q
1
(u), . . . , p
n
:q
n
(u))
_
( a)
where
_
q
1
:= (u)A
1
p
1
:q
1
(u), . . . , p
1
:q
n
(u),
.
.
.
q
n
:=(u)A
n
p
1
:q
1
(u), . . . , p
n
:q
n
(u)
_
The computations here are by the most complexand most signicant-
rule of the Reduction Calculus, which, unfortunately, we cannot attempt to
motivate here. The formulas do imply, however, that for any term B,
_
(u)A
G,u
_
( a)
_
(u)B
G,u
_
( a)
if and only if
B
G,u

cf
B
0
where p
1
:= B
1
, . . . , p
n
:= B
n
,
for suitable B
0
, . . . , B
n
, so that:
For any i = 1, . . . , n, (1)
[=(u)(A
i
p
1
:q
1
(u), . . . , p
n
:q
n
(u))
=(u)(B
i
p
1
:q
1
(u), . . . , p
n
:q
n
(u)),
and
[= A
0
u : ap
1
: q
1
( a), . . . , p
n
:q
n
( a)
= B
0
u : ap
1
:q
1
( a), . . . , p
n
:q
n
( a). (2)
On the other hand, A
LIL
B if (1) holds and instead of (2) the stronger
[=(u)(A
0
p
1
: q
1
(u), . . . , p
n
:q
n
(u))
=(u)(B
0
p
1
:q
1
(u), . . . , p
1
:q
n
(u)) (2

)
is true.
78 Eleni Kalyvianaki and Yiannis N. Moschovakis
Thus, local synonymy is very close to global synonymy, the only differ-
ence being that for global synonymy we need the heads of the two terms to
be denotationally equal for all states, while for local synonymy at a state a,
we only need their heads to be denotationally equal at a. This explains why,
by Lemma 1, the former implies the latter while the converse may fail.
Natural examples which illustrate this distinction are hard to nd, but the
following one may, at least, be amusing.
Consider a particular state a at which two common nouns are co-extensive
for example, man and human. This was the case at the time just after
God had created Adam but not yet Eve. At that state a, then, the sentences
Adam is a man and Adam is a human are locally synonymous, since
_
(u)[Adam is a man]
G,u
_
( a)
render

_
(u)man
G
(u)(Adam
G
(u)
_
( a)

cf
_
(u)man
G
(u)( j(u)
_
( a) where j :=(u)Adam
G
(u)
_
(u)[Adam is human]
G,u
_
( a)
render

_
(u)human
G
(u)(Adam
G
(u)
_
( a)

cf
_
(u)human
G
(u)( j(u)
_
( a) where j :=(u)Adam
G
(u)
and
[= man
G
( a)( j( a)) = human
G
( a)( j( a)).
These sentences, of course, are not referentially synonymous, as they are
not even factually synonymous in any reasonable state.
4.6. Local synonymy across different states
Things get more complicated when we try to trace local synonymy between
sentences at different states. In Section 4.1, it was shown that Yesterday I
was insulted uttered at a state a where the time is 27 June 2005 and Today
I am insulted uttered at state b where the time is 26 June 2005 are factually
synonymous, provided that the two states are otherwise identical. By the
Reduction Calculus,
Two aspects of situated meaning 79
_
(u)
_
Yesterday I was insulted

G,u
_
( a)
render

_
(u)Yesterday
1
G
(u)
_
(u)be insulted
G
(u), I
G
(u)
_
_
( a)

cf
_
(u)Yesterday
1
G
(u)(p(u), q(u))
_
( a) where
p :=(u)(u)be insulted
G
(u), q :=(u)I
G
(u)
and
_
(u)
_
Today I am insulted

G,u
_
(

b)
render

_
(u)Today
1
G
(u)
_
(u)be insulted
G
(u), I
G
(u)
_
_
(

b)

cf
_
(u)Today
1
G
(u)(p(u), q(u))
_
(

b) where
p :=(u)(u)be insulted
G
(u), q :=(u)I
G
(u),
and these two canonical forms have the same bodies, so they will be locally
synonymous if and only if their heads are denotationally equal. But
,[= Yesterday
1
G
( a)(p( a), q( a)) = Today
1
G
(

b)(p(

b), q(

b))
on the plausible assumption that John is running while Mary sleeps (today
and yesterday), taking
(p)(a) =(p)(b) = runs, (q)(a) = John, (q)(b) = Mary
and computing the denotations of the two sides of this equation for the as-
signment ; so
LM(Yesterday I was insulted, a) ,= LM(Today I am insulted, b).
This argument is subtle and a little unusual, and it may be easier to under-
stand in the next example, where we compare the local meanings of the same
sentence, with no modal operator, on two different but nearly identical states.
Consider John runs, in a and b:
_
(u)run
G
(u)(John
G
(u))
_
( a)

cf
_
(u)run
G
(u)( j(u))
_
( a) where j :=(u)John
G
(u))
80 Eleni Kalyvianaki and Yiannis N. Moschovakis
_
(u)run
G
(u)(John
G
(u))
_
(

b)

cf
_
(u)run
G
(u)( j(u))
_
(

b) where j :=(u)John
G
(u))
Local synonymy requires that
[= run
G
( a)( j( a)) = run
G
(

b)( j(

b)), (26)
which means that for all functions j : T
s
T
e
,
run
G
(a)( j(a)) = run
G
(b)( j(b)).
Suppose further that the two states a and b are exactly the same except that the
speaker is differentan aspect of the state that, intuitively, should not affect
the meaning of John runs. In particular, the interpretation run
G
: s (e t)
of the constant run
G
is the same in the two states, i.e., that whoever runs at
state a also runs at state b and conversely. But unless either everyone or
nobody runs in these states, (26) fails: just take an assignment such as
( j)(a) ,=( j)(b) and such that in both states ( j)(a) runs whereas ( j)(b)
does not run.
The examples suggest that synonymy across different states is a complex
relation, and that when we feel intuitively that sentences may have the same
meaning in different states, it is factual synonymy that we have in mind.
5. Situated meaning in the philosophy of language
In this section we will investigate briey the connection of the proposed as-
pects of situated meaning to indexicality, propositional attitudes and transla-
tion.
5.1. Kaplans treatment of indexicals
Indexicality is a phenomenon of natural language usage, closely connected to
situated meaning. Among its many proposed treatments, Kaplans theory of
direct reference (Kaplan 1989) reaches some very interesting results which
we can relate to the aspects of situated meaning introduced in this paper.
Kaplans theory is expressed formally in the Logic of Demonstratives
(LD), where each term or formula has two semantic values, Content and
Character. The Content of a term or a formula is given with respect to a
context, considered as context of utterance, and it is a function from possible
circumstances, considered as contexts of evaluation, to denotations or truth
Two aspects of situated meaning 81
values, respectively. The Character of a term or a formula A is the function
which assigns to each context the Content of A in that context.
In (Kaplan 1978), it is argued that
Thus when I say
I was insulted yesterday (K4)
specic contentwhat I saidis expressed. Your utterance of the
same sentence, or mine on another day, would not express the
same content. What is important to note is that it is not just
the truth value that may change; what is said is itself different.
Speaking today, my utterance of (K4) will have a content roughly
equivalent to that which
David Kaplan is insulted on 20 April 1973 (K5)
would have, spoken by you or anyone at any time.
Kaplan gives formal denitions of these notions in LD, from which it fol-
lows that at a context of utterance where the speaker is David Kaplan and
the time is 21 April 1973, the two sentences (K4) and (K5) have the same
Content, that is the same truth value for every possible circumstance: but, of
course, they have different Characters(K5)s Character is a constant func-
tion, whereas (K4)s clearly depends on the context of utterance.
In L

ar
, these intuitively plausible semantic distinctions can be made with
the use of the factual content and the global meaning which, roughly speak-
ing, correspond to Kaplans Content and Character respectively.
Suppose a is a state
14
where the speaker is David Kaplan (or DK for short)
and the time is 21 April 1973. As example (8) suggests, (K4) and (K5) are
factually synonymous at a, that is
_
I was insulted yesterday

G, a

_
David Kaplan is insulted on 20 April 1973

G, a
.
Moreover, for any state b where DavidKaplan
G
(b) is David Kaplan,
_
I was insulted yesterday

G, a

_
David Kaplan is insulted on 20 April 1973

G,

b
.
82 Eleni Kalyvianaki and Yiannis N. Moschovakis
These two sentences, however, are not referentially (globally) synonymous,
(u)
_
I was insulted yesterday

G,u
,(u)
_
David Kaplan is insulted on 20 April 1973

G,u
,
since I
G
is not denotationally equivalent with DavidKaplan
G
nor is Yester-
day
G
1
with on20April1973
G
. Notice that the indexical I occurs within the
scope of the modal operator Yesterday in (K4), and in any such example
in order to account for the directly referential usage of indexicals, we choose
the de re reading of the operators, thus translating Yesterday as Yesterday
1
.
Kaplan argues that (K4) has another characteristic as well. Consider two
contexts c
1
and c
2
which differ with respect to agent and/or moment of time
that is, the aspects of the context of utterance which are relevant to this partic-
ular sentence. Then, its Content with respect to c
1
is different with its Content
with respect to c
2
. Similarly, in L

ar
,
_
I was insulted yesterday

G, a
,
_
I was insulted yesterday

G,

b
for states a and b which differ in the same way as c
1
and c
2
do.
It is clear from the examples that at least some of the aspects of index-
icality which Kaplan (1989) seeks to explain can also be understood using
factual content, with no need to introduce contexts of utterance and con-
texts of evaluation, which (to some) appear somewhat articial. There are
two points that are worth making.
First, in L

ar
, synonymy is based on the isomorphism between two recur-
sors, which is a structural condition, whereas in LD the identity of Contents
or Characters is dened as a simple equality between two functions.
For example, consider I am insulted, a simpler version of (K4) with no
modal operator, and suppose (as above) that there are states a and b which
differ only in the identity of the speakers, call them Agent
a
and Agent
b
. Sup-
pose also that both utterances of the sentence by the two agents are true.
To show in LD that the two relevant Contents are different, we need to
consider their values in contexts of evaluation other than that determined by
the states a and b: the argument being that the interpretation function of the
constant be insulted evaluated on some circumstances for the two different
agents is not the same (because there are circumstances at which Agent
a
is
insulted while Agent
b
is not), and so the two Contents are not identical. On
the other hand, the factual content of this sentence in state a is expressed by
the canonical form
be insulted
G
( a)(I
G
( a))
cf
be insulted
G
( a)(p) where p := I
G
( a),
Two aspects of situated meaning 83
and the one for b is the same, but with

b in place of a. So, in L

ar
,
FC(I am insulted, a) ,= FC(I am insulted, b)
simply because
,[= I
G
( a) = I
G
(

b).
There is no need to consider the values of the function be insulted
G
in any
state, not even at a and b.
Second, in L

ar
there is the possibility to compare, in addition, the local
meanings of the two sentences (K4) and (K5) at the specic state a. As one
would expect, these are not locally synonymous at a, i.e.,
_
(u)
_
I was insulted yesterday

G,u
_
( a)
,
_
(u)
_
David Kaplan is insulted on 20 April 1973

G,u
_
( a).
This accounts for the fact that although what is said (the factual content) is
the same, to understand this one must know that I and yesterday refer to
DK and 20 April 1973 respectively; two sentences are locally synonymous
in a state only when the fact that they say the same thing can be realized
by a language speaker who does not necessarily know the references of the
indexicals in them.
5.2. What are the belief carriers?
The objects of belief must be situated meanings of some sort: can we model
them faithfully by factual contents or local meanings, the two versions of
situated meaning that we introduced?
There are several well-known paradoxes which argue against taking fac-
tual contents as the objects of belief,
15
but our pedestrian example (5) can
serve as well. If belief respected factual content, then in a state a in which
her is Mary, an agent would equally believe John loves her as she would
John loves Mary; but we can certainly imagine situations in which the agent
does not know that her refers to Marywe make this sort of factual error
all the time, and it certainly affects our beliefs. Thus factual synonymy is not
preserved by belief attribution.
Local meanings are more promising candidates for belief carriers, es-
pecially as they eliminate this sort of belief paradox which depends on the
agents mistaking the values of indexicals. Moreover, the discussion in Sec-
tion 4.5 suggests that the local meaning LM(A, a) models what has sometimes
84 Eleni Kalyvianaki and Yiannis N. Moschovakis
been called the sentence A under the situation a and has been proposed as
the object of belief. So it would appear that of the known candidates, local
meanings may be the best, formal representations of belief carriers.
5.3. What is preserved under translation?
Faithful translation should also preserve some aspect of meaning, so which
is it? It is clear, again, that it cannot be factual content, as John loves her
would never be translated as Jean aime Marie, whatever the state. Perhaps
referential (global) synonymy or local synonymy are translation invariants,
but there are no good arguments for one or the other of theseor for prefer-
ring one over the other, given how closely related they are. The question is
interesting and we have no strong or defensible views on itbut it should be
raised in the evaluation of any theory of meaning, and it almost never is.
Acknowledgements
The research of Eleni Kalyvianaki for this article was co-funded by the Eu-
ropean Union - European Social Fund & National Resourses - EPEAEK II.
Notes
1. We will generally assume that the reader is reasonably familiar with Montagues
LIL and thus it will be clear to her that we employ a rather simplied version
of this language, at least in what concerns the way natural language is translated
into it. On the other hand, we will describe the basic ideas of Gallin (1975)
and Moschovakis (2006), so that this article is largely independent of the details
of those papers.
2. In fact, it is convenient (and harmless) to take T
t
T
e
, i.e., simply to assume that
the truth values 0, 1, er are in T
e
, so that the denotations of sentences are treated
like common entities, with Freges approval. The extra truth value is useful for
dealing with simple cases of presupposition, but it will not showup in this article.
3. Pedantically, types are nite sequences (strings) generated by the distinct sym-
bols e, s, t, (, and ), and terms (later on) will similarly be strings from a larger
alphabet. We use to denote the identity relation between strings.
4. We assume for simplicity this standard (largest) model of the typed -calculus
built from the universes T
s
and T
e
.
5. We have assumed, for simplicity, a constant thetemp : e which we grouped with
names and indexicals, because of its typing.
6. For the examples from natural language, we will assume some plausible proper-
ties of the interpretations of these constants, e.g., that there are states in which
Two aspects of situated meaning 85
some people are running while others sit, that John loves Mary in some states
while he dislikes her in others, etc. None of these assumptions affect the logic of
meaning which is our primary concern.
7. This John is just a formal expression (a string of symbols), which, quite ob-
viously, may refer to different objects in different states. Proper names which
should be rigid designators by the prevailing philosophical view are more than
strings of symbols, and in any case, the logic of meaning which concerns us here
does not take a position on this (or any other) philosophical view.
8. We denote by x :=t the update of an assignment which changes it only by
re-assigning to the variable x : the object t T

:
x :=t(v

i
) =
_
t, if v

i
x,
(v

i
), otherwise.
9. An alternative would be to re-type the same constants in K and distinguish be-
tween the LIL-typing and the Ty
2
-typing of c. The method we adopted is (proba-
bly) less confusing, and it makes it easier to express the Gallin translation of LIL
into Ty
2
below.
10. We use G instead of Gallins * and we make the state variable explicit.
11. In example (15), for simplicity, the logic constant & is not translated into &
G
(u)
since its denotation is independent of the state.
12. A term A is explicit if the constant where does not occur in it.
13. Factual synonymy can be expressed without the use of parameters by the follow-
ing, simple result: A
G, a
B
G,

b
if and only if there exist terms A
0
, . . ., B
0
, . . ., as
in the Referential Synonymy Theorem 3, such that for all assignments ,
if (u) = a and (v) = b, then den(A
i
G,u
)() = den(B
i
G,v
)(), (i = 0, . . . n).
The same idea can be used to dene the recursor FC(A, a) directly, without en-
riching the syntax with state parameters.
14. A state in L

ar
acts both as context of utterance, thus disambiguating all occur-
rences of indexicals, names etc, and as context of evaluation, thus evaluating the
denotations of verbs, adjectives etc.
15. See for example Russells author of Waverley example as presented in the In-
troduction of (Salmon and Soames 1988) or in (Church 1982).
References
Church, Alonzo
1982 A remark conerning Quines paradox about modality. Spanish ver-
sion in Analisis Filos oco 2532, reprinted in English in (Salmon and
Soames 1988).
86 Eleni Kalyvianaki and Yiannis N. Moschovakis
Frege, Gottlob
1892 On sense and denotation. Zeitschrift f ur Philosophie und philosophi-
sche Kritik 100. Translated by Max Black in (Frege 1952) and also
by Herbert Feigl in (Martinich 1990).
1918 Der Gedanke - Eine Logische Untersuchung. Beitr age zur Philoso-
phie des deutschen Idealismus I 5877. Translated as Thoughts and
reprinted in (Salmon and Soames 1988).
1952 Translations from the Philosophical Writings of Gottlob Frege. Ox-
ford: Blackwell. Edited by Peter Geach and Max Black.
Gallin, Daniel
1975 Intensional and higher-order modal logic. Number 19 in North-
Holland Mathematical Studies. Amsterdam, Oxford, New York:
North-Holland, Elsevier.
Kaplan, David
1978 On the logic of demonstratives. Journal of Philosophical Logic 81
98, reprinted in (Salmon and Soames 1988).
1989 Demonstratives An Essay on the Semantics, Logic, Metaphysics,
and Epistemology of Demonstratives and Other Indexicals &
Afterthoughts. In Joseph Almog, John Perry, and Howard Wettstein,
(eds.), Themes from Kaplan, 481614. Oxford University Press.
Martinich, Aloysius P., (ed.)
1990 The Philosophy of Language. New York, Oxford: Oxford University
Press, second edition.
Montague, Richard
1973 The Proper Treatment of Quantication in Ordinary English. In
Jaakko Hintikka et al., (eds.), Approaches to Natural Language: Pro-
ceedings of the 1970 Stanford Workshop on Grammar and Semantics,
221224, reprinted in (Montague 1974). Dordrecht: D. Reidel Pub-
lishing Co.
1974 Formal Philosophy. New Haven and London: Yale University
Press. Selected papers of Richard Montague, edited by Richmond H.
Thomason.
Moschovakis, Yiannis N.
2006 A logical calculus of logic and synonymy. Linguistics and Philosophy
29: 2789.
Salmon, Nathan and Scott Soames, (eds.)
1988 Propositions and attitudes. Oxford: Oxford University Press.
Further excursions in natural logic: The Mid-Point
Theorems
Edward L. Keenan
Abstact
In this paper we explore the logic of proportionality quantiers, seeking and formal-
izing characteristic patterns of entailment. We uncover two such patterns, though
in each case we show that despite the naturaleness of proportionality quantiers in
these paradigms, they apply as well to some non-proportionality quantiers. Even
so we have elucidated some logical properties of natural language quantiers which
lie outside the generalized existential and generalized universal ones.
1. Background
Pursuing a study begun in (Keenan 2004) this paper investigates inference
patterns in natural language which proportionality quantiers enter. We de-
sire to identify such patterns and to isolate any such which are specic to
proportionality quantiers.
Keenan (2004) identied the inference pattern in (1) and suggested that it
involved proportionality quantiers in an essential way.
(1) a. More than
n
m
of the As are Bs.
At least 1
n
m
of the As are Cs.
Ergo: Some A is both a B and a C.
b. At least
n
m
of the As are Bs.
More than 1
n
m
of the As are Cs.
Ergo: Some A is both a B and a C.
To illustrate (1-a): If more than three tenths of the students are athletes
and at least seven tenths are vegetarians then at least one student is both an
athlete and a vegetarian.
This is indeed a valid argument paradigm. However recently Westersthl
(pc) showed that the pattern (1-a), (1-b) is a special case of a more general one
not specic to proportionality quantiers but which includes them simply as a
special case. His result supports the claim that proportionality quantiers en-
88 Edward L. Keenan
ter inference paradigms common to better understood classes of quantiers.
But it also leads us to question whether there are any inference patterns spe-
cic to proportionality quantiers. To pursue these questions we need some
background denitions.
Denition 1 Given a domain E, the set GQ
E
of generalized quantiers over
E =
de f
[P(E) 0, 1], the set of functions from P(E) into 0, 1. Such
functions will also be called (following Lindstrm (1966)) functions of type
< 1 >. Interpreting P
1
s, one place predicates, as elements of P(E) we can
use type < 1 > functions as denotations of the DPs italicized in (2):
(2) a. No teacher laughed at that joke.
b. Every student came to the party.
c. Most students are vegetarians.
So the truth value of (2-a) is the one that the function denoted by no
teacher assigns to the denotation of the P
1
laughed at that joke. The Dets
no, every, and most combine with a single Noun to form a DP and are natu-
rally interpreted by functions of type < 1, 1 >, namely maps from P(E) into
GQ
E
. They exemplify three different classes of type < 1, 1 > functions: the
intersective, the co-intersective and the proportionality ones. Informally a
D of type <1, 1 > is intersective if its value at sets A, B just depends on AB,
it is co-intersective if its value depends just on AB, and it is proportional if
it just depends of the proportion of As that are Bs. Formally, we dene these
notions and a few others below in terms of invariance conditions.
Denition 2 For D of type < 1, 1 >,
1. a. D is intersective iff for all sets A, B, X,Y if AB = X Y then
DAB = DXY.
b. D is cardinal iff for all sets A, B, X,Y if [AB[ = [X Y[ then
DAB = DXY.
2. a. D is co-intersective iff for all A, B, X,Y if A B = X Y then
DAB = DXY.
b. D is co-cardinal iff for all A, B, X,Y if [A B[ = [X Y[ then
DAB = DXY.
3. D is proportional iff for all A, B, X,Y if [AB[/[A[ =[X Y[/[X[ then
DAB = DXY.
4. D is conservative iff for all A, B, B
/
if AB=AB
/
then DAB=DAB
/
.
Further excursions in natural logic: The Mid-Point Theorems 89
5. D is permutation invariant iff for all A, B E, all permutations of
E, DAB = D(A)(B).
One checks that NO, EVERY, and MOST dened below are intersective,
co-intersective and proportional respectively. All three of these functions are
permutation invariant and conservative.
(3) a. NO(A)(B) = 1 iff AB = / 0.
b. EVERY(A)(B) = 1 iff AB = / 0.
c. MOST(A)(B) = 1 iff [AB[ >
1
/2[A[.
Here is a representative sample of these three classes (our main concern
in what follows).
(4) Some intersective Dets
cardinal some, a/an, no, practically no, several, between six and ten,
innitely many, more than six, at least/exactly/just/only/fewer than/at
most six, between six and ten, just nitely many, about/nearly/approxi-
mately a hundred, a couple of, a dozen, How many?
non-cardinal Which?, more male than female, no...but John (as in No
student but John came to the party)
(5) Some co-intersective Dets
co-cardinal every/all/each, almost all, all but six, all but at most six,
all but nitely many
non-co-cardinal every...but John
(6) Some properly proportional Dets (proportional but not intersective
or co-intersective).
a. more than half the, less than two thirds, less than/at most/at
least/exactly half, at most ten per cent, between a half and two
thirds, between ten and twenty per cent, all but a tenth, almost a
third, What percentage?
b. most, every third (as in Every third student was inoculated),
just/nearly/exactly/only/not one...in ten (as in Just one student
in ten was inoculated), (almost) seven out of ten (as in Seven out
of ten sailors smoke Players), between six out of ten and nine out
of ten
So the proportionality Dets include mundane fractional and percentage
expressions, (6-a), usually built on a partitive pattern with of followed by a
denite DP, as in most of the students, a third of Johns students, ten per cent
90 Edward L. Keenan
of those students, etc. (half is slightly exceptional, only taking of optionally:
half the students, half of the students are both ne). The precise syntactic
analysis of partitive constructions is problematic in the literature. Keenan
and Stavi (1986) treat more than a third of the as a complex Det. But more
usually linguists treat the expression following of as a denite DP and of
expresses a partitive relation between that DP and the Det that precedes of.
Barwise and Cooper (1981) provide a compositional semantics for this latter
approach which we assume here.
Proportionality Dets also include those in (6-b) which are not partitive,
but are followed directly by the Noun as in the case of intersective and co-
intersective Dets.
DPs built from proportionality Dets usually require that their Noun argu-
ment denotes a nite set. We could invent a meaning for a third of the natural
numbers but this would be a creative step, extending natural usage not sim-
ply an act of modeling ordinary usage. In general the functions denoted by
proportionality Dets are not intersective or co-intersective, though a few ex-
tremal cases are: Exactly zero per cent = no, a hundred percent = every, more
than zero per cent = some, less than a hundred per cent = not every.
Complex members in each of these seven classes can be formed by taking
boolean compounds in and (but), or, not, and neither...nor. We note without
proof:
Proposition 1 1. GQ
E
= [P(E) 0, 1] is a (complete atomic) boolean
algebra inheriting its structure pointwise from 0, 1.
2. Each of the classes K dened in Denition 2 is closed under the point-
wise boolean operations and is thus a (complete, atomic) boolean sub-
algebra of [P(E) GQ
E
].
So if D of type < 1, 1 > is intersective (cardinal, co-intersective,...) so
is D, which maps each A to (D(A)), the complement of the GQ D(A).
Thus boolean compounds of expressions in any of these classes also lie in
that class. E.g. at least two and not more than ten is cardinal because at least
two and more than ten are, etc.
We write INT
E
(CARD
E
,...) for the set of intersective (cardinal, ...) func-
tions of type <1, 1 > over a domain E, omitting the subscript
E
when no con-
fusion results. Many relations between these subclasses of Dets are known.
E.g. INT, CO-INT, PROP are all subsets of CONS; CARD and CO-CARD
are PI (permutation invariant, dened in Section 2.3) subsets of INT and
CO-INT respectively. When E is nite CARD = INT PI and CO-CARD =
CO-INT PI. And an easily shown fact, used later, is:
Further excursions in natural logic: The Mid-Point Theorems 91
Proposition 2 INT
E
COINT
E
=0, 1, where 0 is that constant function
of type < 1, 1 > mapping all A, B to 0; 1 maps all A, B to 1.
Proof. One checks easily that 0 and 1 are both intersective and co-intersective.
For the other direction let D INT
E
COINT
E
. Then for A, B arbitrary,
DAB=D(AB)(E), since Dis intersective, =D(/ 0)(E) since Dis co-intersec-
tive. Thus D is constant, so D = 0 or D = 1.
Thus only the two trivial Det functions are both intersective and co-intersec-
tive. Further
Fact 1 In general Dets are not ambiguous according as their denotations are
intersective or co-intersective.
fewer than zero denotes 0 which is both intersective and co-intersective,
but Fact 1 says that no Det expression has two denotations, one intersective
and the other co-intersective.
2. Proportionality Dets
We begin with some basic facts regarding proportionality Dets.
2.1. Not rst order denable
Barwise and Cooper (1981) argue that MOST as dened here, is not denable
in rst order logic. See also Westersthl (1989). The arguments given in these
two sources extend to the non-trivial proportionality Dets - those which are
not also intersective or co-intersective. Given that the proportionality Dets
in general are not rst order denable (FOD) it is unsurprising that we have
little understanding of their inferential behavior as inference patterns have
been best studied for rst order expressions.
2.2. Not sortally reducible
We say that a possible Det function D is sortally reducible iff there is a
two place boolean function h such that for all subsets A, B of E, DAB =
D(E)(h(A, B)). Note that intersective D and co-intersective D are sortally
reducible, as illustrated below with some and all:
(7) a. Some poets are socialists.
92 Edward L. Keenan
b. Some individuals are both poets and socialists.
c. All poets are socialists.
d. All individuals are either not poets or are socialists ( All indi-
viduals are such that if they are poets then they are socialists).
In fact Keenan (1993) shows that the conservative D which are sortally re-
ducible are just the intersective and co-intersective ones. Most reasoning
techniques used with formulas of the form x or x involve removing the
quantiers, reasoning with the resulting formula, and then restoring the quan-
tiers when needed. But such techniques will not apply directly to Ss built
with proper proportionality quantiers as they do not admit of a translation
which eliminates the Noun domain of the variable in favor of the entire uni-
verse as in (7-b) and (7-d) above.
2.3. Permutation invariant
Given a permutation h of E (so h is a bijection from E to E) we extend h to
subsets of E by setting h(X) = h(x)[x X, all X E. And a possible Det
denotation D is said to be PI (permutation invariant) iff for all permutations
h of E,
D(A)(B) = D(h(A), h(B)).
Proportionality Dets (over nite E) always denote PI functions (in distinction
for example to no ... but John or Which? among the intersective Dets).
2.4. Two place variants
Proportionality Dets have two place variants like intersective Dets, as in:
(8) A greater percentage of teachers than (of) students signed the petition.
The same proportion of little boys as (of) little girls laugh at funny
faces.
Proportionately fewer female students than male students get drafted.
(9) (A GREATER PERCENTAGE OF A THAN B)(C) = 1
iff
[AC[
[A[
>
[BC[
[B[
.
Further excursions in natural logic: The Mid-Point Theorems 93
3. Inference paradigms
To begin our study of inference paradigms proportionality Dets enter we rst
review Westersthls result concerning our previous attempt (Keenan 2004).
That work built on three operations dened on GQs: complement, post-
complement, and dual. Complement has already been dened (pointwise)
above. For the others:
Denition 3 (Postcomplement and dual)
a. F, the postcomplement of F, is that GQ mapping each B to F(B),
that is, to F(E B).
b. F
d
, the dual of F, =
def
(F).
Note that (F) = (F), so we may omit parentheses.
We extend these operations pointwise to type < 1, 1 > functions:
Denition 4 For D of type <1, 1 >, D, D, and D
d
are those type < 1, 1 >
functions dened by:
a. D maps each set A to (D(A)).
b. D maps each set A to (D(A)).
c. D
d
maps each set A to (D(A))
d
.
3.1. Some examples
We write negX for a DP which denotes the complement of the denotation of
X; similarly Xneg denotes its postcomplement and dualX its dual.
X some every more than half less than half
negX no not every at most half at least half
Xneg not every no less than half more than half
dualX every some at least half at most half
So the complement of every boy is not every boy, its postcomplement is no
boy, and its dual is some boy. And the complement of more than half is
at most half, its postcomplement is less than half, and its dual at least half.
Observe that the postcomplement and dual operators preserve the property of
being proportional but interchange the intersective and co-intersective Dets:
Proposition 3 For D of type < 1, 1 >,
a. if D is proportional so are D and D
d
, but
94 Edward L. Keenan
b. if D is intersective (cardinal), D and D
d
and both co-intersective
(co-cardinal), and
c. if D is co-intersective (co-cardinal), then D and D
d
are both inter-
sective (cardinal).
Proof sketch. We show b. above, as it plays a role in our later discus-
sion. Let D be intersective. We show that D is co-intersective. Let AB =
X Y. We must show that DAB = DXY. But DAB = (DA)(B) =
DA(B) = DA(AB), since D is intersective, = DA(A B) = D(E)(A
(AB)) = D(E)(AB) = D(E)(X Y) = ... = DXY, completing the proof.
To see that D
d
is co-intersective we observe that D is by the above and so
then is (D) = D
d
since pointwise complements preserves co-intersectivity
(Proposition 1).
3.2. Westersthls generalization
We repeat (1-a) above, (1-b) being similar.
(10) a. More than
n
m
of the As are Bs.
At least 1
n
m
of the As are Cs.
Ergo: Some A is both a B and a C.
Now the relevant Dets are interpreted as in (11):
(11) For 0 n m, 0 < m,
(MORE THAN
n
m
)(A)(B) = 1 iff A ,= / 0 and
[AB[
[A[
>
n
m
,
(LESS THAN 1
n
m
)(A)(B) = 1 iff A ,= / 0 and
[AB[
[A[
< 1
n
m
,
(AT LEAST 1
n
m
)(A)(B) = 1 iff A ,= / 0 and
[AB[
[A[
1
n
m
.
Westersthl (pc) notes that the DPs in the premisses in (10) are duals. (LESS
THAN 1
n
m
is the postcomplement of (MORE THAN
n
m
) and (ATLEAST
1
n
m
) is its dual.
Theorem 1 (Westersthls Generalization) For D conservative, the follow-
ing properties are equivalent:
1. D is right increasing (= increasing on its second argument).
2. D(A)(B) D
d
(A)(C) SOME(A)(BC).
Further excursions in natural logic: The Mid-Point Theorems 95
Proof. Let D be conservative and assume [2]. We show [1]. Let B B
/
and assume DAB. We must show DAB
/
. Assume otherwise. So DAB
/
=
0. Then (DA)(B
/
) = 0, so (DA)(B
/
) = D
d
(A)(B
/
) = 1. So by [2],
ABB
/
,= / 0, contradicting that B B
/
. Thus DAB
/
= 1, and D is right
increasing.
Let D be right increasing and assume DAB = 1 and D
d
AC = 1,
whence by the conservativity of D and D
d
we have DAAB = 1 and D
d
AA
C = 1. Assume leading to a contradiction that ABC = / 0. Then AB
C, so D(A)(C) = 1 by the right increasingness of D. Thus D(A)(C) = 1.
But D(A)(C) = D
d
AC = 1, a contradiction. So ABC ,= / 0, whence
SOME(A)(BC) = 1, establishing [2].
So [2] generalizes the argument paradigm in (1-a), (1-b) and does not seem
specic to proportionality Dets since it holds for DPs built from right increas-
ing conservative Dets in general. So far however I have found it difcult to
nd examples of non-proportional Dets which instantiate [1] and [2]. Ones
rst guess, some and every, satises [1] and [2] but these Dets are, recall, pro-
portional: SOME = MORE THAN 0% and EVERY = 100%. The only other
cases I can think of are ones that make presuppositions on the cardinality of
their rst argument. Perhaps the least contentious is both and at least one of
the two. Both students are married and At least one of the two students is
a vegan imply Some student is both married and a vegan. But this instance
does require taking at least one of the two as a Det, which we decided against
earlier.
4. The Mid-Point Theorems
We seek now additional inference patterns that proportionality quantiers nat-
urally enter. Keenan (2004) observes that natural languages present some
non-trivial DPs distinct from rst order ones which always assign the same
truth value to a predicate and its negation, as in (12-a), (12-b) and (12-c),
(12-d). (13) is the general form of the regularity. Proposition 4 is then imme-
diate.
(12) a. Exactly half the students got an A on the exam.
b. Exactly half the students didnt get an A on the exam.
c. Between a third and two thirds of the students got an A.
d. Between a third and two thirds of the students didnt get an A.
(13) DP(P
1
) = DP(not P
1
).
96 Edward L. Keenan
Proposition 4 The DPs which satisfy (13) are those which denote in FIX() =
F GQ
E
[F = F.
At issue then is a syntactic question: just which DPs do satisfy (13)? Let
us limit ourselves for the moment to ones of the form [Det+N], as we are
interested in isolating the role of the Det. And in characterizing that class do
the proportionality Dets play any sort of distinguished role? It seems to me
that they do, though I can only give a rather informal statement of that role.
Still that informal statement at least helps us to understand why many of the
natural examples of Dets which denote in FIX() are proportional. We begin
by generalizing the observation in (12).
Denition 5 For p and q fractions with 0 p q 1,
a. (BETWEEN p AND q)(A)(B) = 1 iff A ,= / 0 and p
[AB[
[A[
q.
b. (MORE THAN p AND LESS THAN q)(A)(B) = 1
iff A ,= / 0 and p <
[AB[
[A[
< q.
Thus (12-a) is true iff there is at least one student and at least a third of the
students passed and not more than two thirds passed. Dets of the forms in
Def 5 are xed by postcomplement, when the fractions p, q lie between 0
and 1 and sum to 1. The condition that p +q = 1 guarantees that p and q
are symmetrically distributed around the midpoint
1
/2. Clearly p
1
/2 since
p q and p+q = 1. Similarly
1
/2 q. The distance from
1
/2 to p is
1
/2 p,
and that from
1
/2 to q is q
1
/2. And
1
/2 p = q
1
/2 iff, adding
1
/2 to both
sides, 1p = q, iff 1 = p+q. And we have:
Theorem 2 (Mid-Point Theorem) Let p, q be fractions with 0 p q 1,
p +q = 1. Then (BETWEEN p AND q) and (MORE THAN p AND LESS
THAN q) are both xed by .
The theorem (plus pointwise meets) guarantees the logical equivalence of
the (a,b) pairs below:
(14) a. Between one sixth and ve sixths of the students are happy.

b. Between one sixth and ve sixths of the students are not happy.
(15) a. More than three out of ten and less than seven out of ten teach-
ers are married.

Further excursions in natural logic: The Mid-Point Theorems 97


b. More than three out of ten and less than seven out of ten teach-
ers are not married.
A variant statement of this theorem using percentages is:
(16) Let 0 n m 100 with n+m = 100. Then
Between n and m per cent of the As are Bs.

Between n and m per cent of the As are not Bs.


For example choosing n = 40 we infer that (17-a) and (17-b) are logically
equivalent:
(17) a. Between 40 and 60 per cent of the students passed.
b. Between 40 and 60 per cent of the students didnt pass.
And Theorem 2 and (16) are (mutual) entailment paradigms which appear
to use proportionality Dets in an essential if not completely exclusive way.
Many of the pairs of proportional Dets will not satisfy the equivalence in 2
since their fractions do not straddle the mid-point appropriately. And Dets
such as between 10 and 20 per cent do not satisfy (16) for the same reason. A
very large class of complex proportional Dets which satisfy 2 or (16) is given
by Theorem 3.
Theorem 3 FIX() is closed under the pointwise boolean operations and
so is a complete (and thus) atomic subalgebra of Type <1,1>.
The proof can be found in the Appendix.
And given our earlier observation that proportionality functions are closed
under the pointwise boolean operations we infer that all the Dets that can be
built up as boolean compounds of the basic fractional and percentage Dets in
2 and (16) respectively are both proportional and xed by , so they satisfy
the equivalence in (13). For example
(18) Either less than a third or else more than two thirds of the As are Bs.

Either less than a third or else more than two thirds of the As are not
Bs.
Proof. The Det in this example denotes the boolean complement of BE-
TWEEN A THIRD AND TWO THIRDS and is thus proportional and xed
by .
98 Edward L. Keenan
It is perhaps worth noticing what happens with proportional Dets of the
form Between p and q when their distribution with respect to the mid-point
(
1
/2, 50%) changes. If both p and q lie below, or both above the midpoint
then we have:
Proposition 5 If 0 < p q <
1
/2 or
1
/2 < p q < 1 then
Between p and q of the As are Bs.

It is not the case that between p and q of the As are not Bs.
Thus such Det pairs satisfy the equivalences in (19).
(19) D(A)(B) = D
d
(A)(B) = (D(A))(B) =(D(A)(B)).
In contrast if the fraction (percentage) pairs p, q include the mid-point but are
not centered then no entailment relation in either direction holds. In (20-a),
(20-b) neither entails the other:
(20) a. Between a third and three quarters of the students passed the
exam.
b. Between a third and three quarters of the students didnt pass
the exam.
5. Generalizing the Mid-Point Theorem
We observe rst that the proportionality Dets differ from the intersective and
co-intersective ones in being closed under the formation of postcomplements:
Proposition 6 If D of type < 1, 1 > is intersective or co-intersective, and
D = D then D is trivial (D = 0 or D = 1).
Proof. Let D be intersective. Then D is co-intersective by Proposition 3
(b). So D = D is co-intersective and hence trivial. For D co-intersective the
argument is dual.
Moreover the expression of the postcomplement relation is natural and
does not use a distinctive syntax. Here are the simplest cases:
Further excursions in natural logic: The Mid-Point Theorems 99
(21)
POSTCOMPLEMENT
more than
n
m
less than 1
n
m
exactly
n
m
exactly 1
n
m
at most
n
m
at least 1
n
m
more than n% less than 100n%
exactly n% exactly 100n%
at most n% at least 100n%
Notice that in our rst group
n
m
ranges over all fractions and so includes
1
n
m
=
mn
m
. Similarly in the second group n ranges at least over the natural
numbers between 0 and 100 (inclusive) so includes both n% and (100n)%.
Thus the linguistic means we have for expressing ratios covers proportional-
ity expressions and their postcomplements indifferently. (Note that postcom-
plement is symmetric: D = F iff F = D). Recall also that all the natural
classes we have adduced are closed under the pointwise boolean operations,
expressible with appropriate uses of and, or and not.
Now recall the fractions p, q for which the Mid-Point Theorem holds. If
p =
n
m
and p and q sum to 1 then q = 1
n
m
. And more than
n
m
and less then
1
n
m
are postcomplements. (If more than
3
10
ths of the As are Bs then less
than
7
10
ths of the As are non-Bs). Similarly between
n
m
and 1
n
m
just means
the same as at least
n
m
and at most 1
n
m
. So we have
Theorem 4 (Generalized Mid-Points) For D of type < 1, 1 >, (D D)
and (DD) are xed by , as are their complements (DD
d
) and (D
D
d
).
Partial proof a. (DD) = DD

= DD = DD.
b. (DD) = (DD) = (DD
d
), and
(DD
d
) = (DD
d
) = (D
d
D) = (DD
d
).
These proofs use the following proposition.
Proposition 7 The postcomplement function is self inverting (D

= D)
and thus bijective, and it commutes with and thus is a boolean automor-
phism of GQ
E
.
Below we give some (more) examples of proportionality Dets which are xed
by postcomplement. Colloquial expression may involve turns of phrase other
than simple conjunction and disjunction.
100 Edward L. Keenan
(22) Some examples
A. more than three tenths but less than seven tenths
more than three out of ten but less than seven out of ten
more than thirty per cent but less than seventy per cent
exactly a quarter or exactly three quarters
exactly one in four or exactly three out of four
exactly twenty-ve per cent or exactly seventy-ve percent
at least three tenths and at most seven tenths
at least three out of ten and at most seven out of ten
at least thirty per cent and at most seventy per cent
between a quarter and three quarters
between twenty-ve per cent and seventy-ve per cent
exactly one (student) in ten or exactly nine (students) in ten
B. not more than three tenths or not less than seven tenths
at most three tenths or at least seven tenths
more than three out of ten and less than seven out of ten
at most three out of ten or at least seven out of ten
more than thirty per cent and less than seventy per cent
at most thirty per cent or at least seventy per cent
not at least three tenths or not at most seven tenths
less than three tenths or more than seven tenths
not at least three out of ten or not at most seven out of ten
less than three out of ten or more than seven out of ten
not at least thirty per cent or not at most seventy per cent
less than thirty per cent or more than seventy per cent
6. Summary
So far we have semantically characterized a variety of Dets which build DPs
satisfying (13), as illustrated in (23):
(23) a. More than three out of ten but less than seven out of ten students
are vegans.
Further excursions in natural logic: The Mid-Point Theorems 101
b. More than three out of ten but less than seven out of ten students
arent vegans.
a. At least three out of four or else at most one out of four students
are vegans.
b. At least three out of four or else at most one out of four students
arent vegans.
But our paradigms above are given for generalized quantiers in general, not
just proportionality ones. To what extent can they be instantiated by Dets that
are not proportionality ones? Proposition 6 tells us that, putting the trivial
Dets aside, we cannot nd an an intersective Det which is its own postcom-
plement, nor can we nd a co-intersective one meeting that condition. But we
can choose < Det, Det +neg > pairs where one is intersective and the other
co-intersective. Here are a few examples.
(24)
POSTCOMPLEMENT
some not all
no every
exactly ve all but ve
at most ve all but at most ve
just nitely many all but nitely many
no. . .but John every. . .but John
The left hand members of the table above are all intersective, the right hand
members co-intersective. And clearly (25-a), (25-b) are logically equivalent,
as are (26-a), (26-b):
(25) a. Some but not all students read the Times. (F F)
b. Some but not all students dont read the Times.
(26) a. Either all or none of the students will pass that exam. (F F)
b. Either all or none of the students wont pass that exam.
Note that the (compound) Dets in these two examples are properly propor-
tional: some but not all more than zero per cent and less than 100 per
cent, and all or none either 100 per cent or else exactly zero per cent. For
the record,
(27) some but not all, which denotes (SOME ALL), is proportional
and not intersective or co-intersective (E assumed to have at least
two elements).
102 Edward L. Keenan
Basically some but not all fails to be intersective because of the not all part
which is not intersective; and it fails to be co-intersective because some fails
to be. To see that it is proportional, suppose that the proportion of As that are
Bs is the same as the proportion of Xs that are Ys. Then if Some but not all
As are Bs is true then at least one A is a B and at least one A is not a B, so the
percentage of As that are Bs lies strictly between 0 % and 100 %, which is
exactly where the percentage of Xs that are Ys lies, whence some but not all Xs
are Ys. One sees then that the (complete) boolean closure of INT
E
CO-INT
E
includes many functions that lie outside INT
E
and CO-INT
E
. In fact, Keenan
(1993), this closure is exactly the set of conservative functions and so includes
in particular all the conservative proportional ones.
Note now however that examples (28-a), (28-b) are logically equivalent,
as the < 1, 1 > functions the Dets denote are postcomplements, but either
exactly ve or else all but ve is not proportional:
(28) a. Either exactly ve or else all but ve students came to the party.
b. Either exactly ve or else all but ve students didnt come to
the party.
To see this let A have 100 members, just ve of which are Bs. The D denoted
by the Det in (28-a) maps A, B to 1. But for [X[ = 1, 000 and [X Y[ = 50,
that D maps X,Y to 0, even though the proportion of Xs that are Ys,
1
20
, is the
same as the proportion of As that are Bs.
Clearly then certain boolean compounds of intersective with co-intersective
Dets yields some non-proportional Dets which satisfy Theorem 4, so that
paradigm is not limited to proportionality Dets.
A last case of DPs that may satisfy Theorem 4 is given by partitives of the
form in (29):
(29) a. (EXACTLY n OF THE 2n)(A)(B) = 1
iff [A[ = 2n and [AB[ = n.
b. (BETWEEN n and 2n of the 3n)(A)(B) = 1
iff [A[ = 3n and n [AB[ 2n (n > 0).
Of course in general DPs of the form exactly n of the m Ns are not xed by .
But in the case where m = 2n they are. Note that if we treat exactly n of the
m as a Det (an analysis that we, along with most linguists, reject) we have:
(30) For m > n, exactly n of the m is in general not intersective, co-
intersective or proportional (but is conservative and permutation in-
variant).
Further excursions in natural logic: The Mid-Point Theorems 103
Acknowledgments
The paper was supported by an BSF Grant # 1999210.
Appendix
This appendix contains the proofs of Proposition 1 and Theorems 2 and 3.
Proposition 1
2. Each of the classes K dened in Denition 2 is closed under the point-
wise boolean operations and is thus a (complete, atomic) boolean sub-
algebra of [P(E) GQ
E
].
Proof sketch. (1b). Let D be conservative, let AB = AB
/
. We show
that (D)(A)(B) = (D)(A)(B
/
). (D)(A)(B) = . . . =(DAB) =(DA(A
B)) =(DA(AB
/
)) =(DAB
/
) = . . . = (D)(A)(B
/
). For D, let AB =
AB
/
. Then DAB = DA(B) = DA(A B) = DA(A (AB)) = DA(A
(AB
/
)) = DA(AB
/
) = DA(B
/
) = D(A)(B).
(2c). Let D be intersective. Let A B = X Y and show that DAB =
DXY. DAB= (DA)(B) =DA(B) =DA(AB) =D(AB)(AB), since
A(AB) = (AB) (AB) = D(X Y)(X Y) = DX(X Y) = DX(Y) =
(DX)(Y) = DXY.
Further, since D is intersective so is D by Prop 1, whence by the above,
(D) = D
d
is co-intersective.
Theorem 2 (First Mid-Point Theorem) Let p, q be fractions with 0 p
q 1, p +q = 1. Then (BETWEEN p AND q) and (MORE THAN p AND
LESS THAN q) are both xed by .
Proof. Assume (BETWEEN p AND q)(A)(B) = 1. Show (BETWEEN p AND
q)(A)(B) = 1. Suppose leading to a contradiction that
[AB[
[A[
< p. Then the
percentage of As that are Bs is greater than q, contrary to assumption. The
second case in which
[AB[
[A[
> q is similar, hence the percentage of As that
arent Bs lies between p and q.
Theorem 3 FIX() is closed under the pointwise boolean operations and
so is a complete (and thus) atomic subalgebra of Type <1,1>.
Proof. a. Let D FIX(). We must show that for all sets A, (D)(A) =
((D)(A)), that is, D is xed by . Let A, B arbitrary. Then
104 Edward L. Keenan
(D)(A)(B) =(D(A)(B)) Pointwise (twice)
=((D(A))(B)) D(A) is xed by
=(D(A))(B) Pointwise
= ((D)(A))(B) Pointwise
Thus (D)(A) = ((D)(A)), as was to be shown.
b. Show (DD
/
) = (DD
/
), i.e. show (DD
/
)(A) = ((DD
/
)(A))
(DD
/
)(A)(B) = (DAD
/
A)(B)
= ((DA)(D
/
A))(B)
= (DA)(B) (D
/
A)(B)
= DA(B) D
/
A(B)
= (DAD
/
A)(B)
= (DD
/
)(A)(B)
= (DD
/
)(A)(B)
Essentially the same proof carries over for
V
i
D
i
replacing DD
/
showing
completeness. Atomicity then follows.
References
Barwise, Jon and Robin Cooper
1981 Generalized quantiers in natural language. Linguistics and Philoso-
phy 4: 159219.
Keenan, Edward L.
1993 Natural language, sortal reducibility and generalized quantiers.
Journal of Symbolic Logic 58: 314325.
2004 Excursions in natural logic. In Claudia Casadio, Philip J. Scott, and
Robert A.G. Seely, (eds.), Language and Grammar: Studies in Math-
ematical Linguistics and Natural Language. Stanford: CSLI.
Keenan, Edward L. and Jonathan Stavi
1986 Semantic characterization of natural language determiners. Linguis-
tics and Philosophy 9: 253326.
Lindstrm, Per
1966 Firstorder predicate logic with generalized quantiers. Theoria 32:
186195.
Westersthl, Dag
1989 Quantiers in formal and natural languages. In Dov Gabbay and
Franz Guenthner, (eds.), Handbook of Philosophical Logic, Vol IV.
Dordrecht: Reidel.
On the logic of LGB type structures.
Part I: Multidominance structures
Marcus Kracht
Abstract
The present paper is the rst part of a sequence of papers devoted to the modal logics
of structures that arise from Government and Binding theory. It has been shown in
(Kracht 2001b) that they can be modeled by so-called multidominance structures
(MDSs). The result we are going to prove here is that the dynamic logic of the
MDSs is decidable in 2EXPTIME. Moreover, we shall indicate how the theory of
Government and Binding as well as the Minimalist Programcan be coded in dynamic
logic. Some preliminary decidability results for GB are obtained, which will be
extended in the sequel to this paper.
1. Introduction
In recent years, the idea of model theoretic syntax has been getting more
attention. One of the advantages of model theoretic syntax is that because
it describes syntactic structures using a logical language fundamental theo-
retical questions can receive a precise formulation and canhopefullybe
answered. This idea can be found already in (Stabler 1992), where it was
argued that questions of dependency among dierent modules of grammar,
or independence questions for principles can be translated into logical ques-
tions. Stabler chose a translation into predicate logic, accompanied by an
implementation in Prolog. Thus, the questions could be posed to a computer,
which would then answer them. The problem with this procedure is twofold.
Often the predicate logic of a class of structures is undecidable and so not
all questions can eectively be answered (and it is impossible to know which
ones). Second, even if the logic is decidable we need to know about its com-
plexity so that we know how long we have to wait until we get an answer.
Thus, the best possible result would be one where we had not only a decid-
ability result but also a complexity result, preferably showing that complexity
is low.
106 Marcus Kracht
Rabin has shown that the (weak) monadic second order logic (MSO) of
trees is decidable, a result that James Rogers (1998) has applied to syntactic
theory. The main disadvantage of this approach is that it does not cover LGB
type structures.
1
The obvious step was to reduce the latter to the former. This
is not always possible, but it led to a result (independently proved by James
Rogers and myself) that if head movement is bounded then Minimality in the
sense of Rizzi (1990) or Locality in the sense of Manzini (1992) come down
to the theory that the language is strongly context free. However, nothing
could be said about the case when head movement was unbounded because
the reduction fails in this case. Now, Rogers remarks that adding free index-
ation makes the second order theory undecidable (it is no longer monadic),
and so the monadic second order theory of LGB type structures might after
all be undecidable.
The good news however is that this does not seem to be the case. In
this paper I shall show that the dynamic logic of a good many classes of
structures is decidable. An application to non-context free languages will be
given. Moreover, I shall describe how GB type structures as well as MP type
structures can be described using dynamic logic. The sequel to this paper
will generalise the result of this paper still further.
2
It will emerge that many
theories of generative grammar are eectively decidable. This is hopefully
the beginning of a general decidability proof that covers the linguistically
relevant structures. The applications of the present results are manifold. We
are given a decision procedure to see whether certain principles of grammar
are independent or not, and we are given a decision procedure to see whether
or not a sentence is in the language.
I have tried to include into the paper all essential denitions. Neverthe-
less, this paper is not easy to read without some background knowledge. In
particular, I am relying on (Kracht 2001b) for a discussion of the relevance of
the structures discussed below to syntactic structures known from generative
grammar. However, making the material accessible to an ordinary linguistic
audience would make this paper of book size length.
3
2. Multidominance structures
In generative grammar, structures are derived from deep structure trees. In
(Kracht 2001b) I considered three kinds of structures: trace chain struc-
tures (TCSs), copy chain structures (CCSs) and multidominance struc-
tures (MDSs). TCSs are the kind of entities most popular in linguistics.
When an element moves, it leaves behind a trace and forms a chain together
On the logic of LGB type structures 107
with its trace. The technical implementation is a little dierent, but the idea is
very much the same. CCSs are dierent in that the moving element does not
leave just a trace behind but a full copy of itself. This type of chain structures
is more in line with recent developments (the Minimalist Program, henceforth
MP), rather than with Government and Binding (= GB). MDSs, however, are
dierent from both. In an MDS, there are no traces. Instead, movement to
another position is represented by the addition of a link to that position. Thus,
as soon as there is movement there are elements which have more than one
mother. Moreover, it was shown in (Kracht 2001b) that MDSs contain ex-
actly the same information as TCSs, since there is an algorithm that converts
one into the other. MDSs, like TCSs, are based on an immediate dominance
relation, written >. (The converse of this relation is denoted by .) In what
is to follow, we assume that structures are downward binary branching. Ev-
ery node has at most two daughters. To implement this we shall assume two
relations, >
0
and >
1
each of which is a partial function, and > = >
0
>
1
. We
do not require the two relations to be disjoint.
Recall the denition of the transitive closure R
+
of a binary relation R
U U over a set U. It is the least set S containing R such that if (x, y) S
and (y, z) S then also (x, z) S . Recall that R is loop free if and only if R
+
is
irreexive. Also, R

:= (x, x) x U R
+
is the reexive, transitive closure
of R.
Denition 1 A preMDS is a structure (M, >
0
, >
1
), where the following holds
(with >=>
0
>
1
):
(P1) If y >
0
x and y >
0
x

then x = x

.
(P2) If y >
1
x and y >
1
x

then x = x

.
(P3) If y >
1
x then there is a z such that y >
0
z.
(P4) There is exactly one x such that for no y, y > x (this element is called
the root).
(P5)
+
is irreexive.
(P6) The set M(x) := y : x y is linearly ordered by
+
.
We call a pair (x, y) such that x y a link. We shall also write x; y to say that
(x, y) is a link. An MDS is shown in Figure 1. The lines denote the immediate
daughter links. For example, there is a link from a upward to c. Hence we
have a c, or, equivalently, c > a. We also have b a. We use the standard
practice of making the order of the daughters implicit: the leftward link is to
108 Marcus Kracht

c
d
d
d
d

e
f
d
d
d
d

h
Figure 1. An MDS
the daughter number 0. This means that a
0
c and b
1
c. Similarly, it is seen
that b
1
d and b
1
h, while c
0
d and g
0
h. It follows that M(a) = c,
while M(b) = c, d, h. A link (x, y) such that y is minimal in M(x) is called a
root link. For example, (b, c) is a root link, since c
+
d and c
+
h. A link
that is not a root link is called derived. A leaf is a node without daughters.
For technical reasons we shall split
0
and
1
into two relations each. Put
x
00
y i (= if and only if) x
0
y and y is minimal in M(x); and put x
01
y i
x
0
y but y in not minimal in M(x). Alternatively, x
00
y if x
0
y and (x, y)
is a root link. Let x
01
y i x
0
y but but not x
00
y. Then by denition

00

01
= and

0
=
00

01
Similarly, we decompose
1
into

1
=
10

11
where x
10
y i x
1
y and y is minimal in M(x) (or, equivalently, (x, y) is a
root link). And x
11
y i x
1
and y is not minimal in M(x). We shall dene

0
:=
00

10

1
:=
01

11
On the logic of LGB type structures 109
We shall spell out the conditions on these four relations in place of just
0
and
1
. The structures we get are called PMDSs.
Denition 2 An MDS is a structure (M, >
00
, >
01
, >
10
, >
11
) which, in addition
to (P1) (P6) of Denition 1 satises
4
(P7) If y M(x) then x
0
y i x; y is a root link (i y is the least element
of M(x) with respect to
+
).
We assume that the leaves are linearly ordered in the following way.
x y :(z)(u)(v)(x

0
z
00
u >
10
v >

0
y) (1)
This is not the only possible ordering; this establishes in fact the order at D-
structure. This is enough for the present purposes, though. It is veried that
a b e, for example.
Write RS for the relation (x, z) : there is y:xRyS z. We can then restate
(1) as follows.
:=

0

00
>
10
>

0
Table 1 gives a synopsis of the various relations used in this paper.
Denition 3 An ordered MDS (OMDS) is a MDS in which is transitive,
irreexive and linear on the leaves.
Now, since
+
0
is a tree ordering, we can extend to an ordering between any
two incomparable nodes (where x and y are incomparable if neither x
+
0
y
nor y
+
0
x nor x = y). In fact, the extension is exactly as dened by (1). De-
tails can be found, for example, in (Kracht 2003b). Notice that in an OMDS,

0

1
= . For suppose otherwise. Then for some x and y we have x
0
y
and x
1
y and therefore z z for every leaf z x, by denition of . Con-
tradiction.
In presence of the ordering postulate, the conditions (P6) and (P7) can be
replaced by the following
The set M(x) := y : x y is linearly ordered by
+
0
.
This is easy to see. First we prove a
Lemma 1 Suppose that y y

and that there is no x such that y


+
x
+
y

.
Then y
0
y

.
110 Marcus Kracht
Symbol Denition Meaning
1
M
(x, x) : x M diagonal
RS (y)(xR yS z) concatenation
RS union
R
+
RRRRRR transitive closure
R

1
M
R
+
reexive and transitive closure

00
left root daughter of

10
right root daughter of

01
left non-root daughter of

11
right non-root daughter of

0

00

01
left daughter of

1

10

11
right daughter of

0

00

10
root daughter of

+
0
(
00

10
)
+
root descendant of

1

01

11
non-root daughter of

0

1
daughter of

0

00
>
10
>

0
left of (at deep structure)
Table 1. Synopsis of Relations
The proof of the claim is in the fact that y

M(y). If the link is derived it is


not minimal, so there is a z such that y

0
z
+
y

. And conversely.
Suppose now that x y. Then there is a chain y = y
0
y
1
y
2

y
n
= y

. The longest such chain contains only nonderived links, by Lemma 1.


This means that x
+
0
y. Now,
+
0
is a tree ordering so that if y

M(x), then
x
+
0
y

as well, and so either y = y

or y
+
0
y

or y

+
0
y, as promised.
Proposition 1 Let Mbe a MDS. Mis an OMDS i the following holds: if x
is not the root,
10
is dened i
00
is undened on x.
We shall prove the theorem and exhibit some useful techniques. We code
the elements of M by sequences in the following way. Let I be a chain
x
i
: i < n+1 such that x
0
is the root, and x
i
>
0
x
i+1
for every i < n. (So we
are going down.) We call I a standard identier for x and denote it by I(x).
n is called the standard depth of x
n
and we write sd(x
n
) to denote it.
Lemma 2 In an OMDS, every x has exactly one standard identier. Hence,
the standard depth of x is uniquely dened.
(See also (Kracht 2001b) on the notion of an identier.) Let us see why the
standard identier is unique.
On the logic of LGB type structures 111
We translate the identier into a binary sequence b
0
b
1
b
n
dened by
b
i
=

0 if x
i
>
00
x
i+1
,
1 if x
i
>
10
x
i+1
.
(2)
In this way, we associate a binary sequence with each node. Now recall
that (1) denes a linear ordering on the leaves. This means that the number
associated to x via (2) is unique. For if not, there are two distinct sequences,
b
0
b
1
b
n
and c
0
c
1
c
m
for x
n
. Let j be the least index such that b
j
c
j
,
say b
j
= 0 and c
j
= 1. Then, by (1), if z x
n
is a leaf, z z. Contradiction.
Now, let x be given. It has a sequence b
0
b
1
b
n
associated with it. Let
y >
0
x. Then y is dened by b
0
b
1
b
n1
, which is unique. So,
0
is a
partial function. Conversely, if
1
is a partial function, then the translation
into binary sequences is unique. Now dene for sequences by b
0
b
1
b
n
and c
0
c
1
c
m
i for the rst j such that b
j
c
j
, b
j
= 0 < c
j
= 1. This is
exactly the order (1), spelled out for the representing sequences. This order is
loop free, transitive and linear on the maximal sequences (which correspond
to the leaves). We add that b
0
b
0
b
m
is immediately to the left of c
0
c
1
c
n
if
b
0
b
0
b
m
= b
0
b
1
b
j1
01 1,
c
0
c
1
c
n
= b
0
b
1
b
j1
10 0
(The lengths of these sequences need not be equal.)
I should emphasise that the identiers do not necessarily form a tree do-
main. Recall that a tree domain T is a subset of N

such that the following


holds: (a) if xi T then x T, and (b) if x j T and i < j then also xi T. Prop-
erty (a) holds but (b) does not hold in general. For suppose that x >
01
y and
x >
10
z. Then I(z) = I(x)1. However since the link y; x is derived there is no
standard identier of the form I(x)0. The identier I(y) contains I(z) = I(x)1
as a prex.
3. Dynamic logic
The language of propositional dynamic logic (PDL) is dened as follows.
Given any set
0
of so-called basic programs,
5
a set of propositional con-
stants, and V of variables, the sets of formulae and programs are the least sets
satisfying:
If a is a propositional constant, a is a formula.
112 Marcus Kracht
If p V is a propositional variable, p is a formula.
If ,

are formulae, so are and

.
If
0
is a basic program, is a program.
If ,

are programs, so are ;

and ; and

.
If is a formula, ? is a program.
If is a program and a formula, () is a formula.
We put

:= (

) and [] := (), and similarly for other


boolean connectives. The minimal logic, denoted by PDL, is the least set of
formulae with the following properties:
1. All propositional tautologies are in PDL.
2. [](

) ([] []

) PDL.
3. (?)

) PDL.
4. (

) ()(

) PDL.
5. (;

) ()(

) PDL.
6. [

]( []) [

] PDL.
7. If PDL then [] PDL.
8. If

PDL and PDL then

PDL.
9. If PDL, then s() PDL for every substitution s.
Here, a substitution is dened to be a function s that assigns a formula s(p) to
every variable p. The formula s() is obtained by replacing every occurrence
of a variable p by s(p), for every variable p. A dynamic logic is a set L PDL
which has the properties (7) (9). Let be a formula and L a dynamic logic;
then L denotes the least dynamic logic containing L and . Similarly with
a set in place of .
Model structures are of the form F = (W, C, R), where W is a set (the set
of worlds or points), C : (W) a function assigning each constant a set
of worlds, and R :
0
(WW) a function assigning each basic program a
binary relation on W. A valuation is a function : V (W). Based on this
On the logic of LGB type structures 113
we dene the interpretation of complex programs as relations in the following
way.
R(

) := R() R(

)
R(;

) := R() R(

)
R(

) := R()

R(?) := (w, w) : (F, , w)


The truth of a formula at a world is dened thus.
(F, , w) :(F, , w)
(F, , w)

:(F, , w) ;

(F, , w) () :there is u: w R() u and (F, , u)


We write F if for all valuations and all worlds w: (F, , w) . The
logic of a class Kof structures is
Th(K) := : for all F K: F
It has been shown that PDL is the logic of all structures and that it is also
the logic of the nite structures. From this follows the decidability of PDL.
However, more is known.
Theorem 1 PDL is EXPTIME-complete.
This means that there are constants c and b and a polynomial p(x) such that
for every formula of length n > c the time needed to solve the problem
whether or not PDL takes b
p(n)
time. (Additionally, any problem of this
complexity can be coded as such a problem in polynomial time.)
4. Grammars as logics
In context free grammars one distinguishes the terminal alphabet from the
rules. A similar distinction is made here as well. Nodes that have no daugh-
ters are called terminal. The lexicon is a set of declarations which state what
labels a terminal node may have. This is typically done by introducing a -
nite set of constants and the statement that all and only those nodes may be
terminal at which one of the constants is true. Since the constants are part
of the language the lexicon is eectively identied with a specic nonmodal
formula. In fact, we are more generous here and assume that the lexicon is a
constant formula , which may involve modal operators. This is useful when
114 Marcus Kracht
we want to assume that the lexicon also contains complex items, as is often
assumed in generative grammar. The grammar is a (nite) set of formulae
expressed in the above language. While the grammar is believed to be the
same for all languages, the lexicon is subject to variation.
The logic DPDL (deterministic PDL) is obtained from PDL by adding
the formulae () [] for every formula and basic program . (Nonba-
sic programs will not necessarily satisfy this postulate even if the basic ones
do.) A frame is a frame for DPDL i for every basic program : if x R() y
and x R() y

then y = y

. (Recall that is called deterministic if it has that


property, and this is the reason the logic is called DPDL.) Furthermore, the
logic of nite deterministic computations is obtained by adding the formula
[
+
]([
+
]p p) [
+
]p
where is the union of all basic programs (hence this denition requires that

0
is nite). If we want to mention the number n of programs, we write
DPDL
n
.f. The following is proved in (Kracht 1999) (nite model property
and decidability) and (Vardi and Wolper 1986) (EXPTIME-completeness).
Theorem 2 For every n, DPDL
n
.f is the logic of all nite structures with n
basic programs, where the basic programs are deterministic and their union
is loop free. DPDL
n
.f is decidable, it is EXPTIME-complete and complete
with respect to nite trees.
Theorem 3 For every n, the PDL-logic of n-branching trees has the nite
model property and is decidable.
Many of the basic results can also be obtained by using a translation of dy-
namic logic into monadic second-order logic (MSO). The disadvantage of
using MSO is that the complexity of the logic is for the most part nonele-
mentary (in the sense of recursion theory), while PDL is elementary (it is
EXPTIME complete). Second, the main result that we shall establish here,
the decidability of the dynamic logic of multidominance structures, cannot be
derived in this way, as far as we can see. For this reason we shall use dynamic
logic.
5. The logic of multidominance structures
Let us agree on the following notation. For each of the relations >
i j
we in-
troduce a program
i j
, which is interpreted by a relation that we write >
i j
or
On the logic of LGB type structures 115

i j
rather than R(
i j
). Structures are of the form
(M, >
00
, >
01
, >
10
, >
11
).
We use
0
in place of
00

01
,
1
for
10

11
and for
0

1
. The
programs
0
and
1
are interpreted as partial functions. Also, the notation

0
:=
00

10
and
1
:=
01

11
is frequently used. Finally, let us write
u :=

(u stands for universal.) A structure is called generated if there is a single


element w such that the least set containing w which is closed under taking
successors along all basic relations is the entire set of worlds. (In our case
this is exactly true if the structure is a constituent.) The following is easy to
see.
Lemma 3 Let Mbe a generated PDL
n
-structure with root x. Then we have
(M, , x) [u] i for all w: (M, , w) .
Our rst goal is to axiomatise the logic of all MDSs. There is an important
tool that we shall use over and over. A formula is constant if it contains no
variables.
Theorem 4 Suppose that L is a logic containing PDL
n
which has the nite
model property, and let be a constant formula. Then the logic L also
has the nite model property.
Proof. Suppose that is consistent with L . Then ; [u] also is L -
consistent, and a fortiori L-consistent. Thus it has a nite model ((F, R), , x).
We may actually assume that for every y, x
u
y. Then y , and so the frame
is a frame for L, since is constant. P
This theorem has interesting consequences worth pointing out. It allows
us to focus on the grammar rather than the lexicon. This reduces the problem
to some degree.
Denition 4 Let
PM:= DPDL
4
.f (3)
(
1
) (
0
) (4)
(
00
) [
01
] (5)
(
10
) [
11
] (6)
(
1
)p (
+
0
; )p (7)
116 Marcus Kracht
From (4) we get that each node with a right hand daughter also has a left hand
daughter. The axioms (5) makes sure that one cannot have both a left hand
derived and a left hand nonderived daughter; (6) is the same for right hand
daughters. The postulates are constant and can be added without sacricing
decidability, by Theorem 4. It follows easily that since
00
and
01
are both
functional, so is their union
0
; and likewise for
1
.
Postulate (7) ensures that the structures are trees at deep structure. That
means that >
0
is a tree order. This is because if z
1
y then there is a path
along nonderived links to y, as we shall show.
Lemma 4 Suppose F satises the following property.
Link. For all w and u: if w >
1
u then there is a y such that
w >
+
0
y and y > u.
Then F satises (7).
Proof. Choose a valuation and a point w such that
(F, , w) (
1
)p
So there is a u
1
w such that u p. By assumption on F, there is a y such
that w >
+
0
y and y > u. From the second we get y ()p, and from the rst
(F, , y) (
+
0
)()p
This shows the claim. P
Using this we prove that the axioms of PM are valid in all MDSs. This is
Lemma 5 below. This is one half of the characterisation, Theorem 5, which
asserts that a nite structure satises the axioms of PM exactly it is actually
a MDS. The other half is constituted by Lemma 8.
Lemma 5 Finite MDSs are PM-structures.
Proof. It is clear that MDSs satisfy the postulates (4), (5) and (6). We need
to show that they satisfy (7). To see this, we shall verify that they satisfy the
property Link from Lemma 4. To this end, take a MDS (M, >
00
, >
01
, >
10
, >
11
).
Suppose that x >
1
y. Then x M(y), and there is, by assumption, an element
u M(y) such that u
+
x. (Notice that by (P7) of Denition 2, x cannot be
the least element in M(y) with respect to
+
since the link (x, y) is derived.)
Choose a path
0
= u; ; x. If this path contains only root links we are
done. Otherwise, let the path contain v; v

, a derived link. Then there is a path


On the logic of LGB type structures 117
= v; ; w; v

such that w
0
v

, by a similar argument. Replace the pair


v; v

in
0
by . This gives a path which is longer than
0
. Thus, as long as
we have derived links we can replace them, increasing the length of the path.
However, is loop free and the structure nite. Hence, the procedure must
end. It delivers a path without derived links, as promised. P
In connection with the following lemma, we say that R() satises the
xed point property if for all formulae , frames F, valuations and points
x:
(F, , x) (

) (;

)
Lemma 6 Let (F, R) be a nite frame, a valuation, and R() be loop free.
Then for all x and :
(F, , x) (

) (;

)
Proof. In PDL, (

) and (;

) (

) are generally valid. Hence


we only have to establish
(F, , x) (

) (;

)
By assumption on R(), for every x there is a sequence x = x
0

x
1

x
2

x
n
where x
n
has no R()-successor. We proceed by induction on maximum
length of such a chain starting at x. Call this the height of x. If the height is
0, x has no R()-successors. Then (;

) is false, and so the claim reduces


to
(F, , x) (

)
which is correct. Now let x be of height n +1 and the claim proved for all
points of height n. Suppose (

) is true at x. Then there is a chain of


length n +1: x = x
0

x
1

x
2

x
k
, and is true at x
k
. Two cases
arise. k = 0, in which case x and we are done. Or k > 0. Then, by
inductive hypothesis, since x
1
has height n, (F, , x
1
) (

) and so we
have x (;

), as promised. P
Say that a program is progressive in a logic L if R() is loop free in
every structure for L. In that case we say that a node x has -height n if there
is no sequence x

x
1

x
2

x
n+1
. If x has -height 0 it means that it has
no -successors. The important fact to note is that we can restrict ourselves
in the present context to progressive programs, and these are the programs
which have the xed point equation property. We say that is contained in ,
in symbols , if L - ()p ()p. If L has the nite model property this
is equivalent to R() R() in every nite L-structure. If L

L and in
118 Marcus Kracht
L, then this holds also in L

. and are equivalent in L if as well as


in L. If is progressive then so are
n
(n > 0) and
+
. The following
theorem rests on the fact that the logic of nite computations has a maximal
progressive program.
Lemma 7 In PDL
n
.f every program is equivalent to a program of the form
?, , or ? , where is progressive.
Proof. Notice that is equivalent to ? , so we do not need a separate
case for progressive programs. Let
i
, i < n, be the basic modalities. Put
:= (
0

1

n1
)
+
In PDL
n
.f, is progressive. Then ; as well
+
are likewise progressive.
Every that is contained in a progressive program is also progressive. What
we shall show is that every program that is not a test can be written as ?
where is contained in . Before we start, notice that if is a test and
then ?; and likewise ; ? .
We note that ?; ? is equivalent to ()? and that ? ? is equivalent
to ()?. Finally, (?)

is equivalent to ?, so that the program operators


reduce on tests to a single test. Now, suppose that
1
=
1
?
1
and
2
=

2
?
2
with
1
,
2
contained in . Then

2
= (
1
?
1
) (
2
?
2
)
= (
1

2
)? (
1

2
)
is of the desired form.

1
;
2
= (
1
?
1
); (
2
?
2
)
= (
1
?;
2
?) (
1
?;
2
) (
1
;
2
?) (
1
;
2
)
(
1

2
)? (
1
?;
1

2
;
2
?
1
;
2
)
which is again of the desired form. Finally, let = ? . We observe that
? . Furthermore, since star is monotone,

(? )

= ?
+
.
Now, , and so
+

+
, since is transitive. P
Denition 5 The Fisher Ladner closure FL() of a formula is the smallest
set containing such that the following is satised.
1. If FL() then , FL().
2. If FL() then FL().
On the logic of LGB type structures 119
3. If () FL() then (), () FL().
4. If (; ) FL() then ()() FL().
5. If (

) FL() then , ()(

) FL().
6. If (?) FL() then , FL().
7. If () FL(), basic then FL().
We remark that FL() is linear in the length of . This is shown by induction
on . This means that complexity can be measured either in terms of the size
of the formula or in terms of the size of FL().
Now let At() be the set of all conjunctions of formulae (or their nega-
tions) from the Fisher Ladner closure of . (This set has a size exponential
in the size of , which induces a rise in complexity for the logic of MDSs in
Theorem 5 from EXPTIME to 2EXPTIME.) Set
X() := (
1
) (
+
0
; ) : At()
(
1
) (
0
), (
00
) [
01
], (
10
)[
11
]
Lemma 8 is consistent with PM i ; [u]X() is consistent with DPDL
4
.f.
Proof. (). If ; [u]X() is inconsistent in DPDL
n
.f, can be proved from
[u]X() in DPDL
4
.f. However, [u]X() can be proved in PM. Hence is
provable in PM.
(). Now let us suppose that ; [u]X() is DPDL
4
.f-consistent. Then by
Theorem 2 it has a nite model based on a frame
M= (M, >
00
, >
01
, >
10
, >
11
)
with root w
0
and valuation . So,
(M, , w
0
) ; [u]X()
Notice that the frame satises the formulae (4), (5) and (6). Hence we may
assume that the relation
0
induces a tree ordering on the set of worlds,
though with multiple roots (thus we have what is known as a forest). We
shall construct a nite PM-model from this. Let S be the closure of w
0
under
the relation >
0
, that is, S is the least set which contains w
0
and is closed
under >
0
. Members of S are called standard points. Let
E := w : there is v S such that w
1
v
120 Marcus Kracht
For a point w, let a(w) be the unique At() such that
(M, , w)
Now choose a w E. Let v be a standard world such that w
1
v. By choice
of X(),
(M, , w
0
) [u]((
1
)a(w) (
+
0
; )a(w))
where w
0
is the root. Hence
(M, , v) (
1
)a(w) (
+
0
; )a(w)
Since a(w) is true at w and since w
1
v, we have
(M, , v) (
+
0
; )a(w)
Hence there is a standard u
+
0
v and u

u such that a(u

) = a(w). By
denition of E, u

is either standard, or in E. For each w, pick such a point


and say that it is linked from w and write w L u

. Thus, L is a function from


E to E S . We note the following. w L u

does not mean that u

is standard.
However, already u has greater standard depth as w, and if u

S then u

E
and so u

can in turn be linked to some node. It follows that for every w E


there is a standard v such that w L
+
v. For suppose not. Then there is a w E
of maximal depth which cannot be linked to a standard point. But it can be
linked to a point in E. The latter has greater depth. Contradiction.
Now we dene a new frame S as follows. The set of points is S . Put
x
00
y i x
00
y, x
10
y i x
10
y; put x
01
y i there is a u such that
u
01
y and u L
+
x; x
11
y i there is a u such that u
11
y and u L
+
x. Finally,
S:= (S,
00
,
01
,
10
,
11
)
The valuation

is dened by

(p) := (p) S . (If constants are present, the


value of a constant c in Sis the value of c intersected with S .) We shall prove
for every w S and every FL():
(S,

, w) (M, , w) (8)
The basic clause is
(Case 1.) = p, a variable (or constant). Then (S,

, w) p i w

(p) i
w (p) i (M, , w) p, by denition of

.
(Case 2.) = .
(S,

, w) i (S,

, w)
i (M, , w)
i (M, , w)
On the logic of LGB type structures 121
(Case 3.) =

.
(S,

, w)

i (S,

, w) ;

i (M, , w) ;

i (M, , w)

Now let = (). The claim will be proved by induction on the syntactic
complexity of .
(Case 4.) =

.
(S,

, w) (

i (S,

, w) (

)(

)
i (M, , w) (

)(

)
i (M, , w) (

)
(Case 5.) =

.
(S,

, w) (

) i (S,

, w) (

)(

)
i (M, , w) (

)(

)
i (M, , w) (

)
We use (i) the fact that

is syntactically less complex than

and (ii)
the inductive hypothesis for (

).
(Case 6.) = ?.
(S,

, w) (?) i (S,

, w) ;
i (M, , w) ;
i (M, , w) (?)
Using the inductive assumptions on and .
(Case 7.) =

. Now, in virtue of Lemma 7 we may assume that

is
progressive, so
(

) (

)(

)
is a theorem of PDL. Further,

is of lesser complexity than

.
(S,

, w) (

) i (S,

, w) (

)(

)
i (M, , w) (

)(

)
i (M, , w) (

)
122 Marcus Kracht
(Case 8.) =
00
. Then the claim follows since
00
=
00
.
(Case 9.) =
10
. Likewise.
(Case 10.) =
01
. We show rst () in (8). (S,

, w) (
01
) implies
that there is a v
01
w such that (S,

, v) . v is standard, and by induction


hypothesis, (M, , v) . By construction, w >
01
u for a u E such that
u L
+
v. This means that a(u) = a(v) and so (M, , u) ; hence (M, , w)
(
01
). Now we show () in (8). Assume (M, , v) (
01
) and v S .
Then there is a w E such that w
01
v and (M, , w) . By construction
there is a standard u such that w L
+
u, and so (M, , u) , since a(u) = a(w).
By inductive hypothesis, (S, , u) . Again by construction, v
01
u, so
(S, , v) (
01
).
(Case 11.) =
11
. Similar.
The next step is to verify that S is a PM-frame. To that eect we have
to ensure that the union of the basic programs is deterministic and loop free
and that the structure satises (7). First, let w S . Recall the denition of the
standard depth. It is easy to see that the standard depth of points is the same
in both structures. Now suppose that wu. We claim that sd(w) > sd(u).
(Case 1.) w
0
u. Then w
00
u or w
10
u, and by denition of standard
depth, sd(w) = 1+sd(u). (Case 2.) w
01
u or w
11
u. In this case there is a
y such that w >
01
y or w >
11
y such that y L
+
u and w >
+
u

for some standard


u

. This means that sd(u) 2 +sd(w). Next, to show that the programs are
deterministic, observe that the original programs were deterministic, and each
link was replaced by just one link. Finally, from Lemma 4 it follows that the
constructed structure satises PM.
Now, from (8) it follows that
(S, , w
0
)
This shows the claim. P
Theorem 5 The logic of MDSs is PM. Moreover, this logic has the nite
model property, is nitely axiomatisable and therefore decidable. Its com-
plexity is 2EXPTIME.
The complexity bound follows from the fact that the formula to be satised
has length O(2
n
), and that DPDL
4
.f is in EXPTIME.
6. Single movement MDSs
There is an important class of MDSs, those where M(x) has at most two el-
ements. This means in practice that each element is allowed to move only
On the logic of LGB type structures 123
once. This class of structures is very important, since the now current Mini-
malist Program requires each movement step to be licensed. These structures
are the topic of a sequel to this paper. Here we are interested only in the
axiomatisation of these structures. We have noted earlier that root links are
always the lowest links. Therefore, for every node x there is at most one y
such that x
0
y. On the other hand there can be any number of non-root
links. The narrowness determines the maximum number of non-root links.
(p) := (p [
+
]p) ((
00
;

)p(
10
;

)p)
Lemma 9 Let be a valuation such that (F, ) (p). Then (p) 1.
Proof. Suppose that x, y (p). Then x
+
0
y cannot hold; for then y
p but y [
+
]p. Likewise y
+
0
x cannot hold. If however x and y are
incomparable there are points u, v and v

such that v v

and x
+
v u as
well as y
+
v

u. Then however u (
00
;
+
)p; (
10
;
+
)p. Contradiction.
P
Denition 6 An MDS is called n-narrow if M(x) n+1 for all x. An MDS
is called narrow if it is 1-narrow.
Set
(p) :=[u](p)
[u]((
1
)p [
0
;

](()p (
0
)p))
Lemma 10 An MDS satises (p) i it is narrow.
Proof. Suppose the MDS is not narrow. Then there is a y and z, z

M(y)
such that z
+
z

and both links y; z and y; z

are not root links. Then put


(p) := y. Then throughout the MDS, p [
+
]p holds. Also, there is no
point u such that u >
00
v, u >
10
v

and y

v as well as y

. It follows that
z (
1
)p; ()p and z

(
1
)p. However, z

R(
0
;

) z. So the formula
is false under this valuation.
Now assume that the MDS is narrow. Take a valuation such that (p)
everywhere. By the preceding lemma, either (p) is empty or (p) = u for
some u. In the rst case no node satises (
1
)p, so the second part of (p) is
true. Now assume (p) = u and let y be a node such that y (
1
)p. Then
say u
1
y. We have to show
y [
0
;

](()p (
0
)p)
To this end let z and z

be such that z z

0
y and z ()p. Then z > u.
Since the structure is narrow, u
0
z, showing z (
0
)p. P
124 Marcus Kracht
7. Extending and reducing the language
The fact that we are dealing with cycle free structures has a great eect on
the expressivity of the language; basically, using implicit denitions all pro-
gram constructors of PDL can be eliminated; conversely, many seemingly
more powerful constructs can be eectively mimicked. We consider here two
devices: nominals (see (Blackburn 1993)) and the converse. A nominal is a
variable that can be true only at a single world. It emerges from the discus-
sion above that nominals actually do not add any expressive strength to our
language. Consider a formula (i) which contains a nominal i. Now consider
the formula
(p) (
+
)p [p/i]
This formula has a model (F, , x) only if (p) is a singleton. The conse-
quence of this is the following
Theorem 6 For every rst-order universal formula using atomic formulae
of the form x R() y or x = y there is a modal formula such that for any
MDS, F i F .
Proof. Let =(x
0
x
1
x
n1
). Introduce nominals i
0
, i
1
, , i
n1
and dene
the following translation:
(x
p
= x
q
)

:= (

)(i
p
i
q
)
(x
p
R() x
q
)

:= (

)(i
p
()i
q
)
()

:=

:=

It is not hard to see that (F, , x)

i F . The sought after formula is


(p
0
) (p
1
) (p
n1
)

[p
k
/i
k
: i < n]
This completes the proof. P
Also, let me recall a few other reductions that we have achieved. The
following equivalences hold:
(

)p ()p(

)p
(;

)p ()(

)p
(?)
On the logic of LGB type structures 125
This means that the program constructors union, concatenation and test are
eliminable if they occur as outermost program constructors. However, we
have also shown that every program is a union of a test and a progressive
program and that for progressive programs the following holds in nite struc-
tures:
(

)p p()(

)p
This allows to eliminate the star as follows:
Lemma 11 Let be progressive and a formula not containing q. Then
(F, ) q (

) (F, ) q ()
Proof. Directly from Lemma 6. P
Lemma 12 Let be progressive in F and and formulae such that does
not contain q. Then
F [(

)/q] F [u](q ()q)


Proof. Using the previous lemma. () Suppose F [(

)/q]. Pick and


x. Suppose (F, , x) [u](q ()q). We have to show that (F, , x) .
Now, (F, ) q (

). Then (F, ) q (

) (by the previous lemma),


and so we can interchange q by (

), giving us (F, , x) . () Choose


any valuation . Suppose that F [u](q ()) , and choose

such
that

(p) = (p) for all p q and

(q) = y : (F, , y) (

). (For this
to be noncircular we need that does not contain q.) Now (F,

, x)[u](q
()) by the previous lemma, and so we get (F,

) . By denition of

this is (F, ) [(

)/q]. was arbitrary, giving us F [(

)/q]. P
Notice that q does not need to occur in . We may strengthen our lan-
guage further by adding an operator on programs, the converse (see (Gia-
como 1996)). This will allow to talk about going up the tree. This makes the
statement of some restrictions easier. We shall show for a large enough por-
tion of the newly added formulae that they do not add expressive power, they
just make life easier. The good news about them is that they can be added
without having to redo the proofs.
Recall that for a binary relation R,
R

:= (y, x) : x R y
The language PDL

extends PDL by a unary operator



, and we require that
R(

) = R()

126 Marcus Kracht


PDL

is axiomatised over PDL for all programs plus for every program :
p [](

)p, p [

]()p (9)
It turns out that it is enough to just add the converse for every elementary
program, for we have
(RS )

:= R

(RS )

:= S

(R

:= (R

Also, notice that


R((?)

) = R(?)
Thus, rather than looking at PDL

4
(four basic programs and a converse op-
erator) we may look at PDL
8
(eight basic programs, no converse), where the
postulates (9) have been added just for the basic programs. We shall not take
that route, though, since it produces needless complications. Rather, we shall
make the following observation.
Lemma 13 Let F=(F, R) be a frame, x F a world, and P, , and modal-
ities such that R() =R(P)

is a partial function and an operator such that


x R() y for all y. Then for any two formulas and , not containing q, and
any valuation :
(F, x) (( q) ( ( Pq) ( Pq))) (10)
i
(F, x) [/q] (11)
Proof. Pick a valuation . We claim that
(F, , x) ((q) ( ( Pq) ( Pq))) (12)
i (q) = u : u . This establishes the claim as follows. Assume (10)
holds. Pick and choose

such that (12) holds with

in place of . This is
exactly the case if

(q) = u : (F, , u) . Now we have both (F,

, x)
(by (10)) and (F,

, x) q . Thus we have (F,

, x) [/q]. For this


we get (F, , x) [/q], since q does not occur in this formula. Conversely,
suppose (11) holds. Choose . (Case 1) (q) = u : (F, , u) . Then
(F, ) , and so (12) holds. Also, (10) holds. (Case 2) (q) u : (F, , u)
. Then (12) does not hold, so (10) holds as well.
On the logic of LGB type structures 127
Now (12) is equivalent to
(F, ) ( q); ( Pq) ( Pq)
Pick z. We have to show that z q i z . Two cases arise. (Case 1.) z has
no R()-successor. Then is true at z and so is both q and . (Case 2.)
z has a R()-successor. Then this successor is unique by assumption. Call
it y. By assumption we have y R(P) z. Furthermore, as x R() y, we have
y Pq as well as y Pq. Suppose z . Then y , from
which y Pq, and so z q. If z then y , by functionality of R().
Hence y Pq and so z q. P
This lemma can be used to introduce converses for the programs
00
and

10
, since they are backwards deterministic. This seemingly allows for the
reduction of any program to a forward looking program. However, recall that
the elimination of star used the fact that every program is basically progres-
sive. With converses added this is no longer the case. So, star is eliminable
only if the program either contains only downward looking modalities or only
upward looking modalities. Tests belong to either class (can be included as
only downward looking in the rst case, or as only upward looking in the
second). Call such a formula a nite turn formula.
Theorem 7 Suppose a class of constituents is axiomatisable with some nite
turn formulae using the operators
00
,
10
and
0
in addition to
i j
. Then
it can be axiomatised without the use of
00
and
10
.
This can be used in the following way. We have said earlier that the MDSs
are not necessarily ordered in the standard sense. To enforce this we need to
add another postulate. The linear order from (1) is modally denable by
:=

0
;
0
;
1
;

0
In the denition we have made use of upward looking programs. It is straight-
forward to verify that
x y (x, y) R()
This would ordinarily involve adding the converse operator. We have seen,
however, that there is a way to consider the converse operators as abbrevia-
tions. Thus we may dene the following.
128 Marcus Kracht
Denition 7 Let
OL := PM
(
00
) [
10
]
(
10
) [
00
]
(
00
)p [
00
]p
(
10
)p [
10
]p
Using Theorem 4 we see that
Theorem 8 OL is decidable in 2EXPTIME.
8. Nearness
The above results are encouraging; unfortunately, they are not exactly what
we need. There typically is a restriction on the distance that an element can
move in a single step. We take as our prime example the subjacency denition
in (Chomsky 1986). As I have argued in (Kracht 1998), perhaps the best
denition is this. The antecedent of a trace can be found within the next
CP which contains the next IP properly containing the trace. This denition
uses that concatenation of the two command relations of IP-command and
CP-command.
One is tempted to cash this out as the following axiom.
(
1
)p (
0
)((CP?; ))

; (IP; )

)p (13)
Here, CP, IP are constants, denoting phrasal nodes of category CP and IP.
This formula says that for every node x, if there is a derived downward link
from x to some y, then there is a path to y following rst a nonderived link,
then following down non-CP nodes and nally non-IP nodes. Unfortunately,
matters are not that easy. The program (; )

can be transcribed as while


go one step down. This is a nondeterministic program, capturing the
relation (x, y) where there is no node on the path from y to x. ( may hold
at y, but not at x.)
However, this gives the wrong results (cf. (Kracht 2001a)). Consider a
VP and an NP that scrambles out of it. Consider a movement of the VP that
passes the NP, whereupon the NP moves to pass the VP again.
NP
1
[ t
1
]
VP
2
t
1
t
2
(14)
On the logic of LGB type structures 129
Then the formula above maybe true even if there was a step of the NP that
crossed a barrier at . I do not know of a natural example of this kind, but
the formalisation should work even if none existed. Furthermore, the prob-
lem is with the NP movement, so it cannot be dismissed on the ground that
the VP has crossed a barrier. Interestingly, the latter objection can easily be
eliminated; for we can assume that the VP has moved into spec of CP before
leaving the barrier. And in that case it has blocked the chances of the NP to
do the same.
So, why does (14) pose a problem with (13)? Let us display some more
constituents:
X
[NP
1 Y
[[ t
1
]
VP
2
t
1
[t
2
]
Z
]
Y
]
X
(15)
The constituent (= node) X has NP
1
as a derived daughter. (13) requests that
we nd a path following rst a nonderived link and so that if we ever cross
a CP we do not cross an IP after that. We shall give such a path. First we
go to Y. From Y we go to VP
2
and down to t
1
= NP
1
. Recall that we are in
an MDS, so whenever you see a trace there is actually a constituent, and it
is the same as the antecedent of the trace. In particular, Y is mother to VP
2
,
and Z mother to t
2
, both are mothers of the same VP
2
constituent. And so the
path inside the upper VP
2
is the same path as the one in the lower copy. And
likewise, to go to t
1
is to go to NP
1
because they are the same.
What went wrong? Formula (13) asks for the existence of some path of
the required kind, but it may not be the one that the constituent actually took
when it moved. It is not enough to say, therefore, that some alternative path
satises the nearness condition, we must somehow require that it is the actual
path that was used in the movement that satises the nearness condition. It is
possible to nd such a formula, but it is unfortunately quite a complex one.
The particularly tricky part here is that the structure almost looks as if the
NP has been moving with the VP only to escape after the VP has crossed the
barrier (= piggy backing). But that is not what happened (we have a trace
witnessing the scrambling).
So, nearness constraints are not easily captured in model theoretic terms
because the structure does not explicitly say which link has been added be-
fore which other. Indeed, notice that one and the same MDS allows for
quite dierent derivations. There is (up to inessential variations, see (Kracht
2003a)) exactly one derivation that satises Freeze, and exactly one that sat-
ises Shortest Steps. The problem is with the Shortest Steps derivations.
As it turns out, however, at least Freeze derivations are easy to charac-
terise. The idea is that the longest path between two standard elements is
actually the one following standard links. Suppose we want to dene the sub-
130 Marcus Kracht
jacency domain for Freeze. (Notice the slightly dierent formulation of the
command domain from (13). Both can be used, and they dier only mini-
mally. This is anyhow only an example.)
:= (
1
)p ((
0
; CP?)
+
; (
0
; IP?)
+
; )p
Lemma 14 M i there is a Freeze derivation such that movement is
within the IPCP-domain.
Proof. Suppose that movement is such that each step is within the IP CP-
domain of the trace. Then in the MDS, every path between these nodes re-
spects these domains. Conversely, let x be a node in an MDS and y >
1
x. Put
(p) := x. Then y (
0
)p. Hence, by assumption,
(M, , y) ((
0
; CP?)
+
; (
0
; IP?)
+
; )p
which says that there is a standard path rst along nodes that do no satisfy
CP and then along nodes that do not satisfy IP to some node z which domi-
nates x immediately. The standard path is the movement path in the Freeze
derivation. This shows the theorem. P
This can be generalised to any requirement that says that a path must re-
spect a regular language, which is more general than the denable command
relations of (Kracht 1993). The general principle is therefore of the form
Dist(c; ,) = (
1
)(c p) (; )p
where c is a constant and is an expression using only
0
and constants. (It
may not even use
00
or
10
.) Moreover, as we shall see, one can mix these
postulates to have a particular notion of distance for phrases and another one
for heads, for example. In general, any mixture of distance postulates is ne,
as long as it is nite.
Theorem 9 The logic of MDSs which have a Freeze derivation satisfying a
nite number of postulates of the form Dist(R) has the nite model property
and is decidable.
Proof. We replay the proof of Lemma 8. Let Dist(c
i
; ,
i
), i <n, be the distance
postulates.
Y() := (
1
)(c
i
) (
i
; ) : At(), i < n
Now dene the linking in the following way. If w
1
u and w c
i
, then
u (
i
; )a(u)
On the logic of LGB type structures 131
Hence there are w

, u

such that u

0
u, w

and the standard path from


u to u

is contained in ,
i
, and a(w

) = a(w). We then put w L w

. Thus, the
condition on Freeze derivations is respected. The rest of the proof is the same.
P
9. First example: Movement
We shall present an example of a language that is trans-context free and can
be generated from a context free language through movement. Furthermore,
it shall follow from our results that the logic of the associated structures is
decidable. Take the following grammar.
S aT S aX
T bU X bc (16)
U cS S S
This grammar generates the language (abc)
n
: n > 0. Now, we shall allow
for movement of any element into c-commanding position. Movement is
only constrained by the fact that it is into c-commanding position, nothing
else. Since we have added the rule S S, the base grammar freely generates
sites to which a constituent can adjoin.
In order to implement this, we need to add constants. For each terminal
and each nonterminal element there will be a constant denoted by underlining
it; for example, U is the constant denoting nodes with label U. This will be
our new language. We also add the condition that the constants from C are
mutually exclusive:
Exc(C) := X Y : X Y and X, Y C
Also, we express the fact at each node at least one constant from C must be
true by
Suf(C) :=

(X : X C)
These two together ensure that each node satises exactly one constant. Next
the context free grammar is described by a set of rules:
132 Marcus Kracht

S
:= S (
00
)a(
10
)T
(
00
)a(
10
)X
(
00
)S(
10
)

T
:= T (
00
)b(
10
)U

U
:= U (
00
)c(
10
)S

X
:= X (
00
)b(
10
)c

a
:= a ()

b
:= b ()

c
:= c ()
Now we are looking at the following logic Mv, where C := S, T, U, X, a, b, c,
with
Mv := OLExc(C) Suf(C)
X
: X C
Since the added postulates are constant, it is a matter of direct verication
that the structures for this logic are the MDSs in which the underlying tree
(using the nonderived links) satises the context free grammar given in (16).
Any constituent may move, and it can move to any c-commanding position.
It is interesting to spell out which linear order we use for the surface con-
stituents. To this end, let x
s
y if y is the highest member of P(x); we also call
the link x; y a surface link. It is not hard to show that
+
s
denes a tree order
on the worlds. Moreover, let x
s0
y if x
s
y and x
0
y; similarly, x
s1
y i
x
s
y and x
1
y. We say that for two leaves x and y that x surface-precedes
y, in x symbols x y.
x y :(u)(v)(w)(x
+
s
u
s0
v >
s1
w >
+
s
y)
This order is not modally denable. However, this does not defeat the useful-
ness of the present approach. There are two xes; one is to introduce a surface
relation. Like we did for the root links, we introduce relations
s0
and
s1
explicitly. The proofs so far go through without a change. Decidability is
again guaranteed.
10. Adjunction
The next generalisation we are going to make concerns adjunction. Recall
from (Kracht 1998) that it is not enough to leave adjunction implicit. We
must add an explicit statement which nodes are maximal. An adjunction
On the logic of LGB type structures 133
structure is therefore obtained by adding a subset Q of M. (Intuitively, this
set represents the maximal nodes of a category.)
x

:= the least y Q such that y >

x
The category of x is dened as follows.
C(x) := y : y

= x

A category is a subset of M of the form C(x). y is a segment of C(x) if


y C(x). Two categories are either equal or disjoint; hence the categories
form a partition of M. Categories must also be linear. To ensure this it is
enough to require the following of the set Q:
Linear Categories. if y and y

are distinct daughters of x then


y Q or y

Q (or both).
For suppose y, y

Q. Then y

, (y

x, when it is easy to see that y

= (y

and so C(y) =C(y

) =C(x). On the other hand, if y Q then y

= y x, while
(y

y and so C(y) is disjoint from C(y

).
Finally, in adjunction structures c-command is revised as follows. Say
that y includes x if all segments dominate x. x c-commands y i the least
z including x dominates y. Now we require that chains are linearly ordered
through ac-command. This is reected in the following conditions.
The set M(x) gets replaced by the set P(x), which is formed as follows.
Suppose that x
+
u, where u is minimal in its category (so that the category
is the least one that includes x), and there is a path from x to u going
only through nonminimal nodes, and following derived links. Then u P(x).
As before, P(x) reports about the movement history of x. But now that c-
command is no longer dened using the one-node-up version (idc-command
in the sense of (Barker and Pullum 1990)), we need to dene a dierent set
of nodes that need to be compared. This is why we chose P(x) to be the
mothers of the ultimate landing site of a complex formed through successive
adjunction. The link that adjunction creates is always counted as derived. We
shall see below an example of where this arises naturally.
In fact, adjunction has been taken to be more restrictive. Typically, when
an element adjoins, it must adjoin to the maximal segment of the existing
category. And so we shall simplify the task as follows. Call x inmal if there
is no y x which is nonmaximal (that is to say, x is the least member in its
category).
134 Marcus Kracht
P(x) := y : y > x and x inmal or
there is a noninmal z and y >
0
z >
1
x (17)
Denition 8 A pseudo-ordered adjunction MDS (PAMDS) is a structure
(M, Q, >
00
, >
01
, >
10
, >
11
), where the following holds:
(A1) Q M.
(A2) If y >
0
x and y >
0
x

then x = x

.
(A3) If y >
1
x and y >
1
x

then x = x

.
(A4) If y >
1
x then there is a z such that y >
0
x.
(A5) There is exactly one x such that for no y, y > x (this element is called
the root).
(A6) If x >
1
y then y Q. (Adjoining elements are maximal segments.)
(A7) If x y and x

y and x, x

Q then x = x

. (Only one daughter is a


nonmaximal segment. Categories are linear.)
(A8) The set P(x) is linearly ordered by
+
0
and if y is minimal with respect
to
+
then y >
0
x.
As before, we need to dene the logic of these structures and then show
that the dened logic has the nite model property, which shall establish its
decidability. First, let us notice a few facts about these structures. Adjunction
typically is head adjunction because here the new notion of c-command takes
eect. A head adjoins to a higher head, but in the new position it does not
idc-command its trace, it just c-commands it. The postulates are as follows.
We shall introduce a constant Q whose interpretation is the set Q. First, let
us agree on the following notation.
A := ()Q (18)
:= (A?;
1
) (
0
; A?;
1
) (19)
A is true on the node to which one has adjoined; (y, x) R() i y P(x).
On the logic of LGB type structures 135
Denition 9 Let
PAM= DPDL
4
.f
(
00
)Q [
10
]Q
(
10
)Q [
00
]Q
[
1
]Q
()p (
+
0
; (? ))p
Lemma 15 Every nite PAMDS satises the postulates of PAM.
Proof. (a) The postulates of DPDL
4
.f are satised, by similar arguments.
(b) Suppose M is a PAMDS, and let x M, x (
00
)Q. Then there is a
y
00
x which is not in M. By (A7), if z
10
x, z must be maximal, whence
z Q. z was arbitrary (in fact, if it exists, it is unique). Therefore, x [
10
]Q.
Similarly for the second axiom. (c) x [
1
]Q. For let y
1
x. Then by (A6),
y Q, whence y Q. (d) Suppose x ()p. This means that there is a y such
that x P(y). By (A8), if x >
1
y, then x is not minimal in P(y). Hence, there
is a z such that x >
+
z and z P(x). This means either that z is minimal in
P(x), in which case z (?)p, or else that z is not minimal, but then z ()p.
By assumption on P(y), we have that x >
+
0
z. Hence z ((?) )p and so
x (
+
0
; )p. P
Now we turn to the converse. Put
Z() := [u]((
1
) (
+
0
; (? ))) : At()
[u]((
00
)Q [
10
]Q), [u]((
10
)Q [
00
]Q)
[u][
1
]Q
Lemma 16 is consistent with PAMi ; Z() is consistent with DPDL
4
.f.
Proof. () Clear. (). Let Z(); be consistent with DPDL
4
.f. Then it has a
nite generated model based on M= (M, Q, >
00
, >
01
, >
10
, >
11
), the valuation
and w
0
such that
(M, , w
0
) Z();
(a) By choice of Z(), w
0
[u]((
00
)Q [
10
]Q). Take z M. Then, by
denition of u, z (
00
)Q [
10
]Q). Suppose now that y is nonmaximal
and z >
00
y. Then z (
00
)Q. Whence z [
10
]Q. So, if z >
10
u, then u is
maximal. Similarly it is seen that if z >
10
y and y is nonmaximal, and z >
00
u
then u is maximal. This establishes linearity (A7). (b) z [
1
]Q. Hence
136 Marcus Kracht
if y >
1
z, y is maximal. Thus, (A6) is satised. (c) Now we deal with the
most problematic formula, the last axiom. We replay the proof of Theorem 8.
The only change is that we dene the relation L dierently. For as before, S
is the set of standard points, and E the set of immediate, derived daughters
of standard points. We shall have to verify that L is cycle free, and that the
structure obtained by identifying all points L-related to each other is a PAM-
structure and the resulting model satises . Basically, the proof of the latter
is as in Theorem 8. So let us see why the structure is a PAM-structure. For,
this we need to establish that P(x) is linearly ordered by >
+
. P
It follows that the logic PAM is in 2EXPTIME. There are typically other
requirements that are placed on adjunction structures. The rst is that head
adjunction takes place to the right only. Thus, if y is a zero level projection
and x >
1
y, then y must be to the right, so = 1. This is captured as follows.
There is a constant H which is true of exactly the zero-level projections. So
we say
H [
10
]
Next, at least in the standard theory, the head-head complex cannot be taken
apart by movement again. (The phenomenon is known as excorporation.)
Structurally, it means that an adjoined element cannot have two mothers.
Thus, if x, x

>
1
y and y is zero level, then x = x

. This must be added to


the list of requirements if needed. This is a universal rst-order formula, so
only have to appeal to Theorem 6 to see that it can be axiomatised modally.
11. Second example: Swiss German
It is worth seeing a concrete example of how the present ideas can be made
to work. We choose Swiss German to exemplify the interplay between move-
ment and adjunction. Our analysis will be the cyclic head adjunction analysis
put forward in the 80s for Dutch and German.
We shall assume that lexical items have internal structure, which is also
binary branching. For simplicity, we denote the relations below the lexical
level by another symbol (and ). (For all those worried about decidability:
these relations are dispensable. We could introduce a constant L, which is true
of all sublexical nodes. Then we put = ; L? and = L?; .) The lexicon
contains complex nodes whose leftmost part is a string. The other nodes are
auxiliary and carry phonetically empty material, here one of the following:
, and . They are mutually exclusive (just like the other labels). is a
feature for accusative case, for dative case and for the selection of an
innitival complement. The following are the lexical trees that we shall use;
On the logic of LGB type structures 137
dchind

NP

NP

d
d
d
d

laa

d
d
d
d

d
d
d
d

Figure 2. Some Lexical Trees


Figure 2 shows two of them in tree format. (By the way, we abandon now the
underscore notation for constants.)
[dchind ]
NP
[em chind ]
NP
[aastriche ]
V
[[halfe ]
V
]
V
[[laa ]
V
]
V
The grammar for the deep structure is this:
VP V
1
VP VP V NP
V
1
V NP VP NP VP
We shall assume that the surface structure is created through successive cyclic
head adjunction. That is to say, any head is allowed to move and adjoin to the
next higher head; adjunction is always to the right, but it need not be cyclic.
Suppose we have four heads V
1
V
2
V
3
V
4
. Then we can rst adjoin V
3
to
V
4
, giving [V
4
V
3
], then V
1
to V
2
, giving [V
2
V
1
], and then nally [V
2
V
1
] to
[V
4
V
3
] to give [[V
4
V
3
] [V
2
V
1
]]. This can be excluded, see below.
The rules, together with the lexicon can be translated into constant axioms
as follows. (Recall from (18) the denition A := ()Q. Furthermore,
2
0
:=

2
;
2
.)
138 Marcus Kracht

VP
:= VP ((
00
)V
1
(
10
)VP)
((
00
)V(
10
)NP)
((
00
)V(
10
)VP)

:= V
1
(
00
)V(
10
)NP

NP
:= NP ((
2
0
)(dchindHans ) (
1
)))
((
2
0
)(em chindem Hans ) (
1
)))

N
V
:= (VA) ((
2
0
)(aastriche ) (
1
))
((
2
0
)((
0
)halfe ) (
1
)) (
1
))
((
0
)((
0
)laa ) (
1
)) (
1
))

A
V
:= (VA) (
00
)V(
11
)(VQ)

:= []

:= []

:= []
Notice that it is possible to enforce cyclic head adjunction by issuing the
following formula in place of
A
V
:

A
V
:= (VA) (
00
)(VA) (
11
)(VQ)
This says that the left hand daughter must be inmal, hence that daughter is
lexical. The right hand daughter may however be complex.
Case government is implemented as follows.

:= V(
2
) (; )

:= V(
2
) (; )
Selectional restriction concerning the innitive is the formula
:= V() ((VP?); )

; )VP
Notice that these formulae are all constant. They describe the restrictions that
apply at D-structure.
The only derivational steps are head adjunction, as shown above. The
crucial fact here is that head adjunction is local; so we restrict the condition
(7) in Denition 8 by saying that the distance between two members of P(x)
must be small. The head movement constraint is embodied in the following
formula
Q
h
:= ()p (
2
0
; (? ))p
On the logic of LGB type structures 139
This formula is somewhat crude, saying that movement is only two steps
up. It suces for our purposes, thanks to the particular grammar chosen. It
would be no problem to formulate a more sophisticated version which says
that a head may only move to the next head.
Denition 10 Call Swiss the logic
OLExc(C) Suf(C)
VP
,
V

,
NP
,
N
V
,
A
V
,

, , Q
h

Swiss is decidable. This follows from our results. The language is trans-
context free. To see this we must rst dene the surface order. This means
that we have to spell out which of the links is a surface link. This is the
standard link if the element is not a V, and it is not adjoined. Otherwise, it is
a derived link.
(
s0
)p ((VA) (
00
)p)) ((VA) (
01
)p)
(
s1
)p ((VA) (
10
)p)) ((VA) (
11
)p)
Notice that although we have introduced new symbols,
s0
and
s1
, they are
eliminable, so they are in eect just shorthands.
After that we dene the left-to-right order on the surface and nally the
relation
s
, which is like the surface , but it skips intervening empty heads.
:=

s
;
s0
; >
s1
; >

s
c :=

s
:= ; (c?; )

; c
Now, x is immediately to the left of y in surface order if x R(
s
) y. x R(
s
) y if
y is the next phonetically nonempty element to the right of x. So, the question
whether the following sequence is derivable
de chind em Hans es huus halfe aastriche
now becomes the question whether the following formula has a model:
[

s
](
s
)(de chind(
s
)(em Hans(
s
)(es huus
(
s
)(halfe(
s
)(aastriche(
s
)[]))))) (20)
140 Marcus Kracht
12. Conclusion
Let us briey review what has been achieved and what remains to be done.
We have established a way to reduce a grammar to a logic L, the lexicon to
a constant formula . As a result, parsing becomes a satisability problem
in a given logic (here L ). (See (Kracht 1995, 2001a) for an extensive
discussion.) Provided that the logic L is decidable, the logic L is also
decidable and the following questions become decidable:
Given a string x and a particular lexicon , is x derivable in L?
Is a given PDL-denable principle satisable in a structure of L?
Or does L refute ?
Is a given regular language included in the language derived by L?
Since principles are axioms, our results establish decidability of these ques-
tions only on condition that L falls within the range of logics investigated
here (or expansions by constant formulae). In particular, this means that
movement is assumed to satisfy Freeze. (This has consequences only for
the formulation of nearness conditions.)
It should be said that there are questions that are known to be undecidable
and so there is no hope of ever nding an algorithm that decides them once
and for all. One problem is the question whether a given grammar generates
less sentences than another one. This is undecidable already for context free
grammars.
The reader might wonder what happened to surface structure and LF.
These two pose no problems, as far as I can see. All that needs to be done
is to split the relations
i
into four dierent ones (which are not mutually
exclusive). In this way, practically the full theory can be axiomatised within
PDL. It is to be noted, however, that while the lexicon consists of constant
formulae, the theory (consisting of general structural axioms) is phrased with
formulae containing variables.
The results obtained in this paper support the claim that properties of gen-
erative grammars developed within GB or the Minimalist Program are in fact
decidable as long as they can be expressed in PDL. In Part II of this sequence
we shall show that this holds true also for the logic of narrow multidomi-
nance structures. These are structures where a given trigger licenses only one
movement step. Decidability will be shown for theories that admit narrow
structures with Freeze-style movement and command relations to measure
distance. This will hopefully be taken up in Part III, where we plan to study
Minimalism in depth.
On the logic of LGB type structures 141
Notes
1. The shorthand LGB refers to (Chomsky 1981) as a generic source for the kinds
of structures that Government and Binding uses.
2. Added in February 2008. I noted with dismay that none of the promised pa-
pers have reached satisfactory stage yet. Some of the generalisations have been
obtained, but a thorough analysis of the MP is still missing.
3. The idea to this paper came on a bus ride in Santa Monica, during which I was
unable to do anything but think. It came just in time for the Festcolloquium
for Uwe. I owe thanks to the participants of the colloquium, especially Stephan
Kepser, Jens Michaelis, Yiannis Moschovakis, Larry Moss and Uwe M onnich for
useful comments. Furthermore, special thanks to Stephan Kepser for carefully
reading this manuscript. All errors that have survived so far are my responsibility.
4. Please note that this denition of an MDS diers slightly from the one given in
(Kracht 2001b).
5. The reader may nd it confusing that we talk about programs here. The reason
is simply that PDL is used to talk about the actions of a computer, and this has
given rise to the terminology. Here however we shall use PDL to talk about trees.
As shall be seen below, the interpretation of a program is actually a relation over
the constituent tree. So, when I write program it is best to think relation.
References
Barker, Chris and Georey Pullum
1990 A theory of command relations. Linguistics and Philosophy 13: 134.
Blackburn, Patrick
1993 Nominal tense logic. Notre Dame Journal of Formal Logic 39: 5683.
Chomsky, Noam
1981 Lecture Notes on Government and Binding. Dordrecht: Foris.
1986 Barriers. Cambrigde (Mass.): MIT Press.
Giacomo, Giuseppe de
1996 Eliminating Converse from Converse PDL. Journal of Logic, Lan-
guage and Information 5: 193208.
Kracht, Marcus
1993 Mathematical aspects of command relations. In Proceedings of the
EACL 93, 241 250.
1995 Is there a genuine modal perspective on feature structures? Linguis-
tics and Philosophy 18: 401458.
1998 Adjunction structures and syntactic domains. In Uwe M onnich and
Hans-Peter Kolb, (eds.), The Mathematics of Sentence Structure.
Trees and Their Logics, number 44 in Studies in Generative Gram-
mar, 259299. Berlin: Moutonde Gruyter.
142 Marcus Kracht
1999 Tools and Techniques in Modal Logic. Number 142 in Studies in
Logic. Amsterdam: Elsevier.
2001a Logic and Syntax A Personal Perspective. In Maarten de Rijke,
Krister Segerberg, Heinrich Wansing, and Michael Zakharyaschev,
(eds.), Advances in Modal Logic 98, 337366. CSLI.
2001b Syntax in chains. Linguistics and Philosophy 24: 467529.
2003a Constraints on derivations. Grammars 6: 89113.
2003b The Mathematics of Language. Berlin: Mouton de Gruyter.
Manzini, Maria R.
1992 Locality A Theory and Some of Its Empirical Consequences. Num-
ber 19 in Linguistic Inquiry Monographs. Boston (Mass.): MIT Press.
Rizzi, Luigi
1990 Relativized Minimality. Boston (Mass.): MIT Press.
Rogers, James
1998 A Descriptive Approach to Language-Theoretic Complexity. Stan-
ford: CSLI Publications.
Stabler, Edward P.
1992 The Logical Approach to Syntax. Foundation, Specication and Im-
plementation of Theories of Government and Binding. ACL-MIT
Press Series in Natural Language Processing. Cambridge (Mass.):
MIT Press.
Vardi, Moshe and Pierre Wolper
1986 Automata theoretic techniques for modal logics of programs. Journal
of Computer and Systems Sciences 32: 183 221.
Completeness theorems for syllogistic fragments
Lawrence S. Moss
Abstract
Traditional syllogisms involve sentences of the following simple forms: All X are
Y, Some X are Y, No X are Y; similar sentences with proper names as subjects, and
identities between names. These sentences come with the natural semantics using
subsets of a given universe, and so it is natural to ask about complete proof systems.
Logical systems are important in this area due to the prominence of syllogistic ar-
guments in human reasoning, and also to the role they have played in logic from
Aristotle onwards. We present complete systems for the entire syllogistic fragment
and many sub-fragments. These begin with the fragment of All sentences, for which
we obtain one of the easiest completeness theorems in logic. The last system extends
syllogistic reasoning with the classical boolean operations and cardinality compar-
isons.
1. Introduction: the program of natural logic
This particular project begins with the time-honored syllogisms. The com-
pleteness of various formulations of syllogistic logic has already been shown,
for example by in ukasiewicz (1957) (in work with Supecki), and in differ-
ent formulations, by Corcoran (1972) and Martin (1997). The technical part
of this paper contains a series of completeness theorems for various systems
as we mentioned in the abstract. In some form, two of them were known al-
ready: see van Benthem (1984) and Westerst ahl (1989). We are not aware of
systematic studies of syllogistic fragments, and so this is a goal of the paper.
Perhaps the results and methods will be of interest primarily to specialists
in logic, but we hope that the statements will be of wider interest. Even more,
we hope that the project of natural logic will appeal to people in linguistic
semantics, articial intelligence, computational semantics, and cognitive sci-
ence. This paper is not the place to give a full exposition of natural logic, and
so we only present a few remarks here on it.
Textbooks on model theoretic semantics often say that the goal of the
enterprise is to study entailment relations (or other related relations). So the
question arises as to what complete logical systems for those fragments would
144 Lawrence S. Moss
look like. Perhaps formal reasoning in some system or other will be of in-
dependent interest in semantics. And if one has a complete logical system
for some phenomenon, then one might well take the logical system to be the
semantics in some sense. Even if one does not ultimately want to take a log-
ical presentation as primary but treats them as secondary, it still should be
of interest to have completeness and decidability for as large a fragment of
natural language as possible. As we found out by working on this topic, the
technical work does not seem to be simply an adaptation of older techniques.
So someone interested in pursuing that topic might nd something of interest
here.
Most publications on syllogistic-like fragments comes from either the
philosophical or AI literatures. The philosophical work is generally con-
cerned with the problem of modern reconstruction of Aristotle beginning
with ukasiewicz (1957) and including papers which go in other directions,
such as Corcoran (1972) and Martin (1997). Our work is not reconstruc-
tive, however, and the systems from the past are not of primary interest here.
The AI literature is closer to what we are doing in this paper; see for exam-
ple Purdy (1991). (However, we are interested in completeness theorems, and
the AI work usually concentrates on getting systems that work, and the meta-
theoretic work considers decidability and complexity.) The reason is that
it has proposals which go beyond the traditional syllogistic systems. This
would be a primary goal of what we are calling natural logic. We take a step
in this direction in this paper by adding expressions like There are more As
than Bs to the standard syllogistic systems. This shows that it is possible
to have complete syllogistic systems which are not sub-logics of rst-order
logic.
The next steps in this area can be divided into two groups, and we might
call those the conservative and radical sub-programs. The conservative
program is what we just mentioned: to expand the syllogistic systems but
to continue to deal with extensional fragments of language. A next step in
this direction would treat sentences with verbs other than the copula. There
is some prior work on this: e.g., Nishihara et al. (1990) and McAllester and
Givan (1992). In addition, Pratt-Hartmann (2003, 2004) and Pratt-Hartmann
and Third (2006) give several complexity-theoretic results in this direction.
As soon as one has quantiers and verbs, the phenomenon of quantier-scope
ambiguity suggests that some interaction with syntax will be needed.
Although the program of natural logic as I have presented it seems in-
eluctably model-theoretic, my own view is that this is a shortcoming that will
have to be rectied. This leads to the more radical program. We also want to
Completeness theorems for syllogistic fragments 145
explore the possibility of having proof theory as the mathematical underpin-
ning for semantics in the rst place. This view is suggested in the literature on
philosophy of language, but it is not well-explored in linguistic semantics be-
cause formal semantics is nowadays essentially the same as model-theoretic
semantics. We think that this is only because nobody has yet made sug-
gestions in the proof-theoretic direction. This is not quite correct, and one
paper worth mentioning is Ben-Avi and Francez (2005). In fact, Francez and
his colleagues have begun to look at proof theoretic treatments of syllogistic
fragments with a view towards what we are here calling the radical program.
One can imagine several ways to kick away the ladder after looking at com-
plete semantics for various fragments, incorporating work from several areas.
But this paper is not concerned with any of these directions.
The results This paper proves completeness of the following fragments,
written in notation which should be self-explanatory: (i) the fragment with
All X are Y; (ii) the fragment with Some X are Y; (iii) = (i)+(ii); (iv) = (iii)
+ sentences involving proper names; (v) = (i) + No X are Y; (vi) All + Some
+ No; (vii)= (vi) + Names; (viii) boolean combinations of (vii); (ix)= (i) +
There are at least as many X as Y; (x)= boolean combinations of (ix) + Some
+ No; In addition, we have a completeness for a system off the main track:
(xi) All X which are Y are Z; (xii) Most; and (xiii) Most + Some.
For the most part, we work on systems that do not include sentential
boolean operations. This is partly due to the intrinsic interest of the more
spare systems. Also, we would like systems whose decision problem is
polynomial-time computable. The existing (small) literature on logics for
natural language generally works on top of propositional logic, and so their
satisability problems are NP-hard. At the same time, adding propositional
reasoning to the logics tends to make the completeness proofs easier, as we
shall see: the closer a system is to standard rst-order logic, the more ap-
plicable are well-known techniques. So from a logical point of view, we are
interested in exploring systems which are quite weak.
A nal point is that the work here should be of pedagogic interest: the
simple completeness theorems in the rst few sections of this paper are good
vehicles for teaching students about logical systems, soundness, and com-
pleteness. This is because the presentation completely avoids all of the de-
tails of syntax such as substitution lemmas and rules with side conditions on
free variables, and the mathematical arguments of this paper are absolutely
elementary. At the same time, the techniques foreshadow what we nd in the
Henkin-style completeness proofs for rst-order logic. So students would see
146 Lawrence S. Moss
the technique of syntactically dened models quite early on. (However, since
we only have three sides of the classical square of opposition, one occasion-
ally feels as if sitting on a wobbly chair.) This paper does not present natural
deduction-style logics, but they do exist, and this would add to a presentation
for novices. Overall, this material could be an attractive prelude to standard
courses.
Dedication Uwe M onnich has been a source of inspiration and support for
many years. I have learned much from him, not only from his work on the
grammatical formalisms but also from his wide-ranging store of knowledge
and his menschlichkeit (a quality shared with Traudel). I would like to thank
Uwe and to wish him many more years of productive work.
1.1. Getting started
We are concerned with logical system based on syllogistic reasoning. We
interpret a syllogism such as the famous
All men are mortal.
Socrates is a man.
Socrates is mortal.
(The rst recorded version of this particular syllogism is due to Sextus Empir-
icus, in a slightly different form.) The interpretations use sets in the obvious
way. The idea again is that the sentences above the line should semantically
entail the one below the line. Specically, in every context (or model) in
which All men are mortal and Socrates is a man are true, it must be the case
that Socrates is mortal is also true.
Here is another example, a bit closer to what we have in mind for the
study:
All xenophobics are yodelers.
John is a xenophobic.
Mary is a zookeeper.
John is Mary.
Some yodeler is a zookeeper.
(1)
To begin our study, we have the following denitions:
Syntax We start with a set of variables X, Y, . . ., representing plural
common nouns. We also also names J, M, . . .. Then we consider sentences S
Completeness theorems for syllogistic fragments 147
of the following very restricted forms:
All X are Y, Some X are Y, No X are Y, J is an X, J is M.
The reason we use scare quotes is that we only have ve types of sentences,
hence no recursion whatsoever. Obviously it would be important to propose
complete systems for innite fragments. The main example which I know of
is that of McAllester and Givan (1992). Their paper showed a decidability
result but was not concerned with logical completeness; for this, see Moss (to
appear).
Fragments As small as our language is, we shall be interested in a number
of fragments of it. These include L(all), the fragment with All (and noth-
ing else); and with obvious notation L(all, some), L(all, some, names), and
L(all, no). We also will be interested in extensions of the language and vari-
ations on the semantics.
Semantics One starts with a set M, a subset [[X]] M for each variable X,
and an element [[J]] Mfor each name J. This gives a model M= (M, [[ ]]).
We then dene
M[= All X are Y iff [[X]] [[Y]]
M[= Some X are Y iff [[X]] [[Y]] ,= / 0
M[= No X are Y iff [[X]] [[Y]] = / 0
M[= J is an X iff [[J]] [[X]]
M[= J is M iff [[J]] = [[M]]
We allow [[X]] to be empty, and in this case, recall that M [= All X are Y
vacuously. And if is a nite or innite set of sentences, then we write
M[= to mean that M[= S for all S .
Main semantic denition [= S means that every model which makes all
sentences in the set true also makes S true. This is the relevant form of
semantic entailment for this paper.
Notation If is a set of sentences, we write
all
for the subset of contain-
ing only sentences of the form All X are Y. We do this for other constructs,
writing
some
,
no
and
names
.
148 Lawrence S. Moss
Inference rules of the logical system The complete set of rules for the
syllogistic fragment may be found in Figure 6 below. But we are concerned
with other fragments, especially in Sections 8 and onward. Rules for other
fragments will be presented as needed.
Proof trees A proof tree over is a nite tree T whose nodes are labeled
with sentences in our fragment, with the additional property that each node
is either an element of or comes from its parent(s) by an application of one
of the rules. S means that there is a proof tree T for over whose root is
labeled S.
Example 1 Here is a proof tree formalizing the reasoning in (1):
All X are Y J is an X
J is a Y
M is a Z J is M
J is a Z
Some Y are Z
Example 2 We take
=All A are B, All Q are A, All B are D, All C are D, All A are Q.
Let S be All Q are D. Here is a proof tree showing that S:
All Q are A
All A are B All B are B
All A are B All B are D
All A are D
All Q are D
Note that all of the leaves belong to except for one that is All B are B.
Note also that some elements of are not used as leaves. This is permitted
according to our denition. The proof tree above shows that S. Also,
there is a smaller proof tree that does this, since the use of All B are B is not
really needed. (The reason why we allow leaves to be labeled like this is so
that that we can have one-element trees labeled with sentences of the form
All A are A.)
Lemma 1 (Soundness) If S, then [= S.
Proof. By induction on proof trees. P
Completeness theorems for syllogistic fragments 149
All X are X
All X are Z All Z are Y
All X are Y
Figure 1. The logic of All X are Y.
Example 3 One easy semantic fact is
Some X are Y, Some Y are Z ,[= Some X are Z.
The smallest countermodel is 1, 2 with [[X]] =1, [[Y]] =1, 2, and [[Z]] =
2. Even if we ignore the soundness of the logical system, an examination
of its proofs shows that
Some X are Y, Some Y are Z , Some X are Z
Indeed, the only sentences which follow from the hypotheses are those sen-
tences themselves, the sentences Some X are X, Some Y are Y, Some Z are
Z, Some Y are X, and Some Z are Y, and the axioms of the system: sentences
of the form All U are U and J is J.
There are obvious notions of submodel and homomorphism of models.
Proposition 1 Sentences in L(all, no, names) are preserved under submod-
els. Sentences in L(some, names) are preserved under homomorphisms. Sen-
tences in L(all) are preserved under surjective homomorphic images.
2. All
This paper is organized in sections corresponding to different fragments. To
begin, we present a system for L(all). All of our logical systems are sound
by Lemma 1.
Theorem 1 The logic of Figure 1 is complete for L(all).
Proof. Suppose that [= S. Let S be All X are Y. Let be any singleton,
and dene a model Mby M =, and
[[Z]] =
_
M if All X are Z
/ 0 otherwise
(2)
150 Lawrence S. Moss
It is important that in (2), X is the same variable as in the sentence S with
which we began. We claim that if contains All V are W, then [[V]] [[W]].
For this, we may assume that [[V]] ,= / 0 (otherwise the result is trivial). So
[[V]] = M. Thus All X are V. So we have a proof tree over as indicated
by the vertical dots
.
.
. below:
.
.
.
.
All X are V All V are W
All X are W
The tree overall has as leaves All V are W plus the leaves of the tree above All
X are V. Overall, we see that all leaves are labeled by sentences in . This
tree shows that All X are W. From this we conclude that [[W]] = M. In
particular, [[V]] [[W]].
Now our claim implies that the model M we have dened makes all sen-
tences in true. So it must make the conclusion true. Therefore [[X]] [[Y]].
And [[X]] =M, since we have a one-point tree for All X are X. Hence [[Y]] =M
as well. But this means that All X are Y, just as desired. P
Remark The completeness of L(all) appears to be the simplest possible
completeness result of any logical system! (One can also make this claim
about the pure identity fragment, the one whose statements are of the form J
is M and whose logical presentation amounts to the reexive, symmetric, and
transitive laws.) At the same time, we are not aware of any prior statement of
its completeness.
2.1. The canonical model property
We introduce a property which some of the logical systems in this paper
enjoy. First we need some preliminary points. For any set of sentences,
dene

on the set of variables by


U

V iff All U are V (3)


Lemma 2 The relation

is a preorder: a reexive and transitive relation.


We shall often use preorders

dened by (3).
Also dene a preorder _

on the variables by: U _

V if contains All
U are V. Let _

be the reexive-transitive closure of _

.
Usually we suppress mention of and simply write , _, and _

.
Completeness theorems for syllogistic fragments 151
Proposition 2 Let be any set of sentences in this fragment, let _

be de-
ned from as above. Let X and Y be any variables. Then the following are
equivalent:
1. All X are Y.
2. [= All X are Y.
3. X _

Y.
Proof. (1)=(2) is by soundness, and (3)=(1) is by induction on _

.
The most signicant part is (2)=(3). We build a model M. As in the proof
of Theorem 1, we take M = . But we modify (2) by taking [[Z]] = M iff
X _

Z. We claim that M[= . Consider All V are W in . We may assume


that [[V]] = M, or else our claim is trivial. Then X _

V. But V _W, so
we have X _

W, as desired. This veries that M[= . But [[X]] = M, and


therefore [[Y]] = M as well. Hence X _

Y, as desired. P
Denition 1 Let F be a fragment, let be a set of sentences in F, and con-
sider a xed logical system for F. A model M is canonical for if for all
S F, M[= S iff S. A fragment F has the canonical model property (for
the given logical system) if every set F has a canonical model.
(For example, in L(all), Mis canonical for provided: X Y iff [[X]]
[[Y]].)
Notice, for example, that classical propositional and rst-order logic do
not have the canonical model property. A model of = p will have to
commit to a value on a different propositional symbol q, and yet neither q nor
q follow from . These systems do have the property that every maximal
consistent set has a canonical model. Since they also have negation, this last
fact leads to completeness. As it turns out, fragments in this paper exhibit
differing behavior with respect to the canonical model property. Some have
it, some do not, and some have it for certain classes of sentences.
Proposition 3 L(all) has the canonical model property with respect to our
logical system for it.
Proof. Given , let M be the model whose universe is the set of variables,
and with [[U]] = Z : Z U. Consider a sentence S All X are Y. Then
[[X]] [[Y]] in Miff X Y. (Both rules of the logic are used here.) P
The canonical model property is stronger than completeness. To see this,
let M be canonical for a xed set . In particular M[= . Hence if [= S,
then M[= S; so S.
152 Lawrence S. Moss
(X,Y, X) (X,Y,Y)
(X,Y,U) (X,Y,V) (U,V, Z)
(X,Y, Z)
Figure 2. The logic of All X which are Y are Z, written here (X,Y, Z).
2.2. A digression: All X which are Y are Z
At this point, we digress from our main goal of the examination of the syllo-
gistic system of Section 1.1. Instead, we consider the logic of All X which are
Y are Z. To save space, we abbreviate this by (X,Y, Z). We take this sentence
to be true in a given model M if [[X]] [[Y]] [[Z]]. Note that All X are Y is
semantically equivalent to (X, X,Y).
First, we check that the logic is genuinely new. The result in Proposi-
tion 4 clearly also holds for the closure of L(all, some, no) under (innitary)
boolean operations.
Proposition 4 Let R be All X which are Y are Z. Then R cannot be expressed
by any set in the language L(all, some, no). That is, there is no set of
sentences in L(all, some, no) such that for all M, M[= iff M[= R.
Proof. Consider the model M with universe x, y, a with [[X]] = x, a,
[[Y]] = y, a, [[Z]] = a, and also [[U]] = / 0 for other variables U. Consider
also a model N with universe x, y, a, b with [[X]] =x, a, b, [[Y]] =y, a, b,
[[Z]] = a, and the rest of the structure the same as in M. An easy examina-
tion shows that for all sentences S L(all, some, no), M[= S iff N [= S.
Now suppose towards a contradiction that we could express R, say by the
set . Then since M and N agree on L(all, some, no), they agree on . But
M[= R and N,[= R, a contradiction. P
Theorem 2 The logic of All X which are Y are Z in Figure 2 is complete.
Proof. Suppose [= (X,Y, Z). Consider the interpretation Mgiven by M =
, and for each variable W, [[W]] = iff (X,Y,W). We claim that
for (U,V,W) , [[U]] [[V]] [[W]]. For this, we may assume that M =
[[U]] [[V]]. So we use the proof tree
.
.
.
.
(X,Y,U)
.
.
.
.
(X,Y,V) (U,V,W)
(X,Y,W)
Completeness theorems for syllogistic fragments 153
This shows that [[W]] = M, as desired.
Returning to our sentence (X,Y, Z), our overall assumption that [= (X,
Y, Z) tells us that M[= (X,Y, Z). The rst two axioms show that [[X]]
[[Y]]. Hence [[Z]]. That is, (X,Y, Z). P
Remark Instead of the axiom (X,Y,Y), we could have taken the symmetry
rule
(Y, X, Z)
(X,Y, Z)
The two systems are equivalent.
Remark The fragment with (X, X,Y) is a conservative extension of the frag-
ment with All, via the translation of All X are Y as (X, X,Y).
3. All and Some
We enrich our language with sentences Some X are Y and our rules with those
of Figure 3. The symmetry rule for Some may be dropped if one twists the
transitivity rule to read
All Y are Z Some X are Y
Some Z are X
Then symmetry is derivable. We will use the twisted form in later work, but
for now we want the three rules of Figure 3 because the rst two alone are
used in Theorem 3 below.
Example 4 Perhaps the rst non-trivial derivation in the logic is the follow-
ing one:
All Z are Y
All Z are X Some Z are Z
Some Z are X
Some X are Z
Some X are Y
That is, if there is a Z, and if all Zs are Xs and also Ys, then some X is a Y.
In working with Some sentences, we adopt some notation parallel to (3):
for All
U

V iff Some U are V (4)


Usually we drop the subscript . Using the symmetry rule, is symmetric.
The next result is essentially due to van Benthem (1984), Theorem 3.3.5.
154 Lawrence S. Moss
Some X are Y
Some Y are X
Some X are Y
Some X are X
All Y are Z Some X are Y
Some X are Z
Figure 3. The logic of Some and All, in addition to the logic of All.
Theorem 3 The rst two rules in Figure 3 give a logical system with the
canonical model property for L(some). Hence the system is complete.
Proof. Let L(some). Let M=M() be the set of unordered pairs (i.e.,
sets with one or two elements) of variables. Let
[[U]] = U,V : U V.
Observe that the elements of [[U]] are unordered pairs with one element being
U. If U V, then U,V [[U]] [[V]]. Assume rst X ,=Y and that con-
tains S = Some X are Y. Then X,Y [[X]] [[Y]], so M[= S. Conversely, if
U,V [[X]] [[Y]], then by what we have said above U,V = X,Y. In
particular, X,Y M. So X Y. Second, we consider the situation when
X =Y. If contains S = Some X are X, then X [[X]]. So M[= S. Con-
versely, if U,V [[X]], then (without loss of generality) U = X, and X V.
Using our second rule of Some, we see that X X. P
The rest of this section is devoted to the combination of All and Some.
Lemma 3 Let L(all, some). Then there is a model Mwith the following
properties:
1. If X Y, then [[X]] [[Y]].
2. [[X]] [[Y]] ,= / 0 iff X Y.
In particular, M[=.
Proof. Let N =[
some
[. We regard N as the ordinal number 0, 1, . . . , N1.
For i N, let U
i
and V
i
be such that

some
= Some V
i
are W
i
: i I (5)
Note that for i ,= j, we might well have V
i
=V
j
or W
i
=W
j
. For the universe
of Mwe take the set N. For each variable Z, we dene
[[Z]] = i N : either V
i
Z or W
i
Z. (6)
Completeness theorems for syllogistic fragments 155
(As in (3), the relation is: X Y iff All X are Y.) This denes the
model M.
For the rst point, suppose that X Y. It follows from (6) and Lemma 2
that [[X]] [[Y]].
Second, take a sentence Some V
i
are W
i
on our list in (5) above. Then i
itself belongs to [[V
i
]] [[W
i
]], so this intersection is not empty. At this point
we know that M[= , and so by soundness, we then get half of the second
point in this lemma.
For the left-to-right direction of the second point, assume that [[X]] [[Y]] ,=
/ 0. Let i [[X]] [[Y]]. We have four cases, depending on whether V
i
X or
V
i
Y, and whether W
i
X or W
i
Y. In each case, we use the logic to see
that X Y. The formal proofs are all similar to what we saw in Example 4
above. P
Theorem 4 The logic of Figures 1 and 3 is complete for L(all, some).
Proof. Suppose that [= S. There are two cases, depending on whether
S is of the form All X are Y or of the form Some X are Y. In the rst case,
we claim that
all
[= S. To see this, let M [=
all
. We get a new model
M
/
= M via [[X]]
/
= [[X]] . The model M
/
so obtained satises

all
and all Some sentences whatsoever in the fragment. Hence M
/
[= . So
M
/
[= S. And since S is a universal sentence, M[= S as well. This proves our
claim that
all
[= S. By Theorem 1,
all
S. Hence S.
The second case, where S is of the form Some X are Y, is an immediate
application of Lemma 3. P
Remark Let L(all, some), and let S L(some). As we know from
Lemma 3, if , S, there is a M[= which makes S false. The proof gets a
model Mwhose size is [
some
[. We can get a countermodel of size at most 2.
To see this, let Mbe as in Lemma 3, and let S be Some X are Y. If either [[X]]
or [[Y]] is empty, we can coalesce all the points in M to a single point , and
then take [[U]]
/
= iff [[U]] ,= / 0. So we assume that [[X]] and [[Y]] are non-
empty. Let N be the two-point model 1, 2. Dene f : MMby f (x) = 1
iff x [[X]]. The structure of N is that [[U]]
N
= f [[[U]]
N
]. This makes f a
surjective homomorphism. By Proposition 1, N [= . And the construction
insures that in N, [[X]] [[Y]] = / 0.
Note that 2 is the smallest we can get, since on models of size 1,
Some X are Y, Some Y are Z [= Some X are Z.
156 Lawrence S. Moss
J is J
J is M M is F
F is J
J is an X J is a Y
Some X are Y
All X are Y J is an X
J is a Y
M is an X J is M
J is an X
Figure 4. The logic of names, on top of the logic of All and Some.
Remark L(all, some) does not have the canonical model property with re-
spect to any logical system. To see this, let be the set All X are Y. Let
M[= . Then either M[= All Y are X, or M[= Some Y are Y. But neither
of these sentences follows from . We cannot hope to avoid the split in the
proof of Theorem 4 due to the syntax of S.
Remark Suppose that one wants to say that All X are Y is true when [[X]]
[[Y]] and also [[X]] ,= / 0. Then the following rule becomes sound:
All X are Y
Some X are Y (7)
On the other hand, is is no longer sound to take All X are X to be an axiom.
So we drop that rule in favor of (7). In this way, we get a complete system
for the modied semantics. Here is how one sees this. Given , let be
with all sentences Some X are Y such that All X are Y belongs to . An easy
induction on proofs shows that S in the modied system iff S in the
old system.
4. Adding Proper Names
In this section we obtain completeness for sentences in L(all, some, names).
The proof system adds rules in Figure 4 to what we already have seen in
Figures 1 and 3.
Fix a set L(all, some, names). Let and be the relations dened
from by
J M iff J is M
J X iff J is an X
Lemma 4 is an equivalence relation. And if J M X Y, then J Y.
Completeness theorems for syllogistic fragments 157
Lemma 5 Let L(all, some, names). Then there is a model N with the
following properties:
1. If X Y, then [[X]] [[Y]].
2. [[X]] [[Y]] ,= / 0 iff X Y.
3. [[J]] = [[M]] iff J M.
4. [[J]] [[X]] iff J X.
Proof. Let Mbe any model satisfying the conclusion of Lemma 3 for
all

some
. Let N be dened by
N = M+[J] : J a name
[[X]] = [[X]]
M
+[J] : J is an X
(8)
The + here denotes a disjoint union. It is easy to check that Mand N satisfy
the same sentences in All, that the Some sentences true in M are still true in
N, and that points (3) and (4) in our lemma hold. So what remains is to check
that if [[X]] [[Y]] ,= / 0 in N, then X Y. The only interesting case is when
J [[X]] [[Y]] for some name J. So J X and J Y. Using the one rule of
the logic which has both names and Some, we see that X Y. P
Theorem 5 The logic of Figures 1, 3, and 4 is complete for L(all, some,
names).
Proof. The proof is nearly the same as that of Theorem 4. In the part of the
proof dealing with All sentences, we had a construction taking a model Mto
a one-point extension M
/
. To interpret names in M
/
, we let [[J]] = for all
names J. Then all sentences involving names are automatically true in M
/
.
P
5. All and No
In this section, we consider L(all, no). Note that No X are X just says that
there are no Xs. In addition to the rules of Figure 1, we take the rules in
Figure 5. As in (3) and (4), we write
U

V iff No U are V (9)


This relation is symmetric.
158 Lawrence S. Moss
All X are Z No Z are Y
No Y are X
No X are X
No X are Y
No X are X
All X are Y
Figure 5. The logic of No X are Y on top of All X are Y.
Lemma 6 L(all, no) has the canonical model property with respect to our
logic.
Proof. Let be any set of sentences in All and No. Let
M = U,V : U ,V
[[W]] = U,V M : U W or V W
(10)
The semantics is monotone, and so if X Y, then [[X]] [[Y]]. Conversely,
suppose that [[X]] [[Y]]. If [[X]] = / 0, then X X, for otherwise X [[X]].
From the last rule in Figure 5, we see that X Y, as desired. In the other
case, [[X]] ,= / 0. Fix V,W [[X]] so that V ,W, and either V X or W X.
Without loss of generality, V X. We cannot have X X, or else V V and
then V W. So X [[X]] [[Y]]. Thus X Y.
We have shown X Y iff [[X]] [[Y]]. This is half of the canonical model
property, the other half being X Y iff [[X]] [[Y]] = / 0. Suppose rst that
[[X]] [[Y]] = / 0. Then X,Y / M, lest it belong to both [[X]] and [[Y]]. So
X Y. Conversely, suppose that X Y. Suppose towards a contradiction
that V,W [[X]] [[Y]]. There are four cases, and two representative ones
are (i) V X and W Y, and (ii) V X and V Y. In (i), we have the
following tree over :
.
.
.
.
All W are Y
.
.
.
.
All V are X
.
.
.
.
No X are Y
No Y are V
No V are W
This contradicts V,W M. In (ii), we replace W by V in the tree above, so
that the root is No V are V. Then we use one of the rules to conclude that No
V are W, again contradicting V,W M. P
Since the canonical model property is stronger than completeness, we
have shown the following result:
Theorem 6 The logic of Figures 1 and 5 is complete for All and No.
Completeness theorems for syllogistic fragments 159
All X are X
All X are Z All Z are Y
All X are Y
Some X are Y
Some X are X
All Y are Z Some X are Y
Some Z are X
J is J
J is M M is F
F is J
J is an X J is a Y
Some X are Y
All X are Y J is an X
J is a Y
M is an X J is M
J is an X
All X are Z No Z are Y
No Y are X
No X are X
No X are Y
No X are X
All X are Y
Some X are Y No X are Y
S
Figure 6. A complete set of rules for L(all, some, no, names).
6. The language L(all, some, no, names)
At this point, we put together our work on the previous systems by proving
a completeness result for L(all, some, no, names). For the logic, we take all
the rules in Figure 6. This includes the all rules from Figures 1, 3, 4, and 5.
But we also must add a principle relating Some and No. For the rst time, we
face the problem of potential inconsistency: there are no models of Some X
are Y and No X are Y. Hence any sentence S whatsoever follows from these
two. This explains the last rule, a new one, in Figure 6.
Denition 2 A set is inconsistent if S for all S. Otherwise, is consis-
tent.
Before we turn to the completeness result in Theorem 7 below, we need a
result specically for L(all, no, names).
Lemma 7 Let L(all, no, names) be a consistent set. Then there is a
model N such that
160 Lawrence S. Moss
1. [[X]] [[Y]] iff X Y.
2. [[X]] [[Y]] = / 0 iff X Y.
3. [[J]] = [[M]] iff J M.
4. [[J]] [[X]] iff J X.
Proof. Let M be from Lemma 6 for
all

no
. Let N come from M by
the denitions in (8) in Lemma 5. (That is, we add the equivalence classes
of the names in the natural way.) It is easy to check all of the parts above
except perhaps for the second. If [[X]] [[Y]] = / 0 in N, then the same holds in
its submodel M. And so X Y. In the other direction, assume that X Y
but towards a contradiction that [[X]] [[Y]] ,= / 0. There are no points in the
intersection in M N. So let J be such that [J] [[X]] [[Y]]. Then by our
last point, J X and J Y. Using the one rule of the logic which has both
names and Some, we see that Some X are Y. Since X Y, we see that
is inconsistent. P
Theorem 7 The logic in Figure 6 is complete for L(all, some, no, names).
Proof. Suppose that [= S. We show that S. We may assume that is
consistent, or else our result is trivial. There are a number of cases, depending
on S.
First, suppose that S L(some, names). Let N be from Lemma 5 for

all

some

names
. There are two cases. If N [=
no
, then by hypothesis,
N [= S. Lemma 5 then shows that S, as desired. Alternatively, there may
be some No A are B in
no
such that [[A]] [[B]] ,= / 0. And again, Lemma 5
shows that
all

some

names
Some A are B. So is inconsistent.
Second, suppose that S L(all, no). Let N come from Lemma 7 for N [=

all

names
. If N [=
some
, then by hypothesis N [= S. By Lemma 7, S.
Otherwise, there is some sentence Some A are B in
some
such that [[A]]
[[B]] = / 0. And then N[= No A are B. By Lemma 7, No A are B. Again,
is inconsistent. P
7. Adding Boolean Operations
The classical syllogisms include sentences Some X is not a Y. In our set-
ting, it makes sense also to add other sentences with negative verb phrases:
Completeness theorems for syllogistic fragments 161
1. All substitution instances of propositional tautologies.
2. All X are X
3. (All X are Z) (All Z are Y) All X are Y
4. (All Y are Z) (Some X are Y) Some Z are X
5. Some X are Y Some X are X
6. No X are X All X are Y
7. No X are Y (Some X are Y)
8. J is J
9. (J is M) (M is F) F is J
10. (J is an X) (J is a Y) Some X are Y
11. (All X are Y) (J is an X) J is a Y
12. (M is an X) (J is M) J is an X
Figure 7. Axioms for boolean combinations of sentences in L(all, some, no, names).
J is not an X, and J is not M. It is possible to consider the logical sys-
tem that is obtained by adding just these sentences. But it is also possible
to simply add the boolean operations on top of the language which we have
already considered. So we have atomic sentences of the kinds we have al-
ready seen (the sentences in L(all, some, no, names)), and then we have ar-
bitrary conjunctions, disjunctions, and negations of sentences. We present a
Hilbert-style axiomatization of this logic in Figure 7. The completeness of
it appears in ukasiewicz (1957) (in work with Supecki; they also showed
decidability), and also by Westerst ahl (1989), and axioms 16 are essentially
the system SYLL. We include Theorem 8 in this paper because it is a natural
next step, because the techniques build on what we have already seen, and
because we shall generalize the result in Section 8.3.
It should be noted that the axioms in Figure 7 are not simply transcriptions
of the rules from our earlier system in Figure 6. The biconditional (7) relating
Some and No is new, and using it, one can dispense with two of the transcribed
versions of the No rules from earlier. Similarly, we should emphasize that
the pure syllogistic logic is computationally much more tractable than the
boolean system, being in polynomial time.
As with any Hilbert-style system, the only rule of the system in this section
is modus ponens. (We think of the other systems in this paper as having many
162 Lawrence S. Moss
rules.) We dene in the usual way, and then we say that. if there
are
1
, . . . ,
n
from such that (
1

n
) .
The soundness of this system is routine.
Proposition 5 If
0
L(all, some, no, names), and if
0
using the
system of Figure 6, then
0
in the system of Figure 7.
The proof is by induction on proof trees in the previous system. We shall
use this result above frequently in what follows, without special mention.
Theorem 8 The logic of Figure 7 is complete for assertions [= in the
language of boolean combinations from L(all, some, no, names).
The rest of this section is devoted to proof of Theorem 8. As usual, the
presence of negation in the language allows us to prove completeness by
showing that every consistent in the language of this section has a model.
We may as well assume that is maximal consistent.
Denition 3 The basic sentences are those of the form All X are Y, Some X
and Y, J is M, and J is an X or their negations. Let
= S : and S is basic.
Note that might contain sentences (All X are Y) which do not belong
to the syllogistic language L(all, some, no, names).
Claim 1 [=. That is, every model of is a model of .
To see this, let M[= and let . We may assume that is in disjunc-
tive normal form. It is sufcient to show that some disjunct of holds in M.
By maximal consistency, let be a disjunct of which also belongs to .
Each conjunct of belongs to and so holds in M.
The construction of a model of is similar to what we saw in Theorem 5.
Dene to be the relation on variables given by X Y if the sentence All
X are Y belongs to . We claim that is reexive and transitive. Well just
check the transitivity. Suppose that All X are Y and All Y are Z belong to
. Then they belong to . Using Proposition 5, we see that All X are Z.
Since is maximal consistent, it must contain All X are Z; thus so must .
Dene the relation on names by J M iff the sentence J is M belongs
to . Then is an equivalence relation, just as we saw above for . Let the
Completeness theorems for syllogistic fragments 163
set of equivalence classes of be [J
1
], . . . , [J
m
]. (Incidentally, this result
does not need to be nite, and we are only pretending that it is nite to
simplify the notation a bit.)
Let the set of Some X are Y sentences in be S
1
, . . . , S
n
, and for 1 i n,
let U
i
and V
i
be such that S
i
is Some U
i
are V
i
. So

some
= Some U
i
are V
i
: i = 1, . . . , n (11)
Let the set of (All X are Y) sentences in be T
1
, . . . , T
p
. For 1 i p,
let W
i
and X
i
be such that T
i
is (All W
i
are X
i
). So this time we are concerned
with
(All W
i
are X
i
) : i = 1, . . . , p (12)
Note that for i ,= j, we might well have U
i
= U
j
or U
i
= W
j
, or some other
such equation. (This is the part of the structure that goes beyond what we saw
in Theorem 5.)
We take Mto be a model with M the following set
(a, 1), . . . , (a, m)(b, 1), . . . , (b, n)(c, 1), . . . , (c, p).
Here m, n, and p are the numbers we saw in the past few paragraphs. The
purpose of a, b, and c is to make a disjoint union. Let [[J]] = (a, i), where i
is the unique number between 1 and m such that J J
i
. And for a variable Z
we set
[[Z]] = (a, i) : 1 i n and J
i
is a Z belongs to
(b, i) : 1 i m and either U
i
Z or V
i
Z
(c, i) : 1 i p and W
i
Z
(13)
This completes the specication of M. The rest of our work is devoted to
showing that all sentences in are true in M. We must argue case-by-case,
and so we only give the parts of the arguments that differ from what we have
seen in Theorem 5.
Consider the sentence T
i
, that is (All W
i
are X
i
). We want to make sure
that [[W
i
]] [[X
i
]] ,= / 0. For this, consider (c, i). This belongs to [[W
i
]] by the last
clause in (13). We want to be sure that (c, i) / [[X
i
]]. For if (c, i) [[X
i
]], then
would contain All W
i
are X
i
. And then our original would be inconsistent
in our Hilbert-style system.
Continuing, consider a sentence (Some P are Q) in . We have to make
sure that [[P]] [[Q]] = / 0. We argue by contradiction. There are three cases,
depending on the rst coordinate of a putative element of the intersection.
164 Lawrence S. Moss
Perhaps the most interesting case is when (c, i) [[P]] [[Q]] for 1 i p.
Then contains both All W
i
are P and All W
i
are Q. Now the fact that
contains (All W
i
are X
i
) implies that it must contain Some W
i
are W
i
. For if
not, then it would contain No W
i
are W
i
and hence All W
i
are X
i
; as always,
this would contradict the consistency of . Thus contains All W
i
are P,
All W
i
are Q and Some W
i
are W
i
. Using our previous system, we see that
contains Some P are Q (see Example 4). This contradiction shows that
[[P]] [[Q]] cannot contain any element of the form (c, i). The other two cases
are similar, and we conclude that the intersection is indeed empty.
This concludes our outline of the proof of Theorem 8.
8. There are at least as many X as Y
In our nal section, we show that it is possible to have complete syllogistic
systems for logics which go are not rst-order. We regard this as a proof-of-
concept; it would be of interest to get complete systems for richer fragments,
such the ones in Pratt-Hartmann (2008).
We write

(X,Y) for There are at least as many X as Y, and we are in-


terested in adding these sentences to our fragments. We are usually interested
in sentences in this fragment on nite models. We write [S[ for the cardinality
of the set S. The semantics is that M[=

(X,Y) iff [[[X]][ [[[Y]][ in M.


L(all,

) does not have the canonical model property of Section 2.1. We


show this via establishing that the semantics is not compact. Consider
=

(X
1
, X
2
),

(X
2
, X
3
), . . . ,

(X
n
, X
n+1
), . . .
Suppose towards a contradiction that M were a canonical model for . In
particular, M[=. Then [[[X
1
]][ [[[X
2
]][ . . .. For some n, we have [[[X
n
]][ =
[[[X
n+1
]][. Thus M[=

(X
n+1
, X
n
). However, this sentence does not follow
from .
Remark In the remainder of this section, denotes a nite set of sentences.
In this section, we consider L(all,

). For proof rules, we take the rules


in Figure 8 together with the rules for All in Figure 1. The system is sound.
The last rule is perhaps the most interesting, and it uses the assumption that
our models are nite. That is, if all Y are X, and there are at least as many
elements in the bigger set Y as in X, then the sets have to be the same.
We need a little notation at this point. Let be a (nite) set of sentences.
We write X
c
Y for

(Y, X). We also write X


c
Y for X
c
Y
c
X,
Completeness theorems for syllogistic fragments 165
All Y are X

(X,Y)

(X,Y)

(Y, Z)

(X, Z)
All Y are X

(Y, X)
All X are Y
Figure 8. Rules for

(X,Y) and All.


and X <
c
Y for X
c
Y but X ,
c
Y. We continue to write X Y for
All X are Y. And we write X Y for X Y X.
Proposition 6 Let L(all,

) be a (nite) set. Let V be the set of vari-


ables in .
1. If X Y, then X
c
Y.
2. (V,
c
) is a preorder: a reexive and transitive relation.
3. If X
c
Y X, then X Y.
4. If X
c
Y, X X
/
, and Y Y
/
, then X
/

c
Y
/
.
5. (V,
c
) is pre-wellfounded: a preorder with no descending sequences
in its strict part.
Proof. Part (1) uses the rst rule in Figure 8. In part (2), the reexivity of

c
comes from that of and part (1); the transitivity is by the second rule
of

. Part (3) is by the last rule of

. Part 4 uses part (1) and transitivity.


Part 5 is just a summary of the previous parts. P
Theorem 9 The logic of Figures 1 and 8 is complete for L(all,

).
Proof. Suppose that [=

(Y, X). Let be any singleton, and dene a


model Mby taking M to be a singleton , and
[[Z]] =
_
M if

(Z, X)
/ 0 otherwise
(14)
We claim that if contains

(W,V) or All V are W, then [[V]] [[W]]. We


only verify the second assertion. For this, we may assume that [[V]] ,= / 0 (oth-
erwise the result is trivial). So [[V]] = M. Thus

(V, X). So we see


that

(W, X). From this we conclude that [[W]] = M. In particular,


[[V]] [[W]].
Now our claim implies that M[=. Therefore [[[X]][ [[[Y]][. And [[X]] =
M, since

(X, X). Hence [[Y]] = M as well. But this means that

(Y, X), just as desired.


166 Lawrence S. Moss
We have shown one case of the general completeness theorem that we are
after. In the other case, we have [= All X are Y. We construct a model
M=M

such that for all A and B,


() [[A]] [[B]] iff A B.
() If A
c
B, then [[[A]][ [[[B]][.
Let V/
c
be the (nite) set of equivalences classes of variables in under

c
. This set is then well-founded by the natural relation induced on it by
c
.
It is then standard that we may list the elements of V/
c
in some order
[U
0
], [U
2
], . . . , [U
k
]
with the property that if U
i
<
c
U
j
, then i < j. (But if i < j, then it might be
the case that U
i

c
U
j
.)
We dene by recursion on i k the interpretation [[V]] of all V [U
i
].
Suppose we have [[W]] for all j < i and all W
c
U
j
. Let
X
i
=
[
j<i,W
c
U
j
[[W]],
and note that this is the set of all points used in the semantics of any variable
so far. Let n = 1+[X
i
[. For all V
c
U
i
, we shall arrange that [[V]] be a set of
size n.
Now [U
i
] is the equivalence class of U
i
under
c
. It splits into equivalence
classes of the ner relation . For a moment, consider one of those ner
classes, say [A]

. We must interpret each variable in this class by the same


set. For this A, let
Y
A
=
[
[[B]] : (j < i)V
j

c
B A.
Note that Y
A
X
i
so that [Y
A
[ <n. We set [[A]] to be Y
A
together with n[Y
A
[
fresh elements. Moving on to the other -classes which partition the
c
-
class of U
i
, we do the same thing. We must insure that for A , A
/
, the fresh
elements added into [[A
/
]] are disjoint from the fresh elements added into [[A]].
This completes the denition of M. We check that so that conditions ()
and () are satised. It is easy to rst check that for i < j, [[[U
i
]][ <[[[U
j
]][. It
might also be worth noting that [[U
0
]] ,= / 0, so no [[A]] is empty.
For (), let A
c
B. Let i and j be such that A
c
U
i
and B
c
U
j
. If
A
c
B, then i = j and the construction arranged that [[A]] and [[B]] be sets of
Completeness theorems for syllogistic fragments 167
the same cardinality. If A <
c
B, then i < j by the way we enumerated the Us,
and so [[[A]][ =[[[U
i
]][ <[[[U
j
]][ =[[[B]][.
Turning to (), we argue the two directions separately. Suppose rst that
A B. Then A
c
B. If A <
c
B, then [[A]] Y
B
[[B]]. If A
c
B, then we also
have A B. The construction has then arranged that [[A]] = [[B]]. In the other
direction, assume that [[A]] [[B]], and let i and j be such that A
c
U
i
and
B
c
U
j
. On cardinality grounds, i j. If i < j, then the construction shows
that A B. (For if not, [[A]] would be a non-empty set disjoint from [[B]], and
this contradicts [[A]] [[B]].) Finally (for perhaps the most interesting point),
if i = j, then we must have A B: otherwise, the construction arranged that
both A and B have at least one point that is not in the other, due to the 1+
in the denition of n.
Since () and (), we know that M [= . Recall that we are assuming
that X Y holds semantically from ; we need to show that this assertion is
derivable in the logic. But [[X]] [[Y]] in the model, and so by (), we indeed
have X Y. P
8.1. Larger syllogistic fragments
Having worked through L(all,

), it is natural to go on to further syllogistic


fragments. We are not going to do this in detail. Instead, we simply state the
results and move ahead in our nal section to the largest system in the paper,
the one that adds

to the system from Section 7. We would need two rules:


Some Y are Y

(X,Y)
Some X are X
No Y are Y

(X,Y)
The rule on the left should be added to our existing system for L(all, some)
(and adding the rules in Figure 8), and the resulting system would be com-
plete for L(all, some,

). Similarly, the rule on the right can be added to the


system for L(all, no) to get a completeness result. Finally, adding both rules
to L(all, some, no) would again result in a complete system.
8.2. Digression: Most
The semantics of Most is that Most X are Y are that this is true iff [[[X]]
[[Y]][ >
1
2
[[[X]][. So if [[X]] is empty, then Most X are Y is false.
As an example of what is going on, consider the following. Assume that
All X are Z, All Y are Z, Most Z are Y, and Most Y are X. Does it follow
that Most X are Y? As it happens, the conclusion does not follow. One can
168 Lawrence S. Moss
take X = a, b, c, d, e, f , g, Y = e, f , g, h, i, and Z = a, b, c, d, e, f , g, h, i.
Then [X[ = 7, [Y[ = 5, [Z[ = 9, [Y Z[ = 5 > 9/5, [X Y[ = 3 > 5/2, but
[X Y[ = 3 < 7/2. (Another countermodel: let X =1, 2, 4, 5, Y =1, 2, 3,
and Z = 1, 2, 3, 4, 5. Then [Y Z[ = 3 > 5/2, [Y X[ = 2 > 3/2, but [X
Y[ = 2 ,>4/2.)
On the other hand, the following is a sound rule:
All U are X Most X are V All V are Y Most Y are U
Some U are V
Here is the reason for this. Assume our hypotheses and also that towards a
contradiction that U and V were disjoint. We obviously have [V[ [X V[,
and the second hypothesis, together with the disjointness assumption, tells us
that [X V[ > [X U[. By the rst hypothesis, we have [X U[ = [U[. So at
this point we have [V[ > [U[. But the last two hypotheses similarly give us
the opposite inequality [U[ >[V[. This is a contradiction.
At the time of this writing, I do not have a completeness result for L(all,
some, most) The best that is known is for L(some, most). The rules are are
shown in Figure 9. We study these on top of the rules in Figure 3.
Proposition 7 The following two axioms are complete for Most.
Most X are Y
Most X are X
Most X are Y
Most Y are Y
Moreover, if L(most), X ,= Y, and ,[= Most X are Y, then there is a
model M of which falsies Most X are Y in which all sets of the form
[[U]] [[V]] are nonempty, and [M[ 5.
Proof. Suppose that , Most X are Y. We construct a model M which
satises all sentences in , but which falsies Most X are X. There are two
cases. If X =Y, then X does not occur in any sentence in . We let M=,
[[X]] = / 0, and [[Y]] = for Y ,= X.
The other case is when X ,=Y. Let M= 1, 2, 3, 4, 5, [[X]] = 1, 2, 4, 5,
[[Y]] = 1, 2, 3, and for Z ,= X,Y, [[Z]] = 1, 2, 3, 4, 5. Then the only state-
ment in Most which fails in the model M is Most X are Y. But this sentence
does not belong to . Thus M[=. P
Theorem 10 The rules in Figure 9 together with the rst two rules in Fig-
ure 3 are complete for L(some, most). Moreover, if ,[= S, then there is a
model M[= with M,[= S, and [M[ 6.
Completeness theorems for syllogistic fragments 169
Most X are Y
Some X are Y
Some X are X
Most X are X
Most X are Y Most X are Z
Some Y are Z
Figure 9. Rules of Most to be used in conjunction with Some.
Proof. Suppose , S, where S is Some X are Y. If X =Y, then contains
no sentence involving X. So we may satisfy and falsify S in a one-point
model, by setting [[X]] = / 0 and [[Z]] = for Z ,= X.
We next consider the case when X ,=Y. Then does not contain S, Some Y
are X, Most X are Y, or Most Y are X. And for all Z, does not contain both
Most Z are X and Most Z are Y. Let M = 1, 2, 3, 4, 5, 6, and consider the
subsets a = 1, 2, 3, b = 1, 2, 3, 4, 5, c = 2, 3, 4, 5, 6, and d = 4, 5, 6.
Let [[X]] = a and [[Y]] = d, so that M,[= S. For Z different from X and Y, if
does not contain Most Z are X, let [[Z]] = c. Otherwise, does not contain
Most Z are Y, and so we let [[Z]] = b. For all these Z, M satises whichever
of the sentences Most Z are X and Most Z are Y (if either) which belong to
. Malso satises all sentences Most X are Z and Most Y are Z, whether or
not these belong to . It also satises Most U are U for all U. Also, for Z, Z
/
each different from both X and Y, M[= Most Z are Z
/
. Finally, M satises
all sentences Some U are V except for U = X and Y =V (or vice-versa). But
those two sentences do not belong to . The upshot is that M[= but M,[=S.
Up until now in this proof, we have considered the case when S is Some X
are Y. We turn our attention to the case when S is Most X are Y. Suppose ,
S. If X =Y, then the second rule of Figure 9 shows that , Some X are X.
So we take M = and take [[X]] = / 0 and for Y ,= X, [[Y]] = M. It is easy to
check that M[=.
Finally, if X ,= Y, we clearly have
most
, S. Proposition 7 shows that
there is a model M [=
most
which falsies S in which all sets of the form
[[U]] [[V]] are nonempty. So all Some sentences hold in M. Hence M[= .
P
8.3. Adding

to the boolean syllogistic fragment


We now put aside Most and return to the study of

from earlier. We close


this paper with the addition of

to the fragment of Section 7.


Our logical system extends the axioms of Figure 7 by those in Figure 10.
Note that the last new axiom expresses cardinal comparison. Axiom 4 in
Figure 10 is just a transcription of the rule for No that we saw in Section 8.1.
170 Lawrence S. Moss
1. All X are Y

(Y, X)
2.

(X,Y)

(Y, Z)

(X, Z)
3. All Y are X

(Y, X) All X are Y


4. No X are X

(Y, X)
5.

(X,Y)

(Y, X)
Figure 10. Additions to the system in Figure 7 for

sentences.
We do not need to also add the axiom
(Some Y are Y)

(X,Y) Some X are X


because it is derivable. Here is a sketch, in English. Assume that there are
some Ys, and there are at least as many Xs as Ys, but (towards a contradiction)
that there are no Xs. Then all Xs are Ys. From our logic, all Ys are Xs as
well. And since there are Ys, there are also Xs: a contradiction.
Notice also that in the current fragment we can express There are more X
than Y. It would be possible to add this directly to our previous systems.
Theorem 11 The logic of Figures 7 and 10 is complete for assertions [=
in the language of boolean combinations of sentences in L(all, some, no,

).
Proof. We need only build a model for a maximal consistent set in the
language of this section. We take the basic sentences to be those of the form
All X are Y, Some X and Y, J is M, J is an X,

(X,Y), or their negations.


Let
= S : [= S and S is basic.
As in Claim 1, we need only build a model M [= . We construct M such
that for all A and B,
() [[A]] [[B]] iff A B.
() A
c
B iff [[[A]][ [[[B]][.
() For A
c
B, [[A]] [[B]] ,= / 0 iff A B.
Completeness theorems for syllogistic fragments 171
Let V be the set of variables in . Let
c
and
c
be as in Section 8.
Proposition 6 again holds, and now the quotient V/
c
is a linear order due
to the last axiom in Figure 10. We write it as
[U
0
] <
c
[U
2
] <
c
<
c
[U
k
]
We dene by recursion on i k the interpretation [[V]] of all V [U
i
]. The
case of i = 0 is special. If [= No U
0
is a U
0
, then the same holds for all
W
c
U
0
. In this case, we set [[W]] = / 0 for all these W. Note that by our
fourth axiom in Figure 10, all of the other variables W are such that W.
In any case, we must interpret the variables in [U
0
] even when (U
0
). In
this case, we may take each [[W]] to be a singleton, with the added condition
that V W iff [[V]] = [[W]].
Suppose we have [[W]] for all j i and all W
c
U
j
. Let
X
i+1
=
[
ji,W
c
U
j
[[W]],
and note that this is the set of all points used in the semantics of any variable
so far. Let m =[
some
[, and let
n = 1+m+[X
i+1
[ (15)
For all V
c
U
i+1
, we shall arrange that [[V]] be a set of size n.
Now [U
i+1
] splits into equivalence classes of the ner relation . For a
moment, consider one of those ner classes, say [A]

. We must interpret each


variable in this class by the same set. For this A, let
Y
A
=
[
[[B]] : (j i) V
j

c
B A.
Note that Y
A
X
i+1
so that [Y
A
[ [X
i+1
[ for all A
c
U
i+1
. We shall set [[A]]
to be Y
A
plus other points. Let Z
A
be the set of pairs A, B with B
c
U
i+1
and A

B. (This is the same as saying that Some A are B in


some
.) Notice
that if both A and B are
c
U
i+1
and A

B, then A, B Z
A
Z
B
.) We shall
set [[A]] to be Y
A
Z
A
plus one last group of points. If C <
c
U
i+1
and A

C,
then we must pick some element of [[C]] and put it into [[A]]. Note that the
number of points selected like this plus [Z
A
[ is still [
some
[. So the number
of points so far in [[A]] is [
some
[ +m. We nally add fresh elements to [[A]]
so that the total is n.
We do all of this for all of the other -classes which partition the
c
-class
of U
i+1
. We must insure that for A , A
/
, the fresh elements added into [[A
/
]]
172 Lawrence S. Moss
are disjoint from the fresh elements added into [[A]]. This is needed to arrange
that neither [[A]] nor [[A
/
]] will be a subset of the other.
This completes the denition of the model. We say a few words about
why requirements ()() are met. First, and easy induction on i shows that
if j < i, then [[[U
j
]][ < [[[U
i
]][. The point is that [[[U
j
]][ [X
i
[ < [[[U
i
]][. The
argument for () is the same as in the proof of Theorem 9. For that matter,
the proof of () is also essentially the same. The point is that when A
c
B
and A ,B, then [[A]] and [[B]] each contain a point not in the other.
For (), suppose that A
c
B. Let i j be such that A
c
U
i
and B U
j
.
The construction arranged that [[A]] and [[B]] be disjoint except for the case
that A B.
So this veries that ()() hold. We would like to conclude that M[=,
but there is one last point: () appears to be a touch too weak. We need to
know that [[A]] [[B]] ,= / 0 iff A B (without assuming A
c
B). But either
A
c
B or B
c
A by our last axiom. So we see that indeed [[A]] [[B]] ,= / 0 iff
A B. P
The next step in this direction would be to consider At least as many X as
Y are Z.
Acknowledgments
I thank the many people who have commented on or criticized this paper, both
during and after the Fest Colloquium. Special thanks to Dag Westerst ahl for
mentioning that he had obtained Theorem 8 in Westerst ahl (1989), and that
ukasiewicz and Supecki had it much earlier, in ukasiewicz (1957).
References
Ben-Avi, Gilad and Nissim Francez
2005 Proof-theoretic semantics for a syllogistic fragment. In Proceed-
ings of the Fifteenth AmsterdamColloquium, 915. ILLC/Department
of Philosophy, University of Amsterdam. Available from
http://www.illc.uva.nl/AC05/uploaded les/AC05Proceedings.pdf.
Corcoran, John
1972 Completeness of an ancient logic. Journal of Symbolic Logic 37(4):
696702.
ukasiewicz, Jan
1957 Aristotles Syllogistic. Oxford: Clarendon Press, 2nd edition.
Completeness theorems for syllogistic fragments 173
Martin, John N.
1997 Aristotles natural deduction revisited. History and Philosophy of
Logic 18(1): 115.
McAllester, David A. and Robert Givan
1992 Natural language syntax and rst-order inference. Articial Intelli-
gence 56: 120.
Moss, Lawrence S.
to appear Syllogistic logics with verbs. Journal of Logic and Computation.
Nishihara, Noritaka, Kenich Morita, and Shigenori Iwata
1990 An extended syllogistic system with verbs and proper nouns, and its
completeness proof. Systems and Computers in Japan 21(1): 760
771.
Pratt-Hartmann, Ian
2003 A two-variable fragment of English. Journal of Logic, Language, and
Information 12: 1345.
2004 Fragments of language. Journal of Logic, Language, and Information
13: 207223.
2008 On the computational complexity of the numerically denite syllogis-
tic and related logics. Bulletin of Symbolic Logic 14(1): 128.
Pratt-Hartmann, Ian and Allan Third
2006 More fragments of language. Notre Dame Journal of Formal Logic
47(2): 151177.
Purdy, William C.
1991 A logic for natural language. Notre Dame Journal of Formal Logic
32(3): 409425.
van Benthem, Johan
1984 Questions about quantiers. Journal of Symbolic Logic 49(2): 443
466.
Westerst ahl, Dag
1989 Aristotelian syllogisms and generalized quantiers. Studia Logica
XLVIII(4): 577585.
List of contributors
Robin Cooper
Department of Linguistics
University of Gothenburg
Box 200
405 30 G oteborg
Sweden
cooper@ling.gu.se
Hristo Ganchev
Faculty of Mathematics and Computer Science
Soa University
5 James Bourchier Blvd.
1164 Soa
Bulgaria
h.ganchev@fmi.uni-soa.bg
Eleni Kalyvianaki
Graduate Program in Logic, Algorithms and Computation
Department of Mathematics
University of Athens
15784 Zografou
Greece
ekalyv@math.uoa.gr
176 List of contributors
Edward L. Keenan
Department of Linguistics
University of California at Los Angeles
3125 Campbell Hall
PO Box 951543
Los Angeles, CA 90024-1543
USA
edward.keenan1@gmail.com
Marcus Kracht
Fakult at LiLi
Universit at Bielefeld
Postfach 10 01 31
33501 Bielefeld
Germany
marcus.kracht@uni-bielefeld.de
Stoyan Mihov
Institute for Parallel Processing
Bulgarian Academy of Sciences
25A, Akad. G. Bonchev Str.
1113 Soa
Bulgaria
stoyan@lml.bas.bg
Yiannis N. Moschovakis
Department of Mathematics
University of California at Los Angeles
520 Portola Plaza
PO Box 951555
Los Angeles, CA 90095-1555
USA
and University of Athens
ynm@math.ucla.edu
List of contributors 177
Lawrence S. Moss
Department of Mathematics
Indiana University
831 East Third Street
Bloomington, IN 47405-7106
USA
lsm@cs.indiana.edu
Klaus U. Schulz
Centrum f ur Informations- und Sprachverarbeitung
Ludwig-Maximilians-Universit at M unchen
Oettingenstr. 67
80538 M unchen
Germany
schulz@cis.uni-muenchen.de
Index
acyclic recursion, 6466
adjunction, 132138
Aristotle, 143144
belief, 57, 8384
boolean algebra, 90, 97, 103
cardinality comparison, 169172
category, 133
closure operation, 36, 38, 39
cartesian product, 52
composition, 51
concatenation, 41
intersection, 44, 46
inversion, 48
Kleene-star, 41
projection, 50
union, 41
complement, 90, 93, 99
completeness theorem, 143146,
149, 152, 155, 157, 158,
160, 162, 165, 168, 170
dependent types, 12, 13, 1620
dialogue, 9, 30
issue-based dialogue manage-
ment, 9, 30
Discourse Representation Theory
(DRT), 9, 10, 15
dual, 93, 94
factual content, 57, 7274, 76, 81
84
nite state automaton, 35, 36, 40
Fisher Ladner closure (FL), 118,
119
Gallin translation, 63, 66, 74
government and binding, 107
narrow, 123
grammar formalisms, 9, 10, 30
HPSG, 9, 10, 20
indexicality, 8083
inessential coordinate, 43, 44, 46,
48, 51, 52
intensionality, 10, 30
k-tape automaton, 36, 39, 40
language of intensional logic, 5862
link, 107111, 116, 117, 120, 128,
129
derived, 108111, 116, 117,
133, 139
root, 108111, 116, 117, 123
surface, 132, 139
logical system, 143, 146, 148151,
156, 161
meaning
local, 57, 7576, 83
referential, 66, 74
situated, 57, 7184
mid-point theorem, 96, 99
180 Index
monadic second order logic (MSO),
106, 114
Montague semantics, 9, 10
most
logic of, 167169
multidominance structure (MDS),
106111, 114123
ordered, 109111
pseudo-ordered, 134136
natural homomorphism, 41
natural logic, 143146
nominal, 124125
one-letter automaton, 36, 3953
postcomplement, 93, 94, 98, 99,
101, 102
program, 111122
converse, 125128
progressive, 117118, 125, 127
propositional dynamic logic (PDL),
106, 111122
deterministic (DPDL), 114, 122
propositional logic, 145, 151, 161
quantier, 87, 101
cardinal, 8890, 94
co-cardinal, 8890, 94
co-intersective, 8894, 98, 101,
103
conservative, 88, 89, 92, 95,
103
rst order, 91, 95
generalized existential, 8789
generalized universal, 8789
intersective, 8894, 98, 101,
103
partitive, 89, 90, 103
permutation invariant, 89, 90,
92, 103
proportional, 87103
sortally reducible, 91, 92
records, 930
reduction calculus, 68
referential canonical form, 68, 69
referential intension, 6667, 6970
regular relation, 36, 3853
situation semantics, 9, 10, 14
Swiss German, 136139
syllogism, 143148, 160, 161, 164
synonymy
factual, 72, 75, 80, 83, 85
local, 7580
referential, 66, 7071, 75
type theory, 930
with records, 930
typed -calculus, 10, 14, 26
two-sorted, 6263
with acyclic recursion, 6466
unication, 9, 10, 2430
unication-based grammar, 9, 10,
30

Вам также может понравиться