Вы находитесь на странице: 1из 447

Semantics & Pragmatics Volume 3, Article 1: 1–72, 2010

doi: 10.3765/sp.3.1

Quantifiers in than-clauses∗
Sigrid Beck
University of Tübingen

Received 2009-01-13 / First Decision 2009-03-17 / Revised 2009-06-17 / Second

Decision 2009-07-06 / Revised 2009-07-27 / Accepted 2009-07-27 / Published 2010-

The paper reexamines the interpretations that quantifiers in than-clauses
give rise to. It develops an analysis that combines an interval semantics for
the than-clause with a standard semantics for the comparative operator. In
order to mediate between the two, interpretive mechanisms like maximality
and maximal informativity determine selection of a point from an interval.
The interval semantics allows local interpretation of the quantifier. Selection
predicts which interpretation this leads to. Cases in which the prediction
appears not to be met are explained via recourse to independently attested
external factors (e.g. the interpretive possibilities of indefinites). The goal
of the paper is to achieve coverage of the relevant data while maintaining a
simple semantics for the comparative. A secondary objective is to reexamine,
restructure and extend the set of data considered in connection with the
problem of quantifiers in than-clauses.

Keywords: comparatives, degrees, intervals, quantifiers, indefinites, plurals, scope

∗ Versions of this paper were presented at the workshop on covert variables in Tübingen 2006,
at two Semantic Network meetings (in Barcelona 2006 and Oslo 2007), at the 2009 Topics
in Semantics seminar at MIT, and at the Universität Frankfurt 2009. I would like to thank
the organizers Frank Richter and Uli Sauerland and the audiences at these presentations for
important feedback. Robert van Rooij and Jon Gajewski have exchanged ideas with me. The
B17 project of the SFB 441 has accompanied the work presented here — Remus Gergel, Stefan
Hofstetter, Sveta Krasikova, John Vanderelst — as have Arnim von Stechow and Irene Heim.
Several anonymous reviewers and Danny Fox have given feedback on earlier versions, and
David Beaver and Kai von Fintel have commented on the prefinal version. I am very grateful
to them all.

©2010 Sigrid Beck

This is an open-access article distributed under the terms of a Creative Commons Non-
Commercial License (creativecommons.org/licenses/by-nc/3.0).
Sigrid Beck

1 Introduction

The problem of quantifiers in than-clauses has been puzzling linguists for a

long time, beginning with von Stechow 1984, via Schwarzschild & Wilkinson
2002, Schwarzschild 2004, and Heim 2006b, to very recent approaches in
Gajewski 2008, van Rooij 2008 and Schwarzschild 2008. It can be illustrated
with the examples below.

(1) John ran faster than every girl did.

(10 ) a. For all x, x is a girl: John ran faster than x.
b. #The degree of speed that John reached exceeds the degree of speed
that every girl reached.
i.e. “John’s speed exceeds the speed of the slowest girl.”
(2) John ran faster than he had to.
(20 ) a. #For all w, w is a permissible world: John ran faster in @ than he
ran in w.
b. The degree of speed that John reached in @ exceeds the degree of
speed that he has in every permissible world w.
i.e. “John’s actual speed exceeds the slowest permissible speed.”
(@ stands for the real world)

Example (1) intuitively only has a reading that appears to give the universal
NP scope over the comparison, namely (10 a): all the girls were slower than
John. The reading in which the universal NP takes narrow scope relative to
the comparison is paraphrased in (10 b). Here we must look at degrees of
speed reached by all girls; depending on the precise semantics of the than-
clause (see below), this could mean the maximal speed that they all reached,
i.e. the speed of the slowest girl. Example (1) has no reading that compares
John’s speed to the speed of the slowest girl. Sentence (2), on the other hand,
only has a reading that gives the modal universal quantifier narrow scope
relative to the comparison, (20 b). That is, we consider the degrees of speed
that John reaches in all worlds compatible with the rules imposed by the
modal base of have to. This will yield the slowest permissible speed, and (2)
intuitively says that John’s actual speed exceeded this minimum requirement.
The sentence is not1 understood to mean that John did something that was
1 Heim (2006b) and Krasikova (2008) include a discussion of when readings like (20 a) are
available. The reading can be made more plausible with a suitable context, depending on the
modal chosen. For the moment I will stick to the simpler picture presented in the text. See

Quantifiers in than-clauses

against the rules — that is, reading (20 a), in which the modal takes scope over
the comparison, is not available.
We must ask ourselves how a quantifier contained in the than-clause can
have wide scope at all, why it cannot get narrow scope in (1), and why (2) is the
opposite. Since — as we will see in more detail below — these questions look
unanswerable under the standard analysis of comparatives, the researchers
cited above have been led to a revision of the semantic analysis of comparison.
Schwarzschild & Wilkinson (2002) employ an interval semantics for the than-
clause and give the comparative itself an interval semantics. Heim (2006b)
adopts intervals, but ultimately reduces the semantics of the comparison
back to a degree semantics through semantic reconstruction. This allows
her to retain a simple meaning of the comparative operator. A than-clause
internal operator derives the different readings that quantifiers in than-
clauses give rise to. The line of research in Gajewski 2008, van Rooij 2008
and Schwarzschild 2008 in turn adopts the idea of a than-clause internal
operator but not the intervals.
In this paper, I pursue a strategy that can be seen as an attempt to simplify
Schwarzschild & Wilkinson’s proposal. Like them, I derive a meaning for
the than-clause without a than-clause internal operator, and that meaning is
based on an interval semantics. But I combine this with a standard semantics
of the comparative in the spirit of von Stechow 1984. This means that the
end result of interpreting the than-clause must be a degree. Everything will
hinge on selecting the right degree, so that each of the relevant examples
receives the right interpretation.
In Section 2, I present the current state of our knowledge in this domain.
The analysis of than-clauses is presented in Section 3. Section 4 ends the
paper with a summary and some discussion of consequences of the proposed

2 State of affairs

I first present a sample of data that I take to be representative of the inter-

pretational possibilities that arise with quantifiers in than-clauses. Then I
sketch Schwarzschild & Wilkinson’s (2002) and Heim’s (2006b) analyses in
Section 2.2, and in Section 2.3 a summary of the proposals in Gajewski 2008,
van Rooij 2008 and Schwarzschild 2008.
Section 3 for more discussion.

Sigrid Beck

2.1 The empirical picture

2.1.1 A classical analysis of the comparative

The basis of our present perception of the problem presented by (1) and
(2) is the analysis of the comparative construction, because the data are
understood in terms of whether the quantfier appears to take wide scope
over the comparison according to a classical analysis of the comparative,
or whether it would have to be seen as taking narrow scope relative to the
comparison. My presentation assumes a general theoretical framework like
Heim & Kratzer 1998 and begins with specifically Heim’s (2001) version of
the theory of comparison promoted in von Stechow 1984 (see also Klein 1991
and Beck 2009 for an exposition and Cresswell 1977; Hellan 1981; Hoeksema
1983; Seuren 1978 for theoretical predecessors). This theory is what I will
refer to as a classical analysis of the comparative. For illustration, I discuss
the simple example (3a) below. In (3b) I provide the Logical Form and in (3c)
the truth conditions derived by compositional interpretation of that Logical
Form, plus paraphrase. Interpretation relies on the lexical entries of the
comparative morpheme and gradable adjectives as given in (4).

(3) a. Paule is older than Knut is.

b. [-er [hd,ti than 2 [Knut is t2 old]]
[hd,ti 2 [Paule is t2 old]]]
c. max(λd. Paule is d-old) > max(λd. Knut is d-old) =
Age(Paule) > Age(Knut)
“The largest degree of age that Paule reaches exceeds the largest
degree of age that Knut reaches.”
“Paule’s age exceeds Knut’s age.”
(4) a. ‚-erƒ = λDhd,ti . λDhd,ti . max(D 0 ) > max(D)
b. ‚oldƒhd,he,tii = [λd. λx. x is d-old]
= [λd. λx. Age(x) ≥ d]
c. Let S be a set ordered by R.
Then maxR (S) = ιs[s ∈ S & ∀s 0 ∈ S[sRs 0 ]]

Importantly, the role of the comparative operator is ultimately to relate

the maximal degree provided by the than-clause to some matrix clause
degree. The than-clause provides degrees through abstraction over the
degree argument slot of the adjective. Different versions of such a classical
analysis are available (for instance von Stechow’s (1984) own or Kennedy’s

Quantifiers in than-clauses

(1997)), but the problem of quantifiers in than-clauses presents itself in a

parallel fashion in all of them.
I will make one small revision to the above version of the classical analysis:
I will suppose that what is written into the lexical entry of the comparative
morpheme as the maximality operator in (4a) is not actually part of the
meaning of the comparative itself. Rather, it is a general mechanism that
allows us to go from a description of a set to a particular object, for example
also in the case of free relative clauses in (5) (Jacobson 1995); see also Beck
2009. I represent maximality in the Logical Form, as indicated in (40 b). The
meaning of the comparative is then simply (40 a), the ‘larger than’ relation. It
is basically this meaning of the comparative that I will try to defend below.
The resulting interpretation remains of course the same.

(5) a. We bought [what we liked].

b. max(λx. we liked x)
(40 ) a. ‚-erƒ = λdd .λd0d . d0 > d
b. [-er [d than max 2 [Knut is t2 old]]
[d max 2 [Paule is t2 old]]]
c. max(λd. Paule is d-old) > max(λd. Knut is d-old)

2.1.2 Apparent wide scope quantifiers

Universal NPs are a standard example for an apparent wide scope quantifier
(see e.g. Heim 2006b). The sentence in (6) below only permits the reading in
(60 a), not the one in (60 b). This can be seen from the fact that the sentence
would be judged false in the situation depicted below.

(6) John is taller than every girl is.

(60 ) a. ∀x[girl(x) → max(λd. John is d-tall) > max(λd. x is d-tall)]
“For every girl x: John’s height exceeds x’s height.”
b. #max(λd. John is d-tall) > max(λd. ∀x[girl(x) → x is d-tall])
“John’s height exceeds the largest degree to which every girl is
“John is taller than the shortest girl.”

_ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _/

g1 ’s height J’s height g2 ’s height g3 ’s height

Sigrid Beck

The classical semantics of comparatives makes this look as if the NP had

to take scope over the comparative. The LF given in (600 a) can straightfor-
wardly be interpreted to yield (60 a); analogously for (600 b) and (60 b). Thus,
strangely, the sentence appears to permit with (600 a) only an LF which violates
constraints on Quantifier Raising (QR): QR is normally confined to a simple
finite clause (May 1985 and much subsequent work). The LF in (600 b), which
would be unproblematic syntactically, is not possible.

(600 ) a. [[every girl] [1 [[-er [d than max 2 [t1 is t2 tall]]

[d max 2 [John is t2 tall]]]]
b. [[-er [d than max 2 [every girl] [ 1 [t1 is t2 tall]]]
[d max 2 [John is t2 tall]]]

The example with the differential in (7) shows the same behaviour (it uses a
version of the comparative that accomodates a difference degree, (7c)).

(7) a. John is 200 taller than every girl is.

b. ∀x[girl(x) → max(λd. John is d-tall) ≥ max(λd. x is d-tall) + 200 ]
= For every girl x: John’s height exceeds x’s height by 200 .
c. ‚-erdiff ƒ = λd. λd0 . λd00 . d00 ≥ d + d0

The problem posed by (5) and (7) is exacerbated in (8), as Schwarzschild &
Wilkinson (2002) observe. We have once more a universal quantifier, but
this time it is one that is taken to be immobile at LF: the intensional verb
predict. Still, the interpretation that is intuitively available looks to be one
in which the universal outscopes the comparison, (80 a). The interpretation
in which comparison takes scope over predict, (80 b), is not possible. This is
problematic because the LF we would expect (8) to have is (10), and (10) is
straightforwardly interpreted to yield (80 b).

(8) John is taller than I had predicted (that he would be).

(9) My prediction: John will be between 1.70 m and 1.80 m.
Claim made by (8): John is taller than 1.80 m.
(80 ) a. ∀w[wR@ →
max(λd. John is d-tall in @) > max(λd. John is d-tall in w)]
“For every world compatible with my predictions: John’s actual
height exceeds John’s height in that world.”
b. # max(λd. John is d-tall in @)
> max(λd. ∀w[wR@ → John is d-tall in w])

Quantifiers in than-clauses

“John’s actual height exceeds the degree of tallness which he has

in all worlds compatible with my predictions.”
“John’s actual height exceeds the shortest prediction, 1.70 m.”
(where R is the relevant accessibility relation, compare e.g. Kratzer
(10) [[-er [hd,ti than max 2 [ I had predicted that [ John be t2 tall]]]
[hd,ti max 2 [ John is t2 tall]]]

This is the interpretive behaviour of many quantified NPs, plural NPs like
the girls, quantificational adverbs, verbs of propositional attitude and some
modals (e.g. should, ought to, might). See Schwarzschild & Wilkinson 2002
and Heim 2006b for a more thorough empirical discussion.

2.1.3 Apparent narrow scope quantifiers

Not all quantificational elements show this behaviour. A universal quantifier

that does not is the modal have to, along with some others (be required, be
necessary, need). This is illustrated below.

(11) Mary is taller than she has to be.

(12) Mary wants to play basketball. The school rules require all players to
be at least 1.70 m. Claim made by (11): Mary is taller than 1.70 m.
(110 ) a. ?#∀w[wR@ →
max(λd. Mary is d-tall in @) > max(λd. Mary is d-tall in w)]
= For every world compatible with the school rules:
Mary’s actual height exceeds Mary’s height in that world;
i.e. Mary is too tall.
b. max(λd. Mary is d-tall in @)
> max(λd. ∀w[wR@ → Mary is d-tall in w])
= Mary’s actual height exceeds the degree of tallness which she
has in all worlds compatible with the school rules;
i.e. Mary’s actual height exceeds the required minimum, 1.70 m.

These modals permit what appears to be a narrow scope interpretation

relative to the comparison. Example (11) does not favour an apparent wide
scope interpretation. Krasikova (2008) argues though that some examples
with have to–type modals may have both readings, depending on context. (13)
is one of her examples favouring a reading analogous to (110 a), an apparent

Sigrid Beck

wide scope reading of have to (see Section 3 for more discussion).

(13) He was coming through later than he had to if he were going to retain
the overall lead. (from Google, cited from Krasikova 2008)
= He was coming through too late.

Existential modals like be allowed also appear to take narrow scope:

(14) Mary is taller than she is allowed to be.

(15) a. #∃w[wR@ &
max(λd. Mary is d-tall in @) > max(λd .Mary is d-tall in w)]
= It would be allowed for Mary to be shorter than she actually is.
b. max(λd. Mary is d-tall in @)
> max(λd. ∃w[wR@ & Mary is d-tall in w])
= Mary’s actual height exceeds the largest degree of tallness that
she reaches in some permissible world; i.e. Mary’s actual height
exceeds the permitted maximum.

And so do some other existential quantifiers and disjunction:

(16) Mary is taller than anyone else is.

(17) a. #There is someone that Mary is taller than.
b. Mary’s height exceeds the largest degree of tallness reached by
one of the others.
(18) Mary is taller than John or Fred are.
(19) a. ?#For either John or Fred: Mary is taller than that person.
b. Mary’s height exceeds the maximum height reached by John or

This is the interpretive behaviour of some modals (e.g. need, have to, be
allowed, be required), some indefinites (especially NPIs) and disjunction (com-
pare once more Heim 2006b). It is also the behaviour of negation and negative
quantifiers, with the added observation that the apparent narrow scope read-
ing is one which often gives rise to undefinedness, hence unacceptability (von
Stechow 1984; Rullmann 1995). (That this is not invariably the case is shown
by (22), illustrating that we are concerned with a constraint on meaning rather
than form.)

(20) *John is taller than no girl is.

Quantifiers in than-clauses

(21) a. John’s height exceeds the maximum height reached by no girl.

The maximum height reached by no girl is undefined, hence:
unacceptability of this reading.
b. #There is no girl who John is taller than.
(22) I haven’t been to the hairdresser longer than I haven’t been to the

Here is how the empirical picture presents itself from the point of view of
a classical analysis of comparatives. It appears that there are two different
scope readings possible for quantifiers embedded inside the than-clause,
wide or narrow scope relative to the comparison. But there is usually no
ambiguity. Each individual quantifier favours at most one reading (negation
frequently permits none). Apparent narrow scope readings are straightfor-
wardly captured by the classical analysis. It is unclear how apparent wide
scope readings are to be derived at all. As Schwarzschild & Wilkinson argue,
they are beyond the reach of an LF analysis. It is also unclear what creates
the pattern in the readings that we have observed.
Before we examine modern approaches to this problem, a final comment
on the data. I have presented them the way they are presented in the literature
on the subject, as if they were all impeccable and their interpretations clear.
But I would like to use this opportunity to point out that I find some of
them fairly difficult and perhaps not even entirely acceptable. This concerns
example (6), for which I would much prefer a version with a definite plural (the
girls instead of every girl). The NP the girls is, if anything, more problematic
under the classical analysis, as Schwarzschild & Wilkinson (2002) point out
(having less of an inclination towards wide scope); but see Section 4 for a
comment on how this issue may be relevant for the analysis developed in
this paper.

(6000 ) a. ?John is taller than every girl is.

b. John is taller than the girls are.
∀x[x ∈ ‚the girlsƒ → John is taller than x]

Another instance are examples with intensional verbs like predict or expect;
when a genuine range is predicted or expected, intuitions regarding when
sentences with differentials like (800 ) would be true vs. false are not very firm.
This seems to me an area in which a proper empirical study might be helpful.
The issue is taken up in Section 3.4.

Sigrid Beck

(800 ) a. John is two inches taller than I had predicted (that he would be).
b. John arrived at most 10 minutes later than I had expected.

2.2 New analyses I

Since it is very hard to see how the data can be derived under the classical
theory, the two theories summarized below (Schwarzschild & Wilkinson 2002
and Heim 2006b) both change the semantics of the comparative construction
in ways that reanalyse scope. The quantificational element inside the than-
clause can take scope there even under the apparent wide scope reading.
The two theories differ with respect to the semantics they attribute to the
comparison itself. They also differ in their empirical coverage.

2.2.1 Schwarzschild & Wilkinson 2002

Schwarzschild & Wilkinson (2002) are inspired by the scope puzzle to a

complete revision of the semantics of comparison. The feature of the classical
analysis that they perceive as the crux of our problem is that the than-clause
provides a degree via abstraction over degrees. According to them, the
quantifier data show that the than-clause instead must provide us with an
interval on the degree scale — in (23) below an interval into which the height
of everyone other than Caroline falls.

(23) Caroline is taller than everyone else is.

‘Everyone else is shorter than Caroline.’

 interval that covers everyone else’s height 

_ _ _ _ _ _ _• _ _ _ _•_ _ _ _ •_ _ _ _ _ _ _ _• _ _ _/

x1 x2 x3 C

(the interval is related to Caroline’s height by the comparative)

(24) ‚than everyone else isƒ = λD. everyone else’s height falls within D
(where D is of type hd, ti)2

To simplify, I will suppose that it is somehow ensured that we pick the right
matrix clause interval (Caroline’s height in (23), Joe’s height in the example
2 I present the discussion here in terms of the classical theory’s ontology, where degrees (type
d, elements of Dd ) are points on the degree scale and what I call an interval is a set of points,
type hd, ti.

Quantifiers in than-clauses


(25) Joe is taller than exactly 5 people are.

Here is a rough sketch of Schwarzschild & Wilkinson’s analysis of this exam-


(26) Subord: [λD. exactly 5 people’s height falls within D]

Matrix + Comp: max D 0 : [Joe’s height − D 0 ] 6= 0
the largest interval some distance below Joe’s height
Whole clause: the largest interval some distance below Joe’s height
is an interval into which exactly 5 people’s height falls.

Note that the quantifier is not given wide scope over the comparison at all
under this analysis. The interval idea allows us to interpret it within the
than-clause. While solving the puzzle of apparent wide scope operators, the
analysis makes wrong predictions for apparent narrow scope quantifiers (cf.
example (27)). The available reading cannot be accounted for ((28a) is the
semantics predicted by the classical analysis, corresponding to the intuitively
available reading; (28b) is the semantics that the Schwarzschild & Wilkinson
analysis predicts).

(27) John is taller than anyone else is.

(28) a. John’s height > max(λd. ∃x[x 6= John & x is d-tall])
b. #The largest interval some distance below John’s height is an
interval into which someone else’s height falls = Someone is
shorter than John.

The breakthrough achieved by this analysis is that we can assign to the than-
clause a useful semantics while interpreting the quantifier inside that clause.
For this reason, the interval idea is to my mind a very important innovation.
The analysis still has a crucial problem in that it does not extend to the
apparent narrow scope quantifiers. That is, it fails in precisely those cases
that were unproblematic for the classical analysis. I will also mention that
the semantics of comparison becomes rather complex under this analysis,
since the comparative itself compares intervals. This is not in line with the
plot I outlined above of maintaining as the semantics of the comparative
operator the plain ‘larger than’-relation.

Sigrid Beck

2.2.2 Heim 2006b

Heim (2006b) adopts the interval analysis, but combines it with a scope
mechanism that derives ultimately a wide and a narrow scope reading of
a quantifier relative to a comparison. Her analysis extends proposals by
Larson (1988). Larson’s own analysis is only applicable to than-clauses with
an adjective phrase gap denoting a property of individuals — a limitation
remedied by Heim. Let us consider her analysis of apparent wide scope of
quantifier data, like (29), first. Heim’s LF for the sentence is given in (30). She
employs an operator Pi (Point to Interval, credited to Schwarzschild (2004)),
whose semantics is specified in (31). Compositional interpretation (once more
somewhat simplified for the matrix clause, for convenience) is given in (32).

(29) John is taller than every girl is.

(30) [ IP [ CP than [1 [every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]]
[ IP 4 [[-er t4 ] [5 [John is t5 tall]]]]

(31) ‚Piƒ = λD.λP . max(P ) ∈ D

(32) a. main clause:
[[[4 [[-er t4 ] [5 [John is t5 tall]]]]] = λd. John is taller than d
b. than-clause:
‚[than [1 [every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]]ƒ =
D 0 /1
λD 0 . ‚[every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]]ƒg =
0 x/2
0 g D /1
λD . ∀x[girl(x) → ‚[ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]ƒ ]=
λD . ∀x[girl(x) → [λD.λP . max(P ) ∈ D](D )(‚[3 [t2 is t3 tall]]ƒg )] =
0 0

λD 0 . ∀x[girl(x) → [λD.λP . max(P ) ∈ D](D 0 )(λd. Height(x) ≥

d)] =
λD 0 . ∀x[girl(x) → max(λd. Height(x) ≥ d) ∈ D 0 ] =
λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]
intervals into which the height of every girl falls
c. main clause + than-clause:
‚(29)ƒ =
[λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]](λd. John is taller than d) =

∀x[girl(x) → Height(x) ∈ (λd. John is taller than d)] =

for every girl x: John is taller than x

Quantifiers in than-clauses

The than-clause provides intervals into which the height of every girl falls.
The whole sentence says that the degrees exceeded by John’s height is such
an interval. Semantic reconstruction (i.e. lambda conversion) simplifies the
whole to the claim intuitively made, that every girl is shorter than John. The
analysis assumes that the denotation domain Dd is a set of degree ‘points’,
and that intervals are of type Dhd,ti .
The analysis is a way of interpreting the quantifier inside the than-clause,
and deriving the apparent wide scope reading over the comparison through
giving the quantifier scope over the shift from degrees to intervals (the Pi
operator). It is applicable to other kinds of quantificational elements like
intensional verbs in the same way. Our example with predict is analysed
below; the intuitively plausible reading can now be derived straightforwardly
from the LF in (34).

(33) a. John is taller than I had predicted (that he would be).

b. ∀w[wR@ → max(λd. John is d-tall in @) >
max(λd. John is d-tall in w)]
= For every world compatible with my predictions:
John’s actual height exceeds John’s height in that world.
(34) [ IP [ CP than [1 [I had predicted [ CP [Pi t1 ] [2 [ AP John t2 tall]]]]]]]
[ IP 3 [John is taller than t3 ]]]
(35) a. main clause:
‚[3 [John is taller than t3 ]]ƒ = (λd. John is taller than d in @)
b. than-clause:
‚[than [1 [I had predicted [ CP [Pi t1 ] [2 [ AP John t2 tall]]]]]]]ƒ =
[λD 0 . ∀w[wR@ → ‚[ CP [Pi t1 ] [2 [ AP John t2 tall]]]ƒg[D /1] ] =
[λD 0 . ∀w[wR@ → max(λd. Height(John)(w) ≥ d) ∈ D 0 ]] =
[λD 0 . ∀w[wR@ → Height(John)(w) ∈ D 0 ]]
intervals into which John’s height falls in all my predictions
c. main clause + than-clause:
‚(34)ƒ =
[λD 0 . ∀w[wR@ → Height(John)(w) ∈ D 0 ]]
(λd. J is taller than d in @)
= for every w compatible with my predictions:
John’s actual height exceeds John’s height in w.

The effect of the Pi operator on the predicate of degrees it combines with

is sketched below for the AP tall. As long as a than-clause quantifier takes

Sigrid Beck

scope over the Pi operator, the resulting meaning of the whole sentence will
be one that lets the quantifier take scope over the comparison, even though
it is interpreted syntactically below the comparative operator and inside the

(36) Pi shifts from degrees to intervals:

[λd. Height(x) ≥ d] =⇒ [λD. Height(x) ∈ D]

In contrast to Schwarzschild & Wilkinson’s original interval analysis, Heim

is able to derive apparently narrow scope readings of an operator relative to
the comparison as well. The sentence in (37a) is associated with the LF in (38).
Note that here, the shifter takes scope over the operator have to. This makes
have to combine with the degree semantics in the original, desired way,
giving us the minimum compliance height (just like it did before, without the
intervals). The shift is essentially harmless.

(37) a. Mary is taller than she has to be.

b. max(λd. Mary is d-tall in @)
> max(λd. ∀w[wR@ → Mary is d-tall in w])
Mary’s actual height exceeds the degree of tallness which she has
in all worlds compatible with the school rules;
i.e. Mary’s actual height exceeds the required minimum, 1.70 m.
(38) [ IP [ CP than [1 [[[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]]]
[ IP 3 [Mary is taller than t3 ]]]
(39) a. main clause:
‚[3 [Mary is taller than t3 ]]]ƒ = (λd. Mary is taller than d in @)
b. than-clause:
‚[than [1 [[[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]]ƒ =
λD 0 . ‚[[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]]ƒg[D /1] =
λD 0 . max(λd. ‚has-to [Mary t2 tall]]]]]]ƒg[d/2] ) ∈ D 0
λD 0 . max(λd. ∀w[wR@ → Mary is d-tall in w]) ∈ D 0
intervals into which the required minimum falls
c. main clause + than-clause:
‚(38)ƒ =
[λD 0 . max(λd. ∀w[wR@ → Mary is d-tall in w]) ∈ D 0 ]
(λd. Mary is taller than d in @)
= Mary is taller than the required minimum.

Quantifiers in than-clauses

Other apparent narrow scope operators receive a parallel analysis. The

crucial ingredient to this analysis is that the Pi operator is a scope bearing
element, able to take local or non-local scope. Pi-phrase scope interaction is
summarized below:

(40) Pi takes narrow scope relative to quantifier

=⇒ apparent wide scope reading of quantifier over comparison
Pi takes wide scope relative to quantifier
=⇒ apparent narrow scope reading of quantifier relative to comparison

Thus than-clauses include a shift from degrees to intervals, which allows

us to assign a denotation to the than-clause with the quantifier. The shift
amounts to a form of type raising. Through semantic reconstruction, the
matrix clause is interpreted in the scope of a than-clause operator when that
operator has scope over the shifter. In contrast to Schwarzschild & Wilkinson,
comparison is ultimately between degrees, not intervals.
Heim’s analysis is able to derive both wide and narrow scope readings of
operators in than-clauses. It does so without violating syntactic constraints.
There is, however, an unresolved question: when do we get which reading?
How could one constrain Pi-phrase/operator interaction in the desired way?
One place where this problem surfaces is once more negation, where we
expect an LF that would generate an acceptable wide scope of negation
reading. That is, the LF in (41b) should be grammatical and hence (41a)
should be acceptable on the reading derived from this LF in (42).

(41) a. *John is taller than no girl is.

b. [ IP [ CP than [1 [no girl [2 [[Pi t1 ] [3 [ t2 is t3 tall]]]]]]]
[ IP 4 [[-er t4 ] [5 [John is t5 tall]]]]
(42) a. main clause:
‚[4 [[-er t4 ] [5 [John is t5 tall]]]ƒ = λd. John is taller than d
b. than-clause:
‚[than [1 [no girl [2 [[Pi t1 ] [3 [ t2 is t3 tall]]]]]]]ƒ =
λD 0 . for no girl x : max(λd. x is d-tall) ∈ D 0
intervals into which the height of no girl falls
c. main clause + than-clause:
‚(41b)ƒ =
[λD 0 . for no girl x : max(λd. x is d-tall) ∈ D 0 ](λd. J is taller than d)
= for no girl x: John is taller than x

Sigrid Beck

Adopting the interval analysis, but combining it with a scope mechanism

and semantic reconstruction, allows Heim to derive both types of readings
(apparent narrow and apparent wide scope), and to reduce the comparison
ultimately back to a comparison between degrees. Thus her empirical cover-
age is greater and the semantics of comparison simpler than Schwarzschild &
Wilkinson’s analysis. The problem that this analysis faces is overgeneration.
We do not have an obvious way of predicting when we get which reading.
The fact that in general, only one scope possibility is available makes one
doubt that this is really a case of systematic scope ambiguity.

2.3 Alternative new analyses: Gajewski, van Rooij, Schwarzschild

There is a group of new proposals — Gajewski 2008, van Rooij 2008 and
Schwarzschild 2008 — for how to deal with quantifiers in than-clauses whose
approach seems to be inspired by Heim’s (2006b) analysis. I present below a
simplified version of this family of approaches that is not entirely faithful
to any of them. I call this the NOT-theory. It can be summarized in relation
to the previous subsection as ‘keep the than-clause internal operator, but
not the intervals’. It adopts the idea that there is an operator — like Heim’s
Pi — that can take wide or narrow scope relative to a than-clause quantifier,
dictating what kind of reading the comparative sentence receives. It does not
adopt an interval analysis, and thus the operator is not Pi and the semantics
of the comparative is not the classical one. Instead, the operator is negation
and the proposed semantics is basically Seuren’s (1978).

2.3.1 Seuren’s semantics for the comparative (operator: NOT)

Seuren (1978) suggests (43b) as the interpretation of (43a). The than-clause

provides the set of degrees of tallness that Bill does not reach. It does so
by virtue of containing a negation, as illustrated in the LF in (44). This
meaning could be combined intersectively with the main clause and the
degree existentially bound, as represented in (45).

(43) a. John is taller than Bill is.

b. ∃d[Height(J) ≥ d & ¬ Height(B) ≥ d]
c. There is a degree of tallness that John reaches and Bill doesn’t

Quantifiers in than-clauses

(44) a. than λd[NOT Bill is d-tall]

b. λd[¬ Height(B) ≥ d] = λd[Height(B) < d]
(45) [∃ [λd [John is d-tall] [than λd [NOT Bill is d-tall]]]

The authors mentioned above note that this semantics gives us an easy
way to derive the intuitively correct interpretation for apparent wide scope
quantifiers. This is illustrated below for the universal NP. In (46) I show that
the desired meaning is easily described in this analysis and in (47) I provide
the LF for the than-clause that derives it. (48) illustrates that some, another
apparent wide scope quantifier, is equally unproblematic.

(46) a. John is taller than every girl is.

b. ∃d[Height(J) ≥ d & ∀x[girl(x) → Height(x) < d]]
c. every girl is shorter than John.
(47) a. than every girl is
b. than λd [every girl [1 [NOT [t1 is d tall]]]]
c. than λd.∀x[girl(x) → Height(x) < d]]

_ _ _ _• _ _ _ _ _ _ _•_ _ _ _ •_ _ _ _ _ _ _ •
_ _ _ _ _ _ _ _ _ _ _/
g1 g2 g3 g4
(48) a. John is taller than some girl is.
b. ∃d[Height(J) ≥ d & ∃x[girl(x) & Height(x) < d]]
c. there is a girl who is shorter than John.

An interesting application is negation, here illustrated with the negative

quantifier no. Proceeding in the now familiar way, we derive (49b). Rephrasing
this in terms of (49c) makes it clear that the resulting semantics is very weak.
Whenever the girls have any measurable height at all — that is, whenever the
than-clause can be appropriately used — there will be a height degree that
John reaches and that all the girls reach as well. The smallest degree on the
scale will be such a degree. The NOT-theory proposes that the sentence is
unacceptable because it is necessarily uninformative.

(49) a. *John is taller than no girl is.

b. ∃d[Height(J) ≥ d & for no girl x : Height(x) < d]
c. ∃d[Height(J) ≥ d & for every girl x : Height(x) ≥ d]
(The lowest degree on the height scale makes this true.)

Sigrid Beck

2.3.2 NOT has to take varying scope

The NOT-theory needs another important ingredient: Just like the Pi-operator
above, other than-clause internal operators have to take flexible scope relative
to NOT in order to create the different readings we observe. This is illustrated
below with the familiar have to example, and with allowed.

(50) a. Mary is taller than she has to be.

b. #∃d[Height(M)(@) ≥ d & ∀w[wR@ → NOT Height(M)(w) ≥ d]]
Mary should have been shorter than she is.
c. ∃d[Height(M)(@) ≥ d & NOT∀w[wR@ → Height(M)(w) ≥ d]]
Mary is taller than the minimally required height.
(51) a. John is taller than he is allowed to be.
b. #∃d[Height(J)(@) ≥ d & ∃w[wR@ & NOT Height(J)(w) ≥ d]]
∃d[Height(J)(@) ≥ d & ∃w[wR@ & Height(J)(w) < d]]
John would have been allowed to be shorter than he is.
c. ∃d[Height(J)(@) ≥ d & NOT∃w[wR@ & Height(J)(w) ≥ d]]
John is taller than the tallest permissible height.
(52) a. #than λd [allowed [λw [NOT [John is d tall in w]]]]
b. than λd [NOT [allowed [λw [John is d tall in w]]]]

Just like the Pi-theory, then, the NOT-theory is able to generate the range
of readings we observe for operators in than-clauses. It seems somewhat
simpler than the Pi-theory in that it does not take recourse to intervals in
addition to a scopally flexible than-clause internal operator. But as in the case
of the Pi-theory, we must next ask ourselves what prevents the unavailable
readings, e.g. what excludes the LF in (52a).

2.3.3 Which reading?

The NOT-theory would have an empirical advantage over the Pi-theory if con-
straints on scope could be found to deal with the overgeneration problem we
noted above. A first successful application are polarity items. Example (53a)
can only have the LF in (54b), not the one in (54a), according to constraints on
the distribution of NPIs. Thus we only derive the approproate interpretation.
Note though that the Pi-theory has the same success since the scope of Pi
is a downward entailing environment, but the rest of the than-clause isn’t
(compare Heim 2006b). (55) is the mirror image.

Quantifiers in than-clauses

(53) a. John is taller than any girl is.

b. #∃d[Height(J) ≥ d & ∃x[girl(x) & Height(x) < d]]
there is a girl who is shorter than John.
c. ∃d[Height(J) ≥ d & NOT ∃x[girl(x) & Height(x) ≥ d]]
John reaches a height degree that no girl reaches.
= John is taller than every girl.
(54) a. *than λd [any girl [1 [NOT [t1 is d tall]]]]
b. than λd [NOT [any girl [1 [t1 is d tall]]]]
(55) John is taller than some girl is.

Let us next reexamine negation. Two interpretations need to be considered.

The one in (56b) was already rejected above as uninformative. It turns out that
the alternative interpretation is equally uninformative. The ungrammaticality
of negation in than-clauses is thus captured elegantly by this theory. Here it
has an advantage over the Pi-theory.

(56) a. *John is taller than no girl is.

b. ∃d[Height(J) ≥ d & for no girl x : Height(x) < d] uninformative
c. ∃d[Height(J) ≥ d & NOT for no girl x : Height(x) ≥ d] uninfor-
= ∃d[Height(J) ≥ d & some girl x : Height(x) ≥ d]

Among the proponents of the NOT-theory, Schwarzschild (2008) examines

modals. He argues that the NOT-theory predicts that modals in than-clauses
should give rise to the same reading that they have with ordinary clause-mate
negation. This prediction is borne out, as the examples below illustrate.

(57) a. John is not allowed to be that tall. NOT  allowed

b. than he is allowed to be.
(58) a. John might not be that tall. might  NOT
b. than he might be.
(59) a. John is not supposed to be that tall. supposed  NOT
b. than he is supposed to be.
(60) a. John is not required to be that tall. NOT  required
b. than he is required to be.

While this is helpful with modals, it stops short of explaining the interpreta-
tion associated with intensional full verbs like predict.

Sigrid Beck

(61) a. John was not predicted to be that tall. NOT  predict — #

b. than he was predicted to be.

Two further possible constraints are discussed. Van Rooij (2008) examines
universal DPs and Gajewski (2008) investigates numeral DPs. Let us consider
both in turn.
Note first that a universal DP is ambiguous relative to clause mate nega-
tion. In particular it allows a reading in which the universal takes narrow
scope relative to negation. Thus there are no inherent scope constraints that
would help us to exclude (630 b) as an LF of (63a). But exclude it we must,
since it gives rise to the unavailable reading (63c).

(62) a. Every girl isn’t that tall. ambiguous

b. than every girl is.
(63) a. John is taller than every girl is.
b. ∃d[Height(J) ≥ d & ∀x[girl(x) → Height(x) < d]]
‘Every girl is shorter than John.’
c. #∃d[Height(J) ≥ d & NOT ∀x[girl(x) → Height(x) ≥ d]]
‘John reaches a height that some girl doesn’t.’
= John is taller than the shortest girl.
(630 ) a. than λd [every girl [1 [NOT [t1 is d tall]]]]
b. *than λd [NOT [every girl [1 [t1 is d tall]]]]

Van Rooij observes that (630 a) yields stronger truth conditions than (630 b). He
proposes that if no independent constraint excludes one of the LFs, you have
to pick the one that results in the stronger truth conditions. This amounts to
the suggestion that than-clauses fall within the realm of application of the
Strongest Meaning Hypothesis (SMH; Dalrymple, Kanazawa, Kim, Mchombo &
Peters 1998). If they do, the NOT-theory can make the desired predictions
about every DPs (and some other relevant examples). So could the Pi-theory,
though, so this does not distinguish between the two scope based theories of
quantifiers in than-clauses.
While I am sympathetic to the idea of extending application of the SMH, I
see some open questions for doing so in the case of than-clauses. Dalrymple
et al. originally proposed the SMH to deal with the interpretation of recipro-
cals. (64a) receives a stronger interpretation than (64b), for example, because
the predicate to stare at makes it factually impossible for the reading of (64a)
to ever be true. Similarly for (64c) vs. (64a,b). But (64a) only has one inter-

Quantifiers in than-clauses

pretation, the strongest one, and (64b) also cannot have a reading parallel to
(64c). The SMH says, very roughly, that out of the set of theoretically possible
interpretations you choose the strongest one that has a chance of resulting
in a true statement, i.e. that is conceptually possible.3

(64) a. These three people know each other.

= everyone knows everyone else.
b. These three people were staring at each other.
= everyone was staring at someone else.
c. These three people followed each other into the elevator.
= everyone followed, or was followed by, someone else.

There is a theoretical question as to when the SMH applies. We would not

wish it to apply in (62) for instance because it would predict that there is
no ambiguity. When there is ambiguity, the data in question must not be
subject to the SMH. Are than-clauses in the domain of application of the SMH?
Prima facie, this seems very plausible, because — just like reciprocals — they
are (almost always) unambiguous, while semantic theory provides several
potential interpretations. What strikes me as problematic is that there is
no way to make the weaker reading emerge, even if the stronger reading is
conceptually impossible. The following sentences are necessarily false, rather
than having the interpretations indicated.

(640 ) a. (about a 100 m race:)

The next to last finalist was faster than every other finalist.
≠ the next to last finalist was faster than the slowest other finalist.
3 Below I provide the formulation of the SMH given in Beck 2001. If we extend the domain of
application of the SMH to than-clauses, we need to strike out those phrases that make explicit
reference to reciprocals, as indicated. The relevant point is that the SMH makes reference
to interpretations compatible with non-linguistic information I, which in the examples in
(640 ) below would be knowledge about the order of finalists, elevator buttons and weekdays,
parallel to knowledge about processions of people and possibilities for staring in (64).

(i) Strongest Meaning Hypothesis (SMH)

Let Sr be the set of theoretically possible reciprocal interpretations for a sentence
S. Then, S can be uttered felicitously in a context c, which supplies non-linguistic
information I relevant to the reciprocal’s interpretation, provided that the set Sc has
a member that entails every other one.
Sc = {p: p is consistent with I and p ∈ Sr }
In that case, the use of S in c expresses the logically strongest proposition in Sc .

Sigrid Beck

b. (in an elevator:)
The second button from the bottom is higher than every other
≠ the second button from the bottom is higher than the lowest
other button.
c. Friday is earlier than every other day of the week. ≠ Friday is
earlier than the latest other day of the week.

Thus than-clauses do not seem parallel to reciprocals. It would be better if an

LF that gives rise to the ‘the least . . . other’ reading for universal DPs simply
did not exist.
Turning now to numeral DPs, note first that it is not immediately obvious
how the NOT-theory predicts a plausible meaning for them at all. Gajewski
(2008) points out that the following analysis of exactly-DPs gives rise to truth
conditions that are too weak. (650 ) would be true in a situation in which more
than three girls stay below John’s height.

(65) John is taller than exactly three girls are.

(650 ) ∃d[Height(J) ≥ d & for exactly 3 girls x : Height(x) < d]
At least three girls are below John’s height

 _ _ _ • _ _ _ _ • _ _ _ _ • _ _ _ _ _ _ _ • _ _ _ _ • _ _ _ _ • _ _ _ _/

g1 g2 g3 g4 g5

d H(J)

Reversing the scope of NOT and the exactly-DP doesn’t help:

(6500 ) ∃d[Height(J) ≥ d & NOT for exactly 3 girls x : Height(x) ≥ d]

there is a degree of height that John reaches that is not reached by
exactly 3 girls,
i.e. fewer or more girls reach that degree
true e.g. if John is taller than every one of five girls

Gajewski develops an analysis that relies on Krifka’s (1999) work on exactly,

at least and at most, according to which these elements take effect at the level
of the utterance, far away from their surface position. I present this analysis
in simplified terms below, using (66) to illustrate. The semantic effect of
exactly is due to an operator I call EXACT, which applies at the utterance
level and operates on the basis of the ordinary as well as the focus semantic

Quantifiers in than-clauses

value of its argument. The operator’s semantics is given in (67). The truth
conditions derived for the example are the right ones, as shown in (68) ((68)
uses Link’s (1983) operator ∗ for pluralization of the noun).

(66) a. Exactly three girls weigh 50 lb.

b. [EXACT [XP (exactly) threeF girls weigh 50 lb.]]
(660 ) ‚threeF girls weigh 50 lb.ƒo =
∃X[∗ girl(X) & card(X) = 3 & ∗weigh.50.lb(X)]
‚threeF girls weigh 50 lb.ƒf =
{∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)] : n ∈ N}
(67) ‚EXACTƒ(‚XPƒf )(‚XPƒo ) = 1
iff ‚XPƒo = 1 & ∀q ∈ ‚XPƒf : ¬(‚XPƒo → q) → ¬q
’Out of all the alternatives of XP, the most informative true one is the
ordinary semantics of XP.’
(68) ‚(66b)ƒ = 1 iff
∃X[∗ girl(X)&card(X) = 3 & ∗weigh.50.lb(X)] &
∀n[n > 3 → ¬∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)]] iff
max(λn.∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)]) = 3

Krifka’s analysis of exactly allows us to assign the problematic example (65)

the LF in (69), which captures the right meaning, namely the interpretation in
(690 ).

(69) EXACT [∃ [λd [John is d-tall] [than λd [threeF girls [λx [NOT x is
(690 ) max(λn. ∃d[Height(J) ≥ d & for n girls x : Height(x) < d])
the largest number n such that John reaches a height that n girls
don’t is 3. = exactly three girls are shorter than John.

Thus independently motivated assumptions about numerals allow the NOT-

theory to derive the desired interpretation. However, there is still the question
of the other LF, (70), in which NOT takes scope over the DP. This gives rise to
interpretation (700 ).

(70) EXACT [∃ [λd [John is d-tall] [than λd [NOT threeF girls [λx [x is
(700 ) max(λn. ∃d[Height(J) ≥ d & NOT for n girls x : Height(x) ≥ d])
the largest number n such that there is a height John reaches and it’s

Sigrid Beck

not the case that n girls do is 3.

= exactly two girls are shorter than John.

The reasoning in (71) makes it clear that this reading leads to truth conditions
that do not correspond to an available reading; they would make the sentence
true in the situation depicted, where there are two girls shorter than John.

(71) a. ∃d[Height(J) ≥ d & NOT for n girls x : Height(x) ≥ d]

= ∃d[Height(J) ≥ d & fewer than n girls reach d]
b. ∃d[Height(J) ≥ d & fewer than 3 girls reach d]

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
• • • • • •
g1 g2 g3 g4 g5


The NOT-theory would have to come up with an explanation for why this
reading is unavailable. I am not aware that there is at present such an
explanation. Note that even if we didn’t have the reservations about the SMH
pointed out above, it would not apply here, as the two interpretations don’t
stand in an entailment relation.
To summarize: just like the Pi-theory, the NOT-theory faces an overgen-
eration problem. Both the Pi-theory and the NOT-theory solve this easily
regarding NPIs. The NOT-theory also has a simple story about modals and
negative quantifiers. It does not have an explanation for intensional full verbs
and numeral DPs, and I argue it does not have a story about universal DPs (or
other prospective applications of the SMH) either. Thus I see some progress
compared to the Pi-theory, but not a complete analysis. A conceptual advan-
tage seems to be the NOT-theory’s simplicity. But we will need to reexamine
that in the next subsection.

2.3.4 Reference to degrees — differentials

One of the strengths of the classical analysis of comparatives is the way in

which it deals with explicit reference to degrees. For example differentials in
comparatives, illustrated in (72) and (73), receive an easy and natural analysis.

(72) a. Bill is 1.70 m tall.

b. John is 200 taller than that.
c. Height(J) ≥ 200 + 1.70 m

Quantifiers in than-clauses

(73) a. John is 200 taller than Bill is.

b. Height(J) ≥ 200 + max(λd. Height(B) ≥ d)
= Height(J) ≥ 200 + Height(B)

It is not obvious how to incorporate differentials into the NOT-theory, whose

semantics of a simple example is repeated in (74). That is because the
than-clause does not refer to a degree.

(74) a. John is taller than Bill is.

b. ∃d[Height(J) ≥ d & NOT Height(B) ≥ d]

Among the proponents of the NOT-theory, Schwarzschild (2008) discusses

this problem. He proposes to understand (75a) in terms of (75b); I simplify
this to (75c) for the purposes of discussion.

(75) a. John is 200 taller than Bill is.

b. ∃d[Height(J) ≥ d & 200 (λd0 . d0 ≤ d & Height(B) < d0 )]
c. 200 (λd0 . d0 ≤ Height(J) & Height(B) < d0 )])
“the degrees between Bill’s height and John’s are a 200 interval”

 200 interval 
_ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _/

H(B) H(J)

The question is how to derive this interpretation. Schwarzschild proposes to

replace NOT in the than-clause with an operator FALL-SHORT. The resulting
LF of our example is given in (750 a) and the semantics of FALL-SHORT in
(750 b). Diff is a variable that is the first argument of FALL-SHORT, to be
bound outside the than-clause and identified with the differential in the
matrix clause (as if the differential was raised out of the embedded clause to
its main clause position).

(750 ) a. than [[FALL-SHORT Diff] λd [Bill is d-tall]]

b. FALL-SHORT = λDiff.λDh d, ti.λd. Diff(λd0 . d0 ≤ d & D(d0 ) = 0)
c. Diff(λd0 . d0 ≤ d& Height(B) < d0 )
Bill’s Height is a Diff-large distance below d

We combine with the differential next, as shown in (76). Then, the degree d is
bound and the usual semantic mechanisms combine this with the rest of the
main clause in (77). This derives (75).

Sigrid Beck

(76) a. [200 er] [λDiff [than Bill is tall]]

b. λd. 200 (λd0 . d0 ≤ d& Height(B) < d0 )]
Bill’s Height is a 200 distance below d
(77) [∃ [λd [John is d-tall] [200 er] [λDiff [than Bill is tall]]]

It seems to me that this is a rather substantial modification of the original

NOT-theory. The basic points about than-clause scope interaction remain the
same (as the reader may verify), but some of the explanation is less obvious.
In particular, I don’t see that scopal behaviour of a modal with same clause
negation necessarily predicts scopal behaviour relative to FALL-SHORT, any
more than it predicts scopal behaviour relative to Pi. I also believe that we
lose the explanation of the unacceptability of negative quantifiers. Neither
of the readings associated with the two possible LFs below is necessarily
uninformative. Finally, I no longer see that the FALL-SHORT-theory is simpler
than the Pi-theory.

(78) a. John is 200 taller than no girl is.

b. [∃ [λd [J. is d-tall] [200 er] [λDiff [than [[FALL-SHORT Diff] λd [no
girl is d-tall]]]
c. [∃ [λd [J. is d-tall] [200 er] [λDiff [than [no girl λx [FALL-SHORT
Diff] λd [x is d-tall]]]]]
(780 ) ∃d[Height(J) ≥ d&200 (λd0 . d0 ≤ d & [λd. no girl is d-tall](d0 ) = 0)]]
= ∃d[Height(J) ≥ d & 200 (λd0 . d0 ≤ d & some girl is d0 -tall])]
= John and some girl are at least two inches tall.
(7800 ) ∃d[Height(J) ≥ d & no girl x : 200 (λd0 . d0 ≤ d& Height(x) < d0 )]
= no girl is 200 shorter than John.

I conclude that while the type of analysis discussed in this section — what
one might call scopal theories of quantifiers in than-clauses — has brought
forth some very interesting ideas, there are also unanswered questions. It
may be worthwhile to pursue a scopeless alternative, which is what I will do
in the next section.

3 Analysis: Selection

The strategy I propose in this section is inspired by both Schwarzschild &

Wilkinson and Heim. Schwarzschild & Wilkinson’s use of intervals is retained
in order to be able to interpret a quantifier inside a than-clause. But like Heim,

Quantifiers in than-clauses

I attempt to make this move compatible with a simple, standard semantics

of the comparative. The novel aspect of the analysis below concerns how
this is done. I do not adopt a than-clause internal operator Pi and I do not
rely on semantic reconstruction. I propose instead that there is a mechanism
that derives a particular degree from an interval provided by the than-clause.
This degree is compared in the normal way with a matrix clause degree.
The trick will be to ensure that the degree chosen is the right one, i.e. that
the comparison ultimately made reflects the intuitively accessible reading of
the comparative sentence in question. The same selection mechanism will
account for both apparent wide scope and apparent narrow scope readings.
The analysis will not employ a scoping mechanism that is specific to compar-
atives. Its relation to the earlier work discussed above can be simply stated
as ‘keep the intervals, but not the operator’.
Two rationales guide me in pursuing this approach. The first is that a
scoping mechanism inside the than-clause overgenerates in ways that we have
yet to find the means of constraining. Therefore it would be an advantage to
make do without such an extra scopal element. The second is that it remains
a strength of the classical analysis that degree operators combine directly
with expressions referring to degrees, and that differentials in particular can
be accounted for in a direct and straightforward way. Therefore I want to
come out of the calculation of the semantics of the than-clause holding in
my hand the degree we will be comparing things to. The combination of
these two lines of reasoning persuades me to attempt a simplification of
Schwarzschild & Wilkinson, which should of course also cover the apparent
narrow scope data that were problematic for them.
Section 3.1 presents the idea behind the selection analysis and applies it to
straightforward cases. Apparent narrow scope universals are not straightfor-
ward and addressed in Section 3.2. Apparent wide scope existentials similarly
seem problematic and are the issue of Section 3.3. In Section 3.4 I reexamine
comparatives that combine a differential with a quantifier in the than-clause
and propose a refinement of the analysis of the comparative to capture the

3.1 Basic idea and simple cases

I illustrate the idea behind the selection analysis with example (79), which
would not in fact require intervals at all of course. But, suppose that we in
general compositionally derive as the meaning of the than-clause a set of

Sigrid Beck

intervals, as suggested in the Schwarzschild & Wilkinson and Heim theories.

Suppose furthermore that this comes from the basic lexical entry of the
adjective, as indicated in (80). This is what I will assume in this section, for
the sake of uniformity (see Section 4 for more discussion). It amounts to (790 )
in the present case. How do I propose to derive the truth conditions of (79a),
(79b), from that?

(79) a. John is taller than Bill is.

b. Height(John) > Height(Bill)
(790 ) [than Bill is tall] = λD 0 . Height(Bill) ∈ D 0
‚ ƒ
b. _ _ _ _ _ _ _ _ _ _ _ _ _ •
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/

H(B) _
.. Bill’s height

(80) ‚tallƒ = λD. λx. Height(x) ∈ D

I suggest that general mechanisms available in such situations enable us — in

fact, force us — to pick from the set of intervals something that is suitable
as the input to the comparative operator repeated in (81). I represent this
selection mechanism as the in (7900 ) for the moment. This subsection asks
what the appropriate meaning for the is. (Note that the term ‘selection’ is
not intended to imply that there is a genuine choice; I intend to provide one
semantics for the.)

(81) ‚-erƒ = λdd . λd0d . d0 > d

(7900 )

John is taller than the ‚than-clauseƒ

In the present case, the could be an operator selecting the shortest interval
from the set, i.e. Bill’s height, cf. (82). This seems a natural choice, given that
all other intervals contain extraneous material and that the point that really
‘counts’ is just Bill’s height.

min(phhd,ti,ti ) = ιD. p(D) & ¬∃D 0 . D 0 ⊂ D & p(D 0 )

(shortest p interval)

Quantifiers in than-clauses

Irene Heim and Danny Fox (p.c.) point out to me that the sense in which
choosing the minimal interval is ‘natural’ is informativity. (83) below states
what the maximally informative propositions out of a set of true propositions
(say, a question meaning) are.

(83) a. m_inf(w)(Q h hs,hhs,ti,tii ) = λq. Q(w)(q) &

¬∃q Q(w)(q ) & q 6= q & Q(w)(q0 ) → Q(w)(q)
0 0 0

b. the maximally informative answers to a question Q(w) (Q(w)

the set of true answers to Q in w) is the set of propositions q in
Q(w) such that there is no other proposition q0 in Q(w) such
that Q(w)(q0 ) entails Q(w)(q) (i.e. if q0 is in Q(w) then so is q).

Informativity allows us to capture the fact that an appropriate answer to

(84a) is the true answer that entails all the other true answers, i.e. John’s
maximal speed (for example the proposition that he drove 50 mph), and in
a parallel way the minimum amount of flour that suffices in (84b)(see Heim
1994; Beck & Rullmann 1999).

(84) a. How fast did John drive?

λw. λp. ∃d p(w) & p = λw 0 . John drove d-fast in w 0

{that John drove 50 mph, that John drove 49 mph, that John
drove 48 mph, . . . }
b. How much flour is sufficient?
λw.λp.∃d[p(w)&p = λw 0 .d-much flour is sufficient in w 0 ]
{that 500 g is sufficient, that 501 g is sufficient, that 502 g is
sufficient, . . . }

The definition can be extended to (intensions of) arbitrary sets in the following

(85) h hs,hα,tii ) = λq. p(w)(q)

m_inf(w)(p &
¬∃q p(w)(q ) & q 6= q → p(w)(q0 ) & p(w)(q)
0 0 0

The instance of this generalization that we will be interested in is (86).

(86) a. m_inf(w)(p h hs,hhd,ti,tii ) = λD. p(w)(D) &

¬∃D p(w)(D ) & D 6= D & p(w)(D 0 ) → p(w)(D)
0 0 0

b. the maximally informative intervals out of a set of intervals p(w)

is the set of intervals D such that there is no other interval D 0 in
p(w) such that p(w)(D 0 ) entails p(w)(D) (i.e. if D is in p(w)
then so is D 0 ).

Sigrid Beck

Fox & Hackl (2006) argue that we want to extend the definition from the
question case to others in order to capture the similarity between (84a,b)
above and (87a), (88a). (87a) refers to the maximum speed John reached and
(88a) refers to the minimum amount that suffices, both maximally informative
in the sense of (85). The instance in (86) extends the analogy from (84a,b)
and (87a), (88a) to (87b), (88b).

(87) a. the speed that John drove

b. than John drove
(88) a. the amount of flour that is sufficient
b. than is sufficient

Hence, the in (7900 ) is m_inf, which yields a singleton, combined with taking
from a set its only member (here represented with max). We can understand
these operators as semantic ‘glue’ (a term introduced by Partee 1984, see
also von Stechow 1995): operations that have to enter into composition, in
addition to what the syntax strictly speaking provides, in order to make
the sentence parts combinable. Their presence is required by the need for

(79000 ) John is taller than max(m_inf(‚than-clauseƒ))

The simple example allows me to emphasize another aspect of what I call

the selection analysis: there is no choice in ‘selecting’ a point from a set of
intervals. Only one interpretation is possible for (79). The ‘glue’ we have here
is entirely semantic (and not, say, subject to pragmatic variability). Although
we will see in a moment that quantifiers in than-clauses require some more
elaboration, this will be preserved. Selection means, basically, taking from
the minimal interval(s) the maximal element.

3.1.1 Apparent wide scope universals

Let’s return to the now familiar example (89). We take the than-clause to have
the denotation in (890 ).

(89) a. John is taller than every girl is.

b. For every girl x: John’s height exceeds x’s height.

Quantifiers in than-clauses

(890 ) [than every girl is tall] = λD 0 . ∀x girl(x) → Height(x) ∈ D 0

‚ ƒ  

into which the height of every girl falls

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
• • • •
x1 x2 x3 J

The intuitive truth conditions of (89) can be described as making a compari-

son between John’s height and the end point of the interval into which all
the girls’ heights fall. If John is taller than the tallest girl, he is taller than all
of them. Thus I propose that from the denotation of the than-clause that is
given in (890 ), we first choose the shortest = maximally informative interval
that fits the description (i.e. that covers all the girls’ heights) and then select
the maximal point of that interval.4

(90) John is taller than Max> m_inf(‚than-clauseƒ)
= John is taller than the height of the tallest girl

(91) and (92) below provide the relevant definitions. We extend the notion of
the ordering relation underlying our degree scale from degrees to intervals,
(91). We can then define the maximal element of a set of intervals, and finally
the end point of an interval, (92).

(91) a. ordering of degree points: d > d0

d is larger than d0
I > J iff ∃d d ∈ I & ∀d0 [d0 ∈ J → d > d0 ]
b. ordering of intervals:
I extends beyond J
(92) a. max> := the max relative to the > relation on intervals or degrees
b. Max> (p) := max> (max> (p))
= the end ‘point’ of the interval that extends furthest

We straightforwardly derive the desired meaning. Other universal quantifiers

can be treated in the exact same way. This is illustrated below with the
4 Fox & Hackl propose to replace maximality with maximal informativity. I have not been able
to develop an analysis that incorporates that proposal. The reason is lack of entailment
among the degrees in the minimal than-clause interval: If I know of a degree d that it falls
in between the height of the smallest girl and the height of the tallest girl, I cannot infer
that a degree d0 larger than d also falls within that interval (d0 might be beyond the height
of the tallest girl) and I cannot infer that a smaller degree d00 also falls within that interval
(d00 might be below the height of the shortest girl). Therefore I will use both maximal
informativity and ordinary maximality.

Sigrid Beck

familiar example containing predict. If my prediction was that John would

be between 1.70 m and 1.80 m tall, then the interval [1.70–1.80] is the unique
shortest interval described by the than-clause. The end point of that interval
is 1.80 m, and the example is correctly predicted to be true if John is taller
than 1.80 m.

(93) a. John is taller than I had predicted (that he would be).

b. For every world compatible with my predictions:
John’s actual height exceeds Johns height in that world.
(930 ) ‚[than I had predicted (that he would be tall)]ƒ =
λD 0 . ∀w[wR@ → John’s height in w ∈ D 0 ]
intervals into which John’s height falls in all my predictions
(9300 ) John is taller than Max> (m_inf(‚than-clauseƒ))
= John is taller than the height according to the tallest prediction

What I call selection yields the maximum relative to the ordering relation
linguistically given — ‘larger than’ on the size scale in the case of taller. This
follows from more general interpretive mechanisms suggested independently
(compare Jacobson 1995; Fox & Hackl 2006). Application of these mecha-
nisms is required by the need for the than-clause to serve as input to the
comparative operator.

3.1.2 Apparent narrow scope existentials

We can apply the same strategy to narrow scope existentials. This is illus-
trated with (94) below. In contrast to Heim’s analysis and like Schwarzschild
& Wilkinson’s, I assume that the than-clause denotes the set of intervals in
(940 ) (once more via the shifted lexical entry for the adjective, (80)). Impor-
tantly, remember that I assume that the shift to intervals must take place
locally, i.e. within the adjective phrase. I do not assume a genuine mobile op-
erator Pi like Heim (2006b) does (whose LF for (94a) would give Pi wide scope
relative to anyone). We dispense with the interpretations for than-clauses
that were attributed to wide scope of the Pi operator.

(94) a. Mary is taller than anyone else is.

b. Mary’s height exceeds the largest degree of tallness reached by
one of the others.
(940 ) ‚[than anyone else is tall]ƒ = λD 0 . ∃x[x ≠ Mary & Height(x) ∈ D 0 ]
intervals into which the height of someone other than Mary falls

Quantifiers in than-clauses

The shortest = maximally informative than-clause intervals will be the heights

of the other relevant people. (Thus we get rid of the intervals immediately.)
Out of these, we choose the maximum. This results in the same meaning as
under the classical analysis. Thus the same selection strategy that we used
above will predict the right truth conditions. The analysis extends to other
apparent narrow scope existentials like be allowed etc.

(95) _ _ _ _• _ _ _ _• _ _ _ _ •_ _ _ _ _ _ _ •
_ _ _ _ _ _ _/
x1 x2 x3 M

(9400 ) Mary is taller than Max> (m_inf(‚than-clauseƒ))

= Mary is taller than the height of the tallest other person.

The selection strategy predicts the right truth conditions for these ‘apparent
narrow scope’ and ‘apparent wide scope’ quantifier data without changing
scope. This allows us to predict ungrammaticality of negation straightfor-
wardly, as illustrated below.

3.1.3 Negation

Remember that the unacceptability of (96) could be understood in terms of an

undefined contribution of the than-clause (von Stechow 1984; Rullmann 1995).
The selection analysis presented here can retain this desirable prediction.
The meaning of the than-clause is (960 ), in accordance with what is said
above. This is the only meaning possible for the than-clause.

(96) *John is taller than no girl is.

(960 ) ‚than no girl is tallƒ = λD 0 . for no girl x : Height(x) ∈ D 0
intervals into which the height of no girl falls

(960 ) will not yield a well-defined meaning for the comparative. Just as in
the original analysis of these data, the than-clause will not provide us with a
maximum, since there is no largest interval containing no girl’s height. Max>
is undefined; hence negation in the than-clause leads to undefinedness of the
comparative as a whole. Since there is no other option, we no longer face
the problem of ruling out the apparent wide scope reading of the negative
The simple data discussed in this subsection highlight the potential attrac-
tion of the selection analysis. We keep a simple semantics for the comparative

Sigrid Beck

and don’t double interpretive possibilities with a scoping mechanism. Next,

we turn to all the complications.

3.2 Refinement I: Have to–type modals

This subsection concerns universal quantifiers that do not behave like every
girl, predict and other apparent wide scope universals. Remember from Sec-
tion 2 that modals like have to appear to favour a narrow scope interpretation
rather than the apparent wide scope interpretation described and derived
above for other universals.

(97) Mary wants to play basketball. The school rules require all players to
be at least 1.70 m.
(970 ) a. Mary is taller than she has to be.
b. Mary’s actual height exceeds the degree of tallness which she has
in all worlds compatible with the school rules;
i.e. Mary’s actual height exceeds the required minimum, 1.70 m.

Keeping stable our assumptions about the meaning of than-clauses, we

will assume (98) for this example. Selecting the maximum of the shortest
than-clause interval will not yield the desired truth conditions this time,
though: that would amount to the claim that Mary’s height exceeds the
maximum height permitted. The sentence intuitively says that Mary is above
the required minimum. Contrasts like the one between have to and predict
are of course what motivates the scope analysis (apparent wide scope for
predict, apparent narrow scope for have to). A different description of the
facts is that the example with predict (and similar examples with every girl,
should, etc.) has a ‘more than maximum’ interpretation while have to can
have a ‘more than minimum’ interpretation. I see the task for my approach
as having to explain how factors independent of comparative semantics may
result in a ‘more than minimum’ interpretation rather than the expected
‘more than maximum’ reading.

(98) ‚than she has to be tallƒ = λD 0 . ∀w[wR@ → Mary’s height in w ∈

D0 ]
intervals into which Mary’s height falls in all worlds compatible with
the rules
the beginning of this interval is below Mary’s actual height, i.e. Mary’s
height exceeds the minimal element of the shortest than-clause inter-

Quantifiers in than-clauses

There are two analyses, as far as I am aware, that propose to reduce the
variation in the interpretation of than-clauses with universal modals between
maximum and minimum interpretation to independent factors, such that
the readings collapse into one. Meier (2002) proposes that the ordering
source that modal semantics uses is responsible for a contextually guided
determination of the interpretation, explaining away apparent maxima and
minima both. Krasikova (2008) examines the problem of have to–type modals
in comparatives in particular and employs covert exhaustification to explain
away apparent ‘more than minimum’ interpretations. While both approaches
solve the problem at hand equally well for my purposes, I describe below
Krasikova’s suggestions because they seem to me to offer more promise for
identifying which modal operators give rise to which reading(s).
Krasikova (2008) points out that whether we get a ‘more than minimum’
reading like the one illustrated above for this type of modal or a ‘more than
maximum’ reading parallel to the reading illustrated for predict depends on
the context an individual example is put into. Remember example (99) from
above, which shows that have to–type modals may also give rise to a ‘more
than maximum’ reading — the reading we expect under the present analysis.5
Thus what distinguishes have to–type modals from others is the availability
of an apparent narrow scope reading (a ‘more than minimum’ reading under
the present perspective).

(99) He was coming through later than he had to if he were going to retain
the overall lead. (from Google, cited from Krasikova 2008)

Krasikova further observes that the universal modals that can give rise to
the ‘more than minimum’/apparent narrow scope reading are just the ones
that occur in sufficiency modal constructions (SMC). An example of an SMC
is given below (von Fintel & Iatridou 2005).

(100) You only have to go to the North End (to get good cheese).

5 It is not at present clear to me under what circumstances a have to–type modal seems
to permit a more-than-maximum interpretation. Relevant factors may be the choice of a
negative polar adjective and a subjunctive-like interpretation (Danny Fox and Irene Heim,
p.c.). Personally, I find this interpretation very hard to get.

Sigrid Beck

(1000 ) Truth conditions: You do not have to do anything more dif-

ficult than to go to the North End (to get
good cheese).
Implicature: You have to go to the North End or do
something at least as difficult (to get good

The combination of only and a modal in the SMC considers alternatives to the
proposition that is the complement of have to, and ranks those alternatives
on a scale. Plausible alternatives for our example and their ranking are given
in (101). They provide the domain of quantification, C in (102); (102a) sketches
a structure for the example, (102b) a meaning for ‘only have to’ and (102c)
the outcome, which corresponds to the desired truth conditions (1000 ). Note
that the SMC reading is one that identifies the point on a scale that is the
minimum sufficiency point, as illustrated in (103).

(101) a. that you go to the nearest supermarket, that you go to the North
End, that you go to New York, that you go to Italy
b. SUPER < NE < NY < Italy (where ‘<’ means: is easier than)
(102) a. [[only have to]C,< [you go to the North End]]
b. ‚[only have to]C,< ƒ(p)(w) = 1 iff
∀q[q ∈ g(C)&¬(q < p) → ¬‚have toƒ(q)(w)]
c. For all q such that q is in g(C) and ¬(q < NE) : ¬‚have toƒ(q)
(103) _ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _/
necessary not necessary

My sketch leaves unaddressed all the thorny problems of the SMC construc-
tion like the composition of only and have to with the rest of the clause,
and the problem of only’s presupposition; compare in particular von Fintel &
Iatridou 2005 and Krasikova & Zhechev 2006. What is important for present
purposes is Krasikova’s observation that the interpretation that have to–type
modals give rise to in than-clauses can be seen as an SMC interpretation. The
‘more than minimum’ interpretation just like the SMC identifies the point
on a scale that is the minimum sufficiency point. Whatever is a plausible
analysis of the SMC should be extendable to the problem at hand.

(104) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
• not necessary
1.70 m

Quantifiers in than-clauses

Krasikova suggests that have to–type modals can use Fox’s (2007) covert
exhaustivity operator EXH instead of only, whose meanings are basically the
same. This is what happens in our comparatives, and this is responsible for
the ‘more than minimum’ interpretition.6 A structure for the than-clause
of (970 a) is given in (105a). Its interpretation using (102b) is (105b). Suppose
now that the relevant alternatives are the propositions in (106a), which place
Mary’s height in varying intervals. Our context is such that difficulties arise
with respect to reaching a certain height. Being short is not hard, being
tall is difficult. Thus the ordering of the alternatives in (106a) is one that
ranks them according to the height of the interval on the tallness scale into
which Mary’s height falls. The requirement easiest to meet is the minimal
compliance height. Given this, (105b) can be paraphrased as (105c).

(105) a. ‚(than) [1 [[EXH has to]C,< Mary be t1 tall]]ƒ

b. [λD 0 . ∀q[q ∈ g(C) & ¬(q < λw. M’s height in w ∈ D 0 ) →
¬‚have toƒ(q)]]
c. [λD . nothing more difficult is required than for Mary’s height
to fall within D 0 ]
(106) a. {λw. Mary’s height in w ∈ D1 , λw. Mary’s height in w ∈ D2 ,
λw. Mary’s height in w ∈ D3 , . . . }
b. If the ordering in terms of height is D1 < D2 < D3 . . . then:
λw. M’s height in w ∈ D1 < λw. M’s height in w ∈ D2 <
λw. M’s height in w ∈ D3 < . . .
(where ‘<’ means: is easier; in our context, being shorter is easier
than taller.)

Applying maximal informativity as usual yields the meaning below for the
subordinate clause, the minimum ‘point’ as desired. Selection with Max>
is trivial; the resulting meaning is that Mary’s actual height exceeds the
minimum compliance height.

6 As an anonymous reviewer points out, this raises the question of why we cannot have an
overt only in such sentences, cf. the ungrammaticality of (ia). The editors point out that
extraction of the associate of only is not good, cf. (ib). This would have to be different for
EXH than for only in order to answer the reviewer’s question.

(i) a. *Mary is taller than she only has to be.

b. *WhoF did Mary only _ call?

Sigrid Beck

(107) m_inf([λD. nothing more difficult is required than for Mary’s height
to fall within D])
= {the minimum compliance height}
= {[1.70–1.70]}

SMC readings of have to–type modals explain the ‘more than minimum’
reading that they can give rise to in comparative than-clauses with the single
assumption that EXH takes the place of only. Internal to the subordinate
clause, exhaustification occurs. Exhaustification of the than-clause reduces
the than-clause interval to a point. The ‘point’ that exhaustification yields is
the minimum compliance height.
I follow Krasikova in making the connection between SMC use and ‘more
than minimum’ readings and in her analysis in terms of exhaustification. This
allows me to maintain the selection analysis from the previous subsection.
According to this analysis, have to–type modals don’t require any revision of
the semantics of comparative constructions. We need to take into account
the special semantics of SMC modals instead. Contrary to appearances,
we uniformly select a degree from an interval via Max> ; with have to, we
may apply Max> after exhaustification. This gives rise to a ‘more than
minimum’/apparent narrow scope reading. If exhaustification does not apply,
we get the regular ‘more than maximum’ = apparent wide scope reading (cf.
example (99) above). Modals that do not permit an SMC reading do not permit
a ‘more than minimum’ reading either, because the ‘more than minimum’
reading is an SMC reading. I refer the reader to Krasikova 2008 for further
discussion. Crucially for present purposes the correlation with SMC use
provides an independent criterion for when to expect which reading. The
contrast between the different kinds of universal quantifiers is not analysed
as a scope effect. The analysis argued for here makes the interpetation of
have to–type modals a property of those particular lexical items. They are
the only apparent narrow scope items requiring special attention since in
contrast to the scope analysis’ procedure, apparent narrow scope existentials
have already been taken care of.

3.3 Refinement II: Indefinites, numeral NPs and the like

This section concerns existential quantifiers that do not behave like NPI any
and other apparent narrow scope existentials. The problem for the selection

Quantifiers in than-clauses

strategy can be illustrated by the example below.

(108) John is taller than exactly five of his classmates are.

(1080 ) a. Exactly five of John’s classmates are shorter than he is.
b. #John is taller than the tallest of his 5 or more classmates.

The intuitively available interpretation (1080 a) looks once more like a straight-
forward wide scope reading of the numeral quantifier. Application of the
selection strategy predicts an interpretation that is unavailable, (1080 b), as
illustrated below.

(109) λD 0 . for exactly 5 x : max(λd. x is d-tall) ∈ D 0

intervals into which the height of exactly 5 classmates falls
Max> (m_inf([λD 0 . for exactly 5 x : max(λd. x is d-tall) ∈ D 0 ])) =
the height of John’s tallest classmate, as long as there are at least 5

_ _ •_ _ _•_ _ _• _ _ _ _ •_ _ _•_ _ _• _ _ _• _ _ •_ _ _ _ _ _/

c1 c2 c3 c4 c5 c6 c7 c8

We face the combined challenge of (i) predicting the right interpretation and
(ii) not predicting the non-existing one. I propose to tackle this problem
through a more thorough analysis of numeral NPs. We will first consider
indefinite NPs in the context of than-clauses and then move on to numerals
and example (108).

3.3.1 Singular and plural indefinites

Singular indefinites allow in principle two interpretations in than-clauses: an

apparent wide scope and an apparent narrow scope reading. Which reading(s)
is/are possible depends on the indefinite as well as the sentence context.
We have seen examples with NPIs in which only the narrow scope reading is
available. An example that has a wide scope reading is given in (110). (111)
and (112) provide two examples which I take to be genuinely ambiguous (the
English version of (111) probably is too, although native speakers seem to
have some difficulty judging the example).

(110) a. John is taller than one of the girls is.

b. There is a girl x such that John is taller than x.

Sigrid Beck

(111) Annett hat lauter gesungen als eine Sopranistin.

Annett has louder sung than a soprano
‘Annett sang more loudly than a soprano did.’ (German)
(1110 ) a. There is a soprano x such that Annett sang more loudly than x.
b. Annett sang more loudly than any soprano did.
(112) Sveta could solve this problem faster than some undergrad could.
(1120 ) a. There is an undergrad x such that Sveta could solve this problem
faster than x could.
b. Sveta could solve this problem faster than any undergrad could.

For examples with apparent narrow scope existentials it was demonstrated

above (with an NPI indefinite, anyone else) how the selection analysis can
derive an appropriate interpretation corresponding to the apparent narrow
scope reading. What about the apparent wide scope reading? One option
open to us is to acknowledge that indefinites quite often give rise to apparent
wide scope readings — so-called specific readings — and to adopt whatever
mechanism is appropriate for the analysis of specific readings in general for
apparent wide scope indefinites in than-clauses. This is what I will do, and I
use the choice function mechanism as the probably best known analysis of
specific indefinites (e.g. Reinhart 1992; Kratzer 1998; but see Endriss 2009 for
a different analysis). I illustrate with example (113a) from Heim 1982, where a
friend of mine can have apparent scope over the conditional.

(113) a. If a cat likes a friend of mine, I always give it to him.

There is a friend of mine such that if a cat likes him, I give it to
b. ∃f : CH(f ) & [if a cat likes f(friend of mine), I give it to him]
If a cat likes the friend of mine selected by f (f a choice function),
I give it to him.

Furthermore, I will assume that indefinite NPs, e.g. with German ein (‘a’),
are ambiguous between the ‘normal’ interpretation ‘∃x’ (existential quan-
tification over individuals) and the ‘specific’ interpretation ‘∃f ’ (existential
quantification over choice functions). Below I provide a selection analysis
of the two readings of (111) under those assumptions.7 On this analysis,
the apparent narrow scope reading amounts to a ‘∃x’ interpretation and
7 I use the German example because the larger English inventory of indefinites makes it hard
for me to determine which examples are genuinely ambiguous.

Quantifiers in than-clauses

the apparent wide scope reading amounts to a ‘∃f ’ interpretation for the

(114) a. ‚[als [1 [einex Sopranistin t1 laut gesungen hat]]]ƒ

= [λD 0 . ∃x[soprano(x) & max(λd. x sang d-loudly) ∈ D 0 ]]
intervals that cover the loudness of soprano singers

Annett sang more loudly than

Max> (m_inf([λD 0 . ∃x[soprano(x)&max(λd. x sang d-loudly) ∈
D 0 ]]))
= Annett sang more loudly than the loudest soprano.
= Annett sang more loudly than any soprano did.
b. ‚[als [1 [einef Sopranistin t1 laut gesungen hat]]]ƒ
= [λD 0 . max(λd. f (soprano)sang d-loudly) ∈ D 0 ]
intervals that include the loudness of the soprano selected by f

∃f : CH(f ) & Annett sang more loudly than

Max> (m_inf([λD 0 . max(λd. f (soprano) sang d-loudly) ∈ D 0 ]))
= Annett sang more loudly than the soprano selected by f (f a
choice function).
= There is a soprano x such that Annett sang more loudly than

I further assume that the usual factors (in particular, the nature of the
indefinite and what readings the sentence context permits) decide when we
can get which reading(s) of a singular indefinite. I have nothing illuminating
to say about the particulars of this; note, however, that I do assume that
apparent narrow scope readings are possible with indefinites/existentials
other than NPIs. My intuitions regarding German indefinites like jemand
(someone) + anders/sonst (other/else), wh-word + other/else convince me
of this in particular, because these indefinites are not, I believe, plausibly
analysed as polarity items, nor are they plausibly analysed as generic (hence
not existential). Other languages’ inventory of indefinites may make my view
of what the interpretive possibilities of existentials in than-clauses are appear
less obvious. I am grateful in particular to Sveta Krasikova for discussion of
this point.

Sigrid Beck

(115) a. Hier ist es schöner als anderswo.

here is it nicer than elsewhere
‘It’s nicer here than it is elsewhere.’
b. possible reading:
It is nicer here than it is anywhere else.
(116) a. Sam ist schneller als jemand anderes/sonstwer.
Sam is faster than someone other/someone else
‘Sam is faster than another person.’
b. possible reading:
Sam is faster than anyone else is.

Also, the data in (117) (in addition to (111) above) provide an indefinite, ein
anderer (‘another’), that is ambiguous. Both (117a) and (117b) were collected
informally from the web. Context makes it clear that (117a) is intended to
mean ‘faster than everyone else’ and (117b) is intended to mean that someone
was slower.

(117) a. Wir denken 7-mal schneller, als ein anderer reden kann.
we think 7 times faster than an other talk can
‘We think seven times faster than anyone else can talk.’
b. Die meisten überholten mich, aber ab und zu war ich auch
the most passed me but now and then was I also
mal schneller als ein anderer.
once faster than an other
‘Most people passed me, but now and then I was faster than

Matters look somewhat different when we consider plural indefinites. Begin-

ning with bare plurals, note that many examples sound strange (thank you to
Irene Heim for example (118)).

(118) a. John is taller than a giraffe.

b. ??John is taller than giraffes.

(119) a. Prof. Shimoyama hat einen längeren Beitrag geschrieben

Prof. Shimoyama has a longer contribution written
als eine Doktorandin.
than a Ph.D. student
‘Prof. Shimoyama wrote a longer contribution than a Ph.D. stu-
(ok: ∃x, ok: ∃f )
Quantifiers in than-clauses

b. ??Prof. Shimoyama hat einen längeren Beitrag geschrieben

Prof. Shimoyama has a longer contribution written
als Doktorandinnen.
than Ph.D. students
‘Prof. Shimoyama wrote a longer contribution than Ph.D. stu-
(120) a. Hans ist schneller gelaufen als eine Schwester von Greg.
Hans ran faster than a sister of Greg’s. (ok: ∃x, ok: ∃f )
b. ??Hans ist schneller gelaufen als Schwestern von Greg.
Hans ran faster than sisters of Greg’s.
c. Hans ist schneller gelaufen als einige Schwestern von Greg.
Hans ran faster than several sisters of Greg’s. (ok: ∃f )

The version with the singular indefinite can have an apparent narrow scope or
an apparent wide scope interpretation (with some speaker variation regarding
which interpretation is favoured). It is known that bare plurals prefer narrow
scope interpretations — let’s say this implies that the choice function ‘∃f ’
interpretation is dispreferred. What the oddness of the plural data tells us,
then, is that there is something unexpectedly wrong with the non-specific ‘∃X’
interpretation of the plural indefinite (I write capital ‘X’ to indicate plurality,
in contrast to ‘x’ for singular). Note that the data (118)–(120) improve when
some or several/einige is added to the plural indefinite. They then have an
apparent wide scope or ‘∃f ’ interpretation. The following generalization

(121) Max> (m_inf(λD.∃X[. . . ])) is dispreferred relative to

Max> (m_inf(λD.∃x[. . . ])).
A plural indefinite ambiguous between ‘∃X’ and ‘∃f ’ will yield ‘∃f ’.
A plural indefinite that prefers the ‘∃X’ interpretation will sound

Why should a plural indefinite sound odd unless it can easily reveice a
specific interpretation? The generalization is intuitively unsurprising once
we examine the ‘∃X’ interpretation more closely. Careful consideration as to
what it would mean in the case of (120), provided in (122a), reveals that (given
that there is more than one sister of Greg’s) it would be true iff the sentence
with the singular ‘∃x’ (’any sister of Greg’s) would be true. I suggest that
this makes the interpretation (122a) somehow inappropriate for the example.
Perhaps this can be seen as a matter of economy: the plural has no purpose,

Sigrid Beck

hence cannot be used gratuitously.

(122) a. #Hans ran faster than

Max> (m_inf([λD 0 . ∃X[∗sister(X) & ∀x ∈ X : x’s speed ∈ D 0 ]))
= Hans ran faster than any sister of Greg’s.
b. ∃f : CH(f ) & Hans ran faster than
Max> (m_inf([λD 0 . ∀x ∈ f (∗sister) : max(λd. x ran d-fast) ∈
D 0 ]))
= Hans ran faster than each of the sisters selected by f (f a choice
(dispreferred with bare plural, ok with some/several)

(123) is a first shot at what the relevant constraint might effect. The reading
that survives, (122b), is one in which, compared to the corresponding singular
indefinite, the plural serves a purpose.

(123) Ban on Unmotivated Pluralization (BUMP):

Do not quantify over a plurality if quantification over a singularity
lets you infer the same reference.

It would be good to be able to reduce this phenomenon to other cases with a

similar semantics.8 Below I relate than-clauses to definite descriptions and
embedded questions (I am once more inspired by Danny Fox (p.c.) in making
this connection). The idea is that all three constructions share some sense of
maximality and/or maximal informativity (Fox & Hackl 2006 and the above
considerations). So (124a) refers to the maximal, and in the sense of (85)
above, the maximally informative speed that John ran; (124b) will require
the maximally informative answer, i.e. the maximal speed John reached; and
according to the analysis developed here, (124c) is of course analogous.

(124) a. the speed that John ran

b. how fast John ran
c. than John ran
8 An anonymous reviewer and Danny Fox pointed out to me that a plural is not generally
dispreferred when a singular yields the same interpretation, contrary to a claim I made in an
earlier version of this paper. Negation and other downward monotone environments allow
plural indefinites, as the example in (i) illustrates. I thank them for pointing out this flaw to

(i) We don’t sell apples (??an apple) in this store.

There were no women present.

Quantifiers in than-clauses

The following three sets of data replace the proper name in (124) with various
kinds of indefinites in the three constructions. The plain singular indefinite
is fine and picks out the fastest speed in the definite description and the
question as well as in the than-clause — in addition to a possible specific
reading. The bare plurals are somewhat odd, which we can explain if a
constraint like the BUMP above is operative (and the ‘∃f ’ interpretation is
dispreferred). The last set with plural some indefinites are fine and have the
specific reading. Plural indefinites with some are different from bare plurals
in easily allowing an ‘∃f ’ interpretation.

(125) a. the speed that a sister of Greg’s ran

b. how fast a sister of Greg’s ran
c. than a sister of Greg’s ran
(126) a. ??the speed that sisters of Greg’s ran
b. ??how fast sisters of Greg’s ran
c. ??than sisters of Greg’s ran
(127) a. the speed that some sisters of Greg’s ran
b. how fast some sisters of Greg’s ran
c. than some sisters of Greg’s ran

These data share the problem of having to determine unique reference from
a set via maximality/informativity. They motivate the way that the BUMP
is phrased above. Perhaps it is the nature of maximality/informativity as
‘glue’ that makes it sensitive to such a constraint: the step of postulating
such operators is an inference one draws to have things make sense, and
such inferences are subject to ‘making sense’-type of requirements like the
BUMP. But I hasten to add that I am by no means confident that I understand
what is at stake and that more work ought to be done in figuring out what
the BUMP is really about.
I conclude this subsection with a couple of comments on further kinds
of indefinites. The first data point confirms the perspective on the data
developed so far with the German example (128), where the obligatorily
weak lauter (several/many) sounds very strange. Only einige (several) is
acceptable, under an apparent wide scope reading.

(128) Annett hat lauter gesungen als einige/??lauter Sopranistinnen.

Annett has louder sung than several sopranos
‘Annett sang more loudly than several sopranos.’

Sigrid Beck

This can be understood if lauter disprefers a choice function analysis, permit-

ting only the BUMP violating reading (1280 a), while einige yields an acceptable
interpretation in terms of (1280 b). Our assumption about lauter vs. einige
is confirmed by (129), where only the version with einige allows the specific
interpretation of the NP ‘relatives of mine’.

(1280 ) a. #Annett sang more loudly than

Max> (m_inf([λD 0 . ∃X[∗soprano(X) &
∀x ∈ X : max(λd. x sang d-loudly) ∈ D 0 ]))
= Annett sang more loudly than any soprano.
b. ∃f : CH(f ) & Annett sang more loudly than
Max> (m_inf([λD 0 . ∀x ∈ f (∗soprano) : max(λd. x sang d-loudly) ∈
D 0 ]))
= Annett sang more loudly than each of the sopranos selected
by f (f a choice function)
(129) a. Wenn einige Verwandte von mir sterben, erbe ich einen Bauern-
b. Wenn lauter Verwandte von mir sterben, erbe ich einen Bauern-
‘If several relatives of mine die, I will inherit a farm.’

Similarly, we might expect that NPIs in than-clauses will only be licensed on

the apparent narrow scope reading ‘∃x’ (perhaps they have no ‘∃f ’ inter-
pretation, or perhaps that interpretation would fail to satisfy the licensing
requirements on their context). This predicts that singular NPIs only have
an apparent narrow scope reading. It also makes the interesting prediction
that plural NPIs should be odd in than-clauses. (130b) is judged degraded
compared to (130a) and (130c) by some speakers, but not by all.

(130) a. John solved this problem faster than any girl did.
b. ??John solved this problem faster than any girls did.
c. John solved this problem faster than any of the girls did.

I don’t understand why some people judge (130b) to be fine; I wonder whether
a Free Choice interpretation of any girls is possible for those who accept the
A final remark: it is not the case that plural indefinites in than-clauses
are generally bad, not even narrow scope ones. The data in (131) embed the
indefinite beneath another operator, and the BUMP does not apply.

Quantifiers in than-clauses

(131) a. More people bought books than read magazines.

b. I buy books more often than I buy magazines.

To sum up: indefinites are semantically ambiguous, and this shows up in

than-clauses just like it does elsewhere. Apparent wide scope of indefinites is
analysed as pseudoscope: a specific reading. Sometimes one interpretation is
excluded by independent factors. In particular an economy constraint BUMP
can rule out ‘∃X’ for plural indefinites in than-clauses.9 The analysis rests on
how the semantic glue interacts with intervals, and on how the interpretation
is derived. I assume that the semantic glue is sensitive to BUMPy constraints,
i.e. that it is a natural place for their application.

3.3.2 Numerals

With these results regarding indefinites in place, let us next be somewhat

more precise in our semantic analysis of ‘exactly n’. Like Gajewski (2008), we
employ a more elaborate analysis of these numerals (compare Hackl 2001a,b;
9 It is not clear to me that competing analyses of quantifiers in than-clauses can easily explain
the pattern of singular vs. plural indefinites. To give an example, the Pi analysis (supposing
it goes along with my assumptions about the semantics of plural indefinites) predicts for (ia)
a narrow scope reading (ic) in addition to the wide scope reading (ib).

(i) a. John was faster than (some) sisters of Greg’s were.

b. ∃X[∗ sister(X) & ∀x ∈ X : Speed(John) > Speed(x)]
‘Some sisters of Greg’s were slower than John.’
c. Speed(John)] > max(λd. ∃X[∗ sister(X) & ∀x ∈ X : Speed(x) ≥ d])
‘John’s speed exceeds the speed reached by the slowest member of a plurality of
sisters of Greg’s’
= John was faster than the second fastest sister of Greg’s.
d. ∃d[Speed(John) ≥ d & NOT ∃X[∗ sister(X) & ∀x ∈ X : Speed(x) ≥ d]]
e. Suppose Greg has three sisters:
_ _•_ _ _ _ _ _ _ _ •_ _ _ _ _ _ _ _ •_ _ _ _ _ _ _/

x1 x2 x3

largest speed reached by

every member of a plurality
of sisters of Greg’s

An interpretation corresponding to (ic) is not available and would have to be excluded — in

the plural case, but not in the singular. The reading predicted by the NOT-theory, (id), is
parallel. Depending on how hard it is to do so, an argument might be gained for the selection
analysis from the pattern of singular vs. plural indefinites in than-clauses.

Sigrid Beck

Krifka 1999 on the semantics of such NPs). Remember the simple example
(66) and its analysis.

(66) a. Exactly three girls weigh 50 lb.

b. [EXACT [XP (exactly) threeF girls weigh 50 lb.]]
(660 ) ‚threeF girls weigh 50 lb.ƒo = ∃X[∗ girl(X)&card(X) = 3&∗ weigh. 50. lb(X)]
‚threeF girls weigh 50 lb.ƒf =
{∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)] : n ∈ N}
(67) ‚EXACTƒ(‚XPƒf )(‚XPƒo ) = 1
iff‚XPƒo = 1 & ∀q ∈ ‚XPƒf : ¬(‚XPƒo → q) → ¬q
‘Out of all the alternatives of XP, the most informative true one is the
ordinary semantics of XP.’
(68) ‚(66b)ƒ = 1 iff
∃X[∗ girl(X) & card(X) = 3 & ∗ weigh. 50. lb(X)] &
∀n[n > 3 → ¬∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)]] iff
max(λn. ∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)]) = 3

This step does not immediately solve our problem. If we give the than-clause
in (108) the semantics in (132), nothing changes: we still compare with the
tallest of John’s classmates, as long as there are at least five. Notice, however,
that this interpretation is just as strange as the plain plural indefinite ‘∃X’
interpretation above, since the number information serves no real purpose
for the truth conditions.

(108) John is taller than exactly five classmates of his are.

(132) λD 0 . max(λn. ∃X[∗ classmate(X)&card(X) = n&∗ Height(X) ∈ D 0 ]) =
Intervals into which the height of exactly five of John’s classmantes
(133) John is taller than
Max> (m_inf(λD 0 . max(λn.∃X[∗ classmate(X) & card(X) = n &
∗ Height(X) ∈ D 0 ]) = 5))
(1330 ) Presupposition: John has at least five classmates.
Assertion: He is taller than any of them.

This reading is thus ruled out by the same constraint BUMP. We should then
alternatively consider a choice function analysis of the indefinite ‘n class-

Quantifiers in than-clauses

mates’. I combine this below with the assumption that exactly is evaluated in
the matrix clause. In (134), we derive the desired interpretation.

(134) max(λn. ∃f [CH(f ) & John is taller than

Max> (m_inf(λD 0 . ∀x ∈ f ((λX. ∗ classmate(X) & card(X) = n) :
Height(x) ∈ D 0 ]) = 5
’the largest number n such that John is taller than the tallest of the
n classmates of his selected by some choice function f is 5.’

An LF of example (108) representing a version of Krifka’s analysis looks as in


(135) a. [EXACT [John is taller [than Max> m_inf [(exactly) 5f of his class-
mates are tall]]]]
b. Out of all the alternatives of the form ‘John is taller than n of his
classmates are’, the most informative true one is ‘John is taller
than 5 of his classmates are’.

The applicability of the constraint BUMP to numeral indefinites is empiri-

cally supported by the data below, which behave in a parallel way to plural
indefinites with some, for example.

(136) a. the speed that two finalist drove

b. how fast two finalist drove
c. than two finalist drove

Thus I suggest that a proper semantic analysis of numeral NPs makes the
facts compatible with a selection solution after all.

3.3.3 Further relevant cases

The analysis developed here for indefinite NPs in than-clauses needs to be

extended to NPs with many and most, which show the same apparent wide
scope interpretations we observed for numerals.

(137) a. John is taller than many of his classmates are.

b. There are many classmates of John’s such that he is taller than
they are.
(138) a. John is taller than most of his classmates are.
b. For most x, x a classmate of John’s: John is taller than x.

Sigrid Beck

I will make further use of the semantics developed by Hackl (2001a,b, 2009)
for these NPs, according to which ‘many N’ is an indefinite NP including a
gradable adjective in the positive form, and ‘most N’ is correspondingly a
superlative. This makes feasible analyses that can be paraphrased in the
following way:10

(1370 ) John is taller than the tallest of the many-membered group of class-
mates of his selected by f (f a choice function).
(1380 ) John is taller than the tallest of the group selected by f , which
comprises a majority of his classmates (f a choice function).

More detailed analysis are given below ((139) provides the two potential
readings of (137) and (140)–(142) analyse (138)). Besides being able to predict
the existing readings, the BUMP constraint in (123) will rule out the ones that
are intuitively unavailable.

(139) a. #John is taller than

Max> (m_inf([λD 0 . ∃X[∗ classm(X) & many(X) & ∀x ∈ X :
Height(x) ∈ D 0 ]))
= John is taller than any classmate (as long as there are many).
b. ∃f : CH(f ) & John is taller than
Max> (m_inf([λD 0 . ∀x ∈ f (λX. ∗ classm(X) & many(X)) :
Height(x) ∈ D 0 ]))
= John is taller than each of the many classmates selected by f
(f a choice function)
(140) ‚than [1 [œX most of his classmates are t1 tall]]ƒ =
[λD 0 . ∃X∃d[∗ classm(X)&d-many(X)&∀Y ∈ C[Y ≠ X&∗ classm(Y ) →
¬d-many(Y )] & ∀x ∈ X : Height(x) ∈ D 0 ]
intervals that contain the heights of a majority of John’s classmates
(141) ‚than [1 [œf most of his classmates are t1 tall]]ƒ =
[λD 0 . ∀x ∈ f (λX. ∃d[∗ classm(X) & d-many(X) & ∀Y ∈ C[Y ≠
10 An anonymous reviewer points out that this predicts that these NPs can have the same
specific readings we know from indefinites. I concur, but would like to point out that this
prediction arises from an analysis of these quantifiers as indefinites, not from the application
of that analysis to than-clauses. The empirical test cases include data like (i) below.

(i) a. If many relatives of mine die, I will inherit a farm.

b. If most relatives of mine die, I will inherit a farm.

Quantifiers in than-clauses

X & ∗ classm(Y ) → ¬d-many(Y )]) : Height(x) ∈ D 0 ]

intervals that contain the heights of the majority of John’s class-
mates selected by f
(142) a. #John is taller than
Max> (m_inf([λD 0 . ∃X∃d[∗ classm(X)&d-many(X)&∀Y ∈ C[Y ≠
X & ∗ classm(Y ) → ¬d-many(Y )] & ∀x ∈ X : Height(x) ∈ D 0 ]))
= John is taller than the tallest of any majority of his classmates.
= John is taller than any of his classmates.
b. ∃f : CH(f ) & John is taller than
Max> (m_inf([λD 0 . ∀x ∈ f (λX. ∃d[∗ classm(X) & d-many(X) &
∀Y ∈ C[Y ≠ X & ∗ classm(Y ) → ¬d-many(Y )]) : Height(x) ∈
D0 ]
= John is taller than the tallest of the majority of John’s class-
mates selected by f (f a choice function)
= For most x, x a classmate of John’s: John is taller than x.

To sum up: this subsection has analysed the available vs. unavailable readings
of indefinite NPs in than-clauses using a choice function mechanism plus a
constraint on unmotivated pluralization. The formulation of the BUMP in
(123) is offered as a first version of the constraint we need; what we want to
derive is that it is strange to say ‘John is taller than exactly three girls are’ if
we meant, and might as well have said ‘John is taller than any girl is’. Since
this seems eminently reasonable, I am hopeful that a good way of stating
the relevant constraint exists. Given this, the present section has extended
the selection analysis to apparent wide scope indefinite NPs of various kinds
(including numerals, many and most), using a pseudoscope mechanism
argued for extensively for indefinites independently of comparatives. The
comparative semantics itself remains simple.

3.4 Refinement III: Differentials

The final kind of data that does not immediately fall out from the selection
analysis is represented by example (143) below: a than-clause containing a
universal quantifier in combination with a differential.

(143) a. John is exactly 200 taller than every girl is.

b. For every girl x: John is exactly 200 taller than x.

Sigrid Beck

Compared to Heim, and also Schwarzschild & Wilkinson, we seem to have

a problem. Heim’s analysis can derive the intuitive interpretation as shown

(144) [[than every girl is tall] [5 [John is exactly 200 taller t5 ]]]
(1440 ) ‚[than [1 [every girl [2 [[Pi t1 ] [3 [t2 is t3 tall]]]]]]]ƒ =
λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]
intervals into which the height of every girl falls
(145) ‚(144)ƒ = [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ](λd. John is exactly 200 taller than d)
= for every girl x: John is exactly 200 taller than x

Choice of Max> on the other hand predicts a different interpretation, which

does not seem right for (143):

(146) John is exactly 200 taller than Max> (m_inf(‚than-clauseƒ))

= John is exactly 200 taller than the tallest girl.

The intuitively available reading of (143a) can be described as one in which

we assume that all the girls reach the same height. I call this an assumption
of equality among the individuals universally quantified over, EQ for short.
The EQ appears to speak in favor of a scope solution since it is entailed
by the truth conditions resulting from giving the universal wide scope over
the comparison. It is not entailed by the truth conditions according to
the selection analysis, although it is of course compatible with the truth
conditions in (146) that the girls all have the same height.
Sentence (143a) exemplifies a problem that arises when a than-clause
containing a universal quantifier is combined with a differential that in-
cludes exactly, at most or almost. A differential including at least does not
distinguish between the two sets of truth conditions.

(147) a. John is at most/almost 200 taller than every girl is.

b. For every girl x: John is no more than 200 taller than x
c. #John is no more than 200 taller than the tallest girl.
(148) a. John is at least 200 taller than every girl is.
b. For every girl x: John is at least 200 taller than x
c. John is at least 200 taller than the tallest girl.

An unmodified differential does not constitute evidence as strong as an

exactly/at most-type differential, because, while it gives rise to the usual

Quantifiers in than-clauses

strengthening implicature that amounts to an exactly reading, this impli-

cature can be canceled. If we suppose the implicature to be present, the
unmodified differential is parallel to exactly.

(149) a. John is 200 taller than every girl is.

b. Implicature: John is no more than 200 taller than every girl is.
c. John is 200 taller than every girl is, perhaps more.

To sum up the picture so far, differentials with exactly and at most, and
perhaps simple differentials, seem to be problematic for the selection analysis
as opposed to the scope analysis.
However, there is more to say about this issue empirically and theoreti-
cally. Beginning with the theoretical side, note that the interpretation of the
matrix clause in (144) was simplified in terms of not giving the differential
quantifier exactly 200 independent scope.11 Data like (150) show that such
expressions do take scope, however:

(150) You are allowed to be exactly 60 tall.

(151) ‚exactly 60 ƒ = λD. max(D) = 60
(1500 ) a. max(λd. ∃w[wAcc@& you are d-tall in w]) = 60
The largest permitted height for you is 60 .
b. ∃w[wAcc@ & max(λd. you are d-tall in w) = 60 ]
It is permitted that you be exactly 60 tall.

Hence, in addition to (a more elaborate version of) (144) above, the LF and
interpretation in (152) become possible. For the Pi theory, this leads to
availability of the analysis in (153).

(152) [[exactly 200 ] [4 [[than every girl is tall] [5 [John is t4 taller t5 ]]]]]
(1440 ) ‚[than [1 [every girl [2 [[Pi t1 ] [3 [t2 is t3 tall]]]]]]]ƒ = λD 0 . ∀x[girl(x) →
Height(x) ∈ D 0 ]
intervals into which the height of every girl falls
(153) ‚(152)ƒ = [exactly 200 ](λd0 . [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]
(λd. John is d0 taller than d)
= [exactly 200 ](λd0 . for every girl x: John is d0 taller than x)
= max(λd0 . for every girl x: John is d0 taller than x) = 200
‘The largest amount that John is taller than every girl is 200 .’
11 Thanks to Danny Fox for drawing my attention to this point.

Sigrid Beck

Note that this LF no longer predicts all the girls to have the same height. It
says that John is exactly 200 taller than the tallest girl — just like the selection
analysis. It is thus not clear that the predictions of the scope analysis are
really different from, and superior to, the selection analysis.
Next, let’s take a closer look at the data. Above, we identified as a problem
that EQ is not predicted, the assumption that all individuals universally
quantified over have the same height (or whatever the gradable predicate
measures). However, the data are quite difficult. While I agree with the
perception in the literature that in (143a) the EQ is plausible, it is clear that it
does not always arise. Below are some examples where it doesn’t; (154)–(156)
are collected from the internet.12 The reader can convince her/himself that
further relevant data can easily be found. The difficulty in determining the
interpretation of data with nominal universal quantifiers is related to the
point mentioned in Section 2 about differentials and intensional verbs. I
mention in (1560 ) a suggestive example also collected from the web.

(154) Aden had the camera for $100 less than everyone else in town was
(155) WOW! Almost 4 seconds faster than everyone else, and a 9 second
gap on Lance.
(156) Jones was almost an inch taller than the both of them. (the both
of them = John Lennon and Paul McCartney, Jones = Tom Jones.
The author thinks that Jones was 50 1100 and that Paul McCartney was
about 50 1000 . John Lennon is reported to be shorter than McCartney
by about an inch.)
(1560 ) I finished 30 seconds faster than I expected. [. . . ] I know my 300
yard time more accurately now.
(the continuation suggests that the speaker’s expectation was a
range rather than a precise point in time.)

The examples are straightforwardly analysed using Max> to determine the

relevant ‘point’ provided by the than-clause.13 The differential measures the
12 A naive Google search has not unearthed a clearly relevant example with an exactly-
13 A different type of example illustrated below is difficult for both a scope and a selection
analysis. I find it hard to decide what such examples mean precisely. It seems plausible to
me that we select some kind of ‘point’ from the meaning of the than-clause, but not in the
way described in the text.

Quantifiers in than-clauses

distance between that and the main clause degree. This is demonstrated for
(155) below.

(1550 ) a. #For all x, x ≠ Z: (Z was) almost 4 seconds faster than x (wide

b. (Z was) almost 4 seconds faster than Max> (m_inf(λD 0 . for all x ≠
Z : Speed(x) ∈ D 0 ))
= Z was almost 4 seconds faster than the next fastest person.
(selection Max> )

We face the task of figuring out what distinguishes (143) from (154)–(156),
i.e. why EQ arises in some data but not all. I would like to ask this question
in terms of how the selection analysis might predict not only (154)–(156),
but also (143). To this effect, let’s take a closer look at the combination of a
differential with a comparative.
Note that we understand a claim like (157a) relative to a plausible level of
granularity. For us to judge (157a) to be true, it is in most contexts sufficient
to be precise up to the level of a few millimeters. Suppose on the other hand
that (157b) is about a sensitive piece of machinery. A one millimeter margin
could very well not be acceptable. This means that what we call John’s height,
or that rod’s length, is actually somewhat fuzzy: it is a ‘blob’ or an interval
on the relevant scale whose size depends on context. The sensitivity to a
level of precision is not represented in the standard truth conditions of the
two examples given in (1570 ).

(157) a. Mary is exactly 2 cm taller than John is.

b. This rod is exactly 2 cm longer than that rod is.
(1570 ) a. Height(Mary) = Height(John) + 2 cm
b. Length(this rod) = Length(that rod) + 2 cm

To capture this, I follow Krifka (2007) in assuming that a scale can be divided
into different units. A unit on the scale then has to be identified that can
count as a ‘point’ at the contextually relevant level of granularity. Which
(i) a. Ben was almost a year older than everyone else in his class (because he had just
missed the deadline for the previous school year).
b. #For all x ≠ Ben: Ben was almost a year older than x.
c. #Ben was almost a year older than the next oldest in his class.
d. ?The others’ ages center around a point almost a year younger than Ben.

Sigrid Beck

division we assume depends on context. Talking about a length of 1.80 m for

example could then refer to a very short or a somewhat larger stretch of the
scale, depending on the relevant standard of precision/unit size. I talk about
unit size as granularity.

(158) . . . _ _ _ _ •_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
1.80 m
...     ...
5 cm 5 cm
...     ...
...  2  cm
  2 cm
1 cm 1 cm

I make use of Schwarzschild’s (1996) notion of a cover as a division of an

entity into its contextually relevant parts, and apply it to scales in (159).
Covers provide the relevant granularity.

(159) Let hS, >i be a scale. Then Cov is a cover of S if Cov is a set of
subsets of S such that each d in S is in some set in Cov, each set in
Cov is contiguous and no two sets in Cov overlap. Assume Cov to
be the set of intervals that are of the contextually relevant size.

I furthermore revise the definition of an end “point” from (160) to (161) ((161b)
is the informal version, (161c) the more precise version employing covers).
Note that the distinction between points and intervals dissolves under this
view because what we usually call a point is an interval on the scale whose
size depends on context.

(160) a. Max> (phhd,ti,ti ) : = max> (max> (p))

= the end point of the interval that extends furthest
b. Let S be a set ordered by R. Then maxR (S) = ιs[s ∈ S & ∀s 0 ∈
S[sRs 0 ]]
(161) a. Max> (phhd,ti,ti ) := end> (max> (p))
= the end ‘blob’ of the interval that extends furthest
b. end> (D) := ιd. d ⊆ D & ¬∃d0 [d0 ⊆ D & d0 > d] & d counts as a
point at the relevant level of granularity
c. Let Cov be the set of intervals that are of the contextually relevant
end>,Cov (D) := ιd. d ⊆ D & d ∈ Cov &
¬∃d0 [d0 ⊆ D & d ≠ d0 & d0 ∈ Cov &d0 > d]

Quantifiers in than-clauses

Supposing that we talk about what we roughly call 1.80 m, the meanings of
our two than-clauses could (depending on context, i.e. the relevant cover)
come out as in (162). It is thus in the nature of scales that they have a
part/whole structure whose units are determined in a context dependent

(162) a. Max>,Cov1 (than John is tall) = [1.798–1.803] (a 0.5 cm unit)

b. Max>,Cov2 (than that rod is long) = [1.7998–1.8002] (a 0.4 mm

Let’s consider differentials under this refined understanding of scales. A

differential measures the distance from the “point” referred to in the matrix
to the “point” referred to in the than-clause, “point” being determined by
the relevant unit size. Note that a plausible granularity for the than-clause
has to match the granularity level suggested by the differential. If the two
do not match, an odd sentence results. I call this a granularity clash. In the
example below, we know that it is impossible to determine to the second the
amount of time that it took John to learn French. The than-clause comes
inherently with a coarse granularity, which clashes with the granularity of
the differential in (163b).

(163) a. Mary learned arithmetic faster than John learned French.

b. ?Mary learned arithmetic faster than John learned French by 7
minutes 23 seconds.
c. Mary learned arithmetic faster than John learned French by
several months.

We can generalize from the example as follows. In a comparative of the form

(164a), it must at least be given that the cover of the relevant interval that
the than-clause provides (via informativity) furnishes units that are smaller
than the differential; i.e. (164b) is a requirement for the comparative to make
sense. If that is the case, then the unit picked out as a “point” by Max>
will also be smaller than the differential (164c). The comparative can then
measure the gap between the main clause degree and the maximum of the
than-clause with the differential ((164d)). If the maximum itself is larger, this
will be impossible. In our example, suppose that we can with exceptional
precision determine to the day how long it took Mary to learn arithmetic
and John to learn French. We cannot reasonably measure the gap between
two days in terms of the differential ‘7 minutes 23 seconds’. The level of

Sigrid Beck

granularity relevant for the than-clause has to make sense in relation to the

(164) a. Main Clause Differential than D

b. for all U ∈ Cov: U < Diff
c. Since Max>,Cov (D) ∈ Cov : Max>,Cov (D) < Diff
d. Max>,Cov (Main Clause) = Diff + Max>,Cov (D)

The reasoning works out given that the cover, and therefore the unit that
counts as ‘maximal point’, is determined locally, i.e. than-clause internally,
independently of the differential which will then either fit or clash.14
I think that granularity offers an explanation for the interpretive effect I
call EQ. Consider the situation depicted below for (165). If we have no further
information regarding the situation, the girls’ sizes can be far apart. This
would indicate a large interval. The idea is that the semantics of the than-
clause itself indicates possible Covers. There is then a danger that we have a
coarse-grained cover. A reasonable division of  x1 –x5  would be
into relatively long units, hence Max> is long. This would be incompatible
with the differential — a granularity clash. That is, a sentence in which the
than-clause indicates a real spread (e.g. because of a universal quantifier)
brings with it the danger of a granularity mismatch with a differential.

(165) John is exactly 200 taller than every girl is.

(166) _ _ _• _ _ _• _ _ _ _ _ •_ _ _ •_ _ _ _ _ _• _ _ _ _ _• _ _ _ _/

x1 x2 x3 x4 x5 J
m_inf(‚(than) every girl is tallƒ) = { x1 –x5 }

14 A similar effect can be observed with Covers in the plural domain in examples like (i) below.

(i) a. The women and the men love their child.

b. The Smiths and the Johnsons love their child.

Suppose we are talking about Angelina and Reginald Johnson and Mary and John Smith.
Then the two subjects in (ia) and (ib) refer to the same group, but make different covers
salient (Schwarzschild 1996). By virtue of the cover suggested by the subject, (ia) tends to be
understood as ‘the women love their child and the men love their child’, which is unexpected.
(ia) amounts to ‘the Smiths love their child and the Johnsons love their child’, which is more
expected. The point is that the subject group autonomously makes salient a cover, whether
this leads to a plausible interpretation of the whole or not.

Quantifiers in than-clauses

The Cover indicated by the than-clause may agree with the differential
only under an additional assumption of closeness of the individual “points”
covered by the than-clause interval. My suggestion is that if a potential
granularity clash could only be avoided under an additional assumption of
closeness, one tends to assume equality and a default Cover of the than-
clause interval D in terms of the singleton set {D}. This is the EQ. In short,
without an informative context, there is a danger of a granularity clash. The
danger is avoided by the EQ. The EQ would under this analysis be an extra
assumptions speakers make in order to ensure that a sentence is meaningful.
(Note that the EQ is not the weakest assumption one could make to ensure
that; perhaps it is the simplest assumption.)
The data above for which the selection analysis automatically makes
good predictions with Max> , (154)–(156), are such that we have a rather clear
expectation about the kind of interval denoted by the than-clause — the range
within which the individual degrees fall is fixed. The context is rich, and
no problems with granularity arise. Thus a genuine Max> interpretation (i.e.
one in which we pick out the maximum from a genuine spread) is possible
without further assumptions. This distinguishes those data from our original
example (165). I suggest that danger of a granularity clash leads to EQ: to
supposing that the ‘points’ that are in danger of being spread over too large
an interval in fact collapse into one. We expect that it should depend on
the amount of information available on the interval covered by the than-
clause whether we get an EQ interpretation or a genuine Max> interpretation.
Additional information to the effect that the points are not the same, but
close enough together for the purposes of the differential, may make the EQ
unnecessary and thus make a genuine Max> interpretation possible for our
EQ data. This appears to me to be correct:

(167) Background: we are running an experiment in which we vary the

growth conditions of seedlings. In particular, we test different
fertilizing agents (ViagraFlor, Dung™, ComposFix and GuanoPlus)
and their effect on how fast our seedlings grow. After two weeks, it
is reported that:
(168) The ComposFix seedlings are exactly 200 taller than all the others.
(Max> possible)

Danger of granularity clash arises in uninformative contexts and triggers EQ.

I should be able to take the same than-clauses that occured in Max> examples

Sigrid Beck

and place them into a less fortunate context, and trigger EQ. Again, this
seems the right prediction.

(169) a. This pot dries out exactly 40 min faster than all the others.
(EQ likely)
b. This T-Shirt dries exactly 20 min faster than all the others.
(EQ likely)

We see that minimal pairs can be found that have essentially the same
comparative (differential plus comparative adjective plus than-clause) but
differ as to informativity of background context regarding the than-clause
interval. An uninformative context makes us assume that the interval is
point-like, so that Max> will be well defined and suitable — EQ. If we have
enough background information to be sure that the Max> unit in the than-
clause interval is suitable, we do not panic, make no extra assumptions, and
can get a genuine Max> interpretation as expected.
Things are different with an existential quantifier. Consider (170) against
the same background as before. The minimal than-clause intervals will be the
heights of the individual girls. Max> will be well defined and suitable without
any additional assumptions, and will make this a comparison between John’s
height and the height of the tallest girl, as desired.

(170) John is exactly 2 cm taller than any girl is.

Max> (m_inf(‚(than) any girl is tallƒ))
(171) _ _ _• _ _ _• _ _ _ _ _ •_ _ _ •_ _ _ _ _ _• _ _ _ _ _• _ _ _ _/

x1 x2 x3 x4 x5 J

I conclude that the selection strategy provides a reasonable perspective

on differential comparatives. It depends on context whether we get an EQ
interpretation or a genuine Max> interpretation, and the selection strategy
can explain this. I will not investiate here what a scope strategy could say
about the data.
A more general remark: At this point in the analysis, a pragmatic element
has entered the picture. The ‘glue’ I have been talking about so far is gen-
uinely semantic and seems fully determined (as far as I can see) given the
requirement of interpretability. But scales (following the insights represented
by Krifka’s work) require reference to context and include a pragmatic ele-
ment in the shape of the cover. In addition to the maximality/informativity

Quantifiers in than-clauses

operators themselves, we need the contextually relevant part/whole structure

of the scale to interpret a particular example. Properties of the cover become
relevant in particular in the presence of differentials, and speakers may be
lead to make extra assumptions (EQ). The fuzzy nature of the data, in my
opinion, speaks in favour of the idea that some kind of pragmatic glue is
required to make things work out. Depending on the context, speakers may
or may not have an easy time figuring out what the necessary glue is. That
said, a remaining caveat is a more thorough empirical understanding of the
data with differentials.

4 Summary and conclusions

4.1 Summary

Building on work primarily by Schwarzschild & Wilkinson and Heim, I propose

an analysis of quantifiers in than-clauses in which the quantifier is interpreted
inside the than-clause. A shift from degrees to intervals of degrees makes
this possible. Despite appearances, there is no scope interaction between
quantifier and shifter or quantifier and comparison operator. Instead, there
is uniformly selection of a point from the subordinate clause interval. The
analysis takes from Schwarzschild & Wilkinson the step to intervals. It shares
with Heim that comparison is ultimately reduced to comparison of points.
Intervals are not directly compared. In contrast to Heim and the subsequent
NOT-theory, apparent scope effects like the interpretation of have to–type
modals and exactly n NPs have been explained away via recourse to alternative
interpretational mechanisms, which have been argued for independently of
than-clauses (in these two examples: exhaustification and an alternative
semantics for exactly-numerals). My strategy is motivated by the lack of clear
scope interaction in than-clauses.
One feature of the proposal is that the semantics of the comparative
operator is very simple. It is the same semantics that one needs for data like
(172a), namely one in which the first argument of the comparative operator is
a degree, (172c). Maximality is still used in clausal comparatives like the ones
we have discussed, but it is independent of the comparative operator.

(172) a. John is taller than 1.70 m.

b. [[-er [than 1.70 m]] [2 [John is t2 tall]]]
c. ‚-erƒ = λd1 . λd2 . d2 > d1

Sigrid Beck

It is in this sense the analysis developed here is in my opinion ‘simpler’

than Schwarzschild & Wilkinson’s. The complexity that is no doubt there
in the present analysis consists in the assumption that general interpretive
strategies like informativity and maximality are involved (plus in indepen-
dent complications like the availability of specific readings for indefinites
and the like). Also, the semantics is no longer completely determined by
compositional semantics. Data with differentials could only be analysed by
enriching the classical semantics with pragmatic notions (covers, contextual
background). However, this aspect of the proposal is supported by contextual
variability of the judgements and thus has to be part of a successful analysis.
In order to ultimately evaluate the success of my proposals, the whole
approach needs to also be extended to adverbials. I will not attempt to do so
now. Other considerations concern a more detailed analysis of the various
modals (including might) and an investigation of the interaction of several
scope bearing elements inside a than-clause. I give some representative data
below and acknowledge the need for further work on the subject (compare
Schwarzschild & Wilkinson 2002, Heim 2006b, Schwarzschild 2008). Finally, I
admit that I have no analysis for Sauerland’s (2008) example (174), for which
he provides a solution in terms of Heim’s theory.

(173) a. It is hotter here today than it often is in New Brunswick.

b. It is hotter today than it might be tomorrow.
c. Sveta solved this problem faster than someone else could have.
(174) Ekaterina is an odd number of centimeters taller than each of her

These issues are left for future work.

4.2 Where do the intervals come from?

There is one important theoretical question left for the intervals-plus-selection

analysis to answer: where do the intervals come from? In Section 3 I made
the assumption that basic adjective meanings already contained intervals:

(175) ‚tallƒ = [λD. λx. Height(x) ∈ D]

I could alternatively have assumed that the operator Pi from Heim 2006b
shifts the standard adjective meaning to (175).

Quantifiers in than-clauses

(176) Pi shifts from degrees to intervals: [1 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]

(177) a. ‚tallƒ = [λd. λx. Height(x) ≤ d]
b. ‚Piƒ = [λD. λP . max(P ) ∈ D]
c. [λD. ‚Piƒ(D)(‚tallƒ(x))] = [λD. Height(x) ∈ D]

Since Pi on the analysis pursued here always takes scope immediately next to
the adjective, this would have served no particular purpose and I simplified
to (175). But a problem for assuming (175) as the basic meaning of a gradable
adjective is that it is very weak. This creates problems for example for the
negation theory of antonymy (compare e.g. Heim 2006a). (178a) analyses
the negative polar adjective short as the negation of tall. I fail to be able to
imagine how a parallel strategy for the interval based meaning (178b) could
be successful.

(178) a. ‚shortƒ = [λd. λx. ¬ Height(x) ≥ d] = [λd. λx. Height(x) < d]

b. ‚shortƒ = λD. λx. Height(x) 6∈ D

So if the intervals do not come into the semantics via a motivated independent
(since mobile) operator Pi, and nor are they plausibly basic, how do they come
in? It would be attractive to say that intervals enter the semantics because,
that is, if and only if, they are needed. That is what I would like to think, and
(175) really was a simplification for the sake of uniformity that I think of as
An idea for how to bring intervals into the semantics when needed that is
due to Heim (2009) is given below. We begin by observing that a relation can
be expressed between a plurality and a part of a scale — a degree ‘blob’.

(179) a. (You have to be 50 tall to enter.)

Our children are that tall.
b. (Bill’s GPA is 3.75.)
Sam’s grades are that good, too.

We see a parallel to expressing a relation between a plurality and a mass noun.

The example (180a) can be represented as in (180b) with the meaning in (180c)
in mind for the relation between the two objects of drink — a cumulative
interpretation (see e.g. Beck & Sauerland 2000 and all the earlier work cited
there that they rely on).

(180) a. Our children drank the milk.

b. ∗ ∗ drank(M)(C)

Sigrid Beck

c. ∀x ≤ C : ∃y ≤ M : drank(y)(x)&∀y ≤ M : ∃x ≤ C : drank(y)(x)

All children participated in drinking the milk, and all parts of

the milk were drunk by one of the children.

Transferring the analysis to our degree example yields (181).

(181) a. Our children are that tall.

b. ∗ ∗ tall(D)(C)
c. ∀x ≤ C : ∃d ≤ D : tall(d)(x) & ∀d ≤ D : ∃x ≤ C : tall(d)(x)
All the children’s heights fall into D, and all parts of D contain
the height of a child.

It is easy to apply the same analysis to a than-clause containing a definite

plural, and it yields the set of intervals that we need according to the analysis
in Section 3. Comparison will be with the maximum point in that set and the
sentence is predicted to mean that our children are shorter than John.

(182) a. (John is taller) than our children are.

b. λD. ∗ ∗ tall(D)(C)
c. λD. ∀x ≤ C : ∃d ≤ D : tall(d)(x)&∀d ≤ D : ∃x ≤ C : tall(d)(x)
intervals that contain the heights of all our children (and nothing

Note that the notion of degree ‘blobs’ that have a part/whole structure is
anticipated by the reference to covers in Section 3. A cover provides us
with the relevant parts of the degree scale. We are consistently assuming a
mass like structure of the degree scale. To make the connection clear, (1820 )
provides a more complete formalisation of (180a) which includes covers
(compare Beck 2001 for this kind of use for covers).

(1820 ) a. λD. [∗ ∗ λd. λx. d ∈ Cov &x ∈ Cov & tall(d)(x)](D)(C)

b. λD. ∀x[x ≤ C & x ∈ Cov → ∃d[d ≤ D & d ∈ Cov & tall(d)(x)]] &
∀d[d ≤ D & d ∈ Cov → ∃x[x ≤ C & x ∈ Cov & tall(d)(x)]]
(suppose that the relevant parts of ‘the children’ are the indi-
vidual children, and that the relevant parts of the cover are the
units according to granularity)

Example (182)/(1820 ) derives a set of intervals, as pluralities of degrees, as

the meaning of a than-clause via plural predication. What would we need

Quantifiers in than-clauses

to do in order for this idea to apply to the range of data examined in this
paper? I briefly discuss three issues for which this change in perspective is
reelvant: (i) universal quantifiers, (ii) singular quantifiers, and (iii) maximal
First, regarding universal quantifiers: The introduction of intervals analo-
gously to (182) would have to happen with universal quantifiers of various
kinds, in particular universal nominals and intensional verbs (cf. our two
representative examples every girl and predict). Regarding intensional verbs,
there is a proposal by Bošković & Gajewski (2008) that instead of universal
quantification over worlds (183a) they (or at least some of them) involve sum
formation (183b).

(183) a. ‚believex ƒ = λp. ∀w[w ∈ BELx → p(w)]

b. ‚believex ƒ = max(λW . W ∈ ∗BELx )

This makes possible the following analysis of a than-clause with an inten-

sional verb (in the simpler version without covers):

(184) a. (John is taller) than you believe.

b. λD.[∗∗λw.λd. John is d-tall in w](max(λW .W ∈ ∗BELyou ))(D)
c. λD.∀w ≤ max(λW .W ∈ ∗BELyou ) : ∃d ≤ D : tall(w)(d)(John)&
∀d ≤ D : ∃w ≤ max(λW .W ∈ ∗BELyou ) : tall(w)(d)(John)
intervals that contain John’s height in all your belief worlds (and
nothing else)

Nominal universal quantifiers, it has been observed, can sometimes be used to

introduce a plurality, although this is not always easily possible. Perhaps (185)
involves a reinterpretation as a plural definite NP. The same reinterpretation
would be responsible for the interpretation of the than-clause in (186) in
case the girls are of varying heights. This might make sense of my above-
mentioned intuition that a definite plural is more acceptable than a universal

(185) a. Everyone gathered in the hallway.

b. ?Every student gathered in the hallway.
(186) a. John is taller than every girl is.
b. ‘every girl’ → G (the plurality of girls)
c. λD.∗∗ tall(D)(G)

Sigrid Beck

h i
d. λD. ∀x ≤ G : ∃d ≤ D : tall(d)(x)
h i
& ∀d ≤ D : ∃x ≤ G : tall(d)(x)
intervals that contain the heights of all the girls (and nothing

Thus it can be argued that a plural analysis of intervals can capture these
data15 The discussion from Section 3 is (almost — see below) unchanged; what
changes is what happens below the level of AP, so to speak (the predication ‘x
is d-tall’): what we assumed to be basic in (175) is now compositionally derived
via pluralization mechanisms. Next, let’s reconsider data with singular
quantificational elements:

(187) a. Mary is taller than anyone else is.

b. *John is taller than no girl is.
(1870 ) a. John is taller
h than some girls are. i
b. λD. ∃X : ∀x ≤ X : ∃d ≤ D : tall(d)(x)
h i
& ∀d ≤ D : ∃x ≤ X : tall(d)(x)
h i
c. λD. ∀x ≤ f (∗ girl) : ∃d ≤ D : tall(d)(x)
h i
& ∀d ≤ D : ∃x ≤ f (∗ girl). tall(d)(x)

There would be no reason to introduce intervals in the data with singular

indefinites and negative quantifiers. Remember from Section 3.1 that in these
cases, we got rid of the intervals immediately anyway (maximal informativity
reduced the contribution of the than-clause to the set of individual heights).
Now, we could just revert to the classical analysis for those data. This
is not an unwelcome result, since the classical analysis offers a successful
solution for them. Pluralization as the trigger for the introduction of intervals
will continue to play a role for plural indefinites (see example (1870 )); the
discussion in Section 3.3 is thus also in important respects unchanged.
Finally, we need to think once more about the role of maximal infor-
mativity. Plural semantics keeps intervals small. The truth conditions of
cumulation are such that the pluralised relation holds between the plurality
and the smallest interval that covers all the individuals in the plurality (cf.
the second conjunct in (181c) and the following analyses). This may make
15 I am not sure at this point what to say about the have to–type modals. Perhaps (as non-neg-
raising verbs) they do not have a plural analysis. We then revert to the classical analysis.
If they do have a plural semantics, the story in Section 3.1 is maintained. The first version
relates the behavior of a modal to neg-raising, the second to SMC use.

Quantifiers in than-clauses

m_inf unnecessary, leaving us with iterated maximality. Again this can be

seen as a welcome result.
The attraction of this approach is, as said above, that intervals enter
the picture only when there is a real need for them. The idea is entirely
compatible with the selection analysis and in my view very desirable. Why
did I not set out in this fashion in Section 3? I am not quite confident enough
of the story in (185), (186), and too many details remain to be worked out,
plus the data need to be examined more carefully. As things stand, readers
sceptical of the ideas sketched in this subsection may take Section 3 as it is,
while others have the beginnings of an analysis of how and why intervals
come into play at all.

4.3 Outlook

Let’s take a step back and think about what an analysis of quantifiers in
than-clauses in terms of selection achieves — beyond the empirical coverage
of the mostly well-known set of data that I have been concerned with above.
Compared to its theoretical competitors, it primarily removes quantifiers
in than-clauses from the realm of scope interaction phenomena. For example,
the interpretive behaviour of quantifiers in than-clauses cannot be seen as
an instance of the Heim/Kennedy generalization (Kennedy 1997; Heim 2001).
The analysis I’ve given in Section 3 violates this generalization.

(188) Heim/Kennedy generalization: [ DegP . . . [ QP [. . . tDegP . . . ] . . . ]]

(189) a. than [1 [every girl is t1 tall]]
b. λD. for every girl x : Height(x) ∈ D

The Heim/Kennedy generalization is motivated in particular by quanti-

fiers in the matrix clause of comparatives. Suppose that the behaviour of
quantifiers in the matrix clause relative to degree operators is regulated by
a scope constraint deriving the Heim/Kennedy generalization. Then there
would be no theoretical connection between this and than-clause quantifiers.
We would accordingly expect empirical differences between quantifiers in
main clause vs. than-clause. On the other hand, if one were to extend the
requirement of finding a definite degree from the than-clause to the main
clause (a good way of ensuring applicability of the lexical entry in (172c),
note), a parallel analysis could still be pursued. (See once more Heim 2009
for a sketch of such an analysis.) There are some striking similarities be-
tween main clause and than-clause quantifiers that motivate such a step, in

Sigrid Beck

particular (190), (191) below: Both sentences in (190) have an interpretation

that talks about the minimum requirement length of the paper, and neither
sentence in (191) does.

(190) a. The paper is longer than it is required to be.

b. The paper is required to be less long than that.
(191) a. The paper is longer than it is supposed to be.
b. The paper is supposed to be less long than that.

But there are also apparent mismatches:

(192) a. Hier ist es schöner als anderswo.

here is it nicer than elsewhere
‘It is nicer here than it is elsewhere.’
b. ok: It is nicer here than it is in the most beautiful other place.
(193) a. Anderswo ist es weniger schön als hier.
elsewhere is it less nice than here
‘It is less nice elsewhere than it is here.’
b. ??The most beautiful other place is less nice than it is here.
(194) a. Sam war schneller als jemand anderes.
Sam was faster than someone other
‘Sam was faster than another person.’
b. ok: Sam was faster than the fastest other person.
(195) a. Jemand anderes war weniger schnell als Sam.
Someone other was less fast than Sam
‘Another person was less fast than Sam.’
b. ??The fastest other person was less fast than Sam.

At this point, I do acknowledge interesting empirical parallels, but I am also

worried about apparent differences. I would not wish to be committed at
present to claiming that quantifiers in the main clause behave in the same
way as quantifiers in the than-clause, or that they don’t, and will remain
neutral as to whether the analysis developed here should be extended to
cover matrix clause quantifiers as well.
Instead of making a connection to scope interaction phenomena, the
present analysis is based on a plural/mass-semantics related vagueness plus
semantic and pragmatic glue. It makes the interpretation of quantifiers in
than-clauses more of a coercion-like phenomenon. Perhaps the variable and
partly messy nature of the data can motivate the nature of the analysis.

Quantifiers in than-clauses


Beck, Sigrid. 2001. Reciprocals are definites. Natural Language Semantics

9(1). 69–138. doi:10.1023/A:1012203407127.
Beck, Sigrid. 2009. Comparatives and superlatives. To appear in Klaus
von Heusinger, Claudia Maidenborn, and Paul Portner (eds.), Handbook
of semantics: An international handbook of natural language meaning.
Berlin: Mouton de Gruyter.
Beck, Sigrid & Hotze Rullmann. 1999. A flexible approach to ex-
haustivity in questions. Natural Language Semantics 7(3). 249–298.
Beck, Sigrid & Uli Sauerland. 2000. Cumulation is needed: A re-
ply to Winter 2000. Natural Language Semantics 8(4). 349–371.
Bošković, Željko & Jon Gajewski. 2008. Semantic correlates of the NP/DP
parameter. Proceedings of the North East Linguistics Society 39. URL
Cresswell, Max J. 1977. The semantics of degree. In Barbara H. Partee (ed.),
Montague grammar, 261–292. Academic Press.
Dalrymple, Mary, Makoto Kanazawa, Yookyung Kim, Sam Mchombo & Stan-
ley Peters. 1998. Reciprocal expression and the concept of reciprocity.
Linguistics and Philosophy 21(2). 159–210. doi:10.1023/A:1005330227480.
Endriss, Cornelia. 2009. Quantificational topics: A scopal treatment of ex-
ceptional wide scope phenomena (Studies in Linguistics and Philosophy
(SLAP) 86). Springer. doi:10.1007/978-90-481-2303-2.
von Fintel, Kai & Sabine Iatridou. 2005. What to do if you want to go to
Harlem: Anankastic conditionals and related matters. URL http://mit.
edu/fintel/fintel-iatridou-2005-harlem.pdf. Ms, MIT.
Fox, Danny. 2007. Free choice and the theory of scalar implicatures. In
Uli Sauerland & Penka Stateva (eds.), Presupposition and implicature in
compositional semantics, 537–586. New York: Palgrave Macmillan.
Fox, Danny & Martin Hackl. 2006. The universal density of measurement.
Linguistics and Philosophy 29(5). 537–586. doi:10.1007/s10988-006-9004-4.
Gajewski, Jon. 2008. More on quantifiers in comparative clauses. Proceedings
of Semantics and Linguistic Theory 18. doi:1813/13043.
Hackl, Martin. 2001a. Comparative quantifiers. Ph.D. thesis, Massachusetts
Institute of Technology. URL http://hdl.handle.net/1721.1/8765.
Hackl, Martin. 2001b. A comparative syntax for comparative quantifiers.

Sigrid Beck

Proceedings of the North East Linguistics Society 31.

Hackl, Martin. 2009. On the grammar and processing of proportional quan-
tifiers: most versus more than half. Natural Language Semantics 17(1).
63–98. doi:10.1007/s11050-008-9039-x.
Heim, Irene. 1982. The semantics of definite and indefinite noun phrases.
Ph.D. thesis, University of Massachusetts at Amherst. URL http://
Heim, Irene. 1994. Interrogative semantics and Karttunen’s semantics for
know. In Rhonna Buchalla & Anita Mittwoch (eds.), The proceedings of
the conference of the Israel Association for Theoretical Linguistics (IATL 1),
128–144. Hebrew University of Jerusalem. URL http://semanticsarchive.
Heim, Irene. 2001. Degree operators and scope. In Caroline Féry & Wolfgang
Sternefeld (eds.), Audiatur vox sapientiae: A festschrift for Arnim von
Stechow, 214–239. Berlin: Akademie Verlag.
Heim, Irene. 2006a. Little. Proceedings of Semantics and Linguistic Theory 16.
Heim, Irene. 2006b. Remarks on comparative clauses as generalized quanti-
fiers. URL http://semanticsarchive.net/Archive/mJiMDBlN. Ms, MIT.
Heim, Irene. 2009. A unified account? Handout for ‘Topics in Semantics’,
Heim, Irene & Angelika Kratzer. 1998. Semantics in generative grammar.
Oxford: Blackwell.
Hellan, Lars. 1981. Towards an integrated analysis of comparatives (Ergebnisse
und Methoden moderner Sprachwissenschaft 11). Tübingen: Narr.
Hoeksema, Jack. 1983. Negative polarity and the comparative. Natural
Language and Linguistic Theory 1(3). 403–434. doi:10.1007/BF00142472.
Jacobson, Pauline. 1995. On the quantificational force of English free relatives.
In Emmon Bach, Eloise Jelinek, Angelika Kratzer & Barbara H. Partee (eds.),
Quantification in natural languages (Studies in Linguistics and Philosophy
(SLAP) 54), 451–486. Dordrecht: Kluwer.
Kennedy, Chris. 1997. Projecting the adjective: The syntax and semantics of
gradability and comparison. Ph.D. thesis, University of California, Santa
Klein, Ewan. 1991. Comparatives. In von Stechow & Wunderlich (1991),
chap. 32, 673–691.
Krasikova, Sveta. 2008. Quantifiers in comparatives. Proceedings of Sinn
und Bedeutung 12. 337–352. URL http://www.hf.uio.no/ilos/forskning/

Quantifiers in than-clauses

Krasikova, Sveta & Ventsislav Zhechev. 2006. You only need a scalar only. Pro-
ceedings of Sinn und Bedeutung 10. URL http://www.sfb441.uni-tuebingen.
Kratzer, Angelika. 1991. Modality. In von Stechow & Wunderlich (1991),
Kratzer, Angelika. 1998. Scope or pseudoscope? are there wide-scope in-
definites? In Susan Rothstein (ed.), Events and grammar. Dordrecht:
Krifka, Manfred. 1999. At least some determiners aren’t determiners. In Ken
Turner (ed.), The semantics/pragmatics interface from different points of
view (Current Research in the Semantics/Pragmatics Interface 1), 257–291.
Krifka, Manfred. 2007. Approximate interpretation of number words: A case
for strategic communication. In Gerlof Bouma, Irene Maria Krämer &
Joost Zwarts (eds.), Cognitive foundations of interpretation (Verhandelin-
gen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd.
Letterkunde 190), 111–126. Amsterdam: Royal Netherlands Academy of
Arts and Sciences.
Larson, Richard K. 1988. Scope and comparatives. Linguistics and Philosophy
11(1). 1–26. doi:10.1007/BF00635755.
Link, Godehard. 1983. The logical analysis of plurals and mass terms: A
lattice-theoretical approach. In Rainer Bäuerle, Christoph Schwarze &
Arnim von Stechow (eds.), Meaning, use, and interpretation of language,
Grundlagen der Kommunikation und Kognition, 302–323. de Gruyter.
May, Robert. 1985. Logical form: Its structure and derivation (Linguistic
Inquiry Monographs 12). Cambridge, MA: MIT Press.
Meier, Cécile. 2002. Maximality and minimality in comparatives. Sinn und
Bedeutung 6. 275–287. URL http://www.phil-fak.uni-duesseldorf.de/asw/
Partee, Barbara H. 1984. Compositionality. In Fred Landman & Frank Veltman
(eds.), Varieties of formal semantics (Groningen-Amsterdam Studies in
Semantics (GRASS) 3), 281–311. Dordrecht: Foris.
Reinhart, Tanya. 1992. Wh-in-situ: An apparent paradox. Proceedings of the
Amsterdam Colloquium 8. 483–492.
van Rooij, Robert. 2008. Comparatives and quantifiers. Empirical Issues in
Syntax and Semantics 7. 423–444. URL http://www.cssp.cnrs.fr/eiss7/

Sigrid Beck

Rullmann, Hotze. 1995. Maximality in the semantics of wh-constructions. Ph.D.

thesis, University of Massachusetts at Amherst. URL http://scholarworks.
Sauerland, Uli. 2008. Intervals have holes: A note on comparatives with
differentials. Ms, ZAS Berlin.
Schwarzschild, Roger. 1996. Pluralities (Studies in Linguistics and Philosophy
(SLAP) 61). Kluwer.
Schwarzschild, Roger. 2004. Scope splitting in the comparative. URL http:
//www.rci.rutgers.edu/~tapuz/MIT04.pdf. Handout from a colloquium
talk at MIT.
Schwarzschild, Roger. 2008. The semantics of comparatives and other de-
gree constructions. Language and Linguistics Compass 2(2). 308–331.
Schwarzschild, Roger & Karina Wilkinson. 2002. Quantifiers in comparatives:
A semantics of degree based on intervals. Natural Language Semantics
10(1). 1–41. doi:10.1023/A:1015545424775.
Seuren, Pieter A.M. 1978. The structure and selection of positive and negative
gradable adjectives. In Donka Farkas, Wesley M. Jacobsen & Karol W.
Todrys (eds.), Papers from the Parasession on the Lexicon, Chicago Lin-
guistic Society, April 14–15, 1978 (CLS 14), 336–346.
von Stechow, Arnim. 1984. Comparing semantic theories of comparison.
Journal of Semantics 3(1-2). 1–77. doi:10.1093/jos/3.1-2.1.
von Stechow, Arnim. 1995. Lexical decomposition in syntax. In Urs Egli,
Peter E. Pause, Christoph Schwarze, Arnim von Stechow & Götz Wienold
(eds.), Lexical knowledge in the organization of language (Current Issues
in Linguistic Theory 114), 81–118. John Benjamins.
von Stechow, Arnim & Dieter Wunderlich (eds.). 1991. Semantics: An interna-
tional handbook of contemporary research. Berlin: de Gruyter.

Prof. Dr. Sigrid Beck

Chair of Descriptive and Theoretical Linguistics
Englisches Seminar
Universität Tübingen
Wilhelmstr. 50
72074 Tübingen

Semantics & Pragmatics Volume 3, Article 3: 1–41, 2010
doi: 10.3765/sp.3.3

Two kinds of modified numerals∗

Rick Nouwen
Utrecht University

Received 2009-03-27 / First Decision 2009-07-19 / Revised 2009-08-18 / Second

Decision 2009-09-08 / Revised 2009-09-29 / Accepted 2009-10-14 / Final Version
Received 2009-10-15 / Published 2010-01-26

Abstract In this article, I show that there are two kinds of numeral modifiers:
(Class A) those that express the comparison of a certain cardinality with the
value expressed by the numeral and (Class B) those that express a bound
on a degree property. The goal is, first of all, to provide empirical evidence
for this claim and second to account for these data within a framework that
treats modified numerals as degree quantifiers.

Keywords: modified numerals, scalar quantification, modality

1 Introduction

Modified numerals are most commonly exemplified by combinations of a

numeral and a comparative, as in more than 100. Following Hackl (2001),
I will refer to such expressions as comparative quantifiers. As (1) shows,
however, apart from modification by a comparative, numerals combine with
a striking diversity of expressions.

(1) more/fewer/less than 100 comparative quantifiers

no more than 100, many more than 100 differential quantifiers
∗ I would like to thank two anonymous reviewers for their helpful comments. Many thanks,
moreover, to S&P editors Kai von Fintel and, especially, David Beaver, for their painstaking
efforts to point out ways in which to improve the article. A concise presentation of the main
points of this article appeared under the same title in the proceedings of the thirteenth
Sinn und Bedeutung conference (Nouwen 2009). Earlier ideas on this subject were presented
at Semantics and Linguistic Theory 13 in Amherst (2008) and the Journées Sémantique et
Modélisation in Toulouse (2008). I am grateful to the audiences of these events for useful
discussion. Special thanks to Min Que and Luisa Meroni for some help with data. This work
was supported by a grant from the Netherlands Organisation for Scientific Research NWO,
which I hereby gratefully acknowledge.

©2010 R.W.F. Nouwen

This is an open-access article distributed under the terms of a Creative Commons Non-
Commercial License (creativecommons.org/licenses/by-nc/3.0).
R.W.F. Nouwen

at least/most 100 superlative quantifiers

100 or more/fewer/less disjunctive quantifiers
under/over 100, between 100 and 200 locative quantifiers
from/up to 100, from 100 to 200 directional quantifiers
minimally/maximally 100, 100 tops other

For a long time, there seemed to be agreement in the formal semantic lit-
erature that there was little to be gained from a thorough investigation of
these expressions. An especially dominant view, originating from generalised
quantifier theory (Barwise & Cooper 1981), was that there was not much more
to the semantics of such quantifiers than the expression of the numerical
relations >, <, ≤ and ≥. In the past decade, however, several studies have
shown that this is an overly simplistic assumption. Examples are Hackl 2001,
Krifka 1999 and Takahashi 2006 on comparative quantifiers, Nouwen 2008b
on negative comparative quantifiers, Solt 2007 on differential quantifiers,
Geurts & Nouwen 2007, Umbach 2006, Corblin 2007, Büring 2008 and Krifka
2007b on superlative quantifiers, Corver & Zwarts 2006 on locative quan-
tifiers and Nouwen 2008a on directional quantifiers.1 Such investigations
usually concern the specific quirks of a certain type of modified numeral.
While I believe that it is important to have a semantic analysis of modified
numerals on a case by case basis, I also believe that what is lacking from the
literature so far is a view of to what extent the various modified numerals in
(1) involve the same semantic structures. In this paper, I will attempt to reach
a generalisation along this line by claiming that there are two kinds of modi-
fied numerals: (A) those that relate the numeral to some specific cardinality
and (B) those that place a bound on the cardinality of some property. The
difference will be made clear below. The main example of (A) are comparative
quantifiers like more/fewer than 100. Most other kinds of modified numerals
fall in the second class.
I will start by making clear what distinguishes the two classes of modified
numerals by presenting a body of data that sets them apart. Then, in section
3, I introduce a well-founded decompositional treatment of comparative
quantifiers, proposed by Hackl (2001), which I take to represent the proper
treatment of class A modifiers. In section 4, I propose that class B modifiers
are operators that indicate maxima/minima. I will then account for the
distribution of these quantifiers by arguing that they are often blocked by
unmodified numerals, which are capable of expressing equivalent meanings.
1 See also Nouwen 2010b for an overview.

Two kinds of modified numerals

Section 5 discusses a particular problem that occurs with the interaction of

B-type quantifiers with modal operators. In section 6, I provide some more
details on the empirical basis for the A/B distinction. Section 7 concludes.

2 Class A and class B modified numerals

It is a striking feature of comparative quantifiers that they can be used to

assert extremely weak propositions. For instance, (2) is acceptable, even
though it expresses a rather under-informative truth.

(2) A hexagon has fewer than 11 sides. A

This example contrasts strongly with the examples in (3), which are all
unacceptable. (Or, alternatively, one might have the intuition that they are

(3) a. #A hexagon has at most 10 sides. B

b. #A hexagon has maximally 10 sides. B
c. #A hexagon has up to 10 sides. B

Why is this so? A naive theory might have it that (2) states that the number
of sides in a hexagon is strictly smaller than 11 (i.e. <11), and that the only
difference with (3) is that, there, it is stated that this number is smaller or
equal to 10 (i.e.≤ 10). Clearly, 6 is both < 11 and ≤ 10. So why are not both
kinds of examples under-informative but true? On the naive view, having at
most 10 sides is expected to be equivalent to having fewer than 11 sides. That
is, both these properties pick out objects with n ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
sides. Semantically, no contrast is to be expected. Given this semantic
equivalence, a pragmatic explanation of the contrast between (2) and (3)
seems equally unlikely.2
Let us call quantifiers that are acceptable in such examples class A quan-
tifiers and those that are like (3) class B quantifiers. As the contrast between
(4) and (5) shows, the distinction is also visible with lower bound quantifiers.

2 A reviewer wondered whether the naive view could not be maintained if we assume that
there is a pragmatic effect associated to the fact that ≤ n includes the possibility of n while
< n excludes it. It is very much unclear what kind of effect that would be, however. One
could, for instance, base a pragmatic inference on the fact that, in (3a), the speaker seems to
signal the possibility that a hexagon has 10 sides by using at most 10. However, one could
equally argue that the same signal is given by the speaker of (2), simply by using fewer than
11 instead of fewer than 10.

R.W.F. Nouwen

That is, (4) is under-informative, yet true and acceptable, while the examples
in (5) are unacceptable/false.

(4) A hexagon has more than 3 sides. A

(5) #A hexagon has {at least / minimally} 3 sides. B

What I think is the underlying problem of examples involving class B ex-

pressions is that such quantifiers are incapable of expressing relations to
definite amounts. Class A expressions, on the other hand, excel at doing so.
Imagine, for instance, that we are talking about my new laptop and that we
are concerned with how much internal memory it has. Say that it has 1GB of
memory (and that I know that it has so much memory.) I can then assert (6)
in a context where you, for instance, just told me that your laptop has 2GB of

(6) My laptop has less than 2GB of memory.

Or, if your computer has a mere 512MB of memory, I can boast that:

(7) My laptop has more than 512MB of memory.

In these examples, I am comparing the definite amount of 1GB, i.e. the precise
amount of memory I know my laptop has, to some given contrasting amount
2GB (512MB) by means of less than (more than). This is something class A
quantifiers can do very well, but something that is unavailable for class B
modified numerals:

(8) I know exactly how much memory my laptop has. . .

a. . . . and it is {#at most / #maximally / #up to} 2GB.
b. . . . and it is {#at least / #minimally} 512MB.

In contrast to (8), class B quantifiers are acceptable when what is ‘under

discussion’ is not a definite amount, but rather a range of amounts, as in (9).

(9) a. Computers of this kind have {at most / maximally / up to} 2GB of
b. Computers of this kind have {at least / minimally} 512MB of mem-

Two kinds of modified numerals

In other words, it appears that class B quantifiers relate to ranges of values,

rather than to a single specific cardinality.3 This intuition is supported by

(10) Jasper invited maximally 50 people to his party.

We normally interpret (10) to indicate that the speaker does not know how
many people Jasper invited. That is, it is unacceptable for a speaker to utter
(10) if s/he has a definite amount in mind, which is why the addition of 43, to
be precise in (11) is infelicitous.4

(11) Jasper invited maximally 50 people to his party. #43, to be precise.

By assuming that the speaker does not know the exact amount, (10) is
interpreted as being about the range of values possible from the speaker’s
perspective. The speaker thus states that there is a bound on that range.
The same intuition occurs if we substitute maximally 50 by any other class B
In sum, I showed that the landscape of modified numerals can be divided
into two separate classes of expressions. What distinguishes class B quanti-
fiers from other modified numerals is that they are incompatible with definite
amounts and are always interpreted with respect to a range of values. Below,
I will present a semantics of class B expressions that makes this intuition

3 In his comments on this article, David Beaver pointed out examples like (i), where the number
appears to be a variable quantified over.

(i) There were maximally 50 people there at any one time.

Although I will not attempt a compositional analysis of cases like (i), such examples do
appear to support the main intuition that class B quantifiers express relations between
amounts and ranges. An example like (i) states that 50 is the maximum of the range formed
by the different number of people present at different times. This is different from (ii),
which states that at any time the number of people present did not exceed 50. (This is true,
for instance, in case from start to finish there were always 20 people present.) So while (i)
expresses a maximum on a range of values created by quantification, (ii) quantifies over
different times and compares the number of people present at that time with 50.

(ii) There were fewer than 50 people there at any one time.

4 Compare this to (i), which forms a minimal pair with (11).

(i) Jasper invited fewer than 50 people to his party. 43, to be precise.

R.W.F. Nouwen

precise. Before I can do so, however, I will need to discuss the semantics of
A-type numeral modifiers.

3 Hackl’s semantics for comparative modifiers

In this section, I discuss the semantics for comparative modified numerals

as developed in Hackl 2001. I will assume that this represents the proper
treatment of class A numeral modifiers. I also extend the framework slightly
by adding a way to account for the ambiguity of non-modified numerals.

3.1 Class A modifiers as degree quantifiers

What is the semantics of a class A quantifier? It is tempting to think that

class A quantifiers correspond to the well-known generalised quantifier-style
determiner denotations such as the ones in (12).5

(12) ‚more than 10ƒ = λP .λQ. ∃x[#x > 10 & P (x) & Q(x)]
‚fewer than 10ƒ = λP .λQ.¬∃x[#x ≥ 10 & P (x) & Q(x)]

In the past decade it has become clear that it is important to have a closer
look at these modified numerals (Krifka 1999; Hackl 2001). In what follows,
I will assume the following semantics of fewer than, which is based on the
arguments in Hackl 2001.

(13) ‚more than 10ƒ = λM. maxn (M(n)) > 10

‚fewer than 10ƒ = λM. maxn (M(n)) < 10

The workings of this definition will become clear below, but one of the main
motivations for an analysis along this line can be pointed out immediately.
The semantics in (13) is simply that of a comparative construction, where car-
dinalities are seen as a special kind of degrees. That is, like the comparative,
it involves a degree predicate M and a maximality operator that applies to

5 In a set-theoretic approach (12) would correspond to the perhaps more familiar (i). I discuss
(12) rather than (i) since, in what follows, I will assume a framework that makes use of
sum individuals. It is easy to see that, within their own respective frameworks, (12) and (i)
ultimately yield the same truth-conditions.

(i) ‚more than 10ƒ = λX.λY .|X ∩ Y | > 10

‚fewer than 10ƒ = λX.λY .|X ∩ Y | < 10

Two kinds of modified numerals

this predicate (Heim 2000). In other words, (13) is completely parallel to other
comparatives, like (14). While in (13), M is a predicate like being a number n
such that Jasper invited n people to his party, in (14) M could, for instance,
be filled in with something like being a degree d such that Jasper is tall to
degree d.

(14) ‚-er than dƒ = λM.maxd0 (M(d0 )) > d

Hackl assumes that argument DPs containing a (modified) numeral always

contain a silent counting quantifier many:

(15) ‚manyƒ = λnλP λQ.∃x[#x = n & P (x) & Q(x)]

(16) 10 sushis Ž [ DP [ 10 many ] sushis ]

In this framework, the numeral (of type d, of degrees) is an argument

of the silent quantifier many (of type hd, hhe, ti, hhe, ti, tiii, of generalised
quantifier-style determiners parameterised for degrees). By applying [ 10
many ] to the noun (phrase), the standard generalised quantifier denotation
of 10 sushis is derived: λQ.∃x[#x = 10 & sushi(x) & Q(x)]. The structure
of a DP containing a modified numeral does not differ essentially. Modified
numerals are also the argument of a counting quantifier, as illustrated in (17).

(17) fewer than 10 sushis Ž [ DP [ [ fewer than 10 ] many ] sushis]

As was stated above, many is parametrised for cardinalities, which we take

to be degrees. Fewer than 10, however, denotes a degree quantifier, not a
degree constant. Thus, to avoid a type clash, the modified numeral in (17) has
to move, leaving a degree trace and creating a degree property.

(18) Jasper ate fewer than 10 sushis.

Ž [ [fewer than 10] [ λn [ Jasper ate [ [ n many ] sushis ] ] ] ]

This leads to the following interpretation, which results in the desired simple

(19) [λM.maxn (M(n)) < 10] ( λn.∃x[#x = n & sushi(x) & ate(j, x)])

maxn (∃x[#x = n & sushi(x) & ate(j, x)]) < 10

This might seem like a rather elaborate way of deriving the truth-conditions
for such simple sentences. Using (12), we would have derived as truth-

R.W.F. Nouwen

conditions ¬∃x[#x ≥ 10 & sushi(x) & ate(j, x)], which is equivalent to (19),
but which does not require resorting to (moving) degree quantifiers and
silent counting quantifiers. Importantly, however, Hackl’s theory makes some
crucial predictions which are not made by theories assuming a semantics as
in (12).
If, like degree operators, modified numeral operators can take scope,
we expect to find scope alternations that resemble those found with degree
operators (Heim 2000). As Hackl observed, this prediction is borne out. For
reasons explained in Heim 2000, structural ambiguity arising from degree
quantifiers and intensional operators like modals is only visible with non-
upward entailing quantifiers, which is why all the following examples are
with upper-bounded modified numerals.
The example in (20), for instance is ambiguous, with (20a) and (20b) as its
two readings.

(20) (Bill has to read 6 books.) John is required to read fewer than 6 books.

a. ‘John shouldn’t read more than 5 books’

b. ‘The minimal number of books John should read is fewer than 6’

One of the readings of (20) states that there is an upper bound on what John
is allowed to read. The more natural interpretation, however, is a minimality
reading, which is about the minimal number of books John is required to
read. (That is, (20) would, for instance, be true if John meets the requirements
as soon as he reads 3 or more books.)
Following Heim (2000), Hackl analyses this ambiguity as resulting from
alternative scope orderings of the modal and the comparative quantifier. The
upper bound reading, (20a), corresponds to a logical form where the modal
takes wide scope. The minimality reading involves the maximality operator
intrinsic to the comparative construction taking wide scope over the modal
(Heim 2000).

(21) [maxn (∃x[#x = n & book(x) & read(j, x)]) < 6]

[require [ [fewer than 6] [ λn [John read n-many books] ] ] ]
(22) maxn (∃x[#x = n & book(x) & read(j, x)]) < 6
[ [fewer than 6] [ λn [ require [John read n-many books] ] ] ]

A similar structural ambiguity can be observed with existential modals. The

two readings of (23) are an upper bound interpretation as well as a reading

Two kinds of modified numerals

which is very weak, stating simply that values below the numeral are within
what is permitted, without stating anything about the permissions for higher
values. (That is, the reading intended in (23b) is, for instance, verified by a
situation where there are no restrictions whatsoever on what John is allowed
to read. Clearly, (23a) would be false in such a situation.)

(23) John is allowed to bring fewer than 10 friends.

a. ‘John shouldn’t bring more than 9 friends’
b. ‘It’s OK if John brings 9 or fewer friends (and it might also be OK
if he brings more)’

As before, these readings can be predicted to exist on the basis of the relative
scope of modal and comparative quantifiers.

(24) maxn (♦∃x[#x = n & friend(x) & bring(j, x)]) < 6

[ [fewer than 6] [ λn [ allow [John invite n-many friends] ] ] ]
(25) ♦[maxn (∃x[#x = n & friend(x) & bring(j, x)]) < 6]
[ allow [ [fewer than 6] [ λn [John invite n-many friends] ] ] ]

The reader may check that Hackl’s predicted readings in (24) and (25) are
indeed the attested ones.

3.2 Class B modifiers are different

These analyses are strongly supportive of an approach which treats compar-

ative quantifiers as comparative constructions. The question now is whether
class B quantifiers should be given a similar treatment. In other words, will
the semantics in (26) do?

(26) ‚up to / maximally / at most / etc... 10ƒ =? λM. maxn (M(n)) ≤ 10

Choosing a semantics that is parallel to that of fewer than is partly unintuitive

since the class B quantifiers are not comparative constructions. Yet, cases
like maximally 10 suggest that the crucial ingredient of the semantics is the
same, namely a maximality operator. The unsuitability of the analysis in (26)
becomes immediately apparent, however, if we investigate examples with
class B modified numerals embedded under an existential modal: these turn
out not to be ambiguous (cf. Geurts & Nouwen 2007). Class B modifiers like
maximally, up to and at most always yield an upper bound on what is allowed
and resist the weaker reading that was found with comparative modifiers, as

R.W.F. Nouwen

the contrast between (27) and (28) makes clear.

(27) John is allowed to bring fewer than 10 friends.

But more is fine too.
(28) John is allowed to bring {up to / at most / maximally} 10 friends.
#But more is fine too.

A further interesting property of the interaction of class B modified numeral

quantifiers and modals is that existential modals interfere with the inferences
about speaker knowledge that we found for simple sentences. Above, I
observed that (29) licenses the inference that the speaker does not know
how many friends Jasper invited. In contrast, (30) does not license any such
inference; it is compatible with the speaker knowing exactly what is and what
is not allowed.

(29) Jasper invited maximally 50 friends.

(30) Jasper is allowed to invite maximally 50 friends.

These observations add to the data separating class A from class B quanti-
fiers. Summarising, the distinctions are then as follows. First of all, class B
quantifiers, but not class A quantifiers, resist definite amounts, except when
embedded under an existential modal. Second, class B quantifiers, but not
class A quantifiers, resist weak readings when embedded under an existential
In the next section I will argue that the peculiarities of class B quantifiers
can be explained if we assume that they are quite simply maxima and minima
indicators. Basically, what I propose is that the semantics of maximally
(minimally) is simply the operator maxd (mind ). This might be perceived as
stating the obvious. What is not obvious, however, is how such a proposal
accounts for the difference between class A and class B quantifiers. I will
argue that the limited distribution of class B modifiers is due to the fact that
they give rise to readings that are in competition with readings available
for non-modified structures. I will show that, in many circumstances, the
application of a class B modifier to a numeral yields an interpretation which
is equivalent to one that was already available for the bare numeral. Before I
can explain the proposal in detail, I therefore need to include an account of
bare numerals in the framework.

Two kinds of modified numerals

3.3 The semantics of numerals

Above, I adopted the semantics of Hackl 2001 for comparative modified

numerals. An important part in that framework is played by the counting
quantifier many. I will re-name this operator many1 , for, in what follows, I
assume that for any numeral there are two counting quantifiers available.
These two options are to account for the two meanings of numerals that
may be observed: on the one hand the existential / weak / lower-bounded
meaning and, on the other hand, the doubly bound / strong meaning. An
example like (31), for instance, is ambiguous between (31a) and (31b).

(31) Jasper read 10 books.

a. the number of books read by Jasper ≥ 10
b. the number of books read by Jasper = 10

I assume that, like the meaning in (31a), the meaning in (31b) is semantic and
not the result of a scalar implicature that results from (31a). See e.g. Geurts
2006 for a detailed ambiguity account, and for some compelling arguments
in favour of it.6
In the current framework, that of Hackl 2001, the weak reading in (31a) is
due to a weak semantics for the counting quantifier: i.e. many1 . I propose
that the strong reading, (31b), is accounted for by an alternative quantifier
many2 (taking inspiration from Geurts 2006.)7

(32) ‚many1 ƒ = λnλP λQ.∃x[#x = n & P (x) & Q(x)]

‚many2 ƒ = λnλP λQ.∃!x[#x = n & P (x) & Q(x)]

Here, ∃!x[ϕ] abbreviates ∃x[ϕ & ∀x 0 [x 0 6= x → ¬ϕ[x/x 0 ] ]].8 In other words,

∃!x stands for ‘exactly one . . . ’. When x ranges over groups of individuals,
∃!x[#x = n & P (x)] is verified by assigning to x the maximal group of
individuals with property P , where n is the cardinality of that group. This is
because any smaller group will not be the unique group with property P of its
cardinality. For instance, if our domain is {a, b, c, d}, all of which satisfy P ,
then ∃!x[#x = 3 & P (x)] is false, since several groups have three atoms and
property P , among which a ⊕ b ⊕ c and a ⊕ c ⊕ d. However, ∃![#x = 4 & P (x)]
6 But see Breheny 2008 for a dissenting view.
7 Here is a mnemonic. The 1 in many1 represents the fact that this operator is unilaterally
bound, namely lower-bounded only. Many2 on the other hand is bilaterally bound.
8 Here, ϕ[x/x 0 ] is the formula that is exactly like ϕ except that free occurrences of x have
been replaced by x 0 . Moreover, it is assumed that ϕ contains no free occurrences of x 0 .

R.W.F. Nouwen

is true, since apart from a ⊕ b ⊕ c ⊕ d there is no other group that has 4

atoms while satisfying P . Consequently, ∃!x[#x = n . . .] stands for ‘exactly
n. . . ’. For instance, the doubly bound reading of Jasper read 10 books is (33).
The truth-conditions of (33) are such that it is false if Jasper read fewer than
10 books (for then there would not be 10 books he read), but also false if
Jasper read more than 10 books (for then there would be many groups of 10
books he read).

(33) ∃!x[#x = 10 & book(x) & read(j, x)]

Not only does the option of two counting quantifiers, many1 and many2 ,
suffice to account for the ambiguity of bare numerals, it is moreover harmless
with respect to the semantics of comparative quantifiers. A sentence like
Jasper read more than 10 books is not ambiguous. It is important to show
that the availability of two distinct counting quantifiers does not predict
ambiguities in such examples. It will be instructive to see in somewhat more
detail why this is indeed the case.
The structure in (34) is exemplary of a simple sentence with a modified
numeral object. As explained earlier, the modified numeral applies to the
degree predicate that is created by moving the quantifier out of the DP.

(34) [ MOD n [ λd [ Jasper read d many1/2 books ] ] ]

Now that there is a choice between two counting quantifiers, the denotation
of the degree predicate depends on which of many1 and many2 is chosen. The
predicate in (35) is the result of a structure containing many1 ; the predicate
in (36) is based on many2 . If, in the actual world, Jasper read 10 books, then
(35) denotes {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. When, however, the predicate contains
the many2 quantifier, the denotation is a singleton set: {10} if Jasper reads
10 books. This is because only the maximal group of books read by Jasper
is such that it is the unique group of that kind of a certain cardinality.
In general, the many2 -based degree predicate extension is a singleton set
containing the maximum of the values in the denotation of the many1 -based
degree predicate.

(35) λd.∃x[#x = d & book(x) & read(j, x)]

(36) λd.∃!x[#x = d & book(x) & read(j, x)]

As discussed above, comparative quantifiers involve maximality operators.

However, the maximal values for degree predicates like (35) and (36) are

Two kinds of modified numerals

always equivalent. In simple sentences based on a structure like (34), the

option of having two distinct counting quantifiers does therefore not result
in any ambiguity.
When we turn to cases where the degree predicate is formed by moving
the modified numerals over a modal operator with universal force, something
similar can be observed. If Jasper is required to read (exactly) 10 books,
then the structure in (37) yields, again, the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Once
more, the structure which contains the bilateral counting quantifier, the one
in (38), yields the set containing the maximum of its weaker counterpart.

(37) [ λd [ require [ Jasper read d many1 books ] ] ]

Ž λd.∃x[#x = d & book(x) & read(j, x)]
(38) [ λd [ require [ Jasper read d many2 books ] ] ]
Ž λd.∃!x[#x = d & book(x) & read(j, x)]

Given that the relation between (38) and (37) is once again one of a set and its
maximal value, no ambiguities can be expected to arise when comparative
quantifiers are applied to these two predicates. This is as is desired.
Of course, it could be that the actual situation is not one containing a
specific requirement, but one with for instance a minimality requirement.
Say, for instance, Jasper has to read at least 4 books. In that case, (37) denotes
the set {1, 2, 3, 4}. The extension of (38), however, is the empty set. (In such a
context, there is no specific n such that Jasper has to read exactly n books.)
Clearly, the maximal value for the predicate is undefined in such a case.
This means that the logical form based on many2 will not lead to a sensible
interpretation and, so, we again do not expect to find ambiguity.
The case of predicates that are formed by abstracting over an existential
modal operator is illustrated in (39) and (40). If Jasper is allowed to read a
maximum of 10 books, then the two predicates are equivalent, both denoting
the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}.9

(39) λd.♦∃x[#x = d & book(x) & read(j, x)]

(40) λd.♦∃!x[#x = d & book(x) & read(j, x)]

In sum, the option of two counting quantifiers many1 and many2 is irrelevant
when combined with a comparative quantifier. This is because the compara-
9 If there is in addition a lower bound, the two predicates are no longer equivalent, but their
maximum will be.

R.W.F. Nouwen

tive quantifier is based on maximality and the degree predicates containing

the different counting quantifiers do not differ in their maximum value.

4 The semantics of class B quantifiers

I now turn to the main proposal: class B quantifiers are maxima/minima

indicators. I start with the upper-bounded modifiers.

4.1 Upper bound class B modifiers

In the formula in (41), MOD↓B generalises over any of the class B modifiers at
most, maximally, up to, etc.10

(41) ‚MOD↓B ƒ = λd.λM. maxn (M(n)) = d

If the semantics of upper bound class B quantifiers is as in (41), then why is

their distribution so limited? What I think is the reason for the awkwardness
of a lot of examples with class B quantifiers is the fact that, in many cases,
(41) is a vacuous operator. To be precise, the two propostions in (42) are
equivalent whenever the cardinality predicate M denotes a singleton set. In
such a case, a bare numeral form is to be preferred over a numeral modified
by a class B modifier, since the latter derives the same meaning from a much
more complex linguistic form.

(42) a. maxn (M(n)) = d

b. M(d)

What I have in mind exactly is the kind of reasoning underlying Horn’s

division of pragmatic labour (Horn 1984). The idea is that a maxim of brevity,
10 For modifiers like at most and maximally, one might wonder whether (41) is not too restricted,
given that they are capable of modifying DPs more generally. However, it appears that there
is a common mechanism to all uses of such modifiers. For instance, (i) could be assigned its
intuitive meaning if we assume that at most has the semantics in (ii), where the operator
‘max’ compares properties on the rank order [assistant professor < associate professor <
full professor]:

(i) Jasper is at most an associate professor.

(ii) ‚at mostƒ = λP .λx.maxP 0 (P 0 (x)) = P

It goes beyond the scope of this article to implement a formal connection between (ii) and
(41), but it should be clear that the underlying mechanism is the same.

Two kinds of modified numerals

part of Grice’s maxim of Manner (Grice 1975), steers toward minimising the
form used to express something. This causes simple (unmarked) meanings to
be typically expressed by means of simple (unmarked) forms. Marked forms
which by convention could be given the same unmarked meaning as some
unmarked form are instead given a more marked interpretation. There are
many variations and implementations of this idea (McCawley 1978; Atlas &
Levinson 1981; Blutner 2000; van Rooij 2004),11 but what is most relevant for
this paper is the general idea that an unmarked meaning is blocked as an
interpretation for the marked form.
With this in mind, the equivalence of (42a) and (42b) whenever M denotes
a singleton set has profound consequences for when it actually makes sense
to state that the maximum of a degree predicate equals a certain value. That
is, in cases where (42a) equals (42b), we expect that the use of maximally
does not lead to an interpretation based solely on (42a), since the use of the
bare numeral form would result in the same meaning. To illustrate this in
some more detail let us carefully go through the following examples.
We know from the discussion above that one of the interpretations avail-
able for (43) is (44).

(43) Jasper invited 10 people.

(44) ∃!x[#x = 10 & people(x) & invite(x)]

Now consider (45), which is interpreted either as (46) or as (47).

(45) Jasper invited maximally 10 people.

(46) [ maximally 10 [ λd [ Jasper invited d many1 people ] ] ]
Ž maxn (∃x[#x = n & people(x) & invite(j, x)]) = 10
(47) [ maximally 10 [ λd [ Jasper invited d many2 people ] ] ]
Ž maxn (∃!x[#x = n & people(x) & invite(j, x)]) = 10

The interpretations in (46) and (47) are equivalent. In fact, just like we do
not expect ambiguities to arise with comparative quantifiers on the basis
of the many1 /many2 choice, we do not expect any ambiguities to arise with
MOD↓B quantifiers, for the simple reason that both such operators involve
11 In fact, there is a close resemblance between this prevalent idea in pragmatics and blocking
principles in other parts of linguistics. The commonality is that two different expressions
cannot have identical meanings. See, for instance, the Elsewhere Condition (Kiparsky 1973)
in phonology or the Avoid Synonymy principle (Kiparsky 1983) in morphology.

R.W.F. Nouwen

a maximality operator and that the maximal values of predicates based on

many1 are always those of predicates based on many2 . In what follows,
we will therefore gloss over the two equivalent options by representing the
semantics following the general scheme in (48).

(48) [ maximally 10 [ λd [ Jasper invited d many1/2 people ] ] ]

Ž maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10

Importantly, the single reading of (45) is equivalent to (44), the strong reading
of (43). The example in (43), however, reaches this interpretation by means
of a much simpler linguistic form, one which does not involve a numeral
modifier. I propose that this is why the reading in (48) of (45) does not
surface: it is blocked by (43).12
As observed above, we can nevertheless make sense of (45) once we
interpret the sentence to be about what the speaker holds possible. So, a
further possible reading for (45) is that in (49).

(49) maxn (♦∃(!)x[#x = n & people(x) & invite(j, x)]) = 10

Crucially, this interpretation is not equivalent to (50), which is the result of

interpreting (43) from the perspective of speaker possibility.

(50) ♦∃!x[#x = 10 & people(x) & invite(j, x)]

12 An anonymous reviewer notes two complications with the proposed blocking mechanism.
First of all, s/he wonders why exactly 10 is not blocked in a similar way to minimally 10,
since the same reasoning seems to apply. I acknowledge that this is something that needs to
be explained. Interestingly, this is something any theory that believes in the existence of
an ‘exactly’ sense for numerals has to explain. One promising route has been proposed by
Geurts (2006), who suggests that exactly is semantically empty and that its only function is
“to reduce pragmatic slack” (p. 320). That is, whereas bare 100 allows for an imprecise rough
construal (Krifka 2007a), exactly 100 enforces precision. If Geurts is on the right track, then
there is no reason to expect that exactly 100 is blocked by 100.
A further complication noted by the same anonymous reviewer is that if we assume that
the ‘max’ operator is presuppositional, we might come to expect that maximally 100 blocks
100 instead of the other way around. This prediction appears to be made when at the same
time we assume the Maximize Presupposition principle (Heim 1991). Since maximally 100 and
100 share the same meaning, but the former triggers a presupposition, the use of 100 would
be blocked. This is a very interesting scenario, but since I have little to say about the kind
of presuppositions (if any) expressions like maximally trigger and I furthermore have no
thoughts on how maximize presupposition would interact with a brevity maxim, I will leave
this issue to further research.

Two kinds of modified numerals

In other words, the meaning in (49) for (45) is not blocked by the bare numeral
form in (43) since (43) lacks this reading.
To be sure, I do not claim that (50) would be an available reading for (43).
That is, the particular kind of interpretation that examples like (45) receive
is available only as a last resort strategy. Underlying this analysis is the as-
sumption that there exist silent modal operators. I can offer no independent
evidence for this assumption, but stress that the intuitions regarding exam-
ples like (45) quite clearly point into the direction of some sort of speaker
modality. In work on superlative quantifiers, we find some alternatives to
the present account. Such approaches are meant to deal with at most and at
least only, but if my arguments above are on the right track, then we could
reinterpret these proposals for the semantics of superlative quantifiers as
applying to the whole of class B. For instance, the analysis of class B expres-
sions presented here differs from that of superlative modifiers in Geurts &
Nouwen 2007. According to the present proposal, the modal flavour of (45) is
due to a silent existential modal operator. In Geurts & Nouwen, however, the
modal was taken to be part of the lexical content of superlative quantifiers.
Another alternative, proposed for superlative modifiers in Krifka 2007b and
which is closer to the present proposal, is to analyse examples like (45) not as
involving a modal operator, but rather a speech act predicate, like assert. In
that framework, the analysis of (45) would say that n=10 is the maximal value
for which ∃(!)x[#x = n & people(x) & invite(j, x)] is assertable, rather than
possible.13 That is, according to Krifka, (45) is interpreted by assigning the
modified numeral scope over an illocutionary force operator, rather than
over a modal operator.
I will return to a comparison of these approaches below. I would like to
point out immediately, however, what I think are the major disadvantages of
both alternatives. The main problem is with examples like (51), which contain
an overt existential modal.

(51) Jasper is allowed to invite maximally/at most 10 people.

13 In his comments on the first version of this paper, David Beaver observed that it it is not
necessarily the speaker’s knowledge that matters, as can be seen from (his) example (i).

(i) I know how many people were at the party, but I’ve been told not to reveal that
number to the press. However, there were maximally 50 there.

It would be interesting to see if data like these help in reaching a synthesis of Krifka’s
account and the present proposal.

R.W.F. Nouwen

Its most salient reading is one in which 10 is said to be the maximum number
of people Jasper is allowed to invite. That is, it places an upper bound on
what is allowed. For Krifka, this is problematic since, here, the modified
numeral is quite obviously not a speech act operator. For the proposal in
Geurts & Nouwen 2007, such examples are problematic since the modal
lexical semantics of at most predicts a reading with a double modal operator,
one originating from the verb and one from the numeral modifier. To remedy
this, Geurts and Nouwen provide an essentially non-compositional analysis
of such examples as modal concord.14
In contrast, the current proposal deals effortlessly with examples, such
as (51). What was crucial to my explanation of how (45) gets to be interpreted
is that degree predicates based on modals with existential force denote
non-singleton sets even when the counting quantifier associated with the
numeral is many2 . This entails that saying that the maximum value for such
a predicate is n is not equivalent to saying that the predicate holds for n.
More formally, there is a contrast between (52a) and (52b).

(52) a. maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10

a ∃!x[#x = 10 & people(x) & invite(j, x)])
b. maxn (♦∃(!)x[#x = n & people(x) & invite(j, x)]) = 10
i ♦∃!x[#x = 10 & people(x) & invite(j, x)])

As a result, whenever an upper bound class B modifier scopes over an

existential modal, no blocking from the simpler bare numeral form will be
able to take place. The application of an upper bound class B quantifier to a
degree predicate is only felicitous if the resulting readings are not readings
that can be expressed just as well by omitting the class B modifier. This is
the case when a modal with existential force has scope inside the degree
14 A further problem I see with the proposal in Krifka 2007b is that the analysis does not
appear to extend straightforwardly to illocutionary forces other than assertion, although in
fairness this might be because (at the time of writing) no detailed exposition of this theory
exists. For instance, nothing suggests that superlative modified numerals can scope over a
question operator in questions.
An additional disadvantage for the proposal of Geurts and Nouwen is that it does not
yield an explanation of the lexical form of class B modifiers. Whereas the current proposal
assigns to a modifier like maximally the semantics of a maximality operator, an extension of
Geurts and Nouwen’s approach would have to take it to be a modal, thereby disassociating it
from the intuitive meaning of maximal.

Two kinds of modified numerals

predicate.15 Treating upper bound class B quantifiers as maxima indicators

thus also predicts the absence of weak readings for examples like (51). Given
the flexible scope of the numeral modifier we expect this sentence to have
two corresponding logical forms, (54a) and (54b). (From here on,  indicates
deontic modality, to distinguish it from the (epistemic) speaker possibility

(53) Jasper is allowed to invite maximally/at most 10 people.

(54) a. maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10
b. [maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10]

If maximally 10 is taken to have wide scope over the modal, then we arrive
at (54a), the reading that says that the maximum number of people Jasper
is allowed to invite equals 10. This is not a semantic interpretation that is
available for (55). Its many2 reading, for instance, says that inviting exactly
10 people is something that Jasper is allowed to do. This is much weaker
than (54a). (The only way we can arrive at an equally strong reading for (55)
is by means of implicature.)

(55) Jasper is allowed to invite 10 people.

If we take the modal in (51) to have widest scope, as in (54b), the resulting
interpretation is one in which inviting exactly 10 people is allowed for Jasper.
This is the reading for (55) discussed above, and so it is blocked. As a result,
(54a) is the only interpretation available.
An interesting side to the account presented here is that the upper bound
class B quantifiers do not encode the ≤ relation. As maxima indicators, their
application only makes sense if what they apply to denotes a range of values.
Otherwise, using the strong reading of the bare numeral form will do just as
Interestingly, the approach also predicts that some of the examples I
discussed above do not only result in a blocking effect, but could moreover
be predicted to be false. For instance, according to the approach set out
above, the meaning of (56a) is that in (56b).

15 As far as I can see, assertability would have the same (crucially weak) properties as possibility.
So, should a silent speech act predicate seem more plausible than a silent modal operator,
then ♦ can just as well be interpreted as expressing assertability. It appears that such a
move would be largely compatible with the proposal of Krifka 2007b.

R.W.F. Nouwen

(56) a. #A triangle has maximally 10 sides.

b. ‘the maximum number of sides in a triangle is 10’

The reading in (56b) is not only blocked by A triangle has 10 sides, but
it is moreover plainly false. I believe that this predicts that (56a) should
be expected to have a somewhat different status from (57), which strictly
speaking has a true interpretation, but one that can be expressed by simpler

(57) #A triangle has maximally 3 sides.

It is difficult to establish whether this difference in status is borne out, or

even how this difference can be recognised. However, my own intuition tells
me that while (56) is never acceptable, (57) could be used in a joking fashion.
Native speakers inform me that (58) is marginally acceptable:

(58) ?A triangle has minimally and maximally 3 sides.

4.2 Lower-bound class B modifiers

Lower-bound class B modifiers correspond to minimality operators. Let MOD↑B

correspond to any of the class B expressions at least, from, minimally, etc.

(59) ‚MOD↑B ƒ = λd.λM. minn (M(n)) = d

Note first that minimality operators are sensitive to the many1 / many2
distinction. Consider the degree predicate [λd. John read d many1/2 books]
and, say, that John read 10 books. In the many1 version of the logical form,
the minimal degree equals 1. In fact, independent of how many books John
read, as long as he read books, the minimal degree will always be 1. In the
many2 version of the logical form, the predicate denotes a singleton set, {10}
if John read 10 books. The minimal degree in that case is, of course, 10.
These observations already straightforwardly account for our intuitions
for an example like (60).

(60) John read minimally 10 books.

The many1 interpretation of (60) will be rejected, for it will always be false.
The minimal value for any simple many1 -based degree predicate is always 1.
The many2 interpretation of (60) will be rejected too, for it will correspond
to an interpretation saying that John read (exactly) 10 books. This reading is

Two kinds of modified numerals

blocked by the bare numeral. (In fact, (60) in the many2 variant is equivalent
to John read maximally 10 books, which, as was explained above, is blocked
for the same reasons.)
We can save (60) by interpreting it with respect to an existential modal
operator. This yields two readings:

(61) a. mind (♦∃x[#x = d & read(j, x) & book(x)]) = 10

b. mind (♦∃!x[#x = d & read(j, x) & book(x)]) = 10

The form in (61a) is once more a contradiction: the minimal degree for which
it is deemed possible that John read d-many1 books is always 1. The reading
in (61b) is much more informative. It says that that the minimal number for
which it is thought possible that John read exactly so many books is 10. In
other words, this says that it is regarded as impossible that John read fewer
than 10 books. This is exactly the reading that is available.

4.3 Beyond modals

Some words are in order on the interaction of numeral modifiers with non-
modal operators. Given the current proposal, any property that involves
existential quantification would license the use of a class B modifier. However,
it is known that degree operators (which we take modified numerals to be)
cannot move to take scope over nominal quantifiers (cf. Kennedy 1997; Heim
2000).16 This explains why (62) does not have the reading in (63).

(62) Someone is allowed to invite maximally 50 friends.

(63) the person who is allowed to invite most friends is allowed to invite
50 friends

As observed above, however, bare plurals do interact with class B quanti-

fiers, as in for instance example (9). This would suggest that some inten-
sional/modal analysis of the readings involved in such examples is in order.
(Thanks to Maribel Romero for pointing this out to me.) I will leave a detailed
analysis of these cases for further research.
16 In Heim’s formulation: If the scope of a quantificational DP contains the trace of a degree
phrase, it also contains that degree phrase itself. See Heim 2000 for details.

R.W.F. Nouwen

5 Maximal and minimal requirements

As Hackl (2001) observed, there is an interesting interaction between modified

numerals and modals. I have extended these observations by showing how
existential modals have a tight connection to class B modifiers in that they
license their (otherwise blocked) existence. What I have not discussed so far
is how class B modifiers interact with universal modals. It turns out that this
part of the story is not straightforward at all.
Given my proposal in the previous section, we expect that there are in
principle four logical forms that correspond to (64).17

(64) Jasper should read minimally 10 books.

(65) í >min:
The minimum n such that Jasper will read n books should be 10
a. í[minn (∃x[#x = n & book(x) & read(j, x)]) = 10] many1
b. í[minn (∃!x[#x = n & book(x) & read(j, x)]) = 10] many2

(66) min> í:
The minimum n such that Jasper should read n books is 10
a. minn (í∃x[#x = n & book(x) & read(j, x)]) = 10 many1
b. minn (í∃!x[#x = n & book(x) & read(j, x)]) = 10 many2

It turns out that none of these logical forms provide a reading that is
in accordance to our intuitions regarding (64). First of all, notice that
minn (∃x[#x = n & book(x) & read(j, x)]) = 10 is a contradiction. If there
are 10 books that Jasper read, then there is also a singleton group containing
a book Jasper read. The minimum number of books Jasper read is therefore
either 1 (in case he read something) or 0 (in case he did not read anything).
It could never be 10. Consequently, (65a) is a contradiction. For a similar
reason, (66a) is a contradiction too. If there needs to be a group of 10 books

17 In this paper, I ignore readings which (for the case of at least) Büring (2008) calls speaker
insecurity readings and which Geurts & Nouwen (2007) discuss extensively. Basically, this
reading amounts to interpreting the modal statement with respect to speaker’s knowledge.
Such readings are especially prominent with superlative quantifiers. For instance, the speaker
insecurity reading of Jasper should read at least 10 books is: the speaker knows that there is
a lower bound on the number of books that Jasper should read, s/he does not know what
that lower bound is, but she does know that it exceeds 9.
Furthermore, I also ignore a reading of (64) in which 10 books is construed as a specific
indefinite. In that reading, (64) states that there are 10 specific books such that only if Jasper
reads these books will he comply with what is minimally required.

Two kinds of modified numerals

read by Jasper, then there also need to exist groups containing just a single
book read by Jasper. Once again, the minimum number referred to in (66a)
is either 0 or 1, never 10.
Turning to (65b), notice that the minn -operator is vacuous here, since
there is just a single n such that Jasper read exactly n books. This renders
(65b) equivalent to the many2 reading of Jasper should read 10 books, and
so we predict it to be blocked. The interpretation in (66b) does not fare any
better. In fact, the minn -operator is vacuous here as well. This means that
(65b) is equivalent to (66b) and that it is consequently also blocked. Even if
no blocking were to take place, (65b)/(66b) offer the wrong interpretation
anyway. They state that Jasper must read exactly 10 books (no more, no
fewer), which is not what (64) means.
One might think that the problems with (65b) and (66b) can be remedied
by abandoning quantification over sums and instead using reference to
(maximal) sums. For instance, (67) represents the truth-conditions we are
after. (Here σx returns the maximal sum that when assigned to x verifies the
scope of σ ).

(67) minn (í[#σx (book(x) & read(j, x)) ≥ n]) = 10

Still, here too the application of minn is not meaningful, since there is only
a single n such that í[#σx (book(x) & read(j, x)) ≥ n] holds, which is 10
if (64) is true. As a consequence, it would not matter whether we applied
a maximality or a minimality operator. We then wrongly predict that (68)
should share a reading with (64). (Note that (65b) and (66b) suffer from the
same odd prediction, given that the operator minn has no semantic impact
there either.)

(68) Jasper should read maximally 10 books.

It appears then that the proposal defended in this article fails hopelessly on
sentences like (64). As I will show, however, things are not so dire as they
appear. In fact, I will argue that what we stumble upon here is a general,
but poorly understood property of modals, which could be summarised as

(69) Generalisation: universal modal operators are interpreted as opera-

tors with existential modal force when minimality is a stake

An illustration of (69) is (70), which is a satisfactory paraphrase for (64).

R.W.F. Nouwen

What is striking is that this paraphrase contains allow instead of should.

(70) 10 is the smallest number of books John is allowed to read

I will not offer an explanation for this generalisation (but see Nouwen 2010a
for an attempt). I will simply show that if we look a bit closer at the inter-
pretation of modal operators, then we come to understand that my theory
actually yields a welcome analysis.

5.1 Previous analyses

There is a precedent. In an earlier theory of at least, Geurts & Nouwen 2007,

the correct predictions regarding its relation to universal modals are arrived
at by an essentially non-compositional mechanism. A central claim made in
that paper is that superlative quantifiers are modal expressions themselves.
For instance, (71a) was proposed to correspond to (71b).18 Furthermore, it
was assumed that there may be a non-compositional interaction between the
modal that is implicitly contributed by a modified numeral and an explicit
modal operator. For instance, (72a) is interpreted as an instance of modal
concord, as in (72b), where the two modals fuse and the modal takes on the
deontic flavour of need.19

(71) a. John read at least 10 books.

b. ∃x[#x = 10 & book(x)& read(j, x)]
(72) a. John needs to read at least 10 books.
b. í∃x[#x = 10 & book(x)& read(j, x)]
18 This is how I see the theoretical landscape: Although not immediately obvious, the proposal
by Geurts and Nouwen already carries in it the idea that superlative quantifiers are minimality
and maximality operators. For instance, (71b) is equivalent to stating that 10 is the minimal
number of books John is allowed to read. Given the basic idea of treating class B operators
as min/max-operators, one has a range of options to account for the distribution of such
quantifiers and for their behaviour in intensional contexts. Geurts and Nouwen represent
one extreme, where the lexicon specifies the exact behaviour of such quantifiers (together
with the rule of modal concord). The present proposal puts forward the other extreme,
where the lexical entry for superlative (and other class B) quantifiers is rather minimal, and
where pragmatic mechanisms account for distribution and behaviour in intensional contexts.
19 I am simplifying the analysis here a little bit. Geurts & Nouwen (2007) propose that
there is an additional conjunct to the meaning of sentences containing superlative quan-
tifiers, for which they leave implicit whether it is entailed or implicated. For (71), for
instance, there would be an additional condition in the truth-conditions saying: ¬∃x[#x >
10& book(x) & read(j, x)]. Similarly for (72).

Two kinds of modified numerals

The approach of Geurts and Nouwen is the most broadly applicable approach
to superlative quantifiers in the (admittedly small body of) literature on that
topic. There are alternatives on the market, but they do not handle examples
like these very well. As I mentioned above, Krifka (2007b) takes at least to
be a speech act modifier. Basically, an example like (71) is analysed by Krifka
in terms of what the speaker finds assertable and is paraphrased as follows:
the lowest n such that it is assertable that John read n books is 10. When
at least is embedded in an intensional context, however, it does not modify
the strength of assertability, but rather the intensional operator. So, taking
Krifka’s analysis as suitable not just for superlative, but rather for all class B
quantifiers, (72a) would be paraphrased as (73).

(73) 10 is the smallest value for n such that John should read n books

In such cases, Krifka’s analysis is identical to the one I have set out above
and it runs in exactly the same problem: (73) is not the reading we are after.
Rather, (72a) means that 10 is the smallest number of books John is allowed
to read.

5.2 Minimal requirements

Geurts & Nouwen (2007) and Krifka (2007b) say nothing about the distinc-
tion between class A and class B expressions. However, if we extend their
proposals for superlative quantifiers to cover all B-type quantifiers, then we
have an interesting trio of competing characterisations of such expressions.
At face value, the observations made so far in this section would appear to
speak in favour of the modal concord proposal of Geurts & Nouwen (2007)
(generalised to all class B quantifiers) and against the account defended here
or in Krifka 2007b. As I will argue now, however, there are reasons to believe
that the problematic predictions made by the latter two theories are not due
to the semantics of the modified numeral, but are actually the result of an
overly simplistic understanding of requirements. What I will do is discuss in
some detail examples like (74).

(74) The minimum number of books John needs to read to please his
mother is 10.

Notice, first of all, that on an intuitive level, (74) is equivalent to (75).

(75) John needs to read minimally 10 books to please his mother.

R.W.F. Nouwen

Note, secondly, that (74) spells out the semantics I have proposed for (75).
What I will show now is that when we look into the semantic details of (74),
we will run into exactly the same problems as we did for (75). What this
shows is that rather than thinking that my account of class B quantifiers is
on the wrong track, there are actually reasons to believe that the proposal
lays bare a hitherto unexplored problem for the semantics of modals like
need, require, etc.
Let us consider the semantics of (74). Say that, in fact, the minimal
requirements for pleasing John’s mother are indeed John reading 10 books.
That is, if John reads 10 or more books, she is happy. If he reads fewer,
she will not be pleased. Standard accounts of goal-directed modality (von
Fintel & Iatridou 2005) assume that statements of the form to q, need to p
are true if and only if p holds in all worlds in which the goal q holds. Below, I
refer to the worlds in which John pleases his mother as the goal worlds. It is
instructive to see what we know about the propositions that are true in such
worlds. The following is consistent with the context described above.

(76) a. In all goal worlds: ∃x[#x = 10 & book(x) & read(j, x)]
b. In all goal worlds: ∃x[#x = 9 & book(x) & read(j, x)]
c. In all goal worlds: ∃x[#x = 1 & book(x) & read(j, x)]
d. In some (not all) goal worlds: ∃x[#x = 11 & book(x) & read(j, x)]
e. In some (not all) goal worlds: ∃x[#x = 12 & book(x) & read(j, x)]
f. In no goal world: ¬∃x[book(x) & read(j, x)]

Let us now analyse some examples. First of all, (77a) and (77b) are intuitively
true and are also predicted to be true ((77a) by virtue of (76a) and (77b) by
virtue of (76c).)

(77) a. To please his mother, John needs to read 10 books.

b. To please his mother, John needs to read a book.

The example in (78) is intuitively false, and is also predicted to be false, for
the context is such that there are goal worlds in which John reads only 10,
and not 11, books.

(78) To please his mother, John needs to read 11 books.

So far, so good. If we turn to examples that place a bound on what is required,

however, then the theory makes a wrong prediction. The example in (79) is
intuitively false. If interpreted as (80), however, it is predicted to be true (by

Two kinds of modified numerals

virtue of (76c)).

(79) The minimum number of books John needs to read, to please his
mother, is 1.
(80) minn [In all goal worlds: ∃x[#x = n & book(x) & read(j, x)]] = 1

In general, theories such as that of von Fintel & Iatridou (2005) predict that
if S is an entailment scale of propositions, and p is a proposition on this
scale, then if p is a minimal requirement for some goal proposition q, then
a statement of the form “the minimum requirement to q is p” is always
predicted to be false, except when p is the minimal proposition of S. This
makes a devastating prediction, namely that minimal requirements could
never be expressed, since they would always correspond to the absolute
One might think that what is going wrong in the example above is that I
assume that when we talk about how many books John read we should be
talking about existential sentences, that is about at least how many books
John read. The alternative would be to describe the number of books John
read by means of the counting quantifier many 2 , that is, how many books
John read exactly. I’m afraid this only makes the problem worse. Here is a
description of the relevant context in terms of the exact number of books
that were read by John.

(81) a. In some but not all goal worlds: John read exactly 10 books.
b. In no goal world: John read exactly 9 books.
c. In no goal world: John read exactly 1 book.
d. In some but not all goal worlds: John read exactly 11 books.
e. In some but not all goal worlds: John read exactly 12 books.

Now, there is no number n such that John read exactly n books in all goal
worlds. So, the smallest number of books John needs to read does not refer.
The upshot is that there is no satisfactory analysis of examples like (74)
under the assumptions made here. In general, it seems that, under standard
assumptions, there is no satisfactory analysis of minimal requirements.
Whatever way we find to fix the semantics of cases like (74), however, this fix
will work to save the account of class B quantifiers too, for (74) was a literal
spell-out of the proposed interpretation of similar sentences with at least,
minimally, etc. It goes beyond the scope of this article to provide such a fix.
The overview in (81), however, can help to indicate where we should look for

R.W.F. Nouwen

a solution.20 Given that there is no goal world in which John read exactly
n books for n’s smaller than 10, it follows that 10 is the minimal number
of books John could read to please his mother. In other words, examples
like (74) show that, in the scope of a minimality operator, modals that are
lexically universal quantifiers get a weaker interpretation.
That said, it is time to revisit example (64), repeated here as (82).

(82) Jasper should read minimally 10 books.

My proposal generated four logical forms, two of which were contradictory

and two of which were blocked by a non-modified form. Let us revisit one of
these logical forms, namely the one with a narrow scope modal and a doubly
bound counting quantifier, represented in (83). The resulting truth-conditions
were presented above as (84).

(83) [ minimally 10 λn [ should [ Jasper read [ n-many2 books ] ] ] ]

(84) minn (í∃!x[#x = n & book(x) & read(j, x)]) = 10

What the discussion in the current section suggests is that it is a misunder-

standing to assume that (83) is interpreted as (84), and that it looks like there
is a mapping to a form like (85), instead.

(85) minn (∃!x[#x = n & book(x) & read(j, x)]) = 10

This captures the intuitive meaning of (82).

At this point I do not have anything to offer which provides the mechanism
behind the generalisation that the combination of a universal modal and a
minimality operator leads to a semantics which is existential in nature. What
is relevant for the present purposes is that this is a general phenomenon.
Interestingly, this means there are noteworthy connections to other areas
where the semantics of a modal statement appear mysterious. Schwager
(2005), for instance, notices that certain imperatives, which are standardly
considered to have universal modal force, require a weaker semantics. Her
key examples are German imperatives containing for example.

(86) Q: How can I save money?

A: Kauf zum Beispiel keine Zigaretten!
Buy for instance no cigarettes
“For example, don’t buy any cigarettes!”
20 See Nouwen 2010a for a proposal along these lines.

Two kinds of modified numerals

In the context of the question asked in (86), the imperative does not convey
that to comply with the advice, the hearer has to stop buying cigarettes.
Instead, it is interpreted as stating that one of the things one could do to
save money is to stop buying cigarettes. Thus, examples like these display
a mechanism that is similar to the interaction of numeral modifiers and
The mysterious interaction of modified numerals and modals is moreover
reminiscent of the interaction of modals and disjunction (Zimmermann 2000;
Geurts 2005; Aloni 2007), especially since, on an intuitive level at least,
a class B modified numeral like minimally 10 (and, quite obviously, 10 or
more) appears to correspond to a disjunction of alternative cardinalities,
with 10 as the minimal disjunct.21 A central issue in the literature on modals
and disjunction is that classical semantic assumptions fail to capture the
entailments of sentences where a disjunctive statement is embedded under a
modal operator (Kamp 1973). A detailed comparison of this complex issue
with the discussion of minimal requirements that I presented here, however,
will be left to further research.

6 More about the A/B distinction

In this section, I will attempt to give some initial answers to three empirical
questions concerning the distinction between class A and B modified nu-
merals that is central to this article. First of all, I turn to the issue of which
expressions go with which class. So far, I have restricted my attention mostly
to, on the one hand, comparative quantifiers (as proto-typical class A expres-
sions) and, on the other hand, superlative, minimality/maximality and up
to-modified numerals (as representatives of class B). What about expressions
like the prepositional over n or under n or the double bound between n and
m or from n to m? Below, I will turn briefly to such expressions.
A second empirical question concerns the validity of the examples used
so far. Although I believe that the intuitions concerning the constructed
examples in this article are rather clear, my plea for two kinds of modified
numerals would still benefit from some independent objective support.
Below, I present the results of a small corpus study that clearly reflects the
distinction argued for in this article.
Finally, this section will turn to the cross-linguistic generality of the
21 See Nilsen 2007 and Büring 2008 for suggestions along this line for the modifier at least

R.W.F. Nouwen

proposal. I will provide data from a more or less random set of languages that
suggest that the class A/B distinction is not a quirk of English or Germanic,
or even Indo-European, but is, in fact, quite general.

6.1 Filling in class A and B

I will leave it an open question exactly which quantifiers belong to which class.
Nevertheless, I can already offer some speculations on several quantifiers
that I have so far not discussed. To start with disjunctive quantifiers, it
appears that these are clear cases of class B expressions.

(87) a. #A triangle has 3 or more sides.

b. #A triangle has 3 or fewer sides.

With disjunctive quantifiers in class B, one might wonder whether there are
any examples of class A expressions which are not the familiar comparative
quantifiers more/fewer/less than n. I think that locative prepositional modi-
fiers are a likely candidate for class A membership, however. In fact, I believe
that the locative/directional distinction in spatial prepositions corresponds
to the class A/B distinction when these prepositions are used as numeral
Roughly, locative prepositions express the location of an object and are
compatible with the absence of directionality or motion. Directional prepo-
sitions, on the other hand, cannot be used as mere indicators of location.

(88) Locative:
a. John was standing under a tree.
b. That cloud is hanging over San Francisco.
c. Breukelen is located between Utrecht and Amsterdam.
(89) Directional:
a. #John was standing up to here.
b. #John was standing from here.
c. #Breukelen is located from Utrecht to Amsterdam.

Now, compare (90a) and (90b).

(90) a. You can get a car for under €1000.

b. You can get a car for maximally €1000.

Two kinds of modified numerals

The example in (90b) is somewhat strange, since it claims that the most
expensive car you can buy is €1000. The example in (89a), in contrast, makes
no such claim. It clearly has a weak reading: there are cars that are cheaper
than €1000 and there might be more expensive ones too. As explained above,
such weak readings are typical for class A quantifiers and do not occur with
class B quantifiers.22 Furthermore, under seems perfectly compatible with
definite amounts, such as in (91).

(91) The total number of guests is under 100. To be precise, it’s 87.

Class A is then not restricted to comparative constructions only. In fact,

other locative prepositions seem to behave similarly to under.

(92) The total number of guests is between 100 and 150. It’s 122.

The locative complex preposition between . . . and . . . contrasts with its

directional counterpart from . . . (up) to . . . , which behaves like a class B
modifier: it is incompatible with definite amounts, as in (93), but felicitous if
it relates to a range of values.

(93) #The ticket to the Stevie Wonder concert that I bought yesterday cost
from €100 to €800.
(94) Tickets to the Stevie Wonder concert cost from €100 to €800.

It appears then that locative prepositions turn into class A modifiers, while
directional ones turn into class B modifiers. A potential counterexample,
however, is over, which apart from a (relatively rarely used) locative sense, as
in (88b), has a directional sense, such as exemplified in (95).

(95) The bird flew over the bridge.

As a numeral modifier, however, over looks like a class A element. In (96),

over 100 is clearly relating the precise weight 104kg with 100kg. Note in
(97) how this contrasts with the directional 100 . . . and up, which is made

22 An anonymous reviewer notes a complication. It appears that under cannot take wide scope
with respect to a modal. That is, it fails to display scope ambiguities such as the one in (20)
above. For instance, (i) (which is an example given by the reviewer) is odd, since it misses an
interpretation where the modified numeral has scope over require.

(i) #John is required to come up with under 6 brilliant ideas.

R.W.F. Nouwen

felicitous by embedding it under an existential modal.

(96) He weighs over 100 kg. To be precise, he weighs 104 kg.

(97) a. #He weighs 100 kg and up.
b. He is allowed to weigh 100 kg and up.

A potential explanation for why the numeral modifier over lacks a direc-
tional/class B sense23 is that the use of prepositions in numeral quantifiers is
restricted to prepositions that are vertically oriented. This is connected to
the observation of Lakoff & Johnson 1980 that cardinality is metaphorically
vertical: more is higher (as in a high number), less is lower (as in a low
number). Prepositions in modified numerals follow this metaphor.24 What
is interesting about over, however, is that only its locative sense is vertical.
Its directional sense, as in (95), rather expresses a mainly horizontal motion.
This could explain why there is no class B sense numeral modifier over.
Further clues that this analysis is on the right track come from Dutch,
where the preposition over lacks a locative sense.

(98) #De wolk hangt over San Francisco.

The cloud hangs over San Francisco.
(99) De vogel vloog over de brug.
The bird flew over the bridge.

Instead of over in (98), boven (above) should be used for locative meanings.

(100) De wolk hangt boven San Francisco.

The cloud hangs above San Francisco.
‘The cloud hangs over San Francisco.’

In Dutch, only boven can modify numerals. Over, which lacks a vertical sense,
is unacceptable in modified numerals.

(101) Inflatie kan {boven / #over} de 10% zijn.

Inflation can {above / over} the 10% be.
‘Inflation can be over 10%’
23 Thanks to Joost Zwarts for discussing this matter with me.
24 Up (to) and under are clearly vertical. Between and from . . . to are compatible with all possible

Two kinds of modified numerals

I will refrain from attempting to offer further evidence for my suggestion

that there is a correspondence between the locative/directional and the A/B
distinction. In any case, it should be clear that the set of prepositional
quantifiers offers an interesting range of contrasts that support the existence
of two classes of modified numerals.
To summarise this subsection, I tentatively put forward the following
classification for English modified numerals.

(102) Class A
(Positive:) more than —, over —
(Negative:) fewer than —, less than —, under —
(Neutral:) between — and —
(103) Class B
(Positive:) at least —, minimally —, from — (up), — or more
(Negative:) at most —, maximally — , up to —, — or fewer, — or
(Neutral:) from — and —

Missing from this classification are the negative comparative quantifiers like
no more/fewer than 10. The reason for this is that the occurrence of negation
complicates the comparison with other quantifiers. In fact, I think that such
quantifiers are best treated as the compositional combination of a class A
comparative modifier with a negative differential no. See Nouwen 2008b for
the consequences of such a move and for more details on the interpretations
available for sentences containing such quantifiers.

6.2 Support for the A/B distinction from a corpus study

I now turn to a small corpus study I conducted which supports the division
between class A and class B modifiers. Recall that one of the central obser-
vations in favour of the distinction connected to contrasts such as (104).
Whereas (104a) can be interpreted with respect to a definite actual number
of people invited by Jasper, (104b) does not allow such an interpretation and
instead is evaluated in relation to what the speaker holds possible.

(104) a. Jasper invited fewer than 100 people. 87, to be precise.

b. Jasper invited maximally 100 people. #87, to be precise.

R.W.F. Nouwen

I explained this contrast by proposing that upper bound class B quantifiers

are indicators of maxima. The indication of the maximum of a single value
leads to infelicity. Existential modals, however, introduce a range of (possible)
values, which thereby license the application of the maxima indicator. For
examples like (104b), where no overt modal is present, the hearer will have
to accommodate an interpretation with respect to speaker possibility. Given
that ♦-modals licenses the application of an upper bound class B modifier,
one would expect, however, that class B modifiers co-occur with an overt
modal operator relatively often. I conducted a corpus study to find out
whether this expectation is fulfilled.

6.2.1 Method

I used the free service for searching the Corpus of Contemporary American
English (COCA, 385 million words, a mix of fiction, science, newspaper and
entertainment texts and spoken word transcripts) at americancorpus.org
(Davies 2008). For each numeral modifier I took 100 quasi-random25 occur-
rences of the modifier with a numeral. For each of these cases I examined
whether the modified numeral was in the scope of an explicit existential
modal operator (such as can, could, might, possibly, allow, etc.) In other
words, I only looked at the surface form and only counted the number of
cases where a modal expression has a scope relation with a modified numeral.
Given the theory presented in this article, the prediction is that this number
is significantly higher with class B numerals than with class A expressions.
I compared five modifiers: fewer than, under, between, at most and up
to. Not all occurrences of these modifiers with a numeral in the corpus were
taken into consideration. For instance, (105) was ignored because in this
example up to is probably not a constituent.26 That is, this example contains
the particle verb to lift up, rather than the verb to lift.

(105) Periodically we’d lift up to 60 kilometers where the temperatures

and pressures are more like Earth’s.

I similarly disregarded occurrences of under n where under is a regular

preposition rather than a preposition in a role of numeral modifier. (For
instance, examples resembling He was known under 2 different names.)
25 ‘Quasi’, since the results are given in chronological order and I would just take the earliest
26 From: “To boldly go. . . ”, Donald Robertson (1994), Astronomy, Vol. 22, Iss. 12; pg. 34, 8 pgs.

Two kinds of modified numerals

6.2.2 Results

The results, summarised in the table in (106), support the proposal in this
article. Here, P is the percentage of occurrences within a existential modal
context, within a sample of 100 occurrences of that modifer.27

(106) Class A Class B

fewer than under between at most up to
P 4% 3% 4% 23% 21%

The corpus thus shows a clear preference for combining class B quantifiers
with existential modal operators, as was predicted.28 Whether the data are as
clear as (106) for other expressions too remains to be seen. It will be difficult
to extend this type of study to other modifiers. Maximally and from. . . to,
for instance, were included in the present corpus search, but did not yield
enough occurrences to make a meaningful comparison.

6.3 The cross-linguistic generality of the distinction

The class A/B distinction is not a peculiarity of the English language. I will
suggest in this subsection that, in fact, the distinction is quite general and
that languages seem to fill in the two classes in roughly the same way. Dutch,
for instance, mirrors the English data perfectly. To illustrate, (107) and (108)
shows the A/B distinction in a contrast between comparative and superlative

(107) Een driehoek heeft meer dan 1 zijde.

A triangle has more than 1 side.
(108) #Een driehoek heeft minstens 2 zijdes.
A triangle has at least 2 sides.

There are similar contrasts for other numeral modifiers. In a nutshell, the
Dutch data suggests the two classes in (109), which is parallel to English.

27 I also counted the number of occurrences in a universal modal context. As would be

predicted, this yielded no significant difference between class A and class B modifiers. For
all modified numerals, this number was between 1 and 5.
28 The contrast between the Class A and Class B data is significant (χ 2 =41.2, df=1, p =
1.375×10−10 .)

R.W.F. Nouwen

(109) Dutch Class A

(Positive:) meer dan — (more than), boven de — (above the)
(Negative:) minder dan — (fewer/less than), onder de — (under the)
(Neutral:) tussen de — en de — (between the. . . and. . . )
(110) Dutch Class B
(Positive:) ten minste —, minstens —, op z’n minst — (at least), vanaf
— (from off), zeker — (certain), minimaal — (minimal)
(Negative:) ten hoogste —, hoogstens —, op z’n hoogst — (at most),tot
— (up to), maximaal — (maximal)
(Neutral:) van — tot — (from — to —)

In other languages, we find similar data. For instance, the division between
comparative and superlative modifiers appears to be cross-linguistically quite
general. In Italian, for instance, the following contrast exists.

(111) Un triangolo ha piú di 1 lato.

A triangle has more than 1 side.
(112) #Un triangolo ha almeno 2 lati.
A triangle has at least 2 sides.

In Chinese, there also exists a superlative form that behaves like a class B

(113) #Sanjiaoxing zui-shao you liang-tiao bian.

triangle most-little have 2-CL side

On the other hand, there also exists an alternative form resembling English
at least, which behaves differently. The form zhi-shao can be used as in a
similar way as English at least is in sentences like At least it doesn’t rain!.
Despite this parallel to the English superlative modifiers, the example in (114)
appears to be fine, which suggests zhi-shao is of type A.

(114) Sanjiaoxing zhi-shao you liang-tiao bian.

triangles to-little have 2-CL side

I leave a more detailed investigation of such data for further research. What-
ever the outcome, however, the data first and foremost reveal that the type
of contrasts that have been the central focus of this paper occur in Chinese
and that, thereby, Chinese also appears to have the class A/B distinction.

Two kinds of modified numerals

Above, I suggested that prepositional numeral modifiers are to be divided

in two classes in accordance with the locative/directional distinction that
exists for their spatial meanings. The clearest case of a class B directional
prepositional modifier in English is up to. In many other languages, one
and the same particle is used for indicating spatial, numerical and temporal
extremes. (In English, up to cannot be used as a temporal operator, for
which until exists.) In Dutch, for instance, the preposition tot has these
three functions. Crucially, in all these three domains tot displays class B

(115) #Een driehoek heeft tot 10 zijdes.

A triangle has up to/until 10 sides.
(116) #Je auto stond tot hier geparkeerd.
Your car stood up to/until here parked.
‘#Your car was parked up to here’
(117) Je auto mag tot hier geparkeerd worden.
Your car may up to/until here parked be.
‘You may park your car up to here’
(118) #Jasper kwam tot middernacht de kamer binnengelopen.
J. came up to/until midnight the room inside-walked.
‘#J. entered the room until midnight’
(119) Jasper mag tot middernacht de kamer binnen komen
J. may up to/until midnight the room inside come
‘J. is allowed to enter the room until midnight’

Similar data exist for German bis (zu), Hebrew ’ad, Catalan fins a, Spanish
hasta and Italian fino a. In fact, in Italian it appears that (120) is generally
awkward, resisting a reading that connects to speaker’s possibility. However,
it becomes acceptable if an overt modal verb is inserted.

(120) ??John ha invitato {al massimo / fino a} 50 amici.

John has invited {at most / until} 50 friends.
(121) John può invitare {al massimo / fino a} 50 amici.
John can invite {at most / until} 50 friends.

R.W.F. Nouwen

7 Conclusion

The central aim of this article has been to put forward the empirical ob-
servation that numeral modifiers come in two classes: those that relate to
definite amounts (class A) and those that resist association with definite
cardinality (class B). Theoretically, I proposed that underlying this distinction
is a difference in the kind of relations numeral modifiers encode: either a
simple comparison relation between numbers (class A) or a relation between
a range of values and its minimum or maximum (class B). I furthermore
showed how this theory can be implemented in a framework where numeral
modifiers are treated as degree quantifiers.
While there already existed analyses of both type A and type B modifiers,
the class difference that was the central focus of this article has not yet been
discussed. For the treatment of class A quantifiers in this article I adopted
the proposal of Hackl 2001. My account of class B modifiers, on the other
hand, is original. It can be compared to two closely related proposals on the
semantics of superlative modifiers: Geurts & Nouwen 2007, where superlative
modified numerals are proposed to lexically specify modal operators, and
Krifka 2007b, where superlative quantifiers are proposed to be speech act
modifiers. Both works do not discuss the class A/B distinction, but I take it
that both these proposals, in view of the main observations of this article,
can be viewed as accounts not just of superlative quantifiers, but of class
B members in general. As suggested in section 5, my proposal is in certain
respects quite close to Krifka’s. It differs greatly, however, from Geurts &
Nouwen 2007 in the way the interaction between modified numerals and
modality is accounted for. In a way, the current article as well as Krifka
2007b represent a position where quantifiers lexically specify quite minimal
functions, which consequently leads to much of the work being done by
pragmatic mechanisms (such as blocking). For the proposal in Geurts &
Nouwen 2007, on the other hand, the balance is different in that a much
greater burden is placed on semantics. An in-depth comparison of these
accounts of class B quantifiers, however, is left for further research.


Aloni, Maria. 2007. Free choice, modals, and imperatives. Natural Language
Semantics 15(1). 65–94. doi:10.1007/s11050-007-9010-2.
Atlas, Jay David & Stephen C. Levinson. 1981. It-clefts, informativeness, and

Two kinds of modified numerals

logical form: Radical pragmatics (revised standard version). In Peter Cole

(ed.), Radical pragmatics, 1–61. New York: Academic Press.
Barwise, John & Robin Cooper. 1981. Generalized quantifiers and natural lan-
guage. Linguistics and Philosophy 4(2). 159–219. doi:10.1007/BF00350139.
Blutner, Reinhard. 2000. Some aspects of optimality in natural language in-
terpretation. Journal of Semantics 17(3). 189–216. doi:10.1093/jos/17.3.189.
Breheny, Richard. 2008. A new look at the semantics and pragmatics of
numerically quantified noun phrases. Journal of Semantics 25(2). 93–140.
Büring, Daniel. 2008. The least at least can do. In Charles B. Chang &
Hannah J. Haynie (eds.), Proceedings of WCCFL 26, 114–120. Somerville,
Massachusetts: Cascadilla Press.
Corblin, Francis. 2007. Existence, maximality and the semantics of numeral
modifiers. In Ileana Comorovski & Klaus von Heusinger (eds.), Existence:
Semantics and syntax (Studies in Linguistics and Philosophy 84), Springer.
Corver, Norbert & Joost Zwarts. 2006. Prepositional numerals. Lingua 116(6).
811–836. doi:10.1016/j.lingua.2005.03.008.
Davies, Mark. 2008. The corpus of contemporary American English (COCA):
385 million words, 1990-present. Available online at http://www.
von Fintel, Kai & Sabine Iatridou. 2005. What to do if you want to go to
Harlem: Anankastic conditionals and related matters. Ms. MIT, available
on http://mit.edu/fintel/www/harlem-rutgers.pdf.
Geurts, Bart. 2005. Entertaining alternatives: disjunctions as modals. Natural
Language Semantics 13(4). 383–410. doi:10.1007/s11050-005-2052-4.
Geurts, Bart. 2006. Take five: the meaning and use of a number word.
In Svetlana Vogeleer & Liliane Tasmowski (eds.), Non-definiteness and
plurality, 311–329. Amsterdam/Philadelphia: Benjamins. Pre-published
version available at http://ncs.ruhosting.nl/bart/papers/five.pdf.
Geurts, Bart & Rick Nouwen. 2007. At least et al.: the semantics of scalar
modifiers. Language 83(3). 533–559.
Grice, Paul. 1975. Logic and conversation. In Peter Cole & Jerry L. Morgan
(eds.), Syntax and semantics 3: Speech acts, 41–58. New York: Academic
Hackl, Martin. 2001. Comparative quantifiers: Department of Linguistics
and Philosophy, Massachusetts Institute of Technology dissertation.
Heim, Irene. 1991. Artikel und Definitheit. In Arnim von Stechow & Dieter

R.W.F. Nouwen

Wunderlich (eds.), Semantik: Ein internationales Handbuch der zeitgenös-

sischen Forschung, Berlin: de Gruyter.
Heim, Irene. 2000. Degree operators and scope. In Proceedings of SALT 10,
Ithaca, NY: CLC Publications.
Horn, Laurence R. 1984. Toward a new taxonomy for pragmatic inference:
Q-based and R-based implicature. In Deborah Schiffrin (ed.), Meaning,
form and use in context, 11–42. Washinton: Georgetown University Press.
Kamp, Hans. 1973. Free choice permission. Proceedings of the Aristotelian
Society 74. 57–74.
Kennedy, Christopher. 1997. Projecting the adjective: the syntax and semantics
of gradability and comparison: UCSD PhD. Thesis.
Kiparsky, Paul. 1973. "Elsewhere" in phonology. In Stephen R. Anderson &
Paul Kiparsky (eds.), A festschrift for Morris Halle, 93–106. New York: Holt,
Reinhart, & Winston.
Kiparsky, Paul. 1983. Word formation and the lexicon. In Proceedings of
the 1982 Mid-America Linguistics Conference, 47–78. Lawrence, Kansas:
University of Kansas.
Krifka, Manfred. 1999. At least some determiners aren’t determiners. In Ken
Turner (ed.), The semantics/pragmatics interface from different points of
view vol. 1, 257–291. Elsevier.
Krifka, Manfred. 2007a. Approximate interpretation of number words: A case
for strategic communication. In Irene Vogel & Joost Zwarts (ed.), Cognitive
foundations of communication, Amsterdam: Koninklijke Nederlandse
Akademie van Wetenschapen.
Krifka, Manfred. 2007b. More on the difference between more than two and
at least three. Paper presented at University of California at Santa Cruz,
available at http://amor.rz.hu-berlin.de/~h2816i3x/Talks/SantaCruz2007.
Lakoff, George & Mark Johnson. 1980. Metaphors we live by. University of
Chicago Press.
McCawley, James. 1978. Conversational implicature and the lexicon. In Peter
Cole (ed.), Syntax and semantics 9: Pragmatics, New York: Academic Press.
Nilsen, Øystein. 2007. At least: Free choice and lowest utility. Paper presented
at ESSLLI workshop on quantifier modification.
Nouwen, Rick. 2008a. Directionality in modified numerals: the case of up to.
Semantics and Linguistic Theory 18. doi:1813/13056.
Nouwen, Rick. 2008b. Upper-bounded no more: the implicatures of
negative comparison. Natural Language Semantics 16(4). 271–295.

Two kinds of modified numerals

Nouwen, Rick. 2009. Two kinds of modified numerals. In T. Solstad &
A. Riester (eds.), Proceedings of Sinn und Bedeutung 13, Available at http:
//www.let.uu.nl/~Rick.Nouwen/personal/papers/sub09.pdf, 15 pages.
Nouwen, Rick. 2010a. Two puzzles of requirement. In Maria Aloni & Katrin
Schulz (eds.), The Amsterdam Colloquium 2009, Springer. http://www.
Nouwen, Rick. 2010b. What’s in a quantifier? In Martin Everaert, Tom Lentz,
Hannah de Mulder, Øystein Nilsen & Arjen Zondervan (eds.), The linguistic
enterprise: From knowledge of language to knowledge in linguistics (Lin-
guistik Aktuell/Linguistics Today 150), John Benjamins. Pre-published
version available at http://www.hum.uu.nl/medewerkers/r.w.f.nouwen/
van Rooij, Robert. 2004. Signalling games select Horn strategies. Linguistics
and Philosophy 27(4). 493–527. doi:10.1023/B:LING.0000024403.88733.3f.
Schwager, Magdalena. 2005. Exhaustive imperatives. In Paul Dekker & Michael
Franke (eds.), Proceedings of the 15th Amsterdam Colloquium, Universiteit
van Amsterdam.
Solt, Stephanie. 2007. Few more and many fewer: complex quantifiers based
on many and few. In Rick Nouwen & Jakub Dotlacil (eds.), Proceedings of
the ESSLLI2007 Workshop on Quantifier Modification, .
Takahashi, Shoichi. 2006. More than two quantifiers. Natural Language
Semantics 14(1). 57–101. doi:10.1007/s11050-005-4534-9.
Umbach, Carla. 2006. Why do modified numerals resist a referential in-
terpretation? In Proceedings of SALT 15, 258 – 275. Cornell University
Zimmermann, Thomas Ede. 2000. Free choice disjunction and epis-
temic possibility. Natural Language Semantics 8(4). 255–290.

Dr. R.W.F. Nouwen

Utrecht Institute for Linguistics OTS
Janskerkhof 13, NL-3512 BL
Utrecht, the Netherlands

Semantics & Pragmatics Volume 3, Article 4: 1–42, 2010
doi: 10.3765/sp.3.4

Anthony S. Gillies
Rutgers University

Received 2009-06-24 / First Decision 2009-08-07 / Revised 2009-09-13 / Second

Decision 2009-09-21 / Revised 2009-10-14 / Accepted 2009-11-18 / Final Version
Received 2010-01-17 / Published 2010-02-01

How do ordinary indicative conditionals manage to convey conditional in-
formation, information about what might or must be if such-and-such is
or turns out to be the case? An old school thesis is that they do this by
expressing something iffy: ordinary indicatives express a two-place condi-
tional operator and that is how they convey conditional information. How
indicatives interact with epistemic modals seems to be an argument against
iffiness and for the new school thesis that if -clauses are merely devices for
restricting the domains of other operators. I will make the trouble both clear
and general, and then explore a way out for fans of iffiness.

Keywords: indicative conditionals, epistemic modality, if-clauses, conditionals,

strict conditionals, dynamic semantics

1 An iffy thesis

One thing language is good for is imparting plain and simple information:
there is an extra chair at our table or we are all out of beer. But — happily — we
∗ This paper has been around awhile, versions of it circulating since 05.2006 and accruing
a lot of debts of gratitude along the way. Chris Kennedy, Jim Joyce, Craige Roberts, Josef
Stern, Rich Thomason, audiences at the Rutgers Semantics Workshop (October 2007), the
Michigan L&P Workshop (Lite Version, November 2007), the Arché Contextualism & Relativism
Workshop (May 2008), the University of Chicago Semantics & Philosophy Language Workshop
(March 2009), and — especially (actually, especially∗ ) — Josh Dever, David Beaver, Kai von
Fintel, Brian Weatherson, and the anonymous S&P referees have all done their best trying
to save me from making too many howlers. But too many is surely context dependent, so
caveat emptor. This research was supported in part by the National Science Foundation
under Grant No. BCS-0547814.

©2010 A. S. Gillies
This is an open-access article distributed under the terms of a Creative Commons Non-
Commercial License (creativecommons.org/licenses/by-nc/3.0).
A. S. Gillies

do not only exchange plain information about tables, chairs, and beer mugs.
We also exchange conditional information thereof: if we are all out of beer, it
is time for you to buy another round. That is very useful indeed.
Conditional information is information about what might or must be, if
such-and-such is or turns out to be the case. My target here has to do with
how such conditional information manages to get expressed by indicative
conditionals (not so called because anyone thinks that’s a great name but
because no one can do any better). Some examples:

(1) a. If the goat is behind door #1, then the new car is behind door #2.
b. If the No. 9 shirt regains his form, then Barça might advance.
c. If Carl is at the party, then Lenny must also be at the party.

Each of these is an ordinary indicative, two of them have epistemic modals in

the consequent clause, and all of them express a bit of ordinary conditional
information.1 What I am interested in is how well the indicatives play with
the epistemic modals.
What these examples say is plain. Take (1b). This says that — within
the set of possibilities compatible with the information at hand — among
those in which the star striker regains his form, some are possibilities in
which Barça advance. Or take (1c). It says something about the occurrence
of Lenny-is-at-the-party possibilities within the set of Carl-is-at-the-party
possibilities — that, given the information at hand, every possibility of the
latter stripe is also of the former stripe. So what sentences like these say is
plain. How they say it isn’t. That’s my target here: How is it that the if s in
our examples manage to express conditional information and do so in a way
compatible with how they play with epistemic modals?
The simplest story about how the if s in our examples manage to express
conditional information is that each of them expresses the information of
a conditional. Which is to say: what these conditional sentences mean can
be read-off the fact that if expresses a conditional operator. Let’s say that
a story about if is iffy iff it takes if to express a bona fide operator, a bona
fide iffy operator (that is, a conditional operator properly so called), and the
same bona fide iffy operator in each of the sentences in (1). We will have to
sharpen that up by saying what it means for an operator to be a conditional
1 We ought to be careful to distinguish between conditional sentences (sentences of natural
language), conditional connectives (two-place sentential connectives in some regimented
language that may serve to represent the logical forms of conditional sentences), and
conditional operators (relations that may serve as the denotations of conditional connectives).


operator properly so called. But that is the gist: iffiness — a.k.a. the operator
view — is the thesis that ordinary indicative conditionals manage to express
conditional information because if expresses a conditional operator.
Depending on your upbringing, the operator view of if may well seem
either obvious or obviously wrongheaded. More on that below. Either way,
it is a hard line to maintain: how conditional sentences play with epistemic
modals seems to refute it. A seeming refutation isn’t quite the same as an
actual one, though. I will show that the refutation isn’t quite right by showing
how fans of iffiness can account for what needs accounting for. But before
showing how the operator view can be made to account for how if s and
modals interact I want to make it look for all the world like it can’t be done.

2 Doom and how to avoid it (sketches thereof)

The operator view is an old school story about indicatives. It says that if
expresses some relation between the (semantic value of the) antecedent and
consequent. So if takes its place alongside other connectives and expresses
an operator — the same operator — on the semantic values of the sentences it
takes as arguments.2 To tell a story like this we have to say exactly what that
operator is. But not just any telling will do. I want to show how our simple
examples cause what looks like insurmountable trouble (doom, even) for any
version of the operator view. Here’s an informal sketch of the trouble, what
rides on it, and how — eventually — we can and ought to get out of the mess.
Take this sketch as a promissory note that a formally precise version of all
that can be given; the rest of the paper makes good on that.
Suppose if expresses the limit case conditional operator of material
implication. Iffiness requires that in sentences like (1b) and (1c) either the
epistemic modals outscope the conditionals or the conditionals outscope
the modals. Neither choice gets the truth conditions right if the conditional
operator is the horseshoe. That’s easy to see (and well known).3 Linguists
grow up on arguments like that. That is one reason why even though the
operator view is the first thing a logician thinks of, it is the last thing a
linguist does.
2 If is a little word with a big history — a big history that we can’t adequately tour here. But
there are guides for hire: for instance, Bennett (2003) and von Fintel (2009).
3 The material conditional analysis of ordinary indicatives is defended (in somewhat different
ways) by, for example, Grice (1989), Jackson (1987), and Lewis (1976). A textbook version of
this “no-scope” argument that has the horseshoe analysis as its target appears in von Fintel
& Heim 2007.

A. S. Gillies

But (as I’ll show) this very same trouble holds no matter what conditional
operator an iffy story says if expresses. To see that requires two things. First,
we need to say in a precise way what counts as a conditional operator (Section
4). Given some pretty weak assumptions iffiness requires that if means all
(well, all relevant). Second, there are some characteristic Facts about how
indicatives and epistemic modals interact (Section 5). These neatly divide:
there are some consistency facts and there are some intuitive entailment
facts. The operator view requires that either the conditionals outscope the
modals or the modals outscope the conditionals. Something general then
follows: no matter what conditional operator we say if expresses, one scope
choice is ruled out by the consistency facts, the other by the entailments
(Section 6).
That seems to be bad news for any fan of any version of the old school
operator view. And there seems to be more bad news in the offing since
the operator view isn’t the only game in town (in some circles, it’s a game
played only on the outskirts of town). The anti-iffiness rival — a.k.a. the
restrictor view — is a new school approach. It embraces Kratzer’s thesis that
if is not a connective at all: it doesn’t express an operator, a fortiori not
an iffy operator, and a fortiori not the same iffy operator in each of our
example sentences it figures in.4 Instead, says the restrictor analysis, if
simply restricts other operators. In the cases we will care about, it restricts
(possibly covert) epistemic modals. The restrictor view makes embarrassingly
quick work of the data that spells such trouble for the operator view (Section
But the success of the restrictor analysis is no argument against Chuck
Taylors and skyhooks tout court. That’s because there are old school stories
that say that if expresses a strict conditional operator over possibilities
compatible with the context, and that it can do all the restricting that needs
doing (Sections 8). Once we see just how, we can look back and see more
4 The restrictor view gets its inspiration from Lewis’s (1975) argument that certain if s (under
adverbs of quantification) cannot be understood as expressing some conditional but rather
serve to mark an argument place in a polyadic construction. Kratzer’s thesis is that this holds
for if across the board. The classic references are Kratzer 1981, 1986. There is another rival,
too: some take if to be an operator, but an operator that does not (when given arguments)
express a proposition (Adams 1975; Gibbard 1981; Edgington 1995, 2008). Instead, they say,
if s express but do not report conditional beliefs on the part of their speakers. I will ignore
this view here: it doesn’t really start off as the most plausible candidate, the trouble I make
here about how if s and modals interact makes it less plausible not more, and it will just take
us too far afield.


clearly what is at stake in the difference between new school and old, why
iffiness is worth pursuing (Section 9), and how this version of the old school
story relates to recent dynamic semantic treatments (Section 10).

3 Ground rules

Let’s simplify. Assume that meanings get associated with sentences by getting
associated with formulas in an intermediate language that represents the
relevant logical forms (lfs) of them. Thus a story, old school or otherwise,
has to first say what the relevant lfs are and then assign those lfs semantic
We will begin with an intermediate language L that has a conditional
connective that will serve to represent the lfs of ordinary indicatives. So let
L be generated from a stock of atomic sentence letters, negation (¬), and
conjunction (∧) in the usual way. But L also has the connective (if ·)(·),
and the modals must and might. What I have to say can be said about
an intermediate language that allows that the modals mix freely with the
formulas of the non-modal fragment of L but restricts (if ·)(·) so that it
takes only non-modal sentences in its first argument. So assume that L is
such an intermediate language. When these restrictions outlive their utility,
we can exchange them for others.5
Iffiness requires that the if of English expresses something properly iffy.
That leaves open just which conditional operator we say that the if of English
means. But our choices here are not completely free, and some ground rules
will impose some order on what we may say. These will constrain our choice
by saying what must be true for a conditional operator to be rightfully so
called. But before getting to that, I’ll start with what I will assume about
First, a general constraint: assume that truth-values — for the if s and
the modals (when we come to that), as well as for the boolean fragment of
L — are assigned at an index (world) i with respect to a context. I will assume
that W , the space of possible worlds, is finite. Nothing important turns on
this, and it simplifies things.
For the fragment of L with no modals and no if s, contexts are idle. It will
be the job of the modals to quantify over sets of live possibilities and the job
5 Conventions: p, q, r , . . . range over sentences of L (subject to our constraints on L); i, j, k, . . .
range over worlds; and P , Q, R, . . . range over sets of worlds. And let’s not fuss over whether
what is at stake is the ‘if ’ of English or the ‘if ’ of L; context will disambiguate.

A. S. Gillies

of contexts to select these sets of worlds over which the modals do their job.
What I want to say can be said in a way that is agnostic about just what kinds
of things contexts are: all I insist is that, given a world, they determine a set
of possibilities that modals at that world quantify over.6 The functions doing
the determining need to be well-behaved.
Given a context c — replete with whatever things contexts are replete
with — an epistemic modal base C determined by it is just what we need:

Definition 3.1 (modal bases). Given a context c, C is a modal base (for c)

only if:

C = λi. j : j is compatible with the c-relevant information at i

Since the only context dependence at stake here will be dependence on

such bases, we can get by just as well by taking them to go proxy for bona
fide contexts, granting them the honorific “contexts”, and relativizing the
assignment of truth-values to index–modal base pairs directly. So we’ll be
saying just which function ‚·ƒC,i : L → {0, 1} is, where C represents the
relevant contextual information. No harm comes from that, and it makes for
a prettier view.7
But not just any function from indices to sets of indices will do as a
(proxy) context. So we constrain C’s accordingly, requiring that they are
well-behaved — that is, reflexive and euclidean:
6 The problems and prospects for iffiness are independent of just whose information in a
context — speaker, speaker plus hearer, just the hearer, just the hearer’s picture of what the
speaker intends, and so on — counts for selecting the domains for the modals to do their job,
and whether or not that information is information-at-a-context at all. So let’s keep things
simple here. If you’d rather be reading a paper which has these (and other) complexities at
the forefront, see von Fintel & Gillies 2007, 2008a,b and the references therein.
7 Three comments. First: take ‚·ƒC to be shorthand for i : ‚·ƒC,i = 1 . If p’s denotation

is invariant across contexts – if ‚pƒC = ‚pƒC no matter the choice for C and C 0 – let’s
agree to conserve a bit of (virtual) ink and sometimes omit the superscript: so, e.g., the
if s I am focusing on here have non-modal antecedents, and so those antecedents will be
context-invariant. Second: it’s a little misleading to say that the only context dependence
is dependence on modal bases since we will want to allow the possibility that what worlds
are relevant to an if at a world can vary across contexts. But, in fact, we can (and will)
still leave room for that possibility by constraining how contexts and the sets of if -relevant
possibilities relate. Third: if I had different ambitions, we couldn’t simplify quite like this. If
the interaction at center stage were how if s and quantifiers interact, or if the modals in the
if /modal interaction were deontic, then we’d want our contexts to rightly characterize the
kind of information at stake and taking them to determine sets of possibilities compatible
with what is known would not do. But my ambitions here aren’t different from what they


Definition 3.2 (well-behavedness). C is well-behaved iff:

i. i ∈ Ci (reflexiveness)
ii. if j ∈ Ci then Ci ⊆ Cj (euclideanness)
C represents a (proper) context only if it is well-behaved.

Observation 3.1. If C is well-behaved then Ci is closed — well-behavedness

implies that if j ∈ Ci , then Cj = Ci .

Proof. Suppose j ∈ Ci . Consider any k ∈ Cj . Since C is euclidean and j ∈ Ci ,

Ci ⊆ Cj . Since C is reflexive, i ∈ Ci and thus i ∈ Cj . Appeal to euclideanness
again: since k ∈ Cj , Cj ⊆ Ck ; but i ∈ Cj and so i ∈ Ck . And once more: since
i ∈ Ck , Ck ⊆ Ci . And now reflexiveness: k ∈ Ck and so k ∈ Ci . (The
inclusion in the other direction just is euclideanness.)

Gloss Ci as the set of live possibilities at i in C. That Ci is closed means

that the live possibilities in Ci do not vary across worlds compatible with C.8

4 Conditional operators

By saying something about what must be true of an operator for it to be

a conditional operator properly so called we thereby say something about
what must be true for a story to be iffy. Taking if to express a bona fide
conditional operator requires, minimally, two things.
Thing one: it requires, in the cases we’ll care about, that if such-and-
such, then thus-and-so doesn’t take a stand on whether such-and-such is
the case and so conditionals like that are typically happiest being uttered
in circumstances in which such-and-such is compatible with the context as
it stands when the conditional is issued. I will take this as a definedness
condition on the semantics for our conditional connective.

Definition 4.1 (definedness). ‚ if p q ƒC,i is defined only if p is compatible


with Ci .

This is a weak constraint.9

8 Given euclideaness, we could get by with different assumptions on C to the same effect.
But reflexiveness is a constraint it makes sense to want since, when we come to them,
epistemic modals — what might or must be in virtue of what is known — in a given context
will quantify over the set of possibilities compatible with that context.
9 The motivating idea isn’t novel (see, e.g., Stalnaker 1975): if it’s ruled out that p in C,
and you want to say something conditional on p in C, then you should be reaching for a

A. S. Gillies

Thing two: it requires that if expresses a relation between antecedent

and consequent. Whether if such-and-such, then thus-and-so is true depends
on whether the relevant worlds at which such-and-such is true bears the
right relationship to the worlds where thus-and-so is true. Take an arbitrary
conditional like if p q at i, in C. And let P and Q be the sets of antecedent
and consequent possibilities so related by the if . Now we need to zoom in on
the relevant worlds in P . So let Di be the set of if -relevant worlds at i. For if
to express a conditional operator properly so called, its denotation must be a
relation R between P -together-with-the-relevant-possibilities-Di and Q.
Di is the set of possibilities relevant for the if at i. Since Di is a function
of i, different worlds may be relevant for one and the same if when evaluated
at different worlds. But, depending on your favorite theory, Di may be a
function of more than just i: it may be a function of i, of C, of p, of q,
or of your kitchen sink. We will return to that shortly. No matter your
favorite theory, we can still ex ante agree to this much: i is always among
the possibilities relevant for an if at i, and only possibilities compatible with
the context are relevant for an if at i. That is: Di is the set of if -relevant
worlds at i only if i ∈ Di and Di ⊆ Ci . The first requirement is a platitude:
the facts at a world are always relevant to whether an indicative at that world
is true. The second means that an indicative in a context is supposed to say
something about the possibilities compatible with that context.
Beyond this, what your favorite theory implementing the operator view
says about Di may vary because what stories say counts as an if -relevant
possibility varies. But what does not vary is that all such stories determine
Di in a pretty straightforward way and so the denotation they assign to if
can be put as a relation between the relevant antecedent possibilities and the
consequent possibilities. Three examples:
Example 1 (variably strict conditional). Suppose your favorite story
takes if to be a variably strict conditional based on some underlying ordering
of possibilities (Stalnaker 1968; Lewis 1973). For every world i, let i be
an ordering of worlds, a relation of comparative similarity (at least) weakly
centered on i. Given a conditional if p q at i in C, you will want to
identify Di with the set of possibilities no more dissimilar than the most
similar p-world to i, restricted by Ci .
Example 2 (strict conditional). Suppose your favorite Lewis-inspired story
counterfactual not an indicative. That can be implemented in any number of ways, including
making it a presupposition of if -clauses (see, e.g., von Fintel 1998a).


comes not from D.K. but from C.I. You thus take if to be strict implication
(restricted to C). But that, too, can be put in terms of orderings: your ordering
i is universal, treating all worlds the same. Whence it follows that — since
the nearest p-world is the same distance from i as is every world — taking
Di to be the set of possibilities no further from i as the nearest p-world
amounts to taking Di to be the set of all worlds W , restricted by Ci .
Example 3 (material conditional). Suppose you are smitten by truth-tables,
and your favorite incarnation of the operator view is the material conditional
story. Equivalently: you will have a maximally discerning ordering (every
world an island) and take Di to be the set of closest worlds to i simpliciter
according to that ordering. For an if at i you will thus take Di to be {i}.
(For an if at some other world j, even an if with the same antecedent and

consequent as the one at i, take Dj to be j .)
Summing this all up: even before taking a stand on just what relation
between relevant antecedent possibilities and consequent possibilities that if
must express in order to express a conditional operator properly so called,
we know that it must still express such a relation. So let’s insist that we
can put things that way, parametric on just how Di gets picked out and
so parametric on what counts as “relevant” antecedent possibilities and so
parametric on the details of your favorite theory:

Definition 4.2 (relationality). (if ·)(·) expresses a conditional only if its

truth conditions can be put this way:

if defined, ‚ if p q ƒC,i = 1 iff R(Di ∩ P , Q)


for some set of possibilities Di and relation R, where i ∈ Di and Di ⊆ Ci .

But not just any relation between Di ∩ P and Q counts as a conditional

relation properly so called. I insist on three minimal constraints on R, for any
P and Q: (i) that Di ∩ P imposes some order on the set of Q’s so related; (ii)
that Q matters to whether the relation holds; and (iii) that — plus or minus
just a bit — only the relationship between the possibilities in Di ∩ P and
the possibilities in Q matter to whether the relation holds. These are not
controversial, but do bear some unpacking.10
First, the order imposed by the antecedent:
10 This general way of characterizing conditionality is not new: both the assumptions and
the results here are inspired by van Benthem’s (1986: §4) investigation of conditionals as
generalized quantifiers. There are, however, differences between his versions and mine.

A. S. Gillies

Definition 4.3 (order). R is orderly iff:

i. R(Di ∩ P , P )
ii. R(Di ∩ P , Q) and Q ⊆ S imply R(Di ∩ P , S)
iii. R(Di ∩ P , Q) and R(Di ∩ P , S) imply R(Di ∩ P , Q ∩ S)
R is something (if ·)(·) at i could mean only if it is orderly.

Such R’s are precisely those for which the set of Q’s a Di ∩ P bears it to
form a filter that contains P .11 That is an aesthetic reason for constraining
R this way. Such R’s also jointly characterize the basic conditional logic.12
The relational properties correspond to reflexivity, right upward monotonic-
ity, and conjunction. That is another — only partly aesthetic — reason for
constraining them this way.
Second, R must care about consequents. This is just the requirement that
conditional relations, like quantifiers, be active:

Definition 4.4 (activity). R is active iff:

if Di ∩ P 6= œ then there is a Q and Q0 such that: R(Di ∩ P , Q) but not

R(Di ∩ P , Q0 )

R is something (if ·)(·) at i could mean only if it is active.

This means that R cares about how Di ∩ P relates to Q. So long as there

are some relevant P -possibilities, there have to be some Q’s for which the
relation holds and some for which it doesn’t.
And finally: R is a relation between the sets of possibilities. Thus if R
holds at all between P -plus-the-relevant-possibilities-Di and the consequent-
possibilities Q, R will hold between any two sets of things that play the right
possibility role. Intrinsic properties of worlds don’t count for or against the
relation holding. The idea is simple, the execution harder. That is because
I have allowed you to choose your favorite iffy theory, and what goes into
determining Di depends on your choice.
What is important is this: suppose your favorite story posits some ad-
ditional structure to modal space to find just the right worlds which, when
combined with P , gives the set of worlds relevant for evaluating Q. That
means that your favorite story cares about how P relates to Q but also about
the distribution of the worlds in P compared to the distribution in Q — for
11 It follows straightaway that orderly R’s are fully reflexive in the sense that R(Di ∩ P , Di ∩ P ).
12 See Veltman 1985 for a proof.


example, perhaps insisting that it is the closest worlds in P to i that must bear
R to Q. If we systematically swap possibilities for possibilities in a way that
preserves the relevant structure, then the conditional relation ought to hold
pre-swapping iff it holds post-swapping. And mutatis mutandis for Di : since
once the posited structure does its job determining Di , then any systematic
swapping of possibilities that leaves the domain untouched should also leave
the conditional relation untouched.13
Where π is such a mapping and P a set of worlds, let π (P ) be the set of
worlds i such that π (j) = i for some j ∈ P . Then:

Definition 4.5 (quality). R is qualitative iff:

R(Di ∩ P , Q) implies R(π (Di ∩ P ), π (Q))

R is something (if ·)(·) at i could mean only if it is qualitative.

This does generalize the familiar constraint on quantifiers — it allows condi-

tional operators to care about both the relationship between P and Q and
also where the satisfying worlds are. If i is the universal ordering then this
requirement reduces to the more familiar quantitative one (restricted to Ci ).
And if Di = {i}, it trivializes.
I am insisting that a story is iffy only if the truth conditions for an indica-
tive if p q at i in Ci can be put as a relation between R between Di ∩ P and
Q. And we have insisted that the relation be constrained in sensible ways — it
must impose some order on sets of consequent possibilities, it must care
about consequents, and it must not care about the intrinsic properties of pos-
sibilities. Each example of an instance of the operator view above — variably
strict, strict, and material conditionals — lives up to these constraints. Still, it
seems like for all we have said it is possible to take the conditional to be true
just in case most/many/several/some/just the right possibilities in Di ∩ P
are in Q. But that is not so: given our constraints, if must mean all.14
13 This is the natural extension of the familiar requirement that quantifiers be quantitative:
for Q to be a quantifier (with domain E) it must be that QE (A, B) iff QE (f (A), f (B)) where
f is an isomorphism of E. Once we have structure to our domain, this will not do. The
more general constraint is then to require that Q be invariant under O-automorphisms of
the domain, where O is the ordering that imposes the posited structure. We can get by with
slightly less: namely, stability under Di -invariant automorphisms.
14 Well, all relevant. This was first proved by van Benthem — see, e.g., van Benthem 1986. The
version I give is simpler (we’re ignoring the infinite case) and a bit more general (slightly
weaker assumptions); the proof is based on one in Veltman 1985, but generalizes it slightly.

A. S. Gillies

Observation 4.1. Assume R is a conditional relation properly so called. Then

R(Di ∩ P , Q) iff Di ∩ P ⊆ Q.

Proof. I care about the left-to-right direction.

Suppose — for reductio — that R(Di ∩ P , Q) but Di ∩ P 6⊆ Q. What we’ll
see is: (i) R(Di ∩ P , P ∩ Q); (ii) the world that witnesses that Di ∩ P 6⊆ Q can
be exploited (by quality) to show that no world in P ∩ Q plays a role in
R(Di ∩ P , P ∩ Q) holding — from which it follows that R(Di ∩ P , œ); (iii) from
which it follows that Di ∩ P must be empty — a contradiction.
(i): By hypothesis R(Di ∩ P , Q). By order it follows that R(Di ∩ P , P ) and
hence that R(Di ∩ P , P ∩ Q).
(iia): Claim: Di ∩ P ∩ Q 6= œ. Proof of Claim: Assume otherwise. order
guarantees that R(Di ∩ P , Di ∩ P ). By hypothesis R(Di ∩ P , Q), and so by
order R(Di ∩ P , Di ∩ P ∩ Q). Applying the assumption that Di ∩ P ∩ Q = œ:
R(Di ∩ P , œ). Appeal to order again and we have that R(Di ∩ P , S) for any
S. But then Di ∩ P must be empty (activity), contradicting the assumption
that Di ∩ P È Q and proving the Claim.
(iib): Let j be a witness to Di ∩ P 6⊆ Q. So j ∈ Di ∩ P but j 6∈ Q. Now pick
any confirming instance k — that is, any k ∈ Di ∩ P ∩ Q — and let π be the
mapping that swaps k and j and leaves all else untouched:
• π (j) = k
• π (k) = j

• π (i) = i for every i 6∈ j, k
By (i) R(Di ∩ P , P ∩ Q). Hence, by quality, R(π (Di ∩ P ), π (P ∩ Q)). But π
doesn’t affect Di ∩ P . So: R(Di ∩ P , π (P ∩ Q)). That is: R holds between
Di ∩ P and both P ∩ Q and π (P ∩ Q). Hence — by order — it holds also
between Di ∩ P and their intersection: R(Di ∩ P , (P ∩ Q) ∩ π (P ∩ Q)). But

π (P ∩ Q) = ((P ∩ Q) \ {k}) ∪ j , so their intersection is (P ∩ Q) \ {k}. So:
R(Di ∩P , (P ∩Q)\{k}). Which is to say that k is irrelevant for R’s holding. But
k was any world in Di ∩ P ∩ Q, so finiteness plus order implies R(Di ∩ P , œ).
(iii): Appeal to order again: since R(Di ∩ P , œ), it holds that for any S
whatever R(Di ∩ P , S). Whence, by activity, it follows that Di ∩ P = œ. And
that contradicts the assumption that Di ∩ P 6⊆ Q.

The intuitive version is just this: if R holds between Di ∩ P and Q then the
former must be included in the latter. That is because if things didn’t go that
way then the witnessing counterexample world could play the role of any
one of the confirming worlds. But that would mean that confirming worlds


play no role. Nothing like that could be something a conditional properly so

called could mean. So Di ∩ P must be included in Q after all.

5 Three facts

Iffiness requires that if is a conditional connective that expresses a con-

ditional operator, and that pretty much means that if has to mean all. It
requires that no matter what other operators we might find in its neighbor-
hood. That spells trouble because of three simple Facts about how indicative
conditionals and epistemic modals play together.15
I have lost my marbles. I know that just one of them — Red or Yellow — is
in the box. But I don’t know which. I find myself saying things like:

(2) Red might be in the box and Yellow might be in the box.
So, if Yellow isn’t in the box, then Red must be.
And if Red isn’t in the box, then Yellow must be.

Conjunctions of epistemic modals like Red might be in the box and Yellow
might be in the box are especially useful when the bare prejacents partition
the possibilities compatible with the context. The first fact is simply that if s
are consistent with such conjunctions of modals.

Fact 1 (consistency). Suppose S1 and S2 partition the possibilities compati-

ble with the context. Then the following are consistent:
i. might S1 and might S2
ii. if not S1 , then must S2 ; and if not S2 , then must S1
15 Three notes about the Facts. First: “Facts” may be laying it on a little thick. The judgments
are robust, and the costs high for denying the generalizations as I put them. That’s all true
even if what we may say about them is a matter for disputing. But it does not much matter:
what I really care about is three characteristic seeming facts about if s, mights, and musts
that at first blush look like the kind of thing our best story ought to answer to. So let’s agree
to take them at face value and see where that leads. Later, if your English breaks with mine
or if your old school pride overwhelms, you can deny the Facts or explain them away as your
preferences dictate. Second: the Facts may seem eerily familiar. They are not far removed
from the sorts of examples of the interplay between adverbs of quantification and if -clauses
in Lewis 1975 and Kratzer 1986. That is no coincidence, as we’ll see (briefly) in Section 7.
Third: since the operator view isn’t the only game in town and since predicting the Facts
is something any story (old school or otherwise) must do, we should state the Facts in a
way that is agnostic on the iffy thesis. So the Facts characterize what is true of sentences
in (quasi-)English, not necessarily what is true of their lfs in our regimented intermediate

A. S. Gillies

I do not know whether Carl made it to the party. But wherever Carl goes,
Lenny is sure to follow. So if Carl is at the party, Lenny must be — Lenny is at
the party, if Carl is. We just glossed an if with a commingling epistemic must
by a bare if with no (overt) modal at all. Thus:

(3) a. If Carl is at the party, then Lenny must be at the party. ≈

b. If Carl is at the party, then Lenny is at the party.

This pair has the ring of (truth-conditional) equivalence. Fact 2 below records
that. But there are also arguments for thinking that the truth-value of (3a)
should stand and fall with the truth-value of (3b).
For suppose that such if s validate a deduction theorem and modus
ponens, and that must is factive.16 The left-to-right direction: assume that
(3a) is true. And consider the argument:

(4) If Carl is at the party, then Lenny must be at the party.

Carl is at the party.
So: Lenny is at the party.

The first two sentences — intuitively speaking — entail the third. And that is
pushed on us by the assumptions: from the first two sentences we have (by
modus ponens) that Lenny must be at the party, which by factivity entails
Lenny is at the party. Apply the deduction theorem and we have that If Carl
is at the party, then Lenny must be at the party entails If Carl is at the party,
then Lenny is at the party. Since we have assumed that (3a) is true, it follows
that (3b) must be. There are spots to get off this bus to be sure — by denying
either modus ponens or by denying the factivity of must — but those costs
are high.17
The right-to-left direction: assume that (3b) is true and consider:
16 Remember that, for now, we are dealing with properties of sentences of (quasi-)English not
properties of those sentences’ lfs in some regimented language. The argument here isn’t
meant to convince you of Fact 2, it is meant to make some of the costs of denying the data
vivid. Geurts (2005) also notes that bare conditionals and their must-enriched counterparts
are “more or less equivalent”.
17 You have to troll some pretty dark corners of logical space for deniers of modus ponens,
but that’s not true for deniers of the factivity of must. That view has something of mantra
status among linguists (philosophers are surprised to hear that). Mantra or not, it is wrong.
For an all-out attack on it see von Fintel & Gillies 2010. Here is just one sort of consideration:
if must p didn’t entail p (because must is located somewhere below the top of the scale of
epistemic strength), then you’d expect must to combine with only in straightforward ways
the way might can:


(5) If Carl is at the party, then Lenny is at the party.

Carl is at the party.
So: Lenny must be at the party.

This is as intuitive an entailment as we are likely to find. Whence it follows by

the deduction theorem that If Carl is at the party, then Lenny is at the party
on its own entails If Carl is at the party, then Lenny must be at the party. So
if (3b) is true so must be (3a): that’s why the former seems to gloss the latter.

Fact 2 (if/must). Conditional sentences like these are true in exactly the
same scenarios:
i. if S1 , then must S2
ii. if S1 , then S2

The glossing that this pattern permits is a nifty trick. But that is only half
the story since if can also co-occur with epistemic might. The interaction
between if and might is different and underwrites a different glossing.
Alas, my team are not likely to win it all this year. It is late in the season
and they have made too many miscues. But they are not quite out of it. If
they win their remaining three games, and the team at the top lose theirs,
my team will be champions. But our last three are against strong teams
and their last three are against cellar dwellers. Still, my spirits are high:
if we win out, we might win it all. Put another way, within the (relevant)
my-team-wins-out possibilities — of which there are some — lies a my-team-
wins-it-all possibility; there is a my-team-wins-out possibility that is a my-
team-wins-it-all possibility. But that is just to say that there are (relevant)
my-team-wins-out-and-wins-it-all possibilities. Maybe not very many, and
maybe not so close, but some.18
Apart from keeping hope alive, the example also illustrates that we can
gloss an indicative with a co-occurring epistemic might by a conjunction
under the scope of might:

(6) a. If my team wins out, they might win it all. ≈

b. It might turn out that my team wins out and wins it all.
(i) a. I didn’t say it is raining, I only said it might be raining.
b. #I didn’t say it is raining, I only said it must be raining.

But it doesn’t.
18 For the record: the Cubs. Please don’t bring it up.

A. S. Gillies

That gloss sounds pretty good. And for good reason: conjunctions that you
would expect to be happy if the truth of (6a) and (6b) could come apart are
not happy at all:

(7) a. #If my team wins out, they might win it all; moreover, they can’t win
out and win it all.
b. #It might turn out that my team wins out and wins it all, and, in
addition there’s no way that if they win out, they might win it all.

That gives us the third Fact about how if s play with modals.19

Fact 3 (if/might). Sentences like these are true in exactly the same scenarios:

i. if S1 , then might S2
ii. it might be that [S1 and S2 ]

It’s now a matter of telling some story, iffy or otherwise, that answers to
these Facts. Old school operator views will have trouble with them; the new
school restrictor view predicts them trivially.

6 Scope matters

The operator view takes if to express an operator, an iffy operator, and the
same iffy operator no matter whether we have a co-occurring epistemic modal
or not and no matter whether the modal is must or might. In cases where
there is a modal, scope issues have to be sorted out. Take a sentence of the

(8) If S1 then modal S2

19 There is a wrinkle: Fact 3 implies that if S1 , then might S2 is true in just the same spots as if
S2 , then might S1 . Seems odd:

(i) a. If I jump out the window, I might break a leg.

b. If I break a leg, I might jump out the window.

The first is true, the second an overreaction. I intend, for now, to sweep this under the same
rug that we sweep the odd way in which Some smoke and get cancer/Some get cancer and
smoke don’t feel exactly equivalent even though Some is a symmetric quantifier if ever there
was one. (The rug in question seems to be the tense/aspect rug; similar considerations drive
von Fintel’s (1997) discussion of contraposition of bare conditionals.)


and let S10 (S20 ) be the L-representation for sentence S1 (S2 ), and modal the
L-representation for modal. We have a short menu of options for the relevant
lf for such a sentence — either the narrowscoped (9a) or the widescoped (9b):

if S10 modal S20

(9) a.
modal if S10 S20

If you want to put your lfs in tree form, be my guest: opting for nar-
rowscoping means opting for sisterhood between modal and S2 ; opting for
widescoping means opting for sisterhood between modal and if S1 then S2 .
The trouble for the operator view is that, since if has to express inclusion,
neither choice will do. One choice for scope relations seems ruled out by
consistency (Fact 1), the other by if/must (Fact 2) and if/might (Fact 3).
To put the trouble precisely, we need one more ground rule. Contexts,
we said, have the job of determining the domains the modals quantify over.
Modals, I’ll assume, do their job in the usual way by expressing their usual
quantificational oomph over those domains: must (at i, with respect to C)
acts as a universal quantifier, and might as an existential quantifier, over Ci .

Definition 6.1 (modal force).

i. ‚might pƒC,i = 1 iff Ci ∩ ‚pƒC 6= œ
ii. ‚must pƒC,i = 1 iff Ci ⊆ ‚pƒC

Now suppose we plump for narrowscoping. Then, given the ground rules,
we cannot predict the consistency of the likes of (2) and that means that we
cannot square iffiness with Fact 1. That’s true no matter how you fill in the
particulars of the iffy story.
Here is the narrowscoped analysis of my lost marbles. We have a modal
and two indicatives:

(10) a. Red might be in the box and Yellow might be in the box.
might p ∧ might q
b. If Yellow isn’t in the box, then Red must be.
if ¬q must p
c. If Red isn’t in the box, then Yellow must be.
if ¬p must q

Any good story has to allow that the bundle of if s in (10b) and (10c) is
consistent with the conjunction in (10a). But, assuming narrowscoping,

A. S. Gillies

this — even without taking a stand on how we choose Di and so without

taking a stand on what counts as the set of if -relevant worlds — seems to be
beyond what can be delivered by any version of the operator view.

Observation 6.1. Suppose p and q partition the possibilities in C and that

(10a) is true. Then the (narrowscoped) sentences in (10) can’t all be true.

Proof. Suppose otherwise — that the regimented formulas in L are all true at
a live possibility, say i, with respect to C. Just one of my marbles is in the
box. So any world in Ci is either a p-world or a q-world, but not both; C is
well-behaved, so i ∈ Ci . That leaves two cases.
case 1: i ∈ ‚¬qƒ. By hypothesis ‚ if ¬q must p ƒC,i = 1, and so Di ∩

‚¬qƒC ⊆ ‚must pƒC . Since i ∈ Di , it then follows that i ∈ ‚must pƒC — which
is to say ‚must pƒC,i = 1. Thus Ci has only p-worlds in it. But that is at
odds with the second conjunct of (10a): that might q is true at i guarantees a
q-world, hence a ¬p-world, in Ci .
case 2: i ∈ ‚¬pƒ. By hypothesis ‚ if ¬p must q ƒC,i = 1, and so Di ∩

‚¬pƒC ⊆ ‚must qƒC . Since i ∈ Di , it then follows that i ∈ ‚must qƒC — which
is to say ‚must qƒC,i = 1. Thus Ci has only q-worlds in it. But that is at odds
with the first conjunct of (10a): that might p is true at i guarantees a p-world,
hence a ¬q-world, in Ci .

Narrowscoping has the virtue of taking plain and simple lfs to represent
indicatives with apparently epistemic modalized consequents. But it has the
vice of not squaring with consistency. This is true no matter the particulars
of your favorite version of the operator view.20
So suppose instead that co-occurring modals scope over the if -constructions
in which they occur. Now it is the generalizations if/must and if/might
that cause trouble. Again, that’s true no matter how Di is chosen and so
no matter what counts as an if -relevant possibility and so no matter what
conditional operator we say if expresses.
Here is a widescope analysis of the key examples (3) and (6):

(11) a. If Carl is at the party, then Lenny must be at the party.

must if p q
b. If Carl is at the party, then Lenny is at the party.
if p q

20 Thus by supplying how your favorite version of the operator view says Di is determined, you
can use this proof to show how that story (assuming narrowscoping) departs from Fact 1.


(12) a. If my team wins out, they might win it all.

might if p q
b. It might turn out that my team wins out and wins it all.
might (p ∧ q)
The facts are that must if p q ≈ if p q and that might if p q ≈
might (p ∧ q). What we need is a semantics for the conditional connec-
tive (if ·)(·) that can predict both patterns. But paths that might lead to one
pretty reliably lead away from the other.
So far I have insisted that i is always among the relevant worlds to an
if at i (i ∈ Di ) and also that only worlds compatible with the context are
relevant (Di ⊆ Ci ). Here I am in good company. But perhaps there is even
more interaction between domains of if -relevant worlds and contexts.
Some theories say that there can be no difference in domains for condi-
tionals between worlds compatible with the context, others disagree:

Definition 6.2 (egalitarianism & chauvinism).

i. A semantics is egalitarian iff if whenever j ∈ Ci then Dj = Di .
ii. A semantics is chauvinistic iff it is not egalitarian.

egalitarianism requires domains to be invariant across worlds compati-

ble with a context. That means that distinctions between worlds made by
D’s — this world is relevant, that one isn’t — are unaffected when those dis-
tinctions are made from behind the veil of ignorance (we don’t know which
world compatible with C is the actual world). Chauvinistic theories allow
differences from behind the veil to matter to what possibilities get selected
for domainhood, and thus allow that a possibility j ∈ Ci may determine a
different set of relevant possibilities than does i. Once we have agreed that,
for any i, Di selects from the worlds compatible with C and must include i,
it is a further question whether we want to be egalitarians or chauvinists.21
21 The history of the conditional is littered with chauvinists. The material conditional analysis
is chauvinistic. It says that the only possibility relevant for the truth of an if at i in C is
i itself. And similarly for an if at j: only j matters there. Thus, except in the odd case
where the context rules out uncertainty altogether, we will have that Dj 6= Di , for any choice
of i and j compatible with C. A variably strict conditional analysis, based on a family
of orderings (one for each world), is chauvinistic if we do not impose an “absoluteness”
condition — the requirement that orderings around any two worlds be the same. (Lewis
(1973: §6) discusses absoluteness in the process of characterizing the V -logics.) What to say
about absoluteness is optional and so there is room for agnosticism about chauvinism.
Stalnaker’s (1975) treatment of indicatives is not officially agnostic about chauvinism, but

A. S. Gillies

It is hard to be a chauvinist. That is because, assuming the particulars

of the chauvinistic theory are compatible with there being a (p ∧ ¬q)-world
in Ci but not in Di , no such story will predict if/must. The data say that
bare indicatives and their must-enriched counterparts are true in the same
scenarios. But chauvinism plus widescoping guarantees that the domain
the if quantifies over is properly included in the domain its must-enriched
counterpart quantifies over. Thus the former says something strictly weaker
than — true in strictly more spots than — the latter. That is at odds with Fact

Observation 6.2. Suppose that Di ⊂ Ci . There are scenarios in which the

widescoped (11b) is true but (11a) isn’t. Thus chauvinism plus widescoping
can’t explain Fact 2.

Proof. Consider a (p ∧ ¬q)-world — call it j — and suppose that Ci does, but

Di does not, contain j. Then every possibility in Di ∩ ‚pƒ is in ‚qƒ and the
plain if is true (at i, in C): ‚ if p q ƒC,i = 1. But not the widescoped must-

enriched if . That is because there is a world in Ci — namely j — such that not

every possibility in Dj ∩ ‚pƒ is a possibility in ‚qƒ. Thus ‚ if p q ƒC,j = 0

and so it is not true that the plain if is true at every world in Ci and so
‚must if p q ƒC,i = 0.

Again, this is true no matter how we fill in the particulars of the operator
view. If we widescope the modals, and the story is chauvinistic, it will not
square with Fact 2.
Given widescoping, egalitarianism fares no better. But here it is
if/might (Fact 3) that causes trouble. This time the issue is triviality: must-
enriched if s are true iff their might-enriched counterparts are.
Here is why. First, egalitarianism implies that Di covers Ci :

Observation 6.3. egalitarianism implies that Di = Ci .

Proof. Assume otherwise. Di ⊆ Ci , so there must be a j ∈ Ci such that j 6∈ Di .

By egalitarianism, Dj = Di . But we know that j ∈ Dj . Contradiction.

that is only because he requires that i induce a total order that is centered pointwise on
i, and that rules against absoluteness. But the pragmatic mechanisms he develops there
are agnostic on the chauvinism question — what he says about how the context constrains
selection functions is compatible with both egalitarianism and chauvinism. I myself see
little reason to go for chauvinism.


Thus if Di reflects some measure of proximity to i, egalitarianism

implies that the underlying ordering is centered not pointwise on i but
setwise on the worlds compatible with C. So egalitarianism implies that
if is really a strict conditional. That’s true whether Di is derived from some
underlying ordering or not: if , might and must quantify over the same
domain of possibilities, and an if is true at i iff all of the antecedent worlds
in that domain are consequent worlds.22 That means that an if at i (in C)
is true iff the corresponding material conditional is true at every possibility
compatible with C. And that means that such an if is true at i iff the material
conditional, widescoped by must, is true at i.23
But from this degree of fit between Di and Ci it follows straightaway that
no two possibilities compatible with C can differ over an if issued in C. There
is solidarity among if s; they stand and fall together:

Observation 6.4. egalitarianism implies

‚ if p q ƒC,i = 1 iff for every j ∈ Ci : ‚ if p q ƒC,j = 1


Proof. ‚ if p q ƒC,i = 1 iff Di ∩ ‚pƒ ⊆ ‚qƒ. By egalitarianism: iff, for any


j ∈ Ci , Dj ∩ ‚pƒ ⊆ ‚qƒ. Equivalently: iff, for any j ∈ Ci , Cj ∩ ‚pƒ ⊆ ‚qƒ — that

is, iff for every such j, ‚ if p q ƒC,j = 1.

Given widescoping, any story with this equivalence will have a hard time
saying why conditionals like (12a) seem to be true iff modalized conjunctions
like (12b) are and so will have trouble with if/might. That is because, given
the usual story for the modals (Definition 6.1), we get triviality:

Observation 6.5. egalitarianism implies:

‚might if p q ƒC,i = 1 iff ‚must if p q ƒC,i = 1


Thus widescoping plus egalitarianism implies that must if p q is true
iff might(p ∧ q) is. Not even Cubs fans fall for that.
22 Strictness makes it easy to understand why negating a bare conditional sounds so much
like saying the counterexample might obtain. For more on context-dependent strictness
(of different flavors) see, e.g., Veltman 1985, von Fintel 1998a, 2001, and Gillies 2004, 2007,
23 Thus, given well-behavedness (Definition 3.2), explaining Fact 2 is easy for widescoping
egalitarians: if p q is equivalent to must (p ⊃ q) which, given well-behavedness, is
equivalent to must must (p ⊃ q). And that, in turn, is equivalent to must if p q .

A. S. Gillies

Proof. Note that ‚might if p q ƒC,i = 1 iff the plain conditional if p q is


true somewhere in Ci . But by Observation 6.4 the plain if is true somewhere

in Ci iff it is true everywhere in Ci . And it is true everywhere in Ci just in case
‚must if p q ƒC,i = 1. That trivializes rather than explains Fact 3.

No matter the particulars, widescoping plus egalitarianism can’t predict

Fact 3.
Iffiness requires conditionals to have a structure that does not play nice
with modals. That’s because no way of resolving the relative scopes will
work.24 What causes the trouble is that the operator view requires if to mean
all. But the Facts don’t seem to allow that. If we widescope, then sometimes
that seems all right — if the modal in question happens to have universal
quantificational force. But when the modal is existential, if looks more like
conjunction than inclusion. And narrowscoping seems no better, rendering
all manner of coherent bits of discourse inconsistent.
That is pretty bad news for the operator view. True, we could save
iffiness by denying some Fact or other. (With defenders like that who needs
detractors?) Adding insult to injury: the Facts were chosen not at random but
with an eye to the competition. They are Facts that the new school restrictor
view predicts so easily hardly anyone has noticed.

7 Iffiness lost

Lewis (1975) famously argued that if s appearing in certain quantificational

constructions (under adverbs of quantification) are not properly iffy, that the
if in
24 Could we go for widescoping must-enriched indicatives and narrowscoping might-enriched
indicatives? For all we’ve said so far: yes. But that strategy faces an uphill battle. It is ad
hoc, three times over. First because there is no good reason to think we should settle for
anything less than a uniform story. Second because it is not obvious what it says we should
do when we consider ways in which the modal might be embedded. What if the modal is
can’t (a possibility modal scoped under negation) or needn’t (a universal under negation)?

(i) a. If my team doesn’t win out, they can’t win it all.

b. If the gardener didn’t do it, the culprit needn’t be the butler.

Do we widescope or narrowscope these? What principled story is there that predicts, rather
than stipulates, that the first is widescoped and the second narrowscoped? Third because
as soon as we consider epistemic modals that lie between the existential might and the
universal must — like probably and unlikely — it is doomed to failure anyway.


 

 Always

 
(13) Sometimes if a man owns a donkey, he beats it.

 

 

is not a conditional connective with a conditional operator as its meaning

but instead acts as a non-connective whose only job is to mark an argument-
place for the adverb of quantification. The relevant structure is not some
Q-adverb scoped over a conditional nor some conditional with a Q-adverb
in its consequent, he said, but instead something like

(14) Q-adverb + if-clause + then-clause

The job of the if -clause in (13) is merely to restrict the domain over which
the adverb (unselectively) quantifies, and allegedly that restricting job is a
job that cannot be done by treating if as a conditional connective with a
conditional operator as its meaning. If Q-adverb is universal, maybe an iffy
if will work; but if it is existential, then conjunction does better. I want to set
the issue about adverbial (and adnomial, for that matter) quantifiers aside for
two reasons. First because I doubt the allegation sticks. But that is another
argument for another day.25 And second because it will do us good to focus
on simple cases.
Still, the trouble for the operator view that is center stage here does look
quite a lot like the problem Lewis pointed out. We have to make room for
interaction between if -clauses and the domains our modals quantify over.
But that interaction is tricky. That is because it looks impossible to assign
if the same conditional meaning — thereby taking its contribution to be an
iffy one — in all of our examples. Indeed, when the modal is universal a con-
ditional relation looks good; but when the modal is existential, conjunction
looks better. This is pretty much the same trouble Lewis saw for if s occurring
under adverbs of quantification, and led him to conclude that such if s do not
express operators at all (and a fortiori not conditional operators).26 Just as
with adverbial quantifiers, there is a fast and easy solution to the problem
if we get rid of the old school idea that if is a conditional connective and
plump instead for anti-iffiness. The most forceful way of putting the anti-iffy
thesis is Kratzer’s (1986: 11):
25 There are ways to get the restricting job done after all. The operator-based stories in, e.g.,
Belnap 1970, Dekker 2001, and von Fintel & Iatridou 2003 all manage.
26 For recent and more thorough-going defenses of if s-as-quantifier-restrictors see, e.g., Kratzer
1981, 1986 and von Fintel 1998b. But see Higginbotham 2003 for a dissenting view.

A. S. Gillies

The history of the conditional is the history of a syntactic

mistake. There is no two-place “if. . . then” connective in the
logical forms for natural languages. “If”-clauses are devices for
restricting the domains of various operators.

The thesis is that the relevant structure for the conditionals at issue here
is not some modal scoped over a conditional nor some conditional with a
modal in its consequent, but is instead something like

(15) modal + if-clause + then-clause

Or, closer to the way we’ve been putting things:

(16) modal(if-clause )(then-clause )

The job of the if -clause is to restrict the domain over which the modal
quantifies. So instead of searching for a conditional operator properly so
called that if contributes whether it commingles with a modal or not, we
search for an operator for if to restrict. And, for indicative conditionals,
we do not have to search far: the operators are (possibly covert) epistemic
So it is the modals, not the if s, that take center stage. They have logical
forms along the lines of modal(p)(q), with the usual quantificational force:

Definition 7.1 (modal force, amended).

i. if defined, ‚might (p)(q)ƒC,i = 1 iff (Ci ∩ ‚pƒ) ∩ ‚qƒC 6= œ
ii. if defined, ‚must (p)(q)ƒC,i = 1 iff (Ci ∩ ‚pƒ) ⊆ ‚qƒC

This plus two assumptions gets us the now-standard and familiar restrictor
view. It easily accounts for consistency (Fact 1), if/must (Fact 2), and
if/might (Fact 3).
First assumption: assume that when there is no if -clause and so no
restrictor is explicit — as in Blue might be in the box or Yellow must be in
the box — the first argument in the lf of the modal is filled by your favorite
tautology (>). In those cases there is nothing to choose between an analysis
that follows our earlier Definition 6.1 and an analysis that follows Definition
27 Officially, our intermediate language now also goes in for a change. L had one-place modals
might and must and a two-place connective (if ·)(·). That won’t do to represent the restrictor
view. Instead, we need the two-place modals might (·)(·) and must (·)(·) and have no need
for a special conditional connective that expresses a conditional operator.


7.1, and so the latter generalizes the former.

Second assumption: assume that the job of if -clauses is to make a (non-
trivial) restrictor explicit. If there is no overt modal — as in a bare condi-
tional — the if restricts a covert must. Collecting the pieces:

Definition 7.2 (anti-iffiness). For any sentence S, let S 0 be its lf in our

intermediate language. Then:
i. A sentence of the form if S1 then S2 has lf:
a. modal(S10 )(R 0 ) if S20 = modal R 0
b. must (S10 )(S20 ) otherwise
ii. Truth conditions as in Definition 7.1

Return to the case of my missing marbles. Taking the if -clauses to be

restrictors in the example:

(17) a. Red might be in the box and Yellow might be in the box.
might (>)(p) ∧ might (>)(q)
b. If Yellow isn’t in the box, then Red must be.
must (¬q)(p)
c. If Red isn’t in the box, then Yellow must be.
must (¬p)(q)

It’s modals all the way down. And the modals can all be true together.

Observation 7.1 (anti-iffiness & consistency). Assume anti-iffiness (Def-

inition 7.2). And suppose, in C, that (17a) is a partitioning modal. Then the
sentences in (17) can all be true together.

Proof. I am in i and there are just two worlds compatible with the facts I
have, i and j. The first is a (p ∧ ¬q)-world, the second a (q ∧ ¬p)-world.
The restrictors in (17a) are trivial, so it is true at i iff Ci has a p-world in
it and a q-world in it; i witnesses the first conjunct, j the second. The
restricting if -clause of (17b) makes sure that the must ends up quantifying
only over the ¬q-worlds compatible with C: (17b) is true at i iff all of the
worlds Ci ∩ ‚¬qƒ are p-worlds. And the only one, i, is. Similarly for the must
in (17c): it quantifies over the ¬p-worlds in Ci , checking to see that they are
all q-worlds.

It is just as easy to square this picture with if/must (Fact 2) and if/might
(Fact 3). Here are the examples with their new school lfs:

A. S. Gillies

(18) a. If Carl is at the party, then Lenny must be at the party.

must (p)(q)
b. If Carl is at the party, then Lenny is at the party.
must (p)(q)
(19) a. If my team wins out, they might win it all.
might (p)(q)
b. It might turn out that my team wins out and wins it all.
might (>)(p ∧ q)

Observation 7.2 (anti-iffiness, if/must, & if/might). Assume anti-iffiness

(Definition 7.2). Then:
i. If S1 , then S2 ≈ If S1 , then must S2
ii. If S1 , then might S2 ≈ might [S1 and S2 ]

Proof. anti-iffiness assigns the same lf to a bare conditional like (18b) and
its must-enriched counterpart (18a): must (p)(q). It would thus be hard, and
pretty undesirable, for their truth conditions to come apart. That explains
Now consider the if -as-restrictor analysis of the sort of examples behind
if/might in (19). If (19b) is true at i in C then Ci has a (p ∧ q)-world in it.
But then that same world must be in Ci ∩ ‚pƒ. It is a q-world, and that will
witness the truth of (19a) at i. Going the other direction: if (19a) is true at
i in C, then there are some q-worlds in Ci ∩ ‚pƒ. Any one of those will do
as a (p ∧ q)-world in Ci , and that is sufficient for (19b) to be true at i. That
explains if/might.

These explanations are easy. And, given the trouble for the operator
view, it looks like the only game in town is to say that if doesn’t express an
operator and so not an iffy operator. That stings.

8 Iffiness regained

The problem for iffiness is that there is an interaction between if -clauses

and the domains our modals quantify over. That is an interaction that seems
hard to square with the thesis that if is a binary connective with a conditional
meaning if we assume that it has the same meaning in each of the cases we
care about here.


But we have overlooked a possibility. We insisted that for a story to be iffy

it must say that if p q at i in C expresses some relation R between Di ∩ P
and Q, where Di ∩ P is the set of (relevant) worlds where the antecedent is
true and Q the set of worlds where the consequent is true. That is all right.
But we unthinkingly assumed that the context relevant for figuring out what
these sets of worlds are must always be C just because that was the context
as it stood when the if was issued. That was a mistake. Setting it straight
sets the record straight for old school iffiness.
The Ramsey test — the schoolyard version, anyway — is a test for when an
indicative conditional is acceptable given your beliefs. It says that if p q
is acceptable in belief state B iff q is acceptable in the derived or subordinate
state B-plus-the-information-that-p. You zoom in on the portion of B where
p is true and see whether q throughout that region. But our job is to say
something about the linguistically encoded meanings of indicatives not to
dole out epistemic advice. Still, the Ramsey test (plus or minus just a bit) can
be turned into a strict conditional story about truth-conditions.
Here’s how (in three easy steps). Step one: sentences get truth-values at
worlds in contexts. So swap C’s for B’s. Step two: embrace egalitarianism.
The worlds compatible with the context are the if -relevant worlds. These
first two steps give us a strict conditional analysis of indicatives, requiring
that if p q is true at i in C iff all the p-possibilities in Ci are possibilities
at which q is true. But truth depends on both index and context. Question:
What context is relevant for checking to see whether q is true at these
p-possibilities? Answer: The Ramseyan derived or subordinate context C-
plus-the-information-that-p, or C + p for short. That’s step three.
The Ramsey test invites us to add the information carried by the an-
tecedent to the contextually relevant stock of information C and check the
fate of the consequent. What we fans of iffiness overlooked was that this
assigns two jobs to if -clauses, and we only paid attention to one of them.
One job is the index-shifting job. The if -clause tells us to shift to various
alternative indices — the antecedent-possibilities compatible with C — to see
whether the consequent is true at them. This job is familiar and most ver-
sions of the operator view do a fine job tending to it. But there is another
job. When we add the information carried by the antecedent to C we also
add to the context relevant for figuring out whether the consequent is true.
That is the context-shifting job. The if -clause tells us to shift to an alternative
derived or subordinate state to see whether the consequent is true. We fans
of old school iffiness made the mistake of only making sure that the first job

A. S. Gillies

got done.
So far this isn’t a story about the meaning of if (much less an iffy one). It
is a blueprint for how to construct a semantics that gives a uniform and iffy
meaning to if s whether or not those if s mix and mingle with other operators.
To construct a story using it we need to take a stand on what it means to add
the information carried by an antecedent to the contextually relevant stock
of information. Taking that stand depends on the aspirations of the theory
since different constructions may depend on different sorts of contextually
available information and there is every reason to think that augmenting
information of different sorts goes by different rules. But our aspirations are
pretty modest here: how indicatives interact with epistemic modals. So we
can opt for an equally simple stand on what it means to add information to a
Even before getting all the details laid out, we can see how the doubly
shifty behavior of if -clauses will be able to predict what needs predicting
about how indicatives and epistemic modals interact. The difference between
interpreting q against the backdrop of the prior context C and against the
backdrop of C + p is a difference that makes no difference if q has no context
sensitive bits in it. No wonder we missed it! But if q does have context
sensitive bits in it — like might or must, whose semantic value depends
non-trivially on C — then this is a difference that makes all the difference.
For example: consider a modal like must q. The contexts C and C + p may
well determine different sets of possibilities. Since must q depends exactly
on whether that set of possibilities has only q-worlds in it, we then get
a difference. Thus if must q is the consequent of an indicative, context-
shiftiness matters.
Here is the simplest way of constructing a semantics around the blueprint:

Definition 8.1 (iffiness + shiftiness).

i. if defined, ‚ if p q ƒC,i = 1 iff Ci ∩ ‚pƒC ⊆ ‚qƒC+p

ii. C + p = λi.Ci ∩ ‚pƒC

Such a story about if is iffy: if expresses a relation between relevant an-

tecedent and consequent worlds and that relation lives up to all the con-
straints we insisted on earlier. Hence if means all. And it expresses that no
matter whether it scopes over a universal modal or an existential modal or
no modal at all in the consequent. It is also doubly shifty. It is index-shifty
since the truth of if p q at i depends on the truth of the constituent q


at worlds other than i. It is context-shifty since the truth of if p q in C
depends on the truth of the constituent q in contexts other than C.
The if /modal interactions that were such trouble were only trouble be-
cause we forgot to keep track of the context-shifting job of if -clauses. And
doing that, even in the simple context-shifting in Definition 8.1, is enough to
make iffiness sit better with the Facts.
I know that just one of my marbles is in the box — either Red or Yel-
low — but do not know which it is. Narrowscope the modals. Then all of
these can be true together:

(20) a. Red might be in the box and Yellow might be in the box.
might p ∧ might q
b. If Yellow isn’t in the box, then Red must be.
if ¬q must p
c. If Red isn’t in the box, then Yellow must be.
if ¬p must q

Observation 8.1 (iffiness & consistency). Assume iffiness + shiftiness

(Definition 8.1). Suppose p and q partition the possibilities in C. The (nar-
rowscoped) sentences in (20) can all be true together in C.

Proof. Here is why. Suppose — for concreteness and without loss of general-
ity — that C contains just two worlds: i, a (p ∧ ¬q)-world and j, a (q ∧ ¬p)-
world. So (20a) is true at i.
Now take (20b). It is true at i in C, given iffiness + shiftiness, iff all the
possibilities in Ci ∩ ‚¬qƒ are possibilities that ‚must pƒC+¬q maps to true.
Thus we have to see whether the following holds:

if k ∈ Ci ∩ ‚¬qƒ then ‚must pƒC+¬q,k = 1

Iff this is so is (20b) true at i in C. But Ci ∩ ‚¬qƒ = {i}, so we have to

see whether or not ‚must pƒC+¬q,i = 1. Equivalently: the if is true at i iff
(C + ¬q)i ⊆ ‚pƒ. And since i is in fact a p-world the if is true at i in C. And
mutatis mutandis for (20c).

The operator view isn’t at odds with consistency after all. It is also easy
to predict if/must (Fact 2) and if/might (Fact 3). Here are the narrowscoped
analyses of the motivating examples:

(21) a. If Carl is at the party, then Lenny must be at the party.

if p must q

A. S. Gillies

b. If Carl is at the party, then Lenny is at the party.

if p q
(22) a. If my team wins out, they might win it all.
if p might q
b. It might turn out that my team wins out and wins it all.
might (p ∧ q)

Observation 8.2 (iffiness, if/must, & if/might). Assume iffiness + shifti-

ness (Definition 8.1). Then:
i. If S1 , then S2 ≈ If S1 , then must S2
ii. If S1 , then might S2 ≈ might [S1 and S2 ]

Proof. If must q is true then so is q, no matter the world and context. So

it’s easy to see that when (21a) is true so is (21b). Now suppose (21b) is
true at i (with respect to C). Then all of the p-worlds in Ci are q-worlds
(Ci ∩ ‚pƒ ⊆ ‚qƒC+p ). But if they are all worlds at which q is true, then i — and
so, given well-behavedness, every world in Ci — is equally a world at which
must q is true (with respect to C + p). And so (21a) is true, at i in C, if (21b)
is. That’s just what if/must requires.
if/might is no different. The noteworthy part is seeing how iffiness +
shiftiness predicts that when (22a) is true then so is (22b). Note that (22a) is
true at i (with respect to C) just in case all of the p-worlds in Ci are worlds
where might q, evaluated in C + p, is true. By well-behavedness we have
if j, k ∈ Ci ∩ ‚pƒ then (C + p)j = (C + p)k = Ci ∩ ‚pƒ

If there is a q-world in (C + p)j , then might q is true throughout this set.

Since might q is an existential modal, if it is true with respect to C + p it
must also be true with respect to C. (Updating contexts with + is monotone.)
Whence it follows that the if with a commingling might is true at i iff among
the p-worlds in Ci lies a q-world. And any such q-world will do to witness
the truth of might (p ∧ q) at i in C. That’s just what if/might requires.

Indicatives play well with epistemic modals. That interaction seemed

hard to square with old school views that take if to express a conditional
operator. No way of sorting out the relative scopes between the modals and
the conditional seemed right. But that is because we mistakenly thought that
antecedents of conditionals only have one job to do. They shift the index at
which we check to see if the consequent is true. But they also contribute to the


context that is relevant when we do that checking. Once we let antecedents do

both their index-shifting and context-shifting jobs we can safely narrowscope
and there is no special problem posed for old school iffiness. The if in
if p modal q means the same iffy thing — inclusion! — saying that all the
(relevant) worlds where p is true are worlds where modal q is true. That’s
so whether the oopmh of modal is universal or existential or null and does
nothing to get in the way of explaining the Facts. That is something we fans
of iffiness ought to dig.28

9 What is at stake

Given the success of anti-iffiness why bother with iffiness at all? A fair
question. Given the context-shifting I’m advocating for fans of iffiness, what’s
the difference between old school and new school? Another fair question. I
owe some answers.
I make three (not wholly unrelated) claims. First, even if the shifty version
of the operator view and the basic version of the restrictor view covered the
same ground, there is still reason to explore the operator view. Second, the
views have different conceptual roots and different allegiances. Third, the
views don’t cover the same ground. I need to argue for each of these.
Suppose that — at least when it comes to accounting for data about the
sorts of constructions at issue here — there’s nothing to choose between
iffiness + shiftiness and anti-iffiness. Even under that assumption there
is reason to take this version of the operator view seriously. That is because
it is important to set the record straight. Maybe you don’t like skyhooks,
Chuck Taylors, and conditional connectives expressing iffy operators in your
lfs. It is important to know that whatever your reasons, it can’t be because
iffiness can’t be squared with the Facts about how if s and modals interact.
The Ramsey test intuition leads naturally to a story according to which
if expresses a bona fide conditional operator that captures the restricting
behavior of if -clauses. Thus the restricting behavior of if -clauses can be a
28 Before I said that I wanted to ignore issues about how this version of the operator view can
meet Lewis’s challenge about the ways if -clauses and adverbs of quantification interact,
saving that argument for another day. I want to stick to that (it really is an argument for
another day), but the general idea is straightforward. First, adjust the kinds of information
represented by a context so that we can sensibly quantify over individuals and the events
they participate in. Second, allow that quantificational domains can be restricted by material
in if -clauses — those domains play the role of the subordinate or derived context. Adverbs
of quantification appear under the conditional and have their usual denotations.

A. S. Gillies

part of, rather than an obstacle to, their expressing something iffy. That is
But what’s the real difference between the views? One view says we have
no conditional operator, just a complicated modal with a slot for a restrictor.
The other says we have a conditional operator but that its antecedent shifts
the context thereby acting like a restrictor. Tomato/tomăto, right? Wrong!
Here is one way of seeing that. Consider three indicatives:

(23) a. If Scorpio succeeds, then the end must be near.

b. If Scorpio succeeds, then the end is near.
c. If Jimbo is in detention, then Nelson might be.

Compare (23a) and (23c). The restrictor view says these have different modals
and different arguments for each of the slots in those modals. So, apart from
the fact that each is a modal expression of some flavor or other, there is
nothing much in common between the two. They are as different as Some
students smoke and All dogs bark: each is a quantificational expression of
some flavor or other. The operator view says something different. It says that,
despite their different antecedents and different consequents, they still share
a common iffy core: there is a conditional connective in common between
them and it contributes the same thing to each of the sentences it occurs in.
Or compare the must-enriched (23a) with its bare counterpart (23b). The
restrictor view says the bare indicative just is the must-enriched version
in disguise. That is how it predicts if/must (Fact 2). It thus treats bare
indicatives as a special case, dealt with by positing a covert and inaudible
necessity modal. Maybe there is reason to posit such an operator, and an
independent and principled reason to posit the necessity modal instead of an
existential one or some different modal with different quantificational force,
and maybe those reasons outweigh the cost of the positing. The operator
view adopts a very different stance here and that is what I want to point out.
It says that bare indicatives like (23b) are ordinary conditionals and their
counterparts with must-ed consequents like (23a) are ordinary conditionals
that happen to have must in their consequents. No special cases, no positing
of inaudible operators, and if/must comes out as a prediction not as a
stipulation. None of this is a knock-down argument for or against either of
the views — it’s not meant to be — but it does highlight their difference in
All of this has been under the assumption that both the doubly shifty iffy
view and the anti-iffy restrictor view cover the same ground about how if s


and modals interact. But that’s not quite right.29 So far we have only worried
about how it is that a conditional sentence manages to express what might be
if such-and-such or how it manages to express what must be if such-and-such.
But conditional information can be more economically expressed than that.
We can just as well have a single conditional sentence that expresses what
must be and what might be if such-and-such.
A case in point: although I have lost my marbles, I know that some of
them — at least one of Red, Yellow, and Blue — are in the box. In fact I know
a bit more. I know that Yellow and Blue are in the same spot and so that Red
can’t be elsewhere if Yellow isn’t in the box. Another example: arriving at
the party, I’m not sure who’s there and who isn’t. I do know that Lenny goes
wherever Carl goes (but sometimes Lenny goes alone), but Monty never goes
where Lenny goes.

(24) a. If Yellow is in the box, then Red might be and Blue must be.
b. If Lenny is at the party, then Carl might be but Monty isn’t.

These are not exotic, each conditional is a true thing to say in the circum-
stances, and there is space for the iffy view and incarnations of the anti-iffy
restrictor view to differ on the truth conditions they assign to conditionals
like these — and so the two views can’t be stylistic variants.
Here is the issue: (24a) and (24b) have glosses:

(25) a. If Yellow is in the box, then Red might be and if Yellow is the box,
then Blue must be.
29 There are reasons independent of interaction with epistemic modals to think that anti-
iffiness, in its purest if -only-restricts form, can’t be the whole story. If it were, and if -clauses
and when-clauses have the same restricting behavior, then we wouldn’t expect differences in
cases like this:

(i) a. If the Cubs get good pitching and timely hitting after the break, they might win
it all.
b. When the Cubs get good pitching and timely hitting after the break, they might
win it all.

But we do detect a difference. I can say something true-if-hopeful with (ia). But (ib) passes
optimistic and heads straight for delusional. It’s hard to see where to locate the differ-
ence — whether it’s semantic or pragmatic — if the semantic contribution of if and when is
purely to mark the restrictor slot for the common operator might. (Lewis (1975) noticed
that sometimes a restricting if is odd when its corresponding restricting when is fine. But
he labeled these differences “stylistic variations”.) Some arguments along these lines are
pushed by von Fintel & Iatridou (2003).

A. S. Gillies

b. If Lenny is at the party, then Carl might be but if Lenny is at the

Party, then Monty isn’t.

These swap a single conditional with a complicated consequent for a conjunc-

tion of simple conditionals. The simple incarnation of the anti-iffy restrictor
view in Definition 7.2 says we do one thing when a conditional consequent
has an overt modal, and do another when there isn’t. But we didn’t say how
out in the open a modal must be to count as overt. Depending on what we
say, we can get divergence between the operator view and the restrictor view
for cases like these.
Assume — for now — that a modal is overt in a sentence iff it is the con-
nective featured in (the lf of) that sentence.30 Under that assumption, it
is then easy to see that the two stories come apart: iffiness + shiftiness
predicts that (24a) is equivalent to (25a) and so true (in the relevant context)
and anti-iffiness does not. That is because the consequent of (24a) isn’t
decorated with a leading modal (it’s a conjunction of modals), and so we have
to posit one. So (24a) gets an L-representation like

(26) must (p)(might (>)(q) ∧ must (>)(r ))

But the truth conditions of (26) do not match the truth conditions of (25a)
and so do not match the truth conditions of the original (24a): (26) is false in
the context as we set it up even though both (24a) and (25a) are true.
Now assume, instead, that a modal is overt iff it is pronounced — no
matter how arbitrarily deeply embedded. Then (26) isn’t the right anti-iffy
lf for (24a). Instead, we get something more sensible: (24a) and (25a) have
the same lf. There’s no in-principle problem with that.31 But what about
conditionals like (24b)? We don’t want to posit a must that outscopes the
pronounced might. So we have to posit a narrowscoped one. In order to
get the posited modal appropriately restricted — so that (24b) comes out
equivalent to (25b) — we have two obvious options. Option (i): Argue that
conditionals like those in (24) are not single conditionals at all, that they are
really conjunctions of two simple modals. That way there is no difference
at all between the conditionals in (24) and the glosses in (25). Option (ii):
Enrich our intermediate language to allow for explicit domain-restricting
variables, and provide a mechanism for the inheriting of those restrictions
30 In this sense, a modal is any (non-equivalent) stack of musts, mights, and negations.
31 Though it doesn’t come free: it puts strain on the process of assigning formulas of L to serve
as the lfs of sentences of natural language.


across intervening operators like conjunction. Both options are open, and
party line proponents of anti-iffiness are free to pursue them. But they do
require work. Option (i) posits movement we’d not like to have to posit, treats
conditionals with apparent conjoined consequents as yet another special
case, and describes rather than explains why the conditionals in (24) are
glossable by those in (25). Option (ii) requires more expressive resources
for L than we thought necessary and requires something over and above
the anti-iffy story as it stands to say when and how domain restriction gets
inherited over distance and across intervening operators. That’s not an
argument against this option but a description of it.32
But none of that really matters: my point was that iffiness + shiftiness
and anti-iffiness aren’t notational variants. And they are not: the iffy story
takes conditionals like (24) in perfect stride. No special cases, no positing
of inaudible operators, no stress on the parser in assigning formulas of
L to serve as the lfs of conditional sentences, no movement. We get the
right truth conditions, and we get as a prediction not a stipulation that the
conditionals in (24) are equivalent to those in (25).

10 Context and dynamics

Not every fan of old school iffiness will want to follow me this far. But there
is a cost to cutting their trip short since they must then deny or explain away
one of the Facts. Iffiness, they’ll no doubt point out, is not without its own
costs: the price of iffiness is shiftiness twice over.
I reply that there are costs and then there are costs. Embracing context-
shiftiness may be a cost, but I want to point out that it is not a new cost: it
makes the analysis here a broadly dynamic semantic account of indicatives.33
So shiftiness is a cost you may already be willing to bear. I want to (briefly)
point out how it is that this shiftiness amounts to a four-fold dynamic
perspective on modals and conditionals.
32 Something in the neighborhood of Option (ii) is developed (though not with an eye to
conjoined consequents) in von Fintel (1994). For a recent discussion see Rawlins 2008.
33 The general idea that consequents are evaluated in a subordinate or derived context is
standard in dynamic semantics — see, e.g., dynamic treatments of donkey anaphora (Groe-
nendijk & Stokhof 1991) or dynamic treatments of presupposition projection in conditional
antecedents and consequents (Heim 1992; Beaver 1999) or dynamic treatments of counter-
factuals (Veltman 2005; von Fintel 2001; Gillies 2007). But exploiting a derived context isn’t
quite a litmus test for dynamics since that is something shared by a lot of Ramsey-inspired
accounts, whether or not they count as ‘dynamic’.

A. S. Gillies

The version of the operator view I’m advocating for fans of iffiness takes
the truth of an indicative (at an index, in a context) to be doubly shifty.
That doubly shifty behavior makes the semantics dynamic in the sense that
interpretation both affects and is affected by the values of contextually
filled parameters. Whether if p q is true at i in C depends on C; the
indicative can be true at i for some choices of C and false at i for others. So
interpretation is context-dependent. Whether if p q is true at i in C also
depends on the subordinate context C + p. Interpreting the indicative in C
affects — temporarily — the context for interpreting some subparts of it. So
interpretation is also context-affecting.
This analysis is also dynamic in a second sense. It makes certain sentences
unstable — the truth-value a sentence gets in a context C is not a stable or
persistent property since it can have a different truth-value in a context C 0
that contains properly more information.

Definition 10.1 (persistence).

i. p is t-persistent iff ‚pƒC,i = 1 and C 0 ⊆ C imply ‚pƒC ,i = 1
ii. p is f -persistent iff ‚pƒC,i = 0 and C 0 ⊆ C imply ‚pƒC ,i = 0
p is persistent iff it is both t- and f -persistent.

The boolean bits are, of course, both t- and f -persistent and so persistent full-
stop. But not the modals: might, being existential, is f - but not t-persistent;
must goes the other way. And since if is a strict conditional, equivalent to a
necessity modal scoped over a material conditional, its pattern of persistence
is just like that for must.34
These two senses in which the story is dynamic are two sides of the same
coin. Together they explain how it is that the narrowscoped conditionals
if ¬p must q and if ¬q must p are consistent with the partitioning
modals in might p ∧ might q. From the fact that i ∈ ‚ if ¬p must q ƒC and

i ∈ ‚¬pƒC it does not follow that i ∈ ‚must qƒC . Indeed, with my marbles
lost, this is sure to be false at i in C since might p is true. What is true at i is
that — in the subordinate or derived context C + ¬q — must q is true. That
is allowed because must isn’t f -persistent. But that is not at odds with the
might claim. And mutatis mutandis for the other if .
34 This pattern makes the treatment of indicatives here similar in some respects to Veltman’s
(1985) data semantic treatment of indicatives. But there are important differences between
the two stories. Here’s one: if p might q is data semantically equivalent to if p q .
That won’t do given Fact 3.


So we have dynamics twice over. But so far none of this looks quite
like what is usually called “dynamic semantics”. In that sense of dynamics
meaning isn’t associated with truth conditions or propositions but with
context change potentials, effects on relevant states of information. Take
an information state s to be a set of worlds, and say that what a sentence
means is how its lf updates information states. That assigns to sentences
the semantic type usually reserved for programs and recipes; they express
relations between states — intuitively, the set of pairs of states such that
executing the program in the first state terminates in the second. We can
think of all sentences in this way, thereby treating them as instructions for
changing information states. Thus: the meaning of a sentence p is how it
changes an arbitrary information state. We might put that by saying the
denotation [p] applied to s results in state s 0 ; in post-fix notation s[p] = s 0 .35
Now say that p is true in s iff s[p] = s, for then the information p carries is
already present in s.36
Having gone this far, we can make good on the Ramsey test this way:

Definition 10.2 (Dynamic Iffiness).

s[ if p q ] = i ∈ s : q is true in s[p]

Some programs have as their main point to make such-and-such the case;
others to see whether such-and-such. Programs of the latter type are tests
and they either return their input state (if such-and-such) or fail (otherwise).
That is the kind of program Definition 10.2 says if is.37 It says an if tests
s to see whether the consequent is true in s[p]. But — in good Ramseyian
spirit — s[p] is just the subordinate context got by hypothetically adding p
to s. Truth isn’t persistent here, either. That is because a state may pass a
test posed by an existential (Are there p-possibilities?) and yet have
35 For the fragment without if s the updates are as you would expect (Veltman 1996). For the
if -free fragment of L, define [·] as follows:

i. s[patomic ] = i ∈ s : i(patomic ) = 1
ii. s[¬p] = s \ s[p]
iii. s[p ∧ q] = s[p][q]

iv. s[might p] = i ∈ s : s[p] 6= œ
It then follows straightaway that — for the if - and modal-free fragment — s[p] = s ∩ ‚pƒ.
36 This generalizes the plain vanilla story about satisfaction we were taught when first learning
propositional logic: as the story usually goes, a boolean p is true relative to a set of
possibilities s iff all the possibilities in s are in ‚pƒ. But that is equivalent to saying that
adding ‚pƒ to the information in s produces no change: s ∩ ‚pƒ = s iff s ⊆ ‚pƒ.
37 See, e.g., Gillies 2004.

A. S. Gillies

some narrower, less uncertain state fail it (No more p-possibilities!).

And dually for the universal must and if .
An iffy account like the one in Definition 10.2 is dynamic in this third
sense. But the doubly-shifty operator view iffiness + shiftiness doesn’t
look much like a dynamic semantics in that sense. That analysis looks static,
assigning truth-conditions to indicatives at a world in a context. And we can
recover propositions if the mood strikes us. But the two stories are in fact
the same: lack of persistence plus the global behavior of the modals and
if s in the doubly shifty story make it equivalent to a dynamic story of the
indicative that dispenses with the assignment of propositions of the normal
sort from the beginning.38 Even though I told the story about truth-values
assigned at contexts and indices, it is equivalent to a story about changing
information states. So we have dynamics thrice over.
We have gotten this far, and found ways to predict the Facts about how
indicatives and epistemic modals interact, without taking a stand on when
one sentence entails another. (Having said nothing about entailment we
couldn’t have said anything about modus ponens either.) Entailment is
usually taken to be preservation of truth at a point of evaluation: iff q is
true at a point if p1 , . . . , pn are all true at that point do the latter entail
the former. Not necessarily so in a dynamic semantics. Often enough,
what is important and what an entailment relation ought to capture is not
preservation of truth but preservation of information flow — what must be
true after adding the information carried by the premises. That is an update-
to-test entailment relation.39 Similarly, since the story as I have told it turns
out to be a dynamic one, we ought to expect a larger menu of options for
what it takes for a collection of premises to entail a conclusion. That is
because truth is sensitive to both context and index and contexts can shift
about as we move from the pi ’s to q. To make sure entailment is sensitive
to those shifts, we shouldn’t merely require preservation of truth-at-a-point.
Instead, just as in a more explicitly dynamic set-up, we want to augment the
38 The standard benchmark for dynamics is whether the interpretation function [·] is either
non-introspective (Can it be that s[p] 6⊆ s?) or non-continuous (Can it be that s[p] 6=
i∈s {i} [p]?). In set-ups like the one in Definition 10.2, the behavior of indicatives is not
continuous. See Gillies 2009 for the details on how the iffy story as I have put it is equivalent
to a more directly dynamically iffy semantics, and how the right notions of entailment
coincide in the two set-ups.
39 For more about the space of options for entailment relations in dynamic semantics see van
Benthem 1996 and Veltman 1996. Update-to-test entailment is a lot like Stalnaker’s (1975)
notion of reasonable inference.


context with the information of the premises, evaluating q not in C but in

(C + p1 ) + · · · + pn ). And that corresponds exactly to the dynamic update-to-
test entailment relation over our language L. That is the fourth way in which
the semantics here is dynamic.
So the doubly shifty behavior of indicatives reflects this four-fold dynamic
perspective. That is useful to know for two reasons. First because it makes
clear what the costs of iffiness are and it makes clear that some of those costs
are not completely new. Second because it makes clear that the dynamic
perspective on modals and conditionals is broader than we may have thought.
The senses in which the story here reflects a dynamic perspective are familiar
senses, but the mechanisms of that iffy story aren’t the usual mechanisms in
a dynamic semantics. The semantics traffics in things like truth conditions
and propositions, not in things like support or programs or context change
potentials. So nothing in the dynamic perspective on modals and conditionals
requires the latter sort of semantic trafficking at the expense of the former
sort. It’s broader than that.

11 An iffy upshot

My preferred version of the operator view says that an indicative is a doubly-

shifty strict conditional over sets of live possibilities. It assigns two jobs to
if -clauses. They have the index-shifting job of shifting the point at which
we check for a consequent’s truth, but they also have the context-shifting
job of shifting the context relevant for deciding at such a point whether a
consequent is true. That is how if can mean the same iffy thing no matter
whether the consequent is modal, and no matter the quantificational force of
that modal, without running afoul of the Facts.
We began with the iffy thesis that conditional information is information
of a conditional. Then we showed that — given some broad constraints for
what counts as a conditional operator properly so called — apparently no
operator view could be squared with the Facts since no way of sorting out
the scopes would work. But all of that assumed that antecedents have no
context-shifting role. So if you want to plump for an incarnation of the
operator view, and you want to square your story with the Facts, you had
better allow for context-shifting.
It’s easy to get the idea that how if s and operators like epistemic modals
interact is an argument for anti-iffiness. But since some iffy stories — this
one! — can account for that data, that’s not right. Nothing about shiftiness

A. S. Gillies

rules out anti-iffiness, of course. And so it’s open to go for a restrictor

view that co-opts context-shifting to account for the way that conditionals
with conjoined consequents turn out equivalent to conjunctions of simpler
conditionals. So if you want to toe the anti-iffy line, you might want to allow
for context-shifting anyway. Of course, that makes toeing the line a bit like
not toeing the line.


Adams, Ernest W. 1975. The logic of conditionals. Dordrecht: Reidel.

Beaver, David. 1999. Presupposition accommodation: A plea for common
sense. In Larry Moss, Jonathan Ginzburg & Martin de Rijk (eds.), Logic,
language, and information vol. 2, 21–44. Stanford, CA: CSLI Publications.
Belnap, Nuel D. 1970. Conditional assertion and restricted quantification.
Noûs 4(1). 1–12. doi:10.2307/2214285.
Bennett, Jonathon. 2003. A philosophical guide to conditionals. Oxford
University Press.
van Benthem, Johan. 1986. Essays in logical semantics (Studies in Linguistics
and Philosophy 29). Dordrecht: Reidel.
van Benthem, Johan. 1996. Exploring logical dynamics. Stanford, CA: CSLI
Dekker, Paul. 2001. On if and only. Semantics and Linguistics Theory [SALT]
11. 114–133. http://staff.science.uva.nl/~pdekker/Papers/OIAO.pdf.
Edgington, Dorothy. 1995. Conditionals. Mind 104(414). 235–329.
Edgington, Dorothy. 2008. Conditionals. In Edward N. Zalta (ed.), The Stanford
encyclopedia of philosophy, Winter 2008 edn. http://plato.stanford.edu/
von Fintel, Kai. 1994. Restrictions on quantifier domains. Amherst, MA:
University of Massachusetts dissertation. http://semanticsarchive.net/
von Fintel, Kai. 1997. Bare plurals, bare conditionals, and only. Journal of
Semantics 14(1). 1–56. doi:10.1093/jos/14.1.1.
von Fintel, Kai. 1998a. The presupposition of subjunctive conditionals. In Uli
Sauerland & Orin Percus (eds.), The interpretive tract (MIT Working Papers
in Linguistics 25), 29–44. http://mit.edu/fintel/fintel-1998-subjunctive.


von Fintel, Kai. 1998b. Quantifiers and if -clauses. Philosophical Quarterly

48(191). 209–214. doi:10.1111/1467-9213.00095.
von Fintel, Kai. 2001. Counterfactuals in a dynamic context. In Michael
Kenstowicz (ed.), Ken Hale: A life in language, 123–152. Cambridge, MA:
MIT Press.
von Fintel, Kai. 2009. Conditionals. Ms, to appear in Seman-
tics: An international handbook of meaning, edited by Klaus von
Heusinger, Claudia Maienborn, and Paul Portner. http://mit.edu/fintel/
von Fintel, Kai & Anthony S. Gillies. 2007. An opinionated guide to epistemic
modality. In Tamar Szabó Gendler & John Hawthorne (eds.), Oxford studies
in epistemology: Volume 2, 32–62. Oxford University Press.
von Fintel, Kai & Anthony S. Gillies. 2008a. CIA leaks. The Philosophical
Review 117(1). 77–98. doi:10.1215/00318108-2007-025.
von Fintel, Kai & Anthony S. Gillies. 2008b. Might made right. In Brian
Weatherson & Andy Egan (eds.), Epistemic modals, Oxford University Press
(to appear). http://rci.rutgers.edu/~thony/fintel-gillies-2008-mmr.pdf.
von Fintel, Kai & Anthony S. Gillies. 2010. Must... stay... strong! Natural Lan-
guage Semantics to appear. http://mit.edu/fintel/fintel-gillies-2010-mss.
von Fintel, Kai & Irene Heim. 2007. Intensional semantics. Lecture Notes, MIT.
von Fintel, Kai & Sabine Iatridou. 2003. If and when if -clauses can restrict
quantifiers. Manuscript, MIT. http://web.mit.edu/fintel/www/lpw.mich.
Geurts, Bart. 2005. Entertaining alternatives: Disjunctions as modals. Natural
Language Semantics 13(4). 383–410. doi:10.1007/s11050-005-2052-7.
Gibbard, Allan. 1981. Two recent theories of conditionals. In William L.
Harper, Robert Stalnaker & Glenn Pearce (eds.), Ifs, 211–248. Dordrecht:
Gillies, Anthony S. 2004. Epistemic conditionals and conditional epistemics.
Noûs 38(4). 585–616. doi:10.1111/j.0029-4624.2004.00485.x.
Gillies, Anthony S. 2007. Counterfactual scorekeeping. Linguistics and Philos-
ophy 30(3). 329–360. doi:10.1007/s10988-007-9018-6.
Gillies, Anthony S. 2009. On truth-conditions for if (but not quite only if ).
The Philosophical Review 118(3). 325–349. doi:10.1215/00318108-2009-00.
Grice, Paul. 1989. Indicative conditionals. In Studies in the way of words,
58–85. Cambridge, MA: Harvard University Press.

A. S. Gillies

Groenendijk, Jeroen & Martin Stokhof. 1991. Dynamic predicate logic. Lin-
guistics and Philosophy 14(1). 39–100. doi:10.1007/BF00628304.
Heim, Irene. 1992. Presupposition projection and the semantics of attitude
verbs. Journal of Semantics 9(3). 183–221. doi:10.1093/jos/9.3.183.
Higginbotham, James. 2003. Conditionals and compositionality. Philosophical
Perspectives 17(1). 181–194. doi:10.1111/j.1520-8583.2003.00008.x.
Jackson, Frank. 1987. Conditionals. Oxford University Press.
Kratzer, Angelika. 1981. The notional category of modality. In Hans-Jurgen
Eikmeyer & Hannes Rieser (eds.), Words, worlds, and contexts: New ap-
proaches in word semantics (Research in Text Theory 6), 38–74. Berlin: de
Kratzer, Angelika. 1986. Conditionals. Proceedings of the Chicago Linguistics
Society [CLS] 22(2). 1–15.
Lewis, David. 1973. Counterfactuals. Cambridge, MA: Harvard University
Lewis, David. 1975. Adverbs of quantification. In Edward Keenan (ed.), Formal
semantics of natural language, 3–15. Cambridge University Press.
Lewis, David. 1976. Probabilities of conditionals and conditional probability.
The Philosophical Review 85(3). 297–315. doi:10.2307/2184045.
Rawlins, Kyle. 2008. (Un)Conditionals. Santa Cruz, CA: UC Santa Cruz disser-
Stalnaker, Robert. 1968. A theory of conditionals. In Nicholas Rescher (ed.),
Studies in logical theory (American Philosophical Quarterly Monograph
Series 2), 98–112. Blackwell.
Stalnaker, Robert. 1975. Indicative conditionals. Philosophia 5(3). 269–286.
Veltman, Frank. 1985. Logics for conditionals. Amsterdam: University of
Amsterdam dissertation.
Veltman, Frank. 1996. Defaults in update semantics. Journal of Philosophical
Logic 25(3). 221–261. doi:10.1007/BF00248150.
Veltman, Frank. 2005. Making counterfactual assumptions. Journal of Se-
mantics 22(2). 159–180. doi:10.1093/jos/ffh022.

Anthony S. Gillies
Department of Philosophy
Rutgers University

Semantics & Pragmatics Volume 3, Article 9: 1–74, 2010
doi: 10.3765/sp.3.9

Cross-linguistic variation in modality systems:

The role of mood∗
Lisa Matthewson
University of British Columbia

Received 2009-07-14 / First Decision 2009-08-20 / Revision Received 2010-02-01 /

Accepted 2010-03-25 / Final Version Received 2010-05-31 / Published 2010-08-06

Abstract The St’át’imcets (Lillooet Salish) subjunctive mood appears in nine

distinct environments, with a range of semantic effects, including weakening
an imperative to a polite request, turning a question into an uncertainty
statement, and creating an ignorance free relative. The St’át’imcets subjunc-
tive also differs from Indo-European subjunctives in that it is not selected by
attitude verbs. In this paper I account for the St’át’imcets subjunctive using
Portner’s (1997) proposal that moods restrict the conversational background
of a governing modal. I argue that the St’át’imcets subjunctive restricts the
conversational background of a governing modal, but in a way which obli-
gatorily weakens the modal’s force. This obligatory modal weakening — not
found with Indo-European non-indicative moods — correlates with the fact
that St’át’imcets modals differ from Indo-European modals along the same
dimension. While Indo-European modals typically lexically encode quantifi-
cational force, but leave conversational background to context, St’át’imcets
modals encode conversational background, but leave quantificational force
to context (Matthewson, Rullmann & Davis 2007, Rullmann, Matthewson &
Davis 2008).

Keywords: Subjunctive, mood, irrealis, modals, imperatives, evidentials, questions,

free relatives, attitude verbs, Salish
∗ I am very grateful to St’át’imcets consultants Carl Alexander, Gertrude Ned, Laura Thevarge,
Rose Agnes Whitley and the late Beverley Frank. Thanks to David Beaver, Henry Davis,
Peter Jacobs, the members of the UBC Pragmatics Research Group (Patrick Littell, Meagan
Louie, Scott Mackie, Tyler Peterson, Amélia Reis Silva, Hotze Rullmann and Ryan Waldie),
three anonymous reviewers, and audiences at New York University, the University of British
Columbia and the 44th International Conference on Salish and Neighbouring Languages
for helpful feedback and discussion. Thanks to Tyler Peterson for helping prepare the
manuscript for publication. This research is supported by SSHRC grants #410-2005-0875
and #410-2007-1046.

©2010 Lisa Matthewson

This is an open-access article distributed under the terms of a Creative Commons Non-
Commercial License (creativecommons.org/licenses/by-nc/3.0).
Lisa Matthewson

1 Introduction

Many Indo-European languages possess both modals, lexical items which

quantify over possible worlds, and subjunctive moods, agreement paradigms
which usually require a licensing modal element. The contrast is illustrated
for Italian in (1)–(2). (1) contains modal auxiliaries; (2) contains subjunctive
mood agreement which is licensed by the matrix attitude verb.

(1) a. deve essere nell’ ufficio

must+3sg+pres+ind be in.the office
‘He must be in the office.’ (Italian; Palmer 2006: 102)
b. puo essere nell’ ufficio
may+3sg+pres+ind be in.the office
‘He may be in the office.’ (Italian; Palmer 2006: 102)

(2) dubito che impari

I.doubt that learn+3sg+pres+sbjn
‘I doubt that he’s learning.’ (Italian; Palmer 2006: 117)

Previous work on the Salish language St’át’imcets (a.k.a. Lillooet; see

Matthewson et al. 2007, Rullmann et al. 2008, and Davis, Matthewson & Rull-
mann 2009) has established the existence of a set of modals in this language,
which differ in their semantics from those of Indo-European. Indo-European
modals typically lexically encode distinctions of quantificational force, but
leave conversational background (in the sense of Kratzer 1981, 1991) up to
context. (1a), for example, unambiguously expresses necessity, while (1b)
unambiguously expresses possibility. However, both modals allow either
epistemic or deontic interpretations, depending on context. In contrast,
modals in St’át’imcets lexically encode conversational background, but leave
quantificational force up to context. (3a), for example, is unambiguously epis-
temic, but is compatible with either a necessity or a possibility interpretation,
depending on context. (3b) is unambiguously deontic, but similarly allows
differing quantificational strengths. See Matthewson et al. 2007, Rullmann
et al. 2008, and Davis et al. 2009 for extensive discussion.1
1 All St’át’imcets data are from primary fieldwork unless otherwise noted. Data are presented
in the practical orthography of the language developed by Jan van Eijk; see van Eijk &
Williams 1981. Abbreviations: adhort: adhortative, caus: causative, circ: circumstantial
modal, col: collective, comp: complementizer, cond: conditional, conj: conjunctive,
counter: counter to expectations, deic: deictic, deon: deontic, demon: demonstrative, det:

Cross-linguistic variation in modality systems: The role of mood

(3) a. wá7=k’a s-t’al l=ti=tsítcw-s=a

be=epis stat-stop in=det=house-3sg.poss=exis
‘Philomena must / might be in her house.’ only epistemic
b. lán=lhkacw=ka áts’x-en ti=kwtámts-sw=a
already=2sg.subj=deon see-dir det=husband-2sg.poss=exis
‘You must / can / may see your husband now.’ only deontic

A simplified table representing the difference between the two types of

modal system is given in Table 1:

quantificational conversational
force background

Indo-European lexical context

St’át’imcets context lexical

Table 1 Indo-European vs. St’át’imcets modal systems

In this paper I extend the cross-linguistic comparison to the realm of

mood. I argue that St’át’imcets possesses a subjunctive mood, and show that
it induces a range of apparently disparate semantic effects, depending on the
construction in which it appears. One example of the use of the subjunctive
is given in (4): it weakens the force of a deontic modal proposition (in a sense
to be made precise below). Other uses include turning imperatives into polite
requests, and turning questions into statements of uncertainty (cf. van Eijk
1997 and Davis 2006).

(4) a. gúy’t=Ø=ka ti=sk’úk’wm’it=a

sleep=3indic=deon det=child=exis
‘The child should sleep.’
determiner, dir: directive transitivizer, ds: different subject, epis: epistemic, erg: ergative,
exis: assertion of existence, foc: focus, fut: future, impf: imperfective, inch: inchoative,
indic: indicative, infer: inferential evidential, irr: irrealis, loc: locative, mid: middle
intransitive, nom: nominalizer, obj: object, prt: particle, pass: passive, perc.evid: perceived
evidence, pl: plural, poss: possessive, prep: preposition, real: realis, red: redirective
applicative, rem.past: remote past, sbjn: subjunctive, sg: singular, sim: simultaneous, stat:
stative, temp.deic: temporal deictic, ynq: yes-no question. The symbol - marks an affix
boundary and = marks a clitic boundary.

Lisa Matthewson

b. guy’t=ás=ka ti=sk’úk’wm’it=a
sleep=3sbjn=deon det=child=exis
‘I hope the child sleeps.’

I will show that the St’át’imcets subjunctive differs markedly from Indo-
European subjunctives, both in the environments in which it is licensed, and
in its semantic effects. I propose an analysis of the St’át’imcets subjunctive
which adopts insights put forward by Portner (1997, 2003). For Portner,
moods in various Indo-European languages place restrictions on the con-
versational background of a governing modal. I argue that the St’át’imcets
subjunctive mood can be analyzed within exactly this framework, with the
twist that in St’át’imcets, the restriction the subjunctive places on the gov-
erning modal obligatorily weakens the force of the proposition expressed.
This has an interesting consequence. While we can account for the
St’át’imcets subjunctive using the same theoretical tools as for Indo-European,
at a functional level the two languages are using their mood systems to
achieve quite different effects. In particular, St’át’imcets uses its mood sys-
tem to restrict modal force — precisely what this language does not restrict
via its lexical modals. At a functional level, then, we find the same kind of
cross-linguistic variation in the domain of mood as we do with modals. This
idea is illustrated in the simplified typology in Table 2:

lexically restrict lexically restrict

quant. force convers. background

Indo-European modals moods

St’át’imcets moods modals

Table 2 Modal and mood systems

These results suggest that while individual items in the realm of mood and
modality lexically encode different aspects of meaning, the systems as a
whole have very similar expressive power.
The structure of the paper: Section 2 introduces the St’át’imcets subjunc-
tive data. I first illustrate the nine different uses of the relevant agreement
paradigm, and then argue that this agreement paradigm is a subjunctive,
rather than an irrealis mood. Section 3 shows that the St’át’imcets sub-
junctive is not amenable to existing analyses of more familiar languages.

Cross-linguistic variation in modality systems: The role of mood

Section 4 reviews the basic framework adopted, that of Portner (1997), and
Section 5 provides initial arguments for adopting a Portner-style approach
for St’át’imcets. Section 6 presents the formal analysis, and Section 7 applies
the analysis to a range of uses of the subjunctive. Section 8 concludes and
raises some issues for future research.

2 St’át’imcets subjunctive data

St’át’imcets possesses a complex system of subject and object agreement.

There are different subject agreement paradigms for transitive vs. intransi-
tive predicates. For intransitive predicates, there are three distinct subject
paradigms, one of which is glossed as ‘subjunctive’ by van Eijk (1997) and
Davis (2006).2

indicative subjunctive
indicative nominalized

1sg tsút=kan n=s=tsut tsút=an

2sg tsút=kacw s=tsút=su tsút=acw
3sg tsut=Ø s=tsút=s tsút=as
1pl tsút=kalh s=tsút=kalh tsút=at
2pl tsút=kal’ap s=tsút=lap tsút=al’ap
3pl tsút=wit s=tsút=i tsút=wit=as

Table 3 Subject agreement paradigms for the intransitive predicate tsut

‘to say’ (adapted from van Eijk 1997: 146)

With transitive predicates, the situation is similar, except that there are
four separate paradigms, one of which is subjunctive.3,4
2 The cognate forms are often called ‘conjunctive’ in other Salish languages, primarily in order
to disambiguate the abbreviations for ‘subject’ and ‘subjunctive’. See for example Kroeber
3 The traditional terms for the first two columns are ‘indicative’ and ‘nominalized’ respectively.
The nominalized endings are identical to nominal possessive endings, and are glossed as
‘poss’ in the data. The choice between these first two paradigms is syntactically governed: the
so-called ‘indicative’ surfaces in matrix clauses and relative clauses, while the nominalized
paradigm appears in subordinate clauses. Both these sets contrast semantically, in all
syntactic environments, with the subjunctive, hence my overall categorization of the first
two paradigms as ‘indicative’.
4 See Kroeber 1999 and Davis 2000 for justification of the analysis of subject inflection

Lisa Matthewson

In subsection 2.1 I illustrate the uses of the paradigms glossed as sub-

junctive, and in subsection 2.2 I argue that these paradigms more closely
approximate familiar subjunctives, rather than irrealis moods.

2.1 Uses of the St’át’imcets subjunctive

The mood I am glossing as ‘subjunctive’ has a wide range of uses, which

at first glance are not easily unifiable. I illustrate all of them here. First,
the subjunctive functions to turns a plain assertion into a wish (Davis 2006:
chapter 24).5

(5) a. nilh s=Lémya7 ti=kél7=a

foc nom=Lémya7 det=first=exis
‘Lémya7 is first.’
b. nílh=as s=Lémya7 ku=kéla7
foc=3sbjn nom=Lémya7 det=first
‘May Lémya7 be first.’

(6) a. ámh=as ku=scwétpcen-su!

good=3sbjn det=birthday=2sg.poss
‘May your birthday be good!’
b. ámh=as ku=s=wá7=su!
good=3sbjn det=nom=be=2sg.poss
‘Best wishes!’ [‘May your being be good.’] (Davis 2006: ch. 24)

This use of the subjunctive is very restricted (see van Eijk 1997: 147).
Minimal pairs cannot usually be constructed for ordinary assertions, as
shown in (7)–(9).

(7) a. kwis lhkúnsa

rain today
‘It’s raining today.’
b. *kwís=as lhkúnsa
rain=3sbjn today
intended: ‘May it rain today.’
assumed here. I do not provide the transitive paradigms, as subject markers vary based on
the person and number of the object and the table is excessively large. See van Eijk 1997 and
Davis 2006 for details.
5 The determiner alternation between (5a) and (5b) (ti=. . . =a vs. ku=) is predictable, but
irrelevant for current concerns. See Matthewson 1998, 1999 for discussion.

Cross-linguistic variation in modality systems: The role of mood

(8) a. áma ti=sq’ít=a

good det=day=exis
‘It is a good day.’
b. *ámh=as ti=sq’ít=a
good=3sbjn det=day=exis
intended: ‘May it be a good day.’

(9) a. guy’t ti=sk’úk’wm’ita

sleep det=child=exis
‘The child is sleeping.’
b. *guy’t=ás ti=sk’úk’wm’ita
sleep=3sbjn det=child=exis
intended: ‘I hope the child sleeps.’

In general, the subjunctive seems only to add to a plain assertion either

in a cleft structure, as in (5), or in conventionalized wishes, as in (6). I return
to this issue below.
The more usual case of the subjunctive creating a wish-statement is when
it co-occurs with the deontic modal ka, as in (10)–(11).

(10) a. plan=ka=tí7=t’u7 wa7 máys-n-as

already=deon=demon=prt impf fix-dir-3erg
‘He should have fixed that already.’
b. plan=as=ká=tí7=t’u7 wa7 máys-n-as
already=3sbjn=deon=demon=prt impf fix-dir-3erg
‘I wish he had fixed that already.’

(11) a. gúy’t=ka ti=sk’úk’wm’it=a

sleep=deon det=child=exis
‘The child should sleep.’
b. gúy’t=ás=ka ti=sk’úk’wm’it=a
sleep=3sbjn=deon det=child=exis
‘I hope the child sleeps.’

When used with the deontic modal ka, in addition to the ‘wish’ interpre-
tation shown in (10)–(11), the subjunctive can also render a ‘pretend to be ...’
6 The data in (12) are from the Upper St’át’imcets dialect; in Lower St’át’imcets, (12a) is
corrected to (i), which has the subjunctive but lacks the deontic modal. This independent

Lisa Matthewson

(12) a. skalúl7=acw=ka: saq’w knáti7 múta7 em7ímn-em

owl=2sg.sbjn=deon fly deic and animal.noise-mid
‘Pretend to be an owl: fly around and hoot.’
(Davis 2006: chapter 24)
b. snu=hás=ka ku=skícza7
2sg.emph=3sbjn=deon det=mother
‘Pretend to be the mother.’
(Whitley, Davis, Matthewson & Frank (editors) no date)

The fourth construction which licenses the subjunctive is the imperative;

the subjunctive weakens an imperative to a polite request (Davis 2006:
chapter 24). In each of (13)–(15), the subjunctive imperative in (b) is construed
as ‘more polite’ than the plain imperative in (a). The subjunctive is particularly
common in negative requests, as in (15).

(13) a. lts7á=malh lh=kits-in’=ál’ap!

deic=adhort comp=put.down-dir=2pl.sbjn
‘Just put it over here!’
b. lts7á=has=malh lh=kits-in’=ál’ap
deic=3sbjn=adhort comp=put.down-dir=2pl.sbjn
‘Could you put it down here?’/‘You may as well put it down over
here.’7 (adapted from Davis 2006: chapter 24)

(14) a. nás=malh áku7 pankúph=a

go=adhort deic Vancouver=exis
‘You’d better go to Vancouver.’
b. nás=acw=malh áku7 pankúph=a
go=2sg.sbjn=adhort deic Vancouver=exis
‘You could go to Vancouver.’
pronoun construction is argued by Thoma (2007) to be a concealed cleft. I return to this
issue below.

(i) nu=hás ku=kalúla7

2sg.emph=3sbjn det=owl
‘Pretend to be an owl.’
7 The third person subjunctive ending appears here because the structure is bi-clausal,
involving a third-person impersonal main predicate: ‘It is here that you could put it down.’

Cross-linguistic variation in modality systems: The role of mood

(15) a. cw7aoz kw=s=sek’w-en-ácw ta=nk’wanústen’=a

neg det=nom=break-dir-2sg.erg det=window=exis
‘Don’t break the window.’
b. cw7áoz=as kw=s=sek’w-en-ácw ta=nk’wanústen’=a
neg=3sbjn det=nom=break-dir-2sg.erg det=window=exis
‘Don’t break the window.’

Fifth, in combination with an evidential or a future modal, the subjunctive

helps to turn wh-questions into statements of uncertainty or wondering.

(16) a. kanem=lhkán=k’a
‘What happened to me?’
b. kanem=án=k’a
‘I don’t know what happened to me.’ / ‘I wonder what I’m doing.’8

(17) a. kanem=lhkácw=kelh múta7

do.what=2sg.indic=fut again
‘What are you going to be doing later?’
b. kanem=ácw=kelh múta7
do.what=2sg.sbjn=fut again
‘I wonder what you are going to do again.’ (van Eijk 1997: 215)

(18) a. nká7=kelh lh=cúz’=acw nas

where=fut comp=going.to=2sg.sbjn go
‘Where will you go?’
b. nká7=as=kelh lh=cúz’=acw nas
where=3sbjn=fut comp=going.to=2sg.sbjn go
‘Wherever will you go?’ / ‘I wonder where you are going to go
now.’ (adapted from Davis 2006: chapter 24)

The same effect arises with yes-no questions. In combination with the evi-
dential k’a or a future modal, the subjunctive also turns these into statements
of uncertainty which are often translated using ‘maybe’ or ‘I wonder’.
8 For expository reasons, k’a was glossed as ‘epistemic’ in (3a) above, but from now on will be
glossed as ‘inferential’. Matthewson et al. (2007) analyze k’a as an epistemic modal which
carries a presupposition that there is inferential evidence for the claim.

Lisa Matthewson

(19) a. lán=ha kwán-ens-as

already=ynq take-dir-3erg
‘Has she already got my letter?’
b. lan=as=há=k’a kwán-ens-as
already=3sbjn=ynq=infer take-dir-3erg
‘I wonder if she’s already got my letter.’/’I don’t know if she got
my letter or not.’
(20) wa7=as=há=k’a tsicw
impf=3sbjn=ynq=infer get.there
i=n-sésq’wez’=a, cw7aoz kw=en
det.pl=1sg.poss-younger.sibling=exis neg det=1sg.poss
‘Perhaps my younger siblings went along, I don’t know.’
(Matthewson 2005: 265)
In combination with a wh-indefinite and the evidential k’a, the subjunctive
creates free relatives with an ‘ignorance/free choice’ reading; see Davis 2006
for discussion.
(21) a. qwatsáts=t’u7 múta7 súxwast áku7, t’ak aylh áku7,
leave=prt again go.downhill deic go then deic
nílh=k’a s=npzán-as
foc=infer nom=meet(dir)-3erg
k’a=lh=swát=as=k’a káti7 ku=npzán-as
infer=comp=who=3sbjn=infer deic det=meet(dir)-3erg
‘So he set off downhill again, went down, and then he met who-
ever he met.’ (van Eijk & Williams 1981: 66, cited in Davis 2009)
b. o, púpen’=lhkan [ta=stam’=as=á=k’a]
oh find=1sg.indic [det=what=3sbjn=exis=infer]
‘Oh, I’ve found something or other.’
(Unpublished story by “Bill” Edwards, cited in Davis 2009)
When used in combination with the scalar particle t’u7, the subjunctive
creates a statement translated as ‘might as well’ or ‘may as well’.

Cross-linguistic variation in modality systems: The role of mood

(22) a. wá7=lhkan=t’u7 wa7 k’wzús-em

impf=1sg.indic=prt impf work-mid
‘I am just working.’
b. wá7=an=t’u7 wa7 k’wzús-em
impf=1sg.sbjn=prt impf work-mid
‘I might as well stay and work.’
(23) a. wá7=lhkacw=t’u7 lts7a lhkúnsa ku=sgáp
be=2sg.indic=prt deic now det=evening
‘You are staying here for the night.’
b. wá7=acw=t’u7 lts7a lhkúnsa ku=sgáp
be=2sg.sbjn=prt deic now det=evening
‘You may as well stay here for the night.’
And finally, in combination with a wh-word and the scalar particle t’u7,
the subjunctive creates free relatives with a universal / indifference reading.
(24) a. wa7 táw-em ki=smán’c=a, ns7á7z’-em
impf sell-mid det.col=tobacco=exis trade-mid
‘He was selling tobacco, trading it for whatever . . . ’
(van Eijk & Williams 1981: 74, cited in Davis 2009)
b. wa7 kwám=wit ku=káopi, ku=súkwa, ku=saplín,
impf take(mid)=3pl det=coffee det=sugar det=flour
[stám’=as=t’u7 cw7aoz
[what=3sbjn=prt neg
‘They got coffee, sugar, flour, whatever we couldn’t grow on our
land. . . ’ (Matthewson 2005: 105, cited in Davis 2009)
c. [stám’=as=t’u7 káti7 i=wá7
[what=3sbjn=prt deic det.pl=impf
ka-k’ac-s-twítas-a i=n-slalíl’tem=a]
circ-dry-caus-3pl.erg-circ det.pl=1sg.poss-parents=exis]
wa7 ts’áqw-an’-em lh=as sútik
impf eat-dir-1pl.erg comp(impf)=3sbjn winter

Lisa Matthewson

‘Whatever my parents could dry, we ate in wintertime.’

(Matthewson 2005: 141, cited in Davis 2009)

The nine uses of the St’át’imcets subjunctive are summarized in Table 4:

environment indicative meaning subjunctive meaning

plain assertion assertion wish

deontic modal deontic necessity/possibility wish
deontic modal deontic necessity/possibility ‘pretend’
imperative command polite request
wh-question + question uncertainty/wondering
yes-no question + question uncertainty/wondering
wh-word + evidential question ignorance free relative
scalar particle t’u7 ‘just/still’ ‘might as well’
wh-word + scalar N/A indifference free relative
particle t’u7

Table 4 Uses of the St’át’imcets subjunctive

These are all the cases where the subjunctive has a semantic effect; in
the next sub-section we will also see some cases where the subjunctive is
obligatory and semantically redundant. I will not aim to account for the entire
panoply of subjunctive effects in one paper. However, the analysis I offer
will explain the first seven uses, setting aside for future research only the
two uses which involve the particle t’u7. See Section 8 for some speculative
comments about the subjunctive in combination with t’u7.

2.2 This is a subjunctive mood

In this sub-section I justify the use of the term ‘subjunctive’ for the subject
agreements being investigated. The choice of terminology is intended to
reflect the fact that the St’át’imcets mood patterns with Indo-European sub-
junctives, rather than with Amerindian irrealis moods, in several respects.
However, we will see below that the St’át’imcets subjunctive also differs

Cross-linguistic variation in modality systems: The role of mood

semantically in important ways from Indo-European subjunctives.9

Palmer (2006) observes that there is a broad geographical typology, such
that European languages often encode an indicative/subjunctive distinc-
tion, while Amerindian and Papuan languages often encode a realis/irrealis
distinction. A typical irrealis-marking system is illustrated in (25).

(25) a. ho bu-busal-en age qo-in

pig sim-run.out-3sg+ds+real 3pl hit-3pl+rem.past
‘They killed the pig as it ran out.’ (Amele; Palmer 2006: 5)
b. ho bu-busal-eb age qo-qag-an
pig sim-run.out-3sg+ds+irr 3pl hit-3pl-fut
‘They will kill the pig as it runs out.’ (Amele; Palmer 2006: 5)

According to Palmer (2006: 145), the indicative/subjunctive distinction

and the realis/irrealis distinction are ‘basically the same’. The core function
of both a subjunctive and an irrealis is to encode ‘non-assertion’.10 However,
there are differences in distribution and in syntactic functions.
First, Palmer observes that subjunctive is not marked independently of
other inflectional categories such as person and number. Instead, there is
typically a full subjunctive paradigm. On the other hand, irrealis is often
marked by a single element. In this respect, the St’át’imcets mood patterns
like a subjunctive; see Table 3 above.
Second, in main clauses, irrealis marking is often used for questions,
futures and denials; this is not the case for main clause subjunctives. In this
respect also, the St’át’imcets mood patterns like a subjunctive. It is not used
to mark questions, futures or denials. (26)–(28) all have indicative marking.

9 This raises a terminological issue which arises in many areas of grammar. Should we apply
terms which were invented for European languages to similar — but not identical — categories
in other languages? For example, should we say ‘The perfect / definite determiner /
subjunctive in language X differs semantically from its English counterpart’, or should we
say ‘Language X lacks a perfect / definite determiner / subjunctive’, because it lacks an
element with the exact semantics of the English categories? I adopt the former approach
here, as I think it leads to productive cross-linguistic comparison, and because it suggests
that the traditional terms do not represent primitive sets of properties, but rather potentially
decomposable ones.
10 Palmer does not provide a definition of ‘non-assertion’. He observes that common reasons
why a proposition is not asserted are because the speaker doubts its veracity, because the
proposition is unrealized, or because it is presupposed (Palmer 2006: 3). See Section 3 below
for discussion.

Lisa Matthewson

(26) t’íq=Ø=ha kw=s=Josie?

arrive=3indic=ynq det=nom=Josie
‘Did Josie arrive?’

(27) t’íq=Ø=kelh kw=s=Josie

arrive=3indic=fut det=nom=Josie
‘Josie will arrive.’

(28) cw7aoz kw=s=t’iq=s s=Josie

neg det=nom=arrive=3poss nom=Josie
‘Josie didn’t arrive.’

Third, Palmer notes that subjunctive marking is obligatory and redundant

only in subordinate clauses, while irrealis marking is often obligatory and
redundant in main clauses. Here again, the St’át’imcets mood patterns like a
subjunctive. It is obligatory and redundant only in three cases. The first is
when embedded under the complementizer lh=. lh= is glossed by van Eijk
(1997) as ‘hypothetical’, and analyzed by Davis (2006) as a complementizer
which introduces subjunctive clauses, including if -clauses, as in (29a) and
(29b), temporal adjuncts (29b), locative adjuncts (29c), and complements to
the evidential k’a when this is used as a (focused) adverb (29d).

(29) a. lh=cw7áoz*(=as)=ka kw=s=gúy’t=su,

comp=neg*(=3sbjn)=irr det=nom=sleep=2sg.poss
lán=ka=tu7 wa7 xzum i=n’wt’ústen-sw=a
already=irr=then impf big det.pl=eye-2sg.poss=exis
‘If you hadn’t slept, your eyes would have been big already.’
(van Eijk & Williams 1981: 12)
b. xwáyt=wit=ka lh=wa7=wit*(=ás)=t’u7 qyax
many.people.die=3pl=irr comp=be=3pl*(=3sbjn)=prt drunk
múta7 tqálk’-em lh=w*(=as) qyáx=wit
and drive-mid comp=impf*(=3sbjn) drunk=3pl
‘They would die if they got drunk and drove when they were
drunk.’ (Matthewson 2005: 367)
c. lts7a lh=wa7*(=as) qwál’qwel’t
deic comp=impf*(=3sbjn) hurt
‘It is here that it is hurting.’

Cross-linguistic variation in modality systems: The role of mood

d. k’a lh=7án’was*(=as) sq’it,

maybe comp=two*(=3sbjn) day
ka-láx-s-as-a n-skícez7=a
circ-remember-caus-3erg-circ 1sg.poss-mother=exis
‘Maybe two days later, my mother remembered the fish my
brother had been soaking.’
(Matthewson 2005: 152; cited in Davis 2006: chapter 23)11

The second case where the St’át’imcets subjunctive is obligatory and

redundant is when embedded under the complementizer i= ‘when’, as in (30).
i= has a similar distribution to lh=, but is restricted to past-time contexts.
See van Eijk 1997: 235-6 and Davis 2006: chapter 27 for discussion.

(30) a. i=kél7=at tsicw, áts’x-en-em

when.past=first=1pl.sbjn get.there see-dir-1pl.erg
i=cw7ít=a tsitcw
det=many=exis house
‘When we first got there, we saw lots of houses.’
(Matthewson 2005: 74)
b. wá7=lhkan lexláx-s i=kwís*(=as)
impf=1sg.indic remember-caus when.past=fall*(=3sbjn)
na=n-sésq’wez’=a, s=Harold Peter
det.abs=1sg.poss-younger.sibling=exis nom=Harold Peter
‘I remember when my little brother was born, Harold Peter.’
(Matthewson 2005: 354-5)
11 Incidentally, Davis (2006: chapter 23) observes that ‘two or more k’a lh= clauses strung
together form the closest equivalent in [St’át’imcets] of [English] “either...or”.’ An example is
given in (i).
(i) k’a lh=xw7utsin-qín’=as, k’a lh=tsilkst-qín’=as=kelh
maybe comp=four-animal=3sbjn maybe comp=five-animal=3sbjn=fut
‘It’ll either be a four point or a five point buck.’ (Davis 2006: chapter 23)
As Davis implies, St’át’imcets lacks any lexical item which renders logical disjunction, and
constructions like (i), although used to translate English ‘or’, are literally two ‘maybe’-clauses
strung together.

Lisa Matthewson

Finally, the subjunctive is obligatory when it appears in combination

with the perceived-evidence evidential =an’. =an’ is analyzed by Matthewson
et al. (2007) as an epistemic modal which is defined only if the speaker has
perceived indirect evidence for the prejacent proposition.

(31) a. *táyt=kacw=an’
‘You must be hungry.’
b. táyt=acw=an’
‘You must be hungry.’

(32) a. *nílh=Ø=an’ s=Sylvia ku=xílh-tal’i

foc=3indic=perc.evid nom=Sylvia det=do(caus)-top
‘Apparently it was Sylvia who did it.’
b. nílh=as=an’ s=Sylvia ku=xílh-tal’i
foc=3sbjn=perc.evid nom=Sylvia det=do(caus)-top
‘Apparently it was Sylvia who did it.’
(Matthewson et al. 2007: 208)

The perceived-evidence evidential is the only environment in the language

where the subjunctive is obligatory in a matrix clause. I assume that the
subjunctive lacks semantic import here, as an otherwise very similar evi-
dential lákw7a does not allow the subjunctive in cases parallel to (31)–(32)
(Matthewson 2010, to appear).
The conclusion is that St’át’imcets, in spite of being an Amerindian lan-
guage, has a mood which patterns, at least morpho-syntactically, like a
subjunctive rather than an irrealis. This fits with how van Eijk (1997) and
Davis (2000, 2006) gloss the relevant forms. However, we will see in the next
section that the St’át’imcets subjunctive differs semantically in interesting
ways from European subjunctives.

3 Why previous analyses do not work for St’át’imcets

The vast majority of formal research on the subjunctive deals with Indo-
European. In languages such as the Romance languages, the subjunctive
mood is used for wishes, fears, speculations, doubts, obligations, reports,
unrealized events, or presupposed propositions. Some examples are provided
in (33)–(34).

Cross-linguistic variation in modality systems: The role of mood

(33) a. creo que aprende

I.believe that learn+3sg+pres+indic
‘I believe that he is learning.’ (Spanish; Palmer 2006: 5)
b. dudo que aprenda
I.doubt that learn+3sg+pres+sbjn
‘I doubt that he’s learning.’ (Spanish; Palmer 2006: 5)

(34) potessi venire anch’ io

can+1sg+pres+sbjn come also I
‘If only I could come too.’ (Italian; Palmer 2006: 109)

In this section I briefly discuss some of the main approaches to the

subjunctive. I cannot do justice to the full array of proposals in the literature;
the goal is to provide enough background to establish that the St’át’imcets
subjunctive is not amenable to a range of existing approaches.
One pervasive line of thought is that subjunctive encodes a general se-
mantic contribution of ‘non-assertion’ (Bolinger 1968, Terrell & Hooper 1974,
Hooper 1975, Klein 1975, Farkas 1992, Lunn 1995, Palmer 2006, Haverkate
2002, Panzeri 2003, among others). One recent formal proposal in this line
is that of Farkas (2003). Farkas argues that there is a correlation between
indicative mood and complements which have assertive context change po-
tential relative to the embedded environment. Assertive context change for a
matrix clause is defined as in (35); the context set of worlds Wc is narrowed.

(35) Assertive context change

c + φ is assertive iff Wc 0 = Wc ∩ p, where c 0 is the output context.
(Farkas 2003: 5)

Farkas provides an analysis of assertion in embedded contexts which

predicts that positive epistemic predicates like believe or know take indicative
complements, as these complements are asserted relative to the matrix
subject’s epistemic state.12
Predicates of assertion (‘say’, ‘assert’) and of fiction (‘dream’, ‘imagine’)
similarly introduce complements which are assertively added to the embed-
ded speech context, and also take indicative complements. On the other
hand, complements to desideratives (‘want’, ‘wish’, ‘desire’) and directives
(‘command’, ‘direct’, ‘request’) are not assertive. Rather than eliminating
12 Predicates like believe take subjunctive complements in Italian; see Giorgi & Pianesi 1997,
among many others, for discussion.

Lisa Matthewson

worlds in the context set where the complement is false, these predicates
eliminate worlds in the context set which are low on an evaluative ranking.13
Thus, these predicates take the subjunctive:

(36) Maria vrea să-i răspundă

Maria wants subj-cl answer.sbjn
‘Maria wants to answer him.’ (Romanian; Farkas 2003: 2)

Giannakidou (1997, 1998, 2009) offers an alternative characterization

of the distribution of the subjunctive, according to which it appears in
nonveridical contexts, while indicative appears in veridical contexts. The
relevant definition is given in (37):

(37) A propositional operator F is veridical iff from the truth of F p we

can infer that p is true relative to some individual x (i.e., in some
individual x’s epistemic model) . . . If inference to the truth of p under
F is not possible, F is nonveridical. (Giannakidou 2009: 1889)

According to this analysis, the division between indicative-taking and

subjunctive-taking predicates relies on whether at least one epistemic agent
is committed to the truth of the embedded proposition. Giannakidou’s
approach predicts a similar division between indicative- and subjunctive-
taking predicates to Farkas’s. In Modern Greek, the indicative is found
in complements to predicates of assertion or fiction, epistemics, factives
and semi-factives. The subjunctive is found in complements to volitionals,
directives, modals, permissives, negatives, and verbs of fear (Giannakidou
2009: 9).14
An approach which aims to derive mood selection directly from the
semantics of subordinating predicates is that of Villalta (2009). Villalta argues
13 The complements of desideratives are also not ‘decided’ relative to their context set, which
is what is actually crucial here for Farkas (2003). Farkas proposes an Optimality Theory
account involving the two constraints in (i):

(i) *SUBJ/+Decided *IND/-Assert

Different rankings of these two constraints give rise to different mood choices in Romanian
vs. French for emotive factive predicates like ‘be sorry/happy’, ‘regret’. Emotive factives are
+Decided but -Assertive, and take the indicative in Romanian and the subjunctive in French.
14 Giannakidou (2009) proposes that the Modern Greek subjunctive complementizer na con-
tributes temporal semantics (introducing a ‘now’ variable). The generalization is still that
subjunctive appears in non-veridical contexts; see Giannakidou 2009 for details.

Cross-linguistic variation in modality systems: The role of mood

that subjunctive-selecting predicates are those whose embedded propositions

are compared to contextual alternatives on a scale encoded by the predicate.
The contribution of the subjunctive is to evaluate the contextual alternatives.
Quer (1998, 2001), looking mainly at Catalan and Spanish, argues that the
subjunctive signals a shift in the model of the evaluation of the truth of the
proposition. For unembedded assertions, the anchor is the Speaker and the
model is the epistemic model of the Speaker. Operators which introduce sub-
junctive introduce buletic models, or other models which create comparative
relations among worlds. This predicts we will find subjunctive in purpose
clauses, and predicts indicative/subjunctive alternations in restrictive rel-
ative clauses, concessives, and free relatives. Quer (2009) also discusses
indicative/subjunctive alternations in conditionals, claiming that indicative
appears in protases that are ‘realistic in the sense that they quantify over
worlds which are close enough to the actual one’ (2009: 1780). Subjunctive is
used when the worlds are further away from the actual one or even disjoint
from it.
An approach to mood which draws on notions from noun phrase se-
mantics is offered by Baker & Travis (1997). Baker and Travis argue that in
Mohawk, mood marks a division between ‘verbal specificity’ (‘factive’ mood)
and Kamp/Heim-style indefiniteness (two variants of non-factive mood, pre-
viously called the ‘future’ and the ‘optative’). Indefinite/non-factive mood
appears in future contexts, in past habituals, in negative clauses, under the
verbs ‘promise’ and ‘want’, and in free relatives with a non-specific reading.
What links all these indefinite-mood environments, according to Baker and
Travis, is the same feature that characterizes indefinite noun phrases in the
Kamp/Heim system: a free variable (in the Mohawk case, an event variable)
which undergoes existential closure in the scope of various operators.
This ends our brief tour through some major formal approaches to the
subjunctive.15 The reader is referred to Portner (2003) for further overview
and discussion. In the next sub-section I show that the St’át’imcets subjunc-
tive does not behave like the Indo-European or Mohawk subjunctives, and
that a new approach is required.
15 I defer discussion of Portner’s (1997) analysis to Section 5, since I will be adapting Portner’s
approach for St’át’imcets.

Lisa Matthewson

3.1 The St’át’imcets subjunctive is not amenable to existing approaches

The St’át’imcets subjunctive differs from familiar subjunctives in both its

distribution and semantic effects. Although there are some initial similarities,
such as the fact that both St’át’imcets and Indo-European subjunctives can be
used to express wishes and hopes, St’át’imcets mood displays no sensitivity to
the choice of matrix predicate. Thus, unlike in Romance or Greek, predicates
of assertion, belief and fiction are not differentiated from desideratives or
directives. All attitude verbs in St’át’imcets take the indicative, as illustrated
for a representative range in (38).16,17

(38) a. tsut k=Laura kw=s=t’iq=Ø k=John

say det=Laura det=nom=arrive=3indic det=John
‘Laura said that John came.’
b. tsut-ánwas k=Laura kw=s=t’iq=Ø k=John
say-inside det=Laura det=nom=arrive=3indic det=John
‘Laura thought that John came.’
c. zwát-en-as k=Laura kw=s=t’iq=Ø k=John
know-dir-3erg det=Laura det=nom=arrive=3indic det=John
‘Laura knew that John came.’
16 Interestingly, the same is not true of the related language Skwxwú7mesh (Squamish). In
Skwxwú7mesh, the subjunctive (glossed as ‘conjunctive’; see fn. 2) is obligatory under ‘tell
someone to do something’ (as in (i)), but is optional under ‘I think’, depending on whether
the speaker knows that the event did not take place (ii-iii) (all data from Peter Jacobs, p.c.).

(i) chen tsu-n-Ø-Ø mi as uys

I tell-dir-dat-3obj come 3conj come.inside
‘I told him to come inside.’

(ii) chen ta7aw’n kwi s-Ø-s mi uys

I think det nom-real-3poss come come.inside
‘I think he came inside.’

(iii) chen ta7aw’n k’-as mi uys

I think irr-3conj come come.inside
‘I thought he came inside (but then I found out that he’s still outside playing).’

Jacobs (1992) analyzes the mood distinction in Skwxwú7mesh as encoding speaker certainty,
which suggests that it differs from the St’át’imcets mood system.
17 The expected subject inflection in the embedded clauses in (38) would actually be possessive
=s; see van Eijk 1997 and Davis 2006. However, many modern speakers prefer to omit the
possessive ending and to use matrix indicative =Ø in these contexts. This does not affect
the point at hand, as the variation is between two forms of indicative marking.

Cross-linguistic variation in modality systems: The role of mood

d. kw7íkwl’acw k=Laura kw=s=t’iq=Ø k=John

dream det=Laura det=nom=leave=3indic det=John
‘Laura dreamt that John came.’
e. xát’-min’-as k=Laura kw=s=t’iq=Ø k=John
want-red-3erg det=Laura det=nom=arrive=3indic det=John
‘Laura wanted John to come.’
f. tsa7cw k=Laura kw=s=t’iq=Ø k=John
glad det=Laura det=nom=arrive=3indic det=John
‘Laura was happy that John came.’
g. tsún-as k=Laura k=John kw=s=ts7as=Ø
say(dir)-3erg det=Laura det=John det=nom=come=3indic
‘Laura told John to come.’18

The St’át’imcets subjunctive is also not used under negated verbs of

belief or report, as it is in many European languages (cf. Palmer 2006: 116).
Compare Spanish (39a) with St’át’imcets (39b) and (39c).

(39) a. no creo que aprenda

not I.think that learn+3sg+pres+sbjn
‘I don’t think that he is learning.’ (Spanish; Palmer 2006: 117)
b. cw7aoz kw=en=tsut-ánwas kw=s=zwátet-cal=s
neg det=1sg.poss=say-inside det=nom=know-act=3poss
‘I don’t think that he is learning.’
c. cw7aoz kw=s=tsut=s kw=s=Aggie
neg det=nom=say=3poss det=nom=Aggie
kw=s=t’cum=s i=gáp=as
det=nom=win=3poss when.past=evening=3sbjn
‘Aggie didn’t say she won last night.’

Nor does the St’át’imcets subjunctive give rise to interpretive differ-

ences inside relative clauses. In some Indo-European languages, an indica-
tive/subjunctive contrast in restrictive relatives gives rise to a distinction
which has variously been analyzed as referential/attributive, specific/non-
specific, or wide-scope/narrow-scope (see Rivero 1975, Farkas 1992, Giannaki-
dou 1997, Beghelli 1998, Quer 2001, among many others). This is illustrated in
18 The predicate in (38g) differs from that in (38a)–(38f) because the ‘ordering’ environment in
(38g) requires an unergative embedded verb.

Lisa Matthewson

(40) for Catalan. Quer’s analysis of these examples involves a shifting of the
model in which the descriptive condition in the relative clause is interpreted;
the effect is one of apparent ‘wide-scope’ for the descriptive condition in the
indicative (40a), as opposed to in the subjunctive (40b).

(40) a. necessiten un alcalde [que fa grans

need.3pl a mayor that make.indic.prs.3sg big
‘They need a mayor that makes big investments.’
(Catalan; Quer 2001: 90)
b. necessiten un alcalde [que faci grans
need.3pl a mayor that make.sbjn.prs.3sg big
‘They need a mayor that makes big investments.’
(Catalan; Quer 2001: 90)

In St’át’imcets, nominal restrictive relatives uniformly take indicative

marking, as shown in (41). The distinction which is in Catalan is encoded
by mood, is achieved by means of determiner choice in St’át’imcets (see
Matthewson 1998, 1999 for analysis).

(41) a. wa7 xat’-min’-ítas ti=kúkwpi7=a wa7

impf want-red-3pl.erg det=chief=exis impf
ka-nuk’wa7-s-tanemwít-a k=wa=s mays
circ-help-caus-3pl.pass-circ det=impf=3poss fix
‘They need a (particular) chief who can help them build houses.’
[wide-scope indefinite]
b. wa7 xat’-min’-ítas ku=kúkwpi7 wa7
impf want-red-3pl.erg det=chief impf
ka-nuk’wa7-s-tanemwít-a k=wa=s mays
circ-help-caus-3pl.pass-circ det=impf=3poss fix
‘They need a(ny) chief who can help them build houses.’
[narrow-scope indefinite]

Cross-linguistic variation in modality systems: The role of mood

The mood effects seen in conditionals in some Indo-European languages

are also absent in St’át’imcets. The antecedents of both notionally indicative
and subjunctive conditionals are obligatorily marked with the subjunctive,
as shown in (42), a paradigm borrowed from Quer 2009: 1780. Although
there are ways to distinguish the different types of conditionals, they do not
involve an indicative-subjunctive mood alternation.

(42) a. Context: I’m looking for John. You say:

lh=7áts’x-en=an, nílh=t’u7 s=qwál’-en-tsin
comp=see-dir=1sg.sbjn foc=prt nom=tell-dir-2sg.obj

‘If I see him, I’ll tell you.’

b. Context: I’m looking for John, and I suspect you know where he
is but you haven’t been telling me. You say:
lh=7ats’x-en=án=ka, sqwal’-en-tsín=lhkan=kelh
comp=see-dir=1sg.sbjn=irr tell-dir-2sg.obj=1sg.indc=fut

‘If I saw him, I would tell you.’

c. Context: I was looking for John, but he left town before I could
find him. You say:

‘If I had seen him, I would have told you.’

The St’át’imcets subjunctive is also not like the Mohawk one. Unlike in
Mohawk, St’át’imcets futures take the indicative, as shown in (43); so do past
habituals, as shown in (44), and plain negatives, as in (45).

(43) a. ats’x-en-tsí=lhkan=kelh lh=nátcw=as

see-dir-2sg.obj=1sg.indic=fut comp=one.day.away=3sbjn
‘I’ll see you tomorrow.’
b. *ats’x-en-tsín=an=kelh lh=nátcw=as
see-dir-2sg.obj=1sg.sbjn=fut comp=one.day.away=3sbjn
‘I’ll see you tomorrow.’

Lisa Matthewson

(44) a. wa7=lhkalh=wí7=tu7 n-záw’-em ku=qú7

impf=1pl.indic=emph=then loc-get.water-mid det=water
lhel=ta=qú7qu7=a múta7 lhel=ta=tswáw’cw=a
from=det=water(pl)=exis and from=det=creek=exis
‘We used to fetch water from the spring and the creek.’
(Matthewson 2005: 370)
b. *wa7=at=wí7=tu7 n-záw’-em ku=qú7
impf=1pl.sbjn=emph=then loc-get.water-mid det=water
lhel=ta=qú7qu7=a múta7 lhel=ta=tswáw’cw=a
from=det=water(pl)=exis and from=det=creek=exis
‘We used to fetch water from the spring and the creek.’

(45) a. áy=t’u7 kw=en=gúy’t ku=pála7 sgap

neg=prt det=1sg.poss=sleep det=one evening
‘I didn’t sleep one night.’ (Matthewson 2005: 267)
b. *áy=t’u7 kw=s=gúy’t=an ku=pála7 sgap
neg=prt det=nom=sleep=1sg.sbjn det=one evening
‘I didn’t sleep one night.’

Finally, there are the cases where the St’át’imcets subjunctive does ap-
pear, with a predictable meaning difference, which are not attested in other
languages. These include the use of the St’át’imcets subjunctive to weaken
an imperative to a polite request, or to help turn a question into a statement
of uncertainty (see examples in (13)–(15) and (16)–(20) above).
I will argue below that in spite of these major empirical differences
between the St’át’imcets subjunctive and that of familiar languages, the basic
framework for mood semantics advanced by Portner (1997) can be adapted
to capture all the St’át’imcets facts. This will support Portner’s proposal
that moods are dependent on modals and place restrictions on the modal
environments in which they appear.

4 Basic framework: Portner 1997

Portner’s (1997) leading idea is that moods place presuppositions on the

modal environment in which they appear. More precisely, moods typically
restrict properties of the accessibility relation associated with a governing
modal operator (see also Portner 2003: 64). The modal operator may be

Cross-linguistic variation in modality systems: The role of mood

provided by a higher attitude verb or modal; it may also, in unembedded

situations, be provided by context.
For illustration, let us first see how Portner analyzes English ‘mood-
indicating may’. In each of the examples in (46), the may is not the ordinary
modal may; it is not asserting possibility. (46b), for example, does not mean
‘it is possible that it is possible that Sue wins the race.’

(46) a. Jack wishes that you may be happy.

b. It is possible that Sue may win the race.
c. May you have a pleasant journey! (Portner 1997: 190)

Portner argues that mood-indicating may presupposes that p is doxasti-

cally possible (possible according to someone’s beliefs). For example, (46a)
presupposes that Jack believes it is possible for you to be happy. He provides
the analysis in (47).

(47) For any reference situation r , modal force F , and modal context R,
Jmay dep (φ)Kr ,F ,R is only defined if φ is possible with respect to
Doxα (r ), where α is the denotation of the matrix subject.
When defined, Jmay dep φKr ,F ,R = JφKr ,F ,R (Portner 1997: 201)

Portner further argues that there are actually two mood-indicating may’s,
with slightly different properties. Mood-indicating may under wish, pray,
etc. (as in (46a)) or in unembedded clauses (as in (46c)) has an extra require-
ment: it presupposes that the accessibility relation R is buletic (deals with
somebody’s wishes or desires).
The discussion of mood-indicating may illustrates an important aspect
of Portner’s analysis, namely that moods place presuppositions on the modal
accessibility relation (a type of conversational background). With English
mood-indicating may, there is a doxastic and sometimes a buletic restriction.
For the English mandative subjunctive, which appears in imperatives as well
as in embedded contexts as in (48), R must be deontic, as shown in (49).

(48) Mary demands that you join us downstairs at 3pm. (Portner 1997: 202)

(49) For any reference situation r , modal force F , and modal context R,
Jm-subj(φ)Kr ,F ,R is only defined if R is a deontic accessibility relation.

When defined, Jm-subj(φ)Kr ,F ,R = JφKr ,F ,R (Portner 1997: 202)

Lisa Matthewson

For Italian moods, Portner claims that R is restricted to being (non-)factive.19

The idea that moods restrict modal conversational backgrounds is common
to several other modal-based analyses of mood (e.g., Farkas 1992 and Giorgi &
Pianesi 199720 ), and is also found in James 1986. What James calls ‘manners
of representation’ are root vs. epistemic conversational backgrounds:

The ambiguity of the modal auxiliaries . . . supports the hypoth-

esis that there are two separate manners of representation.
Moods . . . signify manners of representation. They are not am-
biguous, however; they signify one modality or the other (James
1986: 15).

In the analysis to follow, I will adopt Portner’s idea that moods place
restrictions on a governing modal operator. I will argue that the empirical
differences between the St’át’imcets subjunctive and Indo-European sub-
junctives derive from the fact that the former restricts the conversational
background of the modal operator in such a way that the modal force is

5 Adapting Portner’s approach for the Statimcets subjunctive

I deal here only with the constructions where the subjunctive has a semantic
effect; I will not address the cases of obligatory subjunctive agreement which
were presented in subsection 2.2.21 My analysis will account for all meaningful
uses of the St’át’imcets subjunctive except the two uses which contain the
particle t’u7. See Section 8 for some discussion of the t’u7-constructions.
19 Interestingly, the Italian indicative imposes a modal force restriction as well as a conver-
sational background restriction; it is only used with a force of necessity (Portner 1997:
20 According to Giorgi and Pianesi, the subjunctive indicates that the ordering source is non-
empty; this is a restriction on a conversational background.
21 The analysis presented below is actually compatible with the obligatory presence of the
subjunctive in if -clauses introduced by lh=, and may even help to explain why lh= obligatorily
selects the subjunctive when it means ‘if’, but selects indicative when it means ‘before’.
Thanks to Henry Davis for discussion of this point, and see Davis 2006: chapter 26. (See also
van Eijk 1997: 217, although van Eijk analyzes the subjunctive-inducing lh= as distinct from
(e)lh= ‘before’.) As for the other obligatory cases of subjunctive, these may be grammaticized,
semantically bleached relics of original meaningful uses, aided by the fact that subjunctive
marking is intertwined with person agreement.

Cross-linguistic variation in modality systems: The role of mood

5.1 The St’át’imcets subjunctive presupposes rather than asserts a modal


The first thing to establish is that like Portner’s moods, the St’át’imcets
subjunctive does not itself assert a modal semantics, but is dependent on
a governing modal operator. One piece of evidence for this is that the
St’át’imcets subjunctive must co-occur with an overt modal in almost all its
uses. Of the seven uses of the subjunctive being analyzed here, five of them
have an overt modal (the deontics, ‘pretend’, wh-questions, yes-no questions,
ignorance free relatives), one of them is plausibly analyzed as containing a
covert modal (imperatives), and only one is non-modal (plain assertions). As
noted above, the addition of the subjunctive to plain assertions is extremely
restricted and at least semi-conventionalized. If the subjunctive were itself
independently modal, it would be difficult to explain the minimal contrasts
in (50)–(51).22

(50) a. *gúy’t=as ti=sk’úk’wm’it=a

sleep=3sbjn det=child=exis
Attempted: ‘I hope the child sleeps.’
b. gúy’t=as=ka ti=sk’úk’wm’it=a
sleep=3sbjn=deon det=child=exis
‘I hope the child sleeps.’

(51) a. *skalúl7=acw: saq’w knáti7 múta7 em7ímnem

owl=2sg.sbjn fly deic and make.animal.noise
‘Pretend to be an owl: fly around and hoot.’
b. skalúl7=acw=ka: saq’w knáti7 múta7 em7ímnem
owl=2sg.sbjn=deon fly deic and make.animal.noise
‘Pretend to be an owl: fly around and hoot.’

Furthermore, just like with English mood-indicating may, the interpre-

tation of St’át’imcets subjunctive clauses indicates that the mood does not
22 As noted above, Portner’s analysis does allow for unembedded uses of non-indicative moods,
with the modal accessibility relation being provided by context. So there is no problem with
the cases where the St’át’imcets subjunctive can appear without a c-commanding modal
(as in (5)–(6)). Of course, we would eventually like to explain when these unembedded
subjunctives can and cannot appear. Portner (1997: 201) notes for mood-indicating may and
the mandative subjunctive that ‘Neither of these have a completely predictable distribution,
in that neither occurs in every context in which a purely semantic account would predict
that it could . . . it must be admitted that lexical and syntactic idiosyncracies come into play.’

Lisa Matthewson

itself contribute modal semantics. For example, (50b) does not mean ‘It
should be the case that the child should sleep’.
The St’át’imcets subjunctive also patterns morphosyntactically like a
mood rather than like real modals in the language. As shown above, the
subjunctive is obligatorily selected by some complementizers, unlike modals.
The subjunctive is also fused with subject marking into a full paradigm, unlike
the modals, which are independent second-position clitics.23 I therefore
conclude that the St’át’imcets subjunctive does not itself introduce a modal
operator, but requires one in its environment.

5.2 The St’át’imcets subjunctive does not presuppose a particular con-

versational background

The Statimcets subjunctive differs from most Indo-European moods in that

it cannot be analyzed as being restricted to a certain type of conversational
background. This is illustrated by the fact that it allows deontic, buletic or
epistemic uses. Deontic conversational backgrounds arise with imperatives,
as in (52) or (14b), repeated here in (53):

(52) ets7á=has=(malh) lh=xílh-ts=al’ap

deic=3sbjn=(adhort) comp=do-caus=2pl.sbjn
‘Could you do it like this, you folks?’

(53) nás=acw=malh áku7 pankúph=a

go=2sg.sbjn=adhort deic Vancouver=exis
‘You could go to Vancouver.’

Buletic conversational backgrounds arise with the modal ka:

(54) plan=as=ká=ti7=t’u7 wa7 máys-n-as

already=3sbjn=deon=demon=prt impf fix-dir-3erg
‘I wish he had fixed that already.’

(55) guy’t=ás=ka ti=sk’úk’wm’it=a

sleep=3sbjn=deon det=child=exis
‘I hope the child sleeps.’

23 Or in one case, a circumfix on the verb; see Davis et al. 2009.

Cross-linguistic variation in modality systems: The role of mood

And epistemic conversational backgrounds arise with questions.

(56) nká7=as=kelh lh=cúz’=acw nas

where=3sbjn=fut comp=going.to=2sg.sbjn go
‘Wherever will you go?’ / ‘I wonder where you are going to go now.’
(adapted from Davis 2006: chapter 24)

(57) lan=as=há=k’a kwán-ens-as

already=3sbjn=ynq=infer take-dir-3erg
‘I wonder if she’s already got my letter.’ / ‘I don’t know if she got my
letter or not.’

These data suggest that the St’át’imcets subjunctive is not analyzable in

the same way as the European moods discussed by Portner (1997), which
hardwire a restriction to a particular type of conversational background.

5.3 Instead, the St’át’imcets subjunctive functions to weaken the modal


The core idea of my proposal is that the St’át’imcets subjunctive restricts its
governing modal only in such a way as to weaken the force of the proposi-
tion expressed. The intuition that the St’át’imcets subjunctive weakens the
proposition it adds to was already expressed by Davis (2006: chapter 24):

The best way to characterize this meaning difference is in terms

of the ‘force’ of a sentence. With ordinary indicative subjects,
a sentence expresses a straightforward assertion, question or
command; but with subjunctive subjects, the effect is to weaken
the force of the sentence, so that an assertion becomes a wish,
a question becomes a conjecture, and a command becomes a

The important question is what exactly is meant by ‘weakening’ in this

context, and how to derive the various effects of the subjunctive in a unified
way. I will claim that the St’át’imcets subjunctive restricts the conversational
background of a governing modal in such a way that the modal imparts a
force no stronger than weak necessity. Since there are no modals which

Lisa Matthewson

lexically encode quantificational force in St’át’imcets, this will mean that the
subjunctive must appear in the scope of a variable-force modal, and will
restrict it to a weakened interpretation.

6 Analysis

The idea to be pursued is that the St’át’imcets subjunctive restricts the

domain of quantification of a c-commanding modal, so that the interpretation
which obtains is weaker than pure necessity.24 Rullmann et al. (2008) argue
that St’át’imcets possesses no modals which are lexically restricted for a
pure necessity reading (see also Matthewson et al. 2007 and Davis et al.
2009). Instead, all St’át’imcets modals seem to allow both weak and strong
interpretations (see (3) above, and see the references cited for many more
examples). So, what we need to say is that the subjunctive forces an already
potentially weak c-commanding modal to have a weak reading. In order to
see how this will work, I first very briefly review the basics of a Kratzerian
analysis of modals, and then outline how modals in St’át’imcets are analyzed.
We will then add the subjunctive.
Modals in a standard analysis introduce quantifiers over possible worlds.
The set of worlds quantified over is narrowed down by two conversational
backgrounds. First, it is narrowed down by the modal base, and then it is
ordered and further narrowed down by the ordering source. The modal base
and the ordering source are both usually provided by context in English,
although there are systematic contributions of tense and aspect to the con-
versational background (see e.g., Condoravdi 2002 for discussion). A simple
example is given in (58).

(58) Chris must do his homework.

Modal base (circumstantial): The set of worlds in which the relevant

facts are the same as in the actual world (e.g., we ignore worlds where
Chris is not in school).
Ordering source (normative): Orders worlds in the modal base so
that the best worlds are those which come closest to the ideal repre-
sented by the school’s homework regulations.
Universal quantification: In all the best worlds, Chris does his home-
24 I would like to thank David Beaver and three anonymous reviewers for helping me clarify
aspects of the analysis and its presentation.

Cross-linguistic variation in modality systems: The role of mood

Rullmann et al. (2008) argue that there are two differences between English
universal modals like must and St’át’imcets modals. First, the St’át’imcets
modals place presuppositions on the conversational backgrounds. Second,
the set of best worlds is further narrowed down by a choice function which
picks out a potentially proper subset of the best worlds to be quantified
over. This can lead to a weaker reading, depending on context. The idea is
illustrated informally in (59).25

(59) gúy’t=ka ti=sk’úk’wm’it=a

sleep=deon det=child=exis
‘The child must/should/can sleep.’

Modal base (presupposed to be circumstantial): Worlds in which the

relevant facts about our family are the same as in the actual world.
Ordering source (presupposed to be normative): The best worlds
are those in which my desire for an early night is fulfilled.
Choice function: Picks out a potentially proper subset of the best
Universal quantification: In all worlds in the subset of the best worlds
picked out by the choice function, the child sleeps.

Since the quantification is over a potentially proper subset of the best

worlds, sentences like (59) can be interpreted with any strength ranging
from a pure possibility (‘The child can/may sleep’) to a strong necessity
(‘The child must sleep’). The apparent variable quantificational force of
St’át’imcets modals is thus derived not by ambiguity in the quantifier itself,
but by restricting the size of the set of worlds quantified over by the universal
quantifier. The larger the subset of the best worlds selected by the choice
function, the stronger the proposition expressed. As a limiting case, the
choice function may be the identity function. This results in a reading that is
equivalent to the standard analysis of strong modals like must in English.
Now we turn to the subjunctive. In order to capture the idea that the
subjunctive weakens the c-commanding modal, I analyze the subjunctive as
presupposing that at least one world in the set of best worlds is a world
in which the embedded proposition is false. This will prevent the choice
25 A very sensible suggestion that we should replace Rullmann et al.’s choice function with
an(other) ordering source has been made independently by Kratzer (2009), Portner (2009),
and Peterson (2009, 2010). I will in fact do this below when I compare the current analysis to
that of von Fintel & Iatridou (2008).

Lisa Matthewson

function from being the identity function.26 This is illustrated informally for
a deontic case in (60).
(60) guy’t=ás=ka ti=sk’úk’wm’it=a
sleep=3subj=deon det=child=exis
‘I hope the child sleeps.’

Modal base (presupposed to be circumstantial): Worlds in which the

relevant facts about our family are the same as in the actual world.
Ordering source (presupposed to be normative): The best worlds
are those in which my desire for an early night is fulfilled.
Choice function (must pick out a proper subset of the best worlds, to
avoid a contradiction with the presupposition of the subjunctive): The
very best worlds are those in which my spouse’s desire for an early
night is also fulfilled.
Universal quantification: All the very best worlds are worlds in which
the child sleeps.
(59) allows a strong interpretation which (60) disallows. If the choice
function in (59) is the identity function, the speaker will be satisfied only
if the child sleeps (‘in all the worlds where my desire for an early night is
fulfilled, the child sleeps’). In (60), the speaker will certainly be satisfied if
the child sleeps, but there are also other ways to make him/her happy. (60)
asserts only that ‘in all the worlds where my and my spouse’s desires for an
early night are fulfilled, the child sleeps’ — so the speaker’s desires may be
satisfied if the speaker’s spouse looks after the child while the speaker goes
to sleep. The requirement that (60) places on the child is thus weaker than a
strong necessity.
In the remainder of this section I provide a more formal implementation
of this idea, and in Section 7 I show how the analysis accounts for a wide
range of uses of the St’át’imcets subjunctive, including imperative-weakening,
question-weakening, and ignorance free relatives.
26 Thanks to Hotze Rullmann (p.c.) for discussion of this point. The requirement that p be false
in at least one of the best worlds appears reminiscent of a nonveridicality-style analysis,
and there may be some deep significance to this. However, the analyses are different. For
Giannakidou, the issue is always epistemic, as veridicality is defined in terms of a truth
entailment in an individual’s epistemic model; see (37). Thus, subjunctive is predicted
under verbs like ‘want’, as propositions under ‘want’ are not entailed to be true in any
individual’s epistemic model. Under my analysis, the subjunctive has an anaphoric modal
base and ordering source. I will show in subsection 7.5 that my analysis correctly predicts
the indicative under verbs like ‘want’ in St’át’imcets.

Cross-linguistic variation in modality systems: The role of mood

I adopt the following basic definitions from von Fintel & Heim 2007. (61)
shows the ordering of worlds according to how well they satisfy the set of
propositions in the ordering source, and (62) shows how the best worlds are

(61) Given a set of worlds X and a set of propositions P , define the strict
partial order <P as follows:
∀w1 , w2 ∈X : w1 <P w2 iff {p∈P : p(w2 ) = 1} ⊂ {p∈P : p(w1 ) = 1}
For any worlds w1 and w2 , w1 comes closer to the ideal set up by
the ordering source than w2 does iff the set of propositions in the
ordering source which are true in w2 is a proper subset of the set of
propositions in the ordering source which are true in w1 .

(62) For a given strict partial order <P on worlds, define the selection
function maxP that selects the set of <P -best worlds from any set X
of worlds:
∀X ⊆ W : maxP (X) = {w ∈ X : ¬∃w 0 ∈ X : w 0 <P w}
(von Fintel & Heim 2007: 55)

The best worlds are those for which there are no worlds closer to the ideal
than they are. The analysis of English must is given in (63). must takes as
arguments a modal base, an ordering source and a proposition, and asserts
that in all the best worlds in the modal base, as defined by the ordering
source, the proposition is true.27,28

(63) Jmust Kc,w = λhhs,hst,tii .λghs,hst,tii .λqhs,ti .∀w 0 ∈maxg(w) (∩h(w)) : q(w 0 ) = 1
(von Fintel & Heim 2007: 55)

The analysis of St’át’imcets normative ka is given in (64). ka takes as

arguments a modal base, an ordering source, and a proposition. fc represents
the contextually given choice function.
27 Nothing crucial hinges on having the conversational backgrounds present in the syntax (as
in von Fintel & Heim 2007) rather than being parameters of interpretation (as in Portner
1997). However, the syntactic version may have a potential advantage in enforcing the
required anaphoricity of the conversational backgrounds once we bring in the subjunctive.
In Rullmann et al.’s (2008) analysis of St’át’imcets modals, the choice function is also a
syntactic argument of the modal. Following the suggestion of an anonymous reviewer, I have
changed this here, but again, nothing crucial hinges on the decision.
28 As an anonymous reviewer reminded me, English must also encodes restrictions on its
modal base and ordering source, parallel to (but obviously different from) those defined for
ka in (64). See for example von Fintel & Gillies 2010 and Matthewson 2010, to appear for

Lisa Matthewson

(64) Jka(h)(g)Kc,w is only defined if h is a circumstantial modal base and

g is a normative ordering source.
If defined, Jka(h)(g)Kc,w = λqhs,ti .∀w 0 ∈fc (maxg(w) (∩h(w))) : q(w 0 ) = 1
(adapted from Rullmann et al. 2008: 340)

Now for the subjunctive. As shown in (65), the subjunctive does not affect
truth conditions but merely enforces a weaker-than-necessity reading of a
modal in the environment. The subjunctive does not itself introduce any
conversational backgrounds; h and g in (65) are free variables. I assume
that this enforces anaphoricity: the mood must be c-commanded by a modal
which introduces h and g.29

(65) Jsbjn(φ)Kc,w is only defined if ∃w 0 ∈ maxg(w) (∩h(w))[φ(w 0 ) = 0].

When defined, Jsbjn(φ)Kc,w = λw 0 .J(φ)Kc,w

According to (65), the subjunctive is only defined if there is at least one

world w’ in the set of best worlds in the modal base, as defined by the
ordering source, such that φ is false in w 0 . The analysis is applied to a
normative subjunctive case in (66).

(66) guy’t=ás=ka ti=sk’úk’wm’it=a

sleep=3subj=deon det=child=exis
‘I hope the child sleeps.’

Jka(h)(g)(as(guy’t ti sk’úk’wm’ita))Kc,w is only defined if

i. h is a circumstantial modal base and g is a normative ordering

ii. ∃w 0 ∈ maxg(w) (∩h(w)) [the child doesn’t sleep in w 0 ]

When defined, Jka(h)(g)(as(guy’t ti sk’úk’wm’ita)) Kc,w =1 iff

∀w 0 ∈ fc (maxg(w) (∩h(w))) [the child sleeps in w 0 ]

As above, maxg(w) (∩h(w)) picks out the best worlds in the modal base,
as defined by the normative ordering source. The contextually determined
choice function fc picks out a subset of maxg(w) (∩h(w)), and the modal
universally quantifies over the set picked out by the choice function. Be-
cause the subjunctive mood presupposes that there is at least one world
29 Thanks to an anonymous reviewer for pointing out an inconsistency in an earlier version of

Cross-linguistic variation in modality systems: The role of mood

in maxg(w) (∩h(w)) in which the proposition is false, the choice function

must pick out a proper subset of the worlds provided by the modal base
and ordering source. This forces a weaker-than-universal reading. We in
fact predict gradient readings with the subjunctive — anything from pure
possibility to weak necessity. This seems to fit with the facts about when the
subjunctive is felicitous.
I have so far been simply following Portner (1997) in modeling the mood
restriction as a presupposition, rather than as ordinary asserted content, or
some other kind of inference. The question arises of whether there is any
St’át’imcets-internal justification for the assumption that presupposition is
If the subjunctive contributed ordinary asserted content, we would predict
that it would fail to project through presupposition holes such as negation or
conditionals, and that it could be directly affirmed or denied by the hearer.
The issue of projection through presupposition holes is not testable for
most of the relevant constructions in St’át’imcets. For example, negation in
St’át’imcets is a predicate which embeds an obligatorily nominalized (i.e.,
indicative) subordinate clause. When a subjunctive clitic does co-occur with
negation, it attaches to the negation itself, as shown in (67). Thus, while (67) is
not interpretable in a way which would show that the subjunctive contributed
asserted content, the results are not conclusive because the subjunctive is
probably not scoping under negation syntactically.

(67) cw7aoz=as=ká=t’u7 kw=s=nas=ts

neg=3sbjn=deon=prt det=nom=go=3poss
‘I wish he wouldn’t go.’ (van Eijk 1997: 214)

≠ ‘It is not the case that [in at least one of the best worlds in the
modal base, he doesn’t go, and in all of the set of worlds selected by
the choice function, he goes].’

i.e, ≠ ‘It is not the case that [it’s good if he goes, and I can still be
happy if he doesn’t].’

Nor can we test projection through ‘if’, as ‘if’-clauses obligatorily and re-
dundantly select the subjunctive in St’át’imcets (see subsection 2.2). However,
questions provide evidence that the subjunctive does not contribute ordinary
asserted content. Recall that the subjunctive plus an inferential evidential
30 Thanks to David Beaver and an anonymous reviewer for asking for clarification of this issue.

Lisa Matthewson

when added to a question results in a statement of uncertainty (16)–(20).

The question in (68) cannot be interpreted as if the subjunctive contributed
asserted content which scopes below the question. (See subsection 7.2 for
analysis of questions like (68).)

(68) nilh=as=há=k’a s=Lémya7 ku=kúkwpi7

foc=3sbjn=ynq=infer nom=Lémya7 det=chief
‘I think maybe Lémya7 is the chief / I wonder if Lémya7 is the chief.’

≠ ‘Is it the case that [in at least one of the best worlds compatible
with the inferential evidence, Lémya7 is not the chief, and in all of the
set of worlds selected by the choice function, Lémya7 is the chief]?’

i.e, ≠ ‘Is it the case that [Lémya7 is possibly but not necessarily the

Further evidence that the subjunctive does not contribute ordinary as-
serted content comes from the impossibility of directly affirming or denying
its contribution. This is shown in (69), where B and B’ try to deny A’s sub-
junctive claim that in at least one world compatible with A’s knowledge and
desires, the children don’t sleep. The consultant absolutely rejects the replies
in B and B’.

(69) A guy’t=ás=ka i=sk’wemk’úk’wm’it=a

sleep=3sbjn=deon det.pl=child(pl)=exis
‘I hope the children sleep.’
B #cw7aoz kw=s=wenácw. plán=lhkacw zewát-en
neg det=nom=true already=2sg.subj know-dir
kw=s=cuz’ gúy’t=wit
det=nom=going.to sleep=3pl
‘That’s not true. You already know they will sleep.’
B’ #cw7aoz kw=s=wenácw. lh=cw7áoz=as
neg det=nom=true comp=neg=3sbjn
kw=s=gúy’t=wit i=sk’wemk’úk’wm’it=a, áoz=kelh
det=nom=sleep=3pl det.pl=child(pl)=exis neg=fut
kw=a=s áma ta=scwákwekw-sw=a
det=impf=3poss good det=heart-2sg.poss=exis
‘That’s not true. If the children don’t sleep, you won’t be happy.’

Having established that the weakening contribution of the subjunctive is

not ordinary asserted content, the question now is whether it contributes a

Cross-linguistic variation in modality systems: The role of mood

presupposition per se, or some other not-at-issue content, such as a Potts

(2005)-style conventional implicature. One major empirical difference be-
tween a traditional understanding of presuppositions (e.g., Stalnaker 1974)
and conventional implicatures is that only the former impose constraints on
the state of the common ground. Conventional implicatures, in contrast, stan-
dardly contribute information which is new to the hearer (Potts 2005). I have
argued elsewhere (Matthewson 2006, 2008b) that St’át’imcets entirely lacks
presuppositions of the common ground type; all not-at-issue content in this
language is treated as potentially new to the hearer.31 In those earlier works I
argued that the St’át’imcets facts necessitate an alternative analysis of pre-
supposition (for example that of Gauker 1998). However, another way to look
at things is to say that out of the class of not-at-issue meanings, St’át’imcets
lacks one sub-type, namely common ground presuppositions. What I have
modeled as a presupposition of the St’át’imcets subjunctive would then be
some other kind of not-at-issue content, perhaps a conventional implicature.
However, these issues go beyond the scope of the present paper and do not
affect the main points being made here, so with these caveats I will continue
to model the subjunctive as introducing a presupposition.
Before turning to more complex constructions involving the subjunc-
tive, it is interesting to consider the similarity between the analysis of the
St’át’imcets subjunctive provided here and von Fintel & Iatridou’s (2008) ideas
about weak necessity modals. von Fintel and Iatridou are concerned with
the difference in quantificational strength between ought and have to/must.
In (70), we see that the restriction on employees is stronger than that on
everyone else.

(70) After using the bathroom, everybody ought to wash their hands;
employees have to.
(von Fintel & Iatridou 2008: 116)

(71) also illustrates the contrast between the different modal strengths.
In (71a), taking Route 2 is the only option, if you want to get to Ashfield: all
the worlds in which you get to Ashfield are Route 2-worlds. In (71b), there
are other getting-to-Ashfield worlds apart from only Route 2-worlds. But the
Route-2 worlds are the best, taking into consideration some other factors
(such as a scenic route).
31 For example, attempts to elicit ‘Hey, wait a minute!’ responses to presupposition failures for
a wide range of standard presupposition triggers have all failed (Matthewson 2006, 2008b).
We are therefore unable to decide the presupposition issue for the subjunctive by using the
‘Hey, wait a minute!’ test (as was suggested by an anonymous reviewer).

Lisa Matthewson

(71) a. To go to Ashfield, you have to / must take Route 2.

b. To go to Ashfield, you ought to take Route 2.
(von Fintel & Iatridou 2008: 118)

von Fintel and Iatridou argue that ought is a weak necessity modal, and
that weak necessity modals signal the existence of a secondary ordering
source. This is illustrated informally in (72)–(73). (72) contains a strong
necessity modal, and gives a strong reading, as usual. In (73), a secondary
ordering source further restricts the set of worlds which are universally
quantified over, leading to a weaker reading.

(72) To go to Ashfield, you have to / must take Route 2.

Modal base: Restricts worlds considered to those in which the same

facts about roads hold as in the actual world.
Ordering source: Orders worlds in the modal base so that the best
worlds are those in which you attain your goal of getting to Ashfield.
Universal quantification: In all the best worlds, you take Route 2.

(73) To go to Ashfield, you ought to take Route 2.

Modal base: Restricts worlds considered to those in which the same

facts about roads hold as in the actual world.
Ordering source 1: Orders worlds in the modal base so that the best
worlds are those in which you attain your goal of getting to Ashfield.
Ordering source 2: Further orders the best worlds picked out by
ordering source 1, so that the very best worlds are those in which
you not only attain your goal of getting to Ashfield, but also attain an
additional goal of going via a scenic route.
Universal quantification: In all the very best worlds, you take Route

As von Fintel & Iatridou (2008: 137) put it: ‘The idea is that saying that
to go to Ashfield you ought to take Route 2, because it’s the most scenic
way, is the same as saying that to go to Ashfield in the most scenic way,
you have to take Route 2.’ This is very parallel in spirit to Rullmann et al.’s
(2008) analysis of St’át’imcets modals, where a weak reading is obtained by
a universal quantifier with a restriction provided by a choice function. And
just like Rullmann et al.’s analysis, von Fintel and Iatridou’s actually predicts
gradience: how ‘weak’ a weak necessity modal is can vary, depending on

Cross-linguistic variation in modality systems: The role of mood

which secondary ordering source you pick. In fact, given that the motivation
for using a choice function rather than an ordering source was unconvincing
anyway (cf. Kratzer 2009, Peterson 2009, 2010, and Portner 2009), the
Rullmann et al.-style analysis is better implemented using a double ordering
source, exactly as in von Fintel & Iatridou 2008.32
So what is the difference between English and St’át’imcets? Simply that in
English, we lexically encode the weak necessity (ought vs. have to/must). In
St’át’imcets, no differences in modal force are lexically encoded by modals,
but what English modals do, St’át’imcets does via mood. Another way of
describing the analysis offered here would be to say that the St’át’imcets
subjunctive enforces weak necessity (via domain restriction): it forces there
to be two (non-vacuous) restrictions on the set of worlds in the modal base.
While further cross-linguistic investigation goes beyond the scope of this
paper, it is worth pointing out a connection to another intriguing observation
of von Fintel and Iatridou’s, namely that in many languages, weak necessity
modals are created transparently from a strong necessity modal plus coun-
terfactual morphology. This is illustrated in (74) for French, where the modal
appears in the conditional mood, the one which occurs in counterfactual

(74) tout le monde devrait se laver les mains mais les serveurs
everybody must/cond refl wash the hands but the waiters
sont obligés
are obliged
‘Everybody ought to wash their hands but the waiters have to.’
(von Fintel & Iatridou 2008: 121)

This is very reminiscent of St’át’imcets, where a modal which introduces

universal quantification gives rise to weak necessity interpretations in the
presence of the subjunctive. In St’át’imcets, I have analyzed the weakening
effect as the sole contribution of the subjunctive mood. Of course, ‘counter-
factual’ and ‘subjunctive’ are not the same thing, and I am not in a position
to claim that the current analysis of the subjunctive can extend to counter-
factual morphology in the languages discussed by von Fintel and Iatridou.
However, the present analysis at the very least supports von Fintel and Iatri-
dou’s cross-linguistic generalization that mood morphology can derive weak
32 Like von Fintel and Iatridou, I omit a formal definition of a modal with a double ordering
source; see von Fintel & Iatridou 2008: 138 for some suggestions on how to do this.

Lisa Matthewson

necessity interpretations, and may offer a potential new avenue for looking
at languages like French.

7 Applying the analysis to other subjunctive constructions

In the previous section I presented an analysis of the St’át’imcets subjunctive

and applied it to cases involving a normative modal. In this section I aim
to establish that the analysis of the subjunctive as restricting the conver-
sational background of a co-occurring modal can extend to the other uses
of the subjunctive. I deal in turn with imperatives (subsection 7.1), ques-
tions (subsection 7.2), ignorance free relatives (subsection 7.3), the ‘pretend’
cases (subsection 7.4), and finally I return to the fact that in St’át’imcets, the
subjunctive is not licensed by any attitude verbs (subsection 7.5).

7.1 Imperatives

Recall that the subjunctive, when added to an imperative, makes the com-
mand more polite. An example is repeated here:

(75) a. lts7á=malh lh=kits-in’=ál’ap!

deic=adhort comp=put.down-dir=2pl.sbjn
‘Just put it over here!’
b. lts7á=has=malh lh=kits-in’=ál’ap
deic=3sbjn=adhort comp=put.down-dir=2pl.sbjn
‘Could you put it down here?’/‘You may as well put it down over

The easiest way to analyze the imperatives would be as sub-cases of the

deontic cases already analyzed above. We could say that the imperative
introduces a deontic necessity modal, and the subjunctive weakens the
proposition expressed. That is what I will in fact say, adopting Schwager’s
(2005, 2006) analysis of imperatives.
Schwager (2005, 2006) claims that imperatives introduce a modal opera-
tor, which is a more restricted version of a deontic necessity modal.33 Nor-
mally, the imperative modal expresses necessity, with the Common Ground
33 See Han 1997, 1999 for an earlier proposal of a similar idea. Han’s modal analysis shares
many of the advantages for St’át’imcets of Schwager’s approach. However, since Han models
the modal claim of the imperative as a presupposition rather than part of the assertion,
extra assumptions would be required to apply it to St’át’imcets subjunctive imperatives.

Cross-linguistic variation in modality systems: The role of mood

serving as the modal base, and a contextually given set of preferences giv-
ing the ordering source. In addition, imperatives carry presuppositions, as
shown in (76). The presuppositions restrict an imperative to situations where
a performative use of a deontic modal would be possible, namely those in
which the speaker is an authority on the matter.34

(76) Presuppositions of an imperative:

1. The speaker is an authority on the parameters. [modal base and

ordering source]
2. The ordering source is preference-related.35
3. The speaker affirms the ordering source as a good maxim for
acting in the given scenario. (Schwager 2006: 248-249)

A simple case is illustrated in (77).

(77) Get up!

Modal base: What the speaker and hearer jointly take to be possible
Ordering source: The speaker’s commands

(77) is true iff all worlds in the Common Ground that make true as much as
possible of what the speaker commands at the world and time of utterance
make it true that the addressee gets up within a certain event frame t
(Schwager 2005: chapter 6). The difference between (77) and the plain modal
statement ‘You must get up’ is that with the imperative, the speaker is
presupposed to be an authority. This has the consequence that whenever an
imperative is defined, it is necessarily true.
Adopting Schwager’s analysis enables us to treat the St’át’imcets sub-
junctive imperatives the same way we treated the weakened normative ka-
statements above. We have to assume that the deontic modal in a St’át’imcets
imperative is, like the overt ka, a universal modal which introduces a choice
function or secondary ordering source. While a normal imperative roughly
says that in all the best worlds (the worlds where you obey my commands),
34 The descriptive vs. performative use of a deontic modal is shown in (i), from (Schwager
2008: 26).

(i) a. Peter may come tomorrow. (The hostess said it was no problem.) descriptive
b. Okay, you may come at 11. (Are you content now?) performative

35 The preferences may relate to the addressee’s wishes, as in the case of advice or suggestions.

Lisa Matthewson

you do P, a subjunctive imperative presupposes that at least one world in

which you obey my commands is a world in which you do not do P. This
predicts that a weakened imperative means that in the very best worlds,
you do P, but there are other ways to satisfy me. The requirement on the
addressee becomes weaker, just as the requirement on the child to sleep
becomes weaker in the examples discussed above.
An advantage of Schwager’s analysis for St’át’imcets is that it makes the
correct predictions for ‘permission imperatives’ like ‘Have a cookie!’ These
do not perform a speech act of ordering, but rather of invitation. It might be
natural to think that permission imperatives involve a possibility modal, but
Schwager argues that imperatives always introduce a necessity operator. For
Schwager, the permission effect arises due to the contextual parameters; this
is shown in (78).

(78) Take an apple if you like!

Given what we know the world to be like and given what you want, it
is necessary that you take an apple. (cf. Schwager 2008: 49)

Under Schwager’s analysis, then, the difference between an order and an

invitation consists not in a difference in quantificational force, but in ordering
source. This correctly predicts that in St’át’imcets, permission imperatives
do not have to take the subjunctive:36,37

(79) Context: Your friend comes over and is visiting with you. You hear
her stomach rumbling. You give her a plate and say ‘Have some cake!’
a. wá7=malh kiks-tsín-em
be=adhort cake-eat-mid
‘Have some cake!’
b. #wá7=acw=malh kiks-tsín-em
be=2sg.sbjn=adhort cake-eat-mid
‘You may as well have some cake.’
36 (79b) is marked as infelicitous in this context, which is how the consultant judges it. (80b)
appears to be ungrammatical. The difference possibly relates to the presence in (79b) of the
adhortative particle malh, an interesting element whose analysis must await future research.
37 An anonymous reviewer points out that permission imperatives should be able to take the
subjunctive in certain circumstances, meaning something like ‘the very best way to achieve
your desires is p, though there are other ways’. Future research is required to see whether
this prediction is upheld once the right discourse contexts are provided.

Cross-linguistic variation in modality systems: The role of mood

(80) Context: You are at a gathering and they are almost running out of
food. You take the last piece of fish and then you see an elder is
behind you and is looking disappointed and has no fish on her plate.
You say ‘Take mine!’
a. kwan ts7a ti=n-tsúw7=a
take(dir) deic det=1sg.poss-own=exis
‘Take mine!’
b. *kwán=acw ts7a ti=n-tsúw7=a
take(dir)=2sg.sbjn deic det=1sg.poss-own=exis
intended: ‘Take mine!’

We have seen that an analysis of imperatives as containing a concealed

necessity modal works for St’át’imcets. In the remainder of this section I
briefly discuss the alternative analysis of Portner (2004, 2007).
Portner’s (2004, 2007) analysis of imperatives relies on the notion of a
‘To-Do List’. The idea is that each participant in a conversation has a To-Do
List, a set of properties which they are committed to satisfying. The To-Do
list Function (which maps each participant to their own To-Do List) is a
component of the Discourse Context (along with the Common Ground and
the Question Set). An imperative, as in (81), denotes a property whose subject
is the addressee. This causes the property to be added to the addressee’s
To-Do List.

(81) JLeave! Kw∗,c = [λwλx : x = addressee (c) . x leaves in w]

Similarly to in Schwager’s analysis, ‘permission’ imperatives are dealt
with in Portner’s analysis by the counterpart of the ordering source, namely
different sub-sets of the To-Do List. The To-Do List is divided into deontic,
bouletic and teleological sub-parts, corresponding to orders, invitations, and
suggestions respectively. The addressee can therefore keep track of actions
she is supposed to take to satisfy someone’s orders, her own wishes, or her
own goals.
An important feature of this analysis is that under the To-Do List ap-
proach, imperatives do not contain modal operators. While for Portner,
imperatives and root modals are closely linked — for example, the successful
utterance of an imperative leads to the truth of a corresponding sentence
containing a root modal — imperatives do not themselves contain modals.38
38 See Portner 2007: 363ff for arguments against Han’s (1999) and Schwager’s (2005, 2006)
analysis of imperatives as containing concealed modals.

Lisa Matthewson

My analysis of the St’át’imcets subjunctive, however, seems to require the

presence of a modal, whose force is functionally weakened via a restriction
on the conversational background. A unified analysis of the St’át’imcets
subjunctive across all its uses would therefore seem to require a modal in
the imperative.
However, as pointed out by an anonymous reviewer, Portner’s analysis of
imperatives will work for St’át’imcets. The lexical entry for the subjunctive
given above in (65) does not literally require the presence of a governing
modal; it merely requires the presence of contextually available conversa-
tional backgrounds. These are provided within Portner’s analysis, given that
the Common Ground corresponds to (at least a subset of) a circumstantial
modal base, while a To-Do List corresponds to (at least a subset of) a deon-
tic, bouletic or teleological ordering source. To apply Portner’s analysis to
St’át’imcets, we only need to assume that the imperative morpheme can take
the Common Ground plus two To-Do Lists as arguments. The subjunctive
will presuppose that there is a world among the best worlds in the Common
Ground, according to To-Do List 1, in which the imperative is not satisfied.
Assuming that the second To-Do List is ‘more ignorable’ than the first (cf.
also von Fintel and Iatridou 2008 on the primacy of the first ordering source),
then a hearer can decide to be bound either by both To-Do Lists, or only by
the first. If the speaker has set up her own desires as the secondary To-Do
List, we obtain the politeness reading typical of a St’át’imcets subjunctive
In summary, we have seen that our analysis of the St’át’imcets subjunctive
extends to the weakened imperatives, as long as we assume that imperatives
are concealed normative modal statements, or at least provide the same
conversational backgrounds as a normative modal. This idea can be im-
plemented within either the approaches of Schwager (2005, 2006, 2008) or
Portner (2004, 2007).

7.2 Questions

The subjunctive appears, in combination with an evidential or future modal,

in both yes-no and wh questions in St’át’imcets, in each case turning the
question into a statement of uncertainty. Some examples are repeated here.
Following Littell, Matthewson & Peterson (2009), I use the term ‘conjectural
question’ for this construction.

Cross-linguistic variation in modality systems: The role of mood

(82) a. lán=ha kwán-ens-as

already=ynq take-dir-3.erg
‘Has she already got my letter?’
b. lan=as=há=k’a kwán-ens-as
already=3.sbjn=ynq=infer take-dir-3.erg
‘I wonder if she’s already got my letter.’ / ‘I don’t know if she got
my letter or not.’

(83) a. nká7=kelh lh=cúz’=acw nas

where=fut comp=going.to=2sg.sbjn go
‘Where will you go?’
b. nká7=as=kelh lh=cúz’=acw nas
where=3sbjn=fut comp=going.to=2sg.sbjn go
‘Wherever will you go?’ / ‘I wonder where you are going to go
now.’ (adapted from Davis 2006: chapter 24)

Previous discussion of conjectural questions in Salish includes Matthewson

2008a, Littell et al. 2009 and Littell 2009.39 The analysis given here will
essentially be that of Littell (2009), with the addition of an account of the
role of the subjunctive (which Littell does not discuss), and an extension to
cases where the subjunctive in a conjectural question is licensed by a future
modal, rather than an evidential.
The paradigms in (84) and (85) illustrate the distributional facts for
conjectural questions which contain an evidential (as opposed to a future
modal). We see that the evidential is obligatory (the (b) examples), but
the subjunctive — while strongly preferred — is not quite obligatory (the (c)

(84) a. t’íq=Ø=ha k=Bill

arrive=indic=ynq det=Bill
‘Did Bill arrive?’ indic
39 Littell et al. (2009) investigate conjectural questions in three languages: St’át’imcets,
NìePkepmxcín (Thompson Salish) and Gitksan, while Littell (2009) focuses mainly on
40 While subjunctive evidential questions (as in (84d), (85d)) are obligatorily interpreted as
statements of uncertainty rather than questions, indicative evidential questions (as in (84c),
(85c)) can optionally be interpreted as ordinary questions. I return to this below.
Lisa Matthewson

b. *t’íq=as=ha k=Bill
arrive=3sbjn=ynq det=Bill sbjn
c. ?t’íq=ha=k’a k=Bill
arrive=ynq=infer det=Bill
‘I wonder if Bill arrived.’ evid + indic
d. t’iq=as=há=k’a k=Bill
arrive=3sbjn=ynq=infer det=Bill
‘I wonder if Bill arrived.’ evid + sbjn

(85) a. ínwat=wit
‘What did they say?’ indic
b. *inwat=wít=as
say.what=3pl=3sbjn sbjn
c. ??inwat=wít=k’a
‘I wonder what they said.’ evid + indic
d. inwat=wít=as=k’a
‘I wonder what they said.’ evid + sbjn

As argued in the above-mentioned references, conjectural questions have

the syntax and the semantics of a question, but the pragmatics of an as-
sertion (as they do not require an answer in discourse). With respect to
syntax, conjectural questions clearly pattern with ordinary questions. Littell
et al. (2009) point out that not only do conjectural questions contain the
normal yes-no question particle or sentence-initial wh-phrase plus extraction
morphology, they embed under the same predicates as ordinary questions
do. This is shown in (86).

(86) aoz kw=s=zwát-en-as k=Lisa

neg det=nom=know-dir-3erg det=Lisa
lh=wa7=as=há=k’a áma-s-as k=Rose ku=tíh
comp=impf=3sbjn=ynq=infer good-caus-3erg det=Rose det=tea
‘Lisa doesn’t know whether Rose likes tea.’

Cross-linguistic variation in modality systems: The role of mood

The ability to embed under question-taking predicates is prima facie

evidence that conjectural questions have the same semantic type as ordinary
Pragmatically, however, conjectural questions do not behave like ordinary
questions, because conjectural questions do not require an answer from the
addressee. In fact, conjectural questions are infelicitous in any situation
where the hearer can be assumed to know the answer. This is illustrated in

(87) a. ??lan=acw=há=k’a q’a7

already=2sg.sbjn=ynq=infer eat
‘I wonder if you’ve already eaten.’
b. Context: You see your friend wearing a watch and you say:
‘Would you know what the time was?’
Consultant’s comment: “You wouldn’t have seen the watch if
you say this.”

Nor are conjectural questions a type of rhetorical question. Han (2002)

argues that rhetorical questions have the force of a negative assertion, as in

(88) Did I tell you it would be easy? ≈ I didn’t tell you it would be easy.

But this is not the meaning we get in St’át’imcets for conjectural questions.
In order to express a true rhetorical question, St’át’imcets speakers use
something which is string-identical to an ordinary question, just as in English.
This is illustrated in (89)–(90). (90b) shows that adding a subjunctive plus an
evidential to a rhetorical question results in rejection of the utterance.

(89) Context: Your daughter is complaining that learning how to cut fish
is hard. You say:
a. tsun-tsi=lhkán=ha k=wa=s lil’q
say(dir)-2sg.obj=1sg.indic=ynq det=impf=3poss easy
‘Did I tell you it would be easy?’
41 See Rocci 2007: 147 for the same claim for an Italian construction with similar semantics to
St’át’imcets conjectural questions.

Lisa Matthewson

b. swat ku=tsút k=wa=s lil’q

who det=say det=impf=3poss easy
‘Who said it would be easy?’

(90) Context: You are at the PNE (a fair) and there is this very scary ride
which looks really dangerous. Your friend asks you if you are going
to go on it. You say:
a. tsut-anwas=kácw=ha kw=en=klíisi
say-inside=2sg.indic=ynq det=1sg.poss=crazy
‘Do you think I’m crazy?’
b. *tsut-anwas=ácw=ha=k’a kw=en=klíisi
say-inside=2sg.sbjn=ynq=infer det=1sg.poss=crazy
‘Do you think I’m crazy?’

The status of speaker and addressee knowledge also differs between rhetori-
cal questions and conjectural questions. In rhetorical questions, the speaker
knows the true answer to the question, and typically assumes that the hearer
does as well (e.g., Caponigro & Sprouse 2007). Subjunctive questions are the
exact opposite: neither the speaker nor the addressee typically knows the
In the remainder of this section I will first present the analysis of conjec-
tural questions which contain evidentials, and then explain an interesting
difference between the evidential and the future with respect to subjunctive
First, we need an analysis of questions. I adopt a fairly standard approach,
according to which a question denotes a set of propositions, each of which
is a (partial, true or false) answer to the question (Hamblin 1973).42 This is
illustrated in (91)–(92).

(91) Jdoes Hotze smokeKw = {that Hotze smokes, that Hotze does not

(92) Jwho left me this fishKw = {that Ryan left me this fish, that Meagan
left me this fish, that Ileana left me this fish,...} = {p : ∃x[p = that x
left me this fish]}
42 As far as I am aware, this choice is not critical and a different approach to questions would
work just as well.

Cross-linguistic variation in modality systems: The role of mood

Next, we need an analysis for the inferential evidential k’a. I adopt

Matthewson et al.’s (2007) and Rullmann et al.’s (2008) analysis of k’a as an
epistemic modal with a presupposition about evidence source.

(93) Jk’a(h)(g)Kc,w is only defined if h is a epistemic modal base, g is a

stereotypical ordering source, and for all worlds w 0 , ∩h(w 0 ) is the set
of worlds in which the inferential evidence in w holds.
If defined, Jk’a(h)(g)Kc,w =
λqhs,ti .∀w 0 ∈ fc (maxg(w) (∩h(w)))[q(w 0 ) = 1]
(adapted from Matthewson et al. 2007: 245)

I assume that the evidential modal scopes under the question operator,
so that each proposition in the question denotation contains the evidential.
A conjectural question thus bears some similarity to an English question
containing a possibility modal (e.g., ‘Could Bill have (possibly) arrived?’), with
the additional factor that the evidential introduces a presupposition about
evidence source. Following Guerzoni (2003), I assume that when a question
contains a presupposition trigger, each proposition in the alternative set
carries the relevant presupposition. The question therefore denotes a set of
alternative partial propositions. This is illustrated in (94).43

(94) a. t’iq=as=há=k’a k=Bill

arrive=3sbjn=ynq=infer det=Bill
‘I wonder if Bill arrived.’
b. Alternatives introduced by (94a):
{that Bill possibly arrived [presupposing there is inferential evi-
dence that Bill arrived], that Bill possibly did not arrive [presup-
posing there is inferential evidence that Bill did not arrive]}

Notice that the evidence presuppositions of the two propositions in (94b)

conflict with each other — there is presupposed to be evidence both that Bill
did arrive, and that Bill did not arrive. As Guerzoni (2003) has shown for the
presuppositions of English even, questions whose alternative propositions
introduce different presuppositions end up presupposing the conjunction of
all the individual presuppositions. Take, for example, the question in (95).

(95) Guess who even solved Problem 2?

43 Recall that although (94a) is translated into English using wonder, the meaning of (94a) does
not include an attitude verb. The claim is that (94a) denotes a set of alternative propositions.

Lisa Matthewson

This question introduces ‘a set of alternative partial propositions that for

each relevant person x contains an answer asserting that x solved Problem
2 and presupposing that solving problem 2 was less likely for x than solving
any other relevant problem’ (Guerzoni 2003: 127). Guerzoni then observes
that a speaker who utters (95)

knows that for any arbitrary individual in the restrictor of who,

if the addressee answers that that individual solved the prob-
lem, he will automatically presuppose that the problem was
difficult for that person. Moreover, if the speaker is unbiased,
she doesn’t know in advance (and has no expectations regard-
ing) which propositions will be chosen by the addressee as the
true answer to her question. Given this, it must be the case
that she is taking for granted that the problem was hard for
every arbitrary x in the restrictor of who. Since the addressee
will be able to infer this much, the question is a presupposition
failure unless this condition is indeed satisfied in the context
of the conversation (Guerzoni 2003: 128).

Applying this idea to the St’át’imcets conjectural questions, we obtain the

result that an utterance of (94a) commits the speaker to the presupposition
that there is evidence both that Bill did arrive, and that he did not. This is
illustrated in (96).

(96) Alternatives introduced by (94a):

{that Bill possibly arrived, that Bill possibly did not arrive}
Presupposition of (94a):
There is inferential evidence both that Bill arrived and that Bill did
not arrive

In previous work (Matthewson 2008a, Littell et al. 2009), I assumed that

the mixed-evidence presuppositions which result when we conjoin the pre-
suppositions of all the propositions in the question set could derive the
reduced interrogative force of conjectural questions. The idea was that a
speaker who utters a question while presupposing that there is mixed or
even contradictory evidence about the true answer cannot be taken to be
requiring that the hearer provide the true answer to the question. That is,
the mixed presuppositions about evidence signal that the speaker does not

Cross-linguistic variation in modality systems: The role of mood

believe the question is easily answerable, and this lets the hearer off the hook
with respect to providing an answer.44
However, there are various problems with this analysis, as pointed out
by Littell (2009). One is that the evidence presuppositions are not always
contradictory. For example, a conjectural question such as ‘Who likes ice
cream?’ would presuppose for each contextually salient individual x that
there is inferential evidence that x likes ice cream. But it is perfectly possible
that everyone likes ice cream, and the evidence presuppositions in this case
do not rule out the possibility that the hearer knows the true answer. A
second problem is seemingly incorrect predictions about questions which
contain other evidentials, such as reportative or direct evidentials. Littell
argues that an analysis of conjectural questions which relies on conjoined
evidence presuppositions should predict reduced interrogative force for
any evidential question — yet cross-linguistically it is overwhelmingly only
inferential or conjectural evidentials which result in reduced interrogative
force. This is certainly true of St’át’imcets, as shown in the minimal pair in

(97) a. stám’=as=k’a ts7a

what=3sbjn=infer here
‘I wonder what these are.’
b. *stám’=as=ku7 ts7a
what=3sbjn=report here

For these reasons, I instead adopt and extend an analysis proposed by

Littell (2009). Two assumptions are required. First, the evidence source
44 Rocci (2007) analyzes a construction in Italian with strikingly similar semantics and prag-
matics: the che-subjunctive construction. According to Rocci, che-subjunctives, which are
formed from questions, are interpreted as statements of doubt. He argues that they involve
epistemic modality and inferential evidentiality, and induce the following presuppositions:

(i) p is not in the Common Ground and ¬p is not in the Common Ground

(ii) There is no sign that either Speaker or Hearer knows whether p or ¬p

(iii) There is some set of facts E in CG, such that E is non-conclusive evidence in favor
of p

These are very similar to the effects of the St’át’imcets conjectural questions. However, Rocci
does not give a compositional analysis, perhaps partly because the che-subjunctives have no
overt evidentials or epistemic modals in the structure.
45 Cheyenne is an exception; reportatives in questions in Cheyenne allow non-interrogative
readings under certain circumstances (Murray to appear).

Lisa Matthewson

requirement of an evidential in a question can or must undergo ‘interrogative

flip’ (or ‘origo shift’; Garrett 2001, Faller 2002, 2006, Aikhenvald 2006, Tenny
& Speas 2004, Tenny 2006, Davis, Potts & Speas 2007, Murray to appear,
among others). Thus, a question containing an evidential expects that the
hearer, rather than the speaker, has the relevant type of evidence for the
answer. For example, (98) is not appropriate if directed to your mother, if she
is the one who always cooks dinner. However, it is acceptable when directed
to a third person, who might have heard from your mother what you are
going to eat.

(98) stám’=ku7 ku=cuz’=s-q’á7-lhkalh

what=report det=going.to=nom-eat-1pl.poss
‘What are we going to eat?’

The second assumption is that a speaker who uses an evidential which is

low on a hierarchy of evidence strength implicates that there is no available
evidence of a stronger type (Faller 2002, among others). This also seems to
be correct in St’át’imcets; the use of an inferential evidential, for example,
leads a hearer to infer that the speaker did not have reportative or direct
These two assumptions lead to the following result: a question containing
an evidential which is low on the scale of evidence strength will lead to an
implicature that the hearer does not have evidence of any stronger type. This
is illustrated in (99).

(99) a. man’c-em=há=k’a k=Hotze

smoke-mid=ynq=infer det=Hotze
‘I wonder if Hotze smokes.’
b. Alternatives introduced by (99a):
{that Hotze might smoke, that Hotze might not smoke}
c. Presupposition of (99a):
The hearer has inferential evidence both that Hotze smokes and
that Hotze does not smoke
46 Evidential hierarchies are a topic of some debate and there are many interesting questions
to be investigated (see Faller 2002 for an overview). It is also an interesting question how
evidence-type hierarchies interact with the variable interpretations of all evidentials in
St’át’imcets (Matthewson et al. 2007, Rullmann et al. 2008). Although all strengths are
possible for all evidentials in St’át’imcets, inferential k’a is more likely to be weaker (i.e., to
have a more restricted domain of worlds to quantify over), while the reportative ku7 and the
perceived-evidence =an’ are much more likely to give rise to stronger interpretations.

Cross-linguistic variation in modality systems: The role of mood

d. Implicature: The hearer does not have any stronger type of

evidence than inferential about the correct answer

According to Littell (2009), this analysis accounts for the reduced inter-
rogative force of conjectural questions. The idea is that inferential evidence
is a fairly weak type of evidence, and a speaker who asks a question while
implicating that the hearer only has inferential evidence about the true an-
swer is letting the hearer off the hook with respect to answering. This is
intended to account for (a) the judgments of St’át’imcets consultants that
conjectural questions do not require an answer, (b) the fact that conjectural
questions are infelicitous when the addressee is likely to know the answer (cf.
(87)), and (c) the fact that conjectural questions are translated as ‘I wonder’
or ‘maybe’-statements (although they do not literally have the semantics
of ‘wonder’). ‘I wonder’ is simply a typical method in English of raising a
question without demanding an answer.
However, this account does not seem to predict a complete absence of
interrogative force. After all, the inferential evidence the hearer is assumed
to possess is better than no evidence at all. In line with this, an English
question like ‘According to the weak evidence you have, could Hotze smoke?’
still functions pragmatically as an interrogative. I conclude, therefore, that
interrogative flip plus implicatures about the absence of stronger evidence are
not sufficient in and of themselves to completely let the hearer off the hook
with respect to answering. This is actually a welcome result, since questions
containing k’a in the indicative mood are sometimes translated by speakers
into English using ordinary questions (rather than as statements of doubt;
see footnote 40). However, conjectural questions containing the subjunctive
are never translated as ordinary questions. I therefore assume that while a
question containing an evidential is already somewhat ‘weakened’ in terms
of its interrogative force, the subjunctive performs a further weakening. The
task now is to see whether this falls out from the analysis of the subjunctive
proposed above.
Recall that in the context of a governing modal, the subjunctive adds the
presupposition that in at least one of the best worlds in the modal base, the
proposition is false. The best worlds here (as the modal is epistemic) are
those which conform to the propositions known to be true, and in which
things happen as normal. Since the evidential has undergone interrogative
flip, the epistemically accessible worlds must also be flipped to be the worlds

Lisa Matthewson

compatible with the hearer’s knowledge. The results are shown in (100).47

(100) a. cuz’=as=há=k’a ts7as s=Bill

going.to=3sbjn=ynq=infer come nom=Bill
‘I wonder if Bill is going to come.’
b. Alternatives introduced by (100a):
{that Bill is possibly going to come, that Bill is possibly not going
to come}
c. Presuppositions of (100a):
The hearer has inferential evidence both that Bill is going to
come and that Bill is not going to come; Bill doesn’t come in at
least one normal world compatible with the hearer’s knowledge,
and Bill comes in at least one normal world compatible with the
hearer’s knowledge
d. Implicature: The hearer does not have any stronger type of
evidence than inferential about the correct answer

As before, the implicature that the hearer does not have strong evidence
about the true answer, combined with the mixed-evidence effect of the
evidential presuppositions, will partially reduce the expectation that the
hearer is able to answer the question. In addition, thanks to the subjunctive,
the question now presupposes not only that the evidence about Bill’s possible
arrival is mixed, but also that there are worlds compatible with the hearer’s
knowledge in which Bill does come, and worlds compatible with the hearer’s
knowledge in which he does not come. In other words, the hearer does not
know whether he will come or not. The result is that a subjunctive conjectural
question has a significantly reduced expectation on the hearer to provide an
The account just given, which incorporates the analysis of the St’át’imcets
subjunctive as weakening a modal proposition via domain restriction, suc-
47 An anonymous reviewer raises a potentially significant issue with the choice function
required for these cases. With the deontic and imperative cases discussed above, the choice
function had intuitive content (e.g., the ‘very best way to achieve some end’), but here the role
of the subjunctive is purely to make sure there are some ‘best worlds’ where the prejacent is
false. It is thus not clear which proper subset of the best worlds the function picks out.
48 As noted above, conjectural questions also imply that the speaker does not know the answer.
I assume that this follows, by Gricean reasoning, from the fact that the speaker uttered a
question, rather than having simply asserted the true answer. However, there is a bit more to
be said here, since plain questions in St’át’imcets allow a ‘display question’ use — a teacher
can ask (i):

Cross-linguistic variation in modality systems: The role of mood

cessfully accounts for the distributional and interpretive facts illustrated

in (84)–(85) above. The fact that the subjunctive requires a modal licenser
in a question follows from the analysis of the subjunctive as requiring a
governing modal. The fact that an evidential in a question always licenses
at least slightly reduced interrogative force, regardless of mood, falls out
from the fact that the evidential plays a part in reducing interrogative force.
However, the added contribution of the subjunctive accounts for the pre-
ferred presence of the subjunctive in conjectural questions, as well as for
the fact that questions containing an evidential plus the subjunctive, in con-
trast to indicative evidential questions, can only be interpreted with reduced
interrogative force.
In the final part of this section I extend the discussion to conjectural
questions which contain a future morpheme rather than an evidential. We
have already seen some examples of this ((17b)–(18b) above). In contrast to the
evidential k’a, the future modal obligatorily requires the subjunctive mood if
it is to be interpreted as a statement of doubt. This is shown in (101)–(102),
where the (a) examples are only interpretable as ordinary questions which
expect an answer.

(101) a. t’íq=ha=kelh k=Bill

arrive=ynq=fut det=Bill
‘Is Bill going to come?’ fut + indic
b. t’iq=as=há=kelh k=Bill
arrive=3sbjn=ynq=fut det=Bill
‘I wonder if Bill will come.’ fut + sbjn

(i) k’win ku=án’was múta7 án’was

how.many det=two and two
‘What is two plus two?’

As an anonymous reviewer points out, this display use should technically remain even when
the subjunctive is added. However, consultants judge the subjunctive version of (i) to no
longer be a teacher’s question, but a student’s reply:

(ii) k’wín=as=k’a ku=án’was múta7 án’was

how.many=3sbjn=infer det=two and two
‘I don’t know how much two plus two is.’

Perhaps conjectural questions like (ii) simply do not make good questions for a teacher to
ask because they encode addressee ignorance.

Lisa Matthewson

(102) a. inwat=wít=kelh
‘What will they say?’ fut + indic
b. inwat=wít=as=kelh
‘I wonder what they will say.’ fut + sbjn

The contrast between the evidential and the future with respect to whether
the subjunctive is required to create a conjectural question is striking. So
far, I have argued that the evidential k’a contributes to reduced interrogative
force by means of an implicature that the hearer has no better than inferential
evidence for the true answer, and that the subjunctive contributes to further
reduced interrogative force by presupposing that it is compatible with the
hearer’s knowledge state that each possible answer is false. Now unlike k’a,
the future modal kelh has not been analyzed as an epistemic modal, and it
does not introduce any evidence presuppositions. The denotation for kelh is
given in (103).

(103) Jkelh(h)(g)Kc,w,t is only defined if h is a circumstantial modal base

and g is a stereotypical ordering source.

If defined, Jkelh(h)(g)Kc,w,t =
λqhs,hi,tii .∀w 0 ∈ fc (maxg(w) (∩h(w, t)))[∃t 0 [t < t 0 ∧ q(w 0 )(t 0 ) = 1]]

(adapted from Rullmann et al. 2008)49

Applying this analysis of kelh to questions containing a subjunctive gives


(104) a. nká7=as=kelh lh=cúz’=as nas k=Gloria

where=3sbjn=fut comp=going.to=2sg.sbjn go det=Gloria
‘I wonder where Gloria will go.’
b. Alternatives introduced by (104a):
{that Gloria will go home, that Gloria will go to her mother’s
house, . . . }
49 I have altered Rullmann et al.’s formula to incorporate the ordering source and to make the
format parallel to that of other formulas above. The modal base in (103) is a function from
world-time pairs to sets of propositions.

Cross-linguistic variation in modality systems: The role of mood

c. Presuppositions of (104a):
The future claim is made on the basis of the facts; Gloria won’t
go home in at least one stereotypical world compatible with the
facts, Gloria will not go to her mother’s house in at least one
stereotypical world compatible with the facts, . . .

There are no implicatures about evidence types this time, but interestingly,
we still predict reduced interrogative force. And this time, the contribution
of the subjunctive is absolutely critical to deriving the effect. Due to the
subjunctive, the question as a whole presupposes for each contextually
salient place that Gloria might go, that there is at least one stereotypical
world compatible with the facts in which she doesn’t go there. This means
that the facts underdetermine where she might go — and thus, that the
addressee may not know where she will go. Given that the subjunctive is
crucial in deriving the reduced interrogative force, we correctly predict that
the subjunctive is obligatory in conjectural questions like (102).

7.3 Ignorance free relatives

Ignorance free relatives in St’át’imcets are formed by the combination of a

wh-word, the subjunctive, and the inferential evidential k’a. Some examples
are repeated here.50

(105) a. qwatsáts=t’u7 múta7 súxwast áku7, t’ak aylh áku7,

leave=prt again go.downhill deic go then deic
nílh=k’a s=npzán-as
foc=infer nom=meet(dir)-3erg
k’a=lh=swát=as=k’a káti7 ku=npzán-as
infer=comp=who=3sbjn=infer deic det=meet(dir)-3erg
‘So he set off downhill again, went down, and then he met who-
ever he met.’ (van Eijk & Williams 1981: 66, cited in Davis 2009)
b. o, púpen’=lhkan [ta=stam’=as=á=k’a]
oh find=1sg.indic [det=what=3sbjn=exis=infer]
‘Oh, I’ve found something or other.’
(Unpublished story by “Bill” Edwards, cited in Davis 2009)

There is a large literature on free relatives, concentrating mainly on

English (although see Dayal 1997 for discussion of Hindi and Davis 2009 for
50 Thanks to Henry Davis for helpful discussions of free relatives in St’át’imcets.

Lisa Matthewson

discussion of St’át’imcets). Here I adopt von Fintel’s (2000) analysis; as far as I

know, nothing crucial hinges on the differences between von Fintel’s analysis
and those of, for example, Jacobson (1995) or Dayal (1997). I will argue
that the St’át’imcets ignorance free relatives are compatible with von Fintel’s
proposals, and that their interpretation relies on the independently-attested
semantics of the subjunctive and the evidential.
According to von Fintel, both ignorance and indifference free relatives
presuppose that there is variation among the worlds in the modal base with
respect to the identity of the referent. The free relative denotes a definite
description, and the sentence as a whole asserts that the definite description
satisfies the relevant property.

(106) (whatever)(w)(F )(P )(Q)

a. presupposes: ∀w 0 ∈ minw [F ∩(λw 0 .ιx.P (w 0 )(x) ≠ ιx.P (w)(x))] :
Q(w 0 )(ιx.P (w 0 )(x)) = Q(w)(ιx.P (w 0 )(x))
b. asserts: Q(w)(ιx.P (w)(x)) (von Fintel 2000: 34)

With ignorance free relatives, the modal base F is the epistemic alterna-
tives of the speaker.51 Consider (107), for example.

(107) There’s a lot of garlic in whatever (it is that) Arlo is cooking.

(von Fintel 2000: 27)

(107) presupposes that in all the speaker’s epistemically accessible worlds

which are minimally different from the actual world and in which Arlo is
cooking something different from what he is actually cooking, there is the
same amount of garlic in what he is cooking. As the min-operator introduces
an existential presupposition, (107) presupposes that there are epistemically
accessible worlds in which Arlo is cooking something different from what
he is actually cooking. This amounts to a presupposition that the speaker is
ignorant about the identity of what Arlo is cooking. (107) then asserts that
the unique thing which Arlo is cooking has a lot of garlic in it.
Turning to St’át’imcets, we see that von Fintel’s semantics captures the
required meanings accurately. (105a) presupposes that the speaker does not
know who ‘he’ (the man being talked about) met, and asserts that he met
whoever he met. Moreover, it seems that we can account for the presence of
the subjunctive in free relatives, and also for the presence of the inferential
evidential. In particular, I would like to suggest that the presupposition of
51 With indifference free relatives, the modal base includes counterfactual alternatives.

Cross-linguistic variation in modality systems: The role of mood

speaker ignorance about the denotation of the free relative actually derives
from the evidential k’a and the subjunctive.
The basic idea is that an ignorance free relative is formed from a conjec-
tural question (see Davis 2009 for this insight, although Davis does not word
it in this way). The free relative in (105a), for example, is formed from the
conjectural question in (108).

(108) swát=as=k’a káti7 ku=npzán-as

who=3sbjn=infer deic det=meet(dir)-3erg
‘I wonder who he met.’

Following the analysis of conjectural questions given in subsection 7.2, (108)

denotes the set of propositions of the form ‘he met x’. The evidential in
(108) would normally undergo interrogative flip, giving rise to the inference
that the hearer is not in a position to answer the question of who he met.
When (108) is embedded in a non-matrix environment as in (105a), however, I
assume that interrogative flip does not take place. The free relative based on
(108) will therefore carry a conjoined presupposition that the speaker has
inferential evidence for each alternative, and an implicature that the speaker
has no stronger evidence about who he met. And due to the subjunctive,
it will presuppose that for each alternative, there is at least one best world
in the modal base in which that alternative is false. Thus, the free relative
formed from (108) will presuppose that there is mixed evidence about who he
met, and that for each person x, it’s compatible with the speaker’s knowledge
that he did not meet x. This derives the desired ‘speaker ignorance’ presup-
position. Moreover, we can regard the subjunctive as an overt spell-out of
the existential presupposition of the min-operator, namely that there are
epistemically accessible worlds in which the person he met is not who he met
in the actual world.
A final advantage of this approach is that we correctly capture the fact
that the modal base contains epistemic alternatives, as k’a lexically encodes
an epistemic conversational background. This accounts for the fact that only
ignorance free relatives, and not indifference free relatives, contain k’a in
St’át’imcets (Davis 2009).52
52 Free relatives in St’át’imcets are far from solved. For example, Davis (2009) points out
a problem with free relatives which surface as DPs, as in (105b) above. Davis shows that
syntactically, this wh-word acts like the head noun of a relative clause. This poses a challenge
for the claim that (105b) is formed from a conjectural question. Moreover, if the wh-word is
functioning as a head noun in (105b), the evidential k’a should not be able to attach to it, as

Lisa Matthewson

7.4 ‘Pretend’

There are two patterns to account for with the ‘pretend’ cases, depending on
the dialect. In Upper St’át’imcets, the subjunctive plus the normative modal
ka frequently renders a ‘pretend to be ...’ interpretation. In Whitley et al. no
date, a native-speaker-produced St’át’imcets teaching manual, the standard
construction when the teacher is asking the students to pretend something
is that in (109).

(109) a. skalúl7=acw=ka: saq’w knáti7 múta7 em7ímn-em

owl=2sg.sbjn=deon fly deic and animal.noise-mid
‘Pretend to be an owl: fly around and hoot.’ (Davis 2006: chapter
b. snu=hás=ka ku-skícza7
2sg.emph=3sbjn=deon det=mother
‘Pretend to be the mother.’ (Whitley et al. no date)

In Lower St’át’imcets, however, examples like the ones in (109) are rejected
in ‘pretend’ contexts. Lower St’át’imcets uses either an emphatic pronoun
in a cleft, as in (110a), or the adhortative particle malh, as in (110b). In each
case, the subjunctive is present, but ka is absent.

(110) a. nu=hás ku=skalúla7: sáq’w=kacw knáti7

2sg.emph=3sbjn det=owl fly=2sg.indic deic
‘Pretend to be an owl.’
b. skalúl7=acw=malh: sáq’w=kacw knáti7
owl=2sg.sbjn=adhort fly=2sg.indic deic
‘Pretend to be an owl: fly around.’

In each of the dialectal variants, the apparent ‘pretend’ construction

seems to reduce to another usage, rather than really meaning ‘pretend’.
The examples in (109) are merely instances of the subjunctive adding to a
normative modal assertion. (109a) thus really means something like ‘I wish
you were an owl’, and (109b) means ‘I wish you were the mother.’ In (110a),
the subjunctive adds to a plain assertion to create a wish, something which is
possible with clefts; cf. (5) above. As for (110b), the consultant spontaneously
k’a attaches only to predicates. This is a peculiarity of k’a; Davis shows that other second-
position evidentials, such as reportative ku7 or perceived-evidence =an’, are ungrammatical
in free relatives. Further research is required.

Cross-linguistic variation in modality systems: The role of mood

translates this into English as ‘You may as well be an owl’. The presence
of adhortative malh here is a matter for future research; see comments in
Section 8 below.
Support for the idea that (109) and (110) are not really ‘pretend’ construc-
tions comes from the fact that exactly parallel structures are used when the
wish is not that someone pretend to be something, but rather is a wish which
has a chance of coming true. This is shown in (111). While the consultant
accepts a ‘pretend’ translation for the sentences in (111), she spontaneously
translates them into English using simply ‘you be . . . ’. She judges that the
St’át’imcets sentences do not really mean ‘pretend’.

(111) a. nu=hás ku=kúkwpi7

2sg.emph=3sbjn det=chief
‘Pretend to be the chief.’ [accepted]
‘You be the chief.’ [spontaneously given]
b. nu=hás ku=kúkw
2sg.emph=3sbjn det=cook
‘Pretend to cook.’ [accepted]
‘You be the cook.’ [spontaneously given]

7.5 Why St’át’imcets is not like Romance

In this final sub-section I return to a major cross-linguistic difference between

the St’át’imcets subjunctive and more familiar, Indo-European subjunctives,
namely that in St’át’imcets the subjunctive is never selected by a matrix
predicate, and in fact is ungrammatical under all attitude verbs (as shown in
(38) above).
It turns out that this falls out from the current analysis. The St’át’imcets
subjunctive is parasitic on a modal, and introduces the presupposition that
in at least one of the best worlds in the modal base according to the ordering
source, the embedded proposition is false. This presupposition is incompati-
ble with the semantics of attitude verbs, which are standardly analyzed as
introducing universal quantification over a set of worlds. This is illustrated
in (112) for English believe.

(112) JbelieveKw,g =
λphs,ti .λx.∀w 0 compatible with what x believes in w : p(w 0 ) = 1
(von Fintel & Heim 2007: 18)

Lisa Matthewson

There is no reason to assume that attitude verbs like ‘believe’ have different
semantics in St’át’imcets from in English. On the contrary, the St’át’imcets
verb tsutánwas ‘think, believe’ must involve universal quantification over
belief-worlds, without the possibility of domain restriction (in other words,
there is no choice function or second ordering source). Thus, (113), just like
its English gloss, requires that in all Laura’s belief-worlds, John has left. It
cannot mean that Laura’s beliefs allow, but do not require, that John has left.

(113) tsut-ánwas k=Laura kw=s=qwatsáts=s k=John

say-inside det=Laura det=nom=leave=3poss det=John
‘Laura thinks that John left.’

Given this, adding the subjunctive under the verb ‘believe’ in St’át’imcets
leads to the following contradictory result.

(114) *tsut-ánwas k=Laura kw=s=qwatsáts=as k=John

say-inside det=Laura det=nom=leave=3sbjn det=John
‘Laura thinks that John left.’

J(114)Kw is only defined if ∃w 0 compatible with Laura’s beliefs in w:

John didn’t leave in w 0

If defined, J(114)Kw = 1 iff ∀w 0 compatible with Laura’s beliefs in w:

John left in w 0

The presupposition of the subjunctive contradicts the assertion. This explains

why the subjunctive is not used under verbs like ‘believe’ in St’át’imcets,
unlike in Romance.
We need to separately discuss the absence of subjunctive under desire
verbs in St’át’imcets. An example was given in (38e), repeated here.53

(115) xát’-min’-as k=Laura kw=s=t’iq=Ø k=John

want-red-3erg det=Laura det=nom=arrive=3indic det=John
‘Laura wanted John to come.’

Desire verbs are often treated as involving comparison between alternative

worlds (e.g., Stalnaker 1984, Heim 1992 and much subsequent work). The
intuition is that ‘John wants you to leave means that John thinks that if you
leave he will be in a more desirable world than if you don’t leave’ (Heim 1992:
53 Thanks to an anonymous reviewer for discussion of this issue.

Cross-linguistic variation in modality systems: The role of mood

193). Here I adopt Portner’s (1997) analysis of desire verbs, and in particular
we will see that the St’át’imcets verb xát’min’ is better analyzed as similar to
English hope (which according to Portner is similar to believe, and therefore
is not intrinsically comparative) than to English want.
Portner analyzes hope in terms of a buletic accessibility relation Bulα (s, b).
For any situation s and belief situation b of an agent α, Bulα (s, b) is the set
of buletic alternatives for α in s — i.e., ‘the worlds in which the most of α’s
plans in s (relative to his or her beliefs in b) are carried out’ (Portner 1997:
178). The sentence in (116) receives the interpretation shown: it is true just in
case in all of James’s buletic alternatives, Joan arrives in Richmond soon.

(116) James hopes that Joan arrives in Richmond soon.

{s : BulJames (s, b) ⊆ J Joan arrives in Richmond soon Ks }

(Portner 1997: 188)

Portner’s analysis of hope differs from that of want, and is parallel to that
of believe, in crucial respects (which explain the different embedding possi-
bilities for hope/believe vs. want). In particular, while hope and believe are
defined directly in terms of (doxastic or buletic) alternatives, want is defined
in terms of the agent’s plans. Portner argues that the difference between
hope and want is ‘an idiosyncratic lexical one’ (Portner 1997: 189). If this is
correct, it would not be unexpected that a language could contain only the
hope-type of desire predicate.
If we apply Portner’s analysis of hope to St’át’imcets xát’min’, and attempt
to use the subjunctive in the embedded clause, we get the result in (117).

(117) *xát’-min’-as k=Laura kw=s=t’íq=as k=John

want-red-3erg det=Laura det=nom=arrive=3sbjn det=John
‘Laura wanted John to come.’

J(117)Ks is only defined if ∃s ∈ BulLaura (s, b): John does not come in s

If defined, J(117)Ks =1 iff {s : BulLaura (s, b) ⊆ J John comes Ks }

(117) is defined only if there is at least one situation in Laura’s buletic alter-
natives in which John does not come, but it asserts that in all Laura’s buletic
alternatives, John comes. The contradiction between the presupposition and
the assertion leads to the unacceptability of the sentence.

Lisa Matthewson

The idea that St’át’imcets xát’min’ is parallel to English hope or believe

rather than to English want leads to the following cross-linguistic compari-
son. While Indo-European has two kinds of attitude verbs — those involving
universal quantification over alternative worlds, and those which are intrin-
sically comparative — St’át’imcets has only the former kind. This explains
why St’át’imcets lacks subjunctives under attitude verbs, and even allows us
to draw the broader generalization that St’át’imcets only allows universal
quantification over worlds. This language lacks both true possibility modals
and comparative subjunctive-embedding predicates.54

8 Conclusions and questions for future research

The goal of this paper was to extend the formal cross-linguistic study of
modality to the related domain of mood. Prior work on St’át’imcets has
proposed that languages vary in whether their modals encode quantifica-
tional force (as in English), or conversational background (as in St’át’imcets)
(Matthewson et al. 2007, Rullmann et al. 2008, Davis et al. 2009). Here, I have
argued that languages vary in their mood systems along the same dimension,
at least functionally. While some languages use moods to encode distinctions
of conversational background (buletic, deontic, etc.), St’át’imcets uses mood
to functionally achieve a restriction on modal quantificational force. (Of
course technically, both modals and moods in St’át’imcets restrict conver-
sational backgrounds: the modal force is always universal.) If this view is
correct, then each language-type draws on its moods and its modals together
to allow the full range of specifications. In other words, what modals don’t
encode, moods do. The simplified typological table is repeated here.

lexically encode lexically encode

quant. force conv. background

Indo-European modals moods

St’át’imcets moods modals

Table 5 Modal and mood systems

The analysis presented here raises some questions for future research.
One outstanding issue is the status of subjunctives with no overt licenser at
54 Thanks to an anonymous reviewer for discussion of this point.

Cross-linguistic variation in modality systems: The role of mood

all, as in (5)–(6). As noted earlier, these appear to be productive only in clefts.

It is not immediately obvious that a cleft contains a modal operator which
would license the subjunctive, so further investigation is required (although
see fn. 22).
A second interesting puzzle relates to subjunctive imperatives (see sub-
section 7.1). These seem to strongly prefer the presence of the adhortative
particle malh, which is normally optional in imperatives. Perhaps malh (which
has not previously been analyzed) is a modal, and perhaps its obligatoriness
reflects the licensing requirement of the subjunctive. But what consequence
would this have for the analysis provided above, which assumes that even
imperatives with no adhortative particle contain a concealed deontic modal?
This question cannot be answered without a real investigation of malh,
something which goes beyond the bounds of the current paper.
An even trickier element is the particle t’u7. t’u7 is the culprit in the
two uses of the subjunctive I have declined to analyze here, the ‘might as
well’ cases and the indifference free relatives. Like malh, t’u7 has not yet
been formally analyzed, but for t’u7 there are not even any clear descriptive
generalizations about its usage. It is often translated as ‘just’ or ‘still’, but also
occurs where there is no obvious English translation, or even any detectable
semantic contribution. t’u7 frequently appears with strong quantifiers, as in
(118a), is almost obligatory if one wants to express ‘only’, as in (118b), and is
also the St’át’imcets way to express ‘but’, as in (118c) (although here, unlike
in its other uses, it is not a second-position enclitic, and this may therefore
be a case of homophony).

(118) a. tákem=t’u7 swat áolsvm l=ti=tsítcw=a

all=prt who sick in=det=house=exis
‘Everyone in the house was sick.’ (Matthewson 2005: 311)
b. tsúkw=t’u7 snilh ti=tsícw=a aolsvm-áolhcw
finish=prt 3sg.emph det=get.there=exis sick-house
‘It was only him who went to the hospital.’ (Matthewson 2005:
c. plan aylh láku7 wa7 cw7it i=tsetsítcw=a, t’u7
already then deic impf many det.pl=houses=exis but
pináni7 cw7aoz láti7 ku=wá7 tsitcw
temp.deic neg deic det=impf house
‘Now there are lots of houses there, but then there were no

Lisa Matthewson

(Matthewson 2005: 54)

As noted above, t’u7 is present in the ‘might as well’ uses of the subjunc-
tive, and in indifference free relatives. Examples are repeated here.

(119) a. wá7=lhkacw=t’u7 lts7a lhkúnsa ku=sgáp

be=2sg.indic=prt deic now det=evening
‘You are staying here for the night.’
b. wá7=acw=t’u7 lts7a lhkúnsa ku=sgáp
be=2sg.sbjn=prt deic now det=evening
‘You may as well stay here for the night.’

(120) [stám’=as=t’u7 káti7 i=wá7 ka-k’ac-s-twítas-a

[what=3sbjn=prt deic det.pl=impf circ-dry-caus-3pl.erg-circ
i=n-slalíl’tem=a] wa7 ts’áqw-an’-em
det.pl=1sg.poss-parents=exis] impf eat-dir-1pl.erg
lh=as sútik
comp(impf)=3sbjn winter
‘Whatever my parents could dry, we ate in wintertime.’
(Matthewson 2005: 141, cited in Davis 2009)

Given the analysis above, we expect there to be a modal — or at least a

modal base and an ordering source — present in any structure where the
subjunctive is licensed. The interpretation of subjunctive + t’u7 in (119b) is
plausibly modal — the consultants are remarkably consistent with the ‘might
as well’ translation. There is also a certain similarity between the ‘might
as well’ construction and the Sufficiency Modal Construction (Krasikova &
Zchechev 2005, von Fintel & Iatridou 2008), illustrated in (121).

(121) To get good cheese, you only have to go to the North End!
(von Fintel & Iatridou 2008: 445)

The crucial elements of the Sufficiency Modal Construction are (a) a

necessity modal and (b) an exclusive operator such as ‘only’.55 The possible
connection between (119) and (121) may be fruitful to investigate in future
55 For von Fintel and Iatridou, the ‘only’ is decomposed into ‘neg . . . except’ (and shows up
overtly as this in some languages).
56 See also Mitchell 2003 on ‘might as well’ in English.

Cross-linguistic variation in modality systems: The role of mood

As for indifference free relatives as in (120), these also very plausibly con-
tain a covert modal, presumably a necessity one. The important question will
be whether the subjunctive can be analyzed as a weakener in the indifference
free relatives. Ideally, the future analysis of (119)–(120) will also elucidate
the semantic connection between the two t’u7-subjunctives, both of which
somehow express the notion of ‘indifference’ (although perhaps in different
senses of the word). (119b), for example, conveys that you can stay here for
the night or not, I don’t really care.
In spite of these outstanding questions, I believe that the empirical cover-
age of the analysis presented here is encouraging. Out of the nine meaningful
uses of the St’át’imcets subjunctive, we set aside two which rely on the poorly-
understood particle t’u7, but have managed to unify the remaining seven.
The analysis accounts for such seemingly disparate effects as the weakening
of imperatives, the reduction in interrogative force of questions, and the
non-appearance of the subjunctive under any attitude verb. The analysis, if
correct, supports the modal approach to mood advocated by Portner (1997),
and suggests that languages have a certain amount of freedom in how they
divide up the various functional tasks required of moods and modals.
Finally, the research reported on here opens up broader questions about
the nature of mood cross-linguistically, for example about the relation be-
tween subjunctive and irrealis. In Section 2, I showed that the St’át’imcets
subjunctive patterns morpho-syntactically, as well as in some of its semantic
properties, like a subjunctive rather than an irrealis. However, we also saw
that the St’át’imcets subjunctive differs semantically from Indo-European
subjunctives. I argued above (see fn. 9) that the use of the term ‘subjunctive’
was justified, even in the face of such non-trivial cross-linguistic variation.
However, there is much more work to be done on the formal semantics of
mood cross-linguistically. Once a wider range of systems are investigated
in depth, we may find that the traditional terminology does not correlate
with the cross-linguistically interesting divisions. Topics for future inquiry
include whether there is a minimal semantic change which would turn a
subjunctive morpheme into an irrealis one, or vice versa, and in general what
the semantic building blocks are from which moods are composed.

Lisa Matthewson


Aikhenvald, Alexandra. 2006. Evidentiality. New York: Oxford University

Baker, Mark & Lisa Travis. 1997. Mood as verbal definiteness in a
“tenseless” language. Natural Language Semantics 5(3). 213–269.
Beghelli, Filippo. 1998. Mood and the interpretation of indefinites. The
Linguistic Review 15(2-3). 277–300. doi:10.1515/tlir.1998.15.2-3.277.
Bolinger, Dwight. 1968. Postposed main phrases: an English rule for the
Romance subjunctive. Canadian Journal of Linguistics 14. 3–33.
Caponigro, Ivano & Jon Sprouse. 2007. Rhetorical questions as questions. In
Proceedings of Sinn und Bedeutung 11, 121–133. http://idiom.ucsd.edu/
Condoravdi, Cleo. 2002. Temporal interpretation of modals: Modals for the
present and the past. In David Beaver, Stefan Kaufmann, Brady Clark &
Luis Casillas (eds.), Stanford Papers on Semantics, vol. 7, 59–88. Stanford:
CSLI Publications. http://semanticsarchive.net/Archive/2JmZTIwO/.
Davis, Christopher, Christopher Potts & Margaret Speas. 2007. The pragmatic
values of evidential sentences. In Masayuki Gibson & Tova Friedman (eds.),
Proceedings of the 17th Conference on Semantics and Linguistic Theory,
71–88. Ithaca, NY: CLC Publications. doi:1813/11294.
Davis, Henry. 2000. Remarks on Proto-Salish subject inflection. International
Journal of American Linguistics 66(4). 499–520. doi:10.1086/466439.
Davis, Henry. 2006. A grammar of Upper St’át’imcets. Ms., University of
British Columbia.
Davis, Henry. 2009. Free relatives in St’át’imcets (Lillooet Salish). Ms., Univer-
sity of British Columbia.
Davis, Henry, Lisa Matthewson & Hotze Rullmann. 2009. ‘Out of control’
marking as circumstantial modality in St’át’imcets. In Lotte Hogeweg,
Helen de Hoop & Andrey Malchukov (eds.), Cross-linguistic semantics of
tense, aspect and modality, 205–244. Oxford: John Benjamins. http://
Dayal, Veneeta. 1997. Free relatives and ever: Identity and free choice read-
ings. In Proceedings of SALT VII, 99–116. http://www.rci.rutgers.edu/
van Eijk, Jan. 1997. The Lillooet language: Phonology, morphology, syntax.
Vancouver, BC: UBC Press.

Cross-linguistic variation in modality systems: The role of mood

van Eijk, Jan & Lorna Williams. 1981. Lillooet legends and stories. Mt. Currie,
BC: Ts’zil Publishing House.
Faller, Martina. 2002. Semantics and pragmatics of evidentials in Cuzco
Quechua: Stanford dissertation.
Faller, Martina. 2006. Evidentiality and epistemic modality at the se-
mantics/pragmatics interface. http://www.eecs.umich.edu/~rthomaso/
Farkas, Donka. 1992. On the semantics of subjunctive complements. In Paul
Hirschbühler & Konrad Koerner (eds.), Romance languages and modern
linguistic theory: Papers from the 20th linguistic symposium on Romance
languages, 69–104. Amsterdam and Philadelphia: Benjamins.
Farkas, Donka. 2003. Assertion, belief and mood choice. Paper presented at
the Workshop on Conditional and Unconditional Modality, ESSLLI, Vienna.
von Fintel, Kai. 2000. Whatever. In Proceedings of SALT X, 27–40. http:
von Fintel, Kai & Anthony Gillies. 2010. Must . . . stay . . . strong! Natural
Language Semantics. doi:10.1007/s11050-010-9058-2.
von Fintel, Kai & Irene Heim. 2007. Intensional semantics lecture notes. Ms.,
MIT. http://mit.edu/fintel/IntensionalSemantics.pdf.
von Fintel, Kai & Sabine Iatridou. 2008. How to say ought in foreign: The
composition of weak necessity modals. In Jacqueline Guéron & Jacqueline
Lecarme (eds.), Time and modality, 115–141. Dordrecht: Springer. http:
Garrett, Edward. 2001. Evidentiality and assertion in Tibetan. Los Angeles,
CA: UCLA dissertation.
Gauker, Christopher. 1998. What is a context of utterance? Philosophical
Studies 91(2). 149–172. doi:10.1023/A:1004247202476.
Giannakidou, Anastasia. 1997. The landscape of polarity items. Groningen:
University of Groningen dissertation.
Giannakidou, Anastasia. 1998. Polarity sensitivity as (non)veridical depen-
dency. Amsterdam and Philadelphia: John Benjamins.
Giannakidou, Anastasia. 2009. The dependency of the subjunctive re-
visited: Temporal semantics and polarity. Lingua 119(12). 1883–1908.
Giorgi, Alessandra & Fabio Pianesi. 1997. Tense and aspect: From semantics
to morpho-syntax. Oxford: Oxford University Press.
Guerzoni, Elena. 2003. Why ‘even’ ask? on the pragmatics of questions and

Lisa Matthewson

the semantics of answers: MIT dissertation. http://hdl.handle.net/1721.1/

Hamblin, C. L. 1973. Questions in Montague English. Foundations of Language
10(1). 45–53. http://www.jstor.org/stable/25000703.
Han, Chung-hye. 1997. Deontic modality of imperatives. Language and
Information 1. 107–136.
Han, Chung-hye. 1999. Deontic modality, lexical aspect and the semantics
of imperatives. In Linguistics in the morning calm 4, Seoul: Hanshin
Publications. URLhttp://www.sfu.ca/~chunghye/papers/morningcalm.
Han, Chung-hye. 2002. Interpreting interrogatives as rhetorical questions.
Lingua 112(3). 201–229. doi:10.1016/S0024-3841(01)00044-4.
Haverkate, Henk. 2002. The syntax, semantics and pragmatics of Spanish
mood. Amsterdam and Philadelphia: John Benjamins.
Heim, Irene. 1992. Presupposition projection and the semantics of attitude
verbs. Journal of Semantics 9(3). 183–221. doi:10.1093/jos/9.3.183.
Hooper, Joan B. 1975. On assertive predicates. In John Kimball (ed.), Syntax
and semantics 4, 91–124. New York: Academic Press.
Jacobs, Peter. 1992. Subordinate clauses in Squamish: A Coast Salish language.
MA thesis, University of Oregon.
Jacobson, Pauline. 1995. On the quantificational force of English free relatives.
In Emmon Bach, Eloise Jelinek, Angelika Kratzer & Barbara Partee (eds.),
Quantification in natural language, 451–486. Dordrecht: Kluwer.
James, Frances. 1986. Semantics of the English subjunctive. Vancouver, BC:
UBC Press.
Klein, Flora. 1975. Pragmatic constraints in distribution: the Spanish subjunc-
tive. In Papers from the 11th CLS, 353–365.
Krasikova, Sveta & Ventsislave Zchechev. 2005. Scalar uses of only in con-
ditionals. In Proceedings of the fifteenth Amsterdam Colloquium, 137–
142. University of Amsterdam. http:www.ventsislavzhechev.eu/Home/
Kratzer, Angelika. 1981. The notional category of modality. In Hans-Jürgen
Eikmeyer & Hannes Rieser (eds.), Words, worlds, and contexts: New ap-
proaches in word semantics (Research in Text Theory 6), 38–74. Berlin: de
Kratzer, Angelika. 1991. Modality. In Dieter Wunderlich & Arnim von Stechow
(eds.), Semantics: An international handbook of contemporary research,
639–650. Berlin: de Gruyter.

Cross-linguistic variation in modality systems: The role of mood

Kratzer, Angelika. 2009. Modals and conditionals again, chapter 3. To be

published by Oxford University Press.
Kroeber, Paul. 1999. The Salish language family: Reconstruct-
ing syntax. Lincoln, NE: The University of Nebraska Press.
Littell, Patrick. 2009. Conjectural questions and the wonder effect or: What
could conjectural questions possibly be? Ms, University of British
Littell, Patrick, Lisa Matthewson & Tyler Peterson. 2009. On the semantics of
conjectural questions. Paper presented at the MOSAIC Workshop (Meeting
of Semanticists Active in Canada), Ottawa.
Lunn, Patricia. 1995. The evaluative function of the Spanish subjunctive.
In Joan Bybee & Suzanne Fleischman (eds.), Modality and grammar in
discourse, 419–449. Amsterdam and Philadelphia: Benjamins.
Matthewson, Lisa. 1998. Determiner systems and quantificational strategies:
Evidence from Salish. The Hague: Holland Academic Graphics.
Matthewson, Lisa. 1999. On the interpretation of wide-scope indefinites.
Natural Language Semantics 7(1). 79–134. doi:10.1023/A:1008376601708.
Matthewson, Lisa. 2005. When I was small – i wan kwikws: Grammatical
analysis of St’át’imcets oral narratives. Vancouver, BC: UBC Press.
Matthewson, Lisa. 2006. Presuppositions and cross-linguistic variation. In
Proceedings of NELS 36, Amherst, Mass: GLSA Publications.
Matthewson, Lisa. 2008a. Moods vs. modals in St’át’imcets and beyond. Paper
presented at New York University.
Matthewson, Lisa. 2008b. Pronouns, presuppositions and semantic vari-
ation. In Proceedings of SALT XVIII, 527–550. Cornell University:
CLC Publications. http://www.linguistics.ubc.ca/sites/default/files/
Matthewson, Lisa. 2010. Evidence about evidentials: Where fieldwork
meets theory. Paper presented at Linguistic Evidence 2010, Uni-
versity of Tübingen. http://www.linguistics.ubc.ca/sites/default/files/
Matthewson, Lisa. to appear. On apparently non-modal evidentials. To appear
in Proceedings of CSSP 2009 (EISS8).
Matthewson, Lisa, Hotze Rullmann & Henry Davis. 2007. Evidentials as
epistemic modals: Evidence from St’át’imcets. In J.V. Craenenbroeck (ed.),
Linguistic Variation Yearbook, vol. 7, 201–254. John Benjamins Publishing

Lisa Matthewson

Mitchell, Keith. 2003. Had better and might as well: On the margins of modal-
ity? In M. Krug R. Facchinetti & F. Palmer (eds.), Modality in contemporary
english, 129–149. Berlin: Mouton de Gruyter.
Murray, Sarah. to appear. Evidentiality and questions in Cheyenne. In Suzi
Lima (ed.), Proceedings of SULA 5: Semantics of under-represented lan-
guages in the Americas, Amherst, MA: GLSA Publications.
Palmer, Frank. 2006. Mood and modality. Cambridge: Cambridge University
Press 2nd edn. doi:10.2277/0521804795.
Panzeri, Francesca. 2003. In the (indicative or subjunctive) mood. In Pro-
ceedings of Sinn und Bedeutung 7, http://ling.uni-konstanz.de/pages/
Peterson, Tyler. 2009. The ordering source and graded modality in Gitskan
epistemic modals. Ms., University of British Columbia. http://www.
Peterson, Tyler. 2010. Epistemic modality and evidentiality in Gitksan at the
semantics-pragmatics interface: University of British Columbia disserta-
tion. http://hdl.handle.net/2429/23596.
Portner, Paul. 1997. The semantics of mood, complementation and
conversational force. Natural Language Semantics 5(2). 167–212.
Portner, Paul. 2003. The semantics of mood. In Lisa Cheng & Rint Sybesma
(eds.), The second Glot international state-of-the-article book, 47–77. Berlin:
Mouton de Gruyter.
Portner, Paul. 2004. The semantics of imperatives within a theory of clause
types. In Proceedings of SALT XIV, Cornell University: CLC Publications.
Portner, Paul. 2007. Imperatives and modals. Natural Language Semantics
15(4). 351–383. doi:10.1007/s11050-007-9022-y.
Portner, Paul. 2009. Modality Oxford Surverys in Semantics and Pragmatics.
Oxford: Oxford University Press.
Potts, Christopher. 2005. The logic of conventional implicatures. Oxford:
Oxford University Press.
Quer, Josep. 1998. Mood at the interface. The Hague: Holland Academic
Quer, Josep. 2001. Interpreting mood. Probus 13(1). 81–111.
Quer, Josep. 2009. Twists of mood: The distribution and interpre-
tation of indicative and subjunctive. Lingua 119(12). 1779–1787.

Cross-linguistic variation in modality systems: The role of mood

Rivero, María. 1975. Referential properties of Spanish noun phrases. Language
51(1). 32–48. doi:10.2307/413149.
Rocci, Andrea. 2007. Epistemic modality and questions in dialogue. the
case of Italian interrogative constructions in the subjunctive mood. In
L. de Saussure, J. Moeschler & G. Puska (eds.), Tense, mood and aspect:
Theoretical and descriptive issues, 129–153. Amsterdam and New York:
Rullmann, Hotze, Lisa Matthewson & Henry Davis. 2008. Modals as
distributive indefinites. Natural Language Semantics 16(4). 317–357.
Schwager, Magdalena. 2005. Interpreting imperatives: University of Frank-
furt/Main dissertation.
Schwager, Magdalena. 2006. Conditionalized imperatives. In Proceedings of
SALT XVI, Cornell University: CLC Publications. http://ecommons.library.
Schwager, Magdalena. 2008. Optimizing the future - imperatives between
form and function. Course notes, ESLLI 2008. http://zis.uni-goettingen.
Stalnaker, Robert. 1974. Pragmatic presuppositions. In Milton Munitz & Peter
Unger (eds.), Semantics and Philosophy, 197–214. New York University
Stalnaker, Robert. 1984. Inquiry. Cambridge, MA: MIT Press.
Tenny, Carol. 2006. Evidentiality, experiencers and the syntax of sen-
tience in Japanese. Journal of East Asian Linguistics 15(3). 245–288.
Tenny, Carol & Peggy Speas. 2004. The interaction of clausal syntax, discourse
roles and information structure in questions. Paper presented at the Work-
shop on Syntax, Semantics and Pragmatics of Questions. ESLLI, Université
Henri Poincaré, Nancy. http://www.linguist.org/ESSLI-Questions-hd.pdf.
Terrell, Tracy & Joan Hooper. 1974. A semantically based analysis of mood in
Spanish. Hispania 57(3). 484–494. doi:10.2307/339187.
Thoma, Sonja. 2007. The categorical status of independent pronouns in
St’át’imcets. Ms., University of British Columbia.
Villalta, Elisabeth. 2009. Mood and gradability: an investigation of the
subjunctive mood in Spanish. Linguistics and Philosophy 31(4). 467–522.
Whitley, Rose (translator), Henry Davis, Lisa Matthewson & Beveley Frank

Lisa Matthewson

(editors). no date. Teaching St’át’imcets Through Action. Translation of

Bertha Segal Cook Teaching English Through Action. Upper St’át’imcets
Language, Culture and Education Society.

Lisa Matthewson
UBC Department of Linguistics
Totem Field Studios
2613 West Mall
Vancouver, BC, V6T 1Z4, Canada

Semantics & Pragmatics Volume 3, Article 10: 1–38, 2010
doi: 10.3765/sp.3.10

Free choice permission as resource-sensitive reasoning∗

Chris Barker
New York University

Received 2009-10-14 / First Decision 2009-11-24 / Revised 2010-07-04 / Accepted

2010-08-14 / Final Version Received 2010-08-31 / Published 2010-09-01

Abstract Free choice permission is a long-standing puzzle in deontic logic

and in natural language semantics. It involves what appears to be a conjunc-
tive use of or: from You may eat an apple or a pear, we can infer that You
may eat an apple and that You may eat a pear — though not that You may
eat an apple and a pear. Following Lokhorst (1997), I argue that because
permission is a limited resource, a resource-sensitive logic such as Girard’s
Linear Logic is better suited to modeling permission talk than, say, classical
logic. A resource-sensitive approach enables the semantics to track not only
that permission has been granted and what sort of permission it is (i.e.,
permission to eat apples versus permission to eat pears), but also how much
permission has been granted, i.e., whether there is enough permission to
eat two pieces of fruit or only one. The account here is primarily semantic
(as opposed to pragmatic), with no special modes of composition or special
pragmatic rules. The paper includes an introduction to Linear Logic.

Keywords: Free choice, permission, linear logic, deontic, implicature, resource-

sensitive, substructural

∗ Thanks to Simon Charlow, Emmanuel Chemla, Cleo Condoravdi, Judith Degen, Nicholas
Fleisher, Sven Lauer, Koji Mineshima, Paul Portner, Daniel Rothschild, Philippe Schlenker,
Chung-chieh Shan, Seth Yalcin, and my anonymous referees.

©2010 Chris Barker

This is an open-access article distributed under the terms of a Creative Commons Non-
Commercial License (creativecommons.org/licenses/by-nc/3.0).
Chris Barker

1 The resource-sensitivity of permission talk

Since Ross 1941, it has been clear that the logic of obligation and permission
behaves dramatically differently than other sorts of ordinary reasoning:

(1) a. You may eat an apple or a pear.

b. You may eat an apple.
c. You may eat a pear.

If (1a) is true, then it is certainly true that you may eat an apple. Likewise, it
is equally true that you have it within your power to safely eat a pear. So an
adequate account of the meaning of (1a) must explain how it comes to imply
(1b) and (1c).
This pattern is by no means the usual case. Consider a variation on (1) in
which the permissive modal may is omitted:

(2) a. You ate an apple or a pear.

b. You ate an apple.
c. You ate a pear.

In this case, (2a) certainly does not imply either (2b) or (2c). So something
about permission talk correlates with the unusual implications we are con-
cerned with here.
The puzzle posed by the facts in (1) is known as the free choice permission
problem (Kamp (1973) attributes the choice of name to von Wright).
Since (1a) implies both (1b) and (1c), (1b) and (1c) are therefore both equally
true. Thus in many discussions, (1a) is said to imply (3a), since (3a) is merely
the conjunction of (1b) and (1c):

(3) a. You may eat an apple and you may (also) eat a pear.
b. You may eat an apple or you may (*also) eat a pear.

Crucially, however, (3a) has an interpretation on which it furnishes permission

to eat more than one piece of fruit. This interpretation is the one compatible
with adding also in the second conjunct. Now, although (1a) may be consistent
with a situation in which the addressee is allowed to eat more than one piece
of fruit (as we will see below), the truth of (1a) alone is never sufficient to
guarantee that more than one piece of fruit may be eaten. As a result, (3b) is
a better candidate for a paraphrase of (1a): it, too (surprisingly!) implies (1b)
and (1c), but, like (1a), it does not ever justify eating more than one piece of

Free choice permission as resource-sensitive reasoning

fruit. This is why also is never appropriate in the second disjunct in (3b) on
the intended reading.
What I am suggesting is that a complete characterization of permission
sentences must not only tell us whether permission exists and what type of
permission it is (i.e., permission to eat an apple versus permission to eat a
pear), it must also characterize how much permission has been granted. Thus
it must predict that (1a) and (3b) guarantee permission only to eat one piece
of fruit, but that (3a) can be used to provide permission to eat two pieces of
The key insight that I would like to develop in this paper first appears,
as far as I know, in unpublished work of Lokhorst (1997): that permis-
sion and obligation is a resource-sensitive domain, so that logics based on
(resource-insensitive) classical logic are not appropriate. Lokhorst suggests
using Girard’s (1987) Linear Logic instead, and I will follow the technical
details of his proposal closely. The contribution of this paper will be to
introduce Lokhorst’s work to a linguistic audience, to evaluate it with respect
to competing linguistic analyses, and to investigate the implications of adapt-
ing Lokhorst’s proposal for the theory of natural language semantics and
Resource-sensitive (‘substructural’) logics are already familiar in linguis-
tics as tools for building syntax/semantics interfaces (e.g., Moortgat 1997
or Dalrymple 2001). As far as I know, however, no one has yet suggested
that natural language connectives such as or or and can have uses in which
they behave semantically like connectives in a substructural logic, as I am
suggesting here.
Kamp (1973, 1978) discusses free choice permission not just as a puzzle
for modeling reasoning about obligation (deontic logic), but as a puzzle
for the composition of natural language expressions. From the point of
view of natural language semantics, the interesting thing about the free
choice permission problem is that it appears to require not only making
assumptions about the meaning of certain uses of modal expressions such as
may, but about the meaning of the corresponding uses of the coordinating
conjunctions and and or. This will be true of the solution I offer below.
Many solutions to the free choice permission problem rely on pragmatic
mechanisms for much of the heavy lifting, including Kamp 1978, Zimmer-
mann 2000, Fox 2007, and others. The arguments that free choice implica-
tions are pragmatic, and more specifically are scalar implicatures, stem from
discussions of indefinites in Kratzer and Shimoyama 2002, as developed by

Chris Barker

Alonso-Ovalle (2006) and Fox (2007). The main evidence that free choice
implications may be scalar implicatures turns on the behavior of negated
permission sentences (You may not eat an apple or a pear); I show how the
analysis here can explain the behavior of such sentences in section 5.
In contrast to the pragmatic approaches, I will argue that the main free
choice implications, including especially the implications from (1a) to (1b)
and to (1c), are matters of entailment. To the extent that the analysis here
is viable, it calls into question whether free choice implications are indeed
implicatures. I discuss other entailment approaches (e.g., Aloni 2007) in
section 6.2.

2 Classical logic versus Linear Logic

The account of free choice given below will depend on understanding the
basics of Linear Logic at a fairly deep level. Since Linear Logic is unfamiliar
to most semanticists, this section will present the basics of Linear Logic.

2.1 Classical logic

I will only introduce the elements of classical logic that will be relevant for
comparison with Linear Logic in the discussion below. This will include
conjunction, disjunction, negation, and Weakening, but not, for example,
Formulas. There is a set of atomic formulas a, b, c, . . . , and a set of
variables over formulas A, B, C, . . . . Assume A and B are formulas. Then the
classical negation of A, written ¬A, is a formula; the classical conjunction of
A and B, written A ∧ B, is a formula; and the classical disjunction of A and B,
written A ∨ B, is a formula. In addition, the classical implication of A and B,
written, A → B is defined as an abbreviation of (¬A) ∨ B.
Sequents. A sequent A, B, . . . , M ` N, O, . . . , Z consists of two multisets
of formulas joined by a turnstile (‘`’). Classical sequents are interpreted as
asserting that whenever all of the formulas in the leftmost multiset hold,
then at least one of the formulas in the rightmost multiset must also hold.
Saying that a sequent contains multisets rather than lists of formulas means
that the order in which formulas are written is immaterial. Thus A, B and
B, A represent the same multiset, but A, B is a different multiset than A, A, B,
since the second multiset contains two instances of the formula A.

Free choice permission as resource-sensitive reasoning

Capital Greek letters (∆, Γ , . . . ) schematize over (possibly empty) multisets

of formulas. The turnstile can occur in any position, and there can be more
than one formula on the right hand side, so that the expression ‘∆ ` A, B’,
the expression ‘∆ `’, and the expression ‘` ∆’ are all legitimate sequents.
Negation. The following pair of inference rules characterize classical
∆, A ` Γ ∆ ` A, Γ
¬1 ¬2
∆ ` ¬A, Γ ∆, ¬A ` Γ
Beginning with ¬1 , the inference rule on the left: if Γ follows from the
formulas in ∆ along with A (this is what the sequent above the horizontal
line expresses), then from ∆ alone we can conclude that either some member
of Γ is still true, or else A must be false (the sequent below the horizontal
line). Similar reasoning applies for the inference rule on the right, ¬2 .
Proofs. A proof that a sequent is valid begins with trivial tautologies,
here, that A ` A:
` ¬A, A
¬¬A ` A
As long as each subsequent inference step instantiates a valid inference rule,
the proof guarantees that the final sequent will also be valid. A sequent at
the bottom of such a proof is called a theorem of the logic.
Reading from top to bottom, the first step of the proof here is an in-
stantiation of the inference rule ¬1 . This step concludes that either A or its
negation must be true (a version of the law of excluded middle); the second
step (labeled ¬2 ) proves that two adjacent negations cancel out (the law of
double negation). Proving that A ` ¬¬A is equally easy.
Conjunction. The inference characterizing classical conjunction has two
∆`A ∆`B

If the assumptions in ∆ allow you to prove that A is true (i.e., if ∆ ` A), and
the very same set of assumptions also allow you to prove that B is true, then
you are certainly in a position to assert that the classical conjunction of A
and B must be true.
Disjunction. For disjunction, we have a matched pair of inferences:
∆`A ∆`B
∨1 ∨2
∆`A∨B ∆`A∨B

Chris Barker

If the assumptions in ∆ allow you to prove that some proposition A is true,

you can conclude that the classical disjunction of A and B is true. After all,
if you know that Ann arrived, then you know that either Ann arrived or Bill
arrived. The reason we need a pair of rules is that disjunction is symmetric,
i.e., we are free to add the new disjunct either on the left or on the right.
The classical duality of conjunction and disjunction. The following
equivalences hold:

(4) a. ¬¬A ≡ A
b. ¬(A ∧ B) ≡ ¬A ∨ ¬B
c. ¬(A ∨ B) ≡ ¬A ∧ ¬B

The last two (DeMorgan’s laws) express the logical interrelationship between
disjunction and conjunction. These equivalences can be thought of as bi-
directional inference rules. In any case, I will freely replace formulas with
forms deemed equivalent by (4).
Weakening. Weakening allows assumptions to be discarded.

∆, A ` Γ

If Γ follows from ∆, then Γ certainly still follows if A also happens to be true,

no matter what A happens to express. The assumption A is gratuitous, but
harmless. Weakening allows us to pick and choose among evidence as we
focus on different parts of an argument.
Implication as a form of disjunction. Recall that in the definitions of
well-formed formulas, we defined classical implication A → B as an abbrevia-
tion of ¬A ∨ B. The inference rule that characterizes implication is Modus
Ponens, which says that A, A → B ` B is valid. We can prove Modus Ponens
as follows. The main aspect of the proof that is relevant for comparison with
Linear Logic is the role of Weakening.

A`A ¬B ` ¬B
Weak Weak
A, ¬B ` A ¬B, A ` ¬B

¬B, A ` A ∧ ¬B
¬1 , ¬2
A, ¬(A ∧ ¬B) ` ¬¬B

A, A → B ` B

Free choice permission as resource-sensitive reasoning

Wadler (1993) uses classical modus ponens in the following proof to

emphasize the differences between classical logic and Linear Logic:

A`A [see previous proof]

A, A → B ` A A, A → B ` B

A, A → B ` A ∧ B

Weakening allows us to make use of assumption A twice: once to justify the

left conjunct of the conclusion, and once to support modus ponens in order
to derive the right conjunct of the conclusion. We will see that Linear Logic
requires careful accounting: each assumption can be used exactly once, so
this proof will not go through.
Finally, completing the ¬, ∧, ∨ fragment of classical logic requires Con-
traction: from ∆ ` A, A, infer ∆ ` A. In Linear Logic, Contraction is also
rejected, but Contraction does not play a role in the exposition here.

2.2 Linear Logic

Formulas. Once again there is a set of atomic formulas a, b, c, . . . , and a

set of variables over formulas A, B, C, . . . . However, since none of the Linear
Logic connectives mean what their classical counterparts mean, Linear Logic
uses a completely distinct set of connective symbols. Assume A and B are
formulas. Then the linear negation of A, written A⊥ , is a formula; the additive
conjunction of A and B, written A & B (pronounced “A with B”) is a formula;
the multiplicative conjunction of A and B, written A ⊗ B (pronounced “A
times B”) is a formula; the additive disjunction of A and B, written A ⊕ B
(pronounced “A plus B”) is a formula; and the multiplicative disjunction of
A and B, written A B (pronounced “A par B”) is a formula. (Many things in
natural language semantics are called ‘additive’. The Linear Logic notions of
‘additive’ and ‘multiplicative’ do not line up with any of them.) In parallel
with the definition of classical implication above, linear implication, written
A ( B (pronounced “A lollipop B”), is defined as an abbreviation for A⊥ B.
Sequents. A sequent ∆ ` Γ says that whenever the multiplicative con-
junction of ∆ holds, then the multiplicative disjunction of Γ must hold.
Fragment of Linear Logic for the free choice permission problem. Fig-
ure 1 displays the complete set of rules of Linear Logic that we will use in the
discussion of the free choice permission problem.

Chris Barker

∆, A ` Γ ∆ ` A, Γ
⊥1 ⊥2
∆ ` A⊥ , Γ ∆, A⊥ ` Γ


A ( B ≡ A⊥
A⊥⊥ ≡ A
(A & B)⊥ ≡ A⊥ ⊕ B ⊥ (A ⊗ B)⊥ ≡ A⊥ B⊥

(A ⊕ B)⊥ ≡ A⊥ & B ⊥ B)⊥ ≡ A⊥ ⊗ B ⊥


∆`A ∆`B ∆`A Γ `B

& ⊗
∆`A&B ∆, Γ ` A ⊗ B

∆`A ∆`B ∆ ` A, B &

⊕1 ⊕2 &
∆`A⊕B ∆`A⊕B ∆`A B

Figure 1 Fragment of Linear Logic for FCP

Free choice permission as resource-sensitive reasoning

Linear conjunction and disjunction. The rules for & and ⊕ (the ‘additive’
connectives) look exactly like the classical rules for ∧ and ∨, except for the
substitution of & for ∧ and of ⊕ for ∨. However, as a result of how they
interact with the rest of the logic, the linear logic additives behave differently
from their classical counterparts. For instance, the law of the excluded
middle is valid for classical disjunction: ` (¬A) ∨ A. In Linear Logic, the law
of excluded middle is not valid for additive disjunction, despite the fact that
the inference rule for additive disjunction has the same form as the inference
rule for classical disjunction: 6` A⊥ ⊕ A. However, the excluded middle is
valid for multiplicative disjunction (` A⊥ A).
Linear negation. We have direct analogs to the classical rules for pushing
a formula across the turnstile, namely, ⊥1 and ⊥2 . Since we now have two
kinds of conjunctions and two kinds of disjunctions, there are more duality
equivalences; however, each conjunction is still dual to a disjunction, and
Linear implication. Once again, we have defined implication in terms of
disjunction. Now, interestingly, we can prove the linear version of Modus
Ponens without using Weakening (which is a good thing, since Weakening is
not allowed in Linear Logic):
A`A B⊥ ` B⊥

A, B ⊥ ` A ⊗ B ⊥
⊥1 , ⊥2
A, (A ⊗ B ⊥ )⊥ ` B ⊥⊥

A, A ( B ` B
Because the inference rule for ⊗ splits up the resources (that is, the formulas)
into those used to prove A and those used to prove B, there is no need to
ignore gratuitous assumptions via Weakening.
If we try to reproduce Wadler’s classical proof from the previous section,
we’re out of luck:
?? ` A ?? ` B

A, A ( B ` A ⊗ B
We could take some of the resources to the left of the turnstile to prove A,
and we could take some (actually, we would need all) of the resources to
prove B, but no matter how we divide up the left-hand formulas, we’ll fall
short of proving one or the other of the conjuncts. Linear Logic requires
strict accounting of assumptions, and we can’t make use of A twice, the way
we could in the classical proof.

Chris Barker

2.3 Choice

Since free choice permission is about making choices, what does Linear Logic
have to say about choice?
The critical connectives will be the additive conjunction ‘&’ and its (also
additive) disjunctive dual, ‘⊕’. The relevant inference rules are repeated here:
∆`A ∆`B ∆`A ∆`B
& ⊕1 ⊕2
∆`A&B ∆`A⊕B ∆`A⊕B
Imagine yourself in the role of the prover. Then the assumptions on the left
of the turnstile are what your environment gives you to work with, and the
conclusion on the right of the turnstile is what you return as the result of
your labors (perhaps to be used as an assumption in a larger proof).
So here is what the & inference says: if the resources in ∆ allow you to
provide A, and if the same resources allow you to provide B, then you can
certainly offer to provide either A or B. Furthermore, since you are prepared
to provide either alternative, you can leave the choice up to whoever might
be interested in making use of the conclusion. Thus & conjoins two equally
viable alternatives.
Though both alternatives are equally viable, the consumer is forced to
choose between them. For instance, imagine that ∆ contains a certain amount
of sugar and a certain number of eggs. Using the resources provided, you
can construct either a meringue or else an angel food cake, but you don’t
have enough ingredients to cook both. Being as flexible and gracious as
possible, you offer “meringue & cake” for dessert, and you let your guest
choose. Tellingly, “meringue & cake” is pronounced “meringue or cake” in
idiomatic English (this is a point that we will return to in section 7.3).
In the context of granting permission, the consumer is the entity to which
permission has been granted: we shall see that (unembedded) & corresponds
to free choice on the part of the entity given permission.
Continuing with our investigation of choice in Linear Logic, turning to
the ⊕1 inference rule, if the resources in ∆ allow you to provide A, then you
can certainly offer to provide either A ⊕ B — as long as you remain in control
of which of the alternatives is chosen. You may only know how to make
one dessert, perhaps. You can truthfully promise that dessert will either be
meringue or else Baked Alaska, although you know in advance that it will
have to be meringue. (Analogously with the roles reversed for ⊕2 .)
In the context of granting permission, offering A ⊕ B does not give the
grantee free choice.

Free choice permission as resource-sensitive reasoning

In order to complete the picture of the dualities of & and ⊕, we must

consider what happens on the other side of the turnstile. Hopping across the
turnstile involves negation, which exchanges & for ⊕ (and vice versa).

A`∆ B`∆ A`∆ B`∆

A&B `∆ A&B `∆ A⊕B `∆

These rules follow from the official inference rules by applications of ⊥1 and
⊥2 .
If A alone is enough to enable you to provide ∆, then if someone promises
you A & B, you can certainly commit to providing ∆: just select A when they
give you your choice. (Similarly for the other rule introducing & on the left of
the turnstile.)
Finally, if having A is enough for you to be able to offer ∆, and if having
B is likewise enough for you to be able to offer ∆, then you’re in a position to
promise ∆ even if all you can count on is A ⊕ B. All you know is that you’ll
get either an A or a B, and that which one you get will be someone else’s
choice. However, since you are prepared to cope with either possibility, you
can commit to providing ∆.
The bottom line is that & and ⊕ are two perspectives on a single choice,
differing only in who has the power to make the selection: & provides two
equally legitimate alternatives, but forces an unconstrained (free) choice
between them; ⊕ also provides two alternatives, but reserves the choice for
whoever is providing the resource.

3 Strong permission versus weak permission

Standard deontic logics introduce unary modalities representing obligation

() and permission (♦), and add axioms that characterize an appropriate set of
entailments, usually including at least K and D, though there is considerable
variation; see McNamara 2006 or Portner 2009a for an introduction to deontic
logic. Lokhorst (1997) chooses instead a strategy attributed independently
to Anderson and to Kanger called deontic reduction. Deontic reduction
depends on a special proposition δ (pronounced “yay”), glossed as ‘the good
thing’, or ‘all things are as required’. Thus δ is roughly analogous to Kratzer’s
(e.g., 1991) notion of an ordering source, that is, the set of propositions that
characterize how things ought to be.
Then A is obligatory iff δ ( A: if A follows from the state where all things
are as required, then A is required. Dually, a weak version of permission

Chris Barker

is often defined as (δ ( A⊥ )⊥ : if the negation of A is not obligatory, then

A is at least not forbidden. However, there is a difference between weak
permission, which is the absence of prohibition, and strong permission, i.e.,
a permissive norm (as discussed in, e.g., Hansen et al. 2007), which is the
assertion that some action is explicitly ok.
Lokhorst (1997) renders strong permission as A ( δ. Viewed from the
linguistics tradition, it is not so easy to make sense out of this as a statement
of permission (as discussed in Portner 2009a:60). It is important to bear in
mind that the ‘strong’ part of ‘strong permission’ does not mean that merely
eating an apple will guarantee that everything is ok, no matter what else
happens. If only permission could be that strong! Rather, the difference
between ‘weak’ and ‘strong’ here is the difference between a system in which
we have only obligation and its negation (in which everything that is not
forbidden is permitted), and a more articulated system in which some things
are permitted (A ( δ), some things are forbidden ((A ( δ)⊥ ), and some
things are neither permitted nor forbidden. If I explicitly give you permission
to eat an apple, and I explicitly forbid you to eat a pear, what about eating a
banana? Is it permitted or forbidden? Maybe yes, maybe no.
There is not much discussion of weak permission versus strong per-
mission in the linguistics literature, but at least Asher and Bonevac (2005)
conclude that free choice permission involves strong permission. Certainly
if we want to distinguish between explicit permission and the absence of
prohibition, then we need a logic that can express strong permission. Since I
have claimed that You may eat an apple or a pear crucially neither permits
nor forbids eating both an apple and a pear, we must use strong permission
But what exactly does A ( δ assert, if not that eating an apple will
guarantee the good thing? The key is to consider when A ( δ will be true.
We will be in a situation in which A ( δ just in case eating an apple in
that situation is compatible (‘cotenable’ in the terminology of Relevant Logic)
with all obligations being fulfilled. There are two kinds of such situations:
situations in which eating an apple happens to be obligatory, in which case
we can only conform to obligations by eating the apple (after all, everything
that is obligatory is at least permitted); and situations in which we’re already
in compliance, but eating an apple is optional and does not disturb our happy
state. But if we are otherwise in compliance, and we decide to eat an apple
(A), and we decide to simultaneously kill the postman (K), the fact that apple
eating is permitted will not save us: because of the resource-sensitivity of

Free choice permission as resource-sensitive reasoning

linear logic, in particular, the absence of Weakening, we can’t ignore the dead
postman. As a result, the combination of eating an apple and killing the
postman will land us in a situation that is far from ok: A, K, A ( δ 6` δ.
A fuller understanding of linear implication, and therefore of strong
permission, will emerge from the model theory developed in section 8.
One major expository advantage of the reduction strategy is that it enables
us to talk about permission without complicating the logic with inference
rules for  and ♦. Note that we do not necessarily give up anything by omit-
ting the unary connectives: McNamara (2006) and Lokhorst (2006) show that
under appropriate additional assumptions, deontic reduction characterizes
all the theorems of standard deontic modal logics.
Not that replicating standard deontic logic should be our goal; after all,
standard deontic logic has A → ¬¬A as a tautology, which imposes a kind
of consistency on the set of deontic obligations. In the linguistics tradition,
a number of people (notably Kratzer (1991)) have argued that this is not
appropriate for describing natural language modality, and that we should
instead allow for inconsistent laws. However, I’m not aware of any reason
why deontic reduction is incompatible with Kratzer’s characterization of
deontic modality.
I should note that deontic reduction is not an innocent choice for the
empirical phenomena under consideration here. As I will explain shortly,
because linear implication is defined as A ( B ≡ A⊥ B, the formula for
which permission is granted (i.e., A) occurs in a downward-entailing position.
This will be crucial in deriving the desired entailments. For all I know,
however, it is possible that if a suitable notion of strong permission were
defined in a standard deontic framework (i.e., one based on unary operators
like ), similar entailments would go through.
I intend for deontic reduction to be a convenient expository choice, and
not an essential feature of a resource-sensitive approach to free choice
permission. Nevertheless, there may be some empirical support for the
naturalness of deontic reduction. After all, in addition to being able to use
a modal verb to express permission and obligation, English can also deploy
a conditional: It’s ok if you eat ‘You may eat’. In fact, in Japanese there is
no modal verb that expresses permission, and permission normally can only
be conveyed by means of a conditional construction (Clancy 1985, Akatsuka
1992): tabe-temo ii ‘eat-even.if good’, ‘It’s ok if you eat’.

Chris Barker

4 Free choice permission

We can now suppose that or has among its meanings ⊕, so that You may
eat an apple or⊕ a pear translates as (a ⊕ p) ( δ: the additive disjunction
of a and p is explicitly permitted. Then the desired free-choice implication
follows directly from simple linear reasoning. Generalizing slightly by using
variables over formulas (A, B) instead of atomic formulas (a, p), we have:

` A, A⊥ ` B, B ⊥
⊕1 ⊕2
` A ⊕ B, A⊥ ` δ⊥ , δ ` A ⊕ B, B ⊥ ` δ⊥ , δ
⊗ ⊗
` (A ⊕ B) ⊗ δ⊥ , A⊥ , δ & ` (A ⊕ B) ⊗ δ⊥ , B ⊥ , δ &
` (A ⊕ B) ⊗ δ⊥ , A⊥ ` (A ⊕ B) ⊗ δ⊥ , B ⊥
& &
δ δ
` (A ⊕ B) ⊗ δ⊥ , (A⊥ δ) & (B ⊥
& &
⊥2 , ≡
(A ⊕ B) ( δ ` (A ( δ) & (B ( δ)

This theorem is noted in Lokhorst 1997:6.1

What the speaker provides when she utters You may eat an apple or⊕ a
pear is justification for assuming either that eating an apple is permitted,
or that eating a pear is permitted. She is not providing enough resources
to prove both, so if her utterance is to provide the justification for action,
a choice must be made. However, since the resources allow proof of either
alternative, the consumer is free to choose whichever of the alternatives he
prefers. That is how the addressee has permission to eat an apple, or else
permission to eat a pear, but normally (and certainly not by virtue of the
utterance of (1a)) does not have permission to eat two pieces of fruit.
This result depends on only two assumptions: that or can express ad-
ditive disjunction, and that it is reasonable to represent strong permission
using the deontic reduction strategy. The assumption that or can express
additive disjunction is essential, and is the heart of the explanation offered
here. Deontic reduction is a well-established approach to deontic logic moti-
vated entirely independently of any concern with the free choice permission
problem. Whether it can be replaced with a modal system more familiar to
linguists (if desired) remains for future work.
1 Strictly speaking, since the inference rules given above in section 2.2 are written with a single
formula on the right-hand side, many of the steps given in this proof (for example, the ⊕1
inference) require shuffling extra formulas across the turnstile, applying the inference rule
of interest, then shuffling them all back.

Free choice permission as resource-sensitive reasoning

It is worth emphasizing that the basic free choice meaning is purely

semantic, without requiring any silent pragmatically-triggered type shifting
operators (as in, e.g., Fox 2007), or other pragmatic enrichment.

5 Prohibition

The behavior of permission under negation plays an important role in recent

discussions. As mentioned above, Alonso-Ovalle (2006) and Fox (2007) argue
that the fact that free-choice implications seem to disappear under negation
shows that free choice implications are likely to be implicatures. Since I
am claiming that the relevant free choice implications are entailments, it is
important to carefully examine negated cases.
Whatever is not permitted is forbidden: just as in English, Lokhorst
renders (strong) prohibition as negated (strong) permission. Thus if (A (
δ)⊥ , then A is prohibited. (It is a well-known property of English that may
not is always construed with negation taking scope over may.)

(5) a. You may not eat this apple or this pear.

b. You may not eat this apple.
c. You may not eat this pear.

The main fact to be explained is that (5a) implies (perhaps entails) (5b) and
(5c). Unlike positive free choice implications, we can usually infer that (5b)
and (5c) hold simultaneously. That is, you cannot comply with (5a) by merely
refraining from eating apples. Apparently, permission is a scarce resource,
but prohibition is all too abundant. I will call this construal of (5a) the double-
prohibition reading, and I will suggest that it arises as a standard Gricean
As with most stories about scalar implicatures, we will be concerned with
the epistemic state of the discourse participants.

(6) a. You may not eat this apple or this pear.

b. You may not eat this apple or you may not eat this pear.
c. ((A ⊕ B) ( δ)⊥ ` (A ( δ)⊥ ⊕ (B ( δ)⊥

The translation of (6a) entails the translation of (6b) (that is, (6c) is a theorem),
so we predict that (6a) ought to have an interpretation on which it guarantees
that (6b) is true. Such an interpretation is widely attested in the literature,
and usually is described as favoring the continuation . . . but I don’t know

Chris Barker

which. I’ll call this the ignorance reading.

Note, by the way, if a forgetful babysitter utters (6) to the child she is
babysitting, if the child behaves rationally, he will not eat either piece of fruit,
since he can’t be sure which action is safe — exactly the same behavior as if
both actions had been explicitly forbidden.
So far, so good. Next, consider a situation in which the speaker is not
ignorant. Exactly one of the alternatives is prohibited, and this time the
speaker knows which one it is. Let’s say that apple-eating is forbidden, but
pear eating is fine. If the speaker were being fully cooperative, then she
would normally choose to simply say (5b), and certainly would not choose to
say (5a). In Gricean terms, adding a superfluous disjunct would violate either
the maxim of Quantity, or the maxim of Manner, or both.
There are nevertheless situations in which this kind of uncooperative
statement might be used. For instance, if a father tells an older sister the
rules (“apples forbidden, pears ok”), she might later uncooperatively tell her
younger brother

(7) You may not eat this apple or this pear . . . but I won’t tell you which.

Once again, the rational course of action on the part of the younger sibling
will be to refrain from eating either piece of fruit. Presumably this is exactly
the outcome the unkind sister is aiming for. (I’m indebted to Sven Lauer for
this scenario; see also Simons 2005:273n.4.)
In both the ignorance scenario and the uncooperative scenario, at least
one of the disjuncts holds, but the choice of which fruit is prohibited belongs
to the master, not the slave. The subject of the prohibition must plan for the
worst, and therefore can’t safely commit to either alternative.
Finally, imagine that the speaker is neither ignorant nor uncooperative.
She may be an expert (perhaps she just received full instructions from the
parents) or she may be herself the source from which permission flows; in
any case, she is fully opinionated about what is forbidden. Crucially, although
(6) guarantees only one disjunct, it is consistent with situations in which
both disjuncts hold. As just argued, if exactly one disjunct held, the speaker
would simply have said so. We can deduce, therefore, that both disjuncts
must hold.
There is one more step to complete the Gricean explanation. If the speaker
intends to convey double prohibition, why not use and?

(8) You may not eat an apple and a pear.

Free choice permission as resource-sensitive reasoning

Although this sentence may have the desired double-prohibition reading,

it certainly also has a reading on which it prohibits (only) complex events
that involve eating both an apple and a pear. Uttering (8), then, leaves in
play the possibility that eating a single piece of fruit may be permitted. The
speaker uses a weak form in (6) to express a stronger meaning in order to
avoid misinterpretation.
Thus the assumption that the speaker is opinionated and cooperative de-
rives the implicature that both disjuncts are prohibited via ordinary Gricean
reasoning, without the need to stipulate any special uniformity or distributiv-
ity axioms (as in Alonso-Ovalle 2006) or Zimmermann’s (2000:286) Authority

6 Comparisons with other accounts

6.1 Implicature accounts

A number of authors, including Schulz (2005) and Fox (2007), suggest that
free choice implications are implicatures that arise in contexts in which the
speaker is opinionated about which options are permitted and which are not.
Fox (2007) reasons as follows: if a speaker utters a disjunction when she
could have made a stronger statement, this could naturally lead to a Quantity
implicature that she did not have sufficient evidence to assert the stronger
statement. If those ignorance implicatures are implausible, as when the
speaker is describing permissions in a situation in which their judgment is
authoritative, the implausibility can trigger a repair strategy under which the
disjunction is pragmatically enriched by the application of a predicate exh
(for “exhaustive”). For instance, if an authoritative speaker says You may eat
an apple or a pear, it may be implausible that she doesn’t know whether you
may eat an apple, or whether you may eat a pear. Therefore the statement
♦(A ∨ P ) can be strengthened (given a number of additional assumptions) to
an exhaustive meaning equivalent to the proposition ♦A ∧ ♦P ∧ ¬(♦(A ∧ P )).
This asserts that you may have an apple, and you may have a pear, but you
may not both have an apple and a pear.
I will discuss three potential problems with these accounts. The first
problem is that the free-choice reading can survive even in the presence of
manifest ignorance on the part of the speaker:

(9) I don’t know whether you may have an apple or a pear.

Chris Barker

Since exhaustivity is supposed to be triggered by contexts that are incompat-

ible with ignorance, (9) should only have a reading on which it means ‘I don’t
know whether you may have an apple or whether you may have a pear’. But
(9) robustly also has a free-choice reading on which it means ‘I don’t know
whether you may eat a piece of fruit, where the fruit is your choice between
an apple or a pear’.

(10) If it turns out that John may have an apple or a pear, he’ll choose the

Likewise, as Kamp (1978:279) notes, free choice interpretations remain avail-

able for the antecedent of a conditional, where it is far from clear how
assumptions about complete knowledge of the alternatives could enter in.
The second problem is that if free choice implications were implicatures,
we should expect them to be generally cancelable:

(11) You may eat an apple or a pear, although in fact you may not eat an

Probably (11) has a non-free choice reading on which it is at least logically

consistent. If this were the basic semantic meaning of (11), then we would
expect it to emerge whenever the free-choice implication is cancelled. The
puzzling thing is that if we assume the speaker is opinionated, (11) gives a
strong impression of contradiction rather than of a cancelled implicature.
Chemla (2009a, 2009b) proposes a pragmatic principle that he calls
symmetry, which says that the epistemic attitude of the speaker must be
uniform across disjuncts. Symmetry correctly predicts that (11) should
be infelicitous, since it implies that the speaker holds a different attitude
towards one disjunct than towards the other. However, symmetry alone
cannot explain why (11) sounds contradictory.
One possibility is that performativity is interfering. Portner (2009b)
suggests that performative uses (see section 7.2 below) force, or at least
strongly promote, a free choice interpretation. If so, then what (11) shows
is that at least when an utterance is performative, free choice implications
cannot be cancelled.
The third problem applies to Fox’s account, though not to Schulz’s: as
Fox himself notes, the proposed implicatures for the free-choice reading do
not match intuitions about the meanings of the sentences in question. Fox’s
exh-enhanced truth conditions assert that eating an apple is permitted, and

Free choice permission as resource-sensitive reasoning

eating a pear is permitted, but eating an apple and a pear is forbidden. But
as Simons (2005) and others observe, free choice is compatible with joint
permission. For instance,

(12) [You may eat as much fruit as you want, so]

You may (certainly) eat an apple or a pear.

On Fox’s account, (12) should be contradictory on a free-choice reading of

the final clause. However, although (12) may be mildly redundant, there is no
hint of contradiction.
Franke (2009:8) and van Rooij (2010:18) derive results similar to Fox’s
by using a particular game-theoretic technique (“Iterated Best Response”) to
compute implicatures. One advantage of their approach is that the proposi-
tion that eating both an apple and a pear is forbidden arises as an implicature
only when certain alternatives are salient, correctly predicting that (12) need
not be a contradiction.
On the account here, of course, the explanation for the fact that (12) is
not a contradiction is particularly simple and direct: You may eat an apple or
a pear entails that you may eat an apple, and that you may eat a pear, but
refrains from saying anything about whether it’s ok to eat both an apple and
a pear. It neither grants permission to eat two pieces of fruit, nor forbids it.
Van Rooij frames the comparison between exhaustivity and game theory
as part of the debate about embedded implicatures: if free choice implications
can be handled using iterated best response, then free choice no longer
provides an argument that implicatures must be calculated locally (i.e., in
embedded contexts). The resource-sensitive approach here weakens the
argument that free choice motivates embedded implicatures even further, by
calling into question whether free choice implications are implicatures in the
first place.

6.2 Alternative set semantics

Zimmermann (2000) proposes that disjunction contributes a set of exhaustive

epistemic alternatives, so that You may eat an apple or you may eat a pear
expresses the claim that it is possible that you may eat an apple and it is
possible that you may eat a pear. Novel pragmatic principles (notably his
Authority Principle) strengthen this conjunction into an assertion that you
may eat an apple and you may eat a pear.

Chris Barker

Geurts (2005) elaborates on Zimmermann’s analysis, arguing that disjunc-

tive alternatives should not always be epistemic. Rather, disjunction “fuses”
with nearby modal operators, so that You may eat an apple or a pear means
that you may eat an apple and you may eat a pear without needing to invoke
any special pragmatic principle.
Neither Zimmermann’s nor Geurts’ analyses explain why the free-choice
or differs from an overt and (i.e., You may eat an apple and you may eat
a pear) in failing to guarantee that two pieces of fruit may be eaten. In
addition, as Geurts (2005:406) briefly discusses, it is not clear how either
analysis accounts for negated free choice (discussed above in section 5).
Zimmermann’s idea that disjunction introduces a set of alternatives has
been implemented in a variety of ways. I will mention three here.
Kratzer and Shimoyama (2002) propose that indefinites contribute a set
of alternatives, one for each way of resolving the indefinite. This requires in
turn a modification of the basic compositional semantics, since it is necessary
to allow for composition with sets of meanings instead of single meanings.
This is done pointwise using “Hamblin semantics”, so that an embedded
indefinite can give rise to a set of alternatives at higher compositional levels
(see Shan 2004 for discussion of the complexities of pointwise composition).
Alonso-Ovalle (2006) extends this strategy from indefinites to disjunction,
explicitly addressing the free choice problem.
Aloni’s (2007) approach manages disjunction-alternatives within a dy-
namic semantics based on Dekker 2002, supplemented with structured propo-
Van Rooij (2008:309) sketches yet a third implementation, on which
alternatives are built into the definition of a minimal extension of a world.
Then a world in which you eat only an apple might qualify as a minimal
extension of the world we are in, but not a world in which you eat both an
apple and a pear. In order to deliver free choice implications, it is necessary
for the propositions expressed by a disjunction to always be among those
used for articulating minimal extensions, though this requirement is not
guaranteed by the formal analysis.
In these approaches, free choice effects arise when certain operators
explicitly manipulate alternative sets. For instance, Aloni stipulates that
may(Φ) is true (where Φ is a set of alternatives) just in case the ordinary
meaning of may is true of each alternative. Thus You may eat an apple or a
pear involves applying may to the set of alternatives corresponding to the
addressee eating an apple and the addressee eating a pear. The sentence will

Free choice permission as resource-sensitive reasoning

be true, then, just in case You may eat an apple is true and You may eat a
pear is true.
The account here resembles Aloni’s alternatives account in two important
respects. First, free choice implications are entailments rather than implica-
tures. As we saw in section 6.1, the fact that free choice implications do not
always seem to be cancelable argues in favor of theories on which they are
treated as entailments.
Second, because alternative-taking may requires that ordinary may must
be true of every alternative, it is a downward-entailing operator with re-
spect to the disjunction that gives rise to the alternatives. Aloni points out
that this explains why (so-called free choice) any is licensed (e.g., You may
eat anything), and since the antecedent of linear implication is likewise a
downward-entailing position (as noted above), the same explanation carries
over here. (Of course, there is more to free choice than placing an indefinite
in a downward entailing context. For instance, a referee observes that in
some Romance languages, some free-choice indefinites are licensed under
permission, but not in the antecedent of conditionals or in other downward
entailing contexts.)
One important difference between the approach here and alternative-
based analysis, including Aloni’s, is the integration with the larger compo-
sitional system. The alternative-set approach in effect creates unbounded
dependencies in the semantics: or introduces alternatives which the compo-
sitional system must track until an alternative-aware operator collapses the
alternatives back into to a single proposition. The account here adjusts only
the denotations of the logical connectives, leaving the compositional system
entirely undisturbed. (Not that I had provided a compositional analysis,
though I trust that appropriate details can easily be supplied.)

7 Issues

7.1 Free choice effects apart from permission

It is widely assumed that whatever explains free choice implications for

deontic modals should be the same thing that explains the similar behavior
of epistemic modals:

(13) a. John might be in Aarhaus or in Boston.

b. John might be in Aarhaus.
c. John might be in Boston.

Chris Barker

In parallel with the permission cases, the disjunction in (13a) entails (13b) and
The simplest way to extend the account here to epistemic cases would
be to add to our logic a new atomic formula , which is true just in case
everything that is epistemically known holds. Then You might be in Aarhaus
would translate as A ( , and the desired entailments follow as a matter of
Adding an epsilon to the logic is more than a superficial change. It is im-
portant to keep track of what the logic claims to be modeling. Classical logic
promises to preserve truth: if the assumptions are true, the conclusion will
be true. Since truth is not resource sensitive (if something is true once, it is
true again and again), that is why it is legitimate to duplicate and discard as-
sumptions. Linear Logic promises to preserve resources: whatever resources
the assumptions provide, that is exactly what resources will appear in the
conclusion. In our deontic application, the critical resource is permission:
if the assumptions provide enough permission to eat exactly one piece of
fruit, then the conclusion will provide the same amount of permission. In
the epistemic case, the critical resource is epistemic commitment: whatever
commitments are made by the assumptions, the conclusion will make exactly
the same commitments.
There are other important differences between deontic logic and epistemic
logic. For instance, it is generally considered desirable for an epistemic logic
to guarantee that if you know that A is true, then A is true (A ` A). But
deontically, you would not want to conclude from the fact that A is obligatory
that A must hold, since obligations are all too often not fulfilled. More
relevantly, there are empirical dis-analogies between the free choice behavior
of deontic uses of modals versus epistemic modals. For instance, Kamp
(1978), Zimmermann (2000), and Aloni (2007) note that it is significantly more
difficult to construe epistemic modals as having a . . . but I don’t know which
interpretation (though it is still possible — see especially Simons 2005:274).
I’m not aware of any reason why a reduction strategy could not be part
of a more complete analysis of epistemic modality; nevertheless, it would
be prudent to be cautious about assuming that any deontic analysis should
automatically extend to epistemic cases.
In addition to the possibility that free choice effects may occur in other
modalities, Fox (2007) argues that free choice effects can be discerned in
non-modal contexts that involve existential quantifiers.

Free choice permission as resource-sensitive reasoning

(14) There’s beer in the fridge or in the cooler out back.

Especially when (14) is heard as an implicit permissive, (14) entails both that
there is beer in the fridge and that there is beer in the cooler out back. Both
alternatives are guaranteed to be true, and the consumer of the information
has free choice of which one is relevant for forming a plan of action.
Klinedinst (2007) suggests that free choice effects are present with some
existential quantifiers, but only when the quantificational DP is plural:

(15) a. Some passengers got sick or had difficulty breathing.

b. A passenger got sick or had difficulty breathing.

In (15a), there is a reading on which some passengers got sick, and some had
difficulty breathing. On such a reading, at least some of the passengers must
have gotten sick, and at least some of the passengers must have had difficulty
breathing. But in (15b), there is no guarantee that both of the properties must
be instantiated.
Having mentioned these facts, I will not attempt a discussion here of the
interaction of free choice with quantifiers or with plurals. See Chemla 2009a
for experimental evidence and relevant discussion.

7.2 Performativity

Kamp (1978) draws a distinction between granting permission versus describ-

ing permission, where granting permission is a performative action. When
a parent says You may eat an apple or a pear in the right circumstances,
fruit-eating options may come into being that were not present before the
utterance. But when a sibling comments later Apparently, you may eat an
apple or a pear, they are merely describing the current situation, and no
new options come into being. Van Rooij (2008) and Portner (2009b) de-
velop a dynamic semantics for permission on which a permission sentence
performatively changes the set of what is allowed.
One of the main arguments that performativity is important relies on
correlations between performative uses and the availability of free choice
interpretations. Certainly descriptive uses (such the sibling’s comment) can
have a free choice interpretation or not. Performatives, however, strongly pre-
fer a free choice interpretation. Yet it may still be possible for a performative
to have a non-free choice interpretation:

Chris Barker

(16) You may pillage city X or city Y. But first take counsel with my secre-

Kamp (1973:67; see also Kamp 1978:279) says of this example that “[t]he
second part of this statement makes it clear that the vassal should not infer
from the first part that he may make his own choice of city. Which one he may
loot ultimately depends on the secretary’s advice, the tenor of which — we
may assume — is at this point unknown to king and vassal alike.” To be
sure, nothing specific has been permitted, and the vassal cannot form a
complete plan of action. If we conceive of a performative as something that
enlarges what an agent may safely do, we might therefore suppose that (16) is
a merely descriptive use, since it does not by itself allow the vassal to act. Yet
something must have been permitted: where does the disjunctive permission
that the sentence describes come from, if not from the performance of (16)?
As far as the current paper is concerned, it is enough for permission
sentences to characterize what is allowed. Then whether an utterance ex-
pands the sphere of permissibility depends on the interaction of the truth
conditions with the normal range of factors that influence how a discourse
participant decides to react to an utterance. Whether this minimalist strategy
is viable, or whether it will ultimately be necessary to provide a special role
for performativity remains to be seen. (See Kamp 1978 for extensive, but
ultimately inconclusive, discussion.)

7.3 Is there a conjunctive use of or after all?

Geurts (2005) and Simons (2005) emphasize the importance of explaining

how free choice implications arise when or takes scope over the permission

(17) a. You may eat an apple or a pear.

b. You may eat an apple or you may eat a pear.

The account of free choice given so far does not explain why (17b) also has a
free choice interpretation.
Simons proposes an across-the-board LF movement operation on which
the sentence with unembedded or is predicted to be logically equivalent to
You may [eat an apple or eat a pear]. That approach is compatible with the
account of free choice here.

Free choice permission as resource-sensitive reasoning

However, there is an alternative explanation that may be worth some

consideration: perhaps resource-sensitive or is ambiguous between ⊕ (the
translation we’ve given it so far) and &.
After all, there is no other lexical item that is a candidate for expressing &.
For instance, as mentioned above, if you have ingredients for either meringue
or angel food cake, but only enough to make one recipe, and someone asks
‘What’s for dessert?’, the answer is meringue or& cake, never meringue and&
A second intriguing clue comes from conditionals. In Linear Logic,
strengthening of the antecedent is valid for & but not for ⊗. That is, we
have A ( C ` (A & B) ( C but A ( C 6` (A ⊗ B) ( C. The observation
that and never expresses & explains why trying to strengthen an antecedent
using and in English does not work: If John left, we could all play bridge does
not entail If John left and Mary left, we could all play bridge. But if or has a
conjunctive use, then we could explain why the inference does seem valid if
we use or: If John left or& Mary left, we could all play bridge.
If or can express &, then the ability of (17b) to serve as a paraphrase of
(17a) is immediately explained: it translates directly as (A ( δ) & (B ( δ),
and it is easy to prove that (A ( δ) & (B ( δ) ` (A ⊕ B) ( δ.
Of course, if or had such a conjunctive use, we would expect it to occur
in embedded position too, for example, You may eat an apple or& a pear. But
this is harmless, and merely gives a different route to the . . . but I don’t know
which reading, which we derived above by giving (disjunctive) or wide scope.
More problematically, we would also expect a conjunctive or to be avail-
able in non-modal sentences. Then saying that John left or& Mary left would
offer the addressee free choice of which disjunct to believe, yet would license
belief in at most one of the disjuncts. Such a meaning does not appear to be
Put another way, non-modal uses of or appear to always be classical
disjunction (this is hardly surprising). One notable feature of Linear Logic
is that the classical connectives are easily expressible, given the addition of
the so-called exponential operators, ! (pronounced ‘of course’) and ? (‘why
not?’): from ∆ ` !A infer ∆ ` A, !A); from ?A ` ∆ infer ` ∆. These operators
allow a richer control over resources in which assumptions can be used
repeatedly, as in contraction, or ignored, as in weakening. Given Linear
Logic with exponentials, we can choose a more relaxed classical resource
management regime, or a more fussy pure Linear Logic regime, as needed.
For instance, the classical disjunction of A and B can be expressed as !A ⊕ !B.

Chris Barker

So there is no problem allowing Linear reasoning to peacefully coexist

with classical reasoning, as long as we can reliably tell which kind of resource
management to use in any given context. To a first approximation in English,
linear resource management appears to be relevant only for untensed clauses
with bare verb forms, as in You may eat an apple or eat a pear, in which or
takes scope over the untensed bare verb phrases eat an apple and eat a pear.
Then we could suppose the reason that John left or Mary left does not have
a conjunctive interpretation is because the tensed clauses trigger (only) a
classical interpretation of or.
Figuring out how to regulate the distribution of an ambiguous or would
be a major undertaking, so I leave this issue unresolved for now.

8 Semantics for linear logic

The discussion so far has been conducted entirely in terms of inference rules
and proofs. It is unusual these days, though not unheard of, to express
the meaning of natural language using proof theory without giving a model
theory. More often, of course, we have the opposite situation, in which
semantic analyses provide models without any proof theory.
The most complete picture, however, emerges when proof theory and
model theory complement each other. Therefore I will discuss models for
Linear Logic here, with a detailed illustration of a free choice example.
There are a number of semantic approaches to Linear Logic. Girard’s
(1987, 1995) original semantics in terms of coherence spaces and in terms of
phase spaces would not be directly helpful here. There are other semantic
approaches, however, that have tantalizing associations with the granting
and denying of permission. I will mention three. First, Petri nets describe
the movement of tokens through a network. Lokhorst (1997) uses Petri nets
as models of his Linear Logic treatment of deontic reasoning. (Think of the
tokens as lumps of permission moving from one location to another.) Second,
in game semantics a Proponent and an Opponent take turns making choices,
and I have argued that tracking choice is central to understanding permission
talk. See, e.g., Accorsi and van Benthem 1999 for a discussion of game
semantics for Linear Logic. Third, there are computational models of Linear
Logic that make an explicit connection between the additives and choice. For
example, Abramsky’s (1993) computational semantics for intuitionistic Linear
Logic interprets A ⊗ B as an ordered pair hA, Bi both of whose elements
will be used in further computation (eager evaluation); A & B, on the other

Free choice permission as resource-sensitive reasoning

hand, denotes an ordered pair only one of whose elements will ever be used
(lazy evaluation), and of course A ⊕ B delivers a projection function that
chooses one or the other of the elements in a & pair. Unfortunately for our
purposes here, Abramsky’s computational interpretation of classical Linear
Logic involves parallel distributed processing, which would take us too far
Most reassuringly familiar for linguists, Allwein and Dunn (1993) provide
a kosher Kripke-style possible worlds semantics, and that is the approach
that I will present here.
Following Allwein and Dunn, the expository strategy will be to begin with
an algebraic model that is faithful to the inference rules, then show how to
reconstruct that algebra in terms of worlds.

8.1 An algebraic semantics

The algebraic model contains three main components: a lattice for modeling
the additive connectives, a unary operation for modeling negation, and a
binary operation for modeling the multiplicative connectives.
Additives: let A, ∧, and ∨ form a bounded lattice with partial order ≤ and
top and bottom elements. The lattice can be finite or non-finite, and it can be
distributive or non-distributive.
Negation: now let ∼ be a DeMorgan negation on that lattice. This means
that ∼ must be order-reversing (for all x, y in A, x ≤ ∼y iff y ≤ ∼x), and it
must be involutive (for all x in A, ∼∼x ≤ x).
Multiplicatives: we add a commutative, associative binary operation ◦
with identity element t (that is, t ◦ a = a = a ◦ t for all a in A). Thus A,◦, and
t form a commutative monoid. Note that t may be distinct from the top of
the lattice. The monoid operation must distribute over the join operation,
that is, for all a, b, c ∈ A : a ◦ (b ∨ c) = (a ◦ b) ∨ (a ◦ c). It must also be
compatible with negation in the sense that for all a, b ∈ A : a ◦ b ≤ c iff
a ◦ ∼c ≤ ∼b (“antilogism”).

2 Though it is intriguing to think that the meaning of some natural language expressions might
be appropriately modeled by a distributed process. Perhaps some permission sentences
denote programs which the recipient can execute in various environments in order to
produce whichever certificate of permission is required. Then a free choice permission
sentence denotes a program whose execution is blocked until it receives an external choice
(a selection of which alternative to deploy).

Chris Barker

The points in the lattice model formulas. Given a valuation v mapping

atomic formulas onto elements of A, we extend v to complex formulas as
follows: v(A⊥ ) = ∼v(A); v(A & B) = v(A) ∧ v(B); v(A ⊕ B) = v(A) ∨ v(B);
v(A ⊗ B) = v(A) ◦ v(B); v(A B) = ∼(∼v(A) ◦ ∼v(B)); and v(A ( B) =
∼(v(A) ◦ ∼v(B)).
As an example, I will present a six-element, non-distributive lattice:

5 ∼ ◦ 0 1 2 3 4 5
0 5 0 0 0 0 0 0 0
3 4
1 3 1 0 1 2 1 2 5
2 4 2 0 2 1 2 1 5
1 2 3 1 3 0 1 2 3 4 5
4 2 4 0 2 1 4 3 5
0 5 0 5 0 5 5 5 5 5

The Hasse diagram on the left gives the lattice order in the usual way, so that
0 ≤ 1, 1 ≤ 3, and so on. In addition, since ≤ is reflexive and transitive, we
also have 0 ≤ 0, 0 ≤ 3, etc.
Since meet (∧) in a lattice is the unique greatest lower bound, it can be
read off the Hasse diagram, e.g., 5 ∧ 5 = 5, 4 ∧ 5 = 4, 4 ∧ 3 = 0, and so on
(dually for the join operation ∨).
It is easy to see by inspection that the negation relation ∼ is involutive
(e.g., ∼∼3 = 3) and order reversing (e.g., along with 0 ≤ ∼3 we have 3 ≤ ∼0).
Note that 3 serves as the identity element t of the monoid. Since the
monoid operation is commutative, the matrix is symmetric across the top-left
to bottom-right diagonal (e.g., 4◦2 = 2◦4). Furthermore, mechanical checking
will confirm that the monoid operation is associative (e.g., (4 ◦ 2) ◦ 1 = 4 ◦ (2 ◦
1)), that it distributes over the join operation (e.g., 3◦(1∨4) = (3◦1)∨(3◦4)),
and that it respects the antilogism requirement (e.g., 4 ◦ 2 ≤ 3 ≡ 4 ◦ ∼3 ≤ ∼2).
A sequent Γ semantically entails ∆ (written ‘Γ î ∆’) just in case the
valuation of the multiplicative conjunction of the formulas in Γ is dominated
by the valuation of the multiplicative disjunction of the formulas in ∆. For
instance, since x ∧ y ≤ x for all x, y in A by the definition of meet in a
lattice, we have that A & B î A.
To illustrate how these tables provide a model of the logic, recall that we
have the following three theorems discussed in previous sections and one

Free choice permission as resource-sensitive reasoning

(18) a. (A ( δ) & (B ( δ) ` (A ⊕ B) ( δ
b. (A ⊕ B) ( δ ` (A ( δ) & (B ( δ)
c. (A ( δ) ⊕ (B ( δ) ` (A & B) ( δ
d. (A & B) ( δ 6` (A ( δ) ⊕ (B ( δ)

If the given algebra is a faithful model of Linear Logic, we expect that for
every valuation v assigning a lattice element to the propositional symbols
δ, A, and B, the valuation of the left hand side of any theorem will be
dominated (in the sense of the lattice order ≤) by the valuation of the right
hand side. This is the case for (18) (a) through (c), but we have a countermodel
for (18d): if v(δ) = 0, v(A) = 1, and v(B) = 2, then v((A & B) ( δ) =
v(((A & B) ⊗ δ⊥ )⊥ ) = ∼((v(A) ∧ v(B)) ◦ ∼v(δ)) = ∼((1 ∧ 2) ◦ ∼0) = 5. But
v((A ( δ) ⊕ (B ( δ)) = 0, and 5 6≤ 0.
There are (infinitely) many other possible choices for a lattice, and for
any given lattice, there may be many choices for a suitable negation and for
a suitable monoid operation. For instance, Restall (2000:170) gives an even
simper (but still instructive) model of (distributive) Linear Logic based on a
four-element lattice. Since Linear Logic is sound and complete with respect
to the class of algebraic models given here, a sequent is a theorem iff its left
hand side semantically entails its right hand side for every valuation in every

8.2 A possible-worlds semantics

The algebraic semantics is simple and straightforward, in part because it

merely recapitulates the inference rules; for the same reason, it may not
add any insight beyond what is already evident from the inference rules
themselves. Constructing a Kripke-style possible worlds semantics is a bit
more complicated, but may allow natural language semanticists to transfer
some of their intuitions from more familiar sorts of semantics for natural
languages. We shall see that one particularly intriguing feature of the Kripke
semantics for Linear Logic is that there will be three possibilities for the
status of a formula at a world: it may be true, false, or neither true nor false,
which is exactly what makes Linear Logic suitable for modeling actions that
may be permitted, forbidden, or neither permitted nor forbidden.
Allwein and Dunn associate each element in A with a particular set of
worlds. The construction goes as follows. Consider pairs of the form hF , Ii,
where F and I are sets of points in the lattice. We require hF , Ii to satisfy the

Chris Barker

following four requirements: first,

(w1): F and I must be disjoint.
(w2): F must be closed upward under ≤, so that for all a ∈ F and for all
b ∈ A : (a ≤ b) implies b ∈ F . Dually, I must be closed downward under ≤,
so that for all a ∈ A and for all b ∈ I : (a ≤ b) implies a ∈ I. In particular, F
always contains the top element, and I always contains the bottom element
of the lattice.
Third, F and I must be closed under meets and joins, respectively. That
(w3): for all a, b ∈ F : a ∧ b ∈ F ; and for all a, b ∈ I : a ∨ b ∈ I.
In other words, conditions (w2) and (w3) require that F must be a filter,
and that I must be an ideal.
Finally, there is a maximality condition:
Maximality: A filter/ideal pair hF , Ii satisfying (w1), (w2), and (w3) satisfies
maximality only if there is no other distinct pair of sets hF 0 , I 0 i also satisfying
(w1), (w2) and (w3) that properly includes the first, i.e., such that F ⊆ F 0 and
I ⊆ I 0.
Here are a few of the possible pairs of subsets that fail to satisfy the
h{1, 2}, {1, 3}i violates w1
h{3}, {0}i violates w2
h{4, 3, 5}, {0}i violates w3
h{4, 5}, {0}i violates Maximality
In fact, in this model there are exactly four maximal disjoint filter/ideal pairs:
World a: h{4, 5}, {0, 2}i
World b: h{3, 5}, {0, 1}i
World c: h{2, 4, 5}, {0, 1, 3}i
World d: h{1, 3, 5}, {0, 2, 4}i
These pairs will stand in one to one correspondence with our possible worlds.
For each world w = hF , Ii, we will interpret F as the set of points that are
true at w, and I as the set of points that are false at w.
For worlds c and d, every point in the lattice is either true or false. But
for world a, points 1 and 3 are neither true nor false. Similarly, for world
b, points 2 and 4 are neither true nor false. In terms of permission talk,
there may be situations in which some things are permitted, some things are
forbidden, and some things are neither permitted nor forbidden.

Free choice permission as resource-sensitive reasoning

The next step is to associate each point in the lattice with a set of worlds.
If w is a world associated with the pair of sets of points hF , Ii, let w1 indicate
F and w2 indicate I. Then we can define a map β that takes each point p in
the lattice onto the set of worlds w such that p ∈ w1 :
β(0) = {}
β(1) = {d}
β(2) = {c}
β(3) = {b, d}
β(4) = {a, c}
β(5) = {a, b, c, d}
In other words, we map each point in the lattice to the set of worlds that
make it true.
We now need to define relations over sets of worlds that will allow us to
reconstruct the logical operations we want to model: ∧, ∨, ∼, and ◦.
The meet operation is straightforward. We extend β in the following way:
β(p ∧ q) = β(p) ∩ β(q). So meet corresponds to simple set intersection.
Thus 4 ∧ 2 = 2, and β(4 ∧ 2) = β(4) ∩ β(2) = {a, c} ∩ {c} = {c} = β(2).
The join operation is not quite so straightforward. We cannot represent
join as set union. To see why, note that 3 ∨ 2 = 5, but β(3) ∪ β(2) =
{b, d} ∪ {c} = {b, c, d} 6= β(5). The solution is to exploit the information
present in the second element in the pair of sets that define the worlds. To
do this, we define two operations on sets of worlds. Let W be our set of
worlds, and let C be any subset of W :

l(C) = {x|for all y ∈ W , x1 ⊆ y1 implies y 6∈ C}

r (C) = {x|for all y ∈ W , x2 ⊆ y2 implies y 6∈ C}

Although l and r are defined over all subsets of W , we will only need to apply
them in the following cases:
r (β(0)) = r ({}) = {a, b, c, d}
r (β(1)) = r ({d}) = {b, c}
r (β(2)) = r ({c}) = {a, d}
r (β(3)) = r ({b, d}) = {c}
r (β(4)) = r ({a, c}) = {d}
r (β(5)) = r ({a, b, c, d}) = {}
For instance, the reason a is not in r (β(1)) is because a2 ⊆ d2 , but d ∈ β(1).
Allwein and Dunn show that for all points p in the lattice, l(r (β(p))) = β(p).

Chris Barker

We can now define join by shifting the conjuncts using r , then taking their
intersection, then shifting back using l: β(p ∨ q) = l(r (β(p)) ∩ r (β(q))). For
instance, we have β(1 ∨ 3) = l(r (β(1)) ∩ r (β(3)) = l({b, c} ∩ {c}) = l({c}) =
β(3). Trying the problematic case given above, β(3 ∨ 2) = l(r (β(3)) ∩
r (β(2))) = l({c} ∩ {a, d}) = l({}) = {a, b, c, d} = β(5), as desired.
At this point, β, l, and r allow us to fully simulate the structure of the
lattice in terms of sets of worlds.
Representing negation: β(∼p) = {x|h∼x2 , ∼x1 i ∈ r (β(p))} (where apply-
ing ∼ to a set of points returns the set resulting from applying ∼ to each mem-
ber of the original set). For instance, we have β(∼1) = {h∼{0, 1}, ∼{3, 5}i,
h∼{0, 1, 3}, ∼{2, 4, 5}i} = {b, d} = β(3).
Note that linear negation expresses something about provability, not
about falsity. One way to see this is to observe that in this model, 3 and its
negation ∼3 = 1 are both true at world d.
Representing the tensor relation ◦ proceeds in two steps. In the usual
Kripke semantics, unary modal operators are characterized by an accessibility
relation, a two-place relation over worlds. Because the multiplicatives are
two-place connectives, we will need a three-place relation.3

Sxyz iff ∀p, q : (p ◦ q ∈ z2 and q ∈ y1 ) implies p ∈ x2

The strategy here is a generalization of the Routley-Meyer semantics for

Relevant Logic. The goal is for the relation S to capture all of the information
present in the monoid operation ◦. In order to do this, it needs to take ad-
vantage of both sets of points that define the worlds: the set of propositions
that are true at a world as well as those that are false at that world.
Conceptually, S models modus ponens, in which x plays the role of
antecedent, y plays the role of the implication, and z plays the role of the
consequent. If the implication is true at y, and the consequent is false at z, S
guarantees that the antecedent must be false at x. For instance, since 3 (role:
the implication) is true at b and 1 ◦ 3 (the consequent) is false at c, but 1
(the antecedent) is not false at a, S does not hold of a, b, and c. We do have
Saba, however. The complete relation is aab, aba, baa, bbb, caa, cad,
cbb, cbc, cca, ccd, cdb, cdc, dab, dac, dba, dbd, dcb, dcc, dda, ddd.
Once we have constructed S as a function of ◦, we can define multiplicative
3 Lambek grammars (e.g., Moortgat 1997) also use a three place relation to give a semantics
for a multiplicative conjunction, where the conjunction is used to model concatenation of
linguistic expressions. For an example of modus ponens in type-logical grammar, DP ⊗
DP \S ` S.

Free choice permission as resource-sensitive reasoning

conjunction purely in terms of relations over worlds:

β(p ◦ q) = l({z|∀x, y : Sxyz and y ∈ β(q) implies x ∈ r (β(p))})

This definition unpacks S in order to reconstruct the original relation ◦.

8.3 Understanding linear implication

What does the multiplicative conjunction of two formulas mean? Since we

now have both an algebraic and a possible worlds semantics in correspon-
dence, we can move back and forth between the two semantics in search of
Begin with the algebra. We can keep track of the state of our reasoning
process by picking out a point in the lattice. Assume that I have good
reason to believe we are located at lattice position 1. This is a highly specific
situation: I know that we are located on world d, since that is the only world
at which 1 is true.
Now assume that I learn something: that you have eaten a pear. Call this
fact B, and associate it with lattice point 4 (i.e., let v(B) = 4). To find out
where we are now, I compute 1 ◦ 4 = 2. Since β(1 ◦ 4) = β(2) = {c}, we are
now on world c. Learning that you have eaten an apple changes our location
from world d to world c.
This may initially seem somewhat distressing. In the usual Stalnakerian
system, adding information is typically a monotonic process of eliminating
possible worlds. If we’ve already narrowed the set of live options to a single
world d, there is no way to end up on a distinct world c. Because ◦ is non-
monotonic in this sense, it may be better to think of what we have been
calling worlds as classes of worlds. Sometimes the term ‘set-up’ is used
instead of ‘world’. I will use the term ‘situation’. Then learning that you have
eaten an pear changes the current situation into a different situation, one in
which the consequences of having eaten a pear obtain.
Let’s continue to reason. We pick a point in the lattice to serve as A, the
situation in which you eat an apple, and a separate point to serve as δ, the
situation in which all obligations are fulfilled. Say that v(A) = 2, v(δ) = 3,
and v(B) is still 4. Now consider the proposition that eating an apple is
permitted: A ( δ. Then v(A ( δ) = v((A ⊗ δ⊥ )⊥ ) = ∼(v(A) ◦ ∼v(δ)) =
∼(2 ◦ ∼3) = ∼(2 ◦ 1) = ∼2 = 4. Apparently, in this model, the situation
in which you eat a pear is modeled by the same situation in which you are
permitted to eat an apple. (This sort of coincidence is unavoidable in such a

Chris Barker

tiny model, in the same way that a valuation for classical logic will be forced
to map very different formulas to the same truth value.)
So let’s say that I know we’re in a situation in which you are permitted
to eat an apple (say, point 4), and then I learn that you have eaten an apple.
Perhaps I watch you eat it. This changes things: I compute 4 ◦ 2 = 1.
Thanks to your eating an apple, we’re now in situation 1. And since 1 ≤ 3,
things are as they are supposed to be. In terms of worlds, δ is modeled by
worlds (situations) b and d; and since point 1 corresponds to (a singleton set
containing only) world d, we must be in a δ-world.
So, what if you are permitted to eat an apple or a pear? That’s ∼((2 ∨ 4) ◦
∼3) = 4. We just saw that if we start at 4 and you an apple, we land on a
δ-world. And indeed, if we’re at point 4 and you eat a pear instead, 4 ◦ 4 = 3,
and once again we’re in a δ-situation.
But what if you eat an apple and you eat a pear? 4 ◦ 4 ◦ 2 = 2. Situation
2 is not a δ situation, so things are not ok. Having permission to eat an
apple or a pear is not the same thing as having permission to eat an apple
and a pear. Likewise, if killing the postman is modeled by situation 4 (i.e.,
v(K) = 4), then eating an apple and killing the postman will definitely not
leave us in a δ-situation. (This small model is somewhat unrealistic, however,
in that there are situations in which eating an apple, killing the postman, and
then eating another apple is perfectly permissible.)
However, as emphasized above, having permission to eat an apple or a
pear is compatible with also having permission to eat both. Making use of
the same model, if we have v(A) = v(δ) = v(B) = 3, then v((A & B) ( δ) =
v((A ⊗ B) ( δ) = 3. With this valuation, eating apples and pears is truly
optional: you can eat an apple and stop, or you can eat a pear and stop, or
you can eat an apple and you can eat a pear, and in all three cases you’ll end
up in a δ-situation.

9 Conclusions

On the view presented here, understanding free choice hinges on recognizing

that permission is a scarce resource, and so permission talk requires a
resource-sensitive semantics. Following Lokhorst (1997), I propose Linear
Logic as a way of tracking permission: not only what kind of permission has
been granted, but how much. Then primary free choice implications (given
You may eat an apple or a pear, infer You may eat an apple & You may eat a
pear) follow merely from expressing permission using the (independently-

Free choice permission as resource-sensitive reasoning

motivated) Anderson/Kanger deontic reduction strategy. Double prohibition

(from You may not eat an apple or a pear infer You may not eat an apple and
You may not eat a pear) follows from standard Gricean reasoning, without
any need to postulate special pragmatic mechanisms.
The implications of this view are fairly dramatic. The claim is that natural
language expressions can differ in the resource management schemes they
impose. At the least, alethic modes impose classical resource management,
and deontic modes impose linear resource management (and quite likely,
other modes as well).
Linear Logic is one of the better known resource-sensitive logics. Other
logics may be worth considering instead. Similarly, the Anderson/Kanger
deontic reduction strategy was adopted in part for ease of exposition, and
work remains to integrate the account here within a more general framework
of modality in natural language. But apart from the advantages of Linear
Logic specifically or the deontic reduction, I would like to suggest a more
general conclusion: that we may be able to gain new and valuable insights into
long-standing puzzles in natural language semantics if we allow ourselves to
consider richer logical approaches than standard classical logic.

Abramsky, Samson. 1993. Computational interpretations of Linear Logic. The-
oretical Computer Science 111(1–2). 3–57. doi:10.1016/0304-3975(93)90181-
Accorsi, Rafael & Johan van Benthem. 1999. Lorenzen’s games and Linear
Logic. University of Amsterdam manuscript. http://www.informatik.
Akatsuka, Noriko. 1992. Japanese modals are conditionals. In Diane Brentari,
Gary Larson & Lynn MacLeod (eds.). The joy of grammar: A festschrift
in honor of James D. McCawley. Amsterdam: John Benjamins. 1–10.
Allwein, Gerard & J. Michael Dunn. 1993. Kripke models for Linear Logic. The
Journal of Symbolic Logic 58(2). 514–545. doi:10.2307/2275217.
Aloni, Maria. 2007. Free choice, modals, and imperatives. Natural Language
Semantics 15(1). 65–94. doi:10.1007/s11050-007-9010-2.
Alonso-Ovalle, Luis. 2006. Disjunction in alternative semantics. UMass
Amherst: PhD dissertation.
Asher, Nicholas & Daniel Bonevac. 2005. Free choice permission is strong
permission. Synthese 145(3). 303-323. doi:10.1007/s11229-005-6196-z.

Chris Barker

Brown, Mark. 1996. Doing as we ought: Towards a logic of simply dis-

chargeable obligations. In Mark Brown & José Carmo (eds.). Deontic
logic, agency and normative systems (Third International Workshop on
Deontic Logic in Computer Science). Berlin: Springer. 47–65.
Chemla, Emmanuel. 2009a. Universal implicatures and free choice effects: ex-
perimental data. Semantics and Pragmatics 2(2). 1-33. doi:10.3765/sp.2.2.
Chemla, Emmanuel. 2009b. Similarity: Towards a unified account of
scalar implicatures, free choice permission and presupposition pro-
jection. Manuscript. http://www.emmanuel.chemla.free.fr/Material/
Clancy, Patricia M. 1985. The acquisition of Japanese. In Dan Slobin (ed.).
The crosslinguistic study of language acquisition: The data (Volume 1).
Hillsdale, NJ: Lawrence Erlbaum Associates. 373–524.
Dalrymple, Mary. 2001. Lexical Functional Grammar (Syntax and Semantics
volume 34). New York: Academic Press.
Dekker, Paul. 2002. Meaning and use of indefinite expressions. Journal of
Logic, Language and Information 11(2). 141–194. doi:10.1023/A:1017575313451.
Fox, Danny. 2007. Free choice disjunction and the theory of scalar impli-
cature. In Uli Sauerland and Penka Stateva (eds.). Presupposition and
implicature in compositional semantics. New York: Palgrave Macmillan.
Franke, Michael. 2009. Free choice from iterated best response. In Maria
Aloni, Harald Bastiaanse, Tikitu de Jager, Peter van Ormondt & Katrin
Schulz (eds.). Pre-proceedings of the seventeenth Amsterdam Collo-
quium. Amsterdam: ILLC/Department of Philosophy. 267–276.
Geurts, Bart. 2005. Entertaining alternatives: disjunctions as modals. Natural
Language Semantics 13(4). 383–410. doi:10.1007/s11050-005-2052-4.
Girard, Jean-Yves. 1987. Linear Logic. Theoretical Computer Science 50(1).
1–102. doi:10.1016/0304-3975(87)90045-4.
Girard, Jean-Yves. 1995. Linear Logic: its syntax and semantics. In Jean-Yves
Girard, Yves Lafont & Laurent Regnier (eds.). Advances in Linear Logic.
Lecture Note Series 222. Cambridge, UK: Cambridge University Press.
Hansen, Jörg, Gabriella Pigozzi & Leendert van der Torre. 2007. Ten philo-
sophical problems in deontic logic. Dagstuhl Seminar Proceedings
07122. http://drops.dagstuhl.de/opus/volltexte/2007/941.

Free choice permission as resource-sensitive reasoning

Kamp, Hans. 1973. Free choice permission. Proceedings of the Aristotelian

Society 74. 57–74.
Kamp, Hans. 1978. Semantics versus pragmatics. In Franz Guenthner &
Siegfried J. Schmidt (eds.). Formal semantics and pragmatics for natural
languages. Dordrecht, Holland: Reidel. 255–287.
Klinedinst, Nathan. 2007. Plurality and possibility. UCLA, CA: PhD disserta-
Kratzer, Angelika. 1991. Modality. In Arnim von Stechow, Dieter Wunderlich
(eds.). Semantik: Ein internationales Handbuch der zeitgenössischen
Forschung. Berlin: De Gruyter. 639–650.
Kratzer, Angelika, and Shimoyama, Junko 2002. Indeterminate pronouns:
The view from Japanese. In Yukio Otsu (ed.). The proceedings of the
third Tokyo conference on psycholinguistics. Tokyo: Hituzi Syobo. 1–25.
Lokhorst, Gert-Jan C. 1997. Deontic linear logic with Petri net semantics.
Technical report, FICT (Center for the Philosophy of Information and
Communication Technology). Rotterdam. http://homepages.ipact.nl/
Lokhorst, Gert-Jan C. 2006. Andersonian deontic logic, propositional quantifi-
cation, and Mally. Notre Dame Journal of Formal Logic 47(3). 385–395.
McNamara, Paul. 2006. Deontic Logic. In Dov M. Gabbay & John Woods (eds.).
Handbook of the history of logic, volume 7: Logic and the modalities
in the twentieth century. Amsterdam: Elsevier. 197-288. A version is
also available in the Stanford encyclopedia of philosophy. http://plato.
Moortgat, Michael. 1997. Categorial Type Logics. In Johan van Benthem &
Alice ter Meulen (eds.). Handbook of logic and language. Cambridge,
MA: MIT Press. 93–177.
Portner, Paul. 2009a. Modality. Oxford, UK: Oxford University Press.
Portner, Paul. 2009b. Permission and choice. Georgetown University:
Restall, Greg. 2000. An introduction to substructural logics. London: Rout-
van Rooij, Robert. 2008. Towards a uniform analysis of any. Natural
Language Semantics 16(4). 297–315. doi:10.1007/s11050-008-9035-1.

Chris Barker

van Rooij, Robert. 2010. Conjunctive interpretation of disjunction. Semantics

and Pragmatics 3(11). doi:10.3765/sp.3.11.
Ross, Alf. 1941. Imperatives and logic. Theoria 7(1). 53–71.
Schulz, Katrin. 2005. A pragmatic solution for the paradox of free choice
permission. Synthese 147(2). 343–377. doi:10.1007/1-4020-4631-6_10.
Shan, Chung-chieh. 2004. Binding alongside Hamblin alternatives calls
for variable-free semantics. In Kazuha Watanabe & Robert B. Young
(eds.). Proceedings from Semantics and Linguistic Theory XIV. Cornell
University Press. 289–304.
Simons, Mandy. 2005. Dividing things up: the semantics of or and the
modal/or interaction. Natural Language Semantics 13(3). 271–316.
Wadler, Phil. 1993. A taste of Linear Logic. In Andrzej Borzyszkowski & Stefan
Sokolowski (eds.). Proceedings of the 18th international symposium on
mathematical foundations of computer science (Lecture Notes in Com-
puter Science Volume 711). Heidelberg: Springer. 185-210. doi:10.1007/3-
Zimmermann, Ede. 2000. Free choice disjunction and epistemic possibility.
Natural Language Semantics 8(4). 255-290. doi:10.1023/A:1011255819284.

Chris Barker
10 Washington Place
New York, NY 10003, USA

Semantics & Pragmatics Volume 3, Article 6: 1–54, 2010
doi: 10.3765/sp.3.6

The semantics and pragmatics of plurals

Donka F. Farkas Henriëtte E. de Swart

Department of Linguistics, Department of Modern Languages,
University of California at Santa Cruz Utrecht University

Received 2008-11-14 / First Decision 2009-01-23 / Revised 2009-04-28 / Second

Decision 2009-07-07 / Revised 2009-09-18 / Third Decision 2009-10-07 / Revised
2009-10-30 / Accepted 2009-11-21 / Final Version Received 2010-01-06 / Published

Abstract This paper addresses the semantics and pragmatics of singular

and plural nominals in languages that manifest a binary morphological
number distinction within this category. We review the main challenges
such an account has to meet, and develop an analysis which treats the plural
morpheme as semantically relevant, and the singular form as not contributing
any number restriction on its own but acquiring one when in competition
with the plural form. The competition between singular and plural nominals
is grounded in bidirectional optimization over form-meaning pairs. The main
conceptual advantage our proposal has over recent alternative accounts
is that it respects Horn’s ‘division of pragmatic labor’, in that it treats
morphologically marked forms as semantically marked, and morphologically
unmarked forms as semantically unmarked. In our account, plural forms
are polysemous between an exclusive plural sense, which enforces sum
reference, and an inclusive sense, which allows both atoms and sums as
possible witnesses. The analysis predicts that a plural form is pragmatically
appropriate only in case sum values are among the intended referents.
To account for the choice between these two senses in context we invoke
the Strongest Meaning Hypothesis, an independently motivated pragmatic
principle. Finally, we show how the approach we develop explains some
puzzling contrasts in number marking between English three/more children
and Hungarian három/több gyerek (‘three/more child’), a problem that has
not been properly accounted for in the literature so far.

Keywords: singular, plural, morphology, markedness, optimality theory, strongest

meaning hypothesis, Hungarian

©2010 Farkas and de Swart

This is an open-access article distributed under the terms of a Creative Commons Non-
Commercial License (creativecommons.org/licenses/by-nc/3.0).
Farkas and de Swart

1 Atoms, sums and the inclusive/exclusive sum interpretation