Вы находитесь на странице: 1из 294

THEORETICAL LINGUISTICS

Vol.1

1974

W DE
WALTER DE GRUYTER

BERLIN-NEWYORK

INSTITUT DEUTSCHS PHILOLOGIE U$VE:r;7 KOCHEN


Inventar- 4. r.

Archiv-Nr. 3109001089
ISSN 0301-4428 1974 by Verlag Walter de Gruyter & Co., vormals G. J. Gschen'sche VerlagsbuchhandlungJ. Guttentag, VerlagsbuchhandlungGeorg ReimerKarl J. TrbnerVeit & Comp., l Berlin 30, Genthiner Strae 13.Printed in GermanyAll rights reserved, including those of translations into foreign languages. No part of this journal may be reproduced in any formby photoprint, microfilm or any other meansnor transmitted nor translated into a machine language without permission from the publisher. Typesetting: H. Hagedorn, BerlinPrinting: Mercedes-Druck, BerlinBinding: T. Fuhrmann, Berlin.

CONTENTS

Artifles BELLERT, IRENA On inferences and interpretation of natural language sentences GABBAY, Dov M., and MORAVCSIK, J. M. E. Branching quantifiers, English, and Montague-grammar
HOEPELMAN, JAN PH.

215 139 158 233 182 6 257 39 116

Tense logic and the* semantics of the Russian aspects ISARD, STEPHAN D. What would you have done if... ? KARTTUNEN, LAURI Presupposition and linguistic context KASHER, ASA Mood implicatures: A logical way of doing generative pragmatics KUTSCHERA, FRANZ VON Indicative conditionals LIEB, HANS-HEINRICH Grammars as theories: The case for axiomatic grammar (part I)* SOAMES, SCOTT Rule orderings, obligatory transformations and derivational constraints
Discussions and Expositions

BELLERT, IRENA A reply to H.H. Lieb DASCAL, MARCELO and MARGALIT, A vis HAI A new 'revolution* in linguistics? 'Text-grammars' vs. 'sentence-grammars' LEWIS, HARRY A. Model theory and semantics

287

195 271

Part II of Lieb's article is to appear in volume 2 No. 3 of THEORETICAL LINGUISTICS

EDITORIAL

Theoretical Linguistics is concerned with linguistic theories. This statement seems to be almost a tautology, but it is not: though that may not be immediately obvious. The formulation needs interpretation, especially if it is to be taken as a condensed statement of the editorial policy of this journal. Let me begin by stating the general tendencies determing this policy: rigour of presentation, adequacy with respect to the more complicated aspects of natural languages, and critique of the foundations of the analysis of languages. The first tendency stems from logic, the second from linguistics, and the third from philosophy of language, the three "parents" of theoretical linguistics. But let me explain this in detail. "Theoretical Linguistics" is not only the title of this journal it is almost meant to designate a sub-dicipline of linguistics. Theoretical linguistics, in being concerned with the development of theories about general aspects of particular languages or of language and its uses in general, as well as with the discussion and analysis of the form, scope and applicability of such theories, will have to be contrasted with that part of linguistics which is either concerned with experiments, tests and field-work applied to languages or else with setting up particular empirical hypotheses about the structures of languages based on judgements about (introspective or observable) experiences. Obviously, there should be a very close interrelation between the different parts of linguistics including theoretical linguistics. In view of this interrelation, linguistic theories will have to be extensively illustrated by derivations of hypotheses about particular phenomena of languages in order to become accepted as systems of hypotheses and these special hypotheses will have to be tested to evaluate the empirical adequacy of the linguistic theories from which they are derived. Many expositions and illustrations of linguistic theories describing fragments or aspects of particular languages, dialects etc. will therefore belong to theoretical linguistics as well as empirical research trying to assess the justification of some theory in general. But the central concern of theoretical linguistics should be the construction and discussion of general theories of a high degree of methodological andy in particular, meta-linguistic sophistication.

Editorial

Wherever possible, the concept of an axiomatized, though not necessarily formalized, theory should be taken as a standard at which theories presented or discussed in this journal aim. Glossing over a great number of problems, I assume that (1) linguistic theories (or the central empirical propositions correlated with them), are sets of statements, some of which are empirically true or false; and that (2) the logical relations among the statements of a linguistic theory may be exhibited by an axiomatic system. The logical relations may be deductive or inductive. We may require, in particular, that (3) the logical relations of a linguistic theory should be presented either (a) as an axiomatic deductive theory with descriptive constants (which may be expressed, in its strictest form, in some formal languagethe metalanguage of the language to be described); or (b) by definition of a set-theoretical predicate (or a set-theoretical structure). In empirical disciplines aiming at all at strict axiomatization method (3b) is often applied, especially because it is then possible to draw upon the huge stock of standard mathematical notions (under the assumption that their definition as set-theoretical predicates has already been accomplished in meta-mathematics). Indeed, an axiomatization along these lines may have been an implicit aim of mathematical linguistics, to the extent that the latter developed in relation to generative transformational grammars. The basic intention was apparently to define set-theoretical predicates presenting the structural properties of descriptive constants like " is a grammatical sentence in L" and ".. .is a syntactic structure of sentence _in L", defined for each natural language L. This approach was based on some fundamental notions such as "... is an elementary unit of expression" (e.g. phonetic unit) and the notion of concatenation, supplemented by definitions of auxiliary notions such as "grammatical construction", "grammatical (or syntactical) rule" etc. The basic notions could be taken to correspond to the set-theoretically defined mathematical notions "element of the generating set" and "binary operation" of a free semi-group with a generator set. In connection with these definitions particular emphasis has been placed upon the requirement that the definitions should be recursive. Since, however, the sets corresponding to the notions to be defined could not be taken as finite, methods of recursively constructing the elements had to be applied. This led to speaking of devices that generate sentences, from which some people have mistakenly supposed that the particular method of recursive definition applied might serve as a model of psychological procedures performed by speakers, and possibly also by hearers, of the language for which grammaticality of sentences and their structure had been defined. In fact, only the recursive definition of these notions

Editorial

and no psychological procedures were in question. (It could at most be said that these notions, if adequately defined, put constraints on possible descriptions of psychological procedures). Recently, a complex of set-theoretical predicates (or set-theoretical structures) for syntax and semantics of natural languages has been advanced by applying the methods of model-theory and intensional logic. In complexity and, perhaps, in scope, the resulting systems seem to be at least comparable to generative transformational systems. But in addition to syntax, these proposals included as an essential component an axiomatized theory of indirect semantic interpretation by defining a translation function into a language with a syntactic structure that is better suited to the definition of a direct model-theoretic interpretation than are natural languages. Recently, method (3a) has been used for axiomatization within the generative theory of grammar and the first steps have been taken towards axiomatizing grammars. Further developments in the field of axiomatic grammars will be presented in the first issues of this journal. There may be alternative ways of presenting precise linguistic theories, but they will have to be comparable in rigour to the standards of formal theories that are already well understood and analysed. Certain judgements of value may seem to be implicit in my discussion of various approaches to theoretical linguistics. Let me therefore state explicitly that these judgements are by needs subjective. I do not believe, though, that this should exclude such judgements from an editorial. Any bias that might result from my own convictions will, as a matter of editorial policy, be corrected by consultation of the members of the editorial board, who will not necessarily agree in all their views on theoretical linguistics. Admitting all this, I would still suggest that certain consequences follow concerning the style of presentation preferred in this journal: Currently established mathematical and logical conventions of formulation should be followed as far as possible. Some readers may, perhaps, object that too much emphasis has been placed upon the form in which theories are presented. It is certainly true, that the force of a theory does not derive from its rigour but from its leading ideas and from the answers that can be given to the following questions: Are these ideas revealing and fruitful? Do they offer important generalizations and insights? Are they adequate and applicable? Can they be related to neighbouring fields or do they even sustain notions that are basic in a number of neighbouring fields, etc., etc.? True enough! But it is also true that the reliability and the continuity of development of ideas does depend on the perspicuity, rigour and systematic controllability of the theories in which they are presented. It may be granted that fruitful ideas are very often not stated in a rigorous form in the

Editorial

first instance; a number of attempts at finding the best formulation will have to be allowed. However, the very attempt to integrate an idea into a theory will demand precise and rigorous formulation. Nevertheless, rigour of meta-linguistic presentation is only a virtue in an empirical discipline if it is not empty, i.e. providing that the proposed theory aims at empirical adequacy. Progress in logical and theoretical analyses of languages during recent years should allow for the development of theories that combine rigour with empirical adequacy. The situation has changed from that of the early days of logical analysis of languages when rigour could only be achieved at the sacrifice of adequacy in the description of natural languages. Still, whereas, because of their different training, the logician has had the advantage over the linguist with regard to meta-linguistic rigour, the position is reversed when it comes to meeting the requirements of empirical adequacy. Logicians tend to underestimate the complexities of natural languages and to ignore whatever does not fit into their framework of description. Finally, we have to acknowledge that strict adherence to the basic principles of a discipline may, at times, be detrimental to its fruitful development. It is the task of the philosopher to insist in season and out of season on the need for reflection on the basic principles and to doubt the validity of tacit assumptions. And of course, this also applies to the conception of theoretical linguistics as presented in this editorial. In particular, such points as the relation of theoretical linguistics to formalization, which are somewhat controversial, even among the members of the editorial board, should be discussed in the journal itself. It is to be hoped that logicians, linguists and philosophers of language and at times perhaps others such as the computer scientistwill collaborate in the development of theoretical linguistics. It should be borne in mind, however, that contributions are variously assessed in different disciplines: our field will not be an exception. In order to aid mutual understanding, authors should, therefore, provide their contributions with careful introductions which present the problems succinctly, as well as the criteria for a satisfactory solution, and the methods applied in trying to attain it. This effort on the part of the authors should be met by the reader with patience, tolerance und willingness to learn even from approaches that seem to be unfamiliar at a first reading. In order to further mutual understanding as well as more thorough acquaintance with the different approaches of theoretical linguistics, the journal will have a special section called "Discussions and Expositions" containing detailed critical analyses as well as contributions designed to give an easily readable introduction and exposition of more complicated ideas and theories which are of interest. At present for example, a number of aspects of modeltheoretic grammars seem to need such an introduction. Authors who are willing and able to provide expository contributions of this type are invited to submit them to the journal. It is up to the authors and readers of this journal to contribute to the development of theoretical linguistics, the authors by providing

Editorial

not only original contributions of high standard, but also pedagogically perspicuous and lucid introductions to or critical discussions of the different ways of approaching a common field of interest, the readers by analyzing the results published with an appropriate combination of criticism and tolerance and by applying them fruitfully.
H. Schnelle

ASA KASHER

MOOD IMPLICATURES: A LOGICAL WAY OF DOING GENERATIVE PRAGMATICS*

In this paper we present an extension of the model-theoretic framework of semantics in which some pragmatical aspects of natural language can be treated adequately. In ch.l we specify the scope of generative pragmatics. Ch. 2 outlines the formal framework of semantics. In ch. 3 we pose the problem of non-indicative sentences and in chs. 4 and 5 we reject the solutions suggested by Stenius, Aquist and Lewis. In ch. 6 we define some pragmatical conceptspreconditions and implicatures of various typesusing them in ch. 7 to present a pragmatical characterization of moods in terms of preference-implicatures. Some ramifications are discussed. In ch. 8 we draw a distinction between basic preference-implicatures ("pragmemes") and derived ones. The derivations involve communication rules. In ch. 9 we outline the extended formal framework. Finally, in ch. 10, we present some open questions.

It may be that the next great advance in the study of language will require the forging of new intellectual tools that permit us to bring into consideration a variety of questions that have been cast into the waste-bin of 'pragmatics'. N. Chomsky Form and Meaning in Natural Language 1. Introductory

Ideal speakers live in speech-communities, communicating with their fellows regularly and happily. They ask questions and answer questions, as all of us, real speakers, do; some of them issue commands, as some of us do; and a few of them give * The preparation of this paper was partly supported by the Deutsche Forschungsgemeinschaft (BonnBad Crodesberg). I am grateful to John Bacon, L. Jonathan Cohen and Helmut Schnelle for pointing out some important, apparent counterexamples. I owe special thanks to Helmut Schnelle for extensive comments on an earlier version of the paper. As usual, I am grateful also to Yehoshua Bar-Hillel for some comments and remarks.

Mood implicatures: a logical way of doing generative pragmatics

verdicts, as a few of us do. However, unlike any of us, ideal speakers never fail to perform happily speech acts they intend to carry out. They utter an instance of sentence S in context C only if S is linguistically appropriate to 6"; in their discourse, preconditions never fail to obtain and presuppositions are always true. Pragmatics is the study of that part of linguistic knowledge and behavior which pertains to speaker-sentence-context relations. In a homogeneous community of ideal speakers the linguistic behavior matches strictly the linguistic knowledge, but this is not the case when real speakers are under consideration. In that case there is the distinction between Pragmatics in a narrow sensethe study of the pragmatical competence of real speakers, which is, so to speak, the study of the pragmatical aspects of the behavior and knowledge of ideal speakers, and Pragmatics in a broad sense the study of the pragmatical aspects of the linguistic behavior of real speakers. This paper is confined to Pragmatics in the narrower sense. A pragmatical theory purports to describe and explain adequately part of linguistic competence. When such a theory incorporates a system of explicitly presented postulates and rules that govern the pragmatically basic relations, it will be called a generative pragmatical theory. Pragmatical investigations are intertwined with semantical considerations. We side with Montague in the dispute about representation of meanings: logical languages and model-theoretical systems of interpretation are the backbone of adequate semantical theories of natural languages.1 On the other hand, we take sides with all those linguists who have argued to the effect that at least some transformations have no semantical bearings, and hence, pace Montague, not every linguistic rule (of a syntactical nature) should be semantically justified. Dubbing this approach "logical generativism", we may call the present work a logical way of doing generative pragmatics. In this paper we attempt to provide a framework in which the performative functions of sentences can be described and explained. Some aspects of J.L. Austin's views of speech-acts have been systematized in.J.'R. Ross* work on declarative sentences and J.F. Sadock's works on hyper-structures, but they are restricted to syntactical considerations.2 We are interested in the pragmatical point of view. Semantical Preliminaries 3

2.

Semantics is a hydra-headed label, but we are interested here in just two heads. On the one hand it means the science of meaning and on the other hand it refers to "a discipline which^ speaking loosely, deals with certain relations between expressions of a language and the object s ... 'referred to* by those expressions ..., or possibly ... [the] 'states
Richard Montague's papers will be published in a collection by Yale University Press. Meanwhile, consult Montague (1970) and Montague (1973). 2 Ross (1970) and Sadock (1969). 3 The reader who is fluent in some version of model-theoretical semantics may skip this chapter. See, however, the fourth conclusion of this chapter.
1

Asa Kasher

of affairs' described by them." (Tarski 1944, 345). Since the cleavage between theories of meaning and theories of reference is endorsed both by philosophers and linguists4, let me disclose my sticker at the very outset: Unionist. A general theory of reference provides foundations of and a framework for an adequate theory of meaning for natural language. Replacing slogans by examples, consider first the phrase (1) the former Pope. (2) A*/("the former Pope", Q =/j (/c, Pc) This string of words does not refer on its own merits to anybody on earth, even when an appropriate syntactical structure is attached to it. Otherwise it could not be used to refer to different people on different occasions. Consider the following schematic description of the recent papacy:
Figure 1

An utterance of (1) at /j was used to refer to Pius XI while an utterance of it at /2 was used to refer to Pius XII. Generally, where Cis the context of utterance of (1), tc is the time ofthat utterance, and P c is a succession of popes, say, from St. Peter and on, future popes not excluded. Where the actual state of the papacy is not clear we can stick, say, to the official view of the Catholic Church or to an alternative view. The function/i transfers us, so to speak, from a given interval in figure 1 to the former interval, if any. The second argument of yj (x, y) specifies the succession under consideration, the first argument specifies which interval in this succession serves as a starting point for the operation of/j, which in turn moves us from to the former interval ofj, if any. Notice that ji(/c, Pc) is sometimes undefined. Assuming that the consecration of the second pope, St. Linus, took place in 67, ji is undefined for every earlier point. Presuming that antipopes were popes, f i is undefined whenever there are two or more former popes; thus^(1122, Pc) is undefined, because both Gelasius II and Gregory VIII were then former popes. The function^ is defined at / if and only if any pope at / is an immediate successor of exactly one pope, in Pc. We can now see what the contribution of each element of (1) is to the reference made by any utterance of (1). The word 'pope' directs us to Pc, the word "former" brings forward /, and then fir and finally 'the' introduces restrictions on the domain off , mentioned in the former paragraph.
4

Quine (1961) 21f, 47ff and 130ff. Cf. Katz (1966).

Mood implicatures: a logical way of doing generative pragmatics

What would be the reference made by using (1) at /^, had John XXIII been the successor of Pius XI? Obviously, the reference made would be to Pius XI rather than to Pius XII. The word "Pope" in (1) would then direct us not to Pc but to Pc, which is a possible history of the papacy, referred to in a possible context of Figure 2

utterance C at which (1) was used. One can imagine other contexts of utterance in which (1) is used to refer to popes in other possible courses of events, including different possible papal successions. However, (2) holds for each of these possible contexts of utterance, where "Pc" stands for what is the actual papal succession/row the point of view of C rather than from our point of view, or from the present Pope's point of view. Summarizing what we have found so far, we maintain that when (1) is used in a context Cto refer, the reference is a function of some aspects of the context C; one is the time-index /c, involved in the use of "former", another is the exponent (as opposes to index) eC, which is the possible world or possible history under consideration, that is involved in the use of "Pope". We take the meaning of the phrase "the former Pope" to be fully captured by a satisfactory characterization of the function Ref("the former pope", C). One way of characterizing this function is sketched in (2), which needs however some further clarification. The reference of "the former Pope" in C is a function of Pc, but how exactly should Pc be characterized? It is not the reference of "the pope" in C, because it specifies a whole succession of popes and not just the one that heads the Roman Catholic Church in C, according to some view. On the other hand, given some means GC for determining for a given time point / and a person e whether e is a pope at / or not, we can define the appropriate papal succession Pc in terms of Gc. Let us then amend (2) to read (3) A*/("the former Pope", Q = /j (/c, GC}. Again, one can imagine different C7s that determine different Fs; this is why we have GC rather than just G. Every Gc is related to somepossib/e history; the answer to the question whether e is a pope at / is relative to this possible history. Trying to get rid of this relativity we take G to be the means for determining for a given possible history b, a time point / and a person e, whether e is a pope at / in h. Again, a satisfactory characterization of G will be here considered to capture in full the meaning of the word "pope". GC is then the function G (bc, *> *)> where hc is the possible history under consideration at Cand t and e are appropriate variables. Using the -notation for specification of variables, we have:

10

Asa Kasher

(4)

?/("the former Pope", Q =fi (/c, / * 6),

and, consequently, the meaning of (1), which is KC Ref ("the former pope", 6), is represented as a function of the meanings of "the", "former" and "Pope" in (5):
(5) = Ufa /* Gfc, /, 0). from which it is also clear that the context C in which (1) is used to refer, contributes significantly to the reference, both by index and exponent. Notice that we take a satisfactory characterization of G to be first of all a sufficient condition for capturing in full the meaning of the word "Pope". Whether it is a necessary condition or not is open to debate. We cannot do here more than deckring that for some expressions it seems to us to be a necessary condition as well. "Pope" is among these expressions. However, we admit that in many cases this could not be a necessary condition. For an extreme anti-necessitarian view, consult Schnelle (1973). Having seen how the meaning and reference of the phrase "the former pope" are rekted to semantical properties of its elements, we turn now to see how this phrase plays the role of an element in bigger phrases. (6) The former Pope was Italian. Denoting by "/" the meaning of "Italian", which is a function I(hc, /, *), similar to the function G we saw earlier and ignoring the tense of (6), it is clear that the meaning of (6) is a function of H (the meaning of "the former pope") and 7. Since both H and / have contextual arguments, vi%. bc and /c, it should come as no surprise that the meaning of (6) turns out to have contextual arguments as well. Why (6) depends on the context of utterance should be clear by now. On one occasion, given appropriate index and exponent, the proposition made is trueJohn XXIII was in the actual history, say, according to the official view of the Catholic Church, an Italian; on other occasions the produced proposition is falseJohn XXII was, according to the same official history, a Frenchman, and a possible history can be outlined in which John XXIII was not an Italian. The meaning of (6) will then be taken to be fully captured by an adequate characterization of the function that relates a context of utteranceits indexes and exponentto the truth-value of the proposition made by using (6) at that context. An adequate characterization of the meaning of (6) should show how the meaning of (6) depends on the semantic properties of its elements. I(kc> A 0 is a function from exponents, time-points and persons to truthvalues. Given any context of utterance C9 we determine by / whether the reference of (1) in that context was an Italian, during the appropriate period of the possible history which is the exponent of C. In other words, we apply the function 7(x, y, z) to a certain triple of arguments, as follows: for we substitute 'be denoting the exponent of C which is the possible history under consideration, and for we substitute '//(C)', which refers to the same person, if at all, that (1) refers to in C. What we substitute for y depends on an analysis of the tense of (6) which is beyond the scope of our discussion.

Mood implicatures: a logical way of doing generative pragmatics Altogether we have:

11

(7)

I(bc,t'c,H(C))

which is either true or false, or undefined in case the last argument fails to refer to anything. The following partial function (8) from contexts of utterance to t ruth-values represents the meaning of (6). The purpose of this section has been to hint at our semantic framework. 1) A semantic theory of a language is a theory of functions that are related to phrases of the language and their semantical properties; given a phrase x, a word or a sentence and an appropriate context of utterance C, the function attached by the theory to supplies the value of C, if any. Values of some nounphrases are individuals referred to by using in an appropriate C. 2) Values of other phrases, e.g., predicates, adjectives and prepositions, are functions that contribute to the functions attached to other phrases in which the former ones appear. Generally, a linguistic adequacy criterion for a semantic theory is that the function attached by the theory to a phrase be determined by the functions attached to the elements of and by its syntactic structure. 3) Functions may not be defined for some arguments; for example, a noun phrase may fail to refer whenever uttered in contexts in which presuppositions induced by do not obtain. 4) The contribution of any Cis two-fold: first it provides values of indexes: speaker, addressees, time and place of utterances, and so forth. Secondly, it fixes an exponent, including a possible factual background for x's functioning. An exponent may be a possible state of affairs, a possible succession of such states, or generally, any class of such states. We use the term "exponent" in order to be neutral with respect to the extent and structure of those classes. Justification and detailed development of such frameworks are not our job here and we refer the reader to works by Montague, Lewis and others.5 In what follows some problems will be considered within such a framework.

3.

The problem of other moods

We assume that the meaning of the sentence (9) He is still alive is captured satisfactorily by an adequate presentation of some partial function from contexts of utterance to truth-values. Such a presentation seems possible,
5

Lewis (1972) is an excellent introduction to Montague (1973). Consult also Thomason (1972). Notice that we have not subscribed here to any particular logical language or theory, but to the general conception of model-theoretical semantics for natural languages.

12

AsaKasher

and even natural, because whenever (9) is uttered in an appropriate context, what is said may be true or false. A context of utterance is inappropriate for (9) if, for example, it does not specify the reference of "he" in (9). Thus, lack of information may cause the function attached to (9) to be undefined for some contexts. Consider now the following sentences: (10) May he be still alive! (10') Is he still alive? (10") Who is still alive? Trying to attach to each of these sentences a function, depicting its meaning along similar lines, we have no difficulty in circumscribing the domains of these functionsthe classes of elements serving as arguments of these functionsvi%. contexts of utterance determining indexes and exponents. What are the values of these functions when defined?that is the question. Assuming that no information is lacking in a context C of uttering (10), we are still inclined to say that what was said was neither true nor false. Since we assume that no information pertaining to (10) is lacking in C, the third alternative, w%. that the truth-value is undefined, is also excluded. Our functions are defined for some 7s and their values seem not to be 'true* or 'false'. What are they, then? Without adhering to any solution of that problem, we mention two useful distinctions that have been drawn in relation to some solutions. Philosophers since Frege (1918/19) distinguish between two logical elements of a sentence, the element which the sentences (9) (10) (10') and (10") share, and what they do not share. The former is the "descriptive contenf of the sentences, indicatives and non-indicatives as well; the latter is the "mood" of the sentence indicative, interrogative, imperative, etc. indicated in many cases by a performative verb or by the syntactical form of the sentence. E. Stenius (1964) 168 pointed out that there are different ways in which a mood/radical distinction can be applied, and introduced the terms "grammatical mood" and "semantical mood"6. Obviously, the grammatical mood of a sentence may be different from its "deeper" mood: (11) You will pack at once and leave this house. This sentence is of the indicative mood in the grammatical sense, but in many contexts it is of the imperative mood, pragmatically. In what follows we shall confine ourselves to a discussion of semantically non-indicative sentences. Our major problem in the present paper is how to present the meanings of their moods, within the formal framework outlined earlier. However we shall have something to say about the use of indicative sentences as well.
6

Jespersen used "notional" for the same purpose (1924) 319. Better terms would be "syntactical mood" and "pragmatical mood** but we shall use Stenius' terms.

Mood implicatures: a logical way of doing generative pragmatics

13

4.

Stenius and Aquist on the other moods

One theory of meaning for (semantically) non-indicative sentences postulates the existence of a sentence-radical and a sentence-mood for every sentence, indicative and non-indicative as well. Thus, at some level of representation the suggestive notation of (16)(20) can be used for (12)(15), respectively: (12) Do you live here now? (13) It is obligatory for you to live here now, or live here now! (14) You live here now. (15) It is necessary that you live here now. (16) ?p (17) Op (18) ip

(19)

INp1

Notice the difference beween Ip and p. The latter does not denote the sentence (14). It is a sentence-radical which does not have a mood, and it may be expressed in English by (20) rather than by (14): (20) that you live here now. The operators ?, I and others, which combine with sentence-radicals to form representations of sentences, play a role both in the syntactical and semantical processes. The syntactical role of such operators is well known, having been used by some linguists, though under different terms. Katz and Postal (1964) 86 ff and Katz (1972) 201 ff introduced a special morpheme Q into the underlying structures of yes-no question, and Ross (1970) argues to the effect that declarative sentences are derived from deeper structures including as their main verb, a complex symbol, [ + V, + performative, + communication, + linguistic, + declarative] which is defended on syntactical grounds. Whether these detailed analyses stand up to criticisms or not is not of our concern here (cf. Frser 1970). It suffices to notice that the postulated operators used in (16)(19) provide syntactical anchors, that is, starting points for a transformational process accounting for the surface differences between (12) and (14). The main point of Aquist's variant of Stenius' theory is that the sentenceradical specifies what should be true somewhere and the sentence-mood indicates in a sense where (Aquist 1967). When the sentence (14) is under consideration, what is true in some possible world is (20)that you live here now, which is the sentence radical of (14). The sentence mood of (14) indicates in which possible world (20) is true, according to what is conveyed by (14) in its standard

Stenius* own notation for (19) is "Np". Cf., however, his paper (1967). We shall not discuss the differences between Stenius* two works.

14

AsaKasher

use. Since the semantical mood (14) is the indicative, the possible world indicated is usually the actual world; more accurately, if a context of utterance C has the indexes < addressee: John), <time: t>, and <place: ^, then what is conveyed by a standard use of (14) at C, is that it is true at the exponent of Cy i.e., in the possible world which is actual from the point of view of C that John lives in the vicinity of / at /. Consider now the sentence (13). As Aquist pointed out it is related to the sentence (21) Let it be the case that you live here now where the italicized part stands for the mood and the rest for the radical (20). A better approximation to (13) would be: (22) See to it that it is the case that you live here now. Any standard use of (13) or (22) conveys that their radical is true somewhere. It is the function of the operator O, verbalized by what stands in front of the that-clause in (22) or by the syntactical form of the imperative, to indicate that the radical is true in every possible world in which the command produced by a standard use of (13) or (22) is obeyed. Yes-no questions are treated similarly, being disguised requests for information.8 The ingenuity of Aquist's variant of the mood/radical theory of non-indicative sentences does not conceal its shortcomings. It seems to render imperatives and certain interrogatives amenable to an ordinary treatment within an outlined theory of truth conditions of sentences in possible worlds. Given a sentence, in whatever mood, three steps are required: first, its radical and mood are determined, secondly, a class of possible worlds is specified, depending on the determined mood, and thirdly, truth-conditions are specified for the determined radical in these possible /worlds. Now each of these steps involves difficulties. Given a sentence, it is perhaps easy in many cases to determine its radical, but it is not so easy to determine its mood; this difficulty motivates the distinction between semantical and syntactical moods. Given a radical of a sentence, it is by no means easy to provide its truth-conditions in possible worlds, but since general specification of such truth-conditions is a goal of formal semantics anyway, we ignore the difficulties involved in this step. The second step is based on a possible characterization of moods in terms of relations between possible worlds: supposing that a sentence S of radical r and mood ?, is used successfully in possible world w0, we have the radical r true in every possible world w which is an appropriate alternative to 0>0, or in other words such that the relation Rm (w0, w) holds. Rmthe /^-alternativeness relationis the core of this analysis of /^-sentence. A mood has not been characterized, unless a theory has been provided that marks the properties of Rm which make it different from any other non-intended mood-relation Am/. A theory of moods should
This observation was made by Jespersen, /0r. '/., and also by Aquist (1967).

Mood implicatures: a logical way of doing generative pragmatics

15

provide a general framework for the representation of all moods and an explanation of the variety in moods and its internal relations. Stenius* and Aquist's suggestions provide neither a general theory of moods, nor a characterization of any single one of them. Notice that a theory of moods has two different goals. Characterization of each mood separately is not enough. Explanation of the variety is impossible where no measures for comparing moods, or sentences of different moods are available, and measures for comparison can evolve only within a common framework. Hence, any adequate theory of moods should include at least some important elements of a unified theory of moods. In one sense a cleavage of logic according to mood into logics of commands, questions, advices or what have you, is a desolation of logical theories of mood. In the sequel we adopt the basic distinction between moods and radicals and the idea of characterizing moods by specifying truth-conditions in appropriate possible worlds. But first we consider a rival theory. 5. David Lewis on other moods

David Lewis proposes a method of "paraphrased performatives (Lewis 1972, 205212). First, an explicit performative sentence is paired with each non-indicative sentence. For example: (23) Be late and (24) I command you to be late are associated. Non-indicative sentences are "treated as paraphrases of the corresponding performatives, having the same base structure, meaning, intension and truth-value" on an occasion (Lewis 1972,208). Secondly, the truth conditions of (24) on its performative reading are equated with its truth-conditions on its non-performative, self-descriptive reading. Doubt can be thrown about the success of this ingenious program. For some reasons, indicative sentences are not on a par with non-indicative ones, which are being treated as paraphrases of some explicit performatives. The sentences (25) and (26) are taken to have distinct truth conditions: (25) The earth is flat. (26) I declare that the earth is flat. This unwarranted discrepancy between indicatives and non-indicatives is a symptom of some deeper difficulties in both parts of Lewis' program. First of all, Lewis' proposal hinges upon the assumption that to every sentence there corresponds an explicit performative. There are, however, several kinds of sentences which resist restatement. Consider the following examples9:
(27) is discussed in Culicover (1972).

16

Asa Kasher

One more can of beer and I'm leaving. Clearly, I concede that I've lost the election. Be glad that we aren't taking away your teddy-bear. Go in the back door and hope no one sees you. Give me an Akvavit, or don't you approve of drinking? The difficulty in paraphrasing (27) is due to the fact that it cannot be embedded within an explicit performative hyper-structure without modifying the first conjunct of (27) to include a verbal phrase. It has, however, been shown that the "interpretation of that conjunct is systematically unspecifiable." Since systematic paraphrasing requires systematic specifying, and the latter is impossible the former is also impossible. Another problem is exposed in (28). Any movement of the adverb will result in either a non-grammatical expression or a sentence of a different meaning. Adjoining an additional explicit performative hyper-structure will not do either. Now, the following sentence (32) is not a paraphrase of (28), because the adverb in the latter is significant, while the second conjunct in the former is pointless; and (33) is not a paraphrase of (28), because (34) is not grammatical. (32) I concede that I've lost the election and it is clear that I concede that I've lost the election. (33) It is clear that I concede that I've lost the election. (34) It is clear that I hereby concede that I've lost the election. Another problem is presented in (30), vi%. how to paraphrase the second conjunct. (35) is a near paraphrase of it: (35) You should hope that no one sees you. but in a sense it is too strong since it presupposes the existence of some particular "institution" or set of rules according to which you have to hope no one sees you, a presupposition not shared by (30) as is clear from (36): (36) You don't have to hope no one sees you, but hope no one sees you. Even in the simpler case of (29) any "You should" expression is too strong since no kind of obligation is involved in standard context of utterance of (29). One might argue that "should" is ambiguous, and that no obligation is presupposed in (37): (37) You should be glad that we aren't taking your teddy-bear., but it has been argued10 that "should" is not ambiguous, and that (37) d0*r presuppose the existence of some rules that if followed would lead you to the conclusion that you rather be glad than sorry. However, even if (37) does not presuppose any kind of obligation or rules, it is not a paraphrase of (29), because the latter presupposes that you are not glad in the context of its utterance, while the former does not: (38) You are glad and you should be.
Wertheimer (1972). I take this argument to mean that the difference is meant to be a difference in meaning and not in use. We return to this point later.
10

(27) (28) (29) (30) (31)

Mood implicatures: a logical way of doing generative pragmatics

17

Our final example, (31), is a case of one speech act affecting the force of another one. This interaction is not depicted adequately by an explicit request followed by an explicit question: (39) I hereby request that you give me an Aquavit and I hereby ask whether you don't approve of it. Our hypothesis that not every sentence has an explicit performative paraphrase is not in conflict with any principle of expressibility, as formulated by Frege, Tarski or Searle, under different titles (cf. Searle 1969, 19f). According to the latter, for example, for "any meaning and any speaker s, whenever s means (intends to convey, wishes to communicate in an utterance, etc.) then it is possible that there is some expression such that is an exact expression > of or formulation of x." (Searle 1969, 20). Indeed, what is at stake here is not even whether it is possible for natural languages to provide explicit performative paraphrases for each non-indicative, but whether it is possible to express what is expressed by non-indicatives, and this has obviously an affirmative answer. Notice also that in some cases what looks like the appropriate explicit performative paraphrase of given sentence in a speech act is non-grammatical for most speakers11: (40) *I hereby ask you what time it is. (41) *I hereby fire you. (42) *I hereby point out that most speakers find this sentence unacceptable. (43) *I plead that you will spare my life. (44) *I threaten that I will kill you. (45) *I boast that I have done that. Again, the non-grammaticality of (40)(45) does not violate the principle of expressibility because each of these expressions has a non-indicative paraphrase, having the appropriate force under standard circumstances. The lack of strict correspondence between sentences and explicit performatives is perhaps at the root of the seeming divergence of the meanings of indicatives from their purported explicit performative paraphrases. An adequate theory of explicit performatives will undoubtedly have syntactic ingredients. The other part of Lewis' program does not seem to fare better. There are several objections to the semantic identification of explicit performatives and selfdescriptive readings of sentences like (24). Lewis maintains that different sentences which share their meanings may have different uses, but he does not outline any theory of use. It remains to be seen how such a theory is worked out to accomodate a distinction between truth- and felicity-conditions, presuppositions and conversational implicatures, and so forth. In the sequel we present what is, in a sense, a theory of use, but we do not identify the meaning of a non-indicative with that of the corresponding self-descriptive sentence, if any.
11

These examples are due to J. Sadock.

2 TLIl/2

18

Asa Kasher

We turn now to another objectionable aspect of that identification thesis.12 A sentence (type) is pragmatically self-verifying (-falsifying) if, and only if, every happy use of this sentence or any other sentence having the same meaning and performativeness status must say something that is contingently true (false). It is reasonable to assume, following Gale, that a sentence is pragmatically selfverifying if, and only if, its contradictory is pragmatically self-falsifying. 'Now, from the identification hypothesis it follows that performatives are true or false, and pragmatically self-verifying, self-falsifying or neutral. Actually, "I hereby promise you to come" and similar performatives turn out to be self-verifying: in a happy use of this sentence a promise is issued, and a corresponding descriptive sentence would have been true, had it been uttered and used happily somehow in the same context, in addition to the performative sentence. Moreover, if performatives have truth-values, they have contradictories as well; the contradictory of "I hereby promise you to come", is naturally "I do not hereby promise you to come", which is, surprisingly, selfverifying as well.13 Thus, both a performative and its contradictory are selfverifying, i.e., contingently true in each of their happy uses. If one adopts Lewis' suggestion of saying one sentence and at the same time putting down in writing another sentence, one can produce a performative and its contradictory at one and the same time. Saying of one of them that its use is unhappy would be arbitrary and saying it of both is ruling out too muchone of them is true, and consequently cannot be used unhappily. Indeed, to say that both a're used happily would involve us in a contradiction. Finally, we mention an odd consequence of the identification hypothesis: Performatives are self-verifying, but their self-descriptive paraphrases are- almost always false. If we exclude the possibility of performing two different acts simultaneously, these self-description sentences are self-falsifying. It is odd that paraphrases should differ in their truth-values to such an extent.

6.

Preconditions and implicatures

The next part of this paper will be devoted to a proposal of an alternative- theory. Moods will be characterized within a pragmatical, rather than a semantical framework, which however will include a new substantial semantical element. It is fairly obvious that besides the syntactical concept of grammaticality and the semantical concept of meaningfulness there is some linguistic concept of pragmatical appropriateness. A happy linguistic use of any given sentence is pos12 13

See Gale (1970) and Sampson (1971). Both do not apply it, however, to our problem. We ignore here some'niceties mentioned by Gale.

Mood implicatures: a logical way of doing generative pragmatics

19

sible only in some possible contexts of its utterance. It is linguistically pointless to address an English sentence to a German whom you assume not to know English at all. Similarly, if your intention is to convey information directly, and not, say, to tell a joke, it is inappropriate from a purely linguistic point of view to tell your friend anything about hairs or heirs of what you call "the present king of France" if you know that either you or your friend or both of you fail to believe, at the context of utterance under consideration, that there is a person the reference to whom by that famous expression, at that context, is appropriate. From the same point of view it seems linguistically hollow to ask the girl who lives next door what the time is when you know exactly what it is. Appropriateness is a relation between speakers, sentences ^nd contexts of utterance. We shall not dwell here on the important problems of how to characterize contexts of utterance. We mentioned earlier that contexts of utterance determine, at least partially, indexes and exponents, i.e. there are functions index(C) = <speaker a, addressee(s) , time /, place/>/,...> and Exp(C) = eC Singling out one of the indices of these contextsthe speaker awe abbreviate the predicate "appropriate for uttering sentence 5" by speaker", applicable to contexts of utterance, to "-S-appropriate" or "^-appropriate for a". Every adequate theory of appropriateness of contexts for sentences (for a particular language L) will have true consequences of the form: (46) Every -J-appropriate context of utterance, C, has the property . At the moment we do not impose any severe restrictions on except that^ it depends on C or S, i.e., <p = (C, S). may, of course, depend on a, but this is implicit in its expressed dependence on C. Assuming that ! (C, S) is expressible in L for every C and S we try putting (46) differently. An adequate theory will have true conclusions of the form: (47) In every -appropriate context of utterance C the sentence ^ (C, S) is true. However, (47) might be misleading, blurring the distinction between what is-true in a context of utterance and what is true in the exponent of a context of utterance, e.g., in a possible world that is under consideration in that context of utterance. In 1967 an instance of sentence (48) was issued by an author of some dictionary of philosophy: (48) Pythagoras founded a reactionary Pythagorean Union in Croto. What is true in that context of utterance is for example that the speaker is a communist and that Pythagoras is dead. In the exponent of that context of utterance it is however true that Pythagoras is alive and not a materialist. The distinction between a context of utterance and its exponent does not exclude the possibility of a context of utterance's being its own exponent, under some circumstances, but this is not the standard case. If S is true in the exponent of C then indeed it -is true in C that S is true in its (T*s) exponent. Thus, truth-statements pertaining to exponents of contexts of utterance are directly related to truth-statements about these contexts

20

Asa Kasher

themselves. Since the opposite is not truetruth-statements with regards to some indexes of the context may have nothing to do with its exponentwe replace (48) by a better, two-fold reformulation of (47); every adequate theory of appropriateness for L will have true consequences of the following forms: (49) For every -5-appropriate context of utterance, C, the sentence <pj (C, S) is true in eC (= the exponent of C) or For every a-J'-appropriate context of utterance, C, the sentence 1 (C, S) is true in C. Now, interesting cases of either form are those in which (C, S) depends only on S, i.e. (6, ^) = 9i(C2, S) for any -^-appropriate contexts of utterance Ci and C2. These are the cases where there is a sentence S' of L which is true in every exponent of an -^-appropriate context of utterance, or in every -5-appropriate context of utterance (whoever the speaker is). Such sentences are sometimes called "presuppositions of S" but we shall not use this expression for that purpose. Instead we adopt Thomason's suggestion to distinguish between semantical and pragmatical presuppositions, and further introduce "precondition" and "implicature" as technical terms, employing the following definitions.

Definition I A sentence S1 is an exponent ^.-precondition of a sentence S: In every a-J-appropriate context of utterance C it is true that S' is believed by to be true at eC. In other words, S' is believed by to be true in the circumstances that are discussed in C.

Definition II A sentence S' is a context ^.-precondition of a sentence S: In every a-^-appropriate context of utterance C it is true that S' is believed by to be true at C. That is to say, S' is believed by to hold in the context of utterance. It is a necessary condition for a's performing a happy speech act by uttering S in C that all the sentences that are exponent -preconditions of S will be believed by at C to be true in eC, and that all context -preconditions of S will be believed by him at C to be true then. Now the concept of exponent -precondition, for example, is not a linguistic one. Consider a speaker a' who pretends being a French monarchist. He does not believe that there is presently a person who may be properly referred to as "the present king of France", but he is never reluctant to utter sentences like "Long live the king of France!" or "The present king of France has no heirs". Obviously, the sentence "There exists a person who is presently a king of France" is not an exponent a'-precondition

Mood implicatures: a logical way of doing generative pragmatics

21

of the former sentences. In order not to blur the issue by introducing such performance effects, we restrict the realm of speakers under consideration to that of ideal speakers, who mean exactly what they say, in each and every context of utterance, in distinction from the above mentioned fake monarchist. Later on we shall elaborate this point. Meanwhile we notice that the following concepts are purely linguistic ones.

Definition HI A sentence S' is an exponent precondition of a sentence S: For every ideal speaker (of language L to which S and S' belong) S' is an exponent -precondition of S.

Definition IV A sentence S is a context precondition of a sentence S: For every ideal speaker (of language L to which S and Sf belong) S' is a context -precondition of S. In the sequel we omit the reference to L.

Definition V A sentence S is ^precondition of a sentence S: S' is an exponent precondition of S or Sf is a context precondition of S. Whenever S' is a precondition of S, S' is (semantically) indicative, at least in the ^-appropriate contexts of utterance. Otherwise S' could not have been true in these contexts. Indeed, sentences have preconditions even if they are not indicative in any context appropriate for their utterance. Nevertheless, if a sentence S is indicative in every ^-appropriate context of utterance, then S has one distinguished precondition, vi^. S itself. This should come as no surprise: when an ideal speaker utters such a sentence he is committed to granting then that the sentence is true in the circumstances under consideration. An ideal speaker believes what he asserts. Similarly, an ideal speaker C^ who utters (50) in a context C, which is (Xi-(50)-appropriate, is committed to granting that (51) is true at eC: (50) Elsberg's present psychiatrist's former office is not in Washington. (51) There is presently exactly one psychiatrist who renders his professional service to Eisberg. Bringing into the picture the ideal speaker to whom (50) is addressed in C the addressee one wonders how is involved in his fellow's preconditions. Clearly, a belief on 's part at C that (51) is true in eC is not a necessary

22

Asa Kasher

condition for a's performing a happy speech act by uttering (50) in C. If it were necessary that believes at C that (51) is true in eC7, then would have been required to know it at C. Being required to be thus knowledgeable, would have to entertain true, justified beliefs about 's own beliefs. This is a stern demand. When we scrutinize a speech act of asserting, a necessary condition for its happiness is that the (ideal) speaker involved believes what he asserts; it is to the best of his belief that what he asserts is true in eC. If when we consider the heart of an assertive speech act we find that genuine belief, rather than knowledge, is demanded, why should we expect to find a demand for knowledge, rather than justified belief, when we consider one limb of such a speech act? While the addressee is exempted from actual belief at C that (51) is true in eC, he has to be taken by the speaker to entertain this belief. It is to the best of the speaker's belief at C that it is to the best of the addressee's belief at C that (51) is true in eC. Otherwise C would not have been an d-(50)-appropriate context of utterance for the speaker a. If the speaker has somehow been assured that his addressee believes that Eisberg uses the services of no psychiatrist, presently, then he is not entitled to use (50). Similarly, if the speaker takes it for granted that according to the addressee's beliefs Eisberg is presently treated by a battery of psychiatrists, the speaker is not entitled to use what purports to be a uniquely referring expression at C"Elsberg's present psychiatrist". We see, then, that in a (50)-appropriate context of utterance the (ideal) speaker not only believes (51) to be true in eC, but also takes it for granted that the addressee shares with him this belief. An ideal speaker does not disregard the beliefs of the addressee about the population of the possible world or worlds under consideration, because by performing a speech act he participates deliberately in a cooperative activity. The speaker intends to affect his fellow's beliefs, and one cannot affect something intentionally without having some presumptions about it. It is clear that not every belief the speaker has to hold in an ^-(SOJ-appropriate context of utterance should be taken by him to be shared with the addressee. In the case of the precondition (50) itself, we should rather expect the speaker to believe that the addressee has not yet adopted it. Otherwise it would be from a pragmatical point of view linguistically pointless to address it to him. The principle we have just applied is the maxim of precluded superfluity^ taken as well to be part of our characterization of ideal speakers. Applied to phrases and not only to sentences, this maxim provides an explanation for the different degrees of pragmatical inappropriateness of (52) and (53) in standard contexts of utterance. (52) I met yesterday an unmarried, illiterate bachelor. (53) I saw in the zoo a black raven.

14

We apply Grice's observations but introduce some new terms.

Mood implicatures: a logical way of doing generative pragmatics

23

Consider now the expression "Elsberg's present psychiatrist". According to the maxim of precluded superfluity, there is a reason for the (ideal) speaker's using this expression rather than, say, "Elsberg's psychiatrist"." The reason seems to be that the (ideal) speaker believes there to be some other psychiaterist rendering his service to Eisberg, besides the incumbent. Obviously, this other psychiatrist is not another present psychiatrist of Eisberg, but rather a former or a future psychiatrist of Elsberg's. In an ^-(SOJ-appropriate context of utterance, our ideal speaker is not required to presume that he shares this existential precondition with his addressee. Nor is he required to presume that the negation of this precondition is thus shared. Hence, there are three kinds of association of a precondition with the related belief about the addressee's beliefs. Let C be any -J-appropriate context of utterance, in which is the speaker and is the addressee. Denote by * (, C, w, X)" the statement that person believes at C that X is true in w. Confining ourselves to exponent preconditions, the three possibilities are as follows: (54) B(a,C,eC,JO&B(a,C,eC,B( ,C,eC,y)) (55) (, C, eC, S') & B(a, C, eC, ~ (, C, eC, S')) (56) B(a,C,eC,O&~B(a,C,eC,B( ,C,eC,r))& & - B (a, C, eC, - B( , C, eC, J")) Definition VI S' is a strong exponent implicatttre of S: For every ideal speaker and for every -J-appropriate context of utterance C, in which is the speaker and is the addressee, (54) is true. Strong context implicature is defined similarly, using (57) instead of (54):

(57) B(a, C, C, y) & B(a, C, C, (, C, C, y

Weak exponent and context implicatures are defined differently. We do not prohibit from believing that a certain belief is entertained by . The speaker is simply not required to believe that he shares a certain precondition with , nor is he required to believe that they do not share it. In other words, is linguistically exempted from having certain beliefs pertaining to 's beliefs. Definition VII (i) (ii) (iii)
15

S' is a weak context implicature of S: S' is a context precondition of S; (, C, C, S') is not a context precondition of S; ~(, C9 C9 S') is not a context precondition of S.

Formally speaking, is a meta-variable whose value is fixed by C in a certain way. Intuitively it denotes in each case the addressee (or class of addressees). The same holds for a, mutatis mutandis.

24

Asa Kasher

Weak exponent implicature is defined similarly. We now define two natural notions of pragmatical presuppositions.16 Definition VIII

S' is a strong (weak) exponent pragmatical presupposition of S: S' is a strong (weak) exponent implicature of both S and its negation. Definition IX S' is a strong (weak) context pragmatical presupposition of S: S' is a strong (weak) context implicature of both S and its negation. Thus, (51) is a strong exponent pragmatical presupposition of (50) and (58) is a weak exponent pragmatical presupposition of (59): (58) These are impressions from familiar places in Charlottesville17 (59) There are places in Charlottesville which are not familiar. We explicate some pragmatical concepts in terms of beliefs. Without indulging ourselves in a discussion of how belief is related to knowledge, we would like to clear away one natural misunderstanding. When is said deliberately to believe that sentence S is true in some possible world w, what is ascribed to him is significantly less than believing truly that S is true in w. Now we apply the maxim of precluded secrecy', according to which ideal speakers do not suppress any relevant information. When 's attitude towards S's being true in w is under consideration, an ideal speaker will not say that believes S to be true in w, if believes that knows that S is true in w. Hence, a belief-statement has among its weak exponent implicatures the negation of the corresponding knowledgestatement. However, a belief-statement does not imply the negation of that knowledge-statement. If we take knowledge to be at least true, justified belief, then the two propositional attitudes are indeed compatible. Definition I, for example, does not imply that a sentence S' is not an exponent -precondition of a sentence S if in some a-J-appropriate context of utterance C knows that S' is true in eC. Our definitions are stipulations of truth-conditions, and consequently "believes" should be taken in the natural sense of "at least believes".18 7. Mood-preconditions and implicatures

Every speech act, in which a sentence S is employed under certain circumstances C, centers around a manipulation of ^Ts radical at C. Indeed, what subjects a sentence-radical to the right type of manipulation, at an appropriate context of utterance, is the accompanying sentence-mood.
16

These notions are more similar to Thomason's pragmatical presuppositions than to Stalnaker's, in his paper (1972). 17 We assume that C provides reference for "these" adequately. 18 A similar case is involved in the meaning and use of expressions like "there is one...".

Mood implicatures: a logical way of doing generative pragmatics

25

In some cases sentences show their semantical mood in their grammatical mood. Explicit performative expressions are similarly efficacious. But grammatical moods and explicit performatives do not make up a general means of putting forward semantical moods. Our contention is that every sentence carries a class of weak context implicatures that characterises its semantical mood. To be sure, what we propose is a pragmatical characterization of moods. Roughly speaking, any performance of a speech act comprises two acts of choosing: the speaker can be thought of as picking out a semantically characterizable radical, and indicating by way of choosing a semantical mood to what use he is putting that radical. One way of using a radical is to import somehow that a certain predicate applies to it or that it participates in certain relations. Now, a semantical mood signifies some standard relations in which an accompanying radical participates. By choosing a certain mood the speaker shows that he believes these relations to obtain at the context of utterance. Which are these standard relations? The claim that we would like to put forward is that the standard relations are preference relations exclusively. When an ideal speaker commands by saying "Give me the file marked confidential", he conveys his preference of a state of affairs in which he has been given the file by over a state of affairs in which he has not been given it by , assuming that these two states of affairs are different only in this respect. If asks by uttering "Where were you born?", he makes it known to that everything else being equal he prefers knowing where was born to not knowing it. Similarly, upon asserting by using the sentence "The notion that the end justifies the means proved contagious", manifests the preference on his part of one state of affairs, Wj, over another state of affairs, w2, if they differ from each other exactly to the extent that in Wx knows that believes that the notion that the end justifies the means proved contagious, while in w2 does not know it. We proceed now to an elaboration of one of these observations by way of illustration. For a certain sentence S we shall show that an expression of an appropriate preference relation is a context precondition of S and a weak context implicature of it, but not a pragmatical presupposition of that sentence. Then we generalize. Consider, then, a context of utterance Cl9 in which an ideal speaker addresses to the following sentence: (60) Give me the file marked confidential. Let CQ be a context immediately preceding C\\ both and are present, and is about to utter (60). Now, there are many possible courses of events extending somewhat beyond C0 and C\. We divide them into two disjoint classes: in the first class there are all those in which not much later than C^ gives the file marked confidential, and in the second there are all the rest, i.e., all those in which does not give that file a short while after Q. (In both cases

26

Asa Kashcr

the relevant time-interval stretches from the endpoint of Q.) Let us pick out two possible courses of events, Wj from the first class and w2 from the other one. Although we know that wants to give him the file marked confidential, and in Wj but not in w2 does it, we cannot generally be assured that prefers wx over w2. They might be different in too many respects. If in wj gives the file and then shoots him to death, while in w2 just leaves and disappears, it is reasonable to assume that prefers w2 to Wj, rather than the other way round. However, knowing that is an ideal speaker who utters (60) in Cj , what we can be assured of is that given w3 and w4 that are different from each other exactly to the extent that in w3 gives the file marked confidential a short while after Cj and in w4 he does not, prefers w3 over w4 . Introducing a piece of notation, a preference relation that is involved in an appropriate use of (60) by is :

(61)
where "p" stands for " gives the file marked confidential to a short while after C\ Assuming that (61) and similar formulae are expressible in natural languages, we shall ignore in the sequel the due distinction between (61) and its expression in any L. When referring to (60) as a precondition or an implicature of some sentence S, we shall mean that the corresponding expression in L is such a precondition or an implicature of S, respectively. Notice that p represents the radical of (61). If uttered in C^ (62) rather than (60), what would we say about his linguistic behaviour? (62) I don't want you to give me the file marked confidential. Give me the file marked confidential.

Two ways are open for explaining (62). Either changed his mind immediately after uttering the first sentence of (62), in which case there is no conflict between the two parts of (62) because they reflect a's state of mind in two different moments; or, else, the first part of (62) is addressed to one person, while the second part is directed to another one. In this case there is also no conflict between the different parts of (62). If the speaker is, however, of the same mind during the utterance of (62) and he directs it all over to one and the same hearer, a conflict does arise, simply because the two parts of (62) show divergent preferences on the part of the speaker. The first part expresses one preference explicitly, while the second part, which is (60), conveys implicitly the opposite preference. (61) is a context precondition of (60)'. This should seem by now self-evident, except for the observation that not a preference on the part of the speaker is required by the definition of context preconditions, but rather his belief that he entertains that preference at the context of utterance. In many cases such preferences are accompanied with beliefs that they obtain, and vice versa, but theoretically this is not the case. At the bottom of the distinction between cases of preferencebeliefs and of actual preferences are two important observations.

Mood implicatures: a logical way of doing generative pragmatics

27

First, a distinction should be drawn between extrinsic and intrinsic preferences.19 A preference of one state of affairs over another is extrinsic when it can be adequately explained; in such cases the question "Why is this preferred to that?" does make sense, and an answer can be provided, at least in principle. When an explanation is not sought for and just a liking is involved the preference is intrinsic. Indeed, in the most natural instances of using (60) to command, the preference involved is not a's mere fondness of 's giving him the marked file. Usually tries to adhere to his own general plans. The command he issues by uttering (60) in C^ is meant to achieve some minor goal, thus supporting a major plan or purpose. Preference relations like (61) are often consequences of more general preference relations that constitute a's general purposes and plans. Now, is an ideal speaker and not necessarily a rational person or an ideal logician. Hence, his general preferences may fail to entail some particular preference he entertains. He believes that (61) obtains, but he would concede that he has been mistaken, if he were to be shown that (61) does not follow from, or is not even compatible with his more basic preferences. However, from a linguistic point of view it is sufficient that when utters (60) he believes (61) to obtain. Whether (61) is congruent with d's more general patterns of conduct and belief is of no linguistic significance. (61) is a weak context implicature of (60). We have just shown that (61) is a context precondition of (60). To prove that it is also a weak context implicature, we have to show that neither B( , C, Ct (61)) nor its negation are context preconditions of (60). This can be done by pointing out two a-(60)-appropriate contexts of utterance Q and C2, such that B ( , Cj, Q, (61)) is not believed by the speaker in Q to hold in Q, while (, C2, C2, (61)) is believed by him in C2 to obtain then. The first case is obvious. Usually, CLhas no reason to assume that is already aware of a* s current particular preferences. Indeed, Ct would not utter (60) in C\, had he seen that is bringing him the file marked confidential. On the other hand, there are cases in which has been given some reasons to believe that knows exactly what wishes him to do. Consider the case of tt's uttering (60) repeatedly, just because is openly refusing to bring that file to a. After all, is an ideal hearer, not an ideal friend, collaborator or disciple. (61) is not a pragmatical presupposition of (60), of any kind. It is not clear what one takes or should take the negation of a command or of a sentence like (60) to be. Assuming that negation applies to the radical of (60), the resulting sentence will invoke in its appropriate contexts of utterance, not (61) but rather its negation. Nothing in our arguments hinges upon the particular example we discussed. A consideration of any other sentence S and a class of ^-appropriate contexts of utterance in all of which S is semantically imperative, will lead us to

von Wright (1972) 142ff.

28

Asa Kasher

similar conclusionsa preference relation of the same structure is involved and it is a weak context implicature of S. Now, what about sentences of other semantical moods ? For our proposal to have at least some theoretical significance, we have to show that every sentence, of whatever semantical mood, carries such preference relations, and that each semantical mood bears a characteristic class of such relations. We present now without discussion some preference relations that are weakly implicated by sentences of certain different semantical moods. These relations will all be amended in the sequel, but the following examples do already show one important property of the ultimate class of these preference relations, vi^. that sentences of different semantical moods weakly implicate structurally different preference relations. This property will remain intact throughout all the amendments. (63) Did you give me the file marked confidential? (64) , ( , 2 ) where v, is the set{Be, Cl (K . Cl p v K , Cl <-/>)> K, C2P v K, Cz -/>}, and v2 is the set {Be, c (Kp, Clp , Cl ~p)9 -K, C2p v K, C2 -/>} where '^ c" and "K^ c" stand, respectively, for believes or knows about C, at C, and where C2 is a context immediately following the context of utterance Ci. p represents the radical of (63). Under similar notation: (65) She gave me the file marked confidential. (66) PREF^v,^) where vi is the set {Be,C|/>, K t C2Ba> Cl], and v2 is the set {Be, Clp, - Kp, C2Be, Clp}. (67) If you don't give me the file marked confidential I shall dismiss you. (68) (In the sense of threatening) PREF., c (r, ~ r) & PREFa> c ({- r, q}, {- r, ~ ?})20 where ^ r is the antecedent of the radical of (67) is represented by p: if ~ r, q. ( 69) I advise you to give me the file marked confidential. (70) PREF^c/ivLVa) where vx is the formula PREFp> Cz (p, ~p)9 and v2 is the formula PREF > Cz \~p, p). (71) I hereby apologize for not giving you the file marked confidential.
When the arguments of a preference relation are unit sets, we don't use the set notation, but specify only the element.
20

Mood implicatures: a logical way of doing generative pragmatics

29

(72)

,( , 2 ) where vj is the formula PREFat eC (/>, ~ p), and v2 is the formula ~ PREFaf & (/>, ~/>). In a variety of semantical moods, the mutual distinctions are structurally displayed without any recourse to the internal structures of the radicals themselves. However, this is not generally so. The basic preference relation weakly implicated by (73) seems structurally similar to (61) (73) I promise to give you the file marked confidential. The obvious distinction between commands and promises is revealed in the preference relations implicated, when their structures are specified beyond the level of radicals. If the subject of p points to and the predicate of p described a future act of a, (61) is a preference relation characterizing promises21, but when the subject of p points to and the predicate of it to a future act of , (61) characterizes commands.

Thus we see that the semantical moods put the accompanying radicals to characteristic uses in forming weakly implicated preference relations, possibly restricted by properties of the radicals themselves.

Still, there are different semantical moods that are indiscernible. Even if the properties of the radicals are taken into account we have not yet provided means for telling demands from requests, and both from commands, for example, There are two ways to tackle this problem, and for reasons that will become clear soon, we are going to use both. Both a skyjacker demanding ransom and a refugee asking for political asylum are not in a capacity of issuing orders as to their particular preferences. Both convey preferences of the same structure, but obviously demands seem to be more intense than requests, which in turn may take different degrees of intensity, as shown by the possible modification of the explicit performative involved by "please", "very much", "very very much", etc. So one may try to adopt either of two approaches: demands and requests are of the same semantical mood, but employ it in different degrees of intensity, or alternatively, they are different kinds of speech acts, involving different semantical moods, which however share their structure but not their intensities. Without delving into a comparative study of the two approaches, we here adopt the second one, to distinguish between requests of different kinds, and between demands and requests. Consequently, the characterizing implicatures include a variety of preference relations, sharing their formal properties, except those pertaining to particular intensities.

21

There are necessary conditions for the happiness of an utterance of (73) but we ignore them for the moment. The same holds for the commands under consideration.

30

AsaKasher

Now, when issues a command he presumes that both he (or she) and believe that has the xight to be obeyed, being in an appropriate position of authority in some mutual hierarchy, at least in the context of utterance. That has at the context of uttering St by which he orders to do something, or to avoid doing something, a certain authority over , is a strong context implicature of S. If captain did not believe that is a soldier under his command, he would perhaps prefer 's not bringing him the file marked confidential to his jdoing it, by reasons of regulations, caprice, or what have you. Thus, what is weakly implicated by (60) is (74): (74) PREF e , Cl ( Vl ,v 2 ) where vx is the set {/>,?} v2 is the set {~/>, q}, p is the proposition that gives the file marked confidential to a short while after C\ and q is the proposition that is in a position to command at C\. Put differently, at C prefers one state of affairs, w x , in which brings him the file marked confidential a short while after Cly over another state of affairs, w2, in which does not do it, if Wj and w2 are different just to that extent and in both of them has command over at Q. We shall generalize this observation later. Meanwhile we notice that this amendment of (61) enables us now to discern between commands and requests, and between commands and demands, by recourse to the structures of the preference relations weakly implicated, namely (61) and (74). Similarly, we have to include in both arguments of the characteristic preference relations of promises and warns, their appropriate positive and negative strong implicatures, respectively (cf. Searle 1969, _58f). This will render these semantical moods descernible from each other, and different from other ones in still another respect. One may take objection to our approach by pointing at a context of utterance C which is appropriate for uttering a sentence S by a speaker a, although the preference relation we claim to be implicated is not believed by to hold at C orateC. Consider for example the case of who is obliged to command , at C, to give him the file marked confidential. He might have uttered the sentence (60) even though his own preferences had actually been in favour of 's not giving him that file at all. (A similar apparent counterexample concerning promises is suggested by Carter (1973). In such a case, it might be argued, (61) does not hold in C and it is not believed by to hold then; consequently, (61) is not (weakly) implicated by (60). Now, the interpretation of a PREF-formula is done by means of certain possible worlds, which differ from each other to a certain, specified extent. We

Mood implicatures: a logical way of doing generative pragmatics

31

admit that (61) is not implicated by (60), if no restriction are imposed on the possible worlds involved, except those concerning the extent of differences between compared possible worlds. But there is nothing remarkable in this observation. We have seen already an additional restriction imposed on all these possible worlds, vi%. that the strong context implicatufes of (60) hold in each of them. When issues a command by uttering (60) he exercises his authority over . Even if is obliged to issue that command he is not absolutely compelled to do-it: he may resign, he can violate a regulation, he is even in a position to commit suicide if he wants to avoid by all means uttering (60) to command to give him the file marked confidential. By uttering (60) shows that he is not willing to resign, to violate a regulation, or to commit suicide, in order to avoid the production of that command. Given that he deliberately avoids taking any step to avoid uttering (60), prefers 's giving him the file marked confidential to 's not doing it, everything else-being equal. Notice that since is an ideal speaker he cannot pretend commanding . If he utters (60) in an appropriate context, he cannot fail to produce a command. Assuming that believes that has some authority over him at C9 we conclude that believes also that if issues a command, then prefers its being obeyed by over a's resignation, violation of law, or deathr Hence, these preferences are strong context implicatures. All the possible worlds involved in the interpretation of (61) are required to be such that these implicatures hold in them. When (61) is thus interpreted it is true. 8. Pragmemes

Given that the sentence (60) (Give me the file marked confidential) weakly implicates appropriate English expressions of (74), let us compare twa possible states of affairs, w5 and w6, in both of which (75) is in a position to command at \, and (76) At Ci, knows that believes that (74) holds, and which differ from each other exactly to the extent that in w5 (77) does not give the file marked confidential to a short while after Q holds, while in w6 (78) gives the file marked confidential to a short while after C is the case. Obviously it follows from our ceteris paribus interpretation of "PREF" in formulae like (74) that prefers at C w6 over Wj. Compare now two other possible states of affairs, w7 in which (77) holds and w8 in which (78) holds, in both of which (75) and the negation of (76) are true. Again it follows from (74) that at Q prefers w8 over w7. Thus not only (70) is weakly implicated by (60) but obviously also (79) and (80): (79) PREF., Cl ({(75), (76), (78)}, {(75), (76), (77)}) (80) PREF., Cl({(75), non-(76), (78)}, {(75), non-(76), (77)})

32

Asa Kasher

These two weak implicatures of (60) are, so to speak, parts of the meaning of the weak implicature (74); no special non-deductive rules are needed for deriving them from it. Consider however two other possible states of affairs, W9 and WJQ, that differ from each other exactly to the extent that (75), (76) and (78) hold in w9, but (75), non-(76) and fion-(78) (i.e., (77)) hold in w10. Which if any of these states of affairs does prefer at Q over the other? Would you prefer such a w9 to such a w10 ? Which if any of these states of affairs does prefer at Q over the other? Would you prefer such a w9 to such a w10, playing the role of at Cj? Certainly this is the question of balancing the merits of 's bringing that file to on time and the possible demerits of disclosing a's preference (74) to . Clearly, by uttering (60) at C\ Ct tilted the balance towards w9. If he had prefered w1(> over w9, he would have kept silent, at least with regard to his preference (74). (81) PREFe, Cl ({(75), (76), (78)}, {(75), non-(76), non-(78)}) Since in his capacity as an ideal speaker is neither required to believe that takes (81) to obtain in an a-(60)-appropriate context of utterance nor required to believe that takes the negation of (81) to obtain then, it seems that (81) is weakly implicated by (60). Notice that this seeming weak implicature does not follow from (74) logically, and in this respect it is interestingly different from the above mentioned weak implicatures of (60), vi%. (79) and (80). Actually, (81) is not a weak implicature of (60) at all, for one nicety. In uttering (60) at 6\, shows his willingness to "pay" for the prospective rendered service, by revealing some of his current particular preferences to . The latter has his own purpose and plans, and he is apt to use information about to advance his own interests, whether would li ke it or not. In this sense, pays . But this payment is significantly restricted. An ideal speaker is aware of what he is saying; we assume that he knows what he is doing by uttering (60). The payment is, so to speak, under control. Consequently, would prefer w9 over w10 if in the former knows that knows that believes (74) to be true, at C^. There is no reason to assume that in his capacity as an ideal speaker would prefer, say, wu in which he has command over , brings him the file on time, knows that believes (74) to hold, at 6\, but does not know that knows //, over w12 in which he has the same command over , does not bring him the file on time, and does not know that believes (74) to be true, at C\. It is imaginable that is willing to take the risk of letting know that he is interested in a file marked confidential,, but only on a condition that he knows what knows about it and can take the appropriate protective steps, if necessary. A surprising disclosure of the fact that prefers some state of affairs to another might be too risky for a. (82) Consequently, a weak implicature of (60) is (82) rather than (81): PREF e , C l ( V l ,v 2 ) where Vl = {p, q, K , Cl (74), Ke, Cl K p , Cl (74)}

Mood implicatures: a logical way of doing generative pragmatics

33

and 2 -{*~,. (74)}; p and q are as in (74).

One might object to our using the knowledge operator and not the belief operator in (76) and (82). Notice, however, that our discussion has been restricted to a linguistic community of ideal speakers. Thus means exactly what he says, and conveys exactly what is implicated by the sentences he utters, and understands exactly what says; never misleads and never misunderstands a. When a manifests a certain belief r on his part, he entertains it at the context of utterance, and grasps it precisely and completely, is, therefore, in a position to know that believes r at that context. To be sure, we assume that the community is linguistically homogeneous, and that every ideal member of it knows that he and the others are ideal and that they all know it. Obviously, (82) is logically independent of (74), but they are not prgamatically so. A rule that we dubbed elsewhere22 "the rule of the communication price" enables us to derive (82) from (74). Where "w >" stands for the relation of weak implication, we have: (R) Sw>PREF e , c (v 1 ,v 2 ) Sw>PREF t t f C (u 1 ,u 2 ) where

and

Applying (R) to (82), one finds still another weak implicature of (60). This process can of course be reiterated indefinitely many times, yielding new weak implicatures of (60). The set of preference relations involved, characterizing the semantical mood of (60), includes this infinity of weak implicatures of (60) as well as (79) and (80) and, indeed, similar consequences. But this infinite class is finitely definable: it includes the basic preference relation (74) and is closed under (R) and under logical deduction rules. All the characteristic classes of preference relations are similarly definable. The basic preference relations of these classes we call "pragmemes"?*
22

Kasher (1973). See also Kasher (forthcoming). We do not develop here the due distinction between uninterpreted pragmemes and interpreted ones. See chapter 9, below. We introduced the term "pragmemes" in our lecture at the IVth Congress of Logic, Philosophy and Methodology of Science (Bucharest 1971). Since then it has been put to another use by Konrad Ehlich and Jochen Rehbein.
23

3 TLIl/2

34

Asa Kasher

9.

A formal theoretical framework, outlined

Earlier we have made informal use of a variety of preference relations between possible worlds, at some contexts of utterance. We shall now show how to formalize this conception within a theoretical framework, meant to be a basic, pragmatical component of an adequate linguistic theory. Let L be a natural language. A linguistic theory for L is a system M of postulates and rules, formulated in a language serving as a meta-language of L. First, we outline some elements of this system. /. Sentences. specifies recursively the set of sentences of L. By "sentences" we do not mean just strings of phonological units or of letters, but pairs each consisting of such a string and some syntactical surface parsing and lexical specifications of it, in a way that makes sentences to be syntactically24 and lexically unambiguous.25 In this we deviate from some common ways of understanding the vague term "sentence" but this deviation will make our exposition simpler. 2. Semantical representations. includes a set of uninterpreted formulae by means of which semantic analyses of sentences of L will eventually be represented. We call these formuke "creative". (This term is the pragmatical counterpart of "generative".) These might be the well-formed formulae of a second-order predicate language with intentional operators of various kinds, with many-sorted and restricted variables, branching quantifiers and other logical devices which will be found useful.26 -includes also general means for considering interpretations of the creative formulae. Adopting Thomason's terminology, a valuation consists of an interpretation assignment which associates semantical valuesintensionsto atomic expressions, and a method of projection (or better: a method of combination) which determines the semantical values of molecular expressions, recursively. Where intensions are taken to be functions whose domains consist of classes of possible worlds or other semantical exponents (see section 2 above) a valuation includes also an interpretation structure which specifies a class of semantical exponents. In order to capture some linguistic phenomena, e.g. semantical presuppositions, we admit valuations that are partial in the sense that the functions which serve as semantical values of formulae or other expressions might be left undefined for some elements in their domains.

Notice that only surface syntax is involved here. For a similar conception of "sentence" see my paper (1972). 26 For the usefulness of intentional operators, consult Montague, loc. fit.; concerning many-sorted and restricted variables, see Kasher (1973a); a discussion of branching quantifiers is included in some unpublished papers of Hintikka; see also Gabbay and Moravcsik, this issue.
25

24

Mood implicatures: a logical way of doing generative pragmatics

35

3. Contexts of utterance. Another facility of M is the means it provides for considering classes of possible contexts of utterance. A major duty of M is to characterize adequately the notion of appropriateness of contexts for uttering sentences. At the moment we assume that a context of utterance C determines the values of the meta-variables (the speaker) and (the addressee or addressees), and the exponent eC. 4. Translation. is meant to render ultimately adequate descriptions and explanations of meaning and use. For that purpose, characterizes a translation function that maps pairs <^, O f sentences and contexts appropriate for their utterance, onto triplets </*, r, v) where r is a creative formula and v is a valuation; r is related to the radical of S at C2/; and m is a representation of the mood of S at C. Indeed eC should be admissible according to the interpretation structure of v. A problem that arises here is that of sentences which are semantically ambiguous, e.g. those having both a specific and a non-specific reading for one of their phrases. Our view is that this should be taken care of by the context: a context of utterance is not appropriate for uttering S if it does not resolve all the semantical ambiguities of S. Indeed, people produce and understand sentences even if used in some contexts that do not supply all the information needed for resolving the ambiguities, but then people also produce quite often and understand quite well utterances of non-grammatical expressions. These two facts should, to our mind, be dealt with similarly, in a performance-extension of the theory we outline here. Pragmatical inappropriateness should be on a par with syntactical inappropriateness. We mention in passing that we hold v fixed in most of the cases. The characterization oiuses of sentences is done within M in the following way. A given pair <5, T> is evaluated by the translation function. If C is not J'-appropriate, then the translation function is undefined for <j\ CT>. If C is ^-appropriate, then (S, Cy corresponds to some (my r, v). The pair (r, v) is actually an interpreted creative formula which represents as a function of eC the radical of S at C. Now it is the mood's turn to be staged. Following the general conception of this paper, M ascribes to each semantical mood m a class of characteristic pragmemes. These are presented as special formulae of whose main relational expression is "PREF1", where "i" refers to a degree of intensity.28 Among the

27

r is not the radical of S at C, since it is not interpreted yet. The valuation y, the exponent of Cy eC, and r, define the radical of S at C, which is an interpreted creative formula. 28 We deliberately ignore again the due distinction between PREF-formulae and their English expressions.

36

Asa Kasher

arguments of these relations are always functions of the interpreted formulae. M also includes logical rules and rules of communication (like our [R]) that are iteratively applied to the pragmemes and to the derived preference formulae. These formulae are interpreted, with the aid of the valuation v and the exponent cC. The class of these interpreted formulae represents in M the contribution of the semantical mood of S to its use at C. Another component of M, yet unspecified, will provide a representation for that part of the contribution of r to the use of S at C, which is independent of.29 We conclude this outline by showing how to formalize the interpreted preference relations. Generally speaking we adopt von Wright's logic of preference, somewhat revised according to our purposes.30 Preference relations are relations between possible states of affairs. We formalize the later notion by employing van Fraassen's concept of supervaluations.31 These are partial functions from formulae to truth-values, i.e. to {"True", "False"}. Some restrictions are imposed on these functions within M: (i) (ii) If a supervaluation s ascribes the value "True" to x, and y follows logically from x, then s(y) = "True" ; If y is a semantical presupposition of and s is defined for x, then

The third restriction is imposed only on those supervaluations that take part in characterizing the semantical mood of S at C by preference relations. In this case we also require: (iii) If j is a strong implicature of S, s(y) is "True", if s(S) is "True". Thus, the preference relations involved in uttering S at C, are relations between possible states of affairs in which the strong implicatures of S hold. It seems that eventually we shall have to impose a further restriction, namely that the participating supervaluations render "True" what is weakly implicated by the radical of S at C9 independently of the accompanying mood^ but since we don't have even an approximation to a theoretical framework for this purpose, we do no more than mention it here. Naturally we require also that the supervaluations participating in the interpretation of a preference formula related to the utterance of S at C will all be admissible according to the valuation v that the translation function relates to
See however our paper (1973a) for a discussion of one aspect of the problem. Unfortunately we used there "presupposition" for few different relations, one of which is the weak context implicature, defined here. 30 von Wright, loc. '/., and the bibliography thereof. 31 van Fraassen (1971: pp. 153163). 31 van Fraassen (1971) 153163.
29

Mood implicatures: a logical way of doing generative pragmatics

37

Finally, given two finite sets of formulae, and v2, we say that "the states of affairs Wi and w2 are different from each other exactly to the extent that vj holds in Wi and v2 holds in w2," just in case there are three supervaluations Si9 S2, and s3 such that st is an extension of J3 defined where s3 is defined, on Vj and nowhere else and ascribes to all the elements of Vj the value "True", and s2 is also an extension of J3 defined where s3 is defined, on v2 and nowhere else and ascribes to all the elements of v2 the value "True" and ^ defines wt and s2 defines w2. 10. Open problems

In conclusion we would like to mention two problems that have not been discussed here. Both seem to deserve close consideration. The first problem pertains to some semantical theories as well as to our pragmatical theory. The closed set of logical consequences of a proposition produced by uttering S at C, is sometimes taken to characterize the meaning of S (at C)32. Similarly, we define a set of preference relations, which includes certain pragmemes and is closed under logical rules and under (R). This is somewhat doubtful in the sense that it takes the ideal speaker to be an ideal logician with respect to meaning and use. If one rejects the conception of logical capacities being perfect in one area of application but not in another, which is admittedly somewhat strange, it is possible for him to tackle the problem by using Hintikka's distinction between surface semantics and depth semantics.33 The second problem is deeper: What is it in the pragmemes that makes them, rather than other preference relations or other relations in general, basic elements of language use? And being so basic, why do not natural languages reflect them perspicuously in the surface structures of the sentences?

References
AQUIST, L. (1967) Semantic and pragmatic characterizability of linguistic usage, Synthese 17, 281291. BELLERT, I. (1969) On the problem of a semantic interpretation of subject and predicate in sentences of particular reference, pp.926 in: Bierwisch, M. and K.E. Heidolph (Eds.), Progress in Linguistics, The Hague: Mouton. CARTER, W.R. (1973) On promising the unwanted, Analysis 33, 8892. CULICOVER, P. W. (1972) OM-sentences, Foundations of Language 8, 199236. FR SER, B. (1970) A reply to "On declarative sentences", ML AT, NSF-24, Cambridge, Mass. FREGE, G. (1918/19) The thought: A logical inquiry, pp. 1738: in Strawson, P.F. (Ed.), Philosophical Logic, London: Oxford University Press.
32 33

E.g. Hiz (1969), and Bellert (1969). Hintikka (1973).

38

AsaKasher

GALE, R.M. (1970) Do performative utterances have any constative function? The Journal of Philosophy 67,117121. HINTIKKA, J. (1973) Surface semantics: Definition and its motivation, pp. 128147 in: Leblanc, H. (Ed.), Truth, Syntax and Modality, Amsterdam: North-Holland Publ. Com. Hiz, H. (1969) Aletheic semantic theory, The Philosophical Forum 1, 438451. JESPERSEN, O. (1924) The Philosophy of Grammar, London: George Allen & Unwin. K AS HER, A. (1972) Sentences and utterances reconsidered, Foundations of Language 8, 313345. (1973) Worlds, games and pragmemes: A unified theory of speech-acts, pp. 201207 in: Bogdan, R.J. and I. Niiniluoto (Eds.), Logic, Language and Probability, Dordrecht/ Boston ;D. Reidel. (1973a) Logical forms in context: Presuppositions and other preconditions, The Monist 57,371395. (forthcoming) Aspects of Pragmatical Theory, Frankfurt: Atheneum. KATZ, J.J. (1966) Mr. Pfeifer on questions of reference, Foundations of Language 3, 241244. (1972) Semantic Theory, New York: Harper & Row. , P.M. POSTAL (1964) An Integrated Theory of Linguistic Description, Cambridge, Mass.: MIT Press. LEWIS, D. (1972) General semantics, pp. 169218, in: Davidson, D. and G. Harman (Eds.), Semantics of Natural Language, Dordrecht: D. Reidel. MONTAGUE, R. (1970) Universal grammar, Theoria 36, 373398. (1973) The proper treatment of quantification in ordinary English, pp. 221242, in: Hintikka, J., J.M. E. Moravcsik, and P. Suppes (Eds.), Approaches to Natural Language, Dordrecht/Boston: D. Reidel. QUINE, W.V.O. (1961) From a Logical Point of View, Cambridge, Mass.: Harvard University Press. Ross, J.R. (1970) On declarative sentences, pp. 222272, in: Jacobs R.A. and P.S. Rosenbaum (Eds.), Readings in English Transformational Grammar, Waltham: Ginn and Co. SADOCK, J.M. (1969) Hypersentences, Papers in Linguistics 1, 283370. SAMPSON, G. (1971) Pragmatic self-verification and performatives, Foundations of Language 7, 300302. SCHNELLE, H. (1973) Meaning constraints, Synthese 26,1337. SEARLE, J. (1969) Speech Acts, London: Cambridge University Press. STALNAKER, R.C. (1972) Pragmatics, pp. 380397, in: Davidson, D. and G. Harman (Eds.) Semantics of Natural Language, Dordrecht: D. Reidel. STENIUS, E. (1964) Wittgenstein's 'Tractatus', Oxford: Basil Blackwell. (1967) Mood and language-games, Synthese 17, 254274. TARSKI, A. (1944) The semantic conception of truth, Philosophy and Phenomenological Research 4; reprinted in Feigl, H. and W. Sellars (Eds.), Readings in Philosophical Analysis, New York: Appleton-Century-Crofts, 1949, pp. 5284. THOMASON, R.H. (1972) Extensions of Montague Grammars, MS, University of Pittsburgh. VAN FRAASSEN, B.C. (1971) Formal Semantics and Logic, Macmillan: New York. VON WRIGHT, G.H. (1972) The logic of preference reconsidered, Theory and Decision 3, 140169. WERTHEIMER, R. (1972) The Significance of Sense: Meaning, Modality and Morality, Ithaca and London: Cornell University Press.

HANS-HEINRICH-LIEB1

GRAMMARS AS THEORIES: THE CASE FOR AXIOMATIC GRAMMAR (PART I)

Introduction
Part I: Axiomatic grammar writing: A program

1. 1.1. 1.2. 1.3. 1.4. 1.5. 1.6. 2.

Historical setting: The status of generative grammars as theories The thesis Classical TGs: The Earlier Model Classical TGs: The Later Model (excluding semantics) Classical semantics. Later developments Re-interpreting generative grammars Correlating grammars and axiomatic theories A framework for axiomatic theories (1): Formalized systems and abstract theories 2.1. Introduction 2.2. Formalized systems 2.3. Interpreted formalized systems 2.4. Abstract axiomatic theories 3. A framework for axiomatic theories (2): Realized theories and theory integration 3.1. Realized axiomatic theories: Interpreted and applied 3.2. On the definition of "applied axiomatic theory" 3.3. Uninterpreted constants in applied theories 3.4. Integration of theories: Conflation and use 4. Grammars as axiomatic theories 4.1. The program and its linguistic background 4.2. Questions of logic 4.3. Problems of feasability 4.4. Advantages of axiomatic grammar writing 4.5. Historical notes : Index of technical terms
1

This essay is part of the work currently undertaken by a research group on Theory of Language and Theory of Grammar, directed by the author, at the Freie Universitt Berlin. Basic research for the essay was begun as early as in 1970, when the author was at the Universitt Kolny and was then supported by a grant from the Deutsche Forschungsgemeinschaft.

40

Hans-Heinrich Lieb

Introduction In the present essay I shall propose and develop in detail a new idea that has important consequences for the scientific study of natural knguages. (In making such a claim, I do not wish to be included among the busy revolutionaries of the linguistic scene: When time has restored perspective, today's revolution may appear as the deposing of the city mayor.) The idea is simply that a scientific grammar (as opposed, say, to a pedagogic one) of a natural language or language variety can and should be formulated as an interpreted axiomatic theory. As most new ideas, this one, too, has a history, and this one, too, may be difficult to distinguish from more familiar ones if viewed from the wrong angle. For those who might be prone to take such a view, 1 may have a corrective function: There it is argued in detail that generative grammars as offered by the more influential schools of that province of linguistics are not theories in any ordinary sense of the term. There are indeed two methods, one developed by Wang, the other by myself, by which generative grammars can be rekted to axiomatic theories; but, so the argument goes, as the latter have advantages over the former but the former have none over the latter, generative grammarians may well be left to their scholastic squibs: Both for future work in the theory of grammar and for future grammar writing the more fruitful framework should be assumed. Given the half-life of linguistic schools, 1 may be of historic rather than immediate interest by the time this paper appears; let me emphasize, therefore, that my main points do not depend on either accepting or rejecting any tenet of transfof mational grammar. In Part I of the essay, discussion of generative grammars is followed by an outline of a framework for axiomatic theories (2 and 3) that contains various improvements over traditional conceptions; in particular, there is a new account of how different axiomatic theories may be combined. Combination of theories is important because a theory of language must be combinable with a theory of communication and with psychological and sociological theories; a grammar must be combinable with a theory of language; and various partial grammars (or parts of a grammar) with each other ( 3.4). Further comments on the framework may be found in 2.1. The main thesis of the essay is put forward, explained, and defended in 4: "A complete or partial linguistic grammar can and should be formulated as an applied axiomatic theory in the language of predicate logic or set theory"; in a second, stronger version the stipulation "and in terms of a theory of knguage that presupposes a theory of communication" is added (4.1). First, those parts of the thesis which cannot be understood by reference to the general framework for axiomatic theories are expkined. Assuming a distinction between theory of knguage and theory of linguistic description (in particular, theory of grammar) as drawn in Lieb (1970), it is seen that the thesis belongs within the theory of grammar but presupposes a theory of language (which has to be used for explaining the phrase "complete or partial linguistic grammar" and for evaluation of the thesis in general). The theory of language

Grammars as theories: the case for axiomatic grammar (part I)

41

(and communication) developed in Lieb (1970) is chosen as basic, together with various extensions that are outlined in the essay itself. The pragmatic aspects of languages are explicitly recognized by the theory of language; correspondingly, "linguistic grammar" is understood in an extended sense so as to allow for semantic and pragmatic (parts of) grammars. In 4.2, some logical aspects of the presupposed theory of language are discussed. This is followed ( 4.3) by an attempt to counter several more or less obvious objections that can be raised against the program; in particular, it may be doubted that an extensional language of logic is sufficient for grammars, and axiomatkability should not naively be taken for granted. This discussion is followed by an account of the advantages that the proposed format for grammars will have for linguistics ( 4.4), and by a sketch of previous work that seems to recognize or imply an axiomatic approach to grammar writing ( 4.5).2 In Part II"Axiomatic idiolect grammars: Testing the program" the main proposal of Part I is systematically studied as to its feasability. Assuming that systems of idiolects and systems of languages and language varieties are similarly structured, we concentrate on idiolects and their systems, treating separately each major subsystem: the phonological, morpho-syntactic, and semantic subsystems. The presupposed theory of language is extended by additional assumptions on the systems and subsystems and on their relation to speakers and actual speech events. The structure of a grammar is outlined for each subsystem such that the grammar is an axiomatic theory of the subsystem whose 'application* (interpretative part) rektes the system to a given speaker during a given time and to given speech events. In this way, grammars for the 'syntactical' (the phonological and morphosyntactic) subsystems are discussed, and 'semantic idiolect grammars' are given detailed consideration (pragmatic ones are considered only briefly). For morphosyntactic grammars a new conception of morpho-syntactic structure is introduced that gives a prominent place to the notion of paradigm, thus resuming a central concept of traditional European grammar. Two methods of specifying the semantic subsystem are distinguished, the translation method and the direct approach; the structure of corresponding grammars is outlined separately for each approach. We conclude by discussing questions of 'integration': It is shown how partial idiolect grammars may be integrated into a complete grammar; how idiolect grammars, grammars of language varieties, and grammars of languages may be related to each other; and how non-linguistic theories such as psychological and sociological ones can be linked to linguistic ones. The present paper might be called abstract to the second degree if we
2

Linguists may find Part I of the present essay more accessible by reading 1 and subsections 4.4f before turning to the more technical parts.Explicit recognition of the axiomatic approach is quite recent; relevant work is only beginning to appear. I may perhaps claim priorityfor whatever that may be worth (cf.fn.l). I am happy to find my position strenghtened by other authors taking a similar stand.

42

Hans-Heinrich Lieb

consider as concrete a theory of an individual language or idiolect, as abstract a general theory of language, and as even more abstract an investigation not into languages or language but into the properties of theories whose subject matter are idiolects or languages. In order to persuade the reader that such an undertaking is important for the scientific study of natural languages, I conclude by indicating three of the benefits that may be obtained from axiomatic grammar writing (for a fuller picture, cf. 4.4, below): The discussion of theories in the philosophy of science will directly apply to grammars; in particular, questions of interpretation, confirmation, and evaluation of grammars can be discussed directly within the framework of the philosophy of science. Secondly, it can be stated clearly what a theory of language contributes to grammars of individual languages, or language varieties, or idiolects. Thirdly, it can be shown in which way a grammar may be formally linked to theories on psychological, sociological, or other aspects of communication so as to give a fuller picture of actual language behaviorwhich goes a long way to explicating the concept of pragmatics for natural languages, a task in which generative grammar so far has failed completely.
PARTI AXIOMATIC GRAMMAR WRITING: A PROGRAM

Historical setting: The status of generative grammars as theories The thesis It has been asserted repeatedly by Chomsky and his followers that a transformational generative grammar of a natural language is a 'theory'.3 On closer inspection this turns out to be a questionable claim. It has never been clear what the subject matter of such a theory should be; in Chomsky (1965) Ch. 1, the reader is confronted with a series of vague and possibly inconsistent characterizations, which remain vague even in the light of Chomsky's later work.4 Disregarding problems of subject matter, a more fundamental objection remains: A theory in any customary sense of the term says something about something; more specifically, it contains expressions that can be understood as statements on the intended subject matter of the theory. (This formulation is still naive, deliberately so at the present stage of discussion.) A transformational generative grammar does not contain any expressions that may be understood in this way if a Chomskyan framework is assumed: A grammar is an algorithm for specifying expressions of a
E.g., Chomsky (1955) Ch. 1; (1957) 49; (1965) Ch. 1 pass.; Katz and Postal (1964) 2; Bach (1964) 9. 4 -Juxtaposing some of the well-known formulations in Chomsky (1965) brings out their possible inconsistency: "A grammar of a language purports to be a description of the ideal speaker-hearer's intrinsic competence" (3). "...by a generative grammar I mean simply a system of rules that in some explicit and well-defined way assigns structural descriptions to sentences" (8). "Any interesting generative grammar will be dealing, for the
3

1. /./.

Grammars as theories: the case for axiomatic grammar (part I)

43

'formal system'; as such, it does not say anything about the language of which it is to be a grammar or a theory. This objection can be raised not only against any Chomsky-type grammar, whether based on the 1955 theory or the latest version of the Standard Theory: Any generative semantic grammar is confronted with the same difficulty. Such grammars, it is true, have not been claimed to be theories; even so, we are left with the problem of how to relate them to their intended subject matter in a more than intuitive way. The interpretative problem for Chomsky grammars was first analyzed in Lieb (1967), (1968) and a solution suggested. Using a different approach Wang (beginning in 1968) was eventually led to establish a direct link between transformational generative grammars and axiomatic theories. To give full historical perspective to the present essay I shall first show in detail why a generative grammar cannot be a theory of its intended subject matter, and then discuss the attempts made independently by Wang and myself to overcome the ensuing difficulties. It will be concluded that these attempts, whatever their value, are misguided: Writing a grammar as an axiomatic theory is superior to choosing an algorithmic format and does not depend on it. For my arguments concerning generative grammars I will concentrate on Chomsky-type grammars up to 1965: During this period the formal properties of grammars are reasonably well defined, and the relevant arguments carry over to later developments in generative grammar, which as a rule are rather sketchy in their specification of grammatical form. Let us understand by "classical TG" any transformational generative grammar (of a natural language) that conforms to the theory of transformational generative grammars up to and including the versions proposed in or based on Chomsky (1965). Let us call this theory "the classical theory". I will argue for the following thesis: (1) If understood within the framework of the classical theory, neither a classical TG nor its levels (as far as they are specified) contain expressions that can be understood as statements on the intended subject matter of the grammar.5
most part, with mental processes that are far beyond the level of actual or even potential consciousness" (8). "A grammar can be regarded as a theory of a language [my emphasis, H.L]; it is descriptively adequate to the extent that it correctly describes the intrinsic competence [my emphasis] of the idealized native speaker" (24). Among the host of more or less competent discussions of Chomsky's "competence", Botha (1968) 3.5, may be mentioned as an early ambitious attempt; in Lieb (1970) Chs. 10 and 11, I have investigated a number of relevant problems in a new theoretical framework. 5 I am using "statement" informally but assuming that any statement is a sentence of some natural or constructed language that can be assigned a truth-value. In arguing for (1) I shall disregard formulations such as, "We can consider a grammar L to be a function mapping the integers onto .L...." (Chomsky (1959) fn. 1): Under this interpretation of "grammar", (1) would be trivially true.A position similar to.(l) is also taken in Hermanns (1971).

44

Hans-Heinrich Lieb

The earlier and the later forms of the classical theory will be kept apart, and the proposals for semantic description are treated separately. Classical TGs: The Earlier Model By "the Earlier Model" we mean the classical theory from 1955 to 1963, excluding the Katz-Fodor proposal for a semantic component (Katz and Fodor 1963). For this period the original presentation in Chomsky (1955) remains fundamental for grammars and their levels. This means, in particular, that all 'constituents' of a grammatical component (e.g. of the phrase structure component) are 'strings' of some level of the grammar or set-theoretical entities (e.g. relations) obtained from such strings.6 Given a grammatical level, the only entities that can reasonably be considered as interpretable expressions are the strings of that level. In Lieb (1968) 5, I investigated for all levels except the transformational one how the strings may be interpreted if the framework of Chomsky (1955) is assumed. As it turned out, those strings which are interpretable at all can be understood only as names of entities of the object language (of the grammar), or as names of sets of strings. Hence, none of these levels contains expressions that may be understood as statements on the intended subject matter of the grammar, for the simple reason that it does not contain expressions that may be understood as statements. Consider, in particular, the set of phrase markers which is a constituent of the phrase structure level. According to Chomsky (1955) Ch. 6, a phrase marker is a set of strings of symbols of the phrase structure level, vi%. the set of all strings occurring in the elements of a set of derivations which are equivalent in a certain respect (which can be represented by the same 'tree'). Neither the individual strings nor the set of strings (or, for that matter, the corresponding 'tree') is or can be understood as a statement. Hence, it is misleading to adopt the usual terminology and call a phrase marker a 'structural description' of a sentence: A phrase marker is not a statement and therefore does not 'describe' anything in any ordinary sense of the word. Now it is indeed possible to obtain statements on the basis of a phrase marker when terms such as "is" or "dominates" are defined for substrings (including individual symbols) of strings in a phrase marker (cf. the '6-relations' in Chomsky (1955) e.g. 202). Any such statement, however, will be a statement on strings to the effect that a certain relation holds between substrings of a phrase
6

1.2.

The interpretation of "constituent", above, depends on the logical structure assumed for a grammatical component (or level); e.g. in Chomsky's formulation of the phrase structure component (1959, Section 2) the constituents are the members of an ordered #-tuple. A component of a grammar must be distinguished from the corresponding level; thus, the phrase structure component is (or has as a constituent, or contains in some other way) a set of 'rules'; the phrase structure level has as one of its constituents the set of phrase markers specified by those rules. The set of rules belongs only to the component, the set of phrase markers only to the level.

Grammars as theories: the case for axiomatic grammar (part I)

45

marker. As these strings do not form part of the intended subject matter of the grammar (which is a 'natural language', in some sense of the term), the derivative statements do not refer to it, hence, they do not justify a terminology in which a phrase marker may be called a 'structural description' of a sentence (of the natural language).7 For the grammatical levels studied in Lieb (1968) it was found that they do not contain expressions which may be understood as statements. This result carries over to the corresponding components of the grammar since their only interpretable expressions are strings of the levels. To be more specific, consider the phrase structure component of a grammar as characterized in Chomsky (1959) Section 2. This is an ordered septuple (or would be, in a stricter formulation). One of its constituents is a two-place relation, denoted by "", whose members might be considered as strings of a phrase structure level in the sense of Chomsky (1955). It is assumed that there is a finite subrelation of >, /'. e. a finite set of ordered pairs of strings, with certain specified properties; the elements of this set are called the rules of the grammar (i.e. of the phrase structure component of a grammar). The 'rules' are the only entities of the grammar that it might seem possible to interpret as statements. Now a rule is an ordered pair of strings, and however the two strings are interpreted, the pair of strings is not (and can hardly be) considered an interpretable expression. On the other hand, an expression of the form * 9 where "" and "" stand for arbitrary strings, is easily interpreted as a sentence of symbolic logic stating that the relation holds between and . We might redefine "rule" so as to refer to such expressions. But then a rule would be a statement only on two strings of a grammatical level, which do not belong to the intended subject matter of the grammar but can at best be interpreted to refer to it. Hence, (1) would remain unaffected: A rule could be understood as a statement but not as one on the intended subject matter of the grammar. The transformational level and component have not been considered. Because of well-known insufficiencies of the early conception of transformations I will take them into account only in connection with the later model of transformational generative grammars for which more adequate treatments are available. Our later results carry over to the transformational component of the Earlier Model; they do not say much concerning the earlier transformational level because such a level was left unspecified in the later model. 1.3.
7

Classical TGs: The Later Model (excluding semantics) By "the Later Model" we understand the classical theory from 1964 (when

In Chomsky (1955), and frequently in later versions of the ckssical theory, the distinction between object language and metalanguage is blurred; the above statement is, therefore, somewhat questionable. For a detailed discussion, see Lieb (1968) 1.3 and 6. For a method of relating a phrase marker to a true 'description* cf. 1.6, below.

46

Hans-Heinrich Lieb

essential points of Chomsky 1965 were first presented) to, roughly, 1968, including the 1963 Katz-Fodor proposal for semantics and the treatment of transformations in Ginsburg and Partee (1969). Thus, most of the Later Model corresponds to the 'standard theory', to the exclusion of 'the extended standard theory' as developed, in particular, in Chomsky (1972) (a collection of articles published or prepublished years earlier) and Jackendoff (1972).8 ' We first consider the transformational component as conceived in Ginsburg and Partee (1969).9 Within this framework the transformational rules cannot be understood as statements on the intended subject matter of the grammar. Ginsburg and Partee define a transformational rule "as an ordered pair (>, C), where D is a domain statement and C is a structural change statement on D" (310). "A domain statement is any expression of the form Dv or DI Z?0" (309), where DI is a finite string of symbols from a certain set and D0 is a 'Boolean domain statement'; the latter concept is defined recursively as either (a) a finite string of symbols from the given set or (b) any expression of one of the forms (Di v >2), (D^ D2),~ Diy where DI and D2 are Boolean domain statements (309). A structural change statement on D (D a. domain statement) is again a string of symbols. The expressions "domain statement" and "structural change statement" seem to suggest that the members D and C of a transformational rule are intended to be interpreted as statements. There is, however, not the slightest hint how this might be achieved. In the case of the domain statement only a single interpretation suggests itself: Given a certain 'graph assignment', the domain statement might be said to denote the set of 'trees' that 'satisfy' the statement for that assignment (cf. the definitions I.e. 311; "tree" is not defined). For the 'structural change statement', I do not see any interpretation at all. The transformational rule as a whole remains uninterpreted. If an interpretation is sought, there is only one that would seem defensible: The rule (Z>, C) denotes a two-place relation (more specifically, a function) between 'trees', vi%. between trees satisfying D and their 'transforms'.10 Even so, a transformational rule is not understood as a statement. There is an informal way of specifying transformational rules which makes use of an arrow notation. This notation reappears in Ginsburg and Partee (1969) 313 as "=>7\^'", which, on the basis of a similar formulation I.e., may be read as "T and/change into '" ( a transformational rule,/a graph assignment,
8

For the two {errns, see Chomsky (1972*) 3.2. Ginsburg and Partee characterize the scope of their paper as follows (297 f): "As yet, no mathematical model has been given which encompasses most of these different versions of a T-grammar. The purpose of this paper is to propose one such mathematical model". (198, fn.:) "One important exception is the notion of "syntactic features" described in Chomsky (1965) and included in most subsequent T-grammars." 10 Again, denotation would have to be relativized to a graph assigment.

Grammars as theories: the case for axiomatic grammar (part I)

47

and ' trees).11 By binding the variables or replacing them by constants, we would obtain an expression that could indeed be interpreted as a statement. However, this would be a statement on transformational rules, graph assignments, and trees. From Ginsburg-Partee it is fairly clear that the rules belong to a grammar of a language, not to the intended subject matter of the grammar, however defined. Therefore, any such expression, although interpreted as a statement, would not (or not completely) be interpreted as a statement on the intended subject matter of the grammar. Given the definition of "transformational grammar" I.e. 315, the transformational rules and their two components are the only entities of the transformational component for which an interpretation as statements might be considered.12 Ginsburg and Partee formalize the transformational rules of syntax only. In the Later Model the rules of the phonological component came eventually to be interpreted as being, in a sense, transformational too. The most explicit formulation of this view is found in Chomsky and Halle (1968) 20: The phonological rules are said to be 'local transformations' in the sense of Chomsky (1965).13 Such transformations are apparently covered by the Ginsburg-Partee formalization if it can be extended to allow for the 'features' of phonology, which seems possible.14 In this case, our previous results carry over to the phonological rules. The situation is, however, complicated by the fact that Chomsky and Halle develop a formalization of their own (I.e. 390399) which makes the phonological rules very different formal objects from the transformational rules of syntax as usually conceived and as formalized by Ginsburg-Partee.15 I will not try to reconcile the two versions.16 Suppose that the Chomsky-Halle formalism is adopted. No explicit interpretation is given but it is clear that rules are the only expressions that could possibly be interpreted as statements (besides rule schemata, which can be reduced
Ginsburg and Partee omit reference to/, which is, however, presupposed in the context introducing the arrow notation. 12 It is not clear if and how on the basis of Ginsburg and Partee a transformational level could be constructed that would be compatible with the theory of levels in Chomsky (1955) or an appropriate modification of that theory. In the present context, such a level must be considered as unspecified. 13 "By a local transformation (with respect to A) I mean one that effects only a substring dominated by the single category symbol A" (/. c. 215, fn. 18). 14 Ginsburg-Partee did not consider grammars with 'syntactic features', see above, fn. 9. 15 The phonological rules as specified I.e. 391, (6), are isomorphic to context-sensitive phrase structure rules in the following sense: If the so-called 'units' in a rule are replaced by single symbols, then a phonological rule has the same form as a context-sensitive phrase structure rule, understood as an expression of the form * . The formalism is eventually expanded to include 'rule schemata' (393396). 16 The situation is further complicated by the fact that Chomsky and Halle consider extending their formalism to cover certain rules "which are rather similar to transformations in their formal properties" (I.e. 398), i.e. similar to syntactic transformational rules in their usual format.
11

48

Hans-Heinrich Lieb

to rules, cf. 394, (13)): In a rule , the arrow is taken as denoting a two-place relation whose members are somehow 'given* by and , where "" and "" are to be replaced by strings of certain primitive symbols (specified 390, (1)). As a reading of the arrow "rewrite as" is given; furthermore, the rules are 'applied* so as to allow the transition from strings to strings (391 f, (8) and (9)). Hence, if taken as a statement, a rule should be taken as a statement on strings. The following formulation may be roughly adequate: "For any string x corresponding to and any string y corresponding to , x is rewritten as y" (where "corresponding to" can be defined on the basis of 391, (8), and 392, (9)). The strings are not understood as belonging to the intended subject matter of the grammar; they are apparently considered as expressions that should be interpreted with respect to it. Hence, even if considered as a statement, a rule is not to be taken as a statement on the intended subject matter of the grammar. Let us now briefly consider grammars according to Chomsky (1965), disregarding the semantic component. The categorial component of the base and the phonological component are covered by our previous discussion. The same holds for the (syntactic) transformational component if we assume that taking into account 'syntactic features' will leave unaffected our previous statements on transformational rules. Although I have not made a detailed investigation, this seems to be true with the following qualification: If the restated transformational rules involve complete 'lexical entries', those entries are among the entities which might be considered as statements. There is, however, no explicit interpretation to this effect, and an appropriate interpretation would not invalidate our thesis (I).17 Hence, the lexicon of the grammar (consisting of lexical entries and 'redundancy rules', which are not problematic with respect to our thesis (1)) also contains no expressions that can be understood as statements on the intended subject matter of the grammar. Thus, thesis (1) holds for all the non-semantic components of a grammar conforming to Chomsky (1965), which is the key formulation of the Later Model.l8
1:4. Classical semantics. Later Developments

As conceived in Katz and Fodor (1963), the semantic component of a grammar consists of a dictionary and a projection rule component. (The revisions
A lexical entry is a pair (Dy C) where D is a phonological distinctive feature matrix, and C is a set of specified syntactic and other features (Chomsky (1965) 84, 87). These features are expressions such as " + N" (corresponding to the phonological specified features in Chomsky and Halle (1968) 391, (6a)). It might be proposed that "(>, Q" is to be interpreted as synonymous with the conjunction of statements "D is F! and ... and D is Fn", where Ft, ..., Fn are the features in C. This is a statement on a phonological distinctive feature matrix (such matrices are the values of the variable "D"\ the expressions substitutable for "/}" are names of matrices). The distinctive feature matrices do not, however, belong to the intended subject matter of the grammar (see our fn. 4); they are at best expressions that can be interpreted with respect to it. 18 The concept of level must again be regarded as unspecified.,
17

Grammars as theories: the case for axiomatic grammar (part I)

49

in Katz and Postal (1964) are irrelevant to our problem; hence, we use the earlier conception). The projection rules are indeed formulated as statements. They are, however, statements on paths in phrase markers and on their 'amalgamation'. The paths and their amalgams consist (in some sense of the word) of 'lexical strings' and strings of 'syntactic' and 'semantic markers' and 'distinguishes'. The markers and distinguishers (and strings and paths?) are regarded as expressions that should be interpreted with respect to the intended subject matter of the grammarthey do not belong to it.19 Hence, the projection rules are not statements on the intended subject matter of the grammar (which apparently consists of certain 'abilities' of speakers, cf. Katz and Fodor (1963) [(1964) 484, 493]). This should also follow from our 1.3, if Katz's claim is correct that projection rules are transformational rules (Katz (1971)). In the case of dictionary entries, be it in the original form of Katz and Fodor (1963) or as modified in Katz (1967) 144f, 149, we may argue in a similar way as for the lexical entries according to Chomsky (1965).20 The 'semantic interpretation of a sentence' contains statements on that sentence (Katz and Fodor (1963) [(1964) 503]), and the interpretation might appear among the entities of a 'semantic level' (left unspecified) which would correspond to the semantic component. The 'sentence' is, however, not a sentence of the natural language but a corresponding string of symbols of a grammatical level (cf. Lieb (1968) 3) and for this reason not part of the intended subject matter of the grammar. Hence, thesis (1) remains unaffected. This concludes our argumentation in support of (1), which by now should be well established. The thesis is restricted to classical TGs; our original claims in 1.1 were more general, covering most generative grammars that have been suggested. We shall briefly discuss whether (1) can be extended to later developments in generative grammar. It is natural to do so in the present context since in those developments it was treatment of semantic phenomena that came to be the key issue. Katz (1972) is an updated and expanded version of 'classical' semantics and might have been subsumed under it (as the most elaborate version of the standard
There are a few passages in Katz and Fodor (1963) where "marker" does not seem to refer to an expression [(1964) 518]. But cf. Katz (1967) 129: "A semantic marker is a theoretical term that designates a class of equivalent concepts or ideas." 20 There are, however, certain problems with this: (1) So-called complex semantic markers (as introduced by Katz) may perhaps be reconstructed as open formulas of predicate logic (cf. Bierwisch (1969); accepted in principle in Katz (1972) 166, fn. 21; criticized in Bartsch (1971) 4348). (2) Whereas markers in Katz and Fodor seem to denote classes of lexical items that share a factor in their meanings, they are later interpreted by Katz as denoting a class of 'ideas' (see previous footnote). As the phonological part of a dictionary entry continues to be understood as a phonological distinctive feature matrix (Katz (1967) 144) i.e., as an expressionan interpretation on the lines of our fn. 17 would require some adjustment.
4 TLIl/2
19

50

Hans-Heinrich Lieb

theory in the sense of Chomsky). The following differences are the most important ones in the present context: (a) Introduction of additional notational devices, in particular, of 'categorized variables'.21 (b) Replacement of different projection rules by a single one and assignment of this rule to 'general semantic theory* rather than to individual grammars (Ch. 3, 10). (c) Introduction of a new interpretative component of grammars to take care of 'rhetorical' phenomena such as topic and comment. The new notation is covered by our previous discussion22. The single projection rule is again of a metalinguistic nature; moreover, its status is no longer relevant because the rule is no longer part of individual grammars. Of the 'rhetorical interpretations' it is explicitly stated (I.e. 433) that they "will employ the same vocabulary of representation as semantic interpretation does." Thus, thesis (1) seems to hold for Katz (1972) too. Let us briefly consider 'the extended standard theory' as proposed in Chomsky (1970), (1972*) (including the 'lexicalist hypothesis' of Chomsky (1970*)) and developed most fully in Jackendoff (1972). Outside semantics, the only innovation that need concern us is the use of syntactic features instead of non-lexical category symbols such as "NP", both in Chomsky (1970*) and Jackendoff (1972).23 This extension of the feature notation is easily covered by an extension of our earlier discussion (above, fn. 17). As for the semantic rules, Chomsky does not make any proposals of his own. Jackendoffs 'projection rules' (e.g. (1972) 107, 293) are statements or pseudo-statements (using imperatives) on formal objects of the grammar (readings, semantic markers, syntactic phrase markers, and the like). In the case of his 'semantic structures', in particular, his 'functional structures', I am at a loss of how to understand them because of poorly explained notation. With this proviso, then, thesis (1) may be extended to the extended theory as formulated by its main proponents, and thus to the more important recent developments in *interpret(at)ive' semantics as a whole. An extension of thesis (1) should also be correct for most or all of generative semantics. Concerning the conceptions developed by McCawley and Lakoff, we argue as follows.24 As far as a systematic account of grammars has
Ch. 3, 9. Also: various parentheses p. 165f; a number of 'abbreviatory* symbols (e.g*> p. 314). 22 No demonstration will be attempted. Of a semantic marker it is now stated that it "is a theoretical construct which is intended to represent a concept" (Katz (1972) 38); the question of what the ontological status of a concept is "will be left here without a final answer" (39). 23 In Chomsky, there are additional changes in the form of the base rules, which, however, are of no consequence with respect to thesis (1). 24 I had worked out a detailed argument on the basis of Lakoff s unpublished book [1969] only a small part of which was published as Lakoff (1971). In 1972 I learned from the author that his book is being completely rewritten. I therefore decided not to publish my analysis, which, however, led to basically the same results as the following argument.
21

Grammars as theories: the case for axiomatic grammar (part I)

51

been given, the only expressions of a grammar that could possibly be understood as statements are the rules of the grammar. All rules (whether 'descendants' of the earlier phrase structure rules or of the transformational rules) are understood as specifying 'constraints' on 'derivations' that are sequences of phrase markers. The phrase markers do not belong to the intended subject matter of the grammar (a natural language, in some sense) because they are partly constructed from metalinguistic symbols such as "S", "N" etc. Hence, the rules cannot be understood as statements on the intended subject matter of the grammar.25 The work of Keenan (e.g. Keenan (1972), Keenan to appear) should also be covered by our previous arguments: There are rules for specifying an interpreted formal system certain expressions of which are the 'logical forms' or 'logical structures' of sentences of a natural language. Neither the rules nor the expressions are statements on the intended subject matter of the grammar. The logical forms are to be related to sentences of the natural language (rather, to names of such sentences) by transformations. To the transformational rules our previous arguments ( 1.3 and this paragraph) apply.26 In a way similar to Keenan, Bartsch (1972) and Bartsch and Vennemann (1972) are aiming at greater logical explicitness within a generative-semantics-type framework. Thesis (1) can also be extended to cover their work although this is somewhat difficult to establish due to imperfections of their formalization (e.g. Bartsch (1972) Ch. XX).27
25

LakofPs account of the concept of phrase marker in [1969] is completely inadequate if taken literally, though similar in intent to McCawley's definition of tree ((1968) 244), which, allowing for minor modifications, may be taken as basic for later work (McCawley (1972) assumes the 1968 account: 543, fn. 4). Our above statement on phrase markers is based on the following feature of McCawley's conception: McCawley includes a 'labeling relation' among the constituents of a tree and speaks explicitly of trees "whose non-terminal nodes are labeled by syntactic category names" (246). It should be noted, though, that McCawley introduces "tree" as a purely set-theoretical term. One might try to apply his definition to trees that are entities of the natural language: The set of 'nodes' would have to be identified with a sentence of the natural language and certain of its parts, the set of 'labels' with a set of syntactic categories (not category names). But certain conditions in the definition would now exclude many trees of the usual sort. (The difficulties are with multiple occurrence of the same part of the sentence, and with assigning a single part to several categories simultaneously, e.g. to N and NP).For different but related criticism, cf. Bellert (1972) 294f, Bellert (1973). 26 For Keenan's own view of his work, cf. the following quotations ((1972) 430, 453): "Using the naturalness condition presented in Part I we shall argue here that the L-structures we have proposed are natural transformational sources for NL-structures." ("L": "logical"; "ML": "natural language".) "Our work is akin in spirit to that in generative semantics (G. Lakoff 1969). The difference here is more one of emphasis: we have been more concerned to formulate rigorously certain logical properties of NL and less concerned to define the functions which derive surface structures from logical structures." 27 The question was discussed with the authors in winter 1972/73 in the Berlin research group on Theory of Language and Theory of Grammar.

52

Hans-Heinrich Lieb

Our sketch of recent developments in generative grammar should be sufficient; it would be rather useless to try and aim at exhaustiveness. Most forms of generative grammars that were not considered are systematically compared in Hirschman (1971); apparently they fall in line with the forms that were studied. Thesis (1) or an appropriate extension may now be considered as wellestablished for the majority of all work undertaken in generative grammar since its inception. Hence, we have also established the thesis that most generative grammars cannot be considered as theories of their intended subject matter in any ordinary sense of "theory (of)"; moreover, we are confronted with the general problem of how those grammars can be interpreted so as to 'refer', in a reasonable sense, to their intended subject matter. It is these questions that were taken up independently by Wang and myself and studied with respect to the syntactic components of classical TG grammars. Re-interpreting generative grammars In my own work I attacked the problems connected with (1) by what may be called the re-interpretation approach, as opposed to the correlation approach used by Wang. Briefly, the first method attempts to re-interpret the rules of a generative grammar so that they may be understood as statements on the intended subject matter of the grammar; the second method attempts to correlate an axiomatic theory to a given generative grammar.28 In Lieb (1967), an idea formulated in Lieb (1968) 379, is applied to (one version of) the categorial subcomponent of a grammar as understood in Chomsky (1965)29: The rules are re-interpreted so as to become statements on the system of the natural language. For this purpose, all rules are considered as expressions containing "", and are first interpreted in a standard way as statements on strings of a grammatical level. The lexical and grammatical formatives occurring in the terminal strings of the component are then interpreted as names of certain entities called "-L-constructs"; the basic morphological elements of the system of the object language are assumed to be ,-constructs. The terminal strings are interpreted to denote appropriate sequences of L-constructs. A category symbol
The possibility of re-interpretation seems to be naively assumed in much work within the generative framework (see also the quotations from Bach (1964) in Lieb (1967) 369 f). For an approach similar to Lieb (1967) though less formal and questionable in detail, cf. Hermanns (1971). The conception of syntax developed in Bellert (1972 ) may be considered as a first step to either the re-interpretation or the correlation approach; cf. below, 4.5. (Both Hermanns and Bellert are unaware of the work done by Wang or myself.) In a number of publications Marcus distinguishes and compares 'generative' and 'analytic' models of language (esp. Marcus (1967 *), (1969)). Although this parallels a distinction between algorithmic and axiomatic grammars, Marcus does not recognize the problems presented by our thesis (1) and considers the distinction between la linguistique analytique and la linguistique synthetique ou generative as a distinction essentielle^ irreductible a une autrey plus elementaire ((1969) 320). 29 Lieb (1967) was written later than Lieb (1968) and actually published only in 1969.
28

1.5.

Grammars as theories: the case for axiomatic grammar (part I)

53

denotes "a class of sequences of /.-constructs, roughly, the class of sequences with the following property: Each sequence is denoted by a certain part of the terminal string of some generalized phrase-marker, where the phrase-marker is a deep structure underlying a surface structure, and the part of the terminal string is dominated by the category symbol." (Lieb (1967) 371). Now a second interpretation of the rules is given, which makes them statements on the system of the object language, formulated either in predicate logic or in set theory. To give an example: Suppose that "S*NP^VP" is one of the rules. "S", "A//5", and "KP" denote the same classes as the corresponding category symbols "S", "NP" and "VP".30 The arrow is interpreted as =, and " " as denoting the product operation on classes of sequences, e.g. "NP VP" denotes the class of sequences obtained by concatenating every sequence in NP with every sequence in VP. On this interpretation, the rule is equivalent with "Every element of S consists of an NP followed by a VP" (understood as "Every element of S is the concatenation of some NP and some VP"), which corresponds to an intuitive reading given to such rules by many linguists (as could be easily documented from the literature). A way is suggested as to how the rules, understood as statements on ,-constructs and classes of ,-constructs, can be understood as statements on units or categories of the object language at least in the case of a 'correct grammar'. Any such possibility presupposes the second interpretation of the rules.31 Lieb (1967) does not yet formulate definite results but outlines a program for research. In subsequent work (unpublished) I carried the program through in complete detail, but eventually discontinued the entire line of research for two reasons: (a) When the classical theory was supplemented by the interpretative part I had worked out, the whole system seemed to be unnecessarily complex, especially if compared to the conception of theories and their interpretation as developed in the philosophy of science, (b) After 1965, the classical theory dissolved. In Lieb (1967) the question of how to understand a classical TG as an interpreted axiomatic theory was not treated explicitly; rather, I confined myself to the immediate problems raised by thesis (1). The correlation approach developed by Wang is concerned with the former question. It has grown out of work with a somewhat different orientation, which we will also characterize.
/. 6. Correlating grammars and axiomatic theories

In Wang (1968), a grammar of a natural language is conceived as a system of rules by which 'grammatical statements* (einfache grammatische Aussagen, I.e. 21)
30

For reasons explained in Lieb (1968) 2.2, the letter symbols in rules should be italicized and thus distinguished from the corresponding symbols of the grammatical levels. 31 That interpretation is not a psychological one; hence, it may be questioned whether we are establishing a direct relation to the intended subject matter, cf. the quotations above, 1.1, fn. 4. The same objection may be raised against Wang's solution.

54

Hans-Heinrich Lieb

can be derived. These statements have the form of sentences of first-order predicate logic consisting of a one-place predicate and a closed individual expression. (For formulating the rules Wang also introduces expressions analogous to individual variables.) Wang's 'statements' can indeed be interpreted as statements, and they are obviously intended as statements on the natural language.32 It is proposed (Lc. 40) that a 'structural description' (strukturelle Beschreibung) be identified with a construction (Konstruktion, Lc. 23) of a grammatical statement on the basis of the rules, the construction being a sequence of grammatical statements ending with the statement in question (say, "S he came"). Wang's general approach consists in developing formal systems that correspond to a syntactic system for first-order predicate logic (in the sense of Carnap (1958) Ch. B); a grammar of a natural language is the system of rules of one of the formal systems; grammatical statements on the natural language are derived by those rules, which thus correspond to the rules of inference in a syntactic system for logic.33 It is then shown that the resulting grammars correspond (in a specific way and allowing for a number of deviations) to transformational grammars and structural descriptions as characterized in Chomsky (1965). Further developments of this approach are found in Wang (1971^). For structural descriptions in the sense of Wang it is indeed easier to see how they could be directly interpreted with respect to the object language of the grammar. On the other hand, the rules of a grammar cannot be interpreted in this way since they are rules for the derivation of grammatical statements. E.g., "NPu, VPvSuv" (1968) 24, if interpreted as a statement, might be read as: "From "NPu" and "VPv", "Suv" is derivable" (where "u" and "v" are variables whose values are expressions of the natural language, and juxtaposition of variables denotes concatenation). The rule in this example is easily replaced by a sentence of predicate logic: "(u)(v)(NPu & VPv->Suv)""for every u and v, if u is an NP and v a VP, then u concatenated with v is an S." This suggests that the language of first-order predicate logic might have been used in the first place: Instead of giving a formal system whose rules of derivation coincide with the rules of the grammar, the latter are introduced as axioms in an axiomatic theory formulated completely within first-order .predicate logic itself. The axioms may then be understood as statements on the object language. The usual rules of inference are used to derive theorems which can be interpreted as statements on the object language, such as "S (he comes)".

As constants denoting expressions of the natural language Wang uses those expressions autonymously (cf. Lc. 21). If this is taken literally, he allows only for written forms of natural languages. Cf. "N'man*" (or "Nman") /. c. 20, to be interpreted as meaning that "man" is an N. 33 Basically the same approach is used in the appendix of Smaby (1971), where the concept of a formal system as put forward in Smullyan (1961) is applied.

32

Grammars as theories: the case for axiomatic grammar (part I)

55

This possibility is indeed explored in Wang (1971) (cf. also Wang (1972*) Section IV, (1973*)), where the rule quoted from Wang (1968) 24, is replaced as suggested above. Wang indicates a general way of replacing a context-free grammar by an axiom system formulated within first-order predicate logic (with identity) such that the axioms can be understood as statements on the object language. He formulates a special case of a theorem (proved in Wang (1971*)) stating that a sentence of the form Kx can be derived from the axioms (where K is a category symbol and a closed term to be interpreted as the name of a syntactic constituent) if is a terminal string derivable from K in the grammar ((1971) Section 4).34 An attempt is made to replace transformational rules by axioms, too (Section 5). Emphasizing that his results agree with Chomsky's conception of a grammar as a theory (p. 273), Wang finally applies the concept of deductive-nomological explanation to grammars reformulated as axiom systems. A more detailed application of that concept is attempted in Wang (1972*) (to be discussed below, 4.4). In recent work Wang has been trying to combine ideas from Katz and Fodor (1963), Knuth (1968) and Montague (1970*, ), (1973) in order to arrive at a semantic component that could be combined with an axiomatic syntax (Wang (1972*) Section V; (1972); (1973)); this attempt does not provide an axiomatic reformulation of semantic symbolisms as developed within the generative framework. Wang's work may well represent the most promising attempt to understand the syntactic component of a classical TG as an interpreted axiomatic theory (by mechanically constructing an uninterpreted axiomatic theory from the grammar). Still, his account is problematic because of (A) and inadequate because of (B) and (C): A. Consider phrase structure rules such as "S>NP VP". The sentence correlated with this rule"(x)OO(NPx & VPy>Sx"j)"is equivalent with "NP + VP^S" (assuming an appropriate language of logic and a definition of " + " so that NP + VP = {^: (3*)(3j)(NPx & VPj & = *">)}. But why not correlate "()(S^;>(3x)(3j) (NP* & VPj & = x"jO)", which would be equivalent with "S NP + VP"? Indeed, it is this latter sentence that is equivalent with the rule in the re-interpretation approach of Lieb (1967), and it is only this sentence that corresponds to the immediate constituent analysis which phrase structure grammars were meant to capture. Wang naively assumes that it is only a difference of formulation ("anders ausgedr ckt" (1971h) 277) whether constituents are combined or subdivided. But the two axiomatic theories that could be correlated with a phrase structure grammar differ in a non-trivial way. How are they related? Would it be possible to read " = " in each axiom of the two theories
34

In Section 3, it is shown that from any semi-Thue system a corresponding axiom system in first-order predicate logic can be constructed. The resulting system is, however, quite different from the systems obtained in Section 4, and it is left open how an interpretation with respect to natural languages might be effected.

56

Hans-Heinrich Lieb

("S = NP + VP"), which collapses the difference and gives us a grammar very much like the ones proposed in Cooper (1964) ? B. In many cases, the theory correlated with the complete syntactical component of a classical TG will be inadequate for lack of correspondence and may even be inconsistent. According to Chomsky (1965) 174, "rules of agreement clearly belong to the transformational component." We may surely assume a classical TG of English where the strings the boy is running and the boys are running (or analogous strings that would also serve the purpose of the demonstration) are terminal strings of the syntactic component and have (surface or shallow) phrase markers in which the boy is dominated by "NP" and are running by "VP". By Wang's method we should have "NP the boy39 and " VP are running as theorems of the correlated theory.35 Since agreement is handled by transformational rules, we may assume a base rule "SNP + VP" (or some other rule that would serve for the rest of the argument). The correlated theory then contains the axiom "(x)(y)(NPx & VPy>Sx y)". We immediately obtain the theorem "S the boy are running*. On the other hand, the boy are running cannot be derived from "S" in the grammar, assuming an appropriate rule of agreement that blocks the derivation. Hence the theory is inadequate, for lack of correspondence with the grammar. It may even be the case that we have a theorem of the form (Sx) whenever the derivation of from "S" by means of the grammar is transformationally blocked. In this case, we have "(S the boy are running)", which makes the theory inconsistent. C. Wang's informal indications of how to interpret a theory correlated with a classical TG grammar are inadequate.36 Consider the theory correlated with the categorial subcomponent of a classical TG. Suppose the grammar is meant to be a grammar of a certain natural knguage L (in a sense where L is not identified with a set of strings derived from "S" by means of the grammar, to avoid circularity; cf. Lieb (1968) 3). In the theory assockted with the categorial subcomponent, all category symbols "S", "NP" etc. are primitive (undefined) axiomatic constants. Following Wang's interpretative hints we would interpret them to refer to categories of the system of L by sentences such as "'S' denotes the set of sentences of L" etc. Assuming that denotation for category symbols can be understood as a finite two-place relation and thus be given by enumeration we are still left with the problem of making
"Should", because Wang makes only suggestions for treating the transformational component ((1971 b) 5). If those theorems cannot be obtained, the method is inadequate for that very reason. 36 Wang (1971 b) 277: "Ein formales System ist erst fr die Linguistik von Interesse, wenn es eine linguistische Deutung besitzt. Wir sagen zum Beispiel im Fall der Regel wie NPT N, da NP fr eine Nominalphrase, N fr ein Nomen und T fr einen Artikel steht." Wang probably wants to say that "NP" denotes the set of noun phrases, etc., which is then implicitly assumed also for "NP" in the axioms of the correlated grammar. Wang's interpretative hints are extremely scanty.
35

Grammars as theories: the case for axiomatic grammar (part I)

57

sense of the expressions "sentences of, "noun phrases of, etc., and of the constant "L". Wang does not even see the problem. Thus, an interpretation along his lines is empty. Instead of supplementing the theory by an interpretation as above we may suggest to add axioms which are identities: "S = Sentence-of-L", "NP = Nounphrase-of-L", etc. This would bring out even more clearly the emptiness of Wang's interpretative hints: They do not contain anything about how to interpret the new theory. What are the consequences to be drawn from our investigation of the two approaches that were meant to overcome the difficulties created by thesis (1)? If either the re-interpretation or the correlation approach is successful, a generative grammar (or rather, the syntactic component of a classical TG) may be taken as equivalent, in a certain sense, with an interpreted axiomatic theory. I do not wish to maintain that the problems and inadequacies of Wang's method could not be overcome, just as my own re-interpretation program can be carried through in detail. My argument is that the two attempts are equally misguided: To make sense of a generative grammar (to make it say something about its intended subject matter) we either end up with an interpretative machinery inferior to the interpretation concept that has been or could be developed for axiomatic theories (Lieb), or we make use of an axiomatic theory anyhow (Wang). In order to make sense of an axiomatic theory of a language, we certainly do not need a generative grammar. If it can be shown that such a theory is not otherwise inferior to a corresponding generative grammar, there is no reason why one should bother with such a grammar in the first place. And indeed, this can be shown ( 4.4, below). Hence, the attempts made by Wang and myself are misguided from the point of view of developing an optimal format for scientific grammars. Their main value consists in showing that the enormous amount of generative research could be made avaikble, or partly available, for use in a better framework. In turning to our main subjectgrammars as axiomatic theorieswe have to admit right away that the accepted view of interpreted axiomatic theories is not quite sufficient for our purposes. Thus, a general discussion of axiomatic theories is required. 2.
2.1.

A framework for axiomatic theories (1): Formalized systems and abstract theories
Introduction

The aim of this and the next section is an account of axiomatic theories preliminary in many waysthat may be adequate for treating the general problems of axiomatic grammar writing. The two sections have grown out of an attempt to handle those problems; from a systematic point of view they are basic to the discussion of grammars in Part II. At the same time they try to develop a coherent picture that may be of value independently. At the beginning even elementary

58

Hans-Heinrich Lieb

points may be stated explicitly (also for the possible benefit of some readers); soon enough we shall have to modify customary conceptions. Such departures will not always be made explicit; the rich literature on axiomatization must to a large extent be presupposed. Our way of presentation will be relatively informal; a rigorous discussion of all relevant problems in the present context would be both impossible and out of place. Even so the two sections assume a wider scope than is customary in the field and may come closer to giving a synthetic picture than most existing work. They also contain some new results, especially on the combination of different theories into new ones ( 3.4). An important feature of the framework is separation of formalized systems (logical calculi) ( 2.2f) from axiomatic theories ( 2.4 and 3): An axiomatic theory is formulated by means of a calculus without being a mere extension of the latter, hence, another calculus. We also assume a richer structure for axiomatic theories than is customary: An axiomatic theory is not identified with a set of sentences of some calculusquite apart from questions of 'interpretation'. The basic distinction usually made between 'uninterpreted' and 'interpreted' axiomatic theories is reconstructed as a distinction between 'abstract' and 'realized theories'. Two types of abstract axiomatic theories are distinguished ( 2.4), one using axiomatic constants, the other axiomatic variables and an 'explicite predicate' (e.g. a set-theoretical predicate). The realized theories are subdivided into 'interpreted' and 'applied' (3.1); the latter correspond to 'partially interpreted (empirical) theories' as discussed by Carnap and others. They are given special attention ( 3.2f) because our later account of axiomatic grammars (Part II) is based on them. The realized theories present a more differentiated picture of 'interpretation' than is usually assumed; in particular, there is no 'partial interpretation' in a Carnapian sense although there may be partial interpretation in a literal sense in the case of an interpreted theory, and there must be such interpretation in the case of an applied one (the 'theoretical terms' are //interpreted). Interpretation of the logical calculus used in a theory is discussed in connection with 'interpreted formalized systems' ( 2.3) where a concept of pre-interpreted formalized system is introduced; it is such pre-interpreted systems that are used for formulating axiomatic theories.37 The subsections on formalized systems (2.2) are in my own view less satisfactory than the ones on axiomatic theories. Not only did I have to be sketchy in my presentation of the ordinary conception of calculi (choosing, moreover, a fairly conservative version that may be too narrow to include recent developments in logic); I also had to introduce some new concepts without being able to develop their consequences. It may be worthwhile for the logician to pursue any nontrivial problem my account may present since it is completely motivated by the actual needs of linguistic theory construction.
Stegmller has emphasized the ambiguity of "interpretation" as applied to theories ((1969/70) II, 340f); our conception takes his criticisms into account.
37

Grammars as theories: the case for axiomatic grammar (part I)

59

These needs led to elaboration of the kst subsection ( 3.4) where a new conception of theory integration is developed. I distinguish two main types of joining given theories into new ones and formulate two theorems which are important for combining linguistic theories with each other and with non-linguistic ones. My account of formalized systems will at the beginning be based on Fraenkel et at. (1973)280293.
2.2. Formalized systems

In Fraenkel et al. (1973) a formal system** is considered as "an ordered quintuple of sets fulfilling certain requirements" (282): "(1) A set of primitive symbols, the (primitive) vocabulary, divided into various kinds, such as variables, constants and auxiliary symbols" (280). "(2) A set of terms as a subset of the set of expressions", an expression being any string (usually finite) .of symbols (281). "(3) A set of formulae as a subset of the set of expressions" (281). "(4) A set of axioms as a subset of the set of formulae" (282). (5) A set of "rules of inference according to which a formula is immediately derivable as conclusion" from an appropriate "set of formuke as premises" (282). If those sets satisfy certain requirements of 'effectiveness' (mentioned in their specification pp. 280282), such as the terms being effectively specifiable, the formal system is said to be a logistic system (285). The notion of theorem is introduced in the usual way (282): A sequence (finite in the case of a logistic system)
of one or more formulae is called a derivation from the set of premises if each formula in the sequence is either an axiom or,a member of For immediately derivable from a set of formulae preceding it in the sequence. The last formula is called derivablefrom . A derivation from the empty set of premises is called a proof of its last formula, hence, of each of its formulae. A formula is called provable or a formal theorem if there exists a proof of which it is the last formula.

In addition to formal systems, Fraenkel et al. introduce what they call 'formalized theories', each being (287)
an ordered sixtuple of sets: a set of symbols, a set of terms, a set of formulae, a set of logical axioms, a set of rules of inference, and a set of valid sentences, with various relations obtaining between, and various conditions fulfilled by, these sets.

The essential features are exclusion of non-logical axioms and introduction of a special set of 'valid' sentences (I.e.):
On terminological equivalents, cf. op. cit. 281. For technical terms, variables, assumptions, and theorems introduced in 2-4, cf. the index pp. 106fF(end of Part I of this essay). I gratefully acknowledge some valuable suggestions that were made to me for the present subsection by Franz von Kutschera (Universitt Regensburg) and Wolfgang Stegmller (Universitt Mnchen).
8

60

Hans-Heinrich Lieb The exact extension of this term has to be defined from case to case, the only general condition which such a definition will have to fulfil being that the set of valid sentences should be closed with respect to derivability^ i. e. that every sentence derivable from valid sentences by the rules of inference should be valid itself . . .

It is maintained that any formal system is a formalized theory (287). There is an inconsistency in this, since a formal theory is a quintuple not a sixtuple. Obviously, the authors think of a sixtuple obtained from a formal system by taking the set of theorems as a sixth member. But even then not all formal systems are formalized theories because a formal system may contain non-logical axioms. Let us introduce "formalized system" as a new term to cover both formal systems and formalized theories. A formalized system, then, would be a sixtuple of a certain kind. This conception is still not adequate for our purposes: Fraenkel et al. "wished to dodge the problem of the status of definitions in formal systems" (284). We shall explicitly recognize definitions by including the set of definitions among the members of the formalized system; thus, a formalized system becomes an ordered septuple. A final change concerns the 'rules of inference'. By the conception that seems to be adopted in Fraenkel et al. (cf. also Carnap (1942) 22, 157), rules are sentences of some language that refer to formulas of the formalized system; thus, they are 'metalinguistic' entities. To avoid such entities, we take the relation of immediate derivability itself, i.e. a certain relation between formulas and sets of formulas, following the suggestions in Carnap (1958) 26b. (The rules of inference themselves could also be reconstructed as entities that are no longer 'metalinguistic' in nature; cf. Hatcher (1968) 12.) We thus arrive at the following conception. A FORMALIZED SYSTEM (FS) is an ordered septuple of sets satisfying certain conditions (which are partly spelled out in Fraenkel et al. (1973) and cannot be reproduced here). The first member of the FS is called the set of symbols or the vocabulary of the FS; the second is the set of its terms (including the variables and certain constants from the vocabulary); the third the set of its formulas; the fourth the set of its axioms. These four members are related to each other and to the expressions of the FS (strings of symbols of the FS) as previously indicated. The fifth member is the (possibly empty) set of definitions of the FS, all of them taken as formulas. The sixth member is a two-place relation between formulas and sets of formuks of the FS, called (immediate) derivability in the FS. The last member is the set of valid sentences of the FS, again a subset of the set of formulas, more specifically, a subset of the set of sentences, /'. e. the closed formulas of F (the ones without 'free variables').39 The provable formulas on formal theorems of the FS are defined as before but
39

This assumes that the expression "free variable" can be defined for arbitrary FSs. Note that validity here is a non-semantic notion that must not be confused with truth.

Grammars as theories: the case for axiomatic grammar (part I)

61

the concept of derivation is changed by allowing not only axioms but definitions in the derivation sequence. All sentences that are provable and all sentences that are derivable from a set of valid sentences are valid sentences. A FORMAL SYSTEM (F1S) is now defined as an FS whose valid sentences are exactly the provable sentences or theorems of the FS. The terms "logistic system" and "formalized theory" could be re-introduced but we shall have no use for them.40 The following term refers to an important property of FSs: An FS is called axiomati^able if there is an F1S with the same set of valid sentences. Discovery of non-axiomatizable FSs was one of the main results of foundational research in mathematics. Since we explicitly recognize definitions, a few explanatory remarks may be in place. The definitions are formulas of a specific form that satisfy certain conditions relative to each other and relative to the other constitutive sets of the system. Thus, the definitions and axioms form disjoint sets; and there are four functions from the set of definitions: the defined term of, the dening terms of, the definiendum of, and the definiens of. The first is a function into the set of constants of the system; the second into the set of non-empty sets whose elements are either constants or variables; the third and fourth are functions into the set whose elements are either terms or formulas. The defined term is the only constant occurring in the definiendum; it does not occur in the definiens. The defining terms are exactly the constants and variables occurring in the definiens. The defined terms of all the definitions are the defined terms of the system. The set of constants of the system can be partitioned into the set of defined and the set of undefined terms.41 Using "F", "F!", ... as variables ranging over septuples such as formalized systems, and "C", "Q", ... as variables over sets of constants of FSs, we now introduce the following notions. Let C be a set of constants of the FS F. The C-defined terms of F = the defined terms of those definitions of F among whose defining terms there is an element of C or a C-defined term of F. (This should be replaced by a proper recursive definition.) The C-terms of F= those terms of F in which an element of C or a C-defined term of F occurs;42 analogously, the C-formulas, C-definitions, C-axioms, C-theorems, valid C-sentences and C-expressions ofF. The following concept (which I have not found in the literature) is fundamental for later definitions: Let F be an FS and C a set of constants of F. F-MINUS-C = the septuple consisting of the following sets: the vocabulary of
40 41

Also, "formalized theory" will be used later ( 2.4) for different purposes. For a systematic account of the conditions that must be satisfied by the set of definitions, see a sufficiently comprehensive standard introduction to logic such as Suppes (1957) Ch. 8; for a specialized treatise, cf. Essler (1970). For treating definitions not as formulas but as rules of inference, cf. e.g. Carnap (1942) 157f. For replacing definitions by axioms of a special type cf. Shoenfield 1967, 4.6. 42 Occurrence does, of course, allow for identity.

62

Hans-Heinrich Lieb

F-(C u the C-defined terms of F);43 the terms of F-the C-terms of F; and the five sets obtained analogously from the formulas, axioms, definitions, the derivability relation, and the valid sentences of F (in the case of derivability we take that subrelation of immediate derivability in F for which no first member and no element of a second member is a C-formula of F). We shall say that of two FSs F and Pl9 F is contained in Fj if there is a set C constants of F! such that F = Fj-minus-C. The union of F and Fa is the septuple obtained by uniting corresponding members of F and Fi9 except that the fourth member is to be the set of axioms of F and Fl that are not definitions Under certain conditions, here left unspecified, an FS may be called a FORMALIZED SYSTEM OF LOGIC (FS of logic, logical calculus), where "logic" may be replaced by more specific terms such as "predicate logic", "firstorder predicate logic" etc.44 It is assumed that in any logical calculus the set of constants (similarly, of axioms) may be partitioned into a non-empty set of logical constants and a set, possibly empty, of non-logcal constants** If there are no nonlogical constants, the calculus will be called pure, otherwise, non-pure?6 If F is an FS of logic, C will be called a harm/ess set of constants of F if C is a set of constants of F such that some logical axioms of F are not C-axioms of F. We make the following ASSUMPTION ON MINUS: For any FS of logic F and any harmless set C of constants of F, F-minus-C is an FS of logic.47 We finally extend the concept of FS as follows. Most standard logical calculi, at least the pure ones, have 'readings' in natural languages, i.e. some or all of their constants, terms, and formulas are systematically correlated with expressions of the natural language (thus, "All are P", "Every is P", "For every , * is P"9 "All are elements of P" etc. may all correspond to "(,*)/V). Frequently axiomatic theories are formulated not directly by means of an FS of logic but by using
Where " " stands for difference and "u" for union of sets. For a recent introductory survey, see Rogers (1971); cf. also Hatcher (1968). 45 In a sense where this distinction does not depend on an interpretation of the calculus (cf. the -remarks in Carnap (1958) 25a). Logical connectives and operator symbols like "3" are counted among the logical constants. We shall also assume that in any logical calculus the (possibly empty) set of mathematical constants can be specified by purely formal means. The mathematical constants are assumed to be non-logical. If our conception of FS of logic is to cover systems of natural deduction (cf. Hatcher (1968) 1.6), we may have to allow for the set of logical axioms to be empty (unless we keep the tautologies as logical axioms). 'Dummy constants' (as introduced by Hatcher I.e., or 'ambiguous names' in Suppes 1957, 4.3) must be excluded as constants. Concepts such as derivation and proof would have to be redefined. 46 Reflecting Church's "pure functional calculus" and "applied functional calculus", used in an analogous context ((1956) 173f). 47 A complete explication of "FS of logic" might show that the above assumption is too strong. In that case we would have to strengthen the concept of harmlessness by including additional requirements in its definition.
44 43

Grammars as theories: the case for axiomatic grammar (part I)

63

a reading of such a system. I allow for readings from the very beginning, suggesting an explication along the following lines. Let us consider regimented forms of natural languages as a special type of FS i8 Such expressions as "All are P" would belong to a regimented form of English. Assuming this concept, we define as follows.49 F is a natural language reading (NLR) of Fj iff: (1) F is an FS. (2) Ft is an FS of logic whose expressions are not expressions of a regimented part of a natural language. (3) There is an F2 and a function r such that: (a) F2 is a regimented form of a natural knguage. (b) The domain of r is the set of constants, terms, and formuks of Fj. (c) The values of r are sets of expressions of F2. (d) For all e, ej (e ej) in the domain of r, r(e) r(ej) = ^.50 (e) The vocabulary (the constants, the variables, terms, formuks, axioms, definitions, valid sentences, respectively) of F = the set of all e such that, for some e ls e 6 r(ej) and ej e the vocabulary (the constants, variables, terms, formuks, axioms, definitions, valid sentences) of Fj. (f) Immediate derivability in F = e, E>: (Be^ee r(ej) & <e!,{e2: (3e3)(e3 e Enr(e2))}> e Immediate derivability in Fj)}.51 If F is an NLR of Fj, any r that for some F2 satisfies condition (3) is called a rendering of F as F. We now make the following ASSUMPTION ON NLRs: If F is an NLR of F!, F is an FS of logic (and if Fj is an FS of predicate logic etc., so is F). This assumption is, on the whole, unproblematic.52
48

In the light of Montague's work (esp. Montague 1970), this seems justifiable. As of now, there does not seem to exist an explication of the concept of reading that would really do justice to the domain the explicatum should cover, i.e. the controlled use of natural languages (extended by symbolisms) in the rigorous formulation of theories. The following definition may still be simplistic, i.e. may apply to only some of the entities that should be covered. I believe, though, that the rest can be obtained by formal operations from readings in the defined sense. 50 With "n" for set-theoretical intersection and "0" for the empty set. 51 I.e. e is immediately derivable in F from E if there is an ej such that: ee rfo) and % is immediately derivable in Ft from the set of all e2 for which there is an element of E that is an element of r(e2). "<( ,X* is the ordered-pair notation and "{. : } ' denotes the set of all .. such that. "(3..)" is to be read as "for some ..". 52 Note that for each term or formula of Fj there may be a multiplicity of terms or formulas of F; e.g., for one axiom of Fj, say, "(x-)(PxRx)", we might have "Every P is R", "All P are R", "For all x, if is , is R", etc. Also, r(e) may be empty, for a given r and e. Neither possibility should raise problems. It may, however, be advantageous not to introduce r and F2 by existential quantification but take an NLR as a triple <F,r,F2); this should exclude the possibility of a reading being an FS of different types of logic simultaneously, which is not yet excluded by the above assumption. In view of the ensuing complications I would rather give up the Assumption on NLRs and modify some later formulations.A detailed discussion of many problems connected with readings is found in Schnelle (1973*), 78100.
49

64

Hans-Heinrich Lieb

In discussing interpreted formalized systems we shall have to keep in mind that NLRs have been allowed as formalized systems.
2.3. Interpreted formalized system s

The standard way of interpreting a formalized theory is characterized in Fraenkel et al. (1973) by stating how a 'first-order theory', i.e. a non-pure calculus of first-order predicate logic, is interpreted; this may be quoted in full (288) because it also applies in the case of formalized systems:
Such a theory is interpreted by providing the logical symbols with their usual signification, by fixing the universe (of discourse) U over which the variables range, and by assigning, through rules of designation, to each individual constant some member of U, to each unary predicate a certain subset of U and in general to each -ary predicate a certain /7-ary relation whose field is a subset of U, finally to each w-ary operation symbol an #-ary function from ordered -tuples of U to members of U. Calling the sequence consisting of U and these individuals, sets, relations, and functions ordered in a way similar to that of the constants to which they are assigned, a structure, semi-model, or (possible) realisation of T, rules of truth finally determine under what conditions a formula of T is true in a given structure, relative to a given valueassignment to its free variables if it contains such. (These rules determine, of course, also truth conditions for all the sentences.) An interpretation of a given theory T is called sound if under it all the valid sentences of T become true in the structure determined in the interpretation; this structure itselfand often also the assigning functionis then called a model of the theory.53

Accepting this account, how can we formally represent an interpreted formali^ed system? There is no answer to this question in Fraenkel et al. (1973). An interpreted system is specified by adding 'semantic rules' to an FS F. The rules are metalinguistic sentences on the 'meaning' or the 'reference' of expressions of F (the above 'rules of designation' belong to the semantic rules). We might take an interpreted system as a couple <F, R) where F is an FS and R a set of semantic rules for F. Alternatively, we could supplement F by what R is about, thus avoiding a second component of the interpreted system that is metalinguistic relative to the first. This solution will here be adopted. Using a Carnapian approach, we identify an interpreted system with a couple <F, D> where D is a 'designation' relation for expressions of F (to be taken as a relation between expressions of F and 'extensions' such as classes and truth values, or between expressions and 'intensions' such as properties and propositions). To make this more precise, we first introduce an auxiliary concept. Let "B", "B!", ..., "C", "Q", ..., "D", "D^', ..., "U", "IV, ..., "v", "e", "f', "x" stand for any sets. Let F be an FS of logic. B is called a MINUS-C BASIS FOR <F, D> if D is a two-place relation and C a harmless set of constants of F and B a triple Ui> ..., Un>, v, f> whose first member
53

The above account only indicates the starting-point for model-theoretic work; a more detailed presentation would be impractical in the present context.

Grammars as theories: the case for axiomatic grammar (part I)

65

is a non-repetitious n-tupel of non-empty sets, with n = the number of different types of individual variables of F, such that: (1) f is a one-place function whose domain is the set of constants, terms, and formulas of F, and there is exactly one m > 1 such that the values of f are 'set-theoretical constructs in Uj,..., Un>, {!,..., m})'.54 v is a one-place function whose domain is the (possibly empty) set of undefined non-logical constants of F-minus-C such that, for any argument e ofv,v(e)ef(e).55 The domain of v the domain of D the domain of f.56 For all <e, x> e D, e f(e), and if e 6 the domain of , = v(e).

(2)

(3) (4)

The term "interpreted formaUzed system" will not be defined; we only give necessary conditions for its application that are indispensable for subsequent discussion. From the very beginning we will relativize the notion to "interpreted except for a set of constants C (and all C-expressions)". This will eventually enable us to deal with so-called partial interpretation of axiomatic theories. ASSUMPTIONS ON MINUS-C INTERPRETED FORMALIZED SYSTEMS OF LOGIC (C-IFS of logic). If <F, D> is a C-IFS of logic, then: (1) F is an FS of logic. (2) C is a set of constants of F. (3) D is a one-place function. (4) The domain of D is the set of constants (including logical ones), closed terms (i.e. terms without free variables), and closed formulas (i.e. sentences) of F-minus-C. (5) There is a minus-C basis for <F, D>. An interpreted formalized system of kgic (IFS of logic) is a couple <F, D> that is a C-IFS of logic, for some C.
54

f is the function assigning to each 'well-formed' expression of L the set of its 'possible values relative to ({Ui > Un), {l,..., }}'. Thus, in the above quotation from Fraenk'el et al. (1973) it is specified that the set of 'possible values'Ifor any individual constant is U, the set of 'possible values' of a unary predicate is the power set (the set of subsets) of U, etc.; the set of possible values for formulas could be taken as {1,2}, the set of 'truth-values'. The term "set-theoretical construct in" will not be defined, but set identity, power set, and Cartesian product may be taken as examples. Following recent advances in intensional logic, intensional entities such as properties of elements of Uj could also be treated via set-theoretical constructs in an appropriate tuple of sets that would include Uj-, e.g. we might add among others a set J of 'possible worlds' as a fourth member of B and take the values of f as set-theoretical constructs in Uj,.. .,Un>, {!,.. .,m}, J>. Thus, if e is a formula, f(e) could be the set of propositions understood as functions from J into {!,.. .,m}, and properties etc. would be treated by also using the Uj. The above formulation is to be understood as generalized on these lines whenever necessary. We did not use a more general fomulation in order to leave open the specific form it might take. 55 /. e. v(e) is a 'possible value of e relative to Uj,..., Un>, {l,..., m}>'. 56 With "c" for set inclusion.
5 TLIl/2

66

Hans-Heinrich Lieb

A minus-C interpreted formalized system of predicate logic (etc.) is a C-IFS of logic <F, D> such that F is an FS of predicate logic (etc.)?1 An IPS of predicate logic (etc.) is a C-IFS of predicate logic (etc.), for some C. We further make the following ASSUMPTION 1 ON IFSs OF LOGIC: For all F, B, C, if there is a D such that <F, D> is a minus-C IPS of logic (of predicate logic etc.) and B is a minus-C basis for <F, D>, then there is exactly one D such that <F, D> is a minus-C IPS of logic (of predicate logic etc.) and B is a minus-C basis for <F, D>. This assumption is justified as follows. A minus-C basis for an interpreted formalized system of logic <F, D> would be specified by rules of designation, including rules for .possible values (these would specify a class of functions to which f belongs). On this basis, the rules of evaluation (including the rules of truth) uniquely determine D if an interpretation of the logical constants of F is presupposed. The assumption is needed for some important considerations on axiomatic theories in 3.2, Ad (W)y and 3.3. An IPS of logic <F, D> is called extensional if every sentence e of F is extensional in <F, D> in the following sense: Let ej be in the domain of D and occur in e; let e2 be 'equivalent* to ^ in <F, D>;58 let e3 be the expression obtained from e by replacing an occurence of ^ in e by e2: Then e is equivalent to e3 in <F, D>.59 In constructing an empirical theory we do not use a formalized system of logic that is completely unspecified semantically.60 Rather, we use systems that have been specified 'up to designation': The semantic rules have been formulated up to the point where indication of a set C of constants and an appropriate triple Uj, ..., Un>, v, f> is sufficient to characterize a relation D that would make the system (minus-C) interpreted. Let us call such systems 'pre-interpreted* and treat them as pairs <F, > (where "" stands for sets of relations D), as follows: <F, > is a PRE-INTERPRETED FORMALIZED SYSTEM OF LOGIC (Pre-IFS of logic) iff: (1) F is an FS of logic. (2) For any D 6 : (a) The domain of D the set of constants, terms, and formulas of F. (b) For any C, if there is a minus-C basis for <F, D>, <F, D> is a minus-C IPS of logic. (Again, "logic" as part of the defined term may be repkced by a more specific expression like "predicate logic" if the same is done in the definiens.) A Pre-IFS of logic <F, ) is called extensional if, for every D e A ,
These definitions may have to be replaced by different ones that contain special conditions on D. The term in question should then be taken as undefined in the present essay, assuming, of course, that a C-IFS of predicate logic etc. is a C-IFS system of logic. 58 I.e. the sentence e2<-e1 is true in <F,D> (where "->" is defined by material equivalence between sentences and "true in" is defined by reference to truth-values in the usual way). 59 This is strictly analogous to Carnap's definition of "extensional" for 'semantical systems' ((1956) 48). Note that this definition does not assume anything about the values of D (whether they are 'extensions' or 'intensions'). 60 This has been emphasized by Stegm ller ((1969/70) II, 303305, 340).
57

Grammars as theories: the case for axiomatic grammar (part I)


61

67

if <F, D> is an IPS of logic, <F, D> is extensional (as defined above). Making a certain assumption on natural language readings we may obtain a Pre-IFS from Ft if Fj is an NLR of F and <F, > is a Pre-IFS. We first define, for any two-place relation D, any NLR Fx of F and any r that renders F as F! : The r-form of D = e, x>: (3e1)e1, x> 6 D & ej 6 domain of r & e 6 rfo))}. We now make the following ASSUMPTION ON NLRs AND PRE-IFSs: Let <F, > be a Pre-IFS of logic (predicate logic etc.), F1 an NLR of F, r a rendering of F as Fj, and D 6 . Then, for all C, if there is a minus-C basis for <Fj, the r-form of D>, then <Ft, the r-form of D> is a Pre-IFS of logic (predicate logic etc.). From this we obtain the theorem: If <, > is a Pre-IFS of logic (etc.), F1 an NLR of F, and r a rendering of F as Fj, then <, *> is a Pre-IFS of logic (etc.), where * = {D: (30! e & D = the r-form of Dx)} (proof omitted). If (F,A y is a Pre-IFS of logic and C a set of constants of F, we understand by <F, A> EXCLUDING C that couple <F l9 A j> for which F! = F-minus-C andAj = {Dx : (3D)(D e & Dj = D-the CF-part of D)}, where the CF-partofD is defined as follows, for any FS F, set C of constants of F, and any relation D: that subrelation of D whose first-place members are C-expressions of F. Making an additional assumption to be formulated immediately, we obtain the following theorem (proof omitted): If <F, > is a Pre-IFS of logic and C a harmless set of constants of F, <F, > excluding C is a Pre-IFS of logic. For use in the assumption we first define, for any FS of logic F, any set C of constants of F, and any relation D: the C-reduction of </% Z>> = < F-minus-C, D-the CF-part of D>. ASSUMPTION 2 ON IFSs OF LOGIC then reads: Let F be an FS of logic and C a set of constants of F. If existence of a minus-Q basis for <F, D> is sufficient for <F, D> to be a Q-IFS of logic, then existence of a minus- Q basis for the C-reduction of <F, D> is sufficient for the C-reduction of (F, D^ to be a Q-IFS of logic.The following three concepts are all based on exclusion. ^Fi> A I ) is contained in a Pre-IFS of logic <F, > if, for some set C of constants ofF, ^F ls ) = <F, ]> excluding C. If <F, > is a Pre-IFS of logic, the underlying logic of < F, > = <F, > excluding C*(F), where C*(F) = the set of undefined non-logical constants of F. A Pre-IFS <^F, ) is logically compatible with another, (Fi> ^, if the underlying logic of <F, > is contained in the underlying logic of <Fi> A!> ; every expression e of F that is an expression of Fj 'has the same logical status' in F and PI ; and, for all e, x, Xj, D e A and 1 6 ,, if <e, x> 6 D and <e, x^ e D l9 = x^62
61 6

This concept will be needed for discussion in 4.2. "Has the same logical status in... and..." cannot be defined by the concepts introduced in 2.2. Intuitively, we must prevent cases like a constant e being an individual constant in F and a two-place predicate in Fj; or a logical constant in F and a non-logical one in Fj. For explicating "logical status" we would have to specify the subclassifications assumed for the vocabulary, terms, and formulas of a given FS of logic. This goes beyond the limits of the present essay.

68

Hans-Heinrich Lieb

The following concept and assumption will be needed for the Conflation Theorem in 3.4, below: If <F, > and <F1} !> are Pre-IFSs, the amalgamation of <F, A> and <F1} Aj> is the couple <F2, 2) fr which F2 = the union of F and F! , and 2 = {D: For some Dt e and some D2 e A t , D = D! u D2}. ASSUMPTION ON AMALGAMATIONS: The amalgamation of two Pre-IFSs of logic that are (mutually) logically compatible with each other is a Pre-IFS of logic. ALL SUITABLE TERMS OF 2.2 THAT REFER TO FSs and for which sa far no analogues have been defined, ARE EXTENDED to both interpreted and pre-interpreted systems as follows: The vocabulary of such a system <F, D> or <F, A> is the vocabulary of F, the terms are the terms of F, etc. Finally, a minus-C interpreted formal system of logic is a C-IFS of logic <F, D^ such that F is a formal system; "IF1S of logic" and "Pre-IFIS of logic" are defined analogously. We now have available the necessary apparatus for discussing axiomatic theories. 2.4. Abstract axiomatic theories

For developing our notion of abstract ('uninterpreted') axiomatic theory (AT) we may start from Carnap's account of 'axiom systems' ((1958) 42):
For the formulation of an AS we need to choose or construct a language L, the so-called basic language of the AS. Usually this basic language contains only logical signs. The axioms and theorems of the AS contain certain constants not occurring in language L, called the axiomatic constants of the AS. Some of them are given without definitions; they are called the axiomatic primitive constants of the AS. All other axiomatic constants are introduced'by definitions on the basis of the primitives. The language L' obtained from the basic language L by adding the axiomatic constants is called the axiomatic language.

By "axiom system" (AS) Carnap seems to understand the set of all axioms, theorems, and definitions.63 For relating an AT to an AS in the sense of Carnap, different proposals may be considered. First, we could simply identify AT and AS. But this is inadequate because ATs with different sets of axioms could still be identical (there may be more than one way to 'axiomatize' the same set of sentences). Next we could identify the AT with the 'axiomatic language' of the AS. The axiomatic language would be taken as a formal system of logic in the sense of 2.2, which agrees well with Carnap's conception ((1958) 26b, 42a, b). Choosing the axiomatic language as the AT would correspond to the approach in Fraenkel et al. (1973) (cf. e.g. 283). However, the difference made by Carnap between those constants and sentences of the axiomatic language which belong to the AS and those which don't is not formally reconstructed. This may seem unimportant if the 'basic language' of the AS contains only logical constants; in
63

In Carnap (1960), "Axiomensystem" refers to the set of axioms only.

Grammars as theories: the case for axiomatic grammar (part I)

69

this case, it can be identified with the axiomatic language minus its non-logical constants, in the technical sense of "minus" ( 2.2). For the conception to l>e developed here it will be important, though, that some non-logical constants of the 'language' of the AT may not be axiomatic constants of the AT. Therefore, in defining the notion of an AT, the axioms, definitions, and constants of the AT should be given independent status. Also, the 'language' of the AT should not be a formalized system but a pre-interpreted formalized system, as noted above ( 2.3). The 'basic language' can be identified with the axiomatic knguage excluding the axiomatic constants, in the technical sense of "excluding" ( 2.3)we start from the axiomatic rather than the basic language.64 This leads us to a third proposal: An AT is considered as a quadruple consisting of the axiomatic language, the axioms, definitions, and axiomatic constants of an AS in the sense of Carnap.65 We will have to distinguish two forms of ATs only one of which is closely related to ASs. In order to have a second handy term for theories of this form we will re-introduce "axiom system" as a technical term into our framework. For the following definitions, "L", "IV, ...are introduced as variables over pairs-(F, >; "S", "S^',... stand for arbitrary sets of formulas of formalized systems; "C", "Q" for any sets of elements of the vocabulary of any formalized system. An ABSTRACT AXIOMATIC THEORY OF TYPE 1 (AT1) or AXIOM SYSTEM (AS) is an ordered quadruple <L, C, S, Sj> such that: (1) L is a pre-interpreted formalized system of logic. (2) C is a set of undefined constants of L. (3) S is the (non-empty) set of C-axioms of L. (4) St is the set of C-definitions of L. (5) Every element of C occurs in an element of S or Si. Note that C is non-empty because of (3), and that Sj may be empty. Both the C-axioms and C-definitions may contain non-logical constants of L that a*e neither in C nor C-defmed. (Cf. the definitions in 2.2, p. 61.) By (5) 'unused' constants are excluded, which would have been too stringent a requirement on formalized or formal systems in general. S and Sj are disjoint by (1), (3), and (4). Using "T", "Tj",... to stand for quadruples <L, C, S, Si) we introduce the following auxiliary terms for any AS T = <L, C, S, Sj>: L is the axiomatic language of T; C the set of axiomatic primitives of T; S the set of axioms, and S1
64

The appealing idea that the axiomatic language is obtained from the basic language by 'adding constants' seems to cause insuperable difficulties when a precise formulation is attempted. Similarly, Stegm ller ((1969/70) II, 305) notes for the Observation language* and the 'theoretical language* that they can be specified only as parts of the total language of a theory. 65 A similar approach is occasionally found in the literature; thus, in Mates (1965) Ch. 11, 1, a first-order theory is identified with a pair <, where is the set of non-logical constants occurring in the formulas of the set (the set of theorems); similarly, Rogers (1971)54.

70

Hans-Heinrich Lieb

the set of definitions of T. The defined constants of T = the C-defined terms of the axiomatic language of T. The axiomatic constants of T = the axiomatic primitives and defined constants of T. The axiomatic terms of T = the C-terms of the axiomatic language of T (i.e. any term in which an axiomatic constant occurs) and the axiomatic constants of T. The basic language of T = the axiomatic language of T excluding C. The non-logical constants ot T = the non-logical constants of the axiomatic language of T. The mathematical constants of T = the mathematical constants of the axiomatic language of T (cf. above, fn. 45). The basic non-logical (non-mathematical) constants of T = the non-logical (non-mathematical) constants of the basic language of T. The defined (undefined) terms of T = the defined (undefined) terms of the axiomatic language of T. The theorems of T = the sentences of the axiomatic language of T that are not axioms or definitions of the axiomatic language of T; are derivable in the axiomatic language of T from a set of sentences each element of which is an axiom of T, a definition of T, or a valid sentence of the basic language of T; and are not derivable from a set of valid sentences of the basic language of T. The sentences of T = the axioms, definitions and theorems of T. It is unusual that we should have defined "theorem" by using validity rather than just derivability. This enables us to speak of axiomatic theories even if the basic language (or rather, its first component) is non-axiomatizable in the sense of 2.2, p. 61, which would make the axiomatic language non-axiomatizable, too (note that the latter has not been required to be a formal Pre-IFS of logic). In this way we make sure that all true mathematical sentences can actually be used in an empirical theory in the derivation of theorems; if we had required that only formal theorems of the axiomatic language of T be allowed in the derivation of theorems of T, we would run into trouble because of non-axiomatizability results obtained in the foundations of mathematics.66 Our conception of "theorem" also allows us to introduce a concept of axiomatizability for theories that does not presuppose axiomatizability of the language of the theory. We first define: An abstract formalized theory of type 1 (AFT1) is an ordered quadruple <L, C, S, Sj) such that: (1) L is a Pre-IFS of logic. (2) C is a set of undefined constants of L. (3) S is the (non-empty) set of valid C-sentences of L. (4) Sj is the set of C-definitions of L. T is called a (type-1) axiomati^ation of Tj iff: (1) T is an AT1. (2) is an AFT1. (3) The axiomatic language of T contains the first component of Tj. (4) The axiomatic constants of T = the second component of T x . (5) The theorems of T = the third component of Tj. (6) The definitions of T = the fourth component of Tj. T is called (type-1) axiomatizable if there is a (type-1) axiomatization of T. If an AFT1 is not axiomatizable, it may still have an axiomatizable 'part', in the following sense: Let T, Tx be any AFT1 or AT1. T is a part of Tx (Tx is an extension of T) iff the first component of T is contained in the first component of Tj and the second, third, and fourth components of T are subsets of the corresponding sets of T! . 66 Cf. Fraenkel et.aL (1973) 310-320, for a summary of relevant results.

Grammars as theories: the case for axiomatic grammar (part I)

71

The second type of abstract axiomatic theories is obtained by defining an 'explicite predicate' (Carnap (1958) 42d); best-known examples are 'axiomatizations within set theory', where a 'set-theoretical predicate' is defined (Suppes (1957) Ch. 12, is one of the best introductions). By an ABSTRACT AXIOMATIC THEORY OF TYPE 2 (AT2) we understand a couple <L, e> such that: (1) L is a pre-interpreted formalized system of logic that contains a Pre-IFS of predicate logic. (2) e is a definition of L. (3) The defined term of e is a predicate of L. (4) The definiendum of e is an atomic formula of L. (5) All {the defined term of e}-defmitions of L except e are conditional definitions whose antecedent is the definiendum of e. (6) The axioms of L = the axioms of: L excluding {the defined term of e}. We have the following auxiliary terms for an AT2 (L, e^: L is the language^ e the (main) definition; and the defined term of e is the predicate of <L, e>. The axioms of <L, e> are the conjuncts forming the definiens of e if the definiens is a conjunction, otherwise, the definiens is the only axiom. The definitions of <L, e> are the {the defined term of e}-definitions of L. The theorems of <L> e> are the universal implications having the definiendum of e as their antecedent which are derivable in L from a set each element of which is a definition of <L, e> or a valid sentence of L and which are not derivable in L from a set of sentences of L that does not contain a definition of <L, e).67 The non-logical (non-mathematical) constants of <L, e> are the non-logical (non-mathematical) constants of L (there may be such although there are no 'axiomatic constants'). The following two concepts apply to both types of ATs in an analogous way: An AT is formulated in the language of predicate logic if the axiomatic language (the language) of the AT is a Pre-IFS of predicate logic. The AT is formulated in the language of set theory if there is an AT1 that is a set theory and whose axiomatic language = the axiomatic language (the language) of the AT excluding the set of undefined non-logical constants of the AT that are not axiomatic primitives of the set theory.68 AT*s and AT2s can be related as follows (cf. also Carnap (1958) 42d). Assume an AT1 T such that: The axiomatic language of T is a pre-interpreted formal system of predicate logic; the axioms, definitions and axiomatic primitives of T are finite in number; and all definitions satisfy the criterion of eliminability for the defined term.69 Transform T as follows: First eliminate all defined axiomatic constants from the axioms on the basis of their definitions. In the sentences so obtained, replace the n axiomatic primitives, in a standard way, by n distinct variables of the axiomatic language of T. Let e be a non-repetitious conjunction of the new formulas: e is an open formula of the basic language of T. Now consider any Lj such that, for some et and e2: ej is an -place predicate of LJ ; e2 is a definition
67

Valid sentences are included for the same reason as above in the case of ATAs. The concepts of abstract theory and axiomatizability could also be generalized. 68 The qualification is not needed if those primitives are counted as logical. 69 Some but not all of these restrictions could be removed.

72

Hans-Heinrich Lieb

of LI; the definiendum of ^ .the open atomic formula consisting of ej and a non-repetitious -tuple of the n free variables of e (thus, et = the defined term of 2); the definiens of e2 = e; the axioms of Lj = the axioms of the basic language of T; and the basic language of T = Lj excluding {ej}. Now consider any expression e3 that has been obtained from a theorem of T by treating it in the same way as the axioms of T: Obviously, the universal implication whose antecedent is the definiendum of e2 and whose consequent is e3, is provable in Lj. T is, in a sense, represented by <Lj, e^. This leads to the introduction of AT2s.70 The most striking feature of an AT2 is its lack of axiomatic constants. In particular, if the basic language of a corresponding AT1 is a pure Pre-IFS of logic (without any non-logical constants), the same holds for the language of the AT2. Considering the interpretation problems raised by some axiomatic constants in empirical theories, . by the 'theoretical terms', AT2s might seem superior. Actually, their apparent advantage will turn out to be spurious.

3.
3.1.

A framework for axiomatic theories (2): Realized theories and theory integration
Realised axiomatic theories: Interpreted and applied

As mentioned in 2.1, the realized axiomatic theories (RTs) are subdivided into interpreted and applied. The former are obtained from AT1 s as would be expected, by 'interpretation', i.e. by adding a triple (^Ui>..., UnX v, f^ (where n is the number of different types of individual variables of the axiomatic language). Certain constants may be left uninterpreted, for reasons that will become clear immediately. A MINUS-C INTERPRETED AXIOMATIC THEORY (C-INT) is a couple <T, B> such that: (1) T is an AT1. (2) Every element of C is an undefined non-logical, non-mathematical constant of T. (3) There is a D such that, if <(F, A>=the axiomatic language of T, then O\ and B is a minus-C basis for <F, D>.71 An interpreted axiomatic theory (INT) is a couple <T, B> that is. a C-INT, for some C. A completely interpreted axiomatic theory (cINT) is a 0-INT ( = the empty set). An incompletely interpreted axiomatic theory (ilNT) is an INT that is nota^-INT. The concept of minus-C interpreted formalized theory (C-INTF) may be defined in strict analogy to C-INT, by using AFT1 s and making obvious substitutions.
70 The above account has to be modified when an AT1 is related to a theory *axiomatized within set theory* in the customary sense: The predicate of the latter usually has m places more than would be expected, by introduction an m-tupel of sets such that the i-th set corresponds, in the AT1, to the set of possible values for individual variables of type i. 71 As <F,A> is a Pre-IFS, it follows from the relevant definitions in 2.3 that <F,D> is an interpreted formalized system of logic.

Grammars as theories: the case for axiomatic grammars (part I)

73

One would expect a subdivision of RTs corresponding to the distinction between AT1 s and AT2s. On the usual conception an AT2 is interpreted' by adding a sentence to the effect that a certain -tuple of entities satisfies the predicate of the theory. Technically, this corresponds to constructing an abstract axiomatic theory (of type 1) from the AT2 by introducing an axiom and axiomatic primitives. From this theory we may obtain an interpreted theory in the previous sense. Hence, instead of introducing a new type of RT tp correspond to AT2s, we characterize a special type of AT1 s: An interpreted axiomatic theory with predicate (INTP) is an INT <T, B> for which there is an AT2 <L, e> and an ej such that: (1) The basic language of T = L. (2) ej is an axiom of T. (3) et is an atomic sentence such that: (a) The predicate of et = the predicate of <L, e>. (b) The argument expression of ej consists of the axiomatic primitives of T.72 The concept of an incompletely interpreted axiomatic theory was introduced to deal with so-called 'partial interpretation' of empirical theories. It should be inadequate, though, simply to equate a 'partially interpreted' empirical theory with an iINT: The essential distinction in the former between the 'theoretical' part and the 'correspondence' part (relating 'theoretical terms' and Observation terms') would not be reconstructed. A 'partially interpreted' empirical theory should have a richer structure than an INT. This leads us to introduce applied theories as a second type of realized axiomatic theories. The following definition will be explained in the next subsection (3.2): An APPLIED AXIOMATIC THEORY (APT) is an ordered quadruple <T, T l5 L, B> such that: (1) T is an AT1. (2) Tj is an AT1. (3) The axiomatic primitives of T cz the axiomatic primitives of Tt. (4) The axiomatic primitives of Tj -c the constants of Tj that are non-logical and non-mathematical. (5) The axiomatic language of T= the axiomatic language of Tj exluding the set of axiomatic primitives of Tj that are not axiomatic constants of T. (6) The definitions of T! = the definitions of T. (7) For every axiom e of Tj that is not an axiom of T there is an el5 e2, e3 such that: (a) ej is an axiomatic constant of that is not an axiomatic constant of T. (b) ej occurs in e. (c) e2 is an axiom of 1. (d) ej occurs in 62. (e) 3 is an axiomatic term of T. (f) 63 occurs in %?* (8) L is a pre-interpreted formalized system of logic. (9) The axiomatic language of Tj is contained in L. (10) If <F, > is the axiomatic language of Tt and C the set of axiomatic primitives
72

On this account AT1 s are more basic than AT2s in the following respect: Even the concept of an INTP presupposes the concept of AT1; the attempt to interpret an AT2 automatically creates an AT1 that is then interpreted.Until Sneed (1971), interpretation problems of AT2s had been given little consideration. The above conception corresponds most closely to the first of several approaches discussed by Sneed. (His book came to my attention too late to be carefully considered in the present paragraph.) 73 In addition, e must satisfy certain formal conditions, such as not being the conjunction of a sentence of T and a sentence of Tj that is not a sentence of T. For relevant discussion, see Stegm ller (1969/70) II, Ch. V, esp. 9.

74

Hans-Heinrich Lieb

of T and undefined basic non-logical and non-mathematical constants of Tj, then for some D 6 , is a minus-C basis for <F, D>. Corresponding to an INTP we have an applied axiomatic theory with predicate (APTP), i.e. an APT <T, Tj, L, B> for which there is an AT2 <Lj, e> and an ej such that: (1) The basic language of T = LJ. (2) ea is an axiom of T. (3) ej is an atomic sentence such that: (a) The predicate of ej = the predicate of (Li> e)>. (b) The argument expression of ej consists of the axiomatic primitives of T. Notice that we have obtained a complete parallelism between INTs and APTs: In either form there are theories with and without predicate. The realised axiomatic theories may now be identified with the INTs and APTs. In defining "applied theory" we were trying to reconstruct the concept of 'partially interpreted (empirical) theory' as developed by Carnap and others and exhaustively discussed in Stegmller (1969/70), II, 181437,74 There are, roughly, the following correspondences. T: the theoretical part of the theory. Tj: the theoretical part supplemented by correspondence rules. L: the total language of the theory, possibly including sentences that are not expressions of the theory. B: an 'interpretation* specifying domains of individuals for all individual variables of the theory and assigning values only to the non-theoretical primitives of the theory. The axiomatic constants of T correspond to the 'theoretical terms', the axioms of T to the 'theoretical axioms'; the axiomatic constants of Tx that are For further discussion see, for instance, Suppe (1971) (who does not really take into account Carnap's later views), and Lewis (1970); partly relevant also Sneed (1971). Lewis presents an important proposal that takes Carnap's final views as a starting-point but "has the advantage that it permits theoretical terms to be fully interpreted and explicitly defined" (439 f). It consists, roughly, in positing 'meaning postulates' to the effect that the theoretical terms name the components of the unique realization of the theory (the model that satisfies the conjunction of its axioms) in case there is such a realization, and do not name anything otherwise (434). I find this quite unsatisfactory for a reason which I can only hint at: If we allow the theoretical terms to be first members of a naming relation, then we should also allow entities to be named by them regardless of whether those entities form a model that satisfies the axioms: We should admit a unique realization of the theory and an interpretation of its theoretical terms such that the latter do not name the components of the former. In other words, I am arguing that Lewis' 'meaning postulates' are inadequate for explicating "interpretation" for theoretical terms as intended by Lewis, vi%. in a 'realistic' not an 'instrumentalist' way. There are other problems with Lewis' proposal (thus, his crucial 'empirical hypothesis' p. 439 does seem to lead to a negative result, at least in my case) but it has to be admitted that accepting his proposal should simplify some of the problems raised by our own treatment (cf. below, 3.3). In Stegmller (1973), which appeared only after the present essay was finished, a new conception of theories is proposed that is a modified and expanded version of Sneed's approach (1971). As I shall show in another paper, most of the criticisms that Stegmller directs against the 'received view' of theories do not apply to the conception of realized axiomatic theories as developed above, and valuable insights gained by Sneed and Stegmller can be incorporated into my own framework.
74

Grammars as theories: the case for axiomatic grammar (part I)

75

not constants of T correspond to the Observation terms', the axioms of Tj that are not axioms of T to the Correspondence rules'. The axiomatic language of T corresponds to the 'theoretical language'; L excluding the non-logical constants of T to the Observation language'. Before turning to some more detailed comments on applied theories, we introduce the auxiliary terminology that will be needed for APTs later on in this paper. Let "", "j",... stand for arbitrary quadruples <T, T1? L, B>. If =<T, T!, L, B> is an APT, T will be called the core, Tj the applied core, L the total language, and B the semantic basis of . The semantic basis (an ordered triple) consists of the domain sequence, the primitive interpretation, and the possiblevalue function of 0. In connection with T, we have the core language of = the axiomatic language of T; the core constants of (divided into axiomatic and basic) the non-logical constants of T. The core axioms, definitions, theorems, and sentences of are the corresponding sets of T. The application language of = the total language excluding the core constants of . The axiomatic language of Tx is called the applied-core language of . The axiomatic constants, axiomatic terms, axioms, definitions, theorems, ana sentences of are the corresponding sets of Tj. Those axiomatic primitives of Tj which are not axiomatic constants of T will be called application constants of and those axioms of T! which are not axioms of T, application axioms of . The triple <$!, L1} B> such that Sj is the set of application axioms, Lx the application language and the semantic basis of is the application of . The concept of formalized theory can be extended so as to allow for applied theories, as follows. An applied formalized theory (APFT) is a quadruple <T, T A , L, B> such that: (1) T is an AFT1. (2) Tx is an AFT1. (3) The second component of T c the second component of Tj. (4) The second component of T! the constants of the first component of Tj that are non-logical and non-mathematical. (5) The first component of T = the first component of Tj excluding the difference of the second components of T! and T. (6) The fourth component of Tj = the fourth component of T. (7) L is a Pre-IFS of logic. (8) The first component of Tj. is contained in L. (9) If <F, > is the first component of Tj and G = the second component of Tu the set of undefined non-logical and non-mathematical constants of T! that are not elements of the second component of Tj, then for some D e , is a minus-C basis for <F, >. The concepts of axiomatization and axiomatizability can now also be extended: is an axiomatization of ! iff: (1) is an APT. (2) is an APFT. (3) The core of is a (type-1) axiomatization of the first component of 0j. (4) The applied core of is a (type-1) axiomatization of the second component of ! . (5) The total language of = the third component of (6) The semantic basis of = the fourth component of j. is axiomati^able if there is an axiomatization of .

76

Hans-Heinrich Lieb

We also generalize the concept of PART as defined for AT1 s and AFT1 s (5 2.4, p. 70). Let T be any AT1 or AFT1 and and any APTs or APFTs. is a part of if is a part of the second component of ; is a part of if the first component of is a part of T; and is a part of S1 if the first component of is a part of the second component of ^.75 The (homogeneous) part relations are obviously reflexive; they are also antisymmetric and transitive if the same is assumed for the relation of containing in 2.3, p. 67. An APT or APFT is FORMULATED IN THE LANGUAGE OF PREDICATE LOGIC if the third component of is a Pre-IFS of predicate logic, and FORMULATED IN THE LANGUAGE OF SET THEORY if there is an AT1 that is a set theory and whose axiomatic language = the third component of excluding the set of undefined non-logical and non-mathematical constants of the third component of that are not axiomatic primitives of the set theory. We now turn to a more detailed discussion of APTs, beginning by comments on the definition.

3.2.

On the definition of "applied axiomatic theory"

(1) to (6). The set of primitives of is non-empty by (1) and the definition of "AT1", (3). As it is a proper subset of the set of primitives of T x , the set of axioms of is also properly included in the set of axioms of T! , by (5), (6), and the definition of "AT1", (5); in other words, there are 'correspondence rules', and the Observation terms' occurring in them are not-defined in Tj. It also follows that all theorems of are theorems of Tj. In (4) we explicitly recognize the possibility of non-logical constants other than the axiomatic ones and exclude purely logical or mathematical theories from the applied ones. Ad (7). This is a liberalization of the usual requirement on 'correspondence rules' which can be partly restated as: Every axiom of Tx that is not an axiom of contains an axiomatic constant of and an axiomatic constant of Tx that is not a constant of T. The liberalization is motivated by the fact that we allow for non-logical constants of the basic language. An application axiom may relate an application term and a basic non-logical constant of T on the condition that the application term also occurs in an application axiom where it is related to an axiomatic term of T. By adding condition (7) to the previous ones we are able to prove the following theorem: The set of application axioms of (the 'cor75

For the last definition a stronger requirement could also be considered: is a part of 0j iff: (1) The first component of is part of the first component of 0j. (2) The second component of is part of the second component of 0le (3) The third component of is contained in the third component of 0j. (4) The fourth component U^,.. -,U ln >, Vj,^) of is related to the fourth component U 21> .. -U^Xv^f^) of 1 as follows: U2i Ujj, for all i = 1,.. .,n and some j = 1,.. .,m; Vj ^ v2; and fj ^ f 2 .

Grammars as theories: the case for axiomatic grammar (part I)

77

respondence rules') = the set of C*-axioms of the applied-core language of (the axiomatic language of Tj), where C* = the application constants of (the Observation terms' occurring in axioms of ). Ad (8) and (9). It has been emphasized by Stegm ller (1969/70), II, 305 that the 'theoretical language' and the Observation language' can be distinguished only within the total language of the theory (the latter also contains 'mixed sentences' in which both theoretical and observation terms occur, most notably the correspondence rules). Correspondingly, the total language L is introduced independently (8), and two abstract axiomatic theories are distinguished (1, 2) such that the axiomatic language of one is contained in the other (5) whose language in turn is contained in L (9). Ad (10). As T! is an AT1 by (2) and C a set of undefined non-logical, non-mathematical constants of Tj by (3), (4), and (10), it follows from (10) and the definition of "minus-C interpreted theory" that <1? > is a minus-C interpreted axiomatic theory. Moreover, there is a Bj and Q such that <T, Bj) is a minus-Q INT: Let B! be the triple obtained from by keeping its first member, replacing the second by the empty relation, and the third member f by that subfunction of f whose domain does not contain any expressions with axiomatic constants of Tj that are not constants of T. Let Q = the undefined non-logical and nonmathematical constants of T. (7*, B) is a minus-C^ interpreted axiomatic theory: The first two conditions of the definition of "C-INT" are met. For the third condition we argue as follows. There is a D, say D*, as in (10). Let Q = the set of axiomatic primitives of Tj that are not primitives of T. Let <F, > be the axiomatic language of Tj, and <Fj, Aj) the one of T. D*eA. <F, ) is a Pre-IFS; hence, by (10) and the definition of "Pre-IFS", <F, D*> is a minus-Q IPS of logic (obviously, Q = the set of axiomatic primitives of Tu the set of undefined basic non-logical, non-mathematical constants of Tj). Hence, by Assumption 4 on C-IFSs, the domain of D* = the set of constants and closed terms and formulas of F-minus-Q. Let D1 = D* the QF-part of D*. By (5) and the definition of "excluding" in 2.3, 1^1 and Fx = F-minus-Q. It now follows that B! is a minus-Q basis for (F-minus-Q, I>i> (cf. the definition of "basis" in 2.3). Hence, by the definition of "C-INT", <T, B!> is a minus-C! interpreted axiomatic theory. The axiomatic language of Tj is a Pre-IFS of logic (2.4, definition of "AT1"); hence, it follows from (10), the definition of "Pre-IFS" (2b) and Assumption 1 on IFSs of logic ( 2.3, p. 66) that there is exactly one D such that <F, D> is a C-IFS of logic and a C-basis for <F, D>. Let us call this D the B-interpretation of Tit or, for short, int(B, T) (remember that this is a function with expressions of T! as arguments). All expressions of Tt that contain an axiomatic constant of T are not in the domain of int(B, Tj): they are not 'interpreted' at all. This amounts to saying, in customary terms, that no 'theoretical term' and no expression in which it occurs

78

Hans-Heinrich Lieb

receives an interpretation; only the Observational terms* and corresponding terms and formulas do. I am here taking the position that there is no 'partial interpretation* of theoretical terms by supplying 'correspondence rules': Those very rules remain uninterpreted if they contain theoretical terms.76 Moreover there may be basic non-logical and non-mathematical constants of Tj (that are neither axiomatic constants of Tj nor of T). These, too, would not be in the domain of int(B, Tt) nor would the expressions in which they occur. This view raises a number of problems which may throw into doubt our whole concept of applied axiomatic theory.
3.3. Uninterpreted constants in applied theories

We shall formulate and discuss four objections to condition (10) in the definition of "applied axiomatic theory", which are all concerned with the problems raised by allowing uninterpreted constants.
First objection to (10).

Leaving the axiomatic constants of T uninterpreted has the consequence that no sentence of the axiomatic language of T^ containing such constants is assigned a truth-value on the basis of int(B, Tx) (understood as in the preceding subsection); for this the sentence would have to be in the domain of int(B, Tj). Thus, no theorem or axiom of Tj with such constants can be said to be either true or false: It cannot be understood as a statement. I suggest the following solution for the special case that the axioms of T! are finite in number. Let T2 be any AT1 that is a part of Tj (in the sense of 2.4, p. 70; T2 may be identical with Tj). Consider the Ramsey sentence of T^77 Assuming that the axiomatic language of Tj has bindable variables for each nonlogical constant, the Ramsey sentence of T2 (RT2) is obviously a sentence of the axiomatic language of Tj excluding the set of basic non-logical and non-mathematical constants of Tx and axiomatic constants of T; thus, RT2 is in the domain of int(B, Ti) and is assigned a truth-value. Assuming, in particular, that the values of int(B, Tj) are classes etc. arid truth-values, the truth-value of RT2 = int(B, Tj) (RT2), /. e. a certain positive integer. For all sentences of the axiomatic language of Tx that are in the domain of int(B, Tt) we can define the concept of -truth in the usual way by means of truth-values. The concept of BTX-truth is then extended to theorems and axioms of T! as follows: A theorem or axiom of Tj is BTj-true if it is an axiom or theorem of a part of Tj whose Ramsey sentence is BT^true. The two definitions can be combined in the following way:
76 77

For a similar view on 'partial interpretation' cf. Kutschera (1972), I, 3.1 f. This is the (more accurately: a) sentence obtained by eliminating all defined non-logical, non-mathematical constants from the axioms of T2; forming a non-repetitious conjunction of the axioms; replacing all non-logical, non-mathematical constants of T occurring in the conjunction by appropriate variables; and by preposing existential quantifiers which bind those variables. On the special role of the Ramsey sentence, cf. Carnap (1963), (1966), and the detailed discussion in Stegm ller (1969/70), II, 400437.

Grammars as theories: the case for axiomatic grammar (part I)

79

Let e be any sentence of the axiomatic language of Tj. e is BT^-true iff either (a) or (b): (a) e is in the domain of mt(B, 1^), and int(B, T1)(e) = l; (b) e is an axiom or theorem of a part T2 f such that RT2 is BTj-true. e is -false iff either (a) or (b): (a) e is in the domain of int(B, 1^), and e is not BT^-true. (b) e is an axiom or theorem of 1^ and not BTj-true. As all axioms and theorems of T are axioms and theorems of T ls the two concepts apply in all important cases. Given the special status of the Ramsey sentence (which may be taken as representing the factual content of the theory) the above solution should be acceptable. True enough, it contains a number of limiting assumptions but these are easily removed except for the assumption of a finite number of axioms. That requirement may be dropped only if the concept of Ramsey sentence is generalized to at least sentence schemata (unless we allow infinitely long expressions in the axiomatic language of Tx). Assuming the above solution we may feel less reluctant to accept the following hypothesis: No sufficiently complex theory in the empirical sciences can be developed or reconstructed as a minus-C INT so that some 'theoretical terms' are not in C. This is analogous to what Stegm ller calls the Braithwaite-Ramsey Vermutung ((1969/70) II, Ch. IV, 5) and can be justified by similar arguments. It is this hypothesis which makes us choose APTs rather than INTs as a format for linguistic theories.78 Assuming that the first objection to (10) has been refuted there are still other properties of (10) that may seem objectionable.
Second objection to (10).

There may be non-logical constants of the total language L that are not constants of the axiomatic language of Tj and thus remain uninterpreted. This is inadequate because it destroys the correspondence between the application language of the APT and a customary Observation language'. Anybody who is convinced by this argument may simply replace "the axiomatic language of Tj" in (10) by "L". I decided on the weaker formulation for the following reason: An empirical theory will always be applied in a context where the understandability of certain terms and sentences is not questioned. These terms may be anything but Observation terms' as conceived by an empiricist, and the sentences may be far from being Protokolls t^e. Hence, they should not be covered by the theory's semantic basis. The application terms must receive a basic interpretation unless one wants to give up all hope of calling a sentence of the theory either true or false (no great sacrifice to an instrumentalist' philosopher of science); the only general requirements on the terms of the application language
78

The real contrast, then, is between INTs and APTs and not between AT1 s and AT2s as some formulations in Suppes (1957), Ch. 12, or Sneed (1971), Ch. 1, might suggest; on our account the AT1AT2 dichotomy is not reproduced for realized theories, where we have both INTps and APTps,

80

Hans-Heinrich Lieb

should be pragmatic ones such as, perhaps, understandability in a standard context of use.79 Let me emphasize, though, that this semantic open-endedness of the application language is not meant as a loop-hole for terms from other theories to slip in: For dealing with such terms, non-logical constants other than axiomatic ones have been allowed into the applied-core language (their role will become apparent in the next subsection on theory integration). Further requirements on the application language of an empirical theory may still be in placeadmitting that the concept of 'empirical' has become dubious after the Zusammenbruch der Signifikan^ideethe sceptical belief that there is no important concept of empirical significance to be explicated.80
Third objection to (10).

Leaving the axiomatic core constants uninterr>reted has the consequence that no APT is a theory of something or, conversely, has a subject matter: It is by such constants that the subject matter of the theory would have to be denoted (e.g., "theory of English"}. This is a serious objection that seems to have passed unnoticed in the literature. I see two possible ways of countering it: One is a recourse to pragmatics by understanding "theory oP' as referring to a relation involving the theoretician, e.g. it might be proposed to understand "theory of" and "subject matter oP' always in the sense of "intended theory of (subject matter of)". The term after "theory of" could then be characterized independently of the theory as to its reference. The second counter-argument, which I will here adopt, is a semantic one. Although the core constants of the theory are not in the domain of the basic interpretation of , they are in the domain of the possible-value function f and are thus assigned a set-theoretical construct in the domains of as their set of possible values. We may use this fact and propose definitions along the following lines. Let be any set. <T, B>is a theory of iff, for some C and n: <T V B> is a minus-C interpreted theory, and there is a non-repetitious n-tupel ^,..., e,,) and an n-tupel < ,..., ) = such that {ej,..., en} = the set of axiomatic primitives of and xj 6 f(e$) if Cj 6 C, and xj = v(Cj) if ej C, where f = the third member of and = the second member of B.81 Conversely, is a subject matter (an object} of <T, B>, if <T, B> is a theory of x. We generalize to an APT s follows: is an abstract theory of if <the core of , B*> is a theory of x, where B* = (the domain sequence of 0, the empty relation, that subfunction of the possible-value function of whose domain does not contain any expressions with application constants of >. is an applied theory of x if <the applied core
For a similar position, cf. Kutschera (1972) I, Ch. 3, 2. See also Lewis (1971) 428, who even includes all non-logical terms that are not axiomatic terms of Tj. 80 Cf. Stegm ller (1969/70) II, Ch. 5, 13. 81 This presupposes a finite number of axiomatic primitives. If we want to consider a non-finite number, we might choose an appropriate function instead of an n-tupel.
79

Grammars as theories: the case for axiomatic grammar (part I)

81

of , the semantic basis of > is a theory of x. Conversely, is an abstract subject matter (object) of if is an abstract theory of x, and a specific subject matter (object) of if is an applied theory of x.82 As the most striking consequence of the definitions we have an indeterminacy of subject matter except for completely interpreted theories.83 Hence, all APTs are indeterminate even for their specific subject matters. But this is quite different from not having a subject matter. Thus, the third objection is rejected if we do not accept the presupposed uniqueness of subject matters.84 The fourth and last objection is related to the third. Fourth objection to (10). Leaving the axiomatic core constants uninterpreted may have the consequence that no sufficiently complex APT in the empirical sciences is either a true theory or a false theory of something. This claim is indeed correct. We might propose to call an APT a true theory of x if it is a theory of x and, for its applied core and its semantic basis, the Ramsey sentence of is BT-true. But this would be inadequate because the Ramsey sentence is true in case there is some appropriate -tuple, which might not be x. The only acceptable explication may be as follows: is a true (abstract) theory of x if is an (abstract) theory of x and there is a Bj such that, for = the semantic basis of and Tt = the applied core of : The first and third members of Bj = the first and third members of ; the second member of the second member of Bj; <T1? > is a completely INT; and the Ramsey sentence of T! is T1B1-true; if it is false, isfo/se. In discussing the first objection to (10) we already formulated a hypothesis for sufficiently complex empirical theories that would exclude existence of a Bj such that <Tj, Bj> is a cINT. Hence, no such theory would be either true or false. I find this consequence acceptable. On our conception, a theory is not a sentence; in particular, it is not the conjunction of its axioms. Whereas "theory of" may be needed as a semantic concept, "true theory of" may well be replaced by pragmatic ones as far as the component of truth is concerned. Let me emphasize, though, that in this subsection we have been moving on notoriously unstable ground and should not feel certain to have gained a tenable position. We finally turn to the question of how different theories may be combined into new ones.

82

All these definitions generalize to interpreted and applied formalized theories. The trivial multiplicity of subject matters due to different arrangements of the axiomatic primitives may be disregarded. 84 There may, of course, be exactly one intended subject matter; this is a relevant pragmatic feature of a given theory, see above.
83

6 TLll/2

82 3.4.

Hans-Heinrich Lieb Integration of theories: Conflation and use

The relations between different theories that have so far been studied in logic and the philosophy of science are mainly of two types: In logic, 'extensions' or 'subtheories' of a given theory have been considered; in the philosophy of science problems of 'reducibility' of one theory to another have received special attention, due to the specific problems of the natural sciences.85 It is perhaps not surprising that there are important relations between linguistic theories which are not covered by studies of this kind. Take the following examples: (a) A psychological theory is used in a theory of language to derive theorems of that theory, (b) A theory of language is used in a grammar of an individual language to derive theorems on that language, (c) Axiomatic constants of a theory of language are combined with axiomatic constants of a grammar to form axiomatic terms of the grammar ("is a word in", "English", "is a word in English"), (d) The phonological theory of a language is combined with a morpho-syntactic theory of the same language to form a grammar (in the traditional narrow sense). Whereas a 'subtheory' approach could be considered for (d), it would be misleading to attempt either a subtheory or a reduction approach in cases (a) and (b): A theory of language is not 'reduced* to the psychological theory that it may invoke in the derivation of theorems, and is certainly no 'subtheory' of it; and the same holds for grammars and theories of language. Rather, we are confronted with a relation of 'presupposition*. In (c) the situation is still different; we may .say that the grammar is (partly) formulated 'in terms oP the theory of language (which should also be presupposed). In (d) we may speak of 'conflation' of theories. Whereas the sentences of a presupposed theory should not appear among the sentences (axioms, theorems, and definitions) of the given theory, all sentences of the conflated theories should also be sentences of the conflation. In explicating the various concepts we must distinguish between abstract and realized theories. In the special case of formulation-in-terms-of it will turn out that an applied axiomatic theory formulated in terms of an abstract one provides a 'completion' for that theory, i.e. another applied theory 'based' on the abstract one. This result is of special importance because it suggests that a theory of language may be taken as an abstract axiomatic theory in terms of which certain applied theories, vi%. grammars, are formulated, which then provide completions for the theory of language. For the following definitions we first consider AT1 s and then generalize to 86 APTs Let T and Tt be any AT*s and and 0X any APTs. We first define "conflation" as follows.
For a recent study of deducibility' and 'equivalence* of theories, cf. Sneed (1971) Ch. 7. Cf. also Kutschera (1972) II, Ch. 4, 6. 86 I have not attempted to include AT2s and INTs (which seems possible) because the linguistic theories to be proposed will not assume that format for reasons indicated above ( 3.3., first objection to (10)).
85

Grammars as theories: the case for axiomatic grammar (part I)

83

THE CONFLATION OF T AND Tt = the ordered quadruple <L, C, S, S^ such that: (1) L = the amalgamation of the axiomatic language of T and the axiomatic language of Tj 87 (2) C = the set of those axiomatic primitives of T which are not defined constants of Tx u the corresponding set for Tj 88 (3) S = the set of those axioms of T which are not definitions of the axiomatic language of T x u the corresponding set for T lt (4) B! = the definitions of Tu the definitions An AT1 T will be called compatible with an AT1 Tj provided that: (1) The axiomatic language of T is logically compatible with the axiomatic language of Tt . (2) For every C, Q, and e: If C is the set of axiomatic primitives of T and Q the set of axiomatic primitives of that are not C-defmed terms of and e is a defined constant of Tx and ee C, then e is a Q -defined term of T^89 (3) No axiomatic constant of is a basic constant of Tj . (4) Some axioms of (of Tj) are not definitions of the axiomatic language of Tj ( f T). It is now possible to prove the following CONFLATION THEOREM, for any AT1 s T and Tj: If T is compatible with 7i and conversely, the confation of T and 7i is an AT1. The proof makes use of the assumption on amalgamations in 2.3 and the definition of "union" in 2.2, p. 62; it is fairly straightforward and will be omitted. The concept of conflation may be generalized to include APTs: The conflation of T and 0 = the quadruple <T1$ T2, L, B> such that: (1) ^ = the conflation of T and the core of . (2) T2 = the conflation of T and the applied core of . (3) L = the amalgamation of the axiomatic language of T and the total language of . (4) B = the semantic basis of . The confotion of and i^ the quadruple <T, T!, L, B> such that: (1) = the conflation of the core of and the core of . (2) T! = the conflation of the applied core of and the applied core of (3) L = the amalgamation of the total language of and the total language of 6j . (4) = the 'addition' of the semantic basis of and the semantic basis of 0j, in the following sense: is a triple whose second and third components are the union of, respectively, the second and third components of the semantic bases of and '; the first member is the (n + m)-tupel obtained from the domain sequence of ( = the number of its components) by adding the components of the domain sequence of that are not in the domain sequence of (keeping sequential order). So far I have not established the precise conditions under which the conflation of () and 0j is again an APT but it seems quite likely (on the basis of informal considerations) that the Conflation Theorem can be extended in an appropriate way.
87

For "amalgamation", cf. 2.3, p. 68. The same constant could be an axiomatic primitive in and defined in Tj, or conversely. 89 This avoids circularity when axiomatic primitives of are defined by means of axiomatic primitives of Tj.
88

84

Hans-Heinrich Lieb

Note that even in the case of AT1 s the conflated theories may not be parts of the conflation as "part" was defined in 2.4 and 3.1: Some axiomatic primitives and axioms of the conflated theories may not be among the axiomatic primitives and axioms of the conflation. Distinguishing between "part" and "subtheory", we may define the latter term as follows: If T and Tj are AT^s, T is a SUBTHEORY of Tj if there is an AT1 T2 such that T! = the conflation of T and T2. This concept is generalized to APTs in an obvious way. The (homogeneous) subtheory relations are obviously reflexive. We now explicate the concepts connected with theory use. T is USED in T! (Ti PRESUPPOSES T) iff: (1) The axiomatic language of T is logically compatible with the basic language of T^90 (2) Some valid non-logical sentences of the basic language of Tj are valid (non-logical) sentences of the axiomatic language of T. (3) A valid sentence of the axiomatic language of T is a valid sentence of the axiomatic language of Tj if and only if it is a valid sentence of the basic language of Tj. T is used completely in Tj (Tj presupposes all of T) if T is used in Tt and all valid nonlogical sentences of the axiomatic language of T are valid (non-logical) sentences of the basic language of Tj 91 T is used m if T is used in the applied core of ; is used in T if the applied core of is used in T; and is used in 0j if the applied core of is used in the applied core of 1. "Used completely" is generalized in the same way. Formulating a theory in terms of another is explicated as follows. T IS FORMULATED IN TERMS OF Tx iff: (1) The axiomatic language of Tt is contained in the basic language of T.92 (2) The axiomatic language of Tj contains a Pre-IFS of predicate logic. (3) No undefined term of the axiomatic language of Tj is a defined term of the axiomatic language of T. (4) For each axiomatic primitive e of T there is an ej, e2, e3 such that: (4a) et is a closed non-logical term of Tj. (4b) ej is a predicate expression and ej e2 is an atomic formula of the axiomatic language of T, or ei is a functor expression and Cj.-.e^. is a term of the axiomatic language of T. (4c) e occurs in e2. (4d) ^ is an axiomatic term of Tx or there is an axiomatic term of Tj that occurs in e2. (4e) e3 is an axiom or definition of T. (4f) e^ occurs in e3. If "each" in (4) is replaced by "some" we say that T is partly formulated in terms of Tj. If T is formulated in terms of Tj and the predicate logic contained in Tj (hence, in T) is sufficiently strong to contain an abstraction operator (like the -operator), then for any axiomatic primitive e of T there is an axiomatic term ej of T obtained as follows. Let e2 and e3 be expressions such as ej and e2 in (4) of
For "logically compatible", see 2.3, p. 67. Note that theory use does not require use of an axiom, theorem, or definition of T; a valid non-logical sentence of the axiomatic language of T must be involved but this may be a valid sentence of the basic language of T. (The Conflation Theorem allows only for theory use of this limited type: cf. (3) in the definition of "compatible".) It is only complete theory use that necessarily involves the sentences of T. 92 Because of (1), formulation-in-terms-of implies complete theory use.
91 90

Grammars as theories: the case for axiomatic grammar (part I)

85

the above definition. Let 4 be an expression obtained from e3 by replacing all constants other than e by appropriate variables. We now define ej as follows: If e4 = ,! = e. Otherwise, ej is the closed term obtained from e2 e4 by prefixing an abstraction operator that binds all the free variables in e4. Depending on whether 6264 is a formula or a term, ej is a predicate expression or a functor expression.93 If, in addition, there are no defined axiomatic constants of T, reliance on the nonlogical terms of Tj is particularly strong: T is strongly formulated in terms of Tj if T is formulated in terms of and there are no definitions of T. The concept of formulation-in-terms-of is extended in analogy to the extension of use: T is formulated in terms of if T is formulated in terms of the applied core of and the axiomatic primitives of T are non-logical, non-mathematical constants of ; is formulated in terms of T if the applied core of is formulated in terms of T and the axiomatic primitives of T are non-logical, nonmathematical constants of T; and is formulated in terms of 1 if the applied core of is formulated in terms of the applied core of ! ,94 If is formulated in terms of T, the abstract theory T can be 'completed' into an APT by making appropriate use of . This result is of great importance for the problem of relating grammars to a theory of language and will therefore be established in a more precise way. We first introduce the following concept, for any AT1 T and APT . THE -COMPLETION OF T = the quadruple <Tj, T2, L, B> such that: (1) Tj = the quadruple ^Li> C, S, Sj) such that: (a) Lj = the core language . (b) C = the axiomatic primitives of T. (c) S = the C-axioms of Lj. (d) Sj = the C-definitions of Lj. (2) T 2 =the quadruple <L, C, S, Sj) such that: (a) Lx = the applied-core language of . (b) C = the second member of Tj u the application constants of . (c) S = the third member of Tt u the application axioms of . (d) Sj = the fourth member of Tj. (3) L = the total language of . (4) B the semantic basis of . We now have the following COMPLETION THEOREM, for any APT and AT1 T: If B is formulated in terms of T, the -completion of T is an APT with the same application as ,95 The general outline of the proof is obvious: One has to check the various conditions in the definition of "applied axiomatic theory". As four rather complex definitions are involved (the other three being the definitions of "completion",
To give a simple example: If Tj is a theory of language and T a theory of English, e could be "English"; e^ = "Word" ("is a word in", a two-place predicate); e3 = e4 = "(x,English)", with "x" a variable; ej "() (Word(,English))" ("the class (or property) of all such that x is a word in English"). 94 The condition on T is to exclude the awkward case of a logical or mathematical theory being formulated in terms of a non-logical or non-mathematical one, and conversely. 95 For "application", cf. 3.1, p. 75.
93

86

Hans-Heinrich Lieb

"formulated in terms of" and "axiom system'*) the proof is complex and has to be omitted for lack of space. As one important feature of the proof we mention the following idea: Assume that Tj and T2 in the completion are indeed AT^s. Consider the set consisting of the axiomatic primitives of Tt and the undefined basic non-logical, non-mathematical constants of T2. Then consider the set consisting of the axiomatic primitives of the core of and the undefined basic non-logical, non-mathematical constants of the applied core of . Under the conditions of formulation-in-terms-of these two sets are identical, only the role of a given constant may change (the axiomatic primitives of T2 are undefined basic non-logical, non-mathematical constants of the applied core of ). Hence, the semantic basis of should serve as a semantic basis for the completion. The Completion Theorem enables us to extend the notions of "theory of" and "subject matter of" (3.3, pp. 80f) even to abstract axiomatic theories, relativized to applied theories that are formulated in terms of them. Let be an APT and an AT1. is a theory of relative to if is formulated in terms of and the -completion of is an abstract or applied theory of . is a subject matter of relative to S if T is a theory of relative to . Thus, there is a relative notion of subject matter that would apply to a theory of language even when taken as an abstract axiomatic theory.

4. 4.1.

Grammars as axiomatic theories The program and its linguistic background We shall make a weaker and a stronger proposal for grammar writing:

(2)

Thesis A. A complete or partial linguistic grammar can and should be formulated as an applied axiomatic theory in the language of predicate logic or set theory. Thesis B. A complete or partial linguistic grammar can and should be formulated as an applied axiomatic theory in the language of predicate logic or set theory and in terms of a part or subtheory of a theory of language that presupposes a part or subtheory of a theory of communication.

(3)

Both proposals are meant for scientific grammars, not for pedagogic ones. They are to be understood within the framework of axiomatic theories developed in 2 and 3: For "applied axiomatic theory" and "formulated in the language of predicate logic or set theory", cf. 3.1, pp. 73 and 76; for "subtheory", 3.4, p. S4.96 Requiring an applied rather than an interpreted theory is motivated by
96

Part-of and Subtheory-of are reflexive; hence, the two theses allow for formulation in terms of the complete theory. In (3) it is to be understood that a complete grammar would be formulated in terms of a theory of language whereas a partial grammar might require only a proper part or a subtheory.

Grammars as theories: the case for axiomatic grammar (part I)

87

considerations in 3.3 (cf. discussion of first objection to (10)). Requiring formulation in predicate logic or set theory is anything but self-evident and will have to be justified (below, 4.3). We are left with the task of explaining the expressions "theory of communication", "theory of language", and "complete or partial linguistic grammar". This amounts to characterizing the linguistic background of our program now that the general framework for theories has been developed. In the present subsection I shall informally develop a general outline; the necessary detail will be filled in later in Part II. I draw a basic distinction, frequently obscured in modern linguistics, between two disciplines: the theory of language and the theory of linguistics; the latter has the theory of linguistic description as one of its branches.97 The theory of language develops individual theories on the properties of natural languages and their systems (see below); the theory of linguistic description develops, among other things, theories of grammars of natural languages and their varieties, such as dialects; any such theory is relative to a theory of language.98 At this point we may ask whether a theory of grammars is relative to a theory of language in the sense of presupposing such a theory. I would indeed take the position that "theory of grammars" is most fruitfully understood in a sense where any axiomatic theory of grammars presupposes a theory of language in the defined sense of "presupposes" ( 3.4). The present paper may go quite a long way to developing an axiomatic theory of axiomatic grammars but it certainly will not arrive at an axiomatic formulation; hence, relativity to a theory of language can only be taken as an informal analogon to presupposition in the technical sense. Let us say that the proposed theory of grammars assumes a theory of language, to stress the analogy. The same relation may hold between the proposed theory of grammars and the framework for axiomatic theories (which is not yet an axiomatic theory). I indeed believe that the proposed theory of grammars if axiomatized would presuppose an axiomatic theory of language and an axiomatic version of the framework for axiomatic theories, but I will not here pursue this idea any further. Turning to theories of language I shall assume that any theory of language contains exactly one axiomatic constant, to be called the distinguished term of the theory, that is 'meant to refer to languages'.99 Likewise, for any grammar I shall assume that it contains exactly one constant or -tuple of constants, to be called the distinguished term of the grammar, that is 'meant to refer to the object of the grammar'. I further assume that an axiomatic theory of language is of limited fruitfulness if it neither is a part or subtheory of a theory of communication nor
97

There is at least one other branch, vi%. the methodology of linguistics, dealing with the methods applied or applicable in any branch of linguistics. 98 For a more detailed account of this position, see Lieb (1970) 1.7, or Lieb (1968a) 1. 99 For the problem of explicating this phrase, cf. 3.3, third objection to (10).

88

Hans-Heinrich Lieb

presupposes such a theory.100 Presupposition seems preferable to a subtheory approach insofar as one might wish to use other theories in a theory of language without granting special status to a theory of communication. Differently from Lieb (1970), I will here adopt the presupposition approach, to which the theory in Lieb (1970) is easily converted. The theory presented in Lieb (1970) is the most comprehensive axiomatic theory that presently exists in linguistics. At the time of working it out I had no clear picture of theory integration, which made me take my theory of language as a part of a theory of communication; also, I had not developed the conception of realized theories as outlined above, 3.1. I would now say that the theory was intended as an applied axiomatic theory, with the application characterized in general terms only. I shall here assume the theory as part of the linguistic background, restructured as follows: The theory is part of a theory of language, and may be taken as an abstract axiomatic theory if Thesis B is accepted. It presupposes a (part of a) theory of communication, which in turn presupposes a theory of physical time. Most of the theory in Lieb (1970) consists of the presupposed (part of a) theory of communication and can be characterized informally as follows. Consider a linguistic complex > i. e. either a complete natural language through time (an historical language) or a complete variety of an historical language, such as an historical dialect; or a stage or a period of a language or language variety. Any linguistic complex is a communication complex in the sense of Lieb (1970): a non-empty finite class of means of communication. A variety is a subclass of the language, and a stage or a period of a language or variety is a subclass of the latter. The elements of a linguistic complex are called linguistic means of communication. A linguistic means of communication (corresponding, roughly, to a homogeneous idiolect in a synchronic sense) is a class of abstract texts. Each text is a pair of a 'form' and a 'meaning', both abstract if compared to actual utterances and their meanings. More specifically, an appropriate class of texts is a means of communication for somebody during a certain time. It is actually 'used' by the speaker, i.e. certain texts are 'realized' by meaningful utterances of the speaker. Given the class, the speaker, and the interval, there is a system of the means of communication that uniquely determines the means and is, during the interval, at the speaker's disposal. Given a stage of a language or language variety, there is a system for the stage that is an entity of the same kind as the systems of the elements of the stage and is related to them by a relation of abstraction. More specifically, there is a finite sequence of such systems, each more abstract than the preceding one. When a certain degree of abstraction is reached, a system for a stage may also be a system for other stages, and eventually a system for the complete language or variety. It is assumed that there is such a system for any complete natural language or variety.

Cf. Lieb (1970) 1.3, for arguments.

Grammars as theories: the case for axiomatic grammar (part I)

89

Allowing for certain qualifications to be mentioned below ( 4.2) the theory of Lieb (1970) will be assumed in the present essay in an extended form that incorporates additional assumptions along the following lines (these assumptions would not be expressed in the presupposed theory of communication but by sentencesaxioms, theorems, or definitionsof the theory of language). Any system of a linguistic means of communication contains the phonological, the morpbosyntactic, the semantic, and (possibly) the pragmatic subsystem; the phonological system may be replaced by an analogous one, say, a graphemic system. More specifically, the system is the triple or quadruple of its three or four subsystems (ordered as above), or rather, the corresponding unit class.101 Within these subsystems, minor systems may be distinguished, to be called parts of the subsystems. The phonological and morpho-syntactic subsystems are the syntactical subsystems of the complete system. A number of additional assumptions will be formulated in Part II of the present paper. A system for a linguistic complex is structured in the same way as a system of a linguistic means of communication. On the basis of this sketch, at least the following subtheories of the theory of language may be assumed: A theory of systems (to be further subdivided into a phonological, morpho-syntactic, semantic, and (possibly) pragmatic theory) dealing with the structure of systems (for linguistic complexes or of their elements), and a realisation theory dealing with the relations between systems and spatio-temporal entities. This subtheory may perhaps be further subdivided into at least a phonetic theory and a speech act theory, depending on which spatio-temporal entities are primarily involved. This division will be tentatively accepted for the rest of the present essay. The theory of communication used in the theory of language provides most of the basic non-logical constants of the theory and its subtheories: ".. .is a means of communication (MC) for ... during ...", ".. .is a system of..'.'" etc. I will now assume a theory of grammar distinguishing complete and partial linguistic-grammars as follows: The intended subject matter102 of a complete linguistic grammar is either a linguistic complex and a system for the complex, or a linguistic means of communication and a system of the means. The intended subject matter of a partial linguistic grammar is either a linguistic complex and a (part of a) subsystem of a system for the complex, or is a linguistic means of communication and a (part of a) subsystem of a system of the means. If the subject matter involves a means of communication, we shall speak of a (complete or partial) idiolect grammar.
101

The unit class is chosen for logical reasons explained in the next subsection. In the extended theory, systems lose their status as individuals which they had in Lieb (1970); this changes the logical type of certain variables and constants in my earlier theory. 102 "Intended" shows that we are tacitly assuming a pragmatic relation. I will not go into the relevant problems. They are analogous to the ones discussed for theories in 3.3, third objection to (10).

90

Hans-Heinrich Lieb

The expression "is a grammar of will be construed as follows: A complete grammar of a linguistic complex and a system is any linguistic grammar whose intended subject matter is the complex and the system; a partial grammar of a complex and a system is a linguistic grammar whose intended subject matter is the complex and a (part of a) subsystem of the system; analogously, for linguistic means of communication.103 Accepting either Thesis A or Thesis B substantially enriches the theory of grammar as should be obvious from the framework for axiomatic theories developed in 2 and 3. In particular, we may introduce the following assumptions, again stated informally: A grammar of a linguistic complex and a system is an abstract theory o/104 a couple consisting of the complex and the system; correspondingly, for a linguistic means of communication. The two assumptions may have to be weakened by choosing an -tuple that includes other entities besides the ones just mentioned. As the grammar is an applied axiomatic theory, it is an applied theory of an n-tuple that includes the complex (the means of communication), the system, and entities denoted by the application terms of the theory. In the case of an idiolect grammar a speaker and a time interval may be assumed among those entities. If Thesis B is acceptable we further make the following assumptions, for any grammar formulated in terms of the presupposed theory of language: A grammar of a language contains a theorem of the form c 6 C,105 where C is the distinguished term of the theory of language and c the distinguished term of the grammar; a grammar of a language variety (that is not itself a language) contains a theorem of the form: For some cls ct e C and ccq; a grammar of a linguistic means of communication contains a theorem of the form: For some q, G! 6 C and c ct. Each sentence will be called the distinguished sentence of the corresponding grammar.106 Our main task will consist in demonstrating that the proposals (2) and (3) are feasible, given a theory of language as outlined above. The whole of Part II will be devoted to that demonstration. Throughout, we concentrate on the stronger Thesis B although most of our arguments could be reformulated so as to support at least the weaker proposal. The linguistic background of our program has been characterized in fairly general terms only. In the next subsection I shall be more specific about the logical properties of the theory of language that will be assumed. (The next two subsections are of a more technical nature; the reader interested in the advantages and the history of our approach may turn to 4.4 immediately.)
Further explication would, of course, require an explication of "linguistic grammar". For "abstract theory of and "applied theory of below, cf. 3.3, p. 80. 105 Or an analogous form, depending on the system of logic involved. 106 The concept of distinguished sentence could be generalized by relativizing it to arbitrary theories of language. This is important in connection with language universals (Lieb to appear, 6.2ff) but may be neglected.
104 103

Grammars as theories: the case for axiomatic grammar (part I) 4.2. Questions of logic

91

Using the theory in Lieb (1970) for our present purposes is more problematic than our previous account might suggest, due to the language in which the theory is formulated. The applied-core language (and, in principle, the total language) of the theory of language in Lieb (1970) is a couple <F, ) such that there is a pre-interpreted formalized system <Fi> ) of higher-order predicate logic of the following kind: The underlying logic of (Fi> ^ is a four-sorted version of Carnap's Language C (Carnap (1958)), i.e. a version with four types of individual expressions ; it is a system assuming the 'simple theory of types' with a finite number of levels.* <^Fl5 ^ contains the axiomatic language of an abstract axiomatic theory of the real numbers. ! is the set of all D such that: The domain of D = the set of constants, closed terms, and closed formulas of F1-minus-C*, where C* = the set of undefined non-logical, non-mathematical constants of F t ; and D satisfies the rules of value assignment and evaluation (appropriately modified) for Carnap's Language B ((1958) 25).107 F is a German reading of F^108 For some rendering r of F! as F, is the set of r-forms of elements of . Thus, by the theorem on r-forms (2.3, p. 67), <F, ) is a Pre-IFS of predicate logic of the same kind as

(F!,^).

Actually, the preceding account is not quite correct in the following respects: Fj and a certain rendering of Fj as F were characterized in general terms only, and the same holds for the application of the theory of language (its application axioms, semantic basis, and application language). If (3) is acceptable, the theory of language does not have to be an applied theory at all. According to 4.1 we take an extension of the theory in Lieb (1970), characterized informally, as the theory of language to be assumed for our account of grammars and for possible use in the grammars themselves. To avoid confusion due to different natural languages (the present essay is in English), we modify this decision as follows: First, we consider not the theory in Lieb (1970) but a symbolic counterpart of the theory obtained by replacing the theory's language <F, A> by a version of Carnap's Language C (as just indicated) and replacing the axioms, definitions etc. accordingly. We might then take the theory of language as an extension of the counterpart; i.e. take the latter as a part of the former. This would be problematic on a number of accounts. First, individuals of certain types (systems, texts) must be treated as non-individuals in the new theory of language, which therefore would be an extension not of the counterpart but only of a modification of the counterpart. Second, whereas 0^, ) is an extensional Pre-IFS of logic (cf. 2.3, p. 67), this may not be tme of the axiomatic (the total)
107

It has to be assumed that <1,1> is a Pre-IFS of logic; no proof is possible because the term "minus-C interpreted formalized system of logic" was not defined. (Cf. the definition of "Pre-IFS" in 2.3). 108 Possibly with the qualifications noted in 2.2, fn. 49.

92

Hans-Heinrich Lieb

language of the new theory (for reasons to be discussed in the next subsection). Thirdly, we may not be able to restrict ourselves to a finite number of levels: In the semantic part of the theory of language, axiomatic constants of transfinite levels may be required.109 Finally, there are minor problems with versions of Carnap's Language C, in particular: They do not contain constants or variables whose values are ordered -tuples; such constants and variables may be needed. For the theory in Lieb (1970) we therefore consider a symbolic counterpart as above, with qualifications as follows: The (total or axiomatic) language of the counterpart is either an extensional Pre-IFS obtained from an appropriate version of Carnap's Language C by making the following changes: (a) modifications as required by restructuring the domains of individuals; (b) minor changes of symbolism; or it is a non-extensional Pre-IFS that is a modification of an appropriate version of Carnap's Language C and also incorporates the previous changes (a) and (b). The theory of language to be assumed for the present essay will then be an extension of this counterpart, and may be so even with respect to logic by containing variables and constants of transfinite levels. Ordered -tuples could be assumed as introduced by a variant of the Wiener-Kuratowsky method. We shall, however, do without them and instead take the unit classes of -tuples, which are available in Carnap's Language C. The 'minor changes of symbolism' will be pointed out at their first occurrence. They include use of "e" as an auxiliary symbol (not as a constant); thus, "R<x, y>" and "(x, y^ e R" are treated as notational variants of each other. The assumed theory of language will not be developed formally. In informal discussion I may continue to use expressions of an English language reading (obtainable on the basis of Lieb (1968^) or by translating from Lieb (1970)), speaking of classes, elements etc. (cf. Carnap (1958) 28c). I do not wish to maintain that every theory of language should have the logical properties just indicated. On the contrary, if extensional systems are sufficient for formulating a theory of knguage we may try to avoid complexities such as transfinite levels by formulating theories of language in the language of set theory. This would also solve a problem in our informal version of a theory of grammars: The framework for axiomatic theories in 2 and 3 was formulated by using set-theoretical means of expression. Assuming that a formal version of the theory of grammars presupposes both a theory based on our framework for axiomatic theories and the theory of language, an awkward situation develops if the language of set theory is used in one but not the other theory. It will soon become apparent, though, that a general restriction to the language of set theory may be problematic.

For the reasons, cf. already Carnap (1942) 12.

Grammars as theories: the case for axiomatic grammar (part I)

93

4.3.

Problems of feasability

In this subsection we consider various objections that question the general feasability of our program. Both Thesis A and B contain the requirement that a grammar should be formulated in the language of predicate logic or set theory (in the sense of 3.1, p. 76). Although we did not specify the conditions that would make a PreIFS of logic a Pre-IFS of predicate logic, the requirement may seem too strong if we take "predicate logic" to imply extensionality of the system, as we certainly would in the case of "set theory". Thus, we arrive at the following Objection 1. The program is doomed because the total language of a grammar cannot be extensional. We may argue as follows. According to 4.1, a complete linguistic grammar will contain not only a semantic part but may even contain a pragmatic one, and its application will establish a relation to actual language use. In the semantic part the grammar will have to deal with meanings of (abstract) expressions and in its application, possibly also in its pragmatic part, with attitudes of speakers. For either task a non-extensional knguage is required. If Thesis B is adopted we may even strengthen this argument as follows: The axiomatic language of the theory of language in terms of which a complete grammar is formulated is contained in the grammar's total language. In 4.1 we assumed two subtheories of a theory of language, a systems theory containing a semantic and (possibly) a pragmatic part, and a realization theory containing a speech act part. Developing these three parts requires a non-extensional language for the same reasons as given before in the case of grammars. Hence, the total language of the grammar cannot be extensional. I would immediately subscribe to these arguments except for the crucial assumption: Non-extensional languages are required for dealing with meanings of abstract expressions (i.e. not of concrete utterances or speech acts) and for attitudes of speakers.110 First, consider meanings. The first explicit discussion with which I am acquainted is Carnap (1956) 38. His result, though inconclusive, supports the hypothesis that an extensional language may be sufficient for the semantic description of interpreted formalized systems of logic.111 Recent developments in logical semantics ('possible-worlds semantics') also seem to support that hypothesis : Intensional entities such as properties are all conceived as set-theoretical constructs, and the semantic metalanguage contains only sentences that could be
110

Note that such attitudes are also involved in the meanings of utterances, which are intended meanings (or understood meanings). 1 1 1 "On the basis of these considerations, I am inclined to believe that it is possible to give a complete semantical description even of an intensional language system like S2 in an extensional meta-language like Me. However, this problem requires further investigation." (1956,172).

94

Hans-Heinrich Lieb

sentences of the axiomatic language of an abstract set theory, including constants for specific sets. However, I have been unable to ascertain whether this is sufficient for extensionality in a Carnapian sense.112 Next, consider attitudes, in particular, so-called prepositional attitudes like knowing, believing, meaning, intending etc. It is apparent from the work of Irena Bellert that these concepts may occupy an important place in grammars, and fairly obvious from the work of Bellert and other authors that they should occupy such a place in a theory of language.113 We need non-logical constants such as "believes", "intends" in a grammar and a theory of language; it may be left open whether these are axiomatic constants of a theory of language or of a psychological theory used in the theory of language.114 The question, then, is whether an axiomatic theory of the relevant prepositional attitudes (that might be part of a theory of language or of actual grammars) can be formulated in an extensional language. Relevant discussion is found mainly in logical semantics and the philosophy of language, where an extensive literature on prepositional attitudes has accrued, centering to a large extent around the concept of belief. Mainly since Carnap (1956) (i.e. 1947) it has been widely debated whether belief should be taken as a relation between a person and a sentence or a person and a proposition, or still something else.115 A decision in this matter is immediately relevant for our problem: According to Carnap, an extensional language may suffice if sentences are chosen (1954: (1956) 232); Rescher, who advocates propositions, reaches the conclusion that "an adequate logical theory of belief statements is not, it would seem, to be had unless modal concepts be presupposed as an available tool for its development" ((1968) 53). Again, I am not certain about the extensionality of the language if a possible-worlds semantic is chosen for treating propositional attitudes (cf., e.g., Hintikka'(1969)).
Carnap (1956) 168f, considers one of his metalanguages as non-extensional simply because it contains sentences which say what the intension of a certain expression is; in his view, any such sentence is non-extensional. This would seem to answer our question also for the more recent developments. But Carnap's supporting example seems to be wrong; he apparently confuses equivalence in the metalanguage with, equivalence in the object language. He also presupposes that an expression such as "the property Human" would be a predicate expression of the metalanguage because "Human" is; in a semantics taking properties as particular sets and sets as individuals, such an expression would be an individual expression. 113 Cf. Bellert (1972*) (1972) (to be discussed in 4.5). For the role of propositional attitudes for a speech act theory (as a subtheory of a theory of language), cf. Searle (1969) as a classic reference. 114 It may be argued that such constants should be taken as logical operators; this would affect the form not the substance of our argument. In any case, we are not concerned with actual Verbs of believing* etc. in natural languages. 115 The question is taken up from a linguist's point of view in Partee (1973). Cf. also Moravcsik (1973) for a survey of part of the relevant literature.
112

Grammars as theories: the case for axiomatic grammar (part I)

95

Given this situation, I am taking the course already hinted at in the preceding paragraph: I explicitly allow as Pre-IFSs of predicate logic systems that are nonextensional. If necessary, I would be prepared to give up the relevant parts of Theses A and B: It is not essential to the program that the underlying logic should have a particularly simple structure. There is a second objection of a more fundamental nature: Objection 2. The program is problematic because it naively assumes axiomatizability. This objection loses part of its force because we did not require that the (axiomatic or total) language of an axiomatic theory should be axiomatizable (in the sense of 2.2, p. 61). Given a formal theory, abstract, interpreted, or applied, axiomatizability of the theory involves only a subset of the set of valid sentences of the theory's language (cf. 2.4, p. 70, 3.1, p. 75). To make sense of the objection within our framework we have to assume that an Ordinary' grammar can be represented by a formalized theory of some sort. The question then is: Are those theories axiomatizable in one of the previously introduced senses? For representation as a formalized theory we have to decide on which set of sentences of the theory's language should be the third member of the theory. Objection 2 may already break down because no such set of sentences can be delimited unless the resulting theory is axiomatizable. In the foundations of mathematics it was possible to find non-axiomatizable formalized systems because in some cases the set of valid sentences could be defined semantically as true under such and such an interpretation; it then turned out that the system with those valid sentences was non-axiomatizable. In empirical theories we have the problem of uninterpreted constants that complicates the concept of truth. I have been unable to extend to formalized theories even the relativized concept of truth introduced above for realized axiomatic theories ( 3.3, Third Objection to (10)). One might think of characterizing a set of valid sentences 'pragmatically' but this should always lead to finite sets unless a semantic or syntactical characterization is included. But suppose we have a concept of truth that applies to interpreted and applied formalized theories, and is an applied formalized theory and is a theory of a linguistic means of communication.116 It may be natural to require that every true sentence of the form: s is a text of C, is an element of the third component of (s and C are constants and C is 'meant to refer to the intended subject matter of the theory'). will be axiomatizable only if the set of all those sentences is recursively enumerable. Assuming that it is corresponds to the one most basic assumption of generative grammar (at least, if "text of C" is understood as "sentence of C"). To my knowledge, only Henry Hiz has argued against that assumption in favor of a weaker one (Hiz (1968) 248f). Suppose the set of those sentences is not recursively enumerable. Then we may either give up the require' For "theory of, cf. above, 3.3, fn. 82.

96

Hans-Heinrich Lieb

meftt that it should be a subset of the third component of , which removes the obstacle to axiomatizability; or we may attempt to find a significant subset (of the set of sentences) that is recursively enumerable; that is, we weaken our claim of axiomatizability to a part of . An axiomatization of that part would then be an incomplete axiomatization' of , in an obvious sense. I conclude that the assumption of axiomatizability is at present defensible; in the worst case our program would have to be relativized to 'reduced grammars', /. e. to theories that are 'incomplete axiomatizations' of corresponding formalized theories. One might also consider re-formulating it for formalized instead of axiomatic theories. It would then considerably lose in interest: There are additional semantic problems with formalized theories; the results on theory integration of 3.4 do not seem to generalize to formalized theories in a non-trivial way;117 and the systematizing effect of axiomatic theories would be lost. Objection 3. The program is misguided: It is based on a conception of theories that derives from and aims at theory construction in the natural sciences; this conception is [may be] inadequate, or partially inadequate, for the humanities to which linguistics belongs. I would subscribe to the premises of the objection in their non-dogmatic version ("may be" instead of "is"), taking as the conception of theories the ordinary one in the philosophy of science. I would not, however, accept the conclusion, for the following reasons: (a) The very attempt to apply a certain conception may be the best means to judge its adequacy in a certain field; the framework for axiomatic theories in 2 and 3 already contains a considerable number of modifications and extensions of the traditional conception that are motivated by the needs of theory construction in linguistics; (b) I would indeed submit that the conception of theories as developed in the philosophy of science, whatever its historic origin, is adequate to a significant degree for any empirical discipline, (hence, for linguistics) although there may be no single discipline yet for which it is completely adequate.118 Objection 4. The program is lop-sided: It does not take into account the non-deductive aspects of grammars and grammar writing. This objection is connected with an important complex of questions on which I can comment only briefly: If the objection is understood as establishing a contrast between deductive theories and theories allowing for probability statements it is clearly mistaken since theories of probability can also be formalized or even axiomatized. If, however, the objection refers to the role of inductive reasoning in theory construction, then it has to be admitted that the program makes no reference to it. But this is only natural: Induction plays an important part in the
117

Note that the decisive definitions all contain reference to axioms. There are good reasons for this hypothesis, and so far I have not seen any argument in favour of the singulary status of the humanities that could not be refuted or reduced to dogma.
118

Grammars as theories: the case for axiomatic grammar (part I)

97

heuristics of theory construction and the confirmation of theories, not in their actual formulation; the program is concerned only with the format of grammars. Indeed, it is one of the advantages of the proposed format that questions of heuristics and confirmation can be formukted in a well-established framework.

4.4.

Advantages of axiomatic grammar writing

Fraenkel et al. (1973, 323) make the following remark on formalization of theories (representation by formalized systems):
There are very many mathematicians, and even more so other scientists, who doubt it very much whether mathematical (and other) theories should be formalized even if they can be so in principle, suspecting that the fruits of formalization are not worth the effort.

This applies directly to our program of axiomatization in linguistics, which needs to be justified. There are at least four reasons which recommend an axiomatic format for grammars as stipulated in 4.1. Although those reasons carry weight independently of any disadvantages that may be connected with generative grammars, I will include some comparison with generative grammars in the following discussion. First advantage: The language of a grammar is a formalized system of logic. This is true for the axiomatic language (or total language) of an axiomatic theory even if a restriction to the language of predicate logic or set theory is not accepted. The system may be a natural language reading of a systern of logic, which then provides the possibility of symbolization. Thus, the linguist does not have to bother with formally specifying the language he is going to use: This task has been solved for him by the logician. All that is needed is appropriate introduction of the non-logical components: the non-logical constants and the axioms and definitions that involve such constants. Second advantage: The metalinguistic approach to the theory of language is avoided. One of the basic ideas of generative grammar can be summarized as follows: Language should be studied by specifying the form of generative grammars that characterize natural languages; more specifically, the generative grammars used by the linguist should eventually be formally constrained in such a way that they characterize precisely the natural languages. Insofar as the symbolism used in a generative grammar can be compared to a constructed language, this approach may be called metalinguistic because it directly studies the formal properties of grammars in order to indirectly arrive at properties of natural languages ("Every natural language has the property of requiring generative grammars with the following formal properties: ..."). This is at best a very clumsy procedure for developing a theory of language; at worst it is a permanent source of pseudo7 TLI1/2

98

Hans-Heinrich Lieb

problems and ad /w-decisions on matters of principle.119 The axioms and theorems of a grammar as stipulated in 4.1 are sentences of a language which is interpreted to a degree where the axioms and theorems can be understood as assertions on the intended subject matter of the grammar. A theory of language does not deal with the linguistic properties of grammars but with arbitrary natural languages. Grammars and theories of language are rekted by the concepts of theory integration as developed in 3.4.
Third advantage: The discussion of theories in the philosophy of science can be made to bear directly on grammars

This claim is obviously correct because of our paragraphs 2 and 3 which incorporate and develop many important features of the conception of theories found in the philosophy of science. Because our framework for axiomatic theories is sufficiently similar to that conception, the latter is relevant in still other respects. Generally, the following points can be made. (a) Heuristics and explication. It is well-known that a theory must be distinguished from the ways of arriving at a theory, which in turn can be studied systematically. Developing the heuristics of grammar writing is simplified when it can be seen as a special case of characterizing theory construction in the empirical sciences. In particular, the relation between an informal scientific grammar and a corresponding grammar that has been formulated as an applied axiomatic theory can be understood in terms of a set of simultaneous 'explications'. (b) Interpretation. Transformational generative grammars, or algorithmic grammars in general, run afoul of the problem of 'making sense of them*. The two existing attempts to solve this problem directly or indirectly make use of axiomatic theories. Thus, solving the interpretation problem for generative grammars seems to require a solution of the same problem for axiomatic theories. In 3, such a solution was attempted, based on the relevant discussion in the philosophy of science.
119 As an example of a pseudo-problem or set of pseudo-problems, take the question of how to develop an 'evaluation measure' for grammars in terms of'formal properties of grammars. As a general requirement, evaluation is to correspond to how closely a grammar is in agreement with certain psycholinguistic assumptions, especially on language learning. Normally, this would be seen as a problem of relating sentences of the grammar to sentences of a psycholinguistic theory, regardless of the form of the sentences involved. From this point of view, the problem of finding corresponding formal properties of the grammar is spurious.For an ad ^-decision on a matter of principle, see the way linguistics is claimed for psychology by Chomsky (cf. Lieb (1970), Ch. 10). 120 In his somewhat mistitled book, Botha (1968) made an ambitious attempt to apply to transformational generative grammars the concepts which have been developed for theories in the philosophy of science. Unfortunately, he failed to demonstrate that such grammars are theories in the required sense. Even if some of his arguments can be saved in the light of Wang's work or my own (cf. above, 1.5f), this does not hold for his elaborate discussion of 'mentalism', 'competence' and related matters, which therefore is largely empty.

Grammars as theories: the case for axiomatic grammar (part I)

99

(c) Confirmation. The involved discussion of what constitutes the 'data' for a grammar and how the grammar is related to the 'data' could greatly profit from placing it into the general methodological framework for theories: Assuming that grammars are applied axiomatic theories, the discussion could be reformulated as argumentation concerning (a) the heuristics of grammar writing; (b) the interpretation of the total language and the relation between core, applied core, and application of the grammar; (c) the confirmation of grammars. (d) Explanation. Wang's recent attempts to apply the Hempel-Oppenheim schema for nomological-deductive explanation of facts to generative grammars presuppose their conversion into axiomatic theories (Wang 1972^). The value of Wang's proposals is doubtful.121 Still, if a grammar is written as an applied axiomatic theory it should be possible to give a more adequate account of what Chomsky tried to cover by his distinction between observational, descriptive, and explanatory adequacy. The first kind of adequacy concerns questions of heuristics and confirmation. The second concerns the problem of explaining facts of language use by relating them to what is stated (a) in the grammar and (b) in a psycholinguistic theory that indicates how the system described by the grammar is represented in the speaker. The third kind of adequacy concerns the problem of explaining what is stated in the grammar by relating it to what is stated in a theory of language (or language acquisition). The last two problems may perhaps be reconstructed as explanation of facts vs. explanation of laws. In any case, these and other concepts of explanation now apply directly to grammars. (e) Evaluation. An applied axiomatic theory can be evaluated by such criteria as formal simplicity, organizing power, predictive power, and 'placeability', i.e. readiness with which it is linked to other theories in the same or related fields. At present, such evaluations can hardly be more than informal estimates according to certain principles; as such, they are quite important. They may be sufficient, too, for applied axiomatic theories whose application axioms are sufficiently well understood since any such theory has a semantic basis and a pre-interpreted formalized system of logic as its total language. I suggest that problems of grammar evaluation came to occupy a prominent place in generative tranformational grammar mainly because the grammars remained semantically unspecified.
Fourth advantage: Grammars can be combined with grammars and other axiomatic theories by theory use or conflation (3.4).
121

Wang considers atomic sentences of the axiomatic grammar that assign lexemes to lexical categories as statements of facts, and universal sentences on syntactic categories as laws. Sentences of the two types are then taken as the explanans of an explanation whose explanandum is a theorem derived from the explanans to the effect that such and such is a sentence of the language. But this is a way to have an abstract system explain itself. In my view, the facts to be partially explained by invoking sentences of the grammar are facts of the actual use of the system; for this, pointing out deductive relationships between sentences of the grammar is at best a first step.

100

Hans-Heinrich Lieb

This point is important for (a) practical and (b) theoretical reasons. (a) Practical. Even idiolect grammars will hardly ever be complete. Grammars whose subject matters are a given linguistic means of communication and different subsystems of the means may be integrated into a more comprehensive grammar by theory conflation, if they have the proposed format. Corresponding statements hold for grammars of linguistic complexes. (b) Theoretical. It is of considerable theoretical interest to have a unified conception for combining partial grammars into more comprehensive ones; for formally relating grammars of idiolects, language varieties, and languages to each other and to a theory of language; and for making a theory of communication or physiological, psychological, and sociological theories available in a theory of language. Take, in particular, the relation between a grammar and a theory of language. Chomsky remarks ((1965)6):
It is only when supplemented by a universal grammar that the grammar of a language provides a full account of the speaker-hearer*s competence.

There is no hint how to understand "supplemented". It is naively taken for granted that deductive relationships may be established ((1965)46):
Whenever this is done [sc. abstracting a statement or generalization from a particular grammar and attributing it to the general theory of linguistic structure], an assertion about a particular language is replaced by a corresponding assertion, from which the first follows, about language in general.

This is wishful thinking, given the nature of generative grammars (cf. 1). On the other hand, analogous statements would have a precise meaning on our conception of grammars, where "supplemented by", for instance, could be interpreted s "formulated in terms oF'. It is in connection with the fourth advantage that the desirability of accepting Thesis B shows most clearly. If a grammar is formulated in terms of a. theory of language, the following conditions are satisfied: (a) No sentence of the theory of language is a sentence of the grammar, (b) A sentence derivable from sentences of the grammar and sentences of the theory of language is a sentence of the grammar, (c) A sentence derivable from sentences of the grammar and sentences of the theory of communication that is used in the theory of language (or of any other theory so used) is a sentence of the grammar, (d) Different grammars are comparable if they are formulated in terms of the same theory of language.12 (e) The interpretation problem for theories of language is solved in an elegant way: It is the very grammars formulated in terms of the theory of language that provide applications for that theory, without giving rise to any circularity.
122

This would hold even more strongly if we could replace "formulated in terms of in (3) by "strongly formulated in terms of, /. e. if the grammar did not contain any defined axiomatic terms; {/". 3.4, p. 85.

Grammars as theories: the case for axiomatic grammar (part I)

101

Because of (a), there is no mixing of the general and the particular. Because of (b) and (c) the grammar nevertheless specifies properties of its subject matter that are not purely specific. Because of (d), grammars can be used directly in the quest for language universals. Because of (e) the interrelation between the general and the particular is reconstructed in an intuitively satisfying way that should do justice to traditional arguments both for and against a 'universalist' position in general linguistics.123 The above four reasons in favour of axiomatic grammars cannot be jointly adduced for generative grammars. There would seem to be only one advantage of generative grammars, greater succinctness of symbolization. This turns out to be an erroneous assumption since the very formulas of a generative grammar can be reinterpreted as set-theoretical expressions (Lieb (1967)). Moreover any abbreviation that might be useful in a grammar (like "VCV", " + ", etc.) can be introduced as a defined constant. In the light of our discussion it may be surprising that the axiomatic approach to grammar writing should have been so late in being seriously considered. 4.5. Historical notes .

Both in linguistics and in the theory of linguistics, the axiomatic method has a well-established tradition: On the one hand there have been attempts at axiomatic theories of language or aspects of language1^4; on the other there are many axiomatic investigations of grammars. Either type of research is quite different, though, from a grammar itself being conceived or formulated as an 'interpreted' axiomatic theory. My own work in (1967), (1968) was motivated by the hypothesis that axiomatic grammars are preferable to generative ones in definite respects. In the published versions that idea was not made explicit. The possibility of an axiomatic grammar is implicit in the discussion of Hiz (1968) 248f, where it is no longer assumed that the set A of texts in a language is recursively enumerable: This would suggest an axiomatic treatment of A by which various properties of texts could be specified, without an attempt at recursive enumeration. A Hizean approach to grammar writing is at the basis of Smaby (1971); the proposed 'paraphrase grammars' are formal systems as originally used by Wang, and thus close to axiomatic theories (cf. above, 1.6, p. 54). Moreover, a grammar corresponding to the general theory of Hiz (1969)
123 The conception formulated in Thesis B plays a prominent role in my study on the concept of language universal, Lieb (to appear). One of the theoretical advantages of Thesis B is exactly its fruitfulness for universality research and for clarifying the concept of language universal. 124 A trend that is exemplified by much of the East European work summarized in Marcus

102

Hans-Heinrich Lieb

can be conceived as an applied axiomatic theory, taking as primitive core constants ('theoretical terms') expressions such as "is a paraphrase oP* (for a relation between texts). Assuming a general pragmatic theory that is used in the grammar and application terms that refer to speakers, the paraphrase relation between, say, texts of an idiolect could be related to facts about the use of texts that are paraphrases of each other.125 Irena Bellert has recently proposed a format for grammars which makes a grammar "analogous to a formal interpreted theory" (Bellert (1972) 299) ,126 Bellert's interesting and important proposal seems to be inadequate as it stands. I will discuss this question in some detail. Following Bellert (1972) 299f, "a complete grammar or theory of a natural language L" will consist of a syntactic and an interpretative component and, corresponding to each component, of a "meta-rule" or "meta-theorem" which is "independent of any particular natural language" (300). The syntactic component "generates an infinite set of pairs (S, D), where S is a string of symbols over the terminal vocabulary of L and D is a deep structure tree dominated by a distinguished element of the auxiliary vocabulary of L, which denotes the category of sentence" (299). The syntactic meta-rule states that, "for all pairs (S, D) generated by the syntactic component of the grammar, S is a sentence of L and D its structural description." The 'meta-rule' is meant to repair an inadequacy of the pairs (S, D): They are not statements.127 Thus, the 'meta-rule' has an interpretative function relative to the expressions of the syntactic component. Proposals of this type were shown to be inadequate in Lieb (1968), for the following reason: They lead to an interpretation that establishes a relation not to a natural language but only to (sets) of strings of symbols of (levels of) the grammar.128 Bellert's proposal for syntax is inferior to both the re-interpretation and the correlation approach, which are also of limited value.

125 I made a similar proposal to Professor Hiz in 1972 who agreed that it would be compatible with his 1969 theory. 126 In a number of previous publications (collected in Bellert (1972*)), Bellert explored the possibility of describing the semantical aspect of a natural language by stating 'implicational rules' for the use of sentences. In 1971 I suggested to her that these 'rules' might best be taken as axioms on the use of sentences in a given language. In a way, this is now done in her article although in a form which I find objectionable. 127 Cf. Bellert's discussion (1972) 294. 128 Bellert apparently fails to notice that she is equating the sentences of the natural language with strings of symbols of (a level of) the grammar; this mistake has been discussed for Chomsky's work in Lieb (1968 ) 3. Thus, Bellert's "theorems of the form: CX is a sentence of L with a structural description represented by D'" ((1972) 294) are not statements on the natural language.

Grammars as theories: the case for axiomatic grammar (part I) The interpretative component consists of (299) a finite set of axiomatic implications of the form C->A PROPOSITIONAL ATTITUDE that S'

103

where C stands for a Boolean function of conditions on the structural descriptions D, A stands for the addresser, PROPOSITIONAL ATTITUDE stands for one of the purported attitudes, and S' is a sentential form ...129

The corresponding 'meta-rule' says that, for all S, D, R and C, if the sentence S with structural description D is used by addresser A and directed to receiver R and D satisfies condition C, then "A PROPOSITIONAL ATTITUDE that S' where C and A PROPOSITIONAL ATTITUDE that S' are, respectively, the antecedent and the consequent of an axiomatic implication" (299 f). The separation of the 'axiomatic implications' from the 'meta-rule' is not tenable. To make sense of the conception, we have to take "A" in an axiomatic implication as a free variable (bound in the meta-rule). We thus arrive at 'axioms' as exemplified by "Questionyes/no[S] - A WANTS R to say if S" (298), where "R" has to be taken as another free variable.130 Because of the free variable(s) these 'axioms' are neither true nor false, and they have no acceptable interpretation: What exactly is it to mean that, for every yes-no question S (S being a sentence, not a concrete utterance or speech act), A wants R to say if S?131 All these problems disappear if Bellert's 'implicational axioms' are given up and the semantic 'meta-rule' is understood as specifying the form of certain axioms such that any grammar of a natural language has such axioms, -vi%. universal implications of the following form: If correctly uses <S, D> in L by addressing some utterance U of S to y and if C(S, D, L), then PA(x, y, f); where and y are variables ranging over humans, S is a variable ranging over sentences of natural languages and D a variable ranging over syntactic structures of sentences; U ranges over objects or events in space-time; L is a constant 'for' the language in question; C(S, D, L) is a sentential formula (of the language of the grammar) in which L occurs and S and D are the only free variables; PA(x, y, f) is an atomic formula consisting of a three-place or two-place predicate PA, which belongs to a small set of predicates 'for* propositional attitudes, and a three-place or two-place argument expression consisting of the free variables and (possibly) y and a
129 For PROPOSITIONAL ATTITUDE" in the above schema we are to substitute expressions such as "BELIEVES" understood as "purports to believe" (roughly: behaves linguistically as though he believed); cf. 297. 130 The above example, like most of the examples on p. 248, does not completely correspond to the schema. 131 Bellert maintains for her axiomatic implications that "the consequents can be said to follow formally from the antecedents" (297). But even in analytic sentences material implication does not mean deducibility. Bellert might mean that, given the antecedent, modus ponens could be applied; but this would not answer my question.

104

Hans-Heinrich Lieb

formula f 'specifying properties of propositions relative to C and, possibly, xandy'.132 This description obviously has no place in any particular grammar but only in a theory of grammars. There it may be dispensed with altogether: What Bellert seems to be aiming at is a certain conception of the semantic subsystem of a language system and its relation to 'pragmatic' factors. Such a conception should be developed in a theory of language in terms of which grammars can be formulated. We thus arrive at the following picture: Bellert's proposal as it stands is given up. A partial grammar of a language 'covering syntax and semantics* is an applied axiomatic theory formulated in terms of a theory of language among whose axiomatic constants there are terms such as "uses correctly", "addresses", and among whose non-logical constants there are terms such as "BELIEVES", "WANTS". All axioms of the semantic part of the grammar are formulated by using those constants as indicated above, and this is justified by assumptions made in the theory of language.133 In this way, the basic features of Bellert's approach to semantics (which I continue to consider as fruitful) could be kept in a less objectionable framework. The idea of axiomatic grammar^ writing has also been emerging in recent work by Helmut Schnelle. In Schnelle (1970), a grammar is taken as a theory of a language which in principle ought to be formulated in one of the 'languages' of logic. Such a formulation would frequently be "zu umfangreich und unbersichtlich" (8), therefore, 'linguistic algebras' such as context-free grammars are developed.134 For translations into symbolic logic, Schnelle (Lc. 9) refers to Wang (1968), a reference that was premature at the time since Wang developed his 'correlation method' only in kter work. The first explicit studies of axiomatic grammars are Wang (1971), (1971<); the focus of interest is not, however, on axiomatic grammars but on problems of generative grammar. In Schnelle (1973&, end of 2), the requirement is made that results such as obtained by Wang should be made available for any proposed 'linguistic algebra'; "at the same time one should try to write grammars and the general linguistic theory immediately in the standard version [as a theory using a 'language' of logic, H.L.]". This already comes close to our own conclusion at the end of 1.6.
This is a first approximation. The final clause indicates a problem area in Bellert's approach; in the final version, Bellert now has changed the specification that S' in an 'axiomatic implication* be a sentential form; instead, "S' is a sentence expressing the corresponding belief, assertion or any other purported attitude of the speaker". 133 By no means are the axioms analytic; neither are the syntactic axioms and theorems, which Bellert calls "unconditionally true statements" ((1972) 294). On such a view, a grammar would have no empirical content. (In Lieb (1967), I almost stumbled into the same trap.) Some of the semantic axioms may be said to be on analyticity in the natural language, though I doubt that using "analytic" in this context is helpful. 134 This is hardly correct; cf. above, end of 4.3.
132

Grammars as theories: the case for axiomatic grammar (part I)

105

Schnelle also remarks that "formalizations of this type are not common in linguistics" (I.e. 2.4). To my knowledge, no linguistic grammar has ever been formulated as an 'interpreted* (realized) axiomatic theory. As a possible exception, one might suggest the grammars for various 'fragments of English* contained in Montague's papers (Montague (1970*), (1970^), (1973)). Those grammars are indeed close to abstract axiomatic theories of type 2 (cf. 3.1) and could probably be reformulated as such; even so, we would be left with the task of transforming them into interpreted or applied theories.135 After arguing for the desirability of our program and discussing conceptions that are similar to it, we now argue in Part II for its feasability. As explained at the end of 4.1, the test case will be partial idiolect grammars.

15

Montague's repeated claim that there is no important theoretical difference between formal and natural languages ((1970*) 189; (1970) Introduction) obscures the fact that a linguistic grammar will eventually have to be related to facts of actual language use, a requirement that does not seem to hold for grammars of formal languages.

106

Hans-Heinrich Lieb

Index of technical terms, variables, a s s u m p t i o n s and theorems introduced in 24


abstract axiomatic theory of type 1 (AT1) 69 abstract axiomatic theory of type 2 (AT2) 71 abstract theory of 80 abstract formalized theory of type 1 (AFT1) 70 abstract subject matter of 81 AFT1 = abstract formalized theory of type 1 70 amalgamation 68 APFT = applied formalized theory 75 (the) application (of an APT) 75 application axiom (of an APT) 75 application constant (of an APT) 75 (the) application language (of an APT) 75 applied axiomatic theory (APT) 73 applied axiomatic theory with predicate (APTP) 74 (the) applied core (of an APT) 75 (the) applied-core language (of an APT) 75 APT applied axiomatic theory 73 APTP = applied aciomatic theory with predicate 74 applied formalized theory (APFT) 75 applied theory of 80 AS = axiom system 69 assume 87 assumption on amalgamations 68 assumptions on C-IFSs of logic 65 assumption 1 on IFSs of logic 66 assumption 2 on IFSs of logic 67 assumption on Minus 62 assumption on NLRs 63 assumption on NLRs and Pre-IFSs 67 AT = AT1 or AT2 AT1 = abstract axiomatic theory of type 1 69 AT2 = abstract axiomatic theory of type 2 auxiliary symbol 59 axiom of a formal system (Fraenkel et a/.) 59 of an FS 60 of an IPS 68 of an AT1 69
of an AT2 71 of an APT 75 C-axiom of an FS 61 of an IPS 68 ofaPre-IFS68 axiomatic constant of an AT1 70 of an APT 75 axiomatic core constant (of an APT) 75 (the) axiomatic language (of an AT1) 69 axiomatic primitive (of an AT1) 69 axiomatic term of an AT1 70 of an APT 75 axiomatizable FS61 APFT 75 (type-1) axiomatizabie 70 axiomatization (of an APFT) 75 (type-1) axiomatization 70 axiom system in the sense of Carnap 68 in the sense of Lieb ( = AT1) 69

B, Bj,... (variables) 64 basic core constant (of an APT) 75 (the) basic language (of an AT1) 70 basic non-logical constant (of an AT/) 70 basic non-mathematical constant (of an AT1) 70 basis cf. minus-C basis for <F, D> B-interpretation cf. interpretation BT! -false cf. false BTj -true cf. true C, Q,... (variables) 61,64,69 C-axiom etc. cf. axiom etc. CF-part cf. part C-IF1S of logic = minus-C interpreted formal system of logic 68 C-IFS of logic = minus-C interpreted formalized system of logic 65 C-IFS of predicate logic (etc.) 66 cINT = completely interpreted axiomatic theory 72

Grammars as theories: the case for axiomatic grammar (part I) C-INT = minus-C interpreted axiomatic theory 72 C-INTF = minus-C interpreted formalized theory 72 compatible (ATS) 83 complete linguistic grammar 89 completely interpreted axiomatic theory (cINT) 72 (the) -completion of T 85 completion theorem 85 (the) conflation of AT1 s 83 of an AT1 and an APT 83 ofAPTs83 conflation theorem 83 constant of a formal system (Fraenkel et al.) 59 of an FS 60 of an IPS 68 ofaPre-IFS68 contained in applied to FSs 62 applied to Pre-IFSs 67 (the) core (of an APT) 75 core axiom (of an APT) 75 core constant (of an APT) 75 core definition (of an APT) 75 (the) core language (of an APT) 75 core sentence (of an APT) 75 core theorem (of an APT) 75 D, D!,... (variables) 64 ,!,. .. (variables) 66 defined constant (of an AT1) 70 (the) defined term (of a definition) 61 defined term of an FS 61 ofanIFS68 ofaPre-IFS68 of an AT1 70 C-defined term of an FS 61 ofanIFS68 ofaPre-IFS68 (the) definiendum (of a definition) 61 (the) definiens (of a definition) 61

107

(the) defining terms (of a definition) 61 definition of an FS 61 ofanIFS68 of a Pre-IFS 68 of an AT1 70 of an AT2 (cf. also: main definition) 71 of an APT 75 C-definition of an FS 61 of an IPS 68 of a Pre-IFS 68 derivable from 59 derivation in a formal system (Fraenkel et al\) 59 in an FS 61 in an IPS 68 in a Pre-IFS 68 (the) distinguished sentence (of a grammar) 90 (the) distinguished term of a theory of language 87 of a grammar 87 (the) domain sequence (of an APT) 75 e, Cj,... (variables) 63 f <F, > excluding C 67 expression of a formal system (Fraenkel et ai) 59 of an FS 60 of an IPS 68 of a Pre-IFS 68 C-expression of an FS 61 of an IPS 68 of a Pre-IFS 68 extension 64 extension (of a theory) 70 extensional applied to IFSs of logic 66 applied to Pre-IFSs of logic 66 F, FJ,... (variables) 61 f, f j , . . . (variables) 64 BTj-false 79 false theory of 81 first-order theory 64 PIS = formal system (Lieb) 61

108

Hans-Heinrich Lieb
in an IFS 68 in a Pre-IFS 68 incompletely interpreted axiomatic theory (iINT) 72 INT = interpreted axiomatic theory 72 int(B, Tj) = the B-interpretation of T, 77 intension 64 (the) B-interpretation of Tt (int(B, Tj)) 77 interpreted axiomatic theory (INT) 72 interpreted axiomatic theory with predicate (INTP) 73 interpreted formal system of logic (IFIS of logic) 68 interpreted formalized system of logic (IFS of logic) 65 interpreted formalized system of predicate logic (etc.) (IFS of predicate logic etc.) 66 (minus-C) interpreted formalized system of logic cf. minus INT* = interpreted axiomatic theory with predicate 73

F-minus-C cf. minus (trie)r-formofD67 formal system in the sense of Fraenkel et al. 59 in the sense of Lieb (= F1S) 61 formal theorem = provable formula formalized system (FS) 60 formali2ed system of (predicate etc.) logic (FS of (predicate etc.) logic) 62 formalized theory 59 formula of a formal system (Fraenkel et al.) 59 of an FS 60 of an IPS 68 ofaPre-IFS68 C-formula of an FS 61 of an IPS 68 ofaPre-IFS68 (is) formulated in terms of applied to AT1 s 84 applied to AT*s and (or) APTs 85 formulated in the language of predicate logic applied to ATs 71 applied to APTs and APFTs 76 formulated in the language of set theory applied to ATs 71 applied to APTs and APFTs 76 FS = formalized system 60 FS of (predicate etc.) logic ~ formalized system of (predicate etc.) logic 62 harmless (set of constants) 62 idiolect grammar 89 IF1S of logic = interpreted formal system of logic 68 IPS of logic interpreted formalized system of logic 65 IFS of predicate logic (etc.) = interpreted formalized system of predicate logic (etc.) 66 iINT = incompletely interpreted axiomatic theory 72 immediate derivability in a formal system (Fraenkel et al.) 59 in an FS 60

L, L! ,... (variables) 69 (the) language (of an AT2) 71 linguistic complex 88 linguistic means of communication 88 logical axiom of a formalized theory 59 of an FS of logic 62 of an IFS of logic 68 of a Pre-IFS of logic 68 logical calculus = formalized system of logic logical constant of an FS of logic 62 of an IFS of logic 68 of a Pre-IFS of logic 68 logically compatible 67 logistic system 59 (the) main definition (of an AT2) 71 mathematical constant ofanFS62,fn.45 of an IFS 68 of a Pre-IFS 68 of an AT1 70 MC = means of communication 88

Grammars as theories: the case for axiomatic grammar (part I) means of communication (MC) 88 F-minus-C 61 minus-C basis for <F, D> 64 minus-C IPS == minus-C interpreted formalized system of logic 65 minus-C interpreted axiomatic theory (C-INT) 72 minus-C interpreted formal system of logic (C-IFlSoflogic)68 minus-C interpreted formalized system of logic (C-IFS of logic) 65 minus-C interpreted formalized system of predicate logic (etc.) (C-IFS of predicate logic etc.) 66 minus-C interpreted formalized theory (C-INTF) 72 natural language reading (NLR) of an FS 63 NLR = natural language reading 63 non-logical axiom of a formalized theory 59 of an FS of logic 62 of an IPS of logic 68 of a Pre-IFS of logic 68 non-logical constant of an FS of logic 62 of an IFS of logic 68 of a Pre-IFS of logic 68 of an AT1 70 of an AT2 71 non-mathematical constant ofanFS62,fn.45 of an IFS 68 of a Pre-IFS 68 of an AT2 71 non-pure logical calculus 62 object of = subject matter of 80 part

109

(the) possible-value function (of an APT) 75 (the) predicate (of an ATT2) 71 Pre-IFIS of logic pre-interpreted formal system of logic 68 Pre-IFS of logic pre-interpreted formalized system of logic 66 pre-interpreted formal system of logic (Pre-IFIS of logic) 68 pre-interpreted formalized system of (predicate etc.) logic (Pre-IFS of (predicate etc.) logic) 66 presuppose(s) applied to AT1 s 84 to AT's and (or) APTs 84 presupposes all of applied to AT1 s 84 applied to AT4 s and (or) APTs 84 (the) primitive interpretation (of an APT) 75 proof59 provable (formula) in a formal system (Fraenkel et al.) 59 inanFS60 in an IFS 68 in a Pre-IFS 68 pure logical calculus 62 Ramsey sentence 78, fn. 77 realization theory 89 realized axiomatic theory (RT) 74 (the) C-reduction of <F,D> 67 regimented (form of a natural language) 63 rendering of ... as ... 63 r-form ff. form RT = realized axiomatic theory 74 rule of inference (Fraenkel et al.) 59 f. S,^, ... (variables)69 (the) semantic basis (of an APT) 75 sentence of an FS 60 of an IFS 68 ofaPre-IFS68 of an AT1 70 ofanAPT75 specific subject matter of 81 (is) strongly formulated in terms of 85 subject matter of 80

applied to AT1 s and AFT1 s 70 applied to AT*s, AFT^s, APTs, APFTs76,alsofn.75 applied to systems 89 (the)CF-partofD67 partial linguistic grammar 89 (is) partly formulated in terms of 84

110

Hans-Heinrich Lieb U , U j , . . . (variables) 64 undefined term of an FS 61 of an IPS 68 ofaPre-IFS68 of an AT1 70 (the) underlying logic of (F, > 67 union (of FSs) 62 (is) used (in) applied to AT4 s 84 applied to AT*s and (or) APTs 84 (is) used completely (in) applied to AT1 s 84 applied to AT1 s and (or) APTs 84 > i > (variables) 64 valid (sentence) of a formal system (Fraenkel et al.) 59 of an FS 60 of an IPS 68 ofaPre-IFS68 valid C-sentence of an FS 61 of an IPS 68 ofaPre-IFS68 variable of a formal system (Fraenkel et al.) 59 of an FS 60 of an IPS 68 ofaPre-IFS68 vocabulary of a formal system (Fraenkel et al.) 59 of an FS 60 of an IPS 68 ofaPre-IFS68 x,xj, ... (variables) 64, 80
v v

subject matter of T relative to 86 subtheory 84 syntactical subsystem 89 system of 88 system for 88 T,Tj, ... (variables) 69 term of a formal system (Fraenkel et a/.) 59 of an FS 60 of an IPS 68 ofaPre-IFS68 C-term of an FS 61 of an IPS 68 ofaPre-IFS68 ,,,... (variables) 75 -completion cf. completion theorem on application axioms 76 f. theorem on excluding 67 theorem on NLRs and Pre-IFSs 67 (formal) theorem = provable formula theorem of an FS 61 of an IPS 68 ofaPre-IFS68 of an AT1 70 of an AT2 71 of an APT 75 C-theorem of an FS 61 of an IPS 68 ofaPre-IFS68 theory of 80 theory of relativ to 86 theory of systems 89 (the) total language (of an APT) 75 BTrtrue79 true theory of 81 (type-1) axiomatizable cf. axiomatizable (type-1) axiomatization cf. axiomatization

Grammars as theories: the case for axiomatic grammar (part I) REFERENCES

111

BACH, E. (1964), An Introduction to Transformational Grammars, New York: Holt, Rinehart and Winston BARTSCH, Renate (1971), Semantische Darstellung von Prdikaten, Linguistische Berichte No. 13, 3348 BARTSCH, Renate (1972), Adverbialsemantik. Die Konstitution logisch-semantischer Reprsentationen von Adverbialkonstruktionen, Frankfurt/M: Athenum Verlag (= Linguistische Forschungen 6) BARTSCH, Renate and T. VENNEMANN (1972), Semantic Structures. A Study in the Relation between Semantics and Syntax, Frankfurt/M: Athenum Verlag ( = Athenum Skripten Linguistik 9) BELLERT, Irena (1972a), On the Logico-Semantic Structure of Utterances, Wroclaw etc., (= Polska Akademia Nauk, Komitet J ezykoznawstwa, Prace Jezykoznawcze 66) BELLERT, Irena (1972b), Theory of language as an interpreted formal theory, pp. 292300 in: [Preprints of Proceedings of the llth International Congress of Linguists, Bologna, 28 August2 September 1972] BELLERT, Irena (1973), Sets of implications as the interpretative component of a grammar, in: F. Kiefer, N. Ruwet (eds.), Generative Grammar in Europe, Dordrecht (Holland): Reidel 1973 BIER WISCH, M. (1969), On certain problems of semantic representation, Foundations of Language 5,153184 BOTHA, R.P. (1968), The Function of the Lexicon in Transformational Generative Grammar, The Hague: Mouton (= JL, series maior, 38) CARNAP, R. (1942), Introduction to Semantics, = vol. I in: Carnap, R., Introduction to Semantics and Formalization of Logic, two vols. in one, Cambridge (Mass.): Harvard University Press, 1958 CARNAP, R. (1954), On belief-sentences. Reply to Alonzo Church, pp. 230232 in Carnap (1956). [First published in 1954.] CARNAP, R. (1956), Meaning and Necessity. A Study in Semantics and Modal Logic, 2nd enlarged ed. [Ist ed. 1947], Phoenix Books, Chicago and London: The University of Chicago Press CARNAP, R. (1958), Introduction to Symbolic Logic and its Applications, New York: Dover CARNAP, R. (1960), Einfuhrung in die symbolische Logik mit besonderer Bercksichtigung ihrer Anwendungen, 2nd rev. and enlarged ed., Wien: Springer-Verlag CARNAP, R. (1963), G.G. Hempel on Scientific Theories, pp. 958966 in P.A. Schilpp (ed.), The Philosophy of Rudolf Carnap, La Salle, 111.: Open Court (=The Library of Living Philosophers, vol. 11) CARNAP, R. (1966), Philosophical Foundations of Physics, ed. by M. Gardner, New YorkLondon: Basic Books CHOMSKY, N. (1955), The Logical Structure of Linguistic Theory. Preliminary draft, [unpublished typescript, 1955] M.I.T., Microreproduction Laboratory: Cambridge, Mass, [microfilm] CHOMSKY, N. (1957), Syntactic Structures, The Hague: Mouton (= Janua Linguarum, Series minor, Nr. 4,1957 etc.)

112

Hans-Heinrich Lieb

CHOMSKY, N. (1959), On certain formal properties of grammars, Information and Control 2, 137167. Also pp. 125155 in: Luce, R.D., R. Bush and E. Galanter (eds.), Readings in Mathematical Psychology, Vol. II, New York: Wiley CHOMSKY, N. (1965), Aspects of the Theory of Syntax, Cambridge, Mass.: The M.I.T. Press CHOMSKY, N. (1970a), Remarks on nominalization, in: Jacobs, R.A. and P.S. Rosenbaum (eds.), Readings in English Transformational Grammar, Waltham, Mass.: Ginn. Also pp. 1161 in: Chomsky (1972b) CHOMSKY, N. (1970b), Deep structure, surface structure, and semantic interpretation, in: Jakobson, R. and Shigeo Kawamoto (eds.), Studies in General and Oriental Linguistics Presented to Shiro Hattori on the Occasion of his Sixtieth Birthday, TEC Co.: Tokyo. Also pp. 62119 in: Chomsky (1972b) CHOMSKY, N. (1972a), Some empirical issues in the theory of transformational grammar, pp. 63130 in: Peters, S. (ed.), Goals of linguistic theory, Englewood Cliffs, N. J.: Prentice-Hall. Also pp. 120202 in: Chomsky (1972b) CHOMSKY, N. (1972b), Studies on Semantics in Generative Grammar, The Hague: Mouton (= Janua Linguarum, Series minor No. 107) CHOMSKY, N. and M. Halle (1968), The Sound Pattern of English, New York: Harper and Row CHURCH, A. (1956), Introduction to Mathematical Logic, Vol. I, Princeton, N.J.: Princeton University Press COOPER, W. S. (1964), Set Theory and Syntactic Description, The Hague: Mouton (= Janua Linguarum, Series minor, Nr. 34) DAVIDSON, D. and G. HARMAN (eds.) (1972), Semantics of Natural Language, DordrechtHolland: Reidel ESSLER, W.K. (1970), Wissenschaftstheorie I: Definition und Reduktion, Freiburg-Mnchen: Karl Alber FRAENKEL, A.A., Y. BAR-HILLEL, A. LEVY, with the collaboration of D. van Dalen (Fraenkel et al. (1973)), Foundations of Set Theory, 2nd rev. ed., Amsterdam-London: North-Holland ( = Studies in logic and the foundations of mathematics, Vol. 67) GINSBURG, S. and B. PARTEE (1969), A mathematical model of transformational grammars, Information and Control 15, 297334 HATCHER, W.S. (1968), Foundations of Mathematics, Philadelphia etc.: W.B. Saunders HERMANNS, F. (1971), Die generative Grammatik als Deskriptionsgrammatik, Lingua 27, 301329 HINTIKKA, J. (1969), Semantics for prepositional attitudes, pp. 2145 in: Davis J.Wet.al. (eds.), Philosophical Logic, Dordrecht-Holland: Reidel HINTIKKA, K. J. J., J.M.E. MORAVCSIK, and P. SUPPES (eds.) (1973), Approaches to Natural Language. Proceedings of the 1970 Stanford Workshop on Grammar and Semantics, Dordrecht-Holland: Reidel (Synthese Library) HIRSCHMAN, Lynette (1971), A Comparison of Formalisms for Transformational Grammar, University of Pennsylvania (= Transformations and Discourse Analysis Papers 87) Hiz, H. (1968), Computable and uncomputable elements of syntax, pp. 239254 in: Rootselaar, B. van, J.F. Staal (eds.), Logic, Methodology and Philosophy of Science III.
8 TLIl/2

Grammars as theories: the case for axiomatic grammar (part I)

113

Proceedings of the Third International Congress for Logic, Methodology and Philosophy of Science, Amsterdam 1967, Amsterdam: North-Holland Hiz, H. (1969), Aletheic semantic theory, The Philosophical Forum I (New series), 438451 JACKENDOFF, R.S. (1972), Semantic Interpretation in Generative Grammar, Cambridge, Mass.: The M.I.T. Press (= Studies in Linguistics Series 2) KATZ, J.J. (1967), Recent issues in semantic theory, Foundations of Language 3, 124194 KATZ, J.J. (1971), Generative semantics is interpretive semantics, Linguistic Inquiry 2, 313331 KATZ, J.J. (1972), Semantic Theory, New York: Harper and Row KATZ, J.J. and J.A. FODOR (1963), The structure of a semantic theory, Language 39, 170210. Also pp. 479518 in: Fodor, J.A. and Katz, J.J., The Structure of Language. Readings in the Philosophy of Language, Englewood Cliffs, N.J.: Prentice-Hall, 1964 KATZ, J.J. and P.M. POSTAL (1964), An Integrated Theory of Linguistic Descriptions, Cambridge, Mass.: The M.I.T. Press (= Research Monograph No. 26) KEEN AN, E.L. (1972), On semantically based grammar, Linguistic Inquiry 3, 413462 KEEN AN, E.L. (to appear), A presupposition logic for natural language, mimeographed, to appear in: The Monist KNUTH, D.E. (1968), Semantics of context-free languages, Mathematical Systems Theory 2,127145 KUTSCHERA, F. von (1972), Wissenschaftstheorie: Grundzge der allgemeinen Methodologie der empirischen Wissenschaften, 2 vols., Mnchen: Fink (= Uni-Taschenbcher 100 and 198) LAKOFF, G. [1969], Generative Semantics [preliminary prepublication draft] LAKOFF, G. (1971), On generative semantics, pp. 232296 in: Steinberg and Jakobovits (eds.)(1971) LEWIS, D. (1970), How to define theoretical terms, The Journal of Philosophy 67, 427446 LIEB, H. (1967), A note on transformational grammars, Word 23, 369373. [Written and published after Lieb 1968b] LIEB, H. (1968a), Communication Complexes and their Stages. A Contribution to a Theory of the Language Stage, The Hague, Paris: Mouton LIEB, H. (1968b), Zur Kritik von N. Chomskys Theorie der Ebenen, Lingua 19, 341385 LIEB, H. (1970), Sprachstadium und Sprachsystem. Umrisse einer Sprachtheorie, Kohlhammer Verlag: Stuttgart etc. LIEB, H. (to appear), Universals of language: Quandaries and prospects, Fondations of Language MARCUS, S. (1967a), Analytique et generatif dans la linguistique algebrique, pp. 12521260 in: To Honour Roman Jakobson. Essays on the Occasion of his Seventieth Birthday, vol. 2, The Hague: Mouton MARCUS, S. (1967b), Algebraic Linguistics: Analytical Models, New York: Academic Press MARCUS, S. (1969), Linguistique generative, modeles analytiques et linguistique generate, Revue Roumaine de Linguistique XIV, 313326 MATES, B. (1965), Elementary Logic, London: Oxford University Press McCAWLEY, J.D. (1968), Concerning the base component of a transformational grammar, Foundations of Language 4, 243269

114

Hans-Heinrich Lieb

MCCAWLEY, J.D. (1972), A program for logic, pp. 498544 in: Davidson and Harman (eds.)(1972) MONTAGUE, R. (1970a), English as a formal language, pp. 189224 in: Linguaggi nella societa e nella tecnica. Convegno promosso dalla Ing. C. Olivetti & C., S.p.A. per il centenario della nascito di Camillo Olivetti, Milano, 1417 ottobre 1968, Milano: Edizioni di Comunita ( = Saggi di cultura contemporanea 87) MONTAGUE, R. (I970b), Universal grammar, Theoria 36, 373398 MONTAGUE, R. (1973), The proper treatment of quantification in English, pp. 221243 in: Hintikka, Moravcsik, Suppes (eds.) (1973) MORAVCSIK, J. (1973), Comments on Partee's paper, pp. 349370 in Hintikka et al. (eds.) (1973) PARTEE, Barbara H. (1973), The semantics of belief-sentences, pp. 309336 in Hintikka et al. (eds.) (1973) RESCHER, N. (1968), Topics in Philosophical Logic, Dordrecht-Holland: Reidel (Synthese Library) ROGERS, R. (1971), Mathematical Logic and Formalized Theories: A Survey of Basic Concepts and Results, Amsterdam-London: North-Holland SCHNELLE, H. (1970), Zur Entwicklung der theoretischen Linguistik, Studium Generale 23, 129 SCHNELLE, H. (1973a), Sprachphilosophie und Linguistik. Prinzipien der Sprachanalyse a priori und a posteriori, Reinbeck bei Hamburg: Rowohlt (rororo Studium, Linguistik) SCHNELLE, H. (1973b), Problems of Theoretical Linguistics, pp. 805831 in: P. SUPPES et al. (eds.), Logic, Methodology and Philosophy of Science IV, Amsterdam: North Holland Publisting Comp. SHOENFIELD, J.R. (1967), Mathematical Logic, Reading (Mass.) etc.: Addison-Wesley SEARLE, J.R. (1969), Speech Acts, London: Cambridge University Press SMABY, R.M. (1971), Paraphrase Grammars, Dordrecht-Holland: Reidel (= Formal Linguistics Series Vol. 2) SMULLYAN, R. (1961), Theory of Formal Systems, rev.ed., Princeton, N.J.: Princeton University Press SNEED, J.D. (1971), The Logical Structure of Mathematical Physics, Dordrecht-Holland: Reidel (Synthese Library) STEGMLLER, W. (1969/70), Probleme und Resultate der Wissenschaftstheorie und Analytischen Philosophie, vol. I, Wissenschaftliche Erklrung und Begrndung, 1969, vol. II, Theorie und Erfahrung, 1970, Berlin etc.: Springer STEGMLLER, W. (1973), Probleme und Resultate der Wissenschaftstheorie und Analytischen Philosophie, vol. II, Theorie und Erfahrung, Zweiter Halbband: Theorienstrukturen und Theoriendynamik, Berlin etc.: Springer STEINBERG, D.D. and L.A. JAKOBOVITS (eds.) (1971), Semantics. An Interdisciplinary Reader in Philosophy, Linguistics, and Psychology, Cambridge, Engl.: Cambridge University Press SUPPE, F. (1971), On partial interpretation, The Journal of Philosophy 68, 5776 SUPPES, P. (1957), Introduction to Logic, Princeton: van Nostrand [Frequently reprinted]

Grammars as theories: the case for axiomatic grammar (part I)

115

WANG, J.T. (1968), Zur Anwendung kombinatorischer Verfahren der Logik auf die Formalisierung der Syntax, Institut fr Phonetik und Kommunikationsforschung der Universitt Bonn, IPK-Forschungsbericht 68-5, [Ph. D. diss. Bonn] WANG, J.T. (1971a), Zu den Begriffen der grammatischen Regel und der strukturellen Beschreibung, pp. 5771 in: Wunderlich, D. (ed.), Probleme und Fortschritte der Transformationsgrammatik. Referate des 4. Linguistischen Kolloquiums, Berlin 6.10. Oktober 1969, Mnchen: Hueber (= Linguistische Reihe 8) WANG, J.T. (1971b), Zur Beziehung zwischen generativen und axiomatischen Methoden in linguistischen Untersuchungen, pp. 273282 in: Stechow, A. von (ed.), Beitrge zur generativen Grammatik. Referate des 5. Linguistischen Kolloquiums [Oktober 1970], Braunschweig: Vieweg WANG, J.T. (1971c), On the representation of context-free grammars as first-order theories, pp. 486488 in: SHU LIN (ed.), Proceedings of the 4th Hawaii International Conference on System Sciences, North Hollywood: Western Periodicals Company WANG, J.T. (1972a), Wissenschartliche Erklrung und generative Grammatik, pp. 5066 in: K. Hyldgaard-Jensen (ed.), Linguistik 1971. Referate des 6. linguistischen Kolloquiums 11.14. August 1971 in Kopenhagen, Frankfurt/M: Athenum WANG, J.T. (1962b), On the description of semantics of context-free languages in first-order logic, pp. 355357 in: A. LEW (ed.), Proceedings of the 5th Hawaii International Conference on System Sciences, North Hollywood: Western Periodicals Company WANG, J. T. (1973a), On the represation of generative grammars as first-order theories,, pp. 302316 in: R. J. BOGDAN and I. NIINILUOTO (eds.), Logic, Language and Probability. A selection of papers contributed to sections IV, VI, and XI of the Fourth International Congress for Logic, Methodology, and Philosophy of Science, Bucharest, September 1971, Dordrecht: Reidel WANG, J.T. (1973b), Zum rekursiven Mechanismus im semantischen System, pp. 205219 in: A.P. ten Cate and P. Jordens (eds.), Linguistische Perspektiven. Referate des VII. Linguistischen Kolloquiums, Nijmegen, 26.30. September 1972, Tbingen: Max Niemeyer Verlag

SCOTT SOAMES RULE ORDERINGS, (NUGATORY TRANSFORMATIONS,

AND
DERIVATIONAL CONSTRAINTS*

The author explicates and critizes the abstract characterization of grammars given in Lakoff (1971). In addition, he constructs proposals for formalizing the notions of rule ordering constraints and obligatory transformations in the context of transformational grammar. Each proposal is examined to determine both the restrict!veness of the claim that it makes about the structure of natural languages and the empirical evidence that would be needed to confirm it.
I.

In the past decade and a half the development of transformational grammar has not only produced new insights regarding the structure of natural languages but has also provided a new conception of the explanatory goals of linguistics and of the theoretical devices needed to achieve those goals. The key insights regarding the proper goals of linguistic science are: first, that the grammar of a language is a scientific theory whose task is to correctly predict the linguistically relevant properties of sentences of the language; second, that the linguistically relevant properties of sentences are determined by what a native speaker knows about their semantic, syntactic, and phonological characteristics; and third, that a linguistic theory is a higher level scientific construct that attempts to specify the common structure of all natural languages by formulating principles that are embodied by the grammars of each such language. The fundamental insight regarding the structure of grammars is that in order to achieve these explanatory goals, grammars must contain not only a set of context free constituent structure rewrite rules but also a set of more powerful grammatical transformations. It is only by the combined operation of both of these types of rules that the sentences of a natural language can be generated and given a maximally perspicuous representation of their semantic, syntactic, and phonological properties.
* I would like to thank Noam Chomsky, Julius Moravcsik, James Corum, and Jerry Katz for their helpful comments regarding an earlier version of this paper.

Rule orderings, obligatory transformations and derivational constraints

1 17

This claim can be clarified by characterizing more precisely the nature of context free constituent structure rewrite rules and grammatical transformations. According to Chomsky, a context free constituent structure rewrite rule is a rule "of the form A*Z where A is a category symbol such as S (for "sentence"), NP (for " noun phrase"), N (for "noun"), etc., and Z is a string of one or more symbols which may again be category symbols or which may be terminal symbols (that is symbols which do not appear on the left hand side of any base rule [constituent structure rewrite rule]). Given such a system we can form derivations, a derivation being a sequence of lines that meets the following conditions: the first line is simply the symbol S (standing for sentence); the last line contains only terminal symbols; if , are two successive lines, then X must be of the form ... A ... and of the form ... ... [that is must result from X by substituting a certain occurence of * in X by *Z'], where AZ is one of the rules. A derivation imposes a labeled bracketing on its terminal string in the obvious way. Thus, given the successive lines X = . . . A . . . , Y = . . . Z . . . , where is derived from X by the rule A>Z, we will say that the string derived from (or itself if it is terminal) is bracketed by [A ]A. Equivalent^, we can represent the labeled bracketing by a tree diagram in which a node labeled A (in this example) dominates the successive nodes labeled by the successive symbols of Z." (Chomsky (1967) 422) A simple example of a set of context free constituent structure rewrite rules is (1): (1) S-+NPVP VP-+V NP NP->N

With these rules we can construct the following derivation :

(2)

S
NPVP

NP V NP
NVNP

NVN


Finally, we can represent what is essential to this derivation by either a labeled bracketing (3) or by a labeled tree (4) (Chomsky 1967, 421/2). (3) [ s [NP [N^]N] NP [VP tv^lv [NP [N^]N ] NP]VP ] s

118

Scott Soames

(4)

A labeled bracketing, or correspondingly a labeled tree, that is generated by the combined operation of (i) the constituent structure component of the grammar and (ii) a class of lexical insertion transformations that replace A's in the above generated structures by lexical items is called a base phrase marker* More generally, a phrase marker is any properly labeled bracketing of elements.2 Using this general notion we can characterize a transformation as a structure dependent mapping of phrase markers onto phrase markers. More specifically, a transformation is defined by specifying both what is called 'a structural description' and *a structural change.' The former consists of a finite string of category symbols, terminal symbols and/or variablesfor example, NP, V, NP, (plus, in some cases, certain other restrictions that are irrelevant to our discussion). Such a structural description defines the domain of a transformation as the set of all phrase markers Q, whose full string t of terminal symbols can be divided into substrings tj, t2, t3, t4, (t4 possibly null) which are such that tj is a noun phrase in Q, t2 is a verb in Q, and t3 is a noun phrase in Q. If a phrase marker meets this condition we say that it is exhaustively analyzable into a string of constituents having the structure NP, V, NP, Y. In general, a phrase marker Q satisfies the structural description D of a transformation iff Q is exhaustively analyzable into a string of constituents having the structure specified by D. A transformation is an operation of permutation, deletion, or insertion that is defined on phrase markers that satisfy its structural description. The structural change of a transformation consists of a set of instructions for determining which phrase marker(s) the transformation associates with each phrase marker in its domain. For example, a transformation T might be defined by
1

The combination of the rewrite rules and lexical insertion transformations of a grammar is called the base component of the grammar.

In footnote 16 of Chomsky (1967), Chomsky gives the following example of the difference between proper and improper bracketing: (i) [A . [ . I B [ C - . - I C - . - I A () [A...[B...]A (iii) [A-..[B--.]A-..]B Only (i) is a properly labeled bracketing.

Rule orderings, obligatory transformations and derivational constraints

119

associating the structural description NP, V, NP, with the structural change 1, 2, 3,4=>3,be + en,2,4,byl. This structural change tells us that for all Q, if t1? t2, t3, t4, (with t4 possibly null) represents an exhaustive analysis of Q into a string of constituents with the structure specified by the structural description of T, and if Q' arises from Q by adding 'by* to tlf placing this new constituent at the end of the string, moving t3 to the front, and inserting 'be + en' before the verb, then associates Q' with Q.3 Thus, a transformation can be taken to be a binary relation whose domain is specified by a structural description and whose range is determined by a structural change in the manner just illustrated. Finally, the last notion that I need to assume for the discussion that follows is that of a transformational derivation. A transformational derivation in a grammar G (from now on simply a derivation in G) is a finite sequence of phrase markers P^.. ,Pn such that (i) Px is a base phrase marker defined by the base component of G, and (ii) each <Pj> PJ+I) satisfies the constraints defined by the transformations of G. Just what this second condition comes to I will leave open for a moment since it involves a number of substantive questions that I will discuss in the main body of this paper. For the present, suffice it to say that a pair of successive phrase markers (Pj, P i+1 ^ in a finite sequence of phrase markers meets the constraints defined by the transformations of a grammar G, only if G contains a transformation T that associates PJ+J with Pj. What further conditions, if any, might be needed to define this notion will be the subject of further investigation below. (Since in what follows we shall be concerned only with transformational derivations and not with derivations resulting from the application of rewrite rules alone, the term "derivation" will hereafter be used exclusively to refer to transformational derivations.)
II.

Two notions that play an important role in defining the place of transformations in a grammar are the notion of rule ordering and the distinction between optional and obligatory transformations. However, despite their importance, little attention has been given to determining how they should be formally represented in linguistic theory. One might think that the reason for this is that the formalization of these notions is unproblematic or that there is unanimous agreement among generative grammarians regarding the devices needed to characterize
3

This characterization is an oversimplification due to the fact that what it means for a phrase marker to "arise" from another by permutation, deletion or insertion has not been formally characterized in the literature on transformational grammar (This is the problem of derived constituent structure). For some early reflections on this problem, see Chomsky (1961). For a more current and controversial approach, see Emmonds (1969).

120

Scott Soames

them. Unfortunately, this is not the case. To show this I will examine a recent proposal made by George Lakoff and seconded by Paul Postal about how these notions should be formalized. I will argue that Lakoff's proposal is defective. Nevertheless, the difficulties with it are instructive in that a close examination of them brings to light the different consequences of adopting one or another formal device for explicating what it means to say that a rule is a grammatical transformation, that a given transformation is obligatory, or that one transformation is ordered before another. The motivating force behind Lakoff's characterization is his introduction of grammatical rules that are defined on distinct non-adjacent phrase markers in derivations and that are different from standard transformational apparatus. Lakoff calls these new rules 'derivational constraints'. He then attempts to minimize the difference between these rules and grammatical transformations by redefining the latter in a way that reveals a maximum formal similarity between such transformations and his newly postulated rules. Finally, Lakoff expresses this similarity by claiming that all traditional grammatical devices can be characterized as comprising distinct subtypes of derivational constraints. Recently, Paul Postal has claimed that Lakoff's characterization of grammars is "a fundamental theoretical clarification." (Postal 1972, 139) One of Postal's main reasons for thinking this is that he approves of the way in which Lakoff handles rules ordering. After saying that by a filter he means a rule that marks some derivations as ill-formed even though they "may be perfectly well-formed as far as the base rules and transformations are concerned," (ibid. 138) Postal adds that
... a rich set of filters has been implicitly part of transformational theory since the beginning. I refer to what has been called 'rule ordering.* One can regard rule ordering as consisting of a set of filters which throw out all derivations thought of as generated by randomly ordered rules, whose trees are not aligned by a sequence of rules which meet the ordering condition. A formal account of rule ordering in these terms is given by G. Lakoff. I note only that this point of view brings out clearly that ordering statements must be regarded as actual grammatical rules, a right conclusion, I think, since in many cases ordering statements compete with other types of grammatical apparatus ... I see no reason whatever for assuming, as in the past, that ordering statements are freely available while other types of statements require justification. This asymmetry has, as far as I know, never been justified." (ibid. 140/1)

In what follows I will try to show that misleading in several respects. First, it is a characterization of grammars "a fundamental mistake because (a) Lakoffs characterization

the picture that Postal presents is mistake to call Lakoff s abstract theoretical clarification." It is a of transformations countenances

Rule orderings, obligatory transformations and derivational constraints

121

devices which do not function in the way in which grammatical transformations must and excludes from the class of grammatical transformations all of those devices which have traditionally been thought to make up this class; (b) his treatment of rule ordering weakens the empirical claims made by linguistic theory by contradicting two restrictive and widely held assumptions regarding such orderings; and (c) the distinction that he draws between optional and obligatory transformations renders his system either incoherent or inconsistent. Second, although there are several different ways of writing internally consistent grammars that treat rule ordering roughly in the way in which Lakoff suggests, none of them has "been implicitly part of transformational theory since the beginning." (ibid. 140) Finally, although it is clearly Lakoff's intention that the justification of the rule ordering statements of a grammar be regarded as fully on a par with the justification of language particular transformations, there are no a priori grounds for accepting such a view; nor does Lakoff cite any empirical evidence in its favor. Far from favoring this position, the relevant a priori considerations give us reason to reject it. Despite the fact that Postal presents a misleading picture of the merits of Lakoff *s proposal, he is right in thinking that the abstract characterization of rule orderings in a grammar can have important consequences. One of the things that I will try to do is to spell out what some of these consequences are. Of particular importance in this respect is the relationship between the characterization of rule orderings in grammars and the formalization of the distinction between optional and obligatory transformations. Thus, in addition to explicating and criticizing Lakoff, I will construct abstract characterizations of grammar which differ in how they formalize these two notions. These characterizations will then be compared with respect to the extent and manner in which they restrict the class of possible grammars. It should be emphasized, however, that, although I will attempt to specify the types of empirical considerations that are relevant to accepting or rejecting the theories in question, I will not attempt to determine which is the empirically more highly confirmed theory. In this respect, my investigation is almost entirely a priori and aimed at clarification. I hope, however, that the proposals that I make will result not only in a clearer understanding of the formal nature of linguistic theory, but also in the posing of new questions which may guide empirical research.

III.
In "On Generative Semantics" Lakoff takes over and modifies a way of abstractly characterizing grammars that was introduced by Chomsky (1971) 183/4. Each grammar is characterized as defining an infinite class K of finite sequences of phrase markes P! ... Pn, each of which meets the following conditions.

122 (5)

Scott Soames (i) Pn is a surface structure. (ii) Each pair P| and Pi+1 meets the well-formedness constraints defined by transformations, (iii) There is no P0 such that P0, PI Pn meets (i) and (ii).

One way of putting the intent of this view is to say that for a grammar G, the class K defined by G is the class of completed derivations in G. From this it follows that the class of well formed derivations in G is a subset of K.4 However, it does not follow that K // the class of well formed derivations. K will fail to be this ckss just in case a grammar contains constraints on the well formedness of derivations that are not transformational constraints on successive pairs of phrase markers. In fact, Lakoff argues, grammars contain many different types of non-transformational constraints. Among them are what previously have been called constraints on the order of application of transformations and constraints on the operation of transformations. These Lakoff calls "global derivational constraints"transformations being dubbed "local derivational constraints."5 In order for a sequence of phrase markers to be a well formed derivation in G, it must meet two conditions. First, it must be a member of K. Second, it must satisfy all global derivational constraints.6
4

It may be noted that the notion of a grammatical derivation as a finite sequence of phrase markers Pj.. .Pn in which each Pi (1 < i <, n) follows from PJ.J by virtue of some transformational rule bears an obvious resemblance to the notion of a derivation in standard logical systems. In the case of such logical systems no distinction is made between well formed and ill formed derivations. However, it should not be thought this is an objection to Lakoffs notion of a well formed derivation in a grammar. Lakoff believes that this notion is needed to restrict the set of abstract objects defined by a grammar as characterizing the structure of a language. So long as the difference between Lakoffs terminology and the terminology used in standard logical systems is remembered no confusion should result. Thus, for purposes of this paper, I will follow Lakoff in speaking of well formed derivations in a grammar and even of well formed pairs of successive phrase markers that appear in such derivations. It should also be understood that a derivation is well formed only if it is completed. Just what this means will be explained in a moment. 5 It should be noted that Lakoff uses the term 'global derivational constraint* to cover two quite different notions(i) conditions on transformations (e.g. rule orderings and conditions on the application of transformations) and (ii) conditions on non-adjacent segments of derivations that don't have anything to do with the operation of transformations. Rules of the former type state restrictions on the occurrence of pairs of successive phrase markers in derivations whereas the latter, like the constraints that relate semantic representations and shallow structure, typically relate individual phrase markers that occur at different levels. 6 An important omission in Lakoffs account is the lack of a formal explication of the cycle. In what follows, I will ignore this omission and criticize Lakoff on independent grounds.

Rule orderings, obligatory transformations and derivational constraints

123

Unfortunately, there are immediate difficulties with this view, the chief of which is that the constraint imposed by (5) on initial structures in derivations is too weak. Consider, for example, the following phrase marker:
(6)

VP

under the idea

the man

the machine proved the theorem

the theorem

There is no constituent in English that has the structure (7).

(7)

Thus, there is no derivation in English in which (6) appears. However, since (6) is analyzable into a string of constituents having the structure NP, V, NP, Y, it is associated with (8) by the passive transformation.

Mary

the machine proved the theorem This means that <(6), (8)> satisfies (ii) of (5). What then is to prevent P1? P2 ( = (6), P2 = (8)), ...-!, P (P-i=(6), P = (8)), or P^..?,, P m ...P n (Pi = (6), + = (8)) from being completed derivations in G? It is conceivable

124

Scott Soames

that condition (iii) could rule out the first alternative, but not the other two. Nor is it clear that condition (i) can do the job since the claim that a phrase marker is a surface structure is usually taken to mean that it is the last phrase marker in a completed derivation. But if this is so, then (i) cannot be one of the defining conditions of a completed derivation. In light of these difficulties, I propose the following modification of Lakoff s proposals. First, we may define the class K of derivations in a grammar G as the class of finite sequences of phrase markers PJ ... Pn such that (i) Pj is generated by the base component of G and (ii) each pair <} +> meets the well formedness constraints defined by transformations. Second, we shall define the class K' of completed derivations in G as the class of derivations in G which are such that for all obligatory transformations T in G, either Pn (the final phrase marker in the derivation) domain of or applying to Pn would violate constraints on rule ordering. Third, the class of well formed derivations in G consists of those completed derivations in G that satisfy all nontransformational derivational constraints. This is the framework that I will presuppose in my evaluation of the central aspects of Lakoff's characterization. Having come this far it is important to distinguish what is at issue regarding Lakoff's characterization and what is not. One thing that is not at issue is Lakoff's claim that grammars can be abstractly characterized solely in terms of derivational constraints. This claim is not at issue since, lacking an explanation of what sort of devices count as derivational constraints, it is not clear that anything is ruled out by such a claim. Rather, the things that are at issue are (i) The particular proposals that Lakoff makes for construing traditional devices like transformations and rule orderings as derivational constraints and (ii) the introduction of new types of grammatical rules that are defined on non-adjacent segments of derivations. The second of these issues has been discussed in a number of papers and, consequently, will not be discussed here, (cf., e.g., Chomsky (1972), Postal (1972)) The first, however, has seldom been given much attention. It is with it that I will be concerned in this paper. So far, we have seen that in order for a sequence of phrase markers to be a member of K each successive pair of phrase markers in the sequence must "meet the well formedness constraints defined by transformations." (Lakoff (1971) 232) We do not yet know, however, just how transformations are to be construed as derivational constraints; nor, if they are, do we know what it means to say that a pair of phrase markers <^Pi5 Pj+i}> meets the constraints defined by transformations. With regard to the first point, Lakoff says that a transformation is

Rule orderings, obligatory transformations and derivational constraints

125

an ordered pair of tree conditions (Q, 2^ Q corresponds to what has usually been called the structural description of the transformation, whereas 2 imposes a similar condition on the factorization of the output of the transformation. Further, for any tree condition Q, a phrase marker P is said to satisfy Q (i.e. P/Cj) iff it can be exhaustively analyzed into a string of constituents having the structure specified by Q. In light of this, it would be natural to suppose that a pair of phrase markers <Pi, P|+i> meets the constraints defined by transformations iff there is a transformation ^Q, C2) such that Pj/Q and P i + 1 /C 2 . However, such a characterization utterly misrepresents the nature of transformations. This can be readily seen if it is remembered that a transformation is a rather restrictive kind of binary relationthe domain of which is the set of phrase markers analyzable in accordance with its structural description. In saying that a transformation is a restrictive type of binary relation, I mean to draw attention to the fact that an element in the domain of a transformation is always correlated with a rather small number of phrase markers in its range. In fact, the only cases in which the correlation is not one-one are those, like the double passive, in which there is more than one way to factor a phrase marker in accordance with the structural description of the transformation in question.7 It is just this sort of restrictiveness that LakofPs characterization of transformations as ordered pairs of tree conditions fails to capture. To see this, suppose that some transformation T = ^Q, C^X that A is the set of phrase markers that satisfies Q and that B is the set of phrase markers that satisfies C2. Then, it is obvious that for all aeA and beB, <a, b> satisfies T. Thus, to say that a transformation T = <Q, Q) and that <Pb Pi+1) meets the well formedness constraints defined by T iff P4/Q and PI+I^ is tantamount to saying that T is a binary relation which associates each member of its domain with every member of its range. The difficulties with such a view can be revealed with the help of examples. Chomsky, for instance, has pointed out that a transformation whose sole effect is to interchange constituents of the same type would be one whose input and output phrase markers satisfy exactly the same conditions on factorization. But if transformations just are ordered pairs of tree conditions, then such a transformation could only have the form <Q, Q>the same form that the identity transformation would have to be given in LakofPs system. This result alone is
7

Sentences (ii) and (iii) provide an example of the double passive. (i) Everyone took advantage of John. (ii) John was taken advantage of by everyone. (iii) Advantage was taken of John by everyone. (ii) and (iii) can both be derived from the phrase marker underlying (i) because there are two different ways in which that phrase marker can be analyzed into a string of constituents having the structure specified by the passive transformation.

126

Scott Soames

disastrous since the identity mapping is not identical with a mapping that interchanges constituents.8 However, it is worth noting that this example is just one example of a larger problem. Suppose, for instance, that Q ^ Q. Still, if there is a phrase marker Q such that (i) Q/C2 and (ii) Q contains two constituents of the same categoryno matter what that category may be and no matter whether or not that category is mentioned in C2then the phrase marker Q' which differs from Q only in having these constituents interchanged is such that Q'/C2. What this means is that the "transformation" <Q, C2> won't be able to distinguish <Pj, Q> from <P i? Q'X As a result, one such pair will be well formed iff the other is. Finally, an exactly analogous result holds in the case of phrase markers which differ only in their terminal elements. That is, if <Pi5 Q> satisfies <Q, C^> then so does <Pj, Q*> where Q* is identical with Q except for containing completely different terminal elements.9 Each of these difficulties can be illustrated by considering what the results of the passive transformation would be if it were thought of as an ordered pair of tree conditions <Q, C2> of the type under discussion. In order to achieve empirical adequacy the phrase marker underlying (9) would have to satisfy Q, and the phrase marker underlying (10) would have to satisfy Q. (9) (10) John gave a watch to Susan. A watch was given to Susan by John.

Roughly, we can express this by letting Q = NP, V, NP, and Q = NP, be 4- en, V, Y, by NP. However, if these two conditions define the passive, then each of the following must also be passives of (9). (11) John was given a watch by Susan. (12) A watch was given to John by Susan. (13)*? John was given Susan by a watch.

8 9

Chomsky makes this point in footnote 19 of Chomsky (1972). Because Lakoff says so little about what tree conditions are supposed to be, it might be thought that he could reply to my argument by claiming that he means something other by a tree condition than what I have taken him to mean. As against this I should point out that the substance of my criticism holds no matter what interpretation is given to tree conditions so long as they are such that it makes sense to speak of an individual phrase marker, considered in isolation, as either satisfying or not satisfying a given tree condition. That Lakoff does speak of such conditions in this way is indicated by the following: "A transformation, or local derivational constraint, is a conjunction of the form Pj/Q and Pj+i/Q, as where Q and Q are tree conditions defining the class of input trees and the class of output trees respectively" (p. 233) (my emphasis) If Q defines the class of input trees and Q defines the class of output trees, then, as I have shown, if A is the set of trees that satisfy Q and B is the set of trees that satisfy Q then for all ae A and b6 B, <a, b> satisfies <Cj, Q)

Rule orderings, obligatory transformations and derivational constraints

127

(14) George was given a kiss by Mary. (15)* A watch was given said Bill to Susan by John. (16)* A watch was given Bill said that the earth is flat to Susan by John. In light of these results we must reject the view that (i) transformations are ordered pairs of tree conditions in the sense just illustrated and that (ii) a pair of phrase markers <Pb PI+I> is well formed just in case there is a transformation <Q, 2) such that Pj/Q and Pi+i/Q. However, it should not be thought that it is formally impossible to adequately characterize transformations using some notion of tree conditions. A step in this direction would be to include numerical indices in certain tree conditions. For example, in the case of the passive, Q might include indices on the NPs, V, and X (e.g. NP3, be-Hen, V2, X4, by NPJ). Of course, if this were done, we would have to give up the idea that it makes sense to ask of a phrase marker Q, considered in isolation, whether or not it satisfies such a tree condition. Rather, for any tree condition Q that is the second co-ordinate of a transformation <Q, Qj) we would have to stipulate that a phrase marker Pi+1 in a sequence Pj.. .Pn satisfies Q only if (i) Pm is analyzable into a string having the structure specified by Q and (ii) for all numerical subscripts n occuring on symbols A in Q the substring in Pj+i corresponding to A must be identical with the n* substring in an analysis of PI that satisfies Q. Then, by appropriately numbering the symbols that occur m Q (which in the case of the passive would involve indexing everything except 'be + en' and 'by') we might avoid the problems just cited. It should be noted, however, that adopting these conventions regarding what it means for a phrase marker to satisfy the second co-ordinate of a pair <C1} 2) reduces the notion of a tree condition to complete vacuity. Since first and second co-ordinates of such pairs are so different, it is incongruous to claim that both are tree conditions. More importantly, given the second of the two conventions, the first becomes unnecessary. For example, in the case of the passive^ Q could be represented '3, be + en, 2, 4, by 1'. The categorization of these terms (NP, be + en, V, X, by NP) is predictable from Q together with convention (ii) above. Clearly, however, there is no intuitive sense in which '3, be + en, 2, 4, by is a tree condition. This brings us back to the characterization of transformations that I gave earlier. According to it, the structural description of a transformation defines its domain, and the structural change consists of a set of instructions for determining which phrase marker(s) is associated with each phrase marker in the domain. Since this characterization both avoids the difficulties that plague Lakoff s proposal and defines the notion of a transformation in a perspicuous way, I will assume that it is correct when making my own proposals regarding rule orderings and obligatory transformations.

128

Scott Soames

IV.

This brings me to my second main criticism of Lakoff namely the distinction that he draws between obligatory and optional transformations is either incoherent or inconsistent. According to Lakoff
A derivation will be well formed only if for all i, l<i^n, each pair of phrase markers (Pj, PI+I^ is well formed. Such a pair will, in general, be well formed if it meets some local derivational constraint. There are two sorts of such constraints : optional and obligatory. To say that a local derivational constraint, or transformation, <Q, Q> is optionalis to say (x) [PX/Q => (PX+1/Q => <PX, P x+1 > is well formed)] To say that (Q, Q.^ is obligatory is to say: (x) [PX/C3 z> (,+j/Q <, Px+ 1 > is well formed)] (Lakoff 1971, 233)

The first thing to notice about these remarks is that, as they stand, they are confused. What they seem to assert, taken in conjunction with (5), (or with my reformulation of (5)), is that (17) A sequence of phrase markers is a member of only if every pair of successive phrase markers <P|, PI+I) in the sequence is well formed. (18) A pair of phrase markers <Pj, +) is well formed iff (a) There is some optional transformation <Q, Q) such that (x) [PX/Q ID (PX+1/Q z> <PX, +!> is well formed)] or (b) There is some obligatory transformation (Q, Q> such that (x) [,/Q = (PX+1/Q = <PX, PI+1> is well formed)].10 However, this clearly won't do. First, to adopt (17) and (18) would be to fail to formally define the notion of well formedness as it applies to successive pairs of phrase markers; for what looks like a definition namely (18) mentions the term to be defined on both sides of the biconditional. Second, to say that a specific pair of phrase markers <Pis PI+I> is well formed iff some universal statement of the form mentioned in (a) or (b) is true of the whole sequence of phrase markers of which <Pj, Pi+i> is a part is just bizarre. Finally, any pair of phrase markers <} Pj+i) which is a member of a sequence none of whose members satisfies the structural description of some one transformation will automatically be well formed by falsity of antecedent of the statement corresponding to that transformation. Surely these results are undesirable. Thus, if we are to make sense of LakofFs remarks, some other interpretation is needed.
Although Lakoff doesn't tell us what the bound variables in (18) are supposed to range over, it seems fair to take them as ranging over the subscripts of phrase markers in an ordered sequence of phrase markers. First, it is only in such sequences that phrase markers have subscripts (and, of course, they have different subscripts in different sequences). Second, Lakoff uses the same notation in stating rule ordering constraintswhere the bound variables must be interpreted in this way.
10

Rule orderings, obligatory transformations and derivational constraints

129

Perhaps, then, we ought not to interpret Lakoff as offering a definition of the notion of well formedness as it applies to pairs of phrase markers. Instead, we may be able to construe his remarks as giving axioms regarding the use of this notion. According to this interpretation, we can keep (17) above. However, in place of (18) we will have a list of statements of the forms mentioned in (a) and (b)one for each transformation. Then, to find out whether a given pair of phrase markers (Pj, Pj+i^ is well formed, one instantiates each of the universally quantified formulas, substituting the constant "i" for the variable "x". If one of the resulting formulas of the form mentioned in (b) has a true antecedent, then we can derive by modus ponens a biconditional that states necessary and sufficient conditions for the well formedness of ^Pj, Pi+iX Similarly, if an instantiation of one of the formulas of the form mentioned in (a) has a true antecedent, then modus ponens gives us a sufficient, but not necessary, condition for the well formedness of (P^ Pj+iX These claims are comprehensible. The question now is "Are they adequate?" As they stand, they are not. The reason for this is that nothing has been said to assure us that our axioms regarding well formedness are consistent. To see this, consider the situation in which a phrase marker P| satisfies the structural description of transformations <Q, Q) and (Q* C4>, only one of which need be obligatory. Suppose, for purposes of illustration, that (Q, Q.) is obligatory. Suppose further that Pi+i/Q & nP^/Q. Then, clearly, <P,, P i+1 > is both well formed and not well formed. Since this is a contradiction, any set axioms that allows this possibility is inconsistent and must be rejected. It might be thought that on our present interpretation Lakoff could be saved from inconsistency by adopting constraints on the class of allowable transformations. That is, it might be thought that we could constrain the class of transformations so that the possibility considered above could never arise* However, this is not the case. For consider what constraints would have to be imposed. To avoid contradiction we would have to demand that- every obligatory transformation be such that, for all other transformations T', either the structural descriptions of and T' are incompatible or the conditions imposed on the outputs of T and T' are identical. If a grammar didn't meet this demand, then for some obligatory <Q, 2) there would be a transformation <Q, 4> and a pair of phrase markers Pj and Pj such that Pj/Q & Pj/Q, and Pj/Q & nPj/ Q. However, from this it would follow that <Pj, Pj> (i.e. letting j = 1 + 1) is both well formed and not well formed. Further, on Lakoff's theory no amount of rule ordering can save us from this result. This is the ease since rule orderings are constraints defined on members of K whereas what we are worried about now is specifying the class of well formed pairs of phrase markers as a means of defining K itself. Finally, it is worth noting that the problem that Lakoff encounters here is independent of his misrepresentation of the nature of transformations. That

130

Scott Soames

is, if it is granted that transformations are binary relations in the restrictive sense defined above, then we could just as well take Lakoff as claiming that for each obligatory transformation T, there is a statement that (i) [Pjedomain of T => ( <Pb P i+ i>eT == <Ph Pi+1> is well formed)] and that for each optional transformation T' there is a statement that (j) [PjEdomain of T' => ( <Pj? P j+1 >eT' ^ <Pj? Pj+1> is well formed)]. (Where 'i' and 'j' range over numerical subscripts of phrase markers in derivations.) Then, to avoid contradiction, we would have to restrict the class of transformations in a grammar by demanding that every obligatory transformation T be such that for every other transformation T" and every phrase marker Pj, if P| 6 domain of both T and T", then for all Pj? <Pj, Pj> T iff <Pb Pj> 6 T". In practice what this restriction would amount to is the absurdly strong demand that the structural description of every obligatory transformation be incompatible with (not just distinct from) the structural description of every other transformation in the grammar. It would amount to this because otherwise, we would have to imagine two distinct transformations yielding identical outputs from the same input. Anyone familiar with the actual work of generative grammarians will recognize that none of the transformations that have been empirically motivated have this property.

V.

Having shown that both LakofFs characterization of transformations and his distinction between optional and obligatory rules are inadequate, I now turn to what he says about rule ordering. One of the facts that Postal takes to be central to Lakoff s characterization is that rule ordering constraints are to be represented in grammars by independent theoretical statements. Specifically, such statements are supposed to define well formedness conditions on members of K. About this, Lakoff says
Rule orderings, for example, are given by global derivational constraints, since they specify where in a given derivation two local derivational constraints can hold relative to one another. Suppose <Q, Q) and (3, Q> define local derivational constraints. To say that <Q, Q) is ordered before (Q, C4> is to state a global constraint of the form: (i) (i) [(Pi/Q & Pi+i/Q & Pj/Q & Pj-H/Q) ^ i < j] (Lakoff 1971, 234)

Adopting the characterization of transformations that I suggested earlier, we can take Lakoff as asserting that, if Tj and T2 are transformations, then the statement that T! is ordered before T2 imposes the following constraint on members of K: (i) (j) [Pb Pi+^eT! & <Pj? Pj+1>eT2) ^ i< j] (where T and *j' range over subscripts of phrase markers in a derivation)

Rule orderings, obligatory transformations and derivational constraints

131

The intent of such constraints is to define a subset of K by eliminating all members of K that intuitivelyare produced by applying rules in the wrong order. There are two major points to notice regarding this proposal. First, if it is accepted, then there is every reason to believe that grammars will be allowed to vary in the extent to which they impose rule orderingjust as there is every reason to believe that grammars vary in the number of language particular transformations that they allow. Second, this characterization does not require that the ordering relation holding between transformations be transitive. In fact, it leads one to expect that it is not. This second point can be readily grasped by considering a grammar that contains three transformationsT, T*, and . Suppose further that the grammar orders before T* and T* before . Given Lakoff's characterization of rule ordering, to say this is just to say that the grammar contains the following two derivational constraints. (19) (20) (i) (j) [Pi, P i+ i>eT & <P Pj+1>eT*) i< j] (i) (j) [P,, Pi+1>eT* & <Pj; P j+1 >eT#)^ i<j]

However, from (19) and (20) one cannot deduce (21).

(21)

(i)(j)[P i ) P i + 1 >6T&<P j > P j + 1 >eT#)^i<j]

Suppose, for example, that W is a member of K which is such that for all <Pj, Pi+1> in W, <Pj, Pi+1>T*. In a case like this it is not inconsistent to suppose both that (19) and (20) are true of W (by falsity of antecedent) and that for some Pj and Pj in W, <Pj, P i+1 >eT and <Pj? P j+1 >ET# and i>j. What this means is that according to Lakoff's characterization of rule ordering, the fact that T is ordered before T* and T* is ordered before does not guarantee that T is ordered before ,11 Of course, it is open to Lakoff to accept (21) as a separate grammatical rule in the case just described. If he were to do so, then , * and would be linearly ordered in the usual sense. However, if it is the case that whenever grammars impose orderings, the orderings imposed are linear, then we want a theory that is not just compatible with this result, but which predicts it. Therefore, if we want to preserve the idea that all orderings are linear, we must reject Lakoff's characterization of such devices. This point can be brought out more clearly by comparing Lakoff's characterization of rule ordering with an alternative characterization which requires all rules that are ordered to be linearly ordered. According to this characterization, universal grammar contains a single rule ordering statement (22).
11

It doesn't even guarantee that is not ordered before T.

132

Scott Soames

For every generative grammar G, (i) if T' and T" are transformations of G to which G assigns numerical subscripts m and n respectively (ii) and if m < n, then (iii) for all sequences of phrase markers W, (Pj.... Pn), such that We the class K defined by G, and for all Pj, Pj,eW if<P i ,P i + 1 >eT'and (Pj, Pj+i^ T", then W is a well formed derivation in G only if Pi precedes P} in W. I will use the following formula, considered as a part of universal grammar, as an abbreviation for this constraint. 22. (m) (n) [m<n=> [(i) (j) [(<Pb Pi+1>eT'm & <^ P j+1 >eT")^i<)]]] (Where the variables 'i' and 'j* range over subscripts of phrase markers in derivations and the variables 'n' and 'm' range over subscripts of transformations in the grammar of a language.) Individual grammars can impose rule ordering constraints by assigning positive integers to transformations that need to be ordered. It is clear that this proposal requires rule orderings to be linear since the indices of transformations are drawn from a linearly ordered systemnamely the sequence of natural numbers under successor. However, there are two important features of Lakoff's original proposal that are shared by this one. First, rule ordering constraints are global constraints defined on members of K (even though they are not language particular rules). Second, grammars may vary in the amount of rule ordering that they impose. In evaluating this characterization of rule ordering against Lakoffs, there is one important consideration to bear in mind. That is, whereas Lakoffs proposal is compatible with the possibility that all, some, or no transformational orderings are linear, the alternative just constructed is compatible only with the frirst of these three possibilities. Thus, the characterization that incorporates (22) is more restrictive than Lakoff's in that it makes a stronger prediction about the nature of natural languages. Unless there is empirical evidence that contradicts this prediction, Lakoffs characterization of rule ordering must be rejected in favour of the alternative just constructed. In light of this, it would be natural to expect Lakoff to attempt to provide some empirical justification for his proposal. However, not only does he make no such attempt, he seems not to be aware that any empirical justification is necessary. Rather, he seems to regard his characterization of rule ordering as uncontroversial. For example, he says that the framework that he constructs is one which "Most of the work in generative semantics since 1967 has assumed." (Lakoff 1971, 236). In addition, he says that if we restrict the theory that he presents by limiting "global derivational constraints to those which specify rule order" (and if we make certain other restrictions which are irrelevant to our

(22)

Rule orderings, obligatory transformations and derivational constraints

133

discussion), then "The resulting restricted version of the basic theory is what Chomsky describes as a version of the standard theory." (ibid. 268). All of this suggests that Lakoff believes that his characterization of rule ordering captures assumptions that, in Postal's words, have "been implicitly part of transformational theory since the beginning." (Postal 1972, 140) However, since Lakoffs characterization of rule ordering can be accepted only if some grammars impose nonlinear orderings and since generative grammarians have traditionally assumed that all orderings are linear, it seems fair to conclude that neither Lakoff nor Postal is aware of the consequences of LakofPs proposal. Because of this and because no empirical evidence has been brought to bear against the methodologically superior proposal that employs (22), I will assume in what follows that all transformational orderings are linear.12
VI.

We are now in a position to define the notion of an obligatory transformation. To do this we must determine exactly what we take obligatory transformations to be. The standard intuitive characterization of these rules is that they are transformations that cannot be allowed not to apply. That is, they are rules that cannot be "passed over" in their turn in the ordering if their structural descriptions are met. This remark, though far from precise, tells us one very important thingnamely, that in the standard view the notion of an obligatory transformation is defined partly in terms of rule ordering considerations. Given this much, we can determine precisely which members of K violate the constraints defined by obligatory transformations. The offending derivations are those that contain a phrase marker Pj such that (i) PJ e domain of some obligatory transformation Tn (ii) <P,,P I+1 >.*T 0 (iii) -.3m[<P,,P1+1>6TJ m<n (iv) )[<,, + > =>|] r>n () () [ is unordered => <Pb Pi+1>T]13 (Where the variables *n', 'm* and V range over numerical subscripts of transformations in a grammar, the variables and ('f range over numerical subscripts of phrase markers in a derivation, and *T' ranges over transformations in a grammar).
12

Though I intend to leave open for the moment whether or not all grammars completely order their transformations. 13 To say that a transformation is unordered is to say that there are no constraints affecting when it can apply. This is reflected in our formalism by not assigning a numerical subscript to it.

134

Scott Soames

Intuitively, what these conditions state is that if a derivation D defined by a grammar G is such that (i) (ii) (iii) It contains a phrase marker Pj that satisfies the structural description of some obligatory transformation Tn In constructing D, Tn did not apply to Pj Failure to apply Tn to Pj was not the result of the fact that Tn had not yet been reached in the ordering of transformations at the time that Pj was an available input to transformations Failure to apply Tn to Pj was not the result of the fact that Tn had already been passed in the ordering of transformations at the time that Pi was available, and <Pj, PJ+I> was not produced by a rule upon which no ordering constraints are stated,

(iv)

(v)

then D is not a well formed derivation in G. The reason that these conditions are adequate to define the standard notion of an obligatory transformation is that to say that a derivation satisfies them (and hence is not well formed) is just to say that the derivation was produced by allowing an obligatory transformation to be passed over in its turn in the ordering even though its structural description was met. Finally, by collapsing these conditions we can formally define the notion of an obligatory transformation. Thus, a transformation Tn of a grammar G is obligatory in G iff G contains a constraint of the following form. (23) (i) (m) [[<Pj, Pi+1> & () ( is unordered => <P Pi+1> )
m<n

& (r) (j) P Pj+1>eTr => j > i)] => P^domain Tn] r>n Only those members of K that satisfy all constraints of the form (23) are well formed derivations in G. (Note, to say that a derivations W fails to satisfy (23) is logically equivalent to saying it satisfies conditions (i)-(v) above.) I claim that this characterization captures what we standardly take obligatory transformations to be. However, there is one caveat that must be added. Since the above characterization makes use of rule ordering considerations, the extent to which some subset of the rules of a grammar corresponds to what we standardly take obligatory transformations to be depends upon the extent to which the grammar orders its rules. There are three cases to consider. First, if a grammar orders all of its rules, then the obligatory transformations of that grammar will correspond perfectly with what we standardly take obligatory transformations to be. Second, if a grammar orders none of its rules, then, according to the characterization just given, it can have no obligatory transformations. Finally, if a grammar stands somewhere between these two extremes then (i) none of its unordered rules can be obligatory and (ii) the application of an unordered rule can keep

Rule orderings, obligatory transformations and derivational constraints

135

an obligatory transformation from applying (if it destroys the relevant environment).14 Of course, for one who thinks that grammars can vary in the amount of rule ordering that they impose, considerations like these might lead one to try to define the notion of an obligatory transformation independently of rule ordering considerations. Two obvious ways of doing this would be to require (a) that every obligatory transformation apply to every phrase marker in a derivation that meets its structural description or (b) that if some phrase marker Pj in a derivation satisfies the structural description of an obligatory transformation T, then that derivation must also contain some PJ^J such that <Pj, PJ+I) 6 T. Unfortunately, although the constraints imposed by these characterizations are interesting, their empirical adequacy has never been seriously examined. Thus, we are not now in a position to authoritatively determine whether or not the notion of an obligatory transformation should be defined independently of rule ordering considerations. Still, the analysis just presented does enable us to specify the empirical considerations that bear on this question. If it could be independently determined that (i) either (a) or (b) are satisfied by all well formed derivations or that (ii) some grammars require unordered but obligatory rules, then we would have some evidence for characterizing obligatory transformations independently of rule orderings. On the other hand, if it could be shown that neither of these two conditions is met, then the more standard characterization of rule ordering would be sustained. Finally, it should be noted that the task of gathering evidence in support of one or another of these characterizations is a matter of some importance since the incorporation of a characterization like (a) into linguistic theory could, if correct, further restrict the class of possible grammars. If such a restriction were not accompanied by a weakening of other restrictive conditions in universal grammar,15 the result would be a more highly constrained theory about the nature of natural languages than has thus far been achieved.

VII.
In this paper I have attempted to clarify the nature of a number of key grammatical notions, to construct proposals for formalizing these notions, and
14

There are several possibilities regarding unordered rules. The reason for my condition (v) is to permit, as well formed, derivations which are such that, for some obligatory Tn and phrase marker Pj meeting its structural description, there is an unordered transformation that applies to Pj, prior to the time in the ordering at which Tn has a chance to apply. It may also be worth noting that by complicating my conditions one could retain this feature of my characterizations while stipulating that no unordered rule could keep an obligatory rule from operating by applying at the time at which the obligatory transformation is applicable. Assuming that there are unordered rules the decision of whether or not to accept conditions like these is purely empirical. 15 For example, the claim that all grammars impose maximal rule ordering.

136

Scott Soames

to spell out the empirical consequences of accepting one or another of these proposals. In addition, I have tried to explicate and criticize Lakoff 's abstract characterization of grammars. However, there are two questions regarding this characterization that remain. First, considerations of transitivity aside, is it the case that LakofPs treatment of rule ordering "has been implicitly part of transformational theory since the beginning"? (Postal 1972, 140). Second, is there any reason to accept a theory in which the justification of rule ordering constraints in a grammar is fully on a par with the justification of language particular transformations ? It seems to me that the answer to both of these questions is 'no'. This follows from what is probably the most significant feature of LakofPs treatment of rule ordering namely, the claim that grammars can vary in the amount of rule ordering that they employ.16 Only if this claim is accepted can one maintain The fact that Lakoff regards rule ordering constraints as global constraints defined on members of K is of no theoretical consequence since we can define an equivalent theory that lacks this feature. For example, consider the following: (Recall that in this discussion we are following Lakoff in ignoring the principle of the cycle.) (I) Let the transformations in a grammar be given by means of an ordered list T lt T2, T3 . . . Tn. Further, let each transformation be marked either "obligatory" or "optional." (II) Let the notion of a derivation be defined as a finite sequence of phrase markers in which each <Pb PJ+I> is well formed. (III) Then for all sequences W, <Pj, Pi+1> is well formed in W iff (a) P| = PI ; PI is generated by the base component; and 3 n [<Pb Pi+1>eTn & (m) [Tm is optional P{ domain of Tm]] m<n (b) P^P,; 3n [<P|9 Pm> <=Tn & (r) k<m<n h<i Tk is optional Pj domain TkJ] Intuitively, what this theory states is that (i) Members of (i.e. the class of derivations in a grammar) are all those finite sequences of phrase markers which are such that each pair of phrase markers ^Pj, Pj+i^ in them is well formed. (ii) The initial pair phrase markers <Pi, P2> in any sequence is well formed iff (a) Pj is generated by the base. (b) P2 is produced from Pj by applying some transformation T. (c) In arriving at T in the ordering no obligatory transformation whose structural description is met is allowed not to apply. (iii) A pair of phrase markers <Pj, Pj+i> that is non-initial in a derivation is well formed iff (a) P|+i is produced from Pj by applying a transformation T. (b) T has not been previously reached in the ordering. (c) All transformations ordered before T have either already been passed in the ordering, are optional and hence can be allowed not to apply or are not applicable to Pj because their structural descriptions are not met.
f3m 3
16

(g) [<P f ,'P i+1 ># Trl & 00 <i k<

Rule orderings, obligatory transformations and derivational constraints

137

that rule ordering constraints are no more freely available than language particular transformations. Suppose, for example, that in order to account for some phenomena of a particular language we have to choose between postulating a new language particular transformation and ordering two independently motivated transformations. What sort of theoretical considerations are relevant to making this decision? If one's theory of grammar does not require that all grammatical transformations be ordered, then it might be plausible to suppose that there are no theoretical grounds for selecting one alternative over the other. That is, it might be plausible to suppose that rule orderings don't come any more freely than do language particular transformations. If, on the other hand, one's theory of grammar requires that every transformation be ordered with respect to every other transformation, then, since the two independently motivated transformations would have to be ordered with respect to each other anyway, selection of the rule ordering solution would be preferred because it gives us a chance to save ourselves the postulation of an extra grammatical device. From this result two conclusions immediately follow. First, any reason to accept a theory in which rule ordering constraints are no more freely available than language particular transformations must also be a reason to reject the claim that all grammars impose maximal rule ordering. Second, the extent to which rule orderings have traditionally been considered to be more freely available than language particular transformations gives us some reason to think that LakofFs treatment of rule ordering has not "been implicitly part of transformational theory since the beginning" (Postal (1972) 140)a conclusion that seems obvious anyway from an examination of much of the work of generative grammarians. Finally, we must decide whether or not to accept a theory that allows grammars to vary in the amount of rule ordering that they employ. Clearly, there is at least one good reason not to do sonamely, to adopt such a theory is to expand the class of possible grammars.17 What this means is that instead of being a proposal for which there is a priori justification, the claim that grammars can vary in the amount of rule ordering that they impose is, from an a priori point of view, inferior to the standard account. Of course, factual considerations might ultimately force us to this less restrictive position. However, in order to determine whether or not they do, it is necessary first, to distinguish methodological consi17

My argument assumes that any adequate linguistic theory will allow grammars to order at least some of their rules. If this assuption is accepted, then any constraint on rule ordering that is common to all languages further restricts the class of possible grammars. The requirement that all grammars impose maximal ordering is an obvious choice for such a constraint. Of course, the assumption that adequate linguistic theory will allow grammars to order at least some of their rules is an empirical hypothesis which itself needs justification. However, if it is rejected, then Postal's claim that there is no asymmetry between the availability of rule ordering constraints and other grammatical apparatus is automatically false. It is false because to reject such an assumption is just to say that rule ordering constraints are not available at all.

138

Scott Soames

derations (like restricting or expanding the class of possible grammars) from empirical considerations and second, to evaluate alternative formal characterizations in terms of both their restrictiveness and their empirical adequacy. A persistent problem with LakofFs theoretical discussions is that these tasks are not adequately performed. With respect to rule ordering, the result is that the issue of restrictiveness is ignored. Hence, the need to empirically justify the claim that grammars vary in the amount of rule ordering that they impose is not recognized. However, once formal, methodological and empirical concerns are distinguished, this oversight can be corrected, and the nature of different hypotheses can be clearly recognized.

References CHOMSKY, N. (1961), On the notion "rule of grammar.*' pp. 119136 in: Fodor, J.A. and J.J. Katz (Eds.), The Structure of Language, Englewood Cliffs, N.J.: Prentice Hall CHOMSKY, N. (1967), The formal nature of language, pp. 397442 in: Lenneberg, E.H. Biological Foundations of Language, appendix, New York, N.Y.: John Wiley CHOMSKY, N. (1971), Deep structure, surface structure, and semantic interpretation, pp. 183216 in: Jakobovits, L. A. and D. D. Steinberg (Eds.), Semantics, Cambridge, Engl.: The University Press CHOMSKY, N. (1972), Some empirical issues in the theory of transformational grammar, pp. 63130 in: Peters, S. (Ed.), Goals of Linguistic Theory, Englewood Cliffs, N.J.: Prentice Hall EMMONDS, J. (1969), Root and structure preserving transformations. Ph.D. dissertation, Cambridge, Mass.: MIT LAKOFF, G. (1971), On generative semantics, pp. 232296 in: Jakobovits, L. A. and D.D. Steinberg (Eds.), Semantics, Cambridge, Engl.: The University Press POSTAL, P. (1972), The best theory, pp. 131170 in: Peters, S. (Ed.), Goals of Linguistic Theory, Englewood Cliffs, N.J.: Prentice Hall

DOV M. GABBAY AND J. M. E. MORAVCSIK

BRANCHING QUANTIFIERS, ENGLISH, AND MONTAGUE-GRAMMAR

In this paper we distinguish branching quantifier constructions from linear ones, and illustrate these in English. We then call attention to the interesting properties of systems that include branching quantifiers, and raise the issue of what their inclusion would show for English. In part I we show certain correlations between grammatic and logical order within different syntactic constructions, and show also the syntactic devices in English that allow us to build branching quantifier-constructions of a wide variety. In part II we illustrate how a Montague-type grammar can accommodate branching quantifiers. In doing this we use a simplified version of Montague-grammar for the sake of facilitating exposition. Our work shows that Montague-semantics is stronger than the semantics for the 1st order predicate calculus.

This paper is an attempt to shed some light on quantificational structures in English. As such, it is a part of a larger enterprise, namely, that of presenting a rigorous semantics and syntax for certain important parts of English. Some parts of this enterprise appeared already (see Gabbay 1973, Gabbay and Moravcsik 1973, and Gabbay forthcoming). Below we shall present a brief sketch of the general project. Our aim is to link an empirically motivated rigorous syntax to a modeltheoretic semantics so as to capture what is involved in the understanding of a natural language like English. We assume that the syntax and semantics of English can be represented in a formal way. This amounts to the assumption that the class of wellformed sentences of English, together with their interpretations, can be generated by a set of rules. We do not feel that well known facts such as that grammatically is a matter of degrees, that some terms in natural languages are vague, and that some limits of what certains terms in natural languages denote can be understood only against certain background assumptions, create unsurmountable obstacles for our project. Arguments like the ones in Ryle (1957) against the possibility of projects like ours do not seem to us convincing.

140

Dov M. Gabbay and J.M.E. Moravcsik

Essential to our work is the assumption that the notions of truth and denotation are fundamental to the understanding of the semantics of a natural language. We do not claim that no other notions are needed for a full theory of meaning; nor do we claim that there ar no other dimensions of understanding in addition to the dimension of veracity. But we expect that other aspects of meaning will turn out to be additions, and that their full understanding presupposes an explicit semantics in terms of satisfaction relations, such as the one we are trying to formulate. There are further arguments in support of the claim that the notion of truth and reference are fundamental for the semantics of natural languages in Moravcsik (1972). To be more specific, the grammars that we envisage are context-free phrase structure grammars with transformations, while the semantics that we wish to attach to such a syntax is a set-theoretic semantics, that assigns to each non-syncategorematic term a set-theoretic entity as the semantic object representing the denotation of that term. Ideally, these assignments should be governed by the following conditions, i) Terms that would be normally interpreted as having different extensions should have different semantic objects assigned to them, ii) Terms belonging to the same syntactic category should have the same type of semantic object assigned to them, iii) Different syntactic categories should have different types of semantic objects associated with them, iv) Since we deal with natural languages, the construction of the syntactic categories should depend partly on empirical arguments independent of semantic considerations. Our project has its origins in a similar project that was left unfinished by the late Professor Montague (See Montague 1970 and 1973). Montague's work incorporated some, but not all, of the conditions enumerated above. Our work is both an extension and a modification of Montague's work, in ways that are indicated separately in the various parts of our project. In view of some of the work done on the formal structures of the grammars of natural languagesdue mostly to Chomsky-we are in a position today to formulate and partially answer certain clearly formulated questions regarding the complexity of the mind, or machine, that is required to interpret natural languages. As of now, no analogous question can be formulated with regard to the semantic component. Hopefully, work such as ours will lead to a change in this, undesirable, situation.

I. BRANCHING CONSTRUCTIONS AND ENGLISH SYNTAX We shall assume the usual treatment of existential and universal quantifiers, as presented in modern symbolic logic. In recent years there have been attempts to relate this treatment to quantificational structures in English. These attempts, however, have been restricted to structures in which within the sequence of

Branching quantifiers, english, and montague-grammar

141

quantifiers each quantifier depends for its interpretation on its predecessor. In other words, in these structures we find a linear dependency among the quantifiers. E.g., in the much used example: (1) Every man loves some woman

the interpretation of the second quantified noun phrase depends on the first one. There is no non-arbitrary restriction on the number of quantifiers that can occur in a linear sequence of this sort within a sentence of English. E.g., in: (2) Every man loves some woman some time

we have a sequence with three quantified phrases, and it is easy to imagine the addition of furtherdependentmodifiers containing quantifiers. The attempts to account for these structures have focused so far mostly on the kinds and number of ambiguities that the relevant English sentences exhibit, and the investigation of how the syntactic analysis of these sentences can be related to the structure demanded by logical analysis. Montague 1973 contains, in addition, also a set-theoretic semantic for such sentences. Logicians have considered for some time configurations other than the linearly dependent ones in quantificational logic. One can construct configurations within which the linear dependency breaks, and thus branches of dependent quantifiers arise. But it is only in recent work by Hintikka (forthcoming) that attention has been called to the existence of such structures in English. E.g., in contrast with (l)and(2): (3) All products of some subdivisions of every Japanese company, and all products of some subdivisions of every Korean company are shipped to the U.S.A. on the same boat

contains two sequences of quantified phrases, linked by a conjunction, such that though within each sequence there is linear dependency, the two sequences themselves are not connected by such a dependency. Nevertheless, the two sequences are tied to the phrase 'are shipped to the U.S.A. on the same boat'; this phrase functions as a logical predicate. Thus (3)-unlike (1) and (2)exemplifies the structure of branching quantifiers in English. These structures are of interest by themselves. E.g., it is not clear that various proposals within the framework of generative grammarssuch as generative semantics and Chomsky's extended standard theory-are equally adequate in representing the syntax of these structures. Furthermore, the representation of the semantics presents an interesting challenge. There are, however, other issues of theoretical interest that rest partly on the analysis of these structures. It has been shown by logicians that the class of logical

142

Dpv M. Gabbay and J.M.E. Moravcsik

truths that contains all of the constructions involving branching quantifiers is not recursively enumerable, and is equivalent to some fragment of second order logic. (For references, see Hintikka [forthcoming] and Enderton 1970). In view of these considerations, our work is an attempt to shed light on the following questions: (a) (b) Can a Montague-grammar handle branching quantifiers? Does the existence and adequate treatment of branching quantifiers in English show anything about the need to go beyond 1st order predicate calculus to present an adequate semantics for English? Does the investigation of these phenomena give support to the claim that the class of analytic sentences in English is not recursively enumerable? [I.e., if English includes all logical truth expressible by branching quantifiers, and that set is not r.e.].

(c)

If the answer to the third question is affirmative, and we assume that there is an effective procedure for generating all and only the grammatically wellformed sentences of a natural language, then we will be able to establish an interesting asymmetry between the semantic and syntactic components, namely the set of valid sentences is not r.e. while the set of well-formed sentences is recursive. Before we proceed to deal with these issues, let us consider the order of occurrences of quantified phrases in various types of English sentences and the ordering of quantifiers required by the corresponding paraphrases in any standard logical analysis. Even a cursory look at (3) shows discrepancies. The logical paraphrase would start: "for every Japanese company there are some subdivisions such that...", in short, the quantifiers would occur in the REVERSE order from the order in which the quantified phrases in (3) occur. Thus we should look for some regularities relating logical and grammatical orderings. An alternative to this would be to start with a structure in which the quantifiers occur as required by any reasonable interpretation in 1st order predicate calculus, and assuming this to be "deep structure", proceed to construct a derivation via transformations that arrives eventually at the English sentences. However, there seems to be no empirical syntactic motivation for this, and it is not clear to what extent the types of transformations needed would obey a suitable set of constraints. Sequences of quantified phrases in English can be constructed via the following syntactic patterns: i) ii) iii) iv) with prepositional phrases (such as in [3]) within subject-verb-object sequences (with modifiers, such as in [2]) with relative clauses with embedded structures (e.g. sentences containing 'believes that') We shall try to show that there are regularities of interpretation linked to

these four classes.

Branching quantifiers, english, and montague-grammar

143

In the case of structures like (3) the logical predicate linking the branches 'are shipped to the USA on the same boat'is dyadic. The variables linked to it in logical paraphrase will be the ones tied to the quantifiers appearing last in the branches. I.e. "for every s.t. is Japanese company, there are somej s.t. j is subdivision of x, and every ^ s.t. ... and for all u (Korean company) v (subdivision) and #> (product) % and w are shipped...". We shall see later that more complex branching can also be expressed in English. While the respective products make up the classes to which the predicate in (3) applies and thus occupy the last place in the logical sequences of quantifiers, they occupy the initial positions in the sequences as expressed by the grammatical construction of English. This suggests the hypothesis that when we deal with sequences of quantifiers that are constructed out of prepositional phrases, the grammatical order is the reverse of the logical order. As the examples below indicate, this hypothesis seems to work for the prepositions OP, 'to', 'in' (and other locationals), and 'for'. (4) (5) (6) Some gift to every girl and some gift to every boy are bought by the same Santa Claus Every deer in some forest and every moose in some meadow drink from the same brook Every sacrifice for some good cause and every prayer for some blessing please the gods equally

The same point is illustrated by sentences in which different prepositions are combined such as: (7) Some entrances to every freeway to some city of every country are badly constructed Though this does not involve branching, the same point is exemplified; the logical order will start with the last quantified phrase ('every country') and work in reverse order, ending with 'some entrances' which is linked to the predicate. The hypothesis stands up, even though in this project we do not distinguish between different senses of the prepositions; such as the Of of possession, origin, and object (the portrait of David), as well as the 'to' of direction in contrast with the true dative ('he gave it to me'), and the 'for' of purpose as distinct from the 'for' in 'doing something for someone'. One class of exceptions to our hypothesis will be the pseudo-prepositionals; prepositional phrases that function, semantically at least, not as relational phrases. (8) In one such type of case, the phrase functions adjectivally; e.g.: Every man of some intelligence (with some sense?) smokes cigars.

In another type of case, the preposition links events with their aspects, and thus denotes once more not a genuine relation. E.g.: (8 a) Some aspects of recent linguistic studies and some aspects of recent logical studies are equally depressing.

144

DovM. Gabbay and J.M.E. Moravcsik

The other exception is the preposition 'with' and its relatives; and as of now we have no explanation for this phenomenon. But the point becomes clear from such examples as (9) Every man without some woman is like every ship without some sail (10) Every man with a large income and every woman with a large appetite suit each other In sentences like these, the logical order and the grammatical order coincide, and the dyadic predicate does not apply to the variables bound by the last quantifiers in the sequences. The same coincidence of logical and grammatical order can be seen in sentences with subject-verb-object structure, such as: (11) Every farmer has some sons and every banker has some daughters who belong to the same club But in the case of this syntactic configuration the dyadic predicate is applied to the variables bound by the last quantified phrases in the logical and grammatical order, i.e., those denoting the sons and daughters respectively. Sentences of this structure admit ambiguities, and thus we cannot assume that all readings of all such sentences will preserve the coincidence of grammatical and logical order. Let us now consider the third type of construction mentioned initially namely relative clauses. Sentences of the following sort are illustrative: (12) Some truths that were rejected by every ancient sage in some civilization and some falsehoods that are accepted by every modern scientist in some country resemble each other The main predicate 'resemble each other' applies clearly to the truths and falsehoods. These are clearly not the things bound by the logically last quantifiers in the sequences. Within the relative clauses, however, the regularities observed so far apply. This shows that we have to determine the logical sequences within each relative clause first, and then attach the whole clause as a complex predicate to the head NP's; the main predicate of the sentence then applies to these. There are sentences where a predicate applies both to something within the clause and outside, such as: (13) Some men who pursue every woman are rejected by them In these cases the same considerations apply as to constructions built around 'with'. (14) Similar considerations apply to clauses built with 'where' such as: Every place where some lawyers lived and every house where some doctors lived comes under the same zoning law

The specificity-quantifiers of English, such as 'a certain', always come first in logical order, and this rule is prior to all other regularities mentioned here.

Branching quantifiers, english, and montague-grammar

145

Finally, we turn to embedded sentences. One type of construction is exemplified by sentences with 'believe' as the main verb; e.g. (15) Some men believe that every woman hates them This sentence shows that there can be co-reference between noun-phrases and pronouns across the embedding construction. Needless to say, we can also have branching constructions in which one branch is also outside the belief-context, e.g. (15 a) Some men of some countries believe that they resemble some animals of every species. But note the impossibility of a branching construction in which the branches would be held together by a predicate denoted by the verb which creates the embedding construction. . (15c) Some and allj such that ... and some % and all TV such that ... andj believes that w. Let us now review the whole range of syntactic devices that allow us to express a wide variety of branching configurations. So far we dealt only with cases where distinct branches are held together by a predicate. But there are cases in which the branches are preceded by a common node, as in: (16) Men who make a deal with a certain chisler and women who keep company with him deserve the same fate.

Here the expression referring to the chisler picks out an element that is necessary for the interpretation of both branches, giving us the structure: There is a chisler s.t.
/*

men who...

Vs

deserve the same fate

women who... Little reflection on (16) shows that one should be able to attach such an expression, thus forming the diamond-like structure, in front of any arbitrary number of quantified phrases and with any number of branches; the guarantee comes from the fact that these structures are equivalent to very complex relative clauses, as the schematic representation above indicates. Furthermore, these same syntactic devices make it possible to add one diamond-like structure after another; e.g. to go on with (16) in some such form as: - some angels of all religions . deserve the same fate s.t. some devils of all superstitions ' Further complexities of this sort are also expressible in English. Another relevant question is: how many branches can we have? The answer is: any number of these; one can link them with conjunctions, and predicates like the one in (11) can apply to any number of branches.
10 TLIl/2

equally abhor

146

Dov M. Gabbay and J. M. E. Moravcsik

Still another dimension of complexity is revealed when we see that the main predicate can apply to more than one class denoted within any one of the branches. E.g., (17) Some lie of every politician and some weakness of every voter make the voters hate the politicians

Here all four NP's are tied to the main predicate; and further complexity in this dimension (involving more quantified NP's) is imaginable. Examples used in logic textbooks usually treat only verbs that function as monadic or dyadic predicates in English. But the syntactic device of adding cases via prepositional phrases shows that verbs can function as predicates with a much larger number of elements related. E.g., consider the schema: brings j with ^ to w in for u. We already saw above that more complex verb phrases can apply to any arbitrary finite number of elements. Furthermore, grammatically as well as semantically, prepositional phrases such as 'to Mars', eto London', 'to California' have common structure that must be brought out by analyzing them into further constituents. Thus the treatment of prepositional phrases as adverbial operators is necessities. We therefore give in 2 a simplified account of Montague grammar. In unsatisfactory, since under such treatment (an extra primitive for each phrase) we would lose internal structure and thus miss important generalizations. Thus we have shown how one can build up in English sentences with any number of branches, and predicates tying the branches together, applying to any finite number of variables, and how the language allows us to form the diamondtype branching, with relative clauses. Can we form branches with arbitrary length? Again, there is a syntactic device that guarantees this. For some prepositions, such as Of* allow of an arbitrary number of iterations. Thus we can always form branches of the form: "Quantified NP of quantified NP of..." of arbitrary length. In this section we illustrated a number of regularities linking the order of quantifiers within standard logical paraphrases and the order of quantified noun phrases in a variety of English syntactic constructions. We also indicated several syntactic devices of English that allow the construction of branching structures with arbitrary number of branches, with arbitrary number of NP's within any of the branches, and with predicates that can apply to an arbitrary number of elements. We also illustrated diamond-like structures of arbitrary complexity, and sentences in which the common main predicate is applied to more than one quantified NP's from each branch. The regularities and the variety of sentences expressible in English indicate the extent to which branching structures are part of a natural language like English. This sets the background for the development of the rigorous semantics.

Branching quantifiers, english, and montague-grammar II. BRANCHING QUANTIFIERS AND MONTAGUE SEMANTICS

147

1.

Introduction

Montague (1973) presented a grammar and a semantical interpretation for a certain fragment of English. The fragment is small, but "accomodates all of the more puzzling cases of quantification and reference" known to him. In part I we have shown that branching quantifiers behave according to certain rules. We now show that branching quantifiers can be expressed in Montague grammar and the respective semantics is correct for them. Thus the system of Montague (1973) allows for the construction of sentences with branching quantification. Our plan is as follows. In 2 we give an introduction to Montague type semantics. As explained in Gabbay (1973), Montague's paper (1973) is extremely elegant and some of its features are only technical options and not conceptual necessities. We therefore give in 2 a simplified account of Montague grammar. In 3 we show how the simplified grammar of 2 can accomodate branching quantification'and in 4 we show how Montague grammar (1973) accomodates branching quantification. 2. Simplified Montague Semantics

Let us begin with a certain fragment of English. For example the fragment containing words like John, Mary, run, fall, and sentences like ''John runs', 'Mary kills John''. Our first step is to divide the words into categories, e.g., in this case into the four categories: NP (containing John), IV (containing run), TV (containing kill), S (containing 'John runs'). We supply rules that allow us to construct the sentences we have in mind. E.g., in our case. (Rl) S-+NP + IV, or in diagram:

(R2) IV-+TV + NP, or in diagram:

10*

148

DovM. Gabbay and J.M.E. Moravcsik

(The choice of rules and categories depends on what tasks we set for ourselves; which sentences do we want to construct? What ambiguities do we want to account for? Can we give a simple semantics for this grammar? etc.) Given a grammar, that is, given a set of categories and rules of construction, we can supply this grammar with a semantics. A semantics consists of the following. (a) With each properly constructed phrase of the language we associate a semantical object ||x||. Phrases belonging to the same category obtain the same kind of semantical object. (b) With each rule R we associate a semantical rule SR. If the rule R allows you to construct phrase z from phrases ,..., x (e.g., SNP + IV) then SR tells you how to construct ||z || from ||x! ||,..., ||xn ||. These assignments must be natural. That is the semantics keeps close to the meaning of the English phrases and reflects correctly conditions of truth, makes distinctions in case English makes them, etc. As an example let us construct a semantics for the fragment given above. We start with a set of object (cao be thought e.g., as a set of people, etc.). NP's get elements of the set, e.g. \Jobn \ . IV's get subsets of , e.g. || run || (intuitively, the set of those who run). TV's get binary relations on , (e.g., || kz//\\ c: (i.e., a list of who kills whom). Sentences get truth values (true or false). We now have to specify the rules SRI, SR2. SRI tells you to check for, given and y IV, whether the element ||x|| belongs to the set ||y||, and gives a truth value. So: \Jobn run || is true if \John || || run ||, i.e., the element associated with John is in the set ||nwr||, (those who run). The rule SR2 says: Take the relation ||TV|| and construct all those elements that are related to the ||NP||, e.g., ||kill Mary || is obtained from \\kill\\ and ||^irfry|| by collecting all those elements that the relation ||&//|| relates to the object (|/:||. Thus \\ki// Mary\\ is the set of all those who kill Mary. Thus semantics for the above grammar are obtained by taking sets and assignments || ||. There are many possible semantical interpretations. The nature of the interpretation depends on the richness of the fragment, the rules and categories chosen, the various distinctions required etc. We give two examples: 1. A fragment containing the verb seek cannot be handled like we handled kill, because you kill objects of the universe but you may seek non-existent objects. 2. But even without new kinds of words, how about adding the simple rule: (R3) NP-*NP + NP (i. e., we want to construct John loves John (and) Mary). What will be || NP + NP || (i. e., \John (and) Mary\)t. John (and) Mary is a member of NP and therefore must be assigned the same type of element as 'John\ A simple way out is to assign to NP's

Branching quantifiers, english, and montague-grammar

149

finite subsets of . So |[/0/&/?|| = set containing one element (John himself). \John (and) Mary\ is the set containing 2 elements. If x,y are NP's and z = x+y then || || = || || u || y || . SR3 tells you to take union. All the other rules remain the same except that we replace by \\ = set of all finite subsets of . If you look at the fragment and rules of Montague (1973), you will begin to see why the semantics of Montague (1973) is complicated. We now describe how we can introduce quantifiers in the language. Let us, for simplicity, confine ourselves to the fragment with rules Rl, R2, and categories NP, IV, TV. First we add to the category NP, the variable names he^ he2, he^ We can now form phrases that depend on unspecified names. E.g., kill he (or kill him}. We don't know who him is; for different choices of be we get different sentences. We also introduce common nouns like women , men. This is a new category CN. The semantical objects for them are subsets of \\wen\\ (i.e., the set of all men) etc. We add the category Q of quantifiers containing every and some and add the rule NPQ + CN. Now we can form: Every man runs or every man kills some woman or he kills every man or he kills her, etc. This is a very simple fragment. However, we need one more rule to allow us to express branching quantification ! Luckily this rule was introduced by Montague to treat relative clauses. Recall that we can construct sentences with variables in them, like he runs. We indicate that a sentence S contains a variable name he by writing S (he) (a sentence with he). We want to allow the construction of CN from CN + S (he) by the use of such that., e.g., man such that he runs. We can therefore form "every woman kills some man such that he (the man) runs" . In the next section, let us present this grammar formally and show how it can accomodate branching quantifiers. 3 1. A Grammar For Branching Quantifiers 1 We have the following basic categories : NP (noun phrases). Contains two sets of basic phrases. N! = {John, Mary, etc.} and N2 {x,y,z, ^i, he2, she-ij she2, etc.} This is the set of name variables. Besides these basic NP's we can construct more using the rules below.
Morphology is neglected, i.e., run* runs, loveloves, hehim, sheHer, heshe. In our examples we use the correct English form only for reasons of style. It doesn't affect the representation of branching quantifiers.

1 50

Dov M. Gabbay and J.M.E. Moravcsik

2. 3. 4. 5.

IV (basic ones are for example run, walk). TV (basic ones are e. g., love, ki/J). Q (contains every and some). CN (basic common nouns are man, woman, sheep, etc.) We now define some derived categories and give some rules of grammar. To do this we shall simultaneously define, for any phrase P of any category the notion: "The name variable appears free in P". We denote this by writing P(x). The following are the clauses defining the rules and the notion.

6.

If is a basic element of any category (see (1) (5) for a list of basic elements, such as run, kill, etc.), then is free in if is itself (as a name in NP). is an element of the category S of sentences if it is of the form X+Y where X is an NP and is an IV (i.e., the rule SNP + IV). A variable name is free in if is free in either X or Y. is in IV if Z*=X+ with X in TV and in NP. A variable name is free in if it is free in either Xot (i.e., the rule IVTV+NP). is in NP if X+ and X is in Q and is in CN. A variable name is free in if it is free in either X or Y. If X is in CN and F(x) is in IV with free in Y and not appearing in X then the following phrase is in CN. Z= X such that Y(hen), where ben is the first new name variable of this type not appearing in X or Y. u is a free variable of Z if u? 6 x, u ^ hen and u is free in Xot in Y.

7.

8. 9. 10.

11.

If F(x) is in S with free and X is in NP then the following is in S : (a) If X is a variable name u then take Z(u) to be Y(u) and a variable v is free in Z iff v = u or but is free in Y. (b) If X is not a name variable then is obtained by replacing in K(x) the first occurence of by " and every other occurence of by ben (or optionally shen depending on gender), where hen is the first name of this form not occurring

Using the notion of construction tree, Montague (1970, 1973), we can give some examples. Near the node of the tree we indicate the free variables of the phrase of that node and the rule (if in doubt) used to construct it.

Branching quantifiers, english, and montague-grammar

151

(18)

every man loves some woman

every man

loves some woman

every

man some

some woman woman

Meaning: For every man there is a woman (depending on the man) whom the man loves. every man loves some woman

every man loves

every

man

love

Meaning: There is some woman such that every man-loves her. (The woman is the same for all men.)

152

Dov M. Gabbay and J. M. E. Moravcsik

(20)2 Every man loves some woman that shej kills every sheep that he, runs. every man loves some woman such that she, kills every sheep that hex runs love some woman such that she, kills every sheep that he runs some woman such that shei kills every sheep that he, runs woman such that shei kills every sheep that he, runs y kills every sheep that hei runs
v kills

every sheep that hej runs


v every

sheep such that he} runs sheep that he runs

The meaning is that the woman depends on the man (i.e., for every man there is a woman w = w(x) such that w(x) kills every sheep that runs).

We are not concerned with the deletions that take us from the sentence of this example to the fine surface form.

Branching quantifiers, english, and montague-grammar (21) (Branching quantification):

153

Every man loves some woman (and) every sheep befriends some girl that belongs to the same club (i.e., the woman and the girl). Z(every man, every sheep)

every sheep

Z (every man, u) Z(x, u) = some woman such that loves she belongs to the same club as some girl such that u befriends she2

belong to the same club as some girl such that u befriends she2

some woman such that loves shei woman such that loves shei

ftat u befriends she2

some

girl such that u befriends she2

woman

loves y

u befriends

Once we know how to express branching like (21), we can also express branching like (16) in part I. Simply form a statement Z(u) where u is a name variable (i.e., u replaces the "chiseler") and now quantify over the Uj. Clearly any kind of lattice can be created in this way. (We regard "belong to the same club (as)" as one unit here. We abbreviate it by "belong-". In the top nodes (with Z(x,u)) we used rule 11.) The semantics will show that the meaning of the quantifiers is branching.

154

Dov M. Gabbay and J. M. E. Moravcsik

Now suppose we look at the sentence "every man loves some woman and every sheep befriends some girl that belongs to the same club owned by the man". This sentence has the same tree as in (21) except that the node "belong-" should be replaced by the node "belong to the same club owned by x". This phrase has to be constructed separately (i.e., the node is really replaced by a branch constructing the above). This shows we can express branching quantification where the main verb depends on more than the last of the branching quantifiers. We now turn to the semantics for the grammar of this section. Since our fragment is smaller than Montague's (not including intensional objects), our semantics is less complicated in the sense that the semantical objects associated with elements of the basic categories need not be sets of too high a type. As we remarked in section 2, the natural simple semantics of 2 has to be changed a little to accomodate various technical difficulties. So in order to make the present semantics more transparent, we use the original semantics as our starting point. Let H be a function assigning semantical objects to the elements of the categories, //is defined as follows: Let be a set (our universe of object). Let () , for any name variable (by name variable we mean variable for names). Let //() for any basic name such as John or Mary. Let //() , for any basic IV such as run or basic CN such as man. Let //() 2 for any basic TV such as love. Given such an assignment we define our semantical interpretation. We define a semantical object || ||H associated with any phrase of any category, constructive in the language. The definition is given by induction. || \\H is defined first for the basic elements of the categories and then with each syntactical rule of grammar that allows us to obtain new phrases from old. We associate a semantical rule that allows us to obtain the sematical object associated with the newly constructed phrase. The numbers here follow the numbers of the definition of the grammar. (51) (52) (53) If is a basic NP then ||n||H = the function f such that for any subset A , f(A) is a truth value and f(A) = true exactly when //() A. If is a basic IV then ||n ||H = //(n). If is a basic TV then || ||H is the function f such that for any function F (giving truth values to subset of ), yields the set f(F) = {a |F({b | (a,b) //()}) = true}. .1140/0* ||j? == the function that associates with every subset AS the function fA on subsets of with ,the property that fA(B) = true exactly when \every \\ = the function that associates with every subset l) the function GA such that for any B , GA(B) = true iff 2 . If is a basic CN, then || n ||H = //(n).

(54)

(55)

Branching quantifiers, english, and montague-grammar

155

To continue we need a definition. For a name variable x, let = * if Hl is like //except possibly for giving a different value on x, i.e., Vy(y x//(y) = (S7-S9) Each of the rules (7-9) has the form Z= X+ Y, where is the new phrase obtained from X and Y. The corresponding semantical rules say \\Z\\jj = |^||(||*|)> i- e -> aPpty tne semantical function ||.|| to the argument imiHandthevalueis||Z||H. (S10) If 7= A'such that 7(x) then ||Z[|H = || AT||Hn {a | || r(x)||fli = /r*, where (Sll) If is obtained by applying the rule 11 to the NP X and sentence F( then || Z\\H = || *||H({a| II ^W II*1 = /^ where //* = x// and #*(x) = a}). Lemma 1 : Let P(XI , . . . , x,,) be a phrase with the only free name variables x t , . . ., x. Let //, T/1 be two semantic functions that agree on x lr . . ., x,, and all the basic phrases appearing in P (such as run, kill, etc.) then ||||^ Il-Plta 1 Proof: clear, by induction. Lemma 2. Let be a phrase not containing the free name variable u, then no existential quantifier in can be dependent on u. Proof: Follows from Lemma 1, since by changing the value //, assigns to u || ||H does not change. Corollary. The quantifier of the tree (21) is branching since "some girl" does not depend on "man" and "some woman" does not depend on "sheep". They are both constructed as different phrases (with variables x, u) in different parts of the tree. (22) (Branching Quantifier) : Every son of some king and every daughter of some queen are friends. First construct : then construct : now construct : now construct : and you can now and finally : y is son of x u is daughter of man such that he is son of x woman such that she is daughter of every woman such that she is daughter of every man such that he is son of x. S(x, v) = every woman such that she is daughter of and every man such that he is son of x are friends say : S (some king, v) S(some king, some queen)

assuming that every king is a man and every queen is a woman. Also note that we are not concerned with rules yielding the final surface form.

156

Dov M. Gabbay and J.M.E. Moravcsik

4 BRANCHING QUANTIFIER IN MONTAGUE (1973) Our grammar of 3 is a subgrammar of Montague (1973) and therefore this system can express branching quantification. The reader should note that we did not give a rigorous proof that every sentence with branching quantifiers can be expressed in our grammar. We simply gave several examples to convince the reader that this can be done. 5 DEGREE OF COMPLEXITY OF THE MONTAGUE LANGUAGE The language with branching quantification is stronger than the 1st order predicate calculus. In fact (due to F. Galvin) there exists a sentence of the form (23) Vx3y| > A(x,y,u, v) Vu3v J

that cannot be expressed in 1st order predicate calculus as it is true in all models with infinite domains. The set of valid sentences of branching quantification is not recursively enumerable. (See Enderton 1970, p. 393.) The fact that we can express, e.g., the above sentence in Montague grammar, shows that the Montague language is stronger than 1st order predicate calculus. The reader may wonder, since on the face of it, whatever we can express in Montague grammar can be expressed also in the predicate calculus. The difference, however, is in the semantic interpretation. Take (23). Montague grammar rewrites this essentially as A(x, F(x), u, G(u)) which is expressible in predicate logic. However, semantically, Montague semantics gives it the interpretation 3 F 3 G A, which cannot be done in predicate logic.

References
ENDERTON, H.B. (1970) Finite partially ordered quantifiers, Zeitschrift fr Mathematische Logik und Grundlagen der Mathematik 16, 535-555. GABBAY, D.M. (1973) Representation of the Montague-semantics as a form of the Suppessemantics, pp. 395412 in: Hintikka J., Moravcsik, J., and Suppes, P., (Ed.) Approaches to Natural Languages, Reidel: Dordrecht. GABBAY, D.M. (forthcoming) Tense logics and the tenses of English, in: Moravcsik, J. (Ed.) Logic and philosophy for linguists, Mouton: the Hague. GABBAY, D. M. and J. MORAVCSIK (1973) Sameness and individuation, Journal of Philosophy 70, 513-525.

Branching quantifiers, english, and montague-grammar

157

HINTIKKA, J. (forthcoming) Branching quantifiers, Linguistic Inquiry. MONTAGUE, R. (1970) English as a formal language, pp. 189-223 in: Visentini et al. (Ed.) Linguaggi nella societ e nella tecnica, Edizioni di communit: Milano MONTAGUE, R. (1973) The proper treatment of quantification in ordinary English, pp. 221 -242 in: Hintikka, J. Moravcsik, J. Suppes, P. (Ed.) Approaches to Natural Language, Reidel: Dordrecht. MORAVCSIK, J. (1972) review of G. Leech's Towards a semantic description of English, Language 48,445^54. RYLE, G. (1957) The theory of meaning, pp. 239-264 in: Mace, C. A. (Ed.) British Philosophy in the Mid-Century, George Allen & Unwin: London.

J. PH. HOEPELMAN

TENSE-LOGIC AND THE SEMANTICS OF THE RUSSIAN ASPECTS1

We consider the applicability of J. A. W.Kamp's system for S(ince) and U(ntil) in the formalization of the supposed deep-structure of Russian sentences in which the aspects occur. We will see that, assuming certain expressions for the representation of the perfective and the imperfective, the consequences that are generally felt to be implied by these aspects in spoken Russian, can be inferred, assuming the axioms for linear and dense time. The semantical relations between the imperfective and the perfective aspecjft become more clear.

Introduction

If a "natural logic" exists (Lakoff 1970), it is to be expected that a tenselogical fragment will occur in it. Even in advanced treatments as (Montague 1973), the tense-operators are those of the propositional tense-logical system Kj and its extensions. These operators, however, cannot give a proper account of the logical form of all tense-phenomena that occur in natural language. In the following we consider the drawbacks of the aforementioned operators in the treatment of the logical form of Russian sentences in which the so-called "aspects" are found. Then we will make some proposals concerning the representation of these forms by means of Kamp's system for the operators S(ince) and U(ntil). We will limit our tense-logical analysis to one standard example, the verb "zakryt'zykryvat'", "to close". This is not due to a limitation of Kamp's system, but to difficulties in the analysis of "unbalanced" word-pairs, like "to close" and "to open", by means of Potts' operator "" (Cooper 1966). To expose this would take too much room for the purpose of the present article.

This article is part of research-pro jekt "The tense-logical fragment of natural logic", supported by The Netherlands Organization for the Advancement of Pure Research. I am indebted to Prof. S. C.Dik. J. A. W. Kamp, G. Berger and E. Krabbe for their help.

Tense-logic and the semantics of the russian aspects


I.

159

Until recently the study of tenses in linguistics has been more or less primitive. Most linguists treat the tenses in ways similar to those of Russell (1903, 458476) or Jespersen (1924) and Reichenbach (1947). Prior (1967, 12) however, shows how Russell's analysis leads to a paradox in the treatment of compound tenses, and we in turn can show how Reichenbach's analysis leads to a similar paradox. In his treatment of the tenses, Reichenbachfollowing Jespersenuses diagrams like figure 1.

Jti

> I had seen John * I shall have seen John + I saw John -* I have seen John

R,E
E

S
R, S

S: point of speech R: point of reference E: point of event. Fig.l

Let us now assume that the sentence "Once all speech will have come to an end" is true (cf. Prior 1967, 12). Then a finite number of utterances will have taken place. Assuming further that each utterance has a finite number of reference points, there will be a finite number of reference points. At least one of these is the last one. But Reichenbach's analysis of the sentence "there will have been a last reference point" gives a reference point that is later than the last one. A similar pardox can be constructed when the expression "now" is analysed as "contemporaneous with this utterance" (Krabbe 1972). If the analysis of tenses is related to utterances, one is forced to assume that there always were and always will be utterances, in order to avoid these problems.

160

J.Ph. Hoepelman
II.

Of the different forms of tense-logic, the non-metric prepositional ones with proposition-forming operators on propositions seem to bear the greatest formal resemblance to the tensed sentences of natural languages. J. A. W. Kamp (Kamp 1968) studies in detail the advantages of non-metric tense-logics for the treatment of tensed expressions in natural languages. We shall enumerate the axioms of standard propositional tense-logic and briefly mention the properties of the related models. The basis we choose is a system for standard propositional logic. The set of well-formed formulas is extended with those well-formed formulas which are preceded by: F P G H "it will be the case that" "it was the case that" "it will always be the case that" "it always was the case that",

plus the usual truth-functional combinations of these. "F" and "P" are undefined. "A", "B", ... are metavariables for well-formed formulas Def.G. Def.H. HA= d -.PiA

The rules of inference of propositional logic are extended with: RG. hA->h-iF-.A RH. hA-^KP-iA RM ("Mirror-image" rule). If in a thesis we replace all occurrences of F by P, and of P by F, the resulting formula is a thesis.
Ax. 1. - Ft (A z> B) z> (FA ^ FB) Ax.2. PiF-iAz>A

Ax. 1. and 2. together give the system K t , the theses of which are valid in every model for the axioms given below. Extensions of K t : Ax.3.FFA=>FA (transitivity) Ax. 4. PFA ID (A v FA v PA) (linearity) Ax. 5. FT A => FA (non-ending, non-beginning time) Ax. 6. FA r> FFA (denseness) Ax.7. ( A -5 A) ..=> ( Ao A) (completeness) Def. D : D A = d A & GA & HA

Tense-logic and the semantics of the russian aspects

161

III.

Russell offers the following definition of "change": "Change is the difference in respect to truth or falsehood, between a proposition concerning an entity and a time T, and a proposition concerning the same entity and another time T', provided that the two propositions differ only by the fact that T occurs in the one, where T' occurs in the other. Change is continuous when the propositions of the above kind form a continuous series, correlated with a continuous series of moments, ... Mere existence on some, but not all moments constitutes change on this definition" (Russell, 1903,469470). This definition can, with due modifications, equally well be applied to non-metric systems. Von Wright (1963), (1965) has developed a system with a dyadic proposition-forming operator on propositions, T, by means of which four elementary transformations can be described. Clifford (1966) has pointed out, that von Wright's system-goes together with a discrete series of moments. Tp,q means "p now, and q the next moment". If p represents the proposition "the window is open", then , describes the transformation of a world in which a window is open into a world in which it is closed. , describes the reverse, Tp, describes the staying open of the window and , its staying closed. Agreeing with Russell's definition we can say, that only Tp, and , describe changes. Anscombe has given an operator Ta, that Prior (1967, 70) has defined as follows: Def. Ta: Ta(A,B) =dP(PA & ) (PA & B) Ta(p, q) may be called "p and then q". Def. Ta can be given for any of the axiom systems given above, and so does not presuppose discrete time. Ta(p, np) and Ta(np, p) describe changes as well, but do not preclude the possibility of there having been more changes in between.

IV.

In Russian there are in general two verbal forms corresponding to one English verb. So for instance the verb "to close", which in Russian is represented by the two forms "zakryt"' and "zakryvat'". These two forms are referred to as "perfective" and "imperfective" respectively (if necessary we will indicate the perfectivity or imperfectivity of a form by the superscripts p and '). It has for a long time been thought, that the aspects are a feature characterising only the slavic languages, but recent studies show that they can be assumed in the basis of other languages as well, e.g. in Dutch; cf. (Verkuyl 1971). Aspectual differences, however, are expressed very systematically in the morphology of the slavic languages.
11 TLIl/2

162

J.Ph. Hocpelman

There is considerable disagreement among linguists as to the meaning of the two Russian aspects and their different functions. The great "Grammatika Russkogo Jazyka" tries to cover all their basic meanings (Forsyth 1970, 3): "The category of aspect indicates that the action expressed by the verb is presented: (a) in its course, in process of its performance, consequently in its duration or repetition, e.g. zit', pet', rabotat', chodit', citat', ... (imperfective); (b) as something restricted, concentrated at some limit of its performance, be it the moment of origin or beginning of the action, or the moment of its completion or result, e.g. zapet', koncit', pobezat', propet', prijti, uznat', ujti, ... (perfective)". Forsyth tries to define the difference between the perfective and the imperfective by means of Jakobson's concept of "privative opposition": "A perfective verb expresses the action as a total event, summed up with reference to a single specific juncture. The imperfective does not inherently express the action as a total event summed up with reference to a single specific juncture" (Forsyth 1970,7). The Dutch slavist Barentsen too uses the term "limit" to define the meaning of the perfective and says further: "The perfective points out that in the Narrated Period two contrasting parts are distinguished, of which one contains the Contrast Orientation Period. The element NONPERF points out that in the Narrated Period no two contrasting parts are distinguished". Furthermore, analysing the meaning of the perfective and imperfective forms "otkryt/potkryvat'1" "to open"he states: "The notion of contrast ... asks for explanation. Let us consider the following example: There exists a situation "the window is closed". After some time a change is observed: the window is now open. This means that a transition has taken place from one situation into another" (Barentsen 1971, 10; translation mine, J.H.). The similarity of this definition (and many others could be adduced) to Russell's definition of change is easily seen. Both the imperfective and the perfective forms can describe a change, but whereas the imperfective past form "zakryvalas'""closed" in the sentence "dver' zakryvalas'""the door closed/the door was closing" does not necessarily mean that the door was ever closed, the perfective past form "zakrylas'""closed" in "dver' zakrylas'""the door closed"/"the door is (was) closed"does mean that the "limit" of closing (which in a complete (i. e. continuous) series of moments is the first moment of the door being closedthe first moment at which the sentence "the door is closed" is true) is attained. In other words, the imperfective form may describe a change like "becoming more and more closed", while the door is open, whereas the perfective form describes not only this change, but also what may be called the result of this change: the fact that the door was really closed for the first time. The attainment of this result is an event without duration, which may be called an "instantaneous event", cf. (Anscombe 1964,17).

Tense-logic and the semantics of the russian aspects


V.

163

Let us now assume that we have attached a system for predicate logic to the systems of propositional tense-logic given above. We express predicate constants by "nV, "m2", ..., predicate variables by "jj", "f2", - , individual constants by "a", "b", ..., individual variables by "x", 'V, 'V - - In this article we will only consider one-place predicates. Now we can express that an individual, a, gets a quality (e.g. "closed") for the first time: 1. Hinija & n^a

Clearly (1) is too strong to represent the meaning of "dver' zakrylas/p" "the door closed". Neither the Russian, nor the English sentence imply that the door has never been closed before. What we want to express is, that for some period the door became less and less open and was closed finally. If we try to express this in the propositional tense-logic given above, the best we can get is: 2. HF-imja & n^a

In dense time HF-in^a is true if -11%a was true during some interval until now. But it is also possible that HF-in^a is true because in^a is true in the futureas can easily be inferred from ax. 2 and ax. 5. Even if we therefore stipulate that Gm1a, HF-imja can in a dense series be verified by a "fuzz": if between any past moment of -in^a's truth and the present, however close, there is a moment of -i mj a's falsehood and conversely, cf. (Prior 1967,108). A second difficulty is, that the standard predicate logical systems do not enable us to relate the result of an event to the event itself, so that we cannot distinguish between an event that stops because its result is attained and an event that stops without its result being attained:
3.

is true when a in the past gradually became mx and finally was (or is) 1%, as well as when a "jumped" to mj. On the other hand, if we have an expression "( a)" to represent the imperfective verb "zakryvat'""to close (gradually)",

4.
would be true if a stopped closing gradually without finally being closed, as well as when this result was indeed attained.

164

J.Ph. Hoepelman
VI.

To express the concept of gradual becoming Potts (1969) has devised a system of rules of natural deduction. His main rule we use here as an axiom2. If p stands for "x is mj", than stands for "x becomes m^'. Ax. PL /=>/[ If we substitute "a is closed" for^x in PI. we get: "if a becomes closed, a is not closed". Contraposition gives "if a is closed, it doesn't become closed". We attach Potts' operator an axiom to the system of predicate logic we choose.

VII.
We still cannot express that a proposition, p, was true during a certain interval of time. To express intervals in non-metric systems Kamp (1968) has developed the dyadic proposition-forming operators on propositions S(ince) and U(ntil). S(p, q) means: "Since a time at which p was true q was true until now", and U(p, q) means: "From now q will be true until a time at which p is true". Kamp has proved the functional completeness of S and U (Kamp 1968). We give some expressions defined in S and U (Prior 1967,106f.): PA FA G'A P'A F'A = d S(A, A o A): "A i3 A has been true since A was the case". =d U(A, A => A): "A => A will be true until A is the case". = d S(A=>A, A): "A has been the case since A ID A, i.e. A has been true for some period until now". = d U(Az> A, A): "A will be true until A=> A will be true, i.e. A will be true during some period from now". =d-iHSA: "There can be found no interval, however short, stretching from the past until now, during which A is uninterruptedly true". =<pGSA: "There can be found no interval stretching from now into the future during which A is uninterruptedly true".

Potts' system: A, B for /x,,,

a.

, & b. ( & B)
, -, d. -

Ad b.: provided that "B" depends on no premisses other than "A", and that "A" depends on no premisses other than"B". I do not think that Potts' system is suited to express all kinds of becoming, e.g. not the concept of becoming more and more open.

Tense-logic and the semantics of the russian aspects

165

H'A and G are the expressions we have been looking for. H'A is not verified by a "fuzz". Kamp (personal communication) has divised an axiom-system for S and U: Axioms : I. II . 1. 2.
3.

4. 5. 6. 7. 8. 9. 10.

All axioms of standard propositional logic. All formulas of the following forms : nS(A&-iA,B) S(A, B) & S(C, D) (S(A & C, B & D) v S (A & D & S(C, D), B & D) v S(C&B&S(A,B),B&D) S(A & U(C, D), B) = (S(C & B & S(A, B & D), B) v (C & S(A, B & D)) v (U(C, D) & D & S(A, B & D)) (iS(A v -, -) & S(C,D)) = (S(A & D & S(C, D), D) & nS(A v , )) S(A v B, C) = S(A, C) v S(B, C) HA = -.S(-iA,Av-iA) S(AvnA,AvnA) U(A v , -) -iS(A v -iA, A & -iA) nS(A, B) = iS(A, A v ) -iS(B v -iB, B) v S(-iA & (-iB v nS(B v -,, ), nB)

Rules of inference: MP. ,=> hB Eq. HA = A' hB^B'

If A is a subformula of and B' results from replacing an occurrence of A in by A'. RM. Mirror-image rule for S and U (cp. Sect. II). Def. P, H', P7, F, G', F', as given above. The axioms 1 6 correspond to linear time, the axioms 1 7, 8 to linear nonbeginning, non-ending time, axiom 9 to dense time, axiom 10 to complete time. Furthermore we assume that we have attached a system for predicate-logic, extended with Ax. PL (cp. Sect. VI), to I and II. VIII. 6. H'An^a &

is true if a is closed now for the first time, after becoming more and more closed during some period. We can prove3 that
1

Throughout proofs of theorems and lemma's can be found in the Appendix.

166

J.Ph. Hoepelman

7.

H'p=> Pp

and thus 8. '/iX &/ lX z> -,/ &/iX

can be inferred from axioms 1 6, 9, for dense time (cp. Sect. VII). From P^rrt! & mjx "x wasn't closed and is now closed" we can, by means of PL., infer T^-imjX, mjx), i.e. the contrast that, according to Forsyth, Barentsen and other grammarians is implied by a sentence like "dver' zykrylas'p""the door closed", and that, according to Russell can be used to define change. But because 9. '^ & m t x

is stronger than P-imjX & nijx, we are now able to express formally the difference between the proposition that a door, a, was closing during some period until now and is now, indeed, closed for the first time, and the proposition that a was closing during some period until now, still being open at the present moment. The former was expressed by (9), the latter we can express by 10. H'Amja & -in^a.

Although it is possible that a Russian sentence with a perfective-past verb form refers to present time, as in the following examples, this is not often the case : 11. Umer vskriknul Kukuskin, brosajas' na koleni u ego krovati Umer. He is dead/he died shouted Kukuskin falling on his knees at his bed He is dead/he died. (B.N. Polevoj, Ex. Russ. Synt. II, 300.) My pogibli We are lost (Bondarko, 1967, 99) I teper', poborov otorop', ja resil . . . And now, having fought my shyness, I decided to ... (A. Terc. Pchenc.)

12. 13.

Most Russian sentences with perfective-past verb forms, however, refer to past time. There are, moreover, examples of sentences in which the perfective-past verb is ambiguous in respect to time : 1 4. Kogda my prisli, oni usli When we arrived, they had (already) gone/When we arrived, they just went away. (Forsyth 1970, 68.)

To (14) we can ascribe the following structure : 15. P(O(/ixn)&(PO(/jxm)vO(y5xin)), where ^) represents a formula in which /^ occurs.

Tense-logic and the semantics of the russian aspects

167

So we can assume that 16. 17. '/&/, ('/&/) as well as 18. ('/ &/ lX ) ('/ &/lX) are represented by forms like "zakrylas' " in surface structure. It is equally possible to assume that only (18) is represented by forms like "zakrylas'" in surfacestructure, because (18) is implied by (16) as well as by (17), which occur on some supposed deeper level, without direct representation in the surface-structure. By means of the -operator (cf. Carnap 1959, 82ff, 129ff.) we define a predicateforming operator on predicates, p, such that ffi is true if and only if '^ & jJX is true : Def. p.: PA =<1()('/1 &/lX). Let us for the time being assume that
19.

in the deep structure of Russian sentences is represented by perfective-past forms like "zakrylas'". From 19. we can easily infer 20. '/ '/ ,

while on the other hand from (20) it is impossible to infer (19). If we assume furthermore that (20) occurs in the deep structure of russian sentences with forms of imperfective-past verbs, like the imperfective-past correlate of "zakrylas"', i.e. "zakryvalas'", this corresponds to the situation that in Russian 21. *Dver' zakrylas/p, no ne zakryvalas'1 The door was closed, but didn't close is unacceptable, while 22. Dver' zakryvalas'1, no ne zakrylas/p. The door was closing, but wasn't closed finally.

is a normal sentence. We may therefore assume, that (21) is unacceptable for a logical reason, but not unwellformed (in the sense of "not generated by the formation-rules"). 23. Jesli dver' zakrylas'1, ona zakryvalas/p If the door was closed, it was closing

can therefore be considered as an instance of a tense-logical postulate for Russian.

168

J.Ph. Hoepelman
IX.

Russian perfective verb-forms of verbs like "zakryt's'a""to close" with present-tense endings refer to the future, either denoting an event that is already taking place in the present and has its result in the (near) future, or an event that starts in the future and has its result in a more distant future. Examples: 24. Ja risuju1 portret moej staroj n'ani; kupite1 li vy etot portret, kogda ja jego narisujup? I am drawing the portrait of my old nanny, will you buy it when it is finished? Vse ze oni dumali1, cto Kutuzov dozivetp do rassveta. Nevertheless they thought that Kutuzov would live until dawn. Kogda vy im napisete1* ? My im napisemp cerez tri nedeli. When will you write them? We will write them in three weekstime,(compare also Rassudova, 1968, 9394)

25. 26.

This corresponds to a consequence of 27. FP/lX

in linear time. We first prove: 28. 29. F(H'p & q) => U(q, ) FU(q, p), and then: Ff/lX z> U(/lX, /) FU(/lX, /).

Proof of (29): Subst. // JJx/q (28): Def. |>. If we now define a dyadic proposition-forming operator on propositions, Qa, "and then", "and after that", corresponding to Anscombe's Ta in the following way Def. Q.: Q a (A,B)= d F(A & FB) (A & FB) We can infer from axioms 16, 9, for linear, dense time (cp. Sect. VII):
30.

(U(p, q) FU(p, q)) ID Qa(q, p)

Substitution of ^/q and^x/p in (30) gives 31. (U(/lX> /,) FU(/lX) /)) ^ (^(AfrJri.

From ,(^,^) we infer by PI, Lemma 9 and PL. 32. ,/,,^),

and so from (29) and (31) by Syll.:


33.

Tense-logic and the semantics of the russian aspects

169

So, if we assume that Fp/jX occurs in the deep structure of Russian sentences with perfective-present verb-forms, the contrast of Barentsen, Forsyth and Russell, mentioned previously, can be inferred for linear, dense time. Furthermore, as we saw, U0x, A/[x) v F(U(/jx, )) can be inferred from Ff^x. From U(/1x,A/1x)vFU(/1x,A^x) we can infer 34. G'A/ix

by Lemma 4 and Lemma 9. Conversely, from (34) we cannot infer FJ^x. Assuming that (34) occurs in the deep-structure of Russian sentences with the imperfective correlate of perfective verbs with present-tense endings, e.g. "budet zakryvat's'a" "will be closing" then the situation described above again corresponds to that of Russian: 35. *Dver' zakroets'ap, no ne budet zakryvat's'a1 The door will close, but it will not be closing is unacceptable this, as previously pointed out, for a logical reason , while 36. Dver' budet zakryvat's'a1, no ne zakroets'ap The door will be closing, but it will not be closed

is perfectly acceptable.

X.

A negated perfective verb can often be replaced by the negated corresponding imperfective verb (Forsyth 1970,102f.). This possibility is also accounted for by our assumed deep-structure for perfective verbs. 37. /='/&/ (byDef.p.)

If one of the conjuncts of the right member of (37) is negated, than p/[x is not true. Thus, if we assume '/ to occur in the deep-structure of imperfective verbs, the negation of '^ suffices for the negation of /Jx. On the other hand, as we saw, the negation of a perfective verb can mean that the result of the event described by the verb, i.e. jix, is not attained, while nevertheless '/Jx was the case: 38. Ja dolgo ubezdal1 prepodavaternicu, cto v etom net nikakogo anarchizma, naoborotUbezdal1, no ne ubediP. I tried for a long time to convince the teacher that this was not a manifestation of anarchism, on the contrary. I tried to convince her, but I didn't succeed. (Erenburg. L'udi, gody, zizn'. From Forsyth 1970,104r)

170

J.Ph. Hoepelman
XL

We have already assumed that expressions in which '^ occurs play a role in the deep structure of Russian sentences with imperfective-past verb forms. We have seen that these forms are implied by the postulated expressions for the deep-structure of perfective forms, so that it is impossible to state the perfective form, but to deny the imperfective one. It is, however, not possible to replace the perfective form by the imperfective one in all contexts. Perfective forms are required when the verb has the function of a perfect and when a series of successive events is described (Forsyth 1970, 92f.). We will try to find a formal expression for these contexts. The perfect meaning of a Russian perfective verb form expresses that a situation, of which the beginning is described by the perfective-past verb, has been existing up to and including now. 39. 40. On pol'uhiP ee He fell in love with her (and still is in love with her)/He loves her. Moroz snova krepkijpodulp severnyj veter It's hard frost again (because) the north wind has got up. (Erenburg. OttepeF) Ja zabylp, gde on zivet I forget where he lives On s uma soselp He is mad (examples from Forsyth 1970, loc. c.)

41. 42.

As a formal expression of this perfect (for the group of verbs considered here) we propose: 43. /,x & S(P/1X,/1X)

e.g.: "the door is closed now, and has been closed since it became closed". When perfective verbs are used to describe a series of successive events, each perfective verb describes an event that takes place in a new situation, the beginning of which is described by the preceding perfective verb. This situation can continue to exist after the new event has started, but it is equally well possible that it ceases therewith. 44. D'akon vstalp, odels'ap, vz'alp svoju tolstuju sukovatuju palku i tycho vyselp iz domu The deacon got up, dressed, took his thick, rough stick and quietly left the house (Cechov, Duel'. Forsyth 1970, 65). On otkrylp dver', vyselp, i zaper1* ee op'at' He opened the door, went out, and closed it again (Forsyth 1970, 9).

45.

Tense-logic and the semantics of the russian aspects

171

As a general formal expression of such a sequence we propose: 46. P(0(x+1) & S(|>/A & S(p/k_lXn_, & S(. . . & S(P/lXl ,fa\ . . .),

--)./ v ((*.+) & SQP/,, & S(?/k_lXn_i & S(. . . &

Xj may be identical with x^/j For the future we replace in (46) all occurrences of by F, and of S by U in accordance with RM. The formulation of (46) as a disjunction allows us to consider sequences of events of which the last one took place in the past, as well as sequences of events of which the last one takes place in the present (and which eventually goes on in the future). The presence of the expression (,+) allows for the possibility of an interruption or termination of the sequence of perfectives by an imperfective expression, as is often the case in Russian: 47. Cto i govorit', eto bylo ne samoe obrazcovoe otdelenie. Proderzalip nas tarn minut sorok kuda-to zvonili1, vyjasn'ali1, trebovali1 fotoplenku i tol'ko posle aktivnych nasich ubezdenij . . . i dopol'nitel'nych zvonkov nas otpustilip i daze izvinilis/p. I must say it wasn't a model police-station. They held us there for about forty minutes while they made phone-calls, asked questions, demanded the film from the camera. And it was only after our active persuasions . . . and further phone-calls that they let us go, and even apologised. (V. Nekrasov. Po obe storony okeana. Forsyth 1970, 65.)

We see now, that (43) is a special case of the second member of the disjunction (46).

XII.

Except for the expression of a gradual change (and certain other functions), the imperfective forms of Russian verbs can have two functions that stand in a relationship to one another, and to the perfect meaning of perfective verbs. The first one is the expression of a "two-way-action", the other that of a repeated action. The imperfective verb describing a two-way-action stipulates that the situation which came into being by the action described by the verb does not exist any more. This function of the imperfective thus contrasts to the perfect meaning of perfective-past verbs.

172

J.Ph. Hoepelman

48.

Vojd'a v komnatu on skazal tovariscu Kak zdes' dusno! Ty by chot' otkrylp okno. Da ja ego nedavno otkryval1. When he entered the room he said to his friend: "How stuffy it is in here! You might at least have opened the window". "But I did open it (have it open) not long ago". (Forsyth 1970, 78.) Prochod'a mimo nee on snimal1 sl'apu. As he passed her he raised (i.e. took off and put on again) his hat.

49.

Compare (49) to (50) in which the corresponding perfective form of snim P, sn'alp, occurs: 50. Vstretiv ee, on sn'alp sTapu i skazal . . . When he met her he took off his hat and said (still with his hat off) . . . (Forsyth 1970, loc. c.)

In our formalism, and for the group of verbs considered here, we can express this meaning of the imperfective as follows : 51. 52. 53. 54. (Pp/lX & H S/i*) v P(Pf>/lX & '-,/*) in dense, linear time: HS/ lX ^nSp/ lX ,/ lX -,(/lX & SJ>/lX,/lX) P(Pp/lX & -,(/lX & Sp/lX,/lX)) v (Pp/lX & -,(/lX & Sp/lX,/lX)). '/[ implies iSff^f^

From 5/, j|X we infer by PL. So we can infer from (51) by PL., Lemma 9 and RM : (53) is the negation of (43), which we proposed as the formal expression of the perfect meaning of the perfective-past. A repetition of a proposition, p, being true at different moments can, in Anscombe's formalism, be eXpressed as follows :
or
T a (T a (p,-,p),p),...etC.

T a (T a (T a (np,p),-,p),...),...etc.

Pp/JX & '/[ implies Ta(Ta(-ijiX,^X), n^X), i.e. a repetition of n^X. 55. Pp/lX & H VlX ^ T.CT.Cvix, /ix),

This means that, if we assume that (51) occurs in the deep-structure of Russian sentences with imperfective-past forms, as of "zakryvatVa" "to close" which denote a "two-way-action", then we can infer the repetition that, as may appear from the eXamples, is implied by this function of the imperfective, given the axioms for linear, dense time.

Tense-logic and the semantics of the russian aspects XIII.

173

The other function of imperfect! ve verb forms we mentioned was the expression of repeated action (the iterative). An imperfect!ve verb that expresses an iterative can be considered as a repetition of perfectives : 56. Kazdyj den' on vypival1 pered obedom r'umku vodki Every day before lunch he drank a glass of vodka i.e.: v ponedel'nik vypilp, vo vtornik vypilp, ... on Monday he drank one, on Tuesday he drank one . . . (Forsyth 1970, 164). Kazdyj den' on zakryval1 okno He closed the window every day i.e.: v ponedel'nik zakrylp, vo vtornik zakryP ... he closed it on Monday, on Tuesday . . . etc.

57.

We can infer the repetition Ta(Ta(/[x, n^x),^x) without any new axioms if we express this function of the imperfective by a repetition of a perfective verb: 58. P(P/lX & PP/lX) v (P/lX & PP/,x) o Ta(T.(/lX) Kazdyj den' on budet vypivat'1 pered obedom r'umku vodki Every day before lunch he shall drink a glass of vodka i.e. : ponedel'nik vyp'etp, vo vtornik vyp'etp ... on Monday he shall drink one, on Tuesday he shall drink one, ... Kazdyj den' on budet zakryvat'1 okno Every day he shall close the window i.e.: v. ponedel'nik zakroetp, vo vtornik zakroetp, . . .he shall close it on Monday, on Tuesday, . . .

Forms of the future tense of imperfective verbs can also have an iterative meaning: 59.

60.

We can infer that Qa(QaOix> ~l/ix)> Jix)> # we represent this meaning of the imperfective future by a future repetition of perfectives, i.e. if a perfective will twice or more be the case. 61. F(P/ 1 x&FP/ 1 x)=Q a (Q a (/ 1 x > Vix),/ix)

As to the semantical relationship between the perfective and the imperfective (of the group of verbs considered) these results mean, that we can suppose that in surface structure an imperfective verb form occurs when in the deep-structure an expression occurs that implies /Jx, Pn^x or F-ijix, but not 'S^Jx and then^x", or a repetition of ^x, or of /Jx, whereas the perfective occurs in the surfacestructure when in the deep structure an expression occurs that implies "i^x and then^x", but not a repetition of jix, or of -i/ix.

174

J.Ph. Hoepelman APPENDIX

Proofs of theorems. Proof of 8. '/ & / x x => P-i/lX & / lX : Lemma 1. S(p, q) 12 S(p, q q) Proof of Lemma 1.:
1. n(S(p, q) & S(p, q viq)) s -,(S(p & p, q & (q v -,q)) v S(p & (q v iq) & S(p, q vnq), q & (q vnq))) v S(p & q & S(p, q), q & (q v-iq))) (PL, ax. 2, subst.) -,(S(p, q) & S(p, q v -, q)) = -,(S(p, q) v S(p & S(p, q v -, q), q) v S(p&q&S(p,q),q)) (1. PL. EQ) -,(S(p, q) & S(p, q v-.q))==-,S(p, q) & -,S(p & S(p, q v-,q), q) & -,S(p&q&S(p,q),q)) (2. DeM.) -,(S(p, q) & S(p, q v -, q)) ID S(p, q) (3. PL, Sep.) S(p, q) ID S(p, q) & S(p, q q) (4. Contrap.) S(p, q) & S(p, q) =D S(p, q -, q) (5. PL) S(p,q)=>S(p,qv-,q) (6. PL)

2. 3. 4. 5. 6. 7.

Lemma 2. H(p ^ q) ID (H'p ID H'q) Proof of Lemma 2.: 1.


2.

nS((pvnp),-nq)& S(p -, p) ID S(nq & & S(p , ), ) (ax.4,Sep.)


S(iq & p & S(p V -ip, p), p) ^ S(iq & p, p) &

3. 4. 5. 6. 7.

S(S(pv-.p,p),p) (1., ax. 2, PL) -, S(p v -p, -nq) & S(p v -ip, p) z> S(iq & p, p) (1, 2, Syll, Sep.) S(n q & p, p) -3 S(T q & , ) (Lemma 1., subs.) S(p np, m q) & S(p p, p) z> S(n q & p, p v p) (3, 4, Syll.) n ST (n p v q, p v n p) ID S(p v np, p) v S(p v n p, q) (5., Contr. DeM.) H(p =5 q) ID (Hrp i5 H'q) (6., Def. H, Def. H', Def. =5)

Lemma 3. H'p "=> Pp. Proof of Lemma 3.: 1.


2. 3. 4. 5. 6. 7. 8. 9.

($(-,,-,(-,))&5(-.,))(5((-,)& p & S(p v n p, p), p) & -i S((p v - ), -( - )) S(p v -,, p) = S((p v -ip) & p & S(p v -,, p), p) S(p v - , ) S((p & S(p v -,, p), p) S((p & S(p v i p, p), p) => S(p, p) & S(S(p v -, p, p), p) S(p v -, p, p) :D S(p, p) & S(S(p v -. p, p), p) S(p v n p, p) => S(p, p) S(p, p) ID S(p, p) S(p v -, p, p) => S(p, p v i p) H'piDPp

(. 4.) (1. Ax. 9, Eq. PL) (2. PL, Eq.) (ax. 2, PL) (3, 4, Syll.) (6. Sep) (Lemma l.) (6, 7, Syll.) (8. Def. H', P)

Tense-logic and the semantics of the russian aspects Proof of 8. '/[ &^x=> /[

175

1. 2. 3. 4. 5. 6.

> -/) '/ => '/ '/ => -,/ix H f A/ 1 x&/ 1 x^PVi x &/ix

(PL) 0 Ax l-> ax 60 (2. Lemma 2., MP.) (Lemma 3.) (3, 4, Syll.) (5. PL) Q.E.D.

Proof of 28. F(q & H'p) z> FU(q, p) v U(q, p) : U(q & S(p^p, P ), P =D P )s(U((p^ p ) & (pup) & P=> ) ((p=> p) & U(q, (p=> p) & )) (S(p=> p, p)& p&U(q,( P ^ P )& P )) (ax. 3) U(q & S(p => p, p), p o p) = (U(Uq, p), p o p) v U(q, p) v (S(P => P, P) & P & U(q, p)) (1. PL. Eq.) U(q&S(p=>p,p),p=>p)=> (U(U(q, ), => p) v U(q, p) v U(q, p) (2. PL. Eq.) U(q & S(p ID , ), 13 p) o (U(U(q, p), p ^ p) v U(q, p) (3.PL.) F(q & H'p) 3 FU(q, p) v U(q, p) Q. E. D.

2. 3.

4. 5.

Proof of 30. (U(p, q) FU(p, q)) =5 Qaq, p: Lemma 4. U(p, q) => U(p p, q) Proof of Lemma 4. : 1. 2. 3-iU(Ov-i P ,q) = iU( P ,q)&-iU(-i P ,q) -.U(pvip,q)=>-.U(p,q) (1. PL., Sep.) U(p, q) 13 U(p P, q) (2. ContraP.)

(Ax. 5. DeM.) (1. PL., Sep.) (2. Contrap.)

Lemma 5. G'P & G'q ^ G'(P & q) Proof of Lemma 5 .

1. 2.

3.

4. 5.

& U(p v-ip, p) & U(p , q) = (U(p v-ip, P& q) U((p v q & U(P , q), P & q) U((P ) & (ax. 2, Eq.) & U(p -, , ), & q)) & U(P , ) & U(p , q)=> (U(P vi p , P & q) U((P q & U(P -ip, q)), p & q) U((p -) & P & U(p , ) (( ) & (1 . Lemma 4. PL.) p & U(p , )), & q)) U(P VTJ>, P) & U(P , q)=>U( P vn p , P & q) vU( P , & q) (2.Eq.) vU(Pv-iP,P&q) (3. PL.) U(p -,, ) & U(p -,, q) z> U(p -,, & q) (4. Def. G;, Eq.) G'p&G'q=>G'(p&q)

176

J.Ph. Hoepclman

Lemma 6. Fpn>G'Fp Proof of Lemma 6.:


1. (-iU(p V-ip,-nU(p,p V-ip)) & U(p,p V-ip)) = (U(-iU(p,pVnp) & (p V n p )

2. 3. 4. 5. 6. 7. 8.

&(,), -,p) & -U(p , -nU(p, p v p))) (Ax. 4. Subst.)(RM.) (-,U(p v-,p,-r,U(p, -,)) & U(p, p vip))z> (U(-,U(p, p -,) & (p vip) & U(p, p v-.p), p v -.p)) (1. PL., Sep.) (-iU(p v i p, -n U(p, p v - p)) & U(p, v -i p)) ID (U(iU(p, p v -i p) & U(p, p v -, p), p v -, p)) (2. PL., Eq.) -i U(-, U(p, p v -i p) & U(p, p v p), p v p) (ax. l., Subst.) i(-iU(p v -ip, -iiU(p, -ip)) & (, -p) (3, 4. Contrap. MP.) U(p v -ip, -nU(p, p)) v iU(p, p) (5. DeM.) U(p, p vip)=> U(p vnp, U(p, p -,)) (6. Def. ID) Fpi^G'p (7. Def. F, G')

Lemma 7. FFp =2 Fp Proof of Lemma 7.:

1. 2. 3. 4.

U((p v -ip) & (p v -ip) & U(p, p v ip), (p v -ip) & (p v -)) ^ U(p -,, -,) & U(p, p -,) (ax. 2, PL.)(RM) U((np v p) & U(p, -, v p), ip v p) ID U(p, -ip v p) (2. Sep.) U(U(p, v -, p), -, p v p) ZD U(p, p v i p) (3. Eq.) FFp->Fp (4. Def. F.)

Lemma 8. U(p, q) ID F(q & Fp) Proof of Lemma 8.: 1. 2. 3. 4. 5. 6. 7. U(p, q) r> Fp G'(q & Fp) ID F(q & Fp) U(p,q)iDG'q Fp=)G'Fp U(p,q)rDG'q&GTp (G'q & G'Fp) z> G'(q & Fp) U(p,q)^F(q&Fp) (Lemma l, RM) (Lemma 3, RM) (Lemma 4, RM) (Lemma 6) (3,4, PL) (Lemma 5) (7,2 Syll.)

Tense-logic and the semantics of the russian aspects

177

Lemma 9. G(p ^ q) => (Fp z> Fq) Proof of Lemma 9.: 1. 2. 3.


4.

i(U(-iOp v q), p v -) U(q, p v )) i(U((-.(-i p v q) v q), p v -. p) (ax. 5., Eq.)(RM) -i(U((-i(-ip v q ) v q ) , p v - , p ) = i(U((p & nq) v q), p v -i p) (DeM. Eq.) i(U((p & -. q) q), -. ) = -,(U((p q) & (q - q)), ) (Distr. Eq.)
-iU((p q) & (q nq), -) = -iU(p v q, p v -) (PL. Eq.)

5. 6. 7. 8. 9. 10.

-iU(p vq, VTp) = -iU(p, ) & iU(q, ) (ax. 5, DeM.) -iU(p v q, p v ip) => -iU(p, ) (5. Sep.) -.(U(-.(-ip q), -) U(q, -,)) => -.U(p, -ip) (l, 6, Syll.) -i(nU(-i(Tpvq),pv-ip)z>U(q,pv-ip))iD-iU(p,pv-ip) (7, Def. =>) i(G(p =>q) =5 Fq) ? Fp (8. Def. G, F, Def. =>) G(p 35 q) Z5 (Fp z> Fq) (9. PL.)

Proof of 30. 1. 2. 3. 4. 5. 6. G(U(p, q) => F(q & Fp)) (Lemma 8, ax. 1, RM, Def. G.) FU(p, q) ^ FF(q & Fp) (Lemma 9,1., MP.) FU(p, q) => F(q & Fp) (Lemma 7.) U(p, q) FU(p, q) => F(q & Fp) (3., Lemma 8, PL.) U(p, q) FU(p, q) ^ F(q & Fp) (q & Fp) (4. PL.) U(p,q)vFU(p,q)^Qaq,p Q.E.D.

Proof of 52. HVlX ID -, Lemma 10. H'q =5 S((p =5 ), q) Proof of Lemma 10.:
1. S(p v-ip,q) & S(p Vip,nq)=(S((p v n p ) & (p Vnp), q & nq) S((pVTp)&iq& S(pVTp,nq),q&Tq)v

S((p v-ip) & q & S(p np, q), q & -.q)


2.

(ax. 2)

S(p np, q) & S(p , nq) ID S(p np, q & -iq) S(p np, q & -iq)

3. 4. 5. 6. 7.

S(p , q & q) (1. Lemma 4, Eq.) S(pv-.p,q)&S(pv-,p,-,q):z>S(pvip,q&-,q) (2. PL.) S(p p, q & q) =5 i(S(p , q) & S(p , q)) (3. Contrap.) S(p , & ) (ax. 9.) -,(S(pv-,p,q)&S(pv-ip,-,q) (4,5, MP.) H'q => , S(p => , q) (DeM. Def. =>, Def. H'.)

12 TLI1/2

178 Proof of 52. 1. 2. 3. 4. 5. S(p, q) ID H'q H'q =3 P'q S(p,q)=>P'q -i P'q ID S(p, q) HS q ^ - S(p, q)

J.Ph. Hoepelman

(Lemma 4.) (Lemma 10, Def. P'.) (l,2,Syll.) (3, Contrap.) (4, Def. H'.)

6.

HViX^Sfltfx,/^)

(5. Subst.) Q.E.D.

Proof of 55. HS/iX & PftJxiD Ta(Ta(Vix,/ix), Lemma 11. (H'p & Pq) => P(p & Pq) Proof of Lemma 11.: 1. 2. 3. 4. H'p & Pq => H'p & H'Pq H'p & H'Pq ID H'(p & Pq) H'(p & Pq) =5 P(p & Pq) H'p & Pq ID P(p & Pq) (PL. RM. Lemma 6.) (Lemma 5, RM.) (Lemma 3.) (1, 3, Syll.)

(Proof of 55.:) 1. 2. 3. 4. H'-i/iX & PP/iX =5 P(Vix & /) (Lemma 11, Subst.) P(n/ix & PP/ix) = P(Vix & P^x & '/) (Def. f.) P(-,/ix & P(/iX & '/) => P(^x & P(/tx & P-i/ix) (PL, Lemma 3, Lemma 2, Lemma 9, RM.) H S/ix & Pf/lX Ta(Ta(Vix, /ix), Vix (1> 3, Syll, Def. Ta) Q.E.D.

Proof of 58. (P/1X & Pf/lX) P(f/lX & Pftfx) ^ Ta(Ta(/lX, Vix Lemma 12. P(q & r) :=> (Pq & Pr) Proof of Lemma 12.:
1. 2. 3. 4.

S(p, r) & S(q, r) (S(p & q, r) v S(p & t & S(q, r), r) v S(q & r & S(p, r), r)) S(p & q, r) = S(p, r) & S(q, r) S(p & q, p v n p) o S(p, ) & S(q, ) ( & q) > ( & Pq)

(Ax. 2.) (1. PL.) (2. Subst.) (3. Def, P.)

Tense-logic and the semantics of the russian aspects (Proof of 58.)

179

1. 2. 3. 4. 5. 6. 7. 8.

p/lX & Pp/lX = ('/1 &/lX) & ('/ &/lX) (Def. P) ('/ &/lX) & ('/ &/lX) => (HS/lX &/lX) & P(HS/iX &/) (pl Lemma 2, Lemma 9, RM.) HS^X &/ lX & ('-/ &/ ) ^ (Lemma 12, Sep) H'-i/lX & P^x &/ lX (Lemma 12, Sep.) H'-i/iX & P/lX &/ lX iD P(-./iX & P/ix) &/ix (Lemma 11.) /lX & P(ViX & P/ix) =>/ix & P(Vix & PViX & P(/lX & P(./ix & P(Vix & P/1X & PP/1X ^ Ta(Ta(/lX, Vi P(|>/lX & Pp/lX) is Ta(Ta(/lX, -i/ix)>/ix) (Lemma 9, RM, Lemma 7, PL.) (P/1X & PJ>/lX) P(f/lX & Pp/lX) z, Ta(Ta(/lX, ^X),/1X) Q.E.D.

Proof of 61. F(P/1X & Fp/lX) => Qa(Qa(/lX, Lemma 13. F(q & H'p) => F(p & Fq) Proof of Lemma 13.:
1. F(H'p&q) = U(q,p)vFU(q ) p) (28.)

2. 3. 4.

U(q, ) FU(q, p) => (F(p & Fq) FF(p & Fq)) F(p & Fq) FF(p & Fq) => F(p & Fq) F(H'p & q) = F(p & Fq)

(Lemma 8., Lemma 9.) (Lemma 7., PL.) (1, 3, Syll.)

Proof of 61.: 1. 2. 3. 4. 5. F(P/tx & FP/lX) = F((H'A/lX &/lX) & F(H'A/lX &/,X)) (Def. P.) F((H'A/lX & /lX) & ('/, &/,X)) r> F((H'-,/lX &/lX) & F(H' n/ix&/ix)) (PI. Lemma 9.) F((HS/lX &/lX) & F(HS/lX &/lX)) = F0ix & F(n^x & F^X)) (Lemma 12, RM., Lemma 7, Lemma 13.) F(/lX & F(-,/IX & F/lX)) = F(/,X & F(-,/lX & F/lX)) & F(ViX & F/lX)) (PL.) F(P/lX & FP/,X) ^ Qa(Qa(/lX, Vix)./ix) (L *> SylL, Def. Qa.) Q.E.D.

180

J.Ph. Hoepelman

References
ANSCOMBE,G.E.M.(1964) 'Before and After*. Philosophical Review, 73:1, 324 BARENTSEN A. (1971) 'K opisaniju semantiki kategorii "vid" i "vrem'a" (Na materiale sovremennogo russkogo literaturnogo jazyka).* Amsterdam. Unpubl. BONDARKO, A.V. BULANIN, L.L. (1967) 'Russkij glagol. Posobije dl*a studentov i ucitelej; pod red. JU. S. Maslova. Leningrad. CARNAP, R. (1958) 'Introduction to Symbolic Logic and its Applications.* New York: Dover CLIFFORD, J. (1966) 'Tense logic and the logic of change.* Logique et analyse no. 34,219230. COOPER, N. (1966) Scale-words. Analysis, vol. 27,19661967, pp. 153159. Cambridge University Press FORSYTH, J.: Grammar of Aspect. Usage and Meaning of the Russian Verb.* Cambridge: JESPERSEN, O. (1924) 'The Philosophy of Grammar.* London: Allen and Unwin KAMP, J.A.W. (1968) 'Tense logic and the theory of linear order.* Diss. Univ. of Calif. KRABBE, E. (1972) 'Propositionele tijdslogica.* Amsterdam. Unpubl. LAKOFF, G. (1970) 'Linguistics and natural logic.* Synthese 22,151271. POTTS, T. (1969) 'The logical description of changes which take time.* (Abstract) Journal of Symbolic Logic 34, 537. PRIOR, A. (1967) 'Past, Present and Future.* Oxford: Clarendon Press RASSUDOVA, O.P. (1968) 'Upotreblenie vidov glagola v russkom jazyke.* Moskva. REICHENBACH, H. (1947) 'Elements of Symbolic Logic.* London: Collier-Macmillan RUSSELL, B. (1903) 'Principles of Mathematics. Cambridge, Engl.: At the University Press VERKUYL, H. (1971) On the compositional nature of the aspects.* Diss. Utrecht: Routledge and Kegan Paul VON WRIGHT, G. (1963) 'Norm and Action, a Logical Inquiry.* London(1965). 'And Next', Acta Philosophica Fennica, Fasc. 18.

LAURI KARTTUNEN

PRESUPPOSITION AND LINGUISTIC CONTEXT*

According to a pragmatic view, the presuppositions of a sentence detrmine the class of contexts in which the sentence could be felicitously uttered. Complex sentences present a difficult problem in this framework. No simple "projection method" has been found by which we could compute their presuppositions from those of their constituent clauses. This paper presents a way to eliminate the projection problem. A recursive definition of "satisfaction of presuppositions" is proposed that makes it unnecessary to have any explicit method for assigning presuppositions to compound sentences. A theory of presuppositions becomes a theory of contraints on successive contexts in a fully explicit discourse.

What I present here is a sequel to a couple of my earlier studies on presuppositions. The first one is the paper "Presuppositions of Compound Sentences" (Karttunen 1973a), the other is called "Remarks on Presuppositions" (Karttunen 1973b). I won't review these papers here, but I will start by giving some idea of the backgro nd for the present paper. Earlier I was concerned about two things. First, I wanted to show that there was no adequate notion of presupposition that could be defined in purely semantic terms, that is, in terms of truth conditions. What was needed was a pragmatic notion, something along the lines Stalnaker (1972) had suggested, but not a notion of the speaker's presupposition. I had in mind some definition like the one given under (1). (1) Surface sentence A pragmatically presupposes a logical form L, if and only if it is the case that A can be felicitously uttered only in contexts which entail L.
* Presented at the 1973 Winter Meeting of the Linguistic Society of America in San Diego. This work was supported in part by the 1973 Research Workshop on Formal Pragmatics of Natural Language, sponsored by the Mathematical Social Science Board. I acknowledge with special gratitude the contributions of Stanley Peters to my understanding of the problems in this paper. Any remaining confusions are my own.

182

Lauri Karttunen

The main point about (1) is that presupposition is viewed as a relation between sentences, or more accurately, as a relation between a surface sentence and the logical form of another.1 By "surface sentence" I mean expressions of a natural language as opposed to sentences of a formal language which the former are in some manner associated with. "Logical forms" are expressions of the latter kind. "Context" in (1) means a set of logical forms that describe the set of background assumptions, that is, whatever the speaker chooses to regard as being shared by him and his intended audience. According to (1), a sentence can be felicitously uttered only in contexts that entail all of its presuppositions. Secondly, I argued that, if we look at things in a certain way, presupposition turns out to be a relative notion for compound sentences. The same sentence may have different presuppositions depending on the context in which it is uttered. To see what means, let us use "X" as a variable for contexts (sets of logical forms), "A" and "B" stand for (surface) sentences, and "PA" and "PB" denote the set of logical forms presupposed by A and B, respectively. Let us assume that A and B in this instance are simple sentences that contain no quantifiers and no sentential connectives. Furthermore, let us assume that we know already what A and B presuppose, that is, we know the elements of PA and PB. Given all that, what can we say about presuppositions of complex sentences formed from A and B by means of embedding and sentential connectives? This is the notorious "projection problem" for presuppositions (Morgan 1969, Langendoen & Savin 1971). For instance, what are the presuppositions of "If A then B"? Intuitively it would seem that sentential connectives such as if... then do not introduce any new presuppositions. Therefore, the set Py A then B should be either identical to or at least some proper subset of the combined presuppositions of A and B. This initially simple idea is presented in (2). (2) PIT A U B S P A ^ P B However, I found that when one pursues this line of inquiry further, things become very complicated. Consider the examples in (3). (3) (a) If Dean told the truth, Nixon is guilty too. (b) If Haldeman is guilty, Nixon is guilty too. (c) If Miss Woods destroyed the missing tapes, Nixon is guilty too. In all of these cases, let us assume that the consequent clause "Nixon is guilty too" is interpreted in the sense in which it presupposes the guilt of someone else. The question is: does the compound sentence as a whole carry that presupposition? In the case of (3a), the answer seems to be definitely jes, in the case
1

There is some question over whether this notion of presupposition is properly labeled "pragmatic". For Stalnaker (1972, 1973), pragmatic presupposing is a prepositional attitude of the speaker. However, I will follow Thomason (1973) and others who would like to reserve the term "presupposes" for relations (semantic or pragmatic) between sentences. The idea that it is important to distinguish in this connection between surface sentences and their logical forms is due to Lakoff (1972, 1973).

Presupposition and linguistic context

183

of (3b) definitely no, and in the case of (3c) a maybe, depending on the context in which the sentence is used. For example, if the destruction of the tapes is considered a crime, then Miss Woods would be guilty in case she did it, and (3c) could be a conditional assertion that Nixon was an accomplice. In this context the sentence does not presuppose that anyone is guilty. But in contexts where the destruction of the tapes in itself would not constitute a crime (3c) apparently does presuppose the guilt of someone other than Nixon. These examples show that if we try to determine the presuppositions of "If A then B" as a particular subset of the joint presuppositions of A and B, the initial simplicity of that idea turns out to be deceptive. In reality it is a very complicated enterprise. The kind of recursive principle that seems to be required is given in (4a) in the form it appears in Karttunen (1973b). (4b) says the same in ordinary English.
(4) () Pif A then B/X = PA/X^ (P /XuA ~ (E\uA ~ ))

where Ex is the set of logical forms entailed (in the standard sense) by X, and X U A is the result of adding the logical form of A to X. (b) The presuppositions of "If A then B" (with respect to context X) consist of (i) all of the presuppositions of A (with respect to X) and (ii) all of the presupposition of B ( with respect to XuA) except for those entailed by the set XuA and not entailed by X alone. One would like to find a better way to express this, but I am not sure there is one.2 It really is a complicated question. So much for the background. What I want to show now is that there is another way to think about these matters, and about presuppositions of complex sentences in particular. Let us go back for a moment to the attempted pragmatic definition in (1). The point of that definition is that the presuppositions of a sentence determine in what contexts the sentence could be felicitously used. A
2

Peters has pointed out to me that, under certain conditions, (4a) is equivalent to the following projection principle.

Peters* principle has the advantage that it assigns the same set of presuppositions to "If A then B" irrespective of any context. Note that this set is not a subset of P A uP B , as required by my initial assumption in (2). Peters* principle says that, for each presupposition of B, "If A then B" presupposes a conditional with that presupposition as the consequent and the logical form of A as the antecedent. In addition, "If A then B" has all of the presuppositions of A. I realize now that some of the complexity in (4a) comes from trying to state the principle in such a way that (2) holds. If this is not worth doing, Peters' way of formulating the rule is superior to mine. However, in the following I will argue that we can just as well do without any explicit projection method at all, hence the choice is not crucial.

184

Lauri Karttunen

projection method, such as (4a), associates a complex sentence with a class of such contexts by compiling a set of logical forms that must be entailed in any context where it is proper to use the sentence. Thus we say that the sentence "If A then B" can be felicitously uttered in context X only if X entails all of the logical forms in the set Ptf A B/x,defined in (4a). There is another, much simpler, way to associate complex sentences with proper contexts of use. Instead of characterizing these contexts by compiling the presuppositions of the sentence, we ask what a context would have to be like in order to satisfy those presuppositions. Of course, it is exactly the same problem but, by turning it upside down, we get a surprisingly simple answer. The reason is that we can answer the latter question directly, without having to compute what the presuppositions actually are. The way we go about this is the following. We start by defining, not presupposition, but a notion of satisfaction of presuppositions. This definition is based on the assumption that we can give a finite list of basic presuppositions for each simple sentence of English. For all cases where A is a simple, non-compound sentence, satisfaction is defined as in (5). (5) Context X satis es-the-presuppositions-of A just in case X entails all of the basic presuppositions of A (that is, < ).

The basic presuppositions of a simple sentence presumably can be determined from the lexical items in the sentence and from its form and derivational history, say, the application of certain transformations such as Pseudo-Clefting. To give a somewhat oversimplified example, consider the word too that occurs in the examples under (3). As a first approximation to the meaning of too we could give a condition like the one in (6), which is based on Green (1968). (6) Context X satisfies-the-presuppositions-of "a is too" only if either (i) X entails "b is P" for some b (^ a), or (ii) X entails "a is Q" for some

Q(*P)
This in turn is equivalent to saying that a simple sentence like "Nixon is guilty too" either has a presupposition that someone else is guilty or that Nixon has some other property.3 One or the other must be entailed in context. For compound sentences we define satisfaction recursively by associating each part of the sentence with a different context. The basic idea behind this
3

It appears to me that the only contribution too makes to the meaning of a sentence is that it introduces a presupposition whose form depends on the sentence as a whole and the particular constituent too focuses on. If this is so, there is no reason to assume that too is represented in the logical form of the sentence. As far as the truth conditions are concerned, "Nixon is guilty too" seems equivalent to "Nixon is guilty", therfore, it is possible to assign the same logical form to them. The same point has been raised in Lakoff & Railton (1971) with regard to two-way implicative verbs, such as manage^ whose only function also seems to be to bring in a presupposition.

Presupposition and linguistic context

185

was independently suggested in both Stalnaker (1973) and Karttunen (1973b). For conditionals, satisfaction is defined in (7). (7) Context X satisfies-the-presuppositions-of "If A then B" just in case (i) X satisfies-the-presuppositions-of A, and (ii) XuA satisfies-the-presuppositions-of B.

As before, the expression "XuA" denotes the set that results from incrementing X with the logical form of A.4 For conjunctions, that is, sentences of the form "A and B", satisfaction is defined just as in (7). For disjunctions, sentences of the form "A or B", we have "~A" instead of "A" in part (ii). Examples that illustrate and support these principles can be found in my earlier papers.5 Note that satisfies-the-presuppositions-of is a relation between contexts and sentences. As I have tried to indicate orthographically, we are defining it here as a primitive, irreducible locution. Eventually it would be better to replace this clumsy phrase with some simple verb such as "admits", which has the right pragmatic connotations. I keep the former term only to bring out the connection between (4) and (7) more clearly. At the end, of course, it comes down to having for each simple sentence a set of logical forms that are to be entailed (in the standard logical sense) by a certain context. What is important is that we define satisfaction for complex sentences directly without computing their presuppositions explicitly. There is no need for a projection method. Secondly, in case a sentence occurs as part of a larger compound, its presuppositions need not always be satisfied by the actual conversational context, as long as they are satisfied by a certain local extension of it. For example, in order to admit "If A thenB" a context need only satisfy-the-presuppositions-of A, provided that the presuppositions of B are satisfied by the context as incremented with the logical form of A. It can be shown that the new way of doing things and the old way are equivalent. They sanction the use of any sentence in the same class of contexts. Although it may not be obvious at first, the statement in (8) is true just in case (9) holds, and vice versa.
4

In simple cases, incrementing a context consists of adding one more logical form to it. If the context entails the negation of what is to be added to it, as in counterfactual conditionals, other changes are needed as well to keep the resulting set cosistent. This is a difficult problem, see Lewis (1973) for a general discussion of counterfactuals. 5 It is possible that the principle for disjunctions, and perhaps that for conjunctions as well, should be symetric. This depends on how we want to deal with sentences like "Either all of Jack's letters have been help up, or he has not written any" (see Karttunen 1973a, ftn.ll). A symetric condition for "or" would read follows X satisfies-the-presuppositions-of "A or B" iff { ~A} satisfies-the-presuppositions-of "B" andXu { *B} satisfies-the-presuppositions-of"A". For "and", substitute "A" for ^ A" and "B" for " ^B".

186

Lauri Karttunen

(8)
(9)

X satisfies-the-presuppositions-of "If A then B".


P i f A t h e n B / X ^ Ex

The proof is straight-forward and will not be presented in detail. Here it suffices to note that, by (4a), (9) is equivalent to the conjunction of (10) and (11). (10) (11) < P B -(E XUA -E X )CE X

Similarly, by (7), (8) is equivalent to the conjunction of (12) and (13). (12) (13) X satisfies-the-presuppositions-of A. XuA satisfies-the-presuppositions-of B.

Given our basic definition of satisfaction in (5) and that A and are simple sentences, it follows that (10) and (12) are equivalent. So it remains to be shown that (11) and (13) also amount to the same thing. This can be done with simple set-theoretic means by proving the equivalence of (11) and (14). (Note that < ). (14) P B E XUA

(14) in turn says the same thing as (13) provided that is a simple sentence, as we have assumed here. In short, (8) and (9) are equivalent by virtue of the fact that (10) is equivalent to (12) and (11) is equivalent to (13). Consequently, the class of contexts that satisfy-the-presuppositions-of "If A then B" by principle (7) is the same class of contexts that entail all of the presuppositions assigned to this sentence by (4a).6 As we move on to more complicated sentences, the advantages of (7) over (4) become more and more clear. For example, consider sentences of the form (15). (15) If (A and B) then (C or D).

It is a very cumbersome undertaking to compute the set of logical forms presupposed by (15) by means of rules like (4a). But it is a simple matter to tell by principles like (7) what is required of a context in which (15) is used. This is shown in (16). Note that (16) is not a new definition but a statement that directly follows from (7) and the corresponding principles for conjunctions and disjunctions. (16) Context X satisfies-the-presuppositions-of "If (A and B) then (C or D)" just in case

The same holds in case we choose Peters* principle (see ftn. 2) over (4a). In demonstrating this, what we prove equivalent to (14) is not (11), of course, but that {rAi3 C1 | CeP B }i=Ex. This equivalence follows straight-forwardly from the fact that just in case CeEx UA .

Presupposition and linguistic context

187

(i) X satisfies-the-presuppositions-of A, (ii) Xu A satisfies-the-presuppositions-of B, (iii) XuA&B satisfies-the-presuppositions-of C, and (iv) Xu A & Bu ~ C satisfies-the-presuppositions-of D. As we study complex cases such as this one, we see that we could look at satisfaction of presuppositions in an even more general way. As illustrated in (16), by our definition a given initial context satisfies-the-presuppositions-of a complex sentence just in case the presuppositions of each of the constituent sentences are satisfied by a certain specific extension of that initial context. For example, the presuppositions of D in (15) must be satisfied by a set of logical forms that consists of the current conversational context as incremented with the logical forms of "A and B" and the negation of C. In compound sentences, the initial context is incremented in a left-to-right fashion giving for each constituent sentence a heal context that must satisfy its presuppositions.7 We could easily define a notion of local context separately and give the following general definition of satisfaction for all compound sentences. (17) Context X satisfies-the-presuppositions-of S just in case the presuppositions of each of the constituent sentences in S are satisfied by the corresponding local context.

Note that in this new framework the earlier question of how it comes, about that presupposition is a relative notion for compound sentences does not arise at all. Also, the distinction between cases like (3a) and (3b) is of no particular importance. What is required in both cases is that the presupposition of the consequent clause contributed by the word too be entailed by the current conversational context as incremented with the logical form of the antecedent. In case of (3b), we recognize that this condition is met, no matter what the initial context is like, by virtue of the particular antecedent. In (3a) it appears that the antecedent does not contribute anything towards satisfying the presuppositions of the consequent, at least, not in contexts that immediately come to mind. Hence we can be sure that the presuppositions of the consequent are satisfied in the incremented context just in case they are already satisfied initially. It seems to me now that this is a much better way of putting it than to talk about a presupposition being "shared" by the compound in (3a) and being "cancelled" or "filtered away" in (3b), as I did in the earlier papers. Such locutions can be thrown out with the projection method that gave rise to them.

Lakoff has pointed out to me that a notion of local context is also needed for transderivational constraints that make the well-formedness of derivations in which a certain transformation has applied dependent on the context. In compound sentences, it is the local context these constraints must refer to, not the overall conversational context.

188

Lauri Karttunen

So far I have only discussed complex sentences that are formed with sentential connectives. However, satisfaction of presuppositions can easily be defined for all kinds of complex sentences. Without going into any great detail, I will try to outline how this is done for sentences with sentential subjects or objects. Let us represent such sentences with the expression "v(... A...)" where "v" stands for a complementizable verb and "A" for an embedded subject or object clause. Sentences with verbs like believe and want that require non-sentential subjets are represented with "v(a,A)" where "a" stands for the underlying subject. In this connection we have to distinguish three kinds of complementizable verbs, as shown in (18). (18) I Verbs of saying: say, ask, tell, announce, etc. (including external negation). II Verbs of prepositional attitude: believe, fear, think, want, etc. Ill All other kinds of complementizable verbs: factives, semi-factives, modals, one- and two-way implicatives, aspectual verbs, internal negation.

Essentially this amounts to a distinction between verbs that are "transparent" with respect to presuppositions of their complements (type III) and verbs that are "opaque" to one degree or another (types I and ).8 These distinctions of course are not arbitrary but presumably follow from the semantics of verb complementation in some manner yet to be explained. For sentences where the main verb is of the last type, we need the condition in (19). (19) If v is of type III, context X satisfies-the-presuppositions-of "v(... A...)" only if X satisfies-the-presuppositions-of A.

Thus in a case such as (20), where may, force, and stop all are of type III, a context satisfies-the-presuppositions-of the whole sentence only if it satisfies those of all the nested complements.9 (20) The courts may force Nixon to stop protecting his aides.

For example, a context for (20) ought to entail that Nixon has or will have been protecting his aides.
8

One of the mistakes in Karttunen (1973a) was the claim that verbs of saying and propositional attitude verbs are all "plugs". 9 Since ordinary negation is a sentential operator of type III, it also follows from (19) that a context satisfies-the-presuppositions-of "Nixon won't stop protecting his aides" just in case it satisfies-the-presuppositions-of "Nixon will stop protecting his aides". This is an important fact, but there is no need to make it part of the definition of pragmatic presupposition, as Thomason (1973) does, presumably for historical reasons because the semantic notion of presupposition is traditionally defined in that way.

Presupposition and linguistic context

189

For verbs of propositional attitude we need a condition such as (21), where the expression "Ba(X)" stands for the set of beliefs attributed to a in X. (21) If v is of type II, context X satisfies-the-presuppositions-of "v(a,A)" only if Ba(X) satisfies-the-presuppositions-of A.10 The condition says that sentences such as (22) require that the subject of the main sentence be understood to have a set of beliefs that satisfy-the-presuppositions-of the complement. (22) John fears that Nixon will stop protecting his aides. To satisfy the presuppositions of (22), a context must ascribe to John a set of beliefs that satisfy-the-presuppositions-of "Nixon will stop protecting his aides". Finally, with verbs of type I a complex sentence does not necessarily require that the presuppositions of the complement be satisfied, as we can observe by contemplating examples such as (23). (23) Ziegler announced that Nixon will stop protecting his aides. (23) can be spoken felicitously, perhaps even truly, no matter what the facts are understood to be or whether anyone is supposed to hold a set of beliefs that satisfy the presuppositions of the complement. (24) As a final example of complementation, consider the sentence in (24). John thinks that, if Rosemary believes that Nixon has been protecting his aides, she is afraid that Nixon will stop protecting them. By applying the principles in (21) and (7) recursively, we arrive at the conclusion that, if a given context, X, satisfies the presuppositions of (24), then the presuppositions of the last clause in (24), "Nixon will stop protecting his aides", are satisfied by the set (25). (25) ^Rosemary (Ptohn (X) u Rosemary believes that Nixon has been protecting his aides). This set contains all of the beliefs attributed to Rosemary in a context that consists of all of the beliefs attributed to John in X and the logical form of the given sentence. By virtue of its last-mentioned ingredient, this set in (25) is guaranteed to entail that Nixon has been protecting his aides. Therefore, (24) does not requke that this particular presupposition of the last clause be entailed in contexts where (24) is used, or by the set of beliefs that in those contexts are attributed to John or to Rosemary. As far as I am able to tell, this is the correct result. This concludes what I have to say about satisfaction of presuppositions. What we are interested in is associating sentences with proper contexts of use. We can achieve this goal directly by defining a notion of satisfaction as a relatiqn between contexts and" sentences. In this way we avoid the many complications
It is implicit in this treatment that every individual's beliefs are considered to be closed under entailment. I am not sure whether this is a defect.
10

190

Lauri Karttunen

that have to be built into a projection method that does the same by associating each sentence with a set of presuppositions. The efforts by Langendoen and Savin (1971), Morgan (1969,1973), Keenan (1973), Lakoff and Railton (1971), Herzberger (1973), myself (1973a, 1973b), and many others to find such a method now seem misplaced to me. The best solution to the projection problem is to do away with it. The moral of this paper is: do not ask what the presupposition of a complex sentence are, ask what it takes to satisfy them. I will conclude with a few comments about the notion of context. It is implicit in what I have said about satisfaction that a conversational context, a set of logical forms, specifies what can be taken for granted in making the next speech act. What this common set of background assumptions contains depends on what has been said previously and other aspects of the communicative situation. In a fully explicit discourse, the presuppositions of the next sentence uttered are satisfied by the current context. This guarantees that they are true in every possible world consistent with the context. Of course, it is possible that the actual world is not one of them, since people may be talking under various misapprehensions. Satisfaction of presuppositions is not a matter of what the facts really are, just what the conversational context is. Once the new sentence has been uttered, the context will be incremented to include the new shared information. Viewed in this light, a theory of presuppositions amounts to a theory of a rational order of contexts from smaller to larger sets of shared information. At each step along the way that a fully explicit discourse proceeds, the current context satisfies the presuppositions of the next sentence that in turn increments it to a new context. There are definitions of pragmatic presupposition, such as (1), which suggest that there is something amiss in a discourse that does not proceed in this ideal, orderly fashion. Those definitions make it infelicitous to utter sentences whose
Many things can of course go wrong. First of all, the listener may refuse to go along with the tacit extension that the speaker appears to be suggesting. In case of the classical example: "Have you already stopped beating your wife?" he may have a good reason to balk. The listener may also be unable to comprehend what tacit extension of the current context the speaker has in mind. Some types of presupposition are especially unsuited for conveying anything indirectly. For example, "Nixon is guilty too" is not a good vehicle for suggesting that Agnew is guilty, although the presuppositions of the sentence are satisfied in all contexts where the latter is the case. Finally, the listener may extend the context in some way other than what was intended by the speaker. To what extent we actually can and do make use of such shortcuts depends on pragmatic considerations that go beyond the presuppositions themselves. Note also that there are certain expressions in current American English that are almost exclusively used to convey matters indirectly, hence it is a moot question whether there is anything indirect about them any more. One is likely never to hear "Don't you realize it's past your bedtime" in a context entailing that the addressee ought to be in bed.
11

Presupposition and linguistic context

191

presuppositions are not satisfied by the current conversational context. They outlaw any leaps and shortcuts. All things considered, this is an unreasonable view. Consider the examples in (26). (26) (a) We regret that children cannot accompany their parents to commencement exercises. (b) There are almost no misprints in this book. (c) I would like to introduce you to my wife. (d) John lives in the third brick house down the street from the post office. (e) It has been pointed out that there are counter examples to my theory. The underlined items in these sentences bring in a certain presupposition. Thus (26a) presupposes that its complement is true. Yet the sentence could readily be used in a conversational context that does not satisfy this presupposition. Perhaps the whole point of uttering (26a) is to let it be known that parents should not bring their kids along. Similarly, (26d) might be used to give directions to a person who up to that point had no idea that there are at least three brick houses down the street from the post office, which is a presupposition for the sentence by virtue of the underlined definite description. The same goes for the other examples in (26). What do we say here? I am not all sure we want to say that, in these cases, a sentence has been used infelicitously. I am sure that there is no advantage in saying that sentences like (26a) sometimes do and sometimes do not presuppose their complements. A notion of "part-time presupposition" is not going to help; on the contrary. Had we defined presupposition as a relation between a sentence and its speaker, we would be tempted to talk about some presuppositions being optional. I think the best way to look at this problem is to recognize that ordinary conversation does not always proceed in the ideal orderly fashion described earlier. People do make leaps and shortcuts by using sentences whose presuppositions are not satisfied in the conversational context. This is the rule rather than the exception, and we should not base our notion of presupposition on the false premiss that, it does not or should not happen. But granting that ordinary discourse is not always fiilly explicit in the above sense, I think we can maintain that a sentence is always taken to be an increment to a context that satisfies its presuppositions. If the current conversational context does not suffice, the listener is entitled and expected to extend it as required. He must determine for himself what context he is supposed to be in on the basic of what was said and, if he is willing to go along with it, make the same tacit extension that his interlocutor appears to have made.11 This is one way in which we communicate indirectly, convey matters without discussing them. When we hear a sentence such as (26a), we recognize that it increments contexts which entail that children are not permitted at commencement exercises. These are the only contexts that satisfy the presuppositions of (26a). So if we

192

Lauri Karttunen

have not realized already that we are supposed to be in that kind of context, the sentence lets us know that indirectly. Perhaps the whole point of uttering (26a) was to make us conclude this for ourselves so that we would not have to be told directly.12 One must be careful not to confuse presuppositions with features of contexts that satisfy those presuppositions. Consider a sentence such as (27), which is a modified version of an example discussed by Lakoff (1971). (27) John called Mary a Republican and then she insulted him back. Because of the word back, the second conjunct of (27) presupposes that John has insulted Mary. The principle (17) tells us that this presuppositions ought to be satisfied by the corresponding local context. In this case, the local context consists of the initial context for (27) incremented with the logical form of "John called Mary a Republican". Let us suppose that this context in fact satisfies the presupposition that John has insulted Mary, and that the initial context by itself would not satisfy it. This state of affairs could come about in several ways. The most obvious one is that the initial context entails that calling someone a Republican constitutes an insult. Note that there is nothing in (27) which presupposes that "Republican" is a dirty word. It is not a necessary feature of every context that satisfies the presuppositions of (27). But there are some contexts in which the presuppositions of (27) are satisfied only because of it. Sometimes we can exploit this fact be uttering (27) in a context which does not satisfy its presuppositions. In that case we expect the listener to notice what extension we have in mind. This is similar to what can be done with the examples in (26), except that here the piece of information that is passed along under the counter is neither presupposed nor entailed by any part of (27). As a final example, consider a case of the kind first discussed in Liberman .(1973). (28) Bill has met either the King or the President of Slobovia. The two disjuncts that constitute (28) have conflicting presuppositions: Slobovia is a monarchy/Slobovia is a republic. Yet, (28) as a whole is not contradictory. It seems to assert that Bill has met the Slobovian Head of State and indicates that the speaker does not know much about Slobovia. What sort of context does it take to satisfy-the-presuppositions-of (28) ? Assuming that the condition for "or" is symmetric (see ftn. 5 above), we find that, according to our principles, (28) can be admissible at least in contexts which entail the logical forms of the three sentences in (29). (29) (a) Slobovia is either a monarchy or a republic. (b) If Slobovia is a monarchy, Bill has met the King of Slobovia. (c) If Slobovia is a republic, Bill has met the President of Slobovia. Spch a context can satisfy the presuppositions of (28) for the following reason. By
12

I owe this example to an official MIT bulletin about the spring 1973 commencement.

Presupposition and linguistic context

193

incrementing it with the negation of the first disjunct, "Bill has not met the King of Slobovia", we get a context which entails that Slobovia is a republic, which is what the second disjunct presupposes. By incrementing the original context with the negation of the second disjunct, we get a context which entails that Slobovia is a monarchy, which is a presupposition for the first disjunct. Given that both constituent sentences in (28) are admissible in their respective local contexts, (28) as a whole is admissible. If our way of looking at presuppositions is correct, it should be in principle possible to utter (28) to someone who has never even heard of Slobovia and leave it up to him to conclude that the speaker assumes (29). It seems to me that this is a desirable result. In this paper I have argued that a theory of presuppositions is a best looked upon as a theory of constraints on successive contexts in a fully explicit discourse in which the current conversational context satisfies-the-presuppositionsof, or let us say from now on, admits the next sentence that increments it. I have outlined a recursive definition of admittance, based on the assumption that we can give a finite list of presuppositions for each simple sentence. In this approach we do not need an explicit projection method for assigning presuppositions to complex sentences. A theory of presuppositions of the kind advocated here attempts to achieve both less and more than has been expected of such a theory: less in the sense that it is not a theory of how ordinary discourse does or ought to proceed; more in the sense that it tries to explain some of the principles that we make use of in communicating indirectly and in inferring what someone is committed to, although he did not exactly say it.
References DAVIDSON, D. and G. HARMAN (Eds.) (1972), Semantics of Natural Language, Dordrecht: D. Reidel. FILLMORE, C. J. and D. T. LANGENDOEN (Eds.) (1971), Studies in Linguistic Semantics, New York, N. Y.: Holt, Rinehart, and Winston. GREEN, G. (1968), On too and either, and just on too and either, either, in: Darden, B. et al. (Eds.), Papers From the Fourth Regional Meeting of the Chicago Linguistic Society. University of Chicago, Chicago, Illinois. HERZBERGER, H.G. (1973), Dimensions of Truth, Journal of Philosophical Logic 2,535556. KARTTUNEN, L. (1973a), Presuppositions of Compound Sentences, Linguistic Inquiry IV :2, 169193. KARTTUNEN, L. (1973b), Remarks on Presuppositions, in: Murphy, J., A. Rogers, and R. Wall (Eds.). KEEN AN, E. (1973), Presupposition in Natural Logic, The Monist 57:3,344370. LAKOFF, G. (1971), The Role of Deduction in Grammar, in: Fillmore, C. J. and D. T. Langendoen (Eds.). LAKOFF, G. (1972), Linguistics and Natural Logic, in Davidson, D. and G. Harman (Eds.). LAKOFF, G. (1973), Pragmatics and Natural Logic, in: Murphy, J., A. Rogers, and R. Wall (Eds.). 13 TLI1/2

194

Lauri Karttunen

LAKOFF, G. and P. RAILTON (1971), Some Types of Presupposition and Entailment in Natural Language, unpublished manuscript. LANGENDOEN, D. T. and H. B. SAVIN (1971), The Projection Problem For Presuppositions, in: Fillmore, C. J. and D. T. Langendoen (Eds.). LEWIS, D. (1973), Counterfactuals, Cambridge, Mass.: Harvard University Press. LIBERMAN, M. (1973), Alternatives, in: Papers from the Ninth Regional Meeting of the Chicago Linguistic Society, University of Chicago, Chicago, Illinois MORGAN, J. L. (1969), On the Treatment of Presupposition in Transformational Grammar, in: Binnick, R. et al. (Eds.), Papers from the Fifth Regional Meeting of the Chicago Linguistic Society, University of Chicago, Chicago, Illinois. MORGAN, J. L. (1973), Presupposition and the Representation of Meaning, unpublished Doctoral dissertation, University of Chicago, Chicago, Illinois. MURPHY, J., A. ROGERS, and R. WALL (Eds.) (forthcoming), Proceedings of the Texas Conference on Performatives, Presuppositions, and Conversational Implicatures, Center for Applied Linguistics, Washington, D. C. STALNAKER, R. C. (1972), Pragmatics, in: Davidson, D. and G. Harman (Eds.). STALNAKER, R. C. (1973), Presuppositions, Journal of Philosophical Logic 2, 447457 THOMASON, R. H. (1973), Semantics, Pragmatics, Conversation, and Presupposition, in: Murphy, J., A. Rogers, and R. Wall (Eds.).

DISCUSSIONS AND EXPOSITIONS


MARCELO DASCAL AND AVISHAI MARGALIT A NEW DEVOLUTION' IN LINGUISTICS?TEXT-GRAMMARS' VS. 'SENTENCE-GRAMMARS'

Some of the arguments presented in favor of a replacement of the existing 'sentence-grammars' by a 'text-grammar' (a grammar whose rules would generate a set of well-formed 'texts', and not merely a set of well-formed 'sentences') are discussed and evaluated. Three main arguments in favor of T-grammar are put forward in one of the most comprehensive expositions of the subject (van Dijk, 1972): a methodological, a grammatical and a psycholinguistic argument. The paper examines in detail only the first two, showing that none of them provides satisfactory support for the replacement of S-grammar by T-grammar.

1.

Introduction

In the past few years, the amount of publications devoted to 'text-grammar* has been steadily increasing, particularly in continental Europe.1 The advocates of the new product tend to interpret such an increase as a sign of a new Revolutionary* development taking place now in the field of linguistics: the replacement of the old and limited 'sentence-grammars' by the new, more ambitious and powerful, 'textgrammars'. In spite of the great amount of work recently devoted to it, research on text-grammar is still in an embryonic state; the term 'text-grammar', therefore, should be taken as a label for a 'research program', rather than for a full fledged linguistic 'theory'. Nevertheless, it is time to attempt at least a preliminary evaluation of the claims and achievements of text-grammar. Firstly, because enough has been said about it in order to see what it roughly purports to be, and where it might lead. Secondly, because although embryonic in nature, research programs are able to mobilize considerable resourcesboth human and economicwhich might eventually be employed elsewhere, in the light of the results of a careful evaluation.

1 For example: van Dijk, 1972 (see bibliography); van Dijk, et al., 1971; Ihwe et al, 1971; Petfi, 1971; Stempel, 1971; and many more announced papers, books, symposia, etc.

196

Marcelo Dascal and Avishai Margalit

And thirdly, because criticism may have a positive effect on the research program itself, leading to sharper formulations, shifts in its aims, etc. Before embarking in the analysis of the text grammar research program, it is necessary to point out some limitations of our study. Instead of surveying the whole literature on text-grammar, we will concentrate our attention on one book, (van Dijk, 1972) which contains a systematic, detailed and comprehensive presentation of the basic assumptions, aims and current achievements of the text-grammar research program. One of these aims is the establishment of new and more precise foundations for the theory of literature, as a result of the advocated reform in linguistic theory. However, whatever its effects on the theory of literature, the reform in linguistics is claimed to be independently motivated by shortcomings within linguistic theory itself.2 We shall deal here only with this allegedly 'internal' motivation for the proposed reform in linguistics, disregarding 'external' considerations such as the possible applications of'reformed' linguistics to the theory of literature. To be sure, the possibility of such applications might turn out to be a crucial factor in choosing between two theories equally adequate on purely linguistic grounds. But arguments concerning their respective linguistic adequacy (both descriptive and explanatory) naturally have more weight than arguments concerning their applications to other domains, and should, therefore, be considered first.3 Van Dijk's main thesis is that the existing 'sentence-grammars' ("structural and generative-transformational grammars ... limited to the formal enumeration and structural description of the sentences of a language", henceforth 'S-grammars') are inadequate for the account of the phenomena of natural language, and should be replaced by a grammar which would account also for the formal structure of 'texts' or 'discourse', i.e., by a 'text-grammar' (henceforth, 'Tgrammar') (p. I).4 For him, a T-grammar is significantly different from an Sgrammar, and cannot be construed as a trivial extension of the latter. He gives three types of argumentswhich he considers "decisive"in support of his thesis: a methodological one, a grammatical one, and a psycholinguistic one. The first is the claim that the proper object of linguistics is discourse (or 'texts') and not sentences. The second consists in pointing out that S-grammar is unable to explain
Thus Petfi claims that a satisfactory description of the linguistic facts pertaining to the sentence levelcan only be achieved within the framework of a text-grammar: (Eine solche Beschreibung scheint unsselbst in bezug auf die satzgrammatische Ebenenur im Rahmen einer in bestimmter Weise aufgebauten Textgrammatik durchfhrbar zu sein") (Petfi, 1971, p. 16). 3 This argument is somewhat question-begging, since one of the main contentions of the textgrammarians is that the 'domain' of linguistic theory should be enlarged far beyond its traditional scope. We will return to this question in what follows. Let us point out, however, that the text-grammarians do not deny that the facts within the scope of traditional linguistic theory should be adequately accounted for by any linguistic theory. 4 Henceforth non specified page references will be to van Dijk, 1972.
2

A new 'revolution* in linguistics?'Text-grammars' vs. 'sentence-grammars'

197

certain grammatical phenomena, like pronominalization, defmitivization, etc., which can only be accounted for, according to Van Dijk, by a grammar which postulates texts as its basic units. The third is an appeal to the intuition that a 'well-formed* piece of discourse must be 'coherent* in some sense, and that we are able to express its coherent structure in summaries, outlines, etc. This intuition is interpreted as suggesting that a 'plan' or 'coherent structure' underlies every piece of discourse in much the same way as sentences have underlying 'deep structures'. We will examine in this paper mainly the first and second types of arguments. (Sections 2 and 3, respectively), since the third hinges on considerations of a more 'external' nature.5

2.

The 'natural domain' of linguistic theory

According to Van Dijk, "discourses are the only justifiable 'natural domain' of an empirically adequate theory of language" (p. 7). From this he goes on to claim that what a grammar of a language should generate is textsabstract entities which underly discoursesand not merely sentences. There is nothing new about mentioning 'text' and 'discourse' in connection with linguistic inquiries. However, one should avoid lumping together two distinct types of contexts in which these terms have been used: the context of discovery and the context of justification. Those who use 'discourse' in the context of discovery mean that given discourses (corpora) are to be considered the only legitimate data from which significant linguistic generalizations can be extracted, through the use of controlled discovery procedures (cf. Harris, 1951). Those who, like van Dijk, use 'discourse' or 'text' in the context of justification mean that the adequacy of a grammar of a language can only be evaluated with respect to its ability to generate the correct 'texts' of this language. This approach is characteristic of the recent T-grammar research program. The two approaches are clearly different, but they are not incompatible. Neither does the use of texts as a part of discovery procedures imply that texts are to be considered the proper objects of linguistic explanation, nor does this last claim, characteristic of the more recent use of the notion of text in linguistics, imply anything whatsoever concerning the existence and nature of discovery procedures. For those who use 'text' or 'discourse' in the context of discovery, the main linguistic object to be explained and described is the set of sentences of a language and not the set of its texts, whereas for van Dijk, who certainly rejectsfollowing Chomskythe idea that there are discovery procedures in linguistics, the set of texts is the basic explanandum of linguistics.
5

We discuss the third type of argument more extensively in Dascal and Margalit, 1973. The present paper is partly based on that earlier criticism of the idea of a text grammar.

198

Marcelo Dascal and Avishai Margalit

What is apparently common to both approaches is the assumption that the 'natural domain' of linguistic theory is a set of (possible or given) discourses. In one sense, they really share this assumption, namely, insofar as 'discourses' constitute for both approaches the Observational basis' or set of empirical data that linguistic theory must account for. Nevertheless, from the fact that certain data constitute the 'natural domain' of observation for a certain theory, nothing follows as to the 'naturalness' of the 'proper' theoretical entities that should be postulated in the theory. A good example is the case of syllables and phonemes. We are definitely more familiar with syllables than with phonemes, in the sense that syllables are closer to easily observable units than phonemes. And yet, in phonology, phonemes are the basic theoretical units employed, and not syllables, since in using phonemes we are able to capture more generalizations (Bever, 1971, pp. 1769). Van Dijk, on the contrary, seems to believe it necessary or fruitful to postulate abstract linguistic entities, 'texts', corresponding as closely as possible to the observational entities, 'discourses'. Van Dijk distinguishes between the 'naturalness' of a theory and that of its domain. According to him, a theory T! is more natural than another, T2, with respect to a specified natural domain of objects D, if all the statements of T2 are derivable from Tj and the basic theoretical entities postulated in Tx are somehow 'closer' to the objects of D than the basic theoretical entities postulated in T2. Since van Dijk bases his definition of the naturalness of a theory on certain methodological concepts derived from Nagel's framework, it is useful to recall that a theory, in the broad sense, has two components: an uninterpreted set of statements (sometimes called the 'theory' in the narrow sense) and rules of correspondence (Nagel, 1967, pp. 97105). According to van Dijk's definition, all you have to do in order to compare the 'naturalness' of theories is to look at their correspondence rules. The more straightforward the rules of correspondence, the more natural the theory. Therefore, van Dijk's claim that T-grammars are more natural than S-grammars amounts to the claim that the rules of correspondence from texts (as abstract linguistic entities) to discourses (as natural observational units) are more straightforward than the rules of correspondence of a theory in which the highest level abstract entities are sentences. Nothing is said, however, in van Dijk's account, about the nature of the other part of the theories, namely, their theoretical statements. It may turn out, therefore, that a theory which is more natural in van Dijk's sense, because it has simpler correspondence rules, will have to pay for this feature a high price in terms of the complexity of its purely theoretical part. This is in fact the case in van Dijk's model of T-grammar, where 'macro-structures' of texts replace simple 'deep-structures' of sentences and very complex 'transformational' rules are needed in order to map such macro-structures onto the deep structures of the sentences composing the text. An idea of the complexity of the T-grammar proposed can be formed by inspecting figure 1. The specific contribution of the T-grammar (marked T) consists of the components Rj, R2, and R3, with their respective outputs. The other

A new devolution' in linguistics?'Text-grammars' vs. 'sentence-grammars'

199

components (marked S) are essentially those of an S-grammar of the type proposed by generative semantics. The output of Rg is the 'surface structure* of a discourse or text, i.e., the structure corresponding closely to that of the sequence of sentences in the text. It is obtained by the operation of two sets of transformation rules on the 'macro-structures' generated by Rj. These are conceived not as syntactic, but as semantic 'deep* structures, in the manner of generative semantics. That is, they should look rather like huge logical formulae, expressed by means of a hybrid set of logical devices (drawn from predicate calculus of different orders, modal logic, epistemic logic, etc.) as well as by extra categories such as 'actant', 'text-qualifier', etc. (pp. 1401). Not much is said about the formation rules in Rj, and still less is said about the precise nature of the transformations in R2 and R3.6 If compared with S-grammar, T-grammar presents not only an enormous amount of extra complexity in its theoretical framework, but also a considerable loss in precision which casts doubts on the very possibility of achieving a formalization of the theory. If so, in shifting from S-grammar to T-grammar, linguistics would be abandoning one of the major achievements of the Chomskyan revolution: the establishment of the requirement of explicitness (i.e., formalization) as a conditio sine qua non for any serious linguistic theory. In any case, it is clear that in order to be 'natural' in van Dijk's sense, i.e. in order to have relatively simple correspondence rules, a theory must make use of a highly complex and logically powerful set of theoretical constructs. Given this tradeoff relationship, it is by no means obviousas most text-grammarians assume that a 'natural* theory should be preferred over a 'non-natural' one.

3.

The grammatical argument

According to van Dijk, "one of the basic properties of natural language is not only the possibility of constructing complex sentences by recursive embedding or coordinating other sentences (sentoids), but also the possibility of producing sequences of syntactically independent sentences possessing certain relations between each other" (p. 39). 3.1. Our position in face of this is that no evidence has been produced by van Dijk supporting the claim that the description of the relations between independent sentences is not equivalent to the description of the conditions for constructing complex sentences by recursively embedding or coordinating other sentences. If our view is correct, then such a task belongs properly to S-grammar and does not

It is even doubtful whether such rules might be properly called 'transformations', in the technical sense. But in order to pass judgement on this point, one must wait for at least some examples of such 'transformations'.

200

Marcelo Dascal and Avishai Margalit

require postulation of 'texts' as additional theoretical entities. Certainly, in order to cope with the problem indicated by van Dijk in the preceding quotation, the available system of rules for embedding and coordination (in the framework of the existing S-grammars) is not yet sufficient. Thus, the discussion of such problems by text-grammarians may be beneficial, in the sense of pointing out difficulties that must be solved in the course of extending and completing S-grammars. Nevertheless, the main point is one of principle, as recognized by van Dijk himself: "It is clear that we must not reject S-grammars as inadequate for the description of intersentential relation(s) on the ground of their hitherto rather restricted attention to such problems. We should first be sure that an S-grammar cannot IN PRINCIPLE account for such relations. It might be the case that the conditions determining the wellformedness of complex and compound sentences are similar, if not identical, to those which should be formulated for sequences" (pp. 15-6). As we will try to show in the following sections, none of van Dijk's grammatical arguments provides grounds for his being sure (as he in fact is) that S-grammar cannot account for intersentential relations. On the contrary, we suggest that, once a (complete) S-grammar becomes able to explain satisfactorily coordination and embedding, all the problems related to "producing sequences of syntactically independent sentences possessing certain relations between each other" will be ipso facto solved, without need of any special additions falling outside the scope of S-grammar.7 To illustrate these claims, consider one of van Dijk's examples: (1) We will have guests to lunch. Calderon was a great Spanish writer. According to him, this 'text' is "definitely ungrammatical" and "any native speaker of English will consider this sequence, when presented in one utterance as nonsense" (p. 40). Admittedly, this is indeed, in normal conditions, a bizarre sequence

.Fodor and Katz (1964, pp. 4901), defended, almost ten years ago, a very similar thesis. For them, "except for a few types of cases, discourse can be treated as a single sentence in isolation by regarding sentence boundaries as sentential connectives". They acknowledge, however, that in many cases (e.g., pieces of discourse containing questions and answers and, we would add, dialogs of any type) there is no straightforward technique for reducing discourse to complex sentences. In our opinion, these are the cases par excellence of prima facie evidence for including the notion of 'text' in the theoretical apparatus of a grammar. Therefore, such cases should be thoroughly explored by van Dijk as a source of support for T-grammar. But he, on the contrary, simply relegates their treatment to the (merely sketched) 'pragmatic* component of his T-grammar (cf. p. 14 and chap. 9). However, it seems by now clear that, in spite of Katz's continuous claims to the contrary, there is no escape to supplementing S-grammars with a 'pragmatic* component as well (cf. Kasher, 1970). Our main line of argumentation can, therefore, be adopted also in the present case. That is, one can argue that the conditions which define a (pragmatically) well-formed sequence of sentences must, ultimately, be accounted for by the 'pragmatic' component of any adequate S-grammar.

A new 'revolution* in linguistics?'Text-grammars' vs. 'sentence-grammars*

201

of sentences. As for its 'ungrammatically', let us assume that van Dijk's claim is correct. We suggest that whatever has to be said in a grammar about the grammatical! ty or ungrammatically of such a sequence, has to be said also of either (2) or (3) We will have guests to lunch and Calderon was a great Spanish writer, We will have guests to lunch but Calderon was a great Spanish writer.

That is, if a grammar is inadequate to solve the problems raised by (1), it is also inadequate to solve problems which an S-grammar should solve. If it doesn't, it is inadequate as an S-grammar. If, on the other hand, the S-grammar is adequate, it is also able to solve the problems raised by (1), and need not be supplemented by a T-grammar. Van Dijk himself eventually acknowledges this fact: "Jn this case, the conditions for the combination of two sentences in a sequence seem to be parallel with those for combining 'sentences' (sentoids) in a complex sentence" (p. 40). To be sure, his observations and examples support the conclusion that "at least some conditions for the combination of sentences in a sequence are similar to those for combining sentences (sentoids) in a complex sentence" (p. 41). But what he must produce, in order to substantiate the need of a T-grammar, are examples in which the conditions are not parallel. He should prove that at least some conditions for the combination of sentences in a sequence are radically different from those for combining sentences in a complex sentence. He believes, indeed, that "there are a great number of theoretical and empirical arguments against an identical treatment, of (compound) sentences and sequences" (p. 14). What are, then, these arguments? 3.2. The problem of characterizing the conditions of use of definite and indefinite articles provides the basis for the most elaborated grammatical argument put forward by van Dijk. We shall pay it, therefore, due respect, and try to show that the attempt to treat definitivization (and, later on, also pronominalization) by means of various logical operators proves the contrary of what van Dijk expects to prove. For this attempt shows that to treat texts adequately means to treat them ultimately as complex sentences. One of the conditions for definitivization, as is well known, is the uniqueness of the referent of the 'definite description'. Consider the following text: (4) (i) Yesterday John bought a book. (ii) The book is about theoretical syntax.

As a possible representation of (4), van Dijk suggests the following formulae: (5) (i)(3x)(b(x)Ac( 2l ,x))

Where V is the -operator, (zj) is a constant for 'John', and the interpretation

202

Marcelo Dascal and Avishai Margalit

of predicate letters is obvious. Nevertheless, van Dijk considers (5) (ii) not as a formalization of (4) (ii), but rather of the complex sentence (6) The book John bought yesterday is about theoretical syntax. He then argues that (5) (ii) is redundant with respect to (4) (ii) and proposes, as a better approximation for text (4), the formula (7) (3x) (((b(x) c(Zt,x)) s((iy) (y = ) b(y))). But this version is also considered defective by van Dijk. Partly, because it does not take into account the fact that the order of the conjuncts may be significant, a fact which is not expressed by the use of the logical sign . But the main reason is that the existential quantifier used in it "merely specifies that the property holds for at least one individual, not for one individual" (p. 49). This is the reason for introducing a new operator, ETA (), which identifies a particular but unspecified individual. After this remark, van Dijk proposes another formula: (8)
(()()

c(z1)X))).

Unfortunately, he does not say what sentence (or text) this formuk is supposed to formalize, but it seems that it can only be taken to formalize the entire text, like (5) (ii). We shall return to this problem soon. Coming back now to the main point. We agree with van Dijk's claim that (5) (ii) is the formalization of (6). But, unlike van Dijk, we consider it also a suitable formalization of the entire text (4). In fact, if the -operator is eliminated according to standard procedures it is easily seen that (5) (i) becomes a part of the disabbreviated version of (5) (ii). Therefore, (5) (i) is redundant relative to (5) (ii), which, alone, suffices as a formalization of (4). If we are correct, then it is clear that the same logical form (or semantic representation) is assigned to a 'text', i.e. (4), and a complex sentence, i.e. (6). Therefore, examples of this type do not support the claim that a distinction between texts and complex sentences must be drawn. Let us disregard, for the sake of the argument, what has just been proved, and let us examine (7) on its own merits, in the light of van Dijk's 'linguistic' criteria. Apparently, what he has in mind when affirming the 'linguistic' superiority of (7) is the fact that this formula has, within the scope of the existential quantifier, the form of a conjunction of two formulae, each of which somehow represents one of the two sentences of text (4). It seems that van Dijk's requirement is based on a sort of Wittgensteinian picture theory of language. It could be formulated as follows: a semantic representation is better if it corresponds more closely ideographically to the surface structure represented. Such an 'ideographic' interpretation of van Dijk's requirement is suggested by the explanation he gives of the advantage of adopting (7) as a formalization of (4), namely, that (7) "shows the linguistically relevant fact that the individuals in the subsequent sentoids/sentences are identical and described both as books" (p. 49).

A new 'revolution* in linguistics?'Text-grammars' vs. 'sentence-grammars'

203

This notion of 'linguistic perspicuity', which seems to be connected with the previously discussed concept of 'naturalness' of a theory, may have its merits and be worth considering. In a sense, it is reasonable to claim that '~ ~p' is a better representation of an English sentence containing a double negation than just 'p', although both are logically equivalent. But van Dijk's formalization ((7)), although probably better on these marginal grounds, fails to support his main claim, viz., that texts and complex sentences are to be differently treated in the grammar. For, obviously, (7) itself is a complex sentence. Moreover, from such formulae as (7) separate representations for (4) (i) and (4) (ii) can be easily derived by standard inference rules, whereas the opposite is not true, i. e., from a conjunction of the form (9) (3x) (b(x) c(2l,x)) (3x)s((iy) (y = ) b(y))

it is impossible to derive (7). As for the 'linguistic perspicuity', it may be said that (7) is indeed in some respects more perspicuous but, in other, perhaps more important, respects, less perspicuous than (5) (ii), when both are viewed as alternative formalizations of (4). Expression (7) is more perspicous than (5) (ii) in the sense we just pointed out: it contains, as it were, two parts which closely correspond to the parts of (4).8 But it is less perspicuous than (5) (ii) insofar as (a) it is essentially the representation of a complex sentence and not of two independent sentences and (b) it does not have a representation for the definite description which appears in (4) (ii), whereas (5) (ii) contains IOTA which represents this definite description quite perspicuously. That is to say, ideographic perspicuity in one respect trades off with perspicuity in another respect, so that van Dijk cannot accumulate much capital out of his ideographic requirements. Let us come back now to formula (8) and to its alleged superiority over the other formalizations discussed. As we have seen, the fundamental reason adduced for the rejection of (7) in favor of (8) is that the existential quantifier of (7) does not represent the fact that only one book was bought by John, but allows also for an interpretation according to which some books were bought by John. That is, (7) does not offer an adequate representation of the first sentence of (4), namely, (4) (i). In other words, what is claimed is that

(10)

(3x)(b(x) ,,))

is not a correct rendering of the logical force of (4) (i) taken in isolation. The question, then, is what would count as a correct representation of the logical

Incidentally, this effect is obtained through the use of an unnecessary pair of parentheses around b(x) c(zj, x).

204

Marcelo Dascal and Avishai Margalit

form of (4) (i). The ETA-operator is certainly useful for this purpose, and a natural answer would be: (11) c( 2l ,frx)(b(x))),

but certainly not (8), since it contains the predicate s ("...is about syntax"), which does not appear in (4) (i). Formula (8), therefore, although preventing the undesirable interpretation, is not a 'perspicuous' representation of (4) (i). What sentence, then, is formalized by (8)? It cannot be (4) (ii), since this sentence contains a definite article and not an indefinite one. Its formalization should, therefore, contain the IOTA- but not the ETA-operator. The only possibility left, then, is to consider (8) as a formalization of the entire text (4). But in that case, since the text as a whole contains both an indefinite and a definite article, why select for formalization the former and not the latter? If van Dijk can accept that one of the articles has to be 'sacrificed', so to speak, in the formalization of the text, then (5) (ii) would be as good a candidate as (8). We have already pointed out that van Dijk's idea is that there must be a sort of 'ideographic' perspicuity in the formalization. In that case, it would be reasonable to expect that both articles, which constitute important formal elements in the two sentences, should be somehow represented in the formalization. In order to obtain such a result, the following conjunction could be envisaged: (12) c(2l, () (b(x))) s((ix) (b(x) c^x)))

But it is clear that such a formula does not capture what van Dijk considers the main linguistic insight to be captured by the formalization of text (4), namely, the fact that the individual referred to in the second sentence is the same as the one referred to in the first. This is so because both occurrences of the variable are not within the scope of the same quantifier. Moreover, the addition of a clause specifying this identity is not possible without using a quantifier whose scope comprises the whole conjunction. And the only quantifier able to do that job is the existential quantifier, so that, at best, we would have to return to a formula like (7). This, it has been claimed, is inadequate because it does not reproduce the specific characteristics of (4) (i). Alternative solutions would be (5) (ii), which is inadequate by the same reason, or (8), which does not reproduce the spedfity of (4) (ii). Furthermore, this inadequacy stems from the fact that all three alternative formalizations of (4) are, in an essential way, complex sentences, which cannot be split into separate elements, each standing for one of the sentences of text (4). At this point, we can envisage the situation faced by van Dijk as a straightforward dilemma. Either something like (12), namely, two independent sentences combined by means of conjunction, or a complex single formula like (7), (5) (ii) or (8), is the adequate formalization of text (4). If the former, then the two-sentences text (4) has as its semantic representation a conjunction of two independent semantic representations, each corresponding to one of the sentences of the text. In that case,

A new 'revolution' in linguistics?'Text-grammars' vs. 'sentence-grammars'

205

it becomes obvious that the full description of a 'text' does not require anything that is not provided by an S-grammar. If the second alternative is chosen, then the semantic structures represent both the 'text' and a sentence which is in a paraphrase relation to it. And, again, the mechanisms of an S-grammar would be entirely sufficient to account for the structure of the 'text*. Let us compare now van Dijk's type of argumentation with the 'traditional' way transformationalists present deep structures. Usually, a text like (4) would not be taken as an example whose deep structure is to be exhibited, for only sentences in isolation are considered. The novelty in van Dijk's approach is that he brings to the fore, as 'surface' starting points of linguistic analysis, sequences of sentences or 'texts'. But it would be wrong to think that 'texts' (in van Dijk's sense) do not play any role whatsoever in transformational analysis. As a matter of fact, in most grammatical arguments in which a certain 'deep structure' is proposed for a given sentence, such a deep structure is not presented explicitly employing the full formal apparatus of the theory. Instead, a set of kernel sentences which 'roughly* stands for the deep structure is supplied. In many respects such sets of sentences are like van Dijk's 'texts': the individual sentences are simple; they are separated by full stops. The only difference seems to be that their order is not considered essential to the description, whereas in van Dijk's 'texts' order is a factor of major importance. This way of arguing derives either from a desire to facilitate communication with readers who do not master the formal apparatus, orand perhaps more frequently sobecause the writers have no precise idea of what would be the full formal representation. Whatever the reason, it is clear that 'texts' have somehow been taken into accountat least as an heuristic devicein the literature. When facing examples which seem to endanger his own preferred explanations, van Dijk reverts to the sort of argumentation we called above 'traditional'. Thus, in (13) The girl I told you about yesterday will visit us next week. a definite article is used before the corresponding referent has been 'introduced*. This example of 'retrospecification' of the referent apparently violates van Dijk's main rule for definitivization, namely, that "only previous sentoids... may identify discourse referents" (p. 57). His solution to the problem consists in postulating the 'text' (14) (i) I told you about a girl yesterday (ii) The girl will visit us next week as the structure underlying (13). But the informal presentation of (14) is nothing but the 'traditional' way of arguing on behalf of a particular deep structure for the complex sentence (13). From this point of view, there is absolutely nothing new in van Dijk's treatment of such a sentence. Let us return once more to definitivization. The chief result of van Dijk's analysis of definitivization is the rule that (15) all definite articles in a text are derived from preceding indefinite articles. This rule, if true, provides an argument for introducing the notion of a text into.

206

Marcelo Dascal and Avishai Margalit

the grammar, for only by taking into account 'preceding' sentences/sentoids would it be possible to explain the use of the definite article in a given sentence. Van Dijk admits that an explanation for an occurrence of a definite article need not be found in a preceding sentence (or sentoid), but may also be provided by a 'preceding' situation or context. In that case, however, it cannot be simply said that the definite article is preceded by an indefinite article, for an article is a linguistic item, and not a feature of a situation. The rule would have to be reformulated in order to take care of this. It would have to say that an individual or class of individuals, or whatever may be referred to by the definite article, was previously 'introduced' in the discourse through the situation. This solution seems to be effective for the referential uses of the definite article, but it is useless for the non-referential uses. We turn now to the application of rule (15) to these different uses of the articles. Indefinite NPs may be referential, as in (16), or non-referential as in (17): (16) James bought a car yesterday (17) James cannot buy a car. The term 'derive' in rule (15) is to be interpreted as meaning that, in a text, the characteristic features of the preceding indefinite article are transferred to the corresponding definite article which follows. The rule is a sort of law of conservation of features. If can be split into: (15a) If the preceding 'a' is non-referential, so is the following 'the'. (15b) If the preceding 'a' is referential, so is the following 'the'. There are, however, counter-examples both to the rule (15) and to its specifications (15a) and (15b). Against the general rule that every 'the' must somehow be preceded by an 'a', we can mention the sentence (18) Sally would surely fall in love with the man who is able to teach her Swahili. Here, the 'the' is not, and need not be, preceded by an 'a'. Van Dijk would claim that this is so only because 'the man' is retro-specified by the restrictive relative clause 'who is able to teach her Swahili'. And, according to him, postnominal restrictive relative clauses are always derived from 'preceding' sentoids in the deepstructure. Thus, for him, there would be in the deep structure of (18) a sentoid in which the noun 'man' would have the feature [DEF]9: (19) (i) A man is able to teach Sally Swahili, (ii) Sally would surely fall in love with him. But (19) says much more than (18), as can be shown by means of standard logic notation. Thus, (18) can be formalized as (20) (x) (Mx Txz2-Lz2x) (Mx = is a man; Txy = is able to teach Swahili to y; 22 = Sally; Lxy = will fall in love with y).

Since van Dijk's treatment of pronominalization is essentially parallel to his treatment of definitivization, the fact that we have in (19) 'him* and not 'the man' is immaterial for the discussion.

A new 'revolution* in linguistics?'Text-grammars' vs. 'sentence-grammars'

207

Now, (19) (i) seems to mean: (21) (3x) (Mx Txza) But, whereas (21) affirms the existence of a man who is able to teach Sally Swahili, no such assertion is to be found in (20) and in the original (18). The reason for that is simply that the 'the' in (18) is a generic 'the', which does not imply existence, as, for instance in (22) (23) The unicorn is a mythical animal. Lz2 ((ix) (Mx Txza)). As for (19) (ii), it could be formalized as: But since, according to van Dijk, every occurrence of an -operator must be 'preceded', at least in deep structure, by an occurrence of an ETA-operator (or, at least, an existential quantifier), the fact that (23) can be accepted as a rendering of (19) (ii) is of no consequence, if no suitable formulation for (19) (i) can be found. Moreover, even (23) is not exempt of certain difficulties. As we have shown above (see our discussion of (5) (ii)), sentences like (23) can be taken as formalizations both of complex sentences and of texts, without the need of a separate representation of the Other part of the text'. In that sense, (23) should be taken as a formalization of the sentence (18) and not of one of the parts of (19). But then none of the definitions of the -operator would be adequate: neither Russell's definition, used by van Dijk, from which separate assertion of the existence of the individual described by the definite description does not follow; nor the more usual definition, from which such an assertion follows (cf. Reichenbach, 1947, p. 261). The reason is that in both definitions there is a commitment to the uniqueness of the individual in question, whereas in (18), the 'the' being generic, there is no such commitment.10
10

The distinction between the two definitions of IOTA becomes crucial once we replace 'fall in love* by 'marry', as in (101) Sally would surely marry the man who is able to teach her Swahili. This is so because 'marry* (in our culture) implies uniqueness. RusselPs definition of IOTA could be used to capture successfully both the fact Sally would marry exactly one man who has the ability to teach her Swahili and the fact that it is not sure that such a man exists. On the other hand, from (23), with RusselFs interpretation, (21) does not follow. That means that (23) (interpreting 'L* as 'marry*) would be a happy formalization of (101) but not of the text which corresponds to it, namely: (102) (i) A man is able to teach Sally Swahili. (ii) Sally would surely marry him. In order to make (23) represent the entire text (102), it would be necessary to use the usual definition of IOTA. But in that case (21) follows from (23) (both with 'L* interpreted as 'marry', of course), and the proposed formalization is no longer adequate for (101).

208

Marcelo Dascal and Avishai Margalit

The preceding discussion shows that the text-grammar program is misguided not only in its general lines, but also in at least some of its few detailed analyses of grammatical facts. Let us turn now to a less detailed analysis of the remaining grammatical arguments, in order to show that they suffer from similar deficiencies. 3.3. Van Dijk's treatment of pronominalization is essentially parallel to that of definitivization. The basic rule is:11 "pronominalization may take place if the antecedent NP precedes, and 'backward' pronominalization is possible if the following 'postcedent' occurs in a sentence which immediately dominates the S-symbol under which the pronominalizable NP occurs" (p. 61) This rule was formulated by Langacker within the framework of transformational S-grammar. Van Dijk characterizes it as 'roughly correct'. One could ask, then, in what sense S-grammars are inadequate for dealing with pronominalization and, in particular, how does van Dijk justify the conclusion that "a textual treatment seems to impose itself in the case of pronominalization (p. 63). In fact, nothing in van Dijk's elaborate discussion of the problem offers an answer to these questions. In his discussion, he merely recapitulates the history of pronominalization treatment in S-grammars, trying to create the impression that something is wrong with such a treatment. After formukting the above rule, he discusses a wellknown apparent counterexample to it, the so-called Bach-Peters paradox. This 'paradox' hinges on the fact that if the rule is applied to a sentence like (24), then it will be impossible to show how the sentence is recursively generated. (24) The girl who fell in love with him kissed the boy who insulted her. Van Dijk mentions a series of authors who have tried to show that the paradox is only apparent and to provide solutions for it. He criticizes one of these attempts, McCawley's, for not being able to distinguish between the difference in meaning of two sentences closely related to (24), namely (25) and (26) The girl who fell in love with the boy who insulted her kissed him; The boy who insulted the girl who fell in love with him was kissed by her.

The difference becomes clear if we consider the possibility of 'him' referring not to the individual referred to by 'the boy', but to another individual i. In that case, according to (25) the girl kissed i and fell in love with the boy, whereas, according to (26), the girl fell in love with i and kissed the boy. But if McCawley failed in this case, Kuroda didn't and van Dijk himself uses Kuroda's predicate logic formalization of the two sentences as a means for showing their difference in meaning. However,
11

Compare this rule with van Dijk's rule for definitivization (p.); which was discussed towards the end of the preceding section.

A new 'revolution* in linguistics?* Text-grammars* vs. *sentence-grammars'

this solution is also rejected by van Dijk on the grounds that it is "cumbersome and not easy to handle" (p. 63), which can only meansince no further explanation is giventhat he considers it of excessive formal complexity, but not incorrect. Well, but then why is Karttunen's informal account which "gives an immediate insight into who fell in love with whom and who kissed whom" (id.) also rejected? Certainly because it is too informal. It is clear, therefore, that van Dijk is looking for a sort of golden middle which combines rigour in formalization with intuitive clearness. Before getting to his proposal, let us point out that it is always possible to offer an ad hoc solution to a particular problem which is 'simpler* and 'more intuitive' than other theoretical solutions. But the measure of 'simplicity', as a methodological criterion, cannot be applied 'locally', but only 'globally'. And since van Dijk does not develop a global system of rules, even for the limited domain of pronominalization12, it is pointless to argue that his proposal is better on the grounds that it is simpler. But is it really simpler and formally satisfactory? Van Dijk's solution consists in postulating 'texts' (27) and (28) underlying, respectively, (25) and (26): (27) (i) A boy insulted a girl. (ii) The girl fell in love with the boy. (iii) The girl kissed the boy. (i) A girl fell in love with a boy. (ii) The boy insulted the girl. (iii) The girl kissed the boy.

(28)

He then gives a quasi-formalization of the embedding relations between the sentences of such 'text', and of the transformational chain which leads to (25) and (26). Nevertheless this procedure is nothing but using the old trick again. That is to say, scoring points in 'simplicity' just by not being explicit enough. Had he presented explicitly the rules connecting his postulated texts and sentences (25) and (26), he would have realized that his solution is either as 'cumbersome' as Kuroda's or else altogether inadequate. Unlike van Dijk, who refuses to draw general conclusions from his discussion of pronominalization processes (p. 74), we draw the following one: nothing in van Dijk's discussion has substantiated the claim that in order to deal with pronominalization S-grammars are inadequate and therefore a shift to T-grammars is needed.
12 "It is impossible at this moment to draw general conclusions from our discussion of pronominalization processes, let alone to formulate sufficiently general and simple rules" (p. 74).

14 TLI 1/2

210

Marcelo Dascal and Avishai Margalit

3.4. Definitivization and pronominalization are examples of grammatical facts that can only be dealt with, according to van Dijk, at the level of'textual surface structure', because they involve relations between sentences which form a sequence. The semantic relations which explain definitivization and pronominalization do not involve entire sentences (or their semantic representations), but only some of their elements, especially their noun phrases. Examples of such semantic relations are "referential identity, semantic identity, lexical identity, ..." (p. 91). But there are also semantic relations between the whole sentences of a textual sequence, which must also be accounted for, in van Dijk's model, at the level of 'textual surface structure'. One of these relations is the relation of presupposition, which we shall consider now briefly. Van Dijk's definition of presupposition is an informal counterpart of some definitions that can be found in the literature (e.g., van Fraassen, 1968). Since it is not intended to be a significant contribution to the clarification of the linguistic notion of presupposition, it is pointless to discuss here its shortcomings.13 Leaving aside the question of a satisfactory definition of presupposition, we shall try to see in what sense the addition of the notion of 'text' to a grammar contributes to the explanation of the semantic relation of presupposition between sentences, as suggested by van Dijk. Van Dijk's basic idea for a 'textual' treatment of presuppositions is that the presuppositions of a sentence should be equated with the 'preceding sentoids' in the semantic representation of the 'text' underlying the sequence of sentences to which the sentence belongs (p. 100). That means that, for each sentence involving presuppositions, there is an underlying 'textual surface structure' of which these presuppositions, as well as the semantic representation of the sentence in question, are parts, and in which the former 'precedes' the latter. Consider the following example : (29) Peter realizes that Johnj pretends that hei is ill. This sentence has presuppositions of the first and second order. The first order presupposition is the embedded sentence, "Johnj pretends that hej is ill" which, in turn, presupposes that John is not ill. Since, according to van Dijk, presupposition is a transitive relation,14 the second order presupposition (that John is not ill) is also a presupposition of the whole sentence (29). All this information is made explicit by van Dijk in the underlying 'text': (30) (i) John is not ill. (ii) John pretends &. (iii) John is ill. (iv) Peter realizes Sjj.

13

14

Some discussion of this topic can be found in Dascal and Margalit (1973). That this is not a general rule is clearly shown by the existence of contexts that 'cancel out* the presuppositions of the embedded sentences (cf. Karttunen, 1973).

A new 'revolution' in linguistics?'Text-grammars' vs. 'sentence-grammars'

211

Having in mind the fact that one of the major purposes of text grammars is to explain the notion of 'coherence' of a text, one would wonder in what sense (30) is a coherent text. For (30), as far as we understand, involves a straight-forward contradiction, namely, (i) and (iii). The presentation of a sentence in a separate line strongly suggests indeed that its truth is simply asserted. Van Dijk could, of course, argue that (iii) is 'dominated* by (ii), and therefore, that its truth is not independently asserted. But then, what is gained by separating (iii) from (ii), except for the typographical illusion of having a 'text' and not a complex sentence underlying (29)? Moreover, it is clear that (30) or something of the like is nothing but an informal presentation of the set of kernel sentences that belong to the 'deep structure' of (29). The difference between T-grammarians und S-grammarians is that the latter, but not the former, offer also an explicit formal account of the structural relationships connecting (i)(iv). We see, then, that the introduction of 'texts', even assuming that this can be done without introducing contradictions into the semantic representations, is not likely to lead us beyond the limits of an S-grammar, but rather leaves us far behind it in what concerns explicitness and rigour. To conclude this section, we may say that the only statement in van Dijk's treatment of presupposition with which we entirely agree is the following one: "From the observations made above, the precise theoretical status of presupposition ... has not become fully clear" (p. 100).
* * *

It is clear, thus, that neither deiinitivization, nor pronominalization, nor presupposition, nor any of the other grammatical facts discussed by van Dijk require the replacement of S-grammar by T-grammar. Ultimately, van Dijk himself is aware of this fact. After spending about a hundred pages in trying to prove that a T-grammar is necessary to account for these grammatical facts, he withdraws from this position, by declaring that the additional rules of grammar required to handle the combination of subsequent sentences in a text "could be introduced into a sentence-grammar without profound changes in the form of the grammar" (p. 132). This simply means that the grammatical arguments which allegedly support T-grammar vs. S-grammar do not in fact offer any evidence whatsoever for this claim. The same is true for the methodological arguments discussed in section 2. Hence, the remaining set of arguments, relying on our intuitions about the 'macrocoherence' of texts as reflected in the fact that we normally process texts by discerning in them structured 'plans' or plots, should provide not further evidence (p. 130) for T-grammars, but rather the only evidence, if any, that van Dijk is able to produce. A close examination of these arguments (see Dascal and Margalit, 1973), however, shows that the intuitions on which they depend are rather vague, and by no means comparable to our intuitions about the acceptability of sentences. Therefore,

212

Marcelo Dascal and Avishai Margalit

the attempt to characterize the 'well-formedness' of texts by a set of recursive rules, analogous to those which characterize the well-formedness of sentences seems to be doomed and does not warrant the conclusion that there is such a thing as a 'grammar' of texts, and still less the conclusion that this alleged 'grammar' should replace the existing grammars of sentences.

Ri

Semantic formation rules macro-structures macro-transformation rules transformed macro-structures

R3

transformation rules textual surface structure

S'
micro-semantic formation rules - transformation rules lexicalization rules syntactic representation of sequences

R4

morphophonological representations

Figure 1. Schematic representation of the structure of van Dijk's T-grammar

A new 'revolution* in linguistics?'Text grammars* vs. 'sentence-grammars*

213

The inevitable conclusion, then, is that even the most elaborated results of the T-grammar research program have not, so far, provided the necessary justification for its central claims. The program has not produced a convincing 'refutation' of S-grammar, nor presented a viable alternative to it. Far from being the beginning of a promising 'revolution' in the field of linguistics, the T-grammar research program looks, so far, more like a doomed *coup d'etat'. In the light of the available evidence, therefore, no linguist could be suspected of 'conservatism', should he decide to stick to the Old'but still fruitfulresearch program of transformational S-grammar.

REFERENCES BEVER, T. G. (1971) The Integrated Study of Language Behavior, pp. 158209 in: Morton, John (ed.), Biological and Social Factors in Psycholinguistics, London: Logos Press. DASCAL, MARCELO and MARGALIT, AVISHAI (1973), Text-grammarsa critical view, in: Ihwe, J., Rieser, H., and Petfi, J. (eds.), Probleme und Perspektiven der neueren textgrammatischen Forschung I, Papers in Textlinguistics, vol. 5, Hamburg: Buske. VAN DIJK, TEUN, A. (1972), Some Aspects of Text-Grammars, The Hague: Mouton. VAN DIJK, T. A., JENS IHWE, JANOS PETFI, HANNES RIESER (1971), Textgrammatische Grundlagen fr eine Theorie narrativer Strukturen, Linguistische Berichte 16, pp. l38. FODOR, J. and KATZ, J. J. (1964), The structure of a semantic theory, pp. 479518 in: Fodor, J. and Katz, J. J. (eds.),The Structure of Language, Englewood Cliffs: Prentice Hall. HARRIS, ZELLIG S. (1951), Structural Linguistics, Chicago: The University of Chicago Press. FRAASSEN, BAS C. van (1968) Presupposition, Implication and Self-Reference, The Journal of Philosophy 65,136152 IHWE, JENS; HANNES RIESER; WOLFRAM KCK; MARTIN RRTENAUER (1971), Informationen ber das Konstanzer Projekt Textlinguistik*, Linguistische Berichte 13, pp. 1056 KARTTUNEN, LAURI (1973), Presuppositions of Compound Sentences, Linguistic Inquiry IV, pp. 169193 KASHER, ASA (1970), The Logical Status of Indexical Expressions, unpublished Ph.D. thesis, Jerusalem. NAGEL, ERNST (1961), The Structure of Science, Problems in the Logic of Scientific Explanation, London: Routledge and Kegan Paul. PETFI, JANOS S. (1971), Transformationsgrammatiken und eine ko-textuelle Texttheorie. Grundfragen und Konzeptionen, Frankfurt am Main: Athenum. REICHENBACH, HANS (1947), Elements of Symbolic Logic, New York: Macmillan. STEMPEL, WOLF-IETER (ed.) (1971), Beitrge zur Textlinguistik, Mnchen: Fink.

ANNOUNCEMENT

The Department of Philosophical Studies of Southern Illinois University at Edwardsville, in joint sponsorship with the Association for Symbolic Logic, takes great pleasure in announcing an International Conference on Relevance Logic, to be held in St. Louis, Missouri, from September 26 through September 28, 1974. The conference will be devoted to an examination of both the logical and philosophical aspects of relevance logics. There will be planned sessions on the following topics: I. The Pros and Cons of Relevance Logics; II. Philosophical Applications and Implications; III. Neighbors and Relatives of E and R; IV. Semantics and Proof Theory; V. Negation in Relevance Logics. In addition there will be one session for contributed papers. Among those presently scheduled to read papers are the following: Alan Ross Anderson, John A. Barker, Nuel D. Belnap, Jr., J. Michael Dunn, Kit Fine, Dov M. Gabbay, Louis F. Goble, Robert K. Meyer, Garrel Pottinger, R. Zane Parks, William T. Parry, Richard Routley and Alasdair Urquhart. Papers submitted for the open session may be on any aspect, logical or philophical, of relevance logics and may reflect any point of view toward such topics. The papers to be read at the open session will be selected by the steering committee composed of Alan Ross Anderson, Nuel D. Belnap Jr., Kenneth W. Collier, Robert K. Meyer and Robert G. Wolf. It is planned to publish the proceedings of the conference. Individuals interested in further information about the conference should write to Professor Robert G. Wolf, Department of Philosophical Studies, Southern Illinois University, Edwards ville, Illinois 62025. Individuals interested in submitting papers to the open session should write to Professor Kenneth W. Collier at the same address. The deadline for submitted papers is 1 April 1974.

IRENA BELLERT

ON INFERENCES AND INTERPRETATION OF NATURAL LANGUAGE SENTENCES


The paper discusses the calculus of inferences of a complex sentence from the inferences attributable to its constituent parts. It is argued that the concept of presupposition is not well defined in a number of linguistic papers. Presuppositions could be defined in the same way as all other inferences and the calculus of presuppositions would constitute only part of the general calculus of inferences which is a crucial problem in natural language semantics.

The problem of inferences in a natural language has recently received much attention in the literature on semantics. It may, therefore, be worthwhile to recall briefly the different ways of approaching this problem. Papers concerned with the relation of consequence, presupposition, entailment, implication, meaning postulates or meaning components are in fact all coping with the same problem, namely, the problem of the semantic relations holding between a sentence and the set of propositions it expresses, or in other words, the problem of what can be correctly said to follow from a sentence. It is evident that the semantics of any language (natural or formal) is inseparably connected with the rules that account for the relation of inference holding between the sentences of that language. Every speaker of a language can draw conclusions from the sentences of that language and from his implicit knowledge of the rules of that language, and the problem, in fact the crucial one in semantics, is how to account for such conclusions in an adequate way, that is, how to describe explicitly both the sentence structure and the rules of language which allow us to make the correct inferences, those we are capable of making in an intuitive way when we argue, get persuaded by someone's argument and draw proper conclusions from what we hear. In several papers (Bellert, 1968, 1970, 1972) I have proposed to identify the semantic interpretation of a sentence with the set of conclusions that can be drawn from that sentence and from a set of implicational rules pertinent to the syntactic and lexical characterization of that sentence. Such implicational rules correspond to Carnap's meaning postulater, and this concept has a tradition in philosophy; but such a conception of semantics had not been proposed before in linguistics. As I argued, it seems a possible and promising approach to
15 TLI3

216

Irena Bellert

semantics. The proposal has empirical support, for the capability of drawing conclusions can be taken as evidence of understanding, and on the other hand we can expect that if someone understands what has been said, he is capable of drawing the correct conclusions. We may, therefore, say that: (1) X can understand (is able to interpret) S can draw conclusions from S.

There are two problems which need some clarification here. Firstly it is obvious that only some of the conclusions we can draw are made on the grounds of just the knowledge of language, other conclusions are made on the grounds of some additional premises pertaining to the factual knowledge of the world. The difference between the two is evident in clearcut cases, but there is a large number of boundary cases for which only an arbitrary decision can be made as to whether we do or do not want to include the information in the description of our lexicon (our descriptive terms). This is a question that constantly poses itself to lexicographers. Should such and such information be included in the description of a lexical item or should it, rather, be left aside as pertaining to the encyclopaedic knowledge. In any case, once such a (partly) arbitrary decision is made, we need to distinguish between the two types of conclusions: the ones which we make on additional grounds of the factual knowledge of the world, and which will not be obtainable from our description, and those which will be accounted for by our description of the given language and general rules of inference. I used to call the latter "consequences" in order to distinguish them from the former; in the present paper no confusion may arise, for I will be concerned with only those conclusions or inferences that, clearly, should be accounted for by an explicit description of language, and I will thus leave the problem of boundary cases aside. The second problem that I want to clarify is the following. There are obviously some inferences of complex sentences which correspond to the propositions included in and represented by the deep structure description of those sentences (obviously, there are also some which are even represented in the surface structure). The problem is whether it is possible to represent all propositions corresponding to the inferences we would have to account for in the deep (or logical) structure. I will argue below that the answer is in the negative. But even if it were possible to represent all such propositions in the deep (or logical) structure, it would still be necessary to add to our description additional rules for the calculus of inferences of complex sentences from the representation of their deep structure in terms of the inferences of the particular propositions included in the description. That is to say, the deep (or logical) structure representation alone would not be a solution to this problem. Earlier proponents of generative semantics suggested, however, that all semantic information be explicitly included in the deep (logical) structure. The ex-

On inferences and Interpretation of natural language sentences

217

treme claim was that the deep, logical structure is the semantic interpretation of the corresponding sentence, a ckim which is untenable. In more recent papers, generative semanticists do not maintain such an extreme position1. What seems to me important to realize is that no matter whether we accept the lexical decomposition hypothesis (as proposed by generative semantics) or not, whether we accept the deep structure of the interpretivists or just a grammar generating surface structures alone, we are always faced with the problem of the calculus of inferences of a complex sentence from its component parts, we cannot escape this problem in a full description of a language including semantics, and therefore the inclusion of the propositions corresponding to inferences in the representation of a sentence will never do the job; we still have to know in which syntactic positions such propositions constitute inferences of the entire sentence, and in which other positions they do not. The acceptance of the lexical hypothesis has, however, the following disadvantage for this purpose, as compared with the meaning postulates method. Each time a given lexical item occurs, the tree structure from which it is said to be derived must also occur in the description, thus making the representation of sentences unnecessarily complex. But if we add just one meaning postulate with the equivalent information, it will hold true of all occurences of the lexical item which it is associated, and we achieve the same goal by simpler means. In either case the rules for the calculus of inferences of complex sentences are indispensable. It is clear from the recent literature on the subject of inferences that the set of inferences that can be made from a complex sentence is neither included nor includes, but in most cases overkps with the sum total of the inferences that would be made from the component clauses, if the ktter were to be interpreted as sentences standing by themselves. There are inferences of the whole sentence that do not hold of any of its components, and on the other hand, there are inferences that could be made of a component (if it were a sentence by itself) and which do not hold of the entire sentence. The formet can be exemplified, say, by counterfactual conditionals from which we can infer the negation of the antecedent and the negation of the consequent clause , whereas the latter can be shown in the case

1 For instance, Lakoff (1970) says: "We want a logic which is capable of accounting for all correct inferences in natural language and which rules out incorrect ones" and: "Meaning postulates will have to be accepted for a full description of natural language". He also adds: "I think it is clear that there is a range of cases where lexical decomposition is necessary. In addition, it is also clear that certain meaning postulates are necessary. The question is where to draw the line". 2 Consider, as an example, the following sentence: "If John had not been in Boston yesterday, he would not have lost his suitcase" from which we may infer: "John was in Boston yesterday" and "John lost his suitcase". In general, we need however, additional restrictions for the negation of the second clause.

218

Irena Bellert

of so called verba dicendi and sentiendi, the complements of which cannot be inferred from the entire sentence. Thus, to use Karttunen's terminology (Karttunen, 1971), certain verbs behave like 'holes' (let the inferences through), others behave like 'plugs' (block off the inferences), still others behave like 'filters' (cancel the inferences under certain conditions)3. In view of what we already know about the inferences of complex sentences, we have to admit that the calculus of inferences is a problem to be dealt with independently of how much information is represented in the deep structure description. I would even venture to claim that the more we include in the deep structure, the more complex becomes the calculus of inferences. The proposal for representing auxiliaries, quantifiers, sentential connectives, abstract performatives, etc. as higher verbs (or 'abstract predicates') in the deep structure, obliterates the semantic differences which obviously exist between these abstract entities, so that the apparent advantage gained from a uniform syntactic representation (all such entities have the same syntactic category), would be obtained at the expense of an extremely complex calculus of inferences, in which we would have to differentiate among higher verbs that are quantifiers, those that are sentential connectives, those that are modals, auxiliaries, etc. For we cannot simply assume that once we have such a uniform syntactic representation of the various 'abstract predicates', we can use standard logic to account for the different semantic properties of those 'predicates'. Moreover, linguistic quantifiers and connectives have additional semantic properties which are ignored in standard logic, and the remaining 'abstract predicates', such as auxiliaries, modals or tenses, need to be additionally defined. For instance, the linguistic quantifiers all, or some give grounds for additional inferences concerning the embedded propositions (See, Bellert, 1969,
3

'

The filtering conditions are roughly the following:

Let S stand for any sentence of the form 'If A then B'. (a) If A presupposes C, then S presupposes C. (b) If B presupposes C, then S presupposes C unless A semantically entails C. Let S stand for any sentence of the form and B' (a) If A presupposes C, then S presupposes C. (b) If B presupposes C, then S presupposes C unless A semantically entails C. Let S stand for any sentence of the form or B'. (a) If A presupposes C, then S presupposes C. (b) If B presupposes C, then S presupposes C unless the negation of A semantically entails C. The filtering conditions described above have been slightly revised by both Karttunen (1971) and Lightfoot (1971), but this is irrelevant to my argument, as in any case the filtering conditions play a significant role in the calculus of inferences. I will only argue for the extension of the filtering conditions so that they cover not only presuppositions but all inferences.

On inferences and interpretation of natural language sentences

219

1972)4. Besides, we have more quantifiers in any natural language than we have in logic, and those that do share the semantic properties of the quantifiers defined in logic in some syntactic positions, display some additional properties in other positions. The latter has been demonstrated by Zeno Vendler (1967) for the case of the distributive and non-distributive interpretation of quantifiers depending on the predicate in the quantified sentence. There is in addition one argument for the necessity of adding meaning postulates (or some equivalent device) to any possible description of a natural language that would account for the correct inferences, and this is the following. There is no denial of the fact that there are simple (non-complex) sentences from which it is possible to infer an unlimited number of sentences. This alone forces us to admit that we need some rules, such as meaning postulates, which will be finite in number and will account for the possibility of an unlimited number of inferences. To give a trivial example, consider the following sentences: (2) This statue has stood in the city of Quebec for two hundred years.

(2 a) This statue was standing in the city of Quebec fifty minutes ago. (2b) This statue was standing in the city of Quebec five years ago. (2c) This statue was not standing in the city of Montreal (New York, Warsaw, etc.) a year (fifty years, seven months, etc.) ago.

The sentences (2 a) and (2b), and many analogical sentences, are correct inferences which can be accounted for by one meaning postulate, in which reference is made to all moments of time within the period denoted by the Time Adverbial preceded by the preposition for, the moment of speaking and the present perfect marker. Sentences of the form (2c), and an unlimited number of analogical sentences, can be accounted for, if we add one additional meaning postulate, which roughly will say that: for any physical entity, if it is at a given time in a given pkce, it is not at the same time in any other place. What I have said so far may be summarized as follows: Meaning postulates (or some equivalent formal rules which would permit us to draw inferences from sentences) are indispensable for a full description of a natural language, that is, for a theory of language with semantics (a theory relating 'sounds to meanings'), independently of whether we accept one or another version of grammar. Evidently there is a separate problem worth consideration: what kind of information is really

This matter has been discussed in more detail in my other papers but a well known example is the one of an inference, which does not hold in case of a universal quantifier, but holds in case of the linguistic ^//-quantifier. From the sentence "All the students sitting in Jim's room are reading*' we may infer that there are students in Jim's room.

220

Irena Bellert

necessary for defining recursively all sentences of a knguage, and what can otherwise be 'left over' to be accounted for by meaning postulates.5 What I now want to submit for discussion is the following. If the problem of inferences of natural language sentences is indeed a crucial problem of semanticswhich fact can hardly be deniedand if a number of linguists are already concerned with the problemwhy then so little concern has been given to an explicit description and the formal properties of the relation of presupposition, inference or entailment in natural language? Let me give some examples. George Lakoff, in his very interesting paper (Lakoff, 1972, Chapter V) discusses the problem of transitivity of the relation of presupposition, but it is only in a footnote that he gives a definition of this relation, that is, he explains what he means by 'presupposition' in his discussion (Footnote 2, Chapter V): "This notation" (i.e. the implicational sign) "is introduced purely as a device to keep track of what is going on. It is not meant to have any theoretical significance. I take the term 'presupposition' as meaning what must be true in order for the sentence to be either true or false." If we rephrase LakofPs definition formally, it is equivalent to Keenan's definition of presupposition (which will be discussed below), and then it can easily be proved by a few trivial steps that the relation of presupposition is necessarily transitive. Thus the puzzling problem of the transitivity of the relation of presupposition disappears immediately. Yet, Lakoff concludes by leaving the problem of transitivity of presuppositions an open question. What he discusses in his paper, and what is of real interest, is obviously not the problem of transitivity of presupposition, but the general problem of the calculus of inferences in complex sentences. One could argue that the value of a discussion is independent of the fact that the problem is somewhat confusingly posedand indeed the discussion and evidence presented by Lakoff is of great interest for further research in the field. However, the problem is ill-posed, and as often happens, this brings about some confusion. On the other hand, a properly posed problem, in non-ambiguous terms, already constitutes already an achievement, as it may contribute to a better understanding of the subject matter. Similar observations can be made on some other papers which deal with the problem of presupposition, consequence and inference in natural language description. If I am examining some statements in more detail, it is because they appear in papers which are of great significance in the field. I also believe that there is a good deal of confusion on the matter in question and a great need for adequately describing the relations which are of crucial importance for natural language semantics.
5

Instead of deriving a lexical item from a tree structure corresponding to its meaning components, we can give the equivalent information in a meaning postulate associated with that lexical item. Many transformations could also be reformulated in terms of implicational rules.

On inferencex and interpretation of natural language sentences

221

Let me now make a few remarks on Karttunen's definitions concerning inferences in natural language, as Karttunen papers have advanced to a great extent the linguistic description of presuppositions and inferences. Again what is missing in his description and what, therefore, brings about a slight confusion and logically puzzling problems, is the lack of explicitness in the definitions of the relations being discussed. Let me take an example from his classification of verbs into implicatives, negative implicatives, if-verbs, etc. (Karttunen, 1971). The meaning postulates for implicative verbs (such as, 'manage to', 'succeed in', etc.) are roughly described as: (3) (a) v(S) => S (b) ~ v(S) => ~ S 'v(S) is a sufficient condition for S' 'v(S) is a necessary condition for S'

Let me quote what Karttunen has said on the formal properties of the implicative sign => used in his definition: "In saying that p implies q, I am not using the term 'imply' in the sense of 'logically implies* or 'entails'. The relation is somewhat weaker as indicated by the definition (4) P implies Q, iff whenever P is asserted, the speaker ought to believe thatQ."

Karttunen mentions, then, that this relation is closely related to Van Fraassen's notion of necessitation: (5) P implies Q, iff whenever P is true Q is also true. There are some problems, however, with the meaning postulates which contain the implicative sign in question, as for example (3). The statement in (3 a) says that 'v(S) is a sufficient condition of S' (which is obvious), while the statement in (3b) says that 'v(S) is a necessary condition of S', and the latter is not justified by the author. If such were the case, Karttunen's meaning postulates for implicative verbs would constitute an equivalence, (v(S) would be truthfunctionally equivalent to S, on the grounds that v(S) is said to be a sufficient and necessary condition of S). And this is something that Karttunen did not mean; on the contrary, he claims earlier in the same paper that the two are not logically equivalent, the implication holds in one direction only; the sentence with an implicative verb 'John managed to kiss Mary' is not equivalent to 'John kissed Mary', as the verb 'manage'carries along an extra assumption that is not shared by the latter sentence. The inconsistency stems from the fact that the statement 'v(S) is a necessary condition of S' holds true in this case, only if the negation is conceived of as a logical, sentential negation, and then v(S) and S are indeed logically equivalent, as the law of contraposition applies to ~ v(S) => ~ S, and we obtain S ^ v(S). However, if the negation in ~v(S) is conceived of as a verbal, internal negation, then it does not follow that v(S) is also a necessary condition of S. This is not to say that the law of contraposition (or modus tollens) does not apply to the second meaning postulate in (3); it does, but then

222

Irena Bellert

the consequent is a negation of the internal negation of S, rather than v(S), and it is the former, rather than the latter, which can be correctly said to be a necessary condition of S. I wish to emphasize here that my critical comments concern the formal aspect of Karttunen's definitions only, something that can easily be corrected6 but I have learnt a lot from the discussion presented in the paper. And it seems that once we agree on specifying meaning postulates in terms of common criteria, it will be easier to push forward our description which would account adequately for inferences in natural language. Keenan's notion of presupposition (Keenan, 1970) is explicitly defined, and can therefore be discussed easily. From a formal point of view there can be no confusion as to the properties of the relation of presupposition: (6) If a sentence S presupposes a sentence S1 (or S1 is a presupposition of S), then both S and its negation ~S logically imply S1 (or S1 is a consequence of S and ~ S).

I will discuss this definition in detail, as it constitutes an explicit version of the definitions of presupposition accepted implicitly in several linguistic papers (the definition in Lakoff (1970) discussed above is an example). Such a definition would be improper in a two-valued logic. A presupposition S1 is a logical consequence of both S and ~S, that is to say, S1 is implied by S independently of the truth value of S (in all possible worlds). However, the presuppositions are not logically true sentences, which would make it possible for them to be true in all possible worlds (in all possible state of affairs). In order to solve this problem, Keenan assigns a third or zero value to sentences whose presuppositions do not hold. Thus a presupposition may be false, but then the corresponding sentence will be said to be neither true nor false, it will have a zero value.

The same problem arises in the case of some other meaning postulates in Karttunen's paper. For instance the negative only-if verbs (e.g. 'to hesitate') are associated with the following meaning postulate: ~ v(S) 13 S
f

v(S) is a necessary condition of ~ S'

I think that v(S) can by no means be said to be a necessary condition for ~ S. The first part of the meaning postulate correctly accounts for the fact that the sentence: *Bill did not hesitate to call Jim a liar* implies that Bill called Jim a liar, but the statement: ev(S) is a necessary condition for ~ S' does not hold true, as it would mean that the sentence: did not call Jim a liar* implies that Bill hesitated to call him a liar. And it is obvious from what Karttunen says in the same paper that he did not mean this.

On inferences and interpretation of natural language sentences

223

There is nothing formally wrong in Keenan's proposal. However, we wouldn't really want to use a non-bivalent logic for many reasons. We would have then to dismiss with the rules of standard logic, and work out new inference rules based on a three-valued model. In addition to all the disadvantages of such an approach, we would not gain anything that is of interest to linguistics. For it is worth while to realize that linguists are not interested in the question of whether a sentence is actually true, false or neither true nor false (has a zero value), because this cannot be decided by linguistic or logical methods; this can be decided only by the knowledge of the factual state of affairs and of the time at which a (declarative) sentence has been used. What linguists are interested in is, rather, the question of what can be correctly inferred from a sentence on the grounds of its component parts. What we need to know, then, is how to formulate rules which would account for the inferences that can be made whenever a given sentence (not necessarily a declarative one) is used, independently of the speaker, of the time of speaking, independently of whether it happens to be true, false or neither true nor false; independently of whether the speaker believes or not in what he claims to be the case, and whether he is sincere or not. All such considerations do not belong to linguistics and should not bother us at all, for the rules of a knguage remain the same, no matter whether a speaker says the truth or not, whether he is sincere or not; if he lies then the inferences will not be true either, and the listener will be misinformedbut the interpretation of the sentence constituting a lie will be correct. Now in order to give rules which would account for the inferences we make whenever a given sentence is uttered, it is necessary to extend the rules of standard logic, rather than to reject the two-valued logic by accepting a threevalued logic. In any case, we need much more than standard logic rules for truth conditions for our purposes.7 And this can be achieved by meaning postulates or other equivalent devices, without rejecting a two-valued logic. In addition to the objection against a three-valued logic, involved in the discussed definition, I doubt if the definition is empirically adequate or that it covers all cases referred to as presuppositions in the linguistic literature. Is it really
7

For instance, a speaker who uses a sentence of the form 'Sj and S2' or 'Sj but S2', expresses more (only by the meaning of the connectives used) than just: 'Si is true and S2 is true'; a speaker who uses a sentence of the form 'If Sx then S2', also expresses more than just: 'If St is true, then S2 is true'. This is to say, that even the logical connectives have some additional semantic properties which can be formulated by appropriate meaning postulates. The sentence: 'John returned home and had dinner' gives grounds to different inferences (has different truth conditions) than the sentence: 'John had dinner and returned home.' For the meaning of the connective / / " . . . then, see footnote 12.

224

Irena Bellert

a logical negation that is in question here? Is it not the case that only an internal (verbal) negation of S implies S1?. Many linguists (including Keenan himself) have pointed.to the semantic difference between the internal negation: (7) John did not realize that Mary liked him. and the weak (sentential) negation: (8) It is not true (the case) that John realized that Mary liked him. The sentence 'Mary liked John* can be said to follow from (7), whereas most people would not agree that it follows from (8). As a test, we may add the negation of 'Mary liked John' to (8), without making the statement contradictory: (9) It is not the case (true) that John realized that Mary liked him; she didn't even like him at all (in fact she didn't like him).

The logical negation in the definition of presupposition would, however, cover both types of negation (if we want to retain a two-valued logic), so that the definiens would not constitute a necessary condition. Let me now sketch briefly the main point of my proposal concerning meaning postulates or implicational rules that would account for inferences in natural language. Let me discuss it first intuitively. Presuppositions of a sentence obviously constitute a subset of the inferences of that sentence and in the calculus of inferences we should take both into account equally. If we want to distinguish the particular inferences, let us call them presuppositional inferences or p-inferences, it is on the grounds that pinferences are those (inferences) which are more strongly associated with certain linguistic elements than are other inferences; that is, a p-inference holds not only of a simple sentence containing the corresponding "source-element" of that inference, but it will hold also of a complex sentence under some conditions (in some syntactic positions of the "source-element") in which other inferences do not hold or should be modified. For instance, a p-inference will hold true if the "sourceelement" is within the scope of a verbal negation, of verbs expressing the propositional attitude of the speaker (to want, doubt, imagine, etc) or in the scope of if; Hence the problem of distinguishing a p-inference from other inferences boils down to be only a particular problem of the calculus of inferences, namely, the problem of specifying the conditions in which the inferences due to some components of a sentence remain the same or not in other syntactic positions8. And this is the basic
David Lightfoot (1971), for instance, has proposed conditions under which the inferences holding true of a subjunctive conditional are cancelled if the subject of the antecedent clause is indefinite and co-referential with the subject of the consequent clause in deep structure. These conditions will account e.g. for the fact that the sentence: 'If anybody had gone to Athens; they would have seen Socrates* does not imply that nobody went to Athens.
8

On inferences and interpretation of natural language sentences

225

problem of the calculus of inferences of complex sentences. Moreover, once we have the rules for the calculus of inferences, and we do obtain the inferences of complex sentences by those rules, there is no need of distinguishing the pinferences from other inferences. For what does it matter if a subset of inferences we can make from a complex sentence would have been also inferences of a different sentence, in which the "source-element" of the said inferences would have been in the scope of negation or in the scope of a verb expressing a propositional attitude, etc. We could ask similar questions about other inferences and other syntactic positions as well. What remains to be shown, and what I will try to show, is that the rules of the calculus of p-inferences should then be extended to cover other inferences s well; in fact, those rules in some cases hold of p-inferences and other inferences, in other cases they can be modified and cover all the inferences. It seems thus unreasonable to restrict the study of inferences to a particular case only. I will go into more detailed examination of this problem after I present definitions which seem to me adequate for the relation of inference that holds in natural language. From a formal point of view, it seems to me that the concept of strict implication, originating from C.I. Lewis, is adequate for the linguistic concept of inferences, as it applies to all cases discussed in the linguistic papers in the field. We will say that pq (p strictly implies q) only if it is necessary that: if p then q. Equivalently, we may say that p strictly implies q, if it is not possible that p and not q. Thus we have:

(10)

or equivalently (11) p-^q=

The intuitive understanding of the strict implication, proposed for meaning postulates that should account for the linguistic inferences, is the following. As we have to account for inferences that would hold true independently of the factual conditions in which sentences may be uttered (independently of who is the speaker, what he actually believes, what is the actual state of affairs) the implication is necessarily true, that is, its truth, by definition, is guaranteed by the meaning of p and q alone. This is exactly what we intuitively require of the relation called 'presupposition* in the literature, and what we require of all other inferences as well. The corresponding pragmatic definition, which foDows from the formal properties of the strict implication, as defined above, is: (12) pq = df for any speaker, it is necessary that if he uses (utters) p, then he purports to believe that q

226

Irena Bellert

or equivalently g (13) p>q = df for any speaker, it is not possible to use (utter) p and at the same time to purport to believe that q is not the case. If I say that the pragmatic definition follows from the definition of strict implication, it is for the following reasons. If the implication p>q is necessarily true (that is, q is a necessary condition of p by the meaning alone), then whenever we utter p, we cannot but express our purported belief that q. In other words, it is impossible to utter p and at the same time to express a belief that q is not the case. This holds independently of all extralinguistic factors pertaining to the time, pkce and circumstances in which the sentence is used; the inferences will thus be independent of the factual state of affairs or the state of mind of the speaker, of whether he is lying or telling the truth; if he lies, that is, if the sentence is false, then the conclusions will be false. But this is what happens when we make correct inferences from statements that are lies: we are deceived by the speaker, but our interpretation is correct. Finally the pragmatic definition applies to questions and commands as well, since we do not apply the truth value to a sentence "p", but to the sentence "a speaker uses (utters) p", which is true whenever a speaker ses (utters) p. This way our inferences are independent of the actual states of affairs. The states of affairs change constantly from speaker to speaker, from one moment of time to another for the same speaker, etc. This is why a referential semantics for all possible discourses is out of the question. Our proper names and definite descriptions have unique referents only for a given discourse at a given time (with a few exceptions)9. But in establishing meaning postulates we are not concerned with this problem. Let me now discuss in some detail the rules for the calculus of inferences of complex sentences in relation to the calculus of p-inferences. Let me first discuss those rules that have been established for p-inferences, but which hold of all other inferences as well. Consider the following examples. (14)
(A) John awoke Mary at noon () Mary was asleep immediately before noon Mary was not alive after noon John was not sick John was in Boston

John killed Mary at noon John pretended to be sick John met Mary in Boston

The sentences of (B) would not be called p-inferences of the respective sentences of (A), as it is possible to use a negation of the latter together with the negation of

9 This is one of the crucial points concerning and limiting the possible semantic approaches to natural language. I have discussed some aspects of this matter in (Bellert, 1968).

On inferences and interpretation of natural knguage sentences

227

the former without contradiction (and this would not be the case for p-inferences). (15) John did not awake Mary at noon; she was not asleep immediately before noon. John did not kill Mary at noon; she was alive after noon. John did not pretend to be sick; he was sick. John did not meet Mary in Boston; he was not in Boston.

The relation between sentences of (A) and those of (B) is of the same interest to semantics as the relation of p-inference, since sentences of (B) can clearly be inferred from the respective sentences of (A) on the grounds of the meaning of the respective verbs. Now if we test such inferences against the rules defined by Karttunen for the calculus of presuppositions (Karttunen, 1972) we will see that it is possible to extend the calculus of p-inferences to other inferences and thus cover a broader area of semantics of natural language. Verbs classified as "holes", that is, those that let through all the pinferences of their complements, will also let through all the other inferences as well. Consider, as an example, the verbs 'to force', 'to regret' and 'to realize': (16) Jim forced John to awake Mary at noon > Mary was asleep immediately before noon. Jim forced John to kill Mary at noon > Mary was not alive after noon. : s John regretted that he pretended to be sick * John was not sick. John realized that he met Mary in Boston > John was in Boston. Let us now examine the case of verbs classified as 'plugs', that is those which cancel (to use Karttunen's terminology) the p-inferences of their complements. It could be expected that if such verbs cancel those inferences that remain 'untouched' in the scope of negationthen they should cancel other inferences as well. However, as it appears, in the calculus of inferences, in which we pose the problem with regard to all the inferences, rather than with regard to a particular subset of inferenceswe can extend the rule by stating that the inferences (whether p-inferences or not) remain but are modified: we have to refer to the beliefs of the individual denoted by the subject of the verb. Consider some examples: (17) John has told me that Jim's children are sick. John asked George to stop beating his wife. John ordered his wife to awake the children at noon. John said that he met Mary in Boston.

228

IrenaBellert

Now the sentences 'Jim has children', 'George used to beat his wife', 'The children will be asleep at noon' and 'John was in Boston' can be inferred, respectively, from the sentences in (17), but only as the reported beliefs of John. An interesting fact, which should also be accounted for in the calculus of inferences, is that when the verbs called 'plugs' become negated, then the denial 'cancels' the embedded propositions with its inferences (such is the semantic function of a denial in general), with the exception of those inferences which are not sensitive to propositonal attitudes (and therefore to the denial), and thus remain unchanged. Both sentences: (18) John has not told me that Jim's children are sick. John did not ask George to stop beating his wife.

imply the purported belief of the speaker that Jim has children and that George used to beat his wife. Another interesting fact is that 'plugs' which correspond to verbs of propositional attitude and which obviously have an effect on the asserted proposition (instead of being asserted it acquires the corresponding propositonal attitude), have no effect on the remaining inferences. They are unchanged if the speaker is at the same time the subject of the verb in question10 or they become modified as the reported beliefs of the individual denoted by the subject (as in the case of verba dicendi in (17)). In order to make clear in our examples which propositions are asserted, I will indicate the main stress. (19) I doubt if John's children went to Boston yesterday. The speaker's purported beliefs are that John has children and that John's children went to Boston. (20) Mary doubts if John's children went to Boston yesterday If Mary is reported to have doubts as to the date of John's children's trip to Boston, she must have assumed that John has children and that his children did go to Boston. This is something that we can infer from such a sentence. There is obviously a lot more to investigate in this respect, but what I want to show is that 'plugs' do not always cancel the inferences of their complements. The main argument for canceling the inferences in the case of 'plugs' was that the individual denoted by the subject could be misinformed. However, the same argument applies mutatis mutandi to the speaker and his purported beliefshe can be misinformed as well. And according to our interpretation we take all sentences with all the inferences as relative truths in any case. Let me now say a few words on the interpretation of if ... then and (either) ... or sentences, and the filtering conditions established for such
10

Karttunen (1973) has observed this with respect to presuppositions, but the conditions become still mote satisfying when we deal with the other inferences as well.

On inferences and interpretation of natural language sentences

229

sentences (Karttunen, 1971, Lightfoot, 1971)11. Generally speaking, the filtering conditions tell us which inferences hold true of a given state of affairs, that is, which inferences are not filtered out by the scope of the element if or or, which indicate that reference is made to a possible state of affairs. The rules are very important for the calculus of inferences in general, but in my understanding of the semantic interpretation of sentences, they leave a very important aspect of the semantic interpretation of such sentences aside. We cannot ignore the inferences that hold true of the state of affairs that is ckimed to be possible (or counterfactual) by the speaker, as this is what he intends to convey, this is his semantic message. The inferences of sentences, which can be accounted for by meaning postulates based on the notion of strict implication (or the corresponding pragmatic definition), are correct inferences independently of the actual state of affairs. For they depend solely on the meaning of linguistic expressions. The proposed definitions are, then, suitable even in the case of counterfactuals or sentences of the type 'Imagine (suppose) that S*. The inferences are then valid in the possible state of affairs as described by the speaker by the contents of the (/"-clause. And this is exactly the intended meaning of such sentences12. The interesting thing is that the interpretation is twofold. Some inferences, which are not sensitive to the scope of //*, will be said to hold true of the actual state of affairs, the remaining inferences will be said to hold true of the state of affairs ckimed by the speaker to be possible.
1

See footnote 3. Notice that the semantic interpretation of the relation holding between the two clauses in sentences of the form: 'If A then B' is not an inferential relation which we have to account for by linguistic rules. If a speaker says: If A then B', it would be incorrect to say that B is inferred, entailed or implied, in any sense, by A. It is the speaker who claims that there is a dependency of B on A, and that, if A is the case then B is (or will be) the case. A speaker may say 'If it rains I will go to the movies' as well as 'If it rains I will not go to the movies'. The relation between A and B, then, cannot, and does not need to receive linguistic explanation. The connection between A and B is part of the claim of the speaker, and may depend on various considerations which may be absolutely unpredictable by the hearer. The speaker, in fact, introduces a new premise in the if-clause, which shifts the actual state of affairs the speaker was referring to, into a possible (or sometimes counterfactual) state of affairs which is then referred to. The speaker, then argues what would follow (or what would have. followed) in such a state of affairs. Suppose someone says: 'If Rene: Levesque wins the elections, then Quebec will be independent, French will be the official language, etc. The consequent clauses are probably correct inferences but they are not of linguistic relevance, for they are based on additional premises that pertain to the factual knowledge of the world (for the above example); the consequent clauses may be based on just what the speaker intends to do in the case described by the if-clause (as was.shown in the former example). From a lingustic standpoint, it seems to me that it would be sufficient to add a meaning postulate associated with all sentences of the form 'If A then B', to the effect that the speaker asserts that B is dependent on A (where 'asserts' is an abbreviation for: 'intends to inform the addressee' (Bellert, 1972)).
12

230 Consider the sentence. (21)

IrenaBellert

If Bill remains in Boston, John's son will kill him in May.

The inferences constituting the speaker's purported beliefs concerning the actual state of affairs are the following: 'Bill is in Boston at the present time', and 'John has a son'. The inferences constituting the speaker's purported beliefs concerning a possible state of affairs are: *Bill continues to stay in Boston'. Bill will not be alive after May'. The example has not been analysed in precise terms, but it is used only to serve the point. The distinction between the two states of affairs is necessary, and both types of inferences are equally important for the semantic interpretation of such sentencesthe semantic interpretation of a sentence being conceived of as a set of inferences or conclusions that can be drawn from that sentence and from a set of implicational rules (meaning postulates) pertinent to the syntactic and lexical characterization ofthat sentence.13 Bibliography BELLERT, IRENA (1968), On a condition of the coherence of texts. International Symposium on Semiotics. Warsaw. (Reprinted in Semiotica 2.4, 1970) (1969), Arguments and predicates in the logico-semantic structure of utterances. Dordrecht-Holland: Reidel. (1970), On the use of linguistic quantifying operators. COLING. Sanga Saby, 1969 Reprinted in Poetics. (1972), On the logico-semantic structure of utterances. Wroclaw, Poland: Ossolineum.

CARNAP, RUDOLF (1947), Meaning and necessity. Chicago, 111.: University of Chicago Press. KARTTUNEN, LAURI (1971), The logic of English predicate complement constructions. IULC mimeographed. (1972), Presuppositions of compound sentences. Linguistic Inquiry 4.2. The present text is an extended version of my paper: On the Application of Meaning Postulates to Linguistic Description", presented at the LSA Meeting in San Diego, 1973.1 wish to express my indebtedness to David Lightfoot for his critical comments owing to which I was able to clarify some of my points and eliminate some of the defects of my text. I also feel indebted to several other people who have discussed the paper and made comments. It is difficult, however, to realize to what extent the ideas and the improvements on my earlier concepts have been affected by the criticism I have received. In any case I would like to acknowledge the fact that some of Lieb's critical remarks (H. Lieb, "Grammars as Theories", Theoretical Linguistics, No 12, 1974) have helped me to realize where I was wrong in my formulations (in "Theory of Language as an Interpreted Formal Theory", Proceedings of the llth International Congress of Linguists, Bologna, 1972). Although I still disagree with Lieb on some points, I would like to express my indebtedness to him for his patience in going through the detailed and sometimes incorrect formulations of my earlier paper which will certainly help me in further developing my concepts.
13

On inferences and interpretation of natural language sentences

231

KEENAN, EDWARD (1970), A logical base for a transformational grammar of English. Philadelphia. TDAP 82. LAKOFF, GEORGE (1970), Linguistics and natural logic. Mimeographed. Reprinted in Synthese 22. No. 1-2 LIGHTFOOT, DAVID (1971), Notes on entailment and universal quantifiers. Papers in Linguistics 5. No. 2 VAN FRAASSEN, B.C. (1968), Presupposition, implication and self-reference. Journal of Philosophy 65. VENDLER, ZENO (1967), Linguistics in philosophy. Ithaca N.Y.: Cornell University Press.

S. D. ISARD WHAT WOULD YOU HAVE DONE IF... ?

In this paper I formulate principles to account for some of the ways in which tense, mood, aspect and modal verbs are used in English, and describe a computer program which operates according to these principles. The program is capable of playing a game of tic-tac-toe (noughts and crosses) and answering questions about the course of the game. In particular, it is able to discuss hypothetical situations, both past and future, and to answer questions about possible, as well as actual, events. One of the main ideas on which the program is based is a "pronominal" account of both tense and mood as forms of definite reference to previously mentioned situations.

1.

Introduction

In this paper I shall put forth some ideas on the use of tense, mood, aspect and modal verbs in English, and describe a computer program which embodies these ideas. The program is capable of playing a game of tic-tac-toe (noughts and crosses) and answering questions about the course of the game. In particular, questions having the form of the paper's title. Before plunging into details, I would like to discuss briefly the point of writing such a computer program. To begin with, it is intended as an expository device, rather than a useful, or even potentially useful, piece of technology. It is a small working model, meant to illustrate some abstract proposals. As such, it is supposed not only to perform its task, but to do so in a way that is comprehensible and, hopefully, illuminating to an observer. A computer program cannot constitute a theory in itself, but it is often easier to grasp a theory through consideration of a detailed example than by starting with abstractions. Unfortunately, most readers of this paper will not have direct access to the program itself, for one reason or another, and will find themselves in a position where they must accept on faith that it actually exists and does what is claimed of it. Such readers are, in fact, being presented with whatever I may have to offer through the traditional medium of the ordinary language essay. However, even without the intention of actually writing a program, one can see the temptation to resort to computing terminology in discussing any process as complex as knguage use.

234

Stephan D. Isard

Our everyday vocabularies just don't seem to be very rich in the right sorts of words, and computational concepts, "calling a subroutine", for example, or "assigning a temporary value to a variable", provide us with the best, if still inadequate, source of process metaphors that we have. What the program itself then contributes is a demonstration that one's ideas about the process being modelled are at least coherent, that they can be filled in in detail, even if they ultimately turn out to be wrong. In the circumstances, this sort of demonstration is really of more direct benefit to the producer of ideas than to the consumer, but it is to be hoped that he will profit too in finding the wares in the marketplace better developed and less woolly than they might otherwise have been. To give a somewhat better idea of the range of English, and subject matter, that the program is concerned with, I would like to introduce a sample of its performance. Before I can do this, however, I must make the admission that it does not actually pky tic-tac-toe in that game's classic form. Instead, it plays a variant in which, rather than alternatively putting marks (0 and X) on a three-by-three board, X 0 fig.l the players choose numbers between one and nine. If we consider these numbers arrayed as an appropriate "magic square"
2 9 4 7 5 3 6 1 8

fig-2 the game's original objective, to make three marks in a straight line, translates itself into the goal of taking three numbers which add up to fifteen. The program and its opponent communicate by means of a teletype. The game is played in the players' heads, so to speak, but after each of its own moves, and at other times if requested, the program signals its understanding of the current state of pky by typing out the magic square with the numbers it has taken replaced by "0"s and those that its opponent has taken replaced by "X"s. Here is a sample conversation. It exhibits all the important syntactic constructions and most of the vocabulary with which the program can cope. The user's remarks are prefaced by U and the program's by P. These symbols do not actually appear on the teletype, nor do the parenthesized numbers, which are for future reference. Our teletype does not posses a question mark, but they are inserted here for ease of reading, as are commas, which the program does not use.

What would you have done if... ?

235

The conversation begins before either player has made a move.


U: P:

Display the board.


2 7 6 9 5 1 4 3 8

U: P: U: P:

What have you taken? Nothing. Move.


0 7 6 9 5 1 4 3 8

U: U: P:

I take five. You move.


0 7 6 9 X 1 4 3 0

(1) (2)

U: P: U: P:

Had you taken eight when I took five?


No.

If I had taken six when I took five, would you have taken what you did? No. What would you have taken? 4 Have you taken two?

(3) W (5) (6) (?)

U: P: U: P: U: P: U: P: U: P: U: P: U: P:

Yes.
Could you have taken two when I took five?
No.

Did you take seven ?


No.

What did you take? Eight. If I had taken four when I took five, what would you have done? Take six. Could you have won?

Yes.

236

Stephan D. Isard

U: P: U: P: U: P: U: P: U: P: U: P: U: P:

Will you win? Dunno. Move. I can't move now. I take eight. I took eight. Had you taken eight when I took two? You haven't taken 2. Had you taken two? Under what circumstances ? When you took eight. Yes. Can I win? No.

2.
2.1.

Syntax
Translating vs. obeying

In describing how the system goes about responding to an English sentence, it will be helpful to distinguish between two modes of operation a "translation" mode, in which the English is turned into computer instructions, and an "obeying" mode, in which the operations specified by these instructions are actually carried out. In computer terminology, this is essentially the distinction between compiling a program and running it. In this model it is supposed to correspond approximately to the distinction between a hearer's understanding what a sentence calls upon him to doanswer a question, say, or believe a statementand his actually doing it. This is not, in fact, an altogether clearcut distinction in the use of natural language. Consider, for instance, the process of establishing the referent of "that man over there in the corner" in a question such as "Is that man over there in the corner the Professor of Paleontology?" Identifying the man allows you to reduce the question to whether be is the Professor of Paleontology (and once he has been identified, it won't make any difference if he moves out of the corner and wanders about the room while you ponder the matter). In this sense, making the identification is part of finding out what the question is, and so to be viewed as part of the translation, or compiling process. On the other hand, there is a sense in which it is an "obeying" action. For one thing, you could refuse to do it. You could simply not deign to figure out who was specified by the phrase "that man over there in the corner"you could avert your gazeand you could do this while understanding the words perfectly and knowing exactly how you would go about discovering who was indicated if you felt like it.

What would you have done if...?

237

The moral, then, is that although there is a distinction to be made between comprehending instructions and carrying them out, there will be times when both sorts of process are going on simultaneously, when one instruction is being carried out in order to make clear the content of another. One consequence of this in the operation of our model is that for many, perhaps most, of the sentences that it treats, there never appears anything that could properly be called a translation of the entire sentence. That is, if it were to treat sentences about suspected paleontologists, which it does not, it would at some stage cook up instructions to itself to find out who was in the corner, and it would carry these out, with the result, let us say, that he is identified as Jones. It would next pose itself the question whether Jones is Professor of Paleontology, and, if it happens that he is not, there could be a final instruction to say "no". Of these three instructions, the one which discovers whether Jones is Professor of Paleontology would seem to come closest to capturing the sense of the question, but it misses out the information that the original sentence referred to him as the man in the corner. The identical instruction might arise in the process of answering a question which called him "that chap with the extraordinarily thick spectacles". One might, in analysing the system's operation, try to display all of the instructions at various levels on a tree, and let the tree as a whole represent the system's understanding of the sentence. Whatever the possible merits of such a notation, the system itself does not employ it because, as a model of a naive language user, it is not attempting to analyse its own operation.

2.2.

Closure functions

I would like now to discuss the form of the instructions which the program derives for itself from the English sentences it is given. To do this, it is necessary to introduce the notion of a closure function. In fact I shall go back one step further and begin by saying that (plain, old) function is the name given to the fundamental unit of program in the programming language POP-2 (Burstall et al., 1971), in which the system is written. If you want to accomplish something with POP-2, you apply (or, equivalently, run or call) a function. The programming language comes equipped with a set of basic functions, including, for instance, addition and multiplication, and the activity of programming consists of combining, in various ways, functions you already have, to make new and more complicated ones. In general, functions require arguments on which to work. We can't call the addition function without supplying it with two numbers to add together. The number of arguments required will vary from function to function, and there are some, like the one which turns off the machine, which require none at all. Now, given a function, PLUS, say, which requires two arguments, we can, in POP-2, construct a new function by "freezing" one of the arguments of PLUS at some

238

Stephan D. Isard

specific number, say 1. That is, the new function, written PLUS(%1%), will require only one argument, and what it will do is to add 1 to this argument. Or, to put it another way, it is just like PLUS except that it supplies its own second argument, i.e. 1. A function constructed by "freezing" arguments in this manner is called a closure function. To choose an example more pertinent to the matter at hand, the tic-tac-toe playing system has a function TAKE, which requires two arguments, a player and a number, and whose effect is to transfer the number from the list of those not yet taken to the list of those taken by the player (or to print a rude remark if the number is no longer available to be taken). The closure function TAKE(%7%) then only requires a player as argument and has the effect of transferring 7 to his list. Where TAKE is used to translate the (two-argument) English verb "take", TAKE(%7%) corresponds to a (one-argument) verb phrase. Let me stress here the fact that TAKE(%7%) is a function, the same sort of beast as TAKE itself, the percentage signs being introduced so as to avoid a possible confusion arising from the fact that in ordinary mathematical notation, a term of the form F(x) represents the result of applying the function F to the argument x, and not, in general, another function. Sin(O) is a number, 0 in fact, whereas sin(%0%) is a functiona rather silly one to be sure, but a function nonetheless. One consequence of this is that we can, if we wish, repeat the process of freezing arguments and from TAKE(%7%) construct, e.g. TAKE(%7%)(%!%), which fixes both player and number and, when applied, wants no arguments at all, but simply awards 7 to me.

2.3.

Translation of clauses

Now, if TAKE(%7%) is to be compared to a verbphrase, and we further add a subject, to form TAKE(%7%)(%!%), we get something which might serve as the translation of a phrase like "my taking of seven", "that I take seven" or "for me to take seven". It falls short of representing a full clause since it does not contain, within itself, any indication of the circumstances to which it is meant to apply, past, present, or future, real or imaginary. As a result, in the translation of a sentence like "Have I taken seven?" it finds itself placed as an argument in a larger structure HAVE(%TAKE(%7%)(%!%)%), where the role of HAVE is to search (the machine's memory of) the past for an event of the sort specified by its argument, in this case an occasion on which I took seven. Without going too far into the details of the way in which HAVE is programmed, we can note here that in trying to decide whether such an event has occurred, it is able to look at the component parts of the closure function given to it as argument, and so seek out the agent, action, etc. of the desired event. This is because there are basic functions, supplied with the language, which will identify the original function and the frozen arguments out of which a closure

What would you have done if... ?

239

function has been constructed. In fact, given a closure function, we are not restricted to just looking at its components, but we can also alter the function by replacing them with new values. This ability turns out to be useful in the translation procedure. It is by iterating the process of embedding one closure function as the argument of another that the program builds up its translations of English clauses. Thus the function corresponding to "would you have taken seven" is finally SUBJUNCT(%WILL(0/oHAVE(%TAKE(%7%)(0/oYOU%)%)%)%), where each of SUB JUNCT, WILL and HAVE is a function taking a single closure function taking a single closure function as argument, and the expression as a whole therefore represents a function of no argument. In fact, every clause is given a translation of this general form, which is to say a matrix of verb and one or two arguments embedded successively in functions which are classified as an aspect, a modal verb and a tense. This is true even for sentences such as "I take four", which do not exhibit any overt signs of aspect or modal verb. The modal and aspect functions employed in such cases are dummy ones, which do no real work when called, but simply call the next function in line. Thus the translation of "I take seven" is actually PRES(%NOMODAL (%NOASPECT(%TAKE(%7%) (%!%)%)%)%), where the effect of calling NOMODAL(%NOASPECT(%TAKE(%7%) (%!%)%)%) is to call NOASPECT (%TAKE(/o7%) (%!%)%), which in turn calls TAKE(%7%) (%I%). These dummy functions were introduced in the interest of making the translation program, if not its output, somewhat simpler and easier to read, and they are not meant to have any theoretical significance. It is just that it is convenient to be able to furnish each clause with a fixed number of slots, and to know that, say, the main verb belongs in the fourth one from the top, both when deciding what to do when the main verb is encountered and also when it is necessary in making some later decision to ask what the main verb is. In what follows, I shall normally omit dummies when giving examples of the translator's output.

2.4.

Syntactic trees

We are now in a position to discuss how the translation procedure works. What it produces, in fact, is translations of clauses, rather than of whole sentences. This distinction does not, of course, make itself felt until one begins to consider sentences with more than one clause. What happens in such cases is that when the translation of a clause is complete, it is runthe instructions into which it has been translated are obeyed. The fact that one clause is embedded in another is reflected not in the translation of eitheras we have already noted: but rather in the fact that the translation and running of a subordinate clause are treated as a subprocess in the translation of a higher clause. That is, for a sentence such as "I took what you would have taken", one can represent not the translation procedure's output, but its operation with a tree diagram such as

240

Stephan D. Isard

CLAUSE
NP
V

NP

REL NP MODAL

CLAUSED ASPECT V

fig. 3

where each node stands for a function which is called within the program, and the nodes lower down the tree are invoked during the operation of the ones higher up. While running the clause translator, we are called upon to run the nounphrase translator, the verb translator and the nounphrase translator again, and this second time the nounphrase translator itself calls the clause translator, and so on. There is, of course, nothing novel about relating the syntactic analysis of a sentence to the operation of an automaton in this way. In fact standard proofs that pushdown store automata can parse context free languages (see, e.g. Hopcroft and Ullman (1969)) exploit a correspondence between the parse tree of a string and the sequence of states through which the store of the automaton passes in processing it. Now the sort of tree whose absence is explained away in this fashion corresponds to what would normally be called surface structure. One can also find a correlate for a deeper level of analysis in the sequence of operations that are performed as the compiled programs are run. For it is at this level that sentences which would normally be said to have the same underlying representation receive similar treatment. For example, when faced with the sentence (8) When you took seven, had I taken two?

the system first translates the words "when you took seven", then executes the resulting instructions, before translating the remaining words. In the case of (9) Had I taken two when you took seven?

the translation of "had I taken two" is performed first, but execution of the result is postponed until after both translation and execution of the subordinate clause. In both cases, then, we execute the translation of "when you took seven" before the translation of "had I taken two", and in running the translations of

What would you have done if... ?

241

clauses in an order different from the one in which they are formed, the system might be viewed as performing grammatical transformations. However, the structures being transformed will have their existence only in the eye of the beholder, and not in the machine itself. This strikes me as a relatively happy situation from the point of view both of the linguist who would like his structures and transformations to have some sort of psychological reality, and the psychologist who would like to make use of the linguist's insights, but who is reluctant to postulate operations on trees going on in the head. Unfortunately, the sample of English understood by this program is small and far from exhibiting all of the complexities of English syntax, to say nothing of meaning, and so it can only be used as an illustration of the proposed relation between syntactic structures and the process of comprehension, and not as evidence for it.

2.5.

The translation procedure

Translation is actually carried out by associating with each word a list of procedures to be run whenever the word is recognized. These procedures correspond, in the main, to the syntactic category of the word. They operate on closure functions, initially filled with dummies as described above, and in the most straightforward sort of case, simply replace one of the dummies with a "real" function. For example the verb "take" is assigned just the procedure V. When "take" is encountered this has the effect of replacing the dummy verb of the closure function with the function TAKE. In general, each verb has an associated "semantic" function, usually with the verb itself as its name, and the V procedure puts the semantic function of whatever word has just been recognized into the verb position of the closure function. A verb can also have other procedures as well as V on its list. "Win", for example, has both V and INTRANS, with the effect that after V inserts the WIN function in the verb position, INTRANS removes the object position from the closure function. There are similar procedures which replace the dummies in the tense, modal, aspect, subject and object positions when appropriate words are encountered. These procedures are also able to terminate the clause being translated and begin a fresh one if the position that they are trying to fill is already occupied. The tense, modal and aspect procedures also check positions which should not be filled until after theirs in a given clause. For instance, in the translation of "If I take six, will you win?" the modal procedure activated by "will" notes that TAKE has already replaced the dummy verb of the "if clause and, since a modal cannot follow the main verb, it ends the translation of the "if clause, runs it, and puts WILL into the modal position of a new clause. (The decision to terminate the "if clause

242

Stephan D. Isard

cannot be taken before "will" is encountered because of the possibility of a subordinate "when" clauseas in "If I take six when you take five, will you win?" which would have to be dealt with before the "if clause could be considered finished.)

2.6.

Tense and mood markers

The tense position is capable of being occupied by functions which correspond to mood as well as tense. Its initial situation is different from that of the other positions in that it starts off with a "default" value of PRES rather than a dummy value. This corresponds to the common claim by linguists (see, e.g., Lyons (1968)) that the present tense is an unmarked case in English and means in effect, that a clause is assumed to be present tense and non-subjunctive unless we encounter explicit evidence to the contrary. Verbs carrying markers of other tenses and moqds set off both the procedures associated with the verb itself, and those associated with the marker. Actually, there is only one function that can supplant PRES at this stage, and that is REMOTE. That is to say that a verb form like "took" is not considered to be ambiguous between a past form which appears in "I took seven" and a subjunctive form which appears in "If I took seven, what would you do?". The system knows only a single "remote" form. However, the interpretation placed on this remote form differs with the circumstances in which it finds itself, and so the function REMOTE has two essentially different modes of behaviour. It can behave like a past tense and it can behave like a subjunctive. I shall spell out what it means to "behave like" a past tense or subjunctive in greater detail later on, but the distinction I have in mind is the ordinary one, expressed in informal terms by saying that the past tense is used to refer to actual past situations, while the subjunctive is used for hypothetical situations. In "if clauses, the remote form is behaving as a subjunctive if replacing it by "were to" plus the infinitive produces a paraphrase. That is, "became" in (10) (11) (12) (13) If he became Prime Minister, there would be a revolution, If he were to become Prime Minister, there would be a revolution. If he became Prime Minister, then my history book is wrong, If he were to become Prime Minister, then my history book is wrong. serves as a subjunctive, and the sentence is equivalent to On the other hand, in "became" serves as a past tense form and there is no paraphrase The decision as to which way a given instance of REMOTE should behave is made by the procedures attached to other parts of the sentence. In the program's

What would you have done if... ?

243

present form, REMOTE is initially set to behave like a past tense, but when a sentence appears which contains both "if and the remote form of a modal verb, i.e. "would", "could", "might", REMOTE begins to behave like a subjunctive when the translation of the "if clause is run. The idea behind this initially complicatedlooking condition is that the "if clause conjures up a hypothetical situation to which subsequent remote modal forms refer. Any clauses whose translations are run before that of the "if clause cannot refer to the hypothetical situation because it hasn't been created yet. The remote form continues to be taken as subjunctive in succeeding sentences as long as they contain remote modal forms. REMOTE reverts to being a past tense when a sentence is encountered which does not contain such a form. Thus in (1) of the sample dialogue the remote form is interpreted as a past tense. (2) contains the words "if and "would", so that REMOTE behaves like a subjunctive in the "if clause, and in the main clause, whose translation is run after that of its subordinate clauses. However, the "when" clause is subordinate to the "iP* clause, and so its translation is run first, with the result that REMOTE is still a past tense in the "when" clause. (3) contains the remote modal form "would", so the subjunctive interpretation persists, but (4) does not contain a remote modal form, so that REMOTE reverts to being a past tense in (5). Note that if (5) had followed (3) directly, then it would have been natural to take it as a subjunctive, and a continued reference to the hypothetical case raised in (2). The "cancelling" effect of change of tense, which in this example operates through (4) to interrupt reference to the hypothetical case and make (5) a reference to the actual past, was pointed out to me by Christopher LonguetHiggins (as was the example about the Prime Minister). There is one sort of clause which the program treats as an exception to the principles just set out and in which REMOTE acts as a past tense under all circumstances, but without interrupting the subjunctive interpretation for further clauses. This is the case of "what" clauses, either question or relative, such as "what you did" or "what did you do", which contain a remote tense marker, but no modal. Thus, in a sentence like "If I had taken four, would you have taken what you did?", the "if clause is translated first, and its translation run, with REMOTE acting as a subjunctive. The sub-clause "what you did" of the main clause is the next to have its translation run, and here the remote marker acts as a reference to the actual past. But in the main clause, REMOTE acts as a subjunctive once again. In this way we get the desired effect of comparing what actually happened with what would have happened under different conditions. Unfortunately, it is not always correct to take the remote form in such "what'' clauses as a reference to the actual past. The "what we said" in "If he became Prime Minister he would ignore what we said" can perfectly well be used to refer to what we would say in the hypothetical circumstance of his becoming Prime Minister, as well as to an actual past utterance.

244

Stephan D. Isard

2.7.

Binders

It still remains to discuss the syntactic procedures associated with the words if", "when" and "what", which the program classes together as "binders". These procedures are quite similar in that they all bring about the translation of a subclausethe clause translator is called recursivelyand then they apply a semantic function to the result. Thus in the translation of (2), the occurrence of "when" causes the clause translator to turn the succeeding words "I took five" into the function REMOTE(%TAKE(%5%) (%!%)%) to which a function WHEN is then applied. The procedure associated with "what" is slightly different in that the word is effectively decomposed into a word DAT, which gets inserted into the subject or object position in the sub-clause, and a function WH which is applied to the translation of the sub-clause. "What did you take?" then causes an application of WH to REMOTE(%TAKE(%DAT%)(% YOU)%). Other differences among the binder procedures arise from the fact that in this context, an "if clause can have a subordinate "when" clause, but not viceversa, and that relative ckuses, but not questions, beginning with "what" can appear inside the other two kinds of clause. (The program does not deal with "when" questions.)

3. 3.1.

Semantics Situations

We now turn to what the programs into which the English has been translated actually do. Very informally, we might say that they are programs for obtaining information about the machine's state of mind, and for altering this state of mind. Since the machine never thinks about anything but tic-tac-toe, and even about that with no great profundity, its states of mind can be described in relatively simple fashion. The central concept in its world view is that of a situation, which is basically a state of the game. For most purposes, situations are determined by the values of six variables in the program. These variables can be thought of as pigeonholes whose contents can be looked at, to determine the answers to questions, or replaced, by programs which alter the situations, or, rather, the machine's mental image of them. The particular variables at stake here are: (1) (2) (3) MINE, which contains the list of numbers which have been taken by the machine. HIS, which contains the list of numbers taken by the opponent. REST, holding the list of numbers not yet taken.

What would you have done if...?

245

(4) (5) (6)

MEMORY, a list of the moves made so far, in the order in which they were made. TURN, which specifies whose turn it is. INTRAIN, which can specify that a particular move is in progress, or can be undefined if the situation to be represented lies between moves.

It is evident that there is considerable redundancy in this form of representation, but it is convenient to have such information as whose turn it is ready to hand, and not have to work it out each time it is wanted. There is also one further variable, FOCUSSED, which is part of the machine's concept of a situation, but it is of interest only in special cases, and I shall postpone discussion of it until kter. At any given moment, the system can have only one situation under active consideration, in the sense that the crucial variables have one set of values, and no others. However, it can also have other situations "in mind", to be turned to, or returned to, when the occasion demands. In particular, we want to be able to remember the actual board position while considering hypothetical cases. There are two ways of keeping situations "in mind". The first is by means of a function called SPOSE. The effect of SPOSE is to copy down the values of the variables in the situation currently under consideration so that they can be reinstated at a prearranged point later on. Within the scope of a SPOSE, it is safe to tamper with the situation without losing track of reality. SPOSE appears typically in contexts like SPOSE TAKE(I, 7), which would be part of the program corresponding to the clause "if I take seven". The other way to store a situation is to create a function which, when called, will set up the situation by giving the crucial variables the right values. REMOTE uses a function of this sort when it behaves like a past tense. Its effect is to set up the past situation currently under discussion, so that the rest of the sentence can be considered with respect to it. Thus, when it comes time to run the translation of (6), which, omitting dummies, amounts to REMOTE(%TAKE(%7%)(%YOU%)%), the REMOTE in the tense position recreates the board position
0 9 4 7 5 3 6 1 8

fig. 4 referred to in (5) by the clause "when I took five."

246

Stephan D. Isard

3.2.

Tenses as pronouns

It is this sort of behaviour on the part of the past tense marker that makes McCawley (1971) refer to it as a kind of pronoun. It acts as a form of definite reference to a past situation on which the attention of the conversants has recently been focussed, usually by a previous mention. If you ask me "What did you do after your lecture this morning?" and I reply "I ate lunch", I mean that at the particular past time we are talking about I ate lunch, in just the same way that if you ask me "Is your brother right or left-handed?" and I reply "He is lefthanded", I mean the particular man you asked me about. By way of contrast, the reply "I have eaten lunch" seems irrelevant or evasive, like "Someone is lefthanded", because it doesn't make reference to the particular time in question. We might note in this regard that those operators in formal tense logics which are meant to be read as "at some time in the past it was true that" (see, e.g. Prior (1967)) do a job which is usually performed in English by the perfective aspect, "have-I-past participle", rather than the past tense. The program distinguishes between its HAVE and PAST functions. HAVE appears in the translation of a sentence like "Have I taken two?" and searches the memory for any occasion on which I took two, while PAST appears in the translation of "Did I take two?" and goes to a specific past situation to see whether I took two then. Both functions come into play in the program's translations of past perfect sentences"Had I taken two?"where we first move back to a point in the past, and then search the range of moves before it. However, it is also possible to use past perfect sentences in such a way that the right translation would seem to involve two "pronominal" past tense functions. Consider "What had I done on the previous move? Had I taken two?".

3.3.

Setting referents

An interesting difference between the case of pronouns and that of tenses is that we have few, if any, devices in English whose function is just to call attention to some item so that it can be referred to elsewhere by a pronoun. A possible candidate for such a device would be the sort of topicalization that produces constructions such as "And your father, how is he?" where "And your father" does nothing but call attention to its referent. However, this attentioncalling need not necessarily be in the service of a pronoun, as we can see from "And your father, now there's a man I admire" or "And your appointment, did you arrive in time?". We do, however, have devices whose role is just to provide referents for tenses. "When" clauses, in particular, serve this function. The situation referred to by a "when" clause is always the referent of the tense in a higher clause, and, as in (5)(7), it can serve as the referent of the tense in subsequent sentences as well.

What would you have done if... ?

247

In the cases dealt with by the program I have written, a past tense "when" clause"when I took seven"establishes a referent for the past tense which stands either until the appearance of another past tense "when" clause, or, as mentioned earlier, until there is a change of tense. A new time clause will, of course, establish a new referent, but a change of tense leaves the past tense with no referent at all. In such a context, a return to the past tense with a sentence unadorned by any time clause or time adverbial will appear odd. Consider, for example, the effect of replacing sentence (5) of the sample dialogue by sentence (7). The pronominal behaviour of the present tense is slightly more complicated. To begin with, it can never find itself entirely without a referent, because it can always adopt its "default value", the time of speaking, in the absence of any other referent. The role of present tense "when" clauses is to provide other referents for the present tense. These are usually future situations, as in "When I take seven, what will you do?", or generic ones, as in "When I take aspirin, the spots grow even larger" or "When I take my sugar to tea, I'm as happy as I can be". The program does not attempt to cope with generic situations and I shall not discuss them here. A peculiarity of future situations is that further references to them will normally be made in conjunction with modal verbs, as in the example just cited or "When we get there, they may be gone" and "When you have typed that letter, you can go home". I am adopting here the position put forward by Boyd and Thorne (1969) that these modal verbs are not in themselves markers of a future tense, and that English, in contrast to, say, French or Italian, does not in fact have a syntactically distinguishable future tense. This is not to say that we cannot refer to future times, but just that when we do, we use the same tense that we use for talking about the present. In particular, then, I would not regard a sentence like "You can go home" as syntactically ambiguous, in spite of the fact that it can be used either in reply to "What shall I do when I have finished typing this letter?", making reference to a time in the future, or in reply to "What shall I do now?", making reference to the present moment. Boyd and Thorne, in arguing against the existence of a future tense, concentrate their fire on the claim that "will" serves as the marker for one. They give examples both of sentences about the future which lack a "will", like "He goes to London tomorrow", and sentences containing "will" which are not about the future, as in "My cousin is downstairs. He will be wondering what has happened to me." I think it is also worth remarking on the absence of "will" from time clauses which refer to the future. A language like French, which has a distinct future tense, will use it in such clauses, producing "quand il viendra" or "apres qu'il sera venu" in contrast to the English "when he comes" or "after he has come". This fits in well with the contention of Boyd and Thorne that "will" is a marker of prediction, rather than of futurity as such. The prediction in "When he becomes Prime Minister, he will shed his lofty principles" is made by the main clause. His becoming Prime Minister is presupposed.
17 TLI3

248

Stephan D. Isard

3A.

WHEN

These considerations should make clear the motivation behind the way that the program's WHEN function works. WHEN operates upon the sort of closure function that the program produces as the translation of a clause. In the case of "when I took seven", for example, it gets applied to REMOTE(%TAKE(%7%) (%!%)%). What it does, huessence, is to search out the situation described by the clause, in this case my taking of seven, and make the tense of the sentence, in this case past, refer to it. In slightly more detail, it first notes the tense, in order to find out where to look for a situation of the sort described. If the tense is past, it searches the memory. If this search were to fail, it would indicate a disagreement with the presupposition of the "when" clause, and a message to this effect would be printed out. In the present example, this would be "You haven't taken seven". If the search succeeds, the program notes the values of the variables which define the situation that it finds, and creates a function which, when called, will conjure up this situation by assigning the variables these values. The newly created function is then given the label PAST. When it is not in use, which is to say before any past tense references have been made in a conversation, or after a change of tense, this PAST label is hung on a function whose effect is to temporarily abandon the attempt to understand the sentence it is working on, print out "Under what circumstances", and interpret the response, in the hope that it will provide the missing referent. It then has another try at the original sentence. It is the function currently labelled PAST that REMOTE calls upon when it is acting as a past tense. Thus the REMOTE in the translation of (6) of the sample dialogue looks to see what function is currently labelled PAST, and finds the one created during the sentence (5). It then runs this function, recreating the scene "when I took five" and answers, with respect to it, a question which amounts to "are you just about to take seven, or in the process of doing so?". The reason for this formulation is that "when I took five" does not, even in the restricted universe under consideration, specify a precise point in time. We can compare, for instance, (14) with (15) Did you take six when I took five? Did I win when I took five ?

In the game of the sample dialogue, "when I took five" is equivalent to "on my first move" for the purposes of (14), but to "on your second move" for (15). The reasons for this do not lie on the syntax of (14) and (15), or even in their semantics, in the sense that the opposite interpretations, "your second move" for (14) and "my first move" for (15), would not be meaningless. It is not in general out of order to speak of two people doing things at the same time. It is, however, impossible for two moves to take place at once in a game played

What would you have done if... ?

249

according to the rules of tic-tac-toe, and the program makes the (pragmatic) assumption that it will only be asked to discuss possibilities that might arise during a well-formed game. It is in order to give it room to operate this assumption that the variable INTRAIN is introduced. The value of this variable is supposed to represent a move in progress, and the program can take a question about whether an event occurred in a given situation as asking either whether the move in progress constituted such an event, the interpretation for (14), or whether the event took place just after the move in progress, the interpretation for (15). When the body of a "when" clause specifies an event, as it does in "when I took five", the WHEN function makes this event the value of INTRAIN in the situation that it constructs and gives the memory and board position variables the values they held just before the event. In answering a question such as (14) or (15), the system first looks to see if the event INTRAIN and the event being asked about involve the same agent. If they do, as in (14), it then goes on to see whether the event INTRAIN meets the description given in the question. In the case of (14) it would have to ask whether my taking of five was a winning move. If the agents do not match, it is not the event INTRAIN that is of interest. In this case, the board position variables are altered by letting the move INTRAIN take place, and the question is taken as applying to the following move. This altered situation then carries on as the referent of the tense for the purpose of further sentences. Thus "Did I take four?" following a "Yes" answer to (15) would not be taken as equivalent to "Did I take four when I took five?", whereas following (14) it would be. In trying to seek out the situation referred to by a present tense "when" clause"when I take six"the program must look to the future. This raises the difficulty, which the nature of the game allowed us to avoid in the case of the past tense, that there may be several different possible future situations in which the event could occur. The program does not answer "Well, that depends" in such circumstances, but instead looks to see whether the event might happen as one of the next pair of moves, and, if not, indicates that it doesn't know what situation is referred to. Furthermore, a present tense "when" clause carries the presupposition that the event it mentions, my taking of six, say, will actually happen. The program objects to this supposition if the event is impossible, for instance if six has already been taken, or if its own strategy is such as to prevent it happening. For example, if it is the program's turn to move, and it is about to take six, it will not countenance a "when I take six" from the user. Similarly, it will object to "when you take four" if its strategy dictates that it will not. If the event description manages to clear all these hurdles, things proceed as with the past tense. The program constructs a function which, when called, will create the situation referred to. In this case, of course, the function is given the label PRES, rather than PAST, and it can be called into use by further present tense sentences with modal verbs as discussed above. As soon as a sentence appears whose

250

Stephan D. Isard

main clause does not have this form, PRES reverts to the "actual present", the current game position. It is important to note that when the program encounters a present tense "when" clause, the point of departure for its search into the future is the current referent of PRES, and not necessarily the actual position in the game. Thus we migk have an exchange like U: What will you do when I take six? P: Take 4. U: What will you do when I take two? P: Take 3. where the second question-answer pair refers to the situation after six and four have been taken. We can also have sentences like "If you take two, what will you do when I take three?" where the referent of the present tense is given by the "if clause, and the "when" clause follows on from the hypothetical situation established there.

3.5.

IF

The function corresponding to the word "if* operates on present tense clauses in a manner quite similar to that just described for the function WHEN. It seeks out a future situation to serve as the referent of PRES. The main difference between the two arises from the lack of any presupposition, in "if clauses, that the event described by the clause will actually take place. The IF function does not share the WHEN function's concern with what the program's strategy would lead it to do in a given situation, and so the program is willing to speculate on what might happen "if you take six" in circumstances that would lead it to reject "when you take six". There is a considerable divergence between IF and WHEN when we come to clauses bearing the remote tense marker. As mentioned earlier, "if clauses with this marker are always interpreted by the program as being subjunctive, rather than past tense. The subjunctive mood shares a syntactic marker with the past tense, and exhibits the same sort of "pronominal" behaviour that we have attributed to tenses, but the sorts of things to which it "refers" are somewhat different. In the usages we are concerned with here, it does not stand for a particular situation, even a hypothetical one, but rather for an entire alternative time line. Consider the useful example (contributed by Christopher Longuet-Higgins) "If my father's name had been Smith, my name would be Smith". Speaking informally, we can say that the "if clause asks us to consider a parallel universe in whose past my father's name was Smith. In the main clause, we go on to an assertion about the present of this parallel universe, namely that my name is Smith. The way that I have attempted to capture this intuition in the program is to let the referents of SUBJUNCT be functions which temporarily assign values to PRES and PAST which make them refer to situations different from their "real"

What would you have done if... ?

251

values. Thus applying PAST within the scope of a SUBJUNCT evokes the subjunctive past, the past of the alternative universe, rather than a point in the actual past. Now, it is "if clauses that tell SUBJUNCT which alternative universe to consider. Taking first remote clauses without perfective aspect"if I took seven" we see that they refer to possible future events in the same way as present tense "if clauses without perfective aspect"if I take seven". I believe that there is a preference in ordinary usage to employ the subjunctive form in cases where we deem the event unlikely, but I have not taken account of this in the program. The search for a referent situation is therefore carried out in the same way. However, this situation, when found, is not now given directly to PRES, but rather a function which gives it to PRES is created, and made the value of SUBJUNCT. The situation can then be re-evoked only by first entering the subjunctive. Our account of the difference between sequences like (16)
and

What will you do if I take five? Will you take four?

(17)

What would you do if I took five? Would you take four?

is not then in terms of a difference in meaning between them, but just that the hypothetical situation is filed under the label "present** in one case, and under "present subjunctive" in the other. A possible reason for wanting to have such alternative filing systems is that, as we have already noted, the syntactic labels can be used in a variety of different ways, and one of them might be in use at any given moment. Consider, for instance, U: P: U: What will you do if I take five? Take four. If I take six, will I win?

In U's second question, the present tense can continue the reference to the hypothetical case established in his first. However, in U: P: U: What would you do if I took five? Take 4. If I take six, will I win?

The change of tense shifts the topic away from the hypothetical situation, and back to the current board position.

252 3.6. The subjunctive past

Stephan D. Isard

Since the subjunctive manifests itself in the same syntactic marker that might otherwise signal the past tense, the language must press some other device into service to form the past of the subjunctive. What it uses is the perfective aspect marker "have", and in subjunctive environments the program always translates this marker as PAST, rather than the function HAVE discussed earlier. (This can happen in other constructions as well. We have seen an example involving the past perfect in 3.2. and "He may have" can also serve as either "Maybe he has" or "Maybe he did", depending on whether there is a particular past situation under discussion.) The role of IF in a construction like "If I had taken six when I took two" is then to seek out the hypothetical situation which has been specified, and then declare that when we are in the subjunctive mood, this hypothetical situation counts as the past. In somewhat greater detail, the program's operations on this clause go as follows: The translation procedure produces a closure function of the form IF(%SUBJUNCT(%PAST(%TAKE(%6%)(%I%)%)%)%). In doing this it has produced, and run, a translation of the sub-clause "when I took two", with the result that the (ordinary, non-subjunctive) value of PAST has been set to a specific past situation that has my taking of two in progress. IF then looks down into its frozen argument, finds PAST, and applies it, setting up the past situation. The next step is to alter this situation to one in which I take six. It does this by changing the move in progress, the value of the variable INTRAIN, to a move in which I take six. (If the "when" clause were "whenjou took two", it would be the following move that was altered.) A function is then constructed whose effect is to set the value of PAST to this altered situation, and this function is made the value of SUBJUNCT. The hypothetical situation is now ready to be consulted by succeeding past subjunctive main clauses. Suppose, for instance, that we get the clause "would you have taken three?", which translates to SUBJUNCT(%PAST%(%WILL(%TAKE (%3%)(%YOU%)%)%)%). SUBJUNCT gives PAST the value established by the "if" clause. PAST is then applied, actually setting up the hypothetical situation, with respect to which a question amounting to "now will you take three?" is asked.

3.7.

Modalverbs

The program treats the three modal verbs "may", "will" and "can", together with their associated remote forms "might", "would" and "could". Tlhe senses in which these words are taken are as the "may" of possibility (as opposed to permission), "can" in the sense of "be able" and the "will" of prediction.

What would you have done if... ?

253

Unfortunately for these purposes, the "permission" sense of "may" is the preferred one when the word appears in questions, and "might" is often used in a non-remote sense to replace it. "Might you take four?" is a much more natural question than "May you take four?". The program is therefore written to allow "might" to be either remote or not, whichever seems appropriate in the context. There are also cases in which "might" and "could" can take their remote forms without an explicit "if" clause having gone before to define the subjunctive, as in "When I took six, might (or could) you have taken two?". Such sentences seem to carry an implicit "if things had been different" or "if you had felt like it". For the program's purposes, it is possible to gain the effect of such an implicit clause by simply letting the actual past act as the subjunctive past in these sentences. Thus in (5) of the sample dialogue the "have" in the main clause is translated as PAST, because the clause is subjunctive, and this PAST is used to refer to the situation described in the "when" clause, with respect to which "can you take two?" is asked. The presupposition that two was not actually taken is overlooked. The functions MAY, WILL, and CAN operate by considering possible continuations from the board position under consideration at the time they are run. All three take functions corresponding to events, e.g. TAKE(%4%)(%YOU%) as arguments. MAY is the simplest and just hunts for any possible continuation that includes an event corresponding to its argument. WELL ignores those continuations which involve the program's making a move that would be contrary to its strategy. It can confidently predict that none of these will happen. It reports back on whether all, none, or some but not all of the remaining possible games lead to an occurrence of the specified event. Thus, suppose that the program is asked "Will you win". If every continuation played according to the program's strategy leads to a win, it answers "Yes", if none do, it answers "No" and if some do, but not all, it answers "Dunno". The function CAN notes the agent of the event it is given as argument and checks to see whether he can, by making a suitable move at each stage, force the event to occur. The program ignores its actual strategy in these calculations, so that it is able to report that there are things which it can do, but will not. CAN is also quite different from MAY, because the program will not claim that it can win when the opponent is able to prevent it, but it will still say that it may win if it is possible for it to do so through a mistake on the opponent's part. The program answers "No" to "can" questions in situations where the agent has no sure-fire strategy for bringing the event about. In fact it would often be better to say "Not necessarily", because "No" seems to express impossibility, rather than lack of certainty. That is, a reply of "No" to "Can you win?" looks equivalent to "I cannot win", which would be wrong in a situation where a win was still possible if the opponent slipped up.

254

Stephan D. Isard

3.8.

Focus

The discussion of the modal functions has so far proceeded as if they always searched possible continuations from the given board position all the way to the end of the game. But a question like "If I take six, will you take two?" seems to be more naturally taken as "will you take two on your next move?" than as "will you take two during the rest of the game?". Questions involving "win" on the other hand"If I take six, will you win?"do not seem to be focussed on the next move. I have not produced any principled account of this phenomenon, but have provided the variable FOCUSSED, which can have the value 0 or 1, as one of the elements in a situation. When operating in a situation where the value of the variable is 1, the modal functions only look ahead to the agent's next move. There are two main ways in which situations become focussed. The program sets the variable to 1 before asking questions with the verbs "take" or "do", and the situations supplied by the function WHEN are always focussed. Note that the question "When I take six, will you win?" does mean "Will you win on your next move?". This device serves its purpose in the present restricted context, but it is really just an indication that I have noticed a problem, not that I have solved it. 3.9. Other/unctions

The program contains a number of further functions whose workings I will not go into here. They do things which have to be done somehow or other in order to make the program work, but the way in which they do their jobs is not meant to say anything about the way in which people use English. Two such functions that have been mentioned in passing are the WHAT function, which searches the universe for an object, or action, that fits a given description, and the function that decides whether two event descriptions amount to the same thing in a given context, e.g. whether my taking of seven constitutes a winning move. I have tried to keep such functions to a minimum, sometimes at the cost of failing to make the program do things that it clearly could be made to do without much difficulty. The reason comes back to the program's original purpose as an expository device. With this in mind, I thought it best to keep it as simple as possible, so as to help the reader see more easily what the limits of my proposals are, and what phenomena are meant to fall within their scope. Ad hoc patches, to make the basic principles appear to cover more cases than they really do, would only confuse matters. Acknowledgements This work was supported by a grant from the Science Research Council. As should be evident from the text, my thinking on these topics has been greatly influenced by conversations with Christopher Longuet-Higgins. This does

What would you have done if... ?

255

not, of course, mean that he necessarily subscribes to my conclusions. I am also grateful to Julian Davies, Anthony Davey and Graeme Ritchie for many helpful discussions about both English and programming.

References BOYD, JULIAN and THORNE, J. P. (1969), The semantics of modal verbs, Journal of Linguistics 5, 5774. BURSTALL, R. M. COLLINS, J. S. & POPPLESTONE, R. J. (1971), Programming in POP-2. Edinburgh: Edingurgh University Press. HOPCROFT, J. E., & ULLMAN, J. D. (1969), Formal Languages and Their Relation to Automata. Reading, Mass.: Addison-Wesley. MCCAWLEY, JAMES D. (1971), Tense and Time Reference in English. In Studies in Linguistic Semantics, CHARLES J. FILLMORE and D. TERENCE LANGENDOEN (Eds.) New York N.Y.: Holt, Rinehart and Winston. PRIOR, A. N. (1967), Past Present and Future, Oxford: Oxford Univeristy Press.

FRANZ VON KUTSCHERA

INDICATIVE CONDITIONALS

In this paper a semantics and logic of conditional necessity is developed as the basis of a logic of indicative and subjunctive conditionals and of causal sentences. It is argued, against E. Adams and D. Lewis, that these three types of statements differ only in presupposition.

In "Counterfactuals" (1973) David Lewis has developed a logical system VC for counterfactuals. For the normal cases of counterfactuals, in which the antecedent is false, VC can be replaced by VW, a system based on weak instead of strong centering. In this paper I shall try to show that this system can also be applied to indicative conditionals and can generally be used for a comprehensive and unified treatment of the logic of conditionals. The main obstacles in generalizing Lewis' analysis of counterfactuals to include indicative conditionals are, first, that the intuitive background he provides for his semantics favours strong centering, and, second, an argument by Ernest W. Adams in (1970) to the effect that "subjunctive and indicative conditionals are... logically distinct species" so that the truth-conditions for the former cannot be derived from those for the latter by adding suitable presuppositions. Our first step will be a reconsideration of that argument. 1. Adams' argument

If counterfactuals derive from indicative conditionals or both from a basic type of conditionals then it should be true that: (1) an indicative conditional 'If it is the case that A, then it is the case that B' (shortly: *If A, then B') has for non-A the same truth conditions as the counterfactual 'If it were the case that A, then it would be the case that B'.

According to Adams the two following sentences form a counter-example to (1), since we consider (2) true and its antecedent false, but (3) false: (2) (3) If Oswald didn't shoot Kennedy', then someone else did. If Oswald hadn't shot Kennedy, then someone else would have.

258

Franz von Kutschera

Now Lewis and before him N. Goodman, N. Rescher and R. Stalnaker, have analyzed the truth conditions of conditionals as dependent upon ceteris-paribti:conditions C, not explicitly mentioned in the conditional, that are compatible with the antecedent. If we change our assumptions as to the truth or compatibility of C with A, then we also change our assessment of the truth of the conditional. In the example we are only prepared to accept (2) as true if we know that Kennedy was indeed shot in Dallas, and if we consider that compatible with Oswald's innocence, although this may be very unlikely for us as things stand. If, on the other hand, we consider (3) to be false, we take it that Oswald did indeed shoot Kennedy and that there was no one else around with intentions or means to assassinate the president. We therefore do not consider Kennedy's being shot in Dallas compatible with Oswald's innocence.1 So we have changed our assessment of the ways things might have been, if Oswald hadn't shot Kennedy. But (1) presupposes that this assessment remain the same for the indicative and the counterfactual conditional. The difference in our assessment of the truth of (2) and (3) seems to be a consequence of the fact that, while (2) speaks about the author of the killing in Dallas, (3) implies that Kennedy was somehow fated to be killed anyhow, which is not implied in the logical representation of (3). There are indeed quite a lot of differences of meaning in natural language, for instance by a change in topic and comment as in Goodmans example of Georgia and New York2, that are not accounted for in the usual straight-forward logical representation. So Adams' example is not a conclusive argument against (1). 2. Types of conditionals

In traditional grammar three types of conditionals are distinguished: Those using the indicative in the antecedent and consequent (indicative conditionals) and two forms of subjunctive conditionals. These two forms can be distinguished morphologically in some languages, as in Latin, by their use of present or past tense (Si hoc credas, erres vs. Si hoc crederes, errares)} but generally they have to be determined by the fact that one type (the counterfactual) carries the presupposition that the antecedent (and also normally the succedent) is false, while the other type (in Latin potentialis) carries no such presupposition but expresses the speaker's opinion that the antecedent is improbable or uncertain. We shall also include causal statements of the form "Since it is the case that A, it is the case that B" in our investigation of conditionals. Such sentences presuppose that A (and hence B) is true. The grammatical subdivision of conditionals is of little logical interest since it mixes syntactical criteria (mood) with semantical (presupposition) and pragmatical
1 2

Cf. also Lewis (1973), p. 71. Cf. Goodman (1965), pp. 14 seq.

Indicative conditionals

259

ones (beliefs of the speaker). As we are here only after truth conditions and not after expressive meaning components the difference between indicative conditional and potentialss is not relevant for us. And since we shall not consider partial interpretations3 we shall take no account of presuppositions. We want to argue that we can get along then with only one type of conditional which we write A=>B. We say that A=>B is used as an indicative conditional if it is undecided (for the speaker) whether the antecedent A holds or not. A=>B is used as a counterfactual if (for the speaker) it is a fact that A. And it is used as a causal statement "Since it is the case that A, it is the case that B" if (for the speaker) it is a fact that A.4 We think, therefore, that the difference between the indicative, counterfactual, and causal conditional is not a difference of truth-conditions but only a difference in presupposition. If we assert for instance (1) If Jack believes that John is married, then he is wrong,

and are told that Jack does not believe John to be married, then we are committed to the statement. (2) If Jack were to believe that John is married, he would be wrong. The reason for asserting (1), viz. that John is not married, is the same as that for asserting (2). And if we assert (2) we are committed to (1) if we hear that it is really uncertain whether Jack believes John to be married or not. And if I assert (1) then if I learn that Jack really does believe John to be married, then I am committed to the statement (3) Since Jack believes that John is married, he is wrong. And conversely, if I assert (3) and then learn that it is not sure that Jack believes John to be married, I shall say that (1) is true. One or a few examples are not conclusive evidence for our thesis of course. They just serve to give it a certain intuitive plausibility. Our main argument has to be that the semantic analysis of A=>B is such that the thesis is intuitively adequate. 3. Similarity of worlds

D. Lewis gives several types of semantics for the language of conditionals. The intuitively fundamental one is that of comparative similarity systems5. Such systems are based on relations j <i k on the set I of possible worlds for all i I. j < j k
3 4

Cf. for instance Kutschera (1974a). We shall not discuss the difference between subjective and objective presuppositions Cf. Lewis (1973), pp. 48 seq.

here.
5

260

Franz von Kutschera

says that the world k is at least as similar to i as j is. j < j k is to be a weak ordering for which several conditions hold, among them (1) j < j i for alii, j Iandj=i=i. This is the condition of strong centering^ which says that every world is more similar to itself than any other world. Lewis' truth condition for A => is (2) A^-B is true in i iff A is impossible or there is an -world j so that all A-worlds that are at least as similar to i as j are B-worlds. From (1) we obtain then (3) :3(=>). This is harmless for counterfactuals which normally are used only under the presupposition that A. It is unacceptable, however, if we want to interpret A=>B as the basic form of conditionals, since every causal conditional would then be true. If we replace (1) by the condition for weak centering (!') j<jiforaUi,j8l, then (3) is not valid anymore, but the assumption that there is a world, different from i, which is to i just as similar as i itself, is counterintuitive. Similarity of j and i, according to Lewis, is to be overall-similarity, so that j is the more similar to i the more details they have in common and the more important these common details are. Since for j i j must in some details, however few and unimportant, be different from i, i itself must certainly be more similar to i than j. To obtain an adequate semantics for our A=>B there remain then only two possibilities : Change (2) or change the whole intuitive background of the semantics. We might, for instance, change (2) to (2') A^B is true in i iff A is impossible or necessary or there is an -world j and a -world k so that all A-worlds at least as similar to i as j or k are B-worlds.

In case A is false in i this coincides with (2), i.e. nothing is changed for counterfactuals. (2') expresses the fact that A=^B holds if we can infer B from A together with a suitable ceteris-paribus-condition compatible both with A and A, and looks, therefore, like a good candidate for indicative conditionals. The trouble with (2'), however, is that the logic we obtain from this condition is too weak. Conditionals are a type of inference-relation and though many of the fundamental principles valid, for instance, for logical entailment or material or strict implication (like the laws of contraposition, strengthening of the premiss or transitivity) are not valid for conditionals6 they are valid in the normal cases. From (2'), however, we do not obtain sufficiently strong restrictions of these laws.
Cf. Lewis (1973), 1.8.

Indicative conditionals

261

We shall therefore follow the second course and abandon the use of worldsimilarities for the interpretation of A^B altogether.

4.

Conditional necessity

We shall interpret A=>B as a statement about conditional necessity and read it as "On condition that A, it is necessary that B". The notion of conditional necessity is a generalization of the usual notion of (unconditional) necessity, as conditional probability or conditional obligation are generalisations of the notions of (unconditional) probability and obligation. Under different conditions different propositions may be necessary. From conditional necessity we obtain two concepts of unconditional necessity: proposition p is weakly necessary if it is necessary on a tautologous condition, and p is strongly necessary if it is necessary under all conditions, p is weakly necessary if under the given circumstances p is normally the case. Therefore A^B expresses a notion of weak necessity: on condition that A, it is normally the case that B. Conditional possibility can then be defined by D4.1. A=?>B: = -,(A=>-iB) We read A==^B as "On condition that A, it is (weakly) possible that B". A proposition p is unconditionally weakly possible if under the given circumstances it would not be abnormal if p were the case. And p is strongly possible if there is a condition under which p is (weakly) possible. Before we discuss these intuitive concepts further let me give the formal definitions: Let ( be the language obtained from that of Predicate Logic by stipulating that (A^B) be a sentence if A and B are. To economize on brackets , , are to bind stronger and n>, = weaker than =>, so that we may write => C=>C=>TAinsteadof (()=>( Q) D (C=>nA). D4.2. An interpretation of ( is a quadruple <U, I, f, > so that: (1) (2) (3) U is a non-empty set of (possible) objects. I is a non-empty set of (possible) worlds. f(i, X) is a function on I x P(I) (P(I) being the power set of I) so that for all i I and X r> I (a) f(i,X)c=X (b) X c f(i, ) r> f(i, ) 7 (c) X c: f(i, Y)f|X => f(i, ) = f(i, )) (d) ief(i,I)

For the sake of brevity we use the logical operators of ( also as metatheoretical symbols.

262

Franz von Kutschera

(4)

For all i is a function from the set of sentences of ( into the set {t, } of truth values so that (a) Oj (a) = <Pj(a) for all j I and all individual constants a. (b) Oj satisfies the conditions for interpretations of the language of Predicate Logic over U. (c) (=) = t iff f(i, A) c [B], where [B] = {jel: Oj(B) = t}.

In modal logic we set Oj(NA) = t iff Sj c [A], where for all Sj is a non-empty subset of I. Sj is the set of worlds possible from the standpoint of i. (4c) is the straight-forward generalization for conditional necessity: f(i, A) is the set of worlds (weakly) possible under condition that A from the standpoint of i. If we set Si = (J((i, X), then Sj is the set of worlds strongly possible from the standpoint of i, i.e. the proposition X is strongly necessary iff Sj c X, and X is strongly impossible iff Sj c X. We construe f so that

()

f(i,X) = A = SiCX,

i.e. f(i, X) is empty iff X is strongly impossible. This follows from the conditions of D4.2(3), and the definition of S^ If SiC: X then f(i, ) = according to (a). If Sj then there is a Y with f(i, Y) f) , so according to (c) f(i, X f) Y) = f(i, ) , and according to (b) f(i, ) . From the definition of Si we obtain () Sj c=X iff for all Y c I f(i, Y) c X, And (a) together with (a) implies

()
D4.3. (a) (b)

SiC=X = f(i,X)cX.
NA :=nA=>A MA:=-iNiA,

We can, therefore, define strong necessity and possibility by

while weak necessity and weak possibility are defined by D4.4. (a) (b) LA : = T=>A, where T is a tautology, PA : = -iLiA.

Now condition D4.2(3 a) says that all worlds (weakly) possible on condition that X are X-worlds. Condition (b)always of D4.2(3)says that if X is strongly possible and Xc Y, then Y is strongly possible. This is the law A =>BhMA => MB of modal logic. Condition (c) (in view of (a) and (b)) is equivalent to f(i, )) ^ f(i, Y) f(i> ) X; i.e. if among the worlds (weakly) possible on condition that Y there are some X-worlds, then these are the worlds (weakly) possible on condition that X and Y. This implies the law of im- and exportation of premisses ( B=>C = A=>B => C).

Indicative conditionals

263

Condition (d) finally says that i is weakly possible from the standpoint of i. (d) together with (c) implies the law of modus ponens for conditional necessity: (A=>B) =>B. A word, perhaps, is also in order on condition D4.2 (4a): All individual constants are interpreted as standard names. S. Kripke has given good reasons for such a procedure in (1972). Since we are not interested in existence here we have not introduced sets Uj of objects existing in i. If is a one-place predicate constant of ( we could set () = Ui and define quantification over existing instead of possible objects by .[]: = ( ID A[x]). D4.5. An interpretation SR = <U, I, f, > satisfies a sentence A in i iff () = t. A is valid in $R iff 9M satisfies A for all i I. And A is C-valid iff A is valid in all interpretations of (. Our concept of interpretation appears in Lewis (1973), 2.7 as that of a model based on a weakly centered selection function. His selection functions, however, are introduced on the basis of comparative similarity concepts for which only weak centering is counterintuitive, as we have seen. To arrive at such functionsf(i, A) being interpreted as the set of -worlds most similar to ithe Limit-Assumption has to be assumed, that for all i and A there is an A-world most similar to A. Though this makes no difference for the resulting logical system it is intuitively not well-founded as Lewis points out, since the similarity of worlds may depend on the values of real valued parameters like places, times, masses etc. in them. Our approach avoids these difficulties in giving another interpretation to the selection functions. If we want to consider iterated applications of modal operators the principles () () NA^NNA of C. I. Lewis'system S4, and -iNAiDN-iNA ofS5

suggest themselves. As S5 seems to be intuitively most adequate, we may incorporate the condition
(e) j Sj ID Sj = Sj for all )!

intoD4.2(3a). If we want to obtain principles for iterated applications of => it seems best to generalize () and (). The following two conditions are the likeliest candidates : () ()
(f)

A=>B=>L(A=>B) i(A=>B)iDL-i(A=>B).
j f(i, I) => f(j, X) = f(i, X) for all )! and X c I.

These two conditions are equivalent with postulating in D4.2(3) also If we assume (') A=>BiDN(A=>B) and (') -(=>) ID Ni(A=*B),
18a TLI3

264

Franz von Kutschera

instead of () and (), =* holds iff it holds on all conditions, and does, not hold iff it holds on no conditions, all conditionals would be necessarily true or false, and we would have LA=NA and N(A isB) = A=s-B, i.e. conditional necessity would coincide with strict implication. This is not adequate since it may be true that If my barometer goes up, then the atmospheric pressure rises but, since my barometer does not necessarily function correctly, it is false that The going up of my barometer strictly implies that the atmospheric pressure rises. If, on the other hand, we postulate (") (") =>^>=>(=>) and -(=>) =D A=*-i(A=>B)

this would not be intuitively correct, since, if A^>B holds, then A is a reason for B, but not a reason for A => B. 5. The logic of conditional necessity Let CQ be Predicate Logic plus the following rule and axioms AhNA . A=>A NA=>B=>A N(A =>B) (C=>A) => C=>B (A=>B) (A=>C) ^ A=>B C A=^B=>(A A B = > C = A=>B=3C) A=>BiD(Az)B) (=>[]) ^ =>[] is to be (0 plus C8: NA => NNA C9: iNA=>lShNA (2 is to be (E! plus CIO: A=>B^L(A=>B) Cll: n(A=>B)=)L-i(A=i>B). The propositional part of G0 i-e. (0 minus C5, is equivalent with D. Lewis' system VW in (1973), pp. 132 seq. C0 contains the basic modal system M of von Wright (or T of R. Feys) together with the Barcan formula [] is [], and therefore ^ contains S5. For inference relations > like logical entailment and material or strict implication the following principles are fundamental:
(1) (2) (3) A> A (A > ) ( > C) => (A > C) (>)=>()

CR: Cl: C2: C3: C4: C5: C6: C7:

Indicative conditionals

265

(4) (5) (6) (7) (8) (9)

(A>B)=>(A>BvC) (A > C) = (A > ) (A > C) ( > C) = (A > C) ( > C) (A>B) ( ( >C) = (A > B ID C)

For => in place of > only (1), (4), (5) and (8) hold. In place of (2) we have ( =>) (=) (B=>C) ^ A=>C, in place of (3) (A =>C) (=>) z> C=>B, in place of (6) ( =>A) ( B=>B) ID (A v B=>C = (A=>C) (B=>C)), in place of (7) nLB (A=>B) =3 iB=>nA and in place of (9) we have C5.

6.

Conditional necessity and conditionals

We have to show now that conditionals can be adequately analyzed in terms of statements A ^> about conditional necessity.
(A) Indicative Conditionals

A statement A=>B may be used as an indicative conditional if PA A, i.e. if under the given circumstances A and -\ A are both weakly possible (it may very well be the case that A, but also that ). . Goodman's analysis of counterfactuals in (1965) can in part be carried over to indicative conditionals. Then a sentence (1) "If A, then B" is not only true if N(A => B) but also if there is a relevant condition C, not mentioned in "If A, then B" so that N(A C ^ B). C cannot be a free parameter for then (1) would have no definite truth value. C cannot be the conjunction of all true statements, for if A is false (1) would always be true since N(A A z>B). C has to be at least consistent with A. C cannot always be true since, on condition that A, C might be an implausible assumption if A. As Goodman has shown, the truth condition "If A, then B" is true iff there is a C so that ( => C) and N(A C => B) violates the principle MB (=>)=>-(=>). So we will have to choose a stronger relation than ( ^ nC) between A and C which Goodman calls cotenability. It seems natural to take A=>C in place of such a relation. If A ^> C then C is (weakly) necessary or the normal case on condition that A, so that assuming A C goes without saying. This accounts
18b TLI 3

266

Franz von Kutschera

for C not being mentioned in (1). We then have: "If A, then B" iff there is a condition C such that A=>C and N(A C=>B). From this it follows that "If A, then B" holds iff A=>B does. For if A => is true we can take A D as our C; then N(A C=>B) and (in view of C3) A=s>C. And if we have A=>C and N(A C=>B), then we have A=>A C (Cl and C4) and therefore A^B, in view of C3. This equivalence A=>B = C(A=>C ( C )) also holds for counterfactuals and causal conditionals and so is a valuable argument for the correctness of our analysis. In the normal case of an indicative conditional "If A, then " is not weakly necessary, i.e. we have iLB. We would, for instance, not normally say "If Nixon is stillpresident next year, then he will be over sixty" (1) normally expresses that there is a connection between the facts expressed by A and so that it may very well be that nB if A. The case LB can be excluded by using the strong conditional defined by D6.1. A=*B: = (A=>B) -(=>), which implies that is (weakly) possible under condition that A. For -iLB, A=>B,and A => are equivalent. A =>B can be read as "If it is the case that A, then it may be the case that B". As Lewis points out, such "may"-conditionals (speaking of counterfactuals he had his eye on "might"-conditionals) also play a role in ordinary discourse. They come quite naturally from the standpoint of conditional necessity. (B) Counter/actuals

A =s> may be used as counterfactual if A. Then we also have . We shall not stipulate nPA, i.e. LnA however, since under the given circumstances, though A is false, A might very well be possible. So on our definition for PA A=>B may be used both as an indicative and a counterfactual conditional, depending on the speakers knowledge, of which we have taken no account in our semantics. The logic of counterfactuals then coincides with that given by D. Lewis. In the normal case of a counterfactual we again have -iLB. If we want to imply that B is in fact true, we say "If it were the case that A, then it would still be the case that B". By use of => instead of => we can exclude this case LB, while for /iLB A =>B is equivalent again to A => B. (C) Causal conditionals

A=>B may be used as a causal statement (2) "Since it is the case that A, it is the case that B", if A is true. Then we have PA, but we don't stipulate that , i.e. that A is normally the case. In fact from A=>B and LA we obtain LB,

Indicative conditionals

267

so that B is normally the case too; from LA (or PA) and LB A^>B already follows, so that this statement is quite uninformative in case of LA. In ordinary discourse, we usually only give reasons for phenomena that are unexpected, strange or unusual as for instance J. K nig, H. Hart and H. Honore, J. Passmore and E. Scheibe have pointed out in their discussions of the notion of explanation. This is not always so, but iLB (and therefore iLA), is certainly an important case in the use of causal statements. Taking A => B instead of A => B (equivalent with A => B again in case of -iLB) we do not exclude the case LB, but only iLA LB and give an informative sense to a causal statement in case of LA LB since -(=>) does not follow from LA and LB. A=>B then states that A is necessary for the weak necessity of B. It should finally be emphazised first that a causal conditional "Since A, B" does not state that A is a cause of B. As in Since my barometer is going ftp, the atmospheric pressure rises or Since the period of the pendulum is t, its length is g()2

A may be an effect or a symptom for B that is a reason for believing that B. Moreover, A may not be the only possible or actual reason for B. But in =* if A were not to be the case then at least B might be false. The question, wether CIO and Cll are adequate, is very hard to decide, since we lack reliable truth criteria for ordinary knguage sentences with iterated "if-then's". Take the following examples of sentences of the form A=> (B=* C), (^)=> C, and A AB=*>C: (3) (4) (5) If John will come, then if Jack mil come too, it will be a nice party. If Jack will come in caseJohn comes, it will be a nice party. If John andJack will come, it will be a nice party.

If under the given circumstances it is possible that John will come (PA), then (3) according to CIO and Cll is equivalent to (6) If Jack will come, then it will be a nice party.

And if under the given circumstances it is possible that Jack will come in case John comes (P(A=>B)), (4) is equivalent to (7)
to (3).

Under the given circumstance it will be a nice party. Under condition that P(A=>B) (5) follows from (3), but is not equivalent

All this is hardly convincing. But I doubt that we can really spell out the difference of meaning between (3), (4), and (5). Such constructions are very rare so that we have only a narrow basis for a test for the adequacy of CIO and Cll.
18b*

268

Franz von Kutschera

If we do not want to exclude such iterated "if-then" 's altogetherand we wouldn't lose much for ordinary language analyses therebythen it seems best to adopt strong principles that permit us to reduce many such sentences to simple "if-then" 's, as C8 and C9 do in Modal Logic. 7. Conclusion

An explication of a concept is adequate if the explicatum is coextensional with the explicandum for the great mass of normal instances, and if the explicatum is simple and fruitful. I have tried to show that our explication of conditionals captures the main ideas that we express by them. The simplicity of the explicatum is achieved only by leaving open the question of how to give a precise sense to the notion of relative necessity8 and by passing over a lot of problems connected with natural language analyses. As for fruitfulness just one example: Many conditional obligations have to be analyzed in the form 'If it is the case that A, then it is obligatory that B', for which a rule of detachment holds, so that we can infer from A that B is obligatory? If we take the 'if-then' here as a material or a strict implication, this conditional obligation takes no exceptions. As we can think, with a little imagination, for almost every current conditional obligation of situations, in which it would not hold, i.e. of conditions C so that, if C, then not O(B), we would have inconsistency in almost all our normative systems.10 But if we analyze 'If A, then O(B)' as A=>O(B), then O(B) is only said to be the normal case on condition that A, not that O(B) holds in -worlds in which extraordinary circumstances and strange coincidences obtain. It may then very well be the case that A=>O(B), but -( C=>O(B)). Such restrictions to normal cases are implied, I think, in most every-day statements of conditional obligation. References
ADAMS, E. W. (1970), Subjunctive and Indicative Conditionals. Foundations of Language 6, pp. 8994. FRAASSEN, B. VAN (1973), Values and the Heart's Command. The Journal of Philosophy 70, pp. 5-19.

In another paper I define an epistemic interpretation of conditional necessity and try to show that a purely "objective" interpretation is, as in the cases of unconditional necessity or the similarity of worlds, impossible. 9 For other types of conditional obligations cf. Lewis (1973), 5.1, and Kutschera (1974B). 10 For a way out of this problem that comes close to deontic suicide cf. B. van Fraassen (1973).

Indicative conditionals

269

GOODMAN, N. (1965), Fact, Fiction, Forecast. 2nd ed. Indianapolis 1965. KRIPKE, S. (1972), Naming and Necessity, pp. 253355, 763769 in G. HARMAN and D. DAVIDSON (ed.): Semantics of Natural Language, Dordrecht: Reidel. KUTSCHERA, F. VON (1974a), Partial Interpretations. To appear in E. KEENAN (ed.): Formal Semantics for Natural Language, Cambridge. KUTSCHERA, F. VON (1974b), Normative Prferenzen und bedingte Obligationen. In H. LENK (ed.): Normlogik. Mnchen-Pulkch. LEWIS, D. (1973), Counterfactuals, Cambridge, Mass.: Harvard Univ. Press. STALNAKER, R. C. (1968), A Theory of Conditionals, pp. 98112 in N. RESCHER (ed.): Studies in Logical Theory, Oxford.

DISCUSSIONS AND EXPOSITIONS


H.A.LEWIS MODEL THEORY AND SEMANTICS1

In this paper some basic concepts of model theory are introduced. The structure of a definition of truth for a formal language is illustrated and the extension and alteration required for model theory proper is explained. The acceptability of a modeltheoretic account of truth in a natural language is discussed briefly.

I.

Introduction

In recent years several writers have proposed that formal logic has a role to play in linguistics. One suggestion has been that a semantic theory for a natural language might take the same form as the semantic accounts that are usual for the artificial languages of formal logic. Without presupposing a knowledge of formal logic on the part of the reader, I attempt here to sketch the arguments for this suggestion in the forms it has been given by Donald Davidson and by Richard Montague. Readers already familiar with their views will find nothing here that is new, but I hope also nothing that is dangerously misleading. One good reason for seeking to say nothing that presupposes knowledge of formal logic or of model theory is that questions of importance for my subject have to be settled, or begged, before formal semantics can be developed at all. I shall therefore be dealing with some basic concepts of model theory rather than with any detailed formal developments. It may help to identify my topic more clearly if I explain why it seems to me to survive two criticisms that might be broughttwo views about the role of formal logic in the semantics of natural languages that imply (from opposite directions) that no question about the applicability of formal semantics to natural languages arises.

This paper is a revised version of a paper presented at the meeting of the Semantics Section of the Linguistics Association of Great Britain at the University of York on 5 April 1972. My presentation of the issues owes much to Donald Davidson, in particular to Davidson (1967). I am grateful to the referee for many improvements to an earlier version.

272

Harry A. Lewis

One school of thought is very hospitable to formal logic: allowing a distinction between deep and surface structures in a grammar, it claims that in a correct grammar deep structures will be nothing other than sentences of formal logic, and that such deep structures are necessarily bearers of clear semantic information. The only serious semantic questions that arise for natural languages would then be questions about the derivation of surface structures from deep structures. This view, a caricature to be sure, seems to me too hospitable to formal logic. If formulas of logic are usable as deep structures in a generative grammar, and the principle that meaning does not change in proceeding from deep to surface structure is espoused, semantic questions are simply thrown back onto the deep structures. The semantics of first order predicate logic (to mention the most familiar logical system) is well established for the purposes of logic textbooks but not without its limitations if it is taken as accounting for meaning in natural languages. A linguist who chooses logical formulas for his deep structures enters the same line of business as the philosophers who have puzzled over the philosophically correct account of the semantics of the logical formulas themselves. Some of their puzzles depend on the acceptance of a standard way of translating to or from logical symbolism, but others arise from the usual semantic accounts for first-order logic2. Even if it were legitimate to take the semantics of standard first-order logic for granted, this logic notoriously cannot deal in a straightforward way with many aspects of natural languages, such as tenses, modalities and intentional verbs, and indexicals. But the semantics of the more complex logical systems designed to deal with such notions is sufficiently controversial among logicians that no one can safely take it for granted. Another school of thought, viewing my subject from the opposite direction, holds that we know that a semantic account appropriate to an artificial language could not be appropriate to a natural language just because the former is artificial. The formulas of an artificial language are stipulated at their creation to have the meaning that they have, whereas a natural language, although it is a human creation, must be investigated empirically by the linguist before he can hope to erect a semantic theory3. It seems to me that the bare charge of artificiality is a pointless one: there is no reason why a semantic account that fits a language we have invented should not also fit another language that we have not. (Just as there is no reason why a human artifact should not be exactly the same shape as an object found in nature.)

See below, p. 276/7. In many presentations of first-order logic, only the logical constants (the connectives and quantifiers and perhaps the identity sign) are regarded as having a fixed meaning. In such an approach no formula (except one consisting entirely of logical constants) has a meaning until an interpretation is assigned to the non-logical constants: so the stipulation is a two-stage process.
3

Model theory and semantics

273

The idea that the semantics of natural languages is subject to empirical constraints that do not operate for artificial languages is worthy of greater respect, however. Whereas I may, it seems, decree what my artificial symbols are to mean, I must take the natural language as I find itwe can talk of 'getting the semantics right* for a natural language but not for an artificial one. As a matter of fact there is such a thing as getting the semantics wrong, indeed provably wrong, for an artificial language, because of the requirements of consistency and completeness that formal semantic accounts are intended to meet: but this remark, although it may make formal semantics sound more interesting, does not meet the difficulty about natural languages. Let us imagine that we have a proposed semantic theory for English before us, and that it gives an account of the meaning of the words, phrases, and sentences of the language. An empirical linguist must then inspect this theory to see if it squares with the facts. But what facts? The answer to this question depends to some extent on the nature of the theory: it may be a theory with obviously testable consequences. If, for example, it alleges that speakers of the language will assent to certain sentences, then provided you know how to recognize speakers of the language, and their assentings, such consequences can be tested in the field. Alternatively, the theory may provide a translation of sentences of English into another language. If the other knguage is Chinese, and you speak Chinese, you can check the translations for accuracy. If the other language is a knguage no one speaks, such as semantic markerese, whose existence is asserted only by the theory, then no such check is possible4. The problem of empirical adequacy is a central one for semantics. A semantic theory must provide an account of the meaning of the sentences of the language it purports to describe. If the language is our own language, we should be able to tell without difficulty whether the account is correct. It is a minimal requirement of a semantic theory that it offer a translation, paraphrase or representation of each sentence of the language. Any transktions it offers should be synonymous with the sentences of which they are translations. If the transktions talk about abstract entities of a kind of whose existence we were unaware, we shall need to be persuaded that we really were talking about them all the time, although we did not realize it. (It is an even more basic requirement that the transktion of a deckrative sentence should be a declarative sentence, rather than (for example) a verbless string.) It is not obvious that these platitudes bring us any closer to an understanding of the empirical constraints on a semantic theory. Certainly, if we think of translation as a simple pairing of sentences, it does not5. But if we think of

I owe the expression 'semantic markerese' to David Lewis. See Lewis, D. (1972). For the possibility of giving semantics by pairing synonymous expressions, cf. Hiz (1968) and Hiz (1969).
18a*

274

Harry A. Lewis

translation as the explaining in one language of the sentences of another, we may find a way out. Compare: (1) (2) 'Pierre attend' means the same as 'Pierre is waiting'; 'Pierre attend' means that Pierre is waiting.

(1) informs us of a relation between two sentences: (2) tells us what a particular French sentence means. A semantic theory should not simply pair sentences, it should tell us what they mean. (1) and (2) are both contingent truths, but the same cannot be said of both (3) and (4): (3) (4) 'John is tall' means the same as 'John is tall'; 'John is tall' means that John is tall.

We know that (3) is true in virtue of our understanding of 'means the same as', and so it is a necessary truth. We also know that (4) is true, but (4) is a contingent truth about the sentence 'John is tall'6. Moreover a semantic theory about English in English, worthy of the name, should have (4) as a consequence, as well as (5): (5) (A) 'Four is the square of two' means that four is the square of two, S means that p. and in general should have as consequences all sentences like (A) where 'S' is replaced by a syntactic description of a sentence and 'p' is replaced by that sentence or a recognizable paraphrase of it. Such a theory does not simply pair sentences: it tells us what they mean. A minimal requirement on a semantic theory for a natural language is that it have as consequences sentences of form (A). The fact that the -sentences are contingent truths rather than stipulations or necessary truths proves to be no block to providing for a natural language a semantic account similar to some that can be given for formal languages: indeed the founding father of formal semantics, Alfred Tarski, made it one of his basic requirements for a formal semantic theory that it yield something very like the-A-sentences7. It may seem that the requirement that a semantic theory yield the Asentences is a weak one, and that the production of such a theory would be a trivial matter. It will be part of my purpose in what follows to show that this is not the case.

6 This point can easily be misunderstood because any (true) statement about meanings might be thought to be true *in virtue of meaning' and so necessarily true. But we do not need to know what 'John is tall* means to recognise (3) as true: all we need to know is that the same expression occurs before and after 'means the same as*. 7 See Tarski, A. (1956), in particular section 3 (pp. 186 sqq.). The idea that Tarski's approach may yet be appropriate for natural language is due to Donald Davidson. See in particular Davidson, D. (1967), (1970) and (1973).

Model theory and semantics

275

II.

Simple semantics

The semantic theories that will now be described have the form of definitions of truth for a language: they set out to give the truth-conditions of its sentences. The classical account of the definition of truth is Tarski's paper 'The concept of truth in formalized languages'. It seems to me to be essential to present briefly the main themes ofthat important paper8. A definition of truth is given in a language for a language. There are thus typically two languages involved, the one for which truth is defined, which plays the role of Object-language', and the one in which truth is defined, playing the role of 'metalanguage'. The metalanguage must be rich enough to talk about the object-language, in particular it must contain names of the symbols or words of the object-language and the means to describe the phrases and sentences of the object-language as they are built up from the words: it must also contain translations of the sentences of the object-language. Tarski lays down, in his Convention T, requirements for an adequate definition in the metalanguage of a truth-predicate: that is, requirements that a definition must fulfil if the predicate so defined is to mean 'is true'. The convention demands that it should follow from the definition that only sentences are true; and that the definition should have as consequences all strings of the form (B) S is true if and only if p where 'S' is replaced by a structure-revealing description of a sentence of the object-language and 'p' is replaced by a transktion of the sentence S in the metalanguage. Convention T resembles the requirement that an adequate semantic theory must have the -sentences as consequences. If we require that the definition of truth be finitely statable, then for an object-language with infinitely many sentences it is not possible to take as our definition of truth simply the conjunction of all the infinitely many B-sentences. It is in the attempt to compass all the B-strings in a finite definition of truth that the interest of the Tarski-type truth-definition lies. It is perhaps still not widely appreciated even among philosophers that the production of a definition of truth that fulfils Convention T for an interesting but infinite object-language is far from a trivial matter. Tarski himself showed that one superficially attractive trivialising move does not work. We might be tempted to use the axiom (6) (x) ('x' is true if and only if x) but this does not succeed in doing what was intended since the expression immediately to the left of the 'is true' is but a name of the letter 'x'.
8

cf. note 7 above. A simple introduction to Tarski's ideas is given in Quine (1970), chapter 3: Truth.
19 TLI3

276

Harry A. Lewis

In his paper, Tarski showed how Convention T's requirements could be net for one logical language. In order to illustrate his method I shall use a tiny fragment of first-order logic, with the following syntactic description: (7) Symbols: variables: w y 2 predicates: F G the existential quantifier: E the negation sign: Sentences: (i) (ii) (iii) 'F' followed by a single variable is a sentence; 'G' followed by two variables is a sentence, If S is a sentence, S' followed by S is a sentence, called the negation of S. If S is a sentence containing a variable other than 'w' or V, the result of writing ' followed by the variable followed by S is also a sentence, called (if the variable is v{) the existential quantification of S with respect to Fx Gwx nFw EyFy EznGxz Fy

Thus the following are sentences of the fragment: (8) A definition9 of truth will be offered for this fragment, using a small part of ordinary English as metalanguage. This part should at least contain the predicates 'smokes' and 'loves' and the names 'John' and 'Mary'. In order to explain the truthdefinition, I need to introduce the notion of satisfaction, since the fundamental semantic notion that we use is not truth but truth of, a relation between an individual and a predicate (or verb-phrase). We say that 'smokes' is true of John just in case John smokes; 'is red' is true of this poppy just in case this poppy is red. We have to complicate the notion in a natural way to fit a sentence like 'John loves Mary'. We can already say: 'loves Mary' is true of John if and only if John loves Mary, and 'John loves' is true of Mary if and only if John loves Mary. But we cannot say that 'loves' is true of John and Mary, for that would also be to say that 'loves' is true of Mary and John, but 'John loves Mary' means something different from 'Mary loves John'. We have to say in what order John and Mary are taken: so we use the notion of an ordered pair, John then Mary,of a sequence with two members, John and Mary (in that order). Then we say that a sequence satisfies a predicate, by which we mean that the objects in the sequence,

Since a definition of truth in Tarski's style proceeds by way of a recursive definition of satisfaction, it may more appropriately be styled a theory of truth. If the metalanguage is powerful enough, such a theory may be converted into an explicit definition of truth.

Model theory and semantics

277

ordered as they are, fit the predicate, ordered as it is. We must use some notational device to keep track of the places in the predicate and to correlate them with the places in the sequence. (Note however that the places in the sequence are occupied by objects, the places in the predicate by names or other noun-phrases.) In the examples just given, both object-language and metalanguage are English. In our logical language, 'F' does duty for 'smokes' and 'G' for 'loves': so T' is true of John if and only if John smokes, and 'G' is true of John and Mary (in that order) if and only if John loves Mary. It is now possible to give the recursive definition of satisfaction for the fragment:10 (9) (i) (ii) (iii) (iv) For any sequence q of persons whose first member is John and whose second member is Mary11, and all i and j, q satisfies Fv{ if and only if the i'th member of q smokes. q satisfies Gv-^v^ if and only if the i'th member of q loves the j'th member of q. q satisfies the negation of S if and only if q does not satisfy S. q satisfies the existential quantification of S with respect to the i'th variable if and only if at least one sequence differing from q in at most the i'th pkce satisfies S. A sentence is true if and only if it is satisfied by all sequences of persons whose first member is John and whose second member is Mary. 'EyGyw' is true if and only if someone loves John.

and the definition of truth: (10)

If the implications of this definition are unravelled, we find out for example that (11) (It is common to draw a veil, as I have done, over the process of abbreviation that yields this result: mention of sequences has been obliterated, but they are the most important piece of machinery required for the definition.) The definition of truth just stated tells us what sentences of the fragmentary language are to mean rather than what they do, as a matter of fact, mean: but this is only because the language that served as object-language was not one whose sentences already had a definite meaning. The procedure for defining truth could equally be followed for English sentences where both the metalanguage and the object-language were English. If it was followed, the role of the semantic definition of truth would be to articulate the structure of the languageto show how the meanings of sentences of arbitrary complexity depend on the meanings of their partsrather than to give any very helpful information about the meanings of the parts. A definition of truth of the simple type that I have presented does
10

The italicised *P and *G* do duty for names of the predicates *F and 'G'. Concatenation, the writing of one symbol next to another, has been left to be understood, although it is usual in formal approaches to make it explicit. Vj' means 'the i'th variable', e.g. (v^ means *y* 11 I owe this device for handling proper names to Donald Davidson.
19*

278

Harry A. Lewis

all its serious work in the recursive clauses, and we look to it in vain for more than obvious information about the meanings of the simples (in this language, the elementary predicates and names). The attempt to extend a definition of truth according to Convenction T to a more useful fragment of English is a much more difficult task than it at first appears, however. Although we can leave many questions aside in pursuing this objective, it is still necessary to decide how sentences are built up and to determine the precise semantic role of the parts. I may illustrate the difficulties by a suggestive example familiar to philosophers. Students of modal logic (so-called) interest themselves in the notion of necessity, and concern themselves in particular with 'necessarily' as an adverb modifying whole sentences. The consequences required by Convention T of the truth definition would include sentences such as (12) (12) 'Necessarily John is tall' is true if and only if necessarily John is tall.

An obvious way to accommodate necessity in the recursive definition of satisfaction would be this: (13) q satisfies the necessitation of S (i.e. the result of writing 'necessarily' then S) if and only if necessarily q satisfies S.

Such a clause implies that the sequence whose first member is the member one satisfies 'necessarily vl is odd' if and only if it is necessary that the sequence satisfies ivl is odd'but the notion of a necessary link between a sequence and an open sentence is surely not present in the given sentence-form, and thus (13) is false. Intentional notions resist straightforward treatment in the definition of truth. A warning about the idea of a simple definition of truth in English for English that satisfies Convention T is appropriate here. There are formal reasons why a definition of truth in a language for the same language is not possible : unless the language concerned is very weak, a version of the Epimenides paradox will emerge : (14) 'Is not true of itself is not true of itself.12

III.

Model theory

Model theory is the investigation of the relationship between languages that can be formally described and the structures, interpretations or models for which their sentences are true. The sentences of such a formal language are, typically, true only in certain models, so that, given a collection of sentences, it is often possible to say what features any model in which all the sentences hold must have. Starting from the other end, with a model, it is usual to find that only certain sentences

See Tarski (1956), and cf. Martin (1970).

Model theory and semantics

279

hold in it. Thus, given some sentences, we can investigate their possible models: given a model, we can investigate which sentences it makes true. The connection between a language and a model is set up by means of a definition of truth: by a definition that explains under what conditions a sentence holds, or is true, in the model. The simplest definitions for first-order logic go by way of a recursive characterization of satisfaction, as described above, or directly to a recursive characterization of truth13. The sole difference from the semantics of Section II is that truth is defined as relative to a given model. In contrast, the simple semantics of Section II offers a definition of truth as absolute. For present purposes I take it as a defining characteristic of model theory that it studies relative definitions of truth. Relative definitions of truth are quite standard in logic textbooks, both for elementary logic (where the notion of model or interpretation is needed to define logical truth and logical consequence) and for modal and intentional logics (where truth is often defined as relative to a 'possible world'14). In applications to natural languages, there is a measure of agreement that truth of sentences is relative to the context of utterance where indexical elements (e.g. tense, location) occur: the philosophical divide comes between those who hold that Semantic' primitive notions are admissible in the definition of truth and those who do not15. An influential recent writer who supported the claims of model theory was Richard Montague. Here are two typical statements of faith:
I reject the contention that an important theoretical difference exists between formal and natural languages. On the other hand, I do not regard as successful the formal treatments of natural languages attempted by certain contemporary linguists. Like Donald Davidson, I regard the construction of a theory of truthor rather, of the more general notion of truth under an arbitrary interpretationas the basic goal of serious syntax and semantics; and the developments emanating from the Massachussetts Institute of Technology offer little promise towards that end16.

For a recursive definition of truth-in-an-interpretation, see Mates (1972) p. 60, and for a definition of truth-in-an-interpretation by way of a recursive definition of satisfaction, see Mendelson (1964) pp. 5051. 14 Thus for example the necessitation of S is said to be true just in case S is true in all possible worlds. For formal purpose, possible worlds function as do modelsthey are abstract entities in which certain sentences hold. 15 This is not the place to argue at length that this is the philosophically interesting divide. Davidson favours the absolute definition of truth, Montague the relative (see Davidson 1973, and Wallace, 1972). It is Davidson who has taught us the importance of this contrast, and I follow his usage in using 'absolute* to cover accounts of truth that do not use semantic relatae.g. interpretations, possible worldseven if they use a notion of truth as relative to such things as place and time of utterance. 16 Montague (1970a) p. 189. (Emphasis added.) It will be clear from the last note that 'like Donald Davidson* is here misleading.

13

280

rtarryA.Lewis There is in my opinion no important theoretical difference between natural languages and the artificial languages of logicians; indeed, I consider it possible to comprehend the syntax and semantics of both kinds of languages within a single natural and mathematically precise theory. On this point I differ from a number of philosophers but agree, I believe with Chomsky and his associates. It is clear, however, that no adequate and comprehensive semantical theory has yet been constructed, and arguable that no comprehensive and semantically significant syntactical theory yet exists. The aim of the present work is to fill this gap, that is, to develop a universal syntax and semantics17.

Both papers from which I have quoted offer a syntax and semantics for fragments of English. These accounts are technically complex and even a summary is out of the question. What I shall try to do is to characterise in general terms the modeltheoretic approach and to indicate the special contributions made by Montague. For first-order predicate logic the standard models are collections of sets, one of them, the domain, containing all the objects that can be talked about; predicates have subsets of the domain or relations on the domain assigned to them, and names have members of the domain assigned to them. An interpretation can thus be construed as a domain together with a function that assigns to each predicate or name an appropriate object taken from the domain. A sentence is then said to be true for an interpretation if it is satisfied by every sequence of objects from the domain, given the interpretation. In this approach, meaning is determined in two stages. The meanings of the logical constantsthe connectives ('and', 'if then...' etc.) and the quantifiers ('all', 'some')that also provide the recursive elements in the definition of satisfaction, are stated in advance for all interpretations. An interpretation then gives the meanings of the non-logical constants (the predicates and proper names of the language). One advantage is that the method permits of the definition of a notion of logical truth as truth in all interpretations. Moreover, truth-in-an-interpretation can be defined in advance of knowing a particular interpretation. In the simple semantics that I sketched earlier, this was not so: to give a definition of truth, we need to have all the basic clauses for the recursive definition to hand, and we had no general way of characterizing the interpretation of, for example, a one-place predicate. The new facility could be seen as an advantage: you may have felt that the basic clauses in the simple semantics, such as (9) (i), were disappointingly trivial. The corresponding clause in a relative definition might look like this: (15) q satisfies Fv{ in I if and only if the i'th member of q is a member of the set assigned by I to F.

This appears to open up the possibility of discussing alternative interpretations of the basic elements in the language, a possibility that was not evident for the absolute

Montague (1970b) p. 373.

Model theory and semantics


18

281

definition. An interpretation can be thought of as a dictionary for a language whose syntax is known and about which we have semantic information up to a point: we know the meanings of the logical constants (we are not allowed to vary these from one dictionary to another) and we know the kind of interpretation that is allowed for any lexical item whatsoever, since the kind is determined by the syntactic category. What a particular dictionary, or interpretation, does is to specify which meaning of the appropriate kind each lexical item possesses. If we could approach a natural language in a similar way, we could hope to describe its syntax precisely and to determine what the appropriate meaning or interpretation for each lexical item would be. We could expect to discover some logical constants, in particular among the devices for constructing compound or complex sentences out of simpler parts. It is plain, however, that shifting our attention from the absolute to the relative definition of truth has not at all changed the problems that must be met. The standard mode of interpreting predicate logic together with the rough-and-ready methods we have for translating between English and the logical symbolism fail to deal with a host of constructions and aspects of English, such as intentional contexts in general, in particular tense, modality, intentional verbs such as 'believe', 'seek*; and indexical or token-reflexive elements such as personal pronouns. Moreover the kck of a serious syntactic theory linking the logical notation and English is an embarrassment. A defect that is likely to strike the linguist as particularly important is the lack of syntactic ambiguity in the logical notation. The list of obstacles could be prolonged, and they are serious. Montague and others have sought to overcome them all by providing a single theory of language, embracing syntax and semantics, which provides a framework within which a complete formal grammar for any particular language, natural or artificial, could be fitted19. Although Montague's framework is complex, I believe that a large part of it can be understood by the use of two simple ideas20. The first involves framing all our semantical talk in terms of'functions (with truth as the ultimate value). The second is the idea of analysing the syntax of a language into functional relationships. If the functions are matched in a suitable way, an elegant and powerful semantic theory appears to fall out naturally. I shall try to explain first how the notion of function can be used in semantics. The familiar requirement that a semantic theory determine under what conditions a declarative sentence is true, could be stated in a more abstract way by asking that a

18 Strictly, the part of the interpretation that assigns meaning to expressions, but not the domain itself. 19 See Montague (1970a), (1970b) and (1973), and Lewis, D. (1972). 20 Here I am indebted to Lewis, D. (1972). I recommend this article as a preliminary to any readers who would understand Montague's writings.

282

Harry A. Lewis

semantic theory define a function from declarative sentences to truth-values so that every sentence has a truth-value. The abstract way of talking, in terms of functions, appears quite gratuitous at this point, but it is indispensible for the steps that follow. Consider a simple declarative sentence: (16) John is running. We want a semantic theory to assign a truth-value to it in a model M: in particular, we now should like it to entail sentences like (17) val ('John is running') = T in M if and only if John is running in M. According to the standard approach to the model theory of predicate logic, the name 'John* would receive a member of the domain of an interpretation, and the predicate 'is running* would be given a subset of the domain. (Of course, predicate logic is an artificial language: I continue to use English expressions for illustration only.) The resources of this mode of interpretation do not allow us to say that the subset assigned to 'is running' varies: but this leads to a difficulty, for of course the extension of 'is running'the set of people who are runningvaries from moment to moment, although the meaning of the expression does not. How can we assign a single meaning to 'is running' within a formal semantic theory which allows for this complexity? The answer is, we assign to the predicate a function from times to subsets of the domain; a function, it could be argued, that we already know to existfor we know that at any time some people are running and some not. The resulting account of the truth-conditions of 'John is running' might look like this: val ('John is running', tk) = T in M if and only if val('x is running')(val('John'), tk) = T in M This can be read: the valuation function yields truth in M for the arguments 'John is running' and tk(the k-th time) if and only if the result of applying the interpretation of the predicate 'x is running' (which is a function of two arguments) to the arguments (a) the interpretation of 'John', (b) tk, is truth in M. The standard interpretation of predicates by subsets of the domain can be progressively complicated to deal with any features of the occasion of utterance that are relevant to truth-value. A particular valuation, or model, will then ascribe to each predicate an appropriate function. The other simple idea involved is the extension to the semantic realm of Ajdukiewicz' syntactic ideas which derive in turn from Frege. Ajdukiewicz showed how, given two fundamental syntactic categories, it was possible to assign syntactic categories to other parts of speech21. The categories other than the basic sentence (18)

See Ajdukiewicz (1967).

Model theory and semantics

283

and name are all functions of more and less complexity, so that the criterion of sentencehood is that the syntactic categories of the component expressions should when applied to one another yield 's' (sentence). I shall illustrate the idea and its semantic analogue by the case of a simple verb-phrase. If we know that (19) Arthur walks. is a sentence, and 'Arthur' a name, we know that the syntactic category of 'walks' is s/n, i.e. the function that takes a name into a sentence. A semantic analogue (much simpler however than anything in Montague) would be this: if we know that the interpretation of the whole sentence is to be a truth-value, and the interpretation of the name 'Arthur' is to be a person, we can infer that the interpretation of 'walks' is to be a function from people to truth-values. Montague gives a far more complex account of the interpretation even of simple predicates, as he wishes to allow for the occasion of utterance and further factors. But the principle by which the appropriate meaning for items of a certain syntactic category is discovered is the same. The case of adverbs that modify verb-phrases is in point. Syntactically speaking, such adverbs can be seen as turning verb-phrases into verb-phrases (e.g. 'quickly', 'clumsily'): so semantically speaking, they turn verb-phrase meanings into verb-phrase meanings. We therefore know the type of function that such adverbs require as interpretationsthey are functions from verb-phrase interpretations to verb-phrase interpretations. Adjectives that attach themselves to common nouns are treated in the same way. The syntax of the fragment of English described in Montague's 'Universal Grammar' (Montague, 1970b) is sufficiently sophisticated to allow that (20) Every man such that he loves a woman is a man is a logical truth, whereas (21) Every alleged murderer is a murderer is not. The assignment to syntactic categories, given the semantic principle I have just presented, is not a trivial matter, and it seems to me, although I claim no expertise, that the syntactic descriptions given of fragments of English in 'Universal Grammar' and 'English as a formal language' are ingenious and interesting22. Both fragments allow syntactic ambiguities, and in the latter paper Montague suggests a way of dealing with ambiguity by relativizing his semantics to analyses of sentences, where an ambiguous sentence receives two distinct analyses.

IV.

Conclusion

My aim in this paper has been to present the basic ideas of two approaches to the semantics of natural languages, those associated with Donald Davidson and with Richard Montague. A long critical discussion would be out of place and it would have to draw on writings and detail not expounded here.
See also Montague (1973).

284

Harry A. Lewis

The main themes have been these. A semantic theory can have empirical content even if it is built on the pattern of the theories of truth usually offered for formal languages. Such a theory of truth may represent truth as an absolute notion, or as a relative notion, where the relativity may be to context of utterance (time, place, person) or to "possible worlds". Such notions as 'interpretation' or 'possible world', used as undefined terms in a theory of truth, rest truth on a prior notion that is 'semantic' in that it involves essential use of the notion of truth or a rekted concept such as reference23. A great deal of philosophy is condensed into the contrast between those accounts of truth that use a semantic primitive and those that do notfor example, the question of the intelligibility of concepts of necessity or logical truth can be phrased as the question of the acceptability of certain semantic primitives. In the present context, the contrast is that between theories of meaning for natural languages that make reference to possible worlds, models and interpretations and those that do not. The reader new to this subject may be tempted to suggest that this contrast is unimportant, and perhaps that allegiance to the truth-definition as the criterion of adequacy in semantics is the sole interesting test. Possible worldssome but not all modelsare theoretical entities, it would seem, whose existence is postulated to help with a smoothly-running account of language. If the question is whether possible worlds are disreputable epicycles or respectable ellipses, surely time alone will tell? To be sure, the understanding of language is in us, not in the heavens: but we now readily concede that the ability to produce grammatical sentences is not the same as the ability to produce a grammar that will model the former ability. Why should semantics not trade in notions as obscure to the lay mind as are 'phrase-structure grammar' and 'generalised transformation', provided that they aid theory? Surely we can expect our theory of meaning to be at least as complicated as our syntax? One obstacle to such a generous view lies in the criterion of adequacy built into this approach to semantics: Tarski's convention T. It is a powerful constraint on a theory that it generate the theorems required by'the Convention. Such theorems have on one side a translation or paraphrase of the sentence whose truth-conditions they thus state. Proponents of the relative definition of truth as the semantic paradigm have to persuade us that talk of possible worlds does paraphrase everyday English. If they are right, we should all be easy to persuade, for it strains credulity that native speakers cannot recognize synonymous pairs of sentences when they see them24. The welcome feature of this approach to semantics is that a lay audience can easily test the plausibility of particular proposals by asking that the B-sentences be exhibited: they may then inspect the two sides to see if one is a plausible paraphrase of the other. In other words, such a proposal has empirical

cf. Davidson (1973). This is the argument of Wallace (1972), VI.

Model theory and semantics

285

consequences, and like respectable theories in other fields, it is falsifiable: unlike theories in some other fields, these can be tested by any native speakers of the language in question.

References
AJDUKIEWICZ, K. (1967), On syntactical coherence (translated from the Polish by P.T. Geach), Review of Metaphysics 20,635647. DAVIDSON, D. (1967), Truth and meaning, Synthese 17,304323. DAVIDSON, D. (1970), Semantics for natural languages, pp. 177188 in: Linguaggi nella societa e nella tecnica, Milan: Edizioni de Comunita. DAVIDSON, D. (1973), In defense of Convention T. pp. 7686 in: Leblanc, H. (Ed.), Truth, Syntax and Modality, Amsterdam: North-Holland. Hiz, H. (1968), Computable and uncomputable elements of syntax, pp. 239254 in: van Rootselaar, B. and J. F. Staal (eds.), Logic, Methodology and Philosophy of Sciences III, Amsterdam: North Holland. Hiz, H. (1969), Aletheic semantic theory, The Philosophical Forum 1 (New Series), 438451. LEWIS, D. (1972), General semantics, pp. 169218 in Davidson, D. and G. Harman (Eds.), Semantics of natural language, Dordrecht: Reidel. MARTIN, R.L. (1970), The paradox of the liar, New Haven: Yale University Press. MATES, B. (1972), Elementary Logic (second edition), London: Oxford University Press. MENDELSON, E. (1964), Introduction to Mathematical Logic, Princeton, N. J.: D. Van Nostrand Company. MONTAGUE, R. (1970 a), English as a formal language, pp. 189223 in Linguaggi nella societa e nella tecnica, Milan: Edizioni di Communita. MONTAGUE, R. (1970b), Universal grammar, Theoria 36,374398. MONTAGUE, R. (1973), The proper treatment of quantification in ordinary English, pp. 221242 in: Hintikka, K. J.J., J.M.E. Moravcsik and P. Suppes (Eds.), Approaches to natural languages, Dordrecht: Reidel. QUINE, W. V.O. (1970), Philosophy of Logic, Englewood Cliffs: Prentice-Hall. TARSKI, A. (1956), The concept of truth in formalised languages, pp. 152278 in: Logic, semantics, metamathematics (translated by J.H. Woodger), Oxford: Clarendon Press. WALLACE, J. (1972), On the frame of reference, pp. 212252, in: Davidson, D. and G. Harman (Eds.), Semantics of natural language, Dordrecht: Reidel.

IRENA BELLERT

REPLY TO H. H. LIEB

I wish to take this opportunity and reply to Lieb's comments (in: Grammars as Theories, Theoretical Linguistics, Vol I [1974] pp. 39-115) on my proposal (I. Bellert, Theory of Language as an Interpreted Formal Theory, Proceedings of the 11-th International Congress of Linguists, Bologne, 1972). I will discuss only some critical remarks which, if valid, would make my proposal untenable. The first is due to a misinterpretation of one part of my text, which in fact was carelessly formulated and hence misleading. I am indebted to Lieb for his observations. The others which I found objectionable, give me the opportunity to clarify my statements. On page 103 Lieb says: "The separation of the 'axiomatic implications' from the 'meta-rule' is untenable. To make sense of the conception, we have to take "A" in an axiomatic implication as a free variable(...) Because of the free variable(s) these axioms are neither true nor false and they have no acceptable interpretation (...)" Of course the separation of the axiomatic implications from the meta-rule is untenable and by no means was it intended so in the proposal. But it is quite evident that the conception would not make any sense at all if we took "A" as a free variable. As I said on page 291: "Notice that in the above implicational scheme the expressions A, R and (S, D) (addresser, receiver and sentence with its structural description, respectively) are all bound by universal quantifiers" (the stress is added). It is obvious then that I did not mean "A" to be a free variable. The meta-rule cannot be separated from the expression "C> A PROPOSITIONAL ATTITUDE S1" which constitutes only part of it, and thus, if taken separately, could by no means be said to be true, false nor even to constitute a well formed formula. However, what evidently misled Lieb was that part of my paper in which I used the latter expression in referring to axiomatic implications, for the sake of brevity. The reason of my careless formulation was that the expressions: "C" in the antecedent, "PROPOSITIONAL ATTITUDE" and "S1" in the consequent are the only ones which are specific for each implication and essential for analycity, whereas the remaining expressions are exactly the same and in implications the variables are bound by universal quantifiers. Therefore, when establishing axiomatic implications for any language or a fragment of a

288

Irena Bellert

language, we would have to specify only the mentioned expressions, while the entire meta-rule would always be presupposed as part of each implication. Perhaps the term 'axiomatic scheme' would be more appropriate than 'meta-rule'. In conclusion, I should then have said that the interpretative component will consist of a finite set of axiomatic implications of the form given by the axiomatic schema (meta-rule), the essential and language-specific expressions of which are: "C", "PROPOSITIONAL ATTITUDE" and "S1". Lieb objects against my statement that "the consequents can be said to follow formally from the antecedents". He says: "But even in analytic sentences material implication does not mean deducibility." (Footnote 131, page 104). I cannot agree with his objections. Material implication, clearly, does not mean deducibility, but this is not what my statement says. What I say here is in agreement with the terminology established by Tarski and widely accepted in the literature. Let me recall Tarski's definitions of the terms in question. He considers a finite class of sentences K from which a given sentence X follows. He denotes by the symbol Z the implication whose antecedent is the conjunction of the sentences in the class K and whose consequent is the sentence X. He then gives the following equivalences: "The sentence X is (logically) derivable from the sentences of the class K if and only if Z is logically provable. The sentence X follows formally from the sentences of the class K if and only if Z is analytical. The sentence X follows materially from the sentences of the class K if and only if the sentence Z is true" (Logic, Semantics, Metamathematics,Oxford: At the Clarendon Press, 1956, p. 419). In my proposal I can correspondingly say that the sentence in the consequent of an axiomatic implication follows formally from the class of sentences (or conjunction of sentences) in tire antecedent, as the implication is analytical. Furthermore, Lieb questions the analyticity of the axiomatic implications (Footnote 133, page 104). The question of properly distinguishing analytic statements from synthetic (contingent) statements has been widely discussed in the literature and there is no complete agreement as to the status of some statements. However, when discussing analyticity the authors agree that a statement is said to be analytical if its truth is based on meaning alone independently of extralinguistic facts. Carnap's meaning postulates have been proposed as an intended explication of the concept of analyticity. He defines meaning postulates as L-true implications. The L-truth is an explicatum for what Leibniz called necessary and Kant analytic truth. In Carnap's formulation the antecedent ^implies the consequent. His example is: "If Jack is a bachelor, then he is not married" (Meaning and NecessityMeaning Postulates, Phoenix Books, 1967, p. 10 and 222). In spite of the controversies involved, undoubtedly there is a difference made in logic

Reply to H. H. Lieb

289

between unconditionally true statements and contingent, factual statements: To the former class belong logically true statements and those that are not theorems in standard logic but their truth is independent of extra-linguistic facts. Those are usually called analytical. Now, since my implications are intended to be constructed so that their truth be dependent solely on the meanings of the words and the structures involved, I presume that they can correctly be called analytical. Moreover, if they are taken as axioms of the theory, their truth cannot, without contradiction, be considered contingent, and they have to be taken as unconditionally true statements in the theory. Marian Przel^cki has discussed in detail the status of meaning postulates in axiomatized empirical theories (The Logic of Empirical Theories, Routledge & Kegan Paul, New York, 1969). As he observed, it is a usual practice in axiomatizing empirical theories to explicate the meaning of extra-logical terms by meaning postulates which are then considered to be analytical sentences of that language. Lieb questions, however, the empirical contents of such a theory or grammar (Footnote 133, p. 104). The class of sentences that follow from a given sentence S and some pertinent meaning postulates obviously adds nothing to the meaning of S. Meaning postulates, or axiomatic implications (in my terminology), are established for explicating the nonlogical terms and structures contained in S. The axiomatic implications in my proposal are intended to explicate more complex predicates used in specific structural conditions in terms of other predicatesin a way which, in principle, should reflect the native speakers* understanding of the language, that is, in particular, they should account for the conclusions speakers generally can draw from the corresponding utterances in any fixed universe of discourse for which the meaning and denotation of the terms involved are clearly understood. In order to test the empirical adequacy of axiomatic implications, it is necessary to find a possible state of affairs in which the antecedent holds true but the consequent does not. If such a case is found, the implication in question should be rejected, or the conditions C in the antecedent should be modified in such a way that they become necessary conditions (as they are intended to). But this can be done only by determining a universe of discourse, as well as the denotation of some predicates (those that are not further explicated by axiomatic implications, but occur in the consequents only) by establishing in some way (other than the verbal way of specifying axiomatic implications) the sets of individuals in the universe of discourse of which the given predicates hold true. Otherwise the theory will have no empirical contents indeed. It is clear, however, that such tests are based ultimately on speakers' judgements only. Finally, I wish to add that being a linguist with only some knowledge of logic, I did not aim at a rigirous formalization of the proposal but, rather, I did what is a common practice for non-logicians interested in the possibility of formalizing some aspects of their empirical field, namely, I submitted for discussion a rough outline of a theory which would account for the empirical fact that speakers are capable of drawing a number of conclusions from a single utterance

290

Irena Bellert

by virtue of only the meanings of words and structures involved, independently of extra-linguistic facts. And I am indebted for all critical observations, which may help me in further clarifying my proposal as it has been the case with Lieb's comments.

Вам также может понравиться