Вы находитесь на странице: 1из 13

John Sinclair

Meaning in the Framework of Corpus Linguistics1


John Sinclair
Meaning in the Framework of Corpus Linguistics

1
2
3
4
5

Introduction
The nature of meaning
The acquisition of meaning
The language of definition
Irony

Introduction

6
7
8
9

Sentences
Truth value
Conclusion
Bibliography

The principal unit of meaning is called the lexical item (Sinclair 1996, 1999), which consists sometimes of a single word, but corpus study suggests that the influence of the verbal
environment (the cotext) on the occurrence of a word is so strong that many lexical items
typically consist of more than one word, and often several. The generally received position
in lexicology is that each word realises one or more meanings (normally more than one)
and that when one of its meanings is required, that word is chosen. Relating this position to
corpus evidence, we would regard it as confirmed if the choice of a word appeared to be
unrelated to the cotext, chosen independently of the words around it ; but the evidence at
present is pointing the other way. The evidence suggests not only that words are coselected
with other words to form complex lexical structures called lexical items, but also that the
citation of a lexical item in its full form normally realises just a single meaning, and therefore the process of incorporating relevant cotext into the item removes ambiguityambiguity which has diverted the attention of computational linguists for too many years.2
Another preliminary finding of corpus linguistics is that the creation of meaning is not
confined to the sequential selection of predetermined units of meaning. This means that no
matter how comprehensive and accurate a lexicon may be, it will not be able to account for
1

Acknowledgement : The origin of this paper is a transcription of a conversation between the author
and Professor Wolfgang Teubert, which was recorded in the West Highlands of Scotland during the summer of 2002. With Prof. Teuberts permission, I have taken out my contributions to
the discussion and edited them to make a self-standing paper. The editing was very substantial,
and although I have more-or-less suppressed Prof. Teuberts words, his contributions often provided the stimulus for the next stage in the conversation, so his influence is still strong.
The recording was made as a follow-on from one we made in July 2002, which has been edited
out without losing the conversation structure, and forms the introduction to Krishnamurthy (ed.
2004).
If a word has several meanings, it is easy to show that at least some of the cotext is a component
of the meaning. Make a concordance of perhaps a hundred instances of the word, in a span of four
words on either side. In almost all cases it will be clear from no other evidence than the cotext
which meaning of the item is present, and therefore which meaning the original word partly realises. (Sinclair 1996a)

LEXICOGRAP HICA 20/2 004

Meaning in the Framework of Corpus Linguistics

21

the meaning in a normal sentence. This is not just because of the constant innovation that
we experience in lexis3, but in the juxtaposition of the items selected, and their components.
Since the possible juxtapositions are not limited, no finite lexicon can include them all.
Some ad hoc interpretation is necessary, probably in every sentence uttered. Hence there is
an irreducible indefiniteness in the assignment of meanings. Those who describe languages
should face up to this and devise a means of accounting for as much as possible of the
meaning-creation that is predictable ; those concerned with applications such as lexicography, translation and language teaching should consider how they can cope with the uncertainty of meaning.

The nature of meaning

Meaning is an impression in the mind of an individual, and any consensus we will find is a
very loose consensus based on just sufficient similarity of these impressions for the discourse to proceed. Meaning is entirely provisional ; a lot of it is quite ad hoc and not something that can be recorded in reference books. It is also not in a straightforward relationship
to any discernible units in the language. It is built up from units in the language like words
or phrases ; but the relationship is neither fully predictable nor can it be formalised, because
it is too imprecise.
We are accustomed to appreciating meaning in another way, based solely on the meaning that words retain when cited. Without any support from a cotext, each individual word
still yields one or more meanings when citedwe shall call these the residual meanings. A
residual meaning is one that is realised when a single word is cited ; it has to be strongly
enough associated with the particular word that it is recalled on citation, and it cannot rely
on people being able to imagine a suitable cotext. Most of the semantic influence of coselection is thus not retrievable from the citation form and so does not appear in conventional
dictionaries and formal accounts of meaning.
However, it is clear to everyone that the meaning of a text cannot be expressed as the
concatenation of the residual meanings of each of the words ; the way words interact with
each other in making textual meaning is the antithesis of citation. Nevertheless, stringing
together the residual meanings is the established model for the vast expenditure on computational analysis that we have seen in the last two decades ; each word is looked up in a
lexicon that gives only residual meanings ; next, attempts are made at disambiguation since
most of the words have several residual meanings. Most schemes go no further, and are
neither accurate nor comprehensive.
The origin of meaning is in the text, the selection and coselection of words. The interpretation of meaning is made by participants in the discourse, and so while there has to be
broad general agreement about the meanings of words and phrases for society to function,
there may be differences of opinion in marginal cases, because of the unique experience,
education, personality and preferences of the individuals.
3

Studies some years ago of The Times showed that around a dozen new lexical items appeared in
each days edition (after proper names and trivial variations had been removed).

LEXICOGRAP HICA 20/2 004

22

John Sinclair

There is a similar situation accepted in dialectology ; each individuals usage is called an


idiolect, and is thought to be a unique combination of features. The usage of many individuals in a community, taken together, is called a dialect. Probably no-one speaks exactly
what the dialect describes, but they understand each other enough to get along.
The same goes for meaning. The differences in interpretation between members of a
speech community are small and they do not interfere much with normal communication.
We are not here talking about personal interpretations that are markedly not shared, individual quirks which we all have but which do not affect the sharing of meaning ; if such
interpretations became pronounced then the user would be considered eccentric or worse.
That is one extreme of a continuum of which the other end is Chomskys positing of an
ideal community where each person has exactly the same internalised, operational description of a language as everyone else (Chomsky 1965).
So if people talk about a chair, for instancethe kind of chair you sit on4they will
generally agree whether or not an object falls under this category. If, as may well happen,
there are occasional differences of opinion in marginal cases, all linguistics can do is to
explore this common area. There will always be some imprecision, though, particularly
once you go beyond physical objects like chairs and into abstract ideas like languishing.

The acquisition of meaning

There are essentially three roots of linguistic competence : we can be told about something,
we can experience something, and we can infer something. Being told about something is
verbal experience, and it can be in combination with non-verbal experience or not. The
consensus of meaning in a society must be built out of these two kinds of experience along
with the application of inductive reasoning.
The reasoning is there all the time, but can play a more or a less central role in profiting
from the two kinds of experience ; the purely verbal experience we will deal with later, and
here we will consider the mixture.
How do we know what a chair is ? You might say to a learner Could you get me a
chair ?, and if they brought something else, you could say, Well, thats not a chair. So
they learn from this verbal input. They might bring you something to sit on, a stool perhaps,
and you could comment But that hasnt got a back on it, thus verbalising one of the distinguishing properties of a chair.
Another way might be to show the learner a lot of objects and say This is a chair,
This is not a chair etc. until the learner can make more-or-less the same distinction with
new objects. However practical and simple this may look, it is one of David Abercrombies pseudoprocedures (1965). Abercrombie pointed out that linguists tend to escape
difficulties in descriptions by proposing experiments which are, when you take a closer
look, very unlikely to be carried out in the real world. His first example is of a linguist
4

To simplify the presentation I will stick with the familiar assumption that the word chair has a
powerful residual meaning and does not require a cotext for its use to mean a piece of furniture
that you can sit and lean back on.

LEXICOGRAP HICA 20/2 004

Meaning in the Framework of Corpus Linguistics

23

whose procedure begins by arranging all the possible utterances of a normal monoglot
speaker of a language into pairs, so that each utterance is paired with every other utterance ;
and next to select from these pairs those which are different in sense and different though
similar in sound. Philosophers, who have, of course, other interests and other priorities, do
not worry about pseudoprocedures, and we will return to this point when dealing with truth
values below.
The learner will not encounter much discourse concerning what is or is not a chair ;
rather the word will be used incidentally, in discourse like :
I can hear the greeting with which he threw himself on a chair
He moved his chair between the two girls
the new second-hand sofa made the chair look even worse
leaning on the chair backs in front of them, watching the band
Sitting in a cane chair on the terrace with the sun glistening

These examples from The Bank of English each give a little relevant information about
chairsyou can throw yourself on one, move it, lean on its back or just sit on it ; it is comparable to a sofa, and it can be made of cane. Gradually and mainly imperceptibly, these
experiences will add up to some workable notion of a chair.
There is however evidence (Sinclair 2001) that just encountering lexical items in the
jungle of experience, verbal or not, is not likely to lead to an appreciation of the precise
meaning which they realise. Let us assume that each lexical item offers two kinds of
meaningthe familiar classificatory meaning of the regular dictionary, and a kind of attitudinal or pragmatic meaning that is called semantic prosody. Experience will rapidly lead to
an accurate perception of the semantic prosody, but (apart from formal definition) the experience may not clarify the classificatory meaning.
The problem can be elaborated by considering the classic structure of a dictionary definition, which contains two items, a superordinate and a discriminator. The discriminator
ideally picks out one feature of the item which distinguishes its meaning from all others.
Much of the skill of lexicography goes into finding and stating the discriminators. In ordinary talk and writing, it would be a matter of chance if the discriminator happened to be
made explicit, and it would be a matter of luck if the user recognised it and its function.
Despite this caveat, the sum total of a persons verbal experience of a word or phrase
must be the principal resource from which the meaning is induced, but this hypothesis
cannot be verified ; it is just not practical to gather the entire verbal experience of a person
from birth and examine it analyticallyit has the feel of a pseudoprocedure about it. In
addition there are non-verbal experiences which, while secondary, can reinforce and clarify
the verbal. So if a learner has encountered the words hard and soft and has a preliminary
notion of the difference between them, it is not necessary to go through a series of exchanges like This is hard. This is soft. Once shown what properties hard and soft are then,
as he goes through life, as he presses something and it yields and he will think This must
be soft. These are experiences not based directly on verbal exchanges, but make use of
what Chomsky calls the language facultyall of us have such a faculty, and it bridges
the gap between the patchy, disorganised data of experience and the fairly clear appreciation of meaning that the mature language user appears to possess.
Imagine where someone is in danger of stepping on a banana skin, and a friend shouts
Watch out, its slippery ! She may not have heard the word slippery before, but if she
LEXICOGRAP HICA 20/2 004

24

John Sinclair

previously has been exposed to banana skins lying on the ground, she will have no difficulties understanding what slippery means. Humans have this extraordinary capacity for inference. Our experiments have shown that if you replace a word in a concordance by some
nonsense word, in no time at all people could come up with a reasonable definition of it.
Part of the language faculty is sensitivity to collocation. We may not often be aware of
it, but users of a language tend to choose words at least partly with reference to the other
words in the cotext. Hallidays (1966) famous example of strong tea and powerful engine
will suffice here. If cited in isolation, most people would regard the two adjectives as very
similar semantically, but they are not interchangeable. The habit of collocation can be exploited by a language, and frequent collocation can firm up into coselections which acquire
a unique meaning ; so strong currency is currency which is stable over a period and reasonably safe for investment, while hard currency is currency which can be readily converted into others at the prevailing exchange rate. The meanings and tokens could easily
have been merged or reversedthe only convention that keeps them apart and settled in
meaning is the Saussurean contract of the sign and its meaning, a relationship that is
usually accepted as essentially arbitrary.
As well as the coselections we call collocational, there are many coselections responsive
to the wider context of an utteranceits place in a document or conversation. For example,
if you meet or telephone an Italian who has some English but is not fluent, and say How
are you ? you may well get the response Good. or Well., which are inappropriate, but
clear enough as a message. So you will probably let them pass, leaving the learner to make
the same mistake again. Good is not used for self-evaluation in English, and in any case
its residual meaning has a strong moral overtone ; well on its own conjures up its residual
meaning, which in turn brings up a contrast with ill. There are various phrases centring
on well which can be used, e.g. Im very well, thank you., but not the word on its own.
The usual response in English is Fine ; at a similar place in a conversation an Italian is
likely to say Bene. and good and well are the standard Italian equivalents of bene, as
adjective or adverb. Hence the problem ; fine is not offered as a translation of bene in
standard bilingual dictionaries.
In technical terminology there can be explicit negotiations about the meaning of terms,
as they are used and re-used, and as the community of users diversifies. For example in UK
the term autistic was normally used of a child to mean that the child gave no indication
whatsoever of being aware of anything else in the world, except certain obsessional objects.
The American use of the term is much broader ; any child with a communication disorder of
the kind that makes him or her withdrawn and reclusive. Autistic has become quite a fundamental term in child psychology, tending towards the American usage, so it seems as if
you could say there is a degree of autism in everybody. Some of this movement of meaning
could be the result of popularisation, which militates against technical purity5.

After this paper was finished, a newspaper report (Sunday Times 18.1.04 p.3) gave dramatic
support to this very example. Researchers claimed that an apparent 25 % increase in the number of
children diagnosed with autism between 1993 and 2000 was due to a change in the diagnosis of
behavioural disordersi.e. in the definition of autismand not a rise in what had been called
autism in the past.

LEXICOGRAP HICA 20/2 004

Meaning in the Framework of Corpus Linguistics

25

The language of definition

Let us continue the consideration of the purely verbal experience of language, and move to
the most formal extreme of thatthe language of definition. Our starting-point is that there
is no ideal definition of any word or phrase6. There is no definition of, say, the sort of chair
you sit in, which you could say is the best. There will never be a situation where you could
say now we have achieved the ideal definition of a chair and that any object in the world
can be tested against it, and if it fits the criteria, it is a chair.
Nevertheless, there are a lot of good definitions of chair, and of thousands of other
words and phrases. All the words in a dictionary are defined in terms of other words.
Nothing else comes inno objects, no logical or ontological relations other than those
expressed7. A dictionary is a huge circular description of meaning, all enclosed within the
language. The world outside does not matter ; a unicorn will be defined in the same way as
a horse, a hippopotamus or a grasshopper except that where the hippo may have African
in the definition, the unicorn will have mythical or fabulous.
The language of definitions is quite fundamental in the theory of language. Lexicography is seen by most people, including many of the practitioners, as a purely practical skill
with no theoretical overtoneswhatever works is good. And in the commercial world
where dictionaries are big money-earners that is certainly the case. But it is vitally important to keep the description of language self-standing, and not to rely on external objects,
events or arguments, because these are far less reliable and relevant than the language itself.
The capacity of language for self-reference is much remarked on, and is the capacity that
makes possible the independence of description.
It would be absurd to apologise for the lack of total precision in the description of
meaning ; one of the main resources of the language is the elasticity of meaning, so that
many problematic situations can be resolved quietly and efficiently. Meaning is inexact just
because those who create and deploy it are people who interpret texts according to a wide
and unaccountable range of experience. The society holds together for as long as the members are reasonably content with the equalisation of meaning ; when, as in UK recently, the
society becomes dissatisfied with the governments lack of straightforwardness in communicationcalled spinthere is generated strong pressure for reform.

Irony

The message about the slightly volatile nature of meaning is the same whether from verbal
or non-verbal experience ; it is the product of a tension between the individuals perceptions
6
7

Even when a technical term is coined by its originator and first defined.
Some dictionaries, mainly for children or learners, use illustrations to aid understanding, though
they probably cause as much confusion as they clarify. We can expect that modern mixed media
will come up with all sorts of non-verbal representations of experience instead of definition. But
the essence of the dictionary is that it is all verbal ; the language defines itself in its own terms.

LEXICOGRAP HICA 20/2 004

26

John Sinclair

and the social contract ; and because meaning does not exist with any objective security, it
is always open to revision and reconsideration.
We can see this kind of movement in the accumulation of irony around a word or
phrase.
Take the adverb solemnly. In current usage, outside the language of ceremony, it is very
often used ironically. Here are two examples from The Bank of English :
The names of persons long dead were solemnly inscribed in voting registers
40 or more western and Romanian scholars and a cloud of fans, including Loyalists of the Vampire Realm, and the Japanese chapter of the Transylvanian Society of Dracula, are solemnly debating the origins of what is essentially an Irish fiction set mostly in Whitby and London.

The solemnity is insincere either on the part of the actors, as in the first example, or from
the perspective of the commentator, as in the second example. No doubt a careful study of
the cotexts of the instances would give a number of clues as to how and when the irony is
perceived, but the pattern of use shows that the interpretation of the irony is often in the
hands of the reader, rather than inescapable, like the instances above.
Another very light irony comes with wholeheartedly. If you hear of a local football team
They played wholeheartedly you know they have lost. Now it is unlikely that anyone tells
a learner not to use wholeheartedly unless they wanted to convey a sense of irony, so this
must be a case of induction. But it remains mysterious. How often do you actually encounter a word as infrequent as wholeheartedly ? It is difficult to claim that a user acquires this
verbal understanding with only experience and induction.
People learn to monitor the veracity of everything they are toldto compare what they
hear with what they think is likely, and to take into account the probity of the teller and the
demands of the situation. When there is a mismatch between what is said and what is likely,
one possible interpretation is irony, so this very general stance may be the origin of the
sensitivity.
Another example of the perplexity of how meaning is created is the topical one of
friendly fire. Far from being ironic in use, it is to be understood as the saddest eventuality
for combatants in warfarewhere they are accidentally killed or wounded by fighters on
their own side. It is a very strange use of friendly, but if readers repeatedly come across
sentences such as this one from The Bank of English :
British troops died from friendly fire when an American warplane mistakenly bombed two British
vehicles during a heated battle

they will infer the meaning despite the fact that overwhelmingly the word friendly has a
warm and positive semantic prosody8.

Just as a quick indication of the semantic prosody of friendly without fire, the adjectives that
frequently precede and friendly are warm, relaxed, open, kind, helpful and efficient. There is some
more detail on this strange lexical item in Sinclair (2004). Cf. also Teuberts analysis of
friendly fire in M. A. K. Halliday, Wolfgang Teubert, Colin Yallop and Anna ermkov (2004): Lexicology and Corpus Linguistics, p. 142ff.

LEXICOGRAP HICA 20/2 004

Meaning in the Framework of Corpus Linguistics

27

Sentences

So far the focus of discussion has been the lexical item, which is one of the two principal
structures that created meaning in language. We now turn to the other. Just as the word is
the essential starting-point for describing the lexical item, so the sentence is the startingpoint for studying the other structure.
Word and sentence are the two primitives of the description of meaning ; sentences can
become exceedingly complex, and the simple sentence, consisting of just one main clause,
is our focus. Such a sentence is usually held to express one proposition, and indeed proposition is the French word for a sentence. In speech act terms it would be expected to be one
illocution.
Despite such an assortment of terms for this kind of meaningful structure, I would like
to offer a less well-known one, but a useful concept for dynamic textual description.
David Brazil, in his Grammar of Speech (1995) introduces the term increment for a
structure that moves the discourse forward by changing the state of awareness of the participants. This term implies affinity with clauses and propositions, but does not carry implications of their grammatical relationships in networks, tree structures or taxonomic hierarchies.
The essential component of an increment is the perception of two elements of structure
which are distinct from each other and combined in a meaningful relationship like that of
actor-action. I would like to use the term exocentric for this structure9. In contrast the
lexical item is constructed through the perception of several words combining to make up a
single unit of meaning, like friendly fire, and this is called an endocentric construction.
The hypothesis is that there are two interpretive mechanisms working on every stretch
of text. One is aiming to segment the string into lexical items and is therefore testing the
possibilities of combining several of them together. The guiding criterion here is the semantic prosody, because the perception of a prosody is a clear indication of the presence of
a lexical item. The other interpretative mechanism is scanning the same string, aiming to
divide it into increments, and is therefore looking out for exocentric relationships ; the
guiding criterion is the perception of an illocution. One mechanism tries to express the
entire text as a concatenation of lexical items, while the other is trying to express the entire
text as a string of exocentric relationships.
Note that the meaning creation on both levels meaning is primarily syntagmatic, and
since they operate over the same data to give different results, the two mechanisms are
incompatible with each other. They are operating on different principles ; according to the
relevant philosophy, sentences are related to truth, while lexical items, which cannot be true
or false, are related to reference. They entail each other only insofar as you cannot have
anything truthful unless it refers to something, and there is no sense in referring to things
unless you want to say something about them. But this is the only point where they entail
each other.

I have hesitated a lot about this term and its partner, endocentric because they have been used
before to distinguish structural relationships of a different sort. But they are useful and long neglected, and I hope that my revival of them is not confusing.

LEXICOGRAP HICA 20/2 004

28

John Sinclair

Otherwise they are quite separate modes of interpreting the same data. Their boundaries
do not necessarily coincide, and the assignment of meaning to a passage is some kind of
reconciliation of the two. This provides further evidence that meaning is not precisely
aligned with either grammar or lexis, but is a compromise between the two that is resolved
as part of the interaction.
Exocentric structures take advantage of all sorts of cohesive devices and assumptions ;
because we all know that most text is a string of increments, then it is less essential for
them to be unequivocally shown, and they are sometimes only minimally expressed. For
example, if a reader comes across a noun phrase like :
The enemy in Brussels10

underlined or in bold face, standing as a side-head in a document, the reader will interpret it
as an announcement that the succeeding paragraphs, until the next side-head, will have as
their subject matter the meaning of the noun phrase. Grammars of old tried to persuade us
that the noun phrase actually realised a sentence, such as The subject of this section is
The enemy in Brussels, and that the words The subject of this section is were understood. The sentence could then be analysed as subject, verb, complement and all was
regular.
The problem was, of course, that the number of different ways in which the missing
part of the sentence could be phrased was not limited, nor the number of different analyses
that could be performed on the resultant sentence. Using the notion of an increment, there is
no need to conjure up any more languagewe merely need to use the cotextual information
provided. Any piece of language isolated like a heading, and probably marked out typographically from the rest, is very likely to increment our understanding, and therefore to be
interpreted, minimally, as a two-part structure. In this instance only one part is articulated,
and there is no cohesive element to point to another part of the text. We can then suppose
that the other part of the exocentric structure is to be found in some or all of the surrounding textnot in its word-by-word meaning, but in its physical existence and position.
From the conventions of document construction we can assume that this is a heading, so
the relevant text is what follows. The increment is now fully understood and requires no
further support, and specifically no provision of ghostly text to make it into a grammatically
plausible, but phantom, sentence.
An implication of this argument is that we all have, in addition to an interpretive capacity for running text, a means of interpreting other signals, like layout and typography, as
part of the meaning of the text. If not, they would be merely decoration, and altering them
would not alter the meaning of the text. Sometimes this aspect of meaning assumes a dominant role, for example in a telephone directory. A telephone directory contains thousands of
increments, each realised not by a conventional sentence but by the simple juxtaposition of
a name, address and a number. To use it the reader invokes a local grammar which interprets each line of the directory as an exocentric structure, of which the two elements are the
name/address complex, and the number. The first element is interpreted as an endocentric
structure, where the name and address are considered as coselected.
There are many such local grammars, simple and useful for all sorts of routine communications. The efforts of linguists to write comprehensive and universal grammars has ob10 The Economist. November 15th 2003, page 32.
LEXICOGRAP HICA 20/2 004

Meaning in the Framework of Corpus Linguistics

29

scured the need for more modest grammars to cope with the large amount of text that cannot be described satisfactorily by general grammars. The insistent evidence of corpus investigation has led to a new interest in local grammars (Barnbrook/Sinclair 2001).

Truth value

Until the seminal work of J. L. Austin (1962) it was normal to suppose that a sentence
contained a proposition, and that a proposition had a truth value. Actual language sentences
had often to be manipulated to become available for processing by the logical procedures,
but it was accepted that there was no critical meaning loss in that process.
Austin pointed out that a large number of actually-occurring sentences did not have a
truth value, and indeed could not be conceived of having one. How, then, did they make
their meaning ? From this starting-point he worked out the notion of performatives, sentences which performed actions directly in the world, like some important person, nominated to launch a ship, naming it and swinging a bottle of champagne at its hull as it is
launched.
Austins next step was to appreciate that all sentences had some such performative
function ; it first showed in those that did not have a truth value, but Austins major contribution was the realisation that a truth value and a performative function were not mutually
exclusive. For this he introduced another term, illocution, and proposed that each sentence
had an illocutionary force, whether or not it had a truth value. An ordinary statement like :
Often, coaching is a way to give problem employees one last chance.11

may be systematically relatable to a proposition that has a truth value, but from the point of
view of a conversation or a document its illocutionary force is that of averral. Someone
avers it, and that is the nearest it gets to truth. If the speaker or writer is in good standing
from a point of view of veracity, then other participants will usually accept that the averral
is true, but that acceptance is purely provisional. If it is not questioned or denied by the end
of the conversation or document of which it is a part, then it acquires a longer-term quasitruth status.
We must always remind ourselves that logical analysis is a simplification and a regimentation of what happens in language text, and so cannot be used to explain the meaning
of a text. Increments and their tell-tale illocutions do not have to be explicit ; they only have
to be successful.
Truth value is a good example of how a respectable concept in one discipline becomes a
barrier to clear thinking in another. Let us re-examine two aspects of the uncertainty of
meaning discussed above.
The first is that the understanding that there is no neutral ground in meaningnowhere
where the differences between people are finally ironed out. Language is irretrievably interactive. Until Austin opened up the prospects of discourse analysis this inherent quality of
11 The Economist. November 15th 2003, page 65.
LEXICOGRAP HICA 20/2 004

30

John Sinclair

language was not prominent, and the equation of a sentence and a proposition was uncontroversial.
The second aspect of the uncertainty of meaning is that since the two structures involved in meaning creation, the lexical item and the increment, are incompatible with each
other (or, at the very least, not necessarily compatible with each other), there is no guarantee that real-time discourse will be well-formed, in the Chomskyan sense (op.cit.).
The lack of reliable well-formedness is characteristic of impromptu spoken language,
though it can also be found in hasty written language. But because spoken language is
happening in real time, the endocentric and exocentric structuring do not always dovetail as
neatly as they do in written language where the writer usually has plenty of time to coordinate the components.
Spoken language has come in for a great deal of disparagement since tape recorders
made it possible for spoken text to be studied. By then the standards for what text should be
like had been established by monitors of the written form of the language. Clearly spoken
text did not conform12, and it began to be described in terms which, while sympathetic,
were also disparaging to the point of abuse. Speech transcripts were full of mistakes, slovenly constructions (Biber/Johansson/Leech/Conrad/Finegan (1999) uses dysfluency,
hesitation, inconsistency, memory loss, error, degenerate, incomplete among many other
terms to describe perfectly normal features of spoken language).
To attract such a barrage of criticism, the mere pressures of real time seem hardly strong
enough ; the trivia of concord and the simple links of cohesion do not make an insuperable
barrier to observing the regular rules of the language. But if the meaning-creating structures
of the language do not fit together like the parts of a Swiss watch, then there could be a
serious barrier to well-formedness.
All this demonstrates the irrelevance of (written-language) well-formedness in the description of spoken text. We need a grammar which will operate without flinching on
what actually happens in speech and in many types of writing, where there is no guarantee
that the criteria for well-formedness will be met. Increments are thus quite a long way from
propositions, both physically and conceptually, despite the overlapping area where a simple
sentence can be seen as an averral, a statement and a proposition without any tinkering.
Following Abercrombies reasoning, the operation of truth value is a pseudoprocedure.
For a sentence/proposition to have a truth value, it must be possible to envisage a procedure
whereby its truth may be established or challenged. For an indefinitely large number of
sentences, the procedure can be imagined but could not be carried out. For example, the
truth or falsehood of Bill has the smallest head of any man on earth could be established
if someone measured the heads of all the men on earth. This would have to be done simultaneously since the population is constantly changing, and would only hold for an instant,
but it can be imagined. It is of course out of the question that any such event could take
place, no matter how much money and logistical skill was devoted to it.
This is not intended as a criticism of the role of truth value in philosophy, but only of its
relevance to language. A logician is not interested in whether or not a particular assertion is
true or false, but only that its truth or falsehood is discoverable, no matter how far-fetched

12 neither did written text, but it took some years of corpus research for this to come to light. In the
meantime written text was preprocessed to make it conform to the expectations of parsers.
LEXICOGRAP HICA 20/2 004

Meaning in the Framework of Corpus Linguistics

31

the procedure that would be required. Averral is as close as an utterance can get to veracity,
and while averral has illocutionary force, it does not have truth value.

Conclusion

Meaning is an inalienable property of language, and its most important property ; the object
of language analysis and description is to explain how the articulation of words and sentences creates meaning. While the word and the sentence provide fairly manageable starting-points for the elucidation of meaning, they have to be made the basis of two more abstract categories in order to carry the complexity of language structure. These are called the
lexical item and the increment respectively. All text is sequentially processed by interpretive devices tuned to the two types of meaning-creation, and there is no guarantee that they
will co-ordinate tidily. In fact, informal spoken language often shows a lack of coordination that is edited out of written material.
Meaning has thus an element of uncertainty at its core ; also it is retrieved by individuals
with quite different experience of both language and life. One of the key properties of language is paraphrase, whereby alternative phrasing is offered for a meaning which remains
largely unchanged. At one extreme of paraphrase is definition, where words are held to be
semantically almost equivalent to semi-formal paraphrases of them. Since there is no ideal
paraphrase nor ideal definition, there is an uncertainty of meaning here as well.
Utterances do not have truth-value ; though many may be identical to sentences which
can realise propositions which have truth-value, such utterances can only be averred. Each
participant makes a personal assessment of the likely veracity of an averral.

Bibliography

Abercrombie 1965 = David Abercrombie : Pseudo-procedures in linguistics. In : Studies in Phonetics and Linguistics. London 1965.
Alln/Berg/Malmgren/Norn/Ralph 2001 = Sture Alln, Sture Berg, Sven-Gran Malmgren, Kerstin Norn, Bo Ralph (eds) : Gller Stam, Suffix och Ord. Meijerbergs Arkiv fr
Svensk Ordforskning 29 Gteborg, Meijerbergs Institut fr Svensk Etymologisk Forskning Gteborgs Universitet 2001.
Austin 1962 = John Austin : How to do things with words. London 1962.
Barnbrook/Sinclair 2001= Geoff Barnbrook, John Sinclair : Specialised corpus, local and
functional grammars. In : Mohsen Ghadessy, Alex Henry, Robert Roseberry (eds.) : Small
Corpus Studies and ELT. Amsterdam 2001 (Studies in Corpus Linguistics no.5), 237276.
Bazell/Catford/Robins 1966 = Charles E. Bazell, John C. Catford, M. A. K. Halliday, and
R. H. Robins (eds.) : In Memory of J. R. Firth. London 1966.
Biber/Johansson/Leech/Conrad/Finegan 1999 = Douglas Biber, Stig Johansson, Geoffrey
Leech, Susan Conrad, Edward Finegan : The Longman Grammar of Spoken and Written
English. London 1999.
Brazil 1995 = David Brazil : A Grammar of Speech. Oxford 1995.
LEXICOGRAP HICA 20/2 004

32

John Sinclair

Chomsky 1965 = Noam Chomsky : Aspects of the theory of syntax. Boston 1965.
Corpas 2000 = Gloria Corpas (ed.) : Las Lenguas de Europa : estudios de fraseologa, Fraseografa
y Traduccin. Editorial Comares, SL, Granada 2000.
Ghadessy/Henry/Roseberry 2001 = Mohsen Ghadessy, Alex Henry, Robert Roseberry
(eds.) : Small Corpus Studies and ELT. Amsterdam 2001 (Studies in Corpus Linguistics no.5).
Halliday 1966 = M. A. K. Halliday : Lexis as a linguistic level. In : Charles E. Bazell, John C.
Catford, M. A. K. Halliday, and R. H. Robins (eds.) : In Memory of J. R. Firth. London
1966.
Halliday/Teubert/Yallop/ermkov 2004 = M. A. K. Halliday, Wolfgang Teubert, Colin
Yallop and Anna ermkov (2004): Lexicology and Corpus Linguistics. An Introduction.
London 2004.
Krishnamurthy 2004 = Ramesh Krishnamurthy (ed.) : English Collocation Studies. The OSTI
report. London 2004 (Corpus and Discourse).
Merlini/Sinclair 1996 = Lavinia Merlini, John Sinclair : Lessico e Morfologia. Textus IX, 1,
1996.
Sinclair 1996a = John Sinclair : Introduction and Multilingual Databases. In : IJL Vol 9 no 3.
1996, 172196.
Sinclair 1996b = John Sinclair : The Search for Units of Meaning. In Merlini/Sinclair (eds.),
75106. Reprinted in Gloria Corpas (ed.) : Las Lenguas de Europa : estudios de fraseologa, Fraseografa y Traduccin. Editorial Comares, SL, Granada 2000, 738.
Sinclair 1999 = John Sinclair : The Lexical Item. In : Edda Weigand (ed.) : Contrastive Lexical
Semantics. Amsterdam/Philadelphia 1999 (Volume 17, Current Issues in Linguistic Theory), 1
24.
Sinclair 2001 = John Sinclair : The Floating Dictionary. In : Sture Alln, Sture Berg, SvenGran Malmgren, Kerstin Norn, Bo Ralph (eds) : Gller Stam, Suffix och Ord. Meijerbergs Arkiv fr Svensk Ordforskning 29 Gteborg, Meijerbergs Institut fr Svensk Etymologisk
Forskning Gteborgs Universitet 2001, 393422.
Sinclair 2004 = John Sinclair : New evidence, new priorities, new attitudes : In Sinclair (ed.)
Sinclair 2004 = John Sinclair (ed) : How to use corpora in language teaching. Amsterdam, Philadelphia, Benjamins 2004, 271299.
Sinclair/Payne/Hernandez 1996 = J. Sinclair, J. Payne, C. Hernandez (eds.) Corpus to CorpusA study of Translation Equivalence. Special issue of International Journal of Lexicography
9(3) 1996.
Weigand 1999 = Edda Weigand (ed.) : Contrastive Lexical Semantics. Amsterdam/Philadelphia
1999 (Volume 17, Current Issues in Linguistic Theory).

LEXICOGRAP HICA 20/2 004

Вам также может понравиться