Академический Документы
Профессиональный Документы
Культура Документы
IN ENGLISH
KREMENA DIMITROVA
TABLE OF CONTENTS
1. Introduction
16
5. Comparison
22
5.1.
5.2.
5.3.
5.4.
5.5.
5.6.
5.7.
6. Summary
22
23
23
23
24
25
25
26
Abstract
Idiomatic Expressions or IEs, are a common problem in formal grammar. The question is,
how can such a system built on terms of regular combination, deal with the different cases of
irregularity of IEs? Often the meaning of the whole expression cannot be composed of the
meanings of its daughters in a regular way, and moreover show a different level of flexibility.
Many theories attempt to give account for the irregular behaviour of those elements. All
contribute some interesting ideas, but in this paper we will concentrate only on three
approaches. We will see in more detail the analyses of Krenn and Erbach(1994), Sailer(2003)
and Shn(2005) and try to see where the differences in their concepts are, why some of their
ideas are criticized and not adopted for further use, or in what aspect they appear more
successful than the others. We will first try to shortly introduce the three approaches, in order
to give the reader an idea of the main concepts of the analyses and then we will summarize a
comparatively the most important issues in them.
1 Introduction
Two groups of IEs can be distinguished taking into account their semantics. Some have a
meaning that cannot be derived from the meaning of its consisting parts and therefore cannot
be analysed in a compositional way. For other IEs, the constituents can be assigned meanings
with which they can regularly contribute for the meaning of the whole expression. These
idioms can be analysed compositionally, but what distinguishes them from free combinations
is that the parts require the presence of particular lexical items in order to save the idiomatic
meaning. Moreover, they show a different flexibility degree, which should be accounted for in
a theory of grammar. The way they exhibit their irregularity appear to be the instance due to
which they are distinguished not only from regular phrases, but also among themselves. It is a
challenge for the formal grammars to take a full account on these irregularities, since such
systems usually aim for expressing generalizations and regularities in natural language. In the
paper a few IEs are used in order to see how the analyses work. We do not attempt to give any
detailed linguistic analysis for them, so we will just mention their most important properties
which are relevant for the three theories. The examples are the following:
Shoot the breeze-chat
The meaning of this IE is chat. But even though the phrase is structurally well-built, we can
not in any way distribute its meaning among the constituents. The flexibility level of this IE is
quite low. The only observable lexical variation is inflection.
1) Yesterday we were shooting the breeze with some friends and we decided to make a cake
for Nicks birthday.
2) *The breeze was shot by us.
3) *We shot the breeze which was very interesting.
4) *We shot the interesting breeze.
Take into account-consider
The IE contains the completely frozen prepositional object into account. It doesnt allow for
any modification. A direct object is always required by the verb take. The meaning of the IE
cannot be derived from its parts. But it certainly exhibits more flexibility than the IE shoot the
breeze.
2
be attributed to the fact that his theory was written for German which has more complicated
syntax than English, but this problem is not the aim of this paper, so we will try to adjust our
English examples to his theory. His analysis adopts modified ideas both from Sailer and
Krenn and Erbach. Some technical details remain unfinished in his theory and another weak
point in it is that the analysis turns out to be in cases too restrictive.
Of course, our purpose in this paper is not to explain thoroughly all the details in the
analyses, rather these which are important to briefly introduce the ideas of the approaches in
order to have a base for comparison. Also, a discussion about the semantic frameworks
adopted by the three theories, and the differences in the introduction of semantics in the IEs
will be missed. What is important for us is that account of the semantics exists.
But lets first start with a more detailed overview of Krenn and Erbach(1994).
HEAD adj
CAT
SUBCAT NP1
SYNSEM LOC
bright
CONTENT
BRIGHT-IS 1
For those idioms that allow a certain level of flexibility, the same analysis doesnt seem to
be the proper solution. Krenn and Erbach (1994) try to give for those IEs a unified syntactic
treatment in order to capture their selectional and occurrence restrictions. But the problem of
how to represent their semantics is handled in a different way for each group of IEs. The
syntactic treatment they suggest is in terms of subcategorization analysis in order to keep
everything on lexical level, thus allowing for Derivational Rules to account for processes like
passivization and inflection. In that way, all information about the idiom will be included in
the lexical entry of its head, the information about its complements being specified on the
SUBCAT list.
At first they consider the possibility of selecting for phonology, which is particularly
motivated by cases like trip the light fantastic or take into account. The light fantastic and into
account are frozen complements and the syntactic category of the constituents of the first one
is not clear-cut. Also, their parts cannot be attributed a meaning which is part of the meaning
of the whole expression. It would be very convenient if the heads of these expressions, the
verbs trip and take respectively, could simply subcategorize for complements with the
specific phonology.
PHON take
HEAD verb
CAT
SYNSEM LOC
consider
CONT CONSIDERER 1
CONSIDERED 2
Fig.2 LE of the idiomatic version of take (into account) when selecting for phonology
As good as it sounds, however, this solution appears to be problematic in many aspects.
First, such a representation violates the Principle of Selection in HPSG which states that only
selection for syntactic and semantic information which is contained in the synsem objects
should be allowed, and not for phonology or constituent structure. Trying to avoid this
problem, Krenn and Erbach(1994) assume that heads subcategorize for signs and not for
synsems. However, they accept that selecting for phonology is not the best solution for
representing the fixed complements, as the phonological information may change in processes
like passivization, modification and others as evidenced by German data. For example*, the
frozen complement den Garaus in the idiom den Garaus machen (the Garaus make, to finish
off) appears originally in accusative case in active voice, but receives nominative case in
passive:
There are also cases where the subcategorized element can appear both in singular and
plural like einen/den Beschlu fassen, (die) Beschlsse fassen, lange vorbereitete Beschlsse
fassen.
This shows that the concept of specifying the phonology value is not very successful, as each
variability in the PHON value would have to be additionally specified.
Searching for another possibility of representing the information of the frozen complements
without having to specify their PHON values, the authors of the paper come to the introducing
of the feature LEXEME which was earlier proposed in the theory of Erbach (1992). LEXEME
is a head feature, because its purpose is to ensure which particular lexeme is the head of the
complement phrase on the SUBCAT list of the head daughter of the IE. The information
about the head lexeme percolates according to the Head-Feature Principle. But in order to
make it relevant in cases of relativization and pronominalization, Krenn and Erbach (1994)
specify LEXEME as an index feature. This is motivated by the fact that being a feature of sort
index, it is shared between the NP and the relative and personal pronoun respectively.
And now lets see how our example IE take into account (Fig.3) will look within the
subcategorization analysis using the LEXEME feature of Krenn and Erbach (1994):
PHON take
HEAD verb
SC NP ,NP ,
1
2
SYN LOC
consider
CONT CONSIDERER 1
CONSIDERED 2
The idiomatic verb take subcategorizes for a phrase which is headed by a word with
LEXEME value account. The relevant semantics is specified in the CONTENT value of the
verb. Since we are operating on lexical level, there is no problem for the DRs to take care of
the relevant syntactic and semantic operations for which the IE allows. Modification is
excluded by specifying structural properties of the phrase for which it subcategorizes for.
Since the head daughter is specified as a lexical sign, modification is impossible. But what
happens when an IE doesnt allow for flexibility which is not blocked by the structure of the
LE?
Having explained the part with the syntactical issues of the analysis, it is time to discuss
the semantics.
As previously mentioned, the semantics of an unanalysable idiom is indecomposable into the
meanings of its constituents. So, how can such an IE be analysed semantically, if the usual
compositional means appear not to work here? Since the compositional semantics of HPSG is
encoded in the lexical head, the specific semantic behaviour of the IEs can be specified in the
lexical entries of their heads. Lets see an illustration with the LE of shoot (the breeze) (Fig4).
The IE doesnt allow for any flexibility. The only possible variation is inflection.
PHON shoot
SYNSEM LOC
HEAD verb
CAT
SUBCAT NP[nom] 1 , NP acc, sg DTRS HEAD-DTR... INDEX LEXEME breeze , PP[ with ]
[
]
chat
CONT
CHATTER 1
PHON spill
HEAD verb
CAT
SYN
LOC
divulge-information
CONT DIVULGER 1
DIVULGED 2
PHON make
HEAD verb
PHON difference
CAT
CAT SC NP1
SYNSEM LOC
SUBCAT NP1 ,
SYN
LOC
CONT 2
cases when the complement of the verb has an empty semantic contribution, but often this is
not the case.
These problems show that the analysis suggested by Krenn and Erbach(1994) for handling the
different cases of IEs in terms of subcategorization doesnt prove to be very successful. The
next approach we are going to examine is that of Sailer (2003). His analysis is conceptionally
much different, since the distributional restrictions of IEs are globally accounted for by stating
the possible context of appearance of the constituents as a value of the newly introduced
COLL attribute.
Every element in the VP occurs in the same form in some other combination.
Syntactically, the VP is of a regularly built shape.
If it is a V-NP combination, the direct object can be modified syntactically.
If it is a V-NP combination, it can be passivized,(and further raised).
If it is a V-NP combination, the direct object can be topicalized.
If it is a V-NP combination, the direct object can have the shape of a relative pronoun
According to these criteria our examples will be assigned the relevant kind of irregularity.
Shoot the breeze, take into account and spick and span are internally irregular, because the
meaning of their constituents do not contribute for the meanings of the IEs. And spill the
beans and make difference are internally regular because their meaning is a regular
combination when their constituents are assigned the proper meaning. But we will see this
again later in this chapter when we try to explain the lexical entries which the analysis will
assign them. And now lets see in more detail the analysis which adopts ideas from Gazdar et
al. (1985) but is more successful in giving account of the internally irregular IEs.
Sailer also assumes in his theory that the internally irregular IEs be listed in the lexicon.
But in contrast to Krenn and Erbach (1994) he proposes for them a representation on phrasal
rather than lexical level. Since these IEs do not have the usual properties of regular phrases,
the principles of semantic and syntactic combination do not apply to them. Therefore, all the
principles of grammar should be modified in order to operate only on regular phrases. But
what shall mark the distinction between the regular and irregular phrase when an IE has also a
literal meaning which is usually the case? The answer to this and many other questions is the
new attribute of sort sign-COLL. This feature is used in the case of internally irregular IEs to
mark their irregular status, but its function is of much greater importance for the analysis of
more flexible IEs where COLL accounts for their distributional restrictions.
The value of the attribute is a list, and each time an internally irregular phrase is met, it is
marked by setting the COLL value as a non-empty list. On the opposite, the feature value of
the regular phrases is the elist. Having this mechanism to distinguish the IE from the regular
phrase, the modification of the grammar principles can be easily made. An instance is the
25) HEAD-FEATEURE Principle:
phrase
COLL elist
In the original version of Pollard and Sag (1994) the principle informally says that there
is an identity of HEAD values between a phrase and its head-daughter. Here, the principle is
changed by the additional specification that the former mentioned identity is required only for
regular phrases. Similar trivial changes are applied also for the rest of the grammar principles,
the result being that they are in use only for regular combinations.
The internally irregular IEs receive a representation in the lexicon in the form of Phrasal
Lexical Entries (PLEs). Up to this point, the principle which contains a description of all
lexical entries (words) is the WORD PRINCIPLE (Pollard and Sag, 1994).
26) The WORD PRINSIPLE:
word
In the theory of Sailer the principle which describes all PLEs is the INTERNAL
IRREGULARITY PRINCIPLE (IIP):
27) INTERNAL IRREGULARITY PRINCIPLE (IIP):
10
Informally explained, the LEXICON PRINCIPLE describes all entries which have a nelist as
a value of the COLL attribute. Here is the formal representation of the principle:
28) The LEXICON PRINCIPLE:
sign
It is important to mention that the constituents of the PLEs and also the phrases that
dominate it are regular. The only irregular item is the phrase of the IE. So, for example, if one
of the constituents is a phrase, its COLL value will be elist, i.e. the traditional principles of
combination apply to it, or if it is a word, it appears with its regular meaning. So, even though
its building elements are regular, the IE ignores the principles of grammar and combines them
irregularly. The IE only blocks the material from its daughters from appearing higher in the
structure or of introducing new material.
Now lets turn to our examples and see how the analysis accounts for them.
The IE shoot the breeze (Fig.8) doesnt allow for any flexibility except inflection. Its meaning
cannot be distributed among its parts. All this should be specified in its PLE. The relevant
meaning is given as the CONTENT value of the phrase. Modification is excluded by
specifying breeze as a direct daughter of the NP. Passivization is blocked by requiring the
presence of an NP complement daughter in the structure of the PLE. The impossibility of
topicalization and relativization are accounted for by the nature of the PLE, too. An NP nonhead daughter with its specific phonology is required by the PLE of the IE. Assuming a
traceless analysis of extraction, if this direct object is extracted, the needed phonology will be
missing from its place. So, there will not be any VP which satisfies the requirements of the
PLE. Here should be mentioned that the analysis of Sailer (2003) doesnt provide a
mechanism for blocking pronominalization which is impossible for an IE.
The other IE from our examples is spick and span (Fig.7). It functions entirely like an
adjective and is completely frozen. For such entities, a PLE is relevant, where only the
phonology, its syntactic category and meaning is specified.
phrase
adj
CAT
HEAD
SS LOC
MOD
3
CONT
x.bright'
x
(
)
@
@
COLL nelist
11
p h ra s e
PHON 1 2
H E A D 3 v erb
C A T
S U B C A T N P , P P w ith
S S L O C
[
]
4
5
C O N T x .c h a t ' ( x )
@
@
h e a d -c o m p l-s t r u c
w o rd
D T R S H -D T R P H O N 1 s h o o t
HEAD 3
SS LO C CAT
S U B C A T
4 , 5 , 6
PH O N 2
H EA D noun
C A T
SUBCAT
LO C
S
S
6
C O N T P t h e y :b r e e z e ' @
N -D T R
P H O N b reeze
H -D T R
S S N
DTRS
P H O N th e
N -D T R
S S D et
C
O
L
L
e
li
s
t
C O L L n e li s t
( )
( y ) ( P
@
( y ) )
The IE take into account contains the frozen complement into account. We cannot
attribute it a meaning and since it exhibits no flexibility, it could be introduced in the PLE of
take into account as a phrase with fixed phonology.
phrase
PH O N 1 2
H EA D 3 verb
C AT
S C N P 4 , N P 5
S Y N LO C
C
O
N
T
x
y.consider'
x
,
y
@ ( @
@ )
w ord
H -D T R P H O N 1 take
H
EA
D
3
S
S
LO
C
C
AT
S
U
BC
A
T
N
P
,
N
P
,
6
D
T
R
S
4
5
phrase
into
account
N
-D
T
R
P
H
O
N
2
S
S
6
C O LL nelist
In this part we will discuss how the analysis for internally regular IEs works. As we
previously explained, those elements behave quite like regular phrases apart from the fact that
12
their constituents appear with their idiomatic meaning only within the frames of the IE or do
not allow some semantic or syntactic operations typical for the free combinations. But still,
they show much more flexibility than the internally irregular IEs and when assigned a
meaning they contribute to regularly compose the meaning of the whole expression.
Therefore, in Sailers analysis they are treated as regular combinations and the only thing they
need to be fully accounted for is a mechanism to specify their distributional restrictions. The
way this is done is by specifying the linguistic context of appearance of each constituent due
to the COLL attribute. Until now, the new feature was used only to mark the difference
between regular and internally irregular phrases. In the case of internally regular IEs, COLL is
given another function, namely to specify the context of occurrence of a word. The context of
occurrence is assumed to be the minimal clause of the word. Sailer defines minimal clause
with the help of the dominance relation:
29) Minimal-clause (informal definition):
For two signs x and y, such that x y, x is the minimal clause for y iff,
(i)
x dominates y, and
(ii)
x has a SUBCAT value of sort empty-list and a HEAD value of sort verb, and
(iii) there is no sign z which is dominated by x and satisfies (i) and (ii).
30) Definition of the relation dominate:
dominate ( 1 , 2 ) 1 2
1 DTRS H-DTR 3
dominate ( 1 , 2 ) E 3
and dominate ( 3 , 2 )
1 DTRS N-DTR 3
dominate ( 1 , 2 ) E 3
and dominate ( 3 , 2 )
1 STORE der-rule
IN 3
dominate ( 1 , 2 ) E 3
and dominate ( 3 , 2 )
In order to ensure that for each sign, and for each word which is dominated by this sign,
the COLL value of this word either dominates the sign or is dominated by it, the COLL
PRINCIPLE is assumed.
31) The COLL PRINCIPLE:
dominate (: , 1 )
dominate (: , 2 )
sign A 1 A 2
and 1 word
or dominate ( 2 ,:)
COLL 2 sign
For each idiomatic constituent a new lexical entry with the relevant semantics and value
of the COLL attribute is specified. The differences between the idiomatic word and its literal
13
counterpart are first in their meaning, since a new constant is specified for the idiomatic
version, and second in the contexts of appearance.
The following table (Fig.10) summarizes the possible values of COLL:
Entity
Regular phrase
Internally irregular IE
sort
phrase
phrase
COLL value
elist
Internally regular IE
Constituent of a regular phrase
phrase
word
word
word
[ sign]
[ sign]
[ sign]
Constituent output of DR
word
elist
[ sign]
elist
An internally regular IE from our example data is spill the beans. In terms of flexibility,
the phrase exhibits behaviour very similar to free combinations. But still, topicalization and
relativization are not allowed. As a COLL value of the LEs of spill and beans is specified the
context of appearance of the IE. It accounts for the above mentioned inflexibilities. The
minimal clause of the IE in case of topicalization doesnt satisfy the required context of
occurrence specified in the LEs of the constituents. Therefore, this operation is excluded.
Analogously, relativization is also excluded because after the extraction of the beans its
semantic contribution is absent from the minimal clause of the verb spill. More detailed
explanation of these phenomena is to be found on page 341 of the paper. And here are the LEs
for spill (Fig.11) and beans (Fig.12):
E1 E 2 E 3
word
PHON spill
HEAD verb
CAT
1
SUBCAT NP,NP, ( PP[ to] )
SL
CONT y x.spill@ ( x@ , y@ )
COLL 2
and 2 sign
and minimal-clause ( 2 , 1 )
14
E1 E 2
word
PHON beans
HEAD noun
CAT
1
SUBCAT
Det
S L
CONT x.beans x
@ ( @ )
COLL 2
and 2 sign
and minimal-clause ( 2 , 1 )
Fig.12 LE of the idiomatic version of beans
The last of our examples is the support verb construction make a difference (Fig.13,14).
Its distributional requirements are in terms of co-occurrence of the constituents. For the verb
make a new semantic constant make is introduced. The COLL value of the constituents
specifies the context requirements of the IE. Sailer assumes that for the support verbs a
concrete word must be specified as a direct object of the head and not only a CONTENT
value. Therefore the support verb makes a restriction on the phonology of the head of its
direct object.
E1 E 2 E 3 E 4 E 5
word
PHON make
HEAD verb
CAT
1
SS LOC
SUBCAT NP,NP LOC 3
COLL 2
and dominate
and 4 S L
( 2, 4)
and lexical-head
( 4, 5)
and 5 PHON
difference
S L C [ HEAD noun ]
PHON difference
HEAD noun
CONT x.difference' x
@ ( @ )
(2,1)
15
Now, what we can say in short about Sailers analysis is that its great advantage is its
simplicity, since after some minimal changes and the introducing of a single attribute to
account both for distinguishing the IE from regular phrases and capturing the distributional
restrictions of the IEs, a great range of collocational data can get a representation into the
formal grammar.
The critique of the theory comes from the fact that it appears to be too strong and
unrestrictive. As the whole utterance is made available for lexical elements, any sort of
restriction could be stated as collocational. He, therefore, suggests that the mechanism be used
only in cases where the traditional assumptions of formal grammar cannot handle the
distributional phenomena.
Shns analysis follows in the major part that of Sailer, but in order to make it more
restrictive, he adopts an earlier version of the COLL attribute which can take as value only
local information. He also borrows ideas from Krenn and Erbachs analysis the ability for a
head to select a certain word - but in a much modified way.
16
lot of the locally contained information is missed. Therefore, he adopts an earlier version of
the COLL attribute introduced in the work of Richter/Sailer (1999). There the possible values
of the COLL list are barriers. A barrier contains local information and specifies the possible
linguistic structure (unit) within which the idiomatic constituent should be realized.
The version of COLL which is used in Shns analysis is intended to be much more
restrictive than the one introduced in Sailer (2003). But the purpose of the feature stays the
same, and namely to restrict the environment in which an idiomatic constituent may occur. In
this approach too, COLL is a sign feature and its value is a list. The difference is that here the
possible barrier is not only the complete clause as in Sailer (2003) but barriers can be also
NPs, PPs, VPs , utterances.
barrier
LOC-LIC local
32)
Complete-clause
utterance
np
xp
vp
.
pp
[ sign]
34) Value of the COLL attribute (Shns version)
barrier
LOC-LIC local
The barrier objects have the attribute LOCAL-LICENSER (LOC-LIC), the value of which
is of sort local. Due to those barriers the distributional restrictions of a word are specified in
its lexical entry and will be obliged to appear only within the specified context. A barrier has
to be minimal and dominate the word. All this is assured by the LICENCING PRINCIPLE
(LIP):
35) LISENCING-PRINCIPLE (LIP):
For each barrier object on the COLL list of a sign x and for each phrase z:
the LOCAL value of z is identical with the LOC-LIC value,
iff z dominates x, z can be identified as the barrier specified
and z dominates no sign y which in turn dominates x and forms an equivalent barrier.
36) The formal representation of LIP in case of a single vp-barrier will be:
17
sign
1
COLL
2 vp
3 phrase
LOC-LIC 4
SS LOC 4
As the barrier object on the COLL list must fulfil the local requirements and the LIP, we
can say that Shn has succeeded in solving the problem with the unrestrictedness and lack of
locality of the Sailers version of COLL.
Perhaps, attempting to restrict the too powerful COLL version of Sailer (2003), Shn has
achieved the opposite effect and has made his analysis too restrictive. An important detail in
his theory is that the elements on the COLL list, in contrast to Sailer (2003), may be more
than one and are represented as conjunctions or disjunctions. This results from the fact that
the information specified as a value of COLL is too local and in some cases a single barrier is
not enough to account for the distributional restrictions of IEs. Therefore, other barriers
should be added.
Here comes also a technical problem for which Shn doesnt provide a formalized
solution in his paper. It concerns the so important for the analysis dominance relation. A
relation can operate only on objects which are lower in the structure from the level on which
they are called. Therefore, a word cannot theoretically specify the barrier which is higher in
the structure, because when a certain relation like is_vp is required, it doesnt have access to
this phrase, as it is higher in the tree. In that way each node which is higher in the structure
than this word can be its barrier. Shn proposes two possible solutions for this problem (p.
102), but doesnt develop any of those to a formal representation which accounts for this part
of his analysis.
An important characteristic of idiomatic expressions is the co-occurrence requirements of
their constituents. This motivates an analysis which has the means for the head element in an
IE to select a particular word as its co-occurrence partner. A possible solution is the LEXEME
feature of Krenn and Erbach (1994) but as we already discussed in ch.2, it is quite criticized
for some problems it poses and it is obvious that this is not the proper mechanism to express
those kind of restriction. Therefore, Shn introduces another feature for his theoryLISTEME. The term listeme has been introduced by Di Sciullo and Williams (1998) and
represent the property that a sign takes part in the lexicon. Shn specifies LISTEME as a head
feature, more precisely, below CATEGORY. In that way its value will be available for
selection. Also, the LISTEME value of a projection will be the same as the one of the head,
because being a HEAD feature its percolation is licensed by the HEAD-FEATURE
PRINCIPLE. In order to handle pronominalization, small changes in the theory are made, so
that a coindexed pronoun takes not only the INDEX but also the LISTEME value from its
antecedent. For the cases of non-coindexed pronomina an additional feature PRO-LISTEME
is introduced. The idea is that if a pronoun is coindexed, its LISTEME value will be the same
as the one of its antecedent and if it is not coindexed, the pronoun has a LISTEME value
identical to its PRO-LISTEME, i.e. it, she and so on. In the analysis pronominalization is
handled by an additional specification in the structure of the LEs which defines a nominal
element as referential or not. The handling of pronominalization here is an advantage over the
theory of Sailer, because he doesnt come out with any mechanism dealing with this problem
and impossible reference cannot be blocked from occurrence.
The analysis takes two possible directions, depending on the classification of the IE into
decomposable and non-decomposable.
We will first consider the analysis of the decomposable IEs.
18
As already discussed, those elements have a decomposable meaning into the meanings of
their constituents. If the words in such an IE are given their idiomatic meaning and are
represented by an additional relevant lexical entry, then there wouldnt be a problem for those
words to combine semantically and syntactically following the regular principles of grammar
into the required IE. But these words can usually occur with this meaning only within the IE
and this property should be declared in their individual lexical entries. They will also have
their specific LISTEME value, which automatically helps us to imagine them as independent
words. Then comes the question of how the LISTEME values of the idiomatic and literal
version of a word are distinguished. This is done quite easily by the method of numbering: for
example spill in the literal meaning will be the listeme spill1 and in the idiomatic-spill2. Of
course, the distributional requirements of the constituents of decomposable IEs are specified
on their COLL lists as barriers. Informally explained, what occurs is the head of the phrase
will be required to subcategorize for a word or its projection with a specific LISTEME value.
On the COLL list of the non-head of the IE will be specified the barrier of appearance of the
phrase and the listeme which heads this barrier should also be the head of this complement
daughter of the IE. Lets see an example which portrays all this (Fig.15). If we try to
distribute the meaning of the IVP spill the beans among its parts, then spill will be attributed
the meaning reveal and beans secret, information. The verb spill subcategorizes for the
definite NP the beans.
word
PHON spill
SS LOC CAT
verb
HEAD
LISTEME
spill
SUBCAT NP,NP
noun
complete - clause
COLL
19
w o rd
PHON
S S L O C
m ake
v e rb
HEAD
L IS T E M E
CAT
S U B C A T
N P ,N P
C O N T [ M A IN m a k e ]
m a k e 3
LO C
P H O N d iffe re n c e
S L C H E A D no un
L IS T E M E d iffe re n c e
PHON difference
noun
HEAD
LISTEME difference
CAT
SUBCAT
SS LOC
[
]
INDEX PHI referential
CONT
MAIN difference
complete-clause
COLL
The analysis of non-decomposable IEs in Shns work follows closely the one proposed by
Sailer (2003). He also introduces the term PLE in order to refer to the representation of an IE
which meaning cannot be derived from the meaning of its parts. Again, for these entities the
regular principles of grammar are not in charge. All such principles are changed to apply only
on phrases with an empty COLL list. Shn also adopts the LEXICON PRINCIPLE which
summarizes and licences all kinds of lexical entries-PLEs and words, i.e. elements which have
a ne-list as a value of the COLL attribute. The COLL value of regular phrases and words
which are output of DRs is e-list. The Shns way of argumentation of such common
treatment is that the set of all LISTEME values covers all elements in the Lexicon, since each
LE or PLE has an individual LISTEME value. So, after the same trivial changes as in Sailer
(2003) of all principles are made, they will not apply on the PLEs because their COLL values
are ne-lists. The distributional restrictions of the PLEs are specified in the way their lexical
entries are built. The representation of our non-decomposable examples into the analysis of
Shn is quite similar with that in Sailers. A difference is the way passivization is excluded. In
the PLE for shoot the breeze (Fig.19), the breeze is specified as an accusative object. That
means that it can not appear as a subject of a passive sentence. Shn assumes a trace analysis
of extraction. DRs cannot apply on PLEs and we cannot block operations like topicalization,
passivization, etc. on them. In the case if non-decomposable IEs, this is done by encoding
them in the DTRS structure of PLEs.
20
phrase
PHON 1 2
HEAD
LISTEME shoot-the-breeze
CAT
SS LOC
SUBCAT NP 4 , PP 5 [ with ]
head-compl-struc
word
DTRS
H-DTR
PHON
1
shoot
3 verb
HEAD
SS LOC CAT
LISTEME shoot 2
SUBCAT 4 , 5 , 6
PHON
2
HEAD
noun
SS 6 LOC CAT
SUBCAT
PHON
breeze
CASE
acc
N-DTR
HEAD
H-DTR
LISTEME breeze 2
SS LOC
DTRS
SUBCAT
PHON the
N-DTR
SS Det
COLL elist
COLL nelist
( )
PHON 1 2
HEAD
3
CAT
SC NP 4 , NP 5
SYN
LOC
word
verb
HEAD
3
SS LOC CAT
LISTEME take 2
DTRS
SUBCAT NP 4 , NP 5 , 6
phrase
COLL nelist
adj
C O N T [ M AIN bright ]
C O LL nelist
21
As we can see from of spick and span (Fig.21) and and the frozen complement into
account (Fig.20), completely fixed elements are represented and treated as one unit and their
phonology and relevant meaning are specified.
With this we can finish our part about the analysis of Shn (2005). Its problems appear to
be its restrictiveness and technical details which formalization remains unclear.
5. Comparison
5.1 Classification of IE and Range of Data Analysed
After introducing a short description of the three analyses, this chapter is intended to
account for sketching the similarities and differences, advantages and problems occurring in
the theories for handling IEs of Krenn and Erbach (1994), Sailer (2003) and Shn (2005).
Lets summarize what we have already discussed. All of the theories start with explaining the
data and the range of phenomena which could be handled by the proposed analyses.
Krenn and Erbachs work tries to give an account for quite a large variety of IEs starting from
completely frozen ones which receive multi-word-lexeme representation, then discussing
those IEs which are unanalysable but show some level of flexibility and then quite briefly
mention how the metaphorical ones are to be analysed, and finally they suggest an analysis of
support verb constructions.
The range of data discussed in the theory of Sailer is not much different. He proposes few
criteria of regularity and bases his theory on the distinction between internally regular and
internally irregular IEs. In his paper, he concentrates mainly on IEs which have a VP
structure, but the analysis is so powerful that as he points out in ch. 8, p.348 we have chances
to capture even the most exotic context restrictions. Even though it sounds quite promising
and is maintained as a great advantage of his work, the unrestrictiveness of the analysis is
what turns out as an object of major critique.
Shn suggests a lot more criteria for the classification of IEs, but builds his analysis on the
distinction between decomposable and non-decomposable ones. He takes into account
morphological, lexical, semantic and syntactic markers relevant to the idiomatic cases of
German and its quite complex syntax. In this aspect, Shn has considered many aspects and
covered greater area in explaining the characteristics of IEs. He also pays greater attention to
idiomatic VPs, but his analysis is also general enough to handle other types of idiomaticity.
More or less the most important classification feature of IEs for all three analyses turns
out to be their compositionality. In the paper of Krenn and Erbach the presence or lack of this
property in idioms is designated by their classification into metaphorical and unanalysable.
Much the same is the leading criterion in Shns analysis, but there the IEs are divided into
decomposable and non-decomposable. But in Sailers approach also the aspect of syntactic
regularity of the structures of the IEs is taken into account. He distinguishes between
internally regular IEs which are regular semantic and syntactic combinations in case their
constituents are attributed a meaning, and internally irregular which meaning has no relation
with the meanings of the constituents and/or doesnt have a proper syntactic structure. It is not
certain, whether we should think of it as a difference, since if an expression doesnt have a
clear syntactic structure, then the distribution of its meaning among its parts also becomes a
problem. In such a case everything comes again to the point of the meaning compositionality.
Another issue in Krenn and Erbachs approach is that they provide analysis explicitly for
completely frozen idioms. The difference between completely frozen and more flexible
IEs turns out to be very important, as the first group of IEs receives multi-word-lexeme
treatment and the second a unified syntactic analysis by subcategorization where only the
semantic part is individual for the different types of IEs.
22
Idiomatic
Classification of the IE according to the analysis of:
Expression
Krenn and Erbach (2004) Sailer (2003)
Shn (2005)
Shoot the breeze
Unanalysable
Internally irregular Non-decomposable
Take into account
Unanalysable
Internally irregular Non-decomposable
Spick and span
Unanalysable
Internally irregular Non-decomposable
Spill the beans
Metaphorical
Internally regular
Decomposable
Make a difference Support-verb constructions Internally regular
Decomposable
Fig.22 Classification of IEs according to the different analyses
Make a difference
Shn (2005)
PLE
PLE
PLE
Decomposable IE
Decomposable IE
23
attribute shows to be both technically and conceptionally problematic. Here the analysis of
Sailer shows much different behaviour. He specifies the occurrence context of a word as a
complete-clause on the COLL-list in its LE. But the analysis is found too strong and
unrestrictive. Shn takes ideas from both theories, from Krenn and Erbach (1994)-the ability
for head to select for a particular word via the feature LISTEME, and from Sailer his theory
inherits the COLL attribute but in a much more restrictive version (Richter/Sailer, 1999), so
that the information on the list is local. So, the LISTEME is used for accounting the cooccurrence restrictions of the idiomatic constituents and the COLL is responsible for
specifying the barrier of appearance of the IE. If the disadvantage of Sailers theory is that it
is too powerful and unrestrictive, the opposite effect could be found for the analysis of Shn,
namely to be too restrictive, because the local information specified on the barriers might not
account fully for all distributional requirements of an IE. In such cases additional barriers
must be specified. So, whether this is the best way to handle the distributional restrictions of
IEs is in question for both theories. At least the technical aspect of the analysis of Sailer
appears to be correct, whereas, technical details in Shns theory remain unclear (dominance
relation).
list and with the use of relations. Also, some restrictions are performed by a specification in
the LE too. The distributional restrictions of internally irregular IEs are specified in the nature
of the PLEs and when this is not enough, COLL is used again to express the possible
environment of occurrence of the phrase. It should also be mentioned that Sailer assumes
traceless analysis for his theory and this is a factor for blocking topicallization and
relativization.
The idea of the COLL attribute is made more restrictive and local in Shns theory. The
barriers of occurrence can be also XPs and utterances and not only complete-clauses as in
Sailers analysis. Moreover, Sailers version of COLL takes as a value only one element,
while the COLL list in Shn (2005) can contain more than one. This is done by a disjunction
or conjunction. This is a consequence of the fact that the more restrictive the barrier, the more
restrictive the context of appearance of a word is. Therefore, in order to explain all possible
occurrences of a word, a single barrier might not be enough.
But here too, like in Sailer (2003) the distributional restrictions of decomposable IEs are
accounted by stating the context of appearance on the COLL list and some small
specifications in the LEs and those of the non-decomposable are specified analogously to
Sailer (2003) in the way the PLEs are built. In contrast to Sailer, Shn assumes trace analysis,
which is logical having in the mind the complex German syntax for which the analysis was
originally planned. He proposes that some processes like passivization, nominalization can be
blocked for some decomposable IEs by additionally specifying the DRs not to apply on the
relevant constituents.
An important advantage of Shn (2005) over Sailers work in terms of accounting for the
distributional properties of IEs is that he proposes a mechanism to allow or block
pronominalization of a certain nominal element. An additional specification in its structure
whether or not it is referential is all that has to be made.
5.7 Successfulness
Having briefly compared the analyses of Krenn and Erbach (1994), Sailer (2003) and
Shn (2005) it is relevant to summarize in which aspects they are successful and which not.
The theory of Krenn and Erbach (1994) faces many problems in handling the distributional
restrictions of IEs. Obviously, the other two approaches account better for the discussed
phenomena. While the approach of Sailer (2003) is criticized for being too powerful and
unrestrictive, the approach of Shn can be criticized for being too restrictive. On the other
hand, the analysis of Sailer is at least technically successful. While, even though trying to
make an enhancement of Sailers approach and coming up with good ideas, Shn has left
technical details unclear. Perhaps, many of the problems that are present in the three analyses
can be found a solution. But if we have to make a conclusion about the facts we discussed up
to this point, it will be that technically only the approach of Sailer turns out to be satisfactory
25
and even though its unrestrictiveness puts the conceptional side of the theory in question, it is
capable of expressing all kind of distributional restrictions.
6. Summary
In the paper we discussed three analyses handling idiomatic expressions-the one of Krenn
and Erbach (1994), Sailer (2003) and Shn (2005). In the previously mentioned order we
explained the most important details in each of the approaches. Krenn and Erbach propose a
subcategorization analysis and introduce the feature LEXEME in order to make the selection
for a particular word possible. Their theory faces many technical and conceptional difficulties.
Therefore, it was concluded that their analysis is not the best way the distributional
requirements of IEs should be accounted for.
Sailer suggests an analysis in which a sign feature COLL is introduced. The distributional
requirements of the IEs are specified as a context of occurrence on the list value of the new
attribute. The analysis is criticized for being too powerful and unrestrictive, as all kind of
distributional restrictions could be explained as collocational.
Shn combines concepts both from Krenn and Erbach (1994) and Sailer (2003). He gives
a modified version of LEXEME - the LISTEME feature and restricts the information on the
COLL list to local. The analysis shows to be in cases too restrictive and some technical details
are left unclear. As we see, even though many attempts are made for capturing the different
irregularities of idiomatic expressions, many problems concerning it are waiting for solution,
which is a good motivation for a future research.
26
BIBLIOGRAPHY
Erbach, Gregor(1992).HeadDriven Lexical Representation of Idioms in
HPSG, Universitt des Saarlandes
Krenn, Brigitte and Erbach, Gregor (1994). Idioms and Support Verb
Constructions. In J. Nerbonne, K. Netter, and C. Pollard (hrsg.), Geman in
Head-Driven Phrase Structure Grammar
Nunberg, Geoffrey, Sag, Ivan A. and Wasow, Thomas (1994).Idioms.
Language 70, S.491-538
Pollard, Carl and Sag, Ivan A. (1994).
Grammar.
27