Sie sind auf Seite 1von 594

Frequency vs.

iconicity in explaining
grammatical asymmetries
MARTIN HASPELMATH*
Abstract
This paper argues that three widely accepted motivating factors subsumed
under the broad heading of iconicity, namely iconicity of quantity, iconicity
of complexity and iconicity of cohesion, in fact have no role in explaining
grammatical asymmetries and should be discarded. The iconicity accounts
of the relevant phenomena have been proposed by authorities like Jakobson,
Haiman and Givon, but I argue that these linguists did not suciently con-
sider alternative usage-based explanations in terms of frequency of use. A
closer look shows that the well-known Zipan eects of frequency of use
(leading to shortness and fusion) can be made responsible for all of the al-
leged iconicity eects, and initial corpus data for a range of phenomena
conrm the correctness of the approach.
Keywords: frequency; iconicity; markedness; economic motivation
1. Introduction
The notion of iconicity has become very popular in the last 25 years
among functional and cognitive linguists. In Crofts (2003: 102) words,
the intuition behind iconicity is that the structure of language reects in
some way the structure of experience. Iconicity is thus a very broad no-
tion, and it has been understood and applied in a great variety of ways
(see Newmeyer 1992: 23 for an attempt at a survey). In this paper,
I will examine just the three sub-types of (diagrammatic)
1
iconicity in
(1)(3), which have played an important role in discussions of gram-
matical asymmetries. I will argue that in fact none of these is relevant
for explaining grammatical asymmetries, and that the phenomena in
question should instead be explained by asymmetries of frequency of
occurrence.
Cognitive Linguistics 191 (2008), 133
DOI 10.1515/COG.2008.001
09365907/08/00190001
6 Walter de Gruyter
(1) Iconicity of quantity
Greater quantities in meaning are expressed by greater quantities
of form.
Example: In Latin adjective inection, the comparative and super-
lative denote increasingly higher degrees and are coded by increas-
ingly longer suxes (e.g., long(-us) long, long-ior longer, long-
issim(-us) longest).
(2) Iconicity of complexity
More complex meanings are expressed by more complex forms.
Example: Causatives are more complex semantically than the corre-
sponding non-causatives, so they are coded by more complex forms,
e.g., Turkish du s(-mek) fall, causative du s-u r(-mek) make fall,
drop.
(3) Iconicity of cohesion
Meanings that belong together more closely semantically are ex-
pressed by more cohesive forms.
Example: In possessive noun phrases with body-part terms, the
possessum and the possessor are conceptually inseparable. This is
mirrored in greater cohesion of coding in many languages, e.g., Mal-
tese id hand, id-i my hand, contrasting with sig g u chair, is-sig g u
tiegh-i [the-chair of-me] my chair (*sig g (u)-i ).
While iconicity of quantity is mentioned rarely, iconicity of complexity
and iconicity of cohesion are often invoked in the functional and cogni-
tive literature (and recently to some extent also in the generative litera-
ture; see 4.5). Both have been applied to a wide range of grammatical
phenomena by many dierent authors.
I argue in this paper that these three types of iconicity play no role
in explaining grammatical asymmetries of the type long(-us)/long-ior,
du s(-mek)/du s-u r(-mek), id-i/sig g u tiegh-i. Instead, such formal asym-
metries can and should be explained by frequency asymmetries: In all
these cases, the shorter and more cohesive expression types occur signi-
cantly more frequently than the longer and less cohesive expression types,
and this suces to explain their formal properties. No appeal to iconicity
is necessary. Worse, iconicity often makes wrong predictions, whereas fre-
quency consistently makes the correct predictions.
I want to emphasize that I make no claims about other types of iconic-
ity, such as
iconicity of paradigmatic isomorphism (one form, one meaning in the
system, i.e., synonymy and homonymy are avoided; Haiman 1980; Croft
1990a: 165, 2003: 105);
2 M. Haspelmath
iconicity of syntagmatic isomorphism (one form, one meaning in the
string, i.e., empty, zero and portmanteau morphs are avoided; Croft
1990a: 165, 2003: 103);
2
iconicity of sequence (sequence of forms matches sequence of experi-
ences; e.g., Greenberg 1963 [1966: 103]);
iconicity of contiguity (forms that belong together semantically occur
next to each other; this is similar to iconicity of cohesion, but dierent in
crucial ways, cf. 5);
iconicity of repetition (repeated forms signal repetition in experience,
as when reduplication expresses plurality or distribution).
For most of these iconicity types, frequency is clearly not a relevant
factor, and I have no reason to doubt the conventional view that the
relevant phenomena are motivated by functional factors that can be con-
veniently subsumed under the label iconicity. Whether these functional
factors can be reduced to a general preference for iconic over noniconic
patterns is a separate question that I will not pursue here.
I also need to emphasize that I am interested in explanation of gram-
matical structures, perhaps more so than many other authors that have
discussed iconicity. That is, I want to know why language structure is
the way it is, whereas some authors seem to be content with observing
that language structure is sometimes iconic:
The traditional view of language is that most relationships between linguistic units
and the corresponding meanings are arbitrary . . . But the cognitive claim is that
the degree of iconicity in language is much higher than has traditionally been
thought to be the case. (Lee 2001: 77)
As long as one merely observes that cases like long(-us)/long-ior and
du s(-mek)/du s-u r(-mek) can be regarded as iconic in some way, I have
no problem. What I am denying is that iconicity plays a motivating role
and should be invoked in explaining why the patterns are the way they are.
What I observed while reading the literature on iconicity is that a num-
ber of authors (e.g., Hockett 1958: 577578; Givo n 1985, 1991) seem to
use the term iconicity as a kind of antonym of arbitrariness, so that
almost anything about language structure that is not arbitrary falls under
iconicity. I am in broad sympathy with Givo ns general account of the
relation between arbitrariness and non-arbitrariness in language, but I
would insist on the need to identify the relevant factors as precisely as
possible and to make testable predictions. It is quite possible that the dis-
agreements about the role of frequency vs. iconicity will eventually turn
out to be less severe than it may seem at the beginning, but in any event
this paper should help to clarify the issues.
Frequency vs. iconicity in explaining grammatical assymetries 3
Iconicity and the frequency asymmetries discussed here are universal
explanatory factors, so their eects should be universal. This means that
in principle conrming data could come from any language, and ideally
the data should come from a large representative sample of languages.
Such data are still not very widely available, so this paper will continue
the practice of Haiman (1983) (and much other work) of making claims
about universal asymmetries that are not fully backed up by conrming
data, but that nevertheless seem very plausible because of the apparent
absence of counterevidence. Likewise, disconrming data could come
from any language, but of course isolated counterexamples are not su-
cient to show that no systematic coding asymmetry exists. Many of the
generalizations cited here are known to be merely strong tendencies, not
absolute universals.
The remainder of this paper is organized as follows: 2 discusses icon-
icity of quantity, 34 discuss iconicity of complexity, and 56 discuss
iconicity of cohesion. For each subtype of iconicity, I will rst cite au-
thors who have advocated it and mention examples of phenomena that
are allegedly motivated by iconicity, before presenting my arguments for
a frequency-based explanation of the phenomena. The nal 7 presents
the conclusions.
2. Iconicity of quantity
2.1. Advocates and examples
Iconicity of quantity was dened in 1 as follows:
(4) Greater quantities in meaning are expressed by greater quantities of
form.
It seems that the rst author to mention this motivating principle was
Jakobson (1965[1971: 352]) and (1971). Jakobson cited three examples:
(i) In many languages, the positive, comparative and superlative de-
grees of adjectives show a gradual increase in the number of phonemes,
e.g., high-higher-highest, [Latin] altus, altior, altissimus. In this way, the
signantia reect the gradation gamut of the signata (1965[1971: 352]).
The higher the degree, the longer the adjective.
(ii) The signans of the plural tends to echo the meaning of a numeral
increment by an increased length of the form (1965[1971: 352]). The
more referents, the more phonemes (e.g., singular book, plural books,
French singular je nis I nish, plural nous nissons we nish).
(iii) In Russian, the perfective aspect expresses a limitation in the
extent of the narrated event, and it is expressed by a more limited (i.e.,
4 M. Haspelmath
a smaller) number of phonemes (e.g., perfective zamoroz-it, imperfective
zamoraz -ivat freeze) (Jakobson 1971).
Iconicity of quantity is mentioned approvingly in Plank (1979: 123),
Haiman (1980: 528529, 1985: 5), Anttila (1989: 17), in Taylors (2002:
46) Cognitive Grammar textbook, and in Itkonen (2004: 28); see also
Lako and Johnson (1980: 127).
2.2. Frequency-based explanation
Any ecient sign system in which costs correlate with signal length will
follow the following economy principle:
3
(5) The more predictable a sign is, the shorter it is.
Since frequency implies predictability, we also get the following predic-
tion for ecient sign systems:
(6) The more frequent a sign is, the shorter it is.
These principles have been well known at least since Horns (1921) and
Zipf s (1935) work, but somehow under the inuence of the structuralist
movements many linguists lost sight of them for a few decades. However,
more recently cognitively oriented linguists have begun to appreciate the
importance of frequency again (e.g., Bybee and Hopper 2001, among
many others). I do not claim to have original insights about the way in
which frequency inuences grammatical structures, but I want to argue
that iconicity turns out to be less important as an explanatory concept if
one gives frequency the explanatory role that it deserves.
Principle (6) straightforwardly explains Jakobsons observations about
adjectival degree marking and singular/plural asymmetries, because uni-
versally comparative and superlative forms are signicantly rarer than
positive forms of adjectives, and plural forms are signicantly rarer than
singular forms (see Greenberg 1966: 3437, 4041). It is not possible to
make such a universal statement about perfective and imperfective aspect,
and the frequency of these aspectual categories depends much more on
the lexical meaning of the individual verb. But for Russian, Fenk-Oczlon
(1990) has shown that there is a strong correlation between length and
frequency of a verb form: in general, the more frequent member of a Rus-
sian aspectual pair is also shorter.
This frequency-based explanation is not only sucient to account
for the phenomena cited by Jakobson, but also necessary, because the
principle of iconicity of quantity makes many wrong predictions (as
was also observed by Haiman 2000: 287). For example, it predicts that
plurals should generally be longer than duals, that augmentatives should
Frequency vs. iconicity in explaining grammatical assymetries 5
generally be longer than diminutives, that words for ten should be
longer than words for seven, or even that words for long should
be longer than words for short, or that words for elephant should be
longer than words for mouse. None of these predictions are generally
correct (except perhaps for the last prediction, but note that mouse is
about twice as frequent as elephant in English).
4
Iconicity of quantity has never been considered particularly important,
and its refutation here is only a prelude to the refutation of the other two
kinds of iconicity in 36.
3. Iconicity of complexity: Advocates and examples
Iconicity of complexity was dened in 1 as follows:
(7) More complex meanings are expressed by more complex forms.
Here are some quotations from the literature that describe this principle
and refer to it as isomorphic or iconic.
Lehmann (1974: 111): Je komplexer die semantische Reprasentation
eines Zeichens, desto komplexer seine phonologische Reprasenta-
tion. (The more complex the semantic representation of a sign is,
the more complex is its phonological representation.)
Mayerthaler (1981: 25): Was semantisch mehr ist, sollte auch kon-
struktionell mehr sein. (What is more semantically should also
be more constructionally.)
Givo n (1991: 2.2): A larger chunk of information will be given a
larger chunk of code.
Haiman (2000: 283): The more abstract the concept, the more re-
duced its morphological expression will tend to be. Morphological
bulk corresponds directly and iconically to conceptual intension.
Langacker (2000: 77): [I]t is worth noting an iconicity between of s
phonological value and the meaning ascribed to it (cf. Haiman 1983).
Of all the English prepositions, of is phonologically the weakest by
any reasonable criterion. . . . Now as one facet of its iconicity, of is
arguably the most tenuous of the English prepositions from the se-
mantic standpoint as well . . .
In Lehmanns (1974) approach, semantic complexity is measured by
counting the number of features needed to describe the meaning of an ex-
pression. A contrast between presence and absence of a semantic feature
is often called semantic markedness, and very often iconicity of com-
plexity is described as a kind of iconicity of markedness matching:
(8) Marked meanings are expressed by marked forms.
6 M. Haspelmath
This principle was already formulated by Jakobson (1963[1966: 270]),
and repeated many times in the later literature, e.g.,
Plank (1979: 139): Die formale Markiertheitsopposition bildet die
konzeptuell-semantische Markiertheitsopposition d[iagrammatisch]-
ikonisch ab. (The formal markedness opposition mirrors the
conceptual-semantic markedness opposition in a diagrammatically
iconic way.)
Haiman (1980: 528): Categories that are marked morphologically
and syntactically are also marked semantically.
Mayerthaler (1987: 489): If (and only if ) a semantically more
marked category C
j
is encoded as more featured [ formally complex]
than a less marked category C
i
, the encoding of C
j
is said to be
iconic.
Givo n (1991: 106, 1995: 58): The meta-iconic markedness principle:
Categories that are cognitively markedi.e., complextend also to
be structurally marked.
Aissen (2003: 449): Iconicity favors the morphological marking of
syntactically marked congurations.
For similar statements, see also Zwicky (1978: 137), Matthews (1991:
236), Newmeyer (1992: 763), and Levinson (2000: 136137).
By formally marked, these authors generally mean expressed
overtly. Typical examples of such markedness matching are given in
(9).
(9) less marked/unmarked (more) marked
number singular (tree-) plural (tree-s)
case subject (Latin homo-) object (homin-em)
tense present ( play-) past ( play-ed )
person third (Spanish canta-)
5
second (canta-s)
gender masculine ( petit-) feminine ( petit-e)
causation non-causative
(Turkish du s--mek fall)
causative
(du s-u r-mek fell, drop)
object inanimate animate
(Spanish Veo la casa Veo a la nin a.
I see the house I see the girl.)
That there are universal formal asymmetries in these (and many other)
categories has been known since Greenberg (1966), and Jakobson
(1963[1966]) and (1965[1971]) explicitly refers to Greenbergs cross-
linguistic work. However, Greenberg did not invoke iconicity to explain
the formal asymmetries of the kind illustrated in (9). He had good rea-
sons, as we will see in the next section.
Frequency vs. iconicity in explaining grammatical assymetries 7
4. Iconicity of complexity: frequency-based explanation
4.1. Complex/marked expressions are rarer
Greenbergs (1966) explanation was in terms of the frequency asymme-
tries in the use of the grammatical forms. He noted that less marked
forms are more frequent, and more marked forms are less frequent
across languages. Thus, the economy principles in (5)(6) are sucient
to explain the asymmetries in (9) (see also Croft 2003: 110117). The
English preposition of is not only the most semantically tenuous
(Langacker 2000: 77), but also the most frequent of all the English prep-
ositions. Singulars are more frequent than plurals, nominatives are more
frequent than accusatives, the present tense is more frequent than the
past tense, the third person is more frequent than other persons, and
the masculine is more frequent than the feminine. All of this was docu-
mented by Greenberg (1966) for a few selected languages, and the hy-
pothesis that it holds universally has not been challenged. That causa-
tives are generally less frequent than the corresponding non-causatives
is also clear; I discuss this case in more detail below (4.4). And among
objects, inanimate referents are much more frequent than animate refer-
ents (4.5).
This frequency-based explanation is not only sucient to account for
the relevant phenomena, but also necessary, because iconicity of com-
plexity makes some wrong predictions. In (10), I list cases that go in the
opposite direction of the patterns in (9).
(10) less marked/unmarked (more) marked
number plural singular
Welsh plu feathers plu-en feather
case object case subject case
Godoberi mak
0
i child mak
0
i-di (ergative)
person second p. imperative third p. imperative
Latin canta- sing! canta-to let her sing
gender female male
English widow- widow-er
causation causative noncausative
German onen sich onen
In all these cases, frequency makes the right predictions. Plurals like
Welsh plu feathers are more frequent than singulars (Tiersma 1982), in
the imperative mood the second person is more frequent than the third
person, the word widow is more frequent than the word widower, and
with verbs like open, the causative is more frequent than the noncausa-
tive (see 4.4).
8 M. Haspelmath
These exceptions have long been known in the literature, but linguists
have often described them in terms of markedness reversal. The idea is
that markedness values can be dierent in dierent contexts, so that, for
example, third person is not absolutely unmarked with respect to second
person, but in certain contexts second person can be unmarked and rst
person can be marked (e.g., Waugh 1982; Tiersma 1982; Witkowski and
Brown 1983; Haiman 1985: 148149; Croft 1990a: 66). But in order to
reconcile the cases in (10) with iconicity of complexity, one would have
to show that not only the formal coding, but also the semantic/functional
markedness value has changed. This is much more dicult, and it has
not been shown that it is generally true that in cases of markedness rever-
sal, the formally unmarked term of the opposition is also semantically or
functionally unmarked. For example, Tiersmas (1982) main additional
evidence that locally unmarked plurals like Welsh plu feathers are
generally unmarked (i.e., do not merely show reversed formal coding) is
that in analogical leveling, the plural survives. But analogical leveling is
of course just another symptom of frequency of occurrence (cf. Bybee
1985: Ch. 3).
To make matters even more complex, some authors seem to mean fre-
quency when they say (functional) unmarkedness: Marked means rare,
and unmarked means frequent. For example, in a discussion of un-
marked plurals, Haiman writes:
. . . what is fundamentally at issue is markedness. Where plurality is the norm, it
is the plural which is unmarked, and a derived marked singulative is employed
to signal oneness: thus, essentially, wheat vs. grain of wheat. (Haiman 2000:
287)
The norm is of course the same as the more frequent situation, so what
is fundamentally at issue is frequency. Linguists are of course free to
dene their terms in whatever way they wish, but claiming not only that
formally marked elements tend to be functionally marked (in the sense
of being less frequent), but also that this a surprising instance of mark-
edness matching (or iconicity), is not helpful. The much simpler obser-
vation is that formally marked elements tend to be less frequent, and
this observation is straightforwardly explained by the economy princi-
ples in (5)(6). Neither iconicity nor markedness are relevant con-
cepts in stating and explaining these facts (see Haspelmath 2006 for
detailed argumentation that a notion of markedness is superuous in
linguistics).
The contrasts in (9) show zero expression vs. overt expression, but
some authors such as Lehmann (1974) and Haiman (2000) also talk
Frequency vs. iconicity in explaining grammatical assymetries 9
about length dierences between dierent types of morphemes. In partic-
ular, both authors note that grammatical morphemes are universally
shorter than lexical morphemes, and they claim that this iconically mir-
rors their more abstract or less complex meaning. But again frequency
and economy account for the same facts. Iconicity makes the wrong
prediction that lexical items with highly abstract or simple meanings
should be consistently shorter than items with more concrete or complex
meanings (as noted by Ronneberger-Sibold 1980: 239). It predicts, for ex-
ample, that entity should be shorter than thing or action, that animal
should be shorter than cat, that perceive should be shorter than see, and
so on.
6
4.2. Relative frequency and absolute frequency
It is important to recognize that the relevant type of frequency for the
purposes of this paper is relative frequency, not absolute frequency (cf.
Corbett et al. 2001 for some discussion of this contrast). That is, what I
am looking at here is the relation between the frequency of one category
and the frequency of another category (within a class of lexemes or a
construction): e.g., the relation between the frequency of singulars and
the frequency of plurals (in nouns), the relation between the frequency of
positive forms and the frequency of comparative forms (in adjectives), the
relation between the frequency of inanimate objects and the frequency of
inanimate objects (in transitive verb phrases), and so on.
I am not looking at the absolute frequencies of individual lexemes with
a particular category. The absolute frequency of English books, the plural
of book, is 131 (occurrences per million words, Leech et al. 2001), while
the singular of notebook occurs only 8 times. But the singular and the
plural should not be compared across dierent lexemes. The relative fre-
quencies are as expected: book 243, books 131, notebook 8, notebooks 3.
Likewise for positives and comparatives: the comparative lower occurs
111 times, and the positive bright occurs only 54 times. But the propor-
tions (i.e., relative frequencies) are as expected: low 158, lower 111, bright
54, brighter 5.
What is crucial is that the items whose frequency and formal expression
is compared are paradigmatic alternatives, i.e., that in some sense they
must occur in the same slot. It is in such slots that expectations arise, so
that more frequent items can make do with shorter coding because of
their greater predictability. If two items are not paradigmatically related,
it does not make so much sense to compare their frequency.
Another question is how big the frequency dierence should be to be
reected in grammar. The answer is: signicant. Perhaps one would see
10 M. Haspelmath
bigger dierences in form where the frequency dierences are bigger, but
this is an issue that I do not pursue in this paper.
4.3. Adjectives and abstract nouns: Resolving an iconicity paradox
Croft and Cruse (2004: 175) observe a curious iconicity paradox in
connection with adjectives such as those in (11) and the corresponding
abstract nouns:
(11) long leng-th
deep dep-th
high heigh-t
thick thick-ness
They note that denitions of such adjectives presuppose a scale of length,
depth, height, or thickness that is expressed by an abstract noun. Thus,
long means something like noteworthy in terms of length (cf. also
Melcuk 1967). This abstract noun is thus conceptually simpler than the
adjective, and yet it tends to be morphologically more complex across
languages. The situation in (11) thus appears to run counter to the prin-
ciple that morphological complexity mirrors cognitive complexity (Croft
and Cruse 2004: 175).
Croft and Cruse try to solve the paradox, but do not seem to be very
condent in their solution:
One possible explanation is that, in applying the iconic principle, we should
distinguish between structural complexity (in terms of the number of elementary
components and their interconnections) and processing complexity (in terms of
the cognitive eort involved). Perhaps they are acquired rst of all in an unanal-
yzed, primitive, Gestalt sense, which is basically relative. Maybe in order to
develop the full adult system, analysis and restructuring are necessary. Some of
the results of the analysis may well be conceptually simpler in some sense than
the analysand, but the extra eort that has gone into them is mirrored by the mor-
phological complexity. (Croft and Cruse 2004: 175)
But in fact, no solution to the paradox is required, because it is a
pseudo-paradox: There is no principle that morphological complexity
mirrors cognitive complexity. As we saw, morphological complexity
(in the sense of length) mirrors rarity of use. It is easy to determine that
adjectives are signicantly more frequent than the corresponding abstract
nouns. In (12), frequency gures from Leech et al. 2001 are given (the
gures again indicate occurrences per million words). The example of
beautiful/beauty shows that isolated exceptions to the coding regularity
are possible.
7
Frequency vs. iconicity in explaining grammatical assymetries 11
(12) long 392 leng-th 85
deep 97 dep-th 41
high 547 heigh-t 47
thick 51 thick-ness <10
beautiful 87 beauty 44
4.4. The inchoative-causative alternation: Economy instead of iconicity
In 3 and 4.1, we saw that pairs of noncausative (inchoative) and caus-
ative verbs are not uniformly coded: Sometimes the causative is coded
overtly, based on the inchoative (e.g., Turkish du s--mek fall, du s-u r-
mek fell, drop), and sometimes the inchoative is coded overtly, based
on the semantically causative verb. Such cases are called anticausatives
(e.g., German onen open (tr.), sich onen open (intr.); Russian otkry-
vat
0
-sja open (tr.), otkryvat
0
-sja open (intr.)).
On the natural assumption that causatives have an additional meaning
element (i.e. Russian otkryvat
0
sja means become open, and otkryvat
0
means cause to become open), anticausative coding would be counter-
iconic (as was observed by Melcuk 1967). This was seen as a problem
by Haspelmath (1993), who assumed the iconicity-of-complexity principle
(as well as markedness matching). However, Haspelmath found in a
cross-linguistic study that dierent verb pairs tend to behave dierently
with respect to which member of the pair (the inchoative or the causative)
tends to be coded overtly (cf. also Croft 1990b). Some verb meanings
(which for convenience will be called automatic) tend to be coded as caus-
atives (e.g., freeze, dry, sink, go out, melt), whereas others (which
for convenience will be called costly) are preferably coded as anticaus-
atives (e.g., split, break, close, open, gather). The idea behind the
terms automatic and costly is that the automatic events do not often
require input from an agent to occur, whereas the costly events tend not
to occur spontaneously but must be instigated by an agent. While the au-
tomatic events conform to iconicity, it is especially the costly events that
do not. Haspelmath tried to save the iconicity hypothesis by suggesting
that in some way the frequency of occurrence of a particular event de-
scription is reected in the way its meaning is treated by speakers:
Iconicity in language is based [not on objective meaning but] on conceptual
meaning . . . Events that are more likely to occur spontaneously will be associated
with a conceptual stereotype (or prototype) of a spontaneous event, and this will
be expressed in a structurally unmarked way. (Haspelmath 1993: 106107)
This move is reminiscent of Lehmanns suggestion that rarity results in
a high informational value and therefore somehow in high semantic
12 M. Haspelmath
complexity (cf. note 6), and of the desperate attempt by Croft and Cruse
to solve their iconicity paradox.
Fortunately, a much simpler explanation is available in which iconicity
of complexity plays no role, and the coding preferences are explained
in terms of economy: Automatic verb meanings tend to occur more fre-
quently as inchoatives than costly verb meanings, which tend to occur
more frequently as causatives. Due to economic motivation, the rarer ele-
ments tend to be overtly coded. Wright (2001: 127128) presents some
preliminary corpus evidence from English, as shown in Table 1:
Thus, inchoatives and causatives behave in much the same way as singu-
lars and plurals: Whichever member of the pair occurs more frequently
tends to be zero-coded, while the rarer (and hence less expected) member
tends to be overtly coded. Language-particular dierences often obscure
this picture (e.g., languages that never have overtly coded singulars, or
languages lacking overtly coded causatives), which emerges fully only
once a typological perspective is adopted.
4.5. Dierential object marking: Economy instead of iconicity
It has long been observed (e.g., Blansitt 1973; Comrie 1989; Bossong
1985, 1998) that the overt coding of a direct object often depends on
its animacy, and that such variation in object-marking can be subsumed
under a general rule:
(13) The higher a (direct) object is on the animacy scale, the more likely
it is to be overtly coded (i.e., accusative-marked).
According to Comrie, this is because animate objects are not as natural
as inanimate objects:
. . . the most natural kind of transitive construction is one where the A[gent] is
high in animacy and deniteness and the P[atient] is lower in animacy and
Table 1. Percentage of transitive ( causative) occurrences of some English inchoative-
causative verb pairs
verb pair % transitive
freeze 62% more causatives
dry 61%
melt 72%
burn 76%
open 80%
break 90%
A
B
more anticausatives
Frequency vs. iconicity in explaining grammatical assymetries 13
deniteness; and any deviation from this pattern leads to a more marked construc-
tion. (Comrie 1989: 128)
In an interesting paper that tries to integrate insights from the
functional-typological literature into an Optimality Theory (OT) frame-
work, Aissen (2003: 3) proposes an account that appeals to a xed
constraint subhierarchy involving local conjunction of a markedness hier-
archy of relation/animacy constraints (cf. 14) with a constraint against
non-coding (*
Case
):
(14) markedness subhierarchy:
*Obj/Humg*Obj/Animg*Obj/Inan
The resulting xed constraint subhierarchy is shown in (15). Roughly this
can be read as follows: Structures with zero-coded human objects are
worse than structures with zero-coded animate objects, and these in turn
are worse than structures with zero-coded inanimate objects.
(15) *Obj/Hum & *
Case
g*Obj/Anim & *
Case
g*Obj/Inan &
*
Case
Aissen motivates these constraints by appealing to markedness matching
and iconicity:
The eect of local conjunction here is to link markedness of content (expressed by
the markedness subhierarchy) to markedness of expression (expressed by *).
That content and expression are linked in this way is a fundamental idea of mark-
edness theory (Jakobson 1939; Greenberg 1966). In the domain of Dierential
Object Marking, this is expressed formally through the constraints [in (15)]. Thus
they are iconicity constraints: they favor morphological marks for marked
congurations. (Aissen 2003: 449)
Combined with economy constraints (*Struc), these constraints allow
Aissen to describe all and only the attested language types in her
framework.
However, a much more straightforward explanation of the Dierential
Object Marking universal is available: Inanimate NPs occur more fre-
quently as objects, whereas animate NPs occur more frequently as sub-
jects. Due to economic motivation, the rarer elements tend to be overtly
coded. This explanation has in fact long been known (Filimonova 2005
cites antecedents in the 19th century), though actual frequency evidence
has been cited only more recently (see Jager 2004).
8
Thus, no appeal to markedness matching or iconicity is needed, nor is
Aissens elaborate machinery of OT constraints needed to explain Dier-
ential Object Marking.
14 M. Haspelmath
5. Iconicity of cohesion: Advocates and examples
Iconicity of cohesion was dened in 1 as follows:
(16) Meanings that belong together more closely are expressed by more
cohesive forms.
Iconicity of cohesion is discussed in detail by Haiman (1983) under the
label iconic expression of conceptual distance (The linguistic distance
between expressions corresponds to the conceptual distance between
them, Haiman 1983: 782).
9
What he means by linguistic distance is
made clear by the scale in (17), where (a)(d) show diminishing linguistic
distance (in my terms, increasing cohesion).
(17) Haimans (1983: 782) cohesion scale
a. X word Y (function-word expression)
b. X Y ( juxtaposition)
c. XY (bound expression)
d. Z (portmanteau expression)
I prefer the term cohesion to distance for this scale, because (b) and (c) do
not literally dier in distance, and distance is not really applicable to (d).
Moreover, I want to distinguish strictly between cohesion and contigu-
ity. That there is a functionally motivated preference for contiguity, i.e.,
for elements that belong together semantically to occur next to each
other in speech, is beyond question (see also Hawkins 2004: Ch. 5).
Newmeyers (1992: 761762) discussion of iconicity of distance (and
similarly Givo ns (1985: 202, 1991: 89) proximity principle) conate
cohesion and contiguity. I only argue against an iconicity-based explana-
tion of phenomena related to cohesion.
The following four examples of iconicity of cohesion are the most im-
portant cases cited in the literature:
(i) Possessive constructions: Inalienable possession shows at least the
same degree of cohesion as alienable possession, because in inalienable
possession (i.e., possession of kinship and body part terms) the possessor
and the possessum belong together more closely semantically (Haiman
1983: 793795, 1985: 130136; see also Koptjevskaja-Tamm 1996). An
example:
(18) Abun (West Papuan; Berry and Berry 1999: 7782)
a. ji bi nggwe
I of garden
my garden
b. ji syim
I arm
my arm
Frequency vs. iconicity in explaining grammatical assymetries 15
(ii) Causative constructions: Causative constructions showing a greater
degree of cohesion tend to express direct causation (where cause and
result belong together more closely), whereas causative constructions
showing less cohesion tend to express indirect causation (Haiman 1983:
783787; cf. also Comrie 1989: 172173; Dixon 2000: 7478). The fol-
lowing example is cited by Dixon (2000: 69):
(19) Buru (Austronesian; Indonesia; Grimes 1991: 211)
a. Da puna ringe gosa.
3sg.A cause 3sg.O be.good
He (did something which, indirectly,) made her well.
b. Da pe-gosa ringe.
3sg.A caus-be.good 3sg.O
He healed her (directly, with spiritual power).
A similar Japanese example is provided by Horie (1993: 26):
(20) a. John-wa Mary-ni huku-o ki-se-ta.
John-top Mary-dat clothes-acc wear-caus-past
John put clothes on Mary.
b. John-wa Mary-ni huku-o ki sase-ta.
John-top Mary-dat clothes-acc wear cause-past
John made Mary wear clothes.
The much-discussed English distinction between kill and cause to die is of
course also an instance of this contrast (e.g., Lako and Johnson 1980:
131).
(iii) Coordinating constructions: Many languages distinguish between
loose coordination and tight coordination (i.e., less vs. more cohesive pat-
terns), where the rst expresses greater conceptual distance and the latter
expresses less conceptual distance (Haiman 1983: 788790, 1985: 111
124). Haiman discusses coordination of clauses and cites the two exam-
ples in (21) and (22), where the greater cohesion is manifested by the
absence of a coordinator. In (21a), the greater conceptual distance lies in
the temporal non-connectedness, while in (22a), the greater conceptual
distance lies in the lack of subject identity.
(21) Fefe (Bantoid; Cameroon; Hyman 1971: 43)
a. a` ka` gen ntee n njwen lwa`
0
he past go market and buy yams
He went to the market and also (at some later date) bought
yams.
b. a` ka` gen ntee njwen lwa`
0
he past go market buy yams
He went to the market and bought yams (there).
16 M. Haspelmath
(22) Aghem (Bantoid; Cameroon; Anderson 1979: 114)
a. O
`
nam kb

gha y a z
she cook fufu we.excl and eat
She cooked fufu and we ate it.
b. O
`
m

m mam kb

she past sing cook fufu


She sang and cooked fufu.
Walchli (2005: Ch. 3) also discusses noun phrase coordination and cites
contrasts such as (23). He calls the semantic distinction between them
accidental coordination vs. natural coordination, and claims that
the formal contrast between loose coordination in (23a) and tight coordi-
nation in (23b) iconically reects this semantic contrast (2005: 13).
(23) Georgian
a. gveli da k
0
ac
0
i
snake and man
the snake and the man
b. da-dzma
sister-brother
brother and sister
(iv) Complement clause constructions: Haiman (1985: 124130) also
discusses complement-clause constructions in terms of iconicity of cohe-
sion mirroring conceptual closeness. He observes that in the contrast in
(24), the reduced or contracted version signals conceptual closeness
(same subject), while a non-reduced version signals conceptual distance
(dierent subject) (1985: 126).
(24) a. Who do you wanna succeed? (whopatient; same subject)
b. Who do you want to succeed? (whoagent possible; di.
subject possible)
But much better known is Givo ns work on iconic form-function cor-
respondences in complement clauses (1980, 1990: Ch. 13, 2001: Ch. 12;
see also 1985: 199202, 1991: 9596), which posits a scale of event
integration (called binding hierarchy in earlier versions) that corre-
sponds to a scale of formal integration. In the most recent version of
this, Givo n posits an iconic principle of event integration and clause
union:
The stronger is the semantic bond between the two events, the more extensive will
be the syntactic integration of the two clauses into a single though complex clause
(Givo n 2001: 40)
Frequency vs. iconicity in explaining grammatical assymetries 17
Among his examples are contrasts such as the following, where in each
case the rst example exhibits greater event integration and greater syn-
tactic integration (non-niteness and/or absence of a complementizer):
(25) a. John made Mary quit her job. (2001: 45)
b. John caused Mary to quit her job.
(26) a. She wanted him to leave. (2001: 47)
b. She wished that he would leave.
(27) a. She told him to leave. (2001: 48)
b. She insisted that he must leave.
(28) a. She saw him coming out of the theatre. (2001: 50)
b. She saw that he came out of the theatre.
6. Iconicity of cohesion: frequency-based explanation
My claim here is that Haimans cohesion scale in (17) does not reect one
single underlying cause. It should be taken apart into three dierent
distinctions: (i) overt coding vs. lack of coding (X word Y vs. X Y), (ii)
juxtaposition vs. bound expression (X Y vs. X-Y), and (iii) portmanteau
expression (Z). All three are related to frequency, but not in the same
way. This is clearest in the case of portmanteau expression (or supple-
tion), which only occurs when the combination of the two elements has a
high absolute frequency. For instance, in the domain of causative con-
structions, English has the bound causatives sadd-en make sad, wid-en
make wide, hard-en make hard, but it is only for high-frequency adjec-
tives like good and small that it has suppletive causatives (improve make
good, reduce make small). Similarly, a few cases of suppletion in posses-
sive constructions are attested, but these all come from high-frequency
nouns such as mother (e.g., Ju|hoan taqe` mother, a a my mother,
Dickens 2005: 35). The reason why high absolute frequency favours
suppletion (and irregularity more generally) has long been known: High
frequency elements are easy to store and retrieve from memory, so
there is little need for regularity (cf. Ostho 1899, Ronneberger-Sibold
1988).
However, the overt-covert contrast (X word Y vs. X Y) and the free-
bound contrast (X Y vs. X-Y) are due to frequency-induced predictabil-
ity, as seen earlier for contrasts that others have explained by iconicity of
quantity (2) and by iconicity of complexity (34). Predictability leads to
shortness of coding by economy, and shortness of coding itself leads
to bound expression, because short (and unstressed) elements do not
have enough bulk to stand on their own. The phenomena that Haiman
18 M. Haspelmath
explains through iconicity of cohesion actually all instantiate only the
overt-covert contrast and/or the free-bound contrast, so what matters
for them is again relative frequency.
Let us now examine the four main construction types with alleged ef-
fects of iconicity of cohesion to see how their properties can be explained
in terms of relative frequency.
6.1. Possessive constructions
With inalienably possessed nouns, possessive constructions are of course
much more frequent than with alienably possessed nouns (cf. Nichols
1988: 579). This can be easily demonstrated with corpus gures. Table 2
shows frequencies of three (hopefully representative) sets of nouns in
spoken English and spoken Spanish.
We see that alienable nouns occur as possessed nouns in a possessive
construction only relatively rarely (12% and 7% of the time, respectively),
Table 2. Frequencies of selected kinship terms, body part terms and alienable nouns
English kinship terms
a
body part terms
b
alienable nouns
c
total 16235 100% 11038 100% 24991 100%
possessed 7797 48% 4940 45% 2967 12%
nonpossessed 8434 52% 6098 55% 22024 88%
Source: British National Corpus, spoken part
a mother, father, brother(s), sister(s), wife, husband, son(s), daughter(s), mum, dad,
grandfather, grandmother, aunt, uncle
b head, hand(s), face, nger(s), knee(s), ear(s), leg(s), wrist, hair, nose, neck, belly,
skin, elbow, chest
c car, dinner, health, tree, knife, bed, community, meat, money, bike, suitcase, tools,
book(s), room, bedroom, kitchen
Spanish kinship terms
d
body part terms
e
alienable nouns
f
total 18391 100% 8863 100% 10913 100%
possessed 7362 40% 1297 15% 776 7%
nonpossessed 11029 60% 7566 85% 10137 93%
Source: Corpus del Espan ol, spoken part
d madre, padre(s), hermano(s), hermana(s), esposa, marido, hijo(s), hija(s), mama, papa,
abuelo(s), abuela, t a, t o
e cabeza, mano(s), cara, dedo(s), rodilla(s), o do(s), pierna(s), mun eca, pelo, nariz,
cuello, vientre, piel, codo, pecho, hombro(s)
f coche, cena, salud, arbol, cuchillo, cama, comunidad, pueblo, carne, dinero, bicicleta,
maleta, herramientas, libro(s), habitacion, dormitorio, cocina
Frequency vs. iconicity in explaining grammatical assymetries 19
while it is very common for kinship terms and body part terms to occur
as possessed nouns. (The fact that the gure for Spanish body part terms
is relatively low here is due to the omissibility of overt possessors in body-
part constructions like levanta la mano raise your hand; strictly speak-
ing, all notional possessors would have to be counted, but this is im-
possible to do automatically.)
As we saw in 4.2, what counts is relative frequencies, not absolute fre-
quencies. Since frequent alienable nouns like house or show are much
more frequent than rare inalienable nouns like kidney or great niece in
most cultural contexts, the alienable nouns may well occur in a possessive
construction more often than the inalienable nouns. However, the per-
centage of possessed occurrences of inalienable nouns will always be sig-
nicantly higher than the corresponding percentage of alienable nouns.
Thus, upon encountering an inalienable noun, it will be much easier to
predict that it occurs in a possessive construction, and the possessive
marking is therefore relatively redundant. Since languages are ecient
systems, they tend to show less overt coding with inalienable nouns.
Moreover, since pronominal possessors are more predictable, they show
a greater tendency to become axed, thus accounting for the contrast be-
tween juxtaposition and bound expression.
Crucially, the economy account given here makes somewhat dierent
predictions from Haimans (1983) iconicity account. The facts show that
the predictions of the economy account are the correct ones.
First, the iconicity account is compatible with a hypothetical situation
in which the pronominal possessor in the inalienable construction is
actually longer than the corresponding form in the alienable possession.
However, economy additionally predicts that the form of the inalienable
pronominal possessor not only tends to be bound, but also tends to be
shorter than the alienable possessor. This is in general borne out, and I
know of no counterexamples. Some examples are given in (29).
(29) alienable
construction
inalienable
construction
a. Nakanai luma taku lima-gu
(Johnston 1981: 217) house I hand-1sg
my house my hand
b. Hua dgai fu d-za
(Haiman 1983: 793) I pig 1sg-arm
my pig my arm
c. Ndjebbana budmanda ngayabba nga-ngardabbamba
(McKay 1996: 3026) suitcase I 1sg-liver
my suitcase my liver
20 M. Haspelmath
d. Kpelle a pri m-polu
(Welmers 1973: 279) I house 1sg-back
my house my back
e. Ju|hoan m tju` m ba
(Dickens 2005: 35) 1sg house 1sg father
my house my father
Second, Haimans account in terms of distance matching predicts that
the additional element in alienable constructions should occur in the
middle between the possessor and the possessum, as seen in the canonical
examples from Maltese (is-sig g u tiegh-i [the-chair of-me] my chair, see
1) and from Abun ( ji bi nggwe [I of garden] my garden, see (18)).
However, the extra element may also occur to the left or right of both
the possessor and the possessum, as seen in (30).
(30) alienable
construction
inalienable
construction
a. Puluwat nay-iy hamwol pay-iy
(Elbert 1974: 55, 61) poss-1sg chief hand-1sg
my chief my hand
b.
0
O
0
odham n -mi:stol-ga n -je
0
e
(Zepeda 1983: 7481) 1sg-cat-possd 1sg-mother
my cat my mother
c. Koyukon se-tel-eO se-tlee
0
(Thompson 1996: 654, 667) 1sg-socks-possd 1sg-head
my socks my head
d. Achagua nu-caarru-ni nu-w ta
(Wilson 1992) 1sg-car-possd 1sg-head
my car my head
My economy account only predicts that the coding of inalienable con-
structions should tend to be shorter, but it says nothing about the posi-
tion of the extra coding element in alienable constructions, so cases like
(30ad) are counterevidence to Haimans iconicity account, but com-
patible with my economy account. Haiman (1983: 795) himself cites the
Puluwat example, recognizes that it is a problem for him, and ac-
knowledges the need to reformulate his initial generalization. But he
does not seem to recognize that the facts no longer support any role of
iconicity.
Finally, some languages show overt coding of inalienable nouns as
well, but only when they are not possessed. An example comes from
Koyukon (Athabaskan; Thompson 1996: 654, 656, 667):
Frequency vs. iconicity in explaining grammatical assymetries 21
(31) Koyukon unpossessed possessed
alienable te se-tel-e
0
socks 1sg-socks-possd
socks my socks
inalienable k
0
e-tlee
0
se-tlee
0
unsp-head 1sg-head
head my head
Haimans iconicity does not make any predictions about unpossessed
constructions, but the economy account predicts just what we see: Alien-
able nouns tend to have overt coding in the possessed construction,
whereas inalienable nouns tend to have overt coding in the unpossessed
construction.
Thus, the iconicity account is both too weak (in that it does not predict
the shortness of inalienable possessive pronouns, seen in (29)) and too
strong (in that it wrongly predicts that the patterns in (30) should not be
possible). Economy, by contrast, makes just the right predictions.
6.2. Causative constructions
Again I claim that direct causatives are signicantly more frequent than
indirect causatives and that that explains why they exhibit more cohesive
coding than indirect causatives. No appeal to iconicity is necessary.
In order to show that this is true, ideally one would examine a corpus
of a language with a regular grammatical contrast between direct and in-
direct causation, as illustrated in (19) for Buru and in (20) for Japanese. I
hope that this paper will inspire such research, and I expect that the direct
causatives are much more frequent than the indirect causatives. In the lit-
erature on English, the contrasts between the dierent types of periphras-
tic causatives have received some attention. According to Gilquin (2006:
7), the frequency in the British National Corpus of the four causative
verbs that combine with an innitive are as in (32):
(32) spoken written total
make (I made him go) 898 258 1,156
get (I got him to go) 350 52 402
cause (I caused him to go) 15 207 222
have (I had him go) 48 29 77
Since the make and get causatives are usually regarded as expressing a
more direct type of causation, while the cause and have causatives express
a more indirect type of causation, this is just what we would expect.
It is also possible to compare lexical causative verbs with the corre-
sponding periphrastic cause causatives (this is also what Haiman 1983
22 M. Haspelmath
mostly does for the semantic aspects). Some gures from the British Na-
tional Corpus are given in (33) (these are only the forms with a pronoun
object, i.e., kill me, cause him to die, etc).
(33) stop 3267 cause to stop 6
kill 2400 cause to die 2
raise 466 cause to rise 3
bring down 269 cause to come down 0
drown 80 cause to drown 0
These comparisons are more problematic than those in (32) in that the
length of the two types of causatives diers sharply, so one might suspect
that the lexical direct causatives are more frequent simply because they
are shorter. In general, such eects do not seem to be particularly strong,
if they exist at all (see Haspelmath 2008: 6.5 for further discussion), but
still in the ideal case we would like to perform our corpus study on a lan-
guage where all causatives are expressed grammatically (i.e., even kill
and raise are expressed as die-caus and rise-caus). But since many di-
rect causatives are highly frequent (in an absolute sense) in all languages,
we normally nd a lot of portmanteau expression of causatives, which
limits our options for corpus counts. Nevertheless, the gures in (32) and
(33) should be sucient to make a good initial case for the claim that
direct causatives are generally more frequent than indirect causatives.
If this is true, then the economy account makes a further prediction:
that markers of indirect causation should not only be less cohesive, but
also tend to be longer. And indeed a number of languages have two
causatives diering primarily in length, not in cohesion (cf. Dixon 2000:
7478).
(34) indirect causative direct causative
a. Amharic as-balla a-balla
(Haiman 1983: 786, caus-eat caus-eat
Amberber 2000: 317320) force to eat feed
b. Hindi ban-vaa- ban-aa-
(Dixon 2000: 67, be.built-caus be.built-caus
Saksena 1982) have sth. built build
c. Jinghpaw -shangun sha-
(Maran and Clifton 1976)
d. Creek -ipeyc -ic
(Martin 2000: 394399)
Although Haiman (1983: 786) cites the example from Amharic as an in-
stance of an iconicity contrast, it does not actually t his iconicity expla-
nation. The two causatives of Amharic and the other languages in (34)
Frequency vs. iconicity in explaining grammatical assymetries 23
do not dier in cohesion, but only in length, so the contrast is predicted
only by the economy account.
10
6.3. Coordinating constructions
While Haimans discussion of examples like (2122) above only mentions
the semantic contrast between greater and less conceptual distance, Wal-
chlis terminology (accidental vs. natural coordination) already points to
the real motivating factor: Natural coordination (as in 21b, 22b and 23b)
is natural, i.e., frequent and expected for the pair of expressions, while
accidental coordination is infrequent and hence unexpected. Thus, it is
economical to use more explicit and less cohesive coding in accidental
coordination, and less explicit and more cohesive coding in natural
coordination.
Doing the frequency counts for clause coordination is fairly trivial. For
example, in the German version of The wolf and the seven little kids (one
of Grimms fairy tales), there are 47 und-coordinations, and 41 of them
show subject identity, while only 6 have dierent subjects. All 47 cases
exhibit temporal closeness.
For noun phrase conjunction of the type da-dzma brother-and-sister
(23b), the frequency counts are less straightforward, because the deni-
tion of accidental and natural coordination is quite vague: Walchli (2005:
5) describes natural coordination as coordination of items which are ex-
pected to cooccur, which are closely related in meaning, and which form
conceptual units. This is not specic enough to test the claim directly,
but it seems plausible that for noun phrases, too, it will be possible to
show that coordinations of the type brother and sister will turn out to
be more frequent than coordinations of the type the man and the snake.
6.4. Complement-clause constructions
For many of the examples given by Haiman and Givo n, the frequency ex-
planation is completely straightforward. With want verbs (cf. 24ab),
the same-subject use is of course overwhelmingly more frequent than the
dierent-subject use, for well-understood reasons (our desires naturally
concern rst of all our own actions), and this is often reected in shorter
coding (cf. Haspelmath 1999). This explains the contrast between English
wanna and want to, and also a similar contrast between gotta and got to (I
gotta go home now vs. I got to go to Hawaii last winter) that was already
pointed out and correctly explained by Bolinger (1961: 27) (condensa-
tion is tied to familiarity, cited approvingly by Haiman 1985: 126).
There are also obvious frequency asymmetries between the pairs make/
cause (cf. 25), want/wish (cf. 26), and tell/insist (cf. 27) which suce to
24 M. Haspelmath
explain the shorter coding of the rst member of each pair.
11
Givo n is
right that in each case there is also a semantic contrast, but in order
to show that the semantic contrast is indeed responsible for the formal
contrast, he should provide contrasting examples of constructions with
roughly equal frequency.
In contrasts such as (28ab) (She saw him coming out of the theatre vs.
She saw that he came out of the theatre), which do not exhibit a striking
frequency asymmetry, another factor is clearly highly relevant: In (28a),
the complement event necessarily occurs simultaneously with the main
event, in contrast to (28b), where the complement event could take place
at some other time (She saw that he would come out only two hours later/
that he had come out two hours earlier). In Cristofaros (2003: 5.3.2)
terms, (28a) shows predetermination of the tense value of the comple-
ment clause, and Cristofaro rightly explains the lack of niteness (i.e., the
lack of tense) in (28a) as due to syntagmatic economy: Information
that can be readily inferred from the context can be left out. (See also
Horie 1993: 203212 for related discussion.)
This factor of predetermination is of course not unrelated to the
broader notion of semantic closeness. If a complement-taking verb prede-
termines the tense value and other semantic properties of its complement,
this can be seen as one facet of conceptual closeness or event integra-
tion. However, such cases do not provide evidence for iconicity of cohe-
sion, because the higher syntagmatic cohesion of She saw him coming out
of the theatre would be expected anyway for reasons of economy.
12
7. Conclusion
I conclude that for most of the core phenomena for which iconicity of
quantity, complexity and cohesion have been claimed to be responsible,
there are very good reasons to think that they are in fact explained by fre-
quency asymmetries and the economy principle. The nal result may look
iconic to the linguist in some cases, but iconicity is not the decisive causal
factor.
Linguists have rarely discussed the mechanism by which iconicity could
come to have a causal role in shaping grammars. However, Givo n claims
that iconic structures are easier to process than noniconic structures:
The iconicity meta-principle: All other things being equal, a coded experience is
easier to store, retrieve, and communicate if the code is maximally isomorphic to
the experience. (Givo n 1985: 189)
And similarly, Dressler et al. (1987: 18) say that the more iconic a sign is,
the more natural it is, i.e., the easier speakers nd using it.
Frequency vs. iconicity in explaining grammatical assymetries 25
If these claims were correct also for iconicity of quantity, complexity
and cohesion, it would indeed be predicted that such iconic structures
should be preferred by speakers, and we should see a signicant eect of
iconicity in language structures. But in fact we do not see such an eect.
We see eects of frequency and predictability, i.e. of the economy princi-
ple, which (as everyone agrees) is independently needed. What we can
conclude from this is that the above claims are wrong, i.e., that iconic
structures are apparently not necessarily preferred in processing.
The respective role of iconicity and economy was discussed already
in the 1980s. Haiman (1983: 802) recognized that formal complexity/
simplicity is very often economically motivated, and he rejected the sub-
sumption of economic motivation under iconicity, even though one might
argue that the correspondence between a linguistic dimension (full vs. re-
duced form) and a conceptual dimension (unpredictable vs. predictable) is
itself iconic. As an example of economic motivation, he cites the tendency
for predictable referents to be coded with little material (short pronouns
or zero), while less predictable or unpredictable referents are coded with
more material (longer pronouns or full NPs) (as documented in Givo n
(ed.) 1983).
However, Givo n (1985: 197) sees the correlation between unpredict-
ability and amount of coding material as primarily iconic (see also Givo n
1991: 8789), and he objects to Haimans economy account:
. . . the principle of economy has not been working here by itself, since the end re-
sult of such a situation would have been the exclusive use of zero anaphora for all
topic identication in discourse. (Givo n 1991: 8789)
But that economy (favoring the speakers needs) is not the only relevant
factor in communication should be clear from the beginningif there
were no opposite principle of distinctiveness (favoring the hearer), we
would have no linguistic forms at all. Another argument that Givo n
makes is the following:
It may well be that Zipf-like economy considerations were indeed involved in the
diachronic . . . shaping of the quantity-scale . . . But the end result is nonetheless
an iconicisomorphicrelation between code and coded. And such a relation
surely carries its own meta-motivation, i.e., [the iconicity meta-principle, cited at
the beginning of this section]. (Givo n 1991: 8789)
This last sentence simply does not follow. If the end result is iconic in the
eyes of the analyst, this does not mean that it is iconically motivated, i.e.,
that iconicity is a relevant causal factor. The empirical evidence from
26 M. Haspelmath
frequency distributions and cross-linguistic coding types that was cited in
this paper shows that iconicity may well be irrelevant for an explanation
of the grammatical asymmetries considered here. That is, in the debate
between Haiman and Givo n, Haiman was right to favor economy over
iconicity in explaining the quantity scale for referent expressions. How-
ever, as I hope to have shown here, Haimans economy explanation
should be extended also to many other cases that he and others explained
in terms of iconicity.
Received 04 December 2006 Max-Planck Institute for Evolutionary
Anthropology, Leipzig, Germany Revision received 05 April 2007
Notes
* Versions of this paper were presented at the University of Jena, Tohoku University,
Seoul National University, and the Scuola Normale Superiore in Pisa. I am grateful
for all comments that I received on these occasions and on other occasions. I also
thank Brian Kessler for help with the corpus counts. Contact address: Max Planck In-
stitute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany;
authors e-mail: 3haspelmath@eva.mpg.de4.
1. In C. S. Peirces received typology of signs, there are three types of icons: diagrams,
images, and metaphors (see, e.g., Dressler 1995 for discussion). Nowadays metaphor
is not generally discussed under the heading of iconicity, and imagic iconicity is rele-
vant primarily for onomatopoeia. This paper is exclusively concerned with possible
iconicity eects in grammar, so only diagrammatic iconicity will be considered here.
The relevance of Peirces semiotic concepts to the study of grammar was rst brought
to linguists attention by Jakobson (1965).
2. The idea that (syntagmatic and paradigmatic) isomorphism can be considered an in-
stance of Peircean iconicity was apparently rst proposed by Anttila (1989, originally
published in 1972). A number of authors have noted that this represents a fairly
extreme extension of Peirces original concept, and Itkonen (2004) atly rejects the
subsumption of isomorphism under iconicity.
3. Haiman (1985: 194195) recognizes that the motivation for the reduction is also
partly economic: one gives less expression to that which is familiar or predict-
able, but he does not consider the possibility that the motivation may be entirely
economic.
4. Lako and Johnson (1980: 127) apply their principle more of form is more of con-
tent (which they call a metaphor, not relating it to iconicity) to cases of iteration (She
ran and ran and ran and ran) and lengthening (He is bi-i-i-i-ig!). Such extragrammatical
phenomena may well be motivated by a kind of iconicity of quantity. However, their
attempt to extend the principle to grammatical reduplication fails: While many cases of
reduplication signal more of content (e.g., plurality, continuative aspect), this is by
no means always the case (Moravcsik 1978 also mentions a widespread sense of dimi-
nution and attenuation, and more specic senses such as indierence and pretending).
Grammatical reduplication is apparently just like axation in that the reduplicated
form is always the rarer one.
5. The third vs. rst/second person contrast has also been interpreted as a kind of icon-
icity of absence (closely related to iconicity of quantity as seen in 2): Haiman (1985:
Frequency vs. iconicity in explaining grammatical assymetries 27
45), citing Benveniste (1946), claims that the third person, as a non-speech act partic-
ipant, can be seen as an absent person, a non-person that is iconically represented
by a non-desinence (i.e., zero). But neither Benveniste nor Haiman mention impera-
tives, where the hearer is present, but a second-person desinence is typically absent
(see (10) below). (See also the discussion in Helmbrecht 2004: 228229.)
6. Lehmann (1974: 113) notes that length correlates with rarity, but instead of following
Zipf in explaining length with reference to frequency/rarity, he suggests that rarity can
also be seen as equivalent to improbability or informational value. He then assumes
that informational value correlates with semantic complexity and infers that rare items
tend to be semantically complex. But evidently informational value in the statistical
sense is very dierent from semantic complexity. Talking about animals or perceiving
is perhaps in some technical sense of high informational value (even though it is not
very informative), but it is hard to argue that animal and perceive are semantically
more complex than cat or see.
7. A reviewer observes that the English pair widow/widower in (10) is also an isolated
exception and asks how it is dierent from beautiful/beauty. The answer is that the
widow/widower contrast is not isolated from a cross-linguistic point of view: There is
a general tendency for this pair to show overt coding on the male member (e.g., Ger-
man Witwe/Witw-er, Russian vdova/vdov-ec), whereas beautiful/beauty is isolated not
only within English, but also cross-linguistically.
8. Also much of the earlier functionalist literature is insuciently explicit with regard to
the causal factor. For example, Comrie (1989: 128) only invokes the naturalness of
certain associations between role and animacy, a relatively vague notion compared to
frequency.
9. Cf. also Lako and Johnsons (1980: 128132) principle closeness is strength of
effect, which is, however, not related to iconicity by them, but is regarded as a
metaphor. The frequency-based perspective here suggests that Lako and Johnsons
metaphor-based account is not necessary.
10. A further observation is that direct vs. indirect causation is not the only semantic
parameter by which competing causatives dier. Dixon (2000: 76) lists the following
parameters and observes that they all tend to correlate with the degree of compact-
ness of the causative marker (i.e., its shortness).
longer marker shorter marker
action state
transitive intransitive
causee having control causee lacking control
causee unwilling causee willing
causee fully aected causee partially aected
accidental intentional
with eort naturally
Not all of these can be subsumed under less conceptual distance, but they can be
plausibly related to frequency asymmetries. This is a matter for future research.
11. Leech et al. 2001 give the following gures for the verbal lexemes, which can be taken
as representative for the complement-clause constructions as well: want 945, wish 30;
tell 775, insist 67; make 2165, cause 206.
12. Cristofaro (2003: Ch. 9), while pointing to the importance of the factor of predetermi-
nation, still wants to retain semantic integration and iconicity as explanatory factors
for complement-clause constructions. But like Haiman and Givo n, she does not even
consider the potential explanatory value of frequency-based economy.
28 M. Haspelmath
References
Aissen, Judith
2003 Dierential object marking: Iconicity vs. economy. Natural Language and
Linguistic Theory 21(3), 435483.
Amberber, Mengistu
2000 Valency-changing and valency-encoding devices in Amharic. In R. M. W.
Dixon, and Alexandra Y. Aikhenvald (eds.), Changing Valency. Cambridge:
Cambridge University Press, 312332.
Anderson, Stephen C.
1979 Verb structure. In Larry Hyman (ed.), Aghem Grammatical Structure
(Southern California Occasional Papers in Linguistics 7). Los Angeles: Uni-
versity of Southern California, 73136.
Anttila, Raimo
1989 Historical and Comparative Linguistics. 2nd ed. Amsterdam: Benjamins.
Benveniste, E

mile
1946 Relations de personne dans le verbe. Bulletin de la Societe de Linguistique de
Paris 43, 112.
Berry, Keith and Christine Berry
1999 A description of Abun: A West Papuan language of Irian Jaya. (Pacic Lin-
guistics, B-115) Canberra: Australian National University.
Blansitt, Edward L.
1973 Bitransitive clauses. Working Papers in Language Universals (Stanford) 13,
126.
Bolinger, Dwight
1961 Generality, Gradience, and the all-or-none. The Hague: Mouton.
Bossong, Georg
1985 Dierenzielle Objektmarkierung in den neuiranischen Sprachen. Tu bingen:
Narr.
1998 Le marquage dierentiel de lobjet dans les langues dEurope. In Jack Feuil-
let (ed.), Actance et valence dans les langues de lEurope. Berlin: Mouton de
Gruyter, 193258.
Bybee, Joan L.
1985 Morphology: A Study of the Relation between Meaning and Form. Amster-
dam: Benjamins.
Bybee, Joan L. and Paul Hopper (eds.)
2001 Frequency and the emergence of linguistic structure. Amsterdam: Benjamins.
Comrie, Bernard
1989 Language Universals and Linguistic Typology. 2nd ed. Oxford: Blackwell.
Corbett, Greville, Andrew Hippisley, Dunstan Brown, and Paul Marriott
2001 Frequency, regularity and the paradigm: A perspective from Russian on a
complex relation. In Bybee, Joan and Paul Hopper (eds), Frequency and the
Emergence of Linguistic Structure. Amsterdam: John Benjamins. 201226.
Cristofaro, Sonia
2003 Subordination. Oxford: Oxford University Press.
Croft, William
1990a Typology and Universals. Cambridge: Cambridge University Press.
1990b Possible verbs and the structure of events. In Tsohatzidis, S. L. (ed.), Mean-
ings and Prototypes: Studies in Linguistic Categorization. London: Rout-
ledge, 4873.
Frequency vs. iconicity in explaining grammatical assymetries 29
2003 Typology and Universals. 2nd ed. Cambridge: Cambridge University Press.
Croft, William and Alan Cruse
2004 Cognitive Linguistics. Cambridge: Cambridge University Press.
Dickens, Patrick J.
2005 A Concise Grammar of Ju|hoan. Cologne: Ko ppe.
Dixon, R. M. W.
2000 A typology of causatives: Form, syntax and meaning. In Dixon, R. M. W.
and Alexandra Y. Aikhenvald (eds.), Changing Valency. Cambridge: Cam-
bridge University Press, 3083.
Dressler, Wolfgang U.
1995 Interactions between iconicity and other semiotic parameters in language. In
Raaele Simone (ed.), Iconicity in Language. Amsterdam: Benjamins, 2137.
Dressler, Wolfgang U., Willi Mayerthaler, Oswald Panagl and Wolfgang U. Wurzel
1987 Leitmotifs in Natural Morphology. (Studies in Langauge Companion Series
10). Amsterdam: Benjamins.
Elbert, Samuel
1974 Puluwat Grammar. (Pacic Linguistics, B-29.) Canberra: Australian Na-
tional University.
Fenk-Oczlon, Gertraud
1990 Ikonismus versus O

konomieprinzip: Am Beispiel russischer Aspekt- und


Kasusbildungen. Papiere zur Linguistik 42(1), 4969.
Fenk-Oczlon, Gertraud
1991 Frequenz und KognitionFrequenz und Markiertheit. Folia Linguistica
25(34), 361394.
Filimonova, Elena
2005 The noun phrase hierarchy and relational marking: Problems and countere-
vidence. Linguistic Typology 9(1), 77113.
Gilquin, Gaetanelle
2006 The verb slot in causative constructions: Finding the best t. Constructions
SV1-3. (www.constructions-online.de)
Givo n, Talmy
1980 The Binding Hierarchy and the typology of complements. Studies in Lan-
guage 4, 333377.
1985 Iconicity, isomorphism and non-arbitrary coding in syntax. In John Haiman
(ed.), Iconicity in Syntax. Amsterdam: Benjamins, 187219.
1990 Syntax: A Functional-Typological Introduction. Vol. II. Amsterdam:
Benjamins.
1991 Isomorphism in the grammatical code: Cognitive and biological considera-
tions. Studies in Language 15(1): 85114.
1995 Markedness as meta-iconicity: Distributional and cognitive correlates of
syntactic structure. In Functionalism and Grammar, T. Givo n, 2569.
Amsterdam: Benjamins.
2001 Syntax: An Introduction. Volume II. Amsterdam: Benjamins.
Givo n, T. (ed.)
1983 Topic Continuity in Discourse: A Quantitative Cross-Language Study. Am-
sterdam: Benjamins.
Greenberg, Joseph H.
1963[1966] Some universals of grammar with particular reference to the order of mean-
ingful elements. In Joseph H. Greenberg (ed.), Universals of Grammar. 2nd
ed. 1966. Cambridge, Mass.: MIT Press, 73113.
30 M. Haspelmath
1966 Language Universals, with Special Reference to Feature Hierarchies. (Janua
Linguarum, Series Minor 59). The Hague: Mouton.
Haiman, John
1980 The iconicity of grammar. Language 56, 515540.
1983 Iconic and economic motivation. Language 59, 781819.
1985 Natural Syntax. Cambridge: Cambridge University Press.
2000 Iconicity. In Geert Booij, Joachim Mugdan and Christian Lehmann (eds.),
Morphology: An International Handbook Vol. I. Berlin: de Gruyter, 281
288.
Haspelmath, Martin
1993 More on the typology of inchoative/causative verb alternations. In Bernard
Comrie and Maria Polinsky (eds.), Causatives and Transitivity (Studies in
Language Companion Series 23). Amsterdam: Benjamins, 87120.
1999 On the cross-linguistic distribution of same-subject and dierent-subject
complement clauses: Economic vs. iconic motivation. Paper presented at the
ICLC, Stockholm, July 1999. (Handout available from authors website.)
2006 Against markedness (and what to replace it with). Journal of Linguistics
42(1), 146.
2008 Creating economical morphosyntactic patterns in language change. To
appear in Good, Je (ed.), Language Universals and Language Change.
Oxford: Oxford University Press.
Hawkins, John A.
2004 Eciency and Complexity in Grammars. Oxford: Oxford University Press.
Helmbrecht, Johannes
2004 Ikonizitat in Personalpronomina. Zeitschrift fu r Sprachwissenschaft 23, 211
244.
Horie, Kaoru
1993 A cross-linguistic study of perception and cognition verb complements:
A cognitive perspective. PhD dissertation, University of Southern Cali-
fornia.
Hockett, Charles F.
1958 A Course in Modern Linguistics. New York: MacMillan.
Horn, Wilhelm
1921 Sprachkorper und Sprachfunktion. Berlin: Mayer and Mu ller.
Hyman, Larry
1971 Consecutivization in Fefe. Journal of African Languages 10(2), 2943.
Itkonen, Esa
2004 Typological explanation and iconicity. Logos and Language 5(1), 2133.
Jager, Gerhard
2004 Learning constraint sub-hierarchies: The bidirectional gradual learning
algorithm. In Blutner and H. Zeevat (eds.), Pragmatics in OT, R. Palgrave
MacMillan, 251287.
Jakobson, Roman
1963[1966] Implications of language universals for linguistics. In Joseph H. Greenberg
(ed.), Universals of language. 1966. Cambridge, MA: MIT Press. (First edi-
tion 1963). 263278. 2d ed.
1965[1971] Quest for the essence of language. In Selected Writings, vol. II. The Hague:
Mouton, 345359. (Originally published in Diogenes 51, 1965).
1971 Relationship between Russian stem suxes and verbal aspects. In Selected
Writings, vol. II. The Hague: Mouton, 198202.
Frequency vs. iconicity in explaining grammatical assymetries 31
Johnston, Ray
1981 Conceptualizing in Nakanai and English. In Franklin, Karl (ed.), Syntax
and Semantics in Papua New Guinea Languages. Ukarumpa, Papua New
Guinea: SIL, 212224.
Koptjevskaja-Tamm, Maria
1996 Possessive noun phrases in Maltese: Alienability, iconicity and grammatical-
ization. Rivista di Linguistica 8(1), 245274.
Lako, George and Mark Johnson
1980 Metaphors We Live By. Chicago: University of Chicago Press.
Langacker, Ronald
2000 The meaning of of. In Grammar and Conceptualization. Berlin: Mouton de
Gruyter, 7390.
Lee, David
2001 Cognitive Linguistics: An Introduction. Melbourne: Oxford University Press.
Leech, Georey, Paul Rayson, and Andrew Wilson
2001 Word Frequencies in Written and Spoken English Based on the British Na-
tional Corpus. Harlow, England: Pearson Education.
Lehmann, Christian
1974 Isomorphismus im sprachlichen Zeichen. In Seiler, Hansjkob (ed.), Linguis-
tic workshop II: Arbeiten des Kolner Universalienprojekts 1973/4, (Struc-
tura 8). Mu nchen: Fink, 98123.
Levinson, Stephen C.
2000 Presumptive Meanings: The Theory of Generalized Conversational Implica-
ture. Cambridge/MA: MIT Press.
Maran, L. R. and J. R. Clifton
1976 The causative mechanism in Jinghpaw. In Shibatani, Masayoshi (ed.), The
Grammar of Causative Constructions. New York: Academic Press, 443
458.
Martin, Jack B.
2000 Creek voice: Beyond valency. In Dixon, R. M. W. and Alexandra Y.
Aikhenvald (eds.), Changing Valency. Cambridge: Cambridge University
Press, 375403.
Matthews, Peter
1991 Morphology. 2nd ed. Cambridge: Cambridge University Press.
Mayerthaler, Willi
1981 Morphologische Natu rlichkeit. Wiesbaden: Athenaion.
1987 System-independent morphological naturalness. In Dressler et al., Leitmotifs
in Natural Morphology. (Studies in Language Companion Series 10). Am-
sterdam: Benjamins, 2558.
McKay, Graham R.
1996 Body parts, possession marking and nominal classes in Ndjebbana. In Chap-
pell, Hilary and William McGregor (eds.), The Grammar of Inalienability.
Berlin: Mouton de Gruyter, 293326.
Melcuk, Igor A.
1967 K ponjatiju slovoobrazovanija. Izvestija Akademii Nauk SSSR, serija litera-
tury i jazyka 26 (4), 352362.
Moravcsik, Edith A.
1978 Reduplicative constructions. In Greenberg, Joseph H. (ed.), Universals of
Human Language. Vol. 3. Word Structure. Stanford: Stanford University
Press, 297334.
32 M. Haspelmath
Newmeyer, Frederick
1992 Iconicity and generative grammar. Language 68, 756796.
Nichols, Johanna
1988 On alienable and inalienable possession. In Shipley, William (ed.), In Honor
of Mary Haas. Berlin: Mouton de Gruyter, 557609.
Ostho, Hermann
1899 Vom Suppletivwesen der indogermanischen Sprachen. Heidelberg: Ho rning.
Plank, Frans
1979 Ikonisierung und De-Ikonisierung als Prinzipien des Sprachwandels. Sprach-
wissenschaft 4, 121158.
Ronneberger-Sibold, Elke
1980 SprachverwendungSprachsystem: O

konomie und Wandel (Linguistische


Arbeiten 87). Tu bingen: Niemeyer.
1988 Entstehung von Suppletion und Natu rliche Morphologie. Zeitschrift fu r
Phonetik, Sprachwissenschaft und Kommunikationsforschung 41, 453462.
Saksena, Anuradha
1982 Contact in causation. Language 58, 820831.
Taylor, John R.
2002 Cognitive Grammar. Oxford: Oxford University Press.
Thompson, Chad
1996 On the grammar of body parts in Koyukon Athabaskan. In Chappell,
Hilary and William McGregor (eds.), The Grammar of Inalienability. Berlin:
Mouton de Gruyter, 651676.
Tiersma, Peter
1982 Local and general markedness. Language 58, 832849.
Walchli, Bernhard
2005 Co-compounds and Natural Coordination. Oxford: Oxford University Press.
Waugh, Linda R.
1982 Marked and unmarked: A choice between unequals. Semiotica 38, 299318.
Welmers, Wm. E.
1973 African Language Structures. Berkeley: University of California Press.
Wilson, Peter J.
1992 Una descripcion preliminar de la gramatica del Achagua (Arawak). Bogota:
Asociacio n Instituto Lingu stico de Verano.
Witkowski, S. R. and Cecil H. Brown
1983 Marking reversal and cultural importance. Language 59, 569582.
Wright, Saundra Kimberly
2001 Internally caused and externally caused change of state verbs. Ph.D. disser-
tation, Northwestern University, Evanston, IL.
Zepeda, Ofelia
1983 A Papago Grammar. Tucson: University of Arizona Press.
Zipf, George K.
1935 The Psycho-Biology of Language: An Introduction to Dynamic Philology.
Boston: Houghton Miin.
Zwicky, Arnold
1978 On markedness in morphology. Die Sprache 24, 129143.
Frequency vs. iconicity in explaining grammatical assymetries 33
In defence of iconicity
JOHN HAIMAN*
Abstract
A number of iconically motivated grammatical distinctions, among them
that between alienable and inalienable possession in Japanese and Korean,
are graded. Haspelmaths Zipan frequency hypothesis may be able to
accommodate these facts (lowest bulk is most frequent, middle bulk is less
frequent, and maximal bulk is maximally infrequent), but until more data
are forthcoming, iconicity alone makes the correct predictions in those
cases, and (crucially) in others where bulk is simply not the grammatical
variable at issue in signaling markedness (as for example, the distinction
between nominative/absolutive and ergative/accusative in Kurdish). The
productivity (not just the fortuitous correctness) of an iconically motivated
more form implies more meaning principle is attested in: (a) the
(pre)history of the development of nominalizations in Romanian and
Khmer, (b) in the frequent operation of Watkins Law whereby 3sg.
forms are interpreted as if they were zero-marked, even when they are not,
and (c) grammaticality judgments about the dierences between anaphoric
epithets and structurally identical non-anaphoric noun phrases like the pig
in English. Like reduced form, so too elaborated form, may have a number
of motivations, not only iconic and economic (both cognitive), but also
esthetic. It is probably misconceived to look for only one motivating factor
to account for most observed grammatical facts, although the motivating
factors are more easily identied when they operate alone.
Keywords: iconicity; frequency; productivity.
1. Introduction
Martin Haspelmaths article is a stimulating and thought-provoking cri-
tique of the notion of iconic motivation which deals with a broad range
Cognitive Linguistics 191 (2008), 3548
DOI 10.1515/COG.2008.002
09365907/08/00190035
6 Walter de Gruyter
of data and demands careful scrutiny. Not surprisingly, I am not equally
convinced by all of his arguments.
Haspelmaths fundamental argument is a version of Occams razor:
certain phenomena of reduced expression which seem to be iconic are
equally motivated by Zipan reduction, which is necessary anyway.
He thus proposes that for a variety of phenomena which seem to manifest
diagrammatic iconicity, only frequencyin fact, no cognitive explana-
tion whatsoeveris necessary. It is important to recognize this aspect of
his argument. To say that frequency itself is motivated by some concep-
tual considerations would be to beg the questionwhich one? But since
he never denies that iconicity is also necessary anyway in other areas
of grammar, Occams razor wont work for him. Moreover, he has not
yet done all his homework. The iconicity hypothesis is compatible with
graded phenomena. For example, Sohn (1994) and Tsunoda (1995) have
argued that possession (in Korean and Japanese) may be graded so
that there may be three or even four-way contrasts in conceptual close-
ness that are mirrored in grammatical performance and grammaticality
judgments. To claim that frequency counts also reect this graduation,
Haspelmath would need to produce frequency comparisons of not two,
but three or more forms. Until such evidence is available, paired fre-
quency counts alone will not be able to compete with iconicity.
Let us however assume for now that there are a variety of phenomena
for which both a frequency and an iconicity explanation are equally plau-
sible. When a structure is equally motivated by two constraints, however,
credit should be given
a) to the one that is applicable to a broader range of phenomenanot
just the one which seems to have one or two fewer exceptions. (I am
not impressed by Haspelmaths claims that here and there an excep-
tion to the iconicity principle makes the wrong prediction, while
frequency does not. Frequency also has its exceptions. For example,
as Orwell (1957: 150) and others have pointed out, it is not necessar-
ily always true that the shorter of two forms is the most frequent:
infrequency, verbal sludge like the American people at least in po-
litical discourses, swamps out homely expressions like Americans,
probably by orders of magnitude. In the same way, the occasional
counterexample to a generalization about iconicity is not convinc-
ing.)
b) to the one that is shown to be productivethat is, responsible for the
creation of novel forms. Productivity is the real test for psychological
reality.
36 J. Haiman
2. Broader range of phenomena arguments
2.1. Alienable vs. inalienable possession
One of the best apparent pieces of evidence for diagrammatic iconicity is
the contrast between alienable and inalienable possession. Typically,
though not always, the expression of alienable possession is more com-
plex, with greater linguistic distance between possessor and possessum,
than that of inalienable possession and this seems to reect conceptualiza-
tion iconically (inalienable possessionat least of body partsis concep-
tual closeness to the point of identity: and you cant get closer than
that.) Haspelmath argues, with convincing statistics, that inalienable pos-
session is more frequently expressed, and that the dierent degrees of
bulkiness of my arm versus my house in languages which make an ex-
plicit distinction between the two is nothing but a Zipan consequence of
this dierence in their relative frequency of occurrence. Indeed Haspel-
math makes much of the fact that in some languages like Puluwat, his
frequency test makes the right predictions about morphological bulk,
while iconicity does not. (I rst noted this as a problem myself. I would
now hazard the guess that Puluwat, like other Oceanic languages, rst
allowed the inversion of alienable possession structures like
Possessor
1
X # Possessum
2
! 2 1
as an occasional stylistic inversion, as is still the case in Tinrin (Osumi
1995: 437438) or Paamese (Crowley 1995: 384, 386). Iconicity is not
eternal.)
But morphological bulk is not the only means whereby the conceptual
contrast between alienable and inalienable possession can be expressed.
As William James (1890) pointed out:
it is clear that between what a man calls me and what he calls mine, the line is
dicult to draw. In its widest possible sense, a mans self is the sum total of all
that he can call his, not only his body and his psychic powers, but his clothes
and his house, his wife and children, his ancestors and his friends. (James 1890:
291292)
That is, the contrast between what one is and what one merely has is an in-
nitely gradable one, and languages sometimes reect this gradation in a variety
of (iconic) ways.
2.1.1. Possessor ascension. The phenomenon of possessor ascension or
external possession exists in English as well as in many other languages
(cf. Bally 1926; Hyman 1977; Durie 1987; Clark 1995; Tsunoda 1995;
Payne and Barshi 1999). The contrast is illustrated by pairs like:
In defence of iconicity 37
She patted his cheek/head/knee. (no possessor ascension)
She patted him on the cheek/head/knee. (possessor ascension)
In English possessor ascension is possible with all (real or imagined) body
parts, and with clothes one is actually wearing at the time the action
occurred, but not with clothes in ones closet, with pets, or with ones
productions or other possessions.
She tapped him on the shoe (OK when worn, not OK when not)
*She tapped him on the gerbil/wallet/article he had just written/car
The operative criterion is not exactly inalienable possession but one very
closely related to it. Possessor ascension may occur when the possessor
can be identied with the possessum. It may seem like arrant chutzpah to
invoke possessor ascension in defence of the idea of conceptual closeness,
since it is the relatively inalienable possessor which can be separated from
the possessum in this construction. But note that ascension is a natural
consequence of identity: Who pats my shoulder is ipso facto patting me
(cf. Hyman 1977: 107; Durie 1987: 388; Tsunoda 1995: 590, among
many). Thus possessor ascension is iconic of conceptual closeness, al-
though the means for expressing this closeness are dierent than in cases
like Hua d-zorgeva my hair versus d-gai zu my house. Iconicity can
provide a common explanation (or at least a common characterization)
of these facts, and frequency does not.
2.1.2. Honoric agreement. The phenomenon of honoric agreement is
the tendency for honorics to appear not only on NP denoting respected
persons, but on NP denoting their possessions, or on predications that are
made concerning these possessed NP.
Sohn (1994) and Tsunoda (1995) provide careful examinations of
honoric agreement in Korean and Japanese, whereby a verb may
mark the respect that the speaker accords to its subject or object. How-
ever, when that subject or object is a NP consisting of a possessor (mod-
ier) and a possessum (head), and the one respected is the possessor,
as in the emperors X, there is a cline of subtle and widely shared
grammaticality judgments depending on where the possessum is on the
hierarchy:
Body part > inherent attribute > clothing worn > (kin) > pet >
production > other
The higher the possessum on the hierarchy, the more likely that possessor
respect agreement as marked on the verb (either by a special verb form or
38 J. Haiman
by a respect sux) will be acceptable (Tsunoda 1995: 576). Accordingly,
the emperors hand is accorded respect; the emperors glasses less; the
emperors horse still less; the emperors book that he wrote less still;
and the emperors car/villa none at all. Below, the same range of facts
illustrated from Korean (Sohn 1994: 176):
(1) a. sensayng-nim-uy phali khu-sey- yo
teacher hon.gen. arm big hon.pol.
The teachers arms are big. (arms are inalienably part of the
teacher)
b. sensayng-nim-uy ankyengi khu- (sey)-yo
teacher hon.gen. glasses big hon. pol.
The teachers glasses are big.
(glasses are less likely to bask in the teachers reected honor
and glory)
c. sensayng-nim-uy namwuka khu-(?sey)-yo
teacher hon.gen. trees big hon. pol.
The teachers trees are big.
(trees even less than glasses)
b. sensayng-nim-uy kaytuli khu- (sey)-yo
teacher hon.gen. Dogs big hon. pol.
The teachers dogs are big.
(dogs the teacher owns are less likely to share in his honor than
trees he has planted, perhaps because they have a will of their
own)
The iconic principle behind these judgments is this:
The more we tend to identify the possessum with the possessor, the more
we..show our respect for it, in accordance with our respect for the possessor.
(Tsunoda 1995: 584)
This is exactly in accordance with the conceptual closeness of possessor
and inalienable possessum as marked in physical closeness. The iconicity
hypothesis suggests a common conceptual basis for these facts. The fre-
quency hypothesis proposes none.
2.2. Markedness in general
Haspelmath argues (I think largely convincingly) that local markedness
(Tiersma 1982) or markedness reversal (Andersen 1972) phenomena dem-
onstrate that markedness is not so much an icon of the unexpected as
a consequence of the relative infrequency of the unexpected. There is
In defence of iconicity 39
nothing inherently marked even about singular or plural, which is why
the unmarked form of stars may be (in some languages) the plural. But
relative markedness is reected not only in relative bulk (the Zipan cor-
relation) but in other ways as well.
Consider one elaboration of markedness reversal, Silversteins well-
known hierarchy of animacy (1976), and its ability to explain a number
of nominative/ergative case-marking splits. The ergative is marked
relative to the nominative and marks unexpected/infrequent subjects
(typically inanimate nouns, and typically transitive subjects in the past
tense). Conversely, the accusative is marked relative to the nominative
and marks unexpected/infrequent objects (typically animate human
nouns, and objects in the present tense).
Sorani Kurdish happens to be a language in which the accusative
and ergative are marked in exactly the same waya triumphant demon-
stration of Silversteins hypothesis that markedness alone is at issue in
both nominative/accusative and nominative/ergative oppositions. But in
Kurdish the marked/unmarked distinction (called the oblique/direct dis-
tinction in Western accounts, cf. McCarus 1958) is instantiated not by
greater versus lesser bulk, but by the contrast between agreement suxes
on the verb (for the unmarked S and O) versus mobile pronominal clitics
which land (roughly) after the rst immediate constituent of the VP (for
the marked A and O), cf. Haiman (forthcoming c). There is no dierence
in bulk between the agreement suxes on the one hand and the pronomi-
nal clitics on the other. It is only in their syntactic behaviour that they
dier systematically. To claim with Haspelmath that the unmarked is
simply the most reduced is to miss the obvious generalization that in
Kurdish, as in other languages with split ergativity, it is the nominative
which is the unmarked grammatical relation.
3. Productivity arguments
Haspelmath correctly notes that investigators have had little to say on the
genesis of iconicity: it is merely something that is already there, to be
(alternately) oohed and aahed over or dismissed as epiphenomenal. This
section examines some evidence for the productivity of iconic motivations
for morphological asymmetries. Such evidence is relatively hard, but not
impossible, to nd.
It is worth emphasizing before going on that token frequency can make
no predictions about productivity. It can only account for changes that
have already happened. When an utterance is about to be made for the
rst time, there is nothing for frequency to work on.
40 J. Haiman
3.1. takete/maluma thought experiments
I suggested that periphrastic causatives like cause to rise tend to evoke
some image of magic or telekinesis as opposed to raise. Haspelmath ar-
gues that there is no need to account for such judgments since the shorter
form is simply the most frequent. One typically raises objects through di-
rect contact rather than by waving a wand. But consider now a new form
like the verb disappear as a transitive verb, which rst made its appear-
ance in English (at least for me) in Joseph Hellers Catch-22:
I just heard them say they were going to disappear Dunbar.
Why are they going to disappear him?
I dont know.
It doesnt make sense. It isnt even good grammar. What the hell
does it mean when they disappear someone? (Heller 1972 [1955]:
376)
It has now become widespread, but I still recall the image it conjured up
when I rst read this book in the sixties. Contrasted with make disappear
it included as at least part of its meaning the notion of directly killing, as
opposed to make disappear which would have suggested some bureau-
cratic mediation. Speakers who make judgments like this are basing their
images on a contrast between patterns and performing their computations
without reference to frequency, since the frequency of a new form when it
is rst introduced is zero.
3.2. length revisited
As most cognitive linguists maintain, and as Haspelmath (1993: 106107)
has also contended, human imagery and conceptualization tend to be
based on concrete experience, and are not always the same as what is
viewed by the elite of the scientic community as objective physical
fact. For example, consider notions of up and down: the sun still rises
and sets, even though we accept Copernicus.) In his discussion here,
Haspelmath seems to retreat from this sensible position: an entity is trisyl-
labic, although things are monosyllabic. Hence length does not corre-
spond to conceptual complexity. To Haspelmath (as to Leibniz), an
entity may have seemed conceptually prior to things and people (and
how fortunate at least for Leibniz that ens is monosyllabic in Latin). Hu-
mans in general simply do not seem to operate in this manner: before we
make abstractions about entities, we are at home with things and people
(cf. Wierzbicka 1972, who boldly disregarded both the thought of Leibniz
and the morphology of English in insisting that someone and something
are mutually independent semantic primitives).
In defence of iconicity 41
In the same way, it is, if not ridiculous, at least highly unlikely, that hu-
man beings begin their thinking with a priori dimensions of absolute
space, time, colour, and morality (among them length), which they then
populate with judgments like long versus short, good versus
bad, green versus red and so forth. Rather, the conceptual dimen-
sions like length, width, time and morality and personications
like life, death, beauty and justice come into being (if they do
so at all) only after scores of these judgments are made and people have
reected on them. (It is satisfying, if ultimately irrelevant, that modern
physics now seems to agree with this folklore to the extent that space,
rather than preexisting, is thought to be created by the objects within it.)
This is why in every language I have ever heard of, nominalizations like
length are systematically more complex than judgments like long from
which they derive (again, we can and should take occasional exceptions
like beauty and (German) Tod death in stride). And that is also why
there are languages (like Hua) in which words like death, justice
and beauty do not exist all, and therefore have a frequency of zero.
We can observe the generation of the verb/nominalization distinction
in the recorded history of one language (Romanian) and in the tentatively
reconstructible prehistory of another (Khmer). In both languages, inher-
ited phonetic material was exapted (essentially from the careful pronun-
ciation of a verb form) to create a novel and productive derivational
nominalizing sux (Romanian -re from the inherited innitive) or inx
(Khmer awm(n)- from an inherited unstable anacrusic syllable in sesqui-
syllabic roots) to form new words (Haiman 2003). Considerations of fre-
quency will not explain why this recycled material was assigned the novel
task of marking nominalization in both languages (and also marking
causativity in Khmer). The iconic principle that more form is more
meaning can do so naturally.
3.3. Watkins Law
Not only is it true that the 3sg. form in the indicative or the 2sg. in the
imperative are typically zero, facts which may or may not be accountable
through Zipan reduction. It is more interesting to observe that in para-
digmatic restructurings, 3sg. is often treated as if it were zero, even when
it isnt (Watkins 1962: 16; Haiman and Beninca 1992: 89; Bybee 1985:
55). So we are faced not with actually reduced forms but the reinterpreta-
tion of non-reduced forms. Zipf may account for the actual erosion of a
frequently occurring form, but not for the perception of a non-reduced
form as if it were reduced.
42 J. Haiman
3.4. Full nouns and anaphoric expressions
On the relative abbreviation of anaphoric forms, I would have been
tempted to accept Haspelmaths position, but now I am not so sure. Con-
sider the well-known e-mail joke variously told about various political
leaders:
George Bush and his chaueur are out for a drive in the countryside. Suddenly a
pig darts across the road in front of the car and is killed. Bush sends his driver to
the farmhouse to apologize and make amends (insert a more plausible politician if
you wish) and settles down to wait. After more than an hour, the chaueur re-
emerges from the farmhouse. In his left had he holds a Havana cigar; in his right,
a bottle of champagne. His shirt is undone and covered with lipstick.
What happened?
Well, I got the cigar from the farmer, and the champagne from his wife, and
for the last hour, their daughter has been making passionate love to me.
What did you tell them?
Just that Im George Bushs driver, and Ive killed the pig.
The linguistic judgment that needs to be accounted for is that this joke
can only be written, and not told. The reason, as everyone seems to agree,
is that what the chaueur said can only be pronounced
. . . Ive killed the PIG
while what the farmers family responded to could only be
. . . Ive KILLED the pig.
The contrast, of course is between the uses of the expression the pig as
a full noun phrase (the chaueurs speech) and as an epithet (as in the
farmers understanding). Epithets need not be monosyllabic. Other possi-
bilities include the cocksucking bastard, that idiotic asshole, the cross-eyed
son of a bitch, or virtually anything else one might want to use to charac-
terize George Bush or anyone else. Thus there need be no contrast in
morphological bulk between an epithet and a full NP. What is common
to all epithets, however, is that in addition to incorporating any amount
of information or speakers attitude about a referent, they also function
as anaphoric expressions, and refer back to some antecedent, wherein
that referent is rst named. The totally iconic intuition that underlies the
grammaticality judgment above is that epithets, as anaphors, are copies of
an original referring expression, and like copies everywhere, paler than
their originals. This pallor is indirectly reected in the locus of sentence
stress on the verb rather than its object. Whether relative pallor
via destressing is the same fact as the relative abbreviation of anaphoric
In defence of iconicity 43
pronouns in general is perhaps open to debate, but here Occam is on the
side of iconicity.
4. Conclusions
Both complexity and elaboration may arise more or less accidentally:
Frequency may lead to erosion (thus Zipf 1935), and appendix-like
quirky vestigial residues may result in unnecessary elaboration in lan-
guages as well as in biological organisms (Mayr 2000, Dahl 2005, Kuteva
to appear). But there are also multiple non-accidental motivations for
both compact and elaborated expression. Among the functionally moti-
vated bases for compactness are brutality (e.g., four letter words) and
esthetic power (e.g., haiku). Among other motivations for elaboration are
a) various mechanisms of phonetic bulking which prevent total loss of
both lexical and grammatical categories (Bloomeld 1933: 395396;
Bolinger 1975: 438; Matiso 1982: 7476; Heath 1998),
b) high register and/or politeness (Geertz 1955; Aoki and Okamoto
1988)
c) disambiguation via diacritics (Haiman 1985: 6067),
d) the iconic representation of conceptual symmetry (Haiman 1988) and
e) esthetic appeal (ritual elaboration, Haiman to appear a,b).
Doubtless there are also others. In approaching human language, the
rational exuberance of Chappell and Thompson 1992 (who uncover no
fewer than seven dierent motivations for the absence of qualifying de in
Chinese possessive constructions), or of Hugo Schuchardt 1885: 23 (I
perceive here the motley interplay of innumerable drives) seems to do
more empirical justice to its subject than Haspelmaths reductionism.
Each of these motivations is most clearly attested ceteris paribus, that
is, when they operate unopposed. Haspelmath makes a strong case for
frequency as the sole possible motivation for diering expressions of one
conceptual dimension (transitive versus intransitive): frequently or typi-
cally spontaneous events like freezing will typically occur as root in-
transitives, and form their marked transitive congeners via an extra
causative morpheme. Conversely, events which are typically seen to be
brought about by external agents, like breaking, will typically occur as
transitive verbs, and form their intransitive congeners via an extra medio-
passive or reexive morpheme. (Note that here again, human conceptual-
ization is not the same as objective physical fact. To a physicist, freezing,
melting and boiling are brought about by external agency no less than
breaking). This case is exactly analogous to the contrast between typically
44 J. Haiman
introverted and typically extroverted transitive actions discussed at
considerable length in Haiman 1985: actions typically performed upon
oneself occur in the typical case with unexpressed or reduced objects
(e.g., I shave: middle voice) while actions more typically performed on
others occur when they are reexive with a separate object noun phrase
(e.g. I kicked myself : reexive voice). Armed with this clear example
of economic motivation, Haspelmath attempts to eliminate iconicity in
other cases as well, with less success.
I think we must acknowledge that iconicity is clearly one possible mo-
tivation for the asymmetric realization of referential asymmetry. I have
tried to argue here that it is preferable to frequency when it accounts
for a broader range of related phenomena, and when it seems to be pro-
ductive in generating new forms. Moreover, iconicity is the only possible
motivation for the even more wide-spread if not universal manifestations
of referential symmetry (in distributivity, comparison, reciprocity, coordi-
nate conjunction and so on) that have been discussed elsewhere in the
literature (Lako and Peters 1969; Haiman 1980, 1985, 1988).
I believe that this is puzzling: Iconicity seems at least at present to oer
no proven cognitive benets (Here I am reluctantly in disagreement with
Givon 1985: 189; cf. Bellugi and Klima 1976; Bonvillian et al. 1997;
Tomasello et al. 1999). If we grant this, it is unexplained why it should
occur at all. Given the fact that it disappears so rapidly under convention-
alization (Bloom 1979), moreover, it cannot possibly be regarded as a
vestigial featurefrom proto-language, or from Old English, or even
from last week. In other words, for iconicity to appear in language at all,
it has to be productive. Now it seems that neither of the traditional moti-
vations for linguistic form (economy of eort for the benet of the
speaker versus clarity for the benet of the hearer) can account for it.
I very tentatively propose that iconicity is generated over and over not
only for purely cognitive reasons, but because speakers take a purely es-
thetic pleasure in making the form t the sense. Some purely creative
drive is necessary to account for the ultimate genesis of linguistic material
(which sound change, analogy and grammaticalization merely erode and
tidy up), but it is frequently overwhelmed by the other two (and perhaps
others). Yet a creative esthetic drive compounded of imitation and am-
bition is well attested in human behavior generally and even in language
it is not only inferable on a priori grounds. Indeed it is responsible for the
creation of non-referential non-iconic symmetry, which may include not
only twin forms like imam (Pott 1862; Marchand 1960), but even
nuts and bolts phenomena like grammatical agreement (Ferguson and
Barlow 1988: 17; Haiman to appear a,b). A creative esthetic drive may
even be responsible for the spontaneous creation of expressive mor-
In defence of iconicity 45
phemes like ideophones, as noted long ago by Hermann Paul 1880,
Ch. 9. But that is a subject that deserves another treatment.
Received 28 February 2007 Macalester College, USA
Notes
* Authors email address: 3Haiman@Macalester.edu4. Contact address: Linguistics
Search, Macalester College, 1600 Grand Avenue, St. Paul, MN 55105, USA.
References
Andersen, Henning
1972 Diphthongization. Language 48, 1150.
Aoki, Haruo, and Okamoto, S.
1988 Rules for Conversational Rituals in Japanese. Tokyo: Taishukan.
Bally, Charles
1926 [1995] The expression of the concepts of personal domain and indivisibility in
Indo-European languages. In Chappell, H., and W. McGregor (eds.), 31
61.
Bellugi, Ursula, and Klima, Edward
1976 Two faces of the sign: Iconic and abstract. In Harnad, S. et al. (eds.), The
Origins and Evolution of Language and Speech. New York: N.Y. Academy
of Sciences, 514538.
Bloom, H.
1979 Language Creation in the Manual Modality. Honors Thesis, University of
Chicago.
Bloomeld, Leonard
1933 Language. New York: Holt.
Bolinger, Dwight
1975 Aspects of Language, 2nd ed. New York: Harcourt: Brace.
Bonvillian, John, A. M. Garber, and S. B. Dell
1997 Language origin accounts. First-Language 17, 219239.
Bybee, Joan
1985 Morphology. Amsterdam: Benjamins.
Chappell, Hilary, and McGregor, William (eds.)
1995 The Grammar of Inalienability. Berlin/New York: Mouton de Gruyter.
Chappell, H., and Thompson, S.
1992 Semantics and pragmatics of associative de in Chinese discourse. Cahiers de
Linguistique Asie Orientale 21, 199229.
Clark, M.
1995 Where do you feel? Stative verbs and body-part terms in Mainland South-
east Asia. In Chappell, H., and W. McGregor (eds.), 529563.
Crowley, Terry
1995 Inalienable possession in Paamese. In Chappell, H., and W. McGregor
(eds.), 383432.
46 J. Haiman
Dahl, O

sten
2005 Origins and Maintenance of Linguistic Complexity. Amsterdam: Benjamins.
Durie, Mark
1987 Grammatical relations in Acehnese. Studies in Language 11, 365399.
Ferguson, Charles, and Barlow, Michael
1988 Introduction to Agreement in Natural Language. Stanford: CSLI.
Geertz, Cliord
1955 Linguistic etiquette. The Religion of Java. Glencoe: The Free Press.
Givon, Talmy
1985 Iconicity, isomorphism, and non-arbitrary coding in syntax. In Haiman,
John (ed.) Iconicity in syntax, 187219. Amsterdam: Benjamins.
Haiman, John
1980 The iconicity of grammar: Isomorphism and motivation. Language 56,
515540.
1985 Natural Syntax. Cambridge: CUP.
1988 Incorporation, parallelism, and focus. In Hammond, Michael, Edith
Moravcsik, Jessica Wirth (eds.), Studies in Syntactic Typology. Amsterdam:
Benjamins, 303320.
2003 Explaining inxation. In Polinsky, M., and J. Moore (eds.), The Nature of
Explanation in Linguistic Theory. Stanford: CSLI, 105120.
forthc. a Competing motivations. In Song, J. (ed.), Handbook of Typology. Oxford:
OUP.
forthc. b Decorative imagery in ritual elaboration. In Noonan, M. et al. (eds.), For-
mulaic Language. Amsterdam: Benjamins.
forthc. c Ergativity in Kurdish.
Haiman, John, and Beninca, P.
1992 The Rhaeto-Romance Languages. London: Routledge.
Haspelmath, Martin
1993 More on the typology of inchoative/causative verb alternations. In Comrie,
Bernard, and Maria Polinsky (eds.), Causatives and Transitivity. Amster-
dam: Benjamins, 87120.
Heath, Jerey
1998 Hermit crabs. Language 74, 728759.
Heller, Joseph
1972 [1955] Catch-22. New York: Dell.
Hinton, Leanne
1982 How to cause in Mixtec. BLS 8, 354363.
Hyman, Larry
1977 On the syntax of body parts in Haya. In Haya Grammatical Structure,
(Southern California Occasional Papers in Linguistics 6). Los Angeles: Uni-
versity of Southern California, 99117.
James, William
1890 Principles of Psychology. NY: Dover reprints.
Kuteva, Tania
forthc. On the frills in language. Unpublished manuscript.
Lako, George, and Peters, S.
1969 Phrasal conjunction and symmetric predicates. In Reibel, D., and S. Schane
(eds.), Modern Studies in English. NJ: Englewood Clis, 113142.
Marchand, Hans
1960 Categories and Types in English Word Formation. Heidelberg: Carl Winter.
In defence of iconicity 47
Matiso, James
1982 The Grammar of Lahu. Berkeley: University of California Press.
Mayr, Ernst
2000 What Evolution Is. New York: Basic Books.
McCarus, Ernest
1958 A Kurdish Grammar: Descriptive Analysis of the Kurdish of Sulaimaniya,
Iraq. New York: American Coincil of Learned Societies.
Orwell, George
1957 Politics and the English language. In: Inside the whale and other essays, 143
157. Harmondsworth: Penguin.
Osumi, M.
1995 Body parts in Tinrin. In Chappell, H., and W. McGregor (eds.), 433
462.
Paul, Hermann
1880 Prinzipien der Sprachgeschichte. Tu bingen: Max Niemeyer.
Payne, Doris, and Barshi, Emmanuel
1999 External Possession. Amsterdam: Benjamins.
Pott, August
1862 Die Doppelung. Lemgo: Detmold.
Schuchardt, Hugo
1885 Gegen die Junggrammatiker. Berlin: Robert Oppenheimer.
Silverstein, Michael
1976 Hierarchy of features and ergativity. In Dixon, R. (ed.), Grammatical Cate-
gories in Australian Languages. Canberra: Australian Institute of Aboriginal
Studies, 112171.
Sohn, Ho-Min.
1994 Korean. London: Routledge.
Tiersma, Peter
1982 Local and general markedness. Language 58, 832849.
Tomasello, Michael, Tricia Striano, and Philippe Rochat
1999 Do young children use objects as symbols? British Journal of Developmental
Psychology 17, 563584.
Tsunoda, Tasaku
1995 The possession cline in Japanese and other languages. In Chappell, H., and
W. McGregor (eds.), 565630.
Watkins, Calvert
1962 Indo-European Origins of the Celtic Verb. Dublin: Institute for Advanced
Studies.
Wierzbicka, Anna
1972 Semantic Primitives. Frankfurt: Athenaum.
Zipf, George Kingsley
1935 The Psychobiology of Language. Boston: Houghton-Miin.
48 J. Haiman
On iconicity of distance
WILLIAM CROFT*
Abstract
Haspelmath argues that certain universal asymmetries in linguistic distance
previously analyzed as examples of iconicity of distance are better analyzed
as the result of frequency. It is argued here that Haspelmaths arguments
can be countered by an advocate of iconicity of distance as an explanatory
factor. Iconicity of distance is not dierent in kind from iconicity of conti-
guity, which Haspelmath endorses. Haspelmaths argument works only if
one takes relative frequency instead of absolute frequency; yet it is gener-
ally accepted that economy eects are the result of absolute frequency.
The empirical frequency data that Haspelmath presents is inconclusive.
However, Haspelmath presents data that suggest that an iconicity of dis-
tance analysis, at least for possession constructions, must be revised as icon-
icity of length. Finally, criteria are oered to dierentiate the eects of
economy, iconicity of distance/length, and iconicity of independence.
Keywords: frequency; iconicity; economy; distance.
Haspelmaths article challenges explanations based on iconic motivation
for three categories of linguistic phenomena, quantity, complexity and co-
hesion (distance). For all three of these phenomena, Haspelmath argues
that an explanation in terms of economic motivation, that is, based on
dierences in frequency, is superior to the iconicity explanation that
has been oered in the literature. Haspelmath does not deny that icon-
icity plays a major role in determining linguistic structure; his critique
does not touch the most important manifestations of iconicity in lan-
guage, namely paradigmatic isomorphism, syntagmatic isomorphism,
and contiguity.
I believe that Haspelmath is correct in his arguments that economic
motivation is a superior explanation for the quantity and complexity phe-
Cognitive Linguistics 191 (2008), 4957
DOI 10.1515/COG.2008.003
09365907/08/00190049
6 Walter de Gruyter
nomena he discusses. I did not consider either of these to be examples
of iconic motivation in Typology and Universals (Croft 2003). Instead,
length and complexity are reexes of typological markedness (Greenberg
1966). Typological markedness, at least the formal asymmetries in expres-
sion that Haspelmath discusses, are economically motivated, as Green-
berg argues (Haspelmath cites Greenbergs frequency-based explanations
in both cases).
Iconicity of cohesion is another matter. An explanation in terms of
iconic motivation can be largely defended, and an explanation in terms
of economic motivation appears to be unsatisfactory. Nevertheless, Has-
pelmaths article helps us to tease apart the relationship between iconicity
and economy in motivating linguistic universals.
Haspelmath divides Haimans iconicity of distance into two, contiguity
(see above) and cohesion. Haspelmath distinguishes contiguity from the
scale of iconicity of distance for the grammatical relationship of X to Y
in (1) (from Haiman 1983: 783):
(1) a. X A Y (an additional word is used to express the relationship
between X and Y)
b. X Y (no additional word is used to express the relationship be-
tween X and Y)
c. X-Y (X and Y are morphologically bound)
d. Z (a portmanteau expression of the concepts denoted by X and
Y)
Haspelmath argues that the scale in (1) does not correspond to dis-
tance, because (b) and (c) do not literally dier in distance, and distance
is not really applicable to (d) (5). But it is not clear to me that the notion
of distance is inappropriate for the distinctions in (1). The presence vs.
absence of a third morpheme can be fairly straightforwardly interpreted
in terms of linguistic distance (but see below). The contrast between mor-
phological freedom ( juxtaposition) and boundedness is intended to
represent both prosodic and segmental dierences in behavior that do
represent phenomena that can reasonably be called distance, even in a
strict temporal sense. Prosodically, morphologically free elements may
occur in dierent intonation units, and be interrupted by pause. Segmen-
tally, the articulatory gestures for the forms X and Y may overlap (in
assimilation and other segmental eects), which represents a certain tem-
poral overlap of the formal expression of X and Y. Finally, portmanteau
expression represents complete temporal overlap of the formal expression
of X and Y: all of Z expresses both X and Y. Thus, it is not unreasonable
to consider the scale in (1) to be an extension of iconicity of contiguity,
which Haspelmath accepts as genuinely iconically motivated.
50 W. Croft
The more important question, however, is whether the phenomena that
Haspelmath discusses really are better explained in terms of economic
rather than iconic motivation. Haspelmath discusses four examples: at-
tributive possession constructions, causative constructions, coordinating
constructions, and complement clause constructions. In the case of posses-
sive constructions, he gives frequency data as evidence for a frequency-
based explanation, and oers other grammatical arguments to support a
frequency-based explanation over an iconic explanation. In the case of
the other three constructions, however, he oers little or no frequency
data and few other arguments. It is only for possessives that Haspelmath
has a well developed argument against the iconicity explanation and in
favor of the frequency explanation. I will therefore focus on Haspel-
maths arguments regarding possessive constructions.
The relevant typological universal for possessives is that if there is a
dierence in linguistic distance (cohesion) between the alienable and in-
alienable constructions, the inalienable construction will always be lower
on the distance scale in (1) than the corresponding alienable construction.
Haiman explains this universal by iconicity of distance.
Haspelmaths frequency explanation is based on the relative frequency
of the possessed to the unpossessed form of a noun.
1
In text counts from
English and Spanish, Haspelmath demonstrates that the relative fre-
quency of body part terms and kinship terms in the possessed form com-
pared to the unpossessed form is greater than the relative frequency
of alienable nouns in the possessed form compared to the unpossessed
form. Haspelmath notes that inalienable nouns in the unpossessed con-
struction are crosslinguistically sometimes overtly coded (see his Koyu-
kon examples), and that this fact can be explained in terms of frequency.
In fact, Haspelmaths text counts actually indicate that even kinship
terms and body part terms occur more frequently in the unpossessed
construction.
Thus, an economy explanation only works if one uses relative fre-
quency of unpossessed vs. possessed inalienable nouns compared to the
relative frequency of unpossessed vs. possessed alienable nouns. But all
other examples of typological markednessfrequency-based dierences
in the structural expression of conceptsare of absolute frequency, not
relative frequency. Many such examples are given in Greenberg (1966)
and Bybee (1985); see also Croft (2003: 151, 154). In the one study that
that compares relative and absolute frequency with respect to phenomena
attributed to economy, namely morphological irregularity in Russian
nominal paradigms (Corbett et al. 2001), absolute frequency was a
strongly signicant factor, but relative frequency was only weakly signi-
cant (see Croft 2003: 206207).
On iconicity of distance 51
It is not an accident that absolute frequency has been found to be the
causal factor for economically motivated linguistic patterns. The theoret-
ical explanation for economy (e.g., Bybee 1985) requires absolute fre-
quency. Economy eects are due to degree of entrenchment of linguistic
forms (morphological forms or constructions such as the possessive) in
the mental representation of linguistic knowledge. Entrenchment leads
to routinization of the production of the form by a speaker, which in
turn brings about reduction of that form. But entrenchment is a result
of exposure to the number of tokens of the linguistic form; that is,
entrenchment is a function of the absolute frequencies of forms, not rela-
tive frequencies.
Haspelmath also appeals to predictability to account for economy
(2.2). Unfortunately, predictability is a vague concept: any mathemati-
cal relationship can be construed as predictable. But the most natural
psychological interpretation of predictability, as what the speaker would
be expected to produce, also relies on absolute frequency. If the posses-
sion construction is reduced for inalienable nouns compared to alien-
able nouns because inalienable nouns are more predictable in the pos-
session construction, this means that one would expect the absolute
frequency of inalienable nouns in the possession construction to be
greater than the absolute frequency of alienable nouns in the possession
construction (or perhaps greater than the absolute frequency of inalien-
able nouns in the unpossessed construction). In other words, relative
frequency would not be expected to lead to economy eects such as
reduction.
2
Furthermore, the iconicity of distance hypothesis is not about the rela-
tionship of the possessed construction to the unpossessed construction.
The iconicity of distance hypothesis compares the relationship of two pos-
sessed constructions, the inalienable construction and the alienable con-
struction. The iconicity of distance hypothesis makes no claim about the
unpossessed construction, or about the relationship of the unpossessed
construction to the possessed construction. This evidence is irrelevant to
the iconicity account. A genuine comparison of an iconicity account and
an economy account for distance/cohesion should compare the absolute
frequency of the inalienable possession construction to that of the alien-
able possession construction.
Haspelmaths gures appear to suggest that comparing these two abso-
lute frequencies does support an economic explanation. In English, there
are 12737 tokens of body part and kinship terms in the possessed con-
struction, and only 2967 tokens of alienable nouns in the possessed con-
struction. Unfortunately, the data that Haspelmath presents represents
only a subset of both inalienable and alienable nouns. We cannot be cer-
52 W. Croft
tain if the frequency dierence will remain the same once the vast number
of inalienable nouns is included. Thus, Haspelmaths frequency data is
inconclusive. Also, an economy explanation would make a dierent pre-
diction for those languages in which kinship terms are not found in the
inalienable possession construction (e.g., Kosraean [Kusaiean]). In such
a language, the alienable possession construction tokens would probably
outnumber the inalienable ones (this is what is implied by the token
frequency data for English and Spanish oered by Haspelmath). In that
case, an economy account would predict that the alienable possession
construction would be the more cohesive one. This would be an incorrect
empirical prediction.
For the possessive construction, Haspelmath argues that it is the (rela-
tive) frequency of the construction that matters, not the individual words:
. . . the [frequent] alienable nouns may well occur in a possessive con-
struction more often than the inalienable nouns. However, the percentage
of possessed occurrences of inalienable nouns will always be signi-
cantly higher than the corresponding percentage of [infrequent] inalien-
able nouns. But in every case of reduced alternative constructions that
has been investigated, what determines the reduction of the linguistic ex-
pression is the token frequency of specic words in the construction, not
the construction itself (e.g., Bybee and Slobin 1982; Bybee and Thompson
1997; Bybee and Scheibman 1999; Bybee 2001). If an individual word has
a low token frequency, it tends to be regularized. This latter phenomenon
cannot be explained by an economy account based on construction fre-
quency. The fact that low token frequency inalienable nouns still take
the more cohesive possessive construction must be due to some other fac-
tor. That other factor is iconicity of distance.
Haspelmath proposes that a frequency account can explain the fact
that the inalienable pronominal possessor expression is phonologically
shorter than the alienable pronominal possessor expression, and an icon-
icity of distance account cannot. The greater phonological fusion and
reduction of the inalienable pronominal possessor is almost certainly due
to the fact that the inalienable possessive construction is more highly
grammaticalized than the alienable possession construction. Higher fre-
quency plays a major role in grammaticalization (Bybee 2003). Once a
linguistic expression is extended to a grammatical function, it increases
in frequency, and the increase in frequency leads to erosion of the gram-
maticalizing morphemes in the expression.
However, the relevant frequency contrast in grammaticalization is be-
tween the grammaticalizing construction in a grammatical function and
its historical antecedent in its non-grammatical function. This is not
what the iconicity of distance account intends to explain. What we observe
On iconicity of distance 53
in possessive constructions is that a less grammaticalized pronominal con-
struction, which is presumably newer, has entered the possessive domain,
and is competing with the more grammaticalized possessive construction.
The result of this competition in many languages is a semantic division of
labora common result of competing variants (Croft 2000: 176177).
The semantic division is always such that the more cohesive construction
is used for more inalienable possession relations. The frequency eects of
grammaticalization cannot explain this fact. Iconicity of distance explains
this fact.
3
The remaining argument that Haspelmath raises against the iconicity
account is the only one that is a serious challenge to iconicity of distance
in possessive constructions. Haspelmath argues that iconicity of distance
requires the extra morpheme in alienable possessive constructions to oc-
cur between the possessor and possessum in the construction, yet it some-
times does not do so. But the iconicity explanation can be reformulated to
accommodate this phenomenon.
What is required is a reformulation of iconicity of distance with a
dierent measure of linguistic distance than simple temporal distance.
Haiman himself allows a nontemporal measure of linguistic distance
in his analysis of causative constructions: he argues that there is greater
linguistic distance when the causee is expressed as an indirect object
than as a direct object (Haiman 1983: 792). Haiman discusses the
Puluwat exampleas Haspelmath notesand proposes an alternative
formulationwhich Haspelmath does not discussin terms of phono-
logical bulk (Haiman 1983: 795). Haimans reformulation is that a
conceptually more distant relation is encoded by a linguistically bulkier
expression. This formulation changes the iconic mapping from distance
between X and Y to length of the linguistic form used to code the rela-
tion between X and Y. This reformulation would allow us to distinguish
between iconicity of distance proper, based on temporal distance of forms
in the linguistic signal, and an iconicity of temporal length of the rela-
tional expression. An iconicity of length account for possessive construc-
tions is superior to both the original iconicity of distance hypothesis and
the economy hypothesis.
Haspelmaths arguments against iconicity of distance or length do not
hold up, at least for possessive constructions. The frequency data that
Haspelmath invokes in favor of an economy account are inconclusive or
irrelevant. Haspelmath appeals to relative frequency of constructions,
whereas economy is due to absolute token frequencies of lexical forms (in
either morphological paradigms or grammatical constructions). Where
Haspelmath oers grammatical phenomena that really are economically
motivated (grammaticalization), they are dierent phenomena from the
54 W. Croft
ones that are hypothesized to be iconically motivated by Haiman. Where
frequency and semantics conict (low frequency forms, languages with
smaller inalienable classes), or frequency makes no prediction (division of
labor in grammaticalization), semantics explains what is observed, and the
semantic dierences are iconically motivated.
Nevertheless, some grammatical phenomena are undoubtedly econom-
ically motivated. Nor should we rule out the possibility of multiple moti-
vations, which Haspelmath appears to do. For example, Haspelmath
suggests (though with little data) that certain patterns in complex sen-
tence cohesion that have been taken to be conceptually close are also
higher in frequency. Haspelmath concludes that iconicity therefore has
no role to play in explaining linguistic dierences in complex sentence
constructions. But there is no a priori reason to assume that only one
functional motivation applies for every linguistic construction (Croft
2003: 6469). For example, natural coordination (such as brother and
sister; see Hasplemath, 6.3) may be more frequent than man and snake,
but it is also conceptually more of a unit (Wierzbicka 1980). What is re-
ally necessary is a careful examination of what motivations appear to be
operating to explain each typological universal, and the linguistic predic-
tions each makes.
For instance, in the analysis of complex sentences, one must distinguish
between iconicity of distance and iconicity of independence: concepts
having less conceptual independence will be linguistically no more inde-
pendent than concepts having greater conceptual independence (Cristo-
faro 2003; Croft 2003: 213219). In discussing Cristofaros (2003) typo-
logical universals of subordination, I suggest that deranking of clauses
(Stassen 1985) is economically motivated, because deranking involves
asymmetries in overt coding and reduced behavioral potential as well
as dierences in token frequency. However, dierences in linguistic inte-
gration of subordinate clauses is iconically motivated by conceptual inde-
pendence, because semantic integration and temporal dependence of the
subordinate clausethe factors that determine degree of linguistic inde-
pendenceare symptoms of conceptual independence, not conceptual
distance.
Table 1 indicates salient properties of dierent types of functional
explanations for linguistic cohesion.
Some properties are found under more than one explanation: the same
phenomenon may have alternative explanations. For example, coding
length can be explained either by economy or iconicity of distance/
length. Haspelmath appears to assume that coding length is only explain-
able in terms of economy (6), but this is not necessarily the case. One
must examine all of the properties in Table 1 in order to identify which
On iconicity of distance 55
motivation is operating. If one nds the union of properties from more
than one motivation, then it is likely that multiple motivation is the best
explanation for the linguistic phenomenon.
Received 30 April 2007 University of New Mexico,
Albuquerque, USA
Notes
1. Haspelmath incorrectly describes the distinction between absolute and relative frequency
in 4.2. There he describes relative frequency as a comparison of the absolute frequen-
cies of paradigmatic alternatives such as singular book and plural books. But this is sim-
ply comparison of absolute frequencies. Relative frequency is a proportional frequency,
measured by comparing percentages of one form relative to another form. That is,
relative frequency is a second-order comparison of sets of absolute frequencies (see also
Corbett et al. 2001: 202203).
2. An anonymous referee points out that some recent work in cognitive linguistics (e.g.,
Gries et al. 2005 and works cited therein) makes use of relative frequency. However,
this work uses relative frequency to tease apart subtle semantic distinctions between con-
structions and to better identify individual words semantically most closely associated
with a constructions meaning. It does not claim to motivate economy eects such as
phonological reduction.
3. Haspelmath also cites the relative length of direct vs. indirect causation markers as par-
allel to the relative length of inalienable vs. alienable pronominal possessors. The same
counterargument applies to the causatives as well. It is also possible that in causation,
the dierence between direct and indirect causation reects the conceptual distance be-
Table 1. Major properties of dierent types of functional motivation
Economy Iconicity of Distance or
Length
Iconicity of Independence
Asymmetry in coding length Asymmetry in coding length
Asymmetry in morphological
boundedness (e.g., [X Y] vs.
[X-Y] or [X A Y] vs. [X A-Y]
Asymmetry in
morphological boundedness
Asymmetry in behavioral
potential (Croft 2003: 9599)
Asymmetry in syntactic
potential
Independent phenomenon
from rest of construction
(e.g., contrasts between
[X A] and [X] regardless of
presence or absence of Y)
Involves coding of relation
between X and Y in
construction
Involves coding of relation
between X and Y in
construction
Motivated by absolute
lexical token frequency
Motivated by conceptual
distance
Motivated by conceptual
independence (which also
involves conceptual distance)
56 W. Croft
tween causer and causee, i.e., the arguments of the causative construction. In that case,
the arguments are X and Y and the causative marker is A in Haimans formula for lin-
guistic distance in (1a), and the dierences in length of direct vs. indirect causation
markers is exactly predicted by iconicity of distance.
References
Bybee, Joan L.
1985 Morphology: A Study into the Relation between Meaning and Form. Amster-
dam: John Benjamins.
2001 Frequency eects on French liaison. In Bybee, Joan L., and Paul Hopper
(eds.), Frequency and the Emergence of Linguistic Structure. Amsterdam:
John Benjamins, 337359.
2003 Mechanisms of change in grammaticalization: the role of frequency. In Jo-
seph, Brian, and Richard Janda (eds.), Handbook of Historical Linguistics.
Oxford: Blackwell, 602623.
Bybee, Joan L. and Joanne Scheibman
1999 The eect of usage on degrees of consitutency: the reduction of dont in
English. Linguistics 37, 575596.
Bybee, Joan L. and Dan I. Slobin
1982 Rules and schemas in the development and use of the English past tense.
Language 58, 265289.
Bybee, Joan L. and Sandra A. Thompson
1997 Three frequency eects in syntax. In Juge, Matthew L., and Jeri Moxley
(eds.), Proceedings of the 23rd Annual Meeting of the Berkeley Linguistics
Society. Berkeley: Berkeley Linguistics Society, 378388.
Corbett, Greville G., Andrew Hippisley, Dunstan Brown and Paul Marriott
2001 Frequency, regularity and the paradigm: a perspective from Russian on a
complex relation. In Bybee, Joan L., and Paul Hopper (eds.), Frequency
and the Emergence of Linguistic Structure. Amsterdam: John Benjamins,
201226.
Cristofaro, Sonia
2003 Subordination. Oxford: Oxford University Press.
Croft, William
2000 Explaining Language Change: An Evolutionary Approach. Harlow, Essex:
Longman.
2003 Typology and Universals, 2nd ed. Cambridge: Cambridge University Press.
Greenberg, Joseph H.
1966 Language Universals, with Special Reference to Feature Hierarchies. (Janua
Linguarum, Series Minor 59.) The Hague: Mouton.
Gries, Stefan Th., Beate Hampe and Doris Scho nefeld
2005 Converging evidence: bringing together experimental and corpus data on the
association of verbs and constructions. Cognitive Linguistics 16, 635677.
Haiman, John
1983 Iconic and economic motivation. Language 59, 781819.
Stassen, Leon
1985 Comparison and Universal Grammar. Oxford: Basil Blackwell.
Wierzbicka, Anna
1980 Coordination: the semantics of syntactic constructions. Lingua Mentalis:
The Semantics of Natural Language. New York: Academic Press, 223285.
On iconicity of distance 57
Reply to Haiman and Croft
MARTIN HASPELMATH*
I am grateful to John Haiman and William Croft for their penetrating cri-
tiques of my claims and for the interesting challenges that they provide
for them. This oers me a chance to clarify and elaborate on some of the
central points of my article. This is an important debate, because iconicity
and frequency are central explanatoty concepts in functional and cogni-
tive linguistics. Even if we do not succeed in resolving the issues, our un-
derstanding will be enhanced by this discussion.
1. How frequency explains grammatical asymmetries
A key presupposition of my paper is that frequency of use implies short
coding because frequent items are more predictable. Croft assumes a
rather dierent account of the frequency-shortness connection. He claims
that
[e]conomy eects are due to degree of entrenchment of linguistic forms . . . En-
trenchment leads to routinization of the production of the form by a speaker,
which in turn brings about reduction of that form.
This echoes similar remarks in Joan Bybees work (e.g., Bybee 2001,
2003), but I do not see how such a view can be reconciled with some basic
facts. To be sure, routinization often cooccurs with reduction of form, be-
cause forms that are routinized for the speaker are often also predictable
for the hearer. But in such cases the cause of the reduction is not the
routinization, but the speakers tendency to save energy when part of
the message is predictable. When a routinized form is not predictable
(e.g., when I dictate my phone number to someone), no reduction occurs.
George Kingsley Zipf saw this correctly from the beginning of his
writings:
In listening to spoken language, we notice that, among other things, the speaker
invariably emphasizes these two: rst, what is new or unexpected to the hearer;
Cognitive Linguistics 191 (2008), 5966
DOI 10.1515/COG.2008.004
09365907/08/00190059
6 Walter de Gruyter
second, what the hearer desires [for the speaker] to make especially clear . . . But
that which is unexpected, unusual, or unfamiliar to the hearer is, by denition, the
seldom. (Zipf 1929: 5)
Thus, frequency-induced reduction is to a large extent a hearer-based
phenomenon and is not due to routinization, but to predictability. It
should also be noted that predictability need not be due to linguistic fre-
quency. Stereotypical situations allow massive reduction, simply because
the context makes the utterance content easy to predict. In grammar, too,
some reductions (e.g., lack of stress on anaphoric epithets, discussed by
Haiman in his 2.4) are due to referential predictability from the context,
not to high frequency of use.
A related issue is productivity, which, as Haiman rightly observes, is
the real test for psychological reality. However, an explanatory factor
like frequency of use is not meant to be psychologically real in the way
in which cognitive schemas or generative rules are sometimes said to be
psychologically real. Frequency eects in processing (cf. Ellis 2002) aect
language structure through speakers innovations that ultimately lead to
language change (cf. Bybee 2007). This type of explanation is thus akin
to adaptive explanation in biology (cf. Haspelmath 1999; Croft 2000; Ble-
vins 2004), and I take this to be the standard appproach to explaining
universals in current functional linguistics (cf. Bybee 1988; Kirby 1999;
Newmeyer 2005). Even one of Haimans productivity arguments (under-
analysis of 3rd person markers, or Watkins Law, in his 2.3) is clearly
of the diachronic sort. The two real examples of productivity of a claimed
iconicity eect, transitive disappear (Haimans 2.1) and unstressed epi-
thets (Ive KILLED the pig, 2.4), evidently illustrate the productivity of
conventional regularities of English, not of iconicity itself. In German,
for instance, the verb verschwinden disappear could not possibly develop
transitive uses, because there is no productive ambitransitive alternation
in the language. Of course, to the extent that these regularities of English
reect universal tendencies, these might be due to iconicity (or frequency
or some other explanatory factor), but in that case the explanation is
again mediated by diachrony.
2. How to compare frequencies
As Crofts comments show, there may be a question about which fre-
quencies to compare with which other frequencies. My claim is that
alleged iconicity-of-cohesion eects such as alienability contrasts in
possessive constructions are due to the same kinds of frequency asymme-
tries that give rise to the classical eects of typological markedness
60 M. Haspelmath
(Greenberg 1966, Croft 2003: Ch. 4). Croft disputes this and claims that
my explanation is based on relative frequency, whereas typological mark-
edness is based on absolute frequency. I believe that this reects a misun-
derstanding, so let me clarify the way I see the parallels.
We consider two forms (A and B) that are paradigmatic alternatives
(Croft 2003: 90). In the case of markedness reversal, there are two
classes of lexemes (I and II) that behave dierently, both in terms of fre-
quency and (consequently) in terms of coding. Let us take singular and
plural marking again, where quite a few languages have a class of nouns
(let us call them plural-prominent) which often occur in the plural and
hence have a longer singular form (singulative, cf. Croft 2003: 189
190). Even English could be said to have a few such nouns, e.g., datu-m/
data, criterio-n/criteria. Table 1 shows the frequencies of two selected
English nouns, a singular-prominent and a plural-prominent noun. For
each noun, the rst column gives the absolute frequency, and the second
column gives the relative frequency in percentages.
The standard frequency-based and least-eort-based explanation of
the coding contrasts is that they are due to within-class, across-form dif-
ferences in frequency: In each case, the overtly coded form is signicantly
rarer than the other form. As long as we only look at individual nouns,
it does not matter whether we compare the absolute or the relative fre-
quencies. But when we compare dierent classes, it is important to com-
pare relative rather than absolute frequencies, because in absolute terms,
houses is much more frequent than criteria. Clearly, across-class, within-
form comparisons are not meaningful in the present context.
The picture for alienability contrasts is completely analogous. The two
forms are the possessed and the unpossessed form, and the two classes are
alienable nouns and inalienable ( kinship and body-part term) nouns.
Table 2 shows the frequencies of two selected English nouns.
Again, I claim that the coding contrasts (which this time cannot be il-
lustrated from English, because English treats all possessed nouns alike)
are due to (vertical) within-class, across-form dierences in frequency.
Croft, by contrast, suggests that one should compare (horizontal) across-
class, within-form dierences, but these are as irrelevant for the form
dierences as in Table 1. Some alienable nouns (such as house) are very
Table 1. Frequencies of house and criterion (singular/plural) in the British National Corpus
(spoken)
class I (singular-prominent) class II (plural-prominent)
form A (singular) house 4811 83% criterio-n 137 27%
form B (plural) house-s 1020 17% criteria 365 73%
5831 100% 502 100%
Reply to Haiman and Croft 61
frequent, others (such as palace) are rarer, and some inalienable nouns
(such as head or hand ) are very frequent, whereas others (such as nose or
kidney) are rarer. What unites inalienable nouns is that they have a high
proportion of possessed occurrences, i.e., a high relative frequency of
form B. All this is just as in the singular/plural contrasts seen earlier.
In Table 2, the proportion of form B in class II is more than 50%, as is
the proportion of form B (plural) in class II (plural-prominent) in Table
1. Likewise, the proportion of form B in class I is below 50% in both
tables. However, this is not actually necessary in order to explain the
form contrast between class I and class II. All that matters is that the pro-
portion of form B is signicantly higher in class II than in class I. A
higher proportion of form B means that form B is more predictable than
in class I, which means that it is more likely to be expressed in a short
way. Thus, while the gures in Table 3 are not as overwhelmingly signi-
cant as those in Tables 1 and 2, they are still signicant and sucient to
explain the fact that in some languages, paired body parts have longer
singulars than plurals.
The gures given in my article for body-part and kinship terms in En-
glish and Spanish are more like the gures in Table 3 than the gures in
Table 1, and this may strike some observers (such as Croft) as less than
fully convincing. However, requiring that the frequency should be higher
than 50% in a within-class comparison would not be reasonable, because
quite generally, implicational universals make only relative predictions.
Some languages never mark number or possession, and some languages
always do. But when number or possession marking (or any other kind
of marking) is dierent for dierent lexeme classes, the general prediction
Table 2. Frequencies of house and nose (unpossessed/possessed) in the British National
Corpus (spoken)
class I (alienable) class II (inalienable)
form A (unpossessed) house 3614 75% nose 134 36%
form B (possessed) (someones) house 1197
4811
25%
100%
(someones) nose 238
372
64%
100%
Table 3. Frequencies of nose and foot (singular/plural) in the British National Corpus
(spoken)
class I (singular-prominent) class II (plural-prominent)
form A (singular) nose 372 92% foot 886 51%
form B (plural) nose-s 32 8% feet 877 49%
404 100% 1763 100%
62 M. Haspelmath
is that the higher the frequency of a form, the less marking it receives.
This prediction is fully borne out by the available data on possessive
constructions.
Croft also mentions Corbett et al.s (2001) discussion of relative and
absolute frequency, and their result that relative frequency is much less
important than absolute frequency. However, Corbett et al. are interested
in morphological irregularities, not in coding asymmetries. As I note at
the beginning of 6 of my paper, high absolute frequency favours sup-
pletion (and irregularity more generally), because irregularity is due to
memorizability and has nothing to do with predictability.
Thus, coding asymmetries that correlate with frequency asymmetries
are due to dierential predictability, which can be measured by relative
frequencies. Absolute frequencies explain irregularity. Haimans cohesion
scale has two completely dierent explanations.
3. Kinds of iconicity
In his comments, Croft insists that Haimans cohesion scale can be inter-
preted in terms of temporal distance (not just cohesion, as I had argued),
and thus as an extension of iconicity of cohesion (an iconicity type that I
do not question). I nd this a legitimate view, and indeed the similarities
between iconicity of cohesion/distance and iconicity of contiguity are too
obvious to be overlooked. But sometimes appearances are deceptive, and
I have claimed that the phenomena explained by iconicity of cohesion are
not uniform and must be explained in two dierent ways. This makes
it less surprising that I also claim that contiguity phenomena have yet
another explanation. Thus, instead of Crofts uniform explanation in
terms of iconicity of contiguity and distance (Croft 2003: 7.2.1), I have
three separate explanations for three separate kinds of phenomena: icon-
icity of contiguity (for constituency), frequency-induced predictability
(for coding asymmetries), and frequency-induced memorizability (for
suppletion).
This ies in the face of Haimans principle that an explanation should
be preferred if it is applicable to a broader range of phenomena. But I
do not think that this is a useful heuristic for developing explanatory ac-
counts of highly complex phenomena such as language. Nobody doubts
that language structure is inuenced by a variety of factors, so we should
put all our energy into identifying the precise roles of these factors and
rening our predictions, rather than reducing everything to a few big prin-
ciples and sweeping the details under the rug.
1
Thus, I happily admit that
my frequency account cannot be extended to the external possession and
possessor agreement constructions mentioned by Haiman.
Reply to Haiman and Croft 63
Croft accepts the frequency explanation for iconicity of complexity and
only argues for iconicity of distance/cohesion, and Haiman (1983) had
also argued only for iconicity of cohesion. So it is surprising to see Hai-
man defend iconicity of complexity, in his 2.2, with regard to contrats
such as entity/thing and leng-th/long. He apparently wants to claim that
words expressing basic and concrete meanings such as thing and long
tend to be short, whereas words expressing derived and abstract meanings
such as entity and length tend to be long. This could be evaluated
properly once the claim is made more precise, but let me note here that
many highly abstract concepts are expressed in a very short way (e.g.,
in, on, to, have), and that many concrete concepts are expressed in
a very long way (e.g., caterpillar, rhododendron, game console, thun-
derstorm). Maybe these latter examples would not counts as basic,
and of course children rst acquire shorter words (e.g., car, dog,
rain), but this is because they tend to be more frequent. Haiman
cites my (1993) paper in support of this view, so I should emphasize
again here that I now believe that the relevant part of that paper was
mistaken.
Croft recognizes that the cases where the possessive marker is in a pe-
ripheral rather than in a medial position (examples (30ad) of my article)
present a serious challenge to iconicity of distance, and he suggests that it
should be replaced by iconicity of length. However, he does not explain
in what sense the relationship between form and meaning would still be
iconic. According to Haiman (1983: 782783), short linguistic distance
iconically corresponds to short conceptual distance ( conceptual close-
ness). Iconicity of length would presumably consist in an iconic corre-
spondence between linguistic length and conceptual length, but it is un-
clear what the latter could mean. Perhaps it would mean the same as
conceptual complexity, but then iconicity of length would be indistin-
guishable from iconicity of complexity, which Croft rejects just as I do.
4. Multiple motivations
Both Croft and Haiman emphasize that we should allow for the possi-
bility of multiple motivations (Schuchardts motley interplay of innu-
merable drives), and I fully agree with this. I also agree with Haiman
that each of these motivations is most clearly attested ceteris paribus,
that is, when they operate unopposed, and with Croft that one must
examine all the properties [of a phenomenon] in order to identify which
motivation is operating. Discovering the motivation(s) of a universal
tendency of language structure is not at all straightforward, and overall
our understanding is still very limited. Both John Haiman and William
64 M. Haspelmath
Croft have made substantial contributions to this enterprise, but I still
believe that the modication of their views that I proposed in my article
is hard to avoid. But to make further progress in this area, we need more
empirical work and more debates like the current one.
Received 1 June 2007 Max-Planck Institute for Evolutionary
Anthropology, Leipzig, Germany
Notes
* Contact address: Max Planck Institute for Evolutionary Anthropology, Deutscher Platz
6, 04103 Leipzig, Germany; authors e-mail: 3haspelmath@eva.mpg.de4.
1. In his conclusions, Haiman himself emphasizes the diversity of factors, and (ironically)
accuses me of reductionism. Evidently we need both some reductionism (such as the
principle that identical eects should be derived from identical causes) and close atten-
tion to the details, and the only question is how to achieve the right balance. It seems to
me that I am more on the splitters side, whereas Haiman is more (and a bit too much)
of a lumper.
References
Blevins, Juliette
2004 Evolutionary Phonology. Cambridge: Cambridge University Press.
Bybee, Joan L.
1988 The diachronic dimension in explanation. In: John A. Hawkins (ed.),
Explaining Language Universals. Oxford: Blackwell, 350379.
2001 Phonology and Language Use. Cambridge: Cambridge University Press.
2003 Mechanisms of change in grammaticalization: the role of frequency. In:
Janda, Richard and Joseph, Brian (eds.), Handbook of Historical Linguistics.
Blackwell, 602623.
2007 Frequency of Use and the Organization of Language. Oxford: Oxford Uni-
versity Press.
Corbett, Greville, Andrew Hippisley, Dunstan Brown, and Paul Marriott
2001 Frequency, regularity and the paradigm: A perspective from Russian on a
complex relation. In Joan Bybee and Paul Hopper (eds.), Frequency and the
Emergence of Linguistic Structure. Amsterdam: John Benjamins, 201226.
Croft, William
2000 Explaining Language Change: An Evolutionary Approach. London:
Longman.
2003 Typology and Universals. 2nd ed. Cambridge: Cambridge University Press.
Ellis, Nick C.
2002. Frequency eects in language acquisition: A review with implications
for theories of implicit and explicit language acquisition. Studies in Second
Language Acquisition 24(2), 143188.
Greenberg, Joseph H.
1966 Language Universals, with Special Reference to Feature Hierarchies. (Janua
Linguarum, Series Minor 59.) The Hague: Mouton.
Reply to Haiman and Croft 65
Haiman, John
1983 Iconic and economic motivation. Language 59: 781819.
Haspelmath, Martin
1993 More on the typology of inchoative/causative verb alternations. In Comrie,
Bernard, and Maria Polinsky (eds.), Causatives and Transitivity. Amster-
dam: Benjamins, 87120.
1999 Optimality and diachronic adaptation. Zeitschrift fur Sprachwissenschaft
18(2), 180205.
Kirby, Simon
1999 Function, Selection, and Innateness: The Emergence of Language Universals.
Oxford: Oxford University Press.
Newmeyer, Frederick J.
2005 Possible and Probable Languages. Oxford: Oxford University Press.
Schuchardt, Hugo
Gegen die Junggrammatiker. Berlin: Robert Oppenheimer.
Zipf, George Kingsley
1929 Relative frequency as a determinant of phonetic change. Harvard Studies in
Classical Philology 40: 195.
66 M. Haspelmath
Linguistic and metalinguistic categories
in second language learning
KAREN ROEHR*
Abstract
This paper discusses proposed characteristics of implicit linguistic and ex-
plicit metalinguistic knowledge representations as well as the properties of
implicit and explicit processes believed to operate on these representations.
In accordance with assumptions made in the usage-based approach to lan-
guage and language acquisition, it is assumed that implicit linguistic knowl-
edge is represented in terms of exible and context-dependent categories
which are subject to similarity-based processing. It is suggested that, by
contrast, explicit metalinguistic knowledge is characterized by stable and
discrete Aristotelian categories which subserve conscious, rule-based pro-
cessing. The consequences of these dierences in category structure and
processing mechanisms for the usefulness or otherwise of metalinguistic
knowledge in second language learning and performance are explored. Ref-
erence is made to existing empirical and theoretical research about the role
of metalinguistic knowledge in second language acquisition, and specic
empirical predictions arising out of the line of argument adopted in the cur-
rent paper are put forward.
Keywords: categorization; explicit and implicit knowledge; metalinguistic
knowledge; second language learning, usage-based model.
1. Introduction
This article is concerned with the role of metalinguistic knowledge, or ex-
plicit knowledge about language, in the area of second language acquisi-
tion (SLA). It is situated within a cognitive-functional approach to lan-
guage and language learning, in the belief that our understanding of an
essentially pedagogical notionmetalinguistic knowledgemay be en-
hanced if we consider this notion in terms of a specic linguistic theory,
Cognitive Linguistics 191 (2008), 67106
DOI 10.1515/COG.2008.005
09365907/08/00190067
6 Walter de Gruyter
that is, the usage-based model of language. In this way, light can be shed
on a concept which is of interest to second language (L2) teachers, adult
language learners themselves, and last but certainly not least, applied lin-
guists of all theoretical persuasions, including cognitive linguists with a
pedagogical outlook (e.g., Achard and Niemeier 2004; Boers and Lind-
stromberg 2006).
In this paper, I argue that while implicit linguistic knowledge is charac-
terized by exemplar-based categories, explicit metalinguistic knowledge
relies on Aristotelian categories. Exemplar-based categories are exible,
highly contextualized, and subject to prototype eects, whereas Aristote-
lian categories are stable, discrete, and clearly delineated. These charac-
teristics can be illustrated briey with the help of the following examples
(from Taylor 2003): (1) The Pope is a bachelor. (2) Her husband is an un-
repentant bachelor.
1
If the construction bachelor is considered in terms of
Aristotelian category structure, i.e., if it is dened by means of primitive
binary features such as adult, male, married, etc., sentence (1) would
be judged semantically acceptable, while sentence (2) would have to be
regarded as semantically anomalous. Conversely, if the construction
bachelor is considered in terms of exemplar-based category structure, cat-
egorization by means of primitive binary features no longer applies. In-
stead, specic attributes associated with the category [bachelor] can be
perspectivized in accordance with the linguistic and cultural context pro-
vided by the sentences in which the construction appears, whereas other
attributes may be ltered out. Thus, sentence (1) seems somewhat odd,
since bachelorhood is taken for granted in a pope. Sentence (2), by con-
trast, is no longer anomalous, since certain behavioural attributes associ-
ated with the (idealized) prototype of an unmarried man are highlighted;
at the same time, the attribute associated with the marital status of a pro-
totypical bachelor is temporarily ignored.
In addition to positing qualitatively distinct category structures, I as-
sume that the processing mechanisms operating on implicit linguistic and
explicit metalinguistic knowledge representations are qualitatively dier-
ent. While implicit linguistic knowledge is stored in and retrieved from
an associative network during parallel distributed, similarity-based pro-
cessing, explicit metalinguistic knowledge is processed sequentially with
the help of rule-based algorithms. I suggest that these distinctions be-
tween linguistic and metalinguistic knowledge representations and pro-
cesses aect the way in which the two types of knowledge can be used in
L2 learning and performance.
Indeed, it appears that the proposed conceptualization of linguistic and
metalinguistic knowledge in terms of dierent category structures and as-
sociated dierences in processing mechanisms can help explain available
68 K. Roehr
ndings from the area of SLA which are indicative of both facilitative po-
tential and apparent limitations of metalinguistic knowledge in L2 learn-
ing and performance. Moreover, if read in conjunction with existing
research, the proposed conceptualization allows for the formulation of
specic predictions about the use of metalinguistic knowledge in L2 learn-
ing, both at a general level and for particular types of language learners.
The article is organized as follows: Section 2 provides denitions of the
main constructs under discussion, that is, explicit and implicit knowledge,
explicit and implicit learning, pedagogical grammar, and metalinguistic
knowledge. In Section 3, assumptions about the nature of implicit linguis-
tic knowledge commonly made by researchers working in a usage-based
paradigm are outlined. Section 4 contains a summary and evaluation of
key empirical and theoretical research in relation to the role of explicit
knowledge in language acquisition, with a strong emphasis on L2 learn-
ing. Section 5 puts forward the proposal which is at the core of the
current paper, with the argument focusing on the contrasting category
structures of implicit linguistic knowledge and explicit metalinguistic
knowledge as well as dierences in processing mechanisms associated
with these. Section 6 details empirical predictions that emerge from the
argument put forward in the current paper. Section 7 oers a brief
conclusion.
2. Construct denitions
Explicit knowledge is dened as declarative knowledge that can be
brought into awareness and that is potentially available for verbal report,
while implicit knowledge is dened as knowledge that cannot be brought
into awareness and cannot be articulated (Anderson 2005; Hulstijn 2005).
Accordingly, explicit learning refers to situations when the learner has
online awareness, formulating and testing conscious hypotheses in the
course of learning. Conversely, implicit learning describes when learn-
ing takes place without these processes; it is an unconscious process of
induction resulting in intuitive knowledge that exceeds what can be ex-
pressed by learners (N. Ellis 1994: 3839; see also N. Ellis 1996; Hul-
stijn 2005).
It is assumed that focused attention is a necessary requirement for
bringing representations or processes into conscious awareness, i.e., for
knowledge or learning to be explicit. In accordance with existing research,
three separable but associated attentional sub-processes are assumed, that
is, alertness, orientation, and detection (Schmidt 2001; Tomlin and Villa
1994). In this conceptualization of attention, alertness refers to an indi-
viduals general readiness to deal with incoming stimuli; orientation
Categories in second language learning 69
concerns the allocation of resources based on expectations about the
particular class of incoming information; during detection, attention fo-
cuses on specic details. Detection is thought to require more attentional
resources than alertness and orientation, and to enable higher-level pro-
cessing (Robinson 1995). Stimulus detection may occur with or without
awareness. If coupled with awareness, stimulus detection is equivalent
with noticing, which is dened as awareness in the sense of (momentary)
subjective experience (Schmidt 1990, 1993, 2001). Proponents of the so-
called noticing hypothesis argue that noticing, or attention at the level of
awareness, is required for L2 learning to take place.
It is worth noting that the concepts of attention, noticing, and aware-
ness, as well as their application in SLA, remain controversial (for critical
reviews, see, for instance, Robinson 2003; Simard and Wong 2001). Nev-
ertheless, a working denition is needed to allow for a clear discussion.
Thus, for the purpose of the present article, it is assumed that the ne
line between focused attention in the sense of stimulus detection and fo-
cused attention in the sense of noticing can be regarded as the threshold
of conscious awareness, that is, the point of interface between implicit
and explicit processes and representations.
First and foremost, the present paper is concerned with the notion of
metalinguistic knowledge. Metalinguistic knowledge is a specic type of
explicit knowledge, that is, an individuals explicit knowledge about
language. Accordingly, L2 metalinguistic knowledge is an individuals
knowledge about the L2 they are attempting to learn. The term metalin-
guistic knowledge tends to be used in applied linguistics research concen-
trating on L2 learning and teaching (e.g., Alderson et al. 1997; Bialystok
1979; Elder and Manwaring 2004), and it is closely related to applied
linguists conceptualization of pedagogical grammar (e.g., McDonough
2002; Saporta 1973; Towell 2002). Pedagogical grammar has been de-
scribed as a cover term for any learner- or teacher-oriented description
or presentation of foreign language rule complexes with the aim of pro-
moting and guiding learning processes in the acquisition of that language
(Chalker 1994: 34, quoting Dirven 1990). It is worth noting that, in dis-
cussions of pedagogical grammar, the term grammar is used in a broad
sense as referring to any aspect of language that can be described system-
atically; it is therefore not restricted to morphosyntactic phenomena.
In sum, the notion of metalinguistic knowledge is concerned with a
learners explicit mental representations, while the notion of pedagogical
grammar is concerned with explicit written or oral descriptions of lin-
guistic systematicities which can be presented to a learner as a source of
information about the L2. Accordingly, a learners metalinguistic knowl-
edge may arise from encounters with pedagogical grammar, e.g., through
70 K. Roehr
textbooks and/or through exposure to rule-based or other types of form-
focused instruction (R. Ellis 2001; Sanz and Morgan-Short 2005). By the
same token, pedagogical grammar has arisen from the metalinguistic
knowledge of applied linguists, L2 teachers, and materials designers.
Thus, while the labels of metalinguistic knowledge and pedagogical gram-
mar are used to denote, respectively, an individuals mental representa-
tions and written or oral instructional aids, the two notions are similar
to the extent that they are both explicit by denition and that the latter
can give rise to the former as well as vice versa.
As the argument presented in what follows is concerned with dier-
ences in category structure between explicit and implicit knowledge, the
question of whether a learners explicit knowledge has been derived
bottom-up through a process of analysis of the linguistic input or whether
it has been acquired top-down through formal study of grammar text-
books is not of immediate relevance. In other words, for the purpose of
the current discussion, it does not matter whether explicit knowledge has
arisen from implicit knowledge, e.g., when an L2 learner, perhaps after
prolonged experience with the L2, discovers certain systematicities and
arrives at a pedagogical grammar rule of their own, which is represented
as metalinguistic knowledge and can be articulated, or whether explicit
knowledge is assimilated from the environment, e.g., when an L2 learner
listens to a teachers explanation drawing on a pedagogical grammar rule
and memorizes this information as metalinguistic knowledge. In either
scenario, the dening characteristics, including the internal category
structure, of the metalinguistic knowledge held by the learner remain the
same, as will become apparent in Section 5 below.
It is acknowledged that there may be pedagogically relevant dierences
between internally induced metalinguistic knowledge and metalinguistic
knowledge gleaned from externally presented pedagogical grammar that
are of practical interest to teachers and learners in the L2 classroom. I
am not aware of any empirical research pertaining to this specic issue,
but one could hypothesize, for instance, that pedagogical grammar rules
presented to the learner are more accurate than metalinguistic knowledge
induced bottom-up by the learner him/herself, since the cumulative
knowledge of the applied linguistics community is based on more exten-
sive language experience than the average individual learner has been
able to gather. Alternatively, one could hypothesize that metalinguistic
knowledge derived by the learner him/herself is more relevant to the indi-
viduals L2 learning situation than one-size-ts-all pedagogical grammar
rules acquired from a commercially produced textbook. These questions,
though clearly interesting in themselves, do not impact on the theoretical
argument put forward here, however.
Categories in second language learning 71
Finally, it is worth noting that rule-based or other types of form-
focused instruction occur not only in the L2 classroom, but also in the
context of laboratory studies. Reports of such empirical studies as well
as theoretical papers with a psycholinguistic orientation (e.g., DeKeyser
2003; N. Ellis 1993; Robinson 1997) tend not to use the terms form-
focused instruction, pedagogical grammar, or metalinguistic knowledge;
instead, they refer more generally to explicit learning conditions and
learners explicit knowledge. However, explicit learning conditions draw-
ing on learners explicit knowledge typically require knowledge about the
L2, i.e., metalinguistic knowledge. Hence, the notion of metalinguistic
knowledge is of relevance to L2 learning and L2 teaching, as well as to
psycholinguistically oriented and applied SLA research.
In the context of the present article, metalinguistic knowledge is dened
as a learners explicit or declarative knowledge about the syntactic, mor-
phological, lexical, pragmatic, and phonological features of the L2. Meta-
linguistic knowledge includes explicit knowledge about categories as well
as explicit knowledge about relations between categories (R. Ellis 2004;
Hu 2002; Roehr 2007). Metalinguistic knowledge can vary in terms of
specicity and complexity, but it minimally involves either a schematic
category or a relation between two categories, specic or schematic. Meta-
linguistic knowledge relies on Aristotelian categories, i.e., categories that
are stable and discrete. These categories subserve sequential, rule-based
processing.
In the following sections, these proposed characteristics of metalinguis-
tic knowledge will be explained and exemplied. I will begin by comparing
and contrasting the characteristics of explicit metalinguistic knowledge
with the characteristics of implicit linguistic knowledge as conceptualized
in the usage-based model of language.
3. Implicit linguistic knowledge in the usage-based model
Within the framework of cognitive-functional linguistics, the usage-based
model makes several fundamental assumptions about the nature of lan-
guage: First, interpersonal communication is seen as the main purpose of
language. Second, language is believed to be shaped by our experience
with the real world. Third, language ability is regarded as an integral
part of general cognition. Fourth, all linguistic phenomena are explained
by a unitary account, including morphology, syntax, semantics, and prag-
matics. Hence, at the most general level, the usage-based model charac-
terizes language as a quintessentially functional, input-driven phenome-
non (e.g., Bybee and McClelland 2005; Goldberg 2003; Tomasello 1998).
Two specic theoretical consequences arising from these general premises
72 K. Roehr
are particularly relevant to the current discussion, namely, rst, the pro-
cess of categorization and the sensitivity of knowledge representations to
context and prototype eects, and second, the notion of linguistic con-
structions as conventionalized form-meaning pairings varying along the
parameters of specicity and complexity.
In the usage-based model, the representation and processing of lan-
guage is understood in terms of general psychological mechanisms such
as categorization and entrenchment, with the former underlying the lat-
ter. Entrenchment refers to the strengthening of memory traces through
repeated activation. Categorization can be dened as a comparison be-
tween an established structural unit functioning as a standard and an ini-
tially novel target structure (Langacker 1999, 2000). In view of well-
established empirical evidence from the area of cognitive psychology
(Rosch and Lloyd 1978; Rosch and Mervis 1975), it is accepted that cog-
nitive categories are subject to prototype eects, which are assumed to
apply in equal measure to conceptual and linguistic knowledge (Dirven
and Verspoor 2004; Taylor 2003; Tomasello 2003). A prototype can be
dened as the best example of a category, i.e., prototypical members of
cognitive categories have the largest number of attributes in common
with other members of the category and the smallest number of attributes
which also occur with members of neighbouring categories. In terms of
attributes, prototypical members are thus maximally distinct from the
prototypical members of other categories. To illustrate by means of a
well-known example, robin or magpie are prototypical members of the
category [bird] for (British) speakers of English, while penguin consti-
tutes a marginal category member (Ungerer and Schmid 1996).
Categorization is inuenced by the frequency of exemplars in the input
as well as by the recency and context of encounters with specic exem-
plars (N. Ellis 2002a, 2002b). As the parameters of frequency, recency,
and context interact, specic memory traces may be more or less en-
trenched and hence more or less salient and accessible for retrieval (Mur-
phy 2004). In addition, exemplars encountered in the input may be more
or less similar to exemplars encountered previously. Accordingly, cate-
gory membership is often a matter of degree and cannot normally be un-
derstood as a clear-cut yes/no distinction. It follows from this that cate-
gory boundaries may be fuzzy, and that categories may merge into one
another (Langacker 1999, 2000).
Two theoretical approaches to categorization are compatible with the
usage-based assumptions outlined in the previous paragraphs, that is,
the prototype view and the exemplar view (Murphy 2004). In its pure
form, the prototype view holds that concepts are represented by schemas,
i.e., structured representations of cognitive categories. Schemas contain
Categories in second language learning 73
information about both attributes and relations between attributes that
characterize a certain category. Conversely, the exemplar view, in its pure
form, posits that our mental representations never encompass an entire
concept. Instead, an individuals concept of a category is the set of spe-
cic category members they can remember, and there is no summary rep-
resentation. In this view, categorization is determined not only by the
number of exemplars a person remembers, but also by the similarity of a
new exemplar to exemplars already held in memory.
While the prototype and exemplar views may be incompatible in their
pure forms, they share a suciently large number of characteristics to
allow for a hybrid model to be formulated which includes both schema-
based and exemplar-based representations (Abbot-Smith and Tomasello
2006; Langacker 2000). As a hybrid model is not only compatible with
usage-based assumptions, but also particularly informative for accounts
of language learning and use, it is adopted in the current paper.
According to the hybrid model, all learning is initially exemplar-based.
As experience with the input grows and as repeated encounters with
known exemplars gradually change our mental representations of these
exemplars, it is believed that, ultimately, abstractions over instances are
derived (Kemmer and Barlow 2000; Taylor 2002). These abstractions
are in fact schemas. Schema formation can be dened as the emergence
of a structure through reinforcement of the commonality inherent in mul-
tiple experiences, while, at the same time, experiential facets which do
not recur are ltered out. Correspondingly, a schema is the commonality
that emerges from distinct structures when one abstracts away from their
points of dierence by portraying them with lesser precision and specic-
ity (Langacker 2000: 4).
To illustrate with the help of a linguistic example, a large number of
encounters with specic utterances such as I sent my mother a birthday
card and Harry is sending his friend a parcel lead to entrenchment, i.e.,
the strengthening of memory traces for the form-meaning associations
constituting these constructions. Gradually, constructional subschemas
such as send-[np]-[np] and nally the wholly general ditransitive schema
[v]-[np]-[np] are abstracted. Entrenched constructions, both general and
specic, are described as conventional units. Accordingly, a speakers lin-
guistic knowledge can be dened as a structured inventory of conven-
tional linguistic units (Langacker 2000: 8).
Crucially, the hybrid view argues that representations of specic exem-
plars can be retained alongside more general schemas subsuming these
exemplars. Put dierently, specic instantiations of constructions and
constructional schemas at varying levels of abstraction exist alongside
each other, so that the same linguistic patterns are potentially represented
74 K. Roehr
in multiple ways. Thus, linguistic knowledge is represented in a vast, re-
dundantly organized, hierarchically structured network of form-meaning
associations.
Conventional linguistic units, or constructions, are viewed as inherently
symbolic (Kemmer and Barlow 2000; Taylor 2002), so that constructions
at all levels of abstraction are pairings of form and meaning (Goldberg
2003: 219). Hence, even though a constructional schema at the highest
level of abstraction such as the English ditransitive [v]-[np]-[np] no longer
contains any specic lexical items, it is still endowed with constructional
meaning. Accordingly, a construction is always more than the sum of its
parts; beyond symbolizing the meanings and relations of its constituents,
it has its own semantic prole (Langacker 1991, 2000). For instance, at
the most general level, the semantics of the English ditransitive schema
[v]-[np]-[np] are captured by the notions of transfer and motion (Gold-
berg 1995, 1999, 2003).
To reiterate, the unitary approach to language which characterizes the
usage-based model is applied both at the level of cognition and at the
level of linguistic structure itself. Hence, syntax, morphology, and the lex-
icon are all accounted for by the same system (Bates and Goodman 2001;
Langacker 1991, 2000; Tomasello 1998); they are regarded as diering in
degree rather than as diering in kind. Syntax, morphology, and the lexi-
con are conceptualized as a graded continuum of conventional linguistic
units, or constructions, varying along the parameters of specicity and
complexity, as shown in Figure 1.
2
As Figure 1 indicates, schematic and complex constructions such as the
ditransitive [v]-[np]-[np] occupy the area traditionally referred to as syn-
tax. Words such as send or above are both minimal and specic and oc-
cupy the area traditionally labelled lexicon. Morphemes such as English
plural -s or regular past tense -ed are situated at the centre of the two
clines, since instances of morphology are neither entirely specic nor en-
tirely schematic; by the same token, they are neither truly minimal nor
truly complex, but they are always bound. Lexical categories like [noun],
[verb], and [adjective] are minimal but schematic, while idioms such as
kick the bucket tend to be both complex and specic in that they allow
for little variation. The example kick the bucket only permits verb inec-
tion for person and tense, for instance, and thus ranges high on the specif-
icity scale. At the same time, the construction kick the bucket can be con-
sidered as more complex than the constructions send or above because the
latter cannot be broken down any further.
To summarize, the usage-based model assumes that categorization is a
key mechanism in language representation, learning, and use. As linguis-
tic knowledge is regarded as an integral part of cognition, it is accepted
Categories in second language learning 75
that both conceptual and linguistic categories are subject to context and
prototype eects. Linguistic knowledge is conceptualized in terms of con-
structions, i.e., conventionalized form-meaning units varying along the
parameters of specicity and complexity. Crucially, these assumptions
underlie the usage-based account of implicit phenomena of language rep-
resentation, acquisition, and use. The role of explicit phenomena, in par-
ticular as studied in the eld of SLA, is the focus of the next section.
4. Explicit knowledge in language learning
The notion of explicit knowledge has consistently attracted the interest of
researchers in the areas of SLA and applied linguistics more generally.
Over the past two decades in particular, this interest has generated an im-
pressive amount of both empirical and theoretical research. Depending
on whether researchers take a primarily educational or a primarily psy-
cholinguistic perspective, empirical studies have drawn on a variety of
correlational and experimental research designs, investigating the rela-
tionship between L2 learners linguistic prociency and their metalinguis-
tic knowledge, the role of explicit knowledge in instructed L2 learning,
and the eects of implicit versus explicit learning conditions on the acqui-
sition of selected L2 constructions.
Figure 1. Linguistic constructions in the specicity/complexity continuum
76 K. Roehr
The most uncontroversial cumulative nding resulting from this body
of research has borne out the prediction that attention (in the sense of
stimulus detection) is a necessary condition for the learning of novel input
(Doughty 2003; N. Ellis 2001, 2003; MacWhinney 1997). Moreover, it
has been found that form-focused instructional intervention is more eec-
tive than mere exposure to L2 input (Doughty 2003; R. Ellis 2001, 2002;
Norris and Ortega 2001). As it is the intended purpose of all types of form-
focused instruction to direct learners attention to relevant form-meaning
associations in the linguistic input, this is not a surprising outcome.
Beyond the well-substantiated claim that attention in the sense of stim-
ulus detection is a necessary requirement for input to become intake, the
picture is much less clear. In other words, ndings regarding the role of
explicit knowledge, i.e., knowledge above the threshold of awareness,
yield a more complex and sometimes even apparently contradictory pat-
tern of evidence. As it is beyond the scope of this paper to present an ex-
haustive review of the large body of research that has been carried out in
the preceding decades, the following summary is deliberately brief and fo-
cused exclusively on representative studies that are directly relevant to the
current discussion (for more comprehensive recent reviews of the litera-
ture, see DeKeyser 2003; R. Ellis 2004). In particular, work which illus-
trates the sometimes contrasting nature of ndings and conclusions as
well as work which emphasizes the complex interplay of variables in lan-
guage learning processes has been selected.
Empirical research concerned with metalinguistic knowledge in SLA
has led to at least two results that highlight the potential benets of ex-
plicit knowledge and learning. First, learners metalinguistic knowledge
and their L2 linguistic prociency have been found to correlate positively
and signicantly, even though the strength of the relationship varies be-
tween studies, ranging from a moderate 0.3 to 0.5 (e.g., Alderson et al.
1997; Elder et al. 1999) to between 0.6 and 0.7 (Elder and Manwaring
2004), and, reported most recently, up to 0.8 (Roehr 2007). Thus, there
is evidence for an overall association between higher levels of learner
awareness, use of metalinguistic knowledge, and successful L2 perfor-
mance (Leow 1997; Nagata and Swisher 1995; Rosa and ONeill 1999).
Second, learners use of metalinguistic knowledge when resolving form-
focused L2 tasks has been found to be associated with consistent and sys-
tematic performance (Roehr 2006; Swain 1998).
While these ndings are indicative of a generally facilitative role for ex-
plicit knowledge about the L2, empirical evidence likewise demonstrates
that use of metalinguistic knowledge by no means guarantees successful
L2 performance. For instance, Doughty (1991) found equal gains in per-
formance across two experimental groups comprising 20 university-level
Categories in second language learning 77
learners of L2 English from various L1 backgrounds. Focusing on restric-
tive relative clauses (e.g., I know the people who you talked with), learners
receiving meaning-oriented instruction with enhanced input and learners
exposed to rule-oriented instruction with explicit explanation of the tar-
geted L2 construction showed equal gains in performancea nding
which suggests that metalinguistic explanations may be unnecessary.
By the same token, Sanz and Morgan-Short (2004) found support for
the null hypothesis that providing learners with explicit information
about the targeted L2 construction either before or during exposure to
input-based practice would not aect their ability to interpret and pro-
duce L2 sentences containing the targeted L2 construction, as long as
learners received structured input aimed at focusing their attention appro-
priately. The study was carried out with 69 L1 English learners of L2
Spanish and concentrated on preverbal direct object pronouns. The re-
searchers concluded that structured input practice which made linking
form and meaning task-essential, as proposed in processing instruction
(VanPatten 1996, 2004), appeared to be sucient for successful learning.
Additional explicit information about the targeted L2 construction did
not enhance participants performance any further.
The ambivalent relationship between use of metalinguistic knowledge
and successful L2 performance was likewise underlined by Green and
Hecht (1992), Camps (2003), and Roehr (2006). Green and Hecht (1992)
report a study with 300 L1 German learners of L2 English which targeted
the use of various morphosyntactic features such as tense and word order.
While successful metalinguistic rule formulation typically co-occurred
with the successful correction of errors instantiating the rules in question,
it was also found that successful error correction could be associated with
the formulation of incorrect rules, or no rule knowledge at all.
In a study involving 74 L1 English learners of L2 Spanish focusing on
third-person direct object pronouns, Camps (2003) collected both concur-
rent and retrospective verbal protocol data. He found that references to
the targeted L2 construction co-occurred with accurate performance in
92 percent of cases; yet, no reference to the targeted L2 construction still
co-occurred with accurate performance in 69 percent of cases. Thus, de-
spite providing additional benets in some cases, use of explicit knowl-
edge appears to have been far from necessary.
Roehr (2006) studied retrospective verbal reports from ten L1 English
learners of L2 German, which were obtained immediately after the com-
pletion of form-focused tasks targeting adjectival inection. She found
that although reported use of metalinguistic knowledge co-occurred
more frequently with successful than with unsuccessful item resolution
overall, fully correct use of metalinguistic knowledge still co-occurred
78 K. Roehr
with unsuccessful item resolution in 22 percent of cases. Along similar
lines, anecdotal evidence from the L2 classroom suggests that, on occa-
sion, learners may use their metalinguistic knowledge to override more
appropriate intuitive responses based on implicit linguistic knowledge
(Gabrielatos 2004).
Theoretically oriented work concerned with metalinguistic knowledge
has mainly sought to identify the dening characteristics of the concept
of explicit knowledge as well as the facilitative potential of such knowl-
edge in SLA. The most substantial contribution to establishing the den-
ing characteristics of metalinguistic knowledge has arguably been made
by R. Ellis (2004, 2005, 2006), according to whom explicit L2 knowledge
is represented declaratively, characterized by conscious awareness, and
verbalizable, as mentioned in the construct denition presented in Section
2 above. Moreover, explicit L2 knowledge is said to be learnable at any
age, given sucient cognitive maturity. As explicit knowledge is em-
ployed during controlled processing, it tends to be used when the learner
is not under time pressure. Finally, it has been hypothesized that learners
explicit L2 knowledge may be more imprecise and more inaccurate than
their implicit knowledge.
Research with a primarily theoretical outlook has further considered
metalinguistic knowledge in terms of the categories and relations between
categories that are represented explicitly, as well as the nature of the L2
constructions described by explicit categories and relations between cate-
gories. Typically, such research has conceptualized metalinguistic knowl-
edge as knowledge of pedagogical grammar rules consisting of explicit de-
scriptions of linguistic phenomena. It has been argued that metalinguistic
descriptions may vary along several parameters, including complexity,
scope, and reliability (DeKeyser 1994; Hulstijn and de Graa 1994).
For instance, metalinguistic descriptions may refer to either prototyp-
ical or peripheral uses of a particular L2 construction (Hu 2002). More-
over, the L2 construction described may itself vary in terms of complex-
ity, perceptual salience, or communicative redundancy (Hulstijn and de
Graa 1994). In view of this multifaceted interaction between the type of
explicit description and the type of L2 construction described, it is notori-
ously dicult to predict which kind of metalinguistic description is likely
to be helpful to the L2 learner. Accordingly, positions have shifted some-
what over the years, with earlier work advocating fairly categorically ei-
ther the teaching of more complex metalinguistic descriptions (Hulstijn
and de Graa 1994), or the teaching of simpler rules (DeKeyser 1994;
Green and Hecht 1992).
In recent years, researchers have adopted a more sophisticated line of
argument. DeKeyser (2003) has highlighted the fact that the diculty
Categories in second language learning 79
and hence the potential usefulnessof metalinguistic descriptions is a
complex function of a number of variables, including the characteristics
of the description itself, the characteristics of the L2 construction being
described (see also DeKeyser 2005), and individual learner dierences in
aptitude.
Indeed, the fact that the relative usefulness of metalinguistic descrip-
tions in L2 learning and performance is aected by a range of variables
is to be expected, since language is necessarily learned and used by spe-
cic individuals in specic contexts. First and foremost, the role of meta-
linguistic knowledge in SLA is at least partially dependent upon a
learners current level of L2 prociency (Butler 2002; Camps 2003; Sorace
1985). Second, a learners use of metalinguistic knowledge is likely to be
subject to situation-specic variation, since both the targeted L2 construc-
tion(s) and the task requirements at hand play a part in determining
whether and how metalinguistic knowledge is employed (R. Ellis 2005;
Hu 2002; Klapper and Rees 2003; Renou 2000). Hence, timed tasks in
general and oral task modalities in particular may prevent a learner
from allocating sucient attentional resources to controlled processing
involving metalinguistic knowledge, whereas untimed tasks in general
and written task modalities in particular may have the opposite eect,
possibly encouraging the use of metalinguistic knowledge.
Third, the L1-L2 combination under investigation, paired with the rel-
ative typological distance between L1 and L2, may have a part to play
(Elder and Manwaring 2004). Fourth, length of prior exposure to L2 in-
struction and the type of instruction experienced have been shown to im-
pact on a learners level and use of metalinguistic knowledge (Elder et al.
1999; Roehr 2007). Finally, individual dierences in cognitive and learn-
ing style, strategic preferences, and aptitude may inuence a learners use
of metalinguistic knowledge (Collentine 2000; DeKeyser 2003; Roehr
2005).
Most recently, existing work concerned with the role of explicit knowl-
edge in SLA has been complemented by hypotheses about the nature of
the representations and processes involved in the use of metalinguistic
knowledge. Crucial to the current paper, both empirical ndings and the-
oretical research suggest that explicit and implicit knowledge are separa-
ble constructs which are nonetheless engaged in interplay (N. Ellis 1993,
2005; R. Ellis 2005; Segalowitz 2003). In other words, the so-called weak-
interface position
3
allows for the possibility of explicit metalinguistic
knowledge contributing indirectly to the acquisition of implicit linguistic
knowledge, and vice versa. It has been argued that the two types of
knowledge come together during conscious processing (for particularly
readable reviews of the complex subject matter of consciousness, see
80 K. Roehr
Baddeley 1997; Cattell 2006). Moreover, when explicit knowledge is
brought to bear on implicit knowledge and vice versa, enduring learning
eects may result (N. Ellis and Larsen-Freeman 2006).
The mechanism which is thought to enable conscious processing is
called binding. During binding, a number of implicit representations in
dierent modalities are activated simultaneously and integrated into a
unied explicit representation that is held in a multimodal code in work-
ing memory (Bayne and Chalmers 2003; Dienes and Perner 2003; N. Ellis
2005). We consciously experience this unied representation as a coherent
episode. Put dierently, the mechanism of binding, explained through the
temporally synchronized ring of a number of neurons in dierent brain
regions (Engel 2003), accounts for how implicit representations subserve
explicit representations.
With regard to explicit metalinguistic and implicit linguistic processing,
it has been proposed that implicit learning of language occurs during u-
ent comprehension and production. Explicit learning of language occurs
in our conscious eorts to negotiate meaning and construct communica-
tion (N. Ellis 2005: 306). Thus, during uent language use, the implicit
system automatically processes input and produces output, with the indi-
viduals conscious self focused on the meaning rather than the form of the
utterance. When comprehension or production diculties arise, however,
explicit processes take over. We focus our attention on linguistic form,
and we notice patterns; moreover, we become aware of these patterns as
unied, coherent representations. Such explicit representations can then
be used as pattern recognition units for new stimuli in future usage
events. In this way, conscious processing helps consolidate new bindings,
which are fed back to the brain regions responsible for implicit processing
(N. Ellis 2005).
Steered by the focus of our conscious processing, the repeated simulta-
neous activation of a range of implicit representations helps consolidate
form-meaning associations, often to the extent that implicit learning on
subsequent occasions of use becomes possible. Thus, as the various ele-
ments constituting a coherent form-meaning association are activated si-
multaneously during processing, they are bound together more tightly
(N. Ellis 2005). Crucially, however, it is not a question of the explicit
representation turning into an implicit representation. According to the
weak-interface position, it is not the metalinguistic knowledge, e.g., in the
form of an explicit description of a linguistic phenomenon, that becomes
implicit, but its instantiation, i.e., the sequences of language that the de-
scription is used to comprehend or to construct (R. Ellis 2004: 238).
4
The locus of conscious processingmetaphorically speakingis work-
ing memory. Put dierently, explicit knowledge is conceptualized as
Categories in second language learning 81
information that is selectively attended to, stored, and processed in work-
ing memory. Working memory refers to the system or mechanism un-
derlying the maintenance of task-relevant information during the perfor-
mance of a cognitive task (Shah and Miyake 1999: 1). Thus, working
memory allows for the temporary storage and manipulation of informa-
tion which is being used during online cognitive operations such as lan-
guage comprehension, learning, and reasoning (Baddeley 2000; Baddeley
and Logie 1999). The so-called episodic buer, a component of working
memory, is capable of binding information from a variety of sources and
holding such information in a multimodal code. Importantly, working
memory is limited in capacity (Just and Carpenter 1992; Miyake and
Friedman 1998), i.e., we can only attend to and hence be aware of so
much information at any one time.
Clearly, the fact that limited working memory resources constrain ex-
plicit processing of language aects L2 and L1 in equal measure. It is
well-established that individuals dier in the maximum amount of activa-
tion available to them, i.e., that individuals dier in terms of their work-
ing memory capacity (e.g., Daneman and Carpenter 1980; Just and Car-
penter 1992; Miyake and Shah 1999). Moreover, young children generally
have smaller working memory capacity than cognitively mature adoles-
cents and adults. In other words, beyond the issue of individual dier-
ences, working memory capacity increases in the course of an individuals
development.
In L1 acquisition and use, the emergence of metalinguistic ability is
closely associated with the development of literacy skills, that is, another
dimension of linguistic competence which requires selective attention to
language form (Birdsong 1989; Gombert 1992). As both metalinguistic
ability and literacy skills rely on conscious processing drawing on work-
ing memory resources, a certain level of cognitive maturity which guaran-
tees sucient working memory capacity is required; hence, these abili-
ties do not tend to develop until a child is between six and eight years of
age.
Metalinguistic processeswhether concerned with L1 or L2are
analogous to other higher-level mental operations that draw on working
memory resources and thus require a certain level of cognitive maturity.
Hence, the application of metalinguistic knowledge and the process of
analytic reasoning as applied during general problem-solving appear to
rely on the same basic mechanisms. Put dierently, use of metalinguistic
knowledge in language learning and performance can be regarded as an-
alytic reasoning applied to the problem space of language; metalinguistic
processing is problem-solving in the linguistic domain (Anderson 1995,
1996; Butler 2002; Hu 2002).
82 K. Roehr
In L1, a child may raise questions about form-meaning associations
(Why are there two names, orange and tangerine?), comment on non-
target-like utterances they have overheard (e.g., if another child mispro-
nounces certain words), or objectify language (Is the a word?), thus not
only demonstrating their ability to monitor language use, but also show-
ing the rst signs of what will eventually result in the ability to reason
about language (examples adapted from Birdsong 1989: 17; Karmilo
and Karmilo-Smith 2002: 80). In L2, use of metalinguistic knowledge
can likewise be understood in terms of monitoring and reasoning based
on hypothesis-testing operations (N. Ellis 2005; Roehr 2005), which are
characteristic of a problem-solving approach. Thus, the cognitively ma-
ture L2 learner may deliberately analyze input in an attempt to compre-
hend an utterance (What is the subject and what is the object in this sen-
tence?), or creatively construct output that is monitored for formal
accuracy (If I use a compound tense in this German clause, the rst
verb needs to be in second position and the second verb in nal position.)
To summarize this section, available empirical evidence about the role
of explicit knowledge in language learning and use bears out the theoreti-
cally motivated expectation that metalinguistic knowledge can have both
benets and limitations. Whilst the facilitative eect of focused attention
in the sense of stimulus detection is all but undisputed, determining the
impact of higher levels of learner awareness and more explicit types of
learner knowledge which go beyond focused attention in the sense of
stimulus detection is less straightforward. On the one hand, L2 pro-
ciency and metalinguistic knowledge have been found to correlate posi-
tively and signicantly. Moreover, use of metalinguistic knowledge is typ-
ically associated with performance patterns characterized by consistency
and systematicity. On the other hand, use of metalinguistic knowledge is
by no means a guarantee of successful performance, and higher levels of
learner awareness that reach beyond noticing may be unnecessary or pos-
sibly even unhelpful in certain situations.
In the area of theory, a recent position includes the proposal that ex-
plicit and implicit knowledge are separate and distinct, but can interact.
Hence, explicit knowledge about language may contribute indirectly to
the development of implicit knowledge of language, and vice versa. As
explicit and implicit knowledge interface during conscious processing,
and as such processing is subject to working memory constraints, use of
metalinguistic knowledge in language learning and performance is likely
to have not only benets, but also certain limitations. On the one hand,
conscious processing involving the higher-level mental faculty of analytic
reasoning allows the cognitively mature individual to apply a problem-
solving approach to language learning. On the other hand, conscious
Categories in second language learning 83
processing is constrained by limited working memory capacity and thus
only permits the consideration of a restricted amount of information at
any one time.
Finally, existing research acknowledges that the relative usefulness of
metalinguistic knowledge can be expected to depend on a range of
learner-internal and learner-external variables, including task modalities,
the learners level of L2 prociency, their language learning experience,
their cognitive abilities, and their stylistic orientation.
Whilst it is important to bear in mind that all these factors will dier-
entially aect the role of metalinguistic knowledge in language learning
and performance (see Section 6 below), it is argued here that, ceteris par-
ibus and over and above these factors, another, more fundamental vari-
able which goes beyond specic usage situations and individual learner
dierences is worthy of consideration: The contrasting category structures
of implicit linguistic knowledge representations on the one hand and ex-
plicit metalinguistic knowledge representations on the other hand as well
as the dierent modes of implicit, associative processing and explicit, rule-
based processing constitute the basic cognitive conditions in which lan-
guage learning and performance take place. If taken into account, these
phenomena not only help explain existing ndings about the apparently
ambivalent role of metalinguistic knowledge in L2 learning and use, but
also permit us to formulate specic empirical predictions that can guide
future research.
5. The representation and processing of implicit linguistic knowledge and
explicit metalinguistic knowledge
As linguistic and metalinguistic knowledge pertain to the same cognitive
domainlanguagethey can be expected to share certain characteristics.
Specically, it appears that linguistic constructions and metalinguistic de-
scriptions vary along the same parameters, namely, specicity and com-
plexity. The usage-based model assumes that linguistic constructions can
be more or less specic as well as more or less complex (see Figure 1
above). By the same token, empirical evidence suggests that L2 learners
metalinguistic knowledge can be more or less specic and more or less
complex (e.g., Roehr 2005, 2006; Rosa and ONeill 1999).
For the purpose of illustration, one might imagine the case of an edu-
cated L1 English-speaking adult learner of L2 German and consider their
metalinguistic knowledge which has mostly been derived from encounters
with pedagogical grammar in the classroom and in textbooks.
5
Thus, a
metalinguistic description which this learner is aware of can refer to spe-
cic instances, e.g., German hin expresses movement away from the
84 K. Roehr
speaker, while her expresses movement towards the speaker. Alterna-
tively, it can be entirely schematic and therefore involve no specic exem-
plars at all, e.g., a subordinating conjunction sends the nite verb to the
end of the clause. Both of these examples are additionally complex, i.e.,
they state relations between categories, and they can be broken down into
their constituent parts and therefore require several mental manipulations
during processing (DeKeyser 2003; Stankov 2003). However, a metalin-
guistic description can also be minimal, e.g., noun. Various combina-
tions of dierent levels of specicity and complexity seem possiblewith
the exception of both minimal and specic.
In fact, the joint characteristics of minimal and specic appear to be
unique to lexical items, that is, linguistic constructions. By contrast, even
entirely specic metalinguistic descriptions containing no schematic cate-
gories such as German ei is pronounced like English i or English desk
means Schreibtisch in German involve a relation between two specic
instances and can therefore still be broken down into their constituent
parts. By the same token, a minimal metalinguistic description such as
noun, which cannot be broken down any further, is schematic rather
than specic. Put dierently, as soon as implicit linguistic knowledge is
made explicit, i.e., when a metalinguistic knowledge representation is cre-
ated (no matter by whom, whether an L2 learner, an applied linguist, or
any other language user), it seems to take the form of either a schematic
description (noun), or a proposition involving at least two categories
and a relation between them.
It should be pointed out that this circumstance does not exclude state-
ments about the lexicon from the realm of metalinguistic description and
representation; quite to the contrary, semantic knowledge is perhaps the
most obvious area of explicit knowledge about language, since it typically
encompasses not only L2 metalinguistic knowledge, but also L1 metalin-
guistic knowledge. Indeed, we can glean metalinguistic knowledge about
lexical items from any monolingual or bilingual dictionary. However, it is
crucial to note that, when made explicit, semantic knowledge incorpo-
rates at least two categories and a relation between them, as exemplied
by dictionary denitions of any description. Even the briefest listing of a
synonym without further explanatory comment amounts to stating a rela-
tion between two categories (X means Y). Hence, one can argue that im-
plicit knowledge of the meaning, function, and appropriate usage con-
texts of minimal and specic linguistic constructions such as lexical items
is distinguishable from explicit knowledge about the meaning, function,
and appropriate usage contexts of these constructions. This claim applies
not only to implicit knowledge of and explicit knowledge about the lexi-
con, but also to all other areas of language.
Categories in second language learning 85
Whilst metalinguistic knowledge is comparable with linguistic con-
structions in terms of the parameters of complexity and specicity, ex-
plicit metalinguistic knowledge diers qualitatively from implicit linguis-
tic knowledge in the crucial respect of categorization, that is, one of the
key cognitive phenomena underlying conceptual as well as linguistic rep-
resentation and processing. As outlined in Section 3 above, the usage-
based model assumes that cognitive categories, whether conceptual or lin-
guistic, are exible and context-dependent, sensitive to prototype eects,
and have fuzzy boundaries.
By contrast, metalinguistic knowledge appears to be characterized by
stable, discrete, and context-independent categories with clear-cut bound-
aries. Put dierently, metalinguistic knowledge relies on what has alter-
nately been labelled Aristotelian, categorical, classical, or scientic cate-
gorization (Anderson 2005; Bod et al. 2003; Taylor 2003; Ungerer and
Schmid 1996). For instance, the metalinguistic category subordinating
conjunction is stable and clearly dened; in the case of German, it is in-
stantiated by a certain number of exemplars, such as weil (because), da
(as), wenn (if, when), etc. Although some instantiations occur more fre-
quently than others, there are no better or worse category members; all
subordinating conjunctions have equal status and are equally valid exem-
plars, regardless of context.
By the same token, the linguistic construction [noun] and the metalin-
guistic description noun can be contrasted. As all linguistic construc-
tions are form-meaning pairings, the linguistic construction [noun] is not
devoid of semantic content. Even though it has no specic phonological
instantiation, it has been abstracted over a large number of exemplars oc-
curring in actual usage events (as exemplied in more detail for the
English ditransitive construction in Section 3 above); accordingly, the
linguistic construction [noun] is strongly associated with the semantics of
its most frequent instantiations, such as lexical items denoting entities in
the real world. Consequently, in the average user of English, the highly
frequent and prototypical constructions man, woman and house can be ex-
pected to be more strongly associated with the schema [noun] than the
relatively rare constructions rumination and oxymoron, or the dual-class
words brush and kiss, for instance. Likewise, in the average user of Ger-
man, Fuhlen (the sensing/feeling) is likely to be a relatively marginal
instantiation of the category [noun], compared with the more common
instantiation Gefuhl (sensation/feeling). The more marginal status of
Fuhlen can be attributed to the relative rarity of its nominal usage as
well as its homophone fuhlen (sense/feel), a prototypical verb. Thus,
by dint of its association with various instantiations, their respective
conceptual referents, and their usage contexts, the linguistic schema
86 K. Roehr
[noun] exhibits a category structure which is characterized by exibil-
ity and context-dependency, and which takes into account prototype
eects.
The metalinguistic description noun, on the other hand, relies on Aris-
totelian categorization. It may be dened by means of a discrete state-
ment, e.g., as a word ( . . . ) which can be used with an article (Swan
1995: xxv) or a content word that can be used to refer to a person, place,
thing, quality, or action.
6
Metalinguistic categorization is based on clear
yes/no distinctions; frequency distributions or contextual information are
not taken into account, and prototype eects are ltered out. Thus, in
metalinguistic terms, the constructions man, woman, house, rumination,
oxymoron, brush, kiss, Fuhlen, and Gefuhl all have equal status as mem-
bers of the Aristotelian category noun.
Of course, use of Aristotelian categorization does not mean that we as
language users are unaware of the potential shortcomings of such an
approach. This awareness is also acknowledged in L2 instruction which
draws on metalinguistic descriptions. Most L2 learners will be able to
think of examples of pedagogical grammar rules that are qualied by fre-
quency adverbs such as usually, in general, etc. Most L2 learners will like-
wise be familiar with statements about specic usage contexts as well as
lists of exceptions to a rule that apparently have to be learned by rote. Fi-
nally, the realm of metalinguistic descriptions is not immune to prototype
eects. For instance, descriptions of prototypical functions of a certain
L2 form will occur more often than descriptions of less prototypical
functions of the same form and will thus be more familiar to learners
(Hu 2002). However, it is argued here that these prototype eects only
concern the presentation and/or our perception of metalinguistic de-
scriptions; they do not seem to have any bearing on the internal cate-
gory structure of explicit knowledge representations or the processing
mechanisms operating on these representations, as explicated in the
following.
As a matter of fact, in order to be of use, metalinguistic knowledge re-
quires conditions of stability and discreteness; otherwise, it would be of
little practical value (see also Swan 1994). For metalinguistic knowledge
to be informative, the user needs to decide categorically whether a specic
linguistic construction is to be classied as a noun or not, otherwise a
metalinguistic description such as the verb needs to agree in number
with the preceding noun or pronoun cannot be implemented. By the
same token, the user needs to decide categorically whether a linguistic
construction is a subordinate conjunction or not, otherwise a metalinguis-
tic description such as in German, the nite verb appears at the end of a
subordinate clause cannot be employed.
Categories in second language learning 87
To exemplify further, the metalinguistic description in English re-
ported speech, the main verb of the sentence changes to the past tense
when it is in the present tense in direct speech applies in equal measure
to all English utterances, unless it is qualied by further statements about
specic contexts, e.g., if something that is still true at the time of speak-
ing is being reported, the main verb may remain in the present tense.
Further propositions are required to make explicit the formal and func-
tional criteria of introducing reported speech by means of dierent verbs
such as say and tell, to describe the formal and functional aspects of re-
ported questions, and so forth (example adapted from Murphy 1994).
No matter how many statements are formulated, though, the user needs
to be able to clearly assign category membership in each case in order to
be able to apply the metalinguistic description, represented as metalin-
guistic knowledge, to a concrete linguistic construction. If we cannot de-
cide categorically if something is a main verb, if something is direct
speech, etc., we cannot bring to bear our explicit knowledge.
As a nal example, consider a general, dictionary-style metalinguistic
description pertaining to the constructions desk and Schreibtisch (desk),
which is again necessarily stable and discrete. The statement that English
desk means Schreibtisch in German is posited as a context-independent
proposition which does not take into account prototypicality or usage sit-
uations. In order to achieve a ner descriptive grain, additional proposi-
tions need to be formulated, e.g., in the context of English check-in desk,
the word Check-in-Schalter needs to be used in German. Conversely, the
implicit linguistic knowledge of a procient user of both English and
German would accurately reect the frequency distributions of the con-
structions desk, Schreibtisch, and Schalter in connection with the relevant
referential meanings and suitable pragmatic contexts in which these con-
structions tend to appear.
The same principle applies to the internal structure of all metalinguistic
categories and propositions about relations between categories that make
up metalinguistic descriptions, regardless of whether these refer to lexico-
semantic, morphosyntactic, phonological, or pragmatic phenomena: Aris-
totelian categories are needed to allow for the eective deployment of
metalinguistic knowledge. To reiterate, if we cannot take clear-cut deci-
sions about category membership, our metalinguistic knowledge is of lit-
tle practical value in concrete usage situations.
The contrasting category structures of implicit linguistic and explicit
metalinguistic representations can be expected to aect the processing
mechanisms which operate on these representations during language
learning and use. Indeed, implicit and explicit mental operations involv-
ing natural language appear to be analogous with what is respectively
88 K. Roehr
termed similarity-based and rule-based processing in the eld of cognitive
psychology.
Similarity-based and rule-based processing have been studied in rela-
tion to categorization, reasoning, and articial language learning, and ex-
perimental evidence for a qualitative distinction between the two pro-
cesses is quite robust, though not uncontroversial. In accordance with
the weak-interface position adopted in the current paper (see Section 4
above), I am in agreement with researchers who not only regard rule-
based and similarity-based processing as separable and distinct, but also
argue that the dening property of rule-based processing is its conscious
nature (Cleeremans and Destrebecqz 2005; Hampton 2005; Smith 2005).
As mentioned previously, conscious awareness occurs in working mem-
ory, a limited-capacity resource; as rule-based processes require executive
attention and eort, they may exceed an individuals working memory ca-
pacity (Ashby and Casale 2005; Bailey 2005; Reber 2005).
Empirical evidence indicates that rule-based processing is characterized
by compositionality, productivity, systematicity, commitment, and a drive
for consistency (Diesendruck 2005; Pothos 2005; Sloman 2005). A set of
operations is compositional when more complex representations can be
built out of simpler components without a change in the meaning of the
components. Productivity means that, in principle, there is no limit to the
number of such new representations. An operation is systematic when it
applies in the same way to a whole class of objects (Pothos 2005). Rule-
based processing entails commitment to specic kinds of information,
while contextual variations are neglected (Diesendruck 2005). The reason
for this is that rule-based operations involve only a small subset of an ob-
jects properties which are selected for processing, while all other object
dimensions are suppressed (Markman et al. 2005; Pothos 2005). A strict
match between an objects properties and the properties specied in the
rule has to be achieved for rule-based processing to apply. Because of
this, rule-based judgements are more consistent and more stable than
similarity-based judgements (Diesendruck 2005; Pothos 2005). It should
be immediately apparent that all these properties of rule-based processing
are in keeping with the characteristics of Aristotelian category structure
detailed and exemplied above in relation to metalinguistic knowledge,
i.e., stability, discreteness, lack of exibility, as well as selective and cate-
gorical decision-making.
The characteristics of rule-based processing can be contrasted with the
characteristics of similarity-based processing. The latter involves a large
number of an objects properties, which only need to be partially matched
with the properties of existing representations to allow for successful
categorization (Pothos 2005). Moreover, and contrary to rule-based
Categories in second language learning 89
processing, similarity-based processing is exible, dynamic, open, and
susceptible to contextual variation (Diesendruck 2005; Markman et al.
2005). Again, it should be apparent that the attributes of similarity-based
processing identied in the eld of cognitive psychology are fully conso-
nant with the characteristics of implicit linguistic categories assumed in
the usage-based model.
It is now possible to consider the empirical ndings about the role of
metalinguistic knowledge in language learning (see Section 4 above) in
light of the proposed conceptualization of explicit metalinguistic repre-
sentations and processes as opposed to implicit linguistic representations
and processes. First, I have argued that linguistic and metalinguistic
knowledge pertain to the same cognitive domain (language) and vary
along the same parameters (specicity and complexity). These circum-
stances are consistent with the empirical nding that the two types of
knowledge are positively correlated in L2 learners. At the same time, it
is of course necessary to bear in mind that, considered on their own, cor-
relations do not allow for direct conclusions to be drawn about cause-
eect relationships, or indeed the directionality of such relationships.
Second, I have suggested that linguistic and metalinguistic knowledge
dier qualitatively in terms of their internal category structure, with im-
plicitly represented categories characterized by exibility, fuzziness, and
context-dependency, and explicitly represented categories showing the
contrasting attributes of Aristotelian structure. This proposal is compati-
ble with the existing claim that the two types of knowledge are separate
and distinguishable constructs.
Third, research in cognitive psychology has revealed that rule-based
processes, i.e., processes which operate on explicit knowledge represen-
tations, are characterized by compositionality, productivity, systematic-
ity, commitment, and a drive for consistency. These characteristics are
consonant with the empirical nding that use of metalinguistic knowl-
edge is associated with consistent, systematic, and often successful L2
performance.
Fourth, rule-based processes are associated with stability and denite
commitment to selected information, while exibility and attention to
contextual variation are absent. Furthermore, as rule-based processes re-
quire both attentional resources and eort, they are constrained by an
individuals working memory capacity. These circumstances are in keep-
ing with the empirical nding that use of metalinguistic knowledge does
not guarantee successful L2 performance and may even be unhelpful
in certain situations. Put dierently, rule-based processes operating on
Aristotelian categories may not only exceed an individuals working
memory resources in a given situation, but may also fail to capture the
90 K. Roehr
intricacies of certain linguistic constructions in the rst place, as exempli-
ed below.
7
In sum, it appears that the proposed conceptualization of explicit meta-
linguistic representations and rule-based processes can account for the
benets as well as the limitations of knowledge based on Aristotelian cat-
egory structure. Such knowledge is at its best when it pertains to highly
frequent and entirely systematic patterns whose usage is largely indepen-
dent of context and may be described in terms of one or a few relations
between categories. In English, an -s needs to be added to present tense
verbs in the third person is an example of a metalinguistic description
instantiating metalinguistic knowledge of this kind. Conversely, metalin-
guistic knowledge is less useful, or perhaps even useless, when less
frequent, more item-based constructions exhibiting complicated form-
meaning relations need to be captured, since the required number of cat-
egories and propositions specifying relations between categories grows
rapidly with every specic usage context that diverges from the regular
pattern.
To exemplify, our implicit representations of the linguistic construc-
tions desk and Schreibtisch (desk) include a wealth of information about
appropriate pragmatic usage contexts of the linguistic forms based on cul-
tural models relating to the meanings they symbolize. Accordingly, the
implicit linguistic representations of a procient user of English and Ger-
man would include information about the suitability of the construction
desk to describe an item of furniture commonly found in an oce, as
well as the place where you check in at an airport or see a bank clerk to
open an account. Furthermore, the procient user would hold informa-
tion about the suitability of the construction Schreibtisch in the former
scenario but not in the latter.
At the implicit level, this probabilistic information is represented in a
vast network of associations subject to parallel distributed processing,
i.e., non-conscious operations that are unaected by the constraints of
working memory and the cumbersome propositional nature of explicit
knowledge representations and processes. By contrast, the Aristotelian
categories and relations of the relevant metalinguistic description require
the formulation of a set of independent propositions that specify dierent
usage situations, such as English desk is Schreibtisch in German. How-
ever, if you want to say English desk in German and if the expression is
used in the context of an airport or a bank, Schalter needs to be used,
and so forth.
At the level of more schematic categories, the implicit linguistic knowl-
edge of a procient user of English and German would include not only
the schema [co-ordinating conjunction], but likewise instantiations of
Categories in second language learning 91
this schema, all of which are associated with a wealth of linguistic and
conceptual context information. Accordingly, the fact that the German
constructions aber, jedoch, allein and sondern may all be translated as En-
glish but would be complemented not only by information about the high
frequency of aber, but also by knowledge of the specic syntactic proper-
ties of jedoch, the literary or archaic connotations of allein, the tendency
of sondern to be used in contradicting a preceding negative, etc. However,
the metalinguistic descriptions formulated in the previous sentence clearly
show that, when made explicit, this information needs to be stated in
terms of additional independent propositions based on stable and discrete
categories.
This potentially explosive growth of propositions that would be re-
quired to make explicit representations applicable in dierent contexts
has two detrimental consequences. First, it increases working memory
load and thus renders metalinguistic knowledge proportionally more bur-
densome to process; and, second, it becomes less widely applicable. These
potential drawbacks of explicit, rule-based processes apply in equal mea-
sure to the use of metalinguistic knowledge, i.e., reasoning about lan-
guage, and reasoning in other cognitive domains: If there is white-grey
smoke coming out of the kitchen oven where I have had sh cooking for
the last three hours, then there is a re (example adapted from Pothos
2005: 8) is obviously both harder to process and less useful than if there
is smoke, then there is re. Unfortunately, the complexity, exibility, and
context-dependency of natural language means that general (and truthful)
metalinguistic descriptions equivalent to the latter statement are inevita-
bly rather rare.
6. Empirical predictions
In the preceding section, I have argued that the distinct category struc-
tures and processes which characterize explicit and implicit knowledge
are consonant with existing ndings in the area of SLA. Naturally, a ret-
rospective explanatory account can only take us so far. However, the the-
oretical proposals I have put forward oer us further and arguably more
important insights: They allow for the formulation of empirically testable
predictions with regard to the role of metalinguistic knowledge in L2
learning. In what follows, ve specic hypotheses which are intended to
inform future research are presented.
(1) Linguistic constructions which are captured relatively easily by Aris-
totelian categories and relations between such categories will be easier to
acquire explicitly than linguistic constructions which are not captured
easily by Aristotelian categories and relations between such categories.
92 K. Roehr
Specically, linguistic constructions which show comparatively system-
atic, stable, and context-independent usage patterns should be more ame-
nable to explicit teaching and learning than linguistic constructions which
do not show these usage patterns.
There is as yet very little existing research which has investigated the
potential amenability of specic linguistic constructions to explicit L2
instruction drawing on metalinguistic descriptions, even though theo-
retically motivated predictions about the potential diculties of simple
versus complex metalinguistic rules were put forward more than a decade
ago (e.g., DeKeyser 1994; Hulstijn and de Graa 1994). Recent empirical
ndings suggest that L2 form-function mappings which can be described
metalinguistically in conceptually simple terms and which refer to system-
atic usage patterns appear to pose the least explicit learning diculty (R.
Ellis 2006; Roehr and Ganem 2007) and may therefore be particularly
suitable for explicit teaching and learning. By contrast, L2 form-function
mappings with less systematic usage patterns which require conceptually
complex metalinguistic descriptions should pose greater explicit learning
diculty. In view of the small number of studies that have been con-
ducted so far, further investigation of Hypothesis 1 is clearly required.
(2) Use of metalinguistic knowledge will dierentially aect the uency,
accuracy, and complexity of L2 performance. Specically, uency may
decrease, while accuracy and complexity may increase.
Existing research has shown that L2 learners metalinguistic knowledge
correlates positively with L2 prociencyprovided that the latter is oper-
ationalized by means of written rather than oral measures (e.g., Alderson
et al. 1997; Elder et al. 1999; Renou 2000). Given that the use of explicit
knowledge requires controlled processing which is by denition slow and
eortful compared with automatic, implicit operations, this nding is
perfectly compatible with previous theoretical argumentation. However,
whilst L2 prociency has typically been operationalized via discrete-item
tests of structural and lexical competence and/or via the four skills of
reading, writing, speaking, and listening, no study to date has investigated
learners use of metalinguistic knowledge in relation to the SLA-specic
developmental measures of uency, accuracy, and complexity (R. Ellis
and Barkhuizen 2005; Larsen-Freeman 2006; Skehan 1998) which cut
across both oral and written performance.
In view of the fact that explicit, rule-based processing drawing on rep-
resentations with Aristotelian category structure is subject to working
memory constraints and thus relies on the selective allocation of atten-
tional resources, one would expect that increased accuracy, for instance,
can only be achieved at the expense of decreased complexity and uency.
Likewise, increased complexity can only be achieved at the expense of
Categories in second language learning 93
decreased accuracy and uency, whereas increased uency is unlikely to be
achieved at all in association with high use of metalinguistic knowledge.
Averaged across a group of learners, these predicted patterns should
hold for both oral and written performance, although trade-o eects
can be expected to be stronger in the case of oral performance, since the
time pressures of online processing inevitably place even higher demands
on working memory. To my knowledge, none of the performance pat-
terns hypothesized here have been subjected to empirical enquiry yet.
(3) Use of metalinguistic knowledge will be related to cognitively based
individual learner dierences. Specically, a learners cognitive and learn-
ing style, language learning aptitude, and working memory capacity are
likely to dierentially aect their use of metalinguistic knowledge in L2
performance.
I have argued that metalinguistic knowledge representations exhibit
Aristotelian category structure and that rule-based processing mecha-
nisms operate on these representations. As mentioned previously, rule-
based processing mechanisms are characteristic of analytic reasoning
more generally, so that use of metalinguistic knowledge can be regarded
as problem-solving in the linguistic domain. Accordingly, individuals
with an analytic stylistic orientation and large working memory capacity
should be particularly adept at using metalinguistic knowledge.
While existing research has occasionally speculated on some of these
issues (e.g., Collentine 2000; DeKeyser 2003), no study to date has
probed the relationship between L2 learners metalinguistic knowledge
and their stylistic preferences (for recent work on cognitive and learning
style in SLA more generally, see, for instance Ehrman and Leaver 2003;
Reid 1998). As far as I am aware, only one study to date has directly in-
vestigated the interplay of L2 learners metalinguistic knowledge, their
language learning aptitude, and their working memory capacity (Roehr
and Ganem 2007). Results indicate that learners level of metalinguistic
knowledge and their working memory capacity are unrelated, but that
analytic components of language learning aptitude, i.e., components
whose operationalization incorporates no purely memory-based or purely
auditory elements, were positively correlated with learners level of meta-
linguistic knowledge (r 0:42). In view of the shortage of available evi-
dence, further research into the relationship between metalinguistic
knowledge and cognitively based individual dierence variables is needed.
(4) Use of metalinguistic knowledge and cognitively based individual
dierences will be related to learners aective responses. Specically, in-
dividuals with an analytic disposition who are likely to benet from ex-
plicit learning and teaching drawing on metalinguistic knowledge will
experience feelings of greater self-ecacy and will thus develop positive
94 K. Roehr
attitudes towards their L2 learning situation. By contrast, individuals with
a non-analytic disposition who are likely to benet less from explicit learn-
ing and teaching drawing on metalinguistic knowledge will experience
greater anxiety and will thus develop negative attitudes towards their L2
learning situation.
To my knowledge, there is as yet no published research that has put
this prediction to the test (but see Roehr 2005 for some preliminary
analyses based on a small number of cases; for work on the interaction
of aect and cognition more generally, see, for instance, Schumann 1998,
2004; Stevick 1999). In view of Hypothesis 1 above, it is plausible to hy-
pothesize that metalinguistic descriptions which pertain to linguistic con-
structions characterized by systematic and relatively context-independent
usage patterns may be facilitative for any L2 learner, regardless of cogni-
tively based individual dierences. Such metalinguistic descriptions may
focus a learners attention on aspects of the L2 input that might otherwise
be ignored, thus leading to noticing, i.e., conscious processing just above
the threshold of awareness, and all its associated benets.
If, on the other hand, metalinguistic descriptions pertaining to linguis-
tic constructions that pose more substantial explicit learning diculty ac-
cording to Hypothesis 1 are used, cognitively based individual learner dif-
ferences should begin to matter. An analytically oriented individual may
continue to benet by moving beyond noticing towards understanding,
thus relying on conscious processing at a high level of awareness (Schmidt
1990, 1993, 2001). The achievement of understanding is likely to result in
positive aective responses such as feelings of greater self-ecacy and en-
hanced self-condence. A positive attitude towards the L2 learning situa-
tion may result, which would in turn encourage the learner to deliberately
seek further exposure to the L2. In a learner with a dierent stylistic ori-
entation, however, this upward dynamic could well be replaced by a
downward spiral of failure to understand, feelings of anxiety and loss of
control, a negative attitude towards the L2 learning situation, and, in the
worst-case scenario, the eventual abandonment of L2 study. This hy-
pothesized interaction of cognitive and aective variables can and should
be put to the test.
(5) Use of metalinguistic knowledge in L2 learning will be related to L1
metalinguistic ability. Specically, individuals who show strong metalin-
guistic ability and literacy skills in L1 development are likely to exhibit
high levels of metalinguistic knowledge in L2.
With regard to metalinguistic knowledge in adult learners, the link be-
tween L1 and L2 skills has not been widely explored. Some studies have
incorporated measures of L1 metalinguistic knowledge alongside tests of
L2 metalinguistic knowledge (e.g., Alderson et al. 1997), or acknowledged
Categories in second language learning 95
the association between metalinguistic and literacy skills (e.g., Kemp
2001). Furthermore, existing research has emphasized the link between
L1 ability and aptitude for L2 learning (e.g., Sparks and Ganschow
2001), or highlighted the fact that multilingual individuals generally
show greater metalinguistic awareness (e.g., Jessner 1999, 2006). Yet, I
am not aware of any published study of cognitively mature learners
which has directly focused on the relationship between L1 and L2 compe-
tence on the one hand and L1 and L2 metalinguistic knowledge on the
other hand. If Hypotheses 3 and 4 are borne out, the patterns of interplay
between individual dierence variables and metalinguistic knowledge can
be expected to be similar in both L1 and L2.
7. Conclusion
In this paper, I have put forward a theoretically motivated and empirically
grounded conceptualization of the construct of metalinguistic knowledge,
or explicit knowledge about language, with specic reference to L2 learn-
ing. I have argued that explicit metalinguistic and implicit linguistic
knowledge vary along the same parameters, specicity and complexity,
but that they dier qualitatively in terms of their internal category struc-
ture and, accordingly, the processing mechanisms that operate on their
representation in the human mind. In consonance with assumptions made
in the usage-based approach to language, implicit knowledge is character-
ized by exible and context-dependent categories with fuzzy boundaries.
By contrast, explicit knowledge is represented in terms of Aristotelian cat-
egories with a stable, discrete, and context-independent structure.
In accordance with research in cognitive psychology, implicit knowl-
edge is subject to similarity-based processing which is characterized by
dynamicity, exibility, and context-dependency. Conversely, explicit
knowledge is subject to rule-based processing which is both conscious
and controlled. Such processing is constrained by the capacity limits of
working memory; it requires eort, selective attention, and commit-
ment. Rule-based processing is further characterized by stability and
consistencyproperties that are achieved at the cost of exibility and
consideration of contextual and frequency information. Rule-based pro-
cessing underlies analytic reasoning, whether in the linguistic or any other
cognitive domain. Hence, use of metalinguistic knowledge can be under-
stood as problem-solving applied to language.
The proposed attributes of implicit linguistic and explicit metalinguistic
category structures and processes have been considered in relation to
available research in the eld of SLA, and a post-hoc account that is
96 K. Roehr
consistent with both the benets and the limitations of metalinguistic
knowledge as identied in existing research has been provided. Arising
from the theoretical proposals put forward in the present paper, I have
further formulated ve specic predictions which, if conrmed, would
identify the conditions under which metalinguistic knowledge is likely to
be useful to the L2 learner. These predictions constitute empirically test-
able hypotheses which, it is hoped, will be addressed in future research.
Received 7 August 2006 University of Essex, UK
Revision received 16 May 2007
Notes
* I would like to thank Martin Atkinson, Bob Borsley, Ewa Dabrowska, and two anony-
mous reviewers for their helpful and constructive comments. I am also grateful to Sonja
Eisenbeiss, Roger Hawkins, and Max Roberts for reading an earlier version of this
paper. Address for correspondence: Karen Roehr, Department of Language & Lin-
guistics, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK; email:
3kroehr@essex.ac.uk4.
1. The following notation conventions are used: Schematic categories are shown in small
capitals with square brackets, e.g., [bird]. Exemplars of conceptual categories are shown
in small capitals, e.g., robin. Specic linguistic constructions are shown in italics, e.g.,
bachelor, unrepentant, etc. Metalinguistic descriptions are shown in single inverted com-
mas, e.g., da sends the nite verb to the end of the clause.
2. Langackers (1991) terminology is employed throughout this article. Croft (2001) uses
the terms atomic and substantive instead of minimal and specic, respectively.
3. The weak-interface position can be contrasted with the non-interface position and the
strong-interface position. The non-interface position contends not only that explicit and
implicit knowledge are separate and distinct constructs, but also that they cannot engage
in interplay (Krashen 1981, 1985; Paradis 2004). The strong-interface position maintains
that explicit and implicit knowledge interact directly, and that explicit knowledge may
be converted into implicit knowledge, e.g., through prolonged practice (DeKeyser
1994; Johnson 1996; McLaughlin 1995). A review of these various positions can be
found in R. Ellis (2005).
4. Current research into the interface between explicit and implicit knowledge does not yet
oer any highly precise descriptions of the links between the level of the mind and the
level of the brain. Likewise, researchers understanding of the notion of consciousness
is still incomplete. Therefore, what I present here are hypotheses that are compatible
with existing empirical ndings. While recognizing that further research is required, I re-
gard these hypotheses both as suciently plausible to be given serious consideration and
as suciently detailed to be incorporated into a coherent line of argument.
5. As mentioned previously, for the current discussion it does not matter whether an indi-
viduals metalinguistic knowledge has been derived internally or assimilated from exter-
nal sources.
6. URL: 3http://wordnet.princeton.edu/perl/webwn4, retrieved 16 April 2007, based on a
keyword search for noun.
7. This circumstance is consistent with the proposal that explicit knowledge about lan-
guage may be more inaccurate and more imprecise than implicit knowledge (R. Ellis
Categories in second language learning 97
2004, 2005, 2006). While, at rst glance, this hypothesis seems to be incompatible with
the attributes of rule-based processing, it ts into the picture if the limitations of meta-
linguistic knowledge based on representations with Aristotelian category structure are
taken into consideration.
References
Abbot-Smith, Kirsten and Michael Tomasello
2006 Exemplar-learning and schematization in a usage-based account of syntactic
acquisition. The Linguistic Review 23 (3), 275290.
Achard, Michel, and Susanne Niemeier (eds.)
2004 Cognitive Linguistics, Second Language Acquisition, and Foreign Language
Teaching. Berlin: Mouton de Gruyter.
Alderson, J. Charles, Caroline Clapham, and David Steel
1997 Metalinguistic knowledge, language aptitude and language prociency. Lan-
guage Teaching Research 1, 93121.
1995 Learning and Memory: An Integrated Approach. New York, NY: John Wiley
and Sons.
Anderson, John R.
1996 The Architecture of Cognition. Mahwah, NJ: Erlbaum.
2005 Cognitive Psychology and its Implications (6th ed.). New York, NY: Worth
Publishers.
Ashby, F. Gregory and Michael B. Casale
2005 Empirical dissociations between rule-based and similarity-based categoriza-
tion. Behavioral and Brain Sciences 28 (1), 1516.
Baddeley, Alan D.
1997 Human Memory: Theory and Practice. Hove: Psychology Press.
2000 The episodic buer: A new component of working memory? Trends in Cog-
nitive Sciences 4 (11), 417423.
Baddeley, Alan D. and Robert H. Logie
1999 Working memory: The multiple-component model. In Miyake, Akira and
Priti Shah (eds.), Models of Working Memory: Mechanisms of Active Main-
tenance and Executive Control. Cambridge: Cambridge University Press,
2861.
Bailey, Todd M.
2005 Rules work on one representation; similarity compares two representations.
Behavioral and Brain Sciences 28 (1), 16.
Bates, Elizabeth A. and Judith C. Goodman
2001 On the inseparability of grammar and the lexicon: Evidence from acquisi-
tion. In Tomasello, Michael and Elizabeth A. Bates (eds.), Language Devel-
opment. Malden, MA: Blackwell, 134162.
Bayne, Tim and David J. Chalmers
2003 What is the unity of consciousness? In Cleeremans, Axel (ed.), The Unity of
Consciousness: Binding, Integration, and Dissociation. Oxford: Oxford Uni-
versity Press, 2358.
Bialystok, Ellen
1979 Explicit and implicit judgements of L2 grammaticality. Language Learning
29 (1), 81103.
Birdsong, David
1989 Metalinguistic Performance and Interlinguistic Competence. Berlin: Springer.
98 K. Roehr
Bod, Rens, Jennifer Hay, and Stefanie Jannedy
2003 Introduction. In Bod, Rens, Jennifer Hay, and Stefanie Jannedy (eds.),
Probabilistic Linguistics. Cambridge, MA: MIT Press, 110.
Boers, Frank and Seth Lindstromberg
2006 Cognitive linguistic applications in second and foreign language instruction:
Rationale, proposals, and evaluation. In Kristiansen, Gitte, Michel Achard,
Rene Dirven, and Francisco J. Ruiz de Mendoza Ibanez (eds.), Cognitive
Linguistics: Current Applications and Future Perspectives. Berlin: Mouton
de Gruyter, 303355.
Butler, Yuko Goto
2002 Second language learners theories on the use of English articles: An analysis of
the metalinguistic knowledge used by Japanese students in acquiring the En-
glish article system. Studies in Second Language Acquisition 24 (3), 451480.
Bybee, Joan L. and James L. McClelland
2005 Alternatives to the combinatorial paradigm of linguistic theory based on do-
main general principles of human cognition. The Linguistic Review 22 (24),
381410.
Camps, Joaquim
2003 Concurrent and retrospective verbal reports as tools to better understand the
role of attention in second language tasks. International Journal of Applied
Linguistics 13 (2), 201221.
Cattell, Ray
2006 An Introduction to Mind, Consciousness and Language. London: Continuum.
Chalker, Sylvia
1994 Pedagogical grammar: Principles and problems. In Bygate, Martin, Alan
Tonkyn, and Eddie Williams (eds.), Grammar and the Language Teacher.
New York, NY: Prentice Hall, 3144.
Cleeremans, Axel and Arnaud Destrebecqz
2005 Real rules are conscious. Behavioral and Brain Sciences 28 (1), 1920.
Collentine, Joseph
2000 Insights into the construction of grammatical knowledge provided by user-
behavior tracking technologies. Language Learning and Technology 3 (2),
4457.
Croft, William
2001 Radical Construction Grammar: Syntactic Theory in Typological Perspective.
Oxford: Oxford University Press.
Daneman, Meredyth and Patricia A. Carpenter
1980 Individual dierences in working memory and reading. Journal of Verbal
Learning and Verbal Behavior 19, 450466.
DeKeyser, Robert M.
1994 How implicit can adult second language learning be? AILA Review 11, 8396.
2003 Implicit and explicit learning. In Doughty, Catherine J. and Michael H.
Long (eds.), The Handbook of Second Language Acquisition. Malden, MA:
Blackwell, 313348.
2005 What makes learning second-language grammar dicult? A review of issues.
Language Learning 55(s1), 125.
Dienes, Zoltan and Josef Perner
2003 Unifying consciousness with explicit knowledge. In Cleeremans, Axel (ed.),
The Unity of Consciousness: Binding, Integration, and Dissociation. Oxford:
Oxford University Press, 214232.
Categories in second language learning 99
Diesendruck, Gil
2005 Commitment distinguishes between rules and similarity: A developmental
perspective. Behavioral and Brain Sciences 28 (1), 2122.
Dirven, Rene
1990 Pedagogical grammar. Language Teaching 23 (1), 118.
Dirven, Rene and Marjolijn Verspoor
2004 Cognitive Exploration of Language and Linguistics (2nd ed.). Amsterdam:
John Benjamins.
Doughty, Catherine J.
1991 Second language instruction does make a dierence: Evidence from an em-
pirical study of SL relativization. Studies in Second Language Acquisition 13,
431469.
2003 Instructed SLA: Constraints, compensation, and enhancement. In Doughty,
Catherine J. and Michael H. Long (eds.), The Handbook of Second Lan-
guage Acquisition. Malden, MA: Blackwell, 256310.
Ehrman, Madeline E. and Betty Lou Leaver
2003 Cognitive style in the service of language learning. System 31, 393415.
Elder, Catherine and Diane Manwaring
2004 The relationship between metalinguistic knowledge and learning outcomes
among undergraduate students of Chinese. Language Awareness 13 (3),
145162.
Elder, Catherine, Jane Warren, John Hajek, Diane Manwaring, and Alan Davies
1999 Metalinguistic knowledge: How important is it in studying a language at
university? Australian Review of Applied Linguistics 22 (1), 8195.
Ellis, Nick C.
1993 Rules and instances in foreign language learning: Interactions of explicit and
implicit knowledge. European Journal of Cognitive Psychology 5 (3), 289318.
1994 Consciousness in second language learning: Psychological perspectives on
the role of conscious processes in vocabulary acquisition. AILA Review 11,
3756.
1996 Sequencing in SLA: Phonological memory, chunking, and points of order.
Studies in Second Language Acquisition 18, 91126.
2001 Memory for language. In Robinson, Peter (ed.), Cognition and Second Lan-
guage Instruction. Cambridge: Cambridge University Press, 3368.
2002a Frequency eects in language processing: A review with implications for
theories of implicit and explicit language acquisition. Studies in Second Lan-
guage Acquisition 24 (2), 143188.
2002b Reections on frequency eects in language processing. Studies in Second
Language Acquisition 24 (2), 297340.
2003 Constructions, chunking, and connectionism: The emergence of second lan-
guage structure. In Doughty, Catherine J. and Michael H. Long (eds.), The
Handbook of Second Language Acquisition. Malden, MA: Blackwell, 63103.
2005 At the interface: Dynamic interactions of explicit and implicit language
knowledge. Studies in Second Language Acquisition 27 (2), 305352.
Ellis, Nick C. and Diane Larsen-Freeman
2006 Language emergence: Implications for applied linguistics. Applied Linguis-
tics 27 (4), 558589.
Ellis, Rod
2001 Introduction: Investigating form-focused instruction. Language Learning 51
(1), 146.
100 K. Roehr
2002 Does form-focused instruction aect the acquisition of implicit knowledge?
Studies in Second Language Acquisition 24 (2), 223236.
2004 The denition and measurement of L2 explicit knowledge. Language Learn-
ing 54 (2), 227275.
2005 Measuring implicit and explicit knowledge of a second language: A psycho-
metric study. Studies in Second Language Acquisition 27 (2), 141172.
2006 Modelling learning diculty and second language prociency: The dieren-
tial contributions of implicit and explicit knowledge. Applied Linguistics 27
(3), 431463.
Ellis, Rod and Gary Barkhuizen
2005 Analysing Learner Language. Oxford: Oxford University Press.
Engel, Andreas K.
2003 Temporal binding and the neural correlates of consciousness. In Cleere-
mans, Axel (ed.), The Unity of Consciousness: Binding, Integration, and Dis-
sociation. Oxford: Oxford University Press, 132152.
Gabrielatos, Costas
2004 If-conditionals in ELT materials and the BNC: Corpus-based evaluation of
pedagogical materials. Paper presented at the Corpus Linguistics Research
Group meeting on 26 April 2004, Lancaster University.
Goldberg, Adele E.
1995 Constructions: A Construction Grammar Approach to Argument Structure.
Chicago: University of Chicago Press.
1999 The emergence of the semantics of argument structure constructions. In
MacWhinney, Brian (ed.), The Emergence of Language. Mahwah, NJ: Erl-
baum, 197212.
2003 Constructions: A new theoretical approach to language. Trends in Cognitive
Sciences 7 (5), 219224.
Gombert, Jean Emile
1992 Metalinguistic Development. Hemel Hempstead: Harvester.
Green, Peter S. and Karlheinz Hecht
1992 Implicit and explicit grammar: An empirical study. Applied Linguistics 13
(2), 168184.
Hampton, James A.
2005 Rules and similaritya false dichotomy. Behavioral and Brain Sciences 28
(1), 26.
Hu, Guangwei
2002 Psychological constraints on the utility of metalinguistic knowledge in sec-
ond language production. Studies in Second Language Acquisition 24 (3),
347386.
Hulstijn, Jan H.
2005 Theoretical and empirical issues in the study of implicit and explicit second-
language learning: Introduction. Studies in Second Language Acquisition 27
(2), 129140.
Hulstijn, Jan H. and Rick de Graa
1994 Under what conditions does explicit knowledge of a second language facili-
tate the acquisition of implicit knowledge? A research proposal. AILA Re-
view 11, 97112.
Jessner, Ulrike
1999 Metalinguistic awareness in multilinguals: Cognitive aspects of third lan-
guage learning. Language Awareness 8 (34), 201209.
Categories in second language learning 101
2006 Linguistic Awareness in Multilinguals: English as a Third Language. Edin-
burgh: Edinburgh University Press.
Johnson, Keith
1996 Language Teaching and Skill Learning. Oxford: Blackwell.
Just, Marcel Adam and Patricia A. Carpenter
1992 A capacity theory of comprehension: Individual dierences in working
memory. Psychological Review 99 (1), 122149.
Karmilo, Kyra and Annette Karmilo-Smith
2002 Pathways to Language: From Fetus to Adolescent. Cambridge, MA: Harvard
University Press.
Kemmer, Suzanne and Michael Barlow
2000 Introduction: A usage-based conception of language. In Barlow, Michael
and Suzanne Kemmer (eds.), Usage-Based Models of Language. Stanford,
CA: CSLI, viixxviii.
Kemp, Charlotte
2001 Metalinguistic awareness in multilinguals: Implicit and explicit grammatical
awareness and its relationship with language experience and language at-
tainment. Unpublished doctoral dissertation, University of Edinburgh.
Klapper, John and Jonathan Rees
2003 Reviewing the case for explicit grammar instruction in the university foreign
language learning context. Language Teaching Research 7 (3), 285314.
Krashen, Stephen D.
1981 Second Language Acquisition and Second Language Learning. Oxford:
Pergamon.
1985 The Input Hypothesis: Issues and Implications. London: Longman.
Langacker, Ronald W.
1991 Concept, Image, and Symbol: The Cognitive Basis of Grammar. Berlin: Mou-
ton de Gruyter.
1999 Grammar and Conceptualization. Berlin: Mouton de Gruyter.
2000 A dynamic usage-based model. In Barlow, Michael and Suzanne Kemmer
(eds.), Usage-Based Models of Language. Stanford, CA: CSLI, 164.
Larsen-Freeman, Diane
2006 The emergence of complexity, uency, and accuracy in the oral and written
production of ve Chinese learners of English. Applied Linguistics 27 (4),
590619.
Leow, Ronald P.
1997 Attention, awareness, and foreign language behavior. Language Learning
47, 467505.
MacWhinney, Brian
1997 Implicit and explicit processes: Commentary. Studies in Second Language
Acquisition 19, 277281.
Markman, Arthur B., Sergey Blok, Kyungil Kom, Levi Larkey, Lisa R. Narvaez, C. Hunt
Stilwell, and Eric Taylor
2005 Digging beneath rules and similarity. Behavioral and Brain Sciences 28 (1),
2930.
McDonough, Steven
2002 Applied Linguistics in Language Education. London: Arnold.
McLaughlin, Barry
1995 Aptitude from an information-processing perspective. Language Testing 12,
370387.
102 K. Roehr
Miyake, Akira and Naomi P. Friedman
1998 Individual dierences in second language prociency: Working memory as
language aptitude. In Healy, Alice F. and Lyle E. Bourne (eds.), Foreign
Language Learning: Psycholinguistic Studies on Training and Retention.
Mahwah, NJ: Erlbaum, 339364.
Miyake, Akira and Priti Shah
1999 Toward unied theories of working memory: Emerging general consensus,
unresolved theoretical issues, and future research directions. In Miyake,
Akira and Priti Shah (eds.), Models of Working Memory: Mechanisms of
Active Maintenance and Executive Control. Cambridge: Cambridge Univer-
sity Press, 442481.
Murphy, Gregory L.
2004 The Big Book of Concepts. Cambridge, MA: MIT Press.
Murphy, Raymond
1994 English Grammar in Use (2nd ed.). Cambridge: Cambridge University Press.
Nagata, Noriko and Virginia M. Swisher
1995 A study of consciousness-raising by computer: The eect of metalinguistic
feedback on second language learning. Foreign Language Annals 28 (3),
337347.
Norris, John M. and Lourdes Ortega
2001 Does type of instruction make a dierence? Substantive ndings from a
meta-analytic review. Language Learning 51 (1), 157213.
Paradis, Michel
2004 A Neurolinguistic Theory of Bilingualism. Amsterdam: John Benjamins.
Pothos, Emmanuel M.
2005 The rules versus similarity distinction. Behavioral and Brain Sciences 28 (1),
149.
Reber, Rolf
2005 Rule versus similarity: Dierent in processing mode, not in representations.
Behavioral and Brain Sciences 28 (1), 3132.
Reid, Joy M. (ed.)
1998 Understanding Learning Styles in the Second Language Classroom. Upper
Saddle River, NJ: Prentice Hall Regents.
Renou, Janet M.
2000 Learner accuracy and learner performance: The quest for a link. Foreign
Language Annals 33 (2), 168180.
Robinson, Peter
1995 Review article: Attention, memory, and the noticing hypothesis. Lan-
guage Learning 45 (2), 283331.
1997 Generalizability and automaticity of second language learning under im-
plicit, incidental, enhanced, and instructed conditions. Studies in Second
Language Acquisition 19, 223247.
2003 Attention and memory during SLA. In Doughty, Catherine J. and Michael
H. Long (eds.), The Handbook of Second Language Acquisition. Malden,
MA: Blackwell, 631678.
Roehr, Karen
2005 Metalinguistic knowledge in second language learning: An emergentist per-
spective. Unpublished doctoral dissertation, Lancaster University.
2006 Metalinguistic knowledge in L2 task performance: A verbal protocol anal-
ysis. Language Awareness 15 (3), 180198.
Categories in second language learning 103
2007 Metalinguistic knowledge and language ability in university-level L2 learn-
ers. Applied Linguistics. doi: 10.1093/applin/amm037. URL: 3http://applij.
oxfordjournals.org/cgi/content/full/amm037?ijkey=1xNurNzW63Rt3Um&
keytype=ref4.
Roehr, Karen and Adela Ganem
2007 Metalinguistic knowledge in L2 learning: An individual dierence variable.
Paper presented at Euro SLA on 13 September 2007, Newcastle University.
Rosa, Elena and Michael D. ONeill
1999 Explicitness, intake, and the issue of awareness. Studies in Second Language
Acquisition 21, 511556.
Rosch, Eleanor, and Barbara B. Lloyd (eds.)
1978 Cognition and Categorization. Hillsdale, NJ: Erlbaum.
Rosch, Eleanor and Carolyn B. Mervis
1975 Family resemblances: Studies in the internal structure of categories. Cogni-
tive Psychology 7, 573605.
Sanz, Cristina and Kara Morgan-Short
2004 Positive evidence versus explicit rule presentation and explicit negative
feedback: A computer-assisted study. Language Learning 54 (1), 3578.
2005 Explicitness in pedagogical interventions: Input, practice, and feedback. In
Sanz, Cristina (ed.), Mind and Context in Adult Second Language Acquisi-
tion: Methods, Theory, and Practice. Washington, DC: Georgetown Univer-
sity Press, 234263.
Saporta, Sol
1973 Scientic grammars and pedagogical grammars. In Allen, J. P. B. and Pit
Corder (eds.), The Edinburgh Course in Applied Linguistics. London: Oxford
University Press, 265274.
Schmidt, Richard W.
1990 The role of consciousness in SLA learning. Applied Linguistics 11, 129158.
1993 Awareness and second language acquisition. Annual Review of Applied Lin-
guistics 13, 206226.
2001 Attention. In Robinson, Peter (ed.), Cognition and Second Language Instruc-
tion. Cambridge: Cambridge University Press, 332.
Schumann, John H.
1998 The neurobiology of aect in language. Language Learning 48 (s1), xi341.
2004 The neurobiology of aptitude. In Schumann, John H., Sheila E. Crowell,
Nancy E. Jones, Namhee Lee, Sara Ann Schuchert, and Lee A. Wood
(eds.), The Neurobiology of Learning: Perspectives from Second Language
Acquisition. Mahwah, NJ: Erlbaum, 722.
Segalowitz, Norman
2003 Automaticity and second languages. In Doughty, Catherine J. and Michael
H. Long (eds.), The Handbook of Second Language Acquisition. Malden,
MA: Blackwell, 382408.
Shah, Priti and Akira Miyake
1999 Models of working memory: An introduction. In Miyake, Akira and Priti
Shah (eds.), Models of Working Memory: Mechanisms of Active Mainte-
nance and Executive Control. Cambridge: Cambridge University Press, 1
27.
Simard, Daphnee and Wynne Wong
2001 Alertness, orientation, and detection: The conceptualization of attentional
functions in SLA. Studies in Second Language Acquisition 23, 103124.
104 K. Roehr
Skehan, Peter
1998 A Cognitive Approach to Language Learning. Oxford: Oxford University
Press.
Sloman, Steven
2005 Avoiding foolish consistency. Behavioral and Brain Sciences 28 (1), 33
34.
Smith, Edward
2005 Rule and similarity as prototype concepts. Behavioral and Brain Sciences 28
(1), 3435.
Sorace, Antonella
1985 Metalinguistic knowledge and language use in acquisition-poor environ-
ments. Applied Linguistics 6 (3), 239254.
Sparks, Richard and Leonore Ganschow
2001 Aptitude for learning a foreign language. Annual Review of Applied Linguis-
tics 21, 90111.
Stankov, Lazar
2003 Complexity in human intelligence. In Sternberg, Robert J., Jacques
Lautrey, and Todd I. Lubart (eds.), Models of Intelligence: International
Perspectives. Washington, DC: American Psychological Association, 27
42.
Stevick, Earl W.
1999 Aect in learning and memory: From alchemy to chemistry. In Arnold, Jane
(ed.), Aect in Language Learning. Cambridge: Cambridge University Press,
4357.
Swain, Merrill
1998 Focus on form through conscious reection. In Doughty, Catherine
J. and Jessica Williams (eds.), Focus on Form in Classroom Second
Language Acquisition. Cambridge: Cambridge University Press, 64
81.
Swan, Michael
1994 Design criteria for pedagogic language rules. In Bygate, Martin, Alan Ton-
kyn, and Eddie Williams (eds.), Grammar and the Language Teacher. New
York, NY: Prentice Hall, 4555.
1995 Practical English Usage (2nd ed.). Oxford: Oxford University Press.
Taylor, John R.
2002 Cognitive Grammar. Oxford: Oxford University Press.
2003 Linguistic Categorization (3rd ed.). Oxford: Oxford University Press.
Tomasello, Michael
1998 Introduction: A cognitive-functional perspective on language structure. In
Tomasello, Michael (ed.), The New Psychology of Language: Cognitive and
Functional Approaches to Language Structure (Vol. 1). Mahwah, NJ: Erl-
baum, viixxiii.
2003 Constructing a Language: A Usage-Based Theory of Language Acquisition.
Cambridge, MA: Harvard University Press.
Tomlin, Russell and Victor Villa
1994 Attention in cognitive science and second language acquisition. Studies in
Second Language Acquisition 16, 183203.
Towell, Richard
2002 Design of a pedagogical grammar. URL: 3http://www.lang.ltsn.ac.uk/
resources/goodpractice.aspx?resourceid=4104.
Categories in second language learning 105
Ungerer, Friedrich and Hans-Jo rg Schmid
1996 An Introduction to Cognitive Linguistics. London: Longman.
VanPatten, Bill
1996 Input Processing and Grammar Instruction in Second Language Acquisition.
Norwood, NJ: Ablex.
VanPatten, Bill (ed.)
2004 Processing Instruction: Theory, Research, and Commentary. Mahwah, NJ:
Erlbaum.
106 K. Roehr
Explaining intersubjectivity. A comment on
Arie Verhagen, Constructions of
Intersubjectivity
WOLFRAM HINZEN and MICHIEL VAN LAMBALGEN
1. Overview
Constructions of Intersubjectivity (CoI) is an important addition to the
growing body of work on cognitive and construction-based grammars,
which CoI links to evolutionary issues in interesting ways. CoI also
touches upon a number of fundamental (indeed philosophical) issues in
the study of linguistic communication, meaning, and human cognition; it
should be applauded for the explicitness with which it does so, using lan-
guage as a window on the mind (p. 210). A concrete vision of the evolu-
tion of language is endorsed, arising against the background of analyses
of a number of seemingly disparate and scattered linguistic data. The
book thus forms an excellent starting point to engage with foundational
assumptions entering into the theoretical framework adopted. We will
here equally embed our comments within a theoretical discussion at the
level of frameworks.
The book begins by isolating a number of seemingly unrelated small
grammatical puzzles, which later gain a theoretical signicance for certain
big theoretical issues. The small grammatical puzzles concern negation
(in particular the lack of functional equivalence in the use in discourse of
not impossible and possible); whether nite sentential complements in cop-
ular constructions like The danger is that depleted uranium is poisonous
are subjects or predicates; and discourse connectives (e.g., concessive con-
junctions like although). These three construction types form the topics of
Chapters 2, 3, and 4, respectively. Chapter 5 concludes the book. We here
reverse the order of small and big and begin big, with some claims of
linguistic anthropology.
2. Anthropological and evolutionary issues
Following Verhagen, using human language is essentially a manipulative
activity: language is fundamentally a matter of regulating and assessing
Cognitive Linguistics 191 (2008), 107123
DOI 10.1515/COG.2008.006
09365907/08/00190107
6 Walter de Gruyter
others (9). Its use is never just informative, but always argumentative
(910); like animal communication systems geared at getting conspecics
to act in ways benecial to the communicator (8), language is about get-
ting things done rather than the disinterested representation of the world.
Any such similarity between human language and non-human communi-
cation systems would be a welcome result, as it reduces the apparent gap
separating human and animal language. That said, while human lan-
guage can be used to manipulate and getting others to behave as one de-
sires, ever so often it is not so used, highlighting a crucial dissimilarity be-
tween human language and animal communication systems: we may as
well use language to freely express our thoughts or ponder and assert the
truth of something, without necessarily expecting particular functional
benets ensuing from that. Unlike in non-human communicating species
there is no apparent cause or functional pressure for our deliberate deci-
sions to assert what we do, and much functional pressure is needed to
prevent them. Nor are we restricted in what we choose to refer to, assert,
or communicate. Stuck in the immediate here and now, by contrast, as
non-human animals by and large are, they only have a small number of
non-voluntary vocalizations at their disposal, all intrinsically linked to an
immediate adaptive purpose. No doubt human language use will seem
somewhat pathological if all we say serves some instrumental purpose
and is intrinsically linked to a certain response we wish to achieve. Inter-
estingly, the descriptive and assertoric aspect of language is inescapable
even where language is used manipulativly, as in making compliments to
a lady, where unavoidably we are making a descriptive claim too (what a
beautiful perfume!).
The denial that human language exhibits the very features that roman-
ticists like Schlegel, Herder, or Humboldt claimed to be so distinctive for
itits use for the free and creative expression of thoughtalso has an in-
tellectual heritage we should be aware of. Assimilating human language
to non-human animal communication systems was part and parcel of
B. F. Skinners (1957) vision of language, who atly denied that language
is used for purposes of reference, representation, or the assertion of truth,
arguing instead that it is an instrument serving purposes of the control of
behavior. In CoI, too, we read that language evolved as a mechanism
producing pressure favoring long-term predictability of behavior (14).
CoI does not support a Skinnerian psychology, to be sure; nor does it
claim that all language use is a function of strategic interaction. Yet it is
not entirely clear how far removed its foundational claims about lan-
guage are from Skinnerian views of language as an instrument of control.
We think it is an obvious fact that language is used as an instrument of
control. Our point is merely that (i) the opposite is equally true, (ii) not
108 W. Hinzen and M. v. Lambalgen
so using it is actually a hallmark of human language that should be cen-
tral in any account of its evolution.
We suggest that, more generally, the general assimilation of human to
non-human language on the basis of ascriptions of an evolutionary func-
tion to language, such as communication, will not lead to much insight in
linguistic structure and its special character. To begin with, function as-
criptions to whole, complex systems such as language dont typically
transfer to the parts from which such systems are assembled: these will
typically have independent evolutionary trajectories, unrelated to the
function for which they are later employed in when entering the system
of language. To whatever extent cognitive mechanisms entering language
are used in non-humans, and have non-communicative functions there,
language will not be rationalizable by looking at it as a communication
system. Nor will the study of non-linguistic animal communication un-
lock the secret of what makes language special. If there is anything special
to the human communication system, it is that it is a linguistic one, which
means that its being a communication system cannot possibly be what as
such explains its special features. The study of communication systems
(Hauser 1996) does not tell us much about the special properties of
human language, such as its structural and computational aspects, or the
fact of its intentional and creative use.
Again, none of this means that the study of the communicative use of
language will not let us see many interesting facts about language. CoI
succeeds rather remarkably in unearthing such facts. This books funda-
mental theoretical commitment however is deeper: that social and cul-
tural cognition alone is the key to the understanding of language. The
most basic explanatory notion in Verhagens framework, used extensively
throughout the book, is taken from Tomasello (e.g., 1999, 2003): the
human ability to take others perspectives (2), understand what they at-
tend to, and share their intentions. On Verhagens view this complex of
mental reasoning abilities is the prime biological factor distinguishing us
from other primates. Let a primate interacting manipulatively with others
understand itself as an intentional agent, and have him ascribe intentional
life to other agents as well; have him want to share beliefs and identify
with the intentional mental life of others; then culture becomes possible,
with its own special mechanisms of inheritance, since humans can now
learn from others as opposed to merely from their own interactions with
a non-human environment. With this, language is on its way, if not given,
Verhagen suggests. For language simply is a system of conventions (of
symbols and ways of using them) that solve a cognitive coordination
problem. It is culturally transferred (3); and thus there is no biological
adaptation specic to language needed. In sum, starting from the one
Explaining intersubjectivity. A comment on Verhagen 109
basic notion of taking anothers perspective, language evolvesfor the
coordination and managing of multiple such perspectives in discourse.
3. Testing a hypothesis
What would be evidence for the correctness of such a view? What we
need is independent empirical evidence that human language is optimized
to some signicant extent for the coordination task envisaged. A good de-
gree of optimization is what testing any functionalist hypothesis in biol-
ogy requires. In short, the hypothesis should be a particularly good
source for predictions of mechanisms that we can then empirically attest.
But note that even if this proves possible, the functional rationale of the
mechanisms in question will not be their cause or origin. An independent
story about the mechanisms will have to be told, as a mere hypothesis
about functions will leave the question of origin (proximate causes) open.
Recognizing the need for validation above, Verhagen asserts that we
must be able to see repercussions for the content that is systematically
coded in linguistic symbols of the capacity of understanding others as
like oneself, in short read o the semantics of basic linguistic units from
their ways of handling perspectives:
[I]f coordinating cognitively with others is so basic a component of human prac-
tices, then we should see it reected in more than one area of grammar [ . . . ] con-
necting, dierentiating and tailoring the contents of points of view with respect
to each other (rather than organizing a connection to the world) is essential for
understanding their semantics [ . . . ] (p. 4)
Here we note a potentially wrong opposition, to which we will return sev-
eral times: even granted that, generally, coordinating cognitively with
others is basic to human cognition, and this general principle of cogni-
tion is also instantiated in grammar, we dont see that there somehow ex-
ists an opposition between coordinating cognitively and organizing a
connection to the world, which entails that semantics cannot be under-
stood as serving both functions simultaneously. We contend, in line with
our anthropological claims above, that it can and does.
1
We also note that there are potentially two dierent aspects of
language that we might want to explain by appeal to their discourse
function: sentence-internal organization on the one hand, and discourse
phenomena transcending the sentence-boundary on the other. By
sentence-internal organization we mean the structure of the clause, the or-
ganization of phrases and their dependents, and syntactic mechanisms
like complementation. Sentential connectives and discourse conjunctions
110 W. Hinzen and M. v. Lambalgen
fall into the class of discourse phenomena. The complementation con-
struction in (1) illustrates the rst one; (2) is an example of an inter-
sentential or discourse phenomenon:
(1) George saw/knew/said that his opponent was closing in.
(2) Max fell. John pushed him.
Clearly, it is discourse phenomena that we expect a discourse-based per-
spective to elucidate best. It is much less clear that such a perspective
would illuminate sentence-internal and syntactic organization. Verha-
gens striking claim is that a perspective departing from discourse and
cognitive coordination shows us that both central semantic and syntactic
analyses of particular linguistic constructions have been mistaken.
Now, in (2), the two sentences are obviously semantically connected
(even though they are not parts of one another, in a phrase-structural
sense, as in the construction (1)). To understand (2), it has to be inferred
that the event order is the reverse of the sentence order, and it is only by
applying causal knowledge (pushing can be a cause of falling) and a gen-
eral inference principle (no other possible cause is mentioned, whence one
must assume pushing is the only operant cause) that the listener can con-
struct the corresponding event structure. The speaker need not supply ex-
plicit information about the intended event order since he knows that the
listener is able to compute this herself. This is in fact a general fact about
discourse production and understanding: it is both impossible and unde-
sirable to supply all relevant information in linguistic form and both
speaker and listener therefore appeal to general principles in computing
that information from the linguistic material given. So the principles driv-
ing the understanding in cases like (2) are not specically linguistic ones:
they are more generally cognitive, logical, or inferential ones (examples
will be seen below). Again, we expect this to be dierent in (1), where we
meet a hypotactic construction missing in (2), in which, at least on a stan-
dard syntactic analysis, that his opponent was closing in is the internal ar-
gument of saw (see below for more on this structural claim). It is therefore
more plausible that cognitive coordination in discourse could potentially
tell us much about (2), but little about (1).
4. Negation and discourse connections
Let us see whether this is so and begin with the observation that clearly the
discourse in (2) is about the world, and involves a large amount of cogni-
tive coordination, exemplifying languages potential, insisted on above, to
serve both of these functions simultaneously. The dierence between the
general fact about discourse understanding just noted and Verhagens
Explaining intersubjectivity. A comment on Verhagen 111
claims is that he argues for the existence of specic grammatical construc-
tions whose purpose would lie precisely in cognitive coordination, and
whose semantics would not be explainable otherwise (or on more tradi-
tional semantic assumptions) Verhagen considers that negation is an in-
stance of such a construction, and this is the topic of Chapter 2, to which
we now turn.
We will give a slightly more formal treatment of Verhagens examples
in CoI, to see whether negation can indeed be used in building a case
against semantics as organizing a connection to the world. We rst sum-
marise Verhagens take on negation, with page numbers to where Verha-
gen states his views:
a. the primary function of negation is intersubjective cognitive coordi-
nation (42, bottom of page)
b. the relation between language and the world is only secondary (42,
bottom of page)
2
c. negation is concerned with the relation between distinct mental
spaces of participants in discourse (57)
d. more specically, the speaker uses negation to instruct the addressee
to entertain two distinct mental spaces, one of which has to be re-
jected (42, bottom of page)
e. these mental spaces may incorporate topoi, collections of culturally
determined default rules (58).
Now consider the following three example discourses:
(3) A. Do you think our son will pass his courses this term?
B. Well, he passed them in the autumn term.
(4a) A. Do you think our son will pass his courses this term?
B-a. Well, he did not pass his rst statistics course.
(4b) A. Do you think our son will pass his courses this term?
B-b. Well, he barely passed his rst statistics course.
The general principle behind understanding such exchanges is that, in-
stead of giving a direct answer, B invites addressee A to activate a defea-
sible rule in her semantic memory (cf. the topoi mentioned under e.
above) and to perform an inference based on the rule and the information
supplied by B. Thus in example (3), A must retrieve a defeasible rule of
the type normally, if a student passes his exams in term n, then also in
term n 1, and apply modus ponens using Bs observation about the au-
tumn term. Things get really interesting in example (4a). Here B invites A
to activate a defeasible rule like normally, if a student passes his rst sta-
tistics course, he can pass other courses as well and apply an inference
112 W. Hinzen and M. v. Lambalgen
using his utterance B-a. That the rule is defeasible can be seen from the
possible continuation of (4a) in (4a*):
(4a*) A. Do you think our son will pass his courses this term?
B-a. Well, he did not pass his rst statistics course.
A. But he got a very good grade for the astrophysics course!
A more formal analysis of these examples goes as follows.
3
A defeasible
rule is an implication of the form if P and nothing exceptional is the case,
then Q. Here P can be the proposition a student passes his rst statistics
course, and Q the proposition he can pass other courses as well. Using
this representation, one may disentangle the coordinating and world-
relating functions of not. First o, sentence B-a has a dual function: it
states a fact and it triggers an inference process that allows A to deduce
Bs opinion on the relevant issue. Sentence B-a can have this dual func-
tion because the inference process that it triggers has certain universal fea-
tures which are common knowledge of A and B. Namely, the inference is
a form of closed world reasoning, a form of logical reasoning which is dif-
ferent from classical logic but which is all the time applied in discourse
understanding (see van Lambalgen and Hamm 2004). The logical princi-
ple invoked here is: assume all propositions are false which you have no
reason to assume to be true. One can make sense of (4a) by invoking this
principle twice. First the defeasible rule, written fully as if a student
passes his rst statistics course and nothing exceptional is the case, he
can pass other courses as well is reduced to if a student passes his rst
statistics course, he can pass other courses as well, because no informa-
tion about exceptions is supplied in the discourse. Secondly, no other suf-
cient conditions for passing the other courses are given, so that the rule
is actually an equivalence, and utterance B-a can be used to derive the in-
tended conclusion he will not pass all his courses this term. Note that
without invoking closed world reasoning, the inference that B implicitly
appeals to in (4a) is the classically invalid denial of the antecedent. In
(4b) the suggestion is that if a student barely passes a statistics course,
then one actually has an exceptional circumstance. Therefore the previous
reduction of the defeasible rule no longer applies, and the inference using
utterance B-a fails.
The defeasible character of the inferences involved is brought home
further by the discourse (4a*), where the function of the utterance But
he got a very good grade for the astrophysics course! is precisely to high-
light a second defeasible rule: if a student passes an astrophysics course
and nothing exceptional is the case, he can pass other courses as well. In
this case the second application of closed world reasoning fails, thus ren-
dering invalid the conclusion previously drawn. The circumstance that
Explaining intersubjectivity. A comment on Verhagen 113
conclusions from logical arguments may have to be withdrawn when new
information comes in, may have reinforced the impression that organiz-
ing the connection to the world is of minor importance in language use.
But in actual fact, these discourses are all about ones best guesses about
the state of the world. We conclude, then, that at least with respect to sen-
tential negation, the general framework of non-monotonic logic elegantly
captures the data in question, and the uniqueness implied in Verhagens
claims about the need for a functional explanation is without support.
Note that non-monotonic logic is not intrinsically a framework for rea-
soning in an intersubjective context at all: we nd the same principles of
reasoning in other cognitive domains such as planning, hence their ratio-
nale is not purely in cognitive coordination, leading to further doubts
about the foundational assumptions used.
Another example in the same vein is taken from Chapter 4, on dis-
course connections. Consider Verhagens discussion of although and but
on pp. 167174. He mentions the following general explication of the
meaning of although (167): p although q means: (a) truth conditions: p
& q; (b) presupposition: q implies not-p. Here presupposition means that
if q implies not-p is not yet present in the discourse, it must be intro-
duced (presupposition accomodation). Verhagen correctly notes that if
q implies not-p is formalised as the material implication of classical
logic, (a) and (b) are in immediate contradiction, and then after some dis-
cussion draws the following moral: What is especially important to avoid
the derivation of contradictions, even if the defeasibility of generaliza-
tions is recognized, is that a background mental space, distinct from that
of the speaker/writer, is invoked in which the shared topos is construed
as a basis for a causal inference (168).
A formalisation in non-monotonic logic again shows that we can re-
main agnostic about the necessity (and precise form) of mental space rep-
resentations. We shall provide representations for although and but
using the defeasible conditionals introduced above. These feature a con-
junct nothing exceptional is the case, which we shall formalize here as
not-ab (where ab is a proposition letter indicating some abnormality):
p although q means: (a) truth conditions: p & q; (b) presupposition: q
& not-ab implies not-p.
p but q means: (a) truth conditions: p & q; (b) presupposition: p & not-
ab implies not-q.
In both cases (a) and (b) are consistent, and jointly entail the derivation
of an abnormality. Thus, if someone utters p although q, he contributes
a variable for an abnormality to the discourse, which can be unied with
114 W. Hinzen and M. v. Lambalgen
a concrete circumstance. E.g., He failed his exam, although he worked
very hard. He was sick on the day of the exam. The second sentence is
read as an instantiation of the abnormality pointed at by the rst sen-
tence. No special machinery for mental spaces needs to be adopted; it suf-
ces to apply to general principles for discourse coherence such as the in-
troduction of variables to be unied with linguistic material.
5. Cognitive signicance
Before we continue with our discussion of linguistic matters and return to
the issue of sentential complementation in the next section, there is a
methodological point we want to raise: the use of formal representations
in cognitive linguistics, especially Verhagens use of Fauconniers theory
of mental spaces in explaining the function of negation. We presented a
formal analysis of Verhagens examples involving negation in non-
monotonic logic, without rst explaining Verhagens own mental space
analysis. We did so because we have severe doubts as to the adequacy of
such analyses in a cognitive context. We fully agree that the most produc-
tive way to do linguistics is to relate it to human cognition as a whole.
But what makes a particular piece of linguistic analysis also cognitive?
Let us pause to consider this important question in some detail. At the
outset of modern linguistics in the 1950s a demand was imposed on
theories of linguistic competence according to which such theories should
be explicit. That is, they should not rely on badly understood and
question-begging notions such as understanding, intending, or grasp-
ing the meaning. In practice, explicitness meant to give such psycho-
logical processes a computational or algorithmic description.
4
Adopting
this methodological decision, a given semantic analysis of a natural lan-
guage should employ representations that have well-dened formation
rules, and the mapping between syntactic and semantic representations
should be computationally transparent.
Note that a purely semantic analysis of a linguistic phenomenon can as
such be considered to be successful if it gets the truth conditions of sen-
tences and entailments between sentences in context right. Here, one
does not put any demands upon the semantic representations used except
that one can meaningfully speak of entailments between them. Although
this demand is by no means trivial, it does not yet suce for explanatory
signicance in the context of a study of human cognition. We do not wish
to imply that only pointing at a neural substrate suces for a demonstra-
tion of cognitive reality. Clearly, a given linguistic analysis can stand on its
own feet and does not need validation from neuroscience.
5
Yet, the con-
cepts and entities used in abstract syntactic and semantic representations
Explaining intersubjectivity. A comment on Verhagen 115
must at least not be in conict with known constraints on the processing
of these structures or their storage in long-term and working memory, for
example.
6
The simple point we want to make here is that this integration of elds
of inquiry operating at dierent levels of abstraction (i.e., linguistic and
neurological) depends on the explicitness of the computational descrip-
tions involved. In particular, semantic representations need to be mathe-
matically denite enough to be used in algorithms. We have strong doubts
that this desideratum is met by Fauconniers theory. The analysis of nega-
tion presented above in terms of non-monotonic logic goes some way to-
ward fullling theses desiderata, since, as is shown in Stenning and van
Lambalgen (2008), the proposed system has considerable cognitive signif-
icance, including an appealing neural implementation.
6. The complementation construction
Let us now return to sentential complementation constructions such as
(1). Verhagens suggestion (Chapter 3) is that sentential complementation
is a special purpose construction that, again, intrinsically serves a coordi-
nation aim. Verhagen claims that (1), repeated here as (5), is fundamen-
tally dierent in structure from a construction like (6):
(5) George knew/saw/said that his opponent was closing in.
(6) George knew/saw/said something.
That is, it is wrong to construe (5) as a transitive construction on the
basis of a mere analogy with (6). In particular, he argues that the em-
bedded clause in (5) is not a syntactic constituent or verbal argument (p.
83). Rather, (5) is a construction in its own right, a holistic template
with irreducible sound and meaning properties (p. 79) that doesnt follow
from any general phrase-structural rules.
However, no structural analysis of the sentences in question is actually
provided in this chapter, and no denition of what it would be for the
that-clause to be a constituent is provided. Clearly, a structural analysis
is not ipso facto provided once certain functional claims are made: the
mechanisms underlying certain functions are a logically independent
issue. But standard tests for constituency suggest that we can question
the that-clause, as in (7), or elide it, as in (8):
(7) George saw/knew/said what?
(8) George saw/knew/said that his opponent was closing in, and Bill
saw/knew/said so too.
116 W. Hinzen and M. v. Lambalgen
Verhagens conclusion by contrast is rather exclusively derived from
claims about dierences in discourse functions of (7) and (8), which we
claim is a logical error and fails to provide any independent evidence for
the functions used to an explanatory purpose.
In addition, a wrong opposition arises again. Let it be true that (5) in-
dicates a perspective in the matrix clause, and that a thought is being
perspectivized in the embedded one, as Verhagen argues. This observa-
tion appears fully consistent with the that-clause in (5) being a constituent
that is the complement of the matrix verb. To the extent that there is a
dierence between (5) and (6) in the functional respects just noted
although that dierence is not obvious to usit can follow composition-
ally from the dierence in the two complements of the matrix verb, which
after all dier, in syntactic category and Case. Again, independent evi-
dence is needed for a dierence in structure between (5) and (6)
evidence not simply predicated on the functionalist hypothesis made.
Contrary to the claims made in this chapter, a standard generative
constituent structure analysis of (5) would not proceed merely from an
intuited analogy or relatedness between (5) and (6) (as stated on p.
87). It would also not proceed by a top-down analysis (p. 82). On the
contrary, it would build such a structure from the bottom upwards, be-
ginning with the minimal assumption that saw and the CP in question
must be somehow merged with one another, giving rise to a structure of
the general form [X Y]. Assuming in addition to that minimal require-
ment that in human language, phrases are headed, one of X and Y will
have to be the head, H, which thus projects, with Y becoming its com-
plement or internal argument. The result is then as a whole predicated of
an external argument, Z (i.e., George). In this way we derive that the
common underlying structure of (5) and (6) is indeed [Z [X [Y]]], an anal-
ysis making the rather minimal assumptions that:
(i) human language is combinatorial (there is a recursive operation
merging constituents),
(ii) the organization of expressions is hierarchical (it contains phrases
over and above lexical items),
(iii) phrases are headed (Merge(X,Y) is of type X or else type Y), and
(iv) branching is binary (Merge takes two arguments).
This analysis moreover does not automatically assume the possibility of
generalizing over clausal and nominal structures: it does not refer to any
such constructions, which are not even visible for a minimal analysis that
appeals to abstract notions such as head, complement, internal argument,
and external argument, alone. So it also does not predict that in all
Explaining intersubjectivity. A comment on Verhagen 117
contexts nominal arguments can be inserted where the putative clausal ar-
guments can be, which is the prediction that Verhagen (pp. 8385) pro-
vides evidence against.
It is neither clear to us why double object constructions like They
warned us that the prot would turn out lower would support Verhagens
viewpoint (see p. 86), nor why inversely linked predications of the type
in (7) and (8) do:
(7) [The danger] is [that the middle class feels alienated].
(8) [That the middle class feels alienated] is [the danger].
We here briey discuss only the latter case. The problem posed by Verha-
gen is that more than hundred years of analysis could not settle whether
the that-clause in (7)(8) is a subject or predicate. But perhaps this is a
wrong dilemma. It may precisely be a feature of these constructions that
they are organized around a symmetrical predicational relation between
two XPs in a Small Clause (SC) as in (9), in a way that either of them
can raise to a sentence-subject position in front of the auxiliary, resulting
in either (10) or (11) (see Moro 2000):
(9) SC
CP DP
that . . . alienated the danger
D
(10) SUBJECT [BE [
Small Clause
[The danger] [that the middle class feels
alienated]]
D
(11) SUBJECT [BE [
Small Clause
[The danger] [that the middle class feels
alienated]]
Neither the CP nor the DP in (9) are the head in the Small Clause (or
project), which explains their symmetry, and potentially the fact that ei-
ther of them can raise out of the Small Clause.
7. Constructions as such
Above we appealed to a minimal computational machinery in terms of bi-
nary Merge, which led us to the scheme [Z [X [Y]]]. An argument for using
a minimal phrase structural analysis generated by a recursive operation
118 W. Hinzen and M. v. Lambalgen
Merge is that we need some account of the recursive machinery of lan-
guage (unless recursivity is denied, which it is not in the present volume).
If one assumes a minimal conception to account for recursive structure
building (Merge on its current minimalist construal is such a candidate,
see Hinzen 2006), the question whether there is a complementation con-
struction and whether or not it is identical to a direct object construc-
tion (p. 86) cannot even be formulated. Merge is too primitive to be sen-
sitive to such categorial distinctions, giving us a much simpler vision of
the linguistic systems basic computations. The question is whether this is
a bad or a good result.
The claimed achievement of the Principles and Parameters framework,
incorporated into Minimalism, was that constructions as we can perceive
them in languages at a descriptive level can be shown to follow from
more abstract generative principles which are neither language-specic
nor construction-specic. Thus, what we called the complementation
construction above is simply the overt consequence of Merge plus the
fact that some heads subcategorize for an object that is semantically a
proposition. This, if feasible, is a desirable view, we contend, because the
abstract generative principles in question, if indeed minimal, have to be
part of anyones account; and because having constructions as merely
the overt result of deeper, fewer, and more abstract structure-building op-
erations is both explanatorily benecial and in no conict with the fact
that they take up distinctive discourse functions when used. From an evo-
lutionary viewpoint, too, a minimal and construction-free grammar (that
remains descriptively adequate) should be welcomed: it allows to accom-
plish more (a great variety of linguistic constructions) with less (minimal
structuring principles cutting across constructions), which is arguably in
line with general principles of economy and conservativity in biological
evolution.
8. Perspective-taking
As noted, Verhagen doesnt deny recursion, but places it outside language,
in perspective-taking, which as such, he argues, is inherently recursive
(p. 98). That sentential complementation constructions are paradigmati-
cally recursive is on his view only a sign for the fact that they are the
grammaticalization of this basic human cognitive capacity. The problem
with this account however is that to our knowledge there is no evidence
for recursive perspective-taking outside human language; ipso facto we
cannot invoke perspective-taking to explain language, and the direction
of explanation might precisely have to be reversed, unless there is a com-
mon cause of both. Furthermore, taking a perspective on something
Explaining intersubjectivity. A comment on Verhagen 119
canalthough it need notinvolve what philosophers traditionally have
called a propositional attitude. It need not, since it has been observed in
false belief tasks that while a child may take the wrong perspective
(namely its own) in propositional terms, it takes the right perspective in
behavioural terms, e.g., by looking at the right spot (Clements and Perner
1994). That is, the notion of perspective as such is consistent with both
propositional and non-propositional mental representations; it doesnt ex-
plain why it should be the case that we take propositional perspectives or
why such forms of thought exist. Although there are some claims for
propositionality in non-humans (Seyfarth 2006), there are also strong
ones against it (Terrace 2005), and the notion of propositionality invoked
in the former claims is too broad to illuminate the specics of human
clause structure and the propositional meanings that sentential construc-
tions have. There is also evidence that the understanding of sentential
complementation is actually itself an instrumental causal factor in the
genesis of mind-reading and how the child forms explicit propositional
representations of false beliefs, a task that is not mastered before senten-
tial complementation itself is (De Villiers 2005). All of this indicates that
Verhagens bold attempt to explain language from social cognition may
wellat least partiallyhave the cart before the horse.
7
9. Meaning
We close with a general observation on the philosophy of meaning as-
sumed in CoI. If the meaning of linguistic expressions is inherently and
necessarily linked to their discourse purpose, we face consequences such
as that an assertion of There are seats in this room implies a presupposi-
tion having to do with the seats being comfortable, as Verhagen asserts
(15). But obviously, there can be assertions about seats in rooms where
these seats fail to be comfortable. Hence, the implicature is a mere con-
textual one, and ipso facto not an inherent (non-contextual) aspect of the
expression in question. Is the claim the radical one that there are no such
inherent aspects of the meaning of an expression at all? If it isnt, a non-
contextual notion of linguistic meaning as determined by linguistic form
needs to be preserved on which the compositional process of meaning
determination would be based. If it is, that would entail giving up the
compositionality of meaning, which depends on the availability of a
context-independent notion of meaning that is determined by the syntac-
tic part-whole structure of the expression in question (see Fodor and
Lepore 2002). We may be wary of giving up this widely endorsed con-
straint, as it seems needed to explain the forms of recursivity that lan-
guage exhibits. Note that to whatever extent we endorse compositionality
120 W. Hinzen and M. v. Lambalgen
as a principle for the generation of meaning, meaning will not be conven-
tional: meaning will follow by necessity from algebraic laws of phrasal
composition, in much the way that 5 follows from composing 2 and 3 by
means of the operation .
Note, also, that if the meaning of a sentence is spelled out by appeal to
its argumentative consequences, it will be the case that there is nothing to
rationally explain why we endorse the inferences we do. If we want to jus-
tify moving from A&B to A, say (or claim classical validity for this
move), part of what we will appeal to is the meaning of & (and our
grasp of that meaning). We couldnt justify the classical rule of conjunc-
tion elimination, say, by the existence of a causal mechanism carrying us
from premise to conclusion, or the desirability of the result, or the force
of a drug that we take. By consequence, an independent notion of mean-
ing is needed, even if an argumentation-oriented perspective is adopted,
and meaning cant consist in argumentative consequences alone.
10. Conclusions
Summarizing our main claims, we believe that while the data that CoI un-
earths are rich and certainly need explanation, they have an explanation
in more traditional formal semantic or syntactic frameworks which are
implicitly rejected in CoI. In short, the data do not support either the
analyses provided or the foundational assumptions about language en-
dorsed. Again, we see no conict between older representational or dis-
interested perspectives on the use of language, and observations on the
discursive functions that linguistic expressions may serve. We also see a
danger in one-sided perspectives on language that leave out some of its
distinctive features. Coordination in discourse and manipulative commu-
nication are very clearly vital functions of language, and taking this as
our starting point many important phenomena of language may come to
the surface: we fully concur with Verhagen on this issue. But their expla-
nation will be another question.
Received 31 January 2007 Durham University
Revision received 21 March 2007 University of Amsterdam
Notes
1. Figure 1.2 on p. 7, as one referee notes, may suggest that Verhagen recognizes both fac-
tors. But the claim made is that special foundational signicance attaches to the former
function and that negation and complementation illustrate this, and we dispute this.
2. Since the two rst points are important in what follows, it is worthwhile to quote Verha-
gen directly: [T]he linguistically most relevant properties of negation, the ones that it
Explaining intersubjectivity. A comment on Verhagen 121
shares with other elements in the same paradigmatic class, are purely cognitive opera-
tions (p. 57).
3. Here we follow the analysis of defeasible conditionals given in Stenning and van Lam-
balgen (2006). The interested reader is referred to this paper for a fully formal treatment
of phenomena related to the ones discussed here.
4. Algorithmic is taken in a wide sense here, and also includes computations in neural
networks.
5. Empirical linguistic arguments for a universal argument-adjunct distinction, for exam-
ple, are not empirically invalid if we cant link or translate the primitives used in the
analysis to primitives of a neurobiological description.
6. Together with constraints owing in this particular direction (Dabrowska 2004), it is an
equally reasonable proposal at this point that linguistics may and should impose con-
straints on neuroscience. That is, explicit linguistic proposals for computational pro-
cesses underlying language should be the basis for evaluations of (and predictions for)
neuroscientic experimentation (see e.g., Stockall and Marantz 2006; Poeppel and Em-
bick 2005; for such a perspective for the case of syntax, and Baggio and van Lambalgen
2007 for the case of semantics).
7. One referee claims that it is no objection to Verhagen that there is no evidence for
recursive perspective-taking outside human language, since Verhagen precisely claims
that perspective taking is what makes humans dier from other animals. The point
however is whether it explains language, and recursion therein. For this it needs to
have the relevant formal properties (propositionality, recursivity) independently of
language.
References
Baggio, Giosue and Michiel van Lambalgen
2007 The processing consequences of the imperfective paradox. Journal of Seman-
tics 24, 307330.
Clements, W. A., and Josef Perner
1994 Implicit understanding of belief , Cognitive Development 9, 377395.
Dabrowska, Ewa
2004 Language, Mind and Brain. Some Psychological and Neurological Con-
straints on Theories of Grammar. Edinburgh University Press.
De Villiers, Jill
2005 Can language acquisition give children a point of view? In Astington, J. W.
and J. A. Baird (eds.): Why Language Matters for Theory of Mind, Oxford
University Press, 186219.
Fodor, Jerry, and Ernie Lepore
2002 The Compositionality Papers. Oxford: Oxford University Press.
Hauser, Mark D.
1996 The Evolution of Communication, Cambridge, MA: MIT Press.
Hinzen, Wolfram
2006 Mind Design and Minimal Syntax, Oxford: Oxford University Press.
Moro, Andrea
2000 Dynamic Antisymmetry, Cambridge, MA: MIT Press.
Poeppel, David and David Embick
2005 The relation between linguistics and neuroscience. In Cutler, A. (ed.),
Twenty-First Century Psycholinguistics: Four Cornerstones. Lawrence
Erlbaum.
122 W. Hinzen and M. v. Lambalgen
Seyfarth, Robert
2005 Primate social cognition and the origins of language, Trends in Cognitive
Sciences 9, 264266.
Skinner, B. F.
1957 Verbal Behavior. New York: Appleton-Century-Crofts.
Stenning, Keith and Michiel van Lambalgen
2006 Semantic interpretation as computation in nonmonotonic logic, Cognitive
Science 29 (2006), 919960.
2008 Human Reasoning and Cognitive Science. Cambridge: MIT Press.
Stockall, Linnea, and Alec Marantz
2006 A single route, full decomposition model of morphological complexity:
MEG evidence, The Mental Lexicon 1:1.
Terrace, Herbert
2005 Metacognition and the evolution of language, in Terrace, H. and Metcalfe
(eds.), The Missing Link in Cognition. Oxford University Press, 84115.
Tomasello, Michael
1999 The Cultural Origins of Human Cognition. Cambridge: Harvard University
Press.
2003 Constructing a Language: A Usage-Based Theory of Language Acquisition.
Cambridge: Harvard University Press.
van Lambalgen, Michiel, and Fritz Hamm
2004 The Proper Treatment of Events, Blackwell.
Verhagen, Arie
2005 Constructions of Intersubjectivity. Oxford: Oxford University Press.
Explaining intersubjectivity. A comment on Verhagen 123
Intersubjectivity and explanation
in linguistics: A reply to Hinzen
and van Lambalgen
ARIE VERHAGEN*
1. Introduction
Let me start by saying that I very much appreciate both the eort that
Hinzen and Van Lambalgen (hereafter, H&L) have put into commenting
on Constructions of Intersubjectivity (hereafter, CoI ), and their comments
as such. It is important for all cognitive disciplines studying language that
representatives from dierent schools of thought try to address each
others work, in terms of both results and foundations. We may not reach
agreement as a result of a discussion, but it will still be helpful in clarify-
ing matters for ourselves and for other interested scholars, and thus for
the future development of our common eld of study. This is true even
if the divide is deepwhich is the case here in a number of respects, as
H&L indicate themselves.
Another important preliminary remark concerns the nature and scope
of our dierences. Philosophically they are certainly far reaching, but
from an empirical point of view it is useful to notice that H&L do not
present counterexamples to the actual linguistic analyses presented in
CoI. Rather, their main point is that such analyses can also be provided
in other frameworks, which they label more traditional than cognitive
linguistics, and which should in their view be preferred for other than em-
pirical reasons, having more to do with general ideas about concepts such
as meaning, communication, grammar, etc., and the way these re-
late to even more comprehensive concepts such as evolution or lan-
guage. Below, I will actually dispute that H&Ls comments show that
the alternative, non-cognitive, frameworks provide these explanations
(and suggest that they are not forthcoming either), but it is good to note
at the start that their own comments do not concern the empirical claims
of CoI. In fact, in my own view, our main dierence concerns the question
what may count as an explanation in the analysis of linguistic phenomena.
Finally, as to the organization of this reply, I will not follow H&Ls
comments step by step, as this would lead me to repeat myself too much.
Cognitive Linguistics 191 (2008), 125143
DOI 10.1515/COG.2008.007
09365907/08/00190125
6 Walter de Gruyter
Instead, I will rst concentrate on the notion of meaning, addressing
mainly sections 2, 3, and 9 of H&L (section 2 below); then I will look at
the grammar of negation and argue that the alternative analysis H&L
suggest is linguistically unmotivated, which is partly due to them leaving
out some pieces that constitute important components of the argumenta-
tion in CoI; it is at this point that the dierence in what should be allowed
to count as an explanation in linguistic analysis becomes most concrete.
In this section (3), I will also deal with H&Ls remarks about mental
spaces, cognitive signicance (their section 5), and formalization.
Section 4 concerns H&Ls sections 68, dealing with complementation,
recursion, and some basic assumptions about grammatical structure. Sec-
tion 5 concludes this reply.
2. What do we mean by meaning?
Perhaps the most baing passage for me to read in H&Ls comments was
in the second paragraph of their Section 3. They rst summarize the gen-
eral programme of CoI: to demonstrate that the specic human ability to
manage perspectives is systematically reected in the meanings of several
grammatical constructions, in the sense that these meanings are often
related to the management of such perspectiveswhat I call intersubjec-
tive cognitive coordinationrather than to describing the world (speci-
fying an object of conceptualization in some way). What baed me was
that they immediately add to this: which entails that semantics cannot
be understood as serving both functions simultaneously (and then they
set out to argue that this is a bad idea). How could it be that they see
this as a core idea of CoI, while evidence against it is abundantly present
in the book? Specically, the rst section (p. 210212) of the Concluding
Remarks is entitled, Not everything is intersubjectivity (although inter-
subjectivity is widespread), and it refers back to parts of the book where
the meaning of dierent items was claimed to involve both the objective
and the intersubjective level of conceptualization (cf. also CoI section
1.3, esp. p. 18). Moreover: why would it be an entailment? There must
be something that I missed, and I assume it is to be found in what H&L
conceive of as meaning, and hence as semantics.
H&L devote a separate section to meaning, but the points they make
there are closely related to some they make at the beginning. In section
9, they contest the proposal that evoking inferences is part of the meaning
of linguistic expressions, and defend a context-independent notion of
meaning; in section 2, they oppose an argumentative view of language
use (their picture of this view is a bit of a straw man; see the end of this
section) to the romanticist view that language is used for the free and
126 A. Verhagen
creative expression of thought (construed as reference, representation
or the assertion of truth), claiming that the latter function, unlike the
former, is crucial for understanding what makes language dier from an-
imal communication systems. We can safely equate these two opposi-
tions, since argumentative in the Ducrot-sense adopted in CoI means
evoking inferences (through associated topoi, or defeasible rules),
and the context-independent meaning, as explicated by H&L, consists in
the contribution of a linguistic (or logical) symbol to the reference or the
truth conditions of an expression containing the symbol.
Just how close these two oppositions are connected also comes out in
H&Ls discussion of Ducrots example of the use of seats, used in CoI to
elucidate and specify the idea of argumentativity: saying There are
seats in this room invites the addressee to (i.a.) ascribe a certain positive
degree of comfort to the room under discussion. H&L write: But obvi-
ously, there can be assertions about seats in rooms where these seats fail
to be comfortable. Hence [my italics], the implicature is a mere contextual
one, and ipso facto not an inherent [italics original] (non-contextual) as-
pect of the expression. The implicit premise, necessary to complete this
line of reasoning, can only be: If an aspect of the interpretation of an
expression is not truth-conditional (does not have to represent something
in the world of which the expression is predicated), then this aspect is not
an inherent aspect of the meaning of the expression, but a contextual
one. First of all, this begs the question, the point of dispute precisely
being how linguistic meaning should be construed: as (strictly) truth-
conditional or as (at least also) argumentative. So in principle, we could
stop the debate here, as this basic point of H&L contains a fatal fallacy.
However, I nd it even more important to note that H&L overlook the
fact that their observation has actually been used as an argument for the
argumentative view (cf. CoI 11, and the Ducrot reference cited there).
The point is that the utterance There are seats in this room has its argu-
mentative value regardless of the actual degree of comfort, or lack there-
of, of the seats in the room under discussion (the only condition is that
the language users mutually share the idea that rooms with seats are nor-
mally more comfortable than rooms without). This is precisely the point
that explains why the statement that the seats are uncomfortable can only
be connected to this utterance by means of an adversative connective,
e.g., but, and that something like and moreover is incongruent. Assuming,
for the sake of the argument, that it is somehow established as true that
the seats in a certain room are not exactly comfortable, this still does not
make the text There are seats in this room, and moreover they are un-
comfortable a coherent one. If we want to express, i.e., represent linguis-
tically, both the presence of seats and their lack of comfort, then we have
A reply to Hinzen and van Lambalgen 127
to mark this as contrastive, and that is what makes a linguist, whose job
is to account for the use and distribution of linguistic expressions and
their constituent parts, conclude that the argumentative character is in-
herent in the linguistic elements involved.
H&L do say that semantics should account for both inherent and
contextual aspects of linguistic expressions. But they equate these two
notions with truth and argumentativity, respectively, and then also
with the sentence and discourse levels (their Section 3). So according to
H&L, the following 1-to-1 relationships hold:
a) Inherent meaning : descriptive : sentence level (and presumably
below)
b) Contextual meaning : argumentative/inferential : discourse level
It seems to be this relatively implicitbut contestable and contested
1

view of meaning and the organization of semantic description that makes


H&L conclude that the CoI-view of linguistic meaning as including as-
pects of discourse and argumentation gives up the possibility to account
for relationships between language and the world. Not only do they rst
implicitly identify inherent meaning with descriptive meaning, thus
begging the question, they moreover connect descriptive meaning espe-
cially to the sentence level. Since sentence semantics presumably in their
view precedes discourse and inferential semantics (sentences being taken
as the building blocks of discourse), it follows from considering some in-
ferential and discourse meaning as inherent that there is no possibility
to account for correspondences between language and the world. In any
case, this is the only way in which I can make any sense at all of their
statement.
But, of course, nothing of this kind actually follows from the basic as-
sumptions of CoI, or cognitive linguistics in general. It is knowledge of
shared (i.e., cultural) cognitive models that is directly evoked by linguistic
elements, not information about the world; but some of the inferences
that knowing these models allows us to make, do involve the world.
Thus, the primary meaning of beautiful is to express a positive evaluation,
not to give a description of some sort (consider the task of specifying the
truth conditions for H&Ls example of beautiful perfume . . .); knowing
the culture, and especially having some relevant experience, allows many
language users to make some inferences about actual properties of the
perfume involved. But it is not necessary to make such descriptive infer-
ences, and a person not (capable of ) making them can still understand the
utterance.
Another important point about conceptions of meaning relates to the
role of convention. Section 9 of H&L contains many clauses of which
128 A. Verhagen
the noun meaning is a part, but it is not at all clear that it can be used in
the same sense in all these statements; in other words, H&L do not seem
to be aware of, or at least they do not at all worry about, a possible poly-
semy of the term meaning, which might aect the contents and conse-
quences of their statements. They object to the philosophy of meaning
they think they nd in CoI, but do not explicate what specic sense of
meaning they mean. They state one point of their own position, in rela-
tion to compositionality as: meaning will not be conventional: mean-
ing will follow by necessity from algebraic laws, etc.. But in a context
like this (leaving aside the issue whether compositionality is indeed to be
viewed as an algebraic phenomenon, independent of a particular cogni-
tive system), meaning does not have the same sense as in, for example
The meaning of the word banana is: a category of fruit with (prototypi-
cally) characteristics X, Y, Z. The latter involves a relation between a
sound and a concept that is conventional; banana means what it does
because speakers of English mutually share knowledge of the rules for
the proper use of the word. So in all larger expressions, the meaning of
the whole is partly conventional, because of the words; moreover, the bal-
ance between conventionality and compositionality is not xed (consider
banana republic), and there are even complete sentences with a meaning
that is mostly a matter of convention (An apple never falls far from the
tree). This is very elementary linguistics, of course. The basic hypothesis
of CoI, about intersubjectivity being a prominent aspect of meaning, is
explicitly stated in terms of the meanings of linguistic symbols (words
and constructions) (p. 4), i.e., conventional signs. The claim is that inter-
subjectivity is so important that several linguistic elements, especially a
number of grammatical ones, are conventional instruments for intersub-
jective management (and that they have not suciently been recognized
as such in the past). Nothing in the argumentation for this point hinges
on a view of compositionality, which is an important, but independent is-
sue. But H&L keep talking about (philosophy of ) meaning as if it were
a unitary concept, and then present compositionality of meaning as an
argument against conventionality of meaning. It will be clear that this
is simply completely beside the point. Moreover, if all the senses of mean-
ing are to be subsumed under one philosophy of meaning, this philoso-
phy is never going to be anywhere near coherent, so of little explanatory
value.
2
As a nal remark on meaning, a word on H&Ls terminology in rela-
tion to very general scientic and philosophical commitments. In their
attempt to challenge the idea that argumentation and intersubjectivity
are inherent aspects of linguistic meaning, H&L use manipulation and
controlalluding to behaviourismas terms for the function of
A reply to Hinzen and van Lambalgen 129
language use as viewed in CoI, while CoI itself uses management and
assessment and cognitive coordination. Manipulation normally goes
against the interests of the receiver, and especially: without the receiver
recognizing the intentions of the sender (usually, it involves deceit). The
point of the use of argumentation in CoI is precisely to simultaneously
express similarity to animal communication (it is an attempt to inuence),
and a dierence: it is an attempt to convince, i.e., to inuence the re-
ceivers decision making process, by (i.a.) displaying ones communicative
intention.
3
Of course, it is true that we can and do sometimes use lan-
guage, and our brains, to ponder the truth of something, just as it is true
that we can and do sometimes use our legs to run for fun or in an athlet-
ics competition, that we can and do use our brains to play chess and
watch the stars, etc.. But focussing on these kinds of uses is not going to
get us very far in understanding how the features involved (legs, brains,
language, etc.) t into the natural world, that is: in explaining them. The
challenge is precisely to develop hypotheses, maximally constrained by
what we know about evolution and communication in general, about the
way the human communication system also got to be usable for some
functions, such as reference and description, for which is was not, in all
probability, originally an adaptation (cf. Verhagen forthcoming a).
3. Negation and connectives: Interaction between grammatical items
and its explanation
For negation, the point H&L try to make is that a more traditional
non-monotonic logical approach, enriched with clauses that introduce
the possibility of exceptions, can account for the same observations and
generalizations as CoI without introducing the notion of mental spaces;
if that were true, then their analysis would be simpler (using at least one
theoretical construct less than mine). Moreover, they have objections
against this construct as they have doubts about its cognitive and formal
status (see the end of section 3.2 for some remarks on this last point).
3.1. Exception clauses and topoi
As H&L notice, their use of exception clauses runs parallel to the use
in CoI of Ducrots concept of topoi. The general template of the latter
is If P, then normally Q; the template for H&Ls defeasible rules is If
P and nothing exceptional is the case, then Q. Indeed, for the cases they
discuss, their analysis produces the same account of inferences associated
with negative sentences as the one in CoI; the descriptive adequacy
of their account is thus not better than CoI s, so the approaches might
130 A. Verhagen
be considered notational variants. But one important question is: How
about cases they do not discuss, but which are part of the account in
CoI ? Does their analysis generalize to these cases? This amounts to a
question of explanatory power; it will be taken up in section 3.2. Another
question, also an issue of explanation, is: Does their analysis classify ele-
ments into categories that make sense linguistically? In other words: Does
their characterization of the semantics t the distribution of the linguistic
elements involved? This is the issue for the remainder of this section.
In H&Ls analysis, barely indicates the existence of an exception; for
example in He barely passed his rst statistics course, barely indicates
that the passing was abnormal, so the clause nothing exceptional is the
case is not satised, and therefore the subsequent derivation of a rele-
vant inference Q (e.g., he can pass other courses as well) is blocked.
4
The same result is produced in CoI by the assumption that barely invali-
dates the applicability of topoi associated with the content of the sentence
(the performance was so minimal that one cannot draw conclusions that
one would otherwise draw from the fact that he passed). First, it seems
to me that there may be a serious conceptual problem with the exception-
approach. Exception does not seem to be a primitive notion; it presup-
poses the notion of a rule, whereas the reverse does not hold. Rules (in-
cluding those about what is normally, not necessarily always, the case)
can be experientially based generalizations (e.g., in terms of frequency:
What happens most of the time?), but not the other way around. In-
deed, exceptions must be dened in terms of rules (negatively), as they
are not themselves generalizations (what makes something an exception
is the background rule). Thus it seems to me that H&Ls use of ab as a
proposition letter indicating some abnormality as if it were something
unanalysable, may mask the possibility that their analysis ultimately
reduces to mine.
Second, it is clear that the exception-approach to barely implies that it
belongs to a dierent class of linguistic elements than not: the rst belongs
to the abnormality indicators, the second does not. Here we reach a
fundamental dierence between the logical approach of H&L and the lin-
guistic one of CoI. The initial reason for reconsidering the semantics of
barely in argumentative terms was that both not and barely license the
let alone construction, i.e., their distribution is similar in a linguistically
important way (grammatical behaviour). What the analysis of CoI shows
is that this grammatical behaviour parallels the inferential (and discourse
connecting) properties of both elements (not the real-world relations), and
can thus be explained by assuming that the grammatical properties are
determined by argumentative rather than real-world aspects of mean-
ing. By putting not and barely in semantically dierent categories of
A reply to Hinzen and van Lambalgen 131
elements, H&L simply give up this explanatory power. On the basis of
their account, if the grammatical behaviour of elements reects their
meaning, then one should expect not and barely to be grammatically
very dierent, but in fact they are not; taking the programmatic idea of
language as a window on the mind seriously should precisely lead one
to taking the intersubjective analysis seriously, I maintain. As I said, I
suspect the ultimate source of H&L overlooking this point is that their
basic concerns are logical, rather than linguistic.
5
3.2. Explanatory scope
The last comments in the previous section already indicate that H&L do
not always take into account that an important part of the argumentation
in CoI involves connections between dierent parts of the linguistic sys-
tem, and that they focus their semantic analysis only on certain words
and constructions in isolation. In fact, this is a more general tendency,
that severely undermines the power of their criticism and their alterna-
tive. For one thing, they do not discuss how their analysis of the de-
feasibility of the argumentative implications of not and barely can be
applied/extended to almost, while this is, again, an integral part of the
argumentation in CoI. The importance of the point can be demonstrated
with example (1) (H&Ls 4a*), showing the defeasibility of (at least some
of the) argumentative inferences associated with a negated sentence:
(1) a A. Do you think our son will pass his courses this term?
b B-a. Well, he did not pass his rst statistics course.
c A. But he got a very good grade for the astrophysics course!
Just like I invoked a wide-spread cultural model (Statistics is a hard sub-
ject), H&L invoke another one in the form of astrophysics to demon-
strate that in a next move in the discourse, the initial suggestion He is
not going to pass can be reversed again (If he is smart enough to get a
good grade for astrophysics, he may still pass). Now the point is that the
same reversal can also be established by certain sentences that contain the
operator almost:
(2) (1)ab
c A. But he almost passed the astrophysics course!
In this case, As utterance entails the negation of He passed astrophysics:
in actual fact, the student in question neither passed statistics nor astro-
physics. But by means of almost, speaker A construes the latter as an argu-
ment for the conclusion that he might still pass the term, i.e., in the same
way as the strongly positive statement in (1)c. This is straightforwardly
132 A. Verhagen
accounted for in CoI ( just like barely is a relatively weak negative opera-
tor, almost is a relatively weak positive operator on the argumentative
orientation of an utterance), which also explains why almost does not li-
cense the let alone construction (the argument from linguistic distribution
again), despite the entailment of a negation.
How should this be accounted for in an exception-approach? First of
all, H&L do not themselves indicate what such a generalization would
look like. Perhaps we should say that almost also marks the event de-
scribed in the sentence as an exception, so that otherwise licensed infer-
ences cannot be derived? That would clearly not suce, as it would then
be said to have the same meaning as barely. In fact, we can now see that
the characterization of barely as an exception-indicator is insucient
minimally, the direction, i.e., negative, of the inferences involved should
be included in this characterization. So suppose we characterize almost P
as not-P, and something abnormal is the case. Even though this might
seem better, I dont think it is. The problem is to derive positive infer-
ences from the negative statement. Recall that the general form of the de-
feasible rule, according to H&L, is If (P and nothing abnormal), then
Q. When we now have, due to the presence of almost, not-P as a mi-
nor premise, then it seems to me that nothing can be derived anymore.
The conjunction of the rule (a) If (P & nothing is abnormal), then Q
with (b) something abnormal is the case can lead to the derivation of
(c) not-Q, even if P is the case (with closed world reasoning); but the
conjunction of the same rule (a) with (b) not-P and something abnormal
is the case cannot produce the derivation of (c) Q, as not-P by itself
contradicts the antecedent clause of the rule (P & nothing is abnormal).
In fact, it seems to me that in this case, too, not-Q would have to be de-
rived, given the falsity of the antecedent clause. Thus, I conclude that
there are good grounds for claiming that the exception-approach does
not generalize to almost, and in that sense is also low in explanatory
power, while almost ts naturally into the argumentative framework of
CoI, as a weak positive argumentative operator, complementary to the
negative barely.
A similar conclusion holds for H&Ls discussion of the concessive
connective although. While their reanalysis in terms of the exception-
approach can provide an adequate semantic characterization of sentences
of the type p although q in isolation from the rest of the linguistic system,
they do not show that it accounts for interactions with other elements, in
particular negation. Precisely this interaction is the key part in the argu-
mentation in CoI: although cannot occur in the scope of negation, i.e.,
not p although q must be understood as (not p) although q; it can-
not be interpreted as not (p although q), while its positive (causal)
A reply to Hinzen and van Lambalgen 133
counterpart because can occur in the scope of negation: not p because q
can in principle be read both as (not p) because q and as not (p be-
cause q). In this case, I will refrain from elaborating H&Ls approach
myself to see how it might work, and simply observe that this is what
they actually should have done in order to make their point, but they
havent.
The analysis in CoI of these phenomena crucially rests on the assump-
tion that sentential negation introduces a separate representation (men-
tal space) of the viewpoint that the speaker of the present sentence op-
poses. This point is also not mentioned by H&L, who simply dismiss
mental spaces as if they were only used in the analysis of negative sen-
tences as such. On the contrary, chapter 4 of CoI shows that a mental
space analysis of negation provides an explanation not only of the combi-
natorial restrictions between negation and although, but also of a number
of such restrictions between negation and causal connectivessome of
which exhibit scope restrictions similar to although. Moreover, the mental
space analysis of negation is motivated in chapter 2 (as it is in the mental
space literature in general) independently of the argumentative analysis of
negation, viz. in terms of the interpretation of discourse anaphors follow-
ing negative sentences, and the connective On the contrary. The greatest
explanatory power of the mental space approach, according to CoI, lies
in the possibility of this single idea to unify the analysis of the linguistic
distribution of a number of phenomena. While it may be possible to con-
struct an analysis of simple although-sentences without special machin-
ery for mental spaces, this analysis again does not naturally generalize
to cases of interaction with other phenomena, as manifested in distribu-
tional and interpretive restrictionsa fundamental concern for a linguist
with the ambition to provide explanations. But again, H&Ls concerns
seem to be located more in the dimension of logical rather than linguistic
analysis.
It is in this context that H&L dedicate a separate section, with the title
Cognitive signicance, to the status of the theoretical construct of
mental spaces, which in their view is rather dubious. To many cogni-
tive scientists, this may appear somewhat puzzling, because the basic
idea of mental spaces seems to be just a specic formulation of the funda-
mental human capacity of perspective taking and perspective shifting: to
entertain the same object or idea in dierent ways, from dierent an-
gles, etc., i.e., to combine dierence and sameness, by means of parti-
tioned representations (Dinsmore 1991). As it turns out, however, what
H&L mean is that it is not (to their knowledge and/or standards) su-
ciently formalized. In their view and invoking Chomskys earliest work,
the most important condition for a linguistic analysis to be called a
134 A. Verhagen
cognitive one, is to be explicit, which they immediately identify with to
be given a computational or algorithmic description. Firstly, notice
that they move, very quickly, from what might be a necessary condition
to a necessary-and-sucient one. Secondly, it is quite strange, in view of
the history of science (including recent cognitive science) to read that the
integration of elds of inquiry should depend on the explicitness of
computational descriptions. In actual fact, the possibility of operation-
alising generalizations obtained by one kind of research method in terms
of another seems at least as important, and to my mind much more com-
mon (If your distributional analysis says that A and B are basically the
same/dierent, and if this is psychologically real, then the results of my
reaction time/fMRI-measurements/etc. should look like this: . . . .). But
in the main stream of the generative enterprise, the focus has been on
developing formalisms rather than on deriving such predictions from the
theory and testing them. The confrontation with evidence, however, is the
hallmark of empirical science; the generative preoccupation with formal-
isms at the expense of maximising evidence is thus indicative of the fact
that linguistics is seen more like philosophy or mathematics than like
science. It seems to me that H&Ls point of view is only a recent carry-
over of the unfortunate identication, in the 1950s indeed, of language
with formal language (in the sense of the set of well-formed strings of
elements taken from some nite alphabet) that has hindered the under-
standing of human languages as historical and psychological phenomena
that cannot be so dened, but that are still quite real ( just like, to take a
well known example, a biological species).
4. Complementation and recursion
4.1. A minimalist account?
There is a curious sort of complementarity in H&Ls response to CoI.
Their discussion of negation and connectives contains an alternative se-
mantic analysis, but does not really pay attention to combinatorial and
distributional (i.e., syntactic) aspects as sources of evidence for the seman-
tics. Their treatment of complementation constructions exhibits the re-
verse pattern: it focuses almost completely on the issue of the proper
syntactic analysis, and contains no more than two sentences about the
semantics; in fact, for the sake of the argument they go along with the
CoI-analysis (in brief: matrix clauses are perspectival operators, rather
than event descriptions with other events as parts),
6
so here they ignore
the possibility that the semantics may provide a constraint on the syn-
tactic analysis (which, to be sure, is not to say that it would ipso facto
A reply to Hinzen and van Lambalgen 135
provide such an analysis). Be that as it may, their comments essentially
come down on an argument against a constructionist approach to syntax,
from the point of view of Chomskys minimalist program,
7
and I will
accordingly also only comment on issues of syntactic analysis strictu senso.
They do make some comments on perspective taking, and I will also have
a bit to say about that, but they are unrelated to the syntactic analysis.
H&L suggest that standard tests for constituency provide evidence in
favour of the idea that clausal complements should be analysed as verbal
arguments, i.e., as bearing the same syntactic relation to the verb as a
nominal complement. The problem is that these tests of constituency are
never conclusive. They give the examples George saw/knew/said what?
and George saw/knew/said that X, and Bill saw/knew/said so too, to sug-
gest the generalization that complement clauses in general can be re-
placed by what and so. But that is simply not true, witness *George
warned/was afraid what? and *. . . and Bill warned/was afraid so too,
while George warned/was afraid that his opponent would raise taxes is
ne. Thus, although the distribution of what and of so partly overlaps
with that of complement clauses, there are also discrepancies. This makes
allowing replacement by what/so basically worthless as tests, as they
sometimes produce the answer no and sometimes yes to the question
Does this complement clause bear the same syntactic relationship to
the matrix verb as a noun phrase or a pronoun?. What H&L do, decid-
ing that the yes-answer is the decisive one, is a clear case of the meth-
odological opportunism in much syntactic argumentation exposed by
Croft (2001: Ch. 1). As we saw previously, H&L have a tendency to over-
look one of the most basic concerns of a (cognitive) linguist: to account
for the distribution of linguistic elements, and to take the patterns in this
distribution as the most reliable indicators for the precise way in which
language provides a window on the mind.
H&L mention part of the more complete discussion of this issue in CoI,
admitting that it is not really clear to them what the problem is, and then
go on to provide an analysis of one type of complementation construc-
tion in terms of the minimalist program. They start with formulating a
number of what they call rather minimal assumptions. Leaving aside
whether they are really minimal in the sense of virtually a conceptual ne-
cessity (I dont think so), I will restrict myself to the question whether
this approach accomplishes what H&L claim it does. They state that
these assumptions allow one to describe the similarities between nominal
objects and clausal complements, and of course it does: any suciently
abstract analysis does. They then say that because this analysis does not
refer to syntactic categoriesi.e., it abstracts from the dierences be-
tween nominal and clausal phrasesit does not predict that nominal
136 A. Verhagen
and clausal phrases have the same distribution. But of course, without
additional (presumably not so minimal) stipulations, that is precisely
what the analysis does predict. If the claim is (and that is how H&L pres-
ent it) that the minimalist approach can explain the occurrence of both
nominal and clausal complements (and not only describe what, however
minute and abstract, is similar to them), then the system as they describe
it must predict the same distribution for the two (and more?) types of
phrases that are instantiations of the fully general category label X
(again, without additional stipulations). It seems to me that there is more
of a logical error here than in CoI (cf. footnote 6).
Somewhat more mildly, one could say that it while it may be true that
H&Ls minimalist analysis does not strictly predict the same distribution
for nominal and clausal phrases, it does not predict the dierences either.
Then the CoI-analysis would still have to be viewed as superior, since
it does predict the possibility of clausal complements with warn and be
afraid despite the fact that these predicates do not take nominal objects:
they are both perspective markers (as a verb of communication, and a
mental state predicate, respectively) and hence fully compatible with the
hypothesized meaning of the complementation construction (notice that
the analysis of form and function crucially meet here). But in any case,
the minimalist analysis as provided denitely does not account for the ob-
served distribution of clausal complements as only partially overlapping
with that of nominal ones.
Surprisingly, the most detailed actual syntactic analysis in H&Ls paper
ultimately results in a full contradiction. For the rather sketchy analysis
of standard object complementation (the George saw/knew/warned/ . . .
examples above), H&L invoke the general principle that phrases are
headed. They then attempt to give a minimalist account of copular
complementation constructions of the type The danger is that the middle
class feels alienatedwhich t straightforwardly into CoI since being a
danger is not an observable property in the world, but rather a subjective
assessment; hence the matrix clause evokes a (perhaps unidentied)
perspective, and thus satises the conditions for combination with a com-
plement clause. In the minimalist account, the predication must count as
symmetricalonly in this way is it possible for either element of the
small clause allegedly underlying such sentences to surface as the
subject of the sentence (cf. That the middle class feels alienated is the
danger). H&L state explicitly: Neither the CP [ clause] nor the DP
[ nominal phrase] are the head. This directly contradicts their minimal-
ist claim (iii) (phrases are headed), which was moreover necessary in
the description of object complements. I conclude that their account is
inherently inconsistent, and hence that it again does not accomplish what
A reply to Hinzen and van Lambalgen 137
it is claimed to accomplish, also not for copular matrix clauses of comple-
ments. And then we even have not yet touched upon all the theoretical
machinery invoked, such as movement and empty structural positions,
for which Occams razor would require independent evidencebut that
is a much more general issue than need concern us here.
4.2. Perspective taking, recursion, and understanding false beliefs
In CoI, it is observed that the hypothesis of complementation expressing
perspective taking immediately accounts for the fact that complementa-
tion is a prototype of recursion in language (the possibility for a structure
of type X to be embedded in another structure of type X), since concep-
tual perspective taking itself inherently allows for recursion. Thus, the
source of this case of recursion in language is in a sense placed outside
language, but that is dierent from placing recursion outside lan-
guage, as H&L construe it. More importantly, they contest the CoI ex-
planation on the basis of the argument that this explanation would only
work if this conceptual recursivity is propositional, for which they
claim there is no evidence. They do not state very precisely what they
mean by propositionality, but they refer to experiments involving
what is known in theory of mind research as false belief tasks; these
are tasks in which subjects must be able to entertain another persons be-
lief about the world and predict how s/he would act on that basis, while
knowing simultaneously that this belief is false (hence the term), so that
the subjects own response to the situation would be dierent. If recursion
were restricted to this kind of management of incompatible beliefs,
then H&L would have a point, because (understandably) having a system
of secondary representation for beliefs, i.e. a system on top of the primary
sensori-motor system, seems to be a necessary condition for performing
false belief tasks adequately.
However, managing false beliefs is simply not the same as perspective
taking, it is one of its most abstract and complex forms. Human children
develop several skills of social cognition before language, such as rec-
ognizing intentionality (distinguishing intentional acts from accidental
events), sharing attention (e.g., in gaze following), and directing attention
(pointing, showing. These basic skills all involve perspective taking, and
their development is a necessary condition (given the arbitariness of con-
nections between sound and meaning as children encounter them in the
world) for the development of linguistic symbolic communication (Tom-
asello 1999): it is only through recognition of an adults intention that a
child can start to make guesses about the meaning of some sound. More-
over, these skills clearly already exhibit the potential for recursion; e.g., a
138 A. Verhagen
child can manage other peoples attention to get them to show something
to the child. More recently, is has been shown that very young children
(and, to a limited extent, young chimpanzees) can also recognize other
peoples goals and desires, as evidenced by their propensity to provide
help (Warneken and Tomasello 2006). These are complex social cognitive
skills, and even in cases where our closest relatives can be argued to have
similar abilities, humans are usually much better at them, also at a young
age. Still, they are less complex than understanding beliefs, which are rel-
atively permanent mental states not directly caused by the outside world
(such as perceptions) but by other mental states or eventsnor directly
causing actions, but only indirectly so through guiding plans and inten-
tions (cf. DAndrade 1987). And they all involve simple alignment of the
self with the other, not alignment plus dissociation, which, as mentioned
above, requires a system of secondary representation.
Thus, understanding other people as having intentions and desires like
oneself is simpler and more basic, also developmentally, than understand-
ing beliefs, and especially false beliefs. Now language, being a system of
symbolic communication, has the (fortunate) automatic side eect of
also providing humans with a system of secondary representation (cf.
Keller 1998: 127128), so that it may, itself being based on capacities for
social cognition, provide the scaolding to enhance these capacities to
a level like that of managing false beliefs. Thus, the acquisition of false
belief understanding may very well be dependent on the acquisition of a
representation system for perspectivization, such as complementation. As
a matter of fact, Tomasello has been one of the scholars contributing
some of the most compelling evidence for this view so far (Lohmann
and Tomasello 2003). So it is not at all a matter of putting the cart be-
fore the horse, but rather a matter of treating perspective taking,
theory of mind, etc. not as monolithic concepts, but as congurations
of features that constitute a family of related perspectivization capabilities
of dierent degrees of complexity.
8
5. Conclusion
H&L attempt to show that data adduced in CoI as support for an inter-
subjective, argumentative view of meaning in grammar, have an explana-
tion in other approaches, which they consider more traditional and
that they would in principle consider superior; but in general these at-
tempts fail. The reasons for this failure are various, including misunder-
standings and misconstruals, but the most important one is the fact that
they ignore the precise character of the task of explanation in linguistics,
which involves taking the distribution of linguistic elements seriously:
A reply to Hinzen and van Lambalgen 139
if several linguistic forms behave similarly with respect to one or more
environmentsgrammatical ones and/or discourse onesthen an analy-
sis should account for this (with a minimum of assumptions, of course),
to be acceptable as an explanation (in some cases, H&Ls overlooking of
this crucial point even makes them leave out certain crucial parts of anal-
yses in CoI from their own discussion). Indeed, it is only by taking the
distribution of linguistic elements seriously in this sense, that the study of
language provides an independent window on the mind, such that cer-
tain conceptions of the nature of meaning and mind are not already
built into the foundational concepts of a purported explanation.
It is thus ironic, in my view, that H&L turn explanation into their
major point in their conclusions: I couldnt agree more.
Received 28 September 2007 Leiden University, The Netherlands
Revision received 19 November 2007
Notes
* Authors email address: 3arie@arieverhagen.nl4.
1. The usefulness of drawing boundaries and connections in this way has been disputed by
cognitive and functional linguists for decades now, and these alternatives have produced
several important insights. In an important sense, Fauconniers (1985) theory of Mental
Spaces was motivated by the discovery of the inferential character of meaning at the
level of the sentence, and even below; metaphors were demonstrated to be both inherent
(in sentences a`nd in words) and argumentative in Lako and Johnson (1980), etc.; an
approach showing quite directly that the distinctions invoked by H&L must be called
into question, is Levinsons (2000) theory of presumptive meanings. So given the state
of the art in cognitive linguistics, and in cognitive science in general, some more argu-
mentation to still maintain the traditional view is highly desirable, to say the least.
Even though, admittedly, one cannot cover everything in a relatively short commentary
article, references to other work do not constitute such an argumentation (cf. Bierwischs
2006 response to Hamm, Kamp and Van Lambalgen 2006).
2. The situation may even be worse. In English (unlike some other languages), the term
meaning may also be used for a contextually derived, i.e., person and time bound, inter-
pretation of a linguistic element or a piece of discourse, and even for what a speaker/
writer means (i.e., intends to convey) with an utterance, and it seems to me that
H&L include these senses in their notion of meaning too. Needless to say, all the
senses are related, but they are certainly not identical. One important dierence is that
conventional meaning is, by denition, a social phenomenon, while speaker meaning is,
also by denition, an individual phenomenon. Therefore, a theory of speaker-meaning
and a theory of conventional meaning can never be the same, as a matter of principle
(although they can and should inform and constrain each other).
3. H&L widen the gap between animal and human communication, not only by making
humans very dierent from animals, but also by underestimating the cognitive and com-
municative capabilities of animals, especially when they say that these are [s]tuck in the
immediate here and now (see Emery and Clayton 2004 on certain food caching birds),
only have a small number of vocalizations (see Kroodsma 2004: 122 on some song-
140 A. Verhagen
birds having repertoires of thousands of songs), which are all intrinsically linked to an
immediate [ . . . ] purpose (cf. Pepperberg 2004 on Grey parrots).
4. H&L also notice that the exception-approach entails the derivation of a variable for an
abnormality. Rather than an advantage, I consider this somewhat of a problem, as I
have no trouble understanding I barely passed and I failed although I worked hard with-
out being committed to even the existence of a particular abnormality such as being sick
on the day of the exam; so the occurrence of some abnormality does not seem to be a
necessary condition for the occurrence of an exception. Rather, the inference of the
possible existence of an abnormality seems to be a defeasible inference itself (which it is
not in H&Ls approach, as far as I can see).
5. In their footnote 2, they actually cite one of a number of passages from CoI stating this,
but they seem to simply have missed the point. In a way, their analysis consists of a
return to the position of Fillmore, Kay and OConnor (1988), the problems of which
precisely motivated the alternative analysis in CoI.
6. In one of these two sentences, they insert the proviso that the functional dierence be-
tween their examples (5) and (6) is not obvious to them. But one of the (repeated)
methodological points in CoI is that such dierences are often not obvious when one
looks at sentences in isolation, and only become visible when one looks at what are
and are not coherent ways of tting a sentence into a piece of discourse; it is that kind
of evidence that is adduced in CoI to make the point. H&L do not recognize the validity
of this kind of evidence, claiming it contains a logical error, but it is clear that this
opinion is entirely based on the fallacy of assuming a 1-to-1 relationship between sen-
tence and inherent meaning as discussed in section 2, so not a matter of logic but of
assumptions about the subject matter.
7. In fact, they devote a separate section to a very general discussion of this topic. The
issue has been discussed in many other places in a more adequate way than I could do
here, so I will restrict myself to two remarks. First, H&L call the idea that constructions
can be reduced to deeper principles that are not construction-specic, a claimed
achievement of Chomskys two most recent research programmes. However, if there
is one thing that work in constructional approaches over the last 10 years or so has es-
tablished, then it is tons of evidence that the claimed result is not at all achieved, a`nd in
fact inachievable, for all practical and theoretical purposes. The alleged reduction of
raising constructions and passive constructions to a single non-specic rule Move NP
or even Move, a few other principles, plus some construction-like stipulations (cf.
H&Ls idea of some syntactic heads subcategorizing for specic semantic categories)
to take care of the details, turned out not to generalize to many other constructions, and
meanwhile one construction after the other was found that has demonstrably unique,
so irreducible, and yet productive features. Second, for conceptual reasons to doubt
the general desirability of the Minimalist approach, I would like to point here to work
within the generative tradition that H&L adhere to (though not the two research pro-
grammes mentioned above), viz. Jackendo and Pinker (2005) and Culicover and Jack-
endo (2005), esp. chapter 1.
8. A possible misunderstanding I have encountered in discussions of this point is that per-
spective taking would be the only source of recursion in language. It is true that CoI
does not mention other possible sources, i.e., in other conceptual domains. As a matter
of fact, I think that there are other such sources, independent of perspective taking (e.g.,
the specication of locations or referents, as manifested in embedding of prepositional
phrases and relative clauses). But these still do not create an overall potential for
recursion of all kinds of phrases; rather, recursion is restricted to its own functional
niches (see also Verhagen forthcoming.b).
A reply to Hinzen and van Lambalgen 141
References
Bierwisch, Manfred
2006 Comments on: Fritz Hamm, Hans Kamp, Michiel van Lambalgen, There is
no opposition between Formal and Cognitive Semantics. Theoretical Lin-
guistics 32: 4145.
Croft, William
2001 Radical Construction Grammar. Syntactic Theory in Typological Perspective.
Oxford: Oxford University Press.
Culicover, Peter W. and Ray Jackendo
2005 Simpler Syntax. Oxford: Oxford University Press.
DAndrade, Roy G.
1987 A folk model of the mind. In: Dorothy Holland and Naomi Quinn (eds.),
Cultural Models in Language and Thought. Cambridge: Cambridge Univer-
sity Press, 112148.
Dinsmore, John
1991 Partitioned Representations: A Study in Mental Representation, Language
Understanding, and Linguistic Structure. Dordrecht: Kluwer.
Ducrot, Oswald
1996 Slovenian Lectures/Conferences Slove`nes. Argumentative Semantics/Seman-
tique argumentative. Igor Z

. Z

agar (ed.). Ljubljana: ISH Institut za human-


isticne studije Ljubljana.
Emery, Nathan J. and Nicola S. Clayton
2004 The mentality of crows: Convergent evolution of intelligence in corvids and
apes. Science 306: 19031907.
Fauconnier, Gilles
1985 Mental Spaces. Aspects of Meaning Construction in Natural Language. Cam-
bridge, MA: The MIT Press. [Reprinted 1994, Cambridge: Cambridge Uni-
versity Press.]
Fillmore, Charles J., Paul Kay and Mary Catherine OConnor
1988 Regularity and idiomaticity in grammatical constructions: the case of let
alone Language 64: 501538.
Hamm, Fritz, Hans Kamp and Michiel van Lambalgen
2006 There is no opposition between Formal and Cognitive Semantics. Theoreti-
cal Linguistics 32: 140.
Jackendo, Ray and Steven Pinker
2005 The nature of the language faculty and its implications for evolution
of language ( Reply to Fitch, Hauser, and Chomsky). Cognition 97: 211
225.
Keller, Rudi
1998 A Theory of Linguistic Signs. Oxford: Oxford University Press.
Kroodsma, Don
2004 The diversity and plasticity of birdsong. In: Peter Marler and Hans Slabbe-
koorn (eds.), Natures Music. The Science of Birdsong. Amsterdam: Elsevier
Academic Press, 108131.
Lako, George and Mark Johnson
1980 Metaphors We Live By. Chicago/London: The University of Chicago Press.
Levinson, Stephen C.
2000 Presumptive Meanings. The Theory of Generalized Conversational Implica-
ture. Cambridge, MA: The MIT-Press.
142 A. Verhagen
Lohmann, Heidemarie and Michael Tomasello
2003 The role of language in the development of false belief understanding: A
training study. Child Development 74: 11301144.
Pepperberg, Irene M.
2004 Grey parrots: learning and using speech. In: Peter Marler and Hans Slabbe-
koorn (eds.), Natures Music. The Science of Birdsong. Amsterdam: Elsevier
Academic Press, 363373.
Tomasello, Michael
1999 The Cultural Origins of Human Cognition. Cambridge, MA: Harvard Uni-
versity Press.
Verhagen, Arie
forthc.a Intersubjectivity and the architecture of the language system. In: Jordan Zla-
tev, Timothy P. Racine, Chris Sinha, Esa Itkonen (eds.), The Shared Mind:
Perspectives on Intersubjectivity. Amsterdam/Philadelphia: John Benjamins
Publishing Company.
forthc.b What do you think is the proper place of recursion? Conceptual and empiri-
cal issues. The Linguistic Review.
Warneken, Felix and Tomasello, Michael
2006 Altruistic helping in human infants and young chimpanzees. Science 31:
13011303.
A reply to Hinzen and van Lambalgen 143
Tense and cognitive space:
On the organization of tense/aspect systems
in Bantu languages and beyond
ROBERT BOTNE AND TIFFANY L. KERSHNER*
Abstract
Bantu languages are well-known for their complex tense systems encoding
multiple degrees of remoteness. Two assumptions underlie most approaches
to analysis of such systems: (1) that linguistic time is optimally construed
as a unidimensional expanse, whereby multi-tense systems carve up the
timeline in regular progressive intervals away from the speech event; and
(2) that tense markers quintessentially exhibit no overlap in denoting
reference along this expanse. In this paper, the authors propose a dierent
approach to understanding Bantu tense systems which treats linguistic
timefrom the perspective of Ego (the conceptualizer)as a multi-
dimensional array comprising cognitively dissociated temporal worlds, or
domains, temporally linked and grounded in the deictic dichotomy between
events construed as occurring in a contemporal world of the present ver-
sus those situated in cognitively dissociated domains. That is, tense markers
function to situate events in one of two distinct conceptual types of domain
that correlate with dierent construals of time: Ego-moving or moving-
time. Support comes from a variety of curious facts found in Bantu lan-
guages. A key element of this approach is that it provides an explanation
for why temporal overlap of tenses does, indeed, occur, and advances the
position that there are conceptually dierent pasts and futures.
Keywords: Bantu; cognitive domains; dissociation; semantics; tense.
1. Introduction
In his inuential work Tense, Comrie (1985: 50) alludes to a possible
universal of tense systems: in a tense system, the time reference of each
tense is a continuity. By this, he seems to imply (1) that linguistic time
is optimally construed as a unidimensional expanse and (2) that tense
Cognitive Linguistics 192 (2008), 145218
DOI 10.1515/COG.2008.008
09365907/08/00190145
6 Walter de Gruyter
systems exhibit no gaps in denoting reference along this expanse, a posi-
tion reiterated in, for example, Givo n (2001) and Frawley (1992). How-
ever, Comrie points out a possible exception to that hypothesis in Burera,
an Australian aboriginal language.
1
Burera has a formal opposition be-
tween two verbal suxes, -nga and -de, each of which has two temporal
interpretations, present time reference (be V-ing) and recent past (V-ed
in last few days) with -nga, hodiernal past (V-ed earlier today) and re-
mote past (V-ed more than a few days ago) with -de. Both morphemes
appear to denote discontinuous time reference and, hence, constitute
counter-examples to the purported universal. Bybee et al. (1994: 104), cit-
ing data from Merrield 1968) point out a similar case in Palantla Chi-
nantec. In this language, ka
1
denotes an action just completed or com-
pleted on another day, in opposition to na
2
, which refers to an event
occurring earlier on the same day. Signicantly, Comrie suggests margin-
alizing this kind of situation in coming to understand tense systems:
This kind of tense opposition does not t well within most current conceptions of
tense, although its existence must be acknowledged; at best, one could appeal to
its rarity as an excuse for according it marginal status within the overall theory.
(p. 89)
We believe, contrary to Comries view, that such seemingly idiosyn-
cratic distinctions constitute keys to understanding how tense systems
are organized. In particular, we believe that they provide evidence for a
multi-dimensional conceptualization of time and cognitive space. In this
paper, we set out various kinds of evidence from Bantu and similar Ban-
toid languages that support this view.
Tense systems in Bantu languages are typically rich and complex, with
multiple past and/or future tense markings. Thus, a common set of past
tenses may include a distinct form for immediate past, another for a past
earlier in the day, a third for yesterday or a few days ago, and a fourth for
a more distant past. Generally, Bantuists conceive of these dierent past
forms as denoting linear temporal reference at farther and farther remove
from the speech event. Nurse (2003: 99), for example, states, . . . dierent
languages divide the timeline up dierently, resulting in a dierent num-
ber of tenses. In principle, the timeline can be cut at many points. If this
view were correct, we should expect to nd that the only dierence se-
mantically would be in the time referred to or, morpho-syntactically, in
the form of the tense marker. However, as we will show in the cases ex-
amined here, other dierences in the semantics and morpho-syntax arise
that cannot be explained, or are unsatisfactorily explained, in terms of a
simple linear timeline.
146 R. Botne and T. L. Kershner
2. Tense and time
Tense, painted in rather broad strokes, has commonly been dened as
that grammatical category that marks the location in time of some event
2
with respect to some conventionally recognized reference locus (see, for
example, Chung and Timberlake 1985 among others). That is, temporal
relations can purportedly be conveyed in terms of four basic concepts:
an anchoring reference locus, a situated event, a direction or temporal
location vis-a`-vis the reference locus, and, in some cases, the degree of
remoteness from the reference locus. The typical deictic reference locus
in natural language is the time of the speech event itself, with events con-
strued as situated temporally before, after, or simultaneous with it. Con-
sonant with this perception of tense is the common view that tense is best
understood and represented in terms of a one-dimensional linear timeline
anchored by the speech event. Indeed, Frawley (1992: 337338) explicitly
states that [t]he stereotypical, ideal timeline is an entirely adequate
model of linguistic time. Likewise, Givo n (2001: 285) asserts that [t]he
category tense involves the systematic coding of the relationship between
two points along the ordered linear dimension of time. As we intend to
show, data from a variety of Bantu languages demonstrate that this is too
simple a mental model of tense systems and that there is not such a simple
linguistic correspondence between time and tense, that the common corre-
lation of tense marking solely with the traditional unidimensional timeline
fails to account adequately for the range and dierences in usage one nds.
As suggested in the statement from Frawley cited above, linguists in
general and Bantuists in particular have persisted in correlating tenses
with a simple timeline. However, Comrie (1985: 2), though subscribing
to the simple linear view that such a diagrammatic representation of
time is adequate for an account of tense in human language, does ob-
serve that the timeline does not directly represent the ow of time, i.e.,
whether the present moment is viewed as moving along a stationary time-
line, or whether time is viewed as owing past a stationary present refer-
ence time point. (p. 3) Nevertheless, he demurs in stating that these
dierent perspectives on the ow of time do not seem to play any role
in the characterisation of grammatical oppositions cross-linguistically.
(p. 3) Binnick (1991: 56) and, later, Lako and Johnson (1999) also
call attention to these alternative perspectives of time but, again, do not
correlate them formally with tense or tense systems. We believe it is nec-
essary to integrate the contrasting perspectives directly into any semantic
analysis of tense/aspect systems as a cogent organizational principle.
In adopting this position, we do not claim that there are dierent time-
lines but, rather, dierent construals of timetime as path vs time as
Tense and cognitive space in Bantu languages 147
stream (Figure 1). In the former, time is construed as a stationary time-
line along which Ego, the conceptualizer, moves, as diagrammatically
presented in Figure 2a; in the latter, time itself is perceived as moving
(Figures. 2b, c). Additionally, either Ego or Event may be perceived as
moving with respect to the other. Metaphorically, one could visualize the
former as a person on a raft (moving-Ego) oating past a gathering
(stationary-Event) on the bank of a stream (Fig. 2b), the latter as a per-
son standing on a bridge over the stream (stationary-Ego) observing var-
ious items oating by beneath the bridge (moving Event(s)) (Fig. 2c). In
the latter case, one can imagine the observer either to be observing items
oating toward her (coming from the future) or, on the other side of
the bridge, to be observing items as they pass by going downstream
Figure 1. Alternative construals of time
Figure 2. Ego (speaking at (S)) and Event (E) construed in relation to time(line) [The g-
ure (I) represents Ego, a diamond ()) the location of an event on a imeline.]
148 R. Botne and T. L. Kershner
(moving o into the past). [N.B. abbreviations can be found in the
Appendix.]
In Figure 2b, Ego (at S) conceptualizes herself as moving in time with
respect to a stationary event (E); in Figure 2c, she conceptualizes herself as
stationary while E moves in time toward her. In each case, the temporal
relation of Ego and Event is constant in time; what varies is the cognitive
orientation the individual conceptualizing the situation chooses to adopt.
Our claim is that a language may correlate these dierent orientations
with dierent formal linguistic features, providing the means for a speaker
to adopt either a path or stream construal at the time of speaking.
In order to reduce the number of schemas used in the paper and to fa-
cilitate comparison of formal marking in each construal, we combine the
path and stream orientations illustrated in Figure 2 into one diagram-
matic representation, as in Figure 3. Furthermore, we will, henceforth,
for ease of exposition, refer to each line as a timeline, even though con-
ceptually they represent alternative perspectives on one timeline.
This contrast in perspectives of time was noted and expressed as early
as the work of Gustave Guillaume (1929, 1937, 1945 cited in Hewson
et al. 2000) and later in the work of Benveniste (1965), Traugott (1978),
Fleischman (1982), Emanatian (1992), Hewson et al. (2000) and, most re-
cently, Evans (2005). Hewson et al. (2000: 3840), following Guillaume,
correlate moving time with aspect, moving ego with tense; Traugott
(1978) and Fleischman (1982) discuss the dierences with respect to come
and go as grammaticized temporal markers, but do not utilize the con-
trast further in developing a model of the organization of tense systems.
Emanatian (1992), though espousing a solely moving-ego analysis of
come and go temporal use in Chaga (E.62)
3
, grants the possibility
that the moving-ego analysis and the moving-event analysis may describe
dierent routes for come verbs to become future markers. Evans
(a) Ego-moving;
(b) moving-Ego or moving-Event
[i.e., either Past to Future or Future to Past (as shown), respectively]
Figure 3. Linguistic construals of time(line) combined
Tense and cognitive space in Bantu languages 149
(2003) provides the most detailed discussion and analysis of these cogni-
tive models of time, arguing that they represent complex, and not pri-
mary, metaphors. All of these views consider there to be a binary contrast
in perspectives. We believe there to be a tertiary distinction that has sub-
tle consequences for temporal marking systems. We will return to this is-
sue in Section 4, but rst we consider tense in relation to mental worlds.
3. Tense and mental worlds
Tense systems constitute the overt manifestation of the linguistic organi-
zation of time. Although the multiple conceptual perspectives of time
noted in the preceding section constitute a key organizing principle, just
as important is the concept of mental worlds or, as we shall refer to
them, cognitive temporal domains. These domains are grounded in the
fundamental dichotomy that exists between basic and dissociated deictic
views of realis, space, and time. As background to this discussion, we be-
gin with a brief overview of reference time, or reference locus, found in
the linguistic literature. Two linear models of tense, oneReichenbachs
(1947) modelthat has been and continues to be particularly inuential
(cf. Smith 2004; Helland 1995; Hornstein 1990, for example), the
otherBulls (1960) modelmuch less so, merit a brief discussion.
Although Reichenbachs work can be considered anti-mentalist to a
certain extent and ignored aspectual relations, the inuence of his model
of tense relations nevertheless merits a brief review. In breaking with the
Jespersenian model of primitive absolute tenses (cf. Binnick 1991: 110
112), Reichenbach (1947) dened tenses in terms of the relations holding
among three timesthe time of the speech event (S), the time of the event
(E), and a reference time (R), inclusion of the abstract reference time per-
haps the most signicant feature of his model. The relative order of each
of these with respect to the two others determined temporal reference. Al-
though this model provided a solution to problems inherent in dierenti-
ating the preterite and the perfect, there nevertheless were problems with
the approach. As Comrie (1981: 25) and later Declerck (1991: 236) point
out, the model advanced the idea that specication and strict ordering of
all three times in a linear manner along a timeline were both necessary
and sucient conditions for the proper specication of any tense. How-
ever, for some tenses, such specication is unnecessary and infelicitous.
For example, in the future perfect in French or English, as in il aura
chante he will have sung, the strict ordering between all three times for
the future perfectSER, S,ER, ESRis unwarranted; the rela-
tion of E to S is not part of the meaning of the form, but simply deter-
mined pragmatically from context. That is, aura chante will have sung
150 R. Botne and T. L. Kershner
simply indicates that the event sing preceded the reference time (R),
which itself is posterior to S; it does not indicate the temporal relationship
of E with respect to S. Rather, as Comrie (1981) and Dinsmore (1982)
propose, and Binnick (1991) reiterates, one can specify two pairwise rela-
tions: E with respect to R, and R with respect to S. As Binnick (1991:
115) rightly points out with respect to Reichenbachs approach, [t]ense
is a matter of how R relates to S. In Reichenbachs terms, this would
be RS (past), R,S (present), SR (future).
Although Reichenbachs introduction of the concept of reference time
was, perhaps, his most controversial and key contribution in thinking
about temporal relations, it is not simply a question of reference point.
As Klein (1992: 533) indicates, the possibility of explaining the dierent
behavior of the past and perfect, for example, hinges on what is under-
stood by R. Thus, he proposes a more explicit and precise denition of
reference time, which he labels topic time (TT), as the time span to
which the claim made on a given occasion is constrained (p. 535). While
we believe this to be a very useful and salutary proposal, what Reichen-
bach (and subsequent adherents to his model) as well as Klein have not
addressed, and what seems to be vaguely hinted at, for example, in Com-
ries and Declercks critiques, is that reference time needs to be further de-
composed into two separate concepts, reference anchor and reference
world, and that linguistic time is conceptualized cognitively. The refer-
ence anchor constitutes a locus of orientation with respect to which an
event may be temporally related, as in the English past perfect, for exam-
ple she had sung, in which the singing occurred prior to some other time
or event which itself preceded the moment of speaking (in the pairwise
modication of Reichenbachs terms, ER:RS). On the other hand, ref-
erence worldsor, as we will label them since we are speaking of mental
activity, cognitive domainsconstitute temporal time spans within which
events are asserted to occur. This kind of distinction began to emerge in
the work of Bull (1960).
Bulls model, similar in certain respects to that of Reichenbach, has had
signicant inuence on the development of some of the ideas presented in
this paper. In his work, Bull proposes a model in which there are multiple
axes of orientation, each anchored by a dierent reference point. Al-
though the number of axes is in principle innite, Bull (1960: 22) suggests
that the maximum number grammatically encoded in any language is not
likely to exceed four. He labels these axes in the following manner: PP
(present point equivalent typically to the time of speaking S) for reference
time at the speech event, RP (retrospective point) for a reference time in
the past, AP (anticipated point) for a reference point in the future, and
RAP (retrospective anticipated point) for a reference point posterior to
Tense and cognitive space in Bantu languages 151
another reference point in the past. Events can be temporally related to
each of these reference anchors in three ways, which Bull labels vectors:
anterior to it, simultaneous with it, posterior to it, as shown in Figure 4.
Since Bulls system is based on the notion of relativity, events can only
be projected and construed in relation to one reference point at a time.
Hence, reference points other than PP, the present speech event, may be
potentially encoded in grammatical forms that indicate the temporal rela-
tion of the particular axis (or reference point) to the speech event.
Though providing a rich model for analyzing tense relations by implicitly
dierentiating reference anchor (his primary points, i.e., PP, AP, etc.)
and reference world (his axes of orientation), Bull nevertheless envi-
sioned temporal relations as features of a single timeline (pp. 22, 24).
Furthermore, his model fails to separate the dierent kinds of temporal
relations encoded in the dierent verb forms, in particular, the dierence
between have forms and -ed forms. This singular view of the timeline, as
we have stated, is insucient to account for the distinctions one encoun-
ters in many tense systems; rather, a dual perspective is necessary.
We incorporate and develop further some of the insights from these
various models into one conceptual framework, but diverge in signicant
ways from the simple linear approaches. Tense, in our view, denotes that
relation that holds between S (the locus of the speech event) and a cogni-
tive temporal domain (comparable, but not identical, to Bulls notion of
axis and Kleins topic time), a relation that is best construed in terms
of clusivity: inclusivityi.e., the deictic center (anchored at S) occurs
within the time span of the cognitive worldversus exclusivity, or
dissociationi.e., the deictic center at S is external to, or dissociated
Figure 4. Bulls (1960) tense schema
152 R. Botne and T. L. Kershner
from, the cognitive world. In the privileged case of inclusion, i.e., when
the cognitive world includes S, we label that world the P-domain, denot-
ing a primary, prevailing experiential past and future perspective. For re-
lations of non-inclusion, or dissociation, we refer to that cognitive world
as a D-domain. For expository convenience, we represent these dierent
temporal domains (i.e., the dierent cognitive worlds) as bounded qua-
drangular planes, as in Figure 5, correlated with two perspectives of
time: (i) ego projecting movement over the temporal landscape from one
cognitve domain to the next, and (ii) either moving-ego or moving-event
(dotted arrows) passing through the P-domain. That is, Ego construes
herself as moving across the temporal landscape from one cognitive
domain, or world, to another. Within a given cognitive world, Ego con-
strues time as moving, either carrying Ego along into the future, or carry-
ing events toward Ego from the future. In our analyses of the various lan-
guages, we endeavor to determine which perspective of Time-moving is
relevant. However, limited data in some languages has precluded making
a denitive determination. This lacuna, though unfortunate, does not im-
pede analysis of the organization of temporal systems within the domain
model.
To illustrate this model, we can consider the binary tense distinction in
English represented morphologically in the contrast opposing -ED and
marked verb forms. The -marked English verb forms situate the event
in the P-domain. Although labeled a present tense, the -form does
not necessarily denote coincidence with the time of speaking (however,
see, for example, Langacker 2001 for a vigorous argument that it does
4
).
Rather, the event may be construed in a number of ways other than
present within the domain. It may, for example, be construed as future or
Figure 5. Correlation of cognitive worlds with three perspectives on time
Tense and cognitive space in Bantu languages 153
as past (better known under the rubric historical present). Consider the
phrase were dining out, progressive aspect in the -tense form. One can
use this in any of the following:
(1) a. Were dining out. (response to a query via cell phone while at a
restaurant)
b. Tomorrow, were dining out.
c. Yesterday, were dining out, completely enjoying our evening,
when her phone buzzes.
d. Were dining out every night this month.
e. When were dining out, I always have red wine.
Context and/or use of adverbials of time situate the event with respect to
the speech event. Certainly, there are specic constraints in English on
when the simple -form can be used felicitously, for example, its non-
use for an on-going action at S (such as, I work now), a Modern English
development (cf. Middle English al dares for drede all are cowering for
fear Burrow and Turville-Petre 1992: 45). Our claim is simply that the
-form situates the event somewhere in the P-domain, as illustrated by
the positions of the small diamonds ()) in Figure 6a, in opposition to -ED.
The -ED form, in our view, indicates that the cognitive domain does
not include the deictic locus at S and, furthermore, is specically past
(Fig. 6b). Note that we are not claiming that the only semantic function
of -ED in English is to mark past tense; that is only one of its functions. It
clearly has others, for example, marking irrealis (e.g., If I knew the an-
swer, I would tell you) or marking social distance or politeness (I wanted
to ask you about that picture). In its role as tense marker, however, it
has only past meaning. However, our model presents a framework invit-
ing a unied approach to both temporal and non-temporal uses of
morphology.
4. Cognitive Grammar and mental space models of tense and aspect
Having briey outlined our model in the preceding section, we turn here
to a synoptic consideration of two other cognitive approaches that also
address tense and aspect. In his Cognitive Grammar approach, Lan-
gacker (2000: 23) considers tense to be the primary grounding element
of a nite clause, which proles a grounded instance of a [process] type
[i.e., a lexical verb]. Hence, tense is one kind of grounding predica-
tion, whose function is to locate the clausal prole, i.e., the process de-
noted by the verb in the nite clause, in relation to the ground (the time
of the speech event, the participants, and any immediate circumstances)
(Ibid. 220), which constitutes the locus of conception and viewpoint, and
154 R. Botne and T. L. Kershner
is evoked implicitly as a point of reference. A verb proles a complex re-
lation, a process, in which its evolution through time is salient (Ibid. 222).
As an example, for English, Langacker proposes two aspectual classes,
perfective and imperfective, construed essentially as bounded, as in Fig-
ure 7a, or unbounded (Fig. 7b) within the immediate temporal scope
(IS), respectively.
The processual prole, i.e., the grounded process, in English is specied
for location in time by either of two markers, or -D, which denote that
a. P-domain (contemporal) construals of -marked forms
[) potenital positions of events]
b. D-domain (past) indicating dissociation of cognitive space from S; marked by
-ED
Figure 6. Temporal worlds and S: English Past (-ED) vs Contemporal ()
Figure 7. Perfective and imperfective processes in time (Langacker 2000: 224)
Tense and cognitive space in Bantu languages 155
the process is either proximal or distal to the ground. An example of
such a relationship, the past of a perfective verb, is schematized in (8),
where the squiggly lines denote the time of the speech event.
This approach to tense and aspect, although adopting a conceptual
cognitive framework, diers little from those that we discussed previ-
ously. Tense is construed and represented in terms of relations along a
unidimensional expanse of time. As we have stressed, such a simple view
of tense relations is inadequate to capture the multi-layered systems of
Bantu languages.
A richer cognitive model addressing tense and aspect is that originally
developed and propounded in Fauconnier (1985, 1997) and further ex-
panded in Cutrer (1994). According to Mental Spaces Theory (MST),
tense and mood provide the means for keeping track of the time and real-
ity status (epistemic distance in MST terms) of a conguration of mental
spaces built up in discourse. Essentially, then, they constitute a discourse
management tool. Mental spaces, in the theory, are partial and tempo-
rary conceptual domains constructed during the process of discourse
(Fauconnier 1997 and Evans and Green 2006). There are four dierent
kinds of space (Cutrer 1994: 7173): (1) a Base space, which is always
in the present and contains the initial viewpoint from which events are
construed; (2) a Viewpoint space, essentially equivalent to the notion of
reference or vantage point, that space from which deictic relations are
determined; (3) a Focus space, which is where meaning is actively being
constructed (that space which an utterance is about); and (4) an Event
space, the temporal space in which the event encoded by the verb takes
place. An example illustrating the following brief narrative will make
these concepts clearer.
(2) Anna is moving to Kansas. She has lived in Indiana for ve years.
Yesterday, she rented a U-haul truck.
An MST representation of the passage in (2) begins with a Base (B)
space, which is also the initial Viewpoint (V), Focus (F), and Event (E)
space, as in Figure 9. This space is interpreted by default as in the present,
as denoted by the verb form is moving. Note that there is much more to
Figure 8. Past of a perfective process (Langacker 2000: 225)
156 R. Botne and T. L. Kershner
the structure of these spaces in MST; we have limited the information to
that which is relevant to tense and aspect representation.
In the second sentence, according to the principles of MST, the present
perfect functions to keep the Base space in focus while adding new infor-
mation relevant to the meaning structure being built in the Base. That is,
the event lived represents an event that is complete with respect to the
Base space and, hence, is accorded a new space (Fig. 10). Focus, how-
ever, remains in the Base (indicated by present has). Current relevance of
the perfect arises from the divergence of Event and Focus spaces, indicat-
ing that knowledge of the former has some relevance in the latter.
In the third sentence, the adverbial yesterday is considered to be a space
builder, hence, this sentence establishes a new space which is marked for
past (-D) with respect to the Viewpoint space, which is in the Base (Fig.
11). This new space is now also the Event space. What dierentiates past
from present perfect is that the new space is also the Focus space.
Figure 9. Representation of Anna is moving to Kansas.
Figure 10. Representation of She has lived in Indiana for ve years.
Figure 11. Representation of Yesterday, she rented a U-haul truck.
Tense and cognitive space in Bantu languages 157
This short sample of MST representation of temporal relations in a text
illustrates the basic principle behind the theory: Base, Viewpoint, Focus,
and Event serve as general discourse organizers. Tense and aspect provide
information on the distribution, location, and conguration of these or-
ganizing mental spaces. Tense comprises three categoriesPast, Present,
Futurethat either denote an already existing space or create a new one.
They function, therefore, as discourse links, connecting various spaces
cognitively. AspectPerfect, Progressive, Imperfective, Perfective
provides information about the arrangement of Viewpoint and Focus
(Cutrer 1994: 100), but, unlike tense markers, does not put a space in Fo-
cus. Crucially, with respect to our model,
[t]he tense-aspect categories characterized here are not represen-tations of seman-
tic form, nor are they intended as language specic grammatical categories. But
rather, they are characterizations of conceptual discourse links, which operate at
the cognitive construc-tion level, and which in the strongest possible claim, are
universal. Each tense-aspect category is a universal type of local link between
spaces, a local relationship which may be extablished between spaces as part of
the underlying cognitive structure. These discourse links are conceptual notions
which are separate from language, but which may be encoded by the grammatical
conventions of individual languages. (Cutrer 1994: 94)
We believe our model complements this approach. As Fauconnier (1997:
82) notes, Languages dier . . . in the type of coding they adopt and
what they code. Thus, whereas MST focuses on tense-aspect in terms of
its conceptual linking of events in discourse, our focus is on the organiz-
ing principles of the tense-aspect system itself, what distinctions are made
within a system and how they relate to each other, in short, what gets
encoded. Consequently, we feel the two approaches may be combined
fruitfully to provide a global picture of how tense-aspect systems are or-
ganized and how they are used to manage discourse.
Concluding this brief excursus into other cognitive approaches, we re-
turn to a consideration of the issue of deixis in our approach.
5. Tense and other verbal deixis
Our view of temporal deixis in terms of dissociation is commensurate
with two other possible deictic verbal categoriesrealis and spatial
positionwhich denote whether the situated event is treated as real or
not, or as occurring in the immediate vicinity of the speech event or not.
In each case we can identify two domainsreal vs. not real, here vs. not
hereone coincident, one dissociated, a distinction comparable, perhaps,
to Traugotts (1978) proximal-distal relation.
158 R. Botne and T. L. Kershner
The speech event, then, can be considered to be grounded in the real,
the here, and the contemporal.
5
We believe that this deictic dichotomy
between the extant real, here and contemporal and the displaced not
real, not here, or not contemporal constitutes a second signicant facet
of the organization of tense distinctions in cognitive space. For this
reason, we propose that cognitive space is divided into two distinct con-
ceptual domains for each of four contrasting deictic components: realis,
temporalityopposing a contemporal domain with a non-contemporal
(past) domain, on one hand, and a non-contemporal (future) one, on
the otherand spatial location, as in Table 1.
6
In each case we can
contrast inclusion of the deictic center within the prevailing cognitive
world, for which we use the label P-domain, and dissociation, for which
we employ the label D-domain. More simply stated, we are proposing
that there are paired conceptual worlds, suggested by the oppositions
set out in Table 1, that natural human languages may choose to mark
grammatically.
A language may choose to mark none of these oppositions grammati-
cally, one, or more. Comparison of several disparate languages in Table
2, for example, shows that Norwegian morphologically marks not con-
temporal (past) with -ET, while Slave [Athapaskan, Hare dialect] (Rice
2000) marks not real and not contemporal (future) with -O-, while Tunen
Table 1. Verbal deixis
inclusive dissociative
reality: real not real
temporality: contemporal not contemporal P
(i.e., Cog domain prior to S)
not contemporal F
(i.e., Cog domain later than S)
spatial position: here not here
Table 2. Dissociative marking (morphological) in Norwegian, Slave, and Tunen
Norwegian Slave Tunen
reality: not real n.m. -O- n.m.
temporality: not contemporal P(ast) -ET n.m. ls
not contemporal F(uture) n.m. -O- jo
space: not here (away) n.m. n.m. ka
[N.M. not marked; indicates no overt morphological marking]
Tense and cognitive space in Bantu languages 159
[Bantu, Cameroon] (Dugast 1971) marks not contemporal (Past) with ls,
not contemporal (Future) with jo, and not here with ka.
7
What is relevant and signicant here is that markers for one deictic re-
lation may come to be used for one of the other relations. For example,
Botne (2003a: 39697) shows that in Chindali (Bantu, Malawi) an
itive marker -ka-, indicating an action occurring at a distance from
the deictic center, developed an additional role as a future tense marker.
Parallel to this, a remote past marker, also -ka-, added the function of ir-
realis marker. Consequently, Chindali marks all of the deictic contrasts
with the same morpheme -ka-. This expansion of functions from one
deictic function to another is also found in English -ED, which began
as a marker of past tense, but came to be used as well for irrealis.
The inclusive vs. dissociative distinction, therefore, constitutes an impor-
tant cognitive opposition that unies these and, perhaps, other such
contrasts.
This concept of dissociation is not a new idea. Seiler (1971) appears to
have been the rst to use the concept as a feature in his analysis of the
preterit in Greek. Steele (1975) adopts it in her analysis of irrealis and
past in the reconstruction of Proto-Uto-Aztecan, while Traugott (1978)
implies it in her proximal-distal distinction of tense relations. However,
James (1982) and Fleischman (1989) argue against Steeles use, preferring
instead to retain the temporal meaning past as a basic, fundamental
notion from which an irrealis reading is derived. More recently, Cutrer
(1994: 184) notes that . . . temporal distance extends to express non-
actuality or non probability. Similarly, Taylor (2002: 395) states that,
. . . the past tense presents a situation as located distant from the
ground, whether this be distance in time or distance in reality. We do
not dispute that irrealis use may derive from past use. What we are pro-
posing here, however, and what diers from previous proposals, is that
dissociation involves potentially all deictic phenomena related to linguis-
tic specication of event occurrence and that it constitutes a fundamental
organizing principle not only of tense phenomena, but also of related
verbal deictic phenomena. Furthermore, temporally it is not simply a
separation of past from present. As our data and analyses will demon-
strate, there are potentially several kinds of past or future reference that
can be dierentiated: one or more that fall within the P-domain and one
or more which are dissociative and, hence, fall within a (past or future)
D-domain.
As further exemplication of the essential concepts we have set forth
here, consider briey the case of Nugunu (A.62), a Bantu language
spoken in Cameroon, whose basic TAM system provides a concrete illus-
tration of these relationships in a complex system. There are eight verb
160 R. Botne and T. L. Kershner
conjugations in Nugunu, as shown in (3) (Gerhardt 1989). In addition
to the pre-verbal temporal marker, the verb in some cases acquires a
H(igh) tone on non-initial syllables (for example, go fyga reveiller
[wake up] > fyga in the remote past (P3)).
(3) Nugunu verb conjugations (Gerhardt 1989; Orwig 1991 [ex. in (e)])
a. P3 matoa ma mba mo bombana gala la voiture la cogne avant-hier
voiture elle P3 le cognerH avant-hier [the car struck him the day before
yesterday]
b. P2 a a ds maa
8
nts ms yshs iyo il a defriche son champs hier
il P2 defricherH champs son hier [he cleared his eld yesterday]
c. P1 a baa fyga tulubu il sest reveille to t (ce matin)
il P1 reveiller to t he woke up early (this morning)
d. RSL go a g l ok d ba tsbsn tu as pris la femme de ton fre`re
(et tu las encore) tu RSL prendreH femme de ton_fre`re
[youve taken the wife of your brother
(and you still have her)]
go a fy ga tu tes reveille
tu RSL reveillerH [youve awakened]
e. Pr a d mba hes leaving/about to leave
3S leave
IMPF a duenene (<due sell) he is selling/will sell
3S sell.IMPF
f. F1 ds gaa miee noni yssys nous allons lenterrer aujourdhui
nous F1 enterrerH aujourdhui ceci [we are going to bury her/him today]
g. F2 a na bola il arrivera [demain/dans quelques
jours] il F2 arriver
he will arrive (tomorrow/in a few days)]
h. F3 a nga foaga nya ja heeni il construira une maison la-bas
il F3 construireH maison la-bas [he will build a house over there]
Time reference:
P3 before yesterday
P2 preceding relevant time unit (e.g., yesterday, last month, etc.)
P1 earlier today
RSL resultative
F1 today or tomorrow [but with adverb can be used for more distant time]
F2 1 or 2 days after tomorrow [later if certain]
F3 >2 days
The remote tenses, P3 and F3, comprise an initial nasal segment, hence,
m.ba and n.ga. The near tenses, P1 and F1, are decomposable as ba-a and
ga-a, respectively. The initial CV element can be observed alone in certain
relative clauses, where only the initial morpheme appears, as illustrated
by the P1 example in (4a), followed by what Orwig (1991: 151) terms a
dependent marker (DEP), -na-. Note that the P2 and F2 tense markers
do not change in form, as shown for P2 in (4b).
Tense and cognitive space in Bantu languages 161
(4) Nugunu P1 and P2 in relative clauses (Orwig 1991)
a. P1 a baa ja a baa go gue, gscamsna gssgs m ba na bola
he P1 do he P1 INF die time which I P1 DEP arrive
he had already died when I arrived
b. P2 aja msss ma a na hume, m bss sda naa nyony
when mass it P2 DEP let_out I NAR go to market
when mass let out, I went to the market
The time denoted by P2 varies according to context, but always de-
notes the relevant time unit preceding the temporal locus, for example,
yesterday (if the locus is today), last month, last year. Consequently, the
temporal denotation overlaps that of P3 in time. In the model we are pro-
posing, this is readily accounted for: the two denote dierent perspectives
of the timeline; P3 situates an event in a D-domain, P2 in an anterior time
unit of the P-domain. The temporal markers of Nugunu are summarized
in Table 3 below.
A salient feature of this set is the parallel nature of the morphemes for
past and non-past, morphologically similar in both segmental form and
tonal marking. The remote tenses are both N.CV and low-toned, the
P2/F2 tenses mono-morphemic and high-toned, the P1/F1 tenses CV-a
with reversed H and L pattern. Given the separability of the nal -a and
the predictability of the tones, we can analyze the -a as the same element.
The identity in form (i.e., a) of P2 and RSL makes it tempting to analyze
them as the same morpheme, as Gerhardt (1989: 321) does. Historically,
the P2 use undoubtedly arose from the resultative (RSL) use, which de-
notes a post-Nucleus resultant state of an event (Fig. 12) that has oc-
curred at some time prior to the moment of speaking; this state continues
to exist at the speech locus S, as illustrated in Figure 13a. Because the
state is denoted as current at S, adverbials denoting the time of the event
E cannot be used, as Gerhardt notes.
P2, on the other hand (illustrated in Figure 13b), denotes the temporal
relation of the event proper (E) with respect to S. In this case, it situates E
in that time unit immediately anterior to the current time unit. The rele-
vant temporal unit may be a natural time unit such as yesterday (or
Table 3. Nugunu tense markers
Tense marking
P3 m-ba n-ga F3
P2 a na F2
P1 ba-a ga-a F1
RSL a (-an) Pr (Impf )
162 R. Botne and T. L. Kershner
last month, last year), or a societal time unit, such as that of a rulers
reign, as in (5). In this use, a temporal adverbial such as iyo yesterday is
appropriate.
(5) ofuje yunu yo indenyee gsdj nyoma
chef ce-la` 3S P2 diriger.P2 village an
ss d (Gerhardt 1989: 321)
dix
ce chef-la` dirigea le village dix ans
(sous entendu, cest le predecesseur de celui qui re`gne maintenant)
[that chief ruled the village for ten years
(understood that it is the predecessor of the one who rules now)]
The question we pose here is why there should be such regularity in the
patterning of form and meaning. We propose that our domain model
provides a principled and motivated answer. The schema in Figure 14 il-
lustrates the Nugunu tenses in the model we have laid out. The tense
markers can be sub-divided into two sets based on their formal and se-
mantic characteristics, one that patterns along the moving-Ego timeline
through the P-domain, one along the Ego-moving timeline across the
Figure 12. Event structure and extension (post-Nucleus result phase)
a. a as Resultative (RSL) marker
b. a as Past P2 marking Anteior time unit
[Anterior (AnTU) and Current (CTU) time units]
Figure 13. Resultative (RSL) and Past P2 interpretations of a
Tense and cognitive space in Bantu languages 163
temporal landscape connecting domains. We have analyzed each tense
marker as comprised of two elements; the P2 and F2 have marking
where the P1 and F1 have nal -a. Those markers that correlate with the
moving-Event timeline (and, hence, the P-domain) have a nal -a when
marking the current time unit (i.e., within the bold quadrangle), zero-
marking when specifying the adjacent time units. Those correlated with
the Ego-moving timeline have an initial nasal element. Close observation
shows that the tones also pattern regularly.
What evidence is there in Nugunu to support the claim that the per-
ceived ow of time through the P-domain is toward the future, i.e., a
moving-Ego conceptualization? Both the resultative (RSL, see (3d)) and
the present imperfective (Pr Impf, see (3e)) foster this interpretation. The
RSL does not denote a retrospective view (]
E
X) of the event it marks,
but rather a continuous, on-going view of the result state at the time Ego
speaks (]
E
---Xd) Figure 15. This is supported by the fact that use of this
form does not permit a past time adverbial that would situate the time of
the event itself.
The present imperfective (marked by -an- or -anan- or a phonological
variant) denotes an unbounded temporal interval which is construed
Figure 14. Organization of tense markers in Nugunu
Figure 15. Resultative (RSL) ]
E
---Xd
164 R. Botne and T. L. Kershner
either as time contained within the event (an internal view of the event
E) (Fig. 16a) or as time containing the event (an external view that
invites a future interpretation) (Fig. 16b). In this case, Egos perspective
is toward the endpoint (]
E
) of the event, hence, Ego is construed as
embedded in the time matrix moving forward through the interval into
the future.
Further justication for this analysis of the organization of tense
markers comes from two sources. First, the F2 and F3 futures do not
reect simply a dierence in remoteness. They dier in the degree of cer-
tainty associated with each. The F2 future typically denotes a time tomor-
row or the day after. However, it may also denote a more distant time if
the speaker is certain. The F3 future typically denotes a time a few days
away, but may also be used to indicate uncertainty on the speakers part.
This epistemic dierence is captured nicely in the dissociation of the fu-
ture D-domain, and unies temporal and epistemic meaning.
Second, certain types of dependent (DEP) clause marking dier ac-
cording to domain (see Orwig 1991). The examples in (4) above illustrate
this for P1 and P2, where the marker is na. In fact, it is na for all of the P-
domain markers, except for Pr and F2, which require a sux -mo on the
verb. (These -mo forms appear to be an alternative to use of na in order
to avoid a present tense that would look identical in form to F2 and a na
na sequence that would result in F2.) On the other hand, the two remote
tenses, P3 and F3, both attach a nal -a, as shown in (6) (Orwig 1991:
158). Thus, the two D-domains mark dependency dierently from the P-
domain.
(6) a. P3 a mba ja a baa go due fsa sshs , gecamsna gssgs
he P3 do he P1 INF sell avocados her time which
m mba-a bola
I P3-DEP arrive
he had already sold her avocados when I arrived
a. Internal imperfective view of time in E (be V-ing)
b. External imperfective view of time containing E (will V)
Figure 16. Pr Impf Xd]
E
Tense and cognitive space in Bantu languages 165
b. F3 nobola no jga ja no baa go naaa, gecamsna gssgs o
rain it F3 do it P1 INF fall time which you
jga-a gulu
F3-DEP return
the rain will already have fallen when you return
The sentences in (6) also show that the P1 marker is not an absolute
tense, but a relative one: Specically, it denotes a past event considered as
a completed whole occurring within the temporal domain of the particular
reference time. The futures behave in the same way, as shown for F1 in (7).
(7) Kunuu a mba ls i yimene, gojaa a gss sda
Tortoise he P3 be he know that he F1 go
naa Makoa [Orwig 1991: 160]
to Makoa
Tortoise knew that he would go to Makoa (later that day)
The Nugunu tense system, then, can be analyzed as having the catego-
ries shown in Table 4, lines indicating which forms may cooccur.
The morphemes |ba| and |ga| invariably denote past and future, respec-
tively, whether situated in the P- or D-domains, which particular domain
being dependent on whether they co-occur with N- or not.
6. Resultatives and perfects: dierences between Nugunu and English
What we have labeled the Resultative (RSL) in Nugunumarked by a
has a translation in the English Present Perfect, as noted above in (3d).
We have also observed that P1marked by baamay be translated by
the English Past Perfect (4a and 6a). Clearly, there is not a straightfor-
ward correspondence between the Nugunu and English forms. How to
reconcile this apparent lack of correspondence within the framework of
our model?
As we showed in Figure 12, Resultative a denotes a continuing post-
Nucleus state brought about by some event E that occurred in the past.
Table 4. Nugunu verbal categories
Domain Past/Future Dependency
N. a
ba
ga
na
a
na (V)-mo
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::
166 R. Botne and T. L. Kershner
That is, it denotes the extension of the results of E into the present. Be-
cause of this phasal focus, we can consider the Resultative to be a kind
of Aktionsart, marked only in the P-domain. From this developed the
P2 anterior time unit sense. In order to dierentiate this latter kind of
domain-internal relation from the cross-domain relations of tense proper,
we adopt the term tenor to refer to the dierent temporal relations (dis-
tinct from aspect and Aktionsart) marked in the P-domain.
9
Consonant
with this distinction, henceforth, the terms past and future will be used
to refer to tense relations (i.e., cross-domain temporal relations), Mpast
and Mfuture for tenor (domain-internal) relations along the Time-moving
dimension.
How, then, do the Nugunu Resultative and the English Perfect com-
pare? Both exhibit a moving-Ego time dimension, while neither permits
co-occurrence with a time adverbial denoting a past time. In both the
event E may have just occurred, though this is not directly denoted by
the construction itself, but determined in context. They dier in that the
Nugunu Resultative denotes a post-Nucleus phase to the event, while the
English Perfect denotes a temporal template (or overlay) imposed on
the event schema (Fig. 17). The imposed overlay establishes a point of as-
sessment (PoA) that acts as a new reference point (R2) that is, itself, situ-
ated temporally with respect to S, while the event proper is interpreted
temporally with respect to R2. It is this point of assessment from which
the duration of time elapsed from the onset phase of the event (
E
[) (as in
(Figures 17bd), is determined. In the present perfect, R2 is coincident
with S as a default reading. The Nugunu Resultative lacks not only the
temporal overlay, but also the point-of-assessment from R2 of the En-
glish Perfect. Consequently, it does not permit specication of duration
of time since the event occurred; rather, it only denotes a position in the
extensionthe result phaseof the event.
Although many linguists consider the perfect to be an aspect, we
follow Bybee (1985: 160) here in considering it to have primarily
features of tense (tenor, in our model); it appears aspectual because it
introduces a point-of-assessment (R2) that may be situated inside the
event boundaries, as in (Fig. 17b). It is tense-like in that it relates the
time of the event to the time of assessment, which itself is related to
the time of reference. Current relevance at S is an invited interpretation
from the extension of the temporal overlay to a point of coincidence (of
R2) with S.
Consider now the past perfect use of Nugunu baa. In simple usage,
P1 baa denotes a hodiernal past, as shown in (8). However, it may be
combined in a complex construction with auxiliary ja do, which may
occur with any of the tense/tenor markers, for example, as with P1 baa
Tense and cognitive space in Bantu languages 167
(9a) and P3 mba (9b) (Orwig 1991: 158).
10
Although these examples
have been translated with the past perfect, they denote a past-in-the-
past or ante-past reading rather than a perfect reading. The English
Perfect may have either a past perfect or an ante-past interpretation,
as in (10).
(8) P1 a baa bola na gsys ns
he P1 arrive with morning
he arrived this morning
(9) a. P1P1 a baa ja a baa go gue, gscamsna gssgs m ba
he P1 do he P1 INF die time which I P1
na bola
DEP arrive
he had already died when I arrived
a. Temporal overlay (bold) on Event structure.
b. Dave has lectured for 40 minutes; hell wrap up shortly.
c. He has lectured at IU since 1985.
d. He has lectured at IU. [experiential]
Figure 17. Interpretations of the English Perfect (moving-Ego time matrix)
168 R. Botne and T. L. Kershner
b. P3P1 a mba ja a baa go due fsa ss hs, gecamsna
he P3 do he P1 INF sell avocados her time
gssgs m mba-a bola
which I P3-DEP arrive
he had already sold her avocados when I arrived
(10) a. The Corps had built a bridge across the river the [past perfect]
previous year, but Spring ooding destroyed it
last year.
[past]
b. The Corps had built a bridge across the
river the previous year, but Spring ooding
had destroyed it before the road opened.
[antepast perfect]
[antepast]
Thus, P1 denotes a past tenor relation (i.e., within the same domain) with
respect to some specied reference locus. When that reference locus is not
S, its temporal location is indexed by temporal marking on the auxiliary
verb ja do, either P1 or P3, a schema of the latter provided in Figure 18.
The dierence between Nugunu and English lies in the nature of the
temporal marking within the domains. In the P-domain, Nugunu mor-
phologically marks either tenorproximity (current or anterior time
unit)or result state of a past event. In contrast, English does not mark
domain-internal temporal relations morphologically. Zero-marked verb
forms do not specically situate the event with respect to S, though the
default interpretation is time of speaking (S). Rather, these non-past
forms can be used for any time, but only when temporal position is speci-
ed by the appropriate temporal adverb. Thus, either the historical
R2 indexed by ja; temporally situated by mba (P3)
E (sell ) temporally situated by baa (P1)
Figure 18. Past-in-the-past (or ante-past) in Nugunu
Tense and cognitive space in Bantu languages 169
present yesterday he tells the chief hes taking the day o, and the chief
res him or the future tomorrow he tells the chief hes quitting is accept-
able. The periphrastic present perfect imposes over the event structure a
temporal overlay whose salient endpoint establishes a point of assessment
and a new point of orientation from some contextually identiable locus
(the default being S), from which magnitude ( duration) and/or past ex-
istence may be assessed; hence, one may indicate duration of the interval
or the time at which the event began (e.g., since 2 oclock). Thus, while in
Nugunu one can indicate the time period (this morning, yesterday, etc.),
in English the perfect permits only adverbials of duration or point of
origination.
R2 indexed by have; temporally situated by -D
E (built) temporally situated by -N (i.e., the past participial form)
a. Perfect in the past
b. Perfect in the antepast
Figure 19. Perfect and Past-in-the-past (or ante-past) in English
170 R. Botne and T. L. Kershner
In this paper, we present a general picture of tense, tenor, and dissocia-
tion, adducing from a variety of Bantu languages seemingly curious
evidence of dierent kinds that doesnt nd a satisfactory analysis in a
simple one-dimensional linear approach. These cases, we propose, sup-
port the view of a bipartite mental model of deictic relations. Before turn-
ing to consideration of curiosities in individual languages, we add a
brief note on aspect in our approach.
7. Aspect and tenor in the domain approach
Aspect denotes the particular temporal view of time in the narrated event.
More precisely, a specic aspect denotes a particular temporal phase of
the narrated event as the focal frame for viewing the event. This focal
frame depicts the status of the event in relation to the vantage point deter-
mined by Ego, by default typically the moment of speaking. Tenor, on
the other hand, situates the event at some location in time in relation to
a reference point. We have already addressed dierences between resulta-
tives and perfects in the last section. Here we present a brief sketch of
some aspect and tenor marking in Kilega (D.25), a Bantu language spo-
ken in eastern Democratic Republic of Congo, in order to illustrate other
types of aspect and tenor in the P-domain.
Kilega, like most Bantu languages, exhibits a complex set of tense, ten-
or, and aspectual marking. Data here are drawn from Botne (2003b and
eld notes).
From the vantage point of the moment of speaking, Ego may adopt
any of three aspectual views of the event: inceptive, continuative, culmi-
native, marked by the prexes -sa-, -ku-, and -a-, respectively. Because as-
pect interacts dierently with the dierent inherent lexical aspect of verbs,
we illustrate below three cases, with the activity verb -kangula clear, the
inchoative transitional achievement verb -zombama be(come) hot, and
the inceptive achievement verb -boboka soften; become soft.
With the activity verb -kangula, the aspects denote views of three dier-
ent phases of the event, the beginning (or Onset), the core (or Nucleus), or
the culmination (or Coda), as exemplied by the sentences in (11) and the
diagrammatic representation in
(11) a. a-sa-kangula i swa
11
he has started clearing the eld
b. a-ku-kangula i swa he is clearing the eld
c. a-(a-)kangula i swa he has just cleared the eld
Not all events are activities. Achievements dier in that they have a
punctual core (Nucleus), and may have dierent congurations of onset
and coda phases. Two such verbs are illustrated below. The inchoative
Tense and cognitive space in Bantu languages 171
transitional achievement verb -shika be(come) hot, as exemplied in
(12), comprises both an onset coming-to-be phase as well as a stative
coda phase. As with the activity verb, the inceptive and culminative as-
pects indicate a focal frame just before or just following the nucleus of
the event, as in Figure 21. The continuative -ku-, however, denotes time
in the stative coda phase, rather than in the nucleus phase. That happens
for two reasons: (1) the nucleus is a point, so it cannot be continuative; (2)
the stative coda constitutes the semantic essence of the event. Hence, we
can say that continuative -ku- depicts an interval in the semantic core of
the event, which may or may not be the nucleus.
(12) a. i dya li-sa-sh ka the food is beginning to get hot
b. i dya ly-a-sh ka the food has just become hot
c. i dya li-ku-sh ka the food is very hot
The inceptive achievement verb -boboka soften (13) diers from -shika
in not encoding a coda phase; rather, it encodes an onset and punctual
nucleus (Fig. 22). Again, the three aspects may apply, with the inceptive
and culminative aspects denoting, as expected, inception and culmination
with respect to the nucleus of softening. However, the semantic essence
in this verb is the onset phase, which is denoted by continuative -ku-.
(13) a. by na bi-ku-boboka the sh [are showing signs that they] will
soften [as they cook]
b. by na bi-sa-boboka the sh are almost [beginning to be] soft
c. by na by-a-boboka the sh have softened [and are ready to
eat]
Figure 20. Focal phases of -kangula denoted by aspect marking
Figure 21. Focal phases of -shika be(come) hot by aspect marking
172 R. Botne and T. L. Kershner
The three aspectual markers constitute a set depicting dierent views
of the event at the moment of speaking S, schematized for the dierent
views of an activity verb in Figure 23.
These aspects may have as their vantage point some time other than S.
In such cases, a complex constructiona form of the auxiliary verb be
plus the aspect-marked form of the verbis used. Be indexes the new
vantage point, determined according to tenor marking in the P-domain,
as shown in Figure 24, or by tense marking. Note that the completive
and continuous aspect constructions behave as tenor markers in the cur-
rent time unit in opposition to those that mark the anterior and posterior
time units.
(14) a. tw-a-bez-ag-ile tu-ku-kangula i swa we had been clearing the eld
1P-P2-be-IMPF-P2 1P-CONT-clear eld [at that time] (yesterday)
Figure 22. Focal phases of -boboka soften by aspect marking
Figure 23. Three views of an activity verb at S (P-domain)
Figure 24. Tenor marking in the Kilega P-domain
Tense and cognitive space in Bantu languages 173
b. tw-a-b-e tu-sa-kangula i swa we will have started clearing the
1P-F2-b-F2 1P-INCEP-clear eld eld [at that time] (tomorrow)
c. tw-a-bez-ag-a tw-a-kangula i swa we had cleared the eld [at that
1P-P1-be-IMPF-P1 1P-COMP-clear eld time] (earlier today)
As the Kilega data suggest, certain forms may be perceived as more or
less denoting aspect or tenor within a TMA system. Data from other lan-
guages in our exposition will illustrate further dierences between the
two.
Having eshed out our model rst, we turn now to our goal in this
paper: to illustrate various kinds of evidence from Bantu and Bantoid
languages that support this concept of separate cognitive domains. The
kinds of evidence that we focus on here come from unexpected and inad-
equately explained curiosities in the data.
8. Curiosity #1: Use of the remote past in Basaa (A.43; Cameroon)
The common approach to Bantu language tenses, as for most languages,
is to map tense markings to appropriate intervals of a timeline (cf. for
example, Nurse and Muzale 1999 for Ruhaya and other lacustrine Bantu
languages; Maganga and Schadeberg 1992 for Kinyamwezi; Taylor 1985
for Runyankore/Rukiga, among others). We have done this for some
simple tense markings in Basaa, labeling for convenience each form P
1
,
P
2
and so forth, as it seemingly situates events farther and farther from
the tense locus, here the speech event (S) (see Fig. 25 and examples in
(15)). While this approach seems intuitively sensible at rst glance, it nev-
ertheless fails to account for the data in any satisfying way or provide any
insight into how Basaa speakers organize and conceptualize event space.
(15) a. P
1
a n-sebel juu she called last night
b. P
2
a b -sebel snd ntagbs she called last week
c. P
3
a -w gwet b 14 she died in the war of
(19)14 (WWI)
a -pam aj lsn
then today
she went out ages ago
today
Figure 25. Tenses in Basaa (Mbom 1996; Hyman 2003)
174 R. Botne and T. L. Kershner
d. Pr/F
1
a n-temb kokoa she is returning this
evening
e. F
2
a ga-masak jw i nl she will dance next year
f. F
3
nsajgw a-a jkj s
ksl yada
there will be peace in the
world one day
a a-ks ha lsn she will leave later today
there today
Time reference (general):
P3 remote past
P2 yesterday or earlier
P1 earlier today
Pr/F1 present or future today
F2 tomorrow or later
F3 remote future
The particular curiosity (indicated by ) that we focus on here is the
use of the remote past P3 and remote future F3 with the adverbial len to-
day, which, apparently, cannot be done with either P2 or F2. Naturally,
one has to wonder why this should be the case. We propose that the P3
and F3 markers, unlike the P2 or F2 markers, situate the event in a D-
domain, as schematized in Figure 26. In Basaa, this represents a subjec-
tive sense of distance or separation of the event with respect to the speech
event; hence, not only can it be used to refer to temporally distant events,
but also to temporally proximate ones, which are subjectively construed
as remote, or dissociated in our terms. The other verbal morphemes
denote temporal divisions within the P-domain, as illustrated. The N-
Figure 26. P- and D-domains in Basaa
Tense and cognitive space in Bantu languages 175
morpheme denotes an event within the current time unit (CTU)tone
marking determining before or after Sfor example, today, this
month, etc., P2 b - and F2 ga- adjacent time units (anterior or posterior
to the CTU), for example, yesterday, last month, or tomorrow, next
year, respectively. P2 and F2 cannot be used with len today because
they encode for a time period that is NOT the current time unit.
Time in the P-domain is represented tentatively as moving-Ego, al-
though there is little evidence available. Most grammatical descriptions
appear to describe the Mbene variety of the language. Schu rle (1912: 74)
notes, however, that the Bakoko variety uses a form that incorporates the
verb ks go in the near future, as in mi n-ks l I will come [lit I F1-go
come]. Following the analyses with motion verbs set out in Botne (2006b),
we believe this use to be indicative of a moving-Ego timeline.
In a more conventional manner, we can represent the semantic organiza-
tion of the Basaa tense system as in Table 5. The (extended) contemporal
dimension comprises two cross-cutting concepts: (1) directionearlier
(<S) or later (bS)with respect to the deictic anchor, i.e., the speech event
S, and (2) location with respect to the deictic anchor, either situated within
the same time unit or in the comparable adjacent/contiguous time unit.
This analysis of Basaa implies that there are, then, dierent kinds of re-
moteness possible. In the P-domain, we nd a measured remoteness in
terms of temporal proximity to the deictic center, within or outside of the
relevant current time unit. Projection of an event into a D-domain, on the
other hand, connotes a subjective separation and distance; the event is in
another world. This distinction, we feel, provides the basis for a more
nuanced analysis of the concept of remoteness. This issue will appear
again in several of the following cases.
9. Curiosity #2: Negation patterns in Tunen (A.44; Cameroon)
A second type of evidence comes from Tunen. Like Basaa, Tunen exhib-
its multiple past and future tenses. Unlike Basaa, however, it is the nega-
tive forms that are of direct interest, as they vary across the tenses. Hence,
examples of both armative and negative tenses are illustrated in the sen-
tences in (16)(18).
Table 5. Organization of Basaa temporal markers
Contemporal [P-domain] Not contemporal
CurrentTU ContiguousTU
<S n- b - [Past D-domain]
bS n- ga- a- [Future D-domain]
176 R. Botne and T. L. Kershner
(16) a. P4 msk ls wam a mon n [Dugast 1971: 182]
leopard P4 my child kill
the leopard killed my child
a
0
. wam a mon ata mts a ls ls na [Ibid.]
my child NEG one 1S NEG P4 be_sick
not one of my children was ever sick
b. P3 ba ka nekaka b lihani m`ss malsndolonum [Ibid. 180]
3P P3 meeting x days seven
they set the meeting in seven days
b
0
. hiseli sa siana metana ta buss
antelope NEG lie_down hunger NEG day
bomts [Ibid. 181]
one
Antelope did not sleep hungry, not even for one day
c. P2 ms na nifu sambs o buana numwa [Ibid. 178]
1S P2 package put bed under
I put the package under the bed
c
0
. o sa miajo sin [Ibid. 179]
2S NEG me see
you did not see me
d. P1 ms no mokolo nk [Ibid. 176]
1S P1 foot break
I broke my foot ( just a moment ago)
d
0
. same negative form as P2 [Ibid. 194]
(17) a. Aorist a miajo mona bwansn [Ibid. 172]
3S me child carry.for
he carries the child for me
a
0
. msss ls bslabsnia b bsndo ns [Ibid. 173]
chimpanzees NEG food of humans eat
chimpanzees dont eat human food
b. Present ba ndo efs nys [Ibid. 176]
3P Pr maize_porridge
they are eating maize-porridge
b
0
. a ls ndo buoli nyo [Ibid.]
3S NEG Pr work work
he is not working
(18) a. F1 ms ndo bua buh na sabon-ak [Ibid. 183]
1S F1 your debt pay-F1
I will pay your debt [today]
a
0
. nij a miajoa, ms sa noye
save me 1S NEG in_that_manner
Tense and cognitive space in Bantu languages 177
kia ton [Ibid. 185]
do anymore
save me, I wont act like that anymore
b. F2 o na sabon imw nyi na many kul emts [Ibid.]
2S F2 pay goat and medicines time one
you will pay (for) the goat and medicines at the same
time
b
0
. o kal o ss? yam mila sa ta [Ibid. 186]
2S explain 2S say my palm_nuts NEG produce_much
you said that my palm nuts would not produce
much [oil]
c. F3 ms jo ndasa bulila [Ibid. 187]
1S F3 VEN.come tomorrow
I will come tomorrow
c
0
. ms so jo ajo mima f`alabi [Ibid. 188]
1S F3 NEG 2S hut build.CAUS
I will not have a hut built for you
These sentences illustrate the general use of the dierent tense markers.
As one might expect, the markers change form from one tense to another,
with the sole exception of F1, which is the Present plus the sux -Vk
(whose vowel harmonizes with the root vowel). The general time refer-
ence of each and both armative and negative markers are listed in Table
6. The curious facts in Tunen are (1) why some tenses form their nega-
tives with a form of sa while others do so with ls or so, and (2) why the
present and F1, which have the same armative marker ndo`, have dier-
ent negative marking. Moreover, note that the ls and so negatives are
added to the armative tense marker, while sa negatives replace the cor-
responding tense marker.
Table 6. Armative and negative tense markers in Tunen (Dugast 1971)
Time reference Armative T Negative T
P4 distant, time not precise ls ls ls
P3 pre-hodiernal ka sa
P2 earlier today; stories na sa
P1 immediate past no [same as P2]
Aorist general present ls
Present in midst of E at S ndo ls ndo
F1 hodiernal ndo -Vk sa
F2 time not precise, certain na sa
F3 tomorrow or later, certain j o so jo
178 R. Botne and T. L. Kershner
At rst glance, the forms and distribution of the negative markers seem
almost arbitrary; sometimes a variant of sa, sometimes ls, and in one case
so, with no apparent motivation for the distribution. For example, why
should P4 and Pr tenses have ls, but P3 and P2 forms of sa? The distribu-
tion assumes a denite pattern when we consider the organization of these
markers in terms of P- and D-domains. Our proposed analysis of the
organization and distribution of the armative tense markers, based on
their form and semantics, is shown in Figure 27. The timeline in the P-
domain is presented as moving-Event, based on the premise that P3 ka
was derived from the itive (movement away) marker ka. It is temporally
sub-divided by multiple verb markers; the P4 and F3 markers, however,
situate events in dierent D-domains, either not contemporal (past) or
not contemporal (future), respectively.
It is the organization and distribution of negative markers, however,
that is of primary interest here. First, we nd that all of the sa negatives
are situated along the moving-Event timeline (i.e., within the P-domain),
diering only in tone: pre-S before today a low tone, pre-S today a
high tone, post-S a rising tone (Fig. 28a). Mous (2003: 29495) treats
each sa form as a distinct temporal morpheme because nothing can
be gained in terms of economy of description by extracting a common
negative element sa . . .. We disagree; there are two pieces of semantic
information encoded in each unit: sa indicates negation, tone the time ne-
gated. Hence, we conceive of each as a combination of two elements, seg-
mental sa plus a tone. Furthermore, these sa forms completely replace the
armative tense markers and are, consequently, the only indicators of
time.
Figure 27. Distribution of armative tense marking in Tunen
Tense and cognitive space in Bantu languages 179
Second, the odd negatives ls and so lie along the Ego-moving time-
line, ls for non-future, so for the future D-domains. Unlike the sa forms,
these co-occur with the corresponding armative tense marker.
There are two present constructions, the Aorist (called Present indeni
by Dugast) and the Present (called the Present ponctuel by Dugast). In
both cases, ls is inserted before any tense marking. Since the Aorist form
is simply the bare verb, it has simply ls plus the verb in the negative. In
the present, ls occurs before the tense marking, hence, ls ndo (the high
tone on ls arising contributed by ndo). The Aorist expresses a general
fact, habit, or situation sans consideration . . . de position dans le temps
. . . (Dugast 1971: 172). That is, it does not denote an event on-going
at S. Thus, the semantics of this form naturally correlate with the Ego-
moving timeline, which depicts time as static and unbounded. It is not
surprising, then, that the negative patterns the way it does. On the other
a. Negatives along the moving-event timeline (P-domain)
b. Negatives along the Ego-moving timeline
Figure 28. Distribution of negative tense marking in Tunen
180 R. Botne and T. L. Kershner
hand, the Present does denote an event on-going at S and so presents a
dierent situation.
From the data examined so far, we can see that Pr and F1, both having
the tenor marker ndo, behave dierently with the negative: F1 patterns
in the negativei.e., manifests a form of sawith other tenor forms
in the P-domain; Pr, like the Aorist, patterns with ls negatives outside
the P-domain. Since we naturally anticipate an on-going present to
pattern within the P-domain, why might the negative Pr pattern as it
does in Tunen? We propose the following. In the armative, the F1 ndo
Root-Vk construction clearly derives from the Pr, hence, both F1 and Pr
in the armative can be assumed to assert a fact about an event occur-
ring in the P-domain, one soon after S (F1), one on-going at S. Negating
F1 denotes that in the world that Ego perceives to exist at S (i.e., the P-
domain), it is not a possibility that E will occur in the near future. Negat-
ing Pr, on the other hand, denotes that E is not real at the moment of
speaking S. Tunen speakers have, as we have observed, correlated this
fact with the use of ls, denoting not real at S. That is, the present af-
rmative asserts the reality of E at S, the negative the non-reality (or, to
put it another way, X may be doing something, but E isnt it, i.e., the
reality).
Support for this comes from the observation that the morpheme ls
is linked to the verbal deictic D-domain not real, in that it is found
in the negative of irrealis constructions (what Dugast labels la forme
subjective (1971: 18990), Mous (2003: 297) optative), which in the
armative have the form sp-

-root, in the negative sp-ls-root. This


subjective form is used in expressing wishes, desires, intentions, ques-
tions. Tunen speakers have, then, negated Pr with the same negative
form associated with dissociated domains, past and, specically, irrealis.
The analysis of negative forms in Tunen suggests an organization such
as that depicted in Table 7. What is clear from these data is that the dis-
tribution of negatives in Tunen is not arbitrary, but motivated, we claim,
by the contrast in perspectives on time coupled with the cognitive division
between P- and D-domains.
Table 7. Organization of negative marking in Tunen
Contemporal
[P-domain]
Not Contemporal
[Temporal D-domains]
Not real
[Irrealis D-domain]
<S sa (p3) ls T (p4)
sa (p2)
S ls T (pr)
>S sa (f1 and f2) so T (f3)
Tense and cognitive space in Bantu languages 181
10. Curiosity #3: Dierential implications with tense markers: Lusaamia
(J.34, Kenya) and Ekoti (P.30, Mozambique)
The simple sequential correlation of tense markers with a single timeline
would suggest that the only dierence between, for example, one future
marker and another would lie in the temporal distance from the time
of speaking (although see Janssen 1994 for a dierent view in Dutch).
However, data from Lusaamia and Ekoti demonstrate that this is not
necessarily the case. Rather, there is a dierence in implications or in re-
strictions on use. Consider rst a pair of examples from Lusaamia, one
marked with the near future prex -na-, the other with the remote future
prex -axa- and nal -e, as shown in (19).
(19) Lusaamia (data from Botne eld notes)
a. xusuub ra mbwee a-na-meny-a
1P.hope.FV that 3S-F2-live-FV
we hope that it [child] will live
[implies child exists, i.e., is living]
b. xusuub ra mbwee y-axa-meny-e
1P.hope.FV that 3S-F2-live-F2
we hope that it [child] will live
[implies child has not been born]
Note that in (19a) the sentence is only felicitous when the child spoken
of is actually alive at the moment of speaking. Use of the remote future
construction, in contrast, is felicitous if the child has not yet been born,
regardless of whether, say, the mother is currently pregnant or not. This
dierence in implication falls out naturally in the model we are propos-
ing. As illustrated by the schema in Figure 29, the -na- future situates the
event within the P-domain, that is, within the contemporal world in
which the speaker perceives herself to be. In contrast, the -axa-ROOT-e
future construction situates the event in a dissociated future world in which
the child does not yet exist; that world we have labeled a D-domain. (See
Botne 2006a for greater detail and discussion of Lusaamia TMA forms.)
The dotted timeline indicates that we do not have enough evidence at
this time to select between a moving-Ego or moving-Event analysis.
A similar contrast is found in Ekoti. There are two simple past con-
structions, a recent past formed with -a-. . .-a (20) and a remote past
formed with -aa-. . .-iy-e (21).
(20) Recent Past (P1)
a. taana n-a-c-a fooxi [Schadeberg and Mucanheia
2000: 172] yesterday 1P-P1-eat-P1 together
yesterday we ate together
182 R. Botne and T. L. Kershner
b. mwanakhwaawe a-(a-)n-x c-el-a wuuluvala [Ibid. 170]
chicken.her 3S-P1-it-slaughter-APPL-P1 become_old
her chicken, she slaughtered it because of old age
c. k(i)-a-n-s kan-a ari wi kuri [Ibid. 135]
1S-P1-3S-meet-P1 3S.be LOC.Inguri
I met him (while) he was in Inguri
d. mvuka w-(a-)uum-a ncuwa [Ibid. 164]
rice it-P1-dry-P1 LOC.sun
the rice (has) dried in the sun
(21) Remote Past (P2)
a. Hayaathi ti-ye y-aa-m-par z-iye hooma taana [Ibid. 151]
Hayathi COP-3S 9-P2-3S-heat-P2 9.fever yesterday
Hayathi got/had a fever yesterday
b. (a-)aa-lum-ach- (y)-w-e [Ibid. 89]
1S-P2-bite-INT-P2-PASS-P2
he was very badly bitten
According to Schadeberg and Mucanheia (2000: 112), the P1 tense de-
notes a completed action in the recent past, perfective in nature. It also
may be used when there is a sense of present result, as in (20d). Although
translatable by the English Perfect, it is neither a perfect as in English nor
fully resultative, indicating a present state. Nevertheless, it invites a sense
of current relevance.
The remote past denotes an event that happened in a distant past.
However, as the example in (21a) illustrates, this may be as recent as
Figure 29. Lusaamia domain organization
Tense and cognitive space in Bantu languages 183
yesterday. We posit that the dierence between the uses of the two tenses
lies in the dissociative nature of the remote past. That is, the so-called re-
mote past situates an event in the D-domain, the recent past in the P-
domain (Fig. 30). Current relevance arises from an events being situated
in the P-domain in opposition to the D-domain.
The timeline through the P-domain denotes moving-Ego. Evidence
comes from grammaticalization of the verb -eetta go (>-tta) in the pres-
ent progressive/immediate future construction. This construction, as
shown by the examples in (22), consists of the present tense marker
-n(i)- prexed to the phonologically-reduced form of go followed by
the innitival form of the main verb. The present marker alone on the
main verb, as in (22b) indicates a generic fact, and can be considered to
mark present along the Ego-moving timeline.
(22) a. ki-n-tta o-lawa [Schadeberg and Mucanheia 2000: 142]
1S-Pr-GO INF-leave
I am leaving or I am about to leave
b. akot a-n-l ma maxapa m-pamela [Ibid. 109]
Koti_people 3S-Pr-cultivate farm LOC-interior
the Koti people farm in the interior
What is curious in Ekoti is that the distinction between the two past
forms has consequences for the syntax. In the examples in (23), we nd
that the same propositionportuguese build fortressrequires dier-
ent tense marking depending on whose world, the Portuguese or the
fortress, is perceived as salient.
12
In (23a), the Portuguese are the salient
agents who built in the remote past, hence, the remote tense marker is
Figure 30. Ekoti pasts
184 R. Botne and T. L. Kershner
used. However, in the passive (23b), when the fortress becomes salient as
the subject, the near past marker is used.
(23) Ekoti [Schadeberg and Mucanheia 2000: 116]
a. azuku (a-)aa-cek- ye fortaleeza
Portuguese 3P-P2-build-P2 fortress
the Portuguese built the fortress
b. fortaleeza y-a-cek- w-a naazuku
fortress 3S-P1-build-PASS-FV by-Portuguese
the fortress was built by the Portuguese
As with the case in Lusaamia, this dierence in use and implications
falls out naturally from the model we are proposing. The action of the
Portuguese, the subject and topic of the active sentence (23a), occurred
in a remote and dissociated past and, hence, is marked with the D-
domain tense marker. However, the fortress, subject and topic of the pas-
sive sentence (23b), still exists in the contemporal world of the speaker
and is marked with the past appropriate to the P-domain.
A reviewer has suggested that use of the recent past for (23b) may be a
conventionalization of the fact that the resulting state is nearer the pre-
sent than is the agent of the action. The same reviewer also suggested
that the Lusaamia examples could be accounted for by the lexicalization
or idiomatization of an invited inference. Specically, the remote future
use emanates from the invited inference that the childs not yet existing
is necessarily more remote than if the child has already been born. True,
these may be possible ways to account for the observed data. However,
our model provides a principled and motivated framework within which
these, as well as other, disparate facts can be accounted for in a unified
manner. Another of these disparate curiosities occurs in the next language
we examine.
11. Curiosity #4: Dierential temporal implications of lexical items in
Chisukwa (M.20, Malawi)
Another kind of implicational evidence can be found in the senses of par-
ticular lexical items as they are used with dierent tenses. Consider, for
example, the Chisukwa verb -fwa die (24) (data from Kershner eld-
notes). Apart from its common use referring to humans (24a), it can also
be used metaphorically with respect to the closing of a store (24b and c).
(24) Inchoative verb uku-fwa to die
a. PR a-ku-fw-a. s/he is dying/will die.
3S-Pr-die-FV
Tense and cognitive space in Bantu languages 185
b. P2 isitoolo y-aa-fw- ile the store closed
9.store 9-P2-die-CMPL [implication: temporary]
c. P3 isitoolo i-ka-fw-a the store closed
9-P3-die-FV [implication: out of business]
In a strictly linear analysis, we would expect the only dierence be-
tween P2 and P3 to be an earlier or later time reference, such as yester-
day and before yesterday, respectively. However, what we nd is that
there is an implication of temporariness for P2, but of permanency for
P3. Although anomalous in a simple linear model, this distinction is mo-
tivated in our domain model. The P3 marker -ka- situates the event in the
D-domain where the event is interpreted as permanent, i.e., no longer
active, as opposed to P2, which indicates it is past in the P-domain but still
active, hence, interpreted as a potentially temporary state of aairs.
A second, and similar, piece of lexical evidence is found in the dier-
ence in inter-pretation of the aspectualizer -leka cease when it occurs
with -ka- P3 forms in contrast with tenses marked with -aa-, e.g., P1 and
P2, as shown in (25). Similar to the -fwa case in (24), there is an implica-
tion of permanency with -ka- and temporariness without -ka-; this is re-
ected in the interpretation equivalent the verb receives, that is, quit
(for P3) versus stop (for P1).
(25) Aspectualizing verb uku-leka cease
a. P3 tu-ka-lek-a pakuseenga boo aafulala
1P-P3-cease LOC.INF.build after 3S.P1.become_injured
we quit building after he became injured
P1 tw-aa-lek-a pakuseenga boo aafulala
1P-P1-cease LOC.INF.build after 3S.P1.become_injured
we stopped (temporarily) building after he became
injured
b. P3 ba-ka-leka pakugobola if loombe looli bakubyaala
3P-P3-cease LOC.INF.harvest maize but 3P.Pr.plant
nukugobola amalesi
and.harvest millet
they quit harvesting maize and, instead, are planting and
harvesting millet
P2 b-aa-lek-ite pakugobola if loombe
3P-P2-cease-CMPL LOC.INF.harvest maize
they stopped (temporarily) harvesting the maize
As in the preceding cases of interpretation in Lusaamia and Ekoti, one
might say that the dierence in interpretations is due to an invited infer-
ence that permanent cessation would naturally extend further into the
past, while temporary cessation is more likely with a recent past. Again
186 R. Botne and T. L. Kershner
we point out that, while such an account is possible, it seems to us to be
rather ad hoc. In our analysis, the various and disparate curiosities we
have enumerated can all be accounted for in a unied manner.
12. Curiosity #5: Parallel constructs in Lucazi (K.13, Angola)
With Lucazi, we consider six non-future verbal constructs that illus-
trate the inter-connection of tense, tenor, and aspect. These constructs
do not constitute all the non-future forms in the languagethere are
also gnomic, habitual, and progressive forms as wellbut the ones we ex-
amine here are sucient to illustrate a particularly interesting case of pat-
tern congruity within the dissociative model.
The six constructs each consist of a prex (two prexes in one instance)
and a sux circumscribing a verb stem, as shown in Table 8, labeled as in
Fleisch (2000). As the reader will note, there are two prexes, -na- and -a-
that recur, and which Fleisch has associated with the concepts anterior
and perfective, respectively. Our analysis diverges from Fleischs in sig-
nicant ways; consequently, the reader should note carefully that we do
not employ the terminology as he does.
We begin our analysis by considering rst the two formally parallel
patterns with cross-cutting co-occurrence of the prexes -na- and -a- with
the suxes -V and -ile, as set out in Table 9. The sux -V represents a
vowel that typically harmonizes with that of the verb stem.
Consider the examples in (26) of the -a-. . .-ile construct, Fleischs Sim-
ple Past.
(26) -a-. . .-ile construct
a. v-a-h t-ile mu-musenge [Fleisch 2000: 164]
3P-PST-pass-PFV LOC-bush
they crossed the wilderness
Table 8. Lucazi non-future constructs (Fleisch 2000)
Anterior -na-. . .-V -a-. . .-V Present Perfective
Past Anterior -na-. . .-ile -a-. . .-ile Simple Past (Perfective)
Hesternal Past -na-ka-. . .-ile -a-. . .-a Perfective
Table 9. Parallel prex sux pairs in Lucazi
affixes -V -ile
-na- -na-. . .-V -na-. . .-ile
-a- -a-. . .-V -a-. . .-ile
Tense and cognitive space in Bantu languages 187
b. ka-tali u-a-mu-sum-in-ine
13
[Ibid. 165]
dog 3S-PST-3S-bite-APPL-PFV
the dog bit him
c. kasumbi u-a-y-ile ku-alenga ndonga [Ibid. 142]
chicken 3S-PST-go-PFV LOC-border river
one day, Chicken walked along the river
Fleish (2000: 166) notes the following characteristics associated with use
of the Simple Past:

it denotes absolute temporal reference;

it is associated with a notion of remoteness, though not in a strictly


metrical sense (i.e., the sense of remoteness arises from something
other than simply temporal linear distance);

it provides the temporal reference point as a background for other


states of aairs;

it does not express or imply later consequences resulting from occur-


rence of the event.
These semantic attributes lead us to conclude that the -a-. . .-ile construct
denotes a complete (i.e., whole) and completed event situated at an un-
divided moment in the past. In our model, it situates an event in the past
D-domain. Hence, we label it the D-Past. How does this D-past dier
from the similar -na-. . .ile construct, which also denotes a past event?
Fleisch (2000: 168) notes rst that the -na-. . .-ile construct . . . may re-
fer to markedly remote states of aairs, but at the same time it is in some
respects less remote than the simple past [i.e., what we are calling the D-
Past]. Second, it has potentially a past perfect interpretation, as in
(27ce), meaning it can be interpreted with respect to some reference
point other than S; hence, it is relative and not absolute.
(27) -na-. . .-ile construct
a. tu-na-h luk-ile [Fleisch 2000: 168]
1P-PANT-return-PFV
we (had) returned
b. kaha tusitu vose va-na-l -kungulu-ile [Ibid. 344]
then animals all 3P-PANT-REFL-gather-PFV
then all the animals gathered
c. kaha vangazi vavene va-na-handek-ele ngu-avo: . . . [Ibid.]
then judges themselves 3P-PANT-speak-PFV QUOT-3P
then the judges themselves spoke, saying . . .
d. kasumbi ngu-eni mu-nji-n(a)-amb-ile [Ibid. 169]
chicken QUOT-3S 18-1S-PANT-say-PFV
Chicken said, [that is] why I had said . . .
188 R. Botne and T. L. Kershner
e. amba, nge-oco mu-na-han-a mulonga ku-li [Ibid. 349]
that like-that 18-PANT-give-CMPL judgment 17-COP
ou kasumbi, kasumbi omu njila na-handek-ele
DEM chicken chicken DEM way 3S.PANT-speak-PFV
so, like that, a judgment was found; Chicken had told the
truth
The -na-. . .-ile construct, then, is like the D-past in that it denotes
completion of the event. Unlike the D-past, though, it typically implicates
relevance to some posterior situation. In our model, we propose that it
situates an event in the past of that domain in which the reference locus
is situated, that is, it is a past internal to a domain, the default interpreta-
tion being the P-domain. Hence, though denoting past, that past may be
felt to be less remote than the D-past or markedly remote when in-
terpreted as anterior to some other situation.
We account for the similarity and dierences between these two con-
structs in the following manner. The -ile sux denotes a perfective aspect.
That is, it denotes a perspective on the event following the terminative
coda phase of the event, as shown for the -na-. . .-ile construct in Figure
31. If the point perspective is interpreted with respect to the event itself
(Fig. 31a), an aspectual role, then the interpretation is a completed event.
However, the point perspective can also be interpreted as a reference
point, as in (Fig. 31b), in which case it functions to index a new reference
locus (R2). The event is, then, interpreted as occurring prior to that event
and contributing a sense of relevance at that point.
The -a-. . .-ile and -na-. . .-ile pasts dier, then, in two important ways.
First, the former situates a completed event in a dissociated D-domain,
i.e., that domain external to the reference locus (S); the latter situates the
a. Simple perfective in past reading
b. Perfect reading (had V-ed prior to some reference event)
Figure 31. Dual interpretations of the -na-. . .-ile perfective aspect
Tense and cognitive space in Bantu languages 189
event internal to a domain (by default in the P-domain). Hence, the -a-
form situates the event along the Ego-moving timeline, thereby marking
a tense relation, the -na- form along the Time-moving timeline, thereby
marking a tenor relation. Second, the -a- form treats the event as a com-
pleted whole, the -na- form as a completed endpoint.
Let us turn now to the -na-. . .-V construct. Instead of the perfective suf-
x -ile, it has a nal, typically harmonizing, vowel, transcribed -V. This
ending is semantically similar to, yet subtly dierent from the perfective
-ile. Although, like the perfective, it expresses completion according to
Fleisch, it also expresses either immediacy of the event ([w]ith action
type verbs the anterior predicates express a verbal action which immedi-
ately preceded the reference moment and is still relevant to the latter.
(pp. 1578), as in (28a and b), or it expresses relevance of the event to
the time of reference (by default the speech event), as in (28c and d).
(28) -na-. . .-V construct
a. tu-na-fum-u mu-Vunonge [Fleisch 2000: 160]
1P-PFV-come_from-CMPL LOC-Menongue
we have just come from Menongue
b. tu-na-sokulul-a lipito mu-va-na-het-e [Ibid. 207]
1P-PANT-open-CMPL door 18-3P-PANT-reach-CMPL
we have opened the door (because) they have [ just] arrived
c. vi-ka u-na-kuatilil-a? [Ibid. 124]
8-which 2S-PANT-seize-CMPL
what [lit which thing] have you seized [and still hold]?
d. hee, mu-na-man-e ku-teta laza mulonga? [Ibid. 346]
INTJ 2P-PANT-nish-CMPL INF-cut already 3-judgment
ooh, have you already reached a judgment?
Figure 32. Temporal analysis of -a-. . .-ile and -na-. . .-ile in the dissociative model
190 R. Botne and T. L. Kershner
yii, u-na-hu-u
yes 3-PANT-become-CMPL
yes, it has come to be
The dierence between this construct and the perfective ones lies in the
location of the aspectual viewpoint. For the perfectives, we saw that the
viewpoint was situated post-Coda; in this construct, however, it is situ-
ated post-Nucleus. That is, in both instances something has been com-
pleted, in the former it is the whole event, in the latter the nucleus phase.
That it is the nucleus only that is construed as completed is evident from
the dierence in interpretation of activity and inchoative achievement
verbs. The former have an immediate past interpretation, as in (28a
and b) above, the latter either an immediate past or a resultative state in-
terpretation, as in (29). We will label this the Completive aspect, in oppo-
sition to the Perfective aspect discussed previously.
(29) a. cimbanda na-y-i [Fleisch 2000: 280]
healer 3S.PANT-go-CMPL
the healer has/is gone
b. muangana na-tsi [Ibid. 157]
king 3S.PANT-die.CMPL
the king has died/is dead
The reason for this is illustrated by the schemas in Figure 33. Inchoa-
tive achievement verbs, unlike activity verbs, encode a stative coda phase.
The completive aspect situates Ego in the post-N coda phase of the
event (Fig. 33b), which may be interpreted as immediately post-N (hence,
a. With activity verbs
b. With inchoative achievement verbs (N point of transition)
Figure 33. Completive -V
Tense and cognitive space in Bantu languages 191
a have just V-ed interpretation), or as some unspecied location in
the coda (hence, an is V-ed interpretation). A comparable distinction
has been proposed for similar suxes in Zulu (see Botne and Kershner
2000).
The meaning and use of the -a-. . .-V construct follows from the analy-
ses we have just presented. The sux -V denotes a post-Nucleus perspec-
tive, the prex -a- that the event is situated at an undivided (i.e., punctive)
moment of time along the Ego-moving timeline. This, in fact, is what we
nd. With some verbs, there is a performative sense with the utterance, as
in (30a), which might best be translated as have just this instant named.
With others, there may be a sense of have come to act in this way at this
very moment, as in (30c). In all cases, the event is immediate to S.
(30) Present Completive -a-. . .-V
a. nji-a-mu-luk-u ou mu-ana li-zina
1S-PST-name-CMPL this 1-child 5-name
li-a-eni Joao [Fleisch 2000: 163]
5-POSS-3S John
I (have) named this child John
b. vi-ze vi-nzunda vi-mu-a-mon-o va-a-li-kungulul-ile [Ibid.]
8-DEM 8-frog 8-2P-PST-see-CMPL 3P-PST-REFL-gather-PFV
these frogs that you (pl) have just seen [ just mentioned in
story] gathered (again)
c. m-bambi u-a-hev-e [Ibid. 303]
9-duiker 3S-PST-be_foolsih-CMPL
how foolish the duiker is or the duiker is being foolish
Before considering the nal two constructs, we pause here briey to
summarize the ndings so far. We have dierentiated the four constructs
in terms of the aspectual meaning encoded in the sux and in terms of
the temporal character encoded in the prex. The sux denotes either
a completive (post-Nucleus) perspective or a perfective (post-Coda)
perspective on the event. The prexes -na- and -a- distinguish relations
within a domain (Time-moving timeline) from relations across domains
(Ego-moving timeline), respectively. Table 10 sets out the distinctions.
Table 10. Lucazi constructs in a dissociative model analysis
Tempus
Aspect Completive
(post-Nucleus)
Perfective
(post-Coda)
Anterior (tenor) -na-. . .-V -na-. . .-ile
Past (tense) -a-. . .-V -a-. . .-ile
Aspect
192 R. Botne and T. L. Kershner
The -a-ka-. . .-ile construct (Fleischs Hesternal Past) is the same as the
Anterior Perfective tenor in form, with addition of the prex -ka-.
14
It is
perfective and past, situating an event in that time unit adjacent to the
current relevant time unit, for example, yesterday (vs. today) (31a). It is
relative in the sense that it is anchored to a moment of speaking whether
current or otherwise, as the example in (31b) illustrates. It diers from the
Anterior Perfective in that it apparently has only a past perfective inter-
pretation, and not a past perfect interpretation as well.
(31) Contiguous Anterior Perfective -na-ka-. . .-ile
a. tu-na-ka-y-ile ku-Venduka ngoco
1P-ANT-IT-go-PST 17-Windhoek and
tu-na-ka-h luk-ile zau [Fleisch 2000: 169]
1P-ANT-IT-return-PST yesterday
we went to Windhoek and returned yesterday
b. muan-etu na-handek-a zau ngu-eni mema
brother-1P 3S.PANT-speak-CMPL yesterday QUOT-3S 6-water
a-na-ka-tontol-ele muakama zaulize [Ibid. 170]
6-PANT-AnTU-be_cold-PFV very_much day_before_yesterday
my brother said yesterday that the water had been cold the
day before
The nal construct to consider is the -a-. . .-a construct. Two related
facets of its three uses are exhibited in its experiential (32) and resultative
(33) readings. The experiential interpretation denotes not only an occur-
rence of the event at some indenite time in the past, but typically that
the event denotes an important attribute or characteristic of the subject
at the reference time. Hence, according to Fleisch, (32c) indicates not
only that we have come from Menongue, but that our coming from
Menongue is an integral attribute, either because we were born there or
because we lived there for a lengthy period of time.
(32) Experiential -a-. . .-a
a. nj-a-mon-a ngandu [Fleisch 2000: 160]
1S-PF-see-FV crocodile
I have seen a crocodile
b. na-ngandu na eni u-a-fum-a mu-liyaki [Ibid. 161]
and-crocodile and 3S 3S-PF-come_from-FV LOC-egg
and Crocodile, he too, has come from inside an egg
c. tu-a-fum-a mu-Vunonge [Ibid. 160]
1P-PF-come_from-FV LOC-Menongue
we come from Menongue
[implies birth there or living there for some time]
Tense and cognitive space in Bantu languages 193
The resultative use is very similar to the experiential in that the event
has occurred at some indenite time in the past, and a characteristic state
exists at the time of reference. This use is found with inchoative achieve-
ment verbs that express a resultant state, as in (33a). In order to indicate
that the state existed at a particular time in the past, it is necessary to
combine the -a-. . .-a construct with the auxiliary verb -pu- be(come), it-
self marked with the past perfective construct, as in (33b). In this complex
construction, the auxiliary verb functions to index a point of reference
with respect to which the resultant state is understood.
(33) Resultative -a-. . .-a
a. kasumbi u-a-l -zind-a na-ngandu [Fleisch 306; 341]
chicken 3S-PF-REFL-hate-FV COM-crocodile
Chicken and Crocodile hate one another
b. ci-tapalo c(i)-a-pu-ile c(i)-a-sungam-a [Ibid. 307]
7-street 7-PST-be-PFV 7-PST-be_straight-FV
the street was straight
In neither case does the -a-. . .-a construct co-occur with temporal ad-
verbials, either of location or duration in time. On the other hand, as the
reader can see, both interpretations denote a property characteristic of the
subject at the reference time. We propose that the -a-. . .-a construct im-
poses a temporal frame (in bold in Figure 34) over event structure along
the Ego-moving timeline. The dierence between the experiential and re-
sultative readings is determined solely by whether the event coded by the
verb expresses a stative coda phase or not. If so, then we have the resulta-
tive reading (Fig. 34b), in which the viewpoint (the right edge or bound-
ary) of this frame is located in the coda phase of the event; if not, we nd
an experiential reading, in which the right edge is outside the event, but
the event is within the experiential frame (Fig. 34a).
The temporal frame overlies the timeline, precluding identication of
specic time units on the timeline. Consequently, the time at which the
nucleus (N) of the event occurred is not statable with this verb construct.
Moreover, the frame depicts the state of the event as existing over the
duration of the interval, interpreted as denoting an experience or attribute
true of the subject over that interval.
There is a third use of the -a-. . .-a construct, one which does not follow
from either of the complex temporal modelsEgo-moving or Time-
movingwe have outlined so far. This is its use in narrative to express
consecutive occurrence of events, as in (34), although Fleisch (2000: 263)
notes that this is not as common, or as stylistically appropriate, as use of
the D-past perfective in many contexts.
194 R. Botne and T. L. Kershner
(34) Narrative sequencing with -a- . . . -a
setting: kasumbi u-a-y-ile kualenga
chicken 3S-PST-go-PFV along
ndonga [Fleisch 2000: 342]
river
one day Chicken walked along the river
event 1: kaha u-a-mon-a nguvi
then 3S-PST-see-FV hippopotamus
then he saw Hippopotamus
event 2: kunahu u-a-mu-sik-a ngu-eni: . . .
then 3S-PST-3S-call-FV QUOT-3S
then he called him, saying: . . .
(35) event 3a: kaha nguvu u-a-y-ile
then hippo 3S-PST-go-PFV
u-a-ka-lek-ile ngandu [Ibid.]
3S-PST-SEQ-tell-PFV croc
then Hippopotamus went and told Crocodile
a. Experiential, e.g., see a crocodile
b. Resultative (of change-of-state verb), e.g., hate
Figure 34. Interpretations of the -a-. . .-a construct
Tense and cognitive space in Bantu languages 195
event 3b: kaha nguvu u-a-y-a
3S-PST-go-FV
u-a-ka-lek-a ngandu [Ibid. 162]
3S-PST-SEQ-tell-FV
then Hippopotamus went and told Crocodile
The storyteller apparently gave two renditions of the same story, one in
which he reverted to the D-past perfective when the scene changed (35a),
one in which he continued with use of the -a-. . .-a construct in sequencing
events (35b). In both cases, the itive marker -ka- adds emphasis to a
closely tied sequence.
The use of -a-. . .-a in the narrative follows not from the Ego- vs Time-
moving models that we have discussed throughout the paper, but rather
from a complex temporal sequencing model (see Evans 2005: 229234
for a detailed discussion), in which sequences of events are conceptualized
as discrete entities with respect to some event in question other than the
time of speaking (S). In Lucazi story narrative, this event is typically
marked with the D-past perfective construct -a-. . .-ile, as in (34) above.
Subsequent events may be marked either with the same form or with the
-a-. . .-a construct. As Evans (2005: 230) indicates, a consequence of inte-
gration into this model is the imposition of an in-tandem alignment on
the events (Fig. 35).
These rather curious parallel sets of constructs nd a motivated analy-
sis in the model we are proposing. As illustrated in Figure 36, the similar-
ity, both morphological and semantic, between the Anterior and Past sets
results from their patterning in similar fashion along a timeline; their
dierences arise from their patterning along dierent perspectives of the
timeline, the Anteriors along the moving-Event perspective, the Pasts
along the Ego-moving perspective. The Perfect indicates that an event
has been completed, but also falls within the experiential domain of the
reference locus, i.e., it is relative. Hence, we can conclude that it is found
in both the P-domain and the past D-domain. As noted above, the Past
Anterior does not co-occur with the D-Past, but does co-occur with the
Perfective. This falls out from the occurrence of both in the P-domain, in
contrast to the D-Past, which situates an event in a dierent domain.
Figure 35. Complex sequencing model of narrative sequencing with -a-. . .-a
196 R. Botne and T. L. Kershner
13. Curiosity #6: Multiple futures in Chisukwa (M.20, Malawi)
To this point we have focused primarily on past tenses. Here, we will
discuss the occurrence of multiple future tenses in Chisukwa that seem to
overlap at least to some extent in temporal reference. There are four such
forms, as shown in Figure 38; data in (36) are from Kershner (2002).
They dier in marking: ti vs tiise, and vs ka. Furthermore, speakers of
Chisukwa consider tiise . . .-ka-. . .-e marked events to be deeply re-
mote. What needs to be claried is the relation these congurations
Figure 36. Organizational schema of Lucazi Anteriors and Pasts
Figure 37. Future tenses in Chisukwa
Figure 38. Simple futures in Chisukwa
Tense and cognitive space in Bantu languages 197
have to one another semantically and what the individual morphemes
contribute to the meanings.
(36) F
1
ti a-mu-busy-e
F 3S-3S-tell-F
s/he will tell him/her (sometime soon)
F
2
tiise a-mu-busy-e pala abasikali biisa
FCont 3S-3S-tell-F if police 3P.P1.come
s/he may tell him/her if/when the police (have) come
F
3
ti tu-ka-byaal-e amalima
F 1P-F3-plant-F beans
we will plant beans (at some point)
F
4
tiise tu-ka-byaal-e amalima
FCont 1P-F3-plant-F beans
we might plant beans (e.g., at some point if we have money)
[Note: FCont contingent future]
The conguration ti . . .-e (F1) denotes the highly probable occurrence
of an event in relatively close proximity to the speech event (S). That is,
the speaker is condent of the event happening; hence, it is perceived as in
the here-and-now (Fig. 38). In contrast, the similar conguration ti
-ka-. . .-e (F3) suggests less certainty, both in speaker condence of the
event taking place and in the time at which it might occur. We propose
that the -ka- marks dissociation, situating the event in a future D-domain.
Hence, from the speakers perspective, the event is subjectively more re-
mote. The congurations with tiise (F2 and F4) contrast with each other
in the same way as the ti congurations (F1 and F3). The question, then,
is how the tiise forms dier in meaning and use from the ti forms.
The tiise conguration indicates that the occurrence of the reported
event depends on the fulllment of prior information or on the occurrence
of some other event, a second reference locus R2. The element tiise is
appropriately analyzed as bi-morphemic: ti :se. The morpheme -:se
induces vowel length, but does not aect vowel quality. Evidence for this
comes from the negative forms: ta (<ti a) and taase (<ti a :se)
(Kershner 2002: 193). It is the morpheme -:se that establishes the new
R2. Moreover, this -:se derives from the verb come, -iisa.
Consider, then, the sentence with F4 in (36) above and its schematic
representation in Figure 40 below. The marker -ka- situates the event in
the future D-domain. tiise indicates that there is another condition or event
(in this case, have money) that must be fullled before the reported event
plant beans will occur. Hence, tiise establishes a new reference locus, an
R2, eectively indexing some discourse established event as the substantive
time of R2, thereby creating a new locus of orientation and, hence, a new
198 R. Botne and T. L. Kershner
perspective on time. The reported event is future with respect to R2 within
the future D-domain. It is for this reason that tiise . . .-ka-. . .-e events are
considered by Sukwa speakers to be deeply remote.
A simple linear analysis does not account for these dierences in a de-
scriptively accurate or explanatory manner; our domain analysis does. As
the schema in Figure 40 shows, the ti and tiise congurations comprise
parallel sets in the P- and D-domains. Events marked with ti are simple
futures, either in the P-domain or D-domain. On the other hand, events
marked with tiise are indexed to an antecedent R2. This, in essence, con-
stitutes a kind of mediated remoteness, in contrast to the more direct
remoteness of the D-domain.
Figure 39. Representation of F4
Figure 40. Combined schema of Chisukwa futures
Tense and cognitive space in Bantu languages 199
These data demonstrate, once again, that the same markers may be
used to express temporal relations in both P- and D-domains. Typically,
though not necessarily, there will be a marker such as Chisukwa -ka- that
indicates which of the two domains the tense marking is referencing.
We have seen in this analysis of Chisukwa that the futures involve
more than simple temporal meaning; they involve as well semantic di-
mensions of greater or lesser epistemic certainty and contingency. Our
model, we believe, lends itself well to integrating both temporal and non-
temporal dimensions of meaning in one unied account. Although we are
not able to extend the analysis here to other non-temporal uses, condi-
tional contexts, for example, we feel that the account we have provided
holds much promise for future investigation.
14. Curiosity #7: Discontinuity and pattern in Kom tense markers
(Grasselds Bantu, Cameroon)
Kom presents a challenging case, both for Comries hypothesis of conti-
nuity in the time reference of tense marking as well as for our thesis that a
dissociated domain approach provides a motivated, explanatory analysis
of the data. There are four past tenses, a present/future, and three fu-
tures, in addition to two aspects, imperfective and completive. Data are
from Chia (1976).
(37) a. Amadu nun l fel
A. P4 work
Amadu worked [remote]
b. ivu t na su su? a
rain P3 IMPF INCEP fall
rain was beginning to fall
c. Peter l nyij busi-busi
P. P2 ran in_morning
Peter ran in the morning
d. wu ni yem njaj j taka wa gwi
3S P1 sing song before 2S come
she sang a song before you came
(38) es nun kf asaj uwe afein
1P Pr harvest corn week this
we are harvesting corn this week
(39) a. Pauline ni yem njja a-l kf a
P. F1 sing song LOC-evening
Pauline will sing a song in the evening
200 R. Botne and T. L. Kershner
b. Peter l fel-a al bus
P. F2 work-F tomorrow
Peter will work tomorrow
c. wu nun l yem njaj ulva wu lema kwen
3S F3 sing song when 3S grow reach
she will sing a song when she matures
As one can readily see from these examples and the list of tense
markers alone in Table 11, Kom tense marking is replete with segmental
forms recurring in dierent tense constructions and, hence, having dier-
ent time reference. For example, the forms nun and lwith dierent
tonesoccur in tenses referring to three dierent times, while ni is em-
ployed in two.
The P4 and F4 tenses stand out as curiosities in two ways: they are
composed of two elements, unlike other markers, and they incorporate
the same morphemenun as does the present/future (Pr/F1) tense. Al-
though one could certainly describe the system in terms of a simple linear
model with dierent degrees of remoteness incorporated into it, it would
be dicult to see any coherent motivation for the semantic and morpho-
logical patterning in such an analysis. Although the overall distribution of
tense markers suggests a gradual temporal demarcation outwards from
the present (and even that pattern is disrupted by P3), that doesnt pro-
vide any explanation for the composition of P4 and F4 from Pr/F1 nun
and P2/F3 l. However, it does become understandable and motivated
in the domain approach we are advocating. First, nun functions to denote
a kind of present along both timelines. Not only does it indicate an on-
going event at S, it is found in generic sentences as well. In (40) a simple
present generic is formed with the present tense. Furthermore, a past or
future generic is also possible, as Chia points out (1976: 125), but only
with the P4 or F3 tenses, respectively, as illustrated in (41). No other
tenses can be used this way. That is, only nun tenses can be used to denote
Table 11. Kom tense markers
Tense markers Temporal reference
P4 nun l long ago
P3 t sometime before today (e.g., yesterday, last year)
P2 l early in the day
P1 ni a little while ago (aP3 hours)
Pr/F1 nun now or in a little while (aP3 hours)
F2 ni later in the day
F3 l tomorrow or specic known time in the future
F4 nu n l some time in the future (after tomorrow)
Tense and cognitive space in Bantu languages 201
generic events, i.e., genericness is associated with the static perspective of
time along the Ego-moving timeline. At any point along this line, the
speakers perspective is from within a particular cognitive domain for
which that situation named by a verb is true.
15
(40) Kom nun fel akun
Kom_people Pr cultivate rice
the Kom cultivate rice
(41) a. ngum su nun l na kful nfu?
locusts P4 IMPF eat grass
locusts ate grass
b. ngum su nun l na kful nfu?
locusts F3 IMPF eat grass
locusts will eat grass
Thus, nun (in its generic or gnomic use) situates events along the Ego-
moving timeline (see Fig. 41); l marks either of the dissociated D-
domains (41). Consequently, location of events in either D-domain is
marked by nun l, past or future being dierentiated by reversed tone
patterns on the tense markers.
In contrast, all the tenor markers situate events along the moving-event
timeline, that is, within the P-domain, as illustrated in Figure 42. Note in
particular that nun occurs as a marker of present along both timelines.
Note, too, that l here marks time units removed from S, i.e.,
not contiguous with S; hence, use of l appears to be consistent over
each timeline.
16
More specically, the tenor forms in the P-domain are
organized in form, meaning, and distribution by two interacting princi-
ples. First, the time frame is sub-divided into approximately equivalent
Figure 41. Distribution of nun forms
202 R. Botne and T. L. Kershner
intervals grounded in the concept of today (see Fig. 43a), creating a bi-
lateral symmetry. Two relatively short intervals (approximately e3 hrs)
form the central core, the more future interval of which contains the
time of the speech event. One time unit removed are the comparable in-
tervals earlier and later in the day. All four of these today intervals re-
ceive a low (L) tone on the tense marker. Parallel intervals outside of to-
day, the pre-hodiernal and post-hodiernal intervals, both receive high
(H) tone on the tense marker. Second, the tense forms refer to abstract
metrical units in relation to S (Fig. 43b). Thus, ni refers to those time
units abutting S, l to those time units twice removed from S. Those two
that are temporally past receive L tone, those that are future, H tone.
Figure 42. Tenor marking in the P-domain
Figure 43. Organizational principles in the Kom P-domain
Tense and cognitive space in Bantu languages 203
Note that the tones in the upper and lower schemas match, with the sole
exception of ni later in the day, where we observe a HL sequence. This
pattern, which initially appeared out of place, is to be expected from the
correlation of hodiernal temporal divisions with grammatically abstract
temporal units denoted by the tense markers, as illustrated here.
The Kom case demonstrates that continuity of time reference is a prin-
ciple that holds within a region of one domain, for example, over the past
in the P-domain. Thus, l occurs in the past of the P-domain, but also in
the future, a separate region. It also marks each of the D-domains. This
symmetrical pattern is not just a coincidental fact about Kom; the
same type of correspondence in form between remote past and future do-
main marking can be found in numerous Bantu languages, as the exam-
ples in Table 12 attest.
Segmentally, the dissociative (remote) marker in each of the languages
is identical; tonally, they are identical in two languages, dierent in three,
as well as in Kom. This fact supports the view that Bantu languages often
mark the two dissociative D-domains in the same way, typically using
tone to separate past from future.
In sum, we can retain Comries continuity hypothesis as a universal
principle, but with the stipulation that the same marker may be used in
dierent domains or in comparable intervals (past/future) of the P-
domain.
15. Limits on domains: Bamileke-Dschang (Grasselds Bantu;
Cameroon)
The discussion to this point has addressed languages that mark at most a
past and a future D-domain in addition to the P-domain. One may well
ask whether more than one past or future D-domain can be marked and,
Table 12. Some Bantu languages with comparable marking for D-domains
Tones identical
Sibende F12 P2 -a-ka- cf. P1 -a- Yuko Abe p.c.
F2 -loo-ka- F1 -loo-
Kimatuumbi P13 P2 -a--. . .-ite cf. P1 --. . .-ite Odden 1996
F2 -a-luwa- . . -a F2 -luwa-. .-a
Tones differ
Ewondo A72a P3 -ngaH Redden 1979
F3 -ngaMid
Gimbala H41 P3 -ga- Ndolo 1972
F2 -ga-
Ciila M63 P2 -a-ka- cf. P1 -a- Yukuwa 1987
F2 -la-ka- F1 -la-
204 R. Botne and T. L. Kershner
in fact, whether there are any limits on the number of domains marked.
In making a rst pass at addressing these questions, we can consider the
case of Bamileke-Dschang, whose tense system represents perhaps the
fullest development possible, having ve pasts and ve futures, illustrated
by the examples in (42)(43).
(42) Pasts in Bamileke-Dschang [Hyman 1980: 227]
a. P1 aa
!
taj he bargained [ just a moment ago]
3S.P1 bargain
b. P2 a aa ntaj he bargained [earlier today]
3S P2 bargain
c. P3 a ke taj
!
j he bargained [yesterday]
3S P3 bargain
d. P4 a le taj
!
j he bargained [before yesterday]
3S P4 bargain
e. P5 a le la n
!
taj he bargained [long ago]
3S P4 P5 bargain
(43) Futures in Bamileke-Dschang [Ibid. 228]
a. F1: a
!
a
!
taj he is about to bargain
3S.F1 bargain
b. F2 aa
!
pij
!
j taj he will bargain [later today]
3S.F1 F2 bargain
c. F3
17
aa
!
sue taj he will bargain [tomorrow]
3S.F1 F3 bargain
d. F4 a
!
a lae
!
taj he will bargain [after tomorrow;
3S.F1 F4 bargain some days hence]
e. F5 a
!
a fu
!
taj he will bargain [a long time from
now] 3S.F1 F5 bargain
Following analysis of these constructions, Hyman (1980: 228) extracts
the tense formatives as listed in Table 13. What is of particular interest
and relevance for our discussion here are the two pasts, P4 and P5, and
the F4 future.
Table 13. Bamileke-Dschang tenses
Time reference Past Future
Proximate P
1
a` F
1
!
a
Same day P
2
aa F
2
a
!
pi
e1 day P
3
ke F
3
a
!
s$
e2 days P
4
le F
4
!
a la$
e1 year or more P
5
le la$ F
5
!
a fu
Tense and cognitive space in Bantu languages 205
Before discussing in more detail those tenses, it is important to note
that the Bamileke system, according to Hyman, represents relative tense
marking rather than absolute. This can be observed, for example, in the
crastinal (tomorrow) future F3, derived from the verb le-su to come.
A sentence such as that in (44) may be interpreted in absolute terms with
respect to the time of speaking, as illustrated in the schema in Figure 44.
(44) a ke
!
le jgs oo
!
sue zu
!
m [Hyman 1980: 229]
he P
3
say that you.F1 F
3
see child
he said that you will see the child [tomorrow]
However, the same sentence can be interpreted relatively, i.e., with re-
spect to some other reference locus identied in the discourse context, in
this case that denoted by said, as in (45). The F3 tense marker still sit-
uates the event in the adjacent posterior time unit, but that unit is now
understood to be today because the reference locus R is situated in the
time unit yesterday (Fig. 45). It is thus signicant for our model that it
is not restricted to absolute tense, S being simply a special locus of orien-
tation. At any orienting locus, there will be two perspectives on time, and
the model will apply accordingly.
(45) a ke
!
le jgs oo
!
sue zu
!
m [Hyman 1980: 229]
he P
3
say that you.F1 F
3
see child
he said that you will see the child [today]
18
The Bamileke-Dschang system, because it is more complex than most,
lends itself well to a domain analysis rather than a simple linear analysis.
Figure 44. Absolute interpretation of a su
206 R. Botne and T. L. Kershner
Consider that four of the tenses denote times within the current time
unit of today, while two others denote times one time unit away, either
yesterday or tomorrow. We propose that these tenors situate events
along the moving-Event timeline in the P-domain (Fig. 46). One piece of
evidence supporting this analysis comes from the crastinal (tomorrow)
future noted above; the verb come (4445) denotes events moving to-
ward and past the reference locus. Its counterpart, ke, seems likely to
have derived from an obsolete verb for go or an itive marker (compare
the case of Lucazi, note 14 and see Botne 2006b).
Consider now the more remote past tenses, P4 and P5. We can analyze
le (P4) as denoting a remote past, situating an event in a D-domain. But
what about P5? As the time reference indicated in (43) and Table 13
Figure 45. Relative interpretation of a su
Figure 46. Bamileke-Dschang P-domain
Tense and cognitive space in Bantu languages 207
shows, this is a very remote past; in fact, it is constructed in part with the
P4 past marker le. Hence, we propose that it establishes a second D-
domain in a more remote past than P4 (Fig. 47), that is, the la element
instills an earlier time sense.
We see also in Figure 47 and in (43d) and Table 13 that la appears as a
marker of a future D-domain (F4), but that it is not the most remote fu-
ture. Assuming these two tenses derived from the same morpheme la, this
is an odd situation. There is, however, a unifying analysis behind this cu-
riosity. Hyman (1980: 234) points out that la, in conjunction with P2 and
P3 (i.e., the P-domain pasts) is also used as a pluperfect, as shown by the
examples in (46). He concludes from this that the pluperfect and far re-
mote past use (P5) are comparable anteriors and can be factored out of
the tense system, but that the future la (F4) is not comparable and must
be treated dierently. However, within the framework of our model, we
can see that la functions consistently and coherently in the dierent time
dimensions. In the tenor dimension of the P-domain, i.e., along the
moving-Event timeline, it combines with P2 and P3 to indicate anteriority
to an indexed orienting locus; in the tense dimension across domains,
i.e., along the Ego-moving timeline, it denotes a domain prior to another
D-domain, i.e., P5 prior to P4, or F4 prior to F5. Thus, the key concept
underlying la in both time dimensions is temporal anteriority, construed
in a manner commensurate with the particular conceptualization of the
timeline. This division of functional roles lends further support to the
analysis proposed in Figure 47, as the hesternal ke` and crastinal su both
behave in the same way with respect to la.
Figure 47. Bamileke-Dschang temporal domains
208 R. Botne and T. L. Kershner
(46) a. a aa nda? ntaj he had already
bargained
[Hyman 1980: 234]
he P2 la bargain
(earlier today)
b. a ke nda? ntaj he had already bargained (yesterday)
he P3 la bargain
Assuming this analysis to be appropriate, we propose further that
it also represents the limits of a tense and tenor system. That is, once
a language goes beyond one D-domain in either the past or the future,
it will systematize only one more domain in that time sphere (whether
past or future) and that marking of that domain will be built upon or
with respect to the marking of the other D-domain. One might appropri-
ately consider and compare this situation with the surcompose past
tenses of French, built upon the passe compose, or the use of the English
pluperfect for a past-before-past interpretation, as noted in 5 (Fig.
19b).
16. A brief re-examination of Burera and Palantla Chinantec
We return here briey to the cases of Burera and Palantla Chinantec,
whose tense systems, as we indicated in the introduction, posed problems
for the idea of a linear, continuous marking of tenses. Following from our
analyses in terms of the approach we have advocated here, there is no is-
sue of continuity; seemingly discontinuous tenses are continuous within
their domains. Where Burera diers from Bantu languages is that it only
marks tenor relations within domains and not tense relations between do-
mains, as shown in Figure 48. Note that we have no data to determine the
Figure 48. Burera tenses in the domain approach
Tense and cognitive space in Bantu languages 209
direction of the Time-moving timeline; for expository purposes, we have
represented it by a dotted line moving toward the past. The same holds
for the case of Palantla Chinantec below (Fig. 49).
As an interesting aside, Glasgow (1964) suggests (following a sugges-
tion by Richard Pittman) interpreting the Burera tenses as occurring in
two frames of reference. Although it is unlikely that either of them had
our approach in mind, the spirit seems similar.
The case of Palantla Chinantec (Merrield 1968) is similar to that of
Burera but illustrates a simpler system than that of Burera in that it does
not make a remoteness distinction within the D-domain. Like Burera,
however, it only marks tenor relations within domains and not tense rela-
tions between them.
17. Conclusion
Our goal in this paper has been to present various curious kinds of data
that nd no satisfactory motivation or analysis in a traditional one-
dimensional linear model of temporal relations. The dierent kinds of
curiosities investigated, we propose, provide evidence for a multi-
dimensional model, organized by and grounded in two cogent factors:
(1) the relevance of three perspectives on timeEgo-moving vs Time-
moving (moving event vs moving ego)which motivate dierences in
the organization of tense markers, and (2) the dierentiation of temporal
marking into cognitively distinct domains. This dierentiation into sepa-
rate cognitive domains is not unlike the distinction proposed by Chafe
(1994: 19899) between displaced conscious experience and immedi-
Figure 49. Palantla Chinantec
210 R. Botne and T. L. Kershner
ate conscious experience, and perhaps relates in some way to the distinc-
tion Givo n (1984: 405) makes between active and permanent les in a
mentally projected world.
In the model we propose, tense systems may grammatically mark con-
tinuity of time relations along a timeline in each domain; that is, a tense
marker may denote the temporal relation of an event with respect to a
reference locus within a domain, the tenor of the event. We can thus
salvage Comries proposed universal of continuity by stating that in a
tense system, the time reference of each tense (i.e., tenor) is a continuity
within a specic domain.
Second, tense systems may mark discontinuity of relations reecting
deictic dissociation, i.e., contemporal vs not contemporal for either
past or future times, that is, the relation of a reference frame to S. Thus,
a tense marker may function to project an event into a separate cognitive
domain. This also has implications for Comries universal: along the Ego-
moving timeline, i.e., across domains, the time reference of each tense
marker will have the same general meaning (recall the Kom case of l as
remote marking).
Third, tense systems may mark dierent kinds of remoteness, three of
which we have identied here. These include metrical remoteness in
either the P- or D-domains, dissociative remoteness imbued by projec-
ting an event into a D-domain, and mediated remoteness, a conse-
quence of a reported event being directly related to a second tense locus
intervening between it and the speech event.
Given these ndings, we believe this model establishes a functionally
versatile cognitive framework that accommodates the diversity and range
of tense/aspect systems we encounter in Bantu languages and beyond,
and provides a dynamic means for analyzing and comparing deictic phe-
nomena in the verb in a ner-grained manner than has previously been
the case.
Received 5 September 2006 Indiana University, USA
Revision received 28 August 2007 Kansas State University, USA
Tense and cognitive space in Bantu languages 211
Appendix
Abbreviations
AnTU anterior time unit IMPF imperfective PFV perfective
APPL applicative INCEP inceptive PoA point of
assessment
C coda INF innitive POSS possessive
CMPL completive INT intensive Pr present
COM comitative INTJ interjection PST past
CONT contigent IS immediate
scope
QUOT quotative
COP copula IT itive R2 2nd reference
locus
CTU current time unit L low tone REFL reexive
CV consonant-vowel LOC locative RSL resultative
DEM demonstrative N nucleus S speech time
DEP dependent NAR narrative SEQ sequentive
E event NEG negative t time
F future O onset T tense
FV verb-nal vowel PANT past
anterior
TAM tense, aspect,
mood
H high tone PASS passive V vowel
HOD hodiernal PF perfect
Note: 1S; 2S; 3S 1st person singular, etc; 1P; 2P; 3P 1st person plu-
ral, etc.; P1; P2; P3; etc. nearer past, more distant past, etc.; F1; F2;
F3; etc. nearer future, more distant future, etc.; numbers that appear
as glosses with nouns indicate verb classes. The same number on the verb
or modier denotes agreement with that noun. A diamond ()) marks the
location of an event on a timeline. In Reichenbach, PP point present,
RP recalled point, AP anticipated point, RAP recalled anticipated
point.
212 R. Botne and T. L. Kershner
Notes
* This paper has gone through many revisions, proting from comments and discussion
following presentations at the 98th Annual meeting of the American Anthropological
Society (1999), the 32nd Annual Conference on African Linguistics (2001), The 4th
World Conference on African Linguistics (2003), an IULC colloquium (2004), and the
International Conference on Bantu Grammar (2006). We wish to thank Phil Lesourd,
John Hewson, Brian Joseph, and Ewa Dabrowska, as well as a coterie of reviewers
for their valuable comments and suggestions. We also thank Carol Orwig and Keith
Patman for assistance with obtaining and clarifying data on Nugunu. Any errors or
misinterpretations are the sole responsibility of the authors: Robert Botne, Indiana
University (botner@indiana.edu) and Tiany L. Kershner, Kansas State University
(tlkershn@ksu.edu).
1. Dixon (2002) refers to it as Burarra.
2. We use the term event throughout the paper in a very general and generic sense to
refer to any type of basic predication. It is equivalent here to what situation would
be in a more technical discussion of verb types.
3. Numbering such as E.62 refers to the referential classication of individual Bantu lan-
guages followed by Bantuists (based on Guthrie 1948, 1971), the letter identifying one
of 16 zones, the number a particular language within a zone.
4. There are at least two problems with Langackers arguments. First, with respect to fu-
ture use, although the simple -form often applies to scheduled events, this does not
necessarily appear to be the case. If I am exasperated with the meals I am getting at
home, I could say Tomorrow we eat out, or Tomorrow we will eat out, or To-
morrow we are going to eat out. The former seems no more scheduled than the other
two. A similar remark can be made about past use as a historical present. Langacker
considers this use to constitute a virtual occurrence of the event(s) at the time of
speaking. However, theres no reason to believe this past use is any more virtual than
that marked by -ED; neither is actually occurring at the time of speaking.
5. There is no term that captures exactly the essence of the cognitive domain coincident
with S, hence we use the term contemporal here for lack of a more precise term.
Contemporal is, naturally, a relative notion whose range is determined by each lan-
guage but whose meaning suggests prevailing eect or relevance at S. It replaces the
use of (extended) now in Botne (2003a).
6. A possible additional contrast seems to exist in the case of oral vs written language use.
Consider, for example, the passe compose vs preterit distinction in French, with the lat-
ter having narrowed its use to marking specically written language, again indicative of
a dissociative function. Perhaps others are possible as well: narrative-non-narrative, for
example.
7. Allomorphic variation occurs in most morphemes: Norwegian (-et, -te, -de, -dde), Slave
(-o-, -u-, -wo-).
8. The double letters at the end of the verb appear to be an orthographic convention for
writing a falling or rising tone on a short vowel. For temporal markers, double letters
indicate length.
9. We do not believe this to be the same thing as taxis, Jakobsons (1971) term often
equated with relative tense, although Gu ldemanns (2002) denitionthe time re-
lation between the communicated state of aairs and another state of aairs which is
encoded or implied in the discourse context and which serves as the temporal reference
pointis very similar. The dierence is to be found in the use of S or some specied
time (other than that related to some event) as the reference anchor in our case.
Tense and cognitive space in Bantu languages 213
10. It may also be used in the future as, for example, in (6b) above.
11. Kilega has a 7-vowel system, apparently involving ATR distinctions in the high vowels
(Botne 2003b).
ATR

High i i u u
Non-high e a o
12. We thank Thilo Schadeberg for bringing this example to our attention.
13. -ine is an allomorph of -ile, occurring following a nasal in the verb stem.
14. Note that the itive (i.e., motion away) marker -ka- denotes the movement out of the
current time period of today into the preceding time interval yesterday. This accords
with the moving event perspective of time, as shown in the schema.
15. In our model, simple or progressive, habitual, and generic presents can be dierenti-
ated by the timeline and domain that are relevant. A simple or progressive present
indicates an event situated at S along the Time-moving timeline in the P-domain; a ha-
bitual is true of the whole domain. Generics or gnomics, on the other hand, denote
the relation of an event to the static timeline from the perspective of Ego-moving
across a static temporal landscape.
16. The morpheme l appears to have derived from the verb lli wake up (Chia 1976: 95
96), grammaticized as a marker referring to events earlier in the day. While this may
have been motivation for its grammaticization as an early today past, it doesnt ap-
pear to provide motivation for its other uses.
17. There is an alternative construction for F3 with
!
lu
!
lu instead of
!
su
!
e. Since, according
to Hyman (1980), they are comparable semantically, we will only consider the latter
form here.
18. One reviewer noted that this sentence might be ambiguous between seeing the child
later today (i.e., after S) and seeing the child earlier today (i.e., before S). Although an
interesting question, we do not have an answer, as Hyman (1980) does not give any
more information than the two readings we have cited here.
214 R. Botne and T. L. Kershner
References
Benveniste, Emile
1965 Language and human experience. Diogenes 51, 112.
Binnick, Robert I.
1991 Time and the Verb: A Guide to Tense and Aspect. Oxford/New York:
Oxford University Press.
Botne, Robert
2003a Dissociation in tense, realis, and location in Chindali verbs. Anthropological
Linguistics 45, 390412.
2003b Lega (Beya dialect) (D25). In Nurse, Derek and Gerard Philippson (eds.),
The Bantu Languages. London/New York: Routedge, 422449.
2006a A Grammatical Sketch of the Lusaamia Verb. Ko ln: Ru diger Ko ppe Verlag.
2006b Motion, time, and tense: Grammaticization of come and go futures in
Bantu. Studies in African Linguistics 35, 127188.
Botne, Robert and Tiany L. Kershner
2000 Time, tense, and the perfect in Zulu. Afrika und U

bersee 83:161180.
Bull, William E.
1960 Time, Tense, and the Verb. Berkeley: University of California Press.
Burrow, J. A. and Thorlac Turville-Petre.
1992 A Book of Middle English. Oxford/Cambridge, MA: Blackwell.
Bybee, Joan
1985 Morphology: A study of the relation between meaning and form. Amsterdam/
Philadelphia: John Benjamins Publishing Company.
Bybee, Joan, Revere Perkins, and William Pagliuca
1994 The Evolution of Grammar: Tense, Aspect, and Modality in the Languages of
the World. Chicago/London: The University of Chicago Press.
Chafe, Wallace
1994 Discourse, Consciousness, and Time: The Flow and Displacement of Con-
scious Experience in Speaking and Writing. Chicago: The University of Chi-
cago Press.
Chia, Emmanuel Nges
1976 Kom tenses and aspects. Doctoral dissertation, Georgetown University.
Chung, Sandra, and Alan Timberlake
1985 Tense, aspect, and mood. In Shopen, Timothy (ed.), Language Typology and
Syntactic Description: Grammatical Categories and the Lexicon. Cambridge:
Cambridge University Press, 202258.
Comrie, Bernard
1981 On Reichenbachs approach to tense. In Hendrick, Roberta A., Carrie S.
Masek, and Mary Frances Miller (eds.), Papers from the Seventeenth Re-
gional Meeting, Chicago Linguistic Society, 2430.
1985 Tense. Cambridge: Cambridge University Press.
Cutrer, L. Michelle
1994 Time and Tense in Narratives and Everyday Language. Doctoral disserta-
tion, University of California, San Diego.
Declerck, Renaat
1991 Tense in English. London/New York: Routledge.
Dinsmore, John
1982 The semantic nature of Reichenbachs tense system. Glossa 16, 216239.
Tense and cognitive space in Bantu languages 215
Dixon, Robert M. W.
2002 Australian Languages: Their Nature and Development. Cambridge/New
York: Cambridge University Press.
Dugast, Idelette
1971 Grammaire du tunen. Paris: E

ditions Klincksieck.
Emanatian, Michele
1992 Chagga come and go: Metaphor and the development of tense-aspect.
Studies in Language 16, 133.
Evans, Vyvyan
2005 The Structure of Time. Amsterdam/Philadlphia: John Benjamins.
Evans, Vyvyan and Melanie Green
2006 Cognitive Linguistics: An Introduction. Mahwah, NJ/London: Lawrence
Erlbaum Associates.
Fauconnier, Gilles
1985 Mental Spaces. Cambridge, MA: MIT Press.
1997 Mappings in Thought and Language. Cambridge, New York/Melbourne:
Cambridge University Press.
Fleisch, Axel
2000 Lucazi Grammar. Ko ln: Ru diger Ko ppe Verlag.
Fleischman, Suzanne
1982 The past and the future: Are they coming or going? Proceedings of the Berke-
ley Linguistics Society 8, 322334.
1989 Temporal distance: A basic linguistic metaphor. Studies in Language 13, 1
50.
Frawley, William
1992 Linguistic Semantics. Hillsdale, NJ/London: Lawrence Erlbaum Associates.
Gerhardt, Phyllis
1989 Les temps en nugunu. In Barreteau, Daniel and Robert Hedinger (eds.), De-
scriptions de Langues Camerounaises. Paris: Agence de Cooperation Cultur-
elle et Technique and ORSTOM, 315331.
Givo n, Talmy
1984 Syntax: A Functional-Typological Introduction. Amsterdam/Philadelphia:
John Benjamins Publishing.
2001 Syntax: An Introduction (rev. ed.). Amsterdam/Philadelphia: John Benja-
mins Publishing.
Glasgow, Kathleen
1964 Frame of reference for two Burera tenses. In Pittman, Richard and Harland
Kerr (eds.), Papers on the Languages of the Australian Aborigines. Occa-
sional Papers in Aboriginal Studies 3. Canberra: Australian Institute of Ab-
original Studies, 118.
Guillaume, Gustave
1929 Temps et verbe. Paris: Champion.
1937 The`mes de present et syste`me des temps francais. Journal de psychologie.
Reprinted in Guillaume, Gustave Langage et science du langage (1964).
Quebec: Presses de lUniversite Laval, 5972.
1945 Architechtonique du temps dans les langues classiques. Copenhagen: Munks-
gard.
Gu ldemann, Tom
2002 The relation between imperfective and simultaneous taxis in Bantu: Late
stages of grammaticalization. In Fiedler, Ines, Catherine Griefenow-Mewis,
216 R. Botne and T. L. Kershner
and Brigitte Reineke (eds.), Afrikanische Sprachen im Brennpunkt der For-
schung. Ko ln: Ru diger Ko ppe Verlag, 157177.
Guthrie, Malcolm
1948 The Classication of the Bantu Languages. London: Oxford University Press
for the International African Institute (IAI).
1971 Comparative Bantu: An introduction to the comparative and pre-history of the
Bantu languages, vol. 2. London: Gregg International.
Helland, Hans Petter
1995 A compositional analysis of the French tense system. In Thiero, Rolf (ed.),
Tense Systems in European Languages II. Tu bingen: Max Niemeyer Verlag,
6994.
Hewson, John, Derek Nurse, and Henry Muzale
2000 Chronogenetic staging of tense in Ruhaya. Studies in African Linguistics 29,
3356.
Hornstein, Norbert
1990 As Time Goes By: Tense and universal grammar. Cambridge, MA: MIT Press.
Hyman, Larry M.
1980 Relative time reference in the Bamileke tense system. Studies in African Lin-
guistics 11, 227237.
2003 Basaa (A43). In Nurse, Derek and Gerard Philippson (eds.), The Bantu Lan-
guages. London/New York: Routledge, 257282.
Jakobson, Roman
1971 (1956) Shifters, verbal categoies, and the Russian verb. In Jakobson, R., Selected
Writings, The Hague: Mouton, 13047.
James, Deborah
1982 Past tense and the hypothetical: A cross-linguistic study. Studies in Language
6, 375403.
Janssen, Theo A. J. M.
1994 Tense in Dutch: Eight tenses or two tenses? In Thiero, Rolf, and Joachim
Ballweg (eds.), Tense Systems in European Languages. Tu bingen: Max Nie-
meyer Verlag, 93118.
Kershner, Tiany L.
2002 The verb in Chisukwa: Aspect, tense, and time. Doctoral dissertation. Indi-
ana University, Bloomington.
Klein, Wolfgang
1992 The present perfect puzzle. Language 68, 525552.
Lako, George and Mark Johnson
1999 Philosophy in the Flesh. New York: Basic books.
Langacker, Ronald W.
2000 Grammar and Conceptualization. Berlin/New York: Mouton de Gruyter.
2001 The English present tense. English Language and Linguistics 5, 251272.
Maganga, Clement and Thilo C. Schadeberg
1992 Kinyamwezi: grammar, texts, vocabulary. Ko ln: Ru diger Ko ppe Verlag.
Mbom, Bertrade
1996 The parameter of remoteness distinction in temporal organization: The
case of Basaa. Paper presented at the 27th Annual conference of African
Linguistics. Gainesville: University of Florida.
Merrield, William R.
1968 Palantla Chinantec Grammar. Mexico City: Instituto Nacional de Anthropo-
log a e Historia de Mexico.
Tense and cognitive space in Bantu languages 217
Mous, Maarten
2003 Nen (A44). In Nurse, Derek and Gerard Philippson (eds.), The Bantu Lan-
guages. London/New York: Routledge, 283306.
Ndolo, Pius
1972 Essai sur la tonalite et la exion verbale du Gimbala. Tervuren: Musee Royal
de lAfrique Centrale.
Nurse, Derek
2003 Aspect and tense in Bantu languages. In Nurse, Derek and Gerard Philipp-
son (eds.), The Bantu Languages. London/New York: Routledge, 90102.
Nurse, Derek and Henry Muzale
1999 Tense and aspect in Great Lakes Bantu languages. In Hombert, J. M. and
L. M. Hyman (eds.), Recent Advances in Bantu Historical Linguistics. Stan-
ford: CSLI, 517544.
Odden, David
1996 The Phonology and Morphology of Kimatuumbi. Oxford: Clarenden Press.
Orwig, Carol
1991 Relative time reference in Nugunu. In Anderson, Stephen and Bernard
Comrie (eds.), Tense and Aspect in Eight Languages of Cameroon. Dallas:
SIL, 14762.
Redden, James E.
1979 A Descriptive Grammar of Ewondo (Occsional Papers on Linguistics 4). Car-
bondale, Illinois: Department of Linguistics, Southern Illinois University.
Reichenbach, Hans
1947 Elements of Symbolic Logic. New York: The Macmillan Company.
Rice, Keren
2000 Morpheme Order and Semantic Scope: Word Formation in the Athapaskan
Verb. Cambridge/New York: Cambridge University Press.
Schadeberg, Thilo C. and Francisco Ussene Mucanheia
2000 Ekoti: The Maka or Swahili language of Angoche. Ko ln: Ru diger Ko ppe
Verlag.
Schu rle, Georg
1912 Die Sprache der Basa in Kamerun: Grammatik und Worterbuch. Hamburg:
L. Friederichsen and Co.
Seiler, Hansjakob
1971 Abstract structures for moods in Greek. Language 47, 7989.
Smith, Carlota S.
2004 The domain of tense. In Gueron, Jacqueline and Jacqueline Lecarme (eds.),
The Syntax of Time. Cambridge, Massachusetts/London: The MIT Press,
597619.
Steele, Susan
1975 Past and irrealis: just what does it all mean? International Journal of Ameri-
can Linguistics 41, 20017.
Taylor, Charles
1985 Nkore-Kiga. London: Croom Helm.
Taylor, John R.
2002 Cognitive Grammar. Oxford: Oxford University Press.
Traugott, Elizabeth Closs
1978 On the expression of spatio-temporal relations in language. In Greenberg,
Joseph H. (ed), Univerals of Human Language: Word Structure (Vol. 3).
Stanford: Stanford University Press, 370400.
218 R. Botne and T. L. Kershner
Three types of conditionals and their verb
forms in English and Portuguese
GILBERTO GOMES*
Abstract
An examination of conditionals in dierent languages leads to a distinction
of three types of conditionals instead of the usual two (indicative and sub-
junctive). The three types can be explained by the degree of acceptance or
as-if acceptance of the truth of the antecedent. The labels subjunctive and
indicative are shown to be inadequate. So-called indicative conditionals
comprise two classes, the very frequent uncertain-fact conditionals and the
quite rare accepted-fact conditionals. Uncertain-fact conditionals may have
a time shift in contemporary English and the future subjunctive in Portu-
guese (though not all of them do). Moreover, paraphrases of if with in
case or supposing are usually possible with approximately the same mean-
ing. Accepted-fact conditionals never have these features.
Keywords: conditionals; indicative; subjunctive; counterfactuals.
1. Indicative and subjunctive
Conditionals are often classied into two types: subjunctives (or counter-
factuals) and indicatives (Edgington 1995; Dancygier 1998; Bennett
2003). Here is an example of a subjunctive conditional:
(1) If he were here today, he would certainly help her.
The verb form used in the antecedent of this conditional is traditionally
called the past subjunctive. The verb to be is at present the only verb in
English that has a distinctive form for the past subjunctive (in the rst
and third persons singular: were). It should be noted that the past
subjunctive refers to the present time. The use of the subjunctive impli-
cates that the condition expressed by the antecedent is not real, but only
imaginary. The main verb in the consequent (help) is preceded by the
Cognitive Linguistics 192 (2008), 219240
DOI 10.1515/COG.2008.009
09365907/08/00190219
6 Walter de Gruyter
modal verb would, and this verb-phrase corresponds to the conditional
mood of other languages. It usually expresses an unreal, imaginary sit-
uation that would be the consequence of the condition expressed by the
antecedent.
Thus, subjunctive conditionals typically involve unreal, imaginary sit-
uations. That is why they are often called counterfactual conditionals. It
is usually agreed, however, that the falsity of the antecedent in counter-
factuals is conversationally implicated rather than asserted (Anderson
1951; Stalnaker 1975; Iatridou 2000). This is because a subsequent sen-
tence may assert it without redundancy or cancel it without contradiction.
The term counterfactual is somewhat too strong, since not always is
the antecedent really deemed contrary to fact. Sometimes this type of
conditional is used when the speaker thinks that the antecedent is only
probably (and not certainly) false. For example:
(2) If she were at home, we might visit her now.
Counterfactuals may be used even when the speaker considers the ante-
cedent probable, but wants to avoid the conditional to be interpreted as
too direct a suggestion. For example, Jean may say to Charles
(3) If you took a taxi, you would arrive on time.
believing that Charles will probably accept the implicit suggestion. But in
saying so she is distancing herself from this suggestion by speaking as if
she believed that he was not (or probably not) going to take a taxi; other-
wise she would have simply said If you take a taxi, you will arrive on time.
The subjunctive verb form were is certainly related to the indicative
form were used for the past, although the latter is not used for the rst
and third persons singular. Would may also be the past of will, but here
it merely indicates an imaginary present or future. According to Iatridou
(2000), past tense morphology as a component of counterfactual mor-
phology is found not only throughout Indo-European languages but also
in other totally unrelated languages. Imagining a situation that is not
occurring now seems to be cognitively related to remembering a past sit-
uation which is similarly not occurring now. As Langacker (1991: Ch. 6)
observes, both involve an epistemic distance between the designated pro-
cess and the speaker. According to him, instead of present vs. past we
can speak more generally of a proximal/distal contrast in the epistemic
sphere (Langacker 1991: 245). As this contrast is usually referred to a
time-line mental model, the predication of immediate reality is commonly
interpreted as one of present time and that of non-immediate reality as
one of past time (Langacker 1991: 246). In counterfactuals, by contrast,
the distal morpheme is interpreted as one of unreal circumstances.
220 G. Gomes
We should bear in mind that the verb forms described above are those
of the English language. The same counterfactual conditional structure
may be expressed in other languages with the aid of verb forms that do
not have the same properties as those used in English. For instance, in
German, the same verb form (Konjunktiv II) is used for both the ante-
cedent and the consequent. What is important, however, is that there are
verb forms for conditionals involving imaginary and unreal conditions
that are dierent from those used in conditionals involving possibly real
conditions, such as the following one:
(4) If he was here yesterday, he certainly helped her.
Here there is no would in the consequent, and the indicative is used in
both the antecedent and the consequent. Conditionals of this sort are
called indicative conditionals. Instead of he were, as in (1), we have
he was. It should be noted, however, that in contemporary English the
meaning of (1) may also be expressed by:
(5) If he was here today, he would certainly help her.
In older days this was considered incorrect, and some still consider it so,
but it is part of spoken and written language for many dialects of English.
Many would say that the verb in the antecedent of (5) is in the indicative
mood. Yet, the fact that a verb form normally used for simple statements
about the past is here used for the present timea past/present time
shiftmay at least be considered as an equivalent of the past subjunctive.
Fowlers Modern English Usage (quoted in Edgington 1995: 240) gives the
following examples:
(6) If he heard, he gave no sign.
(7) If he heard, how angry he would be!
The rst heard refers to the past, the second to the present. According
to Fowlers, the rst heard is indicative, the second subjunctive. Others
would consider both as simple past indicative. It would be harder to
maintain that I were and he/she/it were also belong to the simple past
indicative.
2. The present subjunctive in English indicative conditionals
Consider now the following two examples, which are not counterfactual,
since they involve possibly real conditions:
(8) If he is here tomorrow, he will certainly help her.
(9) If he be here tomorrow, he will certainly help her.
Three types of conditionals in English and Portuguese 221
These say in relation to the future what (4) says in relation to the past.
However, while in (4) there is no time shift and no subjunctive, in (8) we
have a present/future time shift and in (9) a subjunctive.
1
In (8), is, which
in simple statements is normally used in relation to the present, refers to
the future. (9) follows the regular form for this kind of conditional in
16th- and 17th-century English. For example:
(10) If he be not in love with some woman, there is no believing old signs.
(Shakespeare, Much Ado about Nothing, Act III, Scene II)
(11) A commander of an army in chief, if he be not popular, shall not
be beloved, nor feared as he ought to be by his army (. . .) (Hobbes,
Leviathan, Ch. XXX)
Here we have what is called the present subjunctive, by contrast with
the past subjunctive that we have seen in counterfactual conditionals.
This archaic form is still sometimes found in recent times:
(12) If it be your will [ . . . ], I will speak no more. (Song by Leonard
Cohen, 1984)
(13) I will be ne with you if you be good to me. (Song by Rick Astley,
1988)
(14) . . . in general, this has a negligible eect on the correlogram, but if
the grouping be very drastic, it is possible to introduce corrections
analogous to Sheppards corrections . . . (L. B. C. Cunningham and
W. R. B. Hynd, 1946)
(15) But right now those considerationsif we be at warare secondary
to victory. (Victor Davis Hanson, in National Review Online, 23
October 2001)
(16) And if we be robbers, how can we expect anything dierent from
our children? (Sermon by Rabbi Barry H. Block, 17 February
2006)
(17) It would make it more important if that be the case, he [Ralph
Nader] said yesterday. (New York Daily News, 5 February 2007)
This use of the present subjunctive in English conditionals has usually
been overlooked. Although rare now, it clearly inrms, for example, the
following statement by Bennett (2003: 11): The conditionals that are
called indicative under this proposal are indeed all in the indicative
mood (. . .).
The fact that indicative conditionals such as (9)(17) use the sub-
junctive moodthough this use is now archaicmay be enough reason
to question the adequacy of the traditional terms subjunctive and in-
dicative for distinguishing these two classes of conditionals, even in En-
glish. The fact that the subjunctive mood is also used in many indicative
222 G. Gomes
conditionals in Portuguese and in classical Spanish (see below) is an addi-
tional argument against this label.
The adequacy of classifying conditionals as indicative or subjunctive
has previously been questioned for the opposite reason. Thus Dudman
(1988) maintains that English counterfactuals use the indicative, not the
subjunctive mood, in spite of If I/he/she/it were. Bennett (2003: 11) also
states that most and perhaps all of [subjunctive conditionals] are in the
indicative mood also. To my mind, at least those with If I/he/she/it
were are undeniably in the subjunctive mood. In addition, the subjunctive
is also the rule in counterfactuals in other languages, such as German and
Spanish. An example in Spanish:
(18) Si el jefe estuviese/estuviera aqui no suceder a
If the boss were here not would happen
eso.
this.
If the boss were here, this would not happen.
(Estuviese/estuviera are alternative forms of the past
subjunctive (preterito imperfecto de subjuntivo).)
My point against this nomenclature is not that most subjunctives in
English use the indicative, but rather that indicatives may have the
present subjunctive in English (If it be, etc.)even if this is exceptional
in current Englishand the future subjunctive in Portuguese and also in
classic Spanish. An example in classic Spanish:
(19) Si fuere a Mexico, visitare las
If go-1sg-fut sbj to Mexico, visit-1sg-fut ind the
piramides.
pyramids.
If I go to Mexico, Ill visit the pyramids.
2
3. Three syntactical forms for conditionals in Portuguese
Let us now examine conditionals in Portuguese. (I will present the discus-
sion in a way that can be followed by those who have no knowledge of
Portuguese.)
(20) (I know that she is not Italian.)
Se ela fosse italiana, ela seria europeia.
If she were Italian, she would be European.
(21) (I do not know whether she is Italian or not.)
Se ela for italiana, ela e europeia.
If she be-1sg-fut sbj Italian, she is European.
Three types of conditionals in English and Portuguese 223
(22) (I know that she is Italian.)
Se ela e italiana, ela e europeia.
If she is Italian, she is European.
In Portuguese, there are three dierent forms of the verb in the ante-
cedent in these three cases: fossefore. (20) has the Portuguese imper-
fect subjunctive (corresponding to past subjunctive in English) in the
antecedent: fosse (were). (If she were [fosse] Italian, she would be
European.) (21) has the Portuguese so-called future subjunctive in the
antecedent: for. (If she is [for] Italian [which is not certain], she is Euro-
pean.) (22) has the present indicative: e (is). (If she is [e] Italian [as we
know she is], she is European).
The use of the future subjunctive always implicates doubt. For in-
stance, if X tells Y that Maria has studied a lot, Y may respond:
(23) Se ela estiver cansada, e melhor parar.
If she be-1sg-fut sbj tired, is better to stop.
If she is tired, she had better stop.
This implicates that, although she has studied a lot, she may be tired or
not. It also implicates that, if she is not tired, perhaps the best thing to do
is to go on studying (for example, because of her test tomorrow).
Now let us imagine a second situation, in which X told Y that Maria is
tired, because she has studied a lot. Y may respond:
(24) Se ela esta cansada, e melhor parar.
If she is tired, is better to stop.
If she is tired, she had better stop.
Y could never use (23) in this situation. If he already knows that she is
tired, he would never use estiver, which implicates doubt. He must use the
present indicative esta. In the rst situation, by contrast, some dialects of
Portuguese would use (24), but others would not (unless the speaker had
already concluded that she is tired, from the fact that she has studied a
lot).
Thus, the Portuguese language has three grammatical forms for the
conditional, not just two. The one using the future subjunctive (or future
perfect subjunctive) in the antecedent, which is absent in English, French,
German and other languages, is usually a clear sign of doubt and is
not used when the antecedent is treated as certain. In English (among
other languages), the noncounterfactual conditional construction is usu-
ally used in situations involving uncertain conditions, but it can also be
used in those involving conditions accepted as facts, like (22).
3
The three
grammatical forms present in Portuguese and the dierences in their use
suggest a distinction among three types of conditional sentences.
224 G. Gomes
4. Three types of conditional according to acceptance or as-if acceptance
of the antecedent
What should we call these three types of conditional? Those such as (1)
(3), (5), (7) and (20), in which the speaker accepts or speaks as if she ac-
cepted that the antecedent is false or probably false, but imagines a situa-
tion in which it would be true, are often called counterfactual conditionals,
a traditional name that may be kept.
4
I propose to call those such as (4),
(6), (8)(17), (19), (21) and (23), in which the speaker is or pretends to be
or speaks as if she were uncertain about the truth of the antecedent,
uncertain-fact conditionals. For those such as (22) and (24), in which the
speaker accepts or speaks as if she accepted that the antecedent is true, I
suggest the name accepted-fact conditionals.
5
Thus, I suggest that we should prefer counterfactual to subjunc-
tive to refer to the rst class, and that so-called indicative conditionals
should be divided in two classes: uncertain-fact conditionals and
accepted-fact conditionals. This classication of conditionals based on
the acceptance or as-if acceptance of the truth of the antecedent needs to
be defended against objections that may be raised following two inuen-
tial traditions in the philosophy of conditionals. First, several philoso-
phers have noted that counterfactuals are sometimes used in cases in
which the speaker believes the antecedent to be true. Second, it has been
argued that the dierence between counterfactual and indicative condi-
tionals is deeper than and not explained by the belief in or acceptance of
the truth of the antecedent. The rst objection is discussed in the section 8
and the second in section 9.
5. The distinction between accepted-fact and uncertain-fact conditionals
Further examples of uncertain-fact and accepted-fact conditionals are
given below. Suppose Johnny is trying to solve the following problem:
What is the value of x if x y 27 and x y 9? He is a clever boy,
but he has never studied algebra. He thinks: 27 may be the result of add-
ing several pairs of numbers. Lets try one.
(25) If x is equal to 20, then y is equal to 7.
(26) And if x is equal to 20 and y is equal to 7, then x minus y is equal to
13.
But x y 9. So x is not equal to 20. After trying another pair of
numbers that add up to 27 and failing again, he decides to ask his older
sister for help. Then she teaches him:
Three types of conditionals in English and Portuguese 225
(27) Look: x y is equal to 9. And if x y is equal to 9, then x is equal
to 9 y.
(28) Now, if x is equal to 9 y and x y is equal to 27, then 9 y y
is equal to 27.
From there she nds the solution.
The verb forms used in all these four conditionals in English are: isis.
If Johnny were thinking in Portuguese, (25) and (26) would typically have
the verb forms forsera (future subjunctivefuture indicative).
6
This
would show that Johnny is just trying out numbers that may or may not
be the right ones. By contrast, his sister would use the verb forms ee
(present indicativepresent indicative) in (27) and (28), because she is
dealing with certainties. In (28), for example, she is certain that x is equal
to 9 y, because she deduced this (in (27)) from the second equation
of the problem. In (27) and (28) we have acceptedfact conditionals,
with isis in English and ee in Portuguese. In (25) and (26) we have
uncertain-fact conditionals, with isis in English and typically forsera
in Portuguese.
We can see that the verb form used in the antecedent does not in
general allow one to make the distinction between accepted-fact and
uncertain-fact conditionals in English. In Portuguese, the use of the future
subjunctive (or future perfect subjunctive) indicates an uncertain-fact con-
ditional, but indicative forms may be used in both types.
The question then arises whether the conventional meaning of the con-
ditional construction is dierent or the same in accepted-fact conditionals
as compared to what it is in uncertain-fact conditionals. Let us consider
English conditionals without would in the consequent. One could ar-
gue that the default interpretation of the antecedent of such conditionals
is that it refers to an uncertain fact and that, in certain cases, additional
information may override this default interpretation, so that their ante-
cedent is understood as referring to an accepted fact. Alternatively, one
could argue that the meaning of the conditional construction does not
include anything about the antecedent referring to an accepted fact or
to an uncertain fact. In other words, one may ask whether the condi-
tional construction in these cases is ambiguous or vague as regards the
uncertain-fact/accepted-fact contrast.
7
This is a dicult question, but there is an argument that favours the
ambiguity thesis. This is the fact that if can usually be paraphrased with
in case or supposing in uncertain-fact conditionals (but not in accepted-
fact conditionals) and by since or given that in accepted-fact conditionals
(but not in uncertain-fact conditionals). This points to a dierence in the
meaning of if in each type of conditional. In an accepted-fact conditional,
226 G. Gomes
the meaning of if is similar to the meaning of since or given that, while in
uncertain-fact conditionals it is similar to the meaning of in case or sup-
posing. (This may be compared to the two meanings of while, a word
that may either mean whereas or during the time that.)
Note that I am not claiming that if, as used in uncertain-fact and in
accepted-fact conditionals, is synonymous with in case (or supposing)
and with since (or given that), respectively, but only that their meanings
are usually similar enough to allow the respective paraphrases. However,
this dierential possibility of paraphrasing accepted-fact and uncertain-
fact conditionals is a linguistic fact that indicates a dierence in the mean-
ing of the conditional construction in these two types.
For example,
(29) If you dont want me here, (then) Ill leave.
may either mean something similar to
(30) In case you dont want me here, (then) Ill leave.
or something similar to
(31) Since you dont want me here, (then) Ill leave.
Example (29) could be used either by someone who is considering the
hypothesis of being unwanted to be there ( just as (30)) or by someone
who has had clear evidence that she is really unwanted to be there ( just
as (31)). It will be an uncertain-fact conditional in the rst case and an
accepted-fact conditional in the second.
Suppose the following isolated sentence is overheard in an airport:
(32) If your ight is late, youll miss your connection.
Two interpretations are possible: (1) There is a possibility of your ight
being late and, in that case, youll miss your connection; (2) Your ight
is late and consequently youll miss your connection. Excluding any inu-
ence of special intonation or facial expression, the conditional construc-
tion itself might favour the rst interpretation. However, special circum-
stances might favour the second. Suppose that this takes place in a small
airport with only one scheduled departure in the next three hours and
that the person who hears the sentence knows that this departure is de-
layed. She may then think that the addressee is taking this ight and that
the speaker is referring to the known fact that it is late. My point is that
the hearer cannot fail to interpret the sentence one way or the other (or
even consider both alternatives). According to the rst interpretation, the
sentence could be paraphrased as In case your ight is late, youll miss
Three types of conditionals in English and Portuguese 227
your connection or Supposing your ight is late, youll miss your con-
nection. According to the second, it could be paraphrased as Since
your ight is late, youll miss your connection or Given that your ight
is late, youll miss your connection.
Many conditionals in Portuguese are also ambiguous as concerns the
uncertain-fact/accepted-fact distinction, as the following example:
(33) Se ele foi contratado, vamos primeiro ver o
If he was hired, go-1pl imp rst see the
trabalho dele para depois criticar.
work of him for after criticize.
If he was hired, lets rst see his work and then criticize it.
The sentence could be used either by one who thinks that the man was
hired or by one who is merely considering the hypothesis that he was.
8
As
in English, however, dierent paraphrases for se [if] would be possible in
each case. If (33) is meant as an accepted-fact conditional, se could be
paraphrased with ja que [since] or dado que [given that], but not with
caso [in case] or supondo que [supposing]. If it is meant as an uncertain-
fact conditional, se could be paraphrased with caso or supondo que (in
which case the verb tense would have to be changed to the past perfect
subjunctive: Caso ele tenha sido contratado, . . . or Supondo que ele
tenha sido contratado, . . .) but not with ja que or dado que.
6. Comparison with other proposed distinctions
My distinction has nothing to do with the thesis of Dudman (1984, 1989)
according to which indicatives should be divided in two classes according
to the presence or absence of a time-shift (and that those presenting
a time shift should be classied in the same group as counterfactuals).
To my mind, the presence of a present/future time shift is undoubtedly
signicant, since it is a sure sign of an uncertain-fact conditional. (No
accepted-fact conditional has a time shift.) However, there are many
uncertain-fact conditionals that do not have a time shift. For example,
when the antecedent refers to the past, as in (4), there is no time shift.
Thomason and Gupta (1980: 299) give an example in which the present
tense in the antecedent may refer to the present, thus without a time shift:
If he loves her, he will marry her.
Haegeman (2003) proposed a distinction between two types of indica-
tive conditionals that is also dierent from that between uncertain-
fact and accepted-fact conditionals: the distinction between premise-
conditionals and event-conditionals. According to her, the conditional
clause in event-conditionals structures the event: it expresses an event
228 G. Gomes
which will lead to the main clause event. In premise-conditionals, by con-
trast, the conditional clause structures the discourse: it expresses a
premise leading to the matrix clause (Haegeman 2003: 31819).
As it happens, almost all of her examples of premise-conditionals are
accepted-fact conditionals or may be interpreted as such. Here is one:
(34) John wont nish on time, if theres (already) such a lot of pressure
on him now. (Haegeman 2003: 322)
The speaker here clearly accepts that there is a lot of pressure on
John. However, the following example, also classied by the author as a
premise-conditional, is an uncertain-fact conditional:
(35) If his children arent in the garden, John will already have left home
(. . .). (Haegeman 2003: 325)
The speaker now seems uncertain about whether Johns children are still
in the garden or not. So we see that Haegemans distinction does not
coincide with mine.
In fact, I do not nd the distinction between event- and premise-
conditionals very clear. In (34), classied as a premise-conditional, we
could also say that the event expressed by the conditional clause will
lead to the main clause event, which is how Haegeman characterizes
event-conditionals.
Edgington (2003) also found diculties with Haegemans distinction.
She stresses the following two characteristics of event-conditionals as dis-
cussed by Haegeman: a causal relation between the conditional clause
and the main clause, and tense oddity (what I have called a present/
future time shift). And she concludes:
Given that there can be tense oddity and no causation running from conditional
to main clause, and vice versa, I am left somewhat uncertain about where to draw
the line between event-conditionals and the rest (Edgington 2003: 396).
Haegeman states that event-conditionals may be clefted and premise-
conditionals may not. (A conditional of the form A only if B is said to
be clefted when it is transformed to one of the form It is only if B that
A.) For example, we cannot say:
(36) *It is only if there is already such a lot of pressure on him now, that
John will nish the book. (Haegeman 2003: 323)
Edgington remarks that without the word such this example would be
in order. She notes that the role of such here is to suggest that the
Three types of conditionals in English and Portuguese 229
speaker already knows that there is all this pressure on John now. She
considers that conditionals in which the premise is really accepted by the
speaker are marginal and untypical and notes that while this is not
part of Haegemans ocial doctrine of premise-conditionals ( . . . ) quite
a few of her examples are of this kind (Edgington 2003: 397). Such con-
ditionals are precisely my accepted-fact conditionals.
Other authors have also proposed distinctions between types of indica-
tive conditionals that do not coincide with the one I am arguing for. Eve
Sweetser, for example, makes a distinction between content conditionals,
in which the realization of the event or state of aairs described in the
protasis is a sucient condition for the realization of the event or state
of aairs described in the apodosis (Sweetser 1990: 114), and epistemic
conditionals, in which knowledge of the truth of the hypothetical premise
expressed in the protasis would be a sucient condition for concluding
the truth of the proposition expressed in the apodosis (Sweetser 1990:
116). Both may either be uncertain-fact conditionals or accepted-fact
conditionals. Incidentally, it may be noted that an example such as (4)
(If he was here yesterday, he certainly helped her) ts both of Sweetsers
categories.
9
7. Features and uses of accepted-fact conditionals
Accepted-fact conditionals are no doubt much rarer than those of the
two other types. In chapters 18 (part 1) of Hobbess Leviathan, I found
only one accepted-fact conditional against 41 uncertain-fact and 11
counterfactual conditionals. In chapters 16 of Portrait of a Lady, by
Henry James, I also found only one accepted-fact conditional against 17
uncertain-fact and 9 counterfactual conditionals. In Portuguese, a search
in Contos Fluminenses by Machado de Assis revealed 6 accepted-fact con-
ditionals against 31 uncertain-fact and 16 counterfactual conditionals.
(Atypical conditionals as dened elsewhere (Gomes 2007) and discussed
in section 10 were excluded from these counts. The search involved only
conditionals with if in English or se in Portuguese.)
One might ask why people would use a conditional if they are certain
about the antecedent. They may do so to draw a conclusion from a
known fact or an accepted premise. Examples are Johnnys sisters sen-
tences (27) and (28). Another example is the following (in a context in
which the speaker had a life-threatening illness):
(37) If Im alive, (its because) my doctors did a good job.
Dudman (1986) quotes two other good examples of what I call
accepted-fact conditionals:
230 G. Gomes
(38) If it had not been possible to stop, or even delay, the Japanese up
country with the help of prepared defences and relatively fresh
troops, it was improbable that they would be stopped now at the
gates of the city (J. G. Farrell 1978).
(39) If they werent my doing, and they werent, then I couldnt control
their appearance or disappearance (Donald E. Westlake 1974).
In accepted-fact conditionals (as noted earlier), if (or if . . . then) may
often be paraphrased with since or given that with little change in mean-
ing, as for example in (38). This may lead one to question whether
accepted-fact conditionals are in fact conditionals (see Bennett 2003: 5).
I will argue that they are, for four reasons. First (most obviously), they
share the same overall linguistic structure with other conditionals. They
use the same conjunctions (if; if . . . then), the same pattern for building
the compound sentence and the same or similar intonation and prosody
in speech. They may have dierent verb forms, but counterfactuals also
do and this does not prevent us from considering them as conditionals.
From a grammatical point of view, there is no reason not to consider
them as conditionals.
Second, they usually share many basic logical and cognitive properties
with the other two types of conditionals. All three types are often used to
make inferences. They may be used to draw a conclusion, based on regu-
larity or on logical necessity, or to indicate this regularity or logical neces-
sity itself. They may all be used to make a prediction, dependent on some
condition. They may also be used to indicate the subjects intention to do
something in the future, conditional on a certain circumstance.
Third, though in accepted-fact conditionals since can often be used to
paraphrase if, this does not show that their subclause is merely a reason
clause. This is shown by the fact that many since-clauses cannot be para-
phrased with if-clauses. For example: Since she was not there, I went
away. The subclause here is not meant as conditional and consequently
we cannot say: *If she was not there, I went away. Thus, the subclause in
accepted-fact conditionals is not merely an adverbial clause of reason (or
cause), as might be thought from the possibility of paraphrasing if with
since, but a real conditional adverbial clause.
Fourth, accepted-fact conditionals may in many cases supply an ade-
quate contrapositive for counterfactual conditionals. For example:
(40) If she were Italian, she would be European.
(41) If she isnt European, she isnt Italian.
Within a context that gives reason to state (40), (41) is an accepted-fact
conditional, since in fact we know that she is neither European nor
Three types of conditionals in English and Portuguese 231
Italian. If we did not, we would not assert the counterfactual (40). Other
examples:
(42) If it had rained, the road would be wet.
(43) If (as is indeed the case) the road isnt wet, it hasnt rained.
(44) If she were very ill, she would be in bed.
(45) If (as is indeed the case) she is not in bed, she is not very ill.
The phrase as is indeed the case was included in parentheses in (43) and
(45) to make clear that these are intended as accepted-fact conditionals.
It could be omitted in a suitable context. In many dialects of Portuguese,
we would not need to include the corresponding phrase, since the verb
form (present indicative) would already implicate that. (If we had been
in doubt, we would have used the future subjunctive.)
Although quite rare, accepted-fact conditionals should be recognized
and distinguished from other indicative conditionals. They are the con-
ditionals that are really indicative, since they involve conditions that the
speaker considers (or acts as if she considered) to be real. The others deal
with uncertain conditions, and in some cases this is reected in the use of
a time shift in English (and other languages) and of the future subjunctive
in Portuguese and classic Spanish.
8. Acceptance and as-if acceptance
In rare cases, a counterfactual is employed even though the speaker does
not really accept the antecedent as false. Anderson (1951) gives the fol-
lowing example:
(46) If he had taken arsenic, he would have shown just these symptoms
[those which he in fact shows].
Note, however, that this example could have been used as a usual
counterfactual, in a situation where the speaker believes the antecedent
to be false. Suppose that there is another medical condition that presents
the same symptoms as arsenic poisoning and that the result of a special
test has shown that the patient has that medical condition. The sentence
would then be just a comment on the similarity of symptoms. Alterna-
tively, the counterfactual could have been used to convey that the speaker
nds it highly improbable that the man has taken arsenic, and that he is
perplexed by the similarity between his symptoms and those of arsenic
poisoning.
If the sentence is used in a situation where the speaker believes the an-
tecedent to be true (the possibility that the example is intended to show),
we should rst ask why the speaker would have chosen to use it, instead
232 G. Gomes
of saying something simpler as, for example: He shows symptoms of arse-
nic poisoning. It seems that the latter would be a clear suggestion that the
man has taken arsenic, and that making such a direct suggestion is pre-
cisely what the speaker is trying to avoid in (46). Here is where an as-if
acceptance of the falsity of the antecedent can be identied. The speaker
acts as if she was making a default assumption that the man has not
taken arsenic, but remarks that, had he done so, he would have shown
just the symptoms he in fact shows. It is a euphemistic way of suggesting
that he has indeed taken arsenic.
An uncertain-fact conditional could have been used to make the same
point in a simpler (though not as euphemistic) way:
(47) If one takes arsenic, one shows just these symptoms [which he
shows].
Edgington (1995: 240) gives another example:
(48) People in line are picking up their bags and inching forwardand
thats what they would be doing if a bus were coming.
It would seemingly be more natural to say: and thats what they usually
do if a bus is coming. The counterfactual here seems to be a more elabo-
rate way of saying the same thing. It is as if the speaker were saying
something like: First lets assume that no bus is coming, since we cannot
see one from here. Then lets imagine a situation that well treat as unreal
in which a bus is coming. What would people do in this situation? They
would pick up their bags and inch forward. Now, what are they doing
now? They are picking up their bags and inching forward. So lets revise
our initial assumption and conclude that a bus is probably coming.
Again, the speaker seems to provisionally act as if she accepted that the
situation described in the antecedent is unreal. It is a way of avoiding
commitment to the hypothesis that a bus is coming.
As noted earlier, the falsity of the antecedent in counterfactuals is usu-
ally considered to be conversationally implicated rather than asserted
(Anderson 1951; Stalnaker 1975; Iatridou 2000), since a subsequent sen-
tence may assert it without redundancy or cancel it without contradiction.
The same applies to the truth of the antecedent in accepted-fact condi-
tionals, as shown in the following example by Sweetser (1990: 128):
(49) Well, if (as you say) he had lasagne for lunch, he wont want spa-
ghetti for dinner. But I dont believe he had lasagne for lunch.
Declerck and Reed (2001: 45) have also shown that there are cases in
which the antecedent is accepted only to be challenged by a question in
the consequent.
Three types of conditionals in English and Portuguese 233
It is thus clear that in special cases an accepted-fact conditional may be
used even though the antecedent is not in fact accepted as true. In such
cases, however, an as-if acceptance is always the reason for using this
type of conditional. Suppose someone believes that the other person is
lying and this is why he is nervous. She says:
(50) If you are not lying, there is no reason to be nervous.
This may be seen as an ironic (or cautious) equivalent of:
(51) If you were not lying, there would be no reason to be nervous.
Pretended belief or a provisional strategic acceptance of the antecedent
is again the explanation. In (50) the speaker acts as if she accepted as a
fact that he is not lying, when in fact she believes he is. The utterance
seems to function as a reductio ad absurdum. If the addressee is not lying,
there is no reason to be nervous and a person does not get nervous when
there is no reason to be nervous. But the addressee is nervous, so it is not
true that he is not lying. The feigned belief in the truth of the antecedent
(achieved by giving it the form of an accepted-fact conditional) is pre-
cisely what makes the sentence ironic, since the speaker is suggesting
something (the fact that the addressee is lying) which is the opposite of
the natural implicature of the sentence (which could be accepted if in
fact the addressee were not nervous).
The antecedents of some accepted-fact conditionals are said to be
echoic, since they repeat something that has previously been stated by
the interlocutor. It has been noted (Sperber and Wilson 1986; Dancygier
1998) that in such cases the speaker does not necessarily share the belief
in the assumption echoed. However, she certainly acts as if she shared
that belief. She manifests at least a provisional acceptancewhich may
be ironic or notof the content of the antecedent.
An uncertain-fact conditional may also be used instead of a counterfac-
tual for irony. Instead of saying that since he is not Superman he will not
be able to do it, one might say:
(52) If he is Superman, he will be able to do it.
In saying this, one acts as if one considered his being Superman as an un-
certain fact, while in fact one believes it to be false.
9. Degree of acceptance or as-if acceptance of the antecedent as a basis
for distinguishing the three types of conditionals
I will now argue that the speakers degree of acceptance or as-if accep-
tance of the reality or probability of the condition described in the
234 G. Gomes
antecedent is sucient for explaining the dierence between the three
types of conditionals. Consider a situation in which three people saw a
man kill John. X is uncertain whether this man was Oswald or not and
says:
(53) If Oswald wasnt the one who killed John, then someone else was.
Y is sure that the man was not Oswald and says:
(54) If Oswald wasnt the one who killed John (as in fact he wasnt), then
someone else was.
Z is sure that the man was Oswald and says:
(55) If Oswald had not been the one who killed John, then someone else
would have been the one who killed him.
Though these three sentences sound unnatural, they are grammatical
and make sense. They could certainly be replaced by simpler ones, but
they were chosen on purpose to have a parallel formulation in the three
cases and at the same time avoid dierent contextual assumptions that
would be induced by a simpler wording (see Fogelin 1998).
The only dierence between the three is the belief that the speaker has
concerning the truth of the antecedent (and that of the consequent, as a
result). Y believes it is true, Z believes it is false and X is uncertain about
it.
10
If they did not have these respective beliefs, at least they would be
implicating acceptance of, non-acceptance of and uncertainty about the
truth of the antecedent, respectively.
We have a dierent situation in the following famous pair of examples
(from Lewis 1973: 3, based on Adams 1970):
(56) If Oswald didnt kill Kennedy, then someone else did.
(57) If Oswald hadnt killed Kennedy, then someone else would have.
The person asserting (56) implicates that she is uncertain and the one
asserting (57) implicates that she is certain about Oswald having killed
Kennedy. As Fogelin (1998) has shown, however, in addition to the dif-
ferent degree of acceptance concerning the truth of the antecedent, each
conditional involves dierent contextual assumptions. Thus, they are in-
terpreted dierently by the listener and they would be asserted by people
wanting to communicate dierent thoughts. One believes that Kennedy
was bound to be killed; the other is merely concerned with the identity
of the killer.
Pairs of examples such as this (rst suggested by Adams 1970), have
been considered by Lewis (1973) and many others after him as evidence
that the dierence between indicative and subjunctive conditionals cannot
Three types of conditionals in English and Portuguese 235
be explained by the speakers opinion about or acceptance of the truth of
the antecedent. However, I am in complete agreement with Fogelin
(1998) in attributing any further dierence to the contextual setting. He
shows that the disparity in the reasons for believing each conditional sim-
ply disappears when the relevant contextual features are held constant.
This is obtained by changing the wording of the sentences, as in (53) and
(55).
11
(I have merely added (54) to complete the picture of the three
types.)
Counterfactuals are thus used when the speaker accepts or speaks as
if she somehow accepted that the antecedent is false or highly improba-
ble; uncertain-fact conditionals are used when the speaker accepts or
speaks as if she somehow accepted that the antecedent is uncertain; and
accepted-fact conditionals are those used when the speaker accepts or
speaks as if she somehow accepted that the antecedent is true or highly
probable.
10. Atypical conditionals
I have distinguished three types of conditionals. This is not to say that ev-
ery conditional should fall into one of these types. There are also some
deviant ones, which I call atypical conditionals. (I have elsewhere pro-
posed a denition and an explanation of atypical conditionals (Gomes
2007). For instance (from Edgington 1995: 240):
(58) If he took arsenic, hes showing no signs.
The person who says so probably believes the antecedent is false and
could have said:
(59) If he had taken arsenic, he would be showing signs of arsenic
poisoningbut he isnt.
At least she is uncertain about it and could have said:
(60) If he took arsenic, signs of arsenic poisoning are expectedbut hes
showing no such signs.
Example (59) includes a typical counterfactual and (60) a typical
uncertain-fact conditionaland they also include a comment with but
after these conditionals, to convey the meaning of the atypical (58).
11. Conclusion
An examination of conditionals in English and Portuguese has thus led
us to distinguish three types of conditionals instead of the usual two
236 G. Gomes
(indicative and subjunctive). The labels indicative and subjunctive
were found inadequate, since subjunctive verb forms may be found in in-
dicative conditionals (in the archaic use of the present subjunctive in En-
glish and of the future subjunctive in classical Spanish, and in the current
use of the future subjunctive in Portuguese). Moreover, so-called indica-
tive conditionals comprise two classes, the very frequent uncertain-fact
conditionals and the quite rare accepted-fact conditionals.
Uncertain-fact conditionals may have a time shift in contemporary
English and the future subjunctive in Portuguese (though not all of
them do). Accepted-fact conditionals never have these features. Al-
though accepted-fact conditionals are rare, I have argued that they are
genuine conditionals, which have the theoretically important function
of providing a contrapositive for many counterfactuals (when a contra-
positive is valid). When the verb forms used do not permit the identi-
cation of an accepted-fact conditional, it may be recognized by the
possibility of adding (as is indeed the case), (as you say) or (as X
says) after if, or by the possibility of paraphrasing if with since or given
that.
I have argued that the degree of real or as-if acceptance by the
speaker of the truth of the proposition expressed by the antecedent is
sucient to explain the dierential use of these three types (and that
further dierences are accidental and due to contextual features). The
task of establishing common or dierent truth conditions for them may
be considered as a subsequent one, which is outside the scope of this
paper.
Received 14 May 2007 Universidade Estadual do
Revision received 15 December 2007 Norte Fluminense, Brasil
Notes
* Laboratory of Cognition and Language, Universidade Estadual do Norte Fluminense,
Campos, RJ, Brazil. E-mail: 3ggomes@uenf.br4.
1. Interestingly, Gibbard (1980) considers conditionals in which there is a present/future
time shift as grammatically subjunctive.
2. The following abbreviations are used in the glosses: 1, 3rst, third person; sg
singular; futfuture; sbjsubjunctive; indindicative; impimperative; perf
perfect.
3. Although I have emphasized in section 2 that there is an archaic use of the present sub-
junctive in indicative conditionals in English (which questions the adequacy of this
label), I am not claiming that this use is preferentially associated with a type of condi-
tional, as the future subjunctive is in Portuguese.
4. Against the term counterfactual, Bennett (2003: 12) remarks that it may be consid-
ered as based on a feature that has nothing to do with the antecedents being
Three types of conditionals in English and Portuguese 237
contrary-to-fact, but only with the speakers thinking that it is so. However, I do
not think that this is really a problem. The labels reference to the speakers opinion
may easily be considered as implicit: a conditional will be called counterfactual when
the speaker accepts or speaks as if she accepted that the antecedent is (or probably is)
contrary-to-fact.
5. Following Auwera (1986), Comrie (1986) and Bhatt and Pancheva (2006), among
others, one might call such conditionals factual conditionals. However, the term has
already been used in relation to uncertain-fact conditionals that express habitual or
general facts. Moreover, accepted-fact shows that the speaker may merely be treat-
ing the antecedent as true, without in fact committing herself to its truth.
6. Though they might also have e e (present indicativepresent indicative).
7. I am indebted to the Editor for this observation.
8. However, the use of the indicative seems to favour the accepted-fact interpretation.
Using the future perfect subjunctive, this could be framed unambiguously as an
uncertain-fact conditional:
Se ele tiver sido contratado, vamos primeiro ver o
If he be-1sg-fut perf sbj hired, go-1pl-imp rst see the
trabalho dele para depois criticar.
work of him for after criticize.
If he was hired, lets rst see his work and then criticize it.
9. This accords with the following observation by Dancygier and Sweetser (2005: 17):
Since reasoning from cause to likely eect is just as possible as reasoning from eect
to likely cause, epistemic conditionals can also follow the direction of content causal
contingency.
10. In Portuguese, a dierent verb form could have been used in each: (56) Se nao tiver sido
. . . (57) Se nao foi . . . (58) Se nao tivesse sido . . .
11. The context of (57) is xed by changing the pair to: If Oswald did not kill Kennedy, then
someone else stepped in and did. If Oswald had not killed Kennedy, then someone else
would have stepped in and killed him (Fogelin 1998). The rst sentence might have
been used by a conspirator who was unsure whether Oswald had succeeded in killing
Kennedy. That (56) might be used in a context similar to that of (57) had already
been pointed out by Bennett (1995: 3345). The same conspirator having the same
beliefs concerning the presence of someone prepared to step in if Oswald failed might
utter (56) before knowing that Oswald had succeeded and (57) after knowing that he
had. By contrast, the context of (56) is xed by using wordings similar to those of (53)
and (55) (Fogelin 1998).
References
Adams, Ernest W.
1970 Subjunctive and indicative conditionals. Foundations of Language 6, 89
94.
Anderson, Alan Ross
1951 A note on subjunctive and counterfactual conditionals. Analysis 12, 35
38.
Auwera, Johan van der
1986 Conditionals and speech acts. In Traugott, E. C., Meulen, A. t., Snitzer-
Reilly, J. and Ferguson, C. A. (eds.) On Conditionals. Cambridge: Cam-
bridge University Press.
238 G. Gomes
Bennett, Jonathan
1995 Classifying conditionals: The traditional way is right. Mind 104, 331354.
2003 A Philosophical Guide to Conditionals. Oxford: Oxford University Press.
Bhatt, Rajesh and Roumyana Pancheva
2006 Conditionals. In Everaert, M. and Riemsdijk, H. C. van (eds.) Blackwell
Companion to Synthax, vol. 1. Oxford: Blackwell.
Comrie, Bernard
1986 Conditionals: A typology. In Traugott, E. C., Meulen, A. t., Snitzer-Reilly,
J. and Ferguson, C. A. (eds.) On Conditionals. Cambridge: Cambridge
University Press.
Dancygier, Barbara
1998 Conditionals and Prediction. Cambridge: Cambridge University Press.
Dancygier, Barbara and Eve Sweetser
2005 Mental Spaces in Grammar: Conditional Constructions. Cambridge: Cam-
bridge University Press.
Declerck, Renaat and Susan Reed
2001 Conditionals: A Comprehensive Empirical Analysis. Berlin: Mouton de
Gruyter.
Dudman, Victor H.
1984 Parsing if -sentences. Analysis 44, 145153.
1986 Antecedents and consequents. Theoria 52, 168199.
1988 Indicative and subjunctive. Analysis 48, 113122.
1989 Vive la revolution! Mind 98, 591603.
Edgington, Dorothy
1995 On conditionals. Mind 104, 235329.
2003 What if? Questions about conditionals. Mind and Language 18 (4), 380
401.
Fogelin, Robert J.
1998 David Lewis on indicative and counterfactual conditionals. Analysis 58 (4),
286289.
Gibbard, Allan
1980 Two recent theories of conditionals. In Harper,W. L., Stalnaker, R. and
Pearce, G. (eds.) Ifs. Dordrecht: Reidel.
Gomes, Gilberto
2007 Truth in natural language conditionals. Unpublished paper. Laboratory of
Cognition and Language. Universidade Estadual do Norte Fluminense,
Campos, RJ, Brazil.
Haegeman, Liliane
2003 Conditional clauses: External and internal syntax. Mind and Language
18(4), 317339.
Iatridou, Sabine
2000 The grammatical ingredients of counterfactuality. Linguistic Inquiry 31(2),
231270.
Langacker, Ronald W.
1991 Foundations of Cognitive Grammar, vol. II: Descriptive Application. Stan-
ford, CA: Stanford University Press.
Lewis, David K.
1973 Counterfactuals. Cambridge, MA: Harvard University Press.
Sperber, Dan and Deirdre Wilson
1986 Relevance: Communication and Cognition. Oxford: Blackwell.
Three types of conditionals in English and Portuguese 239
Stalnaker, Robert C.
1975 Indicative conditionals. Philosophia 5, 269289.
Sweetser, Eve
1990 From Etymology to Pragmatics: Metaphorical and Cultural Aspects of
Semantic Structure. Cambridge: Cambridge University Press.
Thomason, Richmond and Anil Gupta
1980 A theory of conditionals in the context of branching time. In Harper, W. L.,
Stalnaker, R. and Pearce, G. (eds.) Ifs. Dordrecht: Reidel.
240 G. Gomes
Much mouth much tongue:
Chinese metonymies and metaphors
of verbal behaviour
ZHUO JING-SCHMIDT
Abstract
This paper explores metonymical and metaphorical expressions of verbal
behaviour in Chinese. While metonymy features prominently in some of
these expressions and metaphor in others, the entire dataset can be best
viewed as spanning the metonymy-metaphor-continuum. That is, we observe
a gradation of conceptual distance between the source and target which cor-
responds to the gradation of gurativity. Specically, roughly half of the
expressions we encounter are based on the ORGAN OF SPEECH ARTICULATION
FOR SPEECH metonymy and can be considered as clustering around the met-
onymic pole. The other half can be seen as tending towards the metaphoric
pole, as they are largely motivated by conceptual metaphors: (a) VERBAL
BEHAVIOUR IS PHYSICAL ACTION, (b) SPEECH IS CONTAINER, (c) ARGUMENT
IS WAR (or WORDS ARE WEAPONS) and (d) WORDS ARE FOOD. The interac-
tion between metonymy and metaphor is an important cognitive strategy in
the conceptualisation of verbal behaviour. The ndings (i) evidence the gra-
dient predictability of idiom meanings based on semantic compositionality,
(ii) conrm the hypothesis of a bodily and experiential basis of cognition,
(iii) suggest the existence of culture-specic models in the utilization of ba-
sic experiences, and (iv) point to the role of emotion in the metaphorisation
of verbal behaviour as a socio-emotional domain.
Keywords: Chinese; verbal behaviour; metaphor; metonymy; embodi-
ment; emotion.
1. Introduction
Since Lako and Johnson published Metaphors We Live By (1980), many
cognitive linguistic studies have been conducted on conceptual metaphor
and metonymy as evidence of the embodiment of human cognition (e.g.,
Cognitive Linguistics 192 (2008), 241282
DOI 10.1515/COG.2008.010
09365907/08/00190241
6 Walter de Gruyter
Lako and Johnson 1999; Lako 1987; Johnson 1987; Ko vecses 2005, in-
ter alios). This line of research has corrected the long-held misconception
of metaphor and metonymy as mere rhetoric devices. They are, as we
now know, the fundamental components of our cognitive behaviour as
well as an integral part of our socio-cultural practice (Ko vecses 2005:
89).
Within the framework of Cognitive Linguistics, metaphor is under-
stood as the conceptualisation of an abstract or, to use Ko vecses word,
intangible, domain in terms of a basic, usually physical and tangible,
domain (Lako and Johnson 1980; Ko vecses 2005; Langacker 1987).
The former is known as the target domain and the latter the source. Met-
aphor is not only functionally expressive and interactional, it is also
conceptually constitutive, as Ko vecses (1999) has argued. For example,
according to Lako and Johnson, underlying the utterance Im on top of
the world! is the happiness is up metaphor. This metaphor not only en-
ables us to understand and express the emotion of happiness in terms of
the spatial relationship encoded in up. More fundamentally, it enables us
to understand and to express how it feels to be happy at all.
Metonymy, on the other hand, is understood as the process whereby a
certain aspect of a given domain provides mental access to another aspect
of the same domain or, as Croft (2002) points out, a subdomain is
mapped into another subdomain within the same domain matrix. For ex-
ample, the question Have you read Goethe? makes little sense unless the
name of the writer is taken to refer to, and, more importantly, to provide
conceptual access to, the literary works produced by the writer. Thus,
metonymy is functionally a conceptual access mechanism (Ko vecses and
Radden 1998).
Given the distinct functions of metaphor and metonymy, it might ap-
pear that the two processes would be two distinct mental strategies in
their respective prototypical instantiations. In real linguistic conceptual-
isations, however, metaphor and metonymy are hard to separate. Goos-
sens (2002) describes a number of dierent forms in which metaphor and
metonymy interact in British English expressions of verbal behaviour. He
coined the term metaphtonymy to refer to the intertwinement of the
two processes. Barcelona (2000) argues that metaphor and metonymy
are inseparable not only at the level of combined uses, but, more funda-
mentally, at the conceptual level. He points out that metonymy enables
metaphorical mapping by recognising the abstract structural similarity
between the source domain and the target domain. Because of the intimate
relationship between the two processes, metaphor and metonymy are
increasingly being regarded as constituting a continuum rather than a bi-
nary distinction. Dirven (2002) contends that the metaphor-metonymy
242 Z. Jing-Schmidt
continuum can be understood as a gradation between conceptual close-
ness and conceptual distance, which explains the varying degrees of gu-
rativity as seen in metaphor and dierent types of metonymy.
Having outlined the basic tenets of Cognitive Linguistics regarding
conceptual metaphor and metonymy, a brief reference to Conceptual
Blending is in order because of its immediate relevancy to the Theory of
Conceptual Metaphor in terms of Lako and Johnson. It should be noted
that Lako and Johnson (1980: 147148) have explicitly argued that the
metaphorical mapping creates similarities between the source domain and
the target, similarities that do not exist independently of the metaphor.
The recognition of a creation of similarities alone, however, is insu-
cient for the construction of a distinct novel meaning, at least with respect
to certain metaphors. Critically, the novelty of the constructed meaning
seems to resist explanation based on a two-domain mapping. This point
has been stressed by a number of cognitive linguists in view of the
strengths of the four-space model known as mental integration or blend-
ing in the sense of Turner and Fauconnier (1995) and Fauconnier and
Turner (2002). Grady et al. (1999), for instance, argue that blending de-
velops emergent content as a result of experiential incongruity. Such
incongruity gives rise to, and thus accounts for, connotations that are
otherwise not inferable from the input. A famous example is the surgeon
as butcher metaphor. The sense of incompetence behind this metaphor,
Grady et al. (1999: 103106) argue, results from the contradiction be-
tween helping and healing as the surgeons presumable goal, and butch-
ery as the means being named. Croft and Cruise (2004: 203204), Ko -
vecses (2005: 268), and Evans and Green (2006: 403404), among other
scholars, also acknowledge the relative mental complexity and conceptual
richness made explicit by the blending model. In the present paper, the
reader will also encounter particular cases that call for the notion of
blending as an adequate complement to the main model being adopted
here, namely the metaphor-metonymy-interaction model. Accordingly,
applicability of the blending model will be pointed out in the analysis of
such cases.
Cognitively oriented studies of guration in the Chinese language have
made signicant contributions to our awareness and appreciation of
culture-specic as well as universal patterns of conceptualisation. For ex-
ample, Kornacki (2001) shows that metaphors and metonymies are among
the driving mechanisms of Chinese concepts of anger. Ye (2001) presents
metaphors and metonymies in the conception of sadness in Chinese. Most
conspicuously, Yus numerous analyses of metaphors and metonymies
demonstrate how dierent body-part terms are employed for the con-
ceptualisation of various abstract experiences including emotion, social
Chinese metonymies and metaphors 243
dignity, control, and thought (Yu 2000, 2001, 2002, 2003a, 2003b). The
present study is aimed to continue the cognitive linguistic eort to expli-
cate the conceptual mechanisms by which humans make sense of complex
and abstract experiences. Specically, I focus on Chinese lexical com-
pounds and idiomatic expressions that metonymically and/or metaphori-
cally conceptualise verbal behaviour. In short, I study the guration in
the Chinese language about language.
1
Verbal behaviour generally refers to the use of language for social pur-
poses. In this general sense, it includes instantaneous linguistic actions
as well as stable dispositions that characterise persons in relation to the
use of language. To the extent that the physical production of language
(speech) is based on species-specic physiology, verbal behaviour has its
universal biological basis and is subjected to physiological constraints.
Consequently, body-parts that conspicuously participate in the articula-
tion of speech sounds constitute an important source from which the con-
ception of verbal behaviour derives by way of metonymy. In Chinese,
about 50 compounds and idioms describing verbal behaviour involve one
or more salient speech organs including the mouth (zui or kou), the tongue
(she), the lips (chun) and the teeth (chi or ya). To illustrate the role played
by body-part related metonymies in the conception of verbal behaviour,
consider the compounds (1a, b), the idiom (1c), and their uses in (2)
2
:
(1) a. zui-ying (mouth-hard) verbally stubborn, unwilling to admit an
obvious mistake
b. zui-tian (mouth-sweet) marked by a readiness to utter attering
words
c. duo-zui-duo-she (much-mouth-much tongue) marked by the an-
noying tendency to make unsolicited remarks or general verbal
indiscretion
(2) a. women zuo-de bu hao, dei chengren, buyao zui-ying.
1PL do-RES not-good, must admit, not-want mouth-hard
We are not doing well, we have to admit it, and shouldnt be
too stubborn to admit it.
b. zhe ren suiran benshi bu da, danshi zui-tian.
this person though ability not big, but mouth-sweet
Although this person doesnt have great abilities, hes good at
attering.
c. dajia dou bu yanyu, pian ni duo-zui-duo-she, you ni shenme shi a?
everyone all not speak, just 2SG much-mouth-much-tongue,
have 2SG what matter Q
All were silent, only you couldnt spare your mouth and
tongue. It was none of your business!
244 Z. Jing-Schmidt
Here, by means of a metonymical mapping, the body-parts zui mouth
and she tongue dene the conceptual space in which to understand the
meanings of the respective expressions as the space of verbal behaviour.
However, structurally simple as they are, the items in (1) cannot be ana-
lysed as simply metonymic. Rather, the juxtaposition of the body-parts
with the adjectives describing palpable properties indicates the interaction
between metonymy and metaphor. To be specic, ying hard, tian sweet
and duo much are metaphorical because literally they describe texture
and taste in the sensual domain and quantity in the physical domain, re-
spectively. Thus, the three expressions represent an embedment of meta-
phor in metonymy in the conception of verbal stubbornness, verbal do-
cility, and verbal indiscretion, respectively. It is crucial to note that the
properties described by ying, tian and duo are in no way objective and in-
herent to the entities being described. To the contrary, they express how
people feel about certain verbal behaviours, thus reecting peoples inter-
action with their environment including both physical objects and ab-
stract phenomena. That is to say, they are interactional in the sense of
Lako (1987: 51).
The idea of interactional properties of reality, however, has long been
well-known and widely acknowledged in cognitive psychology. Church
(1961: xii), for instance, points out that human knowledge has an inevi-
table component of ambiguity, since we repeatedly discover that proper-
ties found in reality are in fact reections of ourselvesprojections.
Hebb (1972: 234245), drawing on the fact that the same sensory stimu-
lation can give rise to completely distinct perceptions, and dierent stim-
ulations can give rise to the same perception, argues for the necessity to
distinguish perception from sensation. Although Hebb does not explicitly
claim that the properties of reality we perceive are interactional in nature,
the experimental evidence of the complexity of perception he provides
suggests this idea. Section 2.1 will address the interactional properties de-
scribed by metaphors embedded in body-part metonymies in details.
The metonymy involving the body-parts of speech articulation is not
uniquely Chinese. Expressions that operate by the same metonymic prin-
ciple abound in languages throughout the world. For examples, English
speakers are familiar with mouthpiece, give mouth to ones feelings,
badmouth someone, give someone a mouthful, the gift of tongues, have a
sharp tongue, lip service, loose lips, etc. The Japanese use kuchi ga karui
(mouth-light) annoyingly talkative, warukuchi (bad-mouth) slander,
kuchisaki dake (mouth-rst-merely) mere words, lip service. The Ger-
mans say bose Zungen (evil-tongue) verbally vicious people, jemandem
die Zunge losen (somebody-tongue-release) cause somebody to talk, mund-
faul (mouth-lazy) unwilling to speak, in aller Munde (in-all-mouths)
Chinese metonymies and metaphors 245
well-known, sich den Mund verbrennen (self-mouth-burn) do harm to
oneself by speaking mindlessly, to name just a few. In Goossens (2002:
359) data on English expressions, 49 out of 109 items based on body-parts
contain a body-part that is instrumental to speech. The Duden Univer-
sal Dictionary (1996) lists roughly 30 phrasal idioms and compounds con-
taining the mouth and about 20 items containing the tongue as referring
to verbal behaviour in German. The crosslinguistic observance of the
metonymic conception of verbal behaviour in terms of the relevant oral
structure suggests that the universal physiological reality of speech articu-
lation has a powerful and predictable eect on the conceptualisation of
verbal behaviour.
Universality or near universality is also readily observed in the large
repertoire of expressions motivated by conceptual metaphors in Chinese:
(a) verbal behaviour is physical action, (b) speech is container, (c)
argument is war and words are weapons, and (d) words are food.
These conceptual metaphors are not arbitrary conventions of the lan-
guage, but are rooted in basic human experience relevant to existence
and survival. Because of the widely shared experiential, and mostly phys-
ical, basis, the meanings of the idioms are by and large recoverable on
account of the meanings of the components, though to a varying extent.
On the other hand, because the use of language for social purposes
is largely learned cultural behaviour, cultural conventions of perceiving
and discoursing on such behaviour are likely to give rise to variations in
the conceptualisation of verbal behaviour. Variation may occur either
in the source domain or in the target. That is to say, on one side, the
ideas associated with a source domain agreed upon by a community of
speakers to dene a certain abstract experience may be culture-specic
(Ko vecses 2005: 12). These culture-specic ideas are congruent with
what Bruner (1990: 40) calls folk psychology which embodies the inter-
pretive principles elaborated by a culture. For example, sweetness in the
physical domain of taste is a crosslinguistically common source of meta-
phor. Yet the aective connotation of the target onto which sweetness is
mapped may vary from language to language. While sweet taste, e.g., in
(1b) zui-tian sweet-mouthed, apt to atter, is mapped onto a slightly
contemptible verbal tendency in Chinese, it is usually associated with
aection and related positive emotions in English, e.g., sweetheart,
sweetie.
3
On the other side, the same target domain may be approached
through dierent sources across cultures. For example, the quality of
garrulity is approached via the physical domain of (the lack of ) weight
in Japanese, e.g., kuchi ga karui (mouth-light). In Chinese, by contrast,
it is understood in terms of physical disintegration, e.g., zui-sui (mouth-
shattered). Thus, given the role of culture in the conceptualisation of
246 Z. Jing-Schmidt
abstract matters such as verbal behaviour, it might not be an exaggera-
tion to state, as does Bruner, that culture is constitutive of mind.
The way the present paper is organized reects the metonymy-
metaphor continuum with the conceptual metonymy organ of speech
articulation stands for speech on the one pole, and the four concep-
tual metaphors on the other. Section 2 focuses on the metonymically
based expressions that fall into three major subtypes depending on the
specic event types encoded in the particular form of interaction between
the speech organ metonymy and particular metaphors. Section 3 deals
with facts and principles of the major conceptual metaphors of verbal
behaviour. The relationship between conventionality and semantic pre-
dictability, between gurativity and emotionality, between universality
and culture-specicity is addressed throughout the analysis. Section 4 ad-
dresses the role of emotion in the metaphorisation of verbal behaviour.
Quantitative data extracted from a questionnaire survey are employed
to show the existence of a negativity bias in the aective valence of the
gurative lexicon of verbal behaviour. Section 5 concludes the article by
stating the theoretical implications and setting forth possible tasks for fur-
ther research.
The data are extracted from four dierent and functionally comple-
mentary sources: (a) Yao (2000), a standard Mandarin Chinese dictio-
nary, (b) Zhu (2002), a standard Mandarin Chinese dictionary of idioms
and prefabricated expressions, (c) the Chinese language internet search
engine www.baidu.com, and (d) the authors own native lexical reper-
toire. The dictionaries are the principal sources with regard to the seman-
tics of the expressions being studied. The on-line data are useful insofar as
they provide clues into the lexical status of certain colloquial expressions
that are not included in the dictionaries. The authors native knowledge
of the lexicon has been helpful in determining the direction of search for
relevant expressions both in the dictionaries and in the on-line resource
specied above. The present dataset consists of a total of 122 items. De-
spite the attempt to include as many examples as possible, the dataset is
not intended to be exhaustive and remains to be supplemented by future
explorations.
The spell sound adopted here for the examples is based on pinyin,
the standard pronunciation system used in mainland China. Tonal
markers are omitted. Word-for-word literal glosses are provided in the
parentheses following each example, which proceed a translation of con-
ceptual proximity. The Chinese originals of the data employed in the
study are provided in Appendix A at the end of this paper, numbered in
correspondence to the numbering of the examples in the text. Details of
the extraction of data for the analysis in section 4 are provided in the
Chinese metonymies and metaphors 247
beginning of that section. The original questionnaire is provided in Ap-
pendix B.
2. Conceptual metonymy of verbal behaviour
The systematic conceptual metonymy of verbal behaviour can be schema-
tised as organ of speech articulation stands for speech. For the sake
of simplicity, I call it the speech organ metonymy. This basic metonymy
has a variety of representations in Chinese depending on the scenes or
event types being encoded. It takes the interaction between the speech
organ metonymy and a supporting metaphor to describe a particular
type of verbal behaviour. Generally, expressions based on the interaction
between the speech organ metonymy and a supporting metaphor fall
into three subtypes: (I) property of speech organ stands for property
of verbal behaviour, (II) action affecting speech organ stands for
verbal action, and (III) effect of a speech organ stands for effect
of verbal behaviour. In what follows, we shall consider the subtypes in
turn.
2.1. Property of speech organ as source
To consider speech organs, a look at the mouth is fundamental. So I shall
begin by looking at expressions with the mouth (zui or kou) as a metony-
mic vehicle providing mental access to speech. Consider the compounds
involving zui in (3):
(3) a. zui-ying (mouth-hard) verbally stubborn, unwilling to admit an
obvious mistake
b. zui-tian (mouth-sweet) marked by a readiness to utter attering
words
c. zui-sui (mouth-shattered) annoyingly talkative, garrulous, apt
to nag
d. zui-jin (mouth-tight) unlikely to spread news, able to keep se-
crets
Like the rst two expressions, (3a) and (3b), which I already discussed in
the introductory section as (1a) and (1b), (3c) is not simply metonymic or
metaphorical, but reects the intertwinement of both conceptual pro-
cesses. More concretely, a certain interactional property of a speech
organ metonymically refers to a property of verbal behaviour that the
organ helps to produce. The description of the interactional property is
metaphorical because sui shattered, broken in pieces is taken from the
248 Z. Jing-Schmidt
familiar domain of physical objects in which it depicts the shattered state
of breakable things. In the context set up by the speech organ metonymy,
the loss of physical integrity described by sui signies the loss of form
and coherence, the resistance to collection, the incessantness and repeti-
tiveness that typify the verbal activity of nagging and the quality of gar-
rulity. Note, however, that sui, unlike English broken which proles a
damaged state or defect through breaking, does not prole the rather ab-
stract concept of damage or defect, but merely depicts the physical scene
that something is fragmental as a result of breaking. In light of the lan-
guage-specic semantic proling that sets sui apart from broken, the idea
of damage and defect is absent in the imagery underlying zui-sui. This
absence inhibits the inference that, since the mouth enables one to
talk, someone with a broken mouth will be unable to talk, an inference
that would have been justied if sui were a true semantic equivalent of
broken.
4
The same principle of metonymy-metaphor interaction applies to (3d).
Here, the quality of tightness describes physical objects such as the lid of
a container that prevents leaking or, alternatively, a door or a gate that
can be shut tightly, securely. Metaphorically, it expresses the idea that a
person is trustworthy because he or she doesnt spread words. Thus, the
combination of the mouth metonymy and the metaphor constituted by
jin tight suggests an image of the mouth as a tightly shut physical object
in the context of verbal behaviour. This combination might be thought of
as a conceptual integration or blend. That is, the verbal function of the
mouth as selectively derived from input 1, zui mouth, and the idea of
prohibited leaking related to an object being tightly shut as selectively
derived from input 2, jin tight, are projected into a blended space to
yield the distinct meaning of discreet in verbal behaviour and thus
trustworthy.
5
In view of these examples, it is crucial to point out that the physical
domains of texture, taste, conguration, etc. are systems entirely indepen-
dent of the domain constituted by the concept of a speech organ. Thus, it
appears plausible that certain conventional knowledge (Ko vecses 2002)
is required to make sense of the idiosyncratic collocation, say, of the con-
cept of sweet taste and the system of verbal behaviour constituted by the
mouth. It is likely that this conventional knowledge consists of both
knowledge of the physical world and cultural knowledge.
The functional salience of the mouth as a speech organ is also evident
in expressions where the mouth is paired with another body-part. In what
follows, I explore double metonymies in which the property of the mouth
is contrasted with that of the heart (or the bowel) in the conceptualisation
of a particular language-mind relationship. The body-parts xin heart
Chinese metonymies and metaphors 249
and fu bowel refer metonymically to the mind or intention, as the heart
and the bowel are considered the loci of feelings and thoughts in Chinese
folk psychology. Let us look at the idioms in (4):
(4) a. ku-kou-po-xin (bitter-mouth-grandma-heart) lovingly intended
advice put in unpleasant words
b. fo-kou-sheng-xin (Buddha-mouth-saint-heart) compassionate in
words and intentions
c. fo-kou-she-xin (Buddha-mouth-snake-heart) nice talk, vicious
intention
d. daozi-zui, doufu-xin (knife-mouth, tofu-heart) marked by a ten-
dency to utter biting words in spite of a sympathetic disposition
e. kou-shi-xin-fei (mouth-true-heart-false) verbally agree, but
think the opposite
f. xin-zhi-kou-kuai (heart-straight-mouth-quick) frank and
straightforward
g. zui-tian-xin-ku (mouth-sweet-heart-bitter) say good words while
holding malignant intentions
h. kou-mi-fu-jian (mouth-honey-bowel-sword) say good words
while holding malignant intentions
Expressions (4a), (4b) and (4c) in this category share the form of a juxta-
position of two NPs in which the mouth (N
1
) and the heart (N
2
) are each
preceded by a modier (X), thus X
1
N
1
X
2
N
2
. In (4a), bitter mouth is
contrasted with grandma heart to refer metonymically to the contrast
between unpleasant speech and loving intention. The modier bitter
is used metaphorically because it prototypically describes taste in the sen-
sual domain. On the other hand, grandma is used metonymically in that
a kinship term whose referent is typically associated with loving inten-
tions refers to loving intentions. The modiers fo Buddha, sheng saint,
and she snake in (4b) and (4c) are all metonymies in the sense that their
referents are the respective prototypes of mercy, virtue, and malig-
nancy. As such they provide mental access to the respective abstract
qualities. In (4d), a sharp utensil, daozi knife, stands metonymically
for sharpness and a culture-specic food of a soft texture, doufu tofu, re-
fers to softness. This softness, in turn, is a metaphor for sympathy, be-
cause touch in the sensual domain is mapped into the domain of social
emotion.
Expressions (4e), (4f ), (4g) and (4h) take the structure N
1
P
1
N
2
P
2
,
where the two body-part nouns are each followed by a predicate (P) to
describe the perceived relationship between verbal behaviour and thought.
(4e) can be seen as a pair of straightforward metonymies where the truth-
fulness of the mouth refers to the truthfulness of verbal behaviour and the
250 Z. Jing-Schmidt
falseness of the heart stands for the falseness of intention. By contrast,
both (4f ) and (4g) contain a metaphorical mapping that is embedded in
the metonymy based on the properties of a speech organ. The adjectival
predications zhi straight and kuai quick in (4f ) originate in the respec-
tive domains of geometric properties and speed; tian sweet and ku bit-
ter in (4g) belong to the sensual domain of taste. They are mapped into
the domain of verbal behaviour to refer to frankness in (4f ), deceptive
verbal friendliness and malignant intention in (4g). In (4h), a prototype
of sweet and enjoyable food, mi honey, is reduced to enjoyability; a cul-
tural prototype of aggressive weapon, jian sword, is reduced to aggres-
siveness by way of metonymy. This metonymical mapping is embedded
in the main metonymy involving a property of a speech organ as the
source. That is, the honey-like quality associated with the speech organ
stands for the enjoyability of speech produced by that organ. The sword-
like quality associated with the bowel stands for the aggressiveness har-
boured in the bowel as the locus of emotion. On the other hand, the
transfer of enjoyability and aggressiveness available in the physical do-
main of food and weapon into the behavioural domain of language and
intention is distinctly metaphorical.
Further examples of the interactional properties of speech organ as
source are found in the following expressions involving the mouth, the
tongue, the lips and the teeth or various combinations of these oral struc-
tures. Consider (5):
(5) a. duo-zui-duo-she (much-mouth-much-tongue) marked by the an-
noying tendency to make unsolicited remarks or general verbal
indiscretion
b. you-zui-hua-she (oil-mouth-lubricant-tongue) speak in an insin-
cere manner
c. she-jian-kou-kuai (tongue-sharp-mouth-quick) verbally aggres-
sive
d. chi-kou-bai-she (red-mouth-white-tongue) talking groundlessly,
irresponsibly
e. chang-she-fu (long-tongue-woman) gossipy woman
f. du-she (poison-tongue) ability to make hurtful remarks
g. san-cun-bu-lan-she, liang-hang-ling-li-chi (three-inch-not-rotten-
tongue, two-row dexterous-teeth) eloquence
h. ling-ya-li-chi (nimble-incisor-dexterous-molar) marked by ver-
bal skill
i. gou-zui-tu-bu-chu-xiang-ya (dog-mouth-spit-not-out-elephant-
teeth) A verbally mean person is unlikely to utter good re-
marks.
Chinese metonymies and metaphors 251
In (5a), as has been mentioned with regard to (1c), the interactional quan-
titative property of the speech organ, duo much, refers to the excessive-
ness and indiscretion of verbal behaviour. In (5b), the sense of insincerity
is recoverable from the idiosyncratic metaphorical use of you oil and hua
lubricant. These two materials are apt to make things smooth and slip-
pery at once. When it comes to persons, one who is slippery is insincere,
tricky and undependable. The association of slipperiness with trickiness is
not unfamiliar to speakers of English. In (5c), sharpness as an interac-
tional property of the tongue and the mouth stands for aggressiveness of
verbal behaviour produced with the help of these speech organs. Here
again, the use of jian sharp, pointed and kuai quick is metaphorical
because the proper understanding of these words depends on a cross-
domain transfer from the physical to the verbal.
Compared to (5a), (5b) and (5c), in (5d), the metaphorical meanings ex-
pressed by the interactional properties associated with chi red and bai
white are less predictable. Two metaphors are indicated here. On the one
hand, the colour red describes bareness and emptiness. This metaphor re-
lies on an imagery: red is the usual colour of a newborn baby who comes
into the world naked. Thus, chi red acquires a polysemous extension,
namely naked, by way of the birth scene. This extension is metonymic in
nature. As the scene-specic metonymic link between the two senses is re-
moved from the original physical context of birth, chi becomes a general
representation of abstract bareness and emptiness. This is a metaphoric
process. On the other hand, the white colour, being perceived as the most
colourless colour, is taken as the source for the understanding of plainness,
blankness, emptiness and similar abstract qualities. Together, in the verbal
context dened by the mouth and tongue metonymy, red and white char-
acterise the verbal behaviour of talking irresponsibly, or accusing someone
groundlessly. This example is a perfect illustration of the deep experiential
grounding of seemingly unmotivated expressions.
While the rst four examples in (5) represent the metonymy mouth
and tongue as speech, (5e) and (5f ) feature the tongue as speech met-
onymy. (5e) is a pejorative name given to a gossipy female. The improper
size of the tongue as an interactional property stands for the excessiveness
of speech that the tongue helps to produce. In (5f ), the venomous prop-
erty of the tongue gives access to the malignant nature of speech meant to
hurt. In the sense that the hazard of poison is partially mapped into the
verbal domain to describe the power of words to hurt, we are dealing
with a metaphor embedded in the metonymy property of speech organ
for property of verbal behaviour.
A frequently used idiom in the oral tradition of urban narratives, (5g)
describes eloquence or verbal persuasiveness in terms of the skilfulness
252 Z. Jing-Schmidt
of the tongue and the dexterity of the teeth by metonymy. Likewise, the
dexterity of the teeth stands for good verbal skill in (5h). As usual,
the properties of nimbleness and dexterity assigned to the tongue and the
teeth are subjective and interactional.
(5i) is conceptually more complex in that it contains two metonymies
based on interactional properties of a speech organ as source. Here, gou-
zui dog-mouth is contrasted with xiang-ya, elephant teeth, whereby zui
mouth and ya teeth both refer to verbal behaviour per metonymy. The
respective interactional properties signalled by the modiers gou dog
and xiang elephant are also metonymically derived, as the two animals
are taken as the respective prototypes of meanness and dignity. However,
in the sense that a scenario in the animal domain is used for the under-
standing of a phenomenon in human verbal behaviour, it is certain that
metaphor, too, is at work here.
Closely related to the expressions based on the speech organ meton-
ymy are items containing the words sheng voice, qi air, qiang accent,
and diao tone or their combinations. Although these words are not
speech organ terms from the perspective of physiology, their referents
are important components of speech articulation from the perspective of
phonetics. Examples in (6), below, exhibit a similar conceptual process,
namely that the interactional property of the physical manner of speech
articulation refers to the property of verbal behaviour.
(6) a. di-sheng-xia-qi (low-voice-down-air) humble-toned in deference
to a superior
b. ying-yang-guai-qi (yin-yang-anomalous-air) verbally elusive,
ambiguous and marked by a dubious intention
c. kou-qi-da (mouth-air-big) boastful
d. li-zhi-qi-zhuang (reason-straight-air-strong) talk assertively on
the ground of a strong argument
e. you-qiang-hua-diao (oil-accent-lubricant-tone) speak in an insin-
cere manner
Clearly, as (6a) shows, the humbleness of tone that characterises the sub-
missive verbal behaviour of an inferior is made accessible by the meto-
nymical depiction of the properties of low voice and weak air as compo-
nents of the articulation of speech. Imbedded in this metonymy is a
spatial metaphor of social status. Specically, the notions of di low and
xia down are used metaphorically because their understanding as signal-
ling inferiority in the present context requires a two-domain mapping
from spatial perception to social relationship.
Expression (6b) provides a description of a culture-specic experience
of sarcastic verbal behaviour. The interactional properties described by
Chinese metonymies and metaphors 253
yin and yang, the two primordial principles in opposition that govern all
things, are obviously unique to the Taoist cosmology. The ambiguity in
the waxing and waning of yin and yang in the air involved in speech ar-
ticulation signies the anomalously mixed tone that marks elusive and
ambiguous verbal behaviour. Here, the culture-specicity of the metaphor
does not seem to forbid a straightforward interpretation of the idiom on
account of the probable semantics of the components, though basic cul-
tural knowledge is necessary. (6c), too, involves the air in the mouth as a
physical component of speech articulation. In this case, the great physical
force as an interactional property associated with the release of air in
speech articulation stands for verbal exaggeration. Meanwhile, verbal
assertiveness is conceived of as strong air in (6d). (6e) exhibits the same
imagery of slipperiness as (5b) except that the mouth and the tongue are
here replaced by the accent and the tone.
By now, the discerning reader may have noticed that the expressions
based on the property of speech organ stands for property of verbal
behaviour metonymy exhibit a remarkable syntactic regularity. That is,
they instantiate two major form-meaning pairs, or constructions. One is
the nominal construction X-N, containing a modier (X) and a modied
entity (N), e.g., you-qiang-hua-diao (oil-accent-lubricant-tone); the other
is the subject-predicate construction N-P, containing a subject noun (N)
and a predicate (P), e.g., zui-ying (mouth-hard) and kou-mi-fu-jian
(mouth-honey-bowel-sword). The speech organ as the metonymic vehicle
is the N in both constructions. The modier (X) and the predicate (P) are
semantically metaphorical in some cases, specifying the interactional
properties being communicated. They are metonymic in other cases, pro-
viding a prototype that typies the interactional properties to be con-
veyed. This compositionality-based constructional regularity corresponds
to the semantic recoverability of the expressions, although the individual
lexical collocations are largely conventional. Thus, clearly, as Nunberg
et al. (1994) argue, conventionality should not be confused with non-
compositionality.
2.2. Action upon speech organ as source of metonymy
Expressions in this subcategory construe verbal behaviour in terms of a
transitive event in which the speech organ is the object being aected by
the transitive action. Thus, we have the metonymy action affecting
speech organ stands for verbal action. Inherent in this metonymy is
the metaphorical mapping from a physical action into a verbal action serv-
ing particular social purposes. There are two typical forms. The simple VO
is usually used in everyday contexts and the V
1
O
1
V
2
O
2
construction is
used more in literary contexts. Let us rst consider the expressions in (7):
254 Z. Jing-Schmidt
(7) a. ding-zui (upward push-mouth) retort to the explicit criticism or
charge made by a superior
b. dou-zui (ght-mouth) argue, quarrel
c. du-zui (stu-mouth) disallow someone to speak up by bribing
them
d. cha-zui (stick (vt.)-mouth) chip in, interrupt
In these expressions, the word zui mouth acquires its meaningfulness
only through a metonymical highlighting of the mouth as a speech organ.
The verbal actions of arguing with a superior, quarrelling, and silencing
someone by bribing are invariably accessed via physical actions upon the
mouth as speech organ. The transitive action of ding upward pushing in
(7a) contains the implicit spatial element upward which metaphorically
alludes to a social hierarchy in which a superior is considered up. This
metaphor enables a further metaphor, namely that an upward bodily
movement is social deance which, by way of the mouth metonymy, is
understood as deance in verbal communication. Similarly, dou ght in
(7b) constitutes the metaphor argument is war, whereby the physical
action of ghting stands for verbal ght.
In (7c), the strategic practice of bribing someone is understood as the
physical action of stung someones mouth, that is, feeding someone
with a bribe. Furthermore, the stung of a persons mouth in the literal
sense metonymically entails the immediate result of the action of stung:
the person with a stued mouth is physically prohibited from speaking.
This metonymy, however, acquires its relevancy only in the verbal con-
text set up by the speech organ metonymy. The complex construction
of meaning in this case, it seems, involves the creation of a novel space,
e.g., the inference of manipulating someones verbal behaviour by giving
them a bribe, which is erstwhile unavailable in du and zui as the input. In
this sense, it may be suitable to invoke the notion of conceptual blending
as an alternative to metaphor-metonymy interaction. In (7d), the verbal
behaviour of interrupting someones speech by taking an unjustied turn
of speech is referred to as the physical action of sticking ones mouth, as if
it were a manually manipulable object, in between someones talk.
Clearly, as these examples show, the mouth as a speech organ invaria-
bly acts as the metonymic vehicle via which Chinese speakers understand
verbal behaviour. In addition, verbal actions are described as if they were
physical actions. Thus, the metonymy action affecting speech organ
for verbal action is not purely metonymic, but relies on a metaphorical
mapping also. The idiom in (8) works by the same principle:
(8) ma-bu-huan-kou, da-bu-huan-shou (scolded-not-return-mouth,
beaten-not-return-hand,) entirely obedient, non-resistant
Chinese metonymies and metaphors 255
Here, the rst half of the idiom describes ones failure to verbally defend
oneself in terms of returning the mouth as a speech organ. The second
half describes ones failure to physically defend oneself against physical
assault in terms of returning the hand as the most useful bodily instru-
ment for action. No doubt the mouth and the hand are taken metonymi-
cally to refer to speech and action, respectively. On the other hand, the
choice of the verb huan return suggests the metaphor social interac-
tion is exchange of objects. Thus, the verb-object collocations represent
a cardinal metaphor-metonymy interaction. The metaphorical under-
standing of speaking as performing physical actions is also illustrated by
the items in (9) below. These expressions, again, invariably contain at
least one speech organ that serves as the metonymic vehicle. The verbs
preceding the nouns of speech organ, however, are used dierently, vary-
ing between the literal, as in (9a), the metonymical, as in (9b), and the
metaphorical sense, as in (9f ).
(9) a. xue-she (imitate-tongue) imitate, repeat what others say
b. nan-yi-qi-chi (dicult-to-open-teeth) having diculty in talking
about something
c. jiao-shetou (chew-tongue) gossip
d. zhang-kou-jie-she (stretch-mouth-knot-tongue) speechless as in
shock
e. yao-chun-gu-she (sway-lip-pu-tongue) attempt to verbally
instigate
f. gao-chun-shi-she (balm-lip-wipe-tongue) attempt to persuade by
nice talk
The action of repeating what others say is understood as the action of
imitating other peoples tongue, as in (9a). Opening ones teeth refers to
the start of a speech, as in (9b). While these two items are essentially met-
onymic, the items in (9cf ) seem to lean towards the metaphoric pole
in that they rely more heavily on a two-domain mapping. Consequently,
the level of gurativity and aectivity are signicantly higher in these
items. The verbal activity of gossiping is made accessible by the depic-
tion of the physical scene of tongue chewing in (9c). The link is enabled
by the perception that gossiping and chewing share certain oral move-
ments involving the activities of the teeth and the tongue. Presumably,
the scene of chewing the tongue is impressionistic and interactional rather
than objective, reecting a negative attitude towards gossip. The negative
sense probably arises from the incongruity characterising the image of
a gossiping mouth. That is, the teeth are busy doing the wrong thing:
what gets chewed is the tongue instead of food, which would have been
sensible.
256 Z. Jing-Schmidt
Fictive dynamic actions can be observed in the conception of verbal ac-
tions in the three idioms (9df ) involving the V
1
O
1
V
2
O
2
construction.
Specically, the imaginary knotting of the tongue indicates the inability
to speak in (9d), the elaborate performance of swaying the lips and pu-
ing the tongue metaphorically and hyperbolically describes the verbal ef-
fort to instigate in (9e), and the physical actions of beautifying the lips
and the tongue as speech organs signify the verbal attempt to beautify
ones speech for the purpose of persuasion in (9f ). Thus, what we encoun-
ter in the conception of verbal behaviour is an insistent imagination of dra-
matic physical scenes involving speech organs. This dramaticity gives rise
to a sense of irony with regard to the verbal behaviours being described.
The expressions in (10), below, show that verbal eort can be con-
ceived of in monetary terms.
(10) a. fei kou-she (spend-mouth-tongue) talk in vain
b. fei zuipizi (spend-lip/mouth skin) talk in vain
c. fei-she-lao-chun (spend-tongue-labour-lip) take a verbal eort
to convince
Here, the eort to talk to convince someone is understood in terms of
spending ones organs of speech articulation. Thus the metaphor verbal
effort is money, or simply speaking is spending, is embedded in the met-
onymy action affecting speech organ for verbal action.
The expressions discussed in this section (with 9a as an exception,
where the verb xue imitate, learn is more literal than metaphorical) ex-
hibit an embedment of a metaphor in the metonymy action affecting
speech organ for verbal action. The metaphor can be based on a phys-
ical action that involves the dynamic transaction of bodily energy. Alter-
nately, the metaphor may be based on the familiar experience of nancial
transactions. In any case, however, the source action does not occur in re-
ality when a person engages in the target verbal action. It is merely an im-
pression, or imagination, based on our subjective experience of the target
action in the verbal domain. In this sense, it is important to recognise the
interactional properties of the action being depicted. Here again, a syn-
tactic regularity (V-O) accompanies the semantic consistency of the
expressions. This syntactic regularity reects the cognitive grammatical
principle that basic constructions encode humanly relevant events (Lako
1987; Goldberg 1995).
2.3. Eect of speech organ as source of metonymy
In this subcategory, we encounter idioms that conceptualise the eect
of verbal behaviour in terms of what the organs of speech articulation
Chinese metonymies and metaphors 257
accomplish in a physical sense. Thus we have the metonymy effect of
speech organ for effect of verbal behaviour. Inevitably, the meton-
ymy contains a metaphorical mapping from the physical domain into the
socio-verbal domain. Let us consider the items in (11):
(11) a. zhong kou shuo jin (multitude-mouth-melt-gold) collective crit-
icism is destructive
b. chi-she-shao-cheng (red-tongue-burn-town) blatant words are
destructive
c. yi-kou-yao-ding (one-mouth-bite-rm) make a vociferous ac-
cusation
d. xue-kou-pen-ren (blood-mouth-spray-person) ruthlessly attack
someone with false accusations
e. she-zhan-qun-ru (tongue-combat-multitude-scholar) verbally
combat many scholars at once
f. chi-ya-wei-huo (molar-incisor-do-harm) words cause harm.
g. kou-kou-sheng-sheng (mouth-mouth-voice-voice) repeatedly
declare
In (11a), mouths of the multitude metonymically refers to collective
criticism and melt gold metaphorically expresses the idea of a highly
destructive power. A similar idea is expressed in (11b) where the destruc-
tive potentials of the red tongue metonymically stand for the destructive
potentials of blatant words. The sense of blatancy is derived from chi red,
naked, as has been explained with regard to (5c) in section 2.1. The phys-
ical scene of a town being burned down, however, is a metaphorical
description of abstract destruction.
(11c) and (11d) are expressions of the highly specic verbal practice of
making a false accusation. By means of the metaphorical use of yao-ding
bite-rm, (11c) emphasizes the violent and determined manner in which
the accuser makes the charge. By employing the metaphorical senses of
xue bloody and pen spray, (11d) emphasizes the aggressiveness and
ruthlessness of the accuser. In both cases the metaphorical imageries are
framed in the metonymy that understands the eect of speech organ as
the eect of verbal behaviour. (11e) is the war metaphor of argument
embedded in the metonymy based on the performance or eect of a
speech organ. In (11f ), the negative consequence of speaking is referred
to as the potential harm done by the teeth as body-parts of speech articu-
lation. (11g) consists of the mouth word, kou, and the voice word, sheng.
Both are metonymic, referring to speech. The reduplication of the two
words for the purpose of expressing repetition seems to work metaphor-
ically in the sense of more of form is more of content (Lako and John-
son 1980: 127).
258 Z. Jing-Schmidt
In this section, I have shown that the highly schematic conceptual met-
onymy organ of speech articulation for speech is a central mecha-
nism underlying the Chinese conceptualisation of verbal behaviour.
Referentiality and gurativity (Dirven 2002: 102105) are evident in the
speech organ metonymy: the speech organ being named cannot be taken
literally. Rather, it is invariably gurative and provides mental access to
speech-related behaviour. This metonymy is realized in three specic sub-
types each of which emphasizes a particular aspect of verbal behaviour.
The particularities of the subtypes derive from the particular metaphors
embedded in the basic speech organ metonymy.
It might be worthwhile to observe that this central metonymy interacts
with the embedded metaphors in nontrivial fashions. More specically,
the successful metaphorical mapping into the abstract domain of verbal
behaviour as target presupposes a metonymical mapping from speech or-
gan onto speech physically produced by the speech organ within the do-
main matrix of language. Put otherwise, the speech organ metonymy,
being referential in nature, puts a relevancy constraint on the metaphori-
cal mapping such that the target is restricted to the domain of verbal be-
haviour. On the other hand, the metaphorical mapping, by virtue of its
interactional and expressive force, is responsible for the conceptual rich-
ness that arises from the imagic particulars of the source concepts.
In summary, our analysis of the relevant expressions amount to two
generalisations. First, the conceptualisation of verbal behaviour is em-
bodied. Second, the meanings of the idiosyncratic expressions are not en-
tirely opaque, but recoverable and even predictable from the metonymic
and metaphoric senses conveyed by the components. The predictability,
however, is not an all-or-nothing matter but relative and gradient, e.g.,
(4g) kou-mi-fu-jian (mouth-honey-bowel-sword), (5d) du-she (poison-
tongue) and (9a) xue-she (imitate-tongue) may be more readily recovered
than (6b) yin-yang-guai-qi (yin-yang-anomalous-air), (7a) ding-zui (up-
ward push-mouth) and (9f ) gao-chun-shi-she (balm-lip-wipe-tongue).
3. Major conceptual metaphors of verbal behaviour
As well as the conceptual metonymy based on organs of speech articula-
tion, Chinese is rich in conceptual metaphors of verbal behaviour. De-
pending on the concepts or schemata that constitute the source domain,
four conceptual metaphors of verbal behaviour are observed. They are:
(i) verbal behaviour is physical action, (ii) speech is container, (iii)
argument is war, and (iv) words are food. Because these metaphors
arise from basic human experience, they are likely to be universal. How-
ever, as we shall see in the forthcoming paragraphs, these conceptual
Chinese metonymies and metaphors 259
metaphors may at times allow culture-specic image-schemata associated
with special aective connotations. In what follows, I shall discuss the
four conceptual metaphors underlying a large number of Chinese expres-
sions of verbal behaviour.
3.1. VERBAL BEHAVIOUR IS PHYSICAL ACTION
The conceptual process underlying this metaphor pertains to the mapping
from physical actions onto social actions of using words in interpersonal
interaction. Importantly, this metaphor is closely related to the meanings
(or words) are objects metaphor as part of what Reddy (1979: 290) refers
to as the conduit metaphor. The expressions in (12) illustrate this process:
(12) a. gua-zai-zui-bian (hang-on-mouth-side) habitually or repeat-
edly say something
b. bu-zu-gua-chi (not-enough-hang-tooth) not worth mentioning
c. zhi-di-you-sheng (throw-ground-have-sound) making remarks
that produce resonance
d. yi-tu-wei-kuai (one-spit-make-pleasure) speak up uninhibitedly
to feel good
e. zhan-ding-jie-tie (chop-nail-cut-iron) speak resolutely
f. zi-zhen-ju-zhuo (word-pour with measure-sentence-pour with
measure)
In (12a), the verbal behaviour of habitually or repeatedly saying some-
thing is conceived of as a physical event of hanging a three dimensional
object on the side of ones mouth. In this physical domain of a transitive
action, the action of hanging has the eect that the object being hung be-
comes attached to the mouth and remains in that state of attachment. It is
this physical attachment as part of the imagery in the source domain that
gets mapped into the target domain of verbal behaviour to describe the
incessantness with which a certain unit of language is uttered. A similar
process is at play in (12b) where the teeth instead of the mouth participate
in the physical scene. In (12c) a resonant verbal expression is described as
a physical object of, presumably, substantial volume, weight and a particu-
lar texture such that it is able to produce a loud sound when thrown on
the ground. Clearly, this mapping entails that words are conceived of as
having an existence independent of people and context (Lako and
Johnson 1980: 11) and are capable of generating physical eects. The
need to talk about something is conceived of as a bodily urgency to spit
something out in (12d): something that one feels a great desire to commu-
nicate is described as a disturbing object in the mouth. One must spit it
out for the sake of ones well-being. (12e) describes the resolute manner
in which something is said. Determination and resolution are conceived
260 Z. Jing-Schmidt
of as the physical potency to cut and break something as hard and un-
breakable as nails and iron. (12f ) describes the discreet manner of a cir-
cumspective speaker as a careful physical action of pouring out tea or
wine with exact measure, whereby words and utterances are imaged as
the liquids being measured and dispensed from a container.
The more schematic metaphor verbal behaviour is physical action
involving a physical object has a specic instantiation in the metaphor
verbal behaviour is manipulation of a musical instrument, illustrated
by the idioms in (13):
(13) a. dui-niu-tan-qin (towards-bovine-play-musical instrument) say
things that are beyond the hearers ability to understand, cf.
pearls before swine)
b. da-bian-gu (beat-side-drum) to help someone inconspicuously
by saying things in the background
c. zi-chui-zi-lei (self-blow trumpet-self-beat drum) blow your own
trumpet, boast about oneself
d. lao-diao-chong-tan (old-tone-renewed-play) to talk about
something that is already old hat
It is noteworthy that, in (13a), the lack of intelligence is metonymically
inferable from niu bovine animal, a typical dull creature. By contrast, a
string instrument metonymically gives rise to the inference of sophistica-
tion and good taste. Thus, even in this apparently metaphorical expres-
sion in which talking is understood as playing a string instrument, meton-
ymy is at work in tandem with metaphor.
Apart from the idioms I have discussed, there are a number of lexical
compounds in the colloquial language that encode verbal behaviours met-
aphorically by describing a transitive physical action involving a physical
object. Because of lexical entrenchment due to frequent use, it is possible
that the physical scenes behind these prefabricated compounds have be-
come washed out, if not entirely obscure, to native speakers of Chinese.
Consider (14):
(14) a. da-cha (beat-road branch) chip in, interrupt
b. che-pi (pull-skin) chat, talk rubbish
c. wa-ku (dig-bitterness) speak sarcastically
d. chui-niu (blow-bull) boast
e. pai-ma-pi (pat-horse-ass) toady
f. po-leng-shui (slosh-cold-water) verbally discourage
g. huo-xi-ni (mix-thin-mud) say neutral things to dilute the inten-
sity of a conict between other people
h. fa-lao-sao (let out-prison-urine stench) complain, grumble
i. kai-men-jian-shan (open-door-see-mountain) speak directly
Chinese metonymies and metaphors 261
j. dakai-tian-chuang-shuo-liang-hua (open-sky-window-speak-
light-words) talk openly
k. xin-kou-kai-he (let-mouth-open-river) speak heedlessly
Syntactically, all of these compounds can be schematised as the transitive
construction, i.e., the VO construction. However, some of them describe
semantically intransitive events, as, for examples, (14b), (14c), (14d),
(14h), (14i) and (14j). This apparent mismatch between the syntactic
form and the meaning may at rst glimpse suggest that the conventional-
ised metaphors resist an interpretation on account of the respective mean-
ings of the verb and its argument, much in the same way as kick the
bucket which must be treated as an unanalysed whole. However, a closer
examination of these items will correct this initial impression. The verb-
noun collocations are analysable and can be shown to have cognitive
motivations despite their high idiosyncrasy. In each case, the semantic
motivation resides in the dynamic scene conjured by the transitive con-
struction. Some scenes are more dramatic and outrageous than others.
In (14a), beating a road branch is a guration of interruption exactly be-
cause of the semantic contribution made by road branch in its gurative
sense of deviating from the main road to a side road. In (14h), expressing
grief and discontent is understood via the image that someone opens a
window in the prison to let out the chronic, cumulate odour of urine.
While this imagery might be the height of idiosyncrasy, the constitutive
elements of grief, namely prolonged connement and intense poignancy,
are made palpable via lao prison and sao urine stench. These two
things may be considered the respective archetypes of connement and
poignancy. The same semantic recoverability based on constituent input
characterises all the other items in this list, though the degree of semantic
transparency varies from expression to expression.
Compared to the expressions in (14), the compounds in (15) are seman-
tically more transparent because the objects of the transitive actions ex-
plicitly name concepts in the domain of verbal behaviour instead of
three-dimensional things. Nevertheless, the transitive verbs per se describe
physical actions such that the verbal behaviours in question are accessed
via a metaphorical mapping.
(15) a. sa-huang (throw-lie) tell a lie
b. che-huang (pull-lie) tell a lie
c. zao-yao (manufacture-rumour) start a rumour
Another type of the conceptual metaphor verbal behaviour is physical
action involves a transitive action the object of which is the heart. Con-
sider (16):
262 Z. Jing-Schmidt
(16) a. tao-xin (pull out-heart) candidly communicate
b. jiao-xin (exchange-heart) communicate
c. tui-xin-zhi-fu (push-heart-put in-bowel) communicate with
mutual trust
These expressions are based on the interaction between the physical
action metaphor and a metonymy that employs the heart as a body-
part to refer to thoughts. Thus, the act of communication is conceived of
as physical actions whereby the heart is treated as a manipulable object
that can be displaced or transferred between persons.
A further subtype of the physical action metaphor describes a verbal
act that is conducted upon an implicit human object. The gurativity arises
from the fact that the verb being used to describe such an act belongs to the
physical domain where it aects a physical object. Yet regular speakers of
Chinese may not be aware of the underlying metaphorical mapping and
simply learn these expressions as part of the lexicon. Consider (17):
(17) a. ding-zhuang (push upward-hit) verbally insult (a superior)
b. pang-qiao-ce-ji (side-knock-side-strike) indirectly suggest
c. hu-you (sway-swing) atter, toady to
d. chui-peng (blow-lift) lavishly praise
e. kai-dao (open-guide) instruct, help to understand
f. wai-qu (crooked-bend) distort
g. da-duan (beat-broken) interrupt
h. jie-lu (pull-bare) reveal, uncover
The physical and mostly kinetic sense of the actions is largely latent and
may not be accessed by the average language user. Nevertheless, the lex-
ical semantics of these items is not arbitrary, but motivated and cannot be
explained without reference to the conceptual process of a metaphorical
mapping from physical actions to verbal behaviours. For example, in
(17a), verbally insulting someone is understood in terms of the physical
action of upward pushing and hitting someone. Here again, the spatial
element implicit in ding push upward signies relationship in a social hi-
erarchy. In (17b), talking is described as knocking and striking and the
spatial terms pang side and ce side indicate the abstract indirectness in
verbal interaction. Specically, spatial periphery is verbal indirectness.
The rest of (17) works by the same mechanism of mapping a bodily
action onto a verbal action.
3.2. SPEECH IS CONTAINER
According to Reddys (1979: 290) observation, the majority of English
expressions about language instantiate the conduit metaphor of which
Chinese metonymies and metaphors 263
the container metaphor is a signicant part. The same conceptual meta-
phor is operative in the Chinese conceptualisation of linguistic expres-
sions and verbal behaviour. The expressions in (18) illustrate this:
(18) a. yan-wai-zhi-yi (word-outside-of-meaning) unsaid but inferable
message
b. hua-li-you-hua (speech-inside-have-speech) Theres a (hidden)
message in the utterance.
c. hua-li-hua-wai (speech-inside-speech-outside) direct and indi-
rect message
d. dakai-hua-xiazi (open-word-box) start to talk excessively
e. shi-hua (lled-word) honest words
f. kong-hua (empty-word) empty or pretentious words
g. xian-wai-zhi-yin (string-outside-of-sound) unsaid or hidden
message
The spatial concepts of li inside and wai outside in (18ac) signal that
words or meanings are understood as a container that has an inside and
an outside that can be physically dened. Example (18d) takes the con-
tainer metaphor a step further by indicating that words may be imaged
more concretely as a box that can be opened or shut. As containers of
meanings, words may be lled (18e) or empty (18f ). (18g) is related to
the metaphor of containment in a more complex manner. On the one
hand, we have to do with the metaphor talking is playing an instru-
ment whereby xian string (as of a musical instrument) stands in a part-
whole metonymic relation to a stringed instrument. On the other hand,
the spatial concept of wai outside points at the container metaphor.
3.3. ARGUMENT IS WAR
This conceptual metaphor, too, is not unique to Chinese and its near uni-
versality is grounded in the fundamental human experience of conict of
varying scopes. Conicts and battles among individuals and groups have
always accompanied the history of evolution and are no doubt one of the
most entrenched experiences in human existence. This experiential basis
accounts for the fact that this metaphor is conventionalised in a multitude
of idioms and compounds in Chinese. As we shall observe in (19), inher-
ent in the war metaphor is the more specic metaphor words are
weapons.
(19) a. chun-qiang-she-jian (lip-spear-tongue-sword) disputatious ver-
bal exchange
b. dan-dao-zhi-ru (single-knife-straight-enter) engage in a direct
verbal attack
264 Z. Jing-Schmidt
c. maotou-zhi-xiang (spearhead-point-to) aim a verbal attack at
d. hua-feng-yi-zhuan (speech-sharp point of a weapon-one-turn)
change the direction of verbal attack
e. yi-yu-zhong-di (one-utterance-hit-target) hit the spot verbally
f. chu-kou-shang-ren (exit-mouth-hurt-person) verbally insult
g. e-yu-shang-ren (evil-language-hurt-person) verbally insult
h. ren-yan-ke-wei (people-word-worth-fear) words are worth
fearing
i. ti-wu-wan-fu (body-without-complete-skin) completely af-
fected by harsh verbal attacks
j. yan-ci-ji-lie (word-rhetoric-erce-intense) speak polemically
The expressions (19ad) explicitly name specic weapons as the source of
the metaphor. (19a) exhibits a complex interaction between the speech
organ metonymy and the weapon metaphor. On the one hand, chun
lip and she tongue refer metonymically to speech; on the other hand,
qiang spear and jian sword describe the aggressive potency of speech
metaphorically by mapping weapons in the domain of war and battle
onto the domain of argument. While (19be) encode language in terms
of a weapon used to attack an opponent in the linguistic battle, (19fg)
state plainly that words have the potentials of physically wounding a vic-
tim and are thus feared (19h). (19i) conceives of the devastating eect of
verbal attacks in terms of the extent to which the body is physically
wounded. (19j), on the other hand, adopts the description ji-lie erce
and intense which is usually used to describe battles.
Similarly, the compounds in (20) describe verbal aggression in terms of
physical aggression that typies war and battle.
(20) a. ci-er (pierce/stab-ear) biting (remarks)
b. feng-ci (sarcasm-stab) sarcasm
c. mo-sha (wipe-kill) deny
d. peng-ji (blow-strike) vehemently criticise
e. zhong-shang (hit-wound) verbally defame
All of these expressions contain at least one morpheme describing a phys-
ical act of aggression, e.g., ci stab in (20ab), sha kill in (20c), peng hit,
blow and ji strike in (20d) and shang wound in (20e). Thus, a hostile
remark is described as a sharp object (weapon) that pierces the ear in
(20a). The verbal behaviour known as sarcasm is associated with the ag-
gressive physical act of stabbing, as in (20b). To verbally deny a fact is to
wipe out and even kill that fact, as in (20c). The issuance of criticism is
conceived of as a physical act of striking in (20d) and to defame someone
verbally is to wound them physically, as in (20e).
Chinese metonymies and metaphors 265
3.4. LINGUISTIC EXPRESSIONS ARE FOOD
As the most immediate survival necessity for all living organisms includ-
ing humans, food is fundamental to our existence and, as Kass (1999) and
Rozin (1999) show, inuences our behaviour in the most profound ways
possible. Thus, it is likely that food constitutes a natural universal con-
ceptual domain that serves as the source of conceptual metaphors. Lako
and Johnson (1980: 152) and Ko vecses (2002) have discussed the ideas
are food metaphor. Ko vecses (2002) has talked about the sexual desire
is appetite metaphor, the source of which is related to the experientially
basic domain of food. In Chinese, we observe the conceptual potentials of
food in the following expressions:
(21) a. tian-yan-mi-yu (sweet-words-honey-speech) exceedingly nice
talks intended to atter or deceive
b. hua-bu-dui-wei (language-not-right-avour) words with some
peculiar hidden message
c. tun-tun-tu-tu (swallow-swallow-spit-spit) speak with reluctance
and dishonesty
d. yao-wen-jiao-zi (bite-text-chew-word) write or speak ver-
bosely
e. ye-ren (choke-person) aggressive (remarks)
f. sheng-se (raw-puckery) obscure and dicult to understand
g. tian-you-jia-cu (add-oil-add-vinegar) (as a third party) inten-
sify a conict by saying things conducive to an escalation
The taste of food is mapped onto the agreeability of words in (21a) and
(21b). We further observe mappings that focus on food as physical objects
that can be swallowed, spit out, bitten or chewed, as in (21c) and (21d),
too hard too coarse to swallow and thus capable of choking the eater, as
in (21e), or too unripe (as of fruit) to be enjoyable or even digestible, as in
(21f ). Related to the metaphor linguistic expressions are food is (21g)
in which the culinary practice of adding seasonings to enhance the avour
of food is employed to describe a particular verbal behaviour that is in-
tended to intensify a conict.
3.5. Culture-specic metaphors
Apart from the major conceptual metaphors discussed in the foregoing
paragraphs, we encounter several other metaphors that may not immedi-
ately arise from universal source domains. The three expressions in (22)
illustrate the collocation of disagreeable physical temperatures, both cold
and heat, with speech in the conceptualisation of unfriendly or sarcastic
attitude associated with certain hostile verbal acts.
266 Z. Jing-Schmidt
(22) a. leng-yan-leng-yu (cold-speech-cold-language) unfriendly
speech
b. leng-chao-re-feng (cold-irony-hot-satire) speak with biting
sarcasm
c. shuo-feng-liang-hua (say-wind-cold-words) speak ironically
In the background of the metaphor hostile speech is adverse tempera-
ture, verbal behaviour marked by a lack of aection, as (22a) shows,
does seem to reect a more universal metaphor, namely affection is
warmth, or lack of affection is cold. This points to the universal gu-
rative potential of warmth in emotion conceptualisation.
The expressions in (23) utilize the natural weather phenomena of wind
and rain to describe the mobility of unreliable verbal information. The
circulatory character of the wind and the disseminating character of the
rain seem to be the imageries being mapped onto spreading rumours in
the verbal domain in (23a) and (23b). (23c), by contrast, focuses on the
intentional aspect of the verbal behaviour by construing it as a volitional
transitive action of blowing wind. In addition, zhen-bian pillow side is
metonymic in that the typical location of nuptial communication stands
for nuptial communication.
(23) a. feng-yan-feng-yu (wind-speech-wind-talk) gossips, rumours
b. man-cheng-feng-yu (full-town-wind-rain) a rumour being
spread widely
c. chui-zhen-bian-feng (blow-pillow-side-wind) engage in pillow
talk in order to inuence the spouses decision
The items in (24) have as their common source domain the action of sing-
ing, which apparently gives rise to a negative connotation, though to a
varying degree of negativity:
(24) a. yi-chang-yi-he (one-sing-one-echo) oer mutual sympathetic
verbal response
b. ci-chang-bi-he (here-sing-there-echo) oer mutual sympathetic
verbal response
c. gao-chang-ru-yun (high-sing-into-clouds) propagandise a cause
or doctrine
d. chang-gao-diao (sing-high-tone) carry on propaganda
e. shuode-bi-changde-hao-ting (speaking-compare-singing-good-to
hear) nice talks that are unaccountable (waing, empty prom-
ise, or blatant attery etc.)
The collaborative performative act of singing and echoing in (24a) and
(24b) emphasizes the elaborated mutual responsiveness characterising a
Chinese metonymies and metaphors 267
conspicuous display of harmony. The underlying metaphor may be
schematised as collaborative verbal performance is collaborative
musical performance. (24c) and (24d) both employ the complex meta-
phor propaganda is singing in high pitch to convey the deliberate eort
involved in propagandising a cause. (24e) is an explicit comparison be-
tween talking and singing whereby talking is said to surpass singing in
sensual agreeability. However, as common sense has it, the opposite is
usually true. That is to say, singing is perceived (or at least intended) to
be more pleasant to the ear than speaking. It is precisely this paradox in
the comparative claim that invites the inference that talks that sound
nicer than singing are unaccountable and suspicious. This comparison is
metaphorical because auditory agreeability in the sensual domain is
mapped onto semantic and interpersonal agreeability in the verbal
domain.
As these examples show, the metaphors discussed in this subsection dis-
play a higher degree of culture-specicity than those analysed in the pre-
vious subsection, both with regard to the sources being mapped and in
terms of the emotional connotations behind the metaphors. Yet it is clear
that culture-specicity does not contradict the idea of an experiential
basis for the more local metaphors, but reects a dierential experiential
focus underlying them, to use Ko vecses (2005: 246) terms.
To summarize the section on metaphors of verbal behaviour, let us
state the basic propositions that arise from the analysis. Generally, it ap-
pears that the concepts that constitute the source domains of all the meta-
phors of verbal behaviour pertain to familiar human experiences that
can be envisaged as physical scenarios, as Semino (2005) points out in
her discussion of English metaphors of speech activity. Some of these ex-
periences are more fundamental to our existence and survival than others.
Concretely speaking, physical actions, food, and self-defence are proba-
bly the most fundamental aspects of human life: we inevitably and rou-
tinely conduct physical actions in everyday life; we depend on food and
are familiar with its properties; we are adaptively tuned to the survival
signicance of conicts and battles. Because these experiences determine
our tness and vitality as living beings, their eects on conceptualisation
are powerful and predictable. Consequently, we draw on these experi-
ences in understanding the more complex and less palpable social experi-
ence of using words in interaction with our relevant others. The sense that
our body is a container, too, is an irreducible physical experience that
inuences the way we think of language. Thus, just as many cognitive lin-
guists (e.g., Lako and Johnson 1980; Sweetser 1990; Gibbs 1994, 2006;
Grady 1999) have argued, the conventionalisation of conceptual meta-
phors is not arbitrary, but experientially motivated.
268 Z. Jing-Schmidt
Consequently, it is important to note, as Nunberg and his colleagues
have argued in their study of idioms, that conventionality does not equal
arbitrariness and non-compositionality. Rather, there is a substantial se-
mantic recoverability on account of the lexical input that conjures a rich
gestalt of experience. This recoverability is due to the fact that the coining
of the metaphors draw[s] on the full richness of our encyclopaedic knowl-
edge of our bodily and cultural experience, as Croft and Cruise (2004:
204) put it. Ko vecses (2002: 207208) speaks of conventional knowledge
and considers it an important conceptual factor that contributes to the un-
derstanding of idioms. Such knowledge enables the association between
the source and the target such that the image-schematic similarity between
the two domains can be established and novel senses can be created.
From a social psychological perspective, metaphors are expressions of
emotion. The expression of emotion, as most psychologists seem to agree,
is both a communicative and a social strategy. As has been pointed out in
section 2 with regard to metaphors embedded in the speech organ met-
onymy, metaphors do not name truth-conditional properties of verbal be-
haviour. To the contrary, they give voice to feelings and beliefs about the
perceived particularities of various socially signicant verbal behaviours.
In the sense that feelings and beliefs about verbal behaviour, universal or
culture-specic, are encoded and transmitted through metaphors, the
expressive, interactional and constitutive function of metaphors can be
specied as representing emotions with regard to verbal behaviour. This
view will be elaborated in the following section.
4. Emotion and the negative metaphor/metonymy
The denitions of emotion are various and controversial. I follow Arnold
(1960: 182) in considering emotion as the felt tendency toward anything
intuitively appraised as good (benecial), or away from anything intui-
tively appraised as bad (harmful), for a working denition. Accordingly,
the felt tendency associated with a positive appraisal may be called a pos-
itive emotion and that which is associated with a negative appraisal may
be called a negative emotion. Positive and negative emotions divide the
aective space (Russell 1979). The positivity vs. negativity of an emotion
is known as aective valence.
To measure the overall tendency of the present dataset in terms of af-
fective valence, a questionnaire survey was conducted. The questionnaire
contained the entire dataset adopted in this paper. All 122 items were
presented in isolation. The order of the 122 items was randomised. 50
informants were instructed to rate each item as positive, negative, or
neutral.
6
Four informants each left one item unevaluated, giving rise to
Chinese metonymies and metaphors 269
four missing values. The total frequencies of positive, negative and neu-
tral rating are presented in Table 1.
The gures in Table 1 show that the negative rating has the highest to-
tal frequency and the positive rating the lowest. Furthermore, the total
frequency of the negative rating is signicantly higher than expected. By
contrast, the total frequency of the positive rating is signicantly lower
than expected. The total frequency of the neutral rating is not signi-
cantly higher than expected. On the whole, the mismatch between the ob-
served and the expected frequency is signicant with the positive and the
negative rating. That is, signicantly more items of the entire set were
signicantly more frequently rated negative than positive.
7
The strong
asymmetry in the rating points to a negativity bias.
This bias is not alone from a crosslinguistic perspective. A similar ten-
dency has been reported by Simon-Vandenbergen (1995) in her study of
metaphors of linguistic actions in British English. White (1994: 226) points
to a preponderance of negative terms in the Aara lexicon of emotion.
Although these are referentially dierent from metaphors of verbal behav-
iour, the converging trend is remarkable, especially given the attitudinal
and emotional nature of such metaphors. Simon-Vandenbergen, however,
speaks of value judgments instead of emotions. I prefer the notion of
emotion over value judgement in the current context for two important
reasons. First, from an evolutionary psychological perspective, emotion is
a superordinate orchestrating program that directs the activities and in-
teractions of various subprograms including value judgement (Cosmides
and Tooby 2000: 93). To use Johnson-Laird and Oatleys (2000: 459)
words, emotions guide our lives. And since making value judgments is
part of our lives, it follows that emotions guide value judgments. Secondly,
metaphor does not pertain to the objective conceptual representation of
the external world, but pertains to the attitudinal and aective evaluation
of percepts via recurrent basic bodily experience. Thus, metaphor assumes
an aective basis and is immediately relevant to emotion.
Now what do we make of the negativity bias observed in the meta-
phorisation of verbal behaviour? This question is nontrivial, for such a
Table 1. Frequencies of positive, negative and neutral rating
Rating Frequency Percentage
Positive 990 16.2%
Negative 3005 49.3%
Neutral 2101 34.5%
Total 6096 100%
Chi-square 1002.586; df 2; p < 0.001
270 Z. Jing-Schmidt
negativity bias must be considered peculiar or marked in the face of
the well-known Pollyanna Hypothesis which claims that positive words
universally outnumber negative words (Boucher and Osgood 1969). To
explain the marked tendency, Simon-Vandenbergen (1995: 112) contends
that linguistic actions that are perceived as being out of the ordinary,
extreme in one way or another, i.e., too much or too little of some-
thing call for metaphorisation. This statement implies that metaphor-
isation is a selective process. Apparently, metaphor is not used to concep-
tualise any and all verbal behaviours, but primarily those that are
perceived as inadequate or negative. Moreover, it appears that metaphor
is not used merely to provide conceptual access to verbal behaviour. In-
stead, it seems to construe subjective experience of verbal behaviour.
However, the recognition of the selective character and the construal
function of metaphor does not explain why there are more negative meta-
phors of verbal behaviour than positive and neutral ones.
I will now propose that the predominance of negative metaphors of
verbal behaviour can be explained and predicted (a) on account of the
socio-emotional nature of verbal behaviour and (b) on account of the
basic cognitive-aective principle underlying its conceptualisation via
metaphor as a process of aective information processing. As has been
stated previously, verbal behaviour pertains to the use of language for
the purpose of communication in social interaction. The interpersonal
nature of verbal behaviour determines its socio-emotional signicance.
Therefore, the conceptualisation of verbal behaviour via metaphorical
representation pertains to the processing of socio-emotional information.
To be more accurate, the metaphorisation of verbal behaviour can be
viewed as an instantiation of socio-emotional information processing
and is consequently subjected to the principled patterns thereof. Bearing
this in mind, let us consider the default pattern of aective information
processing.
In cognitive psychology, it is widely accepted that people do not pay
equal attention to all the information in their surroundings. Rather, our
attention allocation is a limited capacity process and as such highly selec-
tive (Nosofsky 1986). Hebb (1972: 88) denes attention as sensory selec-
tivity and considers it the distinguishing mark of the higher animal.
More recently, converging evidence has conrmed that negative social in-
formation has stronger impacts on people than positive and neutral infor-
mation and that the processing of socio-emotional information is auto-
matically biased towards negative events. It has been argued that such a
bias is in keeping with our general adaptive behaviour that emphasizes
vigilance and self-defence (Baumeister et al. 2001; Rozin and Royzman
2001; Jing-Schmidt 2007).
8
In light of the socio-emotional character of
Chinese metonymies and metaphors 271
verbal behaviour and in light of the negativity bias as the default
processing pattern with regard to socio-emotional information, the pre-
dominance of negative metaphors of verbal behaviour as reported by
Simon-Vandenbergen and in the present study is no longer surprising but
seems to have a psychological grounding.
On a sociolinguistic note, the predominance of negatively valenced
metaphors also requires an analytic model that reects principles of inter-
action. Here I shall suggest some foundational ideas underlying such a
model. On the one hand, conventional metaphors are prefabricated and
as such convenient and reassuring. As Matiso (1979: 110) puts it, [t]he
security of knowing the right thing to say in a given situation is a precious
commodity (italics in the original). Such security, I shall argue, is strate-
gically appreciable especially in situations where a negative message such
as dismay, contempt, disgust, indignation, anger, etc. needs to be con-
veyed. Metaphorical prefabs full our communicative need to express
negative emotions not in our own name, but in the name of received wis-
dom, i.e., conventionalised collective emotions. This possibility is part of
what Goman (1981: 34) calls the embedding capacity that gives us
dramatic liberties. Thus, to be able to use negative prefabs is not only
reassuring, but also empowering in that the utterances of the speaking
individual assume a collective frame of emotionality. On the other hand,
Whites idea that the articulation of emotion serves a moral regulatory
function in interaction is highly relevant. Specically, negative emotions
communicate moral discontent, which may serve as a motivation to
change undesirable situations and improve social environment.
5. Conclusions
I conclude this paper by emphasizing four points. First, the cognitive pro-
cesses underlying the Chinese conceptualisations of verbal behaviour are
metonymy and metaphor. Metonymy and metaphor form a continuum of
gurativity. At the metonymic pole, we encounter expressions that con-
tain one or more salient bodily components of speech articulation which
refer to speech. On the metaphorical pole, we observe expressions in
which recurrent concrete physical actions, activities and experiences are
mapped onto abstract social behaviours involving the use of language.
Thus, common to both cognitive processes is the embodiment of the con-
ceptualisation of verbal behaviour.
Secondly, the metonymy-metaphor continuum is one of gurativity.
Correlating to the continuum of gurativity that extends from metonymy
to metaphor is the continuum of semantic predictability. The speech
272 Z. Jing-Schmidt
organ metonymy is highly schematic and semantically predictable. Com-
pared to this metonymy, the metaphors show an increasing degree of
conventionality and variously lowered degrees of semantic predictability
depending on what aspect of the source domain participates in the map-
ping. However, based on semantic compositionality, the meanings of the
metaphors are more or less recoverable.
Thirdly, many of the Chinese expressions of verbal behaviour can be
categorized in terms of universal conceptual metaphors because of the
experientially fundamental nature of their source domains. This said,
however, the particular aective valence inherent in the semantics of an
expression does suggest the reality of a culture-specic experiential fo-
cus. For this reason, a proper interpretation of the Chinese metaphors
requires a cultural model that accounts for the culture-specic aective
valence.
Finally, the overall distributions of aective valence characterised
crosslinguistically by a negativity bias may be attributable to the socio-
emotional nature of verbal behaviour and the cognitive-aective patterns
underlying its perception. On this view, the metaphorisation of verbal
behaviour is not only a cognitive phenomenon, but, more accurately, a
cognitive-aective process whereby emotion plays a crucial part in the
conventionalisation of metaphors. By making reference to the larger con-
text of human cognitive-aective behaviour and especially emotion, the
current approach seems to have provided an adequate perspective from
which to deal with the phenomenon at hand. The intellectual signicance
of this perspective is that it raises important questions for linguists with
regard to the relationship between language, cognition and emotion.
Concretely, it will be a task for future studies to determine to what extent
metaphors concerned with other target domains are related to emotion. It
is hopeful that the answers will not only shed new light on our knowledge
of metaphorical language as a cognitive phenomenon, but will also carry
our understanding of emotion and language a step further.
Received 14 May 2007 University of Cologne, Germany
Revision received 26 October 2007
Appendix A: Examples in Chinese original
(1) a. b. c.
(2) a.
b.
c. ?
Chinese metonymies and metaphors 273
(3) a. b. c. d.
(4) a. b. c. d.
e. f. g. h.
(5) a. b. c. d.
e. f. g.
h. i.
(6) a. b. c. d.
e.
(7) a. b. c. d.
(8)
(9) a. b. c. d.
e. f.
(10) a. b. c.
(11) a. b. c. d.
e. f. g.
(12) a. b. c. d.
e. f.
(13) a. b. c. d.
(14) a. b. () c. d. e.
f. g. h. i.
j.
(15) a. b. c.
(16) a. b. c.
(17) a. b. c. d. e.
f. g. h.
(18) a. b. c. d.
e. f. g.
(19) a. b. c. d.
e. f. g. h.
i. j.
(20) a. b. c. d. e.
(21) a. b. c. d.
e. f. g.
(22) a. b. c.
(23) a. b. c.
(24) a. b. c. d.
e.
Appendix B: Questionnaire
!
!
274 Z. Jing-Schmidt

!
1. b b b
2. b b b
3. b b b
4. b b b
5. b b b
6. b b b
7. b b b
8. b b b
9. b b b
10. b b b
11. b b b
12. b b b
13. b b b
14. b b b
15. b b b
16. b b b
17. b b b
18. b b b
19. () b b b
20. b b b
21. b b b
22. b b b
23. b b b
24. b b b
25. b b b
26. b b b
27. b b b
28. b b b
29. b b b
30. b b b
31. b b b
32. b b b
33. b b b
34. b b b
35. b b b
36. b b b
37. b b b
38. b b b
39. b b b
40. b b b
Chinese metonymies and metaphors 275
41. b b b
42. b b b
43. b b b
44. b b b
45. b b b
46. b b b
47. b b b
48. b b b
49. ... b b b
50. b b b
51. b b b
52. b b b
53. b b b
54. b b b
55. b b b
56. b b b
57. b b b
58. b b b
59. b b b
60. b b b
61. b b b
62. b b b
63. () b b b
64. b b b
65. b b b
66. b b b
67. b b b
68. b b b
69. b b b
70. b b b
71. b b b
72. b b b
73. () b b b
74. b b b
75. b b b
76. b b b
77. () b b b
78. b b b
79. b b b
80. b b b
81. b b b
82. b b b
276 Z. Jing-Schmidt
83. b b b
84. b b b
85. b b b
86. b b b
87. b b b
88. b b b
89. b b b
90. b b b
91. b b b
92. () b b b
93. b b b
94. b b b
95. b b b
96. () b b b
97. b b b
98. b b b
99. b b b
100. b b b
101. b b b
102. b b b
103. b b b
104. b b b
105. b b b
106. b b b
107. b b b
108. b b b
109. () b b b
110. b b b
111. b b b
112. () b b b
113. b b b
114. b b b
115. b b b
116. b b b
117. b b b
118. b b b
119. b b b
120. b b b
121. b b b
122. b b b
!
Chinese metonymies and metaphors 277
Notes
* I thank Ewa Dabrowska and the two anonymous expert reviewers for their comments
and suggestions which contributed to the improvements of this paper. I wish to express
my gratitude to Jing Ting of Harbin Normal University, China, who administered the
questionnaire survey and helped me collect the data employed in section 4. I thank her
and Stefan Th. Gries of UCSB for the help they generously oered me in dealing with
the statistics on which the quantitative analysis in section 4 rests. My thanks also go to
my friend Debra Grant who patiently studied the manuscript and improved my English.
Of course, all remaining errors are my own. Authors contact address: University of
Cologne, Department of General Linguistics, Albertus-Magnus-Platz, 50923 Ko ln.
Authors e-mail address: zjingsc0@uni-koeln.de
1. Throughout this paper, the term Chinese refers to the standard ocial language used
in mainland China and Taiwan, known as Mandarin.
2. The following abbreviations are adopted in the relevant glosses in this paper: 1Pl rst
person plural, 2SG second person singular, RES resultative, Q question.
3. An exception has been pointed out to me, by Debra Grant, in the expression Dont try
to sweet talk your way out of it. In general, however, culture-specic food preferences
and culinary experiences may account for the contrastive semantic coloration of sweet-
ness. The European appreciation of confectionery is not only evident in the delight of
the dessert as the culinary highlight. It is also linguistically evident in the idiomatic ex-
pression have a sweet tooth that encodes the favouring of sweetness. More importantly,
the emotional signicance of confectionery is such that sweets are powerful symbols of
love and their withdrawal can serve as punishment, thus constituting enormous psycho-
logical and pedagogical consequences. The Chinese are a people known to place great
value on the entree which by virtue of its variety and elaboration inevitably pre-empts
the dessert which is at the most an afterthought. Within this cultural frame, confection-
ery is marginalized, and even despised or considered destructive to a culinary event if
overindulged. In light of this, the word sweet carries very dierent connotations in the
two languages.
4. The meaning of broken may be rendered variously as sui shattered, po damaged but
unshattered, duan (oblong object) broken in two or more sections or huai defect, dam-
aged, in Chinese. While the rst three senses focus on the perceptual features of the
damaged object, the last one emphasizes functional damage. See Chen (2007) for an in-
depth study of the conceptualisation of cutting and breaking events in Chinese.
5. For our example to be considered an instance of conceptual blending, it is essential to
argue against the availability of the quality of being discreet and trustworthy in either
input. It seems to me, however, such availability is not an all-or-nothing matter, but
one of degree, depending largely on how far one wishes to stretch the association with
the input meanings. In our case, something that is tight is unlikely to leak, which allows
the association that it is safe, metaphorically so when it comes to the organ of speech.
6. All the informants are undergraduate students of the Department of Education, Harbin
Normal University, China. Mandarin is the only native language of the informants. The
questionnaire was presented at the beginning of the Fall/Winter semester of 2007.
7. The fact that the total frequency of the neutral rating is slightly higher than expected can
be explained if we take into account the potential weakness of the current questionnaire
design. Because the items are presented out of context, the rating heavily depends on the
informants abstract lexical knowledge. This is problematic especially because many
negative emotions are usually associated with behaviour in a specic situation. Thus, in
the absence of context, informants might nd it dicult to make denite judgment on
278 Z. Jing-Schmidt
the aective valence of a certain item, which may have contributed to the relatively high
frequency of the neutral rating. This is particularly considerable with items that are too
infrequently used in everyday life for informants to know what they mean at all, espe-
cially in isolation. This methodological weakness is acknowledged here and should be
overcome in follow-up research in the future.
8. Details regarding the vast literature on negativity bias are beyond the scope of this
paper. The interested reader may consult Peeters and Czapinski (1989), Skowronski and
Carlston (1989), Pratto and John (1991), Taylor (1991), Caccioppo and Berntson (1994),
Cacioppo et al., (1997, 1999) in addition to the three references in the parentheses.
References
Arnold, Magda B.
1960 Emotion and Personality, vol. 1, Psychological Aspects. New York: Colum-
bia University Press.
Barcelona, Antonio
2000 On the plausibility of claiming a metonymic motivation for conceptual met-
aphor. In A. Barcelona (ed.), Metaphor and Metonymy at the Crossroads,
3158. Berlin/New York: Mouton de Gruyter.
Baumeister, Roy F., Ellen Bratslavlavsky, Catrin Finkenauer, and Kathleen D. Vohs
2001 Bad Is Stronger Than Good. Review of General Psychology 5(4), 323370.
Boucher, Jerry and Charles E. Osgood
1969 The Pollyanna Hypothesis. Journal of Verbal Learning and Verbal Behaviour
8, 18.
Bruner, Jerome
1990 Acts of Meaning. Cambridge, MA./London: Harvard University Press.
Cacioppo, John T. and Gary G. Berntson
1994 Relationship between attitudes and evaluative space: A critical review, with
emphasis on the separability of positive and negative substrates. Psychologi-
cal Bulletin 115, 401423.
Cacioppo, John T., Wendi L. Gardener, and Gary G. Berntson
1997 Beyond bipolar conceptualizations and measures: The case of attitudes and
evaluative space. Personality and Social Psychology Review 1, 325.
Cacioppo, John T., Wendi L. Gardener, and Gary G. Berntson
1999 The aect system: Form follows function. Journal of Personality and Social
Psychology 76, 839855.
Chen, Jidong
2007 He cut-break the rope: Encoding and categorizing cutting and breaking
events in Mandarin. Cognitive Linguistics 18(2), 273286.
Church, Joseph
1961 Language and the Discovery of Reality. New York: Vintage Books.
Cosmides, Leda and John Tooby
2000 Evolutionary Psychology and the Emotions. In Michael Lewis and Jean-
nette M. Haviland-Jones (eds.), Handbook of Emotions, 2nd edition. New
York/London: The Gilford Press, 91115.
Croft, William
2002 The role of domains in the interpretation of metaphors and metonymies.
In Rene Dirven and Ralf Po rings (eds.), Metaphor and Metonymy in Com-
parison and Contrast. Berlin/New York: Mouton de Gruyter, 161206.
Chinese metonymies and metaphors 279
Croft, William and Alan Cruise
2004 Cognitive Linguistics. Cambridge: Cambridge University Press.
Dirven, Rene
2002 Metonymy and Metaphor: Dierent mental strategies of conceptualisation.
In Rene Dirven and Ralf Po rings (eds.), Metaphor and Metonymy in Com-
parison and Contrast. Berlin/New York: Mouton de Gruyter, 75111.
DUDEN
1996 Deutsches Universal-Worterbuch. Mannheim/Leipzig: Dudenverlag.
Evans, Vyvyan and Melanie Green
2006 Cognitive Linguistics: An Introduction. Edinburgh: Edinburgh University
Press.
Fauconnier, Gilles, and Mark Turner
2002 The Way We Think: Conceptual Blending and the Minds Hidden Complex-
ities. New York: Basic Books.
Gibbs, Raymond W. Jr.
1994 The Poetics of Mind: Figurative Thought, Language, and Understanding.
Cambridge: Cambridge University Press.
1996 Why many concepts are metaphorical. Cognition 61, 309319.
2006 Embodiment and Cognitive Science. Cambridge: Cambridge University Press.
Goman, Erving
1981 Forms of Talk. Philadelphia, PA.: University of Pennsylvania Press.
Goldberg, Adele E.
1995 Constructions: A Construction Grammar Approach to Argument Structure.
Chicago/London: The University of Chicago Press.
Goossens, Louis
2002 Metaphtonymy: The interaction of metaphor and metonymy in expressions
for linguistic Action. In Rene Dirven and Ralf Po rings (eds.), Metaphor and
Metonymy in Comparison and Contrast. Berlin/New York: Mouton de
Gruyter, 349378.
Grady, Joseph
1999 A Typology of Motivation for Conceptual Metaphor: Correlation vs. Re-
semblance. In Raymond W. Gibbs and Gerard J. Stehen (eds.), Metaphor
in Cognitive Linguistics. Amsterdam/Philadelphia: Benjamins, 79100.
Grady, Joseph, Todd Oakley, and Seana Coulson.
1999 Blending and Metaphor. In Raymond W. Gibbs and Gerard J. Steen (eds.),
Metaphor in Cognitive Linguistics. Amsterdam/Philadelphia: Benjamins,
101124.
Hebb, D. O.
1972 Textbook of Psychology. Philadelphia/London: W.B. Saunders Company.
Jing-Schmidt, Zhuo
2007 Negativity bias in language: A cognitive-aective model of emotive intensi-
ers. Cognitive Linguistics 18 (3), 417443.
Johnson, Mark
1987 The body in the mind: The bodily basis of meaning, imagination, and reason.
Chicago: University of Chicago Press.
Johnson-Laird, P. N. and Keith Oatley
2000 Cognitive and Social Construction in Emotions. In Michael Lewis and Jean-
nette M. Haviland-Jones (eds.), Handbook of Emotions, 2nd edition. New
York/London: The Gilford Press, 458475.
Kass, Leon R.
1999 The Hungry Soul. Chicago: The University of Chicago Press.
280 Z. Jing-Schmidt
Kornacki, Pawe
2001 Concepts of anger in Chinese. In Jean Harkins and Anna Wierzbicka (eds.),
Emotions in Crosslinguistic Perspective. Berlin/New York: Mouton de
Gruyter, 255290.
Ko vecses, Zoltan
1999 Metaphor: Does it constitute or reect cultural models? In Raymond W.
Gibbs and Gerard J. Steen (eds.), Metaphor in Cognitive Linguistics.
Amsterdam/Philadelphia: Benjamins, 167188.
2002 Metaphor: A Practical Introduction. Oxford/New York: Oxford University
Press.
2005 Metaphor in Culture. Cambridge: Cambridge University Press.
Ko vecses, Zoltan and Gunter Radden
1998 Metonymy: developing a cognitive linguistic view. Cognitive Linguistics 9
(1), 3777.
Lako, George
1987 Women, Fire, and Dangerous Thing: What Categories Reveal about our
Mind. Chicago: The University of Chicago Press.
Lako, George and Mark Johnson
1980 Metaphors We Live By. Chicago: The University of Chicago Press.
1999 Philosophy in the Flesh. New York: Cambridge University Press.
Langacker, Ronald W.
1987 Foundations of Cognitive Grammar, vol. I. Stanford, CA.: Stanford Univer-
sity Press.
Matiso, James A.
1979 Blessings, Curses, Hopes, and Fears: Psycho-Ostensive Expressions in Yid-
dish. Stanford: Stanford University Press.
Nosofsky, R. M.
1986 Attention, similarity, and the identication-categorization relationship.
Journal of Experimental Psychology, General 115, 3957.
Nunberg, Georey, Ivan A. Sag and Thomas Wasow
1994 Idioms. Language 70, 491538.
Peeters, Giudo and Czapinski, Janusz
1989 Positive-negative asymmetry in evaluations: The distinction between aec-
tive and informational negativity eects. In W. Stroebe and M. Hewstone
(eds.), European Review of Social Psychology, vol. 1. Chichester, UK: Wiley,
3360.
Pratto, Felicia and John, Oliver P.
1991 Automatic vigilance: the attention-grabbing power of negative social
information. Journal of Personality and Social Psychology 61, 380
391.
Reddy, Michael J.
1979 The conduit metaphor: A case of frame conict in our language about
language. In Andrew Ortony (ed.), Metaphor and Thought. Cambridge:
Cambridge University Press, 164201.
Rozin, Paul
1999 Food is fundamental, fun, frightening, and far-reaching. Social Research 66,
930.
Rozin, Paul and Edward B. Royzman
2001 Negativity Bias, Negativity Dominance, and Contagion. Personality and
Social Psychology Review 5(4), 296320.
Chinese metonymies and metaphors 281
Russell, James A.
1979 Aective space is bipolar. Journal of Personality and Social Psychology 37,
345356.
Semino, Elena
2005 The metaphorical construction of complex domains: The case of speech
activity in English. Metaphor and Symbol 20 (1), 3570.
Simon-Vandenbergen, Anne-Marie
1995 Assessing Linguistic Behaviour: A Study of Value Judgements. In Louis
Goossens, Paul Pauwels, Brygida Rudzka-Ostyn, Anne-Marie Simon-
Vandenbergen, and Johan Vanparys (eds.), By Word of Mouth: Metaphor,
Metonymy, and Linguistic Action in Cognitive Perspective. Amsterdam/
Philadelphia: Benjamins, 71124.
Skowronski, John J. and Donal E. Carlston
1988 Negativity and extremity biases in impression formation: A review of ex-
planations. Psychological Bulletin 105, 131142.
Sweetser, Eve
1990 From Etymology to Pragmatics. Cambridge: Cambridge University Press.
Taylor, Shelley E.
1991 Asymmetrical eects of positive and negative events: the mobilization-
minimization hypothesis. Psychological Bulletin 110 (1), 6785.
Turner, Mark and Gilles Fauconnier
1995 Conceptual integration and formal expression. Metaphor and Symbolic
Activity 10, 183203.
White, Georey M.
1994 Aecting culture: emotion and morality in everyday life. In S. Kitayama and
H. R. Markus (eds.), Emotion and Culture. Washington D.C.: American
Psychological Association.
Yao, Naiqiang (ed.)
2000 Xinhua Zidian (Xianhua Chinese Dictionary). Beijing: Shangwu Yinshu-
guan.
Ye, Zhengdao
2001 An inquiry into sadness in Chinese. In Jean Harkins and Anna Wierz-
bicka (eds.), Emotions in Crosslinguistic Perspective. Berlin/New York:
Mouton de Gruyter, 359404.
Yu, Ning
2000 Figurative uses of nger and palm in Chinese and English. Metaphor and
Symbol 15, 159175.
2001 What does our face mean to us? Pragmatics and Cognition 9, 136.
2002 Body and emotion: body parts in Chinese expressions of emotion. In En-
eld, N. and Anna Wierzbicka (eds.), The Body in Description of Emotion:
Cross-linguistic Studies, Pragmatics and Cognition (special issue) 10, 341367.
2003a Metaphor, body, and culture: The Chinese understanding of gallbladder and
courage. Metaphor and Symbol 18, 1331.
2003b The bodily dimension of meaning in Chinese: what do we do and mean
with hands? In Eugene H. Casad and Gary B. Palmer (eds.), Cognitive
Linguistics and Non-Indo-European Languages. Berlin/New York: Mouton
de Gruyter, 337362.
Zhu, Zuyan (ed.)
2002 Hanyu Chengyu Dacidian (Chinese Dictionary of Idioms). Beijing: Zhong-
hua Shuju.
282 Z. Jing-Schmidt
Subjects in the hands of speakers:
An experimental study of syntactic subject
and speech-gesture integration
FEY PARRILL*
Abstract
Work by Russell Tomlin has shown that there is a close relationship
between the syntactic subject of an utterance and the entity the speakers
attention is focused on while the utterance is being formulated, for descrip-
tions of a simple event (Tomlin 1985, 1995, 1997). The experiment pre-
sented in this paper demonstrates that the same eect can be obtained for
a more complex event, and that attention also impacts the spontaneous
hand gestures produced along with speech. The paper shows that both syn-
tactic subject and the information contained in gesture can be manipulated
by changing which entity a speaker is focused on during utterance formula-
tion. This pattern suggests that changes in conceptualization give rise to
changes in both speech and gesture.
Keywords: co-speech gesture, attention, syntactic subject.
1. Introduction
Language researchers must deal with the following very fundamental and
very troublesome question: how do speakers choose between the dierent
syntactic structures available in their language when encoding a mental
representation? For instance, why does a person say Mark was scared by
the raccoon rather than the raccoon scared Mark? Within traditional ap-
proaches, these two sentences are assumed to be semantically equivalent,
so why choose one over the other? Researchers tend to agree that such a
choice is driven by a dierence in the speakers underlying construal of
the situation. (Perhaps the raccoon is more central to the conversation in
the latter case.) There is wide disagreement, however, about the formal
apparatus necessary for describing the relationship between speaker con-
strual and grammatical structure. Some approaches require a series of
Cognitive Linguistics 192 (2008), 283299
DOI 10.1515/COG.2008.011
09365907/08/00190283
6 Walter de Gruyter
translations from one part of the language system to anotherfrom
pragmatics, to semantics, to syntax, for example. Other approaches sug-
gest that syntactic structures can directly reect the outcome of cognitive
processes. This paper advocates for the latter kind of approach. Speci-
cally, the paper oers further support for a hypothesized link between
sentence structure and attention. A number of authors have suggested
that attention plays a major role in determining how language users em-
ploy grammatical constructions (MacWhinney 1977; Talmy 1996, 2007,
forthcoming; Tomlin 1985, 1995, 1997). In addition, using a simple exper-
imental paradigm (discussed in more detail below), Russell Tomlin has
shown that the element a persons attention is focused on while she
formulates an utterance is likely to be encoded as the subject of that utter-
ance (Tomlin 1985, 1997). The choice between an active or passive sen-
tence (as in the examples above) thus reects a dierence in the how the
speakers attention is deployed.
While Tomlin shows that this pattern obtains in a number of lan-
guages, this work is open to certain criticisms. First, Tomlin focuses on a
very simple (transitive) event. Second, psycholinguistic experiments are
particularly vulnerable to the claim that a pattern observed in the
data arises from what participants think they should do in an experi-
ment (sometimes referred to as demand characteristics: Intons-Peterson
1983), rather than from the way that language works under normal
circumstances.
The experiment described in this paper provides responses to these two
criticisms. First, a modication of Tomlins paradigm will be used to ex-
plore the role of attention in predicting syntactic subject in descriptions of
a very complex (caused motion) event. Second, an additional source of
information will be exploited to support the claim that syntax and con-
ceptualization are linked: the hand gestures that people produce while
they are talking. Because these gestures are packed with meaning that is
directly connected to the meaning of the accompanying speech, they are
extremely informative about the relation between language and conceptu-
alization. Because they are not consciously monitored, however, they can
oer a more direct path to the speakers conceptualization than does
speech.
There are two goals for this paper. The rst is to demonstrate a connec-
tion between conceptual structure and grammatical form, along the lines
of Tomlins proposal. The paper will show that the eect of attention on
speech Tomlin obtains can be observed for gesture as well. That is, both
the syntactic subject of an utterance and the information encoded in ges-
ture can be manipulated by changing which entity a speaker is focused on
while planning her utterance. Because gesture can provide information
284 F. Parrill
about imagistic aspects of a speakers representations (Beattie and Shov-
elton 2002; Goldin-Meadow 2003; Kita 2000; Kita and O

zyu rek 2003;


McNeill 1992; Parrill and Sweetser 2004; Sweetser 1998), we can make
more forceful claims about how those representations might be expressed
linguistically. The second goal is to show that speech-gesture patterning
can be manipulated. Previous work on the integration of speech and ges-
ture takes advantage of grammatical dierences across languages that are
thought to correlate with dierences in the information encoded in ges-
ture (Kita and O

zyu rek 2003; McNeill and Duncan 2000). This study,


on the other hand, attempts to evoke dierent patterns of conceptualiza-
tion for the purposes of linguistic expression (often referred to as thinking
for speaking: Slobin 1987, 1996) within a single language.
In what follows, extremely basic information about the kinds of ges-
tures that occur with speech, and their connection to that speech will be
presented.
1
The experimental paradigm Tomlin employs will then be re-
viewed, and the modications necessary to permit an analysis of gesture
will be explained. An experiment replicating Tomlins basic ndings for
syntactic subject and extending the ndings to the domain of gesture will
then be presented.
2. Coordination of speech and gesture
When people talk they typically also produce hand and arm motions, or
gestures. These gestures link spaces in front of the speakers body to
topics in the discourse, point to elements that are not physically present,
or depict aspects of scenes the speaker is describing. In all such cases,
the gestures produced are very closely connected to the accompanying
speech. Because of this tight connection, many researchers have come to
believe that gesture is part of the language systemthat language is not
just speech (or sign), but speech (or sign) plus gesture (Goldin-Meadow
2003; Liddell 2003; McNeill 1992, 2005, 2000; Nun ez and Sweetser 2006;
Parrill and Sweetser 2004; Sweetser 1998).
Once language has been broadened to include gesture, discovering ex-
actly how the two modalities are coordinated during production presents
a major theoretical problem. In the past few decades, there have been
many signicant discoveries on this front (for reviews, see Goldin-
Meadow 2003; Kendon 2004; McNeill 2005). Much of the work that ex-
plicitly examines the encoding of semantic information in the two chan-
nels has focused on iconic gestures (gestures that iconically depict some
aspect of an event or scene) produced when the speaker is talking about
motion. There are two reasons for this focus. First, gestures tend to be
very frequent and complex when we speak about motion. Second, motion
Subjects in the hands of speakers 285
is a relatively well-understood semantic domain. The experiment pre-
sented here also involves motion event descriptions, so some of this work
will be briey reviewed.
2.1. Speech and gesture in descriptions of motion
Motion events can be studied by breaking an event up into a set of com-
ponents, as Talmy (1985) has done. These components (conventionally
placed in small caps) include path (the trajectory of motion), manner (in-
ternal structure of the motion), figure (the moving object), and others.
Talmy has examined the ways in which dierent languages express these
components, and has sorted languages into two groups based on how
path is encoded. If path is encoded in a satellite (e.g., a prepositional
phrase), the language is referred to as a satellite-framed language. This is
true of English, which typically uses prepositional phrases for path, and
conates manner and activity in a motion verb (e.g., the ball rolled down
the hill ). Verb-framed languages, on the other hand, have many verbs en-
coding path or direction while manner is encoded in a separate phrase
(e.g., Spanish fue a Ibiza nadando, literally he went to Ibiza swimming).
Interestingly, studies of gesture production in motion event descriptions
reveal that speakers also gesture dierently depending on whether their
language is satellite-framed or verb-framed (Kita and O

zyu rek 2003; Mc-


Neill and Duncan 2000). Speakers of satellite-framed languages tend to
accompany utterances containing manner verbs with path-only gestures,
unless there is particular focus on manner in the description (McNeill
and Duncan 2000). Figure 1 is an example of an English motion event de-
scription exhibiting this typical pattern. The verb (roll ) encodes manner,
while the prepositional phrase (down the street) encodes path. The ges-
ture also encodes path. The studies presented here ask whether this
typical pattern can be manipulated by shifting the speakers attention.
Speech: then hes like [rolling down the street]. Gesture: speakers right
hand moves downward from left to right (path) while left hand holds.
The gesture occurs during the bracketed speech. The peak prosodic
emphasis is in bold. These transcription conventions, which are used
throughout the paper, are borrowed from McNeill 1992.
3. Linking attention and syntactic subject
People are very likely to gesture when describing a motion event. The
paradigm Tomlin has used to explore the link between attention and syn-
tactic subject involves asking participants to describe an event. For this
reason, Tomlins paradigm can be adapted for analysis of speech-gesture
coordination.
286 F. Parrill
In Tomlins study, participants describe an event as they are watching
it unfold. In this event, two cartoon sh swim towards each other. When
they reach the center of the screen, one sh opens its mouth and swallows
the other. At two points during the scene, an arrow appears above one of
the sh, directing participants attention to that sh, as pictured in Figure
2. The second appearance of the arrow occurs right before the swallowing
event. Participants are instructed to x their eyes on the element to which
arrow points.
If attention is directed to the agent sh (the sh doing the eatingin
this case, the grey sh), participants produce an active sentence to
describe the event, such as the grey sh eats the white sh. If attention is
directed to the patient sh (the sh being eaten, as is the case in the
gure), speakers produce a passive sentence, such as the white sh gets
eaten by the grey sh. In other words, the choice between these two
Figure 1. PATH gesture
Figure 2. Stimulus for Tomlins experiments (redrawn from Tomlin 1997)
Subjects in the hands of speakers 287
grammatical patterns, the active and the passive, is regulated by atten-
tion. (It should be noted that while English is the focus here, Tomlin has
used this paradigm with a variety of languages: see Tomlin 1997.)
3.1. Adaptation of Tomlins paradigm
For the experiment described here, Tomlins paradigm is used with a car-
toon motion event. Participants watch a video clip involving two ele-
ments and describe it while it is unfolding. Their attention is directed to
one of the two elements using a visual cue (an arrow). The event used
for Experiment 1 is a short segment from a cartoon (Pierce and Freleng
1950). Because this manipulation requires the event to have very specic
properties, the stimulus will be described in a fair amount of detail.
(A complete scene-by-scene description of the cartoon from which the
stimulus comes, as well as a very detailed description of this scene, can
be found in McNeill 1992, pp. 366374).
The cartoon depicts the antics of a cat and a bird. The cat is attempting
to catch the bird. The bird is on a building window ledge, while the cat
is below on the street. The cat climbs up towards the bird through the
interior of a drainpipe axed to the side of the building. The bird sees
him coming and puts a bowling ball into the mouth of the drainpipe.
The viewer sees the impact within the pipe of the cat and ball colliding.
The cat then emerges from the bottom of pipe with the ball inside his
bodyone infers that he has swallowed the ball. Because the street is on
an incline, the ball begins to roll inside the cat, propelling him down the
street. The cats legs ail in a circular motion as he rolls, though they do
not touch the ground (thus are not the cause of motion). The event of spe-
cic interest for the experiment is the nal portion of the scene, where the
cat moves down the street with the ball inside his body, shown in Figure
3. This will be referred to as the target event.
Figure 3. Target event for experiment
288 F. Parrill
This event was used because it meets certain requirements imposed by
the nature of Tomlins attention manipulation. First, there are two enti-
ties participating in one event (the ball and the cat). Second, the motion
event components associated with the two entities in the event are separa-
ble. While the cats motion involves manner, it also involves a simple
path. The balls motion involves a very distinctive rotating manner. A
narrator can potentially encode one, both, or neither of these components
in gesture. path alone would look like Figure 1. Such a gesture encodes
trajectory, but no explicit manner of motion.
2
Examples of very similar
speech accompanied by manner alone and path plus manner gestures
are shown in Figures 4 and 5, respectively. (It should be noted that par-
ticipants also sometimes represent the ailing motion of the cats legs in
gesturesuch gestures are discussed below in the analysis section.)
Speech: so hes [rolling down the street]. Gesture: speakers right hand
traces a circular path repeatedly.
Speech: then it is [rolling down the hill ]. Gesture: speakers left hand
traces a circular path moving downward.
This paradigm also requires the language being spoken (English) to
have a syntactic alternation that allows either of the two elements to
appear as sentence subject. This is possible with the target event. Partici-
pants can use a causative (the ball rolls the cat down the hill ) or an intran-
sitive motion construction (the cat rolls down the hill ) to describe the
event. Finally, it is possible to predict a dierence in how participants
will gesture as a function of which element they encode as the subject of
Figure 4. MANNER alone gesture
Subjects in the hands of speakers 289
their utterance. This is because this event has been used for a number of
experiments exploring speech-gesture patterning in motion event descrip-
tions (Duncan 1996; Kita 2000; Kita and O

zyu rek 2003; McNeill 1992,


2000, 2005; McNeill and Duncan 2000). This body of research has shown
that the typical way for an English speaker to describe the target event is
to say the cat rolls down the street, and to accompany the verb phrase
with a gesture depicting path (the trajectory followed by the cat), but
not rotating manner (showing the motion of the ball). This typical pat-
tern is shown in Figure 1. Knowledge of this typical pattern allowed
attention to be manipulated and predictions to be made about the ways
in which gesture would change.
4. Experimental manipulation of speech-gesture integration
Before discussing the experiment and predictions, it is important to note
that this stimulus diers from Tomlins in a number of signicant ways.
The two sh in Tomlins experiments are identical except in color. This
is not true of the bowling ball and the cat, which are dierent sizes and
shapes. Second, the visual cue that directs attention to one of the two
sh in Tomlins stimulus is extremely carefully timed. It takes about 150
milliseconds for a person to shift attention from one target to another
(Posner and Petersen 1990). In Tomlins stimulus, the arrow appears
over the sh 75 milliseconds before the eating event, and the eating event
is very brief (220 ms). This timing ensures that participants will not be
able to shift attention away from the entity the arrow is indicating before
Figure 5. PATH MANNER gesture
290 F. Parrill
beginning to describe the event. Again, this is not the case with the target
event in Experiment 1. The target event lasts 8700 ms. As a result, al-
though the arrow remains on the screen during the entire target event,
participants in this study can shift their attention during their descriptions
of the action. Third, both the agent and the patient in Tomlins stimulus
are animate, whereas in the target event, one of the elements in inanimate
(the bowling ball). Animate entities are thought to make better subjects
(Bock 1986; Chafe 1976; Tomlin 1997). Finally, the event in Tomlins ex-
periment lacks any kind of narrative complexity. In the target event, there
is a fair amount of visual and narrative complexity. One is required to
make inferences about what has happened (e.g., that the cat and ball
have collided, that the cat has swallowed the ball), and each new event is
relatively unpredictable. In summary, much of the careful control built
into Tomlins paradigm is lacking here.
4.1. Methods
Thirty University of Chicago students participated in the experiment for
payment. All were native speakers of English. Each arrived at the experi-
ment room with a friend who served as a listener during the participants
narration. (Participants produce more naturalistic narrations when not
speaking to an experimenter who they may assume has already seen
the stimulus: Parrill, forthcoming.) Each participant watched three one-
minute cartoon clips and described what was happening as the clip un-
folded. At the beginning of the experiment participants were given the
following verbal instructions:
Toward the end of the video, you will see a red arrow on the screen pointing to an
element in the video. When you see the arrow, keep your eyes on the element the
arrow is pointing to. Do not mention the arrow in your description, though. Just
keep describing whats happening in the clip.
Before watching each clip, participants were reminded of these instruc-
tions. The rst two clips were practice clips and will not be discussed fur-
ther. The third (experimental) clip was the bowling ball scene described
above. The stimuli were presented on a laptop, which was placed between
the two participants so that only the narrator could see the screen. The
experimental set-up can be seen in Figures 1, 4 and 5. During the target
event (the cats transit down the street), a red arrow ashed either above
or below the cat, as shown in Figures 6a and 6b. These conditions will be
referred to as the cat arrow and ball arrow conditions, respectively. Fif-
teen participants were in the cat arrow condition and fteen were in the
ball arrow condition. To ensure that the arrow was actually directing
Subjects in the hands of speakers 291
attention to the cat or the ball (not the cats head and the cats bottom),
participants were explicitly asked during debrieng what they thought the
arrow pointed to, and always responded appropriately (cat in the cat
arrow condition, ball in the ball arrow condition).
4.2. Predictions
English speakers typically describe this event by saying the cat rolls down
the street, and produce a gesture that depicts path. Participants in the
cat arrow condition are expected to produce just this pattern. That is, be-
cause their attention is directed to the cat, no change from the default
pattern is anticipated. In the ball arrow condition, however, participants
are expected to produce more utterances with the ball as the subject. This
is because their attention has been directed to the ball. They are also ex-
pected to produce more rotating manner gestures (with or without a path
component) because the ball exhibits this kind of manner. When there is
particular focus on manner, it tends to appear in gesture (McNeill and
Duncan 2000).
4.3. Analysis
The speech and gesture each participant produced for the target event
was transcribed. Each utterance describing the target event was coded as
having either the cat or the ball as the syntactic subject. Participants ges-
tures were sorted into the following categories: gestures depicting rotating
manner (an example of which can be seen in Figure 4), rotating manner
combined with path (shown in Figure 5), path alone (shown in Figure
1), and gestures depicting the manner in which the cats legs move as he
rolls (leg manner). The latter gesture always involves two hands, moving
in apping motions, shown in Figure 7, and is very distinct from the rota-
tion gestures discussed above. No gestures were produced for the target
event that did not contain motion event information.
Figures 6a and 6b. Cat and ball arrow conditions
292 F. Parrill
Speech: His feet are sort of like [not touching the ground ]. Gesture:
speakers two hands move up and down.
4.4. Results
Results describing the eect of the manipulation on speech will be pre-
sented rst. Interestingly, while directing participants attention to the
ball did make them more likely to produce utterances in which the ball
was the subject, the manipulation also made participants in the ball arrow
condition more likely to produce multiple utterances describing the target
event. Table 1 shows the number of ball-subject utterances and cat-
subject utterances produced in the ball arrow and cat arrow conditions.
Participants in the cat arrow condition universally produced one utter-
ance per participant, and the cat was the syntactic subject of all of these
utterances. Participants in the ball arrow condition, on the other hand,
produced an average of 2.4 utterances. For half the utterances produced
in this condition the ball was the syntactic subject.
Figure 7. Gesture depicting MANNER of motion of cats legs
Table 1. Number and syntactic subject of utterances
Syntactic subject of utterance
Condition Ball Cat Total
Ball arrow 18 18 36
Cat arrow 0 15 15
Total 18 33
Subjects in the hands of speakers 293
The two groups diered in the number of utterances produced (Mann-
Whitney U 108, p 0:001). In order to explore the relationship be-
tween syntactic subject and condition, a chi-square test for independence
was used. A chi-square statistic was calculated based on the rst utterance
produced by each participant. The rationale for this analysis is as follows.
If the attention manipulation is successful, participants in the ball arrow
condition should rst produce a ball-subject utterance, even if the cat is
the syntactic subject of subsequent utterances. The frequencies of ball-
subject and cat-subject initial utterances are shown in Table 2. A chi-
square test showed that condition had a signicant eect on the syntactic
subject of the rst utterance produced (w
2
1; N 30 15, p < 0:001).
The majority of participants in the ball arrow condition rst produced
a ball-subject utterance. Interestingly, the ve participants who did not
begin their description of the target event with a ball-subject utterance
never produced one. In other words, the manipulation appeared to have
no eect on those participants.
To explore the eect of the experimental manipulation on the gestures
participants produced, gestures were sorted into those having a rotation
component (rotating manner alone: 9 produced; rotating manner with
path: 2 produced) and those not having a rotation component (path ges-
tures: 16 produced; gestures depicting the manner of motion of the cats
legs: 2 produced), as shown in Table 3.
Participants in the cat arrow condition never produced gestures in
which rotation was present, while participants in the ball arrow condi-
tion produced an average of .73 rotating manner gestures. The dierence
across the two groups approaches signicance (Mann-Whitney U 150,
Table 2. First utterance produced by each participant
Syntactic subject of utterance
Condition Ball Cat Total
Ball arrow 10 5 15
Cat arrow 0 15 15
Total 10 20
Table 3. Hand gestures
Rotation No rotation Total
Ball arrow 11 13 24
Cat arrow 0 5 5
Total 11 18
294 F. Parrill
p :06). Of the 11 rotating manner gestures, 63 percent accompanied
ball-subject utterances. While these numbers are too small to permit a
meaningful statistical analysis, they suggest a link between syntactic sub-
ject and rotating manner.
In this study, gestures involving rotation were assumed to reect think-
ing about balls motion, not the motion of the cats legs. This is because
gestures accompanying speech about the cats legs look quite distinct, as
noted above. Further, if rotating manner were associated with the cat,
the results above might be surprising. While the exact signicance of a
gesture can only be inferred, the pattern above oers some support for
the starting assumptions of this study.
4.5. Discussion
The attention manipulation inuenced which entity participants encoded
as the subject of utterances describing the target event. When attention
was directed to the cat, participants uniformly encoded the cat as the sub-
ject of their utterances, following the typical English pattern. When atten-
tion was directed to the ball, however, participants tended to rst produce
an utterance with the ball as the syntactic subject, but then to revert to
the typical pattern. An impulse to return to the typical English pattern is
one explanation for the larger number of utterances produced in the ball
arrow condition. That is, participants strongly prefer to describe the event
with utterances in which the cat is the syntactic subject, and continue to
talk until they have done so.
Why might the cat be the preferred syntactic subject? First, as noted
above, the cat is animate and animate entities make for better syntactic
subjects (Chafe 1994; Tomlin 1995; 1997). Second, the cat is more central
to the overall narrative than is the ball. The cat is a well-known character
and appears in other scenes (making it discourse-old), whereas the ball
appears only briey. Such factors also make an entity a better candidate
for syntactic subject (Chafe 1994; MacWhinney 1977; Tomlin 1995,
1997).
A second explanation for the larger number of utterances in the ball
arrow condition has less to do with the properties of the cat, and more
to do with the complexity of the motion event. Choosing the cat as the
subject may allow speakers to avoid encoding some of the complexity of
the event. That is, the cats trajectory can be described without giving any
detail about the causal source of that trajectory (the balls rotation).
While space does not permit a full discussion of the speech produced in
this study, it is noteworthy that thirteen of the participants in the cat ar-
row condition described the event by saying hes rolling down the street.
Subjects in the hands of speakers 295
The other two used the same construction but with the verbs scoot and
run. Such descriptions give very little detail about the real manner of mo-
tion, and might even create a misleading impression of what transpired.
Participants in the ball arrow condition, on the other hand, are cued to
focus attention on the motion of the ball. As a result, they describe the
event in more detail. When participants in this condition selected the ball
as the syntactic subject, they produced a variety of constructions. While
the balls rolling down the street was most frequent (57%), caused motion
constructions were also used (the balls pushing/dragging/rolling him
down the street: 25%), as were descriptions of the balls location (the balls
in his stomach: 18%).
Variation in the number of utterances produced when describing an
event has previously been noted for speakers of dierent languages (Kita
et al. 2005). Such dierences are assumed to be a product of the dierent
semantic and syntactic resources provided by the language. The current
project, however, shows that the number of utterances produced is inu-
enced by a speakers decisions about what is most important for the
narrative at the moment of utterance formulation, not just by how her
language packages information. That is, these data may inform us about
thinking for speaking within a single language.
The patterns observed also indicate that speech and gesture change in
coordinated ways. Dierences in the selection of syntactic subject as a
function of the attention manipulation were associated with dierences
in the motion-event components appearing in gesture. Participants pro-
duced rotating manner gestures only in the ball arrow condition, and
tended to produce them with ball-subject utterances. While there is a
large body of work exploring how speech and gesture pattern dierently
as a function of the grammatical features of the language spoken, this is
the rst systematic manipulation of speech-gesture integration within a
single language.
5. Conclusion
The entity on which a speakers attention is focused while she is planning
her utterance shapes both what she says and how she gestures. This holds
true even when the event being described is very complex, thus extending
Russell Tomlins work on a simple transitive event. The patterns observed
also indicate that while attention is a powerful force in language produc-
tion, factors such as animacy and discourse relevance also play a central
role in determining how a person will speak and gesture. These results
have implications for our understanding of the linguistic system as well
as for our understanding of language as a multimodal behavior. They
296 F. Parrill
suggest that changes in conceptualization give rise to changes in both
speech and gesture. It therefore seems prudent to include gesture in our
accounts of the human language system.
Received 16 January 2007 Case Western Reserve University,
Revisions received 24 September 2007 USA
Notes
* I am grateful for extensive and thoughtful comments from the editor of Cognitive
Linguistics and from three anonymous reviewers. Authors aliation: Department
of Cognitive Science, Case Western Reserve University. Authors e-mail address:
fey.parrill@case.edu.
1. Gestures are also produced in the absence of speech. This paper focuses on a rela-
tively narrow subset of gesture, co-speech gestures. More general discussions of gesture
can be found in Kendon 1981, 2004; Ekman and Friesen 1972; Mu ller and Posner,
2004.
2. One reviewer suggests that sliding manner of motion and trajectory alone (path) will be
indistinguishable, thus the gesture depicted in Figure 1 could potentially encode manner
as well, just of a dierent type (sliding rather than rotating). This distinction happens
not to be important for the study, however.
References
Beattie, Georey and Heather Shovelton
2002 An experimental investigation of the role of dierent types of iconic ges-
ture in communication: A semantic feature approach. Gesture 1(25), 129
149.
Bock, J. Kathryn
1986 Syntactic persistence in language production. Cognitive Psychology 18(3),
355387.
Chafe, Wallace
1976 Givenness, contrastiveness, deniteness, subjects, topics, and point of view.
In Li, Charles N. (ed.), Subject and Topic. New York: Academic Press,
2755.
1994 Discourse, Consciousness, and Time: The Flow and Displacement of Con-
scious Experience in Speaking and Writing. Chicago: Chicago University
Press.
Duncan, Susan D.
1996 Grammatical form and thinking-for-speaking in Mandarin Chinese and
English: an analysis based on speech-accompanying gestures. Unpublished
doctoral dissertation, University of Chicago, Chicago IL.
Ekman, Paul and Wallace Friesen
1972 Hand movements. The Journal of Communication 22, 353374.
Goldin-Meadow, Susan
2003 Hearing Gesture: How Our Hands Help Us Think. Cambridge: Belknap
Press of Harvard University Press.
Subjects in the hands of speakers 297
Intons-Peterson, Margaret J.
1983 Imagery paradigms: How vulnerable are they to experimenters expecta-
tions? Journal of Experimental Psychology: Human Perception and Perfor-
mance 9, 394412.
Kendon, Adam (ed.)
1981 Nonverbal Communication, Interaction and Gesture: Selections from Semio-
tica. The Hague: Mouton.
2004 Gesture: Visible Action as Utterance. Cambridge: Cambridge University
Press.
Kita, Sotaro
2000 How representational gestures help speaking. In McNeill, David (ed.),
Language and Gesture. Cambridge: Cambridge University Press, 162
185.
Kita, Sotaro and Asl O

zyu rek
2003 What does cross-linguistic variation in semantic coordination of speech and
gesture reveal? Evidence for an interface representation of spatial thinking
and speaking. Journal of Memory and Language 48(1), 1632.
Kita, Sotaro, Asl O

zyu rek, Shanley Allen, Reyhan Furman and Amanda Brown


2005 How does linguistic framing of events inuence co-speech gestures? In-
sights from cross-linguistic variations and similarities. Gesture 5(1/2), 219
240.
Liddell, Scott K.
2003 Grammar, Gesture, and Meaning in American Sign Language. Cambridge:
Cambridge University Press.
MacWhinney, Brian
1977 Starting points. Language 53, 152187.
McNeill, David
1992 Hand and Mind: What Gestures Reveal about Thought. Chicago: University
of Chicago Press.
2005 Gesture and Thought. Chicago: University of Chicago Press.
McNeill, David (ed.)
2000 Language and Gesture. Cambridge: Cambridge University Press.
McNeill, David and Susan D. Duncan
2000 Growth Points in thinking-for-speaking. In McNeill, David (ed.), Language
and Gesture. Cambridge: Cambridge University Press, 141161.
Mu ller, Cornelia and Roland Posner (eds.)
2004 The Semantics and Pragmatics of Everyday Gestures: Proceedings of the
Berlin Conference April 1998. Berlin: Weidler.
Nun ez, Rafael and Eve Sweetser
2006 With the future behind them: Convergent evidence from Aymara language
and gesture in the crosslinguistic comparison of spatial construals of time.
Cognitive Science 30 (5), 401450.
Parrill, Fey and Eve Sweetser
2004 What we mean by meaning: Conceptual integration in gesture analysis and
transcription. Gesture 4(2), 197219.
Parrill, Fey
forthc The hands are part of the package: Gesture, common ground, and informa-
tion packaging. In Rice, Sally and John Newman (eds.), Empirical and
Experimental Methods in Cognitive/Functional Research. Stanford: CSLI
Publications.
298 F. Parrill
Pierce, Tedd (writer), and Friz Freleng (director)
1950 Canary Row [Television series episode]. Los Angeles: Warner Brothers.
Posner, Michael I. and Steven E. Petersen
1990 The attention system of the human brain. Annual Review of Neuroscience 13,
2442.
Slobin, Dan I.
1987 Thinking for speaking. In Aske, John, Natasha Beery, Laura Michaelis and
Hana Filip (eds.), Proceedings of the 13th Annual Meeting of the Berkeley
Linguistic Society. Berkeley: Berkeley Linguistic Society, 435445.
1996 From thought and language to thinking for speaking. In Gumperz,
John J. and Stephen C. Levinson (eds.), Rethinking Linguistic Relativity.
Cambridge: Cambridge University Press, 7096.
Sweetser, Eve
1998 Regular metaphoricity in gesture: Bodily-based models of speech interac-
tion. Actes du 16 Congres International des Linguistes.
Talmy, Leonard
1985 Lexicalization patterns: semantic structure in lexical forms. In Shopen,
Timothy (ed.), Language Typology and Syntactic Description. Cambridge:
Cambridge University Press, 57149.
1996 The windowing of attention in language. In Shibatani, Masayoshi and
Sandra A. Thompson (eds.), Grammatical Constructions: Their Form and
Meaning. Oxford: Oxford University Press, 235287.
2007 Attention phenomena. In Geeraerts, Dick and Hubert Cuyckens (eds.), The
Oxford Handbook of Cognitive Linguistics. London: Oxford University
Press.
forthc The Attention System of Language. Cambridge: MIT Press.
Tomlin, Russell S.
1985 Interaction of subject, theme, and agent. In Wirth, Jessica R. (ed.), Beyond
the Sentence: Discourse and Sentential Form. Ann Arbor: Karoma, 5980.
1995 Focal attention, voice and word order. In Downing, Pamela and Michaela
Noonan (eds.), Word Order in Discourse. Amsterdam: John Benjamins,
517552.
1997 Mapping conceptual representations into linguistic representations: the role
of attention in grammar. In Nuyts, Jan and Eric Pederson (eds.), Language
and Conceptualization. New York: Cambridge University Press, 162189.
Subjects in the hands of speakers 299
Book reviews
Catherine E. Travis, Discourse Markers in Colombian Spanish: A Study in
Polysemy. Series Cognitive Linguistics Research 27. Berlin/New York:
Mouton de Gruyter, 2005, x 327 pp., ISBN 978-3-11-018161-6. Hard-
cover EUR 98; US$128
Reviewed by Natalya I. Stolova, Colgate University, USA. Email
3nstolova@mail.colgate.edu4
The recent years have seen an increased interest in the discourse markers
(DMs) employed in the Romance languages. Discourse Markers in Co-
lombian Spanish: A Study in Polysemy by Catherine E. Travis is a wel-
come and important contribution to this dynamic eld of research. This
monograph focuses on four Colombian Spanish DMs: bueno (which is
usually rendered into English as well, ok, alright or anyway), o sea
(which translates into English as I mean, rather or that is to say), en-
tonces (which is roughly equivalent to the English so or then) and pues
(which is similar to the English well, so, then). The book is based on
the authors doctoral dissertation completed at La Trobe University
(Australia).
As Travis points out in the rst introductory chapter, most previous
accounts of Spanish DMs choose a pragmatic approach which focuses
on the functions of these markers yet overlooks their meanings. Thus, on
the descriptive level the authors goal is to determine and explicate the
meanings associated with the aforementioned markers. On the theoretical
level, Traviss aim is twofold. The rst theoretical goal is to test whether
her semantic approach that previously has been applied to lexical and
grammatical items could be extended to DMs. The second theoretical
objective is to provide further insights into the notion of polysemy in
discourse.
Since Discourse Markers in Colombian Spanish is a corpus-based study,
in Chapter 2 (Methodology) Travis describes her data collection
Cognitive Linguistics 192 (2008), 301348
DOI 10.1515/COG.2008.012
09365907/08/00190301
6 Walter de Gruyter
process. The tokens that she analyzes were extracted from the audio re-
cordings of the spontaneous conversational Spanish of the city of Cali
which is situated in south-western Colombia. The recordings were per-
formed by a Cali native research assistant who took part in every conver-
sation. The other 15 participants (10 women and 5 men) from the 2155
age group included the research assistants family, friends, and colleagues.
In order to avoid the observers paradox Travis herself did not partici-
pate in the conversations and was not present when they took place. The
participants were aware that they were being recorded and were told that
the author was interested in politeness in Colombian Spanish. Out of
the 12 hours of recordings collected by the research assistant, Travis ex-
tracted 4 hours 48 minutes worth of conversation (40,900 words; 15,500
intonation units) because of their high acoustic quality. These 4 hours 48
minutes represent 13 conversations that took place in a series of settings
(e.g., university cafeteria, restaurant, a participants house, etc.) and cov-
ered a range of topics (e.g., food, things to be done around the house,
work, university life, etc.). When the database did not contain a particu-
lar function on which Travis needed to comment she used some examples
from previous studies as well as some examples from novels by contem-
porary Columbian authors.
As many linguistic labels, the term DMs is a highly problematic
term. Alternative names such as interjections, particles, pragmatic
markers, discourse connectives, discourse operators, muletillas
(literally little crutches) and conectores pragmaticos (pragmatic con-
nectives) have been suggested. As Travis points out (p. 27), such an array
of terms reects the variety of interpretations about the dening features
of these linguistic elements. Therefore she starts Chapter 3 (What are
discourse markers?) by surveying the main perspectives on the DMs that
have been proposed up to date: 1) the theoretical approach represented by
such scholars as Deborah Schirin and Bruce Fraser who view DMs as
markers of sequential relations, 2) the Relevance Theory which treats
DMs as carriers of procedural meaning, 3) the Argumentation Theory
which treats DMs as encoding argumentational relations, and 4) the
theory of grammaticalization which focuses on the historical process
through which DMs have emerged. Following this literature review
1
the author discusses the formal features of DMs such as prosodic and
syntactic independence, indeterminate semantic scope, use by a primary
speaker, as well as their functional features such as playing a role in con-
textualizing utterances and in speaker-hearer interaction.
In the nal section of Chapter 3 Travis outlines the theoretical
approach that she chooses to adopt, namely the Natural Semantic Meta-
language (NSM). The NSM, which was originated by Anna Wierzbicka
302 Book reviews Cognitive Linguistics 192 (2008)
in the early 1970s, is an approach to semantic analysis based on reduc-
tive paraphrase using a small collection of semantic primes. The method
of reductive paraphrase consists of breaking semantically complex
concepts/words down into combinations of simpler concepts/words. As
for semantic primes, these are understood as atomic, primitive meanings
present in all human languages. For example, the complex word/concept
lie can be analyzed in NSM as follows: X lied to Y X knew it was
not true / X said it because X wanted Y to think it was true / people
think it is bad if someone does this. According to the NSM, semantic
primes represent elements of linguistic conceptualization and this focus
on conceptualization is what places this theoretical approach within the
broader framework of cognitive linguistics.
Travis starts her analysis in Chapter 4 with the DM bueno (well, ok,
alright or anyway). She groups the 92 tokens of this DM found in her
corpus into six dierent functional categories: (1) expressing acceptance,
(2) initiating a leave-taking, (3) prefacing a dispreferred response, (4)
marking a reorientation in topic, (5) marking a correction, and (6) intro-
ducing direct speech. These six functions represent four dierent mean-
ings. Context in which the marker occurs makes it possible for a single
meaning to encompass several functions. For example, the rst meaning
of bueno that Travis identies can be represented with the following script
(p. 92):
bueno
1
1. you said something to me now
2. I think that you want me to say something now
3. I say: this is good
Based on the context, bueno
1
can function as a marker of acceptance as
well as a marker of pre-closing the conversation, as illustrated by exam-
ples (1) and (2), respectively
2
(pp. 88; 9495):
(1) Angela: Julia, ah le dejamos dos pancakes. . . . Coma --.
Julia:
?
Hah?
Angela: Co mase una arepa tambien,
?
oyo ?
Julia: Bueno, mijita. Gracias.
Angela: Julia, weve left two pancakes for you. . . . Have --.
Julia: Hah?
Angela: Have an arepa as well, OK?
Julia: Bueno, my love. Thank you.
(2) Patricia: Faltan veinte para las tres.
Claudio: Huy, ooh vea, que estoy un rato.
Book reviews Cognitive Linguistics 192 (2008) 303
Patricia: Faltan veinticinco.
Claudio: Bueno, mi querida estimada.
Angela: Bueno, Claudio, muchas gracias.
Patricia: Its twenty to three.
Claudio: Ooh, hey, Ive been here a while.
Patricia: Its twenty-ve to.
Claudio: Bueno, my esteemed darling.
Angela: Bueno, Claudio, thank you very much.
Bueno in expressing acceptance from example (1) and bueno in initiating a
leave-taking from example (2) constitute the same meaning because in
both cases the marker signals an acceptance of some sort, whether that
of an oer of food or that of a conversation closure.
The second meaning of bueno that Travis identies is labeled dispre-
ferred response (p. 101):
bueno
2
1. you said something to me now
2. I know that you want me to say something now
3. I say: this is good
4. I want to say something more about this
As can be seen from example (3) (p. 97), its function is to encode the
recognition of the validity of the previous comment or question, while
indicating that the main response is still to follow:
(3) Santi:
?
Sabes co mo le miden la edad a un arbol?
Angela:
?
Con el carbono catorce?
Santi: Bueno, tambien. No me acordaba de eso.
Santi: Do you know how the age of trees is measured?
Angela: With carbon-14?
Santi: Bueno, that too. I didnt remember that.
The third meaning (bueno
3
) which is labeled as reorientation plays
the contextualizing role of accepting the prior discourse in order to allow
the discourse to move on (p. 112):
bueno
3
1. someone here said something
2. I say: this is good
3. someone here can say something else now
For instance, in example (4) the interlocutors cannot remember the name
of the beach about which one of them is talking, but since they seem to
304 Book reviews Cognitive Linguistics 192 (2008)
know the beach to which she is referring the DM bueno indicates that the
conversation can carry on (p. 109):
(4) Omar: No me acuerdo co mo se llama.
Rosario: Bueno.
?
Una vez fuimos alla, llegando unos amigos
alemanes?
Omar: I dont remember what its called.
Rosario: Bueno. We went there once, when some German friends
came to visit?
Bueno
4
(correction) serves to slightly modify the position that the
speaker has previously expressed (p. 116):
bueno
4
1. I said something now
2. I think that someone can say: I dont think the same
3. I say: this is good
4. I want to say something more about this now
As can be seen from example (5), the self-correction usually is not triggered
by an objection expressed by one of the interlocutors but rather is moti-
vated by an unexpressed objection that the speaker can foresee (p. 113):
(5) Angela: Alla las olas son mas grandes, el mar es como mas alboro-
tado, me parece a m . Y la arena -- Bueno, Cartagena no
es clarito,
?
cierto?
Angela: There, the waves are bigger, the sea is like more rough, it
seems to me. And the sand -- Bueno, in Cartagena the sea
is not clear, is it?
The DM in question can also be employed to quote one of the func-
tions identied above. This direct speech usage, as illustrated in exam-
ple (6) (p. 118), highlights the fact that the direct speech presented is
drawn from a specic discourse context to which it was responding:
(6) Santi: Por ejemplo, vos te casaste, y tu mama dijo, bueno, les cola-
boro con dos cientos mil pesos mensuales.
Santi: For example, you got married and your mum said, bueno,
Ill help you out with 200,000 pesos monthly.
Travis argues that all the functions of bueno discussed above are cen-
tered around the notion of saying something good about the prior dis-
course, and that this common element derives directly form the adjec-
tive bueno good. Therefore she treats the DM and the adjective as
Book reviews Cognitive Linguistics 192 (2008) 305
polysemous uses of the one form. She also hypothesizes that on the dia-
chronic level bueno
2
and bueno
3
represent two independent developments
from bueno
1
, while bueno
4
is a further development of bueno
2
.
The focus of Chapter 5 is on the DM o sea (I mean, rather or that is
to say). Travis found 93 tokens of o sea in her corpus and classied them
according to ve dierent functions: 1) clarication, 2) utterance comple-
tion, 3) repair, 4) side comment, and 5) conclusion. The rst three func-
tions fall under one single meaning captured by o sea
1
(p. 141) which refers
to what is meant by a preceding utterance produced by the same speaker:
o sea
1
1. I said something with some words
2. I say: I want to say something more about this
3. because I want you to know
what I want to say with these words
Examples (7) through (9) illustrate the clarication, the utterance comple-
tion and the repair functions of o sea
1
, respectively (pp. 136; 143; 148
149):
(7) Sara: Cuando se paga semestral, el pago es por caja. O sea, en
efectivo o en cheque.
Sara: When its paid six-monthly, its paid to a cashier. O sea, by
cash or by check.
(8) Angela: Pero ya me canso o rlos, o sea, hablar mal de otras
personas.
Angela: But I get sick of hearing them, o sea, speak badly about
others.
(9) Angela: Ella -- Ella me dijo que -- o sea, que le daba miedo.
Angela: She -- She told me that -- o sea, that she was afraid.
In contrast to o sea
1
which gives the speaker an opportunity to refor-
mulate something said, or implied, or something he/she is in the process
of expressing, o sea
2
introduces a side comment which contains informa-
tion new to the discourse (p. 158):
o sea
2
1. someone said something
2. I say: I want to say something more about this
with some words
3. I want you to know what I want to say with
these words
306 Book reviews Cognitive Linguistics 192 (2008)
Travis sees this use illustrated in example (10) (pp. 151152) as a further
development of the notion of reformulation:
(10) Santi: que cosas hablara tambien de uno, mi amor.
Angela: Pues, mi amor, o sea, . . . no se trata de atacar, ni de
nada. Pero, si tu te pones a observar, la misma situacio n
se da en t -- en tu casa.
Santi: What would she also say about me, my love.
Angela: Pues, my love, o sea, . . . I dont mean to have a go at
you, or anything. But if you start to observe, the same
situation is found in -- in your house.
O sea
3
involves reformulation as well, but in its case the reformulation
expresses a conclusion that can be drawn from prior discourse (p. 163), as
can be seen in example (11) (p. 160):
o sea
3
1. someone said something with some words
2. I say: I can know something because of this
3. I want to say what I can know
(11) Javier: Estas llaves,
?
de do nde son?
?
de la casa de Roberto?
Santi: Una es de la ocina.
Javier: O sea, que --
Santi: [otra es de] donde la casa.
Javier: . . . O sea, que esta quedo mamando sin llaves de la
ocina hoy. . . .
?
Ah todos tienen llaves de ocina?
Javier: What are these keys to? To Robertos house?
Santi: One is for the oce.
Javier: O sea, que --
Santi: another is for the house.
Javier: . . . O sea, que hes -- hes sucking his thumb without oce
keys today. . . . Does everyone there have oce keys?
While, as stated earlier, the DM bueno and the adjective from which it
derives constitute polysemous uses of the one form, the DM o sea and
its etymon (i.e., the combination of o or and sea 3p.Sg.Pres.Subj. of
ser to be) are, according to Travis, homonymous. As for the historical
evolution of its functions, Travis hypothesizes that the clarication
function gave rise to the utterance completion and the repair func-
tions, on the one hand, and to the conclusion function, on the other
hand, and the utterance completion and the repair uses eventually
produced the side comment option.
Book reviews Cognitive Linguistics 192 (2008) 307
In Chapter 6 the author takes up the DM entonces (so or then).
Travis corpus contains 201 tokens of this unit which she groups into ve
functional groups. The rst two functions (prefacing result and highlight-
ing main clause) illustrated with examples (12) and (13), respectively, are
grouped together under a single meaning entonces
1
(p. 186):
entonces
1
1. I said something (X)
2. I say: because of X, Y
(12) Paula: Tenemos que costarlo nosotros.
Angela: Mhm.
Paula: Entonces, queremos comparar la calidad de trabajo, y el
precio.
Paula: We have to pay for it ourselves.
Angela: Mhm.
Paula: Entonces, we want to compare the quality of the work,
and the price.
(13) Celia: Le digo yo, si usted es una buena -- buena abogada,
entonces, puede tranquilamente trabajar, libre de meterse
con los pol ticos, y hacer sus cosas sucias y torcidas.
Celia: I say to her, if you are a good -- good lawyer, entonces, you
can calmly work, without getting involved with politicians,
and doing things that are dirty and corrupt.
The next two functions (prefacing a response and closing a response)
(examples 14 and 15, respectively) fall under entonces
2
(p. 206):
entonces
2
1. you said something
2. I say: I know something because of this
3. because of this I say this:
(14) Santi: . . . Y yo, a esa hora tengo que estar en la empresa para
ver si -- si recupero esta plata de Mar a Elena Go mez.
Angela: No, s , entonces, yo me voy temprano a la Universidad.
Santi: . . . And I have to be at the company right at the same
time, to see if -- if I can get back that money from Mar a
Elena Go mez.
Angela: No, yes, entonces, Ill go to the university early.
(15) Fabio: Lo que mas nos gusta hacer a Clara y a m es eso. Salir a
comer.
308 Book reviews Cognitive Linguistics 192 (2008)
Angela: Comelones. Ah tenemos que salir, entonces. Conoces La
Carbonera?
Fabio: . . . No.
Fabio: What Cara and I most like to do is that. Go out to eat.
Angela: Gluttons. Well have to go out some time, entonces. Do
you know The Carbonera?
Fabio: . . . No.
In addition to the two scripts (entonces
1
and entonces
2
) listed above,
Travis identies the third option (p. 222) which corresponds to the func-
tion labeled as discourse progression (example 16) (p. 216):
entonces
3
1. someone here said something
2. I say: because of this, I say this:
(16) Angela: dentro de las cuatro . . . semanas de cualquier grabacio n,
que quiero que se borre. . . . Entonces, aqu tenemos que
rmar.
Angela: within four weeks from [the date of ] any recording,
which I wish to be erased. (reading) . . . Entonces, we
have to sign here.
The functions of entonces illustrated above share the notion of marking
information as a result of prior discourse. Its use as a DM derives from
a consecutive conjunction which itself goes back to a temporal adverb
entonces with the meaning at time that. Travis argues that the temporal
adverb and the DM are homonyms. She also hypothesizes that the pre-
facing result function gave rise to the highlighting main clause and
prefacing a response functions, and that the latter gave origin to the
functions labled as closing a response and discourse progression.
The focus of Chapter 7 is the DM pues (well, so, then). Travis nds
a total of 219 tokens of this marker in her corpus which she classies
into seven functional groups: 1) adding information, 2) serving as a focus
device, 3) introducing a repair, 4) prefacing a response, 5) prefacing an
answer, 6) introducing direct speech, and 7) making topic completion.
According to the author, functions 1 through 6 correspond to a single
meanings ( pues
1
) represented as the following script (p. 267):
pues
1
1. someone here said something
2. I say: because of this I want to say something more
about this
Book reviews Cognitive Linguistics 192 (2008) 309
Examples (17) through (22) (pp. 241242; 249; 262; 264; 271; 276) illus-
trate functions 1 through 6, respectively:
(17) Sara: Esto se puede pagar mensual o semestral o anual.
?
S ?
Angela: Mhm.
Sara: Pues, cuando se paga semstral o anual, se puede dar
cheque, o en efectvo.
Sara: You can pay this monthly, or six-monthly, or annually.
OK?
Angela: Mhm.
Sara: Pues, when you pay six-monthly or annually, you can
pay by check, or in cash.
(18) Angela: . . . Y cuando ya, pues, tengamos pa la nevera, pues,
cambiamos nuestra nevera.
Angela: . . . And when, already, pues, we have [the money] for the
fridge, pues, well change the fridge.
(19) Particia: . . . que estaba tan de m-- pues, tan en furor --
Angela: Tan de moda.
Patricia: when it was so -- pues, all the rage.
Angela: So fashionable.
(20) Clara: Mi reloj no quiere trabajar mas.
Angela: Pues, cambiese el reloj, hermana.
Clara: Esta perezoso.
Clara: My watch doesnt wan to work anymore.
Angela: Pues, get a new watch, sister.
Clara: Its lazy.
(21) Santi: Ah,
?
es que son varios casetes? . . .
?
Varios casetes tenes
que entregarle?
Angela: Pues, yo creo que por ah , unos tres, unos cuatro.
Santi: Oh, so its several cassettes? . . . Several cassettes that you
have to give her?
Angela: Pues, I think about three or about four.
(22) Nuri: Cuando veo que esa pelada la pierde, le dije, no, pues, dale
una oportunidad.
Nuri: When I saw that girl was going to fail, I said to her
[the coordinator of the course], no, pues, give her an
opportunity.
310 Book reviews Cognitive Linguistics 192 (2008)
The seventh function that Travis identies (i.e., topic completion) corre-
sponds to the meaning pues
2
(p. 283) and is illustrated in example (23)
(pp. 282283):
pues
2
1. someone here said something
2. I say: because of this, I said something more about this
now
3. I dont want to say anything more about this
(23) Fabio: Se comunico contigo. Chevere.
Angela: Mhm.
Fabio: Se hicieron amigas, pues.
Angela: Mas o menos.
Fabio: She got in touch with you. Great.
Angela: Mhm.
Fabio: You made friends, pues.
Angela: More or less.
On the diachronic level the DM in question derives from Latin POST (a
temporal and spatial adverb meaning after) which produced the Old
Spanish pues employed as a temporal adverb and as a causal conjunction.
In the contemporary Spanish the former usage has been lost while the
latter one survived and, according to Travis, pues as a causal conjunction
and pues as a DM represent a case of polysemy. As far as the paths of
development responsible for the creation of the multiple functions of
pues, the author hypothesizes that the function labeled adding informa-
tion gave rise to the four functions of serving as a focus device, prefacing
a response, prefacing an answer, and making topic completion, while at a
later stage the focus device function made possible the evolution of the
repair one and the prefacing responses/answers function created the
one labeled direct speech.
The account provided by Travis makes several important contributions
to the eld. First of all, through the application of the NSM approach to
discourse data the author was able to dene the DMs making explicit dis-
tinction between their meanings (i.e., the meta-messages independent
from the context) and their functions (i.e., a combination of what is inher-
ent in the markers with what is attributable to the context). As Travis
points out (p. 75), these newly constructed denitions are particularly
valuable for foreign language learners, cross-linguistic studies and re-
search on Spanish concerned with the markers subtle similarities and dif-
ferences that cannot be grasped when broad functional labels are used.
Furthermore, the study lays ground for further research by identifying a
Book reviews Cognitive Linguistics 192 (2008) 311
series of possible directions for future work. As demonstrated throughout
the book and as stressed by the author in the concluding chapter (pp.
290293), her ndings raise the following issues that await investigation:
the relationship between prosody and function; the distribution of the dif-
ferent functions across the dierent markers; a comparison of the contex-
tualizing role of markers across languages and dialects; dierences in the
functional distribution in the use of DMs in spoken and written language;
pragmatic extensions of DMs cross-linguistically and cross-dialectally,
the insights these elements provide into a specic culture; and the rela-
tionship between the various functions on the diachronic level. In sum,
Discourse Markers in Colombian Spanish will be of great interest to the
linguists working within the cognitive approach, to researchers carrying
out corpus-based studies, to scholars interested in DMs, as well as to His-
panists at large.
Notes
1. While overall Travis review of literature is very thorough and up to date, her discussion
of the previous studies that treat the Spanish DMs diachronically could have beneted
from references to the following works: Garachana Camarero 1998a, 1998b; Cuenca
1998; Cuenca and Mar n 2000; Iglesias Recuero 2000; and Gonzalez Olle 2002.
2. Travis divides all of her examples into intonation units and provides morpho-syntactic
glosses. We omit this information to save space.
References
Cuenca, Maria Josep
1998 Processi di grammaticalizzazione: il caso dei connettivi in catalano e in spag-
nolo. In Runo, Giovanni (ed.), Atti del XXI Congresso Internazionale di
Linguistica e Filologia Romanza (Palermo, 1823 de septiembre de 1995),
volume II. Tu bingen: Niemeyer, 195204.
Cuenca, Maria Josep and Mar a Jose Mar n
1999 Verbos de percepcio n gramaticalizados como conectores. Analisis contras-
tivo espan ol-catalan. In Maldonado, Ricardo (ed.), Estudios cogniscitivos
del espanol. Revista Espanola de Lingu stica Aplicada. Numero monograco
2000. Logron o/Queretaro: Asociacio n Espan ola de Lingu stica Aplicada y
Facultad de Lenguas y Letras, UAQ, 215238.
Garachana Camarero, Mar a del Mar
1998a La evolucio n de los conectores contraargumentativos: la gramaticalizacio n
de no obstante y sin embargo. In Mart n Zorraquino, Mar a Antonia and
Estrella Montol o (eds.), Los marcadores del discurso. Teor a y analisis.
Madrid: Arco Libros, 193212.
1998b La nocio n de preferencia en la gramaticalizacio n de ahora (que), ahora bien,
antes, antes bien y mas bien. In Cifuentes Honrubia, Jose Luis (ed.), Estudios
de lingu stica cognitiva. 2 volumes. Alicante: Universidad de Alicantes, II:
593614.
312 Book reviews Cognitive Linguistics 192 (2008)
Gonzalez Olle, Fernando
2002 Vamos. De subjuntivo a marcador (con un excurso sobre imos). In A

lvarez
de Miranda, Pedro and Jose Polo (eds.), Lengua y diccionarios. Estudios
ofrecidos a Manuel Seco. Madrid: Arco Libros, 117135.
Iglesias Recuero, Silvia
1999 La evolucio n histo rica de pues como marcador discursivo hasta el siglo XV.
Bolet n de la Real Academia Espanola 80(280), 209308.
Verena Haser, Metaphor, Metonymy and Experientialist Philosophy:
Challenging Cognitive Semantics. Berlin/New York: Mouton de Gruyter,
2005, 286pp., ISBN 3110182831. hb 88.00EUR.
Reviewed by Dominik Lukes, University of East Anglia, England. Email
3d.lukes@uea.ac.uk4
If Cognitive Linguistics was looking to celebrate a quarter of a century of
its existence, it could have done worse than the year 2005, 25 years since
the publication of Metaphors We Live By. It may not have been the rst
publication to broach the subject that had been occupying many since
the mid-1970s nor the most comprehensive and detailed examination of
cognitive processes as a whole but it did provide a clarity and single-
mindedness of purpose that changed the linguistic landscape forever.
This work of only seventeen references (its many intellectual debts briey
and timidly acknowledged in the preface) turned out to be a clarion call
and a poster child to many who had harbored doubts about some of the
claims to universality made by formal linguistics, no matter how implicit,
which were then reaching their peak.
And what better way to examine the history of an academic discipline
than with a critical review of some of its key texts? Which is precisely what
Verena Haser set out to do in her passionate challenge to cognitive seman-
tics in general and the work of George Lako and Mark Johnson (Lako
and Johnson 1980, 1999 and Lako 1987) in particular. While I agree
with Haser that a careful critique of some of the basic assumptions be-
hind cognitive linguistics is long overdue if the eld expects to move
forward, her study fails to provide a compelling enough argument or of-
fer a viable alternative. In the succeeding paragraphs I will briey outline
her argument, show how and why it fails to achieve its stated purpose and
attempt to oer an alternative avenue of critical engagement with these
texts.
Hasers approach is one of an iconoclast. She claims: Anyone at-
tempting a critical exposition of Lako/Johnsons approach is treading
Book reviews Cognitive Linguistics 192 (2008) 313
on holy ground. (p. 239) and she approaches her task of a critic with
near revolutionary zeal. She calls her approach deconstructivist (p. 2)
but this is merely to allow her to explore every possible avenue of criti-
cism from personal attacks to substantive analysis. As a result a most po-
tent impression a reader gains after a brief exposure to her volume is that
Verena Haser really does not like George Lako
1
(speaking metonymi-
cally, of course). This is not to minimize the strength of her argument
but since her dislike comes through so strongly, I feel that in the interest
of full disclosure, I should admit that I rather like George Lako (also
metonymically). I like his work so much that I spent over a year of my
life translating his Women, Fire and Dangerous Things into my native
Czech (published 2006). While this might impair my objectivity it also
gives me the qualication of detailed exposure to Lakos text and his
style of argument. I feel justied in making this argumentative detour
since Haser herself admits that her investigation will focus not solely on
Lako/Johnsons claims but also the way they are presented and on
recurrent structural features of Lako/Johnsons exposition. (p. 4). In
light of this, I also consider it only fair to subject Hasers own text to the
same treatment.
Hasers argument proceeds in nine chapters (including an introductory
and a concluding one) and is further supported by an appendix in which
she deconstructs Lako (1987)s style of argument. In chapter 2, she
starts out with a discussion of metonymy, concluding that cognitive lin-
guistics has not been able to establish a rm enough distinction between
this concept and that of metaphor. Drawing on research by Glucksberg
and Keysar, she suggests that the dierence is that with metaphors, as
opposed to metonymies, knowledge of the target concept does not imply
knowledge of the source concept. (p. 51 [original emphasis]) This dis-
tinction, however, only makes sense, if we accept her statement that often
supposedly typical examples of metonymies are arguably clear instances
of metaphors. (p. 51) She rejects the possibility that if examples such
as He is a Judas can be analyzed both as metaphors and metonymies,
the distinction itself is the problem. She insists that its important that
metonymic and metaphoric ingredients can be clearly distinguished
from each other. (p. 49).
Having dealt with metonymy, Haser proceeds to examine the concept
of metaphor itself. She styles the chapter as an assessment of McCawleys
praise for Metaphors We Live By as an original contribution. Her
1. Haser, of course, examines many of the works written jointly by Lako and Johnson,
but she seems to be particularly badly disposed toward the former.
314 Book reviews Cognitive Linguistics 192 (2008)
conclusion is that compelling arguments in Lako/Johnson (1980) are
not infrequently conspicuous by their absence. (p. 71) She claims to be
puzzled by the lack of references and acknowledgements in the book and
chastises them for ample use of the unfair strategies which they hold are
encapsulated in the argument is war metaphor (p. 71). Her more sub-
stantive argument is focused on the concept of mapping. She claims, for
instance, that there is no reason to believe that one and the same similar-
ity can explain all the dierent mappings from physical falling to more
abstract kinds of falling. (p. 64)
At rst glance, chapter 4, may seem like an argumentative non sequitur.
Without preamble Haser plunges into a polemic with Lako and John-
sons argument against objectivism. However, the spirit of exegetic attack
continues. Once again, Lako/Johnsons proposals turn out to be al-
most invariably vague; the authors fail to do justice to opposing views;
their presentation of objectivist tenets is at times patently mistaken.
(p. 74) She faults them for inconsistencies in their account of objectivism
and concludes that they are ghting a non-existent foe (It is even ques-
tionable whether any scholar embraces the majority of positions that
Lako/Johnson attribute to objectivism. p. 80) which forces them to
resort to rhetoric (p. 74). To demonstrate her point she mounts a vig-
orous defense of Davidson, who, according to her, had been libeled by
Lako and Johnson as maintaining that meaning and use are indepen-
dent of each other. She presents Davidsons view as being much more
subtle than Lako and Johnson give him credit for. Lako and Johnson
on the other hand provide an account that is incoherent, vague and
lacking in compelling arguments (p. 121122). Any worthwhile contri-
bution that can be spotted in their work had already been made by
philosophers such as Goodman or Putnam. Overall, Haser concludes:
Lako/Johnsons criticism of objectivism leaves as much to be desired
as their own contribution to philosophy. (p. 122).
Having thus stripped Lako and Johnson of all intellectual credibility,
in chapter 5, Haser proceeds to demolish the experientialist theory of
meaning as outlined in Women, Fire and Dangerous Things. Lakos
suggestions do not provide us with a viable philosophical theory of mean-
ing. (p. 139, my emphasis) because it does not explain the nature of
meaning itself since the putative presence of intrinsically meaningful
structures is not a philosophical answer to the questions What is mean-
ing? (p. 141, original emphasis) Haser denies cognitive semantics the
power to distinguish between concepts as simple as Labrador and dog.
She also puts forward the claim that the theory runs afoul of Wittgen-
steins critique of mental images as a foundation of meaning. I will
argue later that part of the problem with this critique (as with most
Book reviews Cognitive Linguistics 192 (2008) 315
philosophical theories of meaning) lies in the fact that is based entirely on
the meanings of words or at best contrived short phrases rather than real
usage events completely ignoring cognitive/construction grammar view
of language where meaning and form are closely interlinked rather than
forming dictionary/pictionary like assignment pairs.
Chapter 6 returns us to a critique of conceptual metaphor theory.
Her conclusion is twofold. First, the conceptual metaphor theory as ex-
pounded by Lako and Johnson (1980, 1999) is unoriginal and internally
inconsistent. The inconsistencies lie mainly on the interface between met-
aphorical concepts and metaphorical expressions. Second, the theory, as
proposed, does nothing to answer philosophical questions such as those
posed by Blackburn: How [do] we understand words? and What does
that understanding consist in. (p. 170171). Conceptualizing some-
thing in terms of another thing by itself does not amount to understand-
ing it in any specic way, since metaphorical denitions can be inter-
preted (understood) in various ways. (p. 171) It might be pointed out
here, that this conclusion, relies on the validity of her criticism in the
preceding chapters, which as I will argue below, relies on an essentially
objectivist view of meaning which according to her does not exist.
Chapters 7 and 8 represent the most substantive portion of Hasers ar-
gument and attempt to provide a positive alternative to the conceptual
metaphor theory. In chapter 7, she focuses her criticism on the fact that
the selection of target / source domain mapping is largely a matter of a
researchers choice: various potential source domains of individual met-
aphorical expressions are typically overlapping which shows once more
that source domains, like metaphorical concepts themselves, are probably
a mere construct. (p. 193). She lists a number of potential overlapping
source domains for argument, including: conflict, assuming a stance,
situating sth, etc. concluding that every source domain selected is
largely an arbitrary choice. (p. 195) In Chapter 8, she suggests that the
conceptual metaphor theory is insucient to account for the transfer of
meaning between conceptual domains. Indeed, she claims with an appeal
to Ockhams dictum, there is no need for positing additional structures
such as conceptual metaphors because metaphorical expressions can be
accounted for by family resemblances. E.g. we do not need a love is a
journey metaphor to be able to understand phrases such as our rela-
tionship has been a long bumpy road, all we need is a principle akin to
that of family resemblances that allows us to extend the use of phrases
such as long bumpy road to all potential metaphorical extensions such
as piracy ght, design process, stardom, etc. (pp. 228229). This
account, relying on research and its interpretation by scholars such as
inter alia Glucksberg, Keysar and McGlone, is presented as superior to
316 Book reviews Cognitive Linguistics 192 (2008)
that of Lako and Johnson (with some scant acknowledgement of the
work of Gibbs), the latter of which involves insurmountable diculties,
leading to a psychologically implausible proliferation of metaphorical
concepts. (p. 238) Haser claims that her approach based on family re-
semblances is more parsimonious and cognitively more realistic and as
such may not only explain links between the senses of dierent lexical
items that are used in similar domains [but also] account for similar gura-
tive senses of a single lexeme thus possibly providing the key to a unied
account of gurative meaning. (p. 238)
It might be slightly unfair to the careful analysis of a large volume of
texts Haser has done, but it appears that her argument can be roughly
reduced to the following bullet points:
1. Lakos and Johnsons view of metaphor is internally inconsistent
and cognitively implausible due to relying on a awed theory of
meaning; it is inferior to that of Glucksberg and his colleagues.
2. Lakos and Johnsons contributions are not nearly as original and
groundbreaking as others and they themselves claim; they often
ignore prior research or fail to attribute ideas.
3. Lako and Johnson often construct non-existent argumentative
foes; most specically there may not be any thinkers who could be
classied as objectivist.
4. Lakos and Johnsons style of writing uses unfair argumentative
practices, and conceals vagueness and internal inconsistencies.
Let us take each of these points in turn.
Ad 1. On the surface Hasers critique of experientialism and conceptual
metaphor theory is devastating. And indeed, she points out a number of
true weaknesses of the theory. But despite her insistence at having dealt a
mortal blow to the theory, she remains on the margins of interest for
practicing cognitive linguists.
2
Her ferocity and single-mindedness of pur-
pose caused her to miss an opportunity for a truly constructive critique
that could move the inquiry forward. She claims that the major di-
culty with Lakos and Johnsons approach is that [u]nderstanding a
given target domain in terms of a given source can trigger many dierent
conceptions (ways of understanding) of the target. Thus understanding
cannot simply consist in viewing one thing in terms of another. (p. 244)
This statement is hard to argue with, however, we can certainly quibble
2. It is impossible to address the number of detailed analyses carried out by Haser with a
relatively blunt instrument as this review. As such, I will only deal with some of her
broader statements.
Book reviews Cognitive Linguistics 192 (2008) 317
with ascribing such a limited view of metaphor to Lako and Johnson.
They never claim that any one concept can only be understood in terms
of one other concept. However, they do claim that the fact that one con-
ceptual domain is structured in terms of another, with inferential conse-
quences, is signicant to our view of human cognition. Hasers confusion
stems from the fact that she fails to distinguish between language as a
system and language as a process (even in the primitive competence vs.
performance sense). As a result, she conates arguments about online
processing with those about the structure of the conceptual system. Lak-
o and Johnson are concerned primarily with the latter but Hasers cri-
tique freely moves from one to the other.
3
Indeed, her alternative solution
based on family resemblances is nothing but a germinal theory of concep-
tual integration, which of course, is concerned precisely with the problem
of online processing of conceptual domains, including metaphors. Haser
only makes one dismissive reference to blending theory but she would do
well to examine it in more detail, particularly when she faults Lako and
Johnson for conating language and thought or when accusing them of
ignoring Wittgensteins warnings against mental images. Actually, she
could have gotten a lot more mileage out of chastising Lako and John-
son for not having a good theory of the role of metaphor in text and evi-
dentiary value of metaphor occurrences for the positing of conceptual
metaphorical structures. As any discourse-aware metaphor analyst can
attest, it is not easy to identify metaphors by careful reading (Cameron
2003) let alone automatically based on keywords (Goatly 1997; Musol
2005; Lukes 2005, and others) and neither is it clear what can be said
about the metaphorical meaning of a single text containing a particular
metaphor in the absence of other evidence. This lack of a good heuristic
for metaphor identication resulted in a proliferation of metaphors men-
tioned below.
The problem with Hasers critique of CMT and experientialism is that
it is purely philosophical. This might seem reasonable given that
Lako and Johnson make explicitly philosophical points but by ignoring
the linguistic dimension of their (primarily Lako, 1987) enterprise, she
misses a number of important points. She faults experientialism for not
being able to answer a philosophical question such as How do we un-
derstand words? But this is only a problem if we have a philosophical
theory of language and meaning. A linguistic theory of meaning requires
3. This critique might seem strange from a linguist committed to a broadly usage-based
view of language but Hasers conations are a result of confusion rather than a commit-
ment to a construction-based view of grammar.
318 Book reviews Cognitive Linguistics 192 (2008)
a much more complex view of linguistic structures including a theory of
compositionality such as that put forth in cognitive and construction
grammars (to which Lako himself devoted considerable space in
Women, Fire and Dangerous Things). If were only looking at the meaning
of words and isolated expressions, as philosophers are prone to do, we
miss what language is all about. Moreover, Haser leaves the reader with
the impression that meaning in Lako and Johnson is reduced to meta-
phor. She dismisses the much more subtle theory of conceptual frames
or idealized cognitive models as outlined in Lako (1987). It as if the fol-
lowing was never written: One of the most robust and far-reaching nd-
ings of cognitive linguistics is the phenomenon of framing (Fillmore 1975,
1982) and correlative notions of idealized cognitive models (Lako 1987).
How a person frames a particular situation will determine what they ex-
perience as relevant phenomena, what they count as data, what inferences
they make about the situation, and how they conceptualize it. (Johnson
and Lako 2002: 246) Johnson and Lakos statement: Embodied real-
ism is not a philosophical doctrine tacked onto our theory of conceptual
metaphor. It is the best account of the grounding of meaning that makes
sense of the broadest range of converging empirical evidence that is
available from the cognitive sciences (Ibid: 249) would also make it
more dicult to Haser to make some of her assertions uncontested.
Instead, Haser limits herself to a critique of the feasibility of image
schemas as pre-conceptual structures (arguably one of the more contro-
versial aspects of Lakos approach) but ignores their potential as an
account of complex concepts. In fact, she never mentions the problem of
compositionality at all. Neither does she discuss the central point of inter-
est in Philosophy in the Flesh, where Lako and Johnson explicitly state
that their main interest lies in reasoning. Although they posit metaphor
as one of the central elements in their new philosophy, it stands and
falls with the concept of non-literal embodied meaning rather than the
minutiae of the conceptual metaphor theory.
Haser did a lot of reading of Lako and Johnson so it is not surprising
that she set some of their voluminous output aside. But while understand-
able, her omissions are also revealing. She completely ignores (perhaps
understandably for a philosopher) Lakos (1996) attempt at applying
his theory of cognitive models to the analysis of the US political dis-
course. Had she paid it more heed, she could have hardly placed such
exclusive emphasis on metaphor in her critique of Lako and Johnson.
It is perhaps fair to point out that Lako, Johnson and many others inu-
enced by them place metaphor center stage in their inquiries but that is
not what Haser critique is about. She also does not mention (although it
may not have been available to her due to publication deadlines) the new
Book reviews Cognitive Linguistics 192 (2008) 319
afterword Lako and Johnson wrote for the 2003 edition of Metaphors
We Live By. In it, they evaluate the concept of metaphor as metaphorical
itself and discuss the limitations of various conceptual underpinnings
most notably the view of metaphor as domain mapping, which is the
main subject of Hasers criticism. This may be perhaps the most impor-
tant new perspective on the study of metaphors. If we maintain that
metaphors (or metaphor-like conceptual structures) are partial and com-
plex concepts are inevitably metaphor-like, we should perhaps not expect
these concepts to be exempt from this rule and be exhaustive and comply
with classical rules for logical contradiction. And herein lie the limitations
of traditional philosophy that Lako and Johnson primarily argue
against. An objectivist-philosophical approach simply cannot see con-
cepts such as these as valid but will instead waste time in trying to square
circles instead of spending it on substantive analysis. However, while
some of her omissions are understandable, if in my view fatal to her ac-
count, others can only be classied as willful. She references Rakovas
(2002) critique of experientialism as kindred to her own but completely
ignores a debate that took place in the very same issue of Cognitive Lin-
guistics including a rebuttal by Lako and Johnson from which I quoted
above. Is it because many of the points raised in that debate (cf Sinha
2002) provide a counterargument to her claims which she asserts does
not exist?
Furthermore, Haser does not only ignore points raised by her oppo-
nents. She cites McGlone (2001)
4
as having caused irreparable damage
to the conceptual metaphor theory but ignores a much more conciliatory
paper by Bortfeld and McGlone (2001) oering a relativistic account of
metaphor processing which rests on the assumption that dierent
modes of metaphor interpretation are [ . . . ] operative in dierent dis-
course contexts thus explaining why people might favor attributional
interpretations of gurative expressions in some circumstances and ana-
logical interpretations in others. (p. 75) Bortfeld and McGlone are
hoping to reconcile competing models of metaphor processing as de-
scribing dierent points on a continuum. Haser is right in that the
Keysar, Glucksberg et al.s attributive model of metaphor deserves more
attention from conceptual metaphor theorists (sadly, to my knowledge, it
has received none from Lako and Johnson) but by positing it as the only
possible alternative she commits the same error of judgment of which she
is accusing Lako and Johnson.
4. Chiappe (2003) characterized McGlones view as easily identiable as partisan and one-
sided.
320 Book reviews Cognitive Linguistics 192 (2008)
Hasers objection that metaphorical concepts impose an ad hoc com-
partmentalization on the data. (p. 245) carries a bit more weight. This
is mostly a methodological problem, however. Metaphors We Live By
opened the oodgates of conceptual metaphor identication, researchers
behaved and many still do like children in a chocolate factory, itting
hungrily from one conceptual metaphor to another, often forgetting that
argument is war and ideas are things are merely labels for more com-
plex conceptual structures. This led to a proliferation of research that was
too simplistic, relying on Lako and Johnson (1980) and ignoring much
of the subsequent research and rening of the theory. As recently as 2004,
Charteris Black (2004) posited a metaphor politics is religion, based
purely on the presence of key words such as mission in his corpus. In
a recent survey of literature on metaphor in education, I discovered that
the vast majority of research limits the foundations of their metaphor
analysis to methodology outlined in Metaphors We Live By (some sec-
ond-hand), completely ignoring the conceptual complexity behind meta-
phor. Some of this blame can certainly be laid at the feet of Lakos
and Johnsons more programmatic eorts, but Lakos own work in
Moral Politics and the appendices of Women, Fire and Dangerous Things,
whatever its other limitations, is much more nuanced and worthy of em-
ulation in the applied context.
As we have seen, Haser does not limit her critique to substantive issues
of an intellectual enterprise in general. At least three quarters of her text
are devoted to criticism targeted directly at the work of Lako and John-
son. In the following I will attempt to address some of these points.
Ad 2. Haser is repeatedly puzzled why Lakos and Johnsons
approach has been so compelling and inuential, when on her account,
it brings nothing new but confused concepts. What is so revolutionary
about an approach that has had many antecedents (both acknowledged
and unacknowledged) and brings little that is radically new? She makes
a valid point when she points out the limited engagement with past ap-
proaches to similar problems exhibited in Lako and Johnsons work.
Lako and Johnson (1999) purport to undertake an analysis of the his-
tory of philosophy they give phenomenology a short shrift and fail to
mention deconstruction and hermeneutics altogether. Husserl and Fou-
cault are mentioned only in passing and Gadamer or Riceour are
completely ignored which is hard to understand in a book purporting to
present a new philosophy. There is much to be learned from these
thinkers even though I agree with Lako and Johnson that their
approach was ultimately awed. Some of their intellectual antecedents
are briey mentioned in the preface of Metaphors we live by but neither
Lako nor Johnson ever engaged with their work again. However,
Book reviews Cognitive Linguistics 192 (2008) 321
Hasers puzzlement over the source of the inuence of Lako and John-
sons work stems from a complete misunderstanding of the history of
science. Writers such as Kuhn and Holton have repeatedly demonstrated
that the success of ideas relies more heavily on clarity of purpose than on
completeness and that giving predecessors a short shrift may well be es-
sential for a new paradigm to take hold. If anything, a certain level of
underspecication
5
can be benecial to a scientic enterprise as Keller
(2000) demonstrated by her analysis of the use of the term gene by prac-
ticing geneticists. Such work also reveals as irrelevant Hasers argument
that Lako and Johnsons work should be questioned because few of
their contemporaries or predecessors (such as Putnam) have engaged
with their approach. In fact, if we look closely, this is true of all the great
revolutions in linguistic thought (the same being true of other disci-
plines including physics and biology) including de Saussure and Chom-
sky. Conceptual revolutions rarely expunge all competing perspectives.
Rather these coexist alongside them until they either disappear or re-
emerge in another shift.
Ad 3. Just as with her preceding point, Hasers criticism of Lakos
and Johnsons propensity for ghting imaginary foes is not entirely with-
out foundation. However, as pointed out above, this is a necessary
feature of all revolutionary reconceptualizations and while it should be
pointed out it is dicult to accept it as a justication for rejecting the
theory out of hand. Furthermore, she (along with Leezenberg 2001) are
incorrect in their assertion that objectivism as described by Lako and
Johnson does not exist. They are simply looking in the wrong places.
While it may be true that an objectivist subscribing explicitly to all the
tenets of objectivism as caricatured by Lako (1987: xiixiii), particu-
larly among philosophers, might be dicult to nd, objectivism is rife in
the applied disciplines and popular consciousness underpinning much of
non-philosophical study of the mind. As pointed out above, Haser herself
exhibits certain objectivist commitments in her critique of Lako and
Johnson when she assumes that only ideas conforming to Aristotelian
logics can have any validity. This is another reason why Haser fails to
perceive the great appeal Lako and Johnsons anti-objectivism holds
for many practicing linguists and semanticians. While it is possible to
employ plausible deniability by quoting from the work of philosophers
such as Davidson to show that some their ideas cannot be classied as
objectivist, it is hard to deny that much of the work on grammar and
5. A criticism leveled at Lakos (1987) concept of motivation by Vervaeke and Green
(1997).
322 Book reviews Cognitive Linguistics 192 (2008)
articial intelligence in the last forty years has been informed by an essen-
tially objectivist view of the mind which, as Robin Tolmach Lako (2000:
6) points out makes it impossible to say almost [anything] that might be
interesting, anything normal people might want or need to know about
language.
Ad 4. Haser puts much stock by the fact that Lako and Johnson uses
argumentative shortcuts or unfair gimmicks (p. 59), as she puts it. This
line of argument, however, can be rejected out of hand. Accusing an op-
ponent of unfair argumentation is a common trope in academic discourse
and is completely irrelevant to the strength of their case (similar treat-
ments of Chomsky have almost spawned a little industry). I have already
conceded to Haser that Lako and Johnson do not always do justice to
the complexity of ideas of those they construe as their opponents. How-
ever, such approximations have long been the mainstay of academic argu-
ments and it could be argued that without them argument would be
impossible just as language would be impossible without categorization.
But perhaps most devastatingly, if we want to see examples of unfair ar-
gumentative practices, we need to look no further than Hasers work.
Throughout the book she mixes ad hominem (how else can we classify
expressions such as unfair gimmicks) attacks with substantive argu-
ments. She repeatedly softens up her targets by denying them credibility
through an accusation of vagueness before she approaches the matter at
hand. Even her use of Lako/Johnson to refer to her opponents
throughout the book could be seen as serving to diminish their identity
rather than simply a device to save her some typing. It is no use pretend-
ing that it is possible to present an argument for ones view without
resorting to shortcuts that the other side will inevitably perceive as un-
fair. However, it would be foolish and fundamentally dishonest to base
a critique of a body of work purely on the deconstruction of argumenta-
tive practices contained therein. This is neither to accuse Haser of foolish-
ness nor dishonesty but rather to point out how easy it is to fall prey to
ones own strictures.
In conclusion, it is dicult to see Hasers book as anything but a
wasted opportunity. While she points out many areas that would benet
from further inspection, her unsuccessful attempt to topple the experien-
tialist enterprise as a whole is more likely to obscure rather than eluci-
date the issues that need to have more light shed on them the most.
References
Bortfeld, Heather, and Matthew S. McGlone
2001 The continuum of metaphor processing. Metaphor and Symbol 16, 7586.
Book reviews Cognitive Linguistics 192 (2008) 323
Cameron, Lynne
2003 Metaphor in Educational Discourse. (Advances in applied linguistics.) Lon-
don: Continuum.
Charteris-Black, Jonathan
2004 Corpus Approaches to Critical Metaphor Analysis. Basingstoke: Palgrave
Macmillan.
Chiappe, Dan
2003 Review: Understanding gurative language. Metaphor and Symbol 18,
5561.
Goatly, Andrew
1997 The Language of Metaphors. London; New York: Routledge.
Johnson, Mark, and George Lako
2002 Why cognitive linguistics requires embodied realism. Cognitive Linguistics
13, 245263.
Keller, Evelyn Fox
2000 The Century of the Gene. Cambridge, MA/London: Harvard University
Press.
Lako, George
1987 Women, Fire, and Dangerous Things: What Categories Reveal About the
Mind. Chicago: University of Chicago Press.
1996 Moral Politics: What Conservatives Know that Liberals Dont. Chicago:
University of Chicago Press.
Lako, George, and Mark Johnson
1980 Metaphors We Live By. Chicago: University of Chicago Press.
1999 Philosophy in the Flesh: The Embodied Mind and its Challenge to Western
Thought. New York: Basic Books.
Lako, Robin Tolmach
2000 The Language War. Berkley/Los Angeles/London: University of California
Press.
Lukes, Dominik
2005 Towards a classication of metaphor use. In Alan Wallington et al. (eds.),
Proceedings of the Third Interdisciplinary Workshop on Corpus-Based Ap-
proaches to Figurative Language. Birmingham: University of Birmingham,
2734.
McGlone, Matthew S.
2001 Concepts as metaphors. In Sam Glucksberg (ed.), Understanding Figurative
Language: From Metaphors to Idioms. Oxford: Oxford University Press,
90107.
Musol, Andreas
2004 Metaphor and Political Discourse: Analogical Reasoning in Debates about
Europe. Basingstoke: Palgrave Macmillan.
Rakova, Marina
2002 The philosophy of embodied realism: A high price to pay? Cognitive Linguis-
tics 13, 215244.
Sinha, Chris
2002 The cost of renovating the property: A reply to Marina Rakova. Cognitive
Linguistics 13, 271276.
Vervaeke, John, and Christopher D. Green
1997 Women, re, and dangerous theories: A critique of Lakos theory of cate-
gorization. Metaphor and Symbol 12, 5980.
324 Book reviews Cognitive Linguistics 192 (2008)
Dirk Geeraerts (ed.), Cognitive Linguistics: Basic Readings. Berlin/New
York: Mouton de Gruyter, 2006, 485 pp., pb ISBN 978-3-11-019085-4
EURO 24.95; hb ISBN 978-3-11-019084-7 EUR 98.00.
Reviewed by Thora Tenbrink, University of Bremen, Germany. Email
3tenbrink@uni-bremen.de4
This book represents an overview of fundamental issues in Cognitive Lin-
guistics by reprinting ground-breaking articles spanning two full decades
of research (from 1982 to 2000). The collection is intended as a course
reader for students and as a resource for advanced researchers within the
eld.
1. Synopsis
Dirk Geeraerts: IntroductionA rough guide to Cognitive Linguistics.
In this introductory chapter, the editor of the Basic Readings collection
pursues to show how the 12 papers in this book belong together and
how they t into the overall Cognitive Linguistics (CL) landscape. He
does this by using inviting language, directly addressing the readerSo
this is the rst time you visit the eld of Cognitive Linguistics, no?
(p. 1)and by discussing key aspects of CL along with the articles that
focus on these aspects. Most crucially, CL researchers share the assump-
tion that abstract formal descriptions and denitions are not sucient to
grasp a concept, a relation, or a structure of any kind; this idea is re-
ected in the books articles in various ways. In general terms, CL aims
at accounting for natural language adequately by focusing on its mean-
ing, while considering fundamental aspects of cognition in relation to
their reections in language. Geeraerts spells out four distinct aspects
of linguistic meaning that are particularly crucial in CL: its potential
to reect various perspectives, its exibility, its embodiment (or non-
autonomy), and its intricate relation to usage and experience.
As is typical for introductions in edited books, summarizing the key as-
pects of each paper in the book leads to considerably high density with
respect to the information conveyed in the overview. Geeraerts counters
this by his choice of language type to make reading easier, but is at times
nevertheless fairly challenging in the range of terminology and notions in-
troduced in this very rst chapter. Probably at least parts of this chapter
are more useful after, not before having dealt intensively with the individ-
ual articles. Generally, the introductory text sometimes seems to be ad-
dressing experts and sometimes total newcomers, which can be confusing.
Book reviews Cognitive Linguistics 192 (2008) 325
This aspect may, however, quite accurately reect an underlying double
purpose of the book in that it subsumes useful resources for readers on
any level of prior knowledge.
Since this chapter does not provide much background information
about the impact of each of the 12 ground-breaking articles in the scien-
tic world, it is best understood together with its companion chapter, the
Epilogue. Here, at the end of the book, the signicance of each area
sketched in the individual articles is highlighted, and many useful pointers
to further work can be found in which the basic ideas are developed fur-
ther. The Epilogue does not go into much detail with respect to contents
such as, conceivably, partial aspects of each chapter that may have been
disproved by now, or substantiated by empirical evidence. Nevertheless,
this short concluding chapter usefully wraps up what has been presented
in separate articles so far, and puts the individual ideas into the mean-
ingful context of a wider research community. Further information
concerning current progress in the eld can then be accessed in an edited
collection by Gitte Kristiansen et al., which was published simultaneously
with this collection of Basic Readings.
Within the Introduction, CL is presented as a coherent whole that is
manifest in the 12 key notions reected by the 12 chapters of this book.
This is an appealing and well-founded idea which might usefully have
been elaborated more. Page 19 presents a suggestive conceptual map
of CL that would have merited more accompanying text. The subsequent
part of the introduction presents an overview of the channels through
which CL is developed further, such as book series, journals, and confer-
ences. While somewhat surprising in an introductory chapter to an edited
book, this overview is actually quite informative even for researchers who
are not total newcomers to the eld. Naturally it is only a snapshot of the
current status in the eld, but even as such, it may well be welcome even
after decades in order to look back on the developments. The chapter
then concludes by illustrating the place of CL within the overall linguis-
tics disciplinea non-trivial task as assessing the relative importance of
specic research directions is never straightforward. The author neverthe-
less succeeds in providing an insightful impression of how CL contributes
to our understanding of language as a whole, and what this might imply
concerning its further investigation.
Chapter 1: Cognitive Grammar. Ronald W. Langacker: Introduction to
Concept, Image, and Symbol.
This concise summary of Langackers well-received theory has its rightful
place within this book, though it gives the naive beginner a somewhat
326 Book reviews Cognitive Linguistics 192 (2008)
rough start. The chapter starts with complex theoretical considerations
and gradually becomes a bit easier to grasp by the extensive use of exam-
ples and illustrations. Nevertheless it remains extremely dense with
respect to information content, and its train of thought is certainly not
designed to introduce newcomers to the eld. Rather, it provides a con-
densed insight into a full-edged cognition-based grammatical theory.
The sheer complexity of this theory poses a high challenge for the reader
on any level of prociency. True beginners in CL will be in trouble if this
is really the rst chapter they encounter (as seems to be suggested by the
ordering of articles within the book). For more advanced readers, how-
ever, this chapter is a necessity and cannot be ignored in its impact in
the eld. Langackers theory explicitly opposes generative approaches to
grammar, claiming that all grammatical constructs meaningfully reect
underlying cognitive structures. Even supercially similar sentences like
the pair John gave the book to Mary vs. John gave Mary the book,
which (according to transformational approaches) are derived from a
common deep structure, reect systematically dierent conceptualizations
of the same scene. Langacker illustrates such concepts by using schematic
depictions that highlight prole and base, trajector and land-
mark, to name some key notions in his framework. Cognitive structures
systematically vary on several dimensions, such as domain, level of specif-
icity, scale and scope, perspective, and relative salience of substructures.
Based on such arguments, Langacker sharply criticizes approaches that
impose further abstract, not conceptually or semantically motivated for-
mal structures on language in a grammatical description. Instead, he sets
out to explain grammatical units on the basis of their cognitive motiva-
tion, using mechanisms of integration and composition on phonological
and semantic levels.
Chapter 2: Grammatical construal. Leonard Talmy: The relation of
grammar to cognition.
Like Chapter 1, this chapter presents basic ideas and concepts in a fash-
ion rather too dense for newcomers. Unlike Chapter 1, this chapter is not
a summary of a full-edged theory, but rather a sub-part of a theory or
broader approach toward understanding cognitive aspects involved and
reected in language. Other aspects of this approach are spelled out in
Talmys other manifold publications, most coherently so in his two vol-
umes (2000: Toward a Cognitive Semantics). In the present article, Talmy
focuses on the distinction between lexicon and grammar. He discusses in
detail how particular concepts (or classes of concepts) are expressed in
grammar (exemplarily, in English, though Talmy claims the generalized
Book reviews Cognitive Linguistics 192 (2008) 327
underlying procedures to be universal) while other kinds of concepts can
only be expressed by the lexical level of language. In particular, grammar
systematically reects qualitative (relative, topological) concepts, while
quantitative (metric, absolute) measures need to be conveyed by particu-
lar lexical items. He then shows how particular notions are conceptual-
ized in systematic ways but can be converted into a dierent category.
For instance, while there is a systematic conceptual dierence between
bounded and unbounded entities (objects/events and mass/action), each
instance can be construed as belonging to a dierent category, as in he
slept for an hour in which a conceptually unbounded activity is pre-
sented as bounded. Similar eects occur with other conceptions with
respect to dimension, plexity, and dividedness; together, they constitute a
complex system that is illustrated in a useful schematization on page 84.
Further systematic eects concern degree of extension (both bounded
and unbounded entities have a certain extent), pattern of distribution
(e.g., some actions occur only once, others in a multiplex fashion, like
breathe), perspectival mode (how the speaker chooses to present the
entity linguistically), and others. All of these specic eects t into four
broader imaging systems (p. 96/97) that encompass also other aspects
of Talmys approach as presented elsewhere. Talmy concludes by propos-
ing that there are shared conceptual structures that may dier concerning
particulr aspects between the domains, but in general, together reect the
nature of human cognition across conceptual domains. While Talmys
proposal of strictly dierentiating between grammar and lexicon may be
somewhat surprising at this place (other non-formalistic approaches tend
to stress the similarities and fuzzy boundaries between both, as expressed,
for instance, by the term lexicogrammar, Halliday 1994: 15), his line
of argument is nevertheless convincing. Specically his analysis of the
concepts conveyed by grammar (vs. the lexicon) in a simple sentence on
page 75f. is quite illustrative. In general, the same is true for Talmys
very far reaching and insightful account of systematic conceptualization
processes and the procedures of conversion that come along with them.
However, the presentation of parallels between space and time seems
problematic. These two domains dier in many substantial respects
(some of which can also be found in Talmy 2000; for a systematic com-
parison of how concepts of space and time are reected in grammar and
usage see Tenbrink 2007); here Talmy refers only to a subpart of their in-
terrelationship but presents the idea of parallelism between them in very
general terms. A case in point concerns the conversion between objects
and events: While events can straightforwardly be presented as objects
(as in John gave me a call; p. 78), the reverse is not, as Talmy claims,
as easily the case. Although a similar eect can be observed if an object is
328 Book reviews Cognitive Linguistics 192 (2008)
presented as part of a process (or event): It hailed in through the win-
dow) (p. 79), this is not quite the same eect as construing an event as
such (rather than part of it) as an object. Obviously, Talmy concentrates
here on the similarities, rather than the dierences, between conceptual
structures and conversion processes. The example just given shows that
there may be systematic limitations and dierences that would be worth-
while attending to (as Talmy himself indicates towards the end of the
chapter).
Chapter 3: Radial network. Claudia Brugman and George Lako:
Cognitive topology and lexical networks.
This chapter builds on one of the most famous M.A. theses ever written,
namely, Brugmans (1981) detailed and innovative study of over. Here,
the most crucial ndings are summed up and discussed in light of the con-
troversy between cognitive topological and more formal, feature-based
approaches to semantics. The particular preposition over is interesting
because of its highly polysemous semantic structure that encompasses
both a broad range of truly spatial senses that are somehow meaningfully
interrelated with each other, plus a further range of metaphorical exten-
sions that are derived more or less directly from particular spatial senses.
While such a richness in semantic structure has seldom been evidenced for
other lexical items, a fair amount of generalizations can be drawn from a
close analysis of this particular one, highlighting basic facts about human
cognition and its reection in lexical semantics. The approach of provid-
ing a basic image-schema that is extended systematically to capture de-
rived senses has since then been extensively applied in other work. While
focussing strictly on the interpretation of over (rather than, more than
exceptionally, providing a broader range of exemplications), the article
mentions many crucial issues in CL more or less in passing. It usefully
makes explicit some of the major underlying assumptions that are often
implicitly taken for granted: for instance, that there are two levels of se-
mantic structurea structure for each single meaning of a term, and a
structure with respect to the meaningful interrelationships between all of
these meanings. Another basic aspect that often remains implicit concerns
the non-predictability of semantic structure and the notion of motiva-
tion that explains why we can analyze meaningful structurecontra
arbitrarinesswithout assuming underlying processes of determinism. In
addition, the chapter succeeds in illustrating the value and cognitive real-
ity of topological representations, and thus provides a strong argument
against formal approaches that presuppose arbitrariness in semantic rela-
tionships. Altogether, this chapter conveys a multitude of fundamental
Book reviews Cognitive Linguistics 192 (2008) 329
messages and is in this respect no less dense in information content than
the rst two articles. However, basic aspects of the analysis of over can
be grasped by newcomers even without fully comprehending the intention
behind the more general discussions oered here. Here, the extensive
usage of notational conventions (that enable abbreviated conveyance of
condensed content) such as ABV.NC.X.P for above/non-contact/
extended/along a path is rather challenging (though explained and mo-
tivated well enough).
Chapter 4: Prototype theory. Dirk Geeraerts: Prospects and problems of
prototype theory.
Unlike the previous chapters, this contribution does not introduce or
summarize any basic notion (for this purpose, an early article by Eleanor
Rosch might have been expected here). Rather, it presents an integrated
overview of the various phenomena that have been studied under the um-
brella of prototype theory. The chapter is valuable in highlighting the
most crucial assets of that general notion. It distinctly works out a num-
ber of basic features associated with the idea of prototype, and it points in
a useful dense fashion to a range of crucial publications. Apart from the
ideas that are widely discussed in the associated papers, Geeraerts further-
more points to a number of aspects that are less well known in the area.
Among them is the necessary distinction between polysemy and prototype
structure, which need to be carefully separated as they capture dierent
aspects of the semantics of a lexical item. Furthermore, the author illus-
trates convincingly how dierent lexical items may have dierent seman-
tic structures and thus t either more or less to the notions of prototype
theory. Therefore not all words can be analysed and described in the
same way; care must be taken to account for the systematic dierences
in lexical analysis. On another level, the theory itself exhibits a prototyp-
ical structure. The notions associated with it are meaningfully intercon-
nected. Also, the general idea encompasses various contributions and
approaches that t more or less prototypically to the central notions.
The chapter concludes with a discussion of how cognition and the real
world both can be conceptualized as fuzzy or neat; concepts can be de-
scribed in either way, yielding more or less descriptive adequacy; further-
more, scientic notions should themselves not be fuzzy as such. It could
be added, in this train of thought, that one of the most central aims of
science is to capture neatly what is, in everyday life, typically only treated
in fuzzy wayseven, and especially, notions and relations that are inher-
ently fuzzy themselves. In other words, science aims at capturing and
describing fuzziness in a neat way.
330 Book reviews Cognitive Linguistics 192 (2008)
Chapter 5: Schematic network. David Tuggy: Ambiguity, polysemy, and
vagueness.
This is a nice, brief, fairly easily readable article with a clear message,
one that is useful for anyone concerned with notions of ambiguity (poly-
semy) or underspecication (here vagueness). As with the other chapters,
it is useful to draw on prior knowledge with respect to these well-
discussed notions, otherwise the impact of the message (the presented
model) itself would be hard to assess. The model presented here builds
extensively and straightforwardly on previous ndings, such as prototype
notions and Langackers theory. These building blocks are now put to
use to highlight the conceptual relationship between ambiguity and
vagueness: two notions that are clearly distinguishable for some concepts,
such as aunt (which is vague because it has one united sense, parents
sister) versus bank (which is ambiguous because of its polysemy),
while dicult to apply for others (such as paint which has many re-
lated context-bound senses). Tuggy proposes a model which accounts
for these problems via the notion of saliency. For some lexical items,
the uniting sense is salient, while for others, it is fairly remote; for still
others, both the uniting sense and the subcases are salient, which opens
up the possibility of conceptualizing them either as vague or as ambigu-
ous. Thus, there is no clear-cut distinction between the two notions,
but a cline, with varying degrees of salience enhancing either one of the
interpretations.
Chapter 6: Conceptual metaphor. George Lako: The contemporary
theory of metaphor.
Anybody not familiar with the theory of conceptual metaphor should
denitely turn to this chapter (if they decide not to start with the more
popular book source, Lako & Johnson 1980). This chapter sums up the
main aspects of the theory and relates it, on the one hand, to metaphor in
poetry (where the classical interpretation of metaphor originates), and
on the other hand, to the broader scientic context, highlighting its signif-
icance and the impact within the eld of cognitive science in general
terms. Also, it purposively counters a number of earlier assumptions in
regard to the notion of metaphor, thus attacking previous assumptions
of literal interpretations of language as a norm. In fact, Lako claims
that metaphorical expressions are pervasive in language to the extent that
it is only in relation to concrete physical experience that literal language
is more common than metaphor. He outlines how metaphorical pro-
cesses govern lexical polysemy, how they combine systematically to yield
Book reviews Cognitive Linguistics 192 (2008) 331
consistent conceptual patterns, how they inuence semantic change, and
how novel specic metaphors are put to use in relation to the more gen-
eral conceptual notions behind them. Through the extensive use of illus-
trative examples, the reader gets thoroughly familiarized with the general
conception. In accounts such as these, there is always a certain danger of
focussing on the phenomena under discussion to such a high degree as to
neglect their boundaries. In the present case, many interesting cases of
metaphorical processes are described in relation to fundamental concep-
tions such as time, causality, event structure, love, and many others. An
interesting counterpart of this study would be an investigation of those
cases which could not be explained on the basis of metaphor, building
on their own conceptual structures. This idea ts well within Lakos
theory in that he explicitly points out that conceptual metaphors are per-
vasive but nevertheless not predictable: they are grounded in experience,
but the metaphoricalprocesses could have taken a dierent path. Further-
more, not all aspects of a broad conceptual target domain (such as
time) are linguistically or conceptually derived in some way or other
from a source, or even a bundle of sources (see also Tenbrink 2007).
Thus, it remains an important future goal for metaphor theory to de-
velop a concrete model of the specic extent to which metaphorical
processes have explanatory force, in relation to other processes of lexical
development.
Chapter 7: Image schema. Raymond W. Gibbs, Jr. and Herbert L.
Colston: The cognitive psychological reality of image schemas and their
transformations.
This article presents a broad and detailed literature review establishing
the relationship between empirical research in Cognitive Psychology, and
the central notion of image schema within Cognitive Linguistics. As
such, it is not an introduction to the notion of image schema (contrary
to what the editors chapter title might be taken to suggest), although
the idea is sketched suciently for the main argument. More impor-
tantly, the article follows up on one of Lakos major claims and mes-
sages (also to be found emphatically at the end of the previous chapter)
namely, that the approach Cognitive Linguistics takes is fundamentally
interdisciplinary. The article carefully spells out what this might mean
in detail, showing how previous evidence from a dierent direction
can be interpreted to support the linguistic theories. Also, it highlights
directions of meaningful future research combining the disciplines in a
fairly concrete way, illustrating how basic ideas within CL could be
validated experimentally. Much of these suggestions are still valid: the
332 Book reviews Cognitive Linguistics 192 (2008)
interdisciplinary relations could still be reinforced much more strongly in
targeted research.
Chapter 8: Metonymy. William Croft: The role of domains in the
interpretation of metaphors and metonymies.
In this highly theoretical chapter, Croft establishes a number of system-
atic aspects in relation to the occurrence and interpretation of metaphors
and metonymies. His idea is to account for the dierentiation between
these two notions on the grounds of the underlying domains: Roughly,
in metaphorical language, there is a mapping between two dierent do-
mains, while metonymy highlights particular aspects within a domain.
Crofts approach to the notion of domains builds on Langackers idea of
a prole on a base, which relates to concepts that are presupposed by a
specic concept. Thus, an arc presupposes a circle; the circle, on
the other hand, is itself dened relative to two-dimensional space, i.e., it
can also be a prole on a base. Thus, concepts are part of a complex do-
main structure with a limited number of basic domains which do not
presuppose anything else. Here, the technical usage of abstract (vs.
basic) for concepts based on other concepts may be confusing, since
the more common (related) usage of abstract (vs. concrete) is also
used fairly often within approaches to (conceptual) metaphor. In the
sense used here, time is a basic domain; while in other accounts time
is often described as (more) abstract (than space, motivating the system-
atic mapping from space to time). Here, the bridge to Lakos notion of
conceptual metaphor is the idea that the two domains being compared in
metaphor are the underlying basic domains of the specic concepts
involved. In metonymy, there is a mapping within one domain matrix,
though the processes involved may be fairly intricate. A further step in
Crofts approach then involves the introduction of yet another phenome-
non, namely, relationality: in Mara sings, the subject is nonrelational
but the predicate is relational; therefore, sings is here dependent while
Mara is autonomous. This notion of (relative) dependency is then dis-
cussed in the light of possible mappings and highlightings in metaphor
and metonymy. In this way, this approach goes in the direction of explor-
ing systematically where metaphorical and metonymical interpretations
can be expected or not, working towards indicating the boundaries of
these phenomena within the use of language in general. It remains unclear
(also taking into account Geeraerts pointers to further reading in the Ep-
ilogue, in which he mentions a diculty in operationalizing the notion of
domains) in how far this (complex but intuitively appealing) theoretical
framework has been substantiated and integrated in subsequent work.
Book reviews Cognitive Linguistics 192 (2008) 333
Chapter 9: Mental Spaces. Gilles Fauconnier and Mark Turner:
Conceptual integration networks.
This rather lengthy chapter establishes a framework in which the notion
of metaphorical transfer is generalized to a major conceptual principle,
namely, blending. Drawing upon a broad variety of evidence, the authors
propose a number of systematic and generalizable features of this princi-
ple. They explain how the mind manages to merge diverse conceptual
spaces together in order to develop new ideas or ways of describing
concepts, i.e., to construct meaning. This happens primarily through the
three basic operations of composition, completion, and elaboration, each
of which are dened briey. In order to develop this framework the au-
thors provide intuitive schematic depictions, and they rely on a limited
number of examples which they re-use time and again to illustrate their
classications. A number of competing governing principles (somewhat
confusingly called optimality principles) serve to explain in what ways
a blending may or may not be successful. Altogether, the article intro-
duces a broad range of theoretical notions and ideas; manageable for the
reader because the authors do not (in contrast to some other contribu-
tions in this book) presuppose prior knowledge, and the re-use of the
same examples provides a sense of coherence. Nevertheless, to my mind
the article could have proted from a broader presentation of evidence
rather than merely referring to the fact (plus reference to a website)
that the theoretical framework builds on much more than is discussed
explicitly here. While some of the proposed principles do not necessarily
strike the reader as inevitable or fundamental, and some details of the
framework seem hard to verify or substantiate, the general idea is never-
theless both important and convincing. In fact, after being introduced to
the cognitive principle of blending, it may become hard to see why one
has not recognized such a basic and widespread phenomenon clearly
before.
Chapter 10: Frame Semantics. Charles Fillmore: Frame semantics.
This fairly short, comprehensible piece is one of the few articles in the
book that Id recommend unhesitatingly to student newcomers without
requiring much prior knowledge. As Fillmore himself points out it is en-
tirely pre-formal and descriptive in a way that allows the unsophisti-
cated reader to grasp the authors main point quickly and intuitively.
The article suggests analyzing the semantics of a language against the
background in which it has developed. (For the purpose of presenting
this point, section 2which basically gives a historical overview on the
ideas developmentcould easily have been omitted; it might in eect be
334 Book reviews Cognitive Linguistics 192 (2008)
more disturbing than enlightening for newcomers.) Any lexical item can
only be understood fully if the community in which it is used, and some-
times a rather specic part or aspect of that community, the frame, is
known. Fillmore provides many illustrative examples for this idea, and
he evaluates the more fundamental claims of frame semantics in the
light of a number of basic topics within the eld of semantics, such as
taxonomies or presupposition. However, since the article does not go
into much detail in any of these directions, it remains up to the reader to
determine what kind of impact such a view on semantics could have in
linguistic theory. In Geeraerts Epilogue, however, the information is
provided that Frame Semantics builds the basis of a corpus-based online
dictionary.
Chapter 11: Construction Grammar. Adele E. Goldberg: The inherent
semantics of argument structure: The case of the English ditransitive
construction.
This fairly early work by one of the major proponents of Construction
Grammar starts out from a discussion of another paper dealing with
ditransitive constructions. Goldberg makes a point of the fact that Con-
struction Grammar, in contrast to that alternative account, does not
require idiosyncratic semantic explanations for specic verbs that may
occur in a particular construction. Instead, the (abstract) grammatical
construction itself is associated with a specic meaning. In the case of
the English ditransitive, this meaning is fairly specic; it involves a notion
of transfer between a volitional agent and a willing recipient. This basic
meaning has a number of systematic extensions, which are accounted
for, on the one hand, by principles of integration of the construction and
the lexical items that ll it, and on the other hand, by metaphorical trans-
fer (of the basic construction sense) much like the kind of metaphorical
transfer known for lexical items. The article is well-written and compre-
hensive and may well serve as an introduction to Construction Grammar
(being more like a case study, it was certainly not intended as such),
although the motivating contrast to the earlier paper may have become
redundant by now.
Chapter 12: Usage-based linguistics. Michael Tomasello: First steps
toward a usage-based theory of language acquisition.
This chapter falls somewhat out of the general scope in that it treats a
specic approach to language (usage-based theory) not with respect
to adult language (like the other chapters), but with respect to child
Book reviews Cognitive Linguistics 192 (2008) 335
language acquisition. This choice is well-motivated since, as Tomasello
points out, it is specically crucial in language acquisition to take usage
into account to a much higher degree than has been done so far. Usage
explains how children start out by producing item-based structures that
are re-used and gradually expanded and reproduced creatively. In doing
this they rely heavily on what they have heard adults use in contexts and
with communicative intentions that they understand. This idea can be
developed further to account for patterns in adult language; partly, the
ndings developed in the present chapter already provide a sound back-
ground for some other accounts in the present book (e.g., frame seman-
tics). This chapter reads like an overview chapter on child language
acquisition (though, to become a true one, relationships to other func-
tion-based theories such as Halliday 1975 would need to be established),
yet it introduces a new view on language that is highly appealing and
motivating to pursue further in various directions. Some of these are
sketched by Geeraerts in his useful Epilogue chapter.
2. General evaluation
This book is specically designed to introduce newcomers to the eld of
Cognitive Linguistics, as well as to serve as a handbook for more ad-
vanced scientists. It brings together a broad range of articles by highly
renowned researchers, each of which is central in some way or other for
one particular basic notion in this eld. This endeavour in itself is highly
laudable, given the high diversity in this research area and the lack of one
singular central gure or ground-breaking work that could count as the
fundamental core of Cognitive Linguistics. In this book, the various di-
rections taken in this eld are systematically brought together and dis-
cussed with respect to their relationship to each other as well as their im-
pact within the wider area. The articles cover the most important aspects
of the eld. Only a few areas come to mind that could considered as miss-
ing; for instance, the issue of universals as opposed to variability across
languages is barely touched. A CL approach to this issue involves study-
ing cognitive principles as a basis for universal principles of language.
Though some ideas in this direction are expressed in Talmys article, the
general approach and corresponding evidence could have been repre-
sented directly by a central article. Examples in this direction are Dan
Slobins work on thinking for speaking, Lera Boroditskys ndings on
time concepts in relation to language, or other work related to Neo-
Whoranism. A key concept related to this research direction would be
the notion of iconicity and motivation as addressed, for example, by Hai-
man (1980).
336 Book reviews Cognitive Linguistics 192 (2008)
Potential readers should be aware that true beginners in this particular
eld will meet with a challenge in terms of scientic and informational
density. Students of Linguistics using this book need to bring in a fair
amount of previous experience with linguistic theories. Also, introductory
books like Ungerer & Schmid (1996, second edition: 2006) or the recent
one by Evans and Green (2006) would be a very helpful preparation for
understanding the basic readings in this collection, since many of the ar-
ticles were originally written not for students but for highly sophisticated
scientic discourse. Thus, the fact that this book presents itself as de-
signed for an introductory course in Cognitive Linguistics, as stated on
the book cover, should not mislead towards expectations of easy reading
(with a number of exceptions as hinted above). Nevertheless, it remains
a useful source of basic readings which can be put to valuable use by
teachers of CL who are ready to provide their students with the required
guidance throughout the journey suggested by the book. Particular care
should then be taken to sort out how the various notions highlighted
(though not necessarily introduced from scratch) throughout the book
t together and how they relate to each other, since many of the basic
ideas cross over the particular chapters, in spite of their being assigned a
single keyword each by the editor. A fair share of such relationships are
insightfully spelled out in the editors introductory chapter, which is gen-
erally a useful overview article (not only) for beginners. In general terms,
this book provides basic reading for everyone involved in scientic inves-
tigation in some particular area within the wider diversied eld of CL.
In this sense, it brings in a high potential to keep the eld together, so
to speak, working against the danger inherent in any kind of (otherwise
welcome) diversication: namely, the lack of coherence yielding not one
convincing unied theory, but a conglomeration of nice ideas that on
may or may not take seriously in individual thinking.
To sum up, although the book might have proted from an extra round
of proofreading, this collection may be highly recommended. It oers
a selection of indispensable readings for students who have had a rst
textbook-based acquaintance with CL and who are now ready to con-
front the primary literature;
an integrated view of the eld that will not only help newcomers but
even more experienced cognitive linguists to see how the various key
notions and the corresponding domains of investigation within CL
hang together;
a set of highly useful practical tips about CL, together with a compre-
hensive collection of pointers to further literature, which will act as an
eye-opener even for advanced researchers.
Book reviews Cognitive Linguistics 192 (2008) 337
References
Brugman, Claudia
1981 Story of over. M.A. thesis, University of California, Berkeley.
Evans, Vyvyan and Melanie Green
2006 Cognitive Linguistics. An Introduction. Edinburgh: Edinburgh University
Press.
Haiman, John
1980 The iconicity of grammar: Isomorphism and motivation. Language 56(3),
515540.
Halliday, Michael A. K.
1975 Learning How to Mean. Explorations in the Development of Language.
London: Edward Arnold.
1994 An Introduction to Functional Grammar. 2nd ed. London: Edward Arnold.
Kristiansen, Gitte, Michel Achard, Rene Dirven, and Francisco J. Ruiz Mendoza (eds.)
2006 Cognitive Linguistics: Current Applications and Future Perspectives. Berlin/
New York: Mouton de Gruyter.
Lako, George and Mark Johnson
1980 Metaphors We Live By. Chicago: University of Chicago Press.
Talmy, Leonard
2000 Toward a Cognitive Semantics, 2 vols. Cambridge, MA: MIT Press.
Tenbrink, Thora
2007 Space, Time, and the Use of Language: An Investigation of Relationships.
Berlin/New York: Mouton de Gruyter.
Ungerer, Friedrich and Hans-Jo rg Schmid
2006 An Introduction to Cognitive Linguistics. Second edition. (First published in
1996.) London/New York: Longman.
Vyvyan Evans and Melanie Green, Cognitive Linguistics. An Introduction.
Mahwah, NJ: Lawrence Erlbaum Associates, Inc. (in Canada and USA).
Edinburgh: Edinburgh University Press (in Europse and the rest of the
world), 2006, 830 pp., ISBN 978-0-7486-1832-3. 24.99.
Reviewed by Rene Dirven, Mechelen, Belgium. Email 3rene.dirven@
pandora.be4
This introduction to CL is in many respects a remarkable book, and in
some respects even an excellent book. It presents itself as a course book,
whose 23 chapters are structured in 4 parts. Part I oers an overview of
the cognitive linguistics enterprise, Part II deals with cognitive semantics
(largely lexical semantics), Part III discusses various cognitive approaches
to grammar, and Part IV is a brief concluding chapter. As the authors
say, the book can be handled in a 12-week seminar either on cognitive
semantics, or else on cognitive grammar. Logically speaking, a two-
semester course could then cover the whole book. As such it is better
338 Book reviews Cognitive Linguistics 192 (2008)
seen (and perhaps also called in its subtitle) an advanced introduction,
keeping the middle, as it does, between the shorter type of introductions
such as Ungerer and Schmid (1996,
2
2006), Dirven and Verspoor (1998,
2
2004), or Lee (2001), and the more specialised type such as Taylor
(2002) or Croft and Cruse (2003), or the highly specialised handbook by
Geeraerts and Cuyckens (2007).
Alternatively, given its length and wide scope, Cognitive Linguistics: An
Introduction (henceforward CLAI) could also be seenthough it is not
intended to beas a kind of pocket encyclopedia of cognitive linguistics.
Indeed, one of the most specic roles and functions CLAI could serve is
that of a reference work for any major strand within CL and for any con-
ceptual or descriptive tool developed in it. This looking-up function is
inherent to any reference work of this size (but if this function were to be
strengthened in a later edition, this would of necessity require a far more
detailed index of at least twice as many keywords). This more general ref-
erence function covering all possible strands within CL could hardly be
fullled by the specialised introductions like Taylor (2002), which mainly
concentrates on Langackers Cognitive Grammar, nor by Croft and
Cruse (2003), with its emphasis on typology and construction grammar,
nor by the more specialised information provided by the authors compe-
tent in one given area as in the CL handbook by Geeraerts and Cuyckens
(2007).
A brief survey of the main topics dealt with in the three parts of CLAI
reinforces this potentially encyclopedic character. Part I deals in its four
chapters with basic themes such as: knowing a language (Ch. 1), CL as-
sumptions and commitments (Ch. 2), universals and variation (Ch. 3),
and language in use (Ch. 4). In Ch. 1 knowing a language is related to
the two basic functions of language, i.e., the symbolic and the interactive
ones. The order in Ch. 2 is the reversed one of what the title says: rst
come the cognitive commitments and then the assumptions. Ch. 3 applies
the notions of universals and variation not only to language, but also to
thought and experience. In Ch. 4 the notion of language in use is related
to knowledge of language, language change, and language acquisition.
What is somehow puzzling, not only in Ch. 2, but also in the two other
chapters, is that the internal order of the various sections would look far
more natural, if reversed. Indeed, what naturally comes rst, normally
serves as the basis for what comes later. Thus experience comes before
thought, and thought before language. Similarly, language use starts o
as language acquisition, results in language knowledge and may undergo
or cause language change. What is perhaps lacking here is a clear sense of
direction: a cognitive bottom-up approach instead of remnants of a gen-
erative top-down way of thinking about language.
Book reviews Cognitive Linguistics 192 (2008) 339
Part II (Ch. 513) deals with various inroads into cognitive semantics,
and Part III (Ch. 1422) with the various strands in cognitive grammar.
This macro-structure of CLAI may have dierent motivations. On the
one hand, it may be a reaction against the absolute primacy of syntax
over semantics in generative grammar, but this would somehow contra-
dict CLAIs explicitly stated intention (p. 781) of building bridges be-
tween the two paradigms. On the other hand, this macro-order may also
say much about how the authors themselves see the internal relations
within CL by giving priority to Lakovian or other semantic concerns
over the various grammatical approaches such as Langackers Cognitive
Grammar, or the various versions of construction grammar. (Here and
throughout this review, upper cases stand for a specic strand, and lower
case for generic uses of terms). Still other, e.g., pedagogic factors may
favour the macro-order chosen: the demands put on the audience by se-
mantic models may be less stringent than those required by grammatical
models. Last but not least, the authors may also express their own prefer-
ential thought patterns by following the given order. Anyway, though
wholly or partially decided upon unconsciously, such orderings speak
volumes.
The single chapters of Part II deal with the main semantic insights de-
veloped in CL. Ch. 6, Embodiment and conceptual structure, focuses
on Johnsons image schemas and Talmys notion of conceptual structure.
Ch. 7. The encyclopedic nature of meaning representation, focuses not
only on the contrasts between the generative dictionary view of meaning
and the CL encyclopedic view, but also on that between Fillmores Frame
Semantics (linking it to ndings in cognitive psychology) and Langackers
theory of domains. Chapters 810 reect (and partly reinforce the picture
of ) Lakos dominance in cognitive semantics. Ch. 8, Categorisation
and idealised cognitive models, compares the classical approach to
meaning in terms of semantic features to the revolutionary ndings by El-
eanor Roschs Prototype Theory, most emphatically propagated by Lak-
o and also extended by his Idealised Cognitive Model (ICM) theory.
Ch. 9, Metaphor and metonymy, compares the traditional view that
gurative language use is primarily a matter of language with Lako &
Johnsons cognitive view that metaphor is primarily a matter of thought,
as suggested by the adjective in conceptual metaphor and Conceptual
Metaphor Theory. Also extensively discussed are Gradys (1997a, b) no-
tion of primary metaphor, the extension of the notion of conceptual to
apply to metonymy in the term conceptual metonymy, introduced by Ko -
vecses & Radden (1998), and Radden & Ko vecses (1999), and the interac-
tion of metaphor and metonymy, mainly referring to Barcelona (2000),
but not to newer insights such as those developed in Bartsch (2002) or
340 Book reviews Cognitive Linguistics 192 (2008)
Geeraerts (2002). Ch. 10, Word meaning and radial categories, allows
the authors to present a much improved version of Lakos radial net-
work, as developed in Tyler and Evans (2001) and drawn up on the basis
of strict criteria. Part IIs last two chapters present Fauconniers analysis
of the on-line construction of meaning in discourse. Ch. 11, Meaning
construction and mental spaces, opposes the truth-conditional approach
in formal semantics to Fauconniers own Mental Space Theory, explain-
ing the creation of mental spaces or small package of information for
each discourse referent, thus enabling discourse participants to keep track
of the various referents at any point in the discourse. Mental Space
Theory is not limited to reference, but also explores the tense-aspect-
modality systems serving the grammatical functions of perspective, view-
point, epistemic distance and grounding. The last concept, which can
directly be linked to Fauconniers notion of base space, is however not
discussed by the authors here, but it is only associated with Langacker
later in the book (p. 575). Mental Space Theory is also at the basis of
Fauconnier & Turners metaphor theory, discussed in Ch. 12 as Concep-
tual blending. According to Fauconnier and Turner, the source domain
is not just mapped onto the target domain as in Lako & Johnsons Con-
ceptual Metaphor Theory, but both domains are input spaces, which, via
the generic space containing their common elements, are blended or inte-
grated into the blend, which may also contain new emergent meaning,
not present in the input spaces. At the end of the chapter (p. 440), the
authors state that Conceptual Blending Theory cannot completely replace
Conceptual Metaphor Theory, but the question is whether that has ever
been Fauconnier and Turners intention. As shown by Grady et al.
(1999), the models of Conceptual Metaphor Theory and Conceptual
Blending Theory are to be seen as complementary to each other rather
than as rivals. Ch. 13, Cognitive semantics in context, contrasts cogni-
tive semantics with truth-conditional semantics and with Relevance
Theory. Though the former contrast is quite a legitimate one, the treat-
ment of the latter as CLs opposite is certainly too strong. Though the
authors admit that Relevance Theory is more consonant with CL
(p. 465), they fail to see that the relevance principle, just like that of
salience or any other perceptual or conceptual principle, is an exquisite
cognitive tool in itself, oering an explanatory motivation for a large
number of phenomena in language use.
Part III, Cognitive Approaches to Grammar, contains fewer points
of divergence and will therefore also be treated more briey. It is orga-
nised in a way remarkably parallel to that of Part II. As was seen, Part
II is structured around CL scholars and their main semantic theories, with
three central chapters on Lakos ideas, preceded and followed by some
Book reviews Cognitive Linguistics 192 (2008) 341
other major CL inroads into semantics. Similarly, Part III has three cen-
tral chapters mainly exposing Langackers Cognitive Grammar ideas,
preceded by a chapter on Talmy and Langacker, and followed by two
chapters on construction grammar, one on Kay & Fillmores version of
it and one on Goldbergs, Crofts and Bergens versions. Thus the major
cognitive grammar models are all discussed, but it is neither clear by itself
nor explained why Fillmores type of construction grammar, which is to
be situated somewhere halfway between the generative and cognitive
worlds of thought, is given so much space and attention, or reversedly,
why Goldbergs fully cognitive Construction Grammar or Crofts Radical
Construction grammar is not analysed more extensively and centrally,
more or less on a more equal footing with Langacker. One also wonders
why (in view of their followings) lesser strands in cognitive grammar such
as Hudsons Word Grammar or Nuytss Cognitive-Functional Approach
or even RRG (Role and Reference Grammar) are not discussed in the
same vein as the other strands. If ever the encyclopedic potential of
CLAI were to be exploited, these would certainly deserve priority. Lets
now look at the various chapters in Part III.
Ch. 14, What is a cognitive approach to grammar, deals with the
common assumptions and principles of cognitive grammar such as the
symbolic nature of grammar, its usage-based nature (and acquisition),
the concept of grammar as an inventory of symbolic units, and the con-
ceptual unity of lexicon and grammar, conceived as a continuum. Ch.
15, The conceptual basis of grammar, explores both Talmys and
Langackers cognitive models. Talmys Conceptual Structuring System
Model comprises the four subsystems of congurational structure, atten-
tion, perspective, and force dynamics. Langackers Cognitive Grammar is
conceived as a network model, which is specied in great detail in the
next three chapters. Chapters 16, 17, and 18 focus on word classes, con-
structions, and tense, aspect, mood, and voice, respectively. The distribu-
tion of the attention paid to the various versions of construction grammar
is, as already stated above, bizarre and the opposite of what one would
expect. Ch. 19 Motivating construction grammar is completely devoted
to the partly generative and partly cognitive model of Kay & Fillmores
construction grammar model, whereas the fully cognitive construction
grammar versions, i.e., Goldbergs Construction Grammar, Crofts
Radical Construction Grammar, and Bergen & Changs Embodied Con-
struction Grammar are packed together in Ch. 20, The architecture of
construction grammars. Although Goldbergs Construction Grammar
is dealt with in more detail (underlying assumptions, applicability to
verb argument structure, interaction between single verbs and construc-
tions, and the organisation of constructions in constructional networks),
342 Book reviews Cognitive Linguistics 192 (2008)
the newest views of Goldbergs are not represented yet. Instead of a com-
parison between Fillmore and Langacker, one would also very much
have preferred a comparison between Goldberg and Langacker (and their
attempts to come closer to each others views), certainly when facing the
comparison between Goldberg and the other two construction grammar
versions, which, though equally relevant, may still be less conspicuous.
Before coming to a concluding part or chapter, CLAIs Part III still oers
two highly interesting chapters. Ch. 21, Grammaticalisation, links
CL to diachronic linguistics in its three main approaches: Heines Meta-
phorical Extension Approach, Traugotts Invited Inferencing Theory and
Langackers Subjectication Theory. In Ch. 22, Cognitive approaches to
grammar in context, the term context is taken in the sense of a (chang-
ing) scientic paradigm, although that qualication would be too strong
for what is discussed. It is rather a comparison than a confrontation of
paradigms. The cognitive approach is, in a nal bridge-building eort,
compared to generative and functional-typological accounts. Here one
cannot but be astonished about this obsession to compare and oppose
CL to generative and other approaches. In a way, this may be condi-
tioned by the general climate of a generative predominance in the
Anglo-American world, where cognitive linguistics is not yet as fashion-
able as it is in many other parts of the world. However, a new paradigm
like CL cannot become strong nor independent by eternally dening and
redening itself vis a` vis competing paradigms. The idea of building
bridges between the various paradigms, which is part of the authors
dream, will at best remain a one-way road anyway, and more likely than
not, end up in a dead-end street. In Part IV or Ch. 23, Assessing the cog-
nitive linguistic enterprise, the authors consider the fact that CL tries to
integrate language and thought as its greatest achievement, but here the
more encompassing term cognition might be more appropriate than
thought, since it embraces both the perceptual and the conceptual facul-
ties, orto speak in ontological termsthe body and the mind. It is this
body-in-the-mind logic that, according to the authors, makes cognitive
linguistics attractive to neighbouring disciplines in the humanities and
also in the social sciences. The strong emphasis on the embodied nature
of cognition and language may be reopening channels of investigation
into language and mind that take into account embodiment, experience
and usage while remaining rmly committed to the mentalist approach
(p. 778).
As this last vista conrms, on the whole the book leaves a very strong
impression, which cannot be diminished by the critical remarks above nor
by those that will follow below. The great strength of CLAI is that it of-
fers a lucid, coherent and substantial synthesis of the most important
Book reviews Cognitive Linguistics 192 (2008) 343
strands in CL, as laid down in the major publications by cognitive lin-
guists. The crucial question therefore is how reliable the various CL in-
sights and constructs are represented and to what extent the two authors
of this pocket encyclopedia oer a wide enough overview of CL. The an-
swer to the rst question is decidedly positive, but the answer to the sec-
ond question is more complex and depends very much on the question
whether linguistics is seen as a theoretical discipline or also as an applied
one. Seen from this wider viewpoint, CLAI is to be characterised as an
extensive overview of the whole theoretical eld of cognitive linguistics,
and one that does not take notice of the very wide eld of applications
in the interdisciplinary exchange between CL and neighbouring disci-
plines. A large part of this exchange, though not all of it, has now been
overviewed and spelled out in Kristiansen et al.s (2006) collective volume
Cognitive Linguistics: Current Applications and Future Perspectives. One
eld of application is the dialogue between cognitive linguistics and
anthropological linguistics, thus emphasizing the cultural core of language
and the linguistic core of culture. Another eld of application of CL is
language pedagogy, where a long tradition of empirical research can be
enriched by the motivational potential of a cognition- and usage-based
approach. A further eld is multimodal communication, focussing either
on the contrast or else on the interaction between verbal and other modes
of conceptual self-expression. The former nds application in signed lan-
guages and the latter in media studies, e.g. in the exploration of the role
of visual metaphors. Still another major eld is cognitive poetics, which
applies cognitive tools to literary text worlds, thus trying to break down
the wall between linguistics and literary studies. A last interdisciplinary
eld covered in Kristiansen et al. is computer linguistics, spelling out the
conditions that CL has to full in order to make it testable for compu-
tational experimentation. This list is far from exhaustive and could be
further exemplied by interdisciplinary explorations such as cognitive
sociolinguistics, cognitive translation studies, cognitive discourse studies
and cultural studies, which may also be cross-mapped internally.
Of course, this enumeration of possible elds of interdisciplinary appli-
cation of CL is not meant as a critique of CLAI, but rather as a reminder
that the eld of CL is not only its own theoretical world, but comprises
a much wider area of investigation. In that sense, the enumeration just
points to the fact that we must make allowance for a number of unavoid-
able gaps in an encyclopedic book like the present one. In fact, a second
volume covering all the applications in interdisciplinary exchange would
be a logical next step. Only, it could hardly be composed by two authors
alone, but would require a whole team. This also demonstrates the tour
de force accomplished by the two authors of CLAI. On the whole, and
344 Book reviews Cognitive Linguistics 192 (2008)
limiting oneself to theoretical cognitive linguistics, one cannot but admire
the careful selection of the most representative CL authors and the in-
sightful characterisations of their work.
We will nish o by briey pointing out what we think to be avoid-
able gaps or shortcomings. Some of the not so coincidental or more
systematic gaps must be pointed out anyway, but again they do not
basically diminish the great esteem in which we hold the book. One such
gapalmost an Anglo-American general phenomenonis the under-
representation of non-Anglo publications. Either merely mentioned as
bibliographical references (which is not a real measuring stick) or totally
missing in the index are scholars such as John Barnden, Renate Bartsch,
Dirk Geeraerts, Theo Janssen, Tanja Kuteva, Kurt Queller, Francisco
Ruiz de Mendoza, Klaus-Uwe Panther, Ted Sanders, Arie Verhagen,
Anna Wierzbicka and Jordan Zlatev. Although sometimes references to
publications by these authors are made, they may appear in a rather
strange context, as for example Wierzbicka, who is only named in the
context of a discussion of Katz and Fodors feature semantics.
Another gap in the book relates to the philosophical background
against which CL must be seen. Obviously, the philosophical roots and
background of CL are not a major concern in the authors minds. Thus
the term phenomenology (see Geeraerts 1985) is not present in the index,
but this is a critique that rather addresses Lako and also Johnson. In
Lako (1987) the philosophical outlook underlying CL is claimed to be
experiential realism, or experientialism, but it is nowhere claried what
philosophy this experientialism is based on. Geeraerts (1985: 355) clearly
reveals that the CL approach goes in the direction of Merleau-Pontys in-
terpretation of phenomenology, which holds that consciousness is pres-
ent in the corporal experience of the world. Here in a nutshell we nd all
the basic epistemological tenets of CL: its realism, its experientialism, its
embodiment and its embodied mind. In contrast to this philosophical
blank, the psychological background of CL has a much better fortune:
Gestalt psychology is already mentioned on p. 65, and further richly illus-
trated in a number of principles. But again the roots of Gestalt psychology
itself are not historically situated and thus, just like Lakos experiential-
ism, seems to stem from nowhere. There is no background link neither
to its anti-Wundt or anti-molecular character nor to its philosophical
implications.
A last point concerns, not so much a gap, but the way of formulat-
ing things, e.g. about the notion of image schema and its relation to
embodiment. In the authors view (p. 46), image schemas are rudimen-
tary concepts like contact, container and balance, which are meaning-
ful because they derive from and are linked to human pre-conceptual
Book reviews Cognitive Linguistics 192 (2008) 345
experience: experience of the world directly mediated and structured by
the human body. The problem is of course: what can we understand by
a rudimentary concept and when does a rudimentary concept become a
non-rudimentary one? Much of the trouble could be avoided here by not
pinning oneself down on the conceptual status of the experience and using
a term like pre-conceptual conguration, as Johnson himself regularly
does, or pre-conceptual patterns, as Hampe (2005: 1) shows to be a
current term. Further down the authors say that embodied concepts of
this kind can be systematically extended to provide more abstract con-
cepts. At this point one cannot but wonder whether there is any dier-
ence between a rudimentary concept and an embodied concept? Still
on the same page we nd the term fundamental concept in the state-
ment abstract concepts like love are structured and therefore under-
stood by virtue of the fundamental concept container. Apart from
the cumulative terminological load, this statement can also be easily mis-
interpreted as if love were a container. Love can be a struggle, a ght, or
even war, but it is never understood as a container. What is left out of
sight here is another conceptual projection, i.e., that states are conceived
of as locations such that be in love (with someone) is structured as
being in one place with someone. But love itself is not seen nor under-
stood as a container. That so much terminological overload could be
avoided is shown by a quotation from Jean Mandler, still on p. 46: the
image schema is more than a spatio-geometrical representation. It is a
theory about a particular kind of conguration, in which one entity is
supported by another entity that contains it. Not only do we nd here
the above-suggested term conguration for image schema, but we can
also infer from this denition that such congurations are not mere
bodily experiences, but also imply a functional dimension added by a per-
ceiving and conceiving human consciousness. Or to conclude, we can
only understand what embodiment means if we see this human body as
a conscious body. Paraphrasing Merleau-Ponty, we could say that an
em-bodied mind presupposes an en-minded body.
References
Barcelona, Antonio (ed.)
2000 Metaphor and Metonymy at the Crossroads: A Cognitive Perspective.
(Topics in English Linguistics 30.) Berlin/New York: Mouton de Gruyter.
Barcelona, Antonio
2000 On the plausibility of claiming a metonymic motivation for conceptual
metaphor. In Barcelona, Antonio (ed.), Metaphor and Metonymy at the
Crossroads: A Cognitive Perspective. Berlin/New York: Mouton de Gruyter,
3158.
346 Book reviews Cognitive Linguistics 192 (2008)
Bartsch, Renate
2002 Generating polysemy: Metaphor and metonymy. In Dirven, Rene and Ralf
Po rings (eds.), Metaphor and Metonymy in Comparison and Contrast.
Berlin/New York: Mouton de Gruyter, 4974.
Croft, William and D. Alan Cruse
2004 Cognitive Linguistics. Cambridge: Cambridge University Press
Dirven, Rene and Marjolijn Verspoor (eds.)
2004 Cognitive Exploration of Language and Linguistics. Second edition. (First
published in 1998.) Amsterdam/Philadelphia: John Benjamins.
Evans, Vyvyan
2004 The Structure of Time: Language, Meaning and Temporal Cognition.
Amsterdam/Philadelphia: John Benjamins.
Geeraerts, Dirk
1985 Paradigm and Paradox. Explorations into a Paradigmatic Theory of Meaning
and its Epistemological Background. Leuven: Leuven University Press.
1993 Cognitive semantics and the history of philosophical epistemology. In
Geiger, Richard A. and Brygida Rudzka Ostyn (eds.), Conceptualizations
and Mental Processing in Language. (A Selection of Papers from the First
International Cognitive Linguistics Conference in Duisburg 1989.) Berlin/
New York: Mouton de Gruyter, 5379.
2002 The interaction of metaphor and metonymy in composite expressions.
In Dirven, Rene and Ralf Po rings (eds.), Metaphor and Metonymy in
Comparison and Contrast. Berlin/New York: Mouton de Gruyter, 435
465.
Geeraerts, Dirk and Hubert Cuyckens (eds.)
2007 Handbook of Cognitive Linguistics. New York: Oxford University Press
Grady, Joseph
1997a Foundations of meaning: Primary metaphors and primary scenes, Depart-
ment of Linguistics, University of California at Berkeley: Ph.D. dissertation.
1997b THEORIES ARE BUILDINGS revisited. Cognitive Linguistics 8, 267
290.
Grady, Joseph, Todd Oakley and Seana Coulson
1999 Conceptual blending and metaphor. In Gibbs, Raymond W. Jr. and Gerard
J. Steen (eds.), Metaphor in Cognitive Linguistics. Amsterdam/Philadelphia:
John Benjamins, 101124.
Hampe, Beate (in cooperation with Joseph E. Grady) (eds.)
2005 From Perception to Meaning: Image Schemas in Cognitive Linguistics. (Cog-
nitive Linguistics Research 29.) Berlin/New York: Mouton de Gruyter.
Ko vecses, Zoltan and Gu nter Radden
1998 Metonymy: Developing a cognitive linguistic view. Cognitive Linguistics 9,
3777.
Kristiansen, Gitte, Michel Achard, Rene Dirven and Francisco J. Ruiz de Mendoza (eds.).
2006 Cognitive Linguistics: Current Applications and Future Perspectives. (Appli-
cations of Cognitive Linguistics 1.) Berlin/New York: Mouton de Gruyter.
Lee, David
2001 Cognitive Linguistics: An Introduction. Oxford: Oxford University Press.
Radden, Gu nter and Zoltan Ko vecses
1999 Towards a theory of metonymy. In Panther, Klaus-Uwe and Gu nter Rad-
den (eds.), Metonymy in Language and Thought. Amsterdam/Philadelphia:
John Benjamins.
Book reviews Cognitive Linguistics 192 (2008) 347
Taylor, John
2002 Cognitive Grammar. Oxford: Oxford University Press.
Tyler, Andrea and Vyvyan Evans
2001 Reconsidering prepositional polysemy networks: The case of over. Lan-
guage 77, 724765.
Ungerer, Friedrich and Hans-Jo rg Schmid
2006 An Introduction to Cognitive Linguistics. Second edition. (First published in
1996.) London/New York: Longman.
348 Book reviews Cognitive Linguistics 192 (2008)
Introduction
ARNE ZESCHEL*
One of the central tenets of Cognitive Linguistics is its fundamentally
usage-based orientation: language is seen as an inventory of dynamic
symbolic conventions (constructions) whose organisation is constantly
updated by (and hence adapting to) language use (Langacker 2000).
Such usage-based, emergentist views of language are also found in re-
cent work outside Cognitive Linguistics in the narrower sense: for in-
stance, there is experimental evidence from various sources that shared
symbolic communication systems can indeed emerge (on the interper-
sonal level) and be learned (on the individual level) in a data-driven, self-
organising manner that does not require substantial language-specic
stipulations (be it in humans or machines).
1
This is not to deny that
many aspects of the usage-based language model are still underspecied
and have the status of assumptions rather than established facts. How-
ever, there is currently a commendable trend within Cognitive Linguistics
to put its programmatic appeal to the usage-based hypothesis to the test:
more and more studies set out to evaluate specic predictions of the
approach in dierent domains against appropriate experimental and/or
corpus data, thereby contributing to a successive renement of the overall
model and helping to put it on a sound empirical footing (cf. Tummers
et al. 2005 as well as the papers in Gries and Stefanowitsch 2006 and
Gonzalez-Marquez et al. 2007 for recent overviews and applications).
Cognitive Linguistics 193 (2008), 349355
DOI 10.1515/COGL.2008.013
09365907/08/00190349
6 Walter de Gruyter
* Authors e-mail address: zeschel@uni-bremen.de. I am grateful to Kerstin Fischer, Anatol
Stefanowitsch and Felix Bildhauer for comments.
1. For the spontaneous emergence of novel symbolic communication systems among
humans, cf. Galantucci (2005); for the emergence of shared linguistic communication
systems (construction grammars) among cognitive robots, cf. Steels (2005); for over-
views of the usage-based approach to child language acquisition, cf. Tomasello (2003)
and Goldberg (2006); for unsupervised machine learning of a Langacker-style natural
language construction grammar, cf. Solan et al. (2005).
The papers in this special issue (which has grown out of a theme session
on Constructions in Language Processing held at the 2nd International
Conference of the German Cognitive Linguistics Association in Munich
in October 2006) all represent this line of research, with a focus on con-
structionist perspectives on (human) language processing and its rela-
tionship to the linguistic representations that speakers extract from their
experience.
The issue opens with a study of island eects in English clausal com-
plement constructions by Ben Ambridge and Adele Goldberg (The island
status of clausal complements: evidence in favor of an information struc-
ture explanation). The authors compare the classical subjacency account
of constraints on ller-gap relations (Chomsky 1973) with an item-based
analogical approach (in which acceptability is a function of semantic
distance to a stored prototype) and their own proposal, in which ease of
extraction depends on the targets degree of backgroundedness in dis-
course (a principle which they refer to as BCI: backgrounded constitu-
ents are islands). Ambridge and Goldberg substantiate their hypothesis
with the results of two questionnaire studies, suggesting that the eects
investigated are best interpreted as a pragmatic anomaly reecting the
fact that a constituent cannot be at the same time backgrounded and
focused. The authors conclude that the possibility of combining two con-
structions in production is inuenced by the information-structural prop-
erties of the constructions involved (among other things).
Unbounded dependency constructions in English are also the topic of
the second study, Questions with long-distance dependencies: A usage-
based perspective by Ewa Dabrowska. In contrast to Ambridge and
Goldberg, Dabrowska is concerned with how the acceptability of dif-
ferent types of WH-questions with long-distance dependencies can be
predicted from their similarity to an assumed prototype rather than from
general semantic/pragmatic principles: departing from the observation
that naturally occurring instances of this construction tend to be highly
stereotypical, she suggests that they are not derived by abstract rules but
by modifying (or, in comprehension: by comparing a given target to)
a stored low-level schema of the format WH do you think/say S-GAP?
Dabrowska presents evidence for the predicted prototypicality eects
from an acceptability judgment experiment and points to possible inter-
pretations of the obtained results in terms of both strongly item-based/
analogical models and a hybrid architecture that also represents abstract
schemas alongside specic exemplars.
Similar to the rst two contributions, the third and fourth paper in the
volume both deal with the same linguistic phenomenon, but with a dier-
ent focus and with dierent aims. In my own contribution (Lexical
350 Arne Zeschel
chunking eects in syntactic processing), I report an experiment on syn-
tactic ambiguity resolution that seeks to probe the psychological reality
and processing relevance of partially schematic prefabs (i.e., the kinds
of low-level schemas that speakers are assumed to store in usage-based
Construction Grammar). The results of the experiment indicate that
global complementation preferences applying to a given verb at large
(i.e., considering its entire usage spectrum) may be overridden by conict-
ing evidence for specic syntagmatic chunks in which this verb occurs.
These results are interpreted as support for the usage-based view that
such structures may have independent memory storage even when they
are fully predictable, and that such representations are furthermore privi-
leged over more abstract (i.e., lexically unlled) constructions in language
processing.
Dealing with the same phenomenon (i.e., garden path eects resulting
from a specic type of local syntactic ambiguity in English), Daniel
Wiechmanns paper Initial parsing decisions and lexical bias: Corpus
evidence from local NP/S-ambiguities has a more methodological focus.
The author presents a corpus-linguistic approach to assessing verbal com-
plementation preferences in terms of collostruction strength using the
method of Distinctive Collexeme Analysis (DCA; Gries and Stefano-
witsch 2004). Using a balanced corpus, both verb-general and (verb-)-
sense-specic associations with dierent complementation patterns are
computed for 20 verbs and related to on-line measures of processing di-
culty from an earlier reading experiment with these verbs (Hare et al.
2003). The results conrm the hypothesis that sense-specic associations
(as determined by the DCA) are a better predictor of processing preferen-
ces/diculties than form-based associations. Moreover, the author sug-
gests that the observed correlation between the corpus-derived predictions
and Hare et al.s experimental ndings indicates that collostruction
strength is a valid approximation of constructional association strength
on the psychological plane.
Holger Diessel s study Iconicity of sequence: A corpus-based analysis
of the positioning of temporal adverbial clauses in English is devoted to
aspects of production again. The author discusses a range of factors that
inuence speakers choice of the positioning of adverbial clauses relative
to the matrix clause in dierent languages, with special attention to one
of these motivations, iconicity of sequence (i.e., the iconic encoding of
prior events in preposed clauses and posterior events in postposed
clauses). Diessels study reveals that the ordering of temporal adverbial
clauses in English is signicantly aected by iconicity of sequence, which
is viewed as a processing principle geared at avoiding structures that are
dicult to plan and to interpret. In a second step, the author uses logistic
Introduction 351
regression analysis to integrate the observed eect into a more com-
prehensive model of processing constraints on clause order in complex
sentences which also includes factors such as clause length, syntactic com-
plexity and pragmatic import. The resulting picture is a model in which
speakers seek to balance multiple constraints on their constructional en-
coding options in order to minimise overall processing load.
Though concerned with yet a dierent aspect of language process-
ing, Martin Hilperts study New evidence against the modularity of
grammar: Constructions, collocations and speech perception is again
interested in the psychological status and processing relevance of en-
trenched exemplars of a given construction. However, the overall thrust
of Hilperts argument is dierent from that of other papers in the issue
which are concerned with item-based eects in language processing: by
showing that the phonemic categorisation of a synthesised ambiguous
sound (located somewhere on a continuum between two phonemes) can
be biased in either direction by embedding it in an appropriate colloca-
tional carrier phrase, the study documents syntactic top-down eects
on word recognition that are dicult to reconcile with strictly serial-
modular theories of language processing. Hilpert provides evidence that
the observed eect applies immediately (i.e., at the level of auditory input
processing), which implies that it cannot be explained by appealing to
late feedback between modules. Instead, the author argues that fre-
quent word combinations have psychological reality as independent units
of linguistic knowledge, and that lexical and syntactic aspects of language
processing are not plausibly attributed to separate (i.e., informationally
encapsulated) mental modules.
Like Hilperts study, the nal contribution addresses a famous tenet
of linguistic theories that are decidedly non-emergentist: in Negative
entrenchment: A usage-based approach to negative evidence, Anatol
Stefanowitsch presents a new perspective on the so-called no negative
evidence problem that gures prominently in nativist accounts of lan-
guage acquisition. The author contrasts dierent strategies for overcom-
ing the problem that have been proposed in the literature and then
presents a new approach that builds on the notion of negative entrench-
ment: if speakers keep track of how often a particular simplex element
or feature occurs in the input, Stefanowitsch argues, such information
could be used to form subconscious expectations as to how often it should
co-occur with other elements or features in the language if there were
nothing in the grammar to prevent this. Learners could thus distinguish
absences in the input that are statistically signicant from those that are
merely accidental, with continued non-occurrence of statistically expected
combinations resulting in their growing negative entrenchment. The
352 Arne Zeschel
author backs up his proposal with the results of a pilot study which sug-
gests that corpus-derived scores of negative entrenchment are a better pre-
dictor of experimental (un)acceptability judgments than corpus-derived
measures of constructional pre-emption (i.e., one of the other mechanisms
discussed in the literature that are assumed to compensate for the lack of
explicit negative evidence).
In sum, the papers collected in this special issue demonstrate many
interesting prospects of combining a usage-based approach to grammar
with suitable empirical methodologies: the contributions ll empirical and
methodological gaps on the constructionist research agenda (Wiechmann;
Diessel), they put important assumptions of the hypothesised model to
the test or extend it in novel ways (Zeschel; Hilpert; Stefanowitsch), they
reframe classical issues in grammatical theory from a usage-based per-
spective (Ambridge and Goldberg; Dabrowska; Stefanowitsch), and they
challenge more general claims about the properties of language and cog-
nition that rest in part on questionable arguments from theoretical lin-
guistics (Hilpert). At the same time, there are a number of important
issues on which not all contributors might agree (such as the scope and
explanatory status of item-based approaches to language processing and
representation; cf. Abbot-Smith and Tomasello 2006). However, this
should only encourage further empirical investigation of these issues, and
future research can of course only benet from the fact that relevant dif-
ferences are clearly articulated rather than glossed over.
That said, readers may wonder how it is that one particular strand of
this research is not featured in this special issue at all i.e., usage-based
work in computational linguistics. Clearly, statistical approaches to
natural language processing share important assumptions of usage-based
theories of language, and particular models might thus provide a useful
empirical touchstone for hand-crafted reconstructions of e.g., construc-
tion learning processes (cf. Bod, in press). Moreover, moving beyond
the purely statistical aspects of language and language processing, the
transition from traditional computational modelling to experiments with
embodied robotic agents that learn to associate linguistic signs with
aspects of their sensory-motor experience (e.g., Dominey and Boucher
2005; Steels and Kaplan 2002; Sugita and Tani 2005) provides a wealth
of further interesting possibilities for investigating some of the very key
concerns of Cognitive Linguistics from a new perspective (cf. also Zlatev
and Balkenius 2001). However, it is beyond the scope of this special issue
to map out points of contact between these two research communities.
For the moment, then, suce it to acknowledge that usage-based ap-
proaches to language are gaining more and more currency also in neigh-
bouring disciplines, and that the increasing integration of appropriate
Introduction 353
methodologies from linguistics, cognitive psychology and computer sci-
ence promises many interesting perspectives for future research on the
cognitive instantiation of language.
Universitat Bremen, Germany
References
Abbot-Smith, Kirsten, and Michael Tomasello
2006 Exemplar-learning and schematization in a usage-based account of syntactic
acquisition. The Linguistic Review 23, 275290.
Bod, Rens
in press Constructions at work or at rest? Cognitive Linguistics.
Chomsky, Noam
1973 Conditions on Transformations. In: Anderson, Stephen R., and Paul
Kiparsky (eds.), A Festschrift for Morris Halle. New York: Holt, Rinehart
& Winston, 232286.
Dominey, Peter F. and Jean-David Boucher
2005 Developmental stages of perception and language acquisition in a perceptu-
ally grounded robot. Cognitive Systems Research 6, 243259.
Galantucci, Bruno
2005 An experimental study of the emergence of human communication systems.
Cognitive Science 29, 737767.
Goldberg, Adele E.
2006 Constructions at work. The nature of generalization in language. Oxford:
Oxford University Press.
Gonzalez-Marquez, Mo nica, Irene Mittelberg, Seana Coulson and Michael J. Spivey (eds.)
2007 Methods in Cognitive Linguistics. Amsterdam: John Benjamins.
Gries, Stefan Th. and Anatol Stefanowitsch
2004 Extending collostructional analysis: a corpus-based perspective on alter-
nations. International Journal of Corpus Linguistics 9(1), 97129.
Gries, Stefan Th., and Anatol Stefanowitsch (eds.)
2006 Corpora in Cognitive Linguistics: Corpus-based Approaches to Syntax and
Lexis. Berlin: Mouton de Gruyter.
Hare, Mary L., Ken McRae, and Jerey L. Elman
2003 Sense and structure: Meaning as a determinant of verb subcategorization
preferences. Journal of Memory and Language 48(2), 281303.
Langacker, Ronald W.
2000 A dynamic usage-based model. In: Barlow, Michael and Suzanne Kemmer
(eds.), Usage-Based Models of Language. Stanford: CSLI, 163.
Solan, Zach, David Horn, Eytan Ruppin, and Shimon Edelman
2005 Unsupervised learning of natural languages. Proc. Natl. Acad. Sci. 102,
1162911634.
Steels, Luc
2005 The emergence and evolution of linguistic structure: from lexical to gram-
matical communication systems. Connection Science 17(3/4), 213230.
Steels, Luc and Frederic Kaplan
2002 Bootstrapping grounded word semantics. In Briscoe, Ted (ed.), Linguistic
evolution through language acquisition: formal and computational models.
Cambridge: Cambridge University Press, 5373.
354 Arne Zeschel
Sugita,Yuuya and Jun Tani
2005 Learning semantic combinatoriality from the interaction between linguistic
and behavioral processes. Adaptive Behavior 13(1), 3352.
Tummers, Jose, Kris Heylen, and Dirk Geeraerts
2005 Usage-based approaches in Cognitive Linguistics: A technical state of the
art. Corpus Linguistics and Linguistic Theory 1, 225261.
Tomasello, Michael
2003 Constructing a Language: A Usage-based Theory of Language Acquisition.
Cambridge, MA: Harvard University Press.
Zlatev, Jordan, and Christian Balkenius
2001 Introduction: Why Epigenetic Robotics? In: Balkenius, Christian, Jordan
Zlatev, Hideki Kozima, Kerstin Dautenhahn, and Cynthia Breazeal (eds.),
Proceedings of the First International Workshop on Epigenetic Robotics:
Modeling Cognitive Development in Robotic Systems. Lund University Cog-
nitive Studies, 85 14.
Introduction 355
The island status of clausal complements:
Evidence in favor of an information structure
explanation
BEN AMBRIDGE and ADELE E. GOLDBERG*
Abstract
The present paper provides evidence that suggests that speakers determine
which constructions can be combined, at least in part, on the basis of the
compatibility of the information structure properties of the constructions in-
volved. The relative island status of the following sentence complement
constructions are investigated: bridge verb complements, manner-of-
speaking verb complements and factive verb complements. Questionnaire
data is reported that demonstrates a strong correlation between acceptabil-
ity judgments and a negation test used to operationalize the notion of
backgroundedness. Semantic similarity of the main verbs involved to
think or say (the two verbs that are found most frequently in long-distance
extraction from complement clauses) did not account for any variance; this
nding undermines an account which might predict acceptability by analogy
to a xed formula involving think or say. While the standard subjacency
account also does not predict the results, the ndings strongly support the
idea that constructions act as islands to wh-extraction to the degree that
they are backgrounded in discourse.
Keywords: island constraints; constructions; sentence complements; man-
ner of speaking verbs; factive verbs; bridge verbs.
1. Introduction
Imagine the President was given an incriminating top secret FBI le
about a person who worked closely with him. Watching him storm out
Cognitive Linguistics 193 (2008), 357389
DOI 10.1515/COGL.2008.014
09365907/08/00190357
6 Walter de Gruyter
* We are grateful to Ewa Dabrowska, Mirjam Fried, Jon Sprouse, Mike Tomasello and
Robert Van Valin for comments on an earlier draft. We are also grateful to Blazej Gal-
kowski and to Ewa Dabrowska for discussion about the Polish facts. This work was sup-
ported by an NSF grant to the second author: NSF 0613227. Correspondence addresses:
Ben.Ambridge@liverpool.ac.uk, adele@princeton.edu.
of the room, the people gathered may well wonder who the report was
about. And yet they could not formulate the question as follows:
(1) *Who did he just read the report that was about _?
As this example illustrates, even when questions appear to be semanti-
cally appropriate, there are constraints on what can count as a question.
Where do such constraints come from? The question has been at the heart
of linguistic theorizing for decades. Many researchers assume that the an-
swer must lie in a system of innate linguistic knowledge that is built on
purely formal principles that are specic to language, since it is not di-
cult to come up with contexts in which ill-formed questions would seem
to be semantically appropriate as in the example just given (e.g., Chom-
sky 1973; Ross 1967; Pinker and Bloom 1990).
In this paper we compare the viability of the following proposals: a) a
formal subjacency account, b) an account that predicts acceptability to
be determined by semantic comparison to a high-frequency formula, and
c) the hypothesis that discourse properties of the constructions involved
determine the relative acceptability of long-distance dependencies.
1.1. Filler-Gap constructions
WH-questions typically involve a constituent that appears in a position
other than its canonical position. We refer to the displaced constituent as
the ller (indicated by italics), and the place where the constituent would
appear in a simple sentence, the gap (_). In this way, we can avoid the
common terminology that the ller is extracted from the site of the gap
and moved to the front of the sentence, since we do not assume that
there is any actual movement (see e.g., Ambridge et al. 2006, 2008a; Sag
and Fodor 1994; Van Valin 1993; for non-movement accounts of simple
and complex question formation). An example of a question ller-gap
construction is given in (2):
(2) Who did she think he saw _?
Relative clauses and topicalizations are other types of ller-gap con-
structions as in (3) and (4):
(3) I met the man who I think you saw _. (relative clause)
(4) Whitesh and bagels, she served _ (topicalization)
Ross (1967) rst observed constraints on ller-gap relations. Certain
syntactic constructions are islands to such relations: in particular, they
358 B. Ambridge and A. Goldberg
may not contain the gap.
1
Syntactic islands include complex noun
phrases, subjects, adjuncts, complements of manner-of-speaking verbs
and complements of factive verbs as illustrated below.
Judgments in the case of complex NPs and subject islands are more
robust, and less dependent on context, than in any of the latter three in-
stances. Exploring these subtle dierences in judgments requires us to
look in a more detailed way at the discourse functions of each of the con-
structions involved. We return to this issue of graded judgments below.
1.2. Subjacency
How should constraints on ller-gap constructions be accounted for?
Since Chomsky (1973), the dominant view has been that constraints on
ller-gap constructions arise from a subjacency constraint: namely
that the gap cannot be separated from the ller by two or more bound-
ing nodes, where S and NP are dened to be bounding nodes.
2
Subja-
cency is a parade example of a constraint that has been claimed to be for-
mal and specic to language: part of universal grammar (Newmeyer
1. The island metaphor was based on the idea that the ller moved from the gap posi-
tion to the front of the sentence. Islands refer to constituents from which a ller cannot
move.
2. NP and S are considered bounding nodes in English. NP and S
0
appear to be bounding
nodes in Italian (Rizzi 1982) and S, S
0
and NP appear to be bounding nodes in Russian
(Freidin and Quicoli 1989). That is, Italian speakers can apparently extract out of WH-
complements, while Russian speakers can only extract out of main clauses.
Table 1. Classic examples of Island constraints
*Who did she see the report that was about _?
(cf. She saw the report that was about x)
Complex NPs
(both noun complements and relative
clauses)
*Who did that she knew _ bother him?
(cf. That she knew x bothered him)
Subjects
??What did she leave the movie because they
were eating _?
(cf. She left the movie because they were eating x)
Presupposed adjuncts
??What did she whisper that he saw _?
(cf. She whispered that he saw x)
Complements of manner-of-speaking
verbs
??What did she realize that he saw _?
(cf. She realized that he saw x)
Complements of factive verbs
The island status of clausal complements 359
1991). The subjacency account predicts that complex NPs, subjects and
all adjuncts should be islands.
At the same time, the subjacency account predicts that gaps within
clausal complements should be acceptable since only one bounding node
(S) intervenes between the ller (who) and the gap (_). This prediction in
fact holds when the main verb is a semantically light (bridge) verb of
saying or thinking (including think, say, believe) (cf. 5):
(5) Who did she think that he saw _?
However, while gaps within the complement clauses of bridge verbs
are, as predicted, acceptable, the subjacency account does not explain
why gaps within the complements of manner of speaking verbs or factive
verbs should be less than fully acceptable, since the syntactic structures
appear to be the same (Erteschik-Shir and Lappin 1979; Ross 1967).
(6) ??Who did she mumble that he saw _?
Manner of speaking verb complement
(7) ??Who did she realize that he saw _?
Factive verb complement
The natural solution for a syntactic account is to argue that the syntac-
tic structures are not actually the same. In fact it has been suggested that
the complements of manner of speaking verbs are adjuncts, not argu-
ments (Baltin 1982). This idea is supported by the fact that the clausal
complement is optional:
(8) She shouted that he left.
(9) She shouted.
Since adjuncts are predicted to be islands on the subjacency account,
this move predicts that clausal complements of manner of speaking verbs
should be islands. However, clausal complement clauses are restricted to
appear with a fairly narrow set of verbs including verbs of saying and
thinking; this restrictiveness is a hallmark of arguments, not adjuncts.
Moreover, (9) does not convey the same general meaning as (8) insofar
as only (8) implies that propositional content was conveyed; the change
of basic meaning when omitted is another hallmark of arguments. In ad-
dition, direct object arguments can replace clausal complements (e.g., 10),
and yet it would be highly unusual to treat a direct object as an adjunct:
(10) She shouted (the remark).
Finally, the possibility of treating the complement clause as an adjunct
clearly does not extend to factive verbs, since their clausal complements
are not generally optional (cf. 1112).
360 B. Ambridge and A. Goldberg
(11) She realized that he left.
(12) ??She realized.
Kiparsky and Kiparsky (1971) suggest a dierent solution to account
for the island status of clausal complements of factive verbs. They suggest
that factive clausal complements contain a silent the fact rendering the
clausal complements part of a complex NP (as in 13).
(13) She realized the fact that he left.
This analysis predicts that the complement clauses of factive verbs
should be as strong islands as overt NP complements, since expressions
such as (14) and (15) would be structurally identical:
(14) *Who did she realize the fact that he saw?
(15) ??Who did she realize that he saw?
Intuitively, however, (14) is less acceptable than (15). Moreover, posit-
ing a silent the fact phrase to account for the ill-formedness of examples
like (15) is ad hoc unless a principled reason can be provided for not pos-
iting a silent NP (e.g., the idea) in the case of bridge verbs which readily
allow extraction.
(16) *Who did she believe the idea that he saw?
(17) Who did she believe he saw?
To summarize, if, in fact, the syntax is the same and only the lexical
semantics diers, subjacency does not predict variation in judgments
across dierent verb classes. The complement clauses must be reanalyzed
as either adjuncts or parts of complex NPs (to our knowledge, it has not
been proposed that they could be subjects, but that would be the other
option), but each of these possibilities raises issues that would need to be
addressed for the proposed alternative analyses to be convincing.
1.3. A possible direct-analogy account
Other researchers have emphasized that long-distance ller-gap construc-
tions are exceedingly rare in spoken corpora. Dabrowska (2004) and Ver-
hagen (2006) both observe that the only long-distance ller-gap expres-
sion to occur with any regularity at all are specic formulas with the
verb think or say (WH DO you think/say S)?
Dabrowska notes that of a total of 49 long-distance ller-gap construc-
tions produced by ve children in CHILDES corpora, all but two were
The island status of clausal complements 361
instances of these formulas. Dabrowska notes further that 96 percent of
adults long-distance ller-gap constructions in the Manchester corpus
also involve the main verb think or say (2004: 197).
Verhagen (2006) likewise observes that in both English and Dutch cor-
pora, questions out of main verb complements are almost uniformly in-
stances of the formula, WH do you think S? or, in the case of Dutch,
WH denk-pron
2nd
dat?. In a search of the English Brown corpus of written
texts, Verhagen nds that 10 out of 11 examples of long-distance ller-
gap constructions involved the verb think; in a search of a Dutch news-
paper 34 out of 43 long-distance ller-gap constructions likewise involved
the verb denken (think).
Dabrowska (2004, this issue) reports sentence judgment studies in
which she compared judgments on instances of the WH do you think/say
S? formula with variations of the formula. Her study demonstrates that
questions of the form WH do you think S? are judged to be more gram-
matical than questions that instead involve auxiliaries (will or would ) or
a dierent verb (suspect, claim, swear, believe) or that include an overt
complementizer that.
3
See also Poulsen (2006) for similar ndings for the
verb denken (think) in Dutch. One might quibble with certain aspects of
Dabrowskas study; for example, half of the questions used as stimuli
involved the verbs think or say, and it is possible that the repetition led
subjects to give those instances higher ratings due to a general uency
eect (see e.g., Jacoby et al. 1989). In addition, we know that strings
that contain more frequent words tend to be judged as more acceptable,
all other things being equal (Ambridge et al. 2008b; Featherston 2005;
Keller 2000; Kempen and Harbusch 2003, 2004; Schuetze 1996); yet the
high frequency do was compared with the low frequency would, and the
high frequency think was compared with lower frequency verbs. Nonethe-
less, simply given the high frequency of WH do you think S? and WH do
you say S? it seems reasonable to accept that these templates may be
stored, as Dabrowska, Verhagen and Poulsen suggest.
Both Dabrowska (this issue) and Verhagen (2006) go further, however,
and argue that other instances of long distance dependency questions are
judged by analogy to a xed high-frequency formula, WH do you think
S?. Verhagen (2006), for example, suggests that Instances that do not
conform to [the formulaic question], can be seen as analogical extensions
from this prototype. . . . invented sentences exhibiting long distance WH-
movement will be worse, the more they deviate from the prototype.
3. Dabrowska (this issue) nds no signicant eect for changing the second person subject,
you, to a proper name, and the auxiliary must agree with the subject, so the stored for-
mulas may be the more general WH DO NP think S? and WH DO NP say S?
362 B. Ambridge and A. Goldberg
Dabrowska (this issue) likewise suggests that in order to produce ques-
tions such as What does she hope shell get?, i.e., questions that do not
t the stored WH do you think S? template, speakers must adapt the
template, substituting she for you, hope for think, and does for do.
Bybee (2007) interprets usage-based theories to claim that grammatica-
lity familiarity, with general semantic or pragmatic constraints playing
little role. She states, Under the usage-based notion that lack of gram-
maticality is lack of familiarity, the oddness of these sentences [island vio-
lations] can be said to be in part due to the fact that one rarely hears such
combinations of structures (2007: 695).
If this view were extended to all constructions and combinations of
constructions, it might be suggested that all of our knowledge of gram-
mar is essentially item-based. What appear to be generalizations or novel
combinations of constructions, would on this view simply be one-shot
analogies on memorized formulaic expressions.
Few researchers have actually defended a purely exemplar based model
of linguistic knowledge, as usage-based models are not normally inter-
preted in this way. In particular, usage-based models espoused by Lan-
gacker (1988), Tomasello (2003) and Goldberg (2006) emphasize that
speakers form generalizations over instances as they record specic
instance-based knowledge (see also Murphy 2002 for a similar view
of non-linguistic categorization). Dabrowska (2004: this issue) and Verha-
gen (2006) in fact, likewise take a moderate position, allowing that gener-
alizations are often formed for constructions that are exemplied by a
wide variety of examples in the input. Dabrowska has argued, for exam-
ple in the case of other constructions, that early usage is highly stereo-
typical and . . . development proceeds from invariant formulas through in-
creasingly general formulaic frames to abstract templates (2004: 200,
emphasis added). Verhagen (2006) also notes that higher type frequency
of examples will lead to more abstract representations (see also Bybee
1985, 1995).
Still the question of whether we generalize beyond the exemplars is
highly relevant to the present case in which the vast majority of attested
examples instantiate only one or two relatively concrete types. We may
grant that these types, namely the formulas WH do you think S? and
WH do you say S? are likely to be stored, given their high frequency
and the judgment data collected by Dabrowska (this issue). The question
raised by Dabrowska, Verhagen and Poulsons work is: is this all
speakers have? Or, instead, is there evidence for a more abstract general-
ization about the function of long distance dependency constructions that
enables us to combine the clausal complement and question constructions
on the y?
The island status of clausal complements 363
1.4. Backgrounded Constructions are Islands (BCI) account
Several researchers have argued that the constraints on ller-gap con-
structions are best accounted for in terms of certain discourse properties
of the constructions involved. A fundamental insight of this perspective
is the observation that the gap generally must fall within the potential
focus domain of the sentence (Erteschik-Shir 1979; Erteschik-Shir 1998;
Takami 1989; Deane 1991; Van Valin 1993, 1995; Van Valin and LaPolla
1997).
4
That is, the constituent in which the gap exists (i.e., the constitu-
ent containing the canonical position for the ller) must be within the
part of the utterance that is asserted; it cannot be presupposed or other-
wise backgrounded. Presuppositions of a sentence are revealed by a
classic negation test: presuppositions are implied by both the positive
and negative form of a sentence. In accordance with this observation,
notice that all of the constructions in Table 1, with the exception of
manner of speaking verb complements, convey presupposed information.
This is indicated in Table 2: i.e., the negation of the sentences in Table 1,
just like their positive counterparts, imply the propositional content
expressed by the island. Thus these island constructions do not express
the assertion of a sentence: they are not part of the focus domain.
5
4. Van Valins (1995) account suggests that the potential focus domain is dened structur-
ally: that all direct daughters of direct daughters of the illocutionary force operator are
within the potential focus domain. This account, like the subjacency account above, re-
quires an appeal to other factors to explain the fact that the complements of manner of
speaking and factive verbs are not fully acceptable since they are within his structurally
dened potential focus domain (being direct daughters of direct daughters of the illocu-
tionary force operator). A Gricean explanation has been oered for manner of speaking
verbs (Van Valin 1997), but complements of factive verbs are predicted to be acceptable.
In its favor, the direct-daughters proposal is aimed at predicting which constructions
are non-backgrounded so that each construction need not be investigated on a case-by-
case basis.
5. In interpreting sentential negation, care must be taken not to place focal stress on any
constituent. Contrastive or metalinguistic negation can negate content expressed within
Table 2. Islands involve non-assserted (here presupposed) information
Complex NPs
1. She didnt see the report that was about him.
! The report was about him.
Sentential subjects
2. That she knew it didnt bother him.
! She knew it
3. She didnt leave the movie after they ate it ! They ate it.
4. She didnt realize that he saw the roses. ! He saw the roses.
364 B. Ambridge and A. Goldberg
Presupposition is a special case of non-assertion: what is presupposed is
taken for granted by both the positive and negated version of a sentence.
Another type of non-assertion is also revealed by the negation test, but is
distinct from presupposition in that neither the embedded proposition nor
its negation is implied by either the positive or the negated form of the
sentence. Complements of manner-of-speaking verbs involve this type of
non-assertion:
(18) She shouted that he left.
implies neither He left nor He didnt leave.
(19) She didnt shout that he left.
implies neither He left nor He didnt leave.
That is, normally a manner of speaking verb is used when the manner
of speaking and not the content of the complement clause is the main
assertion of the clause:
(20) She didnt mumble that he left.
Natural interpretation: She didnt mumble the content.
Notice that in a context in which the manner of speaking can be taken
for granted, the complement clause can be interpreted as asserted. For ex-
ample, in a game of whisper-down-the-alley, main clause negation can be
interpreted as negating the lower clause:
(21) I didnt whisper that the horse was green.
Natural interpretation: That the horse was green is not what I whis-
pered. (e.g., I whispered that the house was clean)
As predicted by the information structure account, in this context, a
gap within the complement clause is much improved:
(22) What did you whisper that the house was?
Thus we see that when the complements of manner-of-speaking verbs
are not within the focus domain (i.e., not construed to convey the main
assertion of a sentence), they are islands to extraction. In special con-
texts where they are construed to be within the focus domain, their island
status is noticeably mitigated. Thus the notion of potential focus
domain is clearly relevant to island constraints, as many have noted for
a long time (see references above).
islands, but then this type of negation can be used to negate anything at all, including
pronunciation or choice of lexical items, (She didnt realize that he saw the ROSES, she
realized that he saw CARNATIONS!).
The island status of clausal complements 365
At the same time, the potential focus domain does not capture the rel-
evant facts perfectly. Subject complements are not within the focus do-
main, as they (or their existence) are presupposed:
(23) The king of France is bald.
!There is a king of France.
(24) The king of France isnt bald.
!There is a king of France.
And yet the entire subject argument is available for questioning:
(25) Who is bald?
The subject argument is not within the focus domain,
6
but it plays a
special role in the information structure of a sentence in that it generally
serves as the primary topic. In order to allow for the fact that (entire) sub-
ject arguments are available to serve as gaps, despite their not being with-
in the focus domain of a sentence, Goldberg (2006: 135) formulates the
generalization as follows:
Backgrounded constituents may not serve as gaps in ller-gap
constructions.
(Backgrounded constructions are islands: BCI)
Backgrounded constituents are dened as constituents that are neither
the primary topic nor part of the focus domain of a sentence. Elements
within clausal subjects are backgrounded in that they are not themselves
the primary topic, nor are they part of the focus domain. Relative clauses,
noun complements, presupposed adjuncts, parentheticals, and active di-
transitive recipients are also not part of the focus domain of the clause
and are therefore backgrounded (cf. Goldberg 2006). In this way, the ac-
count correctly predicts that a wide range of constructions should all be
islands to long-distance dependency relations.
The restriction on backgrounded constructions is motivated by the
function of the constructions involved. Elements involved in unbounded
dependencies are positioned in discourse-prominent slots. It is pragmati-
cally anomalous to treat an element as at once backgrounded and
discourse-prominent.
We have seen that the BCI predicts that complements of factive verbs
should be islands, since, by denition, the complements of factive verbs
6. Subject arguments may be within the focus domain in a limited type of sentence-focus
construction (Lambrecht 1994). This construction requires special sentence accent on
the subject argument and occurs with a restricted set of mostly intransitive verbs.
366 B. Ambridge and A. Goldberg
are presupposed and are therefore backgrounded. The complements
of manner-of-speaking verbs are also predicted to be islands except
in special contexts in which the manner is taken for granted. But as
noted above the judgments of illformedness in these cases are somewhat
subtle. While factive verbs more strongly presuppose the content of their
complement clauses, it is not obvious that they are stronger islands than
manner-of-speaking verbs, though this is what the BCI hypothesis pre-
dicts. Complements of semantically light bridge verbs (e.g., say, think)
are predicted not to be islands, as these neutral verbs are generally
used to introduce a complement clause containing the foregrounded
information.
2. Testing the hypotheses
In this paper we set out to investigate the following questions: a) Do judg-
ments relating to the negation test correlate with judgments concerning
island status as the BCI account predicts? b) Do judgments concerning
island status correlate with similarity of the main verbs involved to the
verbs think and say as the direct-analogy proposal would predict?
We decided to restrict our investigation to one particular ller gap
construction: long-distance WH-extraction from clausal complements.
This allowed us to control for overall sentence length and complexity,
as ratings were obtained for dierent verbs in exactly the same syntactic
pattern. Four verbs were chosen from each of three classes of clausal-
complement-taking verbs:
7
a. factive verbs (realize, remember, notice, know)
b. manner-of-speaking verbs (whisper, stammer, mumble, mutter)
c. bridge verbs (say, decide, think, believe)
2.1. Dierence scores
As described in detail in the methods section, we collected acceptability
ratings for both WH-questions and the corresponding declarative state-
ments. We used as our measure of acceptability of the WH-question a
7. We originally additionally included four whether-complement taking verbs, but these are
treated as llers in the analysis that follows. The authors have (currently unpublished)
data which suggests that subjects treat whether as being intermediate between a com-
plementizer and a WH-word.
The island status of clausal complements 367
dierence score (or dispreference for question-form score) calculated by
subtracting the rating for each WH-question from the rating for the cor-
responding declarative statement, averaging across all subjects for each
item. For example, the number assigned to measure the dispreference for
extraction in Who did Pat stammer that she liked? was arrived at by
subtracting subjects rating of this sentence from their rating of the corre-
sponding declarative sentence, Sara stammered that she liked Dominic.
This allows us to control for any general (dis)preferences that participants
might have for particular VERBCOMP combinations. Such (dis)prefer-
ences might be expected to occur on the basis of simple frequency (e.g.,
sentences containing say that might be rated as more acceptable than sen-
tences containing stammer that, regardless of whether they are interroga-
tive or declarative) and/or the extent to which certain verbs felicitously
introduce complement clauses (again in both declaratives and interroga-
tives). Indeed, in the present study, for example, declarative sentences of
the form NP said that S received a mean rating of 5.9 out of 7, while sen-
tences of the form NP stammered that S received a mean rating of 4.7.
The nding that subjects give lower ratings of acceptability to sentences
containing low frequency strings when other factors are held constant is
well attested in the literature (see references cited earlier). Using dierence
scores ensures that our dependent measure reects the extent to which
participants consider particular WH-extraction questions to be ungram-
matical, controlling for the frequency of particular lexical strings. The
higher the dierence score, the higher the dispreference for the WH-
question form (i.e., the higher the dierence score, the stronger the island
to extraction).
2.2. Negation test
A central goal of the study was to investigate whether the extent to which
a complement clause is backgrounded correlates with its resistance to
WH-extraction. As a measure of backgrounding of the complement
clause, the negation test was used. The degree to which a clause C is con-
sidered backgrounded varies inversely with the extent to which main
clause negation implies that C itself is negated. To determine scores on
the negation test, we simply asked native speakers to judge the extent to
which main clause negation implied that the subordinate clause was ne-
gated. For example, subjects judged the extent to which sentences like
that in (26) implied (27) on a seven point scale:
(26) She didnt think that he left.
(27) He didnt leave.
368 B. Ambridge and A. Goldberg
Clearly (26) does not strictly entail (27), but it does imply it to some ex-
tent (as the judgments collected conrm). The negation test has the virtue
of being a well-motivated, objective and independent measure. This test is
intended to predict what is in the focus domain generally. For present
purposes, the BCI predicts a correlation between the negation test and ac-
ceptability of the long distance dependencies, at least to the extent to
which negation test judgments dier for particular verbs.
2.3. Similarity judgments
In order to determine whether semantic analogy to the verbs think or say
play a role in acceptability judgments, we used both human and auto-
matic calculations of semantic similarity. For the human judgment data,
we created a second questionnaire to investigate verbs similarity to think
and say. For the automated calculation, we used Latent Semantic Analy-
sis (Deerwester et al. 1990). The similarity judgments are discussed in sec-
tion 6.
3. Predictions
3.1. Predictions of the BCI hypothesis
To recap, the BCI hypothesis predicts that the greater the extent to which
sentential negation implies negation of the complement clause, the lesser
the extent to which the complement clause is backgrounded, and hence
the weaker the island. That is, the higher the negation-test score, the
higher the predicted acceptability of the related WH-question, and the
lower the dierence score. Thus the BCI hypothesis predicts a signicant
negative correlation between negation-test and dierence scores.
3.2. Predictions of subjacency account
A purely syntactic subjacency account would expect all structurally iden-
tical sentences to behave identically, and thus would predict no systematic
dierences across semantic verb classes. The proposals to treat comple-
ments of manner of speaking verbs as adjuncts and complements of
factive verbs as part of complex NPs were argued to be problematic.
However, if either of these analyses is correct it would predict that the
constituent in question is an island to extraction. It is well-known that is-
land status is somewhat variable, but no particular gradience of judg-
ments is predicted on this account. That is, there is no reason to expect
that grammaticality judgments should correlate in any systematic way
with judgments on the negation test.
The island status of clausal complements 369
3.3. Predictions of a direct analogy account
Another possibility is that acceptability judgments (dierence scores)
are based on semantic similarity to a xed formula involving the verb
think or say (WH do you think/say S?). Dabrowska (this issue) found
that judgments on questions involving the second person subject you
were not signicantly dierent from those with a proper name, so we
might generalize the template to WH DO NP think/say S? where DO
is capitalized to indicate that its form is determined by agreement with
the subject argument. Our stimuli all contain past tense did and not do
or does; this dierence from the xed formula is controlled for across
items. Our stimuli all contain the complementizer that so this dierence
from the xed formula is also controlled for across items. The key dif-
ference among our items is the main verb involved. The direct-analogy
account would thus seem to predict that there should be a negative corre-
lation between dierence scores and scores of similarity of the main verbs
involved to think or say: the more similar a verb is to think or say, the less
dierence there should be between the acceptability of a question and the
acceptability of its corresponding declarative.
4. Questionnaire #1: acceptability ratings and negation test
The rst questionnaire collected acceptability judgments and judgments
on a negation test. Similarity judgments were collected in a separate ques-
tionnaire (see section 6).
4.1. Method
4.1.1. Participants. Participants who lled out the acceptability/
negation-test questionnaire were 71 na ve undergraduate and graduate
students from Princeton University (mean age 19;6), all of whom were
monolingual English speakers. None of the participants were linguistics
majors and few if any had any background in linguistics. Participants re-
ceived $5 for their participation during a questionnaire day.
4.1.2. Design. For each of twelve verbs, each participant rated the
grammatical acceptability of a WH-question and a declarative statement
both containing a complement clause, and performed a negation-test-
judgment task (see Materials section). The verb (class) was manipulated
as a within-subjects factor with 12 levels for a correlation analysis,
and three levels ( factive, manner of speaking, bridge; with four verbs in
each class) for a factorial analysis. Counterbalance-version (six dierent
370 B. Ambridge and A. Goldberg
versions of the questionnaire were used) was manipulated as a between-
subjects factor.
4.1.3. Materials. Each participant completed a two-part question-
naire; the rst part consisted of judgments of grammatical acceptability
for WH-questions and declarative statements; the second part consisted
of judgments about the extent to which main clause negation implied ne-
gation of the complement clause.
Acceptability judgments of WH-questions featuring WH-extraction
from a clausal complement clause as in (A) were collected:
A) What did [NP1] [VERB1] [[that] [NP2] [VERB2]]?
(e.g., What
i
did Jess think that Dan liked t
i
?)
VERB1 was one of the 12 experimental verbs: realize, remember, no-
tice, know; whisper, stammer, mumble, mutter; say, decide, think, believe.
NP1 and NP2 were one of 12 female or 12 male proper names respec-
tively, while VERB2 was the past tense of one of 12 transitive verbs (ate,
bought, built, drew, xed, found, knew, liked, made, needed, opened, pulled,
read, threw, took, wanted ).
Six dierent versions of the questionnaire were created. For each ver-
sion, sentences were generated at random using the template in (A).
8
Acceptability judgments of declarative statements of the form given in
(B) were also collected:
9
B) [NP1] [VERB1] [that] [[NP2] [VERB2APPROPRIATE NP]]
(e.g., Danielle thought that Jason liked the cake)
8. The actual sentence for each of the 12 experimental verbs (though not the structure of
the sentence) diered across all six versions. For example, the experimental verb realize
occurred in the sentence What did Ella realize that Adam threw? in Version 1, What did
Trinity realize that Andy drew? in Version 2, and so on. This was to guard against the
possibility of our ndings being distorted by item eects.
9. Again, VERB1 was one of the 12 experimental verbs (this time in past tense form). As
for questions of the form in (A), the declarative statements were generated at random
using this template, and diered across the six versions of the questionnaire. VERB2
was selected from the same list of 12 verbs used for the questions, each paired with an
appropriate NP (ate the chips, bought the groceries, drew the picture, xed the computer,
found the keys, knew the secret, made the dinner, needed the map, pulled the car, read the
book, threw the ball, wanted the chocolate). NP1 and NP2 were selected from two further
lists of 12 female and 12 male names (i.e., each name never appeared more than once
throughout the study). This was to avoid explicitly highlighting to the subjects the for-
mal relationship between each WH-extraction question and its equivalent declarative.
The island status of clausal complements 371
For each of the six questionnaire versions, the 24 items in part one of
the questionnaire12 WH-questions and 12 declarative statements
were presented in a dierent pseudo-random order, with the stipulation
that no two verbs from the same verb class ( factive, manner of speaking,
bridge) were presented consecutively.
Negation test judgments. The second part of the questionnaire consisted
of negation test judgments that were designed to indicate the extent to
which sentential negation was interpreted as implying negation of (i.e.,
having scope over) the clausal complement. Each negated complex sen-
tence (e.g., Maria didnt know that Ian liked the cake) was paired with a
negated simple sentence corresponding to the complement clause of the
complex declarative (e.g., Ian didnt like the cake). For each of the six dif-
ferent questionnaire versions, the complex simple negated declarative
sentence pairs were presented in a dierent pseudo-random order, with
the stipulation that no two pairs involving verbs from the same class ( fac-
tive, manner of speaking, bridge) were to be presented consecutively.
These items (see Sentence C below for an example) were created using
an additional set of 12 female names (NP1s) and male names (NP2),
along with the same lists of VERB1s and VERB2APPROPRIATE
NPs as in the declarative statements from Part 1:
C) [NP1] didnt [VERB1] [that] [NP2] [VERB2APPR. NP]
[NP2] didnt [VERB2APPR. NP]
e.g., Maria didnt know that Ian liked the cake.
Ian didnt like the cake.
Again, the items generated for each verb diered across each of the six
dierent versions of the questionnaire with regard to the NPs used.
4.1.4. Procedure. Subjects completed the questionnaire in written
form, and were given only printed instructions.
For Part 1 ( judgments of grammatical acceptability), these instructions
stated:
Please rate each of the sentences below for how acceptable you nd
them. 7 Perfect (completely acceptable), 1 Terrible (completely
unacceptable).
Please indicate your response by drawing a circle around the appropri-
ate number as shown in the examples below. Please judge the sentences
only on how acceptable you nd them (and not, for example, whether
the event they describe is plausible or implausible, good or bad etc.).
Acceptability is a sliding scale and not a yes/no judgmentpeople
tend to dier in their judgments of how acceptable sentences are.
372 B. Ambridge and A. Goldberg
For Part 2 (negation test judgments), these instructions stated:
Here, you will be given two statements. Your task is to decide the ex-
tent to which the rst statement implies the second statement. Consider
the example sentence pairs in AC below:
(A) Bob left early. Bob didnt leave early.
The rst statement strongly implies that the second statement is NOT
true, so in this case you would circle the 1, as shown above.
(B) Bob left the party early. Bob left the party.
This time, the rst statement strongly implies that the second statement
IS true, so this time, you would circle the 7 as shown above.
(C) Bob might leave the party late. Bob left the party early.
This time, the rst statement neither implies nor does not imply the sec-
ond statement, so here you would circle the 4 as shown above.
We are interested in what average people typically imply with their
everyday statements. Bearing these examples in mind, please rate the
pairs below for the extent to which the rst statement implies that
the second statement is true. That is, if you heard a person say [State-
ment 1], to what extent would you assume that they are implying
[Statement 2].
5. Results and discussion
Dierence scores, raw scores (ratings for questions and declaratives), and
negation-test scores can be found in Table A1 (Appendix).
5.1. Preliminary analysis
A preliminary analysis of variance with mean dierence scores (prefer-
ence for declarative over WH-extraction question) as the dependent
variable and verb-type ( factive, manner of speaking, bridge) and counter-
balance version as within-subjects variables was conducted to investigate
the eect of counterbalance version. This variable was not associated with
any signicant main eects or interactions. Subsequent analyses therefore
collapsed across all six dierent questionnaire versions. The dataraw
scores, dierence scores and negation test scoreswere also checked for
normality of distribution (for each verb individually, and collapsed into
the three verb-type categories). Although data in some conditions dis-
played skew and kurtosis, all subsequent analyses yielded the same pat-
tern of results with raw and (log) transformed data. We therefore report
results for untransformed data only.
The island status of clausal complements 373
5.2. Analyses of variance
In order to investigate the role of verb classes, we conducted an analysis
of variance for dierence scores and negation test scores separately, at the
level of verb classes ( factive, manner of speaking, bridge). That is, for
each subject, the dierence score forfor examplefactive verbs repre-
sents the mean of that subjects dierence scores for realize, remember,
notice and know (and the same for negation-test scores).
These analyses were conducted to investigate (a) whether subjects gave
signicantly higher ratings of grammatical acceptability (looking at dif-
ference scores) for certain classes of complement-taking verbs than others
and (b) whether participants negation-test judgments mirrored (i.e., pre-
dicted) these acceptability ratings. These data are shown in Figure 1 and
Figure 2 respectively (and also in Table A1; see Appendix).
Figure 1. Mean negation test scores. Higher scores indicate less backgrounding of the com-
plement clause
Figure 2. Mean dierence (dispreference-for-extraction-question) scores. Higher scores indi-
cate greater ungrammaticality of the question form (relative to the corresponding
declarative)
374 B. Ambridge and A. Goldberg
As predicted by the BCI hypothesis, the increase in dierent scores is
paralleled by a decrease in negation-test scores (recall that the BCI hy-
pothesis predicts a negative correlation between our negation-test and
dierence-score measures).
A one-way within-subjects ANOVA with the independent variable
of verb-type ( factive, manner of speaking, bridge) and the dependent
variable of dierence score yielded a signicant main eect of verb-type
(F
2; 70
27:01, p < 0:001, h
2
p
0:28). Post hoc tests revealed that fac-
tive verbs yielded the strongest islands (i.e., highest dierence scores;
M 2:06 places on the 7-point scale; SE 0:16). Manner-of-speaking
verbs (M 1:74, SE 0:15) yielded the next strongest islands, with (as
their name implies) bridge verbs forming the weakest islands (M 0:97,
SE 0:13). All comparisons were signicant at p < 0:001 with the excep-
tion of that between factive and manner-of-speaking verbs, which was
marginally signicant at p 0:056.
A one-way within-subjects ANOVA with the independent variable of
verb-type ( factive, manner of speaking, bridge) and the dependent vari-
able of negation-test score also yielded a signicant main eect of verb-
type (F
2; 70
49:27, p < 0:001, h
2
p
0:41). Factive verbs yielded the
lowest negation test score (i.e., highest backgrounding of the complement
clause; M 1:90, SE 0:13), then manner-of-speaking verbs (M 2:75,
SE 0:15), then bridge verbs (M 3:35, SE 0:14), with all compari-
sons signicant at p < 0:001.
In summary, the results of these two ANOVAs provide considerable
support for the BCI hypothesis. Factive verbswhich, as a class, are
rated as strongly backgrounding the complement clause (as measured by
the negation test)form the least acceptable WH-extraction questions.
Bridge verbswhich, as a class, are rated as only weakly backgrounding
the complement clauseform the most acceptable WH-extraction ques-
tions, with manner of speaking verbs in-between the two.
In order to quantify the negative correlation between the dierence
scores and the judgments on the negation test, we additionally performed
a correlational analysis on the data. The correlational analysis is aected
by within-verb-class correlations as well as correlations between verb
classes, so it is a more sensitive measure.
5.3. Correlation analysis
We entered into the correlation analysis the mean negation-test score
and the mean dierence score, pooling across all subjects (see Lorch
and Myers 1990). What our analysis lacks in powerhaving only 12
datapointsit makes up for in reliability, as each point includes scores
The island status of clausal complements 375
from 71 participants. A scatterplot of this correlation is shown in Figure
3.
This analysis revealed that the mean negation test score was a highly
signicant (negative) predictor of mean dierence score (r :83,
p 0:001), accounting for over two thirds of the observed variance
(R
2
0:69).
10
The correlation of |.83| is strikingly high, as perfect corre-
lations (/1) are almost non-existent when distinct measures are used.
Separate measures of the same thing, e.g., mean length of utterance
(MLU) at 28 months, have been found to correlate in the .75.80 range
(Bates and Goodman 1997).
5.4. Any role for subjacency?
The subjacency account clearly does not predict the pattern of results
found in the present study. In particular, subjacency does not predict
any distinctions based on the semantic class of the verbs involved without
Figure 3. Correlation between dierence scores (dispreference for question scores) and nega-
tion test scores
10. Mean negation test score was also a signicant (positive) predictor of mean rating of
acceptability for the extraction question (r 0:58, p < 0:05), accounting for approxi-
mately one third of the observed variance (R
2
0:34). Thus although, as we have ar-
gued, dierence scores constitute a more appropriate measure of (un)acceptability than
raw scores, our nding of a signicant association between backgrounding and the ac-
ceptability of WH-extraction questions does not hinge on using dierence scores.
376 B. Ambridge and A. Goldberg
stipulation. Manner of speaking complements and factive complements
would require reanalysis as adjuncts or parts of complex NPs as outlined
above in order to predict their relative ill-formedness vis a vis semantically
light verbs. Such analyses would require independent support, of course,
or risk being ad hoc; moreover, even if reanalysis into adjuncts and
complex NPs is granted, it does not predict the strong correlation
found between dierence judgments and judgments on the negation test.
Moreover, subjacency does not predict the fact that questions from
complements of the verb think (and say) are judged to be particularly
well-formed.
5.5. Think and say questions as stored formula
As Figure 3 illustrates, the present study replicates Dabrowskas (2004,
this issue) ndings that WH-questions with think and say are rated as
somewhat more acceptable than such questions with other verbs (think is
signicantly more acceptable than all other verbs besides say and decide
at p < 0:05 by paired t-test; say is more acceptable than all other verbs
except think, believe, decide and stammer). At the same time, the gramma-
ticality judgments do not provide unambiguous evidence for formulaic
status since the semantic properties of think and say predict that they
should be favored. To demonstrate that speakers judge WH-extraction
questions with think and say to be more acceptable than would be pre-
dicted given their semantics, it would be necessary to show that scores
for these items fall well below the regression line. Generally a dierence
of 1.96 standard deviations is accepted as indicating outlier status and
neither say nor think meet this criterion. Although say is 1.62 standard
deviations below the regression line, and thus farther from the regression
line than most of the other verbs, it does not meet this criterion for outlier
status; neither is it the closest to being classied as an outlier (mutter is
judged 1.65 SDs worse than would be predicted given its negation test
score). Moreover, acceptability of the WH-extraction question is pre-
dicted better by the negation test for think than for any other verb (at
only 0.28 SDs below the regression line).
The BCI generalization goes some way toward explaining why the
same verbs, think and say, are more likely to appear in long-distance
dependency constructions than other verbs cross-linguistically: their se-
mantics motivates their discourse properties which in turn motivate their
distribution (recall that e.g., Dutch denken think shows a tendency to
be used frequently in ller-gap constructions (Verhagen 2006); and cf. dis-
cussion of Polish say below). The idea that think is used with special
discourse properties is buttressed by the idea that clauses with the main
The island status of clausal complements 377
verb think are often cited as in some sense monoclausal (Lako 1969;
Thompson 2002; Verhagen 2006). An indication that say is likewise often
used to foreground the information in the complement clause comes from
the fact that the verb say has been known to grammaticalize into a com-
plementizer (Haspelmath 1989).
Thus although we agree that the forms WH DO NP think S? and WH
DO NP say S? are likely to be stored (due to their high frequency), the
present study does not provide evidence that this is necessarily the case,
as their well-formedness may be due to their semantics. To demonstrate
that WH-extraction questions with think and say are necessarily stored
as templates, one might turn to on-line comprehension time measures
which may be more likely to reveal formulaic status than acceptability
judgments (cf. Wonnacott et al. 2008). The following section investigates
the stronger claim that these high-frequency formulas are used as the
basis of direct semantic analogy when other WH-questions with the
same form are at issue.
6. Questionnaire #2: Semantic similarity
As noted at the outset, the direct-analogy proposal claims that the ques-
tions WH DO NP think S? and WH DO NP say S? constitute semantic
prototypes, and that the grammatical acceptability of other such ques-
tions may vary as a function of their semantic similarity to these proto-
types. In order to test this possibility, we investigated whether semantic
similarity of each verb to think (or say) accounted for any of the observed
variance in dierence scores.
6.1. Method
6.1.1. Participants. 12 na ve undergraduate and graduate students
(11 from Princeton and one from the University of Liverpool) (mean
age 22.5) lled out semantic similarity questionnaires. None of them had
taken part in the rst study and as before, none of the participants were
linguistics majors, and few if any had any background in linguistics. Par-
ticipants received $7 each.
6.1.2. Design. Participants rated the semantic similarity of say and
think (andas a controlfour other verbs used in the main study) to
each of the 11 remaining verbs (see Materials section). The control verbs
allow us to determine whether semantic similarity to think/say in particu-
lar predicts the dierence scores from the rst study better than semantic
378 B. Ambridge and A. Goldberg
similarity to an arbitrary verb that is not claimed to form part of a se-
mantic template (e.g., remember). Each subject received one of six dier-
ently ordered versions of the questionnaire in order to guard against pos-
sible order-eects.
6.1.3. Materials. In order to give semantic similarity the strongest
chance of predicting the acceptability of question forms, we asked
speakers to judge the similarity of the verbs as they appeared in questions.
We used yes/no questions because judgments on WH-questions would
have been confounded by variation in acceptability, which may have in-
uenced speakers similarity ratings in unforeseen ways. Participants lled
out a questionnaire containing items such as the following:
How (dis)similar are the following verbs to think, in the context
A. Did you think that Mary needed the map?
Did you decide that
Mary needed the map?
Meanings are
very dierent
1 2 3 4 5 6 7 Meanings are
very similar
Did you say that Mary
needed the map?
Meanings are
very dierent
1 2 3 4 5 6 7 Meanings are
very similar
Did you whisper that
Mary needed the map?
Meanings are
very dierent
1 2 3 4 5 6 7 Meanings are
very similar
The verbs think, say, remember, notice, stammer and mumble were used
in target questions as think is in the question in A above. For each target
verb, similarity ratings were requested for each of the other 11 verbs (re-
alize, remember, know, whisper, mutter, decide, and believe in addition to
those used in the target sentences).
6.1.4. Procedure. Subjects completed the questionnaire in written
form, and were given only printed instructions:
Your task in this study is to rate verbs for how similar in meaning they
are to another verb (as it is used in a particular sentence). For example,
consider the sentence
John saw the man.
You might decide thatin this contextspotted means something
very similar to saw, in which case you would circle the 7 as shown
below:
The island status of clausal complements 379
John spotted
the man.
Meanings are very
dierent
1 2 3 4 5 6 7 Meanings are very
similar
You might also decide thatin this contextkicked means some-
thing entirely dierent to saw, in which case you would circle the 1,
as shown below:
John kicked
the man.
Meanings are very
dierent
1 2 3 4 5 6 7 Meanings are very
similar
Finally, you might decide thatin this contextthe meaning of
watched is not very similar to that of saw, but it is not very dier-
ent either, in which case you would circle the 5 as shown below:
John watched
the man.
Meanings are very
dierent
1 2 3 4 5 6 7 Meanings are very
similar
6.2. Results and discussion
The second questionnaire aimed to determine whether there was evidence
for the idea that think or say WH-extraction questions were used as the
basis for an analogy when judging the well-formedness of such questions
with other main verbs. We therefore entered into a correlation analysis,
for each verb (except think itself ), the score representing the semantic
similarity of this verb to think (predictor variable) and the mean dier-
ence score from the rst study (outcome variable). A separate correlation
analysis was performed for semantic similarity to say, and also to each of
the four control verbs in the same way. The mean semantic-similarity
to think (and say) scores are shown in Table A1 (Appendix).
The semantic-similarity judgment data failed to show a signicant
correlation with the judgment data for well-formedness of questions (i.e.,
dierence scores). The correlations did not approach signicance for sim-
ilarity to either think (r 0:08, p 0:79) or say (r 0:17, p 0:62), (or,
indeed, for any of the four control verbs: remember, notice, stammer
or mumble). Indeed, the small and non-signicant correlations for think
and say were in the opposite direction to that predicted by the analogy
account.
Relatively few subjects (12) were involved because preliminary analysis
showed that judgments were highly reliable across participants. Each in-
dividual participants judgments were signicantly correlated with the
380 B. Ambridge and A. Goldberg
mean scores collapsing across all participants at p < 0:01. Note also that
because we used mean scores pooled across participants, the power of
the statistical test is unaected by sample size. The validity of our analysis
(as well as its power to detect eects) is demonstrated by systematic nd-
ing of signicant correlations between similarity scores. For example,
similarity-to-think scores were signicantly (negatively) correlated with
similarity-to-say scores r 0:741, p < :02). In fact, judgments of
similarity to all target verbs (say, think, remember, notice, stammer and
mumble) were intercorrelated at p < 0:05 or better, with one exception
(similarity-to-notice and similarity-to-stammer: r :586, p 0:075).
Perhaps participants were basing their similarity judgments on some
sort of conscious strategy that was not relevant to the implicit similarity
judgments that might be used on the analogy proposal. To test for this
possibility, we also calculated similarity scores using an on-line automatic
similarity calculator, Latent Semantic Analysis (Deerwester et al. 1990).
11
As before, since a higher LSA score indicates greater semantic similarity
to think (or say), and a lower dierence score indicates a higher rating of
grammatical acceptability, a negative correlation between LSA and mean
dierence score was predicted.
The analysis found that LSA semantic similarity of the verbs to say did
not involve a signicant correlation (r 0:02, p 0:96). Similarity to
think was also not a signicant predictor of mean dierence score; in fact
there was again a small non-signicant correlation in the opposite direc-
tion to that predicted (r 0:11, p 0:75).
Another potential correlation we considered involved determining, for
each verb, the maximum similiarity score to either the verb think or the
verb say. That is, if we assume that two distinct formulas are stored,
WH DO NP say S? and WH DO NP think S? then judgments may be de-
termined by a comparison between a target verb and whichever formula
it is semantically closest to. We therefore calculated the correlation be-
tween the dierence scores and the array of scores determined by the fol-
lowing formula:
Max (similarity-of-verb
i
-to-say, similarity-of-verb
i
-to-think)
for i a {realize, remember, notice, know, whisper, stammer, mumble,
mutter, decide, believe}.
However, neither the judgment scores of similarity nor the LSA
similarity scores correlated signicantly with the dierence scores by this
11. See http://lsa.colorado.edu/ (texts denoted General Reading up to 1st Year
College).
The island status of clausal complements 381
measure either (r 0:17, p 0:64; r 0:45, p 0:19, respectively).
The correlation with LSA scores is of a fair size (r 0:45), but it is
in the opposite direction to that predicted by the direct analogy account.
Recall that dierence scores are smaller to the extent that the question
form is relatively well-formed. If judgments were based on semantic
analogies to the xed formulas, there should be a negative, not a positive
correlation.
Thus, whichever way it is analyzed, by similarity to say or to think or
to their combination, and according to either human judgment data or to
the automatic LSA similarity calculation, semantic similarity to say or
think is a poor predictor of judgment data. The direct-semantic analogy
proposal fails to account for the data.
7. Conclusion
In conclusion, the BCI hypothesis (or an information structure account
more generally) has been shown to be an excellent predictor of the island
status of clausal complements. Participants negation-test judgments were
able to predict over two-thirds of the variance associated with their dis-
preference-for-WH-extraction-question scores. As this correlation is also
not expected nor easily explained on a purely syntactic account, this nd-
ing lends strong support to the idea that the discourse function of the con-
structions involved plays a critical role in island phenomena.
It is beyond the scope of this paper to provide a thorough comparison
of the BCI and subjacency, but there are many other generalizations that
the BCI accounts for without additional stipulation that subjacency does
not (see Goldberg 2006). The rst two predictions were considered in the
present study.
1. Complements of manner-of-speaking verbs and factive verbs are
islands.
2. Grammaticality judgments should correlate with the degree of back-
groundedness, when length and complexity are held constant.
3. Direct replies are sensitive to islands (Morgan 1975).
4. Exclamative ah! is sensitive to islands (James 1972).
5. The active recipient argument of ditransitive, as a secondary topic,
resists being a gap, while the passive recipient argument of a ditransi-
tive, as a primary topic, is free to be a gap.
6. Presentational relative clauses are not always islands.
7. Denite relative clauses are stronger islands than indenite relative
clauses.
8. Parentheticals are islands.
382 B. Ambridge and A. Goldberg
There is ample evidence that general processing constraints play a role
in island violations (and their amelioration) (cf. e.g., Ellefson and Chris-
tiansen 2000; Gibson 1998; Kluender and Kutas 1993). In particular,
several factors including length, deniteness, complexity, and interference
eects (involving similar referents between ller and gap) have been
shown to play a role. As the present experiment controls for these factors,
we can see that information structure constraints play an independent
role in addition to eects of processing.
Judgments on ller-gap constructions involving the complement clause
of the main verb think (and say) were judged to be signicantly more
acceptable than those involving most other main verbs, as Dabrowska
(2004, this issue) has also found. The BCI hypothesis actually predicts
that these verbs should be preferred on semantic groundsthe accept-
ability judgments correlate well with the negation test scoresso other
data are needed to conrm that a xed formula is stored (but, again, we
take the idea that such formulas are stored to be quite plausible).
On the other hand, the possibility that all ller-gap expressions involv-
ing complement clauses are judged by direct analogy to the formulaic
expression with think or say was not supported by the data. Neither the
human similarity judgment scores nor the automated LSA similarity mea-
sure correlated with the acceptability data. This nding argues against a
strong version of item-based grammar in which acceptability judgments
are necessarily determined by one-shot analogies to well-learned formu-
laic patterns.
In general, we must be careful when appealing to frequency in the input
data as an explanation for linguistic generalizations. The explanation may
be question-begging unless an account is oered as to why there should be
cross-linguistic generalizations about the nature of the input, as there are,
at least to some extent, in the case of island constraints. We must ask, why
is the input the way it is? An account that appeals to information structure
provides an answer to this question: speakers avoid combining con-
structions that would place conicting constraints on a constituent, such
as requiring it to be at once backgrounded and discourse-prominent.
At the same time, certain cross-linguistic dierences do exist. As noted
above (n. 4), Russian allows gaps only in main clauses, whereas Italian
appears to allow long distance dependencies somewhat more freely than
English. Insofar as backgroundedness is a matter of degree, languages
appear to select dierent cut-o points in how backgrounded a constitu-
ent may be while containing a gap (cf. Erteschik-Shir 1973; Fodor 1991
for similar suggestions). Languages dier as to the location of the cut-
o point, but all languages seem to prefer extraction out of non-
backgrounded constituents.
The island status of clausal complements 383
One further intriguing piece of evidence that suggests that convention-
ality (item-based learning) plays a role in addition to the information
structure generalization comes from the fact that there are some cross-
linguistic dierences in which verbs within the class of bridge verbs are
most likely to allow extraction from their complement clause. According
to Cichocki (1983), the Polish verb mowic (say) allows extraction from
its nite complement clause while other verbs, including myslec (think),
do not.
12
At the same time, there are certain intriguing dierences in how
Polish myslec and English think are used that deserve further explora-
tion.
13
In any case, say, like think, is a light verb which allows its com-
plement clause to be foregrounded (as evidenced by the present negation
test scores). Thus, while the relative dierence between Polishs think
and say is not necessarily predicted by the BCI account, the following
more general prediction is made: we do not expect to nd any language in
which a factive verb or a manner-of-speaking verb is more likely to allow
extraction from its complement clause than a light verb of thinking or
saying.
There is a vast and growing amount of evidence that speakers are
aware of detailed statistical patterns in the input. We in no way wish to
deny this. Certainly, speakers inventories of constructions are learned by
generalizing over instances, and the generalizations are often statistical in
nature. The eects of statistics in the input are also clearly relevant to lan-
guage processing (cf. e.g., other papers in this issue).
12. We thank Ewa Dabrowska and Blazej Galkowski for conrming the preference for
extraction with Polish say over think, although Dabrowska notes that even extraction
out of say complements is not fully grammatical in Polish (p.c. 20 March 2007 and
2 May 2007, respectively).
13. Galkowski (p.c. 2 May 2007) observes that Polish myslec cannot be used as a hedge to
assert the content of the subordinate clause the way English think can be, when there is
main clause negation. He suggests that a more elaborate context in which the thought
processes of the subject argument are at issue is required for the following type of ex-
ample.
Nie mysle ze (on) zjad tego hamburgera.
Not think-1sg that he eat-3sg-past this/the hamburger
I dont think he ate the hamburger
For example, Galkowski oers the following context: [My grandpa with Alzheimers
cant be trusted to eat the food I leave for him. So when I see the plate is empty, I dont
think he ate the hamburger. Id rather look for it under his bed.] So the emphasis is on
thinkingBG (p.c. 2 May 2007). Insofar as the focus is on the main verb think and
not the complement clause, the information structure account would predict that ex-
traction from the complement clause should be dispreferred, as it is.
384 B. Ambridge and A. Goldberg
Yet constructions are combined to form actual expressions, and it
seems unlikely that every possible combination of constructions is some-
how stored in advance. The present studies undermine the position that
the felicity of combination is always determined by semantic comparison
with a relatively concrete, xed formula. They also undermine any purely
structural account such as subjacency. Rather, the current ndings sup-
port a view of grammar in which speakers determine which constructions
can be combined, at least in part, on the basis of the information struc-
ture properties of the constructions involved.
Received 21 May 2007 University of Liverpool, UK
Revision received 21 December 2007 Princeton University, USA
Appendix
Table A1 shows, for each verb (class), mean ratings of grammatical
acceptability (and corresponding standard deviations) for questions and
declarative sentences, dierence scores, negation-test scores, and human/
latent semantic analysis similarity-to-think scores).
Table A1. Raw data
Verb
(class)
Dierence
Score
Question Declarative Negation
test score
Judged
similarity
to
think-say
LSA
similarity
to
think-say
Mean SD Mean SD Mean SD Mean SD
Realize 2.17 1.75 4.13 1.76 6.30 1.14 1.54 1.12 4.583.00 0.490.41
Remember 1.76 2.25 4.27 1.94 6.03 1.36 2.20 1.74 2.672.42 0.550.50
Notice 1.72 1.70 4.45 1.77 6.17 1.40 2.18 1.72 3.082.25 0.380.30
Know 2.61 1.85 3.82 1.84 6.42 1.28 1.68 1.27 4.503.42 0.700.63
Whisper 1.92 1.96 3.99 1.81 5.90 1.57 2.63 1.55 1.584.75 0.280.33
Stammer 1.30 2.19 3.41 1.75 4.70 1.82 2.75 1.69 1.504.75 0.150.27
Mumble 1.79 2.10 3.96 1.78 5.75 1.48 2.69 1.65 1.424.67 0.110.20
Mutter 1.96 2.05 3.90 1.97 5.86 1.53 2.93 1.61 1.833.92 0.140.25
Say 0.89 2.06 5.04 2.00 5.93 1.66 2.90 1.48 2.08N/A 0.62N/A
Decide 1.07 1.85 4.73 1.82 5.80 1.59 3.11 1.59 4.332.42 0.340.28
Think 0.62 1.64 5.28 1.64 5.90 1.71 3.94 1.80 N/A3.42 N/A0.62
Believe 1.28 1.95 4.86 1.78 6.14 1.40 3.44 2.01 5.832.33 0.520.51
Factives 2.06 1.33 4.17 1.43 6.23 1.05 1.90 1.07
Manner of
speaking
1.74 1.25 3.81 1.30 5.55 1.20 2.75 1.26
Bridge 0.96 1.13 4.98 1.42 5.94 1.21 3.35 1.21
The island status of clausal complements 385
References
Ambridge, Ben and Goldberg, Adele E.
forthc. Is whether a complementizer or a WH-word?
Ambridge, Ben, Caroline F. Rowland, Anna L. Theakston, and Michael Tomasello
2006 Comparing dierent accounts of inversion errors in childrens non-subject
wh-questions: What experimental data can tell us? Journal of Child Lan-
guage 33(3), 519557.
Ambridge, Ben, Caroline F. Rowland, and Julian Pine
2008a Is Structure Dependence an innate constraint? New experimental evidence
from childrens complex-question production. Cognitive Science 32(1), 222
255.
Ambridge, Ben, Pine, Julian M., Rowland, Carolyn F., and Chris R. Young
2008b The eect of verb semantic class and verb frequency (entrenchment) on
childrens and adults graded judgments of argument-structure overgenerali-
zation errors. Cognition 106(1), 87129.
Baltin, Mark R.
1982 A landing site for movement rules. Linguistic Inquiry 13, 138.
Bates, Elizabeth and Judith Goodman
1997 On the inseparability of grammar and lexicon: evidence from acquisition,
aphasia and real-time processing. Language and Cognitive Processing 12,
50784.
Bybee, Joan L.
1985 Morphology. John Benjamins.
Bybee, Joan L.
1995 Regular morphology and the lexicon. Language and Cognitive Processes 10,
42555.
Bybee, Joan L.
2007 Review of Goldbergs Constructions at Work: the nature of generalization in
language. Journal of Child Language.
Chomsky, Noam
1973 Conditions on Transformations, in S. R. Anderson and P. Kiparsky (eds)
A Festschrift for Morris Halle, New York: Holt, Rinehart and Winston.
Cichocki, Wladyslaw
1983 Multiple Wh-questions in Polish: a two-comp analysis. Toronto working pa-
pers in linguistics 4, 5371.
Dabrowska, Ewa
2004 Language, Mind and Brain. Georgetown: Georgetown University Press.
Deane, Paul
1991 Limits to attention: A cognitive theory of island phenomena. Cognitive Lin-
guistics 2, 163.
Deerwester, Scott, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and
Richard Harsham
1990 Indexing by latent semantic analysis. Journal of the American Society for In-
formation Science 41, 391407.
Ellefson, Michelle R. and Morten Christiansen.
2000 Subjacency Constraints without Universal Grammar: Evidence from Arti-
cial Language Learning and Connectionist Modeling. Proceedings of the
22nd Annual conference of the Cognitive Science Society. Mahwah, NJ: Law-
rence Erlbaum Associates, 645650.
386 B. Ambridge and A. Goldberg
Erteschik-Shir Nomi
1973 On the nature of island constraints. Indiana University Linguistics Club.
Erteschik-Shir, Nomi
1979 Discourse contraints on dative movement. in S. Laberge and G. Sanko
(eds.) Syntax and Semantics: Vol 12: Speech Acts. New York: Academic
Press, 441467.
Erteschik-Shir, Nomi
1998 The dynamics of focus structure. Cambridge: Cambridge University Press.
Erteschik-Shir, Nomi and Shalom Lappin
1979 Dominance and the functional explanation of island phenomena. Theoreti-
cal Linguistics 6, 4185.
Featherston, Sam
2005 Magnitude estimation and what it can do for your syntax: some wh-
constraints in German. Lingua 115, 15251550.
Fodor, Janet D.
1991 Sentence processing and the mental grammar. In P. Sells, S. M. Shieber, and
T. Wasow (Eds.), Foundational Issues in Natural Language. Cambridge,
MA: MIT Press.
Freidin, Robert and Carlos Quicoli
1989 Zero-stimulation for parameter setting. Behavioral and Brain Sciences 12,
33839.
Gibson, Edward
1998 Linguistic complexity: locality of syntactic dependencies. Cognition 68, 176.
Goldberg, Adele E.
2006 Constructions at work: The nature of generalization in language. Oxford:
Oxford University Press.
Haspelmath, Martin
1989 From purposive to innitive- a univeral path of grammaticalization. Folia
Linguistica Historica 12, 287310.
Jacoby, Larry L., Charlotte M. Kelley, and Jane Dywan
1989 Memory attributions. Varieties of memory and consciousness: essays in hon-
our of ednel tulving, ed. by H. L. Roediger and F. I. M. Craik, Hillsdale,
NJ: Erlbaum, 391422.
James, Deborah
1972 Some aspects of the syntax and semantics of interjections. Paper presented at
the 8th Regional Meeting of the Chicago Linguistic Society, Chicago.
Keller, Frank
2000 Gradience in grammar: experimental and computational aspects of degrees
of grammaticality. Unpublished PhD Thesis, University of Edinburgh.
Kempen, Gerard, and Karin Harbusch
2003 An articial opposition between grammaticality and frequency: Comment
on Bornkessel, Schlesewsky and Friederici (2002). Cognition 90, 205210.
Kempen, Gerard, and Karin Harbusch
2004 The relationship between grammaticality ratings and corpus frequencies: A
case study into word order variability in the mideld of German clauses. In
S. Kepser and M. Reis (eds.), Linguistic Evidence: Empirical, Theoretical,
and Computational Perspectives. Berlin: Mouton de Gruyter.
Kiparsky, Paul and Carol Kiparsky
1971 Fact. Progress in Linguistics, M. Bierwisch and K. Heidolph (eds.). The
Hague: Mouton.
The island status of clausal complements 387
Kluender, Robert and Marta Kutas
1993 Subjacency as a Processing Phenomenon. Language and Cognitive Processes
8, 573633.
Lako, Robin
1969 A syntactic argument for negative transportation. Proceedings of the Chi-
cago Linguistic Society 15, 14049.
Lambrecht, Knud
1994 Information Structure and Sentence Form. Cambridge: Cambridge Univer-
sity Press.
Langacker, Ronald W.
1988 A usage-based model. In B. Rudzka-Ostyn (Ed.), Topics in Cognitive Lin-
guistics. Amsterdam: Benjamins, 127161.
Lorch, Richard F. J., and Jerome L. Myers
1990 Regressionanalyses of repeatedmeasures data in cognitive research. Journal of
Experimental Psychology: Learning, Memory and Cognition, 16(1), 149157.
Morgan, Jerry L.
1975 Some interactions of syntax and pragmatics. In P. Cole and J. L. Morgan
(eds.), Syntax and Semantics, Vol. 3: Speech Acts. NewYork: Academic Press.
Murphy, Gregory L.
2004 The Big Book of Concepts. Cambridge, MA: MIT Press.
Newmeyer, Frederick J.
1991 Functional explanation in linguistics and the origins of language. Language
and Communication 11(12), 328.
Pinker, Steven, and Paul Bloom
1990 Natural language and natural selection. Behavioral and Brain Sciences 13(4),
707784.
Poulsen, Mads.
2006 The time course of extraction constraints in Dutch. manuscript.
Rizzi, Luigi.
1982 Issues in Italian Syntax. Dordrecht: Foris.
Ross, John R.
1967 Constraints on variables in syntax, MIT. Published as Innite Syntax!. 1986.
Norwood, NJ: Ablex Publishing Corporation: Ph.D.
Sag, Ivan A. and Janet D. Fodor
1994 Extraction without Traces. WCCFL, 13.
Schuetze, Carson T.
1996 The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic
Methology. Chicago: University of Chicago Press.
Takami, Ken-ichi
1989 Prepositional stranding: arguments against syntactic analyses and an alter-
native functional explanation. Lingua 76, 299335.
Thompson, Sandra A.
2002 Object complements and conversation towards a realistic account. Studies
in Language 26, 12564.
Tomasello, Michael
2003 Constructing a Language: A Usage-Based Theory of Language Acquisition.
Cambridge, MA: Harvard University Press.
Van Valin, Robert
1993 Synopsis of RRG. Advances in Role and Reference Grammar, ed. by Ro-
bert Van Valin: Benjamins.
388 B. Ambridge and A. Goldberg
Van Valin, Robert
1998 The acquisition of wh-questions and the mechanisisms of language acquisi-
tion. In M. Tomasello (Ed.), The new psychology of language: Cognitive and
functional approaches to language structure. Hillsdale, NJ: Lawrence Erl-
baum Associates.
Van Valin, Robert
1995 Toward a functionalist account of so-called extraction constraints. In B.
Divriendt et al. (Ed.), Complex Structures: A Functionalist Perspective. Ber-
lin: Mouton de Gruyter, 2960.
Van Valin, Robert, and Randy J. LaPolla
1997 Syntax: structure, meaning and function. Cambridge: Cambridge University
Press.
Verhagen, Arie
2006 On subjectivity and long distance Wh-movement. Subjectication: Vari-
ous paths to subjectivity, ed. by A. Athanasiadou, Costas Canakis and Bert
Cornillie. New York: Mouton de Gruyter, 32346.
Wonnacott, Elizabeth, Elissa L. Newport, and Michael K. Tanenhaus
2008 Acquiring and processing verb argument structure: Distributional learning
in a miniature language. Cognitive Psychology 56(3), 165209.
The island status of clausal complements 389
Questions with long-distance dependencies:
A usage-based perspective
EWA DABROWSKA*
Abstract
Attested questions with long-distance dependencies (e.g., What do you
think youre doing?) tend to be quite stereotypical: the matrix clause
usually consists of a WH word, the auxiliary do or did, the pronoun you,
and the verb think or say, with no other elements; and they virtually never
contain more than one subordinate clause. This has lead some researchers
in the usage-based framework (Dabrowska 2004; Verhagen 2005) to hy-
pothesise that speakers knowledge about such constructions is best ex-
plained in terms of relatively specic, low level templates rather than gen-
eral rules that apply across the board. The research reported here was
designed to test this hypothesis and alternative hypotheses derived from
rule-based theories.
Keywords: Usage-based model; long-distance dependencies; unbounded
dependencies; acceptability judgment experiment; prototype
eects.
1. Introduction
Questions and other constructions with long distance dependencies
(henceforth LDDs) have played an important role in the development of
syntactic theory, especially in the generative framework. Such structures
Cognitive Linguistics 193 (2008), 391425
DOI 10.1515/COGL.2008.015
09365907/08/00190391
6 Walter de Gruyter
* This study was supported by the Arts and Humanities Research Council (grant number
AH/F001924/1). I would like to thank Adele Goldberg, Marcin Szczerbin ski and two
anonymous referees for their comments on an earlier draft of this paper, and Mike Pin-
combe for help with data collection. Address for correspondence: School of English Lit-
erature, Language and Linguistics, University of Sheeld, Sheeld S10 2TN, United
Kingdom. E-mail: e.dabrowska@shef.ac.uk.
are interesting because they exhibit a dependency between a ller in the
main clause and a gap in a subordinate clause, as in example (1). The
dependencies are frequently referred to as unbounded, as, in principle,
there can be any number of clauses intervening between the ller and the
gap, as illustrated by (1d) and (1e).
(1) a. What will John claim that you did ? (Culicover 1997: 184)
b. Which problem does John know (that) Mary solved ?
(Ouhalla 1994: 72)
c. Whom do you believe that Lord Emsworth will invite ?
(Haegeman 1991: 342)
d. Who did Mary hope that Tom would tell Bill that he should
visit ? (Chomsky 1977: 74)
e. Which problem do you think (that) Jane believes (that) Bill
claims (that) Mary solved ? (Ouhalla 1994: 71)
It is noteworthy, however, that attested questions with LDDs are very
dierent from these constructed examples, as illustrated by the following
examples from the spoken part of the British National Corpus:
(2) a. And how do you think youd spell classical like do you like clas-
sical music? (FMG 725)
b. Whyd why do you think why do you think it is that there
wasnt that motivation? (FY8 201)
c. What is it and why do you think it looks like that? (JJS 882)
d. What do you think Brianll say? (KE1 256)
e. What did they say it meant? (KD0 622)
These real life LDD questions are much more stereotypical than the
sentences in (1). The textbook examples contain a variety of matrix sub-
jects and verbs and dierent auxiliaries; most of them also contain an
overt complementizer and two involve a dependency over more than one
intervening clause. In the corpus sentences, in contrast, the matrix subject
is usually you, the matrix verb think or say, and the auxiliary do; there are
no other elements in the matrix clause, no complementizer, and only one
complement clause. In fact, almost 70 percent of the LDD questions with
nite complement clauses in the spoken part of the BNC have the form
WH do you think S-GAP? or WH did NP say S-GAP?, where S-GAP is
a subordinate clause with a missing constituent. Most of the remaining
questions are minimal variations on these patterns: that is to say, they
contain a dierent matrix subject or a dierent verb or a dierent auxil-
iary or an additional element like an adverbial or complementizer. Only
392 E. Dabrowska
6 percent depart from the prototype in more than one respect (Dabrow-
ska in press a; see also Dabrowska 2004 and Verhagen 2005).
1
This has lead some researchers in the usage-based framework (Dab-
rowska 2004, Verhagen 2005) to hypothesise that speakers knowledge
about such constructions is best explained in terms of relatively specic,
low level templatesWH do you think S-GAP? and WH did you say S-
GAP?rather than in terms of abstract rules and principles of the type
proposed by formal linguists (see, for example Cheng and Corver 2006;
Chomsky 1977; Levine and Hukari 2006).
2
Declaratives with verb com-
plement clauses, in contrast, are much more variedthe main clauses
take dierent subjects, auxiliaries and verbs, and often contain additional
elements (Dabrowska in press a; Verhagen 2005)as a result of which
language learners develop more general representations for this construc-
tion in addition to lexically specic templates for frequent combinations
such as I think S, I dont think S, I mean S.
However, conclusions about speakers mental representations based on
the fact that a particular structure is rare or not attested at all in a corpus
are problematic, since this could be merely a result of sampling. Even
with a large and balanced corpus, sentences which are perfectly compati-
ble with speakers mental grammars may be unattested simply because
they are pragmatically implausible. In short, while restricted patterns of
usage are suggestive, they do not license strong conclusions about mental
representation: the observational data need to be corroborated by experi-
mental studies.
According to the usage-based proposals put forward by Dabrowska
and Verhagen, prototypical LDD questions are produced simply by in-
serting new material into the appropriate slots in a pre-existing template.
Non-prototypical LDD questions such as What does she hope shell get?
require additional work, since the speaker has to adapt the templatein
this case, substitute she for you and hope for think, and modify the auxil-
iary so that it agrees with the subject. To be able to do this, the speaker
would have to construct a proportional analogy such as the one in (3),
1. I am using the term prototype as it is usually used in linguistics: to refer to an ideal-
ised typical instance. Many natural categories are centred around a prototype, in the
sense that other instances are assimilated in the category on the basis of their perceived
similarity to it (cf. Lako 1987; Langacker 1987). The properties of prototypical instan-
ces are thus shared by most other members of the category.
2. The term construction is used in this paper to refer to any grammatical pattern found
in any language: thus expressions such as constructions with long-distance dependen-
cies should not be taken as implying that such patterns necessarily have any mental
reality. I will use the terms template and schema to refer to speakers mental repre-
sentations of these patterns.
Questions with long-distance dependencies 393
where semantic structure is represented in CAPITALS and phonological
structure in italics:
3
(3) YOU THINK SHE WILL GET SOMETHING: WHAT?
is to What do you think shell get?
as SHE HOPES SHE WILL GET SOMETHING: WHAT?
is to ???
To solve this problem, the speaker needs to establish correspondences
between the relevant parts of semantic and phonological structure:
YOU SHE, therefore the target expression will have the phonological
form corresponding to SHE, namely she, in place of the phonologi-
cal form corresponding to YOU, namely you, and so on. This requires
knowledge about linguistic categories (the speaker must know what can
be substituted for what), the internal structure of the source expression,
i.e., YOU THINK SHE WILL GET SOMETHING: WHAT?/What
do you think shell get? (so that he/she knows where to substitute it), and
about agreement (the auxiliary needs to agree with the new subject). A
listener or reader, of course, would have to use the phonological forms
of source and target, plus an understanding of the relationships between
their constituent parts, to construct a semantic representation of the
target.
4
If speakers dont have a ready-made template for non-prototypical
questions and have to extrapolate from existing knowledge in the manner
described above, such sentences will require extra eort to produce and
understand (which should translate into longer processing times) and
should be judged to be less acceptable than more prototypical variants.
Both of these predictions can be tested experimentally; the present paper
is devoted to testing the second one.
An acceptability judgment experiment, of course, can only provide in-
direct evidence about sentence processing, and hence is less clearly rele-
vant to the subject of this special issue than, for example, a study which
compared reading times or the time taken to respond to a sentence. On
3. For ease of expositions, I have assumed that the source expression for the analogy is an
actual expression rather than the template; but this need not be the case.
4. Reliance on analogy and schema use are not as dierent as it might at rst seem. As
Langacker points out, applying analogy requires the speaker to apprehend an abstract
commonality between the source and target forms; and the abstract commonality, of
course, is what would be captured by the schema (Langacker 2000: 60; see also Dabrow-
ska 2008). Furthermore, repeated use of analogy will result in the abstract commonal-
ity being entrenched until it becomes a linguistic unit in its own right. The critical dif-
ference between the two processes, then, is whether the relevant knowledge is retrieved
from memory or created on the y.
394 E. Dabrowska
the plus side, an acceptability judgment study is easier to conduct (since it
doesnt require any special apparatus) and hence it is a sensible rst step
when investigating a syntactic construction. Furthermore, many linguists
would argue that it provides more useful evidence about the nature of
speakers underlying linguistic representations, or competence (cf.
Wasow and Arnold 2005), and hence, perhaps, will be less likely to be
dismissed as mere performance.
Of course, judging the acceptability of a sentence is a type of perfor-
mance, and, like other types of performance, can be inuenced by a vari-
ety of factors: plausibility, complexity, fatigue, mode of presentation, and
so on. This raises an obvious problem for an analyst trying to interpret
the results. The solution to the problem, however, is notas some lin-
guists have suggestedto give up attempts to study speakers judgments
experimentally, but to control as many confounding factors as possible,
and be cautious in interpreting the results.
2. Experimental design
Dabrowska (2004) reports on a preliminary study showing that speakers
rate prototypical LDD questions such as Where do you think they sent the
documents? and What did you say the burglars stole? as more acceptable
than questions which had lexical subjects, a main verb other than think
or say, and an auxiliary other than do (e.g., Where will the customers
remember they sent the documents? What have the police revealed the
burglars stole?). There was no corresponding eect for declaratives. It is
not clear, however, whether the dierence in speakers judgments was
due to the choice of subject, verb, or auxiliary, or some combination
of these factors. The experiment described in this paper was designed to
investigate how each of these three factors individually contributes to
speakers judgment. It will also examine two additional grammatical fac-
tors: the presence or absence of a complementizer and the number of
clauses intervening between the WH word and the gap, as well as the
eects of plausibility and syntactic complexity.
In the experiment, native speakers of English completed a written ques-
tionnaire in which they were asked to rate the acceptability of LDD ques-
tions of varying degrees of prototypicality. There were seven experimental
conditions:
1. Prototypical LDD questions (WH Prototypical): These had the form
WH do you think S-GAP? or WH did you say S-GAP?;
2. LDD questions with lexical matrix subjects (WH Subject);
3. LDD questions with auxiliaries other than do (WH Auxiliary);
Questions with long-distance dependencies 395
4. LDD questions with matrix verbs other than think or say (WH Verb);
5. LDD questions with overt complementizers (WH Complementizer);
6. LDD questions with very long dependencies, i.e., with an addi-
tional complement clause (WH Long);
7. Unprototypical LDD questions (WH Unprototypical): These had a
lexical subject, an auxiliary other than do, and a main verb other than
think or say, an overt complementizer, and an additional complement
clause.
The questionnaire also contained two types of control sentences.
Grammatical controls were declarative versions of the LDD questions
constructed by replacing the WH word with a noun phrase or a preposi-
tional phrase (and adding a conjunction: see below). Ungrammatical
controls involved four types of structures: that trace violations (*That),
sentences involving a dependency reaching into a complex NP (*Com-
plexNP), negative sentences without do support (*Not), and negative sen-
tences with double tense marking (*DoubleTn). Examples of each type of
sentence are provided in Table 1; a complete list of all sentences used in
one version of the questionnaire is given in the Appendix.
3. Predictions
3.1. General rules
If speakers have the competence attributed to them by generative lin-
guists, and if their grammaticality judgments are a more or less direct re-
ection of this competence, then we could expect the following prediction
to hold:
Prediction 1: Grammatical sentences should receive ratings close to 5;
ungrammatical sentences should be rated about 1.
3.2. General rules processing and pragmatics
Prediction 1 is unrealistic, since it is well known that speakers judgments
are inuenced by factors such as complexity and plausibility, just like any
other kind of performance. In particular, sentences involving ller-gap
dependencies are computationally more demanding since the ller must
be held in memory while the rest of the sentence is being processed (cf.
Frazier and Clifton 1989; Hawkins 1999; Kluender and Kutas 1993); and
sentences involving a dependency over more than one clause are particu-
larly dicult. Furthermore, since it is rather odd to assert what the ad-
dressee thinks or says (and perfectly natural to ask about these things),
we might expect an interaction between construction type and the lexical
396 E. Dabrowska
properties of the matrix subject: specically, speakers may assign low rat-
ings to declaratives with second person subjects, but accept the corre-
sponding interrogatives.
Taking processing demands and pragmatics into consideration, one
might make the following predictions:
Prediction 2: WH questions (WH) will receive lower ratings than the
corresponding declaratives (DE):
DE Protototypical > WH Prototypical
DE Subject > WH Subject
Table 1. Examples of sentences used in the experiment
Condition Example
1. WH Prototypical What do you think the witness will say if they dont intervene?
2. WH Subject What does Claire think the witness will say if they dont
intervene?
3. WH Auxiliary What would you think the witness will say if they dont
intervene?
4. WH Verb What do you believe the witness will say if they dont
intervene?
5. WH Complementizer What do you think that the witness will say if they dont
intervene?
6. WH Long What do you think Jo believes he said at the court hearing?
7. WH Unprototypical What would Claire believe that Jo thinks he said at the court
hearing?
Grammatical controls
1. DE Prototypical But you think the witness will say something if they dont
intervene.
2. DE Subject And Claire thinks the witness will say something if they dont
intervene.
3. DE Auxiliary You would think the witness will say something if they dont
intervene.
4. DE Verb So you believe the witness will say something if they dont
intervene.
4. DE Complementizer So you think that the witness will say something if they dont
intervene.
5. DE Long So you think Jo believes he said something at the court
hearing.
6. DE Unprototypical Claire would believe that Jo thinks he said something at the
court hearing.
Ungrammatical Controls
*That *What did you say that works even better?
*Complex NP *What did Claire make the claim that she read in a book?
*Not *Her husband not claimed they asked where we were going.
*DoubleTn *His cousin doesnt thinks we lied because we were afraid.
Questions with long-distance dependencies 397
DE Verb > WH Verb
DE Auxiliary > WH Auxiliary
DE Complementizer > WH Complementizer
DE Long > WH Long
DE Unprototypical > WH Unprototypical
Prediction 3: WH questions involving very long dependencies (WH
Long, WH Unprototypical) should receive lower ratings
than WH questions involving dependencies over just one
clause boundary (all other WH conditions). The corre-
sponding declaratives should be rated equally acceptable,
since they do not involve long distance dependencies.
WH Prototypical, WH Subject, WH Verb, WH Auxiliary,
WH Complementizer > WH Long, WH Unprototypical
Prediction 4: Declaratives with lexical subjects and complement-taking
verbs will receive higher ratings than declaratives with sec-
ond person subjects. There will be no corresponding eect
for interrogatives.
DE Subject > DE Prototypical
3.3. Usage-based models
According to the usage-based hypothesis, speakers store lexically specic
templates corresponding to frequent combinations they have encountered
in their experience such as WH do you think S-GAP? and WH did you say
S-GAP? If this hypothesis is correct, prototypical LDD questions (i.e.,
those that t one of these templates) should be rated as more acceptable
than non-prototypical questions. There should be no corresponding dif-
ferences in the acceptability of declaratives, or the relevant dierences
should be smaller.
It was also hypothesised that speakers construct non-prototypical LDD
questions on analogy with prototypical questions, by adding elements or
substituting a dierent item in a particular position in a template. Since
some elements may be more easily substitutable than others, dierent
substitutions may result in dierent degrees of acceptability. We know
from language acquisition research that children learn to substitute new
items into nominal slots relatively early; the ability to substitute verbs
into slots emerges later, and auxiliary substitutions are later still (Dab-
rowska and Lieven 2005; Lieven et al. 2003, Tomasello et al. 1997). The
most likely explanation for this nding is that nominals are autonomous
units which can be dened independently of the constructions they occur
398 E. Dabrowska
in, while verbs are non-autonomous in the sense that their descriptions
must make reference to the entities participating in the relationship they
designatein other words, these entities are part of the verbs prole (cf.
Langacker 1987: 298). Likewise, tensed auxiliaries, as grounding predi-
cations (cf. Langacker 1991: 193), conceptually presuppose the events
that they grounddesignated by the verb plus its arguments. It follows
that nominals should be more easily substitutable than verbs, which in
turn should be more easily substitutable than auxiliaries.
Prediction 5: Prototypical LDD questions will receive higher ratings
than questions with lexical subjects, which will be more
acceptable than questions with matrix verbs other than
think or say; questions with auxiliaries other than do will
be judged least acceptable.
WH Prototypical > WH Subject > WH Verb > WH
Auxiliary
Finally, LDD questions containing overt complementizers and very
long dependencies are also less prototypical, in that both contain an ex-
tra element (the complementizer or an additional clause). It is not clear
whether inserting an extra element is more or less dicult than substitu-
tion; however, we can make the following predictions:
Prediction 6: LDD questions without overt complementizers will receive
higher ratings than questions with that:
WH Prototypical > WH Complementizer
Prediction 7: LDD questions with very long dependencies (i.e., depen-
dencies reaching over more than one intervening clause)
will be judged less acceptable than questions with shorter
dependencies, but more acceptable than unprototypical
questions (since the latter contain a very long dependency
as well as lexical substitutions).
WH Prototypical > WH Long > WH Unprototypical
4. Method
4.1. Stimuli
4.1.1. Experimental sentences. The experimental sentences were con-
structed by combining a sentence stub with a completion consisting of
either a complement clause and an adverbial clause or two complement
Questions with long-distance dependencies 399
clauses (see below). The stubs for the WH Prototypical condition were as
follows:
(4) a. What do you think . . .
b. Where do you think . . .
c. What did you say . . .
d. Where did you say . . .
The stubs for the non-prototypical sentences were constructed by
changing or adding the relevant element. Thus, in the WH Subject condi-
tion, you was changed to a proper name (and a third person ending was
added to the auxiliary so that it agreed with the subject); in the WH Aux-
iliary condition, do was changed to will or would; in the WH Verb condi-
tion, think was replaced with believe or suspect and say with claim or
swear; in the WH Complementizer condition, an overt complementizer
(that) was added after the verb; and in the WH Unprototypical condition,
all of the above changes were made.
5
The completions for conditions 15 consisted of a four word comple-
ment clause followed by a four-word adverbial clause, e.g.,
(5) (What do you think) the witness will say
(stub) (rst complement clause)
if they dont intervene?
(adverbial clause)
The completions for sentences with very long dependencies (i.e., condi-
tions 6 and 7) consisted of a two-word complement clause followed by a
pronominal subject, verb, and a four-word prepositional phrase, e.g.,
(6) (What do you think) Jo believes
(stub) (rst complement clause)
he said at the court hearing?
(second complement clause)
Thus, all experimental sentences without complementizers were 12
words long and contained three clauses, with seven words intervening be-
tween the WH word and the gap. Sentences with complementizers were
13 words long, with 8 words between the WH word and the gap.
There were two versions of the questionnaire, and two sets of comple-
tions. In version 1, the stubs were combined with completion set 1 in con-
5. Throughout this paper, I use the term non-prototypical to refer to questions which dier
in some respect from the prototypical instances of the construction, and the term unpro-
totypical to refer to questions which dier from the prototype in all relevant respects.
400 E. Dabrowska
ditions 1, 3, 5 and 7 and with completion set 2 in conditions 2, 4, and 6;
in version 2, the stubs were combined with completion set 1 in even-
numbered conditions and with completion set 2 in odd-numbered
conditions.
4.1.2. Grammatical controls. The grammatical control conditions were
constructed by supplying appropriate lexical material for the WH word in
the appropriate position in the clause; in addition, a conjunction (so, and,
or but) was added at the beginning of declaratives corresponding to inter-
rogatives with the auxiliary do: for example, the declarative counterpart
of
(7) What do you think the witness will say if they dont intervene?
was
(8) But you think the witness will say something if they dont intervene.
This was done so that the interrogative sentence and the corresponding
declarative control contained the same number of words; it also made the
declarative sentences sound more natural. (It is somewhat odd for a
speaker to assert what the addressee thinks or said; adding the conjunc-
tion makes the sentence pragmatically more plausible because it conveys
the impression that the speaker is either inferring the addressees beliefs
from his or her words, contrasting them with those of another person, or
clearing up an apparent misunderstanding).
4.1.3. Ungrammatical controls. The ungrammatical control conditions,
like the experimental sentences and the grammatical controls, contained
subordinate clauses. Half of the ungrammatical sentences were declara-
tive and the other half interrogative. That-trace sentences (*That) con-
tained an overt complementizer immediately before a gap in the subject
position:
(9) *What do you think that probably got lost during the move?
(10) *Who do you think that will turn up in the evening?
In Complex NP sentences (*ComplexNP) the matrix clause contained a
complement-taking noun (claim, fact, rumour, hypothesis) followed by a
complement clause with a gap, e.g.,
(11) *What did Claire make the claim that she read in a book?
(12) *Where did you discover the fact that the criminals put the car ?
In Negatives without do support (*Not), the matrix clause contained a
negated verb but no auxiliary, with tense being marked on the main verb:
Questions with long-distance dependencies 401
(13) *Her husband not claimed they asked where we were going.
Finally, in declaratives with double tense/agreement marking (*Dou-
bleTn), the matrix clause contained a third person subject and a negated
verb, and agreement was marked on the auxiliary as well as on the main
verb:
(14) *The girl doesnt remembers where she spent her summer holidays.
The same ungrammatical controls were used in both versions of the
questionnaire. There were four sentences in each condition, giving a total
of 16 ungrammatical controls.
4.1.4. Constructing the questionnaire. The test sentences were divided
into four blocks, each containing one sentence from each of the seven
experimental and eleven control conditions. The order of the sentences
within each block was random.
A full list of the sentences used in version 1 of the experiment is given
in the Appendix.
4.2. Participants
38 second and third year literature students from the School of English at
the University of Newcastle participated in the experiment. All were na-
tive speakers of English.
4.2.1. Procedure. Participants were asked to complete a written ques-
tionnaire and were given the following instructions:
The questionnaire is part of a study of speakers intuitions about En-
glish sentences. It is not an intelligence test or a grammar test.
Please indicate how acceptable/unacceptable you nd each of the fol-
lowing sentences by choosing a number on a scale from 1 (very bad)
to 5 (ne). Read the sentences carefully, but do not spend too much
time thinking about them: we are interested in your initial reaction.
Do not go back and change your responses to earlier sentences.
The instructions were followed by two examples for which ratings were
provided:
(15) Will the girl who won the prize come to the party?
(16) Did the man who arrive by train is my cousin?
The rst was given a rating of 5 and the second 1. These examples were
provided in order to anchor the participants ratings. Thus, in essence, the
402 E. Dabrowska
participants task was to decide whether the sentences in the questionnaire
were more like 1, more like 5, or in-between.
The questionnaires were distributed after a lecture and took 1015
minutes to complete. Participants were randomly assigned to one version
of the questionnaire, with about half completing each version.
5. Results and discussion
5.1. Grammatical v. ungrammatical sentences
The mean acceptability ratings for all conditions are given in Table 2. For
clarity, the same information is presented visually in Figure 1, where the
bars corresponding to each experimental condition have been arranged
from highest to lowest. Although the mean rating for all grammatical
sentences combined (3.67) was considerably higher than for the ungram-
matical controls (1.98), there is no sharp contrast between grammati-
cal and ungrammatical sentences. What we have instead is a continuum
of acceptability ratings ranging from 4.3 for prototypical WH ques-
tions to 1.3 for negatives without do support, with the other sentence
types occupying various intermediate points. The four ungrammati-
cal sentence types cluster at the lower end of the continuum; however,
the acceptability ratings for unprototypical LDD questions were not
Table 2. Acceptability ratings for all conditions
Condition Mean Std. Deviation
WH Prototypical 4.31 0.63
WH Subject 4.25 0.59
WH Verb 3.93 0.71
WH Auxiliary 3.23 0.83
WH Complementizer 3.84 0.84
WH Long 3.85 0.76
WH Unprototypical 2.54 0.75
DE Prototypical 3.57 0.85
DE Subject 4.00 0.63
DE Verb 3.74 0.78
DE Auxiliary 3.49 0.66
DE Complementizer 3.53 0.79
DE Long 3.89 0.75
DE Unprototypical 3.14 0.90
*That 2.50 0.75
*DoubleTn 2.41 0.95
*ComplexNP 1.69 0.56
*Not 1.31 0.49
Questions with long-distance dependencies 403
signicantly dierent from those for that-trace and double-tense sentences
(WH Unprototypical v. *that: t37 0:26, p 0:798; WH Unprototyp-
ical v. *DoubleTn: t37 0:70, p 0:486), although they were higher
than those for the other two ungrammatical control conditions (WH Un-
prototypical v. *Complex NP: t37 6:15, p < 0:001; WH Unprototyp-
ical v. *Not: t37 8:81, p < 0:001). Thus, the pure competence gram-
mar prediction that grammatical sentences will receive ratings close to 5
and ungrammatical sentences will be rated about 1 is clearly false.
5.2. Grammatical sentences
A preliminary analysis of the participants ratings for the grammatical
sentences was conducted using a construction (2) prototypicality
(7) version (2) ANOVA. The analysis revealed a signicant main
eect of prototypicality, F6; 216 49:82, p < 0:001, h
2
p
0:58, and a
prototypicality construction interaction, F6; 216 15:14, p < 0:001,
h
2
p
0:30. No other eect or interaction was signicant. Since there was
no signicant eect of version, and no interaction between version and
any of the other factors, the results for the two versions were collapsed
in all further analyses.
Figure 1. Mean acceptability ratings for all conditions
Note: Grey bars correspond to questions; white bars correspond to declaratives
and striped bars to ungrammatical controls. Error bars represent 95 percent con-
dence intervals.
404 E. Dabrowska
5.2.1. Processing cost of dependencies: Questions v. declaratives. Pre-
diction 2 was that LDD questions will receive lower ratings than the cor-
responding declaratives because they involve displaced constituents. This
prediction was not conrmed: there was no signicant eect of construc-
tion. Instead, as indicated above and shown in Table 3, we have an inter-
action between construction type and lexical content, with prototypical
LDD questions, questions with lexical subjects, and questions with overt
complementizers being judged signicantly more acceptable than the cor-
responding declaratives. (Note that the signicance levels reported in
Table 3 and elsewhere in this paper have not been corrected for multiple
comparisons: since the hypothesis tested predicts that all the relevant
comparisons should be signicant, using the Bonferroni adjustment or
an equivalent method would not be appropriate.)
The interrogative sentences, particularly in the WH Prototypical condi-
tion, contained some frequent bigrams (what do, do you, you think): these
occur with a frequency of 6433, 27602, and 9901 respectively in the Brit-
ish National Corpus. This could be partly responsible for the fact that
interrogatives were rated as more acceptable than the corresponding
declaratives in most conditions. However, as shown in Table 4, there is
no strong relationship between the number of frequent bigrams and ac-
ceptability ratings of WH questions (i.e., questions containing more
high-frequency bigrams are not necessarily more acceptable) or the num-
Table 3. Testing prediction 2
Prediction Mean SD t-test
value
p value Prediction
conrmed?
WH Prototypical
<DE Prototypical
4.31
3.57
0.63
0.85
5.234 <0.001 77
WH Subject
<DE Subject
4.25
4.00
0.59
0.63
2.859 0.007 77
WH Complementizer
<DE Complementizer
3.84
3.53
0.84
0.79
2.149 0.038 77
WH Verb
<DE Verb
3.93
3.74
0.71
0.78
1.644 0.109 7
WH Long
<DE Long
3.85
3.89
0.76
0.75
0.452 0.654 7
WH Auxiliary
<DE Auxiliary
3.23
3.49
0.83
0.66
1.950 0.059 7
WH Unprototypical
<DE Unprototypical
2.54
3.14
0.75
0.90
5.882 <0.001 3
Note: 3indicates that a prediction has been conrmed; 7indicates that a prediction has
not been conrmed; 77indicates a signicant dierence in the opposite direction.
Questions with long-distance dependencies 405
ber of frequent bigrams and the advantage for questions over declaratives
(questions containing more high-frequency bigrams are not necessarily
better than the corresponding declaratives). Furthermore, bigram fre-
quency cannot explain the interaction between construction type and
verb or between construction type and complementizer (see below), since
the words immediately preceding and following the verb and the comple-
mentizer were the same in both the declarative and the interrogative var-
iants. In fact, for sentences with complementizers, bigram frequency
makes precisely the wrong predictions. In the WH Prototypical and DE
Prototypical conditions, the main clause verb (think or say) was followed
by the pronoun they or we or the determiner the, while in the WH Com-
plementizer and DE Complementizer conditions, it was followed by the
complementizer that. The mean frequency of the bigrams think the, think
they, think we, say the, say they, say we in the British National Corpus is
2539, while the mean frequency of the bigrams think that and say that is
9473. Thus, if acceptability ratings were simply a reection of bigram fre-
quency, sentences with complementizers should receive higher ratings
than sentences without them. (The mean frequency of the bigrams that
they, that the and that we is even higher: 59333.) However, as we will see
in section 5.2.4, the ratings for WH questions with complementizers were
Table 4. Frequent bigrams in WH questions
Condition Mean Dierence
score*
Frequent
bigrams
WH Prototypical 4.31 0.74 what do
do you
you think
WH Complementizer 3.84 0.31 what do
do you
you think
WH Subject 4.25 0.25 what do
WH Verb 3.93 0.19 what do
do you
WH Long 3.85 0.04 what do
do you
you think
WH Auxiliary 3.23 0.26 you think
WH Unprototypical 2.54 0.60
* Dierence scores were computed by subtracting the rating of the declarative control from
the rating of the interrogative sentence.
406 E. Dabrowska
considerably lower than for the prototypical variants, while there was no
dierence in the acceptability of the corresponding declaratives.
5.2.2. Processing cost of very long dependencies. According to pre-
diction 3, WH questions containing very long dependencies, i.e., de-
pendencies spanning two clause boundaries, should receive lower ratings
than questions containing dependencies across just one clause boundary,
whilst the corresponding declaratives should be rated equally acceptable,
since they do not involve a ller-gap dependency. This prediction was
evaluated by comparing the mean ratings for questions with very long
dependencies (WH Long and WH Unprototypical) and the correspond-
ing declaratives and for questions containing dependencies over one
clause boundary (WH Prototypical, WH Subject, WH Verb, WH Auxil-
iary, WH Complementizer) and the corresponding declaratives. The rat-
ings were analyzed using a 2 2 ANOVA with the within-participants
factors of construction type (WH question, declarative) and complemen-
tation (1 clause, 2 clauses). This revealed a main eect of complementa-
tion, F1; 37 50:44, p < 0:001, h
2
p
0:58, indicating that sentences
containing two complement clauses were judged as less acceptable than
sentences containing one complement clause and one adverbial clause.
This was qualied by a construction complementation interaction,
F1; 37 34:14, p < 0:001, h
2
p
0:48 (see Figure 2). Post-hoc compari-
sons showed that WH questions with very long dependencies were judged
signicantly worse than questions with long dependencies: t37 9:62,
Figure 2. Construction type complementation interaction (All conditions)
Questions with long-distance dependencies 407
p < 0:001. For declaratives, the corresponding dierence approaches sig-
nicance: t37 1:87, p 0:070. This suggests that processing diculty
had the predicted eect on acceptability ratings.
However, comparisons of ratings for each of two very long depen-
dency questions with each of the long dependency conditions suggests
a slightly more complex picture. As shown in Table 5, WH Long items
were judged signicantly worse than WH Prototypical and WH Subject
items, and unprototypical questions were judged signicantly worse than
all other WH questions. However, there was no signicant dierence be-
tween WH Long and WH Verb items or between WH Long and WH
Complementizer (indeed, the latter received slightly higher ratings), and
questions with a modal auxiliary were judged signicantly worse than
WH Long sentences. Since the declarative versions of unprototypical
LDD questions were also judged to be less acceptable than the other de-
clarative sentences (see Table 6), the low acceptability ratings for the un-
prototypical sentences may be partially due to the lexical content of the
sentences. Thus, although the existence of very long dependencies may
have an eect on acceptability, this can only account for some of the ob-
served dierences.
Table 5. Testing prediction 5 (Questions)
Prediction Mean SD t-test
value
p value Prediction
conrmed?
WH Prototypical
>WH Long
4.31
3.85
0.63
0.76
4.67 <0.001 3
WH Subject
>WH Long
4.25
3.85
0.59
0.76
3.63 0.001 3
WH Verb
>WH Long
3.93
3.85
0.71
0.76
0.62 0.539 7
WH Auxiliary
>WH Long
3.23
3.85
0.83
0.76
4.13 <0.001 77
WH Complementizer
>WH Long
3.84
3.85
0.84
0.76
0.11 0.910 7
WH Prototypical
>WH Unprototypical
4.31
2.54
0.63
0.75
14.17 <0.001 3
WH Subject
>WH Unprototypical
4.25
2.54
0.59
0.75
15.36 <0.001 3
WH Verb
>WH Unprototypical
3.93
2.54
0.71
0.75
11.29 <0.001 3
WH Auxiliary
>WH Unprototypical
3.23
2.54
0.83
0.75
5.73 <0.001 3
WH Complementizer
>WH Unprototypical
3.84
2.54
0.84
0.75
9.38 <0.001 3
408 E. Dabrowska
5.2.3. Pragmatics. Prediction 4 stated there should be an interaction
between construction type and the lexical properties of the matrix subject:
declaratives with lexical subjects should receive higher ratings than de-
claratives with second person subjects, but there should be no correspond-
ing dierence for questions. To test this prediction, a 2 2 ANOVA
with the within-participants factors of construction (WH question, de-
clarative) and subject (second person, lexical). The analysis showed that
the predicted interaction did indeed occur: F1; 37 14:82, p < 0:001,
h
2
p
0:29 (see Figure 3); the main eect of construction was also signi-
cant, F1; 37 25:01, p < 0:001, h
2
p
0:40. Further analysis conrmed
that declaratives with lexical subjects were judged more acceptable than
declaratives with second person subjects: t37 3:92, p < 0:001. The
ratings for interrogatives with lexical subjects were marginally lower
than for interrogatives with second person subject, but the dierence was
not statistically signicant: t37 0:91, p 0:368.
5.2.4. Prototypicality eects. According to the usage-based hypo-
thesis, any modication of the LDD template should result in lower
Table 6. Testing prediction 3 (Declaratives)
Prediction Mean SD t-test
value
p value Same as
question?
DE Prototypical
DE Long
3.57
3.89
0.85
0.75
3.10 0.004 7
DE Subject
DE Long
4.00
3.89
0.63
0.75
1.22 0.230 7
DE Verb
DE Long
3.74
3.89
0.78
0.75
1.45 0.156 3
DE Auxiliary
DE Long
3.49
3.89
0.66
0.75
3.89 <0.001 3
DE Complementizer
DE Long
3.53
3.89
0.79
0.75
3.68 0.001 7
DE Prototypical
DE Unprototypical
3.57
3.14
0.85
0.90
3.22 0.003 3
DE Subject
DE Unprototypical
4.00
3.14
0.63
0.90
6.74 <0.001 3
DE Verb
DE Unprototypical
3.74
3.14
0.78
0.90
4.31 <0.001 3
DE Auxiliary
DE Unprototypical
3.49
3.14
0.66
0.90
2.99 0.005 3
DE Complementizer
DE Unprototypical
3.53
3.14
0.79
0.90
3.17 0.003 3
Questions with long-distance dependencies 409
acceptability ratings for interrogative sentences but have no eect (or a
much smaller eect) on declaratives. In other words, usage-based theories
predict an interaction between construction type and lexical content.
As we have just seen, although there is a signicant interaction between
construction type and the lexical properties of the subject, this is best in-
terpreted as reecting pragmatic implausibility of second person declara-
tives with think and say. Substituting a lexical subject for you in questions
did not result in a signicant reduction of acceptability, although there
was a small dierence in the predicted direction. This suggests that the
LDD template does not specify the subjectalthough it is also possible
that the processing cost of NP substitution is too small to be revealed by
an acceptability judgment task.
The relationship between construction type and the other four factors
(verb, auxiliary, complementizer, and the number of complement clauses)
was investigated by means of four additional 2 2 ANOVAs (see
Table 7) followed up by t-tests. All the interactions were as predicted
by the usage-based account. There was a signicant interaction between
construction type and verb (see Figure 4): changing the matrix verb
in a LDD question results in signicantly lower acceptability ratings
(t37 3:23, p 0:003); changing the verb in the corresponding declar-
ative, on the other hand, results in a slightly more acceptable sentence,
although the dierence is not statistically signicant (t37 1:45,
p 0:155). There was also an interaction between construction type and
auxiliary (see Figure 5): replacing do or did with the modal auxiliary will
Figure 3. Construction type subject interaction
410 E. Dabrowska
or would made the interrogatives less acceptable (t37 8:26,
p < 0:001), while adding the same auxiliaries in declaratives had no eect
on ratings (t37 0:56, p 0:578). The size of the interaction between
construction type and complementizer (Figure 6) was somewhat smaller,
although it was also in the predicted direction: adding an overt com-
plementizer had no eect on declaratives (t37 0:44, p 0:666) but
Table 7. ANOVA results
Eect/Interaction Test statistics
Construction (WH, DE) verb (think/say, believe/suspect/claim/swear)
Construction F1; 37 8:28, p < 0:001, h
2
p
0:34
Construction verb F1; 37 13:81, p 0:002, h
2
p
0:27
Construction (WH, DE) auxiliary (do/did, will/would )
Construction F1; 37 7:30, p < 0:001, h
2
p
0:17
Auxiliary F1; 37 47:06, p < 0:001, h
2
p
0:56
Construction auxiliary F1; 37 22:06, p < 0:001, h
2
p
0:37
Construction (WH, DE) complementizer (none, that)
Construction F1; 37 21:15, p < 0:001, h
2
p
0:36
Complementizer F1; 37 13:29, p 0:001, h
2
p
0:26
Construction complementizer F1; 37 6:68, p 0:014, h
2
p
0:15
Construction (WH, DE) complementation (1 clause, 2 clauses)
Construction F1; 37 12:59, p 0:001, h
2
p
0:25
Construction complementation F1; 37 5:82, p < 0:001, h
2
p
0:42
Note: Only signicant main eects and interactions are listed in the table.
Figure 4. Construction type verb interaction
Questions with long-distance dependencies 411
resulted in lower ratings for interrogatives (t37 3:77, p 0:001).
6
Finally, there is a signicant interaction between construction type and
complementation: adding a second complement clause between the ller
and the gap makes the interrogative less acceptable while the correspond-
ing declarative is more acceptable (see Figure 7; for pairwise compari-
sons, see Table 5). (Note that this analysis compares just the WH Proto-
typical and WH Long conditions and their declarative counterparts, i.e.,
sentences which are most closely matched lexically. The analysis in sec-
tion 5.2.2 contrasted questions with very long dependencies, i.e., WH
Long and WH Unprototypical, with questions containing dependencies
spanning just one clause, i.e., WH Prototypical, WH Subject, WH Verb,
WH Auxiliary, and WH Complementizer.)
The usage-based model adopted here also predicts a particular order in
the acceptability of non-prototypical LDD questions: specically, LDD
questions with a lexical subject should be more acceptable than questions
with a non-prototypical matrix verb, which in turn should be more
6. We know from the psycholinguistic literature that overt complementizers facilitate pro-
cessing, presumably because they signal the presence of a subordinate clause and thus
help the processor to avoid garden path eects: for instance, sentences with complement
clauses introduced by a complementizer are read faster than sentences without comple-
mentizers, even when the main clause verb has a strong preference for clausal comple-
ments (Trueswell et al. 1993, Holmes et al. 1989). Thus the presence of the complemen-
tizer eect for questions provides strong evidence in favour of lexical storage of the
whole construction.
Figure 5. Construction type auxiliary interaction
412 E. Dabrowska
acceptable than those with a non-prototypical auxiliary. As shown in
Tables 8 and 9, these predictions have also been conrmed. (Table 8
also shows the results of pairwise comparisons relevant for testing other
usage-based predictions for the sake of completeness.)
Although the model made no specic predictions about the relative size
of the eect of the other manipulations, it is interesting to see how they
compare to lexical substitutions in the subject, verb and auxiliary slot.
Figure 6. Construction type complementizer interaction
Figure 7. Construction type complementation interaction (Prototypical and Long
sentences only)
Questions with long-distance dependencies 413
As can be seen from Figure 1, the eects of adding a complementizer and
of adding an additional complement clause are about the same as that of
changing the matrix verb. The mean acceptability ratings for WH Verb,
WH Long and WH Complementizer sentences were 3.93, 3.85, and 3.84
respectively; none of the dierences between these conditions was statisti-
cally signicant (WH Verb v. WH Complementizer: t37 0:71,
p 0:484; WH Verb v. WH Long: t37 0:62, p 0:539; WH Com-
plementizer v. WH Long: t37 0:11, p 0:910).
Table 8. Testing predictions 5, 6 and 7 (Questions)
Prediction Mean SD t-test
value
p value Prediction
conrmed?
WH Prototypical
>WH Subject
4.31
4.25
0.63
0.59
0.91 0.368 7
WH Subject
>WH Verb
4.25
3.93
0.59
0.71
3.32 0.002 3
WH Verb
>WH Auxiliary
3.93
3.23
0.71
0.83
4.96 <0.001 3
WH Prototypical
>WH Complementizer
4.31
3.84
0.63
0.84
3.77 0.001 3
WH Prototypical
>WH Long
4.31
3.85
0.63
0.76
4.69 <0.001 3
WH Long
>WH Unprototypical
3.85
2.54
0.76
0.75
10.51 <0.001 3
Table 9. Testing predictions 5, 6 and 7 (Declaratives)
Prediction Mean SD t-test
value
p value Same as
question?
DE Prototypical
DE Subject
3.57
4.00
0.85
0.63
3.92 <0.001 7
DE Subject
DE Verb
4.00
3.74
0.63
0.78
2.85 0.007 3
DE Verb
DE Auxiliary
3.74
3.49
0.78
0.66
2.26 0.030 3
DE Prototypical
DE Complementizer
3.57
3.53
0.85
0.79
0.44 0.666 7
DE Prototypical
DE Long
3.57
3.89
0.85
0.75
3.10 0.004 7
DE Long
DE Unprototypical
3.89
3.14
0.75
0.90
8.44 <0.001 3
414 E. Dabrowska
5.3. Why are real life LDD questions so stereotypical?
Why are questions with long distance dependencies so stereotypical? One
possible explanation is oered by Verhagen (2005). Verhagen observes
that the propositional content of most complementation constructions is
expressed by the subordinate clause; the main clause normally just signals
epistemic stance (see also Thompson 2002), i.e., it invites the hearer to
adopt a particular subjective perspective on the object of conceptualiza-
tion. The greater the distance between the onstage conceptualizer (i.e.,
the subject of the main clause) and the ground (in the sense of Langacker
1987), the more dicult it is to construe the main clause as an epistemic
marker (as opposed to a prediction in its own right). Verhagen argues
that this distance is minimal when the conceptualizer is the rst person
(in declaratives) or the second person (in interrogatives), when the verb
is relatively generic, and when there are no other elements qualifying the
verb; it follows that the matrix clause in LDD questions will normally
contain a second person subject, a relatively non-specic verb such as
think and say, and no additional constituents.
A dierent, but not necessarily incompatible, explanation for restric-
tions on questions and other constructions with long distance dependen-
cies is proposed by Goldberg (2006). Goldberg argues that dierences in
acceptability arising as a result of the use of dierent main clause verbs
can be explained by appealing to a general principle which she calls BCI,
which states that backgrounded constituents are islands. The gap in a
ller-gap dependency construction must occur within the potential focus
domain; the constituent containing the gap cannot be backgrounded.
Since complements of factive verbs and manner of speaking verbs (as
well as complex NPs, sentential subjects, and presupposed adjuncts) are
backgrounded, they cannot participate in ller-gap constructions.
An experimental study by Ambridge and Goldberg (this issue) provides
some empirical support for this proposal. Participants in this experiment
completed two tasks. In the rst task, they rated the acceptability of WH
questions with long distance dependencies (e.g., What did Jess think that
Dan liked?) and the corresponding declaratives (e.g., Daniele thought that
Jason liked the cake). In the second task they were presented with a ne-
gated sentence containing a verb complement clause (e.g., Maria didnt
know that Ian liked the cake) and asked to judge to what extent it implied
the negation of the complement clause (Ian didnt like the cake): this mea-
sured the extent to which speakers judged the information in the subordi-
nate clause to be presupposed, and thus backgrounded. The main nding
was that, as predicted by the BCI hypothesis, there was a very strong neg-
ative correlation between responses on the negation test and dierence
Questions with long-distance dependencies 415
scores computed by subtracting the acceptability rating of the questions
from those of the corresponding declaratives, and a weaker negative cor-
relation between responses on the negation test and acceptability ratings.
It remains to be seen whether BCI can also explain other restrictions
on LDD questions documented in this study: the fact that they strongly
disprefer main clause verbs other than think or say (not just factives
and manner-of-speaking verbs), auxiliaries other than do, and comple-
mentizers, and that in real life (as opposed to the examples found in the
linguistic literature) they virtually never involve a dependency spanning
more than one clause.
Thus we have two independent proposals explaining why particular
lexical variants of LDD questions may be preferred or dispreferred in us-
age. A central claim of usage-based approaches is that mental grammars
are shaped by usage patterns: it is thus not surprising that speakers de-
velop strong lexically specic templates for LDD questions and possibly
fail to develop more abstract representations of these constructions.
6. Conclusion
The most striking result of the experiment reported here is the existence of
strong prototypicality eects for LDD questions. Prototypical instances
of this construction, i.e., those which t one of the templates postulated
on the basis of corpus research (WH do you think S-GAP?, WH did you
say S-GAP?) were judged to be the most acceptable of all sentences. De-
partures from the prototype (use of a dierent auxiliary or verb in the
matrix clause, addition of a complementizer or an extra complement
clause) resulted in lower acceptability ratings. Crucially, there was no cor-
responding eect on declaratives, so the dierences in grammaticality
cannot be attributed simply to the properties of the lexical items used in
the experiment. Acceptability also depended on the type of substitution
required: nominals are apparently easier to substitute than verbs, which
in turn were easier than auxiliaries.
The participants judgments were also inuenced by pragmatic consid-
erations: declaratives with lexical subjects (DE Subject, e.g., So Steve said
the children could stay here when their father returns) were judged to be
more acceptable than declaratives with second person subjects (DE Pro-
totypical, e.g., So you said the children could stay here when their father
returns). This eect, however, was fairly small in comparison with the
purely lexical eects.
Adding an additional complement clause also reduced the acceptability
of the questions (and had the opposite eect on declaratives). This could
be attributed to the greater processing demands posed by the increased
416 E. Dabrowska
syntactic distance between the ller and the gap, since the ller must be
held in working memory while the pre-gap part of the sentence is being
processed. However, questions with very long dependencies (with two
clause boundaries intervening between the ller and the gap) were not
consistently judged to be less acceptable than questions involving depen-
dencies across only one clause boundary; and the eect of adding an ad-
ditional complement clause was no bigger than that of adding a comple-
mentizer or changing the matrix verb. Thus, appealing to prototypicality
eects provides a more parsimonious explanation for these ndings. This
is not to say that the processing demands of holding the ller in memory
have no eect on processingbut the costs may be relatively small com-
pared the eects of prototypicality.
7
Interestingly, unprototypical LDD questionsthose with a comple-
mentizer, an additional verb complement clause, a lexical subject, a
modal auxiliary, and a verb other than think or say in the main clause
were judged to be just as bad as that-trace violations and sentences with
double tense/agreement marking (though better than sentences involving
extraction out of a complex NP and negatives without an auxiliary).
This is consonant with the results of two acceptability experiments con-
ducted by Kluender and Kutas (1993). In their rst experiment, in which
participants were required to provided speeded categorical acceptability
judgments, LDD questions were accepted only 54 percent of the time. In
the second experiment, participants rated acceptability on a scale from 1
to 40, and could take as much time as they wished to make the judgment.
The mean acceptability rating for LDD questions was 19signicantly
higher than that for WH island variations, but much lower than those
for Y/N questions containing complement clauses. Kluender and Kutas
conclude that the low acceptability of LDD questions is attributable to
the processing demands of holding the ller in working memory. How-
ever, since their stimuli were quite dierent from prototypical LDD ques-
tions (they all contained overt complementizers and non-prototypical
verbs; some also had lexical subjects or auxiliaries other than do), it could
also be explained by appealing to their unprototypicality.
So what does the presence of prototypicality eects in acceptability
judgments about LDD questionsin particular, those due to the lexical
7. Acceptability judgment and ease of processing are of course two dierent things, al-
though they tend to be correlated: other things being equal, sentences which are dicult
to process tend to be judged less acceptable (cf. Fanselow and Frisch 2006; Frazier and
Clifton 1989; Kluender and Kutas 1993). Note, too, that unprototypical LDD questions
contain more dysuencies than prototypical instances of the construction (Dabrowska
forthc.), which also suggests that they are more dicult to process.
Questions with long-distance dependencies 417
properties of the sentences, since these are easier to interprettell us
about speakers mental representations of this construction? It has long
been acknowledged, of course, that the choice of lexical items in a sen-
tence aects speakers acceptability judgments. In the generative tradi-
tion, this is usually regarded as a confound: the presence of a particular
word can make an otherwise well-formed sentence unacceptable, which
could lead the analyst to draw incorrect conclusions about grammar; lex-
ical eects, therefore, are something that should be controlled for where
possible, discounted when encountered (Featherston 2005: 702). In this
case, however, we are dealing with the opposite situation: LDD questions
are fully acceptable only with particular lexical content. This suggests
that they are more like a constructional idiom than a fully general con-
struction. In other words, questions with long-distance dependencies are
conventional units with an unusual form (a WH word at the beginning
of the main clause associated with a gap in a subordinate clause) and a
specialized meaning: the unit WH do you think S-GAP? is used to inquire
about the speakers opinion about the content of the subordinate clause
(What do you think he wantsQIn your opinion, what does he want?);
and WH did you say S-GAP? is used when addressee already gave the
speaker the relevant information but the speaker does not remember
(When did you say he came?QYouve already told me when he came,
but please tell me again.)
8
The non-prototypical uses are rather like
what Moon (1998) calls exploitations of idioms exemplied by expres-
sions such as throw in the moist towelette (constructed on analogy with
throw in the towel ) or use an earthmover to crack a nut (cf. use a sledge-
hammer to crack a nut). Once all the lexical content has been changed
(cf. the WH Unprototypical condition), it is no longer possible to identify
the motivating construction, and hence the sentence is judged as un-
acceptable.
9
Alternatively, one could argue that when processing unpro-
totypical LDD questions, speakers have to fall back on more abstract
schemas (the mental analogues of the WH question construction and the
complementation construction), and that the low acceptability ratings for
8. It is interesting to note that the CollinsCobuild English Language Dictionary (Sinclair
1987: 1519) lists the use of think in long-distance questions as a separate sense of the
verb.
9. With most idioms, substituting dierent lexical items for every word would result in ex-
pressions which are still judged acceptable, provided they make sense semantically: for
example, taking use a sledgehammer to crack a nut as the model, one could construct
perfectly acceptable phrases such as do a headstand to impress a neighbour and tickle an
earthworm to amuse the children. This is possible because speakers have fully general
schemas for the transitive and the innitival construction.
418 E. Dabrowska
such sentences reect the diculty of combining highly complex and
abstract schemas.
At this point one may wonder why the possibility that speakers have
lexically specic knowledge about LDD questions has not even been con-
sidered by most syntacticians in spite of the fact that these constructions
have been so intensively studied for several decades. The short answer is
that the possible existence of lexical eects was not regarded as theoreti-
cally interestingso although there are a number of experimental studies
of LDD constructions (see, for example, Cowart 1997; Frazier and Clif-
ton 1989; Kluender and Kutas 1993), to my knowledge, nobody has sys-
tematically investigated the eect of lexical content on speakers linguistic
intuitions about them.
10
A second reason is that linguists tend to rely on
their own intuitionsand linguists intuitions about LDD questions may
be systematically dierent from those of ordinary speakers. Dabrowska
(in press b) shows that linguists tend to judge unprototypical LDD ques-
tions as considerably more acceptable than that trace violations and sen-
tences with double tense marking, and not much worse than prototypical
questions. This could be a reection of their theoretical commitments (the
belief that instances of the same construction should be equally gram-
matical), but it could also be a result of dierences in linguistic experi-
ence. Many linguists spend a considerable amount of time constructing
examples of the structures they are interested in and reading papers con-
taining such constructed examples.
11
Since LDD questions have been the
object of very intensive research, it is likely that linguists (or at least lin-
guists who work on LDD constructions, or discuss them with their stu-
dents) have been exposed to more instances of this construction than
most ordinary language users, and, crucially, the instances they have en-
countered are much more varied, as demonstrated by the examples in (1).
10. There is some work on lexical eects on acceptability judgments in basic argument
structure constructions: see Theakston 2004, Ambridge et al. in press. In both studies,
speakers judged argument structure violations with high frequency verbs (e.g., *I
poured you with water) as less acceptable than argument structure violations with low
frequency verbs (e.g., *I dribbled teddy with water); the authors explain this by appeal-
ing to the higher entrenchment of the pattern with the frequent verb. Ambridge et al.
(2008) also found that fully grammatical sentences with high frequency verbs were
judged slightly more acceptable than sentences with low-frequency verbs, although the
dierence was very small (for adults, 4.82 v. 4.76; the authors do not indicate whether
or not it was statistically signicant).
11. Note that constructing examples for a linguistic paper (or for ones students to analyse)
is a very dierent kind of activity from ordinary language use: it is conscious and delib-
erate, and relies on metalinguistic and/or general problem-solving abilities rather than
normal linguistic routines.
Questions with long-distance dependencies 419
As a result, they are much more likely to develop more general represen-
tations of these constructions, and accept unprototypical instances of
them.
12
Received 11 April 2007 Sheeld University, UK
Revision received 18 December 2007
Appendix: List of sentences used in version 1 of the experiment
Experimental sentences
Prototypical LDD question (WH Prototypical)
What did you say the family should know before they go there?
What do you think they decided to do when they got home?
Where did you say they hid the treasure when they found out?
Where do you think the children could stay when their father returns?
LDD question with lexical subject (WH Subject)
What did Steve say we bought for Alice when we visited her?
What does Claire think the witness will say if they dont intervene?
Where did Andy say the young man went after they found her?
Where does Paul think we put the documents after he saw them?
LDD question with a dierent verb (WH Verb)
What did you claim we bought for Alice when we visited her?
What do you believe the witness will say if they dont intervene?
Where did you swear the young man went after they found her?
Where do you suspect we put the documents after he saw them?
LDD question with a dierent auxiliary (WH Auxiliary)
What will you say the family should know before they go there?
What would you think they decided to do when they got home?
Where will you say they hid the treasure when they found out?
Where would you think the children could stay when their father returns?
LDD question with an overt complementizer (WH Complementizer)
What did you say that the family should know before they go there?
What do you think that they decided to do when they got home?
Where did you say that they hid the treasure when they found out?
12. For other work suggesting that linguists judgments may be systematically dierent
from those of ordinary speakers, see Spencer 1973 and Bradac et al. 1980. See also
Hiramatsu 1999 and Snyder 2000 for experimental studies of syntactic satiation, a
phenomenon whereby sentences which were initially judged ungrammatical become
increasingly acceptable as a result of repeated exposure.
420 E. Dabrowska
Where do you think that the children could stay when their father
returns?
LDD question with an additional subordinate clause (WH Long)
What did you say Eve claimed we bought during our rst visit?
What do you think Jo believes he said at the court hearing?
Where did you say Mike swore he went after the evening performance?
Where do you think Phil suspects we were during the last war?
Unprototypical LDD question (WH Unprototypical)
What will Steve believe that Jo thinks they did with their old furniture?
What would Claire claim that Eve said they know about the whole aair?
Where will Andy suspect that Phil thinks they stayed during the school
holidays?
Where would Paul swear that Mike said they were during the afternoon
session?
Grammatical control sentences
Prototypical declarative (DE Prototypical)
And you think the children could stay here when their father returns.
But you think they decided to do something when they got home
So you said the family should know everything before they go there.
So you said they hid the treasure somewhere when they found out.
Declarative with lexical subject (DE Subject)
And Claire thinks the witness will say something if they dont intervene.
But Steve said we bought something for Alice when we visited her.
So Andy said the young man went home after they found her.
So Paul thinks we put the documents back after he saw them.
Declarative with a dierent verb (DE Verb)
And you swore the young man went home after they found her.
But you suspect we put the documents back after he saw them.
So you believe the witness will say something if they dont intervene.
So you claimed we bought something for Alice when we visited her.
Declarative with a dierent auxiliary (DE Auxiliary)
You will say the family should know everything before they go there.
You will say they hid the treasure somewhere when they found out.
You would think the children could stay here when their father returns.
You would think they decided to do something when they got home.
Declarative with an overt complementizer (DE Complementizer)
And you said that the family should know everything before they go
there.
Questions with long-distance dependencies 421
But you think that the children could stay here when their father returns.
So you said that they hid the treasure somewhere when they found out.
So you think that they decided to do something when they got home.
Declarative with an additional subordinate clause (DE Long)
And you think Phil suspects we were here during the last war.
But you said Mike swore he went home after the evening performance.
So you said Eve claimed we bought something during our rst visit.
So you think Jo believes he said something at the court hearing.
Unprototypical Declarative (DE Unprototypical)
Andy will suspect that Phil thinks they stayed here during the school
holidays.
Claire would claim that Eve said they know everything about the whole
aair.
Paul would swear that Mike said they were here during the afternoon
session.
Steve will believe that Jo thinks they did something with their old
furniture.
Ungrammatical control sentences
that trace sentences (*that)
What did you say that will kill cockroaches but not ants?
What do you think that probably got lost during the move?
Who did you say that ate the spinach your mother cooked?
Who do you think that will turn up in the evening?
Sentences with extraction from a complex NP (*ComplexNP)
What did Claire make the claim that she read in a book?
What did Paul hear the rumour that I found in my garage?
Where did you discover the fact that the criminals put the car?
Where did you put forward the hypothesis that all the weapons were?
Negatives without do support (*Not)
Her husband not claimed they asked where we were going.
The manager not implied you knew what they were doing.
The teacher not suspected she remembered where that woman lived.
Your sister not believed I forgot what he had done.
Declaratives with double tense marking (*DoubleTn)
His cousin doesnt thinks we lied because we were afraid.
The girl doesnt remembers where she spent her summer holidays.
The mother doesnt knows Julia was absent from school today
Your brother doesnt believes the man is telling the truth.
422 E. Dabrowska
References
Ambridge, Ben and Adele E. Goldberg
this issue The island status of clausal complements: Evidence in favor of an informa-
tion structure explanation.
Ambridge, Ben, Julian M. Pine, Caroline F. Rowland and Chris R. Young
2008 The eect of verb semantic class and verb frequency (entrenchment) on
childrens and adults graded judgments of argument structure overgenerali-
zation errors. Cognition.
Bradac, James J., Larry W. Martin, Norman D. Elliott and Charles H. Tardy
1980 On the neglected side of linguistic science: multivariate studies of sentence
judgment. Linguistics 18, 967995.
British National Corpus, The, version 2 (BNC World)
2001 Distributed by Oxford University Computing Services on behalf of the BNC
Consortium. URL: http://www.natcorp.ox.ac.uk/.
Cheng, Lisa Lai-Shen and Norbert Corver (eds.)
2006 Wh Movement: Moving On. MIT Press.
Chomsky, Noam
1977 On wh-movement. In Formal Syntax, edited by Peter W. Culicover, Thomas
Wasow, and Adrian Akmajian. Academic Press, New York, 71132.
Cowart, Wayne
1997 Experimental Syntax: Applying Objective Methods to Sentence Judgments.
Sage Publications, Thousand Oaks, CA.
Culicover, Peter W.
1997 Principles and Parameters: An Introduction to Syntactic Theory. Oxford Uni-
versity Press, Oxford.
Dabrowska, Ewa
2008 The eects of frequency and neighbourhood density on adult native speak-
ers productivity with Polish case inections: An empirical test of usage-based
approaches to morphology. Journal of Memory and Language 58, 931951.
in press a Prototype eects in questions with unbounded dependencies.
in press b Na ve v. expert competence: An empirical study of speaker intuitions.
2004 Language, Mind and Brain. Some Psychological and Neurological Con-
straints on Theories of Grammar. Edinburgh University Press, Edinburgh.
Dabrowska, Ewa and Elena Lieven
2005 Developing question constructions: Lexical specicity and usage-based oper-
ations. Cognitive Linguistics 16, 437474.
Faneslow, Gisbert and Stefan Frisch
2006 Eects of processing diculty on judgments of acceptability. In Gradience in
Grammar, edited by G. Fanselow, C. Fery, M. Schlesewsky, and R. Vogel.
Oxford University Press, Oxford.
Featherston, Sam
2005 Universals and grammaticality: wh-constraints in German and English. Lin-
guistics 43, 667711.
Frazier, Lyn and Charles Clifton, Jr.
1989 Successive cyclicity in the grammar and the parser. Language and Cognitive
Processes 4, 93126.
Goldberg, Adele E.
2006 Constructions at Work. The Nature of Generalization in Language. Oxford
University Press, Oxford.
Questions with long-distance dependencies 423
Haegeman, Liliane
1991 Introduction to Government and Binding Theory. Basil Blackwell, Oxford.
Hawkins, John A.
1999 Processing complexity and ller-gap dependencies across grammars. Lan-
guage 75, 244285.
Hiramatsu, Kazuko
1999 What syntactic satiation can tell us about islands. Papers from the Regional
Meetings, Chicago Linguistic Society 35, 141151.
Holmes, V. M., L. Stowe and L. Cupples
1989 Lexical expectations in parsing complement-verb sentences. Journal of Mem-
ory and Language 28, 668689.
Kluender, Robert and Marta Kutas
1993 Subjacency as a processing phenomenon. Language and Cognitive Processes
8, 573633.
Lako, George
1987 Women, Fire and Dangerous Things. What Categories Reveal about the
Mind. Chicago University Press, Chicago.
Langacker, Ronald W.
1987 Foundations of Cognitive Grammar. Volume 1: Theoretical Prerequisites.
Stanford University Press, Stanford, CA.
1991 Foundations of Cognitive Grammar. Volume 2: Descriptive Application. Stan-
ford University Press, Stanford, CA.
2000 A dynamic usage-based model. In Usage-Based Models of Language, edited
by Michael Barlow and Suzanne Kemmer, 163. CSLI Publications, Stan-
ford, CA.
Levine, Robert D. and Thomas E. Hukari
2006 The Unity of Unbounded Dependency Constructions. CSLI Publications,
Stanford, CA.
Lieven, Elena V., Heike Behrens, Jennifer Speares, and Michael Tomasello
2003 Early syntactic creativity: A usage-based approach. Journal of Child Lan-
guage 30, 333370.
Moon, Rosamund
1998 Fixed Expressions and Idioms in English: A Corpus-Based Approach. OUP,
Oxford.
Ouhalla, Jamal
1994 Introducing Transformational Grammar: From Rules to Principles and Pa-
rameters. Edward Arnold, London.
Sinclair, John (ed.)
1987 Collins Cobuild English Language Dictionary. London and Glasgow.,
HarperCollins.
Snyder, William
2000 An experimental investigation of syntactic satiation eects. Linguistic In-
quiry 31, 575582.
Spencer, N. J.
1973 Dierences between linguists and non-linguists in intuitions of
grammaticality-acceptability. Journal of Psycholinguistic Research 2, 83
98.
Theakston, Anna L.
2004 The role of entrenchment in childrens and adults performance on gramma-
ticality judgment tasks. Cognitive Development 19, 1534.
424 E. Dabrowska
Thompson, Sandra
2002 Object complements and conversation. Towards a realistic account.
Studies in Language 26, 125164.
Tomasello, Michael, Nameera Akhtar, Kelly Dodson, and Laura Rekau
1997 Dierential productivity in young childrens use of nouns and verbs. Journal
of Child Language 24, 373387.
Trueswell, John C., Michael K. Tanenhaus, and Christopher Kello
1993 Verb-specic constraints in sentence-processingseparating eects of lexical
preference from garden-paths. Journal of Experimental Psychology: Learn-
ing Memory and Cognition 19, 528553.
Verhagen, Arie
2005 Constructions of Intersubjectivity: Discourse, Syntax and Cognition. Oxford
University Press, Oxford.
Wasow, Thomas, and Jennifer Arnold
2005 Intuitions in linguistic argumentation. Lingua 115, 14811496.
Questions with long-distance dependencies 425
Lexical chunking effects
in syntactic processing
ARNE ZESCHEL*
Abstract
Research on syntactic ambiguity resolution in language comprehension has
shown that subjects processing decisions are inuenced by a variety of het-
erogeneous factors such as e.g., syntactic complexity, semantic t and the
discourse frequency of the competing structures. The present paper investi-
gates a further potentially relevant factor in such processes: eects of syn-
tagmatic lexical chunking (or matching to a complex memorized prefab)
whose occurrence would be predicted from usage-based assumptions about
linguistic categorisation. Focusing on the widely studied so-called DO/SC-
ambiguity in which a post-verbal NP is syntactically ambiguous between
a direct object and the subject of an embedded clause, potentially biasing
collocational chunks of the relevant type are identied in a number of cor-
pus-linguistic pretests and then investigated in a self-paced reading experi-
ment. The results show a signicant increase in processing diculty from a
collocationally neutral over a lexically biasing to a strongly biasing condi-
tion. This suggests that syntagmatically complex and partially schematic
templates of the kind envisioned in usage-based Construction Grammar
may impinge on speakers online processing decisions during sentence
comprehension.
Keywords: Sentence processing; prefabs; usage-based model.
Cognitive Linguistics 193 (2008), 427446
DOI 10.1515/COGL.2008.016
09365907/08/00190427
6 Walter de Gruyter
* I am grateful to Anatol and Benjamin Stefanowitsch for programming the experimental
software and supplying it to me. Many thanks also to Ewa Dabrowska, Stefanie Wul
and Kathryn Allan for running parts of the experiment. Earlier versions of this paper
were presented at DGKL/GCLA 2 in Munich and ICLC 10 in Krakow. I would like to
thank the audiences for discussion and useful comments. Finally, I am grateful for the
very helpful suggestions of two reviewers for Cognitive Linguistics which have signi-
cantly improved the quality of the paper. All remaining errors are mine. Contact Ad-
dress: Universitat Bremen, Fachbereich 10, Postfach 33 04 40, 28334 Bremen, Germany.
Author e-mail: zeschel@uni-bremen.de.
1. Introduction
Research on language processing has shown that comprehenders have
temporary diculties with the interpretation of locally ambiguous sen-
tences like those in (1):
(1) a. The criminal confessed his sins harmed too many people. (Ray-
ner and Frazier 1987)
b. The thief searched by the police had the missing weapon.
(Trueswell 1996)
c. The complex houses single and married students and their fami-
lies. (Jurafsky 1996)
Whereas it is widely assumed that such diculties provide valuable
cues as to how the underlying processing system is organised, there is as
yet no general consensus about the factors that account for the observed
diculties (cf. Tanenhaus and Trueswell 1995 for an overview). Most
consonant with usage-based approaches to language are so-called
constraint-based models of the ambiguity resolution process which argue
for an immediate interaction and rapid integration of dierent informa-
tion sources: in contrast to syntax-centered two-stage models, these ap-
proaches are non-modular and accord central importance to matters of
usage frequency and psychological entrenchment, both assumptions that
are well in keeping with central tenets of cognitively oriented versions of
Construction Grammar (Bybee 2006; Goldberg 2006; Langacker 2000).
The present paper adds to this research by evaluating the role of a fac-
tor that has received comparably little attention in the literature so far,
viz. potential eects of collocational chunking on the ambiguity resolu-
tion process. Focusing on the so-called DO/SC (sometimes also called
NP/S) ambiguity in which a post-verbal NP is temporarily ambiguous
between a direct object and the subject of an embedded clause (cf. 1a),
initial evidence from a self-paced reading experiment is presented which
suggests that complex and partially schematic templates of the kind envi-
sioned in usage-based Construction Grammar impinge on subjects syn-
tactic processing decisions.
2. A usage-based perspective on sentence processing
One of the foundational assumptions of usage-based approaches is that
linguistic knowledge is heavily redundant, with abstract schemas coexist-
ing with specic instances of the relevant type that have independent unit
status themselves (provided they are suciently entrenched). From a
processing perspective, this raises the following question: if there are sev-
428 A. Zeschel
eral elements in the constructicon that could be invoked as the categoris-
ing structure for a given target expression, then which of these will actu-
ally be chosen? The question is important since dierent candidates may
warrant dierent predictions as to how the utterance will unfold further,
meaning that the choice of a particular candidate structure at the expense
of others may have consequences for later processing decisions.
As indicated in the introduction, traditional psycholinguistic ap-
proaches to sentence processing can be broadly distinguished into two
types of models. Prototypical instances of the rst type are serial and
modular: such models assume that parsing decisions are initially guided
by considerations of syntactic complexity alone, and that attachment am-
biguities are resolved through general heuristics such as minimal attach-
ment and late closure without recourse to non-syntactic information
unless the initial analysis fails, in which case the parser has to backtrack
and reanalyse (Ferreira and Clifton 1986; Frazier 1987; Frazier and Fo-
dor 1978). The second class of models is typically parallel and interactive:
here, comprehenders are assumed to employ constraints from a variety
of dierent sources from the outset, with several dierent analyses com-
peting for selection (e.g., MacDonald et al. 1994; Trueswell et al. 1993;
Trueswell et al. 1994).
Langacker (2000) sketches a usage-based perspective on the selection of
linguistic categorising structures that is compatible with models of the lat-
ter type. Specically, Langacker assumes that only a single categorising
structure will be selected in the end, and that the choice of this element is
inuenced by three types of factors: entrenchment (i.e., relative degree of
resting activation and routinization of the competing alternatives), con-
textual priming (plausibility within the present discourse context) and
overlap (degree of structural similarity with the specic target at hand).
While the rst two of these three factors have been extensively studied
in the ambiguity resolution literature (cf. e.g., Cuetos, Mitchell and Cor-
ley 1996; Jurafsky 1996; Pickering, Traxler and Crocker 2000; Trueswell
1996 on aspects of frequency and entrenchment; Garnsey et al. 1997;
Hare et al. 2003, 2004; Tanenhaus et al. 2000; Trueswell et al. 1994;
Wiechmann, this issue on semantic t and contextual plausibility), the
third factor, overlap, has received considerably less attention.
The present paper addresses this factor, with overlap understood as
similarity of the target to a larger composite structure that is hypothesised
to have psychological unit status and which could thus be invoked as a
single categorising structure holistically. A possible explanation for the
relative neglect of this question in previous research is that issues of syn-
tax and sentence processing are not commonly thought about in terms of
prefabricated formulae because they are seen as the provenance of free
Lexical chunking eects 429
and unrestricted combinatoriality in language, with highly general syntac-
tic rules being the most salient manifestation of what Sinclair (1987) has
called the open-choice principle in language. As a consequence, sen-
tence processing is usually assumed to involve an incremental build-up
from atomic units rather than an amalgamation of more or less complex
structures that may already be retrieved en bloc. As indicated above,
however, usage-based approaches assume that speakers do indeed memo-
rize such internally complex chunks (regardless whether they are predict-
able or not), and that the generalisations that speakers extract from struc-
turally similar elements in their repertoire (e.g., VP !V NP as a
generalisation over e.g., ask a question, hit the post, pull strings etc.) are
in fact epiphenomenal (in the sense that they are merely implicit in a set
of stored exemplars, cf. Langacker 2000). Prefabricated chunks of various
grain sizes thus play an important role in the model, and it seems reason-
able to hypothesise from here that they are also relevant for processing.
So far, there has been little experimental work on this issue, even
though it has not remained unnoticed: for instance, Elman et al. (2004)
mention the possibility that formulaic sequences such as Mr. and Mr.
Smith proudly announce . . . may induce processing biases that dier from
that of the verb as viewed in isolation (here: a DO-preference in spite
of the overall SC-preference of announce). Apart from such potentially bi-
asing eects of prior context, i.e., preverbal material, one would also ex-
pect the attachment of the ambiguous noun phrase to be sensitive to the
concrete lexical identity of its head: specically, encountering an NP that
is a direct object collocation of the respective verb should privilege a DO-
analysis of the developing structure, even if the verb in isolation otherwise
favours SC. Moreover, one would expect that any additional syntagmatic
cues for this reading should increase the hypothesised eect.
Arguably, aspects of entrenchment, contextual plausibility and overlap
may be dicult to disentangle in practice: what is frequent in the input is
usually frequent for a good semantic reason, and individual frequent
combinations are of course likely to be stored. Hence, it can be expected
that collocating nouns in VN-sequences occur in this position more fre-
quently than expected, that they will allow a semantically coherent inter-
pretation and that the entire sequence may in fact be stored as a prefab.
Nevertheless, habitual co-occurrence is still not the same as semantic
plausibility. Specically, one can expect routinized co-occurrences to be
semantically plausible for some reason or other, but not necessarily vice
versa (i.e., there are all sorts of things that can be plausibly confessed,
but not all of the corresponding nouns are habitual collocates of the verb
confess). This is where the present study comes in: the experiment re-
ported in section 3 attempts to tease apart eects of contextual plausibil-
430 A. Zeschel
ity on the one hand and overlap on the other hand by comparing the
processing of putative prefabs to the processing of presumably non-stored
VN-sequences that contain a close semantic variant of the collocating
noun.
To my knowledge, there is no previous research on the inuence of col-
location eects on syntactic processing in the sense outlined above. On a
more general level, however, there is experimental evidence that compre-
henders do indeed use latent statistical cues from the linguistic context to
speed up comprehension. For instance, McDonald and Shillcock (2003)
present eyetracking evidence that transitional probabilities between verbs
and nouns aect gaze duration in reading, concluding that the brain
is able to draw upon statistical information in order to rapidly estimate
the lexical probabilities of upcoming words: a computationally inexpen-
sive mechanism that may underlie procient reading (McDonald and
Shillcock 2003: 648). The present study combines corpus-linguistic and
experimental methods to investigate whether such information also inu-
ences syntactic processing, and how possible chunking eects of this type
relate to item-based preferences pertaining to the verb when viewed in
isolation.
3. Prefabs in sentence comprehension
Eects of syntagmatic lexical chunking on syntactic processing were in-
vestigated in a self-paced reading experiment. Subjects read dierent
types of locally ambiguous sentences that ultimately turned out to involve
sentential complementation. Target items were of three types:
stimuli that could not be said to privilege a DO-analysis due to lexical
chunking eects because they did not involve a collocating VN-pair at
all (even though the respective noun was a near-synonym of the collo-
cating noun and hence semantically plausible)
stimuli that supported a transitive DO-analysis before the disambigu-
ation region by involving a DO-collocating noun, yet no additional
pointers to the ultimately wrong DO-analysis (i.e., the core of the
hypothesised DO-prefab alone)
stimuli that strongly supported a transitive DO-analysis before the
disambiguation region by involving a DO-collocating noun and addi-
tional preverbal cues for the collocating DO-chunk (i.e., a hypothe-
sised complex prefab)
Suitable stimuli were constructed on the basis of a number of corpus-
linguistic pretests. These pretests departed from a list of 16 SC-biased
verbs that were taken from an earlier study (Garnsey et al. 1997) and
Lexical chunking eects 431
explored potentially interesting DO-uses of these items in the British
National Corpus (BNC).
1
In the rst step, each verb was concordanced
in all relevant forms, and frequency counts for nouns occurring at posi-
tions R-1 and R-2 were summed. For each verb, the list of co-occurring
nouns was then sorted for frequency and the top three VN-pairs were
concordanced anew, this time with a larger span of up to ve words in
between verb and noun in order to also capture instances involving e.g.,
disjuncts and modication. Finally, three VN-collocations from the re-
sulting concordances were chosen that looked promising for present con-
cerns.
2
The target items of the present study thus selected were the three
combinations admit defeat, believe luck, and prove worth.
It was ensured that the non-collocating combinations in the rst condi-
tion were nevertheless attested in naturally occurring English text (if not
in the BNC, then at least on .uk sites on the web). Examples of such near-
synonymous combinations are given in (2):
(2) a. Davids proved his value for the Ajax team again.
http://gov-certicates.co.uk/birth/certicate/Edgar_Davids
(last accessed 25 January 2008)
b. Tories cant ever admit losing without stamping their feet up
and down and howling and bawling not fair, not fair.
http://chat.thisislondon.co.uk/london/threadnonInd.jsp?
forum=18&thread=220080
(last accessed 25 January 2008)
c. Darren Moore could scarcely believe his fortune when he
headed gently in amid a motionless Hull defence.
http://football.guardian.co.uk/Match_Report/0,72111,00.html
(last accessed 25 January 2008)
3.1. Corpus-linguistic pretests
3.1.1. Methods. Association strength computations for the presumed
collocations were calculated in the form of a covarying collexeme analysis
(Stefanowitsch and Gries 2005) in order to apply a measure that is sensi-
tive to syntactic structure. In other words, only those occurrences of e.g.,
admit followed by defeat (within a certain preset span) were counted as a
1. In the study by Garnsey and colleagues, SC-biased items were dened as verbs that oc-
curred with sentential complements at least twice as often as with NP direct objects (as
determined by a sentence completion task).
2. Suitable items had to be both relevantly frequent and permit a substitution of the noun
with a non-collocating close semantic variant for the collocationally neutral condition.
432 A. Zeschel
relevant hit in which the noun was actually the direct object of the verb.
This was necessary because the very existence of the ambiguity illustrates
that mere (near-) adjacency of two words in a sequence does not say any-
thing about the structural relations between these items. For instance, in
the following hits for queries of the type [V] . . . w5 . . . [N], the supposedly
DO-collocating noun is not a direct object of the verb:
(3) a. . . . Dowens admitted the defeat left him a little at . . . [BNC
A9U]
b. Nobody but a fool who believes his luck lies around the corner
could . . . [BNC ART]
c. Pieces which have proved to be of enduring worth have passed
. . . [BNC FPY]
Covarying collexeme analyses permit the identication of signicant
associations between words in dierent slots of one and the same gram-
matical construction. Unfortunately, the method requires either a parsed
corpus or extensive manual post-editing of the data. Since the BNC is
not syntactically annotated and balanced parsed corpora such as ICE-
GB are much too small to investigate the comparably rare bigrams that
are at issue here, samples had to be drawn for the gures that could not
be exhaustively coded by hand.
3
Specically, these were the frequencies
of
the target verb co-occurring with all other nouns in the transitive
construction
the target noun co-occurring with all other verbs in the transitive
construction
the transitive construction in the corpus at large
Table 1 illustrates the actual calculation of these values on the example
of admit defeat. The plain format gures in the table were obtained di-
rectly from the corpus/corpus samples, the italicized ones were arrived at
by subtraction (see below):
To begin with, the frequency of [admit defeat]
transitive
in the upper left
cell (58) was obtained by syntactically analysing the 62 raw hits of the
3. Samples were evenly distributed across the corpus; samples for verbs reected the pro-
portions of the four dierent inected forms in the overall concordance. The detailed
gures are as follows (total corpus frequency in brackets): admit (11,283)372 coded
sample tokens; defeat (with noun tag: 3476)346 tokens; believe (34,559)380 tokens;
luck (3180)343 tokens; prove (14,593)374 tokens; worth (3194)343 tokens; transitive
construction (3,747,626)384 tokens (see note 5).
Lexical chunking eects 433
corpus query. The frequency of [V
-admit
defeat]
transitive
in the lower left cell
was obtained by analysing a sample of 346 examples out of the 3,476 to-
tal occurrences of defeat that are tagged as a noun in the BNC.
4
Speci-
cally, I rst identied the number of hits in this sample in which defeat
functioned as the direct object in a transitive construction (77); second,
the proportion of transitive direct object uses in the sample was extrapo-
lated to the overall population of nominal defeat (774); third, the number
of hits for [admit defeat]
transitive
was subtracted from this gure, thus giv-
ing the estimated number of transitive constructions consisting of a verb
other than admit and defeat as the direct object (716). The same proce-
dure was applied in order to arrive at the estimated number of tokens
for [admit N
-defeat
]
transitive
(2065). The gure in the lower right cell (all
transitive constructions in the BNC which have neither admit in the V-
slot nor defeat in the N-slot) was arrived at in two steps: rst, the gures
for [admit N]
transitive
and [V defeat]
transitive
(i.e., the known row and column
totals) were subtracted from the total number of transitive constructions
Table 1. Input gures for admit defeat
Noun N in the
transitive Cxn
All other nouns
(transitive)
Totals
Verb V in the transitive Cxn [admit defeat]
trans
58
[admit N
-defeat
]
trans
2065
2123
All other verbs (transitive) [V
-admit
defeat]
trans
716
[V
-admit
N
-defeat
]
trans
3,744,787
3,745,503
Totals 774 3,746,852 3,747,626
4. For the noun worth in prove oness worth, it was not possible to adopt this approach
since the BNC tagging was unusually inaccurate here: as it turned out, the vast majority
of occurrences of worth with a nominal tag were not in fact nouns but wrongly classied
adjectives (uses of the type X is worth Y) . In order to address this problem, the follow-
ing procedure was applied: rst, the proportion of nominal uses of worth was estimated
by manually analysing a sample of 373 tokens out of the overall 12,381 occurrences of
the word (17.96 percent), thus giving an estimated 2224 nominal instances of worth in
the entire corpus. On the basis of this gure, it was then possible to calculate the number
of nominal observations that had to be analysed in order to estimate the proportion of
transitive object uses among these 2224 tokens (328 examples). Finally, I began to ana-
lyse the complete concordance for worth (all tags) until I had identied 328 nominal to-
kens and then assessed how many of these featured worth as the head of the direct object
constituent in a transitive construction (106), a gure that was then extrapolated to the
overall population (thereby giving 32.32 percent or 719 tokens).
434 A. Zeschel
in the BNC, thus giving the last missing row- and column totals.
5
Once
these were in place, the gures for [admit N
-defeat
]
transitive
and [V
-admit
de-
feat]
transitive
could be subtracted from these results, thereby giving the val-
ue in the nal missing cell. On the basis of the completed table, it was
then possible to calculate the expected frequency of [admit defeat]
transitive
in the BNC (0.4) and to evaluate the dierence between the observed and
the expected value.
Once association strengths were calculated in this manner, all attesta-
tions of the three target items in the BNC were subject to a detailed man-
ual coding of their syntagmatic context prole within a span of eve
words (with the verb as node). Full manual post-editing was applied for
two reasons: rst, the POS-tagging of the BNC is not 100 percent reliable,
so that the co-occurrence gures obtained from automatically generated
collocate lists are not necessarily correct.
6
Second, even if tagging were
100 percent reliable, automatically generated collocate lists would still
remain an imperfect approximation of the grammatical co-occurrence
properties of the investigated items because they do not take syntactic
structure into account. For instance, admit defeat is often found with
5. The number of transitive constructions in the BNC was estimated by drawing a sample
of 384 verb tags, counting the number of transitive constructions in the sample (141) and
extrapolating its proportion to the overall number of verb tags in the corpus (giving an
estimated 3,747,626 out of 10,206,300 verb tags in total). This is essentially the approach
advocated by Stefanowitsch and Gries (2003) who bootstrap argument structure con-
struction frequencies from verb frequencies, an intuitively appealing yet admittedly less
than perfect operationalisation since there is not a 1: 1 correspondence between verb
tags and argument structure constructions (for instance, an expression like I really
shouldnt have eaten that manifests a single transitive construction, yet contains three
items that would receive a verb tag in the BNC). While certain renements of this mea-
sure would have been easy to implement (such as an exclusion of all modal verbs), this
would not have been possible for other problematic cases such as auxiliary uses of do
and have or light verb uses of e.g., go in serial verb constructions. Since it would have
been unprincipled to exclude only some of the potentially problematic cases, I simply
took the total number of verb tags in the corpus. Coding-wise, constructions were
counted as transitive i the main verb occurred with two arguments and the second ar-
gument was a direct object that could be passivized. Mood was ignored, meaning that
examples like He was hit by a truck were coded as transitive. Finally, examples were
counted as transitive as soon as the relevant clause was transitive, regardless whether
the sampled verb tag itself did in fact belong to the transitive main verb or to an auxil-
iary.
6. For instance, of the 58 hits for admit/admits/admitted/admitting . . . (w5) . . . defeat in
which defeat is actually the direct object of the verb, 20 have the innitive marker to
(tagged T01) in position L-1. However, a look at the lexical co-occurrence statistics
shows that the marker to actually appears 25 times in this position, with ve occurrences
(20 percent) erroneously tagged as a preposition (PRP) instead.
Lexical chunking eects 435
adverbs such as nally or never that typically (4 a, b)though not always
(4 c, d)occur immediately before the verb:
(4) a. . . . the man who believed this would never admit defeat . . .
[BNC CAW]
b. . . . she nally admitted defeat when . . . [BNC BP4]
c. Some people never do admit defeat. [BNC G3D]
d. . . . but was nally having to admit defeat and . . . [BNC G3D]
It would of course be desirable to quantify co-occurrences with such
elements in the same way in which the association strength between the
verb and the noun was quantied, i.e., using standard collostructional
methods by assessing the association of each item in the string with the
constructional slot in question. On the other hand, these methods are
only applicable to the constitutive slots of a given construction, which
makes it dicult to accommodate optional elements such as e.g., negators
or adverbial adjuncts. As a result, behavioural proles were identied in a
more informal way that relied on raw frequency of co-occurrence instead
(giving observations of the type X percent of the instances of admit de-
feat involve negation, X percent involve an aspectual adverb such as
nally etc.). Combining dierent such observations, lexico-grammatical
context proles were identied through detailed manual annotation of 58
relevant (i.e., transitive direct object) hits for admit defeat, 77 observa-
tions of believe luck, and likewise 77 instances of prove worth.
3.1.2. Results. The results of the association strength computations are
reported in Table 2 (values indicate probability of error that the associa-
tion is non-chance as computed by the Fisher-Yates exact test, cf. Stefa-
nowitsch and Gries 2003, 2005 for discussion):
Regarding the syntagmatic periphery of the hypothesised chunks, the
following templates (comparable to the compound lexical items pro-
posed in Sinclair 1996) emerged from the corpus pretests:
(5) a. admit defeat
NP (ADV) ({OBLIGATION}) ADMIT defeat
Table 2. Association strengths of the three VN-bigrams
Bigram admit defeat believe luck prove worth
p 2.51e-101*** 5.69e-125*** 4.12e-171***
436 A. Zeschel
b. believe luck
NP
i
{ABILITY} NEGBELIEVE POSS
i

(good/bad ) luck
c. prove worth
({POSS./TRANSFER}) NP
i
(DET chance to)
PROVE POSS
i
worth
As for the notation, elements in brackets are optional and elements in
curly brackets represent semantic categories with variable lexical encod-
ing. Hence, (5.a) indicates that transitive DO-uses of admit defeat consist
of a subject NP that is typically followed by a particular kind of adverb
(usually an aspectual one like nally, eventually or never, but there are
also some manner items like reluctantly), followed by an element signal-
ling OBLIGATION (typically have to, but also must, be forced to etc.),
followed by a form of admit, followed by the direct object noun defeat.
In the case of believe luck, the subject of believe is commonly followed
by an element signalling ABILITY (typically can), followed by a negative
element such as not, hardly or scarcely, followed by a form of believe, fol-
lowed by a possessive pronoun that is coreferential with the subject, fol-
lowed by the direct object luck. The dominant usage pattern for transitive
prove worth is a little more complex since the bigram is typically em-
bedded in an innitival construction in which the referent of an NP is
said to have (or be given) a chance (opportunity etc.) to prove his or her
worth in a particular respect (or to the benet of a certain third party).
Some representative examples of each pattern are given in (6)(8):
(6) a. . . . the man who believed this would never admit defeat. [BNC
CAW]
b. Despite soldiering on for three days she nally admitted defeat
when . . . [BNC BP4]
c. . . . but was nally having to admit defeat and accept powerless-
ness. [BNC G3D]
(7) a. Juliet couldnt believe her luck. [BNC JY0]
b. He could hardly believe his bad luck though . . . [BNC CEP]
c. . . . she tted so exactly that Wyclie could scarcely believe his
luck. [BNC GWB]
(8) a. . . . and we are giving him the chance to prove his worth as a
footballer. [BNC K5A]
b. . . . he will not feel the need to prove his worth, either to himself
or . . . [BNC GVF]
c. . . . had an extended early opportunity to prove his worth as . . .
[BNC BN9]
Lexical chunking eects 437
Using the templates in (5), it was now possible to construct appropriate
stimuli for the following reading experiment.
3.2. Reading experiment
3.2.1. Subjects. 35 participants took part in the experiment. Subjects
were students at the Universities of Bremen, Sheeld, Salford and Mich-
igan.
7
All participants were native speakers of English. Further demo-
graphic characteristics such as sex and age were not recorded since they
were not deemed relevant for the purpose at hand. Subjects were not
paid for participation in the experiment.
3.2.2. Materials. The experiment was conducted on standard personal
computers with Microsoft Windows XP operating systems. Stimuli were
presented using Self Paced Reading Projector from experimentalSuite
0.9 beta for Windows, a custom-made stand-alone application pro-
grammed with Macromedia Director.
Subjects were seated at the screens and presented with eight short texts
about the Football World Cup 2006 followed by a timed yes-no compre-
hension question. Texts 4, 6 and 8 contained the actual test items in dif-
ferent orders, with the remaining texts serving as distractors. In the target
items, critical passages were preceded by one to three sentences establish-
ing an appropriate discourse context, followed by the sentence containing
the critical passage, followed by one or two additional sentences before
the comprehension question. Texts were designed to be as ecologically
valid as possible by assembling them (as far as possible) from pieces of
real-life British sports reporting. As an illustration, the set of experimen-
tal stimuli for prove worth is reproduced below:
After the rst two matches, Brazilian superstar Ronaldo was criticised a lot for
being overweight, slow and lacking determination. Coach Carlos Alberto Parreira
stayed stubbornly loyal, though, promising his centre forward a place in the start-
ing line-up for the next match.
a. Ronaldo will prove his value for the team . . .
b. Ronaldo will prove his worth for the team . . .
c. Ronaldo will get a chance to prove his worth for the team . . .
7. Subjects had to be drawn from this wide geographic range because it had proved di-
cult to nd enough native speaker participants among students in Bremen alone at the
time of investigation. However, since regional linguistic dierences between subjects can
be assumed to be irrelevant for the task at hand, this was not deemed problematic.
438 A. Zeschel
has been downplayed by the media. He is an exceptional player, and I am very
condent that he will score today. Ronaldo repaid Parreira with two goals
against Japan that led his team to the knockout stages and equalled Gerd Mu llers
all-time record of 14 world cup goals.
The stimulus sets used for the other two verbs are included in the Ap-
pendix. Subjects were given printed instructions which read as follows:
You are taking part in an experiment on text comprehension. You will read a
number of short texts, each followed by a short statement relating to their content
that you are asked to qualify as either right or wrong.
The texts are presented on a word by word basis. The current word is presented in
the middle of the screen. Pressing SPACE will advance the presentation to the
next word, replacing its predecessor, until the end of the text is reached.
You can determine the pace of the presentation yourself. It is important that you
pay attention to details, so simply take as much time as you need for a careful
reading of the texts. Crucially, it is not possible to go back to earlier words.
At the end of each sentence, the string is displayed. When you are ready
for the next sentence, press SPACE to move on. When you have reached the end
of a text, the string ??? is displayed in the middle of the screen.
When you get to the ??? prompt, please place one nger each on the keys cur-
sor left and cursor right. Pressing either key will prompt the test statement to
appear in the middle of the screen. Please indicate whether the statement matches
the contents of the preceding text as quickly as possible by either pressing cursor
right (YES) or cursor left (NO).
After your response, the procedure is repeated for the next text until the end of the
experiment is reached.
The experiment is not a quiz. All the information that you need for a correct re-
sponse is supplied in the texts.
The rst text is for training. Please pause after you have responded to the rst test
statement. The instructor will ask you if there is still anything unclear about the
procedure before the experiment begins.
3.2.3. Procedure. Words were presented one at a time in 48 point yel-
low font in the middle of a blue screen. Subjects advanced the presenta-
tion by pressing the space bar. Comprehension questions were answered
by pressing either the key cursor left (no) or cursor right (yes).
Subjects were told that the rst text was for training. Once subjects had
responded to the rst comprehension question, the experimenter ensured
that the overall procedure was clear to them and left them to complete the
rest of the experiment unsupervised. Each subject saw each of the three in-
vestigated verbs only once, in one of the three conditions outlined above.
Lexical chunking eects 439
3.2.4. Results. Reading times were compared at the second word of
the disambiguation region (e.g., been in the stimulus set reproduced in
3.2.2) in order to compensate for spillover eects. Following suggestions
by Ferreira and Clifton (1986), length-adjusted residual reading times
were computed for each subject and then related to degree of collocativity
in an analysis of variance. Residual reading times were obtained by com-
puting linear regression analyses for each subjects reading performance
on words of dierent lengths (in number of characters including punctua-
tion) with number of characters as the explanatory variable and reading
time as the dependent variable. Data from the training run and all sen-
tence-nal words were excluded. Reaction times faster than 100 ms and
slower than 2500 ms were treated as missing data (0.93 percent). A sin-
gle-group t-test on the regression coecients for all subjects revealed that
both the intercept and the linear component were signicantly dierent
from zero ( p < 0:001***) (cf. Lorch and Myers 1990). After computing
the regressions, residuals were obtained by subtracting the predicted read-
ing time for a given word from subjects actual reading time for this word.
Hence, positive residuals indicate slower and negative residuals faster
processing of a given word than predicted by the regression. The residuals
for all critical words in all three conditions (all VN-sequences were ana-
lysed together) were then submitted to an analysis of variance that indi-
cated a signicant eect of collocation (F
2; 102
5:5716, p < 0:01**).
Figure 1 presents the results in graphical form:
Figure 1. ANOVA results
440 A. Zeschel
3.3. Discussion
As predicted, Figure 1 shows a uniform increase in mean residual reading
time from the syntagmatically non-biasing over the biasing to the
strongly DO-biasing condition. Somewhat unexpectedly, mean residual
reading times for critical words in the non-biasing condition were slightly
shorter than reading times for other words of the same length. Even
though it is dicult to explain why words in a potentially garden-pathing
position were processed slightly faster than other experimental words of
the same length (small as the dierence may be), this result is in keeping
with earlier ndings that the three investigated verbs are indeed biased to-
wards sentential complementation, i.e., there is clearly no indication of a
processing diculty at the critical position in this condition. The mean in-
crease in processing diculty in the collocating condition is likewise only
slight, but nevertheless suggests that the presence of the collocating noun
alone already works against the isolated verb bias towards sentential
complementation. As expected, the most marked deviations from pre-
dicted reading times are found in the complex prefab condition with a
mean increase of almost 200 ms. These results can be taken as an indica-
tion that speakers do not merely memorize particular collocations of a
given verb (which is uncontroversial), but that these units are at least in
some cases more protably viewed as syntagmatically complex chunks
rather than as simple bigrams, and that such larger prefabs may also in-
uence on-line syntactic processing decisions during comprehension.
Nevertheless, the present results are but a rst indication that needs to
be interpreted with caution. To begin with, the study did not contain an
unambiguous baseline condition with an overt complementiser (e.g., Ro-
naldo will prove that his worth for the team has been downplayed by the
media) as is usually included in studies of syntactic ambiguity resolution.
That way, verbs could be investigated in three collocationally dierent
conditions that could be directly compared without markedly boosting
the number of experimental subjects. Irrespective of the collocation/
chunking issue, however, it is well documented that the basic ambiguity
eect is more pronounced for some verbs than for others, which intro-
duces a potentially confounding factor that should be controlled for in
possible follow-ups. Likewise, questions remain as to the precise deni-
tion of the complex prefabs in condition 3 and the extent to which they
can be directly compared across verbs. As indicated in section 3.1.1, it is
at present unclear how the collostructional methodology developed for
clearly delimited constructions such as [V NP] could be extended to larger
idiom chunks with fuzzy boundaries of the type in (5) whose formal spec-
ications gradually shade o into mere semantic preferences. Finally, the
Lexical chunking eects 441
overall operationalisation of degree of syntagmatic attraction in terms of
discrete levels such as non-collocation vs. collocation vs. complex
prefab is certainly an imperfect approximation of what in reality is
clearly a continuum.
All in all, however, the signicant increase in mean residual reading
time from condition 1 to condition 3 is a promising indication that future
research in this direction may be worthwhile. Moreover, all three verbs
show a uniform increase in reading time between the non-collocating
and the collocating condition which suggests that the observed eect is
not due to contextual priming in the sense of section 2 alone: since the
non-collocating nouns in condition 1 are a consistently weaker cue for the
DO-analysis than their synonymous variants in the collocating condition,
the increase in reading time is obviously not due to semantic factors.
Likewise, the dierence in mean residual reading time between the collo-
cating and the complex prefab condition cannot be accounted for by ap-
pealing to increased semantic plausibility of the DO-analysis in the latter
case. Instead, the results support Langackers (2000) assumption that
overlap between stretches of the input and complex preassembled cate-
gorising structures is a relevant processing factor in its own right.
4. Implications
The results of the present study suggest that speakers retain memory for a
variety of syntagmatic context features associated with the dierent usage
patterns of a given verb, and that accumulating syntagmatic evidence for
patterns of this type may override otherwise dominant parsing biases at-
taching to the verb in isolation. These ndings are consistent with the
usage-based hypothesis that
lower-level schemas, i.e., structures with greater specicity, have a built-in advan-
tage in the competition with respect to higher-level schemas. Other things being
equal, the ner-grained detail of a low-level schema aords it a larger number of
features potentially shared by the target (Langacker 2000: 16).
On the procedural level, it seems plausible to assume that this is indeed
a relevant factor that inuences pattern capture, i.e., the question
which candidate out of the initial activation set the system will actually
settle to in the end. Functionally, a bias towards concreteness also has
clear advantages: since speakers/hearers store whatever is suciently fre-
quently encountered and hence both communicatively and cognitively
routinized, accumulating evidence for a particular chunk of this type
442 A. Zeschel
means that there is a good chance that the corresponding analysis will
prove the correct guess again and thus serves to relieve (or rather bypass)
further processing load probabilistically.
Coming back to the question raised in the beginning (if there are sev-
eral elements in the constructicon that could be invoked as the categoris-
ing structure for a given expression, then which of these candidates will
actually be chosen?), the following answer would be consistent both
with the general bottom-up orientation of usage-based models and the
empirical results presented above: all else being equal, hearers/readers
will choose the most concrete potential categorising structure that is con-
sistent with the currently identied input, to the eect that (more) abstract
schemas will only be invoked as a kind of last resort where a more con-
crete standard of comparison is not available. It remains for future re-
search to show whether this more specic hypothesis can be corroborated,
and how the dierent factors that were found to inuence syntactic ambi-
guity resolution should be weighted. For the moment, suce it to say that
lexical chunking eects of the type investigated in this study do seem to
be one of these factors, a result that is well in keeping with usage-based
assumptions about the kinds of linguistic representations that speakers
store and retrieve in processing.
Received 29 November 2007 Universitat Bremen, Germany
Revision received 25 January 2008
Appendix
Stimulus Set A: prove worth
After the rst two matches, Brazilian superstar Ronaldo was criticised a
lot for being overweight, slow and lacking determination. Coach Carlos
Alberto Parreira stayed stubbornly loyal, though, promising his centre
forward a place in the starting line-up for the next match.
a. Ronaldo will prove his value for the team . . .
b. Ronaldo will prove his worth for the team . . .
c. Ronaldo will get a chance to prove his worth for the team . . .
has been downplayed by the media. He is an exceptional player, and I am
very condent that he will score today. Ronaldo repaid Parreira with
two goals against Japan that led his team to the knockout stages and
equalled Gerd Mu llers all-time record of 14 world cup goals.
Lexical chunking eects 443
Stimulus Set B: admit defeat
England did not live up to the high expectations at home, and for most of
the time, coach Sven Go ran Eriksson seemed obstinate in his decision to
ignore what was happening right before his eyes. In an interview after the
disastrous penalty shoot-out against Portugal, Eriksson still continued to
act as if he could scarcely believe that his time was up: oddly, he spoke of
how we still have the team to reach the nal.
a. Eriksson and his team admitted losing on penalties again . . .
b. Eriksson and his team admitted defeat on penalties again . . .
c. Eriksson and his team nally had to admit defeat on penalties again
. . .
was particularly tragic- We practised penalties so much, I really dont
know what more we could do about it, the Swede said.
Stimulus Set C: believe luck
Hosts Germany turned out to be one of the positive surprises of the tour-
nament. When their team was grouped with Costa Rica, Poland and Ec-
uador in last decembers draw,
a. the German fans did not believe their fortune in the draw . . .
b. the German fans did not believe their luck in the draw . . .
c. the German fans could hardly believe their luck in the draw . . .
would take them anywhere past the rst knock-out round, and surely no-
body expected Ju rgen Klinsmanns team to beat an opponent like Argen-
tina. Five matches into the cup it was 53 to Germany on penalties and it
looked like nothing could keep them from storming into the nal.
References
Bybee, Joan
2006 From usage to grammar: the minds response to repetition. Language 82(4),
711733.
Cuetos, Fernando, Don C. Mitchell and Martin Corley
1996 Parsing in dierent languages. In: Manolo Carreiras, Jose Garcia-Albea,
and N. Sabastian-Galles (eds.), Language processing in Spanish. Hillsdale,
NJ: Erlbaum, 145187.
Elman, Jerey L., Mary Hare, and Ken McRae.
2004 Cues, constraints, and competition in sentence processing. In Tomasello,
Michael, and Dan Slobin (eds.), Beyond Nature-Nurture: Essays in Honor
of Elizabeth Bates. Mahwah, NJ: Lawrence Erlbaum Associates, 111138
444 A. Zeschel
Ferreira, Fernando, and Charles J. Clifton
1986 The independence of syntactic processing. Journal of Memory and Language
25, 348368.
Frazier, Lyn
1987 Sentence processing: A tutorial review. In Max Coltheart (ed.), Attention
and performance XII: The psychology of reading. Hillsdale, NJ: Lawrence
Erlbaum Associates, 559586.
Frazier, Lyn and Janet Dean Fodor.
1978 The sausage machine: A new two-stage parsing model. Cognition 6, 291325.
Garnsey, Susan M., Neal J. Pearlmutter, Elizabeth M. Myers, and Melanie A. Lotocky
1997 The contributions of verb bias and plausibility to the comprehension of tem-
porarily ambiguous sentences. Journal of Memory and Language 37, 1,
5893.
Goldberg, Adele E.
2006 Constructions at Work: The Nature of Generalization in Language. Oxford:
Oxford University Press.
Hare, Mary L., Ken McRae and Jerey L. Elman
2003 Sense and structure: Meaning as a determinant of verb subcategorization
preferences. Journal of Memory and Language 48(2), 281303.
2004 Admitting that admitting verb sense into corpus analyses makes sense. Lan-
guage and Cognitive Processes 19(2), 181224.
Jurafsky, Daniel
1996 A Probabilistic Model of Lexical and Syntactic Access and Disambiguation.
Cognitive Science 20, 137194.
Langacker, Ronald. W.
2000 A dynamic usage-based model. In: Barlow, Michael and Suzanne E.
Kemmer (eds.), Usage-Based Models of Language. Stanford: CSLI Publica-
tions, 2463.
Lorch, Robert F. and Jerome L. Myers
1990 Regression analyses of repeated measures data in cognitive research. Journal
of Experimental Psychology: Learning, Memory, and Cognition 16, 149
157.
MacDonald, Maryellen C., Neil Pearlmutter and Mark Seidenberg
1994 Lexical nature of syntactic ambiguity resolution. Psychological Review 101,
676703.
McDonald, Scott A. and Richard C. Shillcock
2003 Eye movements reveal the on-line computation of lexical probabilities. Psy-
chological Science 14, 648652.
Pickering, Martin J., Matthew J. Traxler, and Matthew W. Crocker
2000 Ambiguity Resolution in Sentence Processing: Evidence against Frequency-
Based Accounts. Journal of Memory and Language 43(3), 447475.
Rayner, Keith and Lyn Frazier
1987 Parsing temporarily ambiguous complements. Quarterly Journal of Experi-
mental Psychology 39, 657673.
Sinclair, John
1987 Collocation: A progress report. In: Steele, Ross and Terry Threadgold
(edsl), Language Topics. Essays in honour of Michael Halliday. Vol. 2. Am-
sterdam: John Benjamins, 319331.
Sinclair, John
1996 The Search for Units of Meaning. Textus 9, 75106.
Lexical chunking eects 445
Stefanowitsch, Anatol and Stefan Th. Gries
2003 Collostructions: Investigating the interaction between words and construc-
tions. International Journal of Corpus Linguistics 8(2), 209243.
Stefanowitsch, Anatol and Stefan Th. Gries
2005 Covarying Collexemes. Corpus Linguistics and Linguistic Theory 1(1), 146.
Tanenhaus, Michael K. and John C. Trueswell
1995 Sentence comprehension. In: Miller, Joanne L. and Peter D. Eimas, (eds.),
Handbook of perception and cognition Vol. 11: Speech, language and commu-
nication. San Diego, CA: Academic Press. 217262.
Tanenhaus, Michael K., Michael J. Spivey-Knowlton, and Joy E. Hanna
2000 Modeling thematic and discourse context eects on syntactic ambiguity res-
olution within a multiple constraints framework: Implications for the archi-
tecture of the language processing system. In: Pickering, Martin J., Charles
Clifton and Matthew Crocker (eds.), Architecture and Mechanisms of the
Language Processing System. Cambridge: Cambridge University Press,
90118.
Trueswell, John C.
1996 The Role of Lexical Frequency in Syntactic Ambiguity Resolution. Journal
of Memory and Language 35(4), 566585.
Trueswell, John C., Michael K. Tanenhaus, and Christopher Kello
1993 Verb-specic constraints in sentence processing: Separating eects of lexical
preference from garden-paths. Journal of Experimental Psychology: Learn-
ing, Memory and Cognition 19, 528553.
Trueswell, John C., Michael K. Tanenhaus, and Susan M. Garnsey
1994 Semantic inuences on parsing: Use of thematic role information in syntac-
tic disambiguation. Journal of Memory and Language 33, 285318.
446 A. Zeschel
Initial parsing decisions and lexical bias:
Corpus evidence from local NP/S-ambiguities
DANIEL WIECHMANN*
Abstract
Recent research in sentence comprehension suggests that lexically specic
information plays a key role in on-line syntactic ambiguity resolution. On
the basis of an analysis of the local NP/S-ambiguity, the present study of-
fers a corpus-based approach to sentence processing that supports this view.
However, it is proposed that the relevant information used to recover the
syntactic structure of an incoming string of words is not retrieved from indi-
vidual verbs but from a more ne-grained level of form-meaning pairings
that distinguishes dierent verb senses. The investigation proceeds in two
steps: First, verb-general and sense-specic preferences for nominal and
sentential complementation are induced from corpus data and compared us-
ing odds ratios as a measure of association. Second, correlation analyses
are performed that relate the computed coecients of association to read-
ing time latencies from a recent self-paced moving window experiment
(Hare et al. 2003). The results corroborate the view that individual verb
senses, rather than individual verbs, guide initial parsing decisions.
Keywords: parsing; lexical guidance; local syntactic ambiguity; distinc-
tive collexeme analysis.
Cognitive Linguistics 193 (2008), 447463
DOI 10.1515/COGL.2008.017
09365907/08/00190447
6 Walter de Gruyter
* I would like to thank Stefan Th. Gries, Holger Diessel and two anonymous reviewers for
discussion and valuable comments on earlier versions of this paper. All remaining weak-
nesses are, of course, my fault alone. Also, I would like to thank Stefanie Wul for shar-
ing her ICE-GB isomorphic BNC sample with me. Finally, special thanks to Mary Hare
for providing me with the original reading time data of the study reported in Hare et al.
(2003). Correspondence address: Institut fu r Anglistik und Amerikanistik, Friedrich
Schiller Universitat Jena, 07743 Jena, Germany. e-mail: daniel.wiechmann@uni-jena.de.
1. Introduction
Comprehending a natural language sentence is a complex process involv-
ing numerous sub-processes below and above the sentence level such as
recognizing words, resolving anaphoric relationships, recognizing gura-
tive language, establishing discourse coherence, and various kinds of in-
ferencing. However, one of the most central tasks is the analysis of the
syntactic structure of the signal, i.e., parsing. In languages like English,
which are morphologically comparatively poor, a perceived string of
words is likely to allow for more than one way of combining lexical units
into larger syntactic structures, which may give rise to local syntactic am-
biguities during on-line processing.
One of the best-studied local syntactic ambiguities involves the alterna-
tion between nominal and sentential complements. In this ambiguity, a
post-verbal NP cannot be straightforwardly interpreted with respect to
the grammatical role that it plays in the sentence since it could either
function as the direct object of the preceding verb or as the subject of an
embedded clause:
(1) a. Inspector Clousseau revealed [
NP
Dreyfuss intentions].
b. Inspector Clousseau revealed [
S
[
NP
Dreyfuss intentions] were
indeed diabolic].
Using ambiguities of the type in (1) as an example, the present study
investigates a particular hypothesis as to how such ambiguities are re-
solved in on-line sentence comprehension. Specically, what is at issue is
the assumption that the process involves probabilistic subcategorization
preferences that are associated with individual senses of a given verb.
Corpus-linguistic evidence in support of this hypothesis is presented and
compared to recent experimental results from a self-paced reading study
(Hare et al. 2003). With regard to linguistic model-building, the study ar-
gues that conceptions of subcategorization preferences should make refer-
ence to a quite ne-grained level of representation, i.e., the individual
senses of a verb. Methodologically, it is argued that such preferences can
be appropriately estimated by means of quantitative corpus-linguistic
methodologies.
2. The verb sense guidance hypothesis (VSGH)
Early research in the eld of sentence comprehension was dominated by
the view that the human comprehension system employs a two-stage se-
rial mechanism with dierent processes operating on each stage (Fodor
1978): the initial stage uses syntactic category information only and
448 D. Wiechmann
adopts very general parsing heuristics (like minimal attachment or
late closure) to recover syntactic structures. When the mechanisms of
the initial phase fail to detect the correct structure, the parser employs a
backtracking mechanism to reanalyze the string. In this second stage, in-
formation from several sources (e.g., semantic or discourse pragmatic
properties) is integrated into the structure-building process.
As syntactic theories put more and more emphasis on lexical represen-
tations (cf. Chomsky 1970; Jackendo 1975), psycholinguistic research,
too, supplied more and more evidence for a parsing mechanism that is
guided by lexically specic information. In lexical guidance accounts
of sentence comprehension (Ford et al. 1982; Mitchell 1994), it is com-
monly assumed that particular lexical items, most notably verbs, exhibit
individual preferences for possible subcategorization patterns and that
these preferences enable the comprehension system to anticipate likely
structural continuations. Such accounts predict that sentences should be
easy to process if a verbs structural expectations are met, and harder to
process if such expectations are violated. Consequently, these accounts
predict that the sentences in (2) dier signicantly in terms of processing
diculty:
(2) a. Inspector Clousseau suspected Sir Charles Litton was the
phantom.
b. Inspector Clousseau remembered Sir Charles Litton was the
phantom.
c. Inspector Clousseau suspected Sir Charles Litton all along.
d. Inspector Clousseau remembered Sir Charles Litton only vaguely.
Specically, 2a and 2d should be easier to process than 2b and 2c, re-
spectively, because the structural continuations are in accordance with the
preferences of the verbs in these examples: remember is biased towards
nominal complements, whereas suspect prefers sentential continuations.
There is compelling evidence for such a lexically driven parsing mecha-
nism, which I will only briey sketch here: Fodor (1978) predicted that a
verbs preference for transitive or intransitive complementation could in-
uence the initial parsing decision of whether a gap should be postulated
after the verb. Ford et al. (1982) generalized Fodors ideas and claimed
that each verb has associations of diering strengths to all its possible
subcategorization frames. These strengths reect a combination of verb
frequency and contextual factors and are exploited to build up expec-
tations that are used in parsing. Ford et al. tested this hypothesis in an
o-line experiment in which subjects were asked to make a forced choice
between two possible interpretations of an ambiguous sentence. It could
be shown that a set of subcategorization preferences could be used to pre-
Initial parsing decisions 449
dict subjects choices. Although Ford and colleagues did not test for fre-
quency eects themselves, it was later shown that the biases assumed in
their study corresponded to frequencies in the Brown corpus (Jurafsky
1996). Clifton et al. (1984) tested the approach by using the frequency
norms collected by Connine et al. (1984) and showed that these fre-
quencies could be used for predicting dierences in processing diculty.
Tanenhaus et al. (1985) demonstrated that fronted direct objects resulted
in longer reading times for verbs with a transitive bias, but not for verbs
that preferred intransitive use. Trueswell et al. (1993) used a cross-modal
naming paradigm to show that frequency-based subcategorization prefer-
ences are relevant for on-line disambiguation. MacDonald et al. (1994)
reported that the lexical bias eect was also detectable with main verb/
reduced relative clause ambiguities. Jennings et al. (1997), in an exten-
sion of Trueswell et al. (1993), used a similar cross-modal naming ex-
periment and focused on an alleged design aw in that experiment: up
to this point, previous studies had binned the verb-preferences into just
two classes (high and low frequency). Jennings and colleagues demon-
strated a correlation between the strength of the bias and reading time at
the target word such that the stronger the bias, the larger the advantage
they found in naming latency for the preferred over the non-preferred
continuation.
However, it has been suggested that verb-specic preferences are not
quite ne-grained enough: many verbs can express dierent meanings
which in turn may be associated with dierent argument structure cong-
urations. Consider the examples in (3):
(3) a. Peter
VP
[
V
admitted
NP
[his ex-girlfriend]
PP
[to the club]].
b. Peter
VP
[
V
admitted
S
[
NP
[his ex-girlfriend] was hotter than his
current one]].
c. Peter
VP
[
V
admitted
NP
[his error]].
The verb admit in (3a) roughly means grant entry and takes NP ob-
jects only, whereas in (3b) and (3c) it means roughly acknowledge to
be true and can take either nominal or sentential complements. Recent
studies have therefore addressed the possibility that subcategorization
preferences are in fact sense-contingent: Argaman and Pearlmutter (2002)
showed that verbs and their derived nominalswhich presumably share a
number of semantic featureshave similar subcategorization probabil-
ities. This suggests that the semantic properties of a verb inuence its sub-
categorization choice. Hare et al. (2003) conducted a self-paced moving
window experiment to investigate this possibility. They found increased
reading times in cases in which the structural expectation after the crucial
NP was not met, concluding that [r]eaders were inuenced by structural
450 D. Wiechmann
expectations contingent on verb sense (Hare et al. 2003: 294; see also
Hare et al. 2004). This hypothesis can be formulated as follows:
Verb Sense Guidance Hypothesis (VSGH)
Each conventionalized verb sense carries probabilistic information ex-
pressing its bias for possible argument structure congurations. This in-
formation is used to guide early parsing decisions.
The present study investigates whether the VSGH can be corroborated
from a corpus-linguistic point of view. It is divided into two parts: First, a
distinctive collexeme analysis (henceforth DCA; Gries and Stefanowitsch
2004) is conducted to assess form-based and sense-contingent preferences
for 20 verbs in a balanced 17 million words sample of the British Na-
tional Corpus (BNC). This analysis supplies for each verb (sense) an as-
sociation score expressing the degree to which a given verb form or verb
sense prefers one of the two relevant complementation patterns. Second,
these results are compared with experimental ndings from the self-paced
reading study reported in Hare et al. (2003) by computing correlation
analyses for the results of the DCA and the reading-time deltas measured
by Hare and colleagues.
3. Form-based vs. sense-contingent preferences
There are two ways of estimating lexical preferences: they can either be
assessed experimentally, e.g., by means of sentence completion tasks
(e.g., Garnsey et al. 1997) or sentence production tasks (e.g., Connine et
al. 1984), or via corpus investigation.
1
Both methods exhibit dierent
strengths and weaknesses: experimental techniques permit the investiga-
tion of a single factor in isolation by allowing the researcher to control,
in principle, all known factors that are not addressed in a given design.
By contrast, corpus data usually consist of samples of naturally occurring
language that is embedded in real-life communicative situations and thus
inuenced by a multitude of factors which cannot easily be identied.
However, the naturalistic quality of corpus data is also what makes them
so attractive: experimental settings can easily produce linguistic artifacts
that are detached from the constraints of normal discourse. For instance,
since the meaning of the sentences to be produced is largely irrelevant,
participants in sentence completion tasks might prefer short variants
1. Garnsey and colleagues used a proper name followed by a verb as in Debbie remem-
bered and asked subjects to complete this fragment. In Connine et al. (1984), sub-
jects were presented with a verb and were asked to write down a sentence containing
that verb.
Initial parsing decisions 451
over longer ones simply to minimize their eort. However, in real life sit-
uations speakers are of course bound to their communicative intentions
and must thus use forms which are appropriate for the speech act to be
performed. Given these respective strengths and weaknesses of experi-
mentally and corpus-derived norms, it appears obvious that they should
be employed in a complementary way. Nevertheless, as has been pointed
out elsewhere (cf., e.g., Tummers et al. 2005), it is necessary to engage
in rigorous, quantitative methodologies to make full use of the corpus-
linguistic potential.
3.1. Assessing form-based preferences
3.1.1. Method. The present study employs a variant of collostruc-
tional analysis (cf. Stefanowitsch and Gries 2003 for detailed discus-
sion), a family of collocational techniques that was developed to inves-
tigate the relationship between syntax and lexis. Formulated in the
framework of construction grammar (Goldberg 1995; Lako 1987), it ad-
dresses the interaction of linguistic signs of various levels of abstraction,
e.g., lexical items and abstract argument structure constructions. The
degree of association between such constructionsi.e., metaphorically,
the glue between these unitsis referred to as their collostruction
strength. One of the variants of this method, distinctive collexeme anal-
ysis, employs the general logic of the approach to compare a given
words relative attraction to a set of constructional variants in which this
item can occur. In other words, it oers a way to measure a verbs relative
preference for a given set of complementation options. In the present
study, these alternatives are the nominal and the sentential complementa-
tion pattern that compete in the resolution of NP/S-ambiguities. As re-
gards the lexical items to be investigated in these constructions, the study
covers all of the 20 verbs used in the reading experiment by Hare and col-
leagues (i.e., acknowledge, add, admit, anticipate, bet, claim, conrm, de-
clare, feel, nd, grasp, indicate, insert, observe, project, recall, recognize,
reect, report and reveal ), each of which can occur with both nominal
and sentential complements.
The data were extracted from a balanced 17 million word sample of the
British National Corpus which was compiled to be isomorphic to the
British component of the ICE corpus.
2
Of interest were all instances of
these verbs that are immediately followed by a noun phrase. The study is
restricted to past tense forms of the verbs and lexical rather than prono-
minal NPs (pronominal realizations of the relevant NP were excluded be-
2. For detailed information about the properties of that corpus cf. Nelson (1996).
452 D. Wiechmann
cause they are formally marked for case and thus do not give rise to NP/
S-ambiguities).
3
As expected, the investigated verbs had markedly dierent frequencies
in the corpus. In order to attain a data set of manageable size, the follow-
ing procedure was applied:
for verbs with a token frequency greater than 3,000, a random 10%
sample was extracted
for verbs with a token frequency between 300 and 3,000 a random
sample of 300 items was extracted
for verbs with a token frequency lower than 300, all occurrences were
extracted
This gave a set of 4,960 data-points which was then coded for the
grammatical role of the post-verbal NP by hand. The labels NP and
S were used to indicate nominal and sentential complementation, re-
spectively. Cases that could not be assigned to either of these two catego-
ries received the label other.
Having extracted and coded the data, they were submitted to the DCA
in order to compute association strengths between a given verb and the
two syntactic patterns. The gures that were required for this calculation
are given in Table 1.
Required gures include the observed frequencies of verb V in either of
the two constructions (O11, O21) as well as the observed frequencies of
these constructions occurring with other verbs (O12, O22). The labels
R1, R2 and C1, C2 stand for row and column totals and N denotes over-
all frequency, i.e., O11 O12 O21 O22. Given these frequencies, the
relative attraction between verbs and the two constructions in question
can be computed. Generally speaking, candidate measures of the prop-
erty of interest (association strength) compare the observed distribution
with the expected distribution under the assumption of statistical indepen-
dence and evaluate how much evidence the observed distribution provides
Table 1. Input distributions
verb V other verbs
nominal OBJ O11 O12 R1
sentential OBJ O21 O22 R2
C1 C2 N
3. The analysis was restricted to past tense forms because Hare et al. (2003) used these
forms in their experiment as well.
Initial parsing decisions 453
against this assumption. On closer inspection, however, it is far from triv-
ial to determine exactly what measure is best suited to adequately express
degrees of association between linguistic units (cf. Evert 2004; Wiech-
mann in press).
4
Following Gries (2006), the present study makes use of
a discounted odds ratio to express collostruction strength, because a)
this measure approximates the results of more accurate measures (such
as exact hypothesis tests) fairly well, and b) in contrast to such other mea-
sures, its estimation of the relationship in question is less dependent on
sample sizes.
5
3.1.2. Results. Table 2 and Figure 1 present the results of the DCA,
specically the preference of a given verb for NP-complementation. The
4. Evert (2004) provides a comprehensive overview of measures proposed in the computa-
tional and corpus-linguistic literature and discusses their mathematical properties and
areas of application. Wiechmann (in print) evaluates 47 scores of dierent mathematical
types against their performance to predict eye-tracking data reported in Kennison
(2001).
5. The discounted variant of the odds ratio adds 0.5 to each factor in order to avoid in-
nite values.
Table 2. Verb preferences for nominal complements
Verb form-based bias
(log odds ratios)
conrm 3.66
feel 2.04
anticipate 1.35
recall 1.20
acknowledge 0.11
reect 0.27
bet 0.30
reveal 0.38
claim 0.59
recognize 0.64
indicate 0.89
insert 1.30
observe 1.38
grasp 1.40
project 1.40
add 2.22
declare 2.36
admit 2.62
report 3.38
nd 4.28
454 D. Wiechmann
left column in Table 2 lists the investigated verbs and the right column
species the corresponding association strength coecients, i.e., the re-
spective (logarithmically scaled) odds ratios. These express the degree to
which a given verb prefers one of the two patterns: the higher the score,
the stronger the preference for NP-complementation. Negative values in-
dicate that a verb is biased towards sentential complementation.
3.1.3. Discussion. Figure 1 reveals that the investigated verbs dier no-
ticeably with regard to their structural preferences. Only four of the 20
verbs (conrm, feel, anticipate, recall ) do in fact show a preference for
sentential complementation. All remaining verbs have at least a tendency
to prefer nominal complements. The overall preference for nominal com-
plementation of these 20 verbs reects a general or global tendency of
English to favor simple monotransitive patterns. Other things being
equal, comprehenders are thus more likely to expect NP continuations,
simply because the global transitivity bias acts on the comprehension sys-
tem even before the verb is being perceived. Consequently, verbs must ex-
hibit rather strong preferences for sentential complementation to counter
this eect.
Figure 1. Verb preferences
Initial parsing decisions 455
3.2. Assessing sense-contingent preferences
3.2.1. Method. Dierent senses of the investigated verbs were identi-
ed in a lexical database, WordNet 2.0, which was also used in Hare et
al.s (2003) study.
6
Each of the 4960 items in the data set was assigned to
the sense that was considered to provide the best t relative to the list
of senses proposed in WordNet.
7
To give an example, there were 656 occurrences of [ nd NP] in the
data. 608 tokens of these involve nominal complementation and 48 in-
stances involve sentential complementation. A semantic subclassication
of these uses revealed that 210 instances out of the 608 nominal tokens
are instantiations of sense 1 (FIND
1
) in WordNet, which is described as
verb of possession; come upon after searching. Sense FIND
1
does not
occur with sentential complements. This contrasts with sense FIND
2
,
glossed as come to believe on the basis of emotions, intuitions, or indef-
inite grounds in WordNet, which is instantiated 180 times in the sample
and has 137 occurrences in the nominal and 43 occurrences in the senten-
tial pattern. The remaining tokens of nd realize yet other senses of the
verb, for which as many as 16 distinct senses are distinguished in Word-
Net (however, FIND
1
and FIND
2
are the most frequent and semanti-
cally dierent ones and account for roughly 60% of the data).
Having classied the data in this manner for all 20 verbs, the syntactic
preference of a given verb sense could then be estimated by submitting the
distributional information to a second DCA. For each verb two senses
namely the ones that t the semantics of Hare et al.s context sentences
were contrasted.
3.2.2. Results. Table 3 presents the odds ratios expressing the sense-
contingent collostruction strengths:
As above, positive scores indicate a preference for nominal comple-
mentation and negative values indicate a preference for sentential
complementation.
6. WordNet was compiled by a group of psycholinguists at Princeton University in 1985
and elaborated ever sinceas an attempt to investigate lexical memory. For more infor-
mation on WordNet, cf. Fellbaum (1998).
7. The assignment of WordNet senses to a large set of novel examples is not unproble-
matic, because the sense distinctions in WordNet are very ne-grained. As a result a certain
degree of misclassication had to be accepted. Note, however, that the most important
semantic distinction concerns very coarse-grained contrasts: Hare and colleagues chose
senses from WordNet in such a way that [f ]or each of the 20 verbs, we identied two
senses that appeared to be suciently distinct, that we believe are known to undergrad-
uates, and that allow dierent subcategorization frames according to WordNet (p. 285).
456 D. Wiechmann
Figure 2 presents the results for both form-based and sense-contingent
preferences in graphical form.
3.2.3. Discussion. The results show that form-based and sense-
contingent preferences may dier both quantitatively, i.e., in terms of as-
sociation strength (cf. e.g., bet or reveal ), and qualitatively, i.e., in terms
of the preferred pattern at large (cf. e.g., admit or conrm). The fact that
the subcategorization preferences are dierent for dierent meanings ex-
pressed by a given verb form corroborates the position advocated in
Hare et al. (2004) that psychological models and, consequently, experi-
mental protocols using subcategorization preferences should take verb
senses into account. However, in order to assess their relevance for as-
pects of on-line processing, it is necessary to compare these o-line data
to appropriate experimental observations.
3.3. Comparing corpus-based and experimental ndings
In order to test whether the employed method, distinctive collexeme anal-
ysis, can be fruitfully applied to estimate speakers on-line processing
preferences, the computed association scores were compared with the
reading time latencies of the individual items observed by Hare and
colleagues.
Table 3. Form-based vs. sense-contingent preferences
Verb form-based sense1 sense2
conrm 3.66 1.63 3.22
feel 2.04 2.15 0.96
anticipate 1.35 0.21 2.55
recall 1.20 0.35 1.22
acknowledge 0.11 0.35 1.76
reect 0.27 1.82 1.57
bet 0.30 4.38 1.39
reveal 0.38 0.38 0.21
claim 0.59 0.53 1.53
recognize 0.64 0.91 1.61
indicate 0.89 0.25 0.91
insert 1.30 0.93 0.79
observe 1.38 0.98 1.33
grasp 1.40 0.07 0.85
project 1.40 0.73 2.39
add 2.22 1.27 0.98
declare 2.36 0.75 0.39
admit 2.62 1.08 0.87
report 3.38 1.47 1.04
nd 4.28 0.02 1.04
Initial parsing decisions 457
Before I present the results, it will be helpful to provide a more detailed
description of the experiment in question. As indicated, the study was de-
signed to test whether a verbs sense-contingent subcategorization bias is
exploited during on-line processing, specically for the resolution of tem-
porary NP/S-ambiguities. Participants were asked to read two sentences:
a context sentence and the actual target sentence, which incorporated the
investigated verb and always involved a sentential continuation. The con-
text sentences were designed so as to evoke a scenario compatible with one
of two maximally dierent senses of the verb under investigation.
8
Having
read the context sentence rst, the participants then read through the test
sentence, which was presented one word at a time. As an illustration, con-
sider the stimulus set for the verb nd in (4) and (5) (crucial NP italicized):
(4) Condition 1
a. The intro psychology students hated having to read the assigned
text because it was boring.
Figure 2. Form-based vs. sense-contingent preferences
8. The properties of the context sentence were controlled for not directly priming the rele-
vant syntactic patterns themselves, i.e., they did neither involve a NP V S nor a NP V
NP structure.
458 D. Wiechmann
b. They found the book was written poorly and dicult to
understand.
(5) Condition 2
a. Allison and her friends had been searching for John Grishams
new novel for a week, but yesterday they nally were successful.
b. They found the book was written poorly and were annoyed that
they had spent so much time trying to get it.
Hence, having read up to the investigated verb in the target sentence,
subjects were predicted show a disposition to interpret this verb as instan-
tiating the sense that is compatible with the scenario conveyed by the con-
text sentence, i.e., they should expect an S-continuation once found has
been read in (4) and an NP-continuation in (5). The authors predicted a
context by ambiguity interaction in the disambiguation region (DR) and,
in fact, the strongest ambiguity eect could be measured at the second
word of that region (i.e., at written in the above example). In other words,
an S-biasing context sentence (as in condition 1) should lead to relatively
shorter reading times at the second word of the disambiguation region
(DR
POS2
) of the S-target sentence. Conversely, an NP-biasing context (as
in condition 2) should lead to increased reading times at DR
POS2
of the S-
target sentence. Averaged across verbs, these predictions were fullled.
The present study investigates whether the relevant preferences can be
quantied using the collostructional methodology introduced in section
3.1.1. To that end, the sense-contingent preferences as expressed by dis-
counted odds ratios were compared with the reading time latencies at the
second word of the disambiguation region. If collostruction strength is in
fact a good predictor of the relevant biases, it is expected that there is a
correlation between collostruction strength and reading time latency. In
other words: the stronger the association with nominal complementation,
the greater the ambiguity eect should be. Conversely, a negative correla-
tion is expected if reading time deltas are compared with preferences for
sentential complementation, the pattern that was consistently employed
in the experimental study by Hare and colleagues.
3.3.1. Method. Correlational analyses were conducted between the
computed association scores (discounted odds ratios) and the reading
time latencies at DR
POS2
both on the level of lexical form and lexical
meaning using Spearmans rank order correlation.
9
9. All statistics were calculated with the R statistics package version 2.2.1.
Initial parsing decisions 459
3.3.2. Results. The analysis revealed a signicant negative correlative
relationship between sense-contingent preferences and reading time for
the second word of the disambiguation region (Spearmans rho
0:3136; p < 0:05*): the weaker a senses preference for sentential com-
plementation, the greater the ambiguity eect when this pattern is en-
countered. No such correlation could be observed for form-based prefer-
ences and reading time latencies (Spearmans rho 0:1172; p 0:471).
4. Discussion
The present study has provided corpus-linguistic evidence for the exis-
tence of detailed sense-specic probabilistic information that is associated
with particular lexical forms and that appears to guide the human lan-
guage comprehension system upon resolving local syntactic ambiguities.
In particular, the employed method of distinctive collexeme analysis as
well as the selected association strength measure of discounted odds ratios
were shown to provide a useful means for inducing the observable biases
from corpus data.
Nevertheless, some qualications are in order: First, although verb
sense-specic preferences seem to play an important role in guiding com-
prehenders syntactic analysis of a sentence, there are many other factors
that are known to inuence the ambiguity resolution process, too (cf.
MacDonald 1997 for an overview; see also Zeschel, this volume). Fur-
thermore, nothing in the present study excludes the possibility that the
relevant expectations are in fact encoded on a more general level (i.e., a
level of semantically coherent verb classes) rather than stored separately
for particular senses of individual verbs.
However, wherever these preferences are encoded, the observed results
tie in nicely with central tenets of usage-based approaches to language.
First, usage-based models (Langacker 1988) predict a connection between
statistical patterns in the input (to be approximated by studying large-
scale balanced corpus data) and the mental representations that are built
up in response to speakers linguistic experience. Second, usage-based ap-
proaches to grammar are construction-based by capitalizing on the no-
tion of form-meaning pairings. The present study has presented evidence
in support of the idea that a particular type of such form-meanings pair-
ings (i.e., the association between syntactic complementation patterns and
particular lexical meanings) indeed plays a role in determining the distri-
bution of verbs with dierent senses across grammatical constructions
and also seems to inuence comprehenders on-line processing decisions
when confronted with syntactic ambiguities involving these items.
460 D. Wiechmann
One recent addition to the family of usage-based theories is Embodied
Construction Grammar (Bergen and Chang 2005). Bryant (2003, 2004)
has provided a parsing component for this approach, called constructional
analyzer. On this approach, parsing is an analysis process which takes an
input utterance in context and determines the set of constructions that are
most likely to be responsible for it. The advantage of a construction-
based parser is that [ . . . ] constructions carry both phonological and
conceptual content, [and] a construction[al] analyzer [ . . . ] must respect
both kinds of constraint (Bergen and Chang 2005: 172). Constructions
and their constraints are regarded not as deterministic but as tting a
given utterance and context to some quantiable degree. Bryant suggests
that constructions and their constraints could be associated with connec-
tion weights. The present paper is sympathetic to such a conception of
language and suggests that these connection weights can be inferred
from collostruction strengths.
Received 20 March 2006 Friedrich-Schiller-Universitat Jena,
Revision received 1 May 2007 Germany
References
Argaman, Vered and Neil J. Pearlmutter
2002 Lexical semantics as a basis for argument structure frequency biases. In:
Merlo, Paula and Suzanne Stevenson (eds.), The Lexical Basis of Sentence
Processing: Formal, Computational and Experimental Perspectives. Amster-
dam: John Benjamins, 303324.
Bergen, Benjamin K. and Nancy C. Chang
2005 Embodied Construction Grammar in simulation-based language under-
standing. In O

stman, Jan Ola and Mirjam Fried (eds.), Construction Gram-


mar(s): Cognitive and Cross-Language Dimensions. Amsterdam: John
Benjamins, 147190.
Bryant, John
2003 Constructional analysis. Masters thesis, UC Berkeley.
2004 Scalable Construction-Based Parsing and Semantic Analysis. In: Proceed-
ings of the Second International Workshop on Scalable Natural Language
Understanding, Boston: 2004.
Chomsky, Noam
1970 Remarks on Nominalization. In Jacobs, Roderick and Peter Rosenbaum
(eds.), Readings in English Transformational Grammar. Waltham, MA:
Ginn and co., 184221.
Clifton, Charles, Lyn Frazier and Cynthia Connine
1984 Lexical expectations in sentence comprehension. Journal of Verbal Learning
and Verbal Behavior 23, 696708.
Connine, Cynthia, Fernanda Ferreira, Charlie Jones, Charles Clifton and Lyn Frazier
1984 Verb frame preferences: Descriptive norms. Journal of Psycholinguistic Re-
search 13(4), 307319.
Initial parsing decisions 461
Evert, Stefan
2004 The Statistics of Word Cooccurrences: Word Pairs and Collocations. Un-
published doctoral dissertation, University of Stuttgart.
Fellbaum, Christiane (ed.)
1998 WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
Fodor, Janet Dean
1978 Parsing strategies and constraints on transformations. Linguistic Inquiry 9,
427473.
Ford, Marilyn, Joan Bresnan and Ronald M. Kaplan
1982 A competence-based theory of syntactic closure. In Bresnan, Joan (ed.), The
Mental Representations of Grammatical Relations. Cambridge, MA: MIT
Press, 727796.
Garnsey, Susan M., Neal J. Perlmutter, Elizabeth M. Meyers and Melanie A. Lotocky
1997 The contribution of verb bias and plausibility to the comprehension of
temporarily ambiguous sentences. Journal of Memory and Language 37,
5893.
Goldberg, Adele E.
1995 Constructions. A Construction Grammar Approach to Argument Structure.
Chicago: University of Chicago Press.
2006 Constructions at Work: The Nature of Generalization in Language. Oxford:
Oxford University Press.
Gries, Stefan T. and Anatol Stefanowitsch
2004 Extending collostructional analysis: A corpus-based perspectives on alterna-
tions. International Journal of Corpus Linguistics 9, 97129.
Gries, Stefan Th.
forthc Exploring variability within and between corpora: some methodological
considerations. Corpora.
Hare, Mary L., Ken McRae and Jerey L. Elman
2003 Sense and structure: Meaning as a determinant of verb subcategorization
preferences. Journal of Memory and Language 48(2), 281303.
2004 Admitting that admitting verb sense into corpus analyses makes sense. Lan-
guage and Cognitive Processes 19(2), 181224.
Jackendo, Ray
1975 Morphological and semantic regularities in the lexicon. Language 51,
639671.
Jennings, F., Billi Randall and Lorraine K. Tyler
1997 Graded eects of verb subcategory preferences on parsing: Support for
constraint-satisfaction models. Language and Cognitive Processes 12(4),
485504.
Jurafsky, Daniel
1996 A Probabilistic Model of Lexical and Syntactic Access and Disambiguation.
Cognitive Science 20, 137194.
Kennison, Shelia
2001 Limitations on the use of verb information during sentence comprehension.
Psychonomic Bulletin and Review 8(1), 132138.
Lako, George
1987 Women, Fire, and Dangerous Things. Chicago: University of Chicago
Press.
Langacker, Ronald W.
1988 A usage-based model. Current Issues in Linguistic Theory 50, 127161.
462 D. Wiechmann
MacDonald, Maryellen C., Neil Pearlmutter and Mark Seidenberg
1994 Lexical nature of syntactic ambiguity resolution. Psychological Review 101,
676703.
MacDonald, Maryellen C.
1997 Lexical Representations and sentence processing: An introduction. In Mac-
Donald, Maryellen C. (ed.), Special Issue of Language and Cognitive Pro-
cesses: Lexical Representations and Sentence Processing. Hove, East Sussex:
Psychology Press, 121136.
Mitchell, Don C.
1994 Sentence Parsing. In Gernsbacher, Morton Ann (ed.), Handbook of Psycho-
linguistics. San Diego, California: Academic Press, 375409.
Nelson, Gerald
1996 The Design of the Corpus. In Greenbaum, Sidney (ed.) Comparing English
Worldwide: The International Corpus of English. Oxford: Clarendon Press,
2735.
Stefanowitsch, Anatol and Stefan Th. Gries
2003 Collostructions: Investigating the interaction between words and construc-
tions. International Journal of Corpus Linguistics 8(2), 209243.
Tanenhaus, Michael K., Greg Carlson, and Mark S. Seidenberg
1985 Do listeners compute linguistic representations? In: Dowty, David, Lauri
Karttunen and Arnold Zwicky (eds.), Natural Language Parsing: Psycholog-
ical, Computational, and Theoretical Perspectives. Cambridge: Cambridge
University Press, 359408.
Trueswell, John C., Micheal K. Tanenhaus and Christopher Kello
1993 Verb-specic constraints in sentence processing: Separating eects of lexical
preference from garden-paths. Journal of Experimental Psychology: Learn-
ing, Memory and Cognition 19, 528553.
Tummers, Jose, Kris Heylen and Dirk Geeraerts
2005 Usage-based approaches in Cognitive Linguistics: A technical state of the
art. Corpus-Linguistics and Linguistic Theory 1(2), 22561.
Wiechmann, Daniel
in press On the computation of collostruction strength: Testing measures of associa-
tion as expressions of lexical bias. Corpus Linguistics and Linguistic Theory.
Resources
The British National Corpus (Version 1.0)
1995 Oxford University Computing Services for the BNC Consortium. Oxford:
Oxford University.
The International Corpus of English: The British Component on PC CD ROM
1998 Survey of English Usage. London: University College London.
Initial parsing decisions 463
Iconicity of sequence: A corpus-based
analysis of the positioning of temporal
adverbial clauses in English
HOLGER DIESSEL*
Abstract
Recent work in functional and cognitive linguistics has argued and pre-
sented evidence that the positioning of adverbial clauses is motivated by
competing pressures from syntactic parsing, discourse pragmatics, and se-
mantics. Continuing this line of research, the current paper investigates the
eect of the iconicity principle on the positioning of temporal adverbial
clauses. The iconicity principle predicts that the linear ordering of main
and subordinate clauses mirrors the sequential ordering of the events they
describe. Drawing on corpus data from spoken and written English, the pa-
per shows that, although temporal clauses exhibit a general tendency to fol-
low the main clause, there is a clear correlation between clause order and
iconicity: temporal clauses denoting a prior event precede the main clause
more often than temporal clauses of posteriority. In addition to the iconicity
principle, there are other factors such as length, complexity, and pragmatic
import that may aect the positioning of temporal adverbial clauses. Using
logistic regression analysis, the paper investigates the eects of the various
factors on the linear structuring of complex sentences.
Keywords: iconicity; temporal adverbial clauses; constituent order; com-
peting motivations; logistic regression.
Cognitive Linguistics 193 (2008), 465490
DOI 10.1515/COGL.2008.018
09365907/08/00190465
6 Walter de Gruyter
* I would like to thank Karsten Schmidtke, Daniel Wiechmann, and especially Beate
Hampe for many helpful comments and suggestions. All remaining errors are, of course,
mine. Contact address: University of Jena, Institut fu r Anglistik/Amerikanistik, Ernst-
Abbe-Platz 8, 07743 Jena, Germany. Authors email address: holger.diessel@uni-jena.de.
1. Introduction
Adverbial clauses are subordinate clauses that are combined with a main
clause in complex sentences. As can be seen in examples (1) to (4), in
English the adverbial clause may precede or follow the associated main
clause. This raises the interesting question of what motivates the sequen-
tial ordering of main and subordinate clauses. When does the adverbial
clause precede the main clause and when does it follow it?
(1) If its a really nice day, we could walk.
(2) Id quite like to go to Richmond Park because I was reading about it
in this novel.
(3) When you get a tax rebate, you get the money back after about a
year, dont you?
(4) Weigh up all these factors carefully before you commit yourself to
the manoeuvre.
1.1. Competing motivations for the positioning of adverbial clauses
In a recent paper, Diessel (2005) argued that the ordering of main and ad-
verbial clauses is motivated by functional and cognitive pressures from
three sources: (1) syntactic parsing, (2) discourse pragmatics, and (3) se-
mantics. Drawing on Hawkins (1994, 2004) processing theory of constit-
uent order and complexity, he shows that adverbial clauses are easier to
process, and thus more highly preferred, if they follow the main clause.
According to Hawkins, the human processor prefers linear structures
that allow for fast and easy access to the recognition domain. The recog-
nition domain is dened as the string of linguistic elements that must be
processed and kept in working memory until the parser has accessed all
immediate constituents of a phrase once the mother node of the phrase
has been recognized.
Complex sentences consist of two clauses functioning as the immediate
constituents of a bi-clausal structure, which is organized by the subordi-
nate conjunction creating the mother node S
complex
that dominates the
complex sentence construction (cf. Hawkins 1994: 360). If the adverbial
clause follows the main clause, the subordinate conjunction establishes
the S
complex
-node right after the main clauses has been processed and be-
fore the adverbial clause is accessed, which means that the two immediate
constituents of the complex sentence can be attached to their mother node
(i.e., S
complex
) as soon as this node is constructed. In contrast, if the adver-
bial clause precedes the main clause, the subordinate conjunction estab-
lishes the S
complex
-node right at the beginning of the bi-clausal structure,
which means that the human parser rst has to process the adverbial
466 H. Diessel
clause before the second immediate constituent, i.e., the main clause, can
be attached to S
complex
. Complex sentences with an initial adverbial clause
thus have a longer recognition domain (5a) than complex sentences with
nal adverbial clauses (cf. 5b and cf. Diessel 2005). If the human pro-
cessor prefers complex sentences with nal adverbial clauses, one has to
ask what motivates the occurrence of initial adverbial clauses. Why do
speakers prepose adverbial clauses if complex sentences with nal adver-
bial clauses are easier to parse?
(5) a. [ When . . . . . . . . . ]
SUB
[ . . . . . . . . . . . ]
Main
recognition domain
b. [ . . . . . . . . . . . . . ]
Main
[ when . . . . . . ]
SUB
recognition domain
One factor that motivates the preposing of adverbial clauses is their
pragmatic function. A number of studies have argued and presented evi-
dence that the discourse function of an adverbial clause varies with its po-
sition relative to the main clause (cf. Chafe 1984; Diessel 2005; Ford
1993; Givo n 1990: 846847; Ramsay 1987; Thompson and Longacre
1985; Thompson 1985, 1987; Verstraete 2004). If the adverbial clause fol-
lows the main clause it tends to provide new information, or else func-
tions as an afterthought; but if the adverbial clause precedes the main
clause, it serves to organize the information ow in the ongoing discourse.
As Chafe (1984), Givo n (1990: 846847), and others have argued, initial
adverbial clauses provide a guidepost for the interpretation of subsequent
clauses; they are often used at the beginning of a new paragraph or a new
turn to organize the transition between discourse topics. In other words,
the occurrence of initial adverbial clauses is motivated by particular dis-
course-pragmatic functions. Complex sentences containing initial adver-
bial clauses can be seen as particular constructions that speakers use to
stage information, i.e., to lay a thematic foundation for the following dis-
course (cf. Ford 1993: Ch 3; Givo n 1990: 846847; Thompson 1987; Ver-
straete 2004).
However, this general orientation function of initial adverbial clauses
does not explain why certain semantic types of adverbial clauses occur in
initial position more readily than others. In addition to syntactic parsing
and discourse pragmatics, we thus have to consider the meaning of com-
plex sentences to account for the sequential ordering of main and subor-
dinate clauses. In the literature, the following major semantic types of ad-
verbial clauses are usually distinguished: temporal clauses, indicating a
temporal relationship between two events; conditional clauses, expressing
a condition or prerequisite for the realisation of the main clause event;
Iconicity of sequence 467
causal clauses, providing a cause or reason for the proposition expressed
in the main clause; result clauses, referring to the result or consequence of
the main clause event; and purpose clauses, denoting the goal or purpose
of the activity expressed in the main clause (see Quirk et al. 1985: Ch 12
for a detailed discussion of the various semantic types of adverbial
clauses).
Using corpus data from both spoken and written genres, a number of
studies have demonstrated that temporal, conditional, causal, result, and
purpose clauses tend to occur in dierent positions relative to the main
clause (cf. Altenberg 1984; Biber et al. 1999: 820825; Diessel 1996,
2005; Ford 1993; Quirk et al. 1985: Ch 12; Ramsay 1987). To simplify,
conditional clauses usually precede the main clause, temporal clauses are
commonly used both before and after the main clause, and causal, result,
and purpose clauses predominantly follow the associated main clause.
Interestingly, the same positional patterns have also been observed in
many other languages across the world. Investigating the distribution of
adverbial clauses in a representative sample of the worlds languages,
Diessel (2001) identied two common cross-linguistic patterns. There are
languages in which all adverbial clauses precede the main clause, unless
they are extraposed (e.g., Japanese), and there are languages in which
the positioning of adverbial clauses varies with their meaning (e.g., Pun-
jabi). In the latter language type, conditional clauses usually precede the
main clause, temporal clauses exhibit a mixed pattern of pre- and post-
posing, and causal, result, and purpose clauses commonly follow the as-
sociated clause (see also Hetterle 2007).
1.2. Iconicity of sequence
Another factor that seems to inuence clause order is iconicity. The no-
tion of iconicity comprises two basic types, diagrammatic iconicity, which
is concerned with structural (or relational) similarities between the sign
and the referent, and imagic iconicity, which is concerned with substantial
similarities between the sign and the referent (e.g., sound symbolism). The
notion of diagrammatic iconicity has been used in various functional and
cognitive explanation of linguistic structure (cf. Croft 2003: Ch 4.2;
Dressler 1995; Fenk-Oczlon 1991; Givo n 1985, 1991; Haiman 1980,
1983, 1985, 1994, 2006; Haspelmath forthc.; Itkonen 2004; Jakobson
1965[1971]; Plank 1979; Tabakowska et al. 2007; Taylor 2002: 4548).
The general idea behind [diagrammatic] iconicity is that the structure of
language reects in some way the structure of experience (Croft 2003:
102); but this general notion of iconicity subsumes a wide variety of dif-
468 H. Diessel
ferent meanings.
1
In this paper, I concentrate on a particular subtype of
diagrammatic iconicity, iconicity of sequence, which refers to the sequen-
tial ordering of linguistic elements in discourse and complex sentences.
Note that this kind of iconic motivation cannot be explained by fre-
quency of occurrence (cf. Haspelmath 2008) or eort reduction (cf. Hai-
man 2006) as other types of iconicity.
There are a number of studies suggesting that clause order in complex
sentences is usually iconic. For instance, Lehmann (1974) and Haiman
(1978, 1983) argued that conditional clauses tend to precede the main
clause because conditional clauses refer to an event that is conceptually
prior to the one expressed in the main clause; Greenberg (1963 [1966])
proposed that purpose clauses follow the main clause because they denote
the intended endpoint or result of the activity expressed in the associated
clause (cf. Schmidtke in press); and Clark (1971) argued that after-clauses
precede the main clause more often than before-clauses, because after-
clauses refer to an event that occurs prior to the one in the main clause,
whereas before-clauses refer to a posterior event (cf. Diessel 2005).
While all of these studies suggest that iconicity of sequence is an impor-
tant determinant of the linear structuring of complex sentences, it must be
emphasized that the distributional properties of certain semantic types of
adverbial clauses are not consistent with the iconicity principle. In partic-
ular, the positioning of causal clauses violates the iconicity of sequence.
Although causes and reasons are conceptually prior to the eect ex-
pressed in the main clause, causal clauses tend to occur sentence-nally
(cf. Altenberg 1984; Diessel 2001, 2005; Ford 1993: Chs 34; Hetterle
2007). Across languages, causes and reasons are commonly expressed in
constructions that follow the semantically associated clause, suggesting
that iconicity of sequence is not relevant for the positioning of causal
clauses. Diessel (2006) argues that the tendency of causal clauses to follow
the main clause is motivated by the fact that causal clauses are primarily
1. In a recent review of the literature, Haspelmath (2008) identied eight dierent sub-
types of (diagrammatic) iconicity: (1) iconicity of quantity (greater quantities are
expressed by more linguistic structure), (2) iconicity of complexity (more complex
meanings are expressed by more complex forms), (3) iconicity of cohesion (semantic
cohesion is reected in structural cohesion), (4) iconicity of paradigmatic isomorphism
(one meaning, one form in the system), (5) iconicity of syntagmatic isomorphism (one
form, one meaning in the clause), (6) iconicity of sequence (sequences of form match
sequences of experiences), (7) iconicity of contiguity (semantically associated elements
occur adjacent to each other), and (8) iconicity of repetition (repetition in linguistic
form reects repeated experiences).
Iconicity of sequence 469
used to back up a previous statement that the hearer may not accept or
may not nd convincing.
Moreover, while the positioning of conditional clauses is consistent
with the iconicity of sequence, there is an alternative explanation for their
distribution. Conditional clauses precede the main clause because they de-
note a hypothetical situation, providing a conceptual framework (or men-
tal space) for the interpretation of subsequent clauses (cf. Dancygier 1998;
Dancygier and Sweetser 2000; Lehmann 1974). If the conditional clause
follows the main clause, the hearer may at rst misinterpret the preceding
main clause as a factual statement. Since the revision of a previous utter-
ance increases the processing load, there is a strong motivation to place
conditional clauses before the main clause (cf. Diessel 2005). Thus, it
seems that the iconicity principle is not immediately relevant for the posi-
tioning of causal and conditional clauses.
Moreover, one might hypothesize that iconicity of sequence, which de-
notes the temporal dimension of experience, primarily concerns the order-
ing of temporally related clauses. Previous studies suggest that temporal
clauses denoting a prior event precede the main clause more often than
temporal clauses of posteriority (cf. Clark 1971; Diessel 2005). But al-
though iconicity of sequence has been widely discussed in the literature,
it has never been systematically investigated. It is the purpose of this
study to ll this gap. Using corpus data from spoken and written English,
the paper presents the rst quantitative analysis of the positioning of tem-
poral adverbial clauses to systematically investigate the eect of the icon-
icity principle on clause order.
2. Analysis
The analysis concentrates on ve types of temporal clauses marked by the
subordinating conjunctions when, after, before, once, and until. The ve
conjunctions have been chosen for two reasons: rst, they are among the
most frequent temporal conjunctions in English, and second, they are se-
mantically especially interesting for the purpose of this study.
When-clauses are interesting because when is the only temporal con-
junction in English that does not specify the temporal sequence between
main and adverbial clauses. As can be seen in examples (6) to (8), when-
clauses denote situations that can occur prior, posterior, or simultane-
ously to the one expressed in the main clause.
(6) We shall make up our mind when the IMF has reported. [prior]
(7) They had already made breaches in the defensive wall of sand [ . . . ]
when the order came. [posterior]
(8) I did cook occasionally, when they were out. [simultaneous]
470 H. Diessel
The four other conjunctions are interesting because they form semantic
pairs: after and before describe a temporal sequence of two events from
reverse perspectives (cf. 910). After-clauses refer to an event that pre-
cedes the one expressed in the main clause, whereas before-clauses refer
to a posterior event. The iconicity principle would thus predict that
after-clauses precede the main clause more frequently than before-clauses.
(9) a. After her father died, of course, Isabels trust fund included
quite a substantial holding in the company. [prior]
b. I put Emily back in her own bed, after shed fallen asleep.
[prior]
(10) a. Before the debt crisis set in, Brazil was enjoying growth rates of
7 percent per year. [posterior]
b. The heat [ . . . ] from the sun is retained by the earth for a while,
before its radiated away. [posterior]
Quirk et al. (1985: 1082) point out that after- and before-clauses are not
generally converses of one another. Both clause types have special uses in
which the two constructions have dierent meanings. For instance, a
complex sentence with a before-clause referring to a non-factual (or coun-
terfactual) situation does not have the same meaning as the correspond-
ing complex sentence with an after-clause (cf. 1112); but constructions
of this type are rare (examples 1112 are the only counterfactual before-
clauses in the entire database).
(11) a. An Asian man [ . . . ] triggered the alarm before I could stop
him.
A I could stop an Asian man, after he triggered the alarm.
(12) a. Before he could move in for the tackle, Hughes had driven the
ball high past Grobbelaar from 25 yards.
A He could move in for the tackle, after Hughes had driven the
ball high past Grobbelaar from 25 yards.
Once and until parallel after and before: an adverbial clause introduced
by once refers to a prior event, whereas an adverbial clause marked by un-
til denotes a posterior situation. However, once and until dier from after
and before in that they introduce adverbial clauses that are telic: once in-
dicates a designated starting point of the situation expressed in the main
clause and until marks its endpoint (cf. 1314).
(13) a. Once the problem became clear, policy was tightened. [prior]
b. Well be pretty busy once our course gets back into full swing.
[prior]
(14) a. Until Id spoken to William Davis Id no idea that the monar-
chy was the only bright spot on our horizon. [posterior]
Iconicity of sequence 471
b. There should be no further cuts in interest rates, until the un-
derlying rate of ination begins to tumble. [posterior]
Note that all ve conjunctions can have non-temporal meanings (cf.
Quirk et al. 1985: 10781086). When-clauses may have a conditional in-
terpretation, after-clauses are sometimes interpreted with a causal conno-
tation, before-clauses can express a purpose or goal, once-clauses are
often conditional, and until-clauses may express a combination of time,
purpose, and result. However, these non-temporal semantic features are
not or only weakly grammaticalized; they usually emerge as conversa-
tional implicatures from the interpretation of temporal clauses in the dis-
course context.
2
2.1. Study 1
2.1.1. Methods. The analysis is based on data from the British Com-
ponent of the International Corpus of English (ICE-GB). The ICE-GB
corpus consists of 1 million words compiled from a wide variety of spo-
ken and written genres. The corpus is tagged and includes detailed infor-
mation about syntactic structure. For this study, I randomly selected 200
when-clauses, 200 after- and before-clauses (100 after and 100 before), and
200 once- and until-clauses (100 once and 100 until ). Half of the data
come from spoken discourse, the other half come from written genres.
The study is restricted to nite adverbial clauses and disregards participle
constructions and gerunds. After the initial search, I excluded all adver-
bial clauses that were not relevant for the purpose of the current investi-
gation. Specically, I excluded adverbial clauses that are inserted into the
main clause (cf. 15) and adverbial clauses that do not occur with an asso-
ciated main clause (cf. 16).
(15) And the reason for that before you ask me was that uhm everybody
was confusing my brain.
(16) Uhm half an hour after I leave probably.
Moreover, I excluded adverbial clauses that are related to the main
clause at the speech act level (cf. Hengeveld 1989). There were, for in-
stance, several before-clauses that speakers used as independent speech
acts to coordinate the interaction between the speech participants (cf.
1718).
2. In some uses, the non-temporal meanings have been conventionalized as in I would vote
for Kennedy before I vote for Bush; but constructions of this type are rare.
472 H. Diessel
(17) Now before you . . . uhm . . . break into groups and look at the re-
sults of the two analyses and try and see whats going on . . . any
sort of questions?
(18) Uhm well before we get into the detailed discussion of all of this
have you got something else Mary?
Since adverbial clauses of this type do not describe a sequence of two
related events, they were disregarded. Table 1 shows the frequency of the
ve conjunctions after the irrelevant items were excluded.
All sentences were manually coded for two features: (1) the position of
the adverbial clause relative to the main clause (initial ADV-clause vs. -
nal ADV-clause), and (2) the conceptual order of main and adverbial
clauses (prior ADV-clause vs. posterior ADV-clause vs. simultaneous
ADV-clause). The data were separately coded by the author and a
student assistant; intercoder reliability was very high, with almost 100
percent.
2.1.2. Results. The majority of temporal clauses follow the main
clause. Overall, there are 166 initial and 404 nal adverbial clauses in the
data, i.e., 70.9 percent of the temporal clauses follow the main clause and
only 29.1 percent precede it. Figure 1 shows the proportions of initial and
nal adverbial clauses expressing a prior, posterior, or simultaneously oc-
curring event. As can be seen in the graph, 53.9 percent (N 119) of the
prior adverbial clauses precede the main clause, 22.2 percent (N 36) of
the simultaneous adverbial clauses are preposed, and only 5.9 percent
(N 11) of the posterior temporal clauses are placed before the associ-
ated main clause. There is thus a clear correlation between conceptual
order and linear structure: temporal clauses denoting a prior event pre-
cede the main clause more often than temporal clauses denoting a simul-
taneously occurring event, which in turn are more frequently preposed to
the main clause than temporal clauses of posteriority. A 2 3 w
2
-analysis
Table 1. Raw frequencies
Spoken Written Total
when 94 95 189
after 47 50 97
before 41 46 87
once 48 50 98
until 49 50 99
Total 279 291 570
Iconicity of sequence 473
revealed that the association between conceptual order and linear struc-
ture is signicant (w
2
185:13, df 2, p < 0:001).
While there is a preference for an iconic clause order, it must be em-
phasized that a signicant number of complex sentences violate the icon-
icity of sequence. If we disregard adverbial clauses referring to a simulta-
neously occurring event, there are 295 complex sentences with iconic and
113 complex sentences with non-iconic clause orders, i.e., 27.7 percent of
the temporal clauses examined in this study violate the iconicity principle.
Interestingly, complex sentences containing initial adverbial clauses are
more consistent with the iconicity principle than complex sentences con-
taining nal adverbial clauses. As can be seen in Figure 2, if the adverbial
clause precedes the main clause, 91.5 percent (N 119) of all sentences
are iconic, but if the adverbial clause follows the main clause, only 63.3
percent (N 176) exhibit an iconic ordering (w
2
35:25, df 1,
p < 0:001).
Since the positioning of temporal adverbial clauses varies with the sub-
ordinate conjunction, I also examined the positional patterns of individ-
ual types of temporal clauses. As can be seen in Table 2, when-clauses
tend to follow the main clause: 51 when-clauses precede the main clause
and 138 when-clauses occur after it. The majority of the when-clauses de-
note a situation that occurs simultaneously to the one expressed in the
main clause. As can be seen in this table, there are 26 prior when-clauses,
162 simultaneous when-clauses, and only 1 posterior when-clause.
The positioning of the when-clause correlates with the conceptual or-
der: 57.7 percent of the prior when-clauses precede the main clause, but
Figure 1. Conceptual order and linear structure
474 H. Diessel
only 22.3 percent of the simultaneously occurring when-clauses are pre-
posed. Leaving aside the one posterior when-clause, a 2 2 w
2
-analysis
revealed a signicant association between linear structure and conceptual
order (w
2
14:26, df 1, p < 0:001), conrming the hypothesis that
clause order is iconic.
Like when-clauses, after- and before-clauses tend to occur at the end of
a complex sentence. As can be seen in Table 3, there are 151 nal and
only 33 initial after- and before-clauses in the data. Of the initial subordi-
nate clauses, 27 are introduced by after and only 6 are introduced by be-
fore. A 2 2 w
2
-analysis revealed a signicant association between clause
Figure 2. Clause order and iconicity
Table 2. When-clausesconceptual order and linear structure
Linear order Prior Simultaneous Posterior Total
Initial 15 36 0 51
Final 11 126 1 138
Total 26 162 1 189
Table 3. After- and before-clausesconceptual order and linear structure
Linear order after before Total
initial 27 6 33
nal 70 81 151
Total 97 87 184
Iconicity of sequence 475
order and clause type (w
2
13:66, df 1, p < 0:001), suggesting that the
conceptual order expressed by after and before inuences the ordering of
main and the subordinate clauses. Note, however, that complex sentences
with initial adverbial clauses are more often iconic than complex sen-
tences with nal adverbial clauses: 81.8 percent (N 27) of the sentences
with initial after- and before-clauses are iconic, but only 53.6 percent
(N 81) of the sentences with nal adverbial clauses are consistent with
the iconicity of sequence (w
2
8:868, df 1, p < 0:003).
Interestingly, before-clauses functioning as independent speech acts (see
examples 1819 above) always precede the main clause: there are ve
before-clauses of this type in the data and all ve clauses occur before
the main clause. However, even if we include speech act before-clauses
into the analysis, adverbial clauses marked by after precede the main
clause signicantly more often than adverbial clauses marked by before
(w
2
7:411, df 1, p < 0:006).
Finally, once and until parallel after and before in that they indicate a
temporal sequence between two situations: once-clauses are conceptually
prior to the event in the main clause, while until-clauses denote a poste-
rior situation. However, the distributional contrast between once and until
is much more pronounced than the distributional contrast between after
and before. As can be seen in Table 4, 77 once-clauses precede the main
clause but only 5 until-clauses are preposed. A 2 2 w
2
-analysis revealed
that the distributional dierence between once and until is highly signi-
cant (w
2
109:56, df 1, p < 0:001). Once again, the iconicity principle
is more consistent with complex sentences containing initial adverbial
clauses than with complex sentences containing nal adverbial clauses:
93.9 percent (N 77) of the initial adverbial clauses occur in complex
sentences that are iconic, but only 81.7 percent (N 94) of the nal
adverbial clauses are embedded in an iconically structured sentence
(w
2
6:182, df 1, p < 0:013).
To summarize, we have seen that the positioning of temporal adverbial
clauses varies with conceptual order: temporal clauses denoting a prior
event precede the main clause signicantly more often than temporal
clauses denoting a simultaneous event, which in turn are more frequently
Table 4. Once- and until-clausesconceptual order and linear structure
Linear order once until Total
initial 77 5 82
nal 21 94 115
Total 98 99 197
476 H. Diessel
preposed to the main clause than temporal clauses of posteriority. The
analysis suggests that iconicity of sequence has a signicant eect on the
positioning of temporal adverbial clauses in English. However, the data
also reveal that the iconicity principle cannot be the sole determinant of
the sequential structuring of complex sentences because 27.3 percent of
the sentences examined in this study do not have an iconic clause order;
that is, more than a quarter of all sentences violate the iconicity of se-
quence. Moreover, the iconicity principle does not explain why complex
sentences with initial adverbial clauses are more often iconic than com-
plex sentences with nal adverbial clauses (cf. Figure 2), and why the po-
sitioning of the temporal adverbial clause varies with the subordinate
conjunction. For instance, although both after and once introduce prior
adverbial clauses, once-clauses precede the main clause more often than
after-clauses (cf. Tables 2 and 3). In order to account for these ndings,
we have to include additional factors into the analysis. The second study
was designed to investigate the combined eect of the iconicity principle
and other factors inuencing clause order in English.
2.2. Study 2
Based on the previous research (see Section 1.1.), we may hypothesize
that in addition to iconicity of sequence the following factors are relevant
for the positioning of temporal adverbial clauses:
1. The semantic relationship between main and adverbial clauses. Com-
plex sentences containing temporal adverbial clauses often imply a
conditional, causal, or purposive relationship (see above). Since con-
ditional clauses tend to precede the main clause, while causal and
purpose clauses usually follow it, it is a plausible hypothesis that the
positioning of temporal adverbial clauses is aected by their implicit
meanings. This may account for the distributional dierences be-
tween once-clauses, which are often conditional, and after-clauses,
which can be causal.
2. The length of the adverbial clause. It is well-known that heavy con-
stituents tend to occur sentence-nally (Behaghel 1932). There are
two explanations for this: information structure and syntactic parsing
(see Wasow 2002 for a review of the literature). In the discourse-
pragmatic literature it is commonly assumed that given information
tends to precede new information because new information needs to
be grounded in information that is already known to the hearer.
Since new information needs more explicit coding than given infor-
mation, long constituents tend to occur at the end of a sentence (cf.
Dik 1989: 351). Alternatively, Hawkins (2004: 104108) argued that
Iconicity of sequence 477
right-branching languages like English tend to place long constituents
at the end of the sentence because the order short-before-long is
easier to parse than the reverse ordering (see above). Since adverbial
clauses are heavy constituents it is a plausible hypothesis that the pre-
dominance of nal temporal clauses results from the weight of these
constructions. Moreover, we may assume that temporal clauses pre-
ceding the main clause tend to be shorter than temporal clauses that
follow it (cf. Diessel 2005).
3. The complexity of adverbial clauses. Hawkins (1994, 2004) argued
that constituent order is crucially aected by the structural complex-
ity of linguistic elements. Specically, he claimed that in right-
branching languages like English, syntactically complex structures
tend to occur sentence-nally because in nal position they are easier
to parse. Since adverbial clauses can vary in terms of their complex-
ity, we may assume that initial temporal clauses are structurally less
complex than nal adverbial clauses.
2.2.1. Methods. In order to test these hypotheses, I conducted a binary
logistic regression analysis, in which all of the above mentioned factors
are taken into account. Logistic regression analysis is an extension of or-
dinary regression analysis, in which the dependent variable is categorical
(rather than continuous as in ordinary regression analysis) (cf. Tabachnik
and Fidell 2004: Ch 12; Backhaus et al. 2006: Ch 7). The goal of binary
logistic regression analysis is to predict the value of the dichotomous de-
pendent variable from one or more predictor variables that can be contin-
uous, discrete, dichotomous, or a mix of them (cf. Tabachnik and Fidell
2004: 517; Backhaus et al. 2006: 428).
3
In the current study, logistic re-
3. Logistic regression analysis involves the same formula as ordinary regression analysis
except that the dependent variable is expressed by the natural logarithm of the odds,
i.e., ln(p/1 p) a bx. The odds provide a probability measure that is dened as
the ratio of the probability that an event A will occur and the probability that the event
A will not occur, i.e., odds P(A)/1 P(A). The odds must be distinguished from sim-
ple probabilities. For instance, in a corpus of a 100 complex sentences with 40 initial ad-
verbial clauses and 60 nal adverbial clauses, the odds of randomly selecting an initial
adverbial clause are 40/60 0.666, and the odds of randomly selecting a nal adverbial
clause are 60/40 1.5. By contrast, the probability of selecting an initial adverbial
clause is 0.4 and the probability of selecting a nal adverbial clause is 0.6. Probability
values increase linearly, but the odds increase exponentially (cf. 10/90 0.11, 20/
80 0.25, 30/70 0.43, 40/60 0.66, 50/50 1, 60/40 1.5, 70/30 2.3, 80/
20 4, 90/10 9, 95/5 19, 99/1 99). The natural logarithm of the odds transfers
the exponential curve into a symmetrical S-curve which denes the two outcomes of a
binary logistic regression analysis (cf. Tabachnik and Fidell 2004: Ch 7; Backhaus et al.
2006: Ch 7).
478 H. Diessel
gression analysis was used to predict the position of the adverbial clause
(i.e., initial or nal) from the following set of predictors: conceptual order
(i.e., iconicity), meaning, length, and syntactic complexity. Figure 3 shows
the research design.
Conceptual order and syntactic complexity were coded as dichotomous
variables: adverbial clauses denoting a prior event were distinguished
from adverbial clauses denoting a posterior or simultaneously occurring
event, and simple adverbial clauses consisting of a single clause were dis-
tinguished from complex adverbial clauses containing another subordi-
nate clause. Meaning was coded as a discrete variable with three levels:
(i) purely temporal, (ii) temporal with an implicit conditional meaning,
and (iii) temporal with an implicit causal or purposive meaning. Finally,
length was coded as a continuous variable, measured by dividing the
number of words in the adverbial clause by the total number of words in
the complex sentence.
4
For all features, intercoder reliability was at least
95 percent.
2.2.2. Results. Table 5 shows the raw frequencies of the categorical
predictors, i.e., conceptual order, complexity, and meaning, and Figure 4
shows the histograms of the continuous predictor, relative length (i.e., the
ratio of adverbial clause/complex sentence), for nal and initial temporal
clauses.
Figure 3. Research design
4. For instance, if the adverbial clause consists of 6 words and the complex sentence of 13
words, the relative length of the adverbial clause is 6/13 0.4615384, i.e., 46.15 percent.
Iconicity of sequence 479
Note that the frequency distributions are consistent with the proposed
hypotheses: prior temporal clauses precede the main clause on average
more often than posterior and simultaneous temporal clauses. In addi-
tion, Table 5 shows that simple adverbial clauses are more often preposed
to the main clause than complex adverbial clauses (i.e., adverbial clauses
including another subordinate clause), and that temporal clauses with an
implicit conditional meaning tend to precede the main clause, whereas
temporal clauses with an implicit causal or purposive meaning almost al-
ways follow it. The histograms show that the average relative length of
nal temporal clauses is greater than the average relative length of initial
adverbial clauses, but the dierence is small: if the adverbial clause fol-
Table 5. Frequencies of the categorical predictor variables
VARIABLE LEVEL INITIAL FINAL TOTAL
Conceptual order 1. posterior/simultaneous
2. prior
47
119
302
102
349
221
Complexity 1. simple
2. complex
138
28
309
95
447
123
Meaning 1. purely temporal
2. conditional
3. causal/purposive
89
76
1
299
52
53
388
128
54
Figure 4. Frequency of the relative length of initial and nal temporal clauses
480 H. Diessel
lows the main clause the mean relative length of the adverbial clause is 45
percent, and if the adverbial clause precedes the main clause the mean rel-
ative length of the adverbial clause is 40.5 percent of the entire sentence.
In order to test if and to what extent these asymmetries are relevant for
the positioning of temporal adverbial clauses, I conducted a stepwise lo-
gistic regression analysis starting with the maximal model in which all
predictor variables and their interactions are included in the regression.
This model is compared to the null (or empty) model in which none of
the predictor variables is included (cf. Tabachnik and Fidell 2004: Ch
12; Backhaus et al. 2006: Ch 7). In the current study, the maximal model
was signicantly dierent from the null model, indicating that the predic-
tors as a group reliably distinguish between initial and nal position.
However, since the interactions between the various predictor variables
were not signicant, they were excluded from the model (cf. Crawley
2005: 104). In the next step, I computed a regression model including
only the predictor variables without their interactions. In this model,
three of the predictor variables turned out to be signicantly related to
the dependent variable, i.e., conceptual order, meaning, and length. Since
syntactic complexity was not signicantly related to clause order, it was
removed from the regression model. The resulting minimally adequate
model t the data signicantly better than the null model (w
2
174:69,
df 4, p < 0.001) and had almost the same explanatory power (Nagel-
kerkes R
2
0:38) as the maximal model (Nagelkerkes R
2
0:39). The
overall prediction accuracy increased from 70.9 percent in the null model
to 80 percent in the minimally adequate model, which is a reasonable im-
provement given that prediction accuracy can only increase if the model
correctly predicts some of the initial adverbial clauses (which account for
only 29.1 percent of the data).
As in ordinary multiple regression analysis, regression coecients indi-
cate the eect of the individual predictor variables on the outcome; but
since the regression coecients of logistic regression analysis are dicult
to interpret, they are commonly transformed into odds ratios, which is a
measure of eect size that indicates the likelihood of a particular outcome
to occur.
5
Table 6 provides a summary of the analysis of the predictor
variables in the minimally adequate model.
5. Odds ratios are calculated by dividing the odds of an event occurring by the odds of an-
other event occurring. For instance, if 65 percent of the days during one year are sunny
and 35 percent are rainy, the odds of a sunny day are 1.86 and the odds of a rainy day
are 0.54 and the odds ratio (sunny/rainy) is 3.43, which means that a sunny day is 3.43
times more likely to occur than a rainy day.
Iconicity of sequence 481
The regression coecients indicate the direction of change induced by a
particular predictor: positive values (which correspond to odds ratios
larger than 1.0) indicate that the predictor variable increases the likeli-
hood of the adverbial clause to precede the main clause; negative values
(which correspond to odds ratios smaller than 1.0) indicate that the pre-
dictor variable decreases the likelihood of the adverbial clause to precede
the main clause. The Wald w
2
-values and the associated levels of signi-
cance indicate that the predictor variables (conceptual order, meaning,
and length) are signicant. The odds ratios show the change in odds for
an adverbial clause to be placed in initial position. For instance, the
odds ratio for conceptual order indicates that for adverbial clauses denot-
ing a prior event the odds of preceding the main clause are 6.7 times
larger than the odds for adverbial clauses denoting a posterior or simulta-
neous event. The two nal columns show the lower and upper boundaries
of the condence intervals for the odds ratios (cf. Backhaus et al. 2005:
475476).
Note that conceptual order and conditional meaning increase the like-
lihood of the adverbial clause to precede the main clause (compared to
posterior/simultaneous temporal clauses with purely temporal meaning),
whereas a causal/purposive meaning and an increase in length decrease
the likelihood of the adverbial clause to precede the main clause (com-
pared to purely temporal clauses that are shorter). Note also that concep-
tual order, i.e., the encoding of a prior event, is the strongest predictor for
the initial occurrence of a temporal adverbial clause.
Since the positioning of temporal adverbial clauses varies with the sub-
ordinate conjunction (see above), I also computed regression models for
individual types of temporal clauses. Specically, I developed three sepa-
rate logistic regression models for when-clauses, after- and before-clauses,
and once- and until-clauses using the same stepwise procedure as in the
model described above (Table 7 in the Appendix provides a summary of
the frequency data). Interestingly, while conceptual order had a signi-
cant eect on the positioning of all temporal clauses (when: w
2
12:149,
Table 6. Results of the logistic regression analysis
Factor reg. coef.
B
Wald
w
2
df p odds
ratio
lower
CI
upper
CI
Conceptual order 1.902 73.69 1 0.001 6.70 4.34 10.35
Meaning
a. causal/purpose
b. conditional
2.775
1.364
41.07
7.27
31.20
2
1
1
0.001
0.007
0.001
0.06
3.91
0.01
2.42
0.469
6.31
Length 1.343 7.39 1 0.001 0.19 0.06 0.63
482 H. Diessel
df 1, p < 0:001; after/before: w
2
14:504, df 1, p < 0:001; once/
until: w
2
32:285, df 1; p < 0:001), meaning and length were only sig-
nicant for certain types of temporal clauses, suggesting that the eect of
conceptual order is more consistent across clause types than the eect of
the other predictor variables. Meaning was signicant for the positioning
of conditional once- and until-clauses (w
2
6:491, df 1; p < 0:011) and
marginally signicant for the positioning of causal/purposive after- and
before-clauses (w
2
3:601, df 1; p < 0:061); but although when-clauses
were often used with an implicit conditional meaning, conditionality did
not aect their position (w
2
9:546, df 1; p < 0:010). Length was only
signicant for once- and until-clauses (w
2
6:491, df 1; p < 0:011), but
not for when-, after-, and before-clauses (when: w
2
2:000, df 1,
p > 0:157; after/before: w
2
0:398, df 1, p > 0:528).
3. Discussion
The analysis suggests that iconicity of sequence has a strong and consis-
tent eect on the linear structuring of complex sentences with temporal
adverbial clauses. Temporal clauses referring to a prior event precede the
main clause more often than temporal clauses expressing a simultane-
ously occurring event, which in turn precede the main clause more often
than temporal clauses of posteriority. The iconicity of sequence is in ac-
cordance with both complex sentences in which the conceptual order of
main and adverbial clauses is encoded by the subordinate conjunction
(i.e., after-, before-, once-, and until-clauses) and complex sentences in
which the conceptual order is inferred from the meaning of the whole sen-
tence because the conjunction itself does not express a particular order
(i.e., when-clauses). In both types of sentences, clause order correlates
with conceptual structure: after- and once-clauses, referring to a prior
event, precede the main clause signicantly more often than before- and
until-clauses, denoting a posterior situation, and when-clauses referring
to a prior event are more frequently preposed to the main clause than
when-clauses denoting a posterior or simultaneously occurring event.
The analysis also revealed that complex sentences including initial adver-
bial clauses are more consistent with the iconicity principle than complex
sentences including nal adverbial clauses: while complex sentences with
initial adverbial clauses are almost always iconic, more than one third of
all complex sentences with nal adverbial clauses violate the iconicity of
sequence.
Another factor that correlates with the positioning of temporal adver-
bial clauses is their implicit meaning. About one third of all adverbial
clauses examined in this study imply a conditional, causal, or purposive
Iconicity of sequence 483
relationship between the events expressed by main and subordinate
clauses. Like ordinary conditional clauses, temporal clauses with an im-
plicit conditional meaning tend to precede the main clauses, and like or-
dinary causal and purposive clauses, temporal clauses with an implicit
causal or purposive meaning almost always follow it. This may explain
why once- and after-clauses dier in their distribution: although both
types of adverbial clauses denote a prior event, once-clauses, which are
often conditional, precede the main clause more often than after-clauses,
which are frequently used with an implicit causal meaning. Note that in
the logistic regression analysis the meaning of the adverbial clause had
less predictive power than iconicity of sequence. Moreover, the analysis
showed that while the iconicity principle inuenced all temporal clauses,
the implicit meaning was only relevant for certain types of temporal
clauses.
Apart from conceptual order and implicit meaning, the length ratio of
main and adverbial clauses was a signicant predictor of clause order.
The analysis revealed that initial temporal clauses account for a smaller
proportion of the overall length of the complex sentence than nal adver-
bial clauses, i.e., adverbial clauses that precede the main clause are
shorter than adverbial clauses that follow it; but since the dierence was
relatively small, length had only a small eect on the positioning of the
adverbial clause. In the conjunction-specic analyses, once- and until-
clauses were the only adverbial clauses for which the length ratio was a
signicant predictor.
Why do these factors inuence the positioning of temporal adverbial
clauses? I suggest that all of the factors examined in this study are rele-
vant for clause order because they inuence the processing of complex
sentences. Specically, I claim that iconicity of sequence, which is com-
monly characterized as a semantic principle, can be interpreted as a pro-
cessing principle that contributes to the overall processing load of a com-
plex sentence construction because a non-iconic clause order is dicult to
plan and to interpret. As Givo n (1985: 189) put it: All other things being
equal, a coded experience is easier to store, retrieve and communicate if
the code is maximally isomorphic to the experience (emphasis is the
original). There are several experimental studies supporting this view.
For instance, Ohtsuka and Brewer (1992) found that iconic sentences
combined by next are easier to understand and to remember than non-
iconic sentences combined by before, and Clark (1971) found that
English-speaking children have fewer diculties to understand before-
and after-clauses if clause order is iconic (see also Carni and French
1984; Clark 1973; Coker 1978; Diessel 2004; Ferreiro and Sinclair 1971;
Trosborg 1982). Assuming that non-iconic orders are dicult to plan
484 H. Diessel
and to interpret, it is a plausible hypothesis that complex sentences tend
to be iconic because speakers prefer linguistic structures that are easy to
process.
Like iconicity, the meaning of the adverbial clause is relevant for the
processing of the complex sentence. In particular, conditional clauses put
a particular constraint on the processing of complex sentences. As I have
argued in Diessel (2005), conditional clauses provide a particular con-
ceptual framework for the interpretation of the semantically associated
clause. More precisely, the conditional clause indicates that the main
clause is a hypothetical statement that is contingent on the realization of
the event expressed in the subordinate clause. If the conditional clause
precedes the main clause, it is immediately obvious that the sentence de-
scribes a hypothetical situation, but if the conditional clause follows the
main clause the hearer may at rst misinterpret it as a factual statement.
Since the reanalysis of previous clauses is dicult to process, conditional
clauses tend to occur at the beginning of the sentence or their occurrence
is announced in the initial main clause by intonation or a subjunctive verb
form.
In addition to the meaning, the pragmatic function can inuence the
positioning of adverbial clauses. As has been repeatedly argued in the
literature, initial and nal adverbial clauses serve dierent discourse-
pragmatic functions. While nal adverbial clauses are commonly used to
provide new information or to spell out information that was pragmati-
cally presupposed in the preceding main clause, initial adverbial clauses
are commonly used to provide a thematic ground that facilitates the se-
mantic processing of subsequent clauses (see Section 1.1. for relevant
references). Moreover, we may assume that causal clauses typically follow
the main clause because causal clauses are commonly used to back up a
previous statement, i.e., the nal occurrence of causal clauses is a conse-
quence of the fact that causal clauses are often embedded in a particular
discourse routine (cf. Diessel 2006; see also Diessel 2004: Ch 7, who dis-
cusses the discourse function of causal clauses in early child language).
Finally, length is an important factor for the processing of complex
sentences because the length of constituents denes the recognition do-
main (see above). Adopting Hawkins parsing theory, we may assume
that nal adverbial clauses are easier to parse than initial adverbial
clauses because complex sentences with nal adverbial clauses have a
shorter recognition domain than complex sentences with initial adverbial
clauses. This explains the predominance of nal adverbial clauses in
English. Note that in left-branching languages like Japanese adverbial
clauses are often consistently placed before the main clause because in
this language type complex sentences are easier to process if the adverbial
Iconicity of sequence 485
clause occurs at the beginning of the sentence (cf. Diessel 2001, 2005).
However, in right-branching languages like English, nal position is the
default and the initial occurrence of adverbial clauses is motivated by
competing processing forces.
Adopting an incremental model of sentence comprehension in which
the overall processing load of linguistic structures is determined by the cu-
mulative eect of syntactic, semantic, and other processing constraints
(cf. MacDonald et al. 1994), we may assume that speakers tend to avoid
structures in which the overall processing load exceeds a certain level.
This may explain why iconicity of sequence exerts a particularly strong
eect on complex sentences with initial adverbial clauses. Since the com-
bined eect of the initial position of the adverbial clause (which is dicult
to parse) and the occurrence of a non-iconic clause order (which is di-
cult to conceptualize) can raise the overall processing load to a very high
level, speakers seek to avoid the use of non-iconic clause orders in com-
plex sentences with initial adverbial clauses. Put dierently, if the adver-
bial clause follows the main clause there is less processing pressure to use
an iconic clause order because complex sentences with nal adverbial
clauses are easier to parse; there is thus more tolerance in complex senten-
ces with nal adverbial clauses for the increased processing load that
arises from the violation of the iconicity principle.
In sum, this paper has shown that the positional patterns of temporal
adverbial clauses are consistent with the hypothesis that clause order in
complex sentences is usually iconic. While iconicity of sequence is often
characterized as a semantic factor, it can be seen as a processing principle
that is especially relevant for complex sentences with initial adverbial
clauses because these structures are dicult to parse, so that speakers
seek to limit the overall processing load by using an iconic clause order.
Received 5 July 2007 University of Jena, Germany
Revision received 11 January 2008
486 H. Diessel
Appendix
References
Altenberg, Bengt
1984 Causal linking in spoken and written English. Studia Linguistica 38, 2069.
Backhaus, Klaus, Bernd Erichson, Wul Plinke, and Rolf Weiber
2006 Multivariate Analysemethoden. Eine anwendungsorientierte Einfuhrung. Ber-
lin: Springer. [11th. edition].
Behaghel, Otto
1932 Deutsche Syntax. Eine geschichtliche Darstellung. Vol. IV. Wortstellung, Pe-
riodenbau. Heidelberg: Winter.
Biber, Douglas, Stig Johansson, Georey Leech, Susan Conrad, and Edward Finegan
1999 Longman Grammar of Spoken and Written English. London: Longman.
Carni, Ellen and Lucia A. French
1984 The acquisition of before and after reconsidered: What develops? Journal of
Experimental Psychology 37, 394403.
Chafe, Wallace
1984 How people use adverbial clauses. Berkeley Linguistics Society 10, 43749.
Clark, Eve V.
1971 On the acquisition of the meaning of after and before. Journal of Verbal
Learning and Verbal Behavior 10, 26675.
1973 How children describe time and order. In Charles A. Ferguson and Dan I.
Slobin (eds.), Studies of Child Language Development. New York: Holt,
Rinehart and Winston, 585606.
Coker, Pamela L.
1978 Syntactic and semantic factors in the acquisition of after and before. Journal
of Child Language 5, 26177.
Crawley, Michael J.
2005 Statistics. An Introduction into R. John Wiley and Sons, Ltd.
Croft, William
2003 Typology and Universals. Cambridge: Cambridge University Press.
Dancygier, Barbara
1998 Conditionals and Prediction. Time, knowledge, and causation in conditional
constructions. Cambridge: Cambridge University Press.
Table 7. Position, length, conceptual order, and implicit meaning
CONCEPT
ORDER
prior prior temp. caus. cond.
LENGTH
MEAN
PROP.
MEANING
WHEN initial 15 36 31 20 0.425
nal 11 127 95 4 39 0.472
AFTER/BEFORE initial 27 6 32 2 0.388
nal 70 81 128 20 2 0.412
ONCE/UNTIL initial 77 5 24 57 0.359
nal 21 94 78 28 10 0.498
Iconicity of sequence 487
Dancygier, Barbara and Eve Sweetser
2000 Constructions with if, since and because: causality, epistemic stance, and
clause order. In Elizabeth Couper-Kuhlen and Bernd Kortmann (eds.),
Cause, Condition, Concession, Contrast: cognitive and discourse perspectives.
Berlin: Mouton de Gruyter, 11142.
Diessel, Holger
1996 Processing factors of pre- and postposed adverbial clauses. Berkeley Linguis-
tics Society 22, 7182.
2001 The ordering distribution of main and adverbial clauses: A typological
study. Language 77, 34365.
2004 The Acquisition of Complex Sentences. Cambridge: Cambridge University
Press.
2005 Competing motivations for the ordering of main and adverbial clauses. Lin-
guistics 43, 44970.
2006 Causal and conditional constructions. Paper presented at the Second In-
ternational Conference of the German Cognitive Linguistics Association.
Munich.
Dik, Simon
1989 The Theory of Functional Grammar: Part 1: The structure of the clause. Dor-
drecht: Foris.
Dressler, Wolfgang U.
1995 Interactions between iconicity and other semiotic parameters in language. In
Raaele Simone (ed.), Iconicity in Language. Amsterdam: John Benjamins
1995, 21237.
Fenk-Oczlon, Gertraud
1991 Frequenz und KognitionFrequenz und Markiertheit. Folia Linguistica 25,
36194.
Ferreiro, Emilia and Hermina Sinclair
1971 Temporal relationships in language. International Journal of Psychology 6,
3947.
Ford, Cecilia E.
1993 Grammar in Interaction. Adverbial clauses in American English conversations.
Cambridge: Cambridge University Press.
Givo n, Talmy
1985 Iconicity, isomorphism and nonarbitrary coding in syntax. In John Haiman
(ed.), Natural Syntax. Amsterdam: John Benjamins, 187220.
1990 Syntax. A functional-typological introduction. Vol. II. Amsterdam: John
Benjamins.
1991 Isomorphism in the grammatical code: cognitive and biological considera-
tions. Studies in Language 15, 85114.
Greenberg, Joseph H.
1963[1966] Some universals of grammar with particular reference to the order of
meaningful elements. In Joseph H. Greenberg (eds.), Universals of Grammar
[2nd edition]. Cambridge, Mass.: MIT Press. [1st edition 1963], 73
113.
Haiman, John
1978 Conditionals are topics. Language 54, 56489.
1980 The iconicity of grammar. Language 56, 51540.
1983 Iconic and economic motivations. Language 59, 78119.
1985 Iconicity in Syntax. Amsterdam: John Benjamins.
488 H. Diessel
1994 Iconicity. In R. E. Asher (ed.), The Encyclopedia of Language and Linguis-
tics. Oxford: Pergamon Press, 162933.
2006 Iconicity. In Keith Brown (ed.), Encyclopedia of Language and Linguistics.
Vol. V. [2nd edition]. Amsterdam: Elsevier, 457461.
Haspelmath, Martin.
2008 Frequency vs. iconicity in explaining grammatical asymmetries. Cognitive
Linguistics 19, 133.
Hawkins, John A.
1994 A Performance Theory of Order and Constituency. Cambridge: Cambridge
University Press.
2004 Eciency and Complexity in Grammars. Oxford: Oxford University
Press.
Hengeveld, Kees
1989 Layers and operators in Functional Grammar. Journal of Linguistics 25,
12757.
Hetterle, Katja
2007 Causal clauses in cross-linguistic perspective. Unpublished Manuscript. Uni-
versity of Jena.
Itkonen, Esa
2004 Typological explanation and iconicity. Logos and Language 5, 2133.
Jakobson, Roman
1965[1971] Quest for the essence of language. In Roman Jakobson (ed.), Selected Writ-
ings. Vol. II. The Hague: Mouton. [Originally published in Diogenes
51(1965)], 34559.
Lehmann, Christian
1974 Prinzipien fu r ,Universal 14. In Hansjakob Seiler (ed.), Linguistic Workshop
II. Munich: Wilhem Fink, 6997.
MacDonald, Maryellen C., Neal J. Pearlmutter, and Mark S. Seidenberg
1994 The lexical nature of syntactic ambiguity resolution. Psychological Review
191, 676703.
Ohtsuka, Keisuke and William F. Brewer
1992 Discourse organization in the comprehension of temporal order in narrative
texts. Discourse Processes 15, 317336.
Plank, Frans
1979 Ikonisierung und De-Ikonisierung als Prinzipien des Sprachwandels. Sprach-
wissenschaft 4, 121158.
Quirk, Randolph, Sidney Greenbaum, Georey Leech, and Jan Svartvik
1985 A Comprehensive Grammar of the English Language. London: Longman.
Ramsay, Violetta
1987 The functional distribution of preposed and postposed if and when clauses in
written discourse. In Russell Tomlin (ed.), Coherence and Grounding in Dis-
course. Amsterdam: John Benjamins, 383408.
Schmidtke, Karsten
in press A typology of purpose clauses.
Tabachnick, Barbara G. and Linda S. Fidell
2004 Using Multivariate Statistics. New York: Harper Collins. [3rd edition].
Tabakowska, Elzbieta, Christina Ljungberg, and Olga Fischer (eds.)
2007 Insistent Images. Amsterdam: John Benjamins.
Taylor, John
2002 Cognitive Grammar. Oxford: Oxford University Press.
Iconicity of sequence 489
Thompson, Sandra A.
1985 Grammar and written discourse. Initial and nal purpose clauses in English.
In Talmy Givo n (ed.), Quantied Studies in Discourse. Special issue of Text
5, 5584.
1987 Subordination and narrative event structure. In Russell Tomlin (ed.), Co-
herence and Grounding in Discourse. Amsterdam: John Benjamins, 43554.
Thompson, Sandra A. and Robert E. Longacre
1985 Adverbial clauses. In Timothy Shopen (ed.), 1985. Language Typology and
Syntactic Description. Vol. II, Cambridge: Cambridge University Press,
171234.
Trosberg, Anna
1982 Childrens comprehension of before and after reinvestigated. Journal of
Child Language 9, 381402.
Verstraete, Jean-Christophe
2004 Initial and nal position of adverbial clauses in English: the constructional
basis of the discoursive and syntactic dierences. Linguistics 42, 819853.
Wasow, Thomas
2002 Postverbal Behavior. Stanford: CSLI Publications.
490 H. Diessel
New evidence against the modularity
of grammar: Constructions, collocations,
and speech perception
MARTIN HILPERT*
Abstract
This paper combines quantitative corpus data and experimental evidence to
address the question whether speech perception is inuenced by knowledge
of grammatical constructions and, more specically, knowledge of preferred
collocation patterns of these constructions. Lexical identication tasks are
devised in which subjects are presented with synthesized, phonetically am-
biguous stimuli. The results suggest that knowledge of constructions and
collocations inuences speech perception, thus providing evidence for a
usage-based, non-modular view of grammar.
Keywords: modularity of grammar; constructions; collocations; lexical
identication task; phonemic boundaries; compensation for
coarticulation.
1. Introduction
Usage-based approaches to language (Barlow and Kemmer 2000, Bybee
and Hopper 2001, Bybee 2006) hold that repeated usage events over time
shape grammar. One foundational aspect of this hypothesis is the often-
made observation that frequent words tend to reduce phonetically and
phonologically (Zipf 1935; Hooper 1976; Bybee 2000, 2001, inter alia).
Several phenomena empirically support this point. Jurafsky et al. (2001)
nd that word-nal t/d deletion correlates positively with the relative
Cognitive Linguistics 193 (2008), 491511
DOI 10.1515/COGL.2008.019
09365907/08/00190491
6 Walter de Gruyter
* I would like to thank Katherine Crosswhite, Suzanne Kemmer, Nancy Niedzielski, Joan
Bybee, Anatol Stefanowitsch, and two anonymous referees for Cognitive Linguistics, who
have oered helpful comments on earlier versions of this paper. Also, the audience of the
8th CSDL in San Diego provided valuable suggestions. The usual disclaimers apply.
Please address correspondence to 3hilpert@icsi.berkeley.edu4.
frequency of lexical items such as want or mind. Similarly, Cooper and
Paccia-Cooper (1980) observe increased palatalization of word-nal stops
before a glide in items with higher frequency. Palatalization is thus more
likely in a phrase such as did you, as compared to mind you. Finally,
Gregory et al. (1999) report that word duration is relatively shorter
for items with higher text frequency and greater contextual probability.
These and many other studies strongly support the relation of frequency
and phonetic reduction, and hence the usage-based model.
The present study focuses on another tenet that is commonly held
in usage-based approaches and elsewhere, but which as yet has not been
suciently supported through empirical studies. A core assumption in
both cognitive linguistics (Langacker 1987) and connectionist modeling
(McClelland et al. 1986) has been that the mental representation of gram-
mar is non-modular. The common distinction between a syntactic mod-
ule, the mental lexicon, and a phonological module is rejected in favor
of a monotonic structure of grammar. In many formalist frameworks, a
modular organization of grammar with particular emphasis on the auton-
omy of syntax is presupposed, following suggestions and denitions of
Fodor (1983). To illustrate, Newmeyer (1998: 23) denes the autonomy
of syntax in terms of the following hypothesis:
The Autonomy Of Syntax (Autosyn):
Human cognition embodies a system whose primitive terms are nonsemantic
and nondiscourse-derived syntactic elements and whose principles of combination
make no reference to system-external factors.
This hypothesis does of course not deny that information from dierent
grammatical modules is integrated at some level of linguistic processing,
or even at multiple levels. The crux of the argument is therefore that cer-
tain types of information are processed in one module but disregarded in
another. Frazier (1987) lays out how, for instance, acoustic spectra are in-
strumental for parsing speech into words, whereas the resolution of ana-
phoric reference seems quite unrelated to this particular task. Clifton
(1991: 97) explicates this assumption in the following quote:
Modules are dened, in part, in terms of the information relevant to them, and
thus in terms of their representational vocabularies. Information about letters or
speech sounds is relevant to the lexical module (if there is such a thing), but is of
no possible value to the syntactic module.
In cognitive linguistics, the idea of modularity has been repeatedly
criticized (Bybee 2006; Fillmore et al. 1988; Goldberg 1995, 2006; Lan-
492 M. Hilpert
gacker 1987). The alternative view is that lexical and syntactic knowledge
are stored in the same way, and are not processed by dierent mental
modules. A representative position is expressed by Langacker (2005):
Lexicon, morphology, and syntax form a continuum, divided only arbitrarily into
discrete components. Everything along this continuum is fully describable as
assemblies of symbolic structures. A symbolic structure is specically dened as
the pairing between a semantic structure and a phonological structure (its seman-
tic and phonological poles).
While this approach denies the existence of distinct modules that han-
dle dierent aspects of grammatical knowledge, little experimental evi-
dence has been oered to demonstrate the uniformity of grammar. As a
notable exception, Tanenhaus et al. (1995) demonstrate that visual infor-
mation has an immediate inuence on syntactic processing: A sentence
such as Put the apple on the towel in the box presents hearers with a tem-
porary syntactic ambiguity, as the prepositional phrase on the towel can
initially be understood as a destination. If hearers are given a visual con-
text that contains an apple, a towel, and a box, they will, for a brief pe-
riod, pay close attention to the towel. Tanenhaus and colleagues show
that the situation is very dierent if the visual context contains a second
apple that is placed on a napkin. When presented with a contrastive set of
two apples, a towel, and a box, hearers do not initially parse on the towel
as a directional prepositional phraselittle attention is paid to the irrele-
vant sole towel. Tanenhaus and colleagues interpret this result as evidence
against the modularity of syntactic processing.
Despite these ndings, the hypothesis that grammar is non-modular is
still both more speculative and not as well supported as the more general
hypothesis that grammar is shaped through usage. The present study
addresses the need to demonstrate more thoroughly that knowledge of
language is indeed a large inventory of symbolic pairings of sound and
meaning, and nothing else.
The issue of modularity is closely related to the question of how audi-
tory perceptual input is integrated with knowledge of lexical items and
syntactic structures. Is speech perception a purely bottom-up process in
which sounds are sequentially parsed into phonemes, words, phrases,
and sentences, or are there lexical and syntactic top-down eects that
guide the perception of speech? Strict bottom-up organization would ac-
cord with (though not necessitate) a modular approach to grammar. Con-
versely, top-down eects on speech perception, in which lexical or syntac-
tic levels of processing interact with the processing of speech sounds, are
more naturally accounted for in a non-modular approach. It has to be
New evidence against the modularity of grammar 493
pointed out, though, that both types of organization could in principle be
modeled by either a modular or non-modular architecture.
Lexical top-down eects on speech perception have been reported on
several occasions (Elman and McClelland 1988; Ganong 1980; Magnu-
son et al. 2003; Warren and Warren 1970). As will be explained in more
detail below, these eects only operate on lexical units and therefore do
not bear on the question of the autonomy of syntax. The present study
goes beyond lexical eects and presents a top-down eect on speech per-
ception that is driven by speakers knowledge about constructions and
non-lexicalized collocations. The fact that this type of knowledge has an
eect on the perception of auditory input demonstrates the immediate in-
terrelatedness of syntax and phonology, and thus constitutes new evi-
dence against the purported modularity of grammar.
Recently, empirical evidence for syntactic eects on speech production
has been presented by Gahl and Garnsey (2004). In a study of pronunci-
ation variation, they show that words are not only shortened if their over-
all text frequency is high, but also when their syntactic context makes
them highly probable. The overall duration of same word is shorter in
contexts where it is more likely to occur, and hence more easily identied
by the hearer.
To illustrate, in the corpus data used by Gahl and Garnsey, the verb
suggest co-occurs with sentential complements more often than with di-
rect objects, such that sentences like The director suggested the scene
should be lmed at night are more likely than The director suggested the
scene between Kim and Mike (2004: 752). In a production study that mea-
sures reading times, Gahl and Garnsey nd that syntactic biases toward
one complementation pattern signicantly correlate with reduced produc-
tion of the verb in question (2004: 763). Verbs such as argue, believe,
claim, conclude, confess, or decide are pronounced shorter if they occur
with a sentential complement, and longer if they occur with a direct ob-
ject. Conversely, verbs such as accept, advocate, conrm, or emphasize
are reduced when occurring with a direct object, but not when they take
a sentential complement. Gahl and Garnsey conclude that the probabil-
ities of dierent complementation patterns are mentally represented for
each verb, and that this knowledge of syntactic probabilities aects
speech production (2004: 768).
These ndings suggest that the organization of grammar is in fact non-
modular. On a strictly formalist view, the relative frequencies of syntactic
patterns would not be part of any grammatical module to begin with, as
it is held that usage does not aect the mental representation of grammar
(Newmeyer 2003). On any modular view, the fact that syntactic represen-
tations aect subphonemic speech production would require an elaborate
494 M. Hilpert
interface between modules, eectively reducing the autonomy of each re-
spective module to a relative degree. The only conceivable explanation in
terms of strict modularity would require the ad-hoc postulation of dier-
ent lexical entries for verbs such as suggest, only diering in relative
length, depending on their complementation patterns. Since such a solu-
tion requires a fair amount of technical machinery and auxiliary assump-
tions, Gahl and Garnsey point out that the most parsimonious accounts
of these eects will be ones in which the grammar itself is enriched with
probabilistic information (2004: 769).
The present study aims to provide further evidence for the non-
modular view of grammar, and extends the approach of Gahl and
Garnsey to another domain of language use: it will be argued that syn-
tactic probabilities aect not only speech production, but also speech
perception.
2. The experimental paradigm
The experimental paradigm used in the present study is that of lexical
identication tasks. Subjects hear a stimulus and are asked to identify
the word or phrase they perceived by selecting an orthographical repre-
sentation on a computer screen. The stimuli used in this task are often
not unambiguously identiable. The stimuli are rendered ambiguous
through synthesized elements that lie on a continuum between two pho-
nemic poles, such as /p/ and /b/. For minimal pairs such as pear and
bear, the intermediate steps on the continuum allow two possible interpre-
tations, each of which is a lexical word of English. The experiment deter-
mines at which step of the continuum subjects ip from one interpretation
to the other.
One of the best known applications of this experimental paradigm is
the demonstration of the so-called categorical perception of speech
sounds by Liberman et al. (1957). Liberman and colleagues showed that
speech perception diers fundamentally from other types of perception.
For instance, if subjects are presented with a color continuum from red
to orange, that continuum is perceived as a gradual change that goes
through a stage of orange-red in the middle. By contrast, if subjects are
presented with a continuum from the syllable /pa/ to the syllable /ba/,
each stimulus is perceived as either one or the other. No stimulus is per-
ceived as midways between /pa/ and /ba/. This nding suggests that
speech perception is categorical and depends on a specialized speech
processing system, but this point has been subject to controversy (Fry
et al. 1962, Kewley-Port and Luce 1984). This particular debate is not of
New evidence against the modularity of grammar 495
concern here, as the present study merely shares the experimental para-
digm, not the theoretical stakes, with these early studies.
An aspect of speech perception that is relevant to the present study has
been discussed by Warren and Warren (1970), who observe a remarkable
eect: When sounds are cut out from a recording and replaced by a non-
phonemic sound such as a cough, hearers will ll in the missing sound
without even being aware of it. For instance, if the rst /s/ is replaced in
a recording of the word legislatures, hearers will fail to hear that the word
has been altered, even when they are explicitly asked to identify the
spliced element. Warren and Warren call this eect phonemic restoration.
They further nd that phonemic restoration is sensitive to the meaning
of the context. In a sentence such as It was found that the 3cough4 eel was
on the shoe, hearers robustly restore the word heel. By contrast, replacing
the last word of the sentence with axle leads to the restoration of wheel.
The restored element appears thus to be the semantically most appropri-
ate candidate from a set of phonologically related items.
Another application that is similar in spirit to the present investigation
was developed by Ganong (1980). Ganong found that hearers perceive
phonologically ambiguous stimuli with a lexical bias: if a string of pho-
nemes can be interpreted as a lexical word, subjects will favor this inter-
pretation over a competing interpretation that is merely a phonotactically
legal non-word. To illustrate, a stimulus that is ambiguous between face
and faish will tend to be perceived as face. The Ganong-eect can be
measured by comparing responses to dierent continua of stimuli. Re-
sponses to a continuum from face to faish will dier markedly from re-
sponses to a continuum from /s/ and //, which apart from the missing
onset is identical to the face to faish continuum, but which oers only
non-word syllables as possible interpretations. In the rst continuum, sub-
jects will be biased towards the competitor interpretation that is a lexical
item ( face). In the second continuum, none of the competitors is a lexical
item, and so comparatively fewer responses will identify an ambiguous
stimulus as /s/. This eect can be interpreted as a shift in the perceptual
category boundaries of phonemes such as /s/ and //.
While both phonemic restoration and the Ganong-eect represent
striking lexical top-down eects in the processing of auditory input, they
do not address the hypothesis that syntax is an autonomous module of
grammar. In order to put this hypothesis to the test, the present study in-
vestigates whether not only lexical words, but also larger syntactic units
such as constructions (Goldberg 2006) or collocations can aect speech
perception. Under the notion of syntactic eects, the present approach
subsumes even lexico-grammatical dependencies such as the collocational
preferences of particular grammatical constructions (Stefanowitsch and
496 M. Hilpert
Gries 2003). The present study devises lexical identication tasks in which
subjects hear ambiguous stimuli that are embedded in dierent syntactic
and collocational contexts. Unlike Ganong (1980), the present study uses
stimuli for which both competing interpretations are actual words of En-
glish, as for example cry and try. Further, unlike Warren and Warren
(1970), the present study employs stimuli whose competing interpreta-
tions are all semantically viable, i.e., hearers will not be given the biased
choice of, say, a wheel being either on an axle or on a shoe. If a syntactic
context signicantly leads subjects to favor one competitor over the other,
this constitutes evidence against the autonomy of syntax. The basic idea
of the experiments in this study is thus to test whether syntactic context
can shift category boundaries in phonemic perception. Figure 1 shows
two idealized curves of phoneme perception in a continuum from /k/ to
/t/.
The grey curve represents responses to stimuli heard in isolation, the
black curve represents responses to stimuli heard embedded in syntactic
context. The null hypothesis of the present study is that syntactic context
has no inuence whatsoever on phonemic speech perception. The stimuli
should be categorized the same, whether they are heard in isolation or
embedded within a given syntactic context. The two curves should thus
coincide. The research hypothesis is that syntactic context can introduce
a bias that leads subjects to shift their phonemic category boundaries,
such that the two curves coincide at the ends of the continuum, but di-
verge in the middle. These alternative hypotheses are weighed in three dif-
ferent experiments.
Figure 1. Idealized curves of phonemic categorization
New evidence against the modularity of grammar 497
The rst experiment of this paper tests the general question whether
syntactic knowledge has any eect on lexical identication. It is shown
that constructions actually inuence the perception of phonemically am-
biguous stimuli. Given a stimulus that is ambiguous between two lexical
elements in a construction, subjects are more likely to identify the stimu-
lus as an element that frequently occurs in the respective construction,
and less likely to identify it as an element that only sparsely occurs in
that construction.
The second experiment tests how robust the ndings of the rst experi-
ment are. If constructions inuence the perception of phonemically
ambiguous stimuli, it should be possible to nd constructions that induce
opposing biases, thus shifting the perceptual category boundaries in op-
posite directions. The results show that this is the case, corroborating the
ndings of the rst experiment.
In a third experiment it is tested whether constructional context does
not only inuence the level of phonemic processing, but also extends to
low-level phonetic processing. While the results of the rst two experi-
ments could be dismissed as operations that potentially involve the late
re-categorization of misheard words, the third experiment investigates
whether syntactic knowledge directly and instantaneously inuences
lower levels of speech perception. The used test case is whether construc-
tional context can trigger the phonetic eect of compensation for coarti-
culation (Elman and McClelland 1988). In processing naturally occurring
speech, hearers accommodate the fact that any string of phonemes is af-
fected by coarticulation. The production of every speech sound is inu-
enced by its preceding elements that are coarticulated with it. For exam-
ple, the /k/ in this car will sound somewhat dierent than the /k/ in one
car. Compensation for coarticulation can be thought of as an increased
tolerance, such that even less than perfect examples of a phoneme are
categorized as such, when hearers know that there is a reason for the un-
dershoot. Compensation for coarticulation is necessarily a low-level pho-
netic process that works instantaneously to keep up with the natural
speech ow. The results of the third experiment show that constructions
can indeed induce compensation for coarticulation. This means that
knowledge of syntax has an eect on phonetic processing, which in turn
constitutes evidence against syntactic modularity of the kind hypothesized
in Newmeyer (1998).
3. Materials and participants
The speech stimuli used in the present study are based on recordings of
an adult female human voice, speaking in a standard American English
498 M. Hilpert
variety. After the recordings, the stimuli were altered with a computerized
synthesizer to yield ten-step continua between two phonemic poles,
such as /k/ and /t/. The chosen method of synthesizing was sample-
averaging. This technique divides two wave forms of the same length
into small slices at the rate of 44.1 kHz and creates continua of ambigu-
ous sounds by laying wave forms from the two dierent sources on top of
each other. Depending on how strong each source is represented at a
given continuum step, the resulting wave form sounds more or less like
one of the original sources. The synthesized stimuli are embedded in un-
altered recordings of actual words to yield a continuum between, say, the
English words cry and try. The end points of the continuum are unambig-
uously perceived as cry and try, but the point at which the perceptual
crossover from cry and try occurs will vary from person to person.
In some experiments reported in this paper, the outer continuum steps
were discarded if pilot studies indicated that even steps that lay more to-
wards the center of the continuum were identied unambiguously across
subjects.
Fifteen volunteer subjects with self-reported normal hearing, normal or
corrected to normal vision, and English as their native language partici-
pated, each one in all three of the experiments. Since this was a procedure
of about 45 minutes, subjects were instructed to take breaks whenever
they felt the need for doing so. The experimental design was fully self-
paced through mouse clicks and allowed for subject-controlled breaks.
All subjects were Rice University undergraduate or graduate students
that were either paid or given course credit for their participation. None
of the data had to be excluded.
4. Experiment 1The English make-causative
In the rst experiment, subjects are presented with ambiguous speech sig-
nals within a construction that is intended to bias the lexical identication
process towards one of the two competing interpretations. The construc-
tion used in this experiment is the English make-causative, which has a
strong bias towards verbs of emotion and psycho-physiological reaction
(Kemmer 2001). Typical examples are It made me feel dizzy or That
makes it look a lot bigger; examples involving activity verbs such as He
made me do it are much less frequent, despite the high text frequency of
the verb do. Table 1 shows the twenty most frequent verbs from an ex-
haustive extraction of the make-causative construction from the British
National Corpus (Leech 1992), which yields 10,708 examples.
While the verb cry occurs 73 times in the make-causative construction,
the verb try, which is not shown in Table 1, occurs only eleven times. As a
New evidence against the modularity of grammar 499
minimal pair with cry, it aords a test case for the eect of constructional
context on speech perception. Note that try is ten times as frequent in dis-
course as cry (Francis and Kucera 1982), such that the frequency of cry
and try in the make-causative is asymmetrical to their overall frequency.
At any rate, the context of the make-causative should bias the categoriza-
tion of stimuli ambiguous between cry and try towards cry. The carrier
phrase that is used in the experiment is the phrase They made me, which
is followed by a signal that ranges on a eight-step continuum from /trai/
to /krai/. It is hypthesized that the constructional carrier phrase biases
hearers towards perceiving a principally ambiguous signal as /krai/. To
test this hypothesis, subjects categorized the ambiguous signals both with-
in the constructional frame and in isolation.
4.1. Method
4.1.1. Materials. For the experiment, a ten-step /t/-/k/ continuum
was created using a sample-averaging script within Praat (Boersma and
Weenink 2005). The source signals were the items cry and try, recorded
from the speech of a female native speaker of American English. Input
sections were selected that contained the burst of the consonant as well
as the rst four glottal pulses. Again using Praat, the longer one of the
two sections was shortened such that both sections were of equal length.
From these continuum endpoints, intermediate signals were created in
10% steps. Each of the resulting /t/-/k/ continuum steps was concate-
nated with the remaining stretch of the recording of try, which comprised
the entire word minus the burst and the rst four glottal pulses. This
procedure yields a continuum of sounds, the rst one an unambiguous
/krai/, and the last one an unambiguous /trai/. In pilot studies the two
Table 1. The 20 most frequent verbs in the English make-causative construction in the BNC
Verb Tokens Verb Tokens
feel 1654 appear 142
look 822 happen 119
think 542 come 111
laugh 358 realise 111
seem 293 see 100
work 264 pay 97
sound 258 meet 93
go 237 stand 91
want 195 take 74
wonder 157 cry 73
500 M. Hilpert
endpoint steps were identied unambiguously and hence discarded; only
the intermediate eight steps were used. For the carrier phrase, the phrase
they made me was recorded from the same speaker, and subsequently
concatenated with each of the eight stimuli. The subsequent concatena-
tion of stimuli types yields 16 dierent stimuli (carrier phrase and null
context by eight try-cry continuum steps).
4.1.2. Procedure. The experiment was conducted using PsyScope 1.2.5
(Cohen et al. 1993). Subjects were given on-screen instructions, stating
that they would see a clickable red dot as a xation point at the center
of the computer screen. After clicking the dot, they would hear a pre-
recorded sound le and have to identify the percept as a word of English
in a two-way choice. Orthographical representations in a 44 pt font were
displayed to the left and right side of the screen. The same orthographical
representation would appear in the same place throughout. For each sub-
ject, this experiment involved 64 trials, such that each of the eight contin-
uum steps was heard eight times, four times in isolation, and four times in
the constructional context. No ller trials were used.
1
Stimuli were pre-
sented in randomized order, while the relative positions of the ortho-
graphical representations were kept constant.
4.2. Results
Figure 2 summarizes the outcome of the rst experiment. In both condi-
tions, the perceptual crossover from /k/ to /t/ occurs between steps three
and six. The outer two steps on either side of the continuum are unambig-
uously identied by all subjects. The gure shows that the categorization
curve is drawn half a step towards the right side of the continuum in the
context of the make-causative construction, which is consistent with the
research hypothesis. More instances of ambiguous sounds are identied
as cry if they are presented in the constructional carrier phrase. A re-
peated measures ANOVA was conducted for the cry responses in isola-
tion and in the constructional context to measure the eect of the con-
structional carrier phrase. The calculation is based on all cry responses
of fteen subjects in two dierent conditions (isolation, causative) across
the eight steps in the synthesized continuum.
1. To the extent that ller trials serve to obscure the research question in an experimental
design, they were not deemed necessary in the present study. Additionally, since partic-
ipants completed all three experiments in one lengthy sitting, llers would have meant
an additional strain on the participants.
New evidence against the modularity of grammar 501
The constructional eect is signicant in a by-subject analysis (F
1; 14

18:44, p < 0:001) and approaches signicance in the corresponding
by-item analysis (F
1; 7
4:17, p 0:08).
5. Experiment 2Using collocations to induce opposing biases
While the rst experiment is designed to merely explore whether construc-
tional knowledge, which falls into the domain of syntax, has an eect on
speech perception, the second experiment tests how robust this eect is.
Given that only the by-subject analysis returned a signicant result in
the rst experiment, further investigation seems necessary. Again, subjects
are presented with phonologically ambiguous stimuli that are embedded
in constructional frames. This time however, subjects hear the same stim-
uli in three dierent conditions. In the control condition, subjects hear the
stimuli in isolation. In the two other conditions, the stimuli are embedded
in collocations that exhibit dierent lexical preferences. The research hy-
pothesis is that each carrier phrase biases lexical identication towards its
lexical preference, while the control condition yields responses that fall in
between the two other conditions. The null hypothesis is that all three
conditions should receive similar responses, or responses that dier
randomly.
The phrases its always and its getting are used as carrier phrases be-
cause they collocate heavily with the elements worse and worth, which
form a phonological minimal pair of English. The collocational prefer-
Figure 2. Perception of the /trai/-/krai/ continuum in isolation and the causative
502 M. Hilpert
ences are opposed to each other: while its always frequently occurs with
worth, its getting is frequently followed by worse. The reverse combina-
tions are not ungrammatical, but very infrequent. Sentences such as Its
getting worth investing again are thus rarely seen. Table 2 shows the ten
most frequent items that occur after its always and its getting in the
BNC. The table is based on 502 occurrences of its always and 292 tokens
of its getting.
As expected, function words such as the determiners the and a, and
prepositions such as to and on are among the most frequent elements.
However, both lists also contain open-class elements such as the adjec-
tives nice, dicult, easier, and better with its always, and late and dark
with its getting. What matters to the present analysis is that the minimal
pair members worse and worth approximate a complementary distribu-
tion across the two collocational environments. In terms of absolute fre-
quency, worse and worth occur at the same order of magnitude (Francis
and Kucera 1982), such that any observed eect should not be due to
stronger familiarity with one of the two competitors.
The distributional asymmetry between the items after its getting and
its always should lead subjects to interpret the same ambiguous stimuli
in dierent ways, depending on the preceding context. The question pur-
sued in this experiment is whether this dierence is strong enough to in-
duce opposing biases that are statistically signicant, and that are both
distinct from intermediate responses to the control condition.
5.1. Method
5.1.1. Materials. A ten-step /s/-/y/ continuum was created using the
same sample-averaging script within Praat. The source signals were the
items worse and worth, which were recorded as spoken by a female native
Table 2. The 20 most frequent items after its always and its getting in the BNC
its always Tokens its getting Tokens
the 74 a 25
a 64 late 20
like 14 dark 15
nice 14 more 11
dicult 13 worse 10
there 12 to 9
worth 11 on 8
easier 9 the 7
going 9 too 7
better 8 very 7
New evidence against the modularity of grammar 503
speaker of English. Input sections were selected that contained the last
four glottal pulses from the vowel //. The longer /s/-section was short-
ened such that it was of equal length as the /y/-section. From these con-
tinuum endpoints, intermediate signals were created in 10% steps. Each of
the resulting /s/-/y/ continuum steps was concatenated with the remain-
der of the word worth, which comprised the entire word minus the last
four glottal pulses and the frication. This procedure yielded a continuum
of sounds, the rst one an unambiguous /ws/, and the last one an un-
ambiguous /wy/. The rst points and the last two points of the contin-
uum were interpreted unambiguously in pilot tests, such that they were
discarded and not used in the actual experiment. Only the seven steps
from step two to step eight were used.
5.1.2. Procedure. The experiment was conducted in much the same
way as experiment 1, using PsyScope 1.2.5 with on-screen instructions.
For each subject, this experiment involved 84 trials, such that each of
the seven continuum steps was heard twelve times, four times in isolation,
and four times in each of the two dierent constructional contexts.
No ller trials were given. Stimuli were presented in randomized order;
the relative positions of the orthographical representations were kept
constant.
5.2. Results
Figure 3 shows that syntactic context actually biases lexical identication
in the predicted way. The diagram shows all worse responses relative to
the three conditions of the experiment. The perceptual crossover covers
all steps except the rst one, regardless of context. It can be seen that syn-
tactic context has an eect on the interpretation of the ambiguous stimu-
lus, as the condition its getting produces the most worse responses. The
light grey curve, representing the condition its always is atter and drawn
more to the left than the black curve, as this condition yields the fewest
worse responses. The medium grey line, representing worse responses in
the absence of a carrier phrase, falls between the two other lines, except
at step 7.
A repeated measures ANOVA was conducted for all worse responses to
measure the eect of the constructional carrier phrases. The calculation is
based on all worse responses of fteen subjects in three dierent condi-
tions (isolation, its always, its getting) across the seven steps of the syn-
thesized continuum. The constructional eect is signicant in a by-subject
analysis (F
2; 28
13:60, p < 0:001) and in the corresponding by-item
analysis (F
2; 12
18:83, p < 0:001).
504 M. Hilpert
6. Experiment 3Compensation for coarticulation
The rst two experiments yield evidence that knowledge of constructions
and collocations induces shifts in phonemic category boundaries. This
can be interpreted as a syntactic eect on phonological processing. While
the explanation of such an eect requires some auxiliary assumptions on
a modular view of grammar (Newmeyer 1998, 2003), the eect itself does
not amount to a refutation of modularity. One possible criticism is that
the observed results are the eect of late feedback between modules,
which is how the eect of phonemic restoration due to semantic context
(Warren and Warren 1970) is most appropriately interpreted. It could
thus be that an input that is passed from the phonological module to the
syntactic module is left unspecied or subsequently judged to be a misper-
ception, and therefore re-analyzed at a relatively late processing stage. In
order to show that syntactic eects on speech perception apply immedi-
ately at the level of auditory input processing, it needs to be demonstrated
that on-line phonetic processing is aected by syntactic context. The third
experiment investigates whether this is actually the case.
A potential source for such evidence is the phonetic eect of com-
pensation for coarticulation (Elman and McClelland 1988). In processing
naturally occurring speech, hearers do not expect each token of a phone-
mic category to be invariant. Hearers unconsciously compensate for
the fact that every speech sound is inuenced by its preceding elements.
If a transition from one phoneme to the next takes eort, hearers will
Figure 3. worse responses relative to condition
New evidence against the modularity of grammar 505
accommodate the resulting undershoot and perceive even less than perfect
examples of a phoneme as a proper member of its category. The behavior
of compensation for coarticulation can be exploited in an experimental
setting. What the experiment aims to test is whether compensation for
coarticulation, as an on-line phonetic eect, can be triggered by the syn-
tactic context of a given construction.
To this end, not only one ambiguous stimulus is required, but two. For
the rst part of the complex ambiguous stimulus, the third experiment re-
uses stimuli of the second experiment. Subjects are presented with stimuli
that are ambiguous between worse and worth in two conditions. In the
control condition, subjects hear the stimulus in isolation, while the second
condition presents the stimulus appended to the carrier phrase its always.
As has been shown in experiment 2, this carrier phrase biases lexical iden-
tication towards the competitor worth. It is assumed here that prior ex-
posure to the stimuli does not have a biasing eect; the same subjects
heard the enhanced stimuli in the third experiment.
The stimulus continues with an element that is phonetically ambiguous
between the words trying and crying. It is here that compensation for
coarticulation comes into play. The interpretation of the rst stimulus
(worse-worth) should lead subjects to categorize the second stimulus
(trying-crying) in dierent ways, depending on the degree of eort on the
part of the speaker to make the transition. Transitional eort is opera-
tionalized here, in a somewhat simplistic but not confounding way, as dis-
tance in production site. The two ambiguous stimuli yield four possible
interpretations, which are shown in Table 3.
If the rst stimulus is perceived as worth, which ends on the interdental
/y/, subjects should forgive that a following velar /k/ is pronounced
somewhat more towards the front; so they should be more likely to per-
ceive the second stimulus as crying. Put simply, as /y/ is produced further
in the front of the mouth than /s/, we expect it to generate a relatively
greater tolerance. By contrast, the transitions from dental to velar (/s/
> /k/), from interdental to alveolar (/y/ > /t/) are relatively easy; and
the transition from dental to alveolar (/s/ > /t/) is the easiest option al-
Table 3. Stimuli interpretations and degree of eort in coarticulation
Interpretation Transition Eort
worth crying interdental - velar dicult
worse crying dental - velar intermediate
worth trying interdental - alveolar indermediate
worse trying dental - alveolar easy
506 M. Hilpert
together. Hearers should therefore be the least tolerant with respect to
this transition.
2
On the research hypothesis, the constructional context its always
should bias subjects towards perceiving worth more often than in the con-
trol condition. This, in turn, should result in a bias to perceive the second
ambiguous stimulus as crying more often. If we thus observe more crying
responses in the second condition, this would suggest that syntax aects
even low-level phonetic processing.
6.1. Method
6.1.1. Materials. The carrier phrase its always and the seven-step
worse-worth continuum from the second experiment were re-used without
further changes, but the stimuli were concatenated with further material.
A ten-step continuum was created from the recorded items crying and try-
ing, using the previously discussed method. Here, the rst point and the
last three points of the continuum were interpreted unambiguously in
pilot tests, such that they were discarded and not used in the actual exper-
iment. Only the six steps from step two to step seven were used. The sub-
sequent concatenation of stimuli types yielded 42 dierent stimuli (seven
worth-worse continuum steps times six trying-crying continuum steps).
6.1.2. Procedure. The experiment was conducted in the same way as
the other experiments, using PsyScope 1.2.5 with on-screen instructions.
The only dierence concerned the fact that this time, subjects had to iden-
tify a percept as a word of English in a four-way choice: worth crying,
worse crying, worth trying, or worse trying. Orthographical representa-
tions in a 44 pt font were either arrayed into the four corners of the screen
or displayed to the left and right side of the screen. The same orthograph-
ical representation would appear in the same place throughout. For each
subject, this experiment involved 252 trials. Each of the 42 stimuli was
heard six times, three times in isolation, and three times after the carrier
phrase its always. No ller trials were given. Stimuli were presented in
randomized order while the relative positions of the orthographical repre-
sentations were kept constant.
2. A reviewer points out that both worth and worse should trigger the fronting of the initial
consonant of crying and asks whether a minor dierence in place of articulation, such as
dental vs. interdental, has been shown to make a signicant dierence in compensation
for coarticulation. Elman and McClelland (1988) report an eect for the alternation be-
tween /s/ and //, i.e., an alveolar and a postalveolar fricative, so that indeed minute
dierences seem sucient for the eect to obtain.
New evidence against the modularity of grammar 507
6.2. Results
Figure 4 shows that there are fewer crying responses in the control condi-
tion than in the condition that involved the constructional carrier phrase.
The diagram shows the absolute numbers of responses to the second am-
biguous stimulus (crying-trying). After the constructional carrier phrase
its always we expected a higher number of worth responses and conse-
quently a higher number of crying responses. This expectation is borne
out.
A repeated measures ANOVA was conducted for all crying responses
to measure the eect of the constructional carrier phrases. The calculation
is based on all crying responses of fteen subjects in two dierent condi-
tions (isolation, its always) across the six steps of the synthesized contin-
uum. The constructional eect approaches signicance in a by-subject
analysis F
1; 14
4:02, p 0:065); the corresponding by-item analysis re-
turns a signicant result (F
1; 5
6:58, p 0:050).
7. Conclusion
The results of the three experiments demonstrate that syntactic context in
the form of constructions and collocations has an eect on both phone-
mic categorization and low-level phonetic processing. Presenting ambigu-
ous sounds in the carrier phrase of a constructional or collocational frame
alters the phonemic category boundaries in a lexical identication task,
Figure 4. Crying responses to the second ambiguous stimulus by condition
508 M. Hilpert
and it can induce the phonetic eect of compensation for coarticulation.
It needs to be acknowledged that the observed eects fail to reach signi-
cance in the cases of the by-item analysis of experiment 1 and the by-
subject analysis of experiment 3. Here, there is only evidence in the form
of trends. The directions of the observed eects are, however, as pre-
dicted; they always move towards the lexical element that more fre-
quently occurs with the carrier phrase. This rearms the point that collo-
cations and collocational patterns within constructions (Stefanowitsch
and Gries 2003) have a psychological reality that shapes the way in which
hearers perceive speech. It can also be concluded that the lexically based
Ganong-eect has a more abstract counterpart which extends to the level
of syntax, and which is not restricted to the opposition of words and non-
words. The result that subjects are biased towards hearing entrenched
units over hearing chance collocations is consistent with views held in
Construction Grammar and cognitive linguistics, but up to now, this
view had not been suciently supported through empirical studies. The
results of the present study provide new evidence that syntactic and lexi-
cal knowledge are not stored in dierent mental modules, but rather form
a continuum from heavily entrenched and conventionalized units to
loosely connected elements (Bybee 2006).
Received 27 June 2007 ICSI Berkeley, USA
Revision received 3 January 2008
References
Barlow, Michael and Suzanne E. Kemmer
2000 Usage-based Models of Language. Stanford: CSLI.
Boersma, Paul and David Weenink
2005 Praat: doing phonetics by computer (Version 4.3.14) [Computer program].
Retrieved May 26, 2005, from http://www.praat.org/.
Bybee, Joan L.
2000 The phonology of the lexicon: Evidence from lexical diusion. In M. Barlow
and S. Kemmer. (eds.), Usage-based Models of Language. Stanford: CSLI.
2001 Phonology and Language Use. Cambridge: Cambridge University Press.
2006 From usage to grammar: The minds response to repetition. Language 82(4),
711733.
Bybee, Joan L. and Paul Hopper (eds.)
2001 Frequency and the Emergence of Linguistic structure. Amsterdam: John
Benjamins.
Clifton, Charles Jr.
1991 Syntactic modularity in sentence comprehension. In R. R. Homan and
D. S. Palermo (eds.), Cognition and the symbolic processes, Vol. 3: Applied
and ecological perspectives. Hillsdale, NJ: Erlbaum, 95114.
New evidence against the modularity of grammar 509
Cohen, Jonathan, Brian MacWhinney, Matthew Flatt, and Jeerson Provost
1993 PsyScope: An interactive graphical system for designing and controlling ex-
periments in the psychology laboratory using Macintosh computers. Behav-
ior Research Methods, Instruments, and Computers 25, 257271.
Cooper, William E. and Jeanne Paccia-Cooper
1980 Syntax and Speech. Cambridge, MA: Harvard University Press.
Elman, Je L. and James L. McClelland
1988 Cognitive penetration of the mechanisms of perception: Compensation for
coarticulation of lexically restored phonemes. Journal of Memory and Lan-
guage 27, 143165.
Fillmore, Charles J., Paul Kay, and Mary C. OConnor
1988 Regularity and idiomaticity in grammatical constructions: The case of Let
Alone. Language 64(3), 510538.
Fodor, Jerry A.
1983 The Modularity of Mind. Cambridge: Bradford books.
Francis, W. Nelson and Henry Kucera
1982 Frequency Analysis of English Usage: Lexicon and Grammar. Boston:
Houghton-Miin.
Frazier, Lyn
1987 Sentence Processing: A tutorial review. In M. Coltheart (ed.), Attention and
Performance XII: The Psychology of Reading. Hove: Erlbaum, 559586.
Fry, Dennis B., Arthur Abramson, Peter D. Eimas, and Alvin M. Liberman
1962 The identication and discrimination of synthetic vowels. Language and
Speech 5, 171189.
Gahl, Susanne and Susan M. Garnsey
2004 Knowledge of grammar, knowledge of usage: Syntactic probabilities aect
pronunciation variation. Language 80(4), 748775.
Ganong,William F.
1980 Phonetic categorization in auditory perception. Journal of Experimental Psy-
chology: Human Perception and Performance 6, 110125.
Goldberg, Adele E.
1995 Constructions. Chicago: University of Chicago Press.
2006 Constructions at Work. The Nature of Generalization in Language. Oxford:
Oxford University Press.
Gregory, Michelle, W. D. Raymond, A. Bell, E. Fosler-Lussier, and D. Jurafsky
1999 The eects of collocational strength and contextual predictability in lexical
production. Chicago Linguistics Society 35.
Hooper, Joan B.
1976 An Introduction to Natural Generative Phonology. NewYork: Academic Press.
Jurafsky, Dan, Alan Bell, Michelle Gregory, and William D. Raymond
2001 Probabilistic Relations between Words: Evidence from Reduction in Lexical
Production. In J. L. Bybee and P. Hopper. (eds.), Frequency and the Emer-
gence of Linguistic Structure. Amsterdam: John Benjamins, 229254.
Kemmer, Suzanne E.
2001 Causative Constructions and Cognitive Models: The English Make Causative.
The First Seoul International Conference on Discourse and Cognitive Lin-
guistics: Perspectives for the 21st Century, 803832.
Kewley-Port, Diane and Paul A. Luce
1984 Time-varying features of initial stop consonants in auditory running spectra:
A rst report. Perception and Psychophysics 35, 353360.
510 M. Hilpert
Langacker, Ronald W.
1987 Foundations of Cognitive Grammar. Stanford: Stanford University Press.
2005 Construction grammars: Cognitive, radical, and less so. In F. J. Ruiz de
Mendoza Iban ez and M. S. Pen a Cervel (eds.), Cognitive Linguistics. Inter-
nal Dynamics and Interdisciplinary Interaction. Berlin: Mouton de Gruyter,
101159.
Leech, Georey
1992 100 million words of English: the British National Corpus. Language Re-
search 28(1), 113.
Liberman, Alvin M., Katherine S. Harris, H. S. Homan, and B. C. Grith
1957 The discrimination of speech sounds within and across phoneme boundaries.
Journal of Experimental Psychology 54, 35868.
Magnuson, James, Bob McMurray, Michael Tanenhaus and Richard Aslin
2003 Lexical eects on compensation for coarticulation: The ghost of Christmash
past. Cognitive Science 27(2), 285298.
McClelland, James L., David E. Rumelhart and the PDP Research Group
1986 Parallel Distributed Processing: Explorations in the Microstructure of Cogni-
tion. Volume 2: Psychological and Biological Models. Cambridge, MA: MIT
Press.
Newmeyer, Frederick J.
1998 Language Form and Language Function. Cambridge: MIT Press.
2003 Grammar is grammar and usage is usage. Language 79, 682707.
Stefanowitsch, Anatol and Stefan Th. Gries
2003 Collostructions: Investigating the interaction between words and construc-
tions. International Journal of Corpus Linguistics 8, 20943.
Tanenhaus, Michael, Michael Spivey-Knowlton, Kathleen Eberhard and Julie Sedivy
1995 Integration of visual and linguistic information in spoken language compre-
hension. Science 268, 16321634.
Warren, Richard M. and Roslyn P. Warren
1970 Auditory illusions and confusions. Scientic American 223, 3036.
Zipf, George K.
1935 The Psycho-biology of Language. Boston: Houghton-Miin.
New evidence against the modularity of grammar 511
Negative entrenchment: A usage-based
approach to negative evidence
ANATOL STEFANOWITSCH*
Abstract
The alleged absence of negative evidence in the linguistic input has played a
major role both in linguistic theorizing and in discussions about linguistic
methodology. I argue that, given a suciently sophisticated understanding
of frequency, negative evidence can be inferred from the positive evidence in
the linguistic input. Using an extension of collostructional analysis, I show
how the corpus linguist, and, by analogy, the language learner, can discrim-
inate between combinations of linguistic items that are accidentally absent
from a given corpus and combinations whose absence is statistically signi-
cant. I also show that this kind of negative corpus evidence correlates with
degrees of acceptability in judgment tasks. I propose a conceptualization of
such negative evidence as negative entrenchment in a usage-based model.
Keywords: negative evidence; acceptability; collostructional analysis; cor-
pus linguistics; entrenchment; usage-based model.
1. Introduction
The alleged absence of negative evidence in the linguistic input has played
a major role both in linguistic theorizing and in discussions about linguis-
tic methodology. In this short paper, I will argue that negative evidence is
not as absent as it may seem, but that this perception is due to a relatively
simplistic understanding of the kind of information that can be derived
Cognitive Linguistics 193 (2008), 513531
DOI 10.1515/COGL.2008.020
09365907/08/00190513
6 Walter de Gruyter
* I would like to thank the editor of this special issue, Arne Zeschel, and the two anony-
mous reviewers for their extremely helpful comments and suggestions. I would also like
to thank Martin Hilpert for recruiting the participants for the experiment reported in this
paper. Contact Address: Universitat Bremen, Fachbereich 10, Postfach 33 04 40, 28334
Bremen, Germany. Author e-mail: stefanowitsch@uni-bremen.de.
from the frequency of linguistic items in the input. With a more sophisti-
cated approach, negative evidence can, in fact, be inferred from the posi-
tive evidence in the linguistic input.
In Section 2, I will briey introduce the no negative evidence prob-
lem (Bowerman 1988) and discuss some potential solutions that have
been proposed in the literature. I will also touch on the way in which
these solutions might t in with the idea of a usage-based model of
language.
In Section 3, I will then expand one particular solution and show how a
corpus linguist, faced with the same problem as the language learner, can
solve the no negative evidence problem using inferential statistics
rather than relying on raw frequencies, as is still widely done.
Finally, in Section 4, I will present the results of a pilot study correlat-
ing two kinds of corpus-derivable negative evidence with acceptability
judgments, showing that it is plausible to assume that speakers have ac-
cess to such evidence.
2. The no negative evidence problem
There seems to be widespread agreement across theoretical frameworks
that the crucial task of language learners is to arrive at some representa-
tion of the general properties of a grammar, when all they have in terms
of evidence is a limited corpus of actually occurring utterances. The
main piece of evidence for the existence of such representations is the
fact that speakers seem to be able to classify sentences that they have
never heard before as grammatical or ungrammatical.
It is a reasonable assumption that speakers arrive at such representa-
tions by generating hypotheses about the general properties of the gram-
mar on the basis of the available input and then testing whether these hy-
potheses are true. For example, they might be faced with utterances such
as those in (1a, b) and (2a, b) (cf. Baker 1979; Bowerman 1988):
(1) a. Dad told a story to Sue.
b. Dad told Sue a story.
(2) a. I gave a book to John.
b. I gave John a book.
On the basis of these utterances, a learner might hypothesize that all
verbs that occur in the pattern [NP
subj
V NP
obj
to NP
obj
] (the dative con-
struction) can also occur in the pattern [NP
subj
V NP
obj
NP
obj
] (the ditran-
sitive construction). However, this hypothesis turns out to be overly gen-
eral, as the following judgments show:
514 A. Stefanowitsch
(3) a. Dad said something nice to Sue.
b. *Dad said Sue something nice.
(4) a. Mary donated a book to the library.
b. *Mary donated the library a book.
The problem for the speaker is that they have no way of testing their
hypothesis: there is nothing in the input that tells them that (3b) and (4b)
are ungrammaticalthere is no negative evidence. It is generally argued
that the fact that such sentences never occur cannot in itself provide such
evidence because the task of generalizing from a limited input to a gram-
mar potentially producing an unlimited output always requires going be-
yond the input. In other words, grammaticality can not be equated with
(likelihood of ) actual occurrence (cf. Chomsky 1957: 15.).
It seems that usage-based cognitive grammar (cf. Langacker 1991,
2000) does not, as such, provide a solution to this problem. In the usage-
based model, grammaticality is taken to be a graded phenomenon. The
usage-based model assumes that linguistic knowledge is represented in
the form of linguistic units that emerge from recurrent usage-events.
Such units may dier in their degree of schematicity, ranging from fully
specied linguistic expressions over relatively concrete multi-morphemic
expressions with a single open slot to highly schematic congurations of
abstract linguistic categories. As these units emerge from and are main-
tained by concrete usage-events, they may also dier in their degree of en-
trenchment. As Langacker puts it:
Every use of a structure has a positive impact on its degree of entrenchment,
whereas extended periods of disuse have a negative impact. With repeated use, a
novel structure becomes progressively entrenched, to the point of becoming a unit;
moreover, units are variably entrenched depending on the frequency of their oc-
currence . . . (Langacker 1987: 59)
Once a unit with a particular degree of entrenchment and schematicity
is established in the system, it may serve to sanction further usage events
to a greater or lesser extent:
To the extent that a target structure accords with the conventional units in the
grammar, these units are said to sanction this usage. It is crucial to realize that
sanction is a matter of degree and speaker judgment. It is a measure of an expres-
sions well-formedness, i.e. how closely it conforms to linguistic convention, in all
its aspects and dimensions. (Langacker 1987: 66)
The mechanisms of entrenchment and sanction explain straightfor-
wardly how speakers are able to distinguish degrees of grammaticality
Negative entrenchment 515
(or conventionality, in Langackers terms) among those utterances and
utterance types that do occur. These degrees of grammaticality simply
reect the degree to which an utterance conforms to one or more estab-
lished units and to the degree to which these units are entrenched. How-
ever, it does not explain how speakers are able to distinguish degrees of
grammaticality among utterances and utterance types that do not occur.
If a particular conguration of linguistic elements is never instantiated,
speakers will not derive a schema corresponding to this conguration.
Essentially, this means that, when faced with an utterance that does
not correspond to an established unit, they should reject it categorically.
Of course, matters are slightly more complex: speakers do have the
option of comparing the utterance in question to schemas to which it
corresponds partially. This is referred to as partial sanction (Langacker
1987: 69). However, the mechanism of partial sanction is obviously
highly restricted, since many utterances will correspond partially to estab-
lished schemas but still be unacceptable. For example, (3b) and (4b) are
partially sanctioned by (and can be interpreted in relation to), respec-
tively, (1b), (2b) and the ditransitive schema, yet they are categorically
unacceptable to speakers of English. Thus, as it stands, the usage-
based model does not provide a solution to the no negative evidence
problem.
In the literature, a range of suggestions have been made across frame-
works as to how the language learner might get around the lack of nega-
tive evidence. Many of these are discussed in detail in Bowerman (1988)
so I will summarize them here only briey in the form of strategies poten-
tially available to the language learner:
(a) dont hypothesize rules that would require negative evidence to
prove them incorrect (innate constraints on possible rules, Baker
1979);
(b) construct extremely conservative grammars, either by choosing the
narrowest grammar compatible with the evidence from the set of
grammars oered by a UG (cf. Berwick 1985; Berwick and Wein-
berg 1984), or by abstracting only very concrete, low-level schemas
from the input (cf. Dabrowska 2000; Tomasello 2003; Lieven et al.
2003);
(c) if you have positive evidence that something can be expressed in a
particular way, assume that it cannot be expressed in other ways,
unless you have positive evidence for the other ways (preemption);
(d) form expectations about what to encounter in a particular context
and take the occurrence of other things in that context to be nega-
tive evidence (also a kind of preemption, Goldberg 1995).
516 A. Stefanowitsch
The idea of innate constraints on possible rules is an attractive sugges-
tion in theory, but it shares the drawback of all innatist accounts: unless
the properties of a mechanism capable of constraining the learner to the
relevant class of rules can be described in detail and unless it can be
shown that such a mechanism actually exists in humans (or at least, that
it could have evolved in the history of our species), Bakers suggestion re-
mains purely speculative. Certainly, recent proposals on the nature of
Universal Grammar (Hauser, Chomsky and Fitch 2002) contain nothing
that could serve this function even in principle. This does not mean that
no appropriate mechanism will ever be found, but until it is, it seems a
plausible research strategy to focus on more tangible explanations.
The idea of narrowly constructed grammars may seem somewhat sim-
plistic, but it has been shown to go a long way towards explaining lan-
guage acquisition. However, it clearly cannot be the whole story: rst, as
(Bowerman 1988: 81) points out, overextensions do occur in child lan-
guage and speakers do at some point begin to use their grammar produc-
tively. Second, it does not fully explain the ability of speakers to discrim-
inate between grammatical and ungrammatical sentences in cases where
something falls outside of the scope of their previous linguistic experience.
Like the idea of narrowly constructed grammars, the idea of preemp-
tion is a simple but powerful mechanism that has been shown to have an
inuence on the acquisition of grammatical constructions (Brooks and
Tomasello 1999; Brooks and Zizak 2001). But as in the case of narrowly
constructed grammars, it cannot provide a complete explanation, at least
in the case of grammar. It is quite plausible that a child, upon repeatedly
hearing an irregular past tense form like went, will take this as evidence
against the existence of the regular alternative goed. However, it seems
much less plausible that the same child, upon repeatedly hearing sen-
tences like Dad told a story to Sue will take this as evidence against the
alternative Dad told Sue a story: if they did, grammatical alternatives
could not persist in the language. In other words, the idea that the prepo-
sitional dative could preempt the ditransitive might appear to account for
the fact that (3b) and (4b) are unacceptable, but it also wrongly predicts
that (1a) and (2a) should be unacceptable.
Goldberg (1995: 29f.) puts forth an interesting extension of the notion
of preemption that avoids this problem. She observes that grammatical
alternatives, where they exist, are never completely synonymous, and
suggests that language learners could exploit this. For example, the di-
transitive and the dative dier in terms of their information-structural
properties (cf. Erteschik-Shir 1979, cf. also Gries 2003). Therefore,
when children hear the dative being used with a particular verb in an
information-structural context that calls for the ditransitive, they could
Negative entrenchment 517
infer that the ditransitive is an option for that verb. Goldbergs proposal
is extremely interesting if it could be shown that, for example, the dative
occurs in inappropriate information-structural contexts with verbs that do
not alternate between the ditransitive and the dative. However, this lies
outside the focus of the present paper.
Instead, this paper will investigate an alternative extension of the pre-
emption account, namely the possibility that the degree of entrenchment
of a potentially preempting construction might play a role in providing in-
direct negative evidence. Note that in the case of irregular morphology,
the entrenchment of the preemptive form is typically maximal: the combi-
nation of go and past tense is realized as went one-hundred percent of
the time. In contrast, the entrenchment of verb-construction combina-
tions can vary quite widely. For example, tell may never occur in the di-
transitive construction, but that does not mean that it occurs in the dative
construction one-hundred percent of the time. Instead, it occurs in a
variety of constructions including transitives with or without additional
oblique arguments (e.g., Dad told a story, Dad told Sue about his own
childhood ). Whether or not a combination of a particular verb and con-
struction is capable of preempting a semantic or functional alternative
might depend on how strongly it is statistically associated with the poten-
tially preempting construction.
Finally, this paper will discuss an additional source of negative evi-
dence that has not, to the best of my knowledge, received extensive treat-
ment in the previous literature (but cf. Briscoe and Copestake 1999: 26
for a related discussion). Formulated as a strategy, access to this source of
negative evidence can be characterized as follows:
(e) Form expectations about the frequency of co-occurrence of linguistic
features or elements on the basis of their individual frequency of oc-
currence and check these expectations against the actual frequency
of co-occurrence.
In the next section, I will rst rephrase this strategy in terms of a
corpus-linguistic method and then briey discuss it within the framework
of the usage-based model.
3. Negative evidence in corpus linguistics
As mentioned above, there is a long-standing assumption in linguistics,
beginning with Chomsky (1957), that corpus-linguistic methods do not
provide access to negative evidence and are therefore of noor at least
of a very limiteduse as a tool for theoretical linguistics. This assump-
518 A. Stefanowitsch
tion is by no means restricted to researchers with a generally anti-
empirical mindset, but it is, somewhat surprisingly, shared by many cor-
pus linguists. For example, McEnery and Wilsons widely-used textbook
on corpus linguistics contains the following passage:
Without recourse to introspective judgments, how can ungrammatical utterances
be distinguished from ones that simply havent occurred yet? If our nite corpus
does not contain the sentence:
*He shines Tony books.
how do we conclude that it is ungrammatical? [ . . . ] It is only by asking a native or
expert speaker of a language for their opinion of the grammaticality of a sentence
that we can hope to dierentiate unseen but grammatical constructions from those
which are simply ungrammatical and unseen. (McEnery and Wilson 2001: 1112)
In other words, the corpus linguist is assumed to be in a position simi-
lar to that of the child learning a language, although, unlike the child,
they can resort to an alternative source of information, acceptability
judgments.
While this may seem a plausible position at rst glance, I have argued
elsewhere (Stefanowitsch 2005, 2006, cf. also 2007) that there is a solution
to the apparent problem of no negative evidence for the corpus linguist
that does not require recourse to acceptability judgments.
McEnery and Wilsons claim holds true only for individual linguistic
features or simplex elements: if a particular feature or element does not
occur in our corpus of a given language, there is no way of determining
whether it exists in that language or not.
This may well be a problem, for example, for the historical linguist try-
ing to pin down the exact date at which a particular word entered the lan-
guage. It is not, at least in principle, a problem for a theoretical linguist
trying to uncover the structural properties of a language. The theoretical
linguist is rarely concerned with the existence of individual features or
simplex elements, since their task is to formulate the principles according
to which such features or elements are combined.
In other words, they are concerned with the co-occurrence of features
or elements. As long as these elements occur in the corpus at all, their in-
dividual frequencies, together with information about the size of the cor-
pus can be used to calculate (or, in some cases, estimate), their expected
frequency of co-occurrence. This can then be compared to the actual fre-
quency of occurrence (whether it is zero or some other number) and
statistically evaluated to determine whether the dierence between the ob-
served and the expected frequency is signicant.
Negative entrenchment 519
Consider the case of the dative shift discussed above, specically ex-
ample (3b), repeated here as (5)
(5) *Dad said Sue something nice
The crucial claim expressed by the judgment of unacceptability on this
sentence is that say cannot take ditransitive complementation, or, put
dierently, that the lexical construction say cannot co-occur with the
ditransitive construction. In this case, the acceptability judgment seems
uncontroversial, and therefore it does not come as a surprise that not a
single instance of say with ditransitive complementation occurs in the
one-million word British Component of the International Corpus of En-
glish (or, indeed, in the 100 million word British National Corpus).
But in fact we do not need an acceptability judgment to tell us that (5)
is not an acceptable sentence of English. Table 1 shows the frequencies
required to determine this purely on the basis of corpus data. The ditran-
sitive construction occurs 1,824 times and the verb say occurs 3,333 times.
The total number of verbs in the ICE-GB is 136,551. Thus, if there were
no particular relationship between say and the ditransitive construction,
we would expect the combination to occur 44.52 times in the corpus
(1,824 3,333 / 136,551); all expected frequencies are shown in parenthe-
ses in Table 1. The dierence between the observed frequency of zero and
the expected frequency can now be tested for signicance using any statis-
tical test appropriate for contingency tables. For example, the Fisher-
Yates exact test yields a probability of error of p 1.96E-20.
1
In other
words, if we reject the hypothesis that there is a chance relationship
Table 1. Ditransitive complementation and say in the ICE-GB
Ditransitive sDitransitive Total
say 0
(44.52)
3,333
(3,288.48)
3,333
ssay 1,824
(1,779.40)
134,394
(131,105.52)
133,218
Total 1,824 134,727 136,551
1. While no particular claim is made here that the Fisher-Yates exact test is the best model
for the way in which speakers would calculate association strengths, it is the best test
available to the corpus linguist for reasons discussed in detail in Stefanowitsch and Gries
(2003) and Gries and Stefanowitsch (2004). In particular, it will not exaggerate the sig-
nicance of low frequency events in the way in which, for example, Mutual Information
will.
520 A. Stefanowitsch
between say and the ditransitive and assume instead that the non-
occurrence of this combination is non-accidental, there is a chance of
less than one in 5 quintillion that we are wrong. Extending the terminol-
ogy of collexeme analysis (Stefanowitsch and Gries 2003), we might refer
to say as a (signicant) zero collexeme of the ditransitive construction.
Of course, a corpus-derived statement of signicant non-occurrence
does not tell us why two (or more) features or elements cannot co-occur
anymore than an intuition-based judgment of unacceptability doesin
both cases, it is up to linguistic theory to provide an explanation. Such
explanations are likely to take a variety of forms. Consider Table 2,
which shows the top twelve zero collexemes of the ditransitive (all those
whose absence is statistically signicant at corrected levels of signicance,
cf. Stefanowitsch 2006).
2
In all cases except for say, the non-occurrence with the ditransitive con-
struction can be relatively straightforwardly accounted for in semantic
Table 2. Top twelve zero collexemes of the ditransitive in the ICE-GB
Collexeme F(Corpus) FO(Ditr) FE(Ditr) p-value
be 25416 0 340.00 4.29E-165
be|have 6261 0 83.63 3.66E-038
have 4303 0 57.48 2.90E-026
think 3335 0 44.55 1.90E-020
say 3333 0 44.52 1.96E-020
know 2120 0 28.32 3.32E-013
see 1971 0 26.33 2.54E-012
go 1900 0 25.38 6.69E-012
want 1256 0 16.78 4.27E-008
use 1222 0 16.32 6.77E-008
come 1140 0 15.23 2.06E-007
look 1099 0 14.68 3.59E-007
2. As discussed in more detail in Stefanowitsch (2006), both the verb and the construction
in question must have a certain minimal frequency of occurrence in order for this
approach to yield signicant results. In a one-million word corpus like the ICE-GB, it
will only pick out a few dozen verbs as signicantly non-occurring and leave the status
of many other non-occurring verbs uncertain. Even in an hundred-million word corpus
like the British National Corpus, there will be verbs and/or constructions that are not
frequent enough to determine whether the fact that they do not co-occur is due to
chance or not. However, while this might be a problem for the corpus linguist, it is not
a fundamental problem for the suggestion that language learners may use this kind of
negative evidence: it simply makes the prediction that judgments of unacceptability
should be less strong for infrequent verbs and/or constructions than for frequent ones.
Negative entrenchment 521
terms: it is simply not clear how any of these verbs should be interpreted
in the ditransitive. Say (and other verbs occurring further down the list,
such as put, believe, provide, produce, suggest, and describe), on the other
hand, would be perfectly interpretable in the ditransitive, since they regu-
larly occur with three arguments whose semantic roles match those of the
ditransitive construction. Here, phonological, syntactic or ne-grained se-
mantic constraints must be (and have been) posited to explain their statis-
tically signicant absence. These will not concern us here, but see Stefa-
nowitsch (2007) for some discussion.
Instead, let us briey relate the corpus-linguistic method introduced
here to the usage-based model. I would suggest that we could conceptual-
ize the notion of entrenchment in terms of expectations of co-occurrence
rather than in terms of raw frequency of occurrence. If a corpus linguist
can use information about the individual occurrence of features and ele-
ments to predict their frequency of co-occurrence and compare the pre-
diction to the actual observations, then a speaker faced with linguistic in-
put should be able to do the same. It does not seem implausible to assume
that for any given conguration of linguistic categories, speakers are able
to (subconsciously) calculate a likelihood of occurrence based on the
known frequencies of occurrence of the individual categories. The higher
the frequency of occurrence of a particular conguration is with respect to
this baseline expectation, the more likely it is to become represented as a
unit in Langackers sense. Assuming such a statistically-driven model
of entrenchment, the availability of negative evidence is a natural con-
sequence: the stronger an expectation of co-occurrence is, the more
noticeable its absence will be. The continued non-occurrence of a given
expected conguration of linguistic categories would thus lead to a
negative entrenchment of the combination of features in question. This
negative entrenchment could serve as direct negative evidence for con-
straints on rules or schemas. Since it is statistical rather than categorical
in nature, it would also explain why speakers seem to be able to assign
dierent degrees of unacceptability to sentences instantiating ungrammat-
ical combinations of features and elements.
4. Preemption, negative evidence and acceptability
4.1. Aims and methods
The aim of the pilot study reported in this section was to test, rst,
whether speakers assign dierent degrees of acceptability to ungrammati-
cal sentences and second, and crucially, whether these degrees of accept-
522 A. Stefanowitsch
ability correlate with the degree of preemption discussed in Section 2 or
the degree of negative entrenchment discussed in Section 3, or both.
Twenty verbs were selected that seemed to be either categorically
banned from occurring in the ditransitive while occurring freely in the da-
tive, or vice versa. These eighteen verbs show varying degrees of preemp-
tion (i.e., association with a potentially preempting construction) and
negative entrenchment (i.e., degree of signicant absence from the con-
struction), as shown in Table 3.
For each verb, a short text was constructed consisting of a sentence set-
ting up a context, followed by a grammatical target sentence containing
the verb in question in the construction in which it does occur. The sen-
tences were constructed to reect typical uses of these verbs as determined
by browsing naturally occurring examples on .uk and .us web pages.
These were the control items. The experimental items were then con-
structed by simply switching the construction of the target sentence. A
full list of the stimuli is given in the Appendix.
The acceptability judgments were collected on-line. Seventeen un-
dergraduate students at a US-American university participated in the
Table 3. Degrees of preemption and negative entrenchment
PREEMPTION NEGATIVE ENTRENCHMENT
Rank Verb Coll. Strgth Rank Verb Coll. Strgth
1. win 4.81E-01 1. devote 8.21E-01
2. suggest 3.96E-01 2. transfer 7.85E-01
3. provide 9.69E-02 3. address 6.49E-01
4. reveal 4.72E-02 4. issue 6.20E-01
5. describe 1.23E-03 5. reveal 6.11E-01
6. admit 2.39E-05 6. earn 4.97E-01
7. wish 3.42E-06 7. admit 4.90E-01
8. mention 1.04E-06 8. introduce 4.28E-01
9. earn 2.31E-07 9. relate 4.21E-01
10. report 3.40E-08 10. return 4.02E-01
11. relate 2.21E-08 11. report 3.38E-01
12. issue 3.54E-10 12. explain 2.77E-01
13. allow 1.35E-10 13. mention 2.33E-01
14. address 1.14E-10 14. describe 1.51E-01
15. explain 5.36E-14 15. suggest 1.49E-01
16. return 6.08E-17 16. wish 1.42E-01
17. introduce 1.48E-17 17. win 6.90E-02
18. put 2.80E-20 18. provide 5.85E-02
19. transfer 1.62E-28 19. allow 1.60E-02
20. devote 5.08E-36 20. put 2.96E-03
Negative entrenchment 523
experiment. They logged in at their convenience and from a location of
their own choice. They then received written instructions that they would
be presented with a series of short texts containing one sentence in bold
type and that they should rate this sentence in terms of how acceptable
it sounded (see Appendix). The subjects were then presented with the ex-
perimental items one by one, always followed by an unnumbered ten-
point scale ranging from fully acceptable to not at all acceptable.
There was no time limit and the subjects were able to change their ratings
until they were satised. They then submitted their judgment by clicking
a button that would bring up the next item. The items were presented to
each subject in one of four pseudo-random orders. After subjects had
rated all ungrammatical sentences, they were presented with the control
items in the same order to provide a control judgment.
4.2. Results
For three of the verbs, transfer, issue and provide, it turned out that the
ungrammatical version was ranked as equally or more acceptable than
the grammatical version. This may be due to a dialectal dierence be-
tween British and American English or the subjects may have been con-
fused by the existence of a third grammatical alternative with with (They
issued/provided him with a new passport) that shares the meaning and the
constituent order of the ditransitive. In the terminology of the usage-
based model, we could say that they received partial sanction from this
grammatical construction, causing subjects to misjudge their grammati-
cality. These items were discarded from further analysis.
3
Next, all items were discarded for each subject where the grammatical
version did not receive an acceptability rating of 1 (fully acceptable) or
2. This was intended to control for problems that subjects may have
3. One of the reviewers criticized the fact that these items were discarded from the analysis,
arguing that surely, partial sanction is going to be a factor in processing. Their point
is well taken, but since partial sanction was not controlled for systematically in this
study, there is nothing that can be said about this issue here. In the present study, the
items had to be discarded, as the method used here would have assigned negative unac-
ceptability scores to these items. Negative unacceptability is hard to interpret (it would
mean that these strings were more than acceptable) and it certainly does not appear to
capture any actual property of the strings in question. Thus, the inuence of partial
sanction on judgments of unacceptability will have to be investigated separately in
future research.
524 A. Stefanowitsch
had with the texts/sentences that were not due to the ungrammatical
combinations of verb and construction.
4
For the remaining items, the acceptability rating of the grammatical
sentence was subtracted from that of the ungrammatical sentence. The
average of the resulting dierence scores was then taken to be the nal
acceptability score, shown in Table 4.
Given that they were rated on a ten point scale, the dierences in the
acceptability scores are relatively small. Together with the relatively large
standard deviations, this means that the ranking cannot be taken too
literally.
Table 4. Acceptability scores
Rank Verb Average St. Dev.
1. earn 6.78 2.07
2. admit 6.75 2.65
3. win 6.56 1.81
4. devote 6.50 2.31
5. report 6.00 3.00
6. mention 5.39 3.09
7. relate 5.25 2.65
8. address 5.20 2.28
9. reveal 5.00 2.65
10. describe 4.72 2.60
11. introduce 4.56 3.28
12. explain 4.50 2.58
13. suggest 4.22 2.56
14. wish 4.11 2.53
15. allow 3.47 3.30
16. return 2.89 2.78
17. put 1.44 2.90
4. Discarding these items ensures that only those judgments remain that were given by
speakers for whom the theoretically grammatical strings are actually acceptable. In
other words, it controls for dialectal or idiolectal variation concerning the baseline sen-
tences. Clearly, if a speaker feels that a sentence like She returned his essay to him is not
(almost) fully grammatical, then any judgment they make about She returned him his es-
say would be dierent in quality from the judgments of the majority of speakers of En-
glish for whom the rst sentence is fully grammatical. There is a second problem with
those stimuli for which the grammatical variant receives a low acceptability rating: in
these cases, the method used here creates the illusion that the ungrammatical versions
are close to acceptable: for example, if the grammatical version is rated 6 and the un-
grammatical version is rated 8, the dierence score will be 2 (almost fully acceptable).
Negative entrenchment 525
Keeping in mind this caveat, there is a large and statistically signicant
correlation between acceptability judgments and negative evidence
(Spearmans rho 0:53, p 0:03, *). In contrast, there is no statistically
signicant correlation between acceptability judgments and preemption
(Spearmans rho 0:30, p 0:23, n.s.).
5
4.3. Discussion
The results, though preliminary, suggest that, at least for the combinabil-
ity of verbs with argument structure constructions, preemption does not
have an inuence on acceptability judgments while negative entrench-
ment does.
The rst result is not too surprising as there are strong a priori argu-
ments against a role for preemption in grammar, but the results lend
Figure 1. Regression lines for acceptability judgments and negative evidence (9- - -) vs. pre-
emption (C )
5. One of the reviewers inquired whether the dierence between the two correlation coe-
cients was signicant. This is not the case: the standard method used to assess the signif-
icance of the dierence between two correlation coecients (cf., e.g., Blalock 1972: 406
407) yields a z-value of 0.74 ( p 0:46). However, this does not seem to me to be cru-
cially important: the main point of this paper was not to decide between preemption
and negative evidence but to assess independently for each of the two factors whether
they might play a role in shaping judgments of unacceptability.
526 A. Stefanowitsch
additional weight to these arguments and show that preemption is an
unlikely source of negative evidence in the acquisition of grammatical
constructions even if it is reconceptualized in statistical rather than cate-
gorical terms. This would not, of course, mean that preemption could not
play a role at linguistic levels that have a more categorical structure, such
as phonology and morphology and it would not mean that more sophisti-
cated approaches to preemption could not play a role at the level of
syntax.
The second result, if substantiated in larger-scale follow-up studies,
would suggest that speakers do indeed make use of statistically signicant
absences in determining constraints on grammatical constructions. Note
that the point here is not simply that signicantly absent combinations
are judged as unacceptable but that degrees of signicant absence corre-
late with degrees of unacceptability.
5. General discussion
The notion of negative entrenchment deserves a place in the usage-based
model even if it were not needed to account for constraints on rules/
schemas. The fact alone that degrees of negative entrenchment correlate
with degrees of unacceptability would be sucient to include it unless
some other usage-based mechanism could be shown to account for the
gradedness of acceptability judgments.
However, it seems plausible that negative entrenchment is not just in-
volved in creating a sense of unacceptability, but that it is actually used
by speakers in identifying constraints on schemas, i.e. in guring out
ways in which a particular schema can not be put to use. Speakers might
uncover certain semantic motivations for these constraints (for example,
the narrow-class rules suggested in some lexicalist approaches, e.g.,
Pinker 1989), but those semantic motivations are not necessary for learn-
ing the constraint in the rst place. In other words, negative entrenchment
is a mechanism that allows the learner to acquire both motivated and
arbitrary restrictions.
Thus, negative entrenchment would be a valuable addition to the usage-
based model, serving a function that no currently known mechanism can
fulll. At the same time, it does not require a major reconceptualization of
the usage-based model, since negative entrenchment is simply an additional
aspect of the notion of entrenchment as discussed by Langacker. Thus, it is
interpretable relatively straightforwardly within the existing framework.
Received 22 November 2007 Universitat Bremen, Germany
Revision received 30 January 2008
Negative entrenchment 527
Appendix: Stimuli
The audience welcomed the speaker with a round of applause.
a) He addressed some opening remarks to them.
b) He addressed them some opening remarks.
Tony could not hide his terrible past from his wife any longer.
a) He admitted his crimes to her.
b) He admitted her his crimes.
Jacks team was losing the game when it was his turn at bat.
a) His home run allowed them a spectacular comeback.
b) His home run allowed a spectacular comeback to them.
As an eyewitness, Emma was questioned by the police.
a) She described the accident to them.
b) She described them the accident.
Richard was a huge fan of his local football team.
a) He devoted a whole website to them.
b) He devoted them a whole website.
Sara is a very successful author.
a) Her books have earned her a fortune.
b) Her books have earned a fortune to her.
The professor wanted the students to perform an experiment.
a) He explained the procedure to them.
b) She explained them the procedure.
John needed capital to develop his invention.
a) His agent introduced some investors to him.
b) His agent introduced him some investors.
The embassy was very helpful when Grace lost her purse.
a) They issued a new passport to her.
b) They issued her a new passport.
Emilys dad called to ask how things were going.
a) She mentioned her back pain to him.
b) She mentioned him her back pain.
The architect presented his plan for the new lecture hall.
a) The students provided valuable feedback to him.
b) The students provided him valuable feedback.
The famous singer made a rare appearance on a talk show.
a) Viewers were invited to put their questions to him.
b) Viewers were invited to put him their questions.
528 A. Stefanowitsch
Olivias friends wanted to know everything about her trip to Nepal.
a) She related her experiences to them.
b) She related them her experiences.
The committee asked Diane about her research.
a) She reported her results to them.
b) She reported them her results.
The teacher was very pleased with Rons written work.
a) She returned his essay to him.
b) She returned him his essay.
The undercover agent needed police backup.
a) He revealed his true identity to them.
b) He revealed them his true identity.
Robert and Susan asked the hotel manager for a good place to eat.
a) He suggested an excellent restaurant to them.
b) He suggested them an excellent restaurant.
Laura called her parents from Italy because she needed money.
a) They transferred the money to her.
b) They transferred her the money.
The senator ran a very honest re-election campaign.
a) His honesty won him widespread support.
b) His honesty won widespread support to him.
Mary gave a farewell party before leaving for Italy.
a) Everyone wished her a safe journey.
b) Everyone wished a safe journey to her.
References
Baker, Charles L.
1979 Syntactic theory and the projection problem. Linguistic Inquiry 10, 533581.
Berwick, Robert
1985 The Acquisition of Syntactic Knowledge. Cambridge, MA: MIT Press.
Berwick, Robert and Amy Weinberg
1984 The Grammatical Basis of Linguistic Performance. Language Use and Acqui-
sition. Cambridge, MA: MIT Press.
Blalock, Hubert
1972 Social Statistics. New York: McGraw-Hill, 406407.
Bowerman, Melissa
1988 The no negative evidence problem: how do children avoid constructing on
overly general grammar? In Hawkins, John A. (ed.), Explaining Language
Universals. Oxford and Cambridge, MA: Blackwell, 73101.
Negative entrenchment 529
Briscoe, Ted and Ann Copestake
1999 Lexical rules in constraint-based grammar. Computational Linguistics 25(4),
487526.
Brooks, Patricia J. and Michael Tomasello
1999 How young children constrain their argument structure constructions. Lan-
guage 75, 720738.
Brooks, Patricia J. and Otto Zizak
2002 Does preemption help children learn verb transitivity? Journal of Child Lan-
guage 29, 759781.
Chomsky, Noam
1957 Syntactic Structures. The Hague: Mouton.
Dabrowska, Ewa
2000 From formula to schema: The acquisition of English questions. Cognitive
Linguistics 11, 83102.
Erteschik-Shir, Nomi
1979 Discourse constraints on dative movement. In Givo n, Talmy (ed.), Discourse
and Syntax (Syntax and Semantics 12). New York: Academic Press,
441467.
Goldberg, Adele
1995 Constructions. A Construction Grammar Approach to Argument Structure.
Chicago, IL and London: Chicago University Press.
Gries, Stefan Th.
2003 Towards a corpus-based identication of prototypical instances of construc-
tions. Annual Review of Cognitive Linguistics 1, 127.
Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch
2002 The faculty of language: what is it, who has it, and how did it evolve?
Science 298, 15691579.
Langacker, Ronald W.
1987 Foundations of Cognitive Grammar. Volume 1: Theoretical Prerequisites.
Stanford: Stanford University Press.
1991 Concept, Image and Symbol. Berlin and New York: Mouton de Gruyter.
2000 A dynamic usage-based model. In: Barlow, Michael and Suzanne Kemmer
(eds.), Usage-Based Models of Language. Stanford: CSLI, 163.
Lieven, Elena, Heike Behrens, Jenny Speares, and Michael Tomasello
2003 Early syntactic creativity: a usage-based approach. Journal of Child Lan-
guage 30, 333370.
McEnery, Tony and Andrew Wilson
2001 Corpus Linguistics. Second edition. Edinburgh University Press.
Pinker, Steven
1989 Learnability and cognition: the acquisition of argument structure. Cambridge,
MA: Harvard University Press.
Stefanowitsch, Anatol
2005 New York, Dayton (Ohio), and the raw frequency fallacy. Corpus Linguis-
tics and Linguistic Theory 1(2), 295301.
Stefanowitsch, Anatol
2006 Negative evidence and the raw frequency fallacy. Corpus Linguistics and
Linguistic Theory 2(1), 6177.
Stefanowitsch, Anatol
2007 Linguistics beyond grammaticality. Corpus Linguistics and Linguistic Theory
2(3), 5771.
530 A. Stefanowitsch
Stefanowitsch, Anatol, and Stefan Th. Gries
2003 Collostructions: investigating the interaction of words and constructions. In-
ternational Journal of Corpus Linguistics 8(2), 209243.
Tomasello, Michael
2003 Constructing a Language: A Usage-based Theory of Language Acquisition.
Cambridge, MA and London: Harvard University Press.
Negative entrenchment 531
Manner of motion saliency:
An inquiry into Italian
FILIPPO-ENRICO CARDINI*
Abstract
The present study reports the ndings of an empirical investigation aimed at
testing how salient the domain of manner of motion is in Italian. Hitherto,
Italian has been as sumed to be low-manner-salient simply on the grounds
that it is a Romance language. But dierently from languages such as
French or Spanish, for example, hardly any empirical evidence has been
produced that can prove the validity of that assumption. In this study, the
degree of manner of motion saliency in Italian has been investigated by con-
trasting it to that exhibited by a typical high-manner-salient language,
namely English, which has been extensively studied from this point of view.
The investigation consisted of: (a) a dictionary-based lexical survey aimed
at comparing the number of Italian manner of motion verbs with that of
English; (b) experimental trials with Italian and English native speakers,
aimed at measuring the manner of motion salience manifest in their linguis-
tic behaviour. The experimental work with speakers involved three tests of
ease of lexical access, intended to establish how quickly speakers can re-
trieve manner of motion verbs from memory, and one test on spontaneous
narration, designed to test the frequency with which manner of motion
verbs are used during free speech. The results of the study provide the
rst empirical evidence that Italian does indeed show typical traits of low-
manner-salience.
Cognitive Linguistics 194 (2008), 533569
DOI 10.1515/COGL.2008.021
09365907/08/00190533
6 Walter de Gruyter
* Authors e-mail: 3f.cardini@lancaster.ac.uk4. I wish rst of all to thank Paul Chilton
and Jan McAllister for their helpful comments and suggestions. I also want to express
my gratitude to all the participants in my study, who took part without any economic
compensation. In particular, I would like to acknowledge the support I received from
the secondary schools that gave permission to test some of their pupils. These are:
Bungay High School (Bungay, Suolk, England); Scuola Media Schiaparelli-Marconi
(Savigliano, Cuneo, Italy); Liceo Classico Giuseppe Mazzini (Genoa, Italy). Finally, I
thank the reviewers for their useful criticism.
Keywords: manner of motion semantics; manner of motion salience;
Italian.
Introduction
Talmys (1985, 1991, 2000) bipartite typology of satellite-framed-
languages (or simply S-languages: e.g., Germanic languages) as op-
posed to verb-framed-languages (or simply V-languages: e.g., Romance
languages) has, over recent years, triggered increasing research into se-
mantic domains such as that of manner of motion. When describing
some motion event, a Germanic S-language such as English is expected
to express manner of motion through the main verb of the clause,
whereas path of motion is indicated by means of a verb particle (a satel-
lite to that verb), as in The man ran (manner of motion) across (path
of motion) the street. By contrast, a Romance V-language such as
Italian is expected to encode path information in the main verb, while
manner may be expressed at the end of the clause by means of an adverb,
gerundive form or prepositional phrase, as in Luomo attraverso` (path
of motion) la strada correndo (manner of motion) (The man crossed
[path of motion] the street running [manner of motion]).
Interestingly, the dierence between the two types of expression seems
to produce eects on the ease with which manner information is deliv-
ered. Slobin (e.g., 2003, 2004) has repeatedly observed that because the
encoding of manner through the satellite-framed construct does not re-
quire any addition to the phrase, S-language speakers are facilitated and
encouraged to express manner information, at least when compared to V-
language speakers. The latter tend to provide such information only when
it is important to the context. Slobin further claims that this is ultimately
the reason why a considerable amount of empirical work has found the
semantic domain of manner of motion signicantly more prominent in
S-languages than in V-languages. In particular, it has been shown that S-
language speakers exhibit a wider and more ne-grained array of manner
verb types in their everyday speech, and that the rate of use of such verbs
is remarkably higher than that displayed by V-language speakers. This
and other related phenomena have therefore led some linguists to call S-
languages high-manner-salient languages, while V-languages low-manner-
salient languages.
With regard to Italian, however, very little research has been done in
this connection, and the alleged low-manner-salience is simply pre-
sumed on the grounds that it is a Romance language. An empirical
check on this presumption is particularly needed also because some
534 F-E. Cardini
analyses made on the Italian lexicalisation patterns for the expression
of motion events have actually found that these do not tightly conform
to the pure verb-framed typology, and that Italian might in some re-
spects even be regarded as a mixed language. For example, in a brief
contrastive analysis made between French, Italian and German locative
adverbials, Schwarze (1985: 362) argues that . . . , with regard to the
descriptions of change of location, Italian is as much Romance as it
is French, . . . , but it makes a more systematic use of an alternative con-
struct, the one we have called Germanic (authors italics; my trans-
lation). He notes that the Italian system of locative adverbials which can
combine with motion verbs (thus giving rise to a construction similar to
the satellite-framed one) suers fewer restrictions when compared to the
French system: (i) French does not have any equivalent for via (away);
(ii) many French adverbial forms are composite, whereas Italian forms are
simple (thus, probably, quicker to use); (iii) in general, the use of such
adverbials in combination with motion verbs is more limited; (iv) pleo-
nastic forms such as uscire fuori (to exit out) are not acceptable in
French. Arguing along similar lines, Koch (2000: 109) claims that . . .
Italian features among the verb-framed languages. One has to acknowl-
edge, however, that, dierently from other Romance languages, Italian
undoubtedly shows satellite-framing tendencies . . . (authors italics; my
translation). For example, Italian often seems to be rather comfortable
in employing the satellite-framed construct even in types of motion
events which typical V-languages are expected to express exclusively
with path information encoded in the main verb of the clause. This is
the case of so-called boundary-crossing-events, that is, motion events in-
volving the actions of entering, exiting or crossing (Slobin and Hoiting
1994: 498). Although prototypical V-languages are said to licence the use
of a manner verb as a main verb of a clause only when no boundary-
crossing is predicated (Slobin 2004: 225), it is by no means a rare occur-
rence to hear Italian speakers say sentences such as Corse fuori di casa
(S/he ran out of the house) instead of Usc ` di casa correndo (S/he
exited the house running), or Salta in macchina! (Jump into the car!)
instead of Entra in macchina con un salto! (Enter the car with a
jump!).
In view of the above, the aim of this research is to collect empirical
data for Italian in order to clarify whether and to what extent it can actu-
ally be viewed as a low-manner-salient language. The role of English,
whose high-manner-salient character has clearly emerged in many con-
trastive studies (e.g., see Naigles and Terrazas 1998; Naigles et al. 1998;
O

zcalskan and Slobin 1999; Slobin 2000), will be that of a control, giving
us a measure against which Italian can be contrasted.
Manner of motion saliency 535
The work reported here consists of two main parts: Study 1 and Study
2. Study 1 concerns a vocabulary research which was carried out in order
to see how large the vocabulary of manner of motion verbs is in both En-
glish and Italian. As a matter of fact, the size of a manner of motion vo-
cabulary appears to be one of the main indices of manner saliency for
some languages: In High-manner-salient languages there is a rich lexi-
con of manner morphemes (Slobin 2004: 251); [S-languages] . . . have
developed large lexicons with many ne-grained distinctions of manner,
in comparison with smaller and less dierentiated manner lexicons in V-
languages (Slobin 2003: 163). The report of the ndings is preceded by a
section dedicated to the set up of criteria for the identication of those
features a verb should have in order to be categorised as a manner of
motion verb. The creation of such criteria is needed in order to count
manner verbs, but it is itself a fundamental research necessity: despite
the numerous studies concerning manner of motion verbs, the term
appears to have been used rather loosely in the literature, with no partic-
ular concern about what it identies precisely. The same point is made by
Zlatev and his coauthors (forthcoming).
Study 2 provides information about the experimental work conducted
on some English and Italian speakers on features related to speech pro-
duction. One kind of test which was carried out was measuring how
quickly speakers could retrieve manner of motion verbs from memory.
This sort of task has been already tried out by other researchers on, for
example, French vs. English (see Slobin 2003: 1645) and Basque vs. En-
glish (Ibarretxe-Antun ano 2004: 324). Those results were consistent in
showing that manner of motion concepts are less readily available to
speakers of V-languages. Another issue investigated in Study 2 is the fre-
quency with which manner of motion verbs are used by the two linguistic
groups in spontaneous speech. As already noted, this is a kind of fre-
quency with respect to which speakers of high- and low-manner salient
languages are supposed to show a signicant dierence: In High-
manner-salient languages, speakers regularly and easily provide informa-
tion about manner when describing motion events, whereas in Low-
manner-salient languages manner information is only provided when
manner is foregrounded for some reason (Slobin 2004: 251).
1. Study one: How many manner of motion verbs?
Previous inquiries into the size of the manner of motion domain in dier-
ent languages have shown that, in S-languages, the semantic domain of
manner of motion verbs is considerably larger and more ne-grained
than that displayed by V-languages. Thus, for example, a study of 115
536 F-E. Cardini
English manner of motion verbs found only 79 French counterparts,
many of them of low-frequency use when compared to the English ones
(Jovanovic and Kenteld 1998). However, an analogous study conducted
on English vs. Russian showed that these two S-languages are equally sat-
urated on this dimension (Dukhovny and Kaushanskaya 1998). More
generally, Slobin (2004: 251) reports that: Work with dictionaries and
consultants . . . suggests that the Romance languages, Turkish and He-
brew [V-languages] have no more than about 75 intransitive manner
verbs in regular use, whereas the Germanic and Slavic languages, Hun-
garian and Mandarin [S-languages] have upwards of 150. The rst step
of the present research will be to see how the size of the Italian semantic
domain of manner of motion verbs compares to that of an S-language
like English.
To carry out a count of manner of motion verbs, criteria are needed to
determine what is and is not a manner of motion verb. It is necessary to
indicate which features a verb should exhibit if it is to be classied as one
of manner of motion. The features will mainly be semantic. As we will see,
the only clearly syntactic feature required is that the verb must be intran-
sitive. This is in order to conform to the only known criterion used in pre-
vious vocabulary researches of this kind (see Slobin [2004: 251] quoted
above).
In seeking to identify those features able to characterise a verb as one
of manner of motion, the two dimensions of the term will be dealt with
separately: the dimension of motion will be examined rst. Then the dis-
cussion will turn to manner.
1.1. Required features for motion
This section will discuss what is here understood under the term motion,
but, also, what kind of motion will be of concern in this paper. To enter
the list of motion verbs used here, a verb will have to show all the follow-
ing features:
A) The verb meaning must clearly express a change of location in space
Motion implies a change of location in space of some entity. The occur-
rence of this change is what distinguishes MOVE from BE/located . . .
the only two motive states, which are structurally distinguished by lan-
guage (Talmy 2000: 25). Therefore, the rst requirement for a verb to
be included in the list is that it must clearly express a change of location
in space. To ensure that this semantic component is a suciently salient
characteristic of the verbs meaning, some restrictive conditions will have
to be met.
Manner of motion saliency 537
Firstly, it will be required that change of location is expressed by the
verb root, and not acquired through adjoining verb particles. To give an
example, the phrasal verb make o (leave hurriedly, especially in order
to avoid duty or punishment) contains a clear motive semantics only be-
cause of the particle o; otherwise, the verb make taken in itself, though
certainly of dynamic character, does not really tell us about a change of
location in space.
Secondly, the motion component must receive the direct and primary
focus within the verbs semantics. That is, verbs which only indirectly sig-
nal movement while primarily focusing on some other kind of semantic
content will not feature in the list. This is the case for verbs denoting
kinds of searching, pursuing, hunting, etc., where the main semantic con-
cern seems to fall on the purpose of the movement rather than on the
movement itself. For this reason the verb chase, for example, has been in-
cluded in the list by virtue of its meaning rush in a specic direction
(He chased down the motorway), and not by virtue of its predominant
meaning pursue in order to catch up with (The police chased the sto-
len car through the city).
Thirdly, it will be required that the motive component is not conned
to the onset or oset of some described action. Thus, verbs that indicate
only the beginning of some movement will not be part of the list (e.g.,
incamminarsi [start walking]). The same ban will apply to verbs indicat-
ing the end point of some movement, such as verbs of collision (e.g.,
crash, cannon, cozzare [bang into]). Verbs of collision mainly focus on
the violent nature of the impact rather than on the motion prior to it. In
fact, in Snell-Hornbys (1983: 80) categorisation of descriptive verbs, they
do not fall under movement, but under static verbs.
B) The verb meaning must involve translational movement
Since most studies on manner of motion (e.g., Naigles et al. 1998; O

zca-
lskan and Slobin 1999; Papafragou et al. 2002) are based on the model of
motion event proposed by Talmy (2000: 25), the idea of motion oered
here will have to address the claims of that model. One of those claims is
that the motion verb must be of translational nature: The Motion com-
ponent refers to the occurrence (MOVE) or non-occurrence (BELOC)
specically of translational motion (authors emphasis) (Talmy 2000:
25). Talmy (1985: 141) distinguishes between two fundamentally dierent
kinds of motion: translational and self-contained. In the former, an ob-
jects basic location shifts from one point to another in space. In the
latter, the object keeps to the same basic, or average location. So, for
example, a verb like kneel simply causes the kneeler to change his or her
538 F-E. Cardini
posture, but not his or her basic position in space. By contrast, a move-
ment like that denoted by run can indeed shift the fundamental position
of the entity that is running in space (e.g., from one end of a corridor to
the other end), thus constituting a translational motion. In Sablayrolles
(1995: 2812) terms, the rst of the two verbs, i.e., kneel, would be a
verb denoting Change of postures (CoPtu), while the second, i.e., run, a
verb denoting Change of position (CoPs). CoPtu verbs merely denote
a change of the relations between the parts of an entity. CoPs verbs, in-
stead, shift the whole entity from some part to another part of some loca-
tion, changing its position. Probably, one can safely equate Sablayrolles
CoPtu verbs with Talmys self-contained motions, and Sablayrolles CoPs
verbs with Talmys translational motions. It may also be possible to dene
Talmys notion of average location of an entity as the portion of space
lying within the reach of any part of that entity. This denition could
often be useful for establishing whether or not some verb of movement
denotes translation, although, at the same time, one should also be aware
of some of its limitations. One major weakness of the proposed denition
is that the notion of containment there expressed is based too much on
physical rather than on conceptual boundaries. The movement expressed
by the Italian dondolare (rhythmically move back and forth in oscilla-
tion) that someone can perform while being seated on a swing, for exam-
ple, presupposes that the subject on the swing constantly shifts his/her
physical average location as dened above (provided that the swinging
movement is suciently long). However, to claim that dondolare really
expresses translational motion is questionable because it leaves the mov-
ing entity shifting its position within one and the same restricted portion
of space.
1
The reason why Talmys theoretical frame of motion event requires the
nature of the motion verb to be translational can be readily understood if
one looks at the denition of such an event: The basic Motion event . . .
1. The English for dondolare is swing, for which an analogous observation can be made.
However, swing also has another meaning (move by grasping a support from below
and leaping), one in which some Figure moves from A to B following the same curving
trajectory expressed by dondolare, with the dierence that it does not move back to A to
then start new cycles of the same back and forth movement. One can argue that, at least
conceptually, it is only this second meaning of swing that has a true translational charac-
ter. This is because, even in the case that the length of a motion from some A to some B
point were the same for an entity that swings (in its second meaning) and another that
dondola between those two points, only the second entity keeps nding itself in a same
range of locations (those comprised between A and B): this despite the fact that that
range may be wide enough to constantly shift its physical average position in space.
Manner of motion saliency 539
is analysed as having four components: besides Figure and Ground, there
are Path and Motion (authors emphasis). The Figure is the moving en-
tity; the Ground is the locative reference object in relation to which the
Figure moves; Path is the direction followed by the Figure; Motion is
the presence per se of motion in the event (e.g.,: The dog [F] is running
[M] into [P] the building [G]). The denition necessarily entails that the
presence in some entity of some motion per se is not sucient to give rise
to a motion event (whether or not mannered): the movement of the Fig-
ure must be able, if required, to refer to a Ground through some Path
information.
Motion verbs that have the translation element such as walk or crawl
for example, are perfectly able to meet such a requirement, as they can
readily combine with path adverbs specifying their direction in relation
to the Ground (I walked across the street; The baby crawled out of
the kitchen). Such verbs will therefore be included in the list.
By contrast, some contained motion verbs such as those relating to vi-
brations, for example, can hardly be used in motion event descriptions as
dened above. Sentences like I am quivering or I am trembling
(whose verbs certainly indicate movement; quiver: tremble or shake
with short rapid motions; tremble: shake involuntarily, typically as
a result of anxiety, excitement or frailty) are perfectly all right, but a
sentence like I am quivering/trembling towards/into/etc. the room,
sounds unusual. It appears that the rigidly contained nature of these mo-
tion verbs does not even allow possible combinations with path adverbs
for the formation of translational motion frames. Such verbs will not en-
ter the list.
There is a further type of motion verbs that perhaps could be described
as potentially translational, which are not strictly of translational charac-
ter themselves but which can easily acquire such a characteristic by con-
joining with path adverbs. This kind of verb will be included in the list
mainly because some of the most typical manner of motion verbs belong
to this category, and to exclude them would be rather odd. For example,
because of the observation made earlier about the conceptual boundaries
within which some motion possesses self-contained character, certain in-
stances of a verb like bounce/rimbalzare, namely those instances in which
some entity repeatedly hits a surface on the same spot by vertical oscilla-
tion, cannot be strictly considered translational. The same applies to
jump/saltare, which does not always involve a motion outside conceptual
boundaries. However, both these verbs regularly appear in constructional
frames in which the element of translation is readily attained (The ball is
bouncing towards the exit/La palla sta rimbalzando verso luscita;
The cat jumped o the table/Il gatto salto` via dal tavolo).
540 F-E. Cardini
C) The motion verb must be intransitive
The literature on manner of motion usually treats a good number of tran-
sitive verbs as of manner of motion verbs. As already mentioned, how-
ever, the present investigation is conned to intransitive verbs in order to
conform to the only known criterion used by previous vocabulary re-
searches on S- and V-languages for manner of motion verb categorisa-
tion. The criterion is applied rather stringently, as the ban on transitive
verbs also extends to composite constructs such as that in which certain
motion verbs combine with ones way (e.g., to wend ones way).
Note that verbs such as drive and climb, which are very often or predom-
inantly used transitively, feature in my list only because they do have
intransitive forms as well (He drove home late; She started to climb
out of the front seat).
The only exception that is made to the exclusion of transitive verbs
concerns reexive forms made up of a transitive verb (e.g., to launch)
plus a reexive pronoun (oneself ). This form enables the transitive motive
action to fall onto the grammatical subject of the phrase itself (the Fig-
ure), enabling it to perform a translational motion along a path. In En-
glish, the reexive pronoun is -self/selves (I launched myself out of
bed); in Italian it is the pronominal particle mi/ti/si/ci/vi (Mi lanciai
fuori dal letto), which, in the innitive form, is attached to the end of
the verb in the form -si (lanciarsi ).
2
The reason for this exception is
that Italian appears to make a widespread use of reexive motion verbs
(according to this research on manner of motion verbs, their number
turns out to be about 15 percent of the whole lot). To ignore them would
have probably meant to overlook a signicant area of vocabulary habitu-
ally used in conversation by Italians.
1.2. Required features for manner
We now need to consider which kinds of features actually characterize
how some motion occurs. In order for a motion verb to be mannered
this will have to carry at least one of the following kinds of semantic
information:
2. Not all those Italian verbs ending with the sux -si in the innitive are reexive forms of
transitive verbs like the above lanciarsi ( trans. lanciare -si). Some other verbs such
as inerpicarsi (clamber), for example, are actually intransitive motion verbs which exist
integrated with the pronominal particle -si. Such verbs, called intransitive pronominal
verbs, do not have any correspondent transitive form.
Manner of motion saliency 541
A) Information about aspects of motion directly referring to input mate-
rial perceived by our senses
A1) Information about some fundamental movements which can be
performed both by animate and inanimate entities interacting with
surfaces during translation
There are some general kinds of movements within which almost any
type of more specic motion performed on some surface seems to fall.
At least three of these movements can be identied: oscillation, rotation,
continuous friction. Depending on the particular physical characteristics
of the moving entity and of the underlying surface (e.g., shape, material),
any body moving on a surface will show one of these three fundamental
manners of motion: round-shaped bodies will best perform rotation;
highly elastic materials will increase the likelihood of oscillation by the
moving body (i.e., bouncing); smooth or slippery surfaces will increase
that of continuous friction.
Translational or potentially translational motion verbs that best ex-
press such general kinds of movements are, respectively, roll/rotolare,
bounce/rimbalzare, slide/scivolare, which can therefore be viewed as typi-
cal representatives of this kind of manner information. Notice, however,
that other sorts of verbs that highlight more specic features of motion on
surfaces (therefore listed later under dierent kinds of manner informa-
tion) are nothing more than particular instances of such general kinds of
manners. Some kinds of jumping can actually be viewed as a vertical os-
cillation; cycle involves rotation; ski/sciare and skate/pattinare involve
continuous friction.
A2) Information about kinds of body movements peculiar to some living
entities, performed during translation
Typical of this type of information are body movements propelling hu-
mans and animals into such a translation: e.g.,: walk/camminare (move
. . . by lifting and setting down each foot in turn, never having both feet
o the ground at once); trot/trottare ([horse or quadruped] proceed or
cause to proceed at a pace faster than a walk, lifting each diagonal pair of
legs alternatively).
Still belonging to this type of information are body movements which
simply add complexity to some translational event. For example, in a
verb such as the Italian sculettare (wiggle ones hips and bottom while
walking), the self-contained body movement consisting in the oscillation
of hips and bottom does not provide the element of translation to the
global motion performed by the Figure (the translation is provided by
542 F-E. Cardini
the walking movements), but it does load that motion with further man-
ner characterisation.
A3) Information about particular trajectories traced in space by some en-
tity during translation
Examples of such semantic information can be found in zigzag (have or
move along in a zigzag course [zigzag: a line or course having abrupt
alternate right and left turns]); arc (move with a curving trajectory).
This kind of information regarding trajectory might raise some problems,
since it could be easily identied with the notion of path as this is usually
understood in the literature of motion events. The latter is traditionally
kept distinct from manner (e.g., Slobin 2000: 109; Talmy 1985: 6272).
In this respect, it is important to note that the term trajectory used above
does not stand for what is usually understood by the term path. In the pres-
ent argument, path denotes the fundamental direction followed by some
moving entity independently of details regarding possible patterned shifts
of position in space on the part of that moving entity. It is to handle such
path details that the term trajectory is adopted here. To see the dier-
ence between the two concepts we can compare some path verbs not pro-
viding any manner information against what might be called trajectory
verbs providing manner information. Verbs like enter, exit, cross, ascend,
descend, advance, retreat, orbit, etc. are path verbs since they inform about
the basic direction followed by the Figure.
3
In fact, in each of them, the
idea of motion is conated with the semantics of directional adverbs (in
the above verbs with into, out of, across, up, down, forward, back, around,
respectively) so that any possible combination between a path verb and a
path adverb (e.g., ?to advance forward) would result in a redundant
form. This is not true for the trajectory verbs as dened here, whose seman-
tics does not contain any clue about the fundamental direction followed
by the Figure. In fact, when used in conjunction with path adverbs
(The ball arced across the room) no redundant form arises: the combi-
nation will express both the basic direction (through the path adverb) and
the particular trajectory (through the main verb) followed by the Figure.
A4) Information about vehicles used for translational motion, or about
actions required for propelling them into such a motion
3. Here, basic direction denotes the capacity of some verb to provide some approximate
information as to where some entity is going. A verb like ascend ( go up) does not tell
us precisely where the ascending entity is going; however, it tells us at least that the en-
tity will not reach any point in space below, behind, in front of the entity, etc., and that
it will instead reach some points in space located above it.
Manner of motion saliency 543
Examples of the rst kind are: cycle (ride a bicycle); canoe (travel in
or paddle a canoe). Examples of the second kind are: pedal (move by
working the pedals of a bicycle); paddle (move through the water in a
boat using a paddle or paddles). In some instances the vehicle used for
motion remains unspecied, left to be inferred through context: ride
(travel on horse or other animal; travel in or on a vehicle).
A5) Information about particular sounds associated with translational
motion
Information of this kind can be found, for example, in verbs such as rattle
([of a vehicle and its occupants] move or travel with a knocking
sound); whistle (produce a high-pitched sound by moving rapidly
through the air or a narrow opening); zoccolare (make noise with ones
clogs while walking).
B) Information about aspects of motion evoking fundamental concepts
The denitions of many verbs found in this research suggest the existence
of some fundamental concepts which can be evoked by particular mo-
tions.
4
In many motion verbs the manner component appears to be ulti-
mately related to such concepts. It is proposed that the following concepts
are among those involved in a certain type of manner of motion:
B1) SPEED
fast: zoom (move or travel very quickly); hurtle
(move or cause to move at high speed);
lare (move at high speed).
slow: drift (be carried slowly by a current of air
or water).
B2) ENERGY/FORCE
forceful, violent: barge (move forcefully or roughly);
prorompere (come out with vehemence,
violence).
weak, feeble: totter (move in a feeble or unsteady way).
4. The term fundamental refers here to concepts which might be universal to all humans.
This is not to say that all cultures must have linguistic labels for them. Although some
scholars indeed claim that the existence of any presumed universal concept can be dem-
onstrated only by showing that all languages have a label for it (Wierzbicka 1999), some
studies actually claim to have proved the existence of universal categories which are
not coded linguistically by all cultures (e.g., Berlin and Kay 1969). Thus, although
none of the proposed fundamental concepts actually features in the table of universal
lexical items so far identied (Goddard and Wierzbicka 2002), they are presented here
nonetheless.
544 F-E. Cardini
B3) WEIGHT
heavy: trundle ([with reference to a wheeled vehicle
or its occupants] move or cause to move
slowly and heavily).
light: trip (walk, run or dance with quick light
steps).
B4) EFFORT
easy, eortless: coast ([of a person or vehicle] move easily
without using power); uire (ow easily
and in abundance).
dicult,
laborious:
clamber (climb or move in an awkward and
laborious way, typically using both hands
and feet); arrancare (to advance with
eort).
B5) CONTINUITY
continuous,
steady:
ow ([of a liquid, gas or electricity] move
steadily and continuously in a current or
stream); pace (walk at steady speed).
abrupt, jerky: joggle (move or cause to move with
repeated small bobs or jerks).
B6) HARMONY
elegant,
co-ordinated:
ballare (to perform co-ordinated movements
following the rhythm or music or of
singing).
clumsy, awkward: lollop (move in an ungainly way in a series
of clumsy paces or bounds).
B7) STEADINESS
controlled, steady: march (walk in a military manner, with a
regular, measured tread).
uncontrolled: stagger (walk or move unsteadily as if
about to fall); barcollare ([of people or
things] move unsteadily, as if about to fall,
swaying from one side to the other).
C) Information about aspects of motion evoking emotional states
Many motion verbs are loaded with certain dierent emotional character-
isations. They seem to indicate the attitude underlying the motion of
some entity. Without attempting to come up with an inventory of such
attitudes as was done for fundamental concepts in information type B,
some examples can be given: HASTE (hurry/arettarsi: move or act
Manner of motion saliency 545
with great haste); FEAR (sneak/sgattaiolare: move or go in a furtive
stealthy manner); CONFIDENCE/ARROGANCE (swagger: walk
or behave in a very condent and arrogant or self-important way);
CALM/RELAXATION (stroll/passeggiare: walk in a leisurely way);
GAIETY (caper: skip or dance about in a lively or playful way; gam-
bol: run or jump about playfully).
Contrary to the information belonging to the previous category, this
one is usually to be found in connection with the living entities only.
This is because emotional states can only exist in entities with a psycho-
logical reality. Both animals and non-living entities can move with speed,
energy, etc., but it is only animals that can be in a hurry, be fearful, an-
gry, condent, relaxed, etc. Bullets move fast and whiz past, for example,
but they do not rush or hurry. To some extent, the same can be said for
vehicles such as cars or trains when travelling at high speed.
5
The important observation that must be made after having outlined the
three fundamental kinds of information providing a motion verb with
manner content is that the picture is not as rigid as it might at rst ap-
pear. Manner of motion verbs very rarely fall into only one of the pro-
posed categories of manner information, but normally contain dierent
types of manner information combined together. Thus, the picture is a
very exible one. Although we can have motion verbs whose semantics
involves only one type of manner information (e.g., the previously men-
tioned hurtle [B1]), the majority of manner of motion verbs encapsulate
more kinds of manner information. In this respect, many of the verbs of-
fered as examples of our manner categories should not be viewed as verbs
exclusively containing the type of manner they have been put in relation
to, but simply as predominantly focusing on that kind of manner. For
example, the semantics of a verb like run, which was classied as informa-
tion type A2, also carries some information related to the B1 type, as it
is somehow associated with the idea of a quick motion, at least when
compared with walk. A verb like stomp (tread heavily, noisily, typically
in order to show anger) seems to combine types A2 (visual percep-
tion), B3 (fundamental concepts), A5 (hearing perception), C (emotional
characterisations).
6
5. However, vehicles can often acquire emotional characterisations transferred by meton-
ymy from the animate entity co-involved with its motion (i.e., their human driver). For
example, it is possible to hear of a car rushing past.
6. The coming together of such very dierent types of representation into one and the same
concept is somehow reminiscent of the observations made by Damasio and Damasio
(1992) about the word as a kind of convergence zone which ties together aspects of
thought that may be stored separately in dierent areas of the brain.
546 F-E. Cardini
Along with what were deemed to be the three principal types of infor-
mation which can provide an element of manner to motion verbs, there
are other types of information some might want to add to the list, but
which, on close scrutiny, do not seem to possess the necessary require-
ments. These kinds of information are:
a) Information about a particular location (more or less specied) with
which the motion expressed by the verb is somehow connected
This may be the case for verbs like slot (be placed or able to be
placed into a long narrow aperture), scollinare (to cross hills). It
may sound obvious that locative information relates to the question
where, and not how, so that it should not have anything to do with
manner. If so, however, one could argue that information type A4
(vehicles for motion) should not have been included in the list of
manner information types either, since vehicles can equally be viewed
as a particular location to which some motion is connected. The
objection has a point: travel in a canoe (to canoe), certainly gives in-
formation about where some motion occurs. However, locations rep-
resented by vehicles are of a special kind in that they move together
with the Figure. They enable and give rise to the motion, which, as a
result, becomes dependent on and closely connected to them. This is
shown by the fact that many vehicles inevitably bring along features
of motion inherent to their own nature. Motions performed on skis or
skates will display sliding features, for example. Motions in/on any
wheeled vehicle will show rolling features: those performed on large
wheeled vehicles (e.g., lorries) will show heavier and less agile features
than those performed on small ones (e.g., scooters). By contrast, loca-
tions that do not move together with the Figure can only rarely aect
its motion with any kind of manner. An exception was made for the
intransitive use of squeeze (manage to get into or through a narrow
or restricted space) and the intransitive use of squash (make ones
way into a small or restricted space), which strongly evoke the con-
tained movements possibly made by the Figure trying to make itself
smaller in order to get in or through a narrow space.
Information about locations includes physical places constituting
the aim of some motion.
7
Locative aim can probably be pursued
7. More abstract kinds of aim coincide with the notion of purposes, discussed above in the
section dedicated to motion. Just as locative information cannot be viewed as an ele-
ment of manner because not informative of how but rather of where some motion takes
place, in the same way purposes cannot be viewed as an element of manner because they
are not informative of how but rather of why some motion takes place.
Manner of motion saliency 547
only by living entities with attributed intentionality. So, a verb like
disgorge ([of a river] empty into a sea) does not express the idea
of locative aim, whereas verbs like rincasare (go back home) or to
earth (run [of a fox] to its underground lair) do. Amongst this
latter kind of verbs one should also list those that indicate a lack of
aim such as roam, rove, etc. (and the very similar Italian errare, va-
gare, vagabondare, etc.). Of this type of verbs, only those which also
contain elements of manner discussed earlier were included in the list.
Thus, a verb like wander (walk or move in a leisurely or aimless
way), does feature in the record of verbs by virtue of the leisurely
attitude ( unhurried, relaxed) with which the motion can occur
(manner type C). Likewise, the Italian girellare (to go about lazily
and aimlessly here and there) was listed because of the indolent atti-
tude characterising the motion (still manner type C). By contrast, a
verb like roam (move about or travel aimlessly or unsystematically
especially over a wide area), which only seems to indicate the
aimlessness of the motion, has not been included; nor has been the
Italian errare (go here and there without any precise aim).
b) Information about some particular character of the moving entity
rather than of the motion itself
This may be the case of the verb swarm (move somewhere in large
numbers), or ock (move or go together in a crowd) where, if
some manner can be spotted at all, this may reside in the particular
quantity/numbers of the moving entity/entities: as to how the mo-
tion itself occurs, no clue is actually oered.
Information regarding the moving entity rather than its motion
may also refer to its particular form, and not quantity. Take drip
([of liquid] fall in small drops), and gocciolare (to exit in small
drops) for example: although the information in small drops could
well answer the question how is the water falling/exiting?, the how
does not seem to refer to the motion of the water but rather to the
water itself, namely to its shape.
One last consideration regards what can be described as low manner
content motion verbs, that is, motion verbs showing an almost negligible
load of manner of whichever kind this may be. Probably, the most nota-
ble case concerns fall.
8
This verb seems to contain a manner component
8. I only refer to those instances of fall involving translational motion (e.g., free fall
of some entity through the air: I fell down from the cli), not to those involving self-
contained motion (e.g., the sudden loss of the erect position: I fell down onto the
oor).
548 F-E. Cardini
in that the motion is typically rapid and without control (Oxford Dic-
tionary of English, second edition, 2003). At the same time, this compo-
nent must be very weak. The Shorter Oxford English Dictionary, third
edition, 1973, does not report this aspect of manner: to descend (primar-
ily by gravity); to drop from a high or relatively high position; to
descend, to sink, to decline. The same observation about low manner
load can be made for the Italian equivalent cadere. Its concept of descent
is neutral even with regard to the speed involved in the motion, which
does not have to be necessarily rapid; the only characterisation regards
the absence of any support for the falling entity: go from a high to a
low position without any prop, either slowly or rapidly (Lo Zingarelli,
Vocabolario della lingua italiana, dodicesima edizione, 1999).
1.3. Dictionaries used and related issues
Two monolingual dictionaries (Oxford Dictionary of English, second edi-
tion, 2003; Lo Zingarelli, Vocabolario della lingua italiana, dodicesima
edizione, 1999), and one bilingual dictionary (Il Ragazzini, dizionario
inglese/italianoitaliano/inglese, terza edizione, 1995) were used for the
research. Some further help was then provided by Grande Dizionario della
Lingua Italiana, 1970, by the Shorter Oxford English Dictionary, third
edition, 1973, and by the Oxford Paravia, il dizionario inglese/italiano
italiano/inglese, 2001.
Manner of motion verbs were rst selected from the two monolingual
dictionaries which provided a source for denitions (by contrast, bilingual
dictionaries predominantly oer only a translation of some lexeme into
the closest available to the other language). It was by looking at these dic-
tionary denitions that the semantics of verbs could be checked against
the criteria previously set.
In order to ensure a good degree of balance across the vocabulary size
of the two languages investigated, a bilingual dictionary was also used,
since bilingual dictionaries must necessarily provide such a balance. After
that all those verbs deemed to possess manner of motion semantics were
identied in the two monolingual dictionaries, from that subset only those
items which were also found in the bilingual dictionary (smaller in size
than the two monolingual dictionaries) were then selected.
Although the size of the bilingual dictionary used was large enough to
ensure a good degree of validity to the research (74,500 entries for En-
glish, 63,500 for Italian), the fact that, at the same time, it was smaller
than that of the monolingual dictionaries may have helped identify those
items whose frequency of use among speakers is not negligible. To the
same end, terms marked by the dictionaries as archaic, poetic, literary
Manner of motion saliency 549
and regional were not selected. With regard to English regional forms,
only those terms to be found in all English speaking countries should
therefore have been included in the list (e.g., British English, North
American English, Australian English, etc.). However, since further
work with native speakers (reported in the second part of the study) in-
volved British subjects only, those lexical items that the dictionary la-
belled as specic to British English were accepted too. By contrast, terms
specic to any other kind of English were excluded.
1.4. Results and discussion
Following the criteria set in 1.1. and 1.2., 251 English and 138 Italian
manner of motion verbs (i.e., Italian size 55 percent of English
size) were found. The dierence is strongly signicant (chi-square test:
X
2
17.40; p < 0:0001), (test of comparison of proportions: z 4.25;
p < 0.001). Interestingly, the proportion between the numbers of the two
languages here investigated is very close to that (roughly 2 to 1) regularly
found between S- and V-languages in general (see quote in section 1.,
p. 5). The list of verbs can be found in the Appendix. To check the relia-
bility of the classication system, the verbs were categorised by an inde-
pendent bilingual rater who had no knowledge of the research question.
The independent rater was asked rst to read the criteria used in this
paper, and then to indicate which verbs of the list met such criteria. For
each verb, the rater was provided with: (a) the denition the researcher
had found for that verb in the dictionaries used; (b) one example of its
use. When the rater did not regard some verb as one of manner of mo-
tion, he was also asked to indicate whether the problem lay in the motion
criteria or in the manner ones. The results of the interreliability test can
be seen in Table 1 below. Instances of disagreement mostly involved mo-
tion criteria. The Cohens Kappa could not be calculated: the Kappa is
never computable when one of the raters (in this case, the researcher) cat-
egorises all items of a sample as belonging to one and the same category
(in this case, manner of motion verbs).
Table 1. Results of the interreliability test
Rater 1 Rater 2 Agreement rate
Number of verbs found to meet
motion criteria
389/389 363/389 93.3 percent
Number of verbs found to meet
manner criteria
389/389 386/389 99.2 percent
Number of verbs found to meet
both motion and manner criteria
389/389 360/389 92.5 percent
550 F-E. Cardini
This result could have been rened by an inquiry into the frequency of
use in the population of the listed verbs.
9
However, the data available in
various frequency lists (e.g., Brown Verbal Frequency for English; Lessico
di frequenza dellitaliano parlato [De Mauro et al. 1993 for Italian]) are
inadequate to the aims pursued in here. One major obstacle is that the
frequency lists consulted do not make a distinction between dierent
meanings that one and the same lexical item may have. To give an exam-
ple, frequency gures for the verb run include uses such as run a risk, run
a business, etc., which do not seem to indicate any manner of motion of
some moving entity.
2. Study two: Experimental work with speakers
The vocabulary research reported in the previous section has shown how
the number of English manner of motion verb types is signicantly higher
than the number for Italian. Further possible aspects of the dierent
degree of manner of motion saliency shown by the two languages were
investigated through experimental work on native speakers. Two dierent
kinds of experiment were carried out: (a) three tests on ease of lexical
access; (b) one test on spontaneous narration.
2.1. Tests on ease of lexical access
Ease of lexical access tests measure how easily speakers can retrieve a
particular category of lexical concepts from memory. Such tests consist
in having participants write/utter as many relevant lexical concepts as
possible within a short time frame. As already noted in the introduction,
previous such tests on V-languages like French (Slobin 2003: 164) and
Basque (Ibarretxe-Antun ano 2004: 324) indicated that their speakers
produce a signicantly lower number of manner of motion verbs than
English speakers do, thus showing that such verbs are in some way
more salient in English speakers minds. The question was therefore to
see how Italian would fare in this respect. In the rst set of tests (A), En-
glish and Italian speakers of two dierent age groups were tested on man-
ner of motion verbs. Subsequently, for reasons that will be explained
later, a complementary test (B) on motion verbs in general was also car-
ried out.
9. This would have allowed us to see to what extent the gap found between the vocabulary
size of the two languages is actually paralleled by one concerning the variety (types) and
frequency (tokens) of verbs used by the average speaker.
Manner of motion saliency 551
A) Test on manner of motion verbs
Aim of the test
The aim of the test was to see how salient manner of motion concepts are
to English and to Italian speakers by measuring how readily available
such concepts are to these speakers.
Participants
The test investigated separately two dierent age groups. The rst set of
data was collected from 35 English and 35 Italian native speakers of both
sexes aged 14. All participants were monolingual and were recruited in
two secondary schools (one in England and one in Italy).
The second set of data was collected from 35 English and 35 Italian
adult native speakers of both sexes (21 to 60 years old; the average age
of both groups was about 35) of dierent educational and socioeconomic
backgrounds. They were interviewed separately in libraries and other
public places such as cafes and sports centres only when the environment
was suciently quiet. All English participants were tested in England.
Some of them reported they had some knowledge of French but regarded
themselves as monolingual speakers. As for the Italian participants, they
were all tested in Italy; many of them said they had a rudimentary knowl-
edge of English, but all of them regarded themselves as monolingual
speakers.
Task and procedure
The 14 year old participants performed the task simultaneously in their
classrooms. The task consisted in writing down on a piece of paper as
many manner of motion verbs as they could think of within a time frame
of 90 seconds. The time given in previous instances of the same kind of
test to undergraduates was 60 seconds (Slobin 2003: 164), but because of
the young age of the participants, it was decided to lengthen the time
frame. The researcher rst explained to the participants what a manner
of motion verb is by making a contrast between the verb go/andare
(which does not provide information about manner) and the verb run/
correre (which does provide information about manner): There are
some verbs like go that indicate motion without indicating the manner in
which the motion occurs. By contrast, there are other verbs like run,
which do not only indicate motion but also the manner in which the
motion occurs: we can call the latter manner of motion verbs ; Ci
sono verbi come andare che indicano movimento senza pero` indicare la
maniera in cui il movimento avviene. Al contrario, ci sono verbi come
correre che non solo indicano movimento, ma anche la maniera in cui il
552 F-E. Cardini
movimento avviene: possiamo chiamare questi ultimi verbi di maniera di
movimento .
10
After making sure that every participant had grasped
what a manner of motion verb is, the researcher explained their task:
When you hear start, try to write down as many manner of motion
verbs as you can; Al via! provate a scrivere quanti piu` verbi di man-
iera di movimento riuscite. The test was started immediately after the
task explanation, since it was crucial not to leave the participants any
time to start thinking about instances of manner of motion verbs in
advance.
With regard to the adult participants, the test was carried out on a one-
to-one basis. The time frame given for the task was 60 seconds as in Slo-
bins previous tests on undergraduates (2003: 164). The explanation of the
task to the adult participants was analogous to that given to the teenage
participants, although, alongside run/correre, this time the researcher also
oered zoom/sfrecciare as a further example of manner of motion verb.
This was done in order to prevent participants from focusing only on
types of manner related to kinds of bodily movements peculiar to humans
and to some animals.
11
Results (14 year old participants)
The mean for English manner of motion verbs per participant was 7.63
(std. dev.: 2.46); that for Italian was 3.23 (std. dev.: 2.02). An independent
10. The concept of manner of motion verb is admittedly not immediately grasped by all
people, and some detailed explanation of it was needed before the test. This, however,
was not ideal, because the informants might not fully understand the explanation. In-
deed, in the list of manner motion verbs produced by some participants there are items
not in the least related to manner of motion. The writing of incorrect verbs occurred
in Slobins test too, although this was not seen as a problem but rather as a further
means for investigating manner of motion salience in English vs. French speakers,
with the latter apparently unable to focus on this semantic area as precisely as the
former: French speakers found it hard to limit themselves to manner verbs, listing
non-manner verbs such as descendre descend, to go down, traverser cross, traverse.
English speakers showed no such intrusions (Slobin 2003: 164). The results of the
present study showed a similar pattern, with Italian speakers making more errors
than English speakers. However, one should be careful to interpret this as evidence
for less manner salience. It could be simply a chance result, whereby errors on both sides
were caused by problems with task understanding.
11. Because run/correre and zoom/sfrecciare were used as examples for explaining what a
manner of motion verb is (run/correre both in the test on teenagers and in that on
adults; zoom/sfrecciare in the test on adults only), instances of these verb types pro-
duced by the participants during the task were not counted. More precisely, instances
of run/correre were not counted in either of the two tests; instances of zoom/sfrecciare
were only counted in the test run on teenagers.
Manner of motion saliency 553
samples t-test provides strong evidence for a signicant dierence between
the two means (t 8.18; p < 0.001). This result is illustrated in Figure 1
below.
Overall, the English group produced 37 types of manner of motion
verbs, the Italian group 25. The elicited types (tokens in brackets) are
shown below.
English:
sprint (31); walk (30); jog (26); jump (25); hop (22); skip (20); drive (16); y
(14); swim (12); crawl (10); bike (9); leap, roll (7); dance (4); climb, wobble
(3); bounce, dive, paddle, skate, slide, spring, stagger (2); accelerate, creep,
glide, pace, pedal, rattle, ride, sail, shoot, ski, sneak, stroll, stumble, zoom
(1).
Italian:
camminare (22); saltare (16); strisciare (7); ballare, nuotare, volare (6);
arrampicar-e/si, rotolare, saltellare, scappare (5); pattinare, pedalare (4);
gattonare, scorrere (3); danzare, galoppare, scattare, scivolare, zoppicare
(2); barcollare, precipitare, ruzzolare, sgambettare, trottare, trotterellare
(1).
Results (adults)
The mean for English manner of motion verbs per participant was 6.89
(std. dev.: 3.12); that for Italian was 3.51 (std. dev.: 2.01). Again, an inde-
pendent samples t-test provides strong evidence for a signicant dierence
Figure 1. Lexical access: Mean for manner of motion verbs per teenage participant
554 F-E. Cardini
between the two means (t 5.37; p < 0.001). These results are summa-
rised in Figure 2 below.
Overall, the English group produced 63 dierent types of manner mo-
tion verbs, whereas the Italian group produced 36. The gure regard-
ing the English number of types is encouragingly similar to that of 74
found by Slobin (personal communication) in a comparable test made
on 37 undergraduates. The elicited types (tokens in brackets) are shown
below.
English:
walk (29); y, jump (17); hop, skip (13); crawl, drive, swim (10); jog (9);
slide (6); ride (5); cycle, gallop, glide, speed, stroll, wander (4); bounce,
dash, leap, race, rush, saunter, shoot, slither, sprint, trot (3); canter, climb,
dance, hurry, row, stagger, step, stomp, stride, swing, tiptoe, zip (2); accel-
erate, amble, belt, creep, dive, ease, gambol, limp, mince, pace, paddle,
potter, pounce, rocket, scamper, scurry, scuttle, shue, ski, slip, trudge,
waddle, wobble, zigzag (1).
Italian:
camminare (16); saltare (12); nuotare (10); volare (8); passeggiare, sciare
(6); galoppare, pedalare, strisciare (5); marciare, rotolare, scivolare, trot-
tare (4); cavalcare, navigare, pattinare, precipitarsi (3); arrampicare,
schettinare, vogare (2); balzare, caracollare, ondarsi, montare, remare,
rimbalzare, saettare, saltellare, scappare, serpeggiare, sfarfallare, slittare,
spaziare, svolazzare, trotterellare, veleggiare (1).
Figure 2. Lexical access: Mean for manner of motion verbs per adult participant
Manner of motion saliency 555
Discussion
Adult groups from both languages showed, as expected, a better perfor-
mance compared with their respective teenager groups. That is, the adult
groups scored a similar number of verb tokens and a higher number of
verb types in a shorter time frame. What is relevant to the present in-
quiry, however, is that, when compared to their Italian counterparts,
both English groups produced, statistically, a signicantly greater number
of manner of motion verb tokens, also displaying a wider repertoire of
types. In other words, the data are remarkably consistent in showing a
greater manner of motion salience for English speakers. Some reserva-
tions, however, may be made in relation to the methodology used in the
task, which required a formal explanation of the meaning of manner of
motion verb. For practical reasons, the explanation could only provide a
general indication of what was meant, and, further, it may not have been
fully understood by all informants. It was therefore thought that the
validity of the data obtained through this test should be veried by a
complementary experiment, similar to this one and yet free from the
drawbacks just mentioned. This second kind of experiment is reported in
section B below.
B) Test on motion verbs in general
Aim of the test
The aim of the test was the same as that of the previous test, namely to
measure how readily available manner of motion concepts are to speakers
from the two languages. Again, it was inquired how easily speakers could
retrieve manner of motion verbs from memory within a short time frame.
As we will see in the task description, this test investigated the issue
slightly less directly, but had the advantage of eliminating possible prob-
lems with task understanding. At any rate, the experiment oered the
possibility of checking whether or not the previous participants perfor-
mance on manner of motion verbs would change if a dierent testing
method was adopted.
Participants
The experiment was run on 35 undergraduates per linguistic group of
both sexes. All participants were tested in two university libraries (one in
England and one in Italy). They were all monolingual; many Italian
speakers said that they had some knowledge of English but that they
were not uent.
556 F-E. Cardini
Task and procedure
The participants were tested on a one-to-one basis. The task consisted in
writing down on a piece of paper as many motion verbs as the participant
could think of within the time frame of 60 seconds. Investigation of man-
ner of motion verbs specically would still be possible by checking how
many manner of motion verbs could be found within the list of motion
verbs produced by the participants. The inquiry was therefore not as di-
rect as it was in the previous task, but had the advantage of not needing
any formal explanation about the verbs the informants were required to
list (see related problems again in note 10), since the concept of verbs of
motion is much easier to grasp intuitively. The task instructions were very
simple: When you hear start, please write down as many verbs indicat-
ing motion as you can; Al via scrivi quanti piu` verbi puoi indicanti
movimento.
Results
With regard to the manner of motion verbs found within the larger pool
of motion verbs in general, the English mean was 8.40 (std. dev.: 2.60).
The mean for Italian respondents was 4.43 (std. dev.: 2.10). An indepen-
dent samples t-test provides strong evidence for a signicant dierence
between the two means (t 7.02; p < 0.001). As for the remaining non-
manner motion verbs, the English mean this time was lower than the Ital-
ian mean (4.03 [std. dev.: 2.60] vs. 6.80 [std. dev.: 3.00] respectively) and
signicantly dierent (t 4:14; p < 0.001). Figure 3 shows, for each
Figure 3. Lexical access: Mean for motion verbs per speaker with split into manner of
motion verbs and non-manner motion verbs
Manner of motion saliency 557
language, the speakers mean for motion verbs produced with the split
into manner and non-manner motion verbs also shown.
Overall, the English group produced 61 types of manner of motion
verbs, whereas the Italian group produced 33. The elicited types (tokens
in brackets) of manner of motion verbs are shown below.
English:
run (34); jump (29); walk (28); swim (17); dance, skip (16); hop, jog (14); y
(12); leap, sprint (9); drive (7); crawl (6); bounce, climb, roll (5); cycle, slide
(4); canter, skate, spin, step, swing (3); dive, glide, hurry, race, ride, row,
shimmy, stroll (2); back-ip, bound, chase, ick, ow, gallop, hike, jive,
limp, pace, pounce, prance, rattle, rush, saltate, shue, sidle, ski, slither,
splash, spring, stagger, stamp, stumble, stunt, surf, trot, tumble, wander,
wobble (1).
12
Italian:
correre (31); camminare (25); saltare (22); nuotare (15); volare (12); ballare
(6); scivolare (5); rotolare, sciare (4); accelerare, passeggiare (3); fuggire,
pattinare, saltellare, scappare (2); arettarsi, arrampicarsi, circumnavigare,
deambulare, galoppare, marciare, pedalare, rimbalzare, sbandare, scattare,
schizzare, scorrere, sfrecciare, slittare, strisciare, trottare, trotterellare,
zoppicare (1).
Discussion
As in the previous test on adults, the English speakers were able to
produce signicantly more manner of motion verb tokens than Italian
speakers, and in a very similar proportion (in the test on manner of mo-
tion verbs the Italian gure was 51 percent of the English gure, while in
this test it was 52.7 percent). Likewise, the number of types per group was
similar to those already found earlier (in the rst experiment the Italian
gure was 57.1 percent of the English total; in this second experiment it
was 54.1 percent). Thus, the data of this last experiment corroborated
12. A couple of English verbs produced by the speakers in this test do not feature in my list
of English manner verbs. This is because the dictionaries used for the present research
did not report such verbs. However, they do exist (they were found in the Shorter Ox-
ford) and therefore must be accepted. This was the case of back-ip and saltate, which
the Shorter Oxford denes as to perform a backward somersault and leap, jump,
skip respectively.
558 F-E. Cardini
and provided additional evidence for what was found in the rst one. It is
also interesting to notice that, when compared with Italian, the roughly
double number of English manner of motion verbs was not paralleled by
an analogous double amount of English non-manner motion verbs. If
that had been the case, the signicantly greater gure scored by English
subjects for manner of motion verbs should have been viewed simply as
an instance of a wider phenomenon involving a larger area of vocabulary,
namely that of motion verbs in general. In fact, the number of English
non-manner motion verbs was signicantly lower than the number found
for Italian.
13
These data therefore seem to suggest that the greater sa-
lience shown for English subjects with regard to manner of motion verbs
does not extend to other kinds of motion verbs. Or, if it does, the salience
of manner of motion verbs relative to the rest of motion verbs is higher
than the Italian one.
One last observation relevant to all three tests of ease of lexical access
needs to be made concerning the signicantly greater number of manner
of motion verb tokens produced by the English speakers throughout. It is
certainly the case that the greater size of the English manner of motion
verbs vocabulary (see Study 1) must have played a considerable role in
determining that phenomenon. The English participants could and did
pick from a larger number of available verb types, this of course having
an eect on the overall number of tokens elicited. However, when one
looks at the gures regarding some semantically corresponding pair of
verbs of wide use, it is interesting to notice that the number of English
tokens is consistently higher than that for Italian. For example, across
the three tests, 87 tokens were elicited for walk against 63 for the Italian
equivalent camminare; 71 for jump against 50 for saltare; 43 for y
against 26 for volare; 39 for swim against 31 for nuotare; 34 for run
against 31 for correre (for this last pair, the gures can only refer to
the last test on motion verbs in generalsee again note 11). This sug-
gests that the greater number of tokens of manner of motion verbs in
English cannot be accounted for only in terms of the wider repertoire
of types that were available to their speakers. At least to some extent,
the phenomenon must also have been brought about by the greater en-
trenchment with which such verbs are represented in English speakers
13. Because the higher number of English of manner of motion verbs was counterbalanced,
to some extent, by the Italian higher number for other kinds of motion verbs, the over-
all number of verbs listed by each group was not very dierent. The English mean for
motion verbs per subject was 12.43 (std. dev.: 2.33), whereas the Italian mean was
11.23 (std. dev.: 3.19). An independent samples t-test provides only a slight evidence
for a signicant dierence between English and Italian (t 1.80; p < 0.1).
Manner of motion saliency 559
mind, presumably because of their higher frequency of use in everyday
speech.
2.2. Spontaneous narration of the Frog story
Aim of the test
The test was meant to check the frequency and variety of manner of mo-
tion verb production during spontaneous narration, that is, narration
where peoples natural ow of language does not underlie any kind of
constraints imposed by the researcher.
Material
The test material was the picture book Frog, where are you? (Mayer
1969), which consists of a sequence of 24 wordless pictures relating a
short story, and which the informants had to freely narrate in their
own words. The book has been used in previous experiments to elicit nar-
rations from speakers of many languages (Berman and Slobin 1994). In-
formation has also been gathered for Italian speakers (see Slobin 2004:
225), who, in the case of one particular picture of the book (the Owls
Exit), obtained a score for manner mention which was similar to that
of all other V-languages and signicantly lower than that of any of the
S-languages. However, full published data referring to all episodes of
the story do not seem to be available for this language.
Participants
The participants were 23 English and 29 Italian native speakers of both
sexes between 17 and 19 years of age. They were recruited in two second-
ary schools (one in England and one in Italy). The participants were all
monolingual; the Italian participants had some limited knowledge of
English which was acquired in the school, but none of them regarded
themselves as a uent speaker of this language.
Task and procedure
Participants were interviewed on a one-to-one basis and were given the
same instructions as were used in previous linguistic tests made on the
Frog story (see Berman and Slobin 1994: 223). They were rst asked to
look through the entire book and then to tell the story while looking at
the pictures. The participant sat side-by-side with the researcher, who
was the only listener. During the participants narration, the researcher
minimised as much as possible the verbal feedback in order not to inu-
ence the chosen form of expression. Still following the procedures used in
560 F-E. Cardini
previous tests on the Frog story, the prompts used during the narrations
were (1) silence or nod of head, (2) uh-huh, okay, yes, (3) Any-
thing else?, (4) and . . . ?, (5) Go on. Each session was audio-
recorded, then orthographically transcribed.
Analysis of the samples and results (manner of motion verbs)
In order to obtain a reliable comparison between the two linguistic
groups on manner of motion verbs production, for each language, the
elicited manner motion verbs had to be counted relative to the whole
amount of speech recorded in that language. Since no constraints were
imposed on the length of the narration, samples from the two groups
varied in this respect. To control for length eects, the procedure outlined
by Berman and Slobin (1994) was adopted here. Speech length was
measured by counting the number of clauses produced. Essentially, each
clause gravitates around a verb, so that the count of recorded verbs
should correspond to that of the clauses. However, modal and aspectual
verbs were counted together with their main verbs. Thus, the following
constitute single clauses: he wants to nd the frog; the boy went
searching for the frog; the dog starts following. The analysis ex-
cluded personal comments relating to the task (e.g., . . . dont know
what else to say), or queries about object identication, such as asking
the investigator what an animal is called (e.g., I dont know what thats
called).
To give an example of how the target verbs were counted relative to the
amount of speech and compared across the two linguistic groups, we can
look at the number of tokens of manner of motion verbs recorded. For
each informant, the number of elicited tokens was divided by the number
of clauses produced, giving the mean value for manner of motion verbs
tokens per clause. With that information it was then possible to reckon
the mean value for each entire linguistic group. The English group scored
a mean of 0.0675 manner of motion verbs per single clause (std. dev.:
0.036); the Italian group scored a mean of 0.0378 (std. dev.: 0.033). An
independent-samples t-test indicates that the dierence between the two
means is signicant (t 3.05; p < 0.01).
The gures regarding tokens of target verbs refer to all tokens actually
recorded. It must be pointed out, however, that in some cases, certain
verbs were repeated more than once to describe what was a single action
depicted in the book (e.g., . . . a big deer, who ran and ran . . .; Billy
then climbed on top of the rock . . . Unfortunately when he was climbing
onto the rock . . .). It is possible that the repetition of one verb for the
description of one same event was simply due to the temporary inuence
of the activation of the verb when pronounced the rst time rather than a
Manner of motion saliency 561
phenomenon related to its permanent salience in the speakers mind.
14
This kind of inuence may have even increased the chances of item repe-
tition across similar yet distinct events portrayed in the story. One practi-
cal example to make the point: in the Frog story there are two separate
pictures which can both evoke the action of climbing. One is the picture
of the boy who has climbed a tree, the other is that of the same boy who
has climbed a rock. Many interviewed participants described both distinct
events with the verb climb/arrampicarsi. Now, was the participant who
used the verb climb/arrampicarsi to describe the rst event more likely to
use it again in the description of the second event, than s/he would have
been if they had used another verb instead (e.g., get on/salire) the rst
time? If yes, then the only gure free from such kind of inuences is that
regarding the number of types of manner motion verbs per single partici-
pant. In other words, contrary to the gure regarding frequency of use
(number of tokens per single participant) of some language category, the
gure indicating variety of use (number of types per single participant)
gives an index of salience which cannot be biased by the temporary inu-
ence of the language produced by the informant during the specic time
of the task. When checking the number of types of manner of motion
verbs per single participant, it was found that the English group scored a
mean of 0.0534 (std. dev.: 0.031) per clause; for the Italian group, the
same kind of mean was 0.0288 (std. dev.: 0.024). An independent-samples
t-test indicates that the dierence between the two means is signicant
(t 3.130; p < 0.01). The results referring to the mean number of man-
ner of motion verbs found per single clause are shown in Figure 4a
(tokens) and in Figure 4b (types).
The types (tokens in brackets) elicited from the two groups are listed
below:
English: climb (23); run (18); y, jump (8); walk (6); creep (4); crawl, slip
(3); bound, chase, dance, ap, hop, sneak, spring, swim, tumble, wander,
wriggle (1).
Italian: arrampicarsi, correre (18); scappare (12); saltare (7); volare (5);
fuggire (3); scivolare (4); camminare, intrufolarsi, nuotare, scagliarsi, sgus-
ciare (1).
14. In other words, we cannot exclude the possibility that the repetition of some verb was
facilitated by repetition priming. Wheeldon and Monsell (1992), for example, found
that production of a word in response to a denition had a large and long-lasting facil-
itatory eect on latency for later production of the same word to name a pictured
object.
562 F-E. Cardini
Notice that although the English group was smaller than the Italian
group (23 informants the former, 29 the latter), the number of types per
linguistic group (the same kind of data shown with regard to types in the
tests on ease of lexical access previously reported) was 19 for English and
12 for Italian.
Analysis of the samples and results (path and neutral motion verbs)
Although manner of motion verbs were the actual target of the investiga-
tion, it was necessary to inquire about other kinds of motion verbs in
order to check that the dierence detected between English and Italian
subjects with regard to manner of motion verb use was not also found in
other types of motion verbs, in which case the dierent trend noticed be-
tween English and Italian would not have been meaningful to the present
inquiry. A check was therefore carried out on a kind of verb which could
well have been used for describing the various motion events in place of
manner of motion verbs. All those intransitive, translational motion verbs
were counted which do not give any indication of manner, whether these
were path verbs (e.g., exit/uscire, ascend/salire) or simply verbs that
denote neutral translational motion (come/venire, go/andare, move/
muoversi ). Here, this group of verbs will be referred to as path/neutral
motion verbs. The analysis of the samples in relation to path/neutral
motion verbs followed the same criteria previously used for the analysis
of manner of motion verbs. With regard to tokens of path/neutral verbs
per single clause, the English mean was 0.1104 (std. dev.: 0.047), while the
Italian 0.1471 (std. dev.: 0.060). With regard to types per single clause,
the English mean was 0.0527 (std. dev.: 0.016), while the Italian mean
was 0.0659 (std. dev.: 0.025). This time the English means were therefore
Figure 4a. Spontaneous narration: Mean
for tokens of manner of motion
verbs per clause
Figure 4b. Spontaneous narration: Mean
for types of manner of motion
verb per clause
Manner of motion saliency 563
lower than the Italian means. The results referring to the quantity of
path/neutral motion verbs found per single clause are given in Figures
5a (tokens) and 5b (types).
Below is the list of types (tokens in brackets) of path/neutral verbs pro-
duced by the two linguistic groups. The number of types per linguistic
group was 8 for English and 14 for Italian.
English: fall (67); go (28); come (22); get on/o/onto/out (9); move (6);
leave (3); swarm (2); head (1).
Italian: cadere (105); uscire (79); andare (25); (ri)tornare (24); salire (21);
avvicinarsi (11); allontanarsi (7); dirigersi, inlarsi (3); entrare, muoversi,
scendere (2); provenire, recarsi (1).
Discussion
The English group produced signicantly more manner of motion verbs
than the Italian group. Indeed, the English mean for tokens and types of
manner of motion verb per clause was signicantly greater than the Ital-
ian mean (Figures 4a and 4b). It is also worth noting that the English
participants produced more types per group even if the number of sub-
jects was lower than that of the Italians.
What is also relevant to our inquiry is that the results obtained for
path/neutral motion verbs across the two linguistic groups did not repli-
cate the same pattern found for manner of motion verbs. As a matter of
fact, they showed an opposite tendency, since this time all the relevant
Figure 5a. Spontaneous narration: Mean
for tokens of path/neutral
motion verbs per clause
Figure 5b. Spontaneous narration: Mean
for types of path/neutral motion
verb per clause
564 F-E. Cardini
gures regarding English participants were actually lower than the rele-
vant Italian ones (Figures 5a and 5b). Because the greater use of manner
of motion verbs in English was therefore not paralleled by a similar trend
with path/neutral motion verbs, the English greater salience for motion
verbs appears to be restricted to the former kind of verbs.
15
In this respect, one can say that the results of this experiment are con-
gruent with those obtained earlier in the test regarding ease of lexical ac-
cess of motion verbs in general. There too, the greater salience of manner
of motion verbs among English speakers compared with the Italian
speakers did not extend to other kinds of motion verbs.
Conclusions
Looking at the overall body of data collected in the two studies, it ap-
pears that Italian can indeed be categorised as a low-manner-salient lan-
guage. Although grammatical analyses agree that Italian must be consid-
ered one of the least typical V-languages, the prominence of the semantic
domain of manner of motion in the knowledge and use of its speakers
does not seem to be near that exhibited by speakers of prototypical S-
languages such as English.
The vocabulary investigation, aimed at inquiring into the size of the
semantic domain of manner of motion, showed that the number of
Italian manner of motion verbs is signicantly lower than the number
for English. With regard to the extent of such a dierence, results
matched fairly closely previous comparisons made between other V- and
S-languages.
The experimental work carried out with speakers also pointed to a sig-
nicantly lower salience of the Italian manner of motion domain, this
time in terms of linguistic behaviour shown by such speakers. That is to
say, Italian speakers were slower in retrieving manner of motion verbs
from memory, and showed a signicantly lower frequency and variety of
mention of this kind of verbs in their spontaneous speech.
With regard to future research questions related to the above nd-
ings, it would now be of interest to see whether the linguistic phenom-
ena detected in this study give rise to Whoran eects. That is, whether
15. The fact that the Italian number of path/neural verbs was higher than the English
quantity was, to some extent, to be expected. If some speaker described some motion
event without using a manner of motion verb, s/he must perforce have used some other
kind of translational motion verb (i.e., path or neutral motion verb) instead. Thus,
at least some of the Italian lesser amount of manner of motion verbs must have been
compensated by some greater amount of path/neutral verbs.
Manner of motion saliency 565
the signicantly dierent degree of linguistic salience found between
the speakers of these two languages has a coherent counterpart in non-
linguistic cognitive areas.
Appendix: List of manner of motion verbs
English:
accelerate, amble, arc, bank, barge, beat, beetle, belt, (bi)cycle/bike,
billow, boat, bob, bobble, bolt, bounce, bound, bowl, breeze, bumble,
bump, bus, bustle, buzz, canoe, canter, caper, careen, career, cascade,
charge, chase, chug, clamber, climb, coach, coast, collapse, course, crawl,
creep, cruise, dance, dart, dash, dive, dodder, dodge, drag oneself, drift,
drive, ease, edge, ap, ash, ee, eet, ick, ing, it, itter, oat, ounder,
ow, utter, y, foxtrot, freewheel, fumble, gallop, gambol, gimp, glide,
gush, gust, hare, hasten, hike, hitch(-hike), hobble, hop, hurry, hurtle,
inch, irrupt, jerk, jet, jink, jive, jog, joggle, jolt, jump, kite, launch oneself,
leap, leapfrog, limp, loiter, lollop, lope, lurch, march, mince, mizzle, moon,
mosey, motor, nose, pace, pad, paddle, parachute, parade, patter, pedal,
pelt, plane, plod, plop, plough, plummet, plunge, potter, pounce, pour,
prance, promenade, prowl, pump, race, rack, raft, rattle, reel, ride, ripple,
roar, rocket, roll, roller-skate/blade, row, rumble, run, rush, rustle, sail,
saunter, scamper, scoot, scramble, scud, scue, scull, scurry, scuttle, sham-
ble, shimmy, shin, shoot, shue, side-slip, sidle, skate, ski, skid, skim, skip,
skitter, sledge/sleigh, slew, slide, slink, slip, slither, slog, slosh, sneak, soar,
somersault, speed, spin, splash, splosh, spout, spring, sprint, spurt, squash,
squeeze, squirt, stagger, stalk, stamp, stampede, steam, step, stomp, storm,
stream, stride, stroll, strut, stumble, stump, stunt, surf, surge, swagger,
swan, sweep, swim, swing, swoop, tango, teeter, thrash, throw oneself,
thrust, tiptoe, toboggan, toddle, toe in/out, totter, trail, traipse, tramp,
trek, trip, trot, trudge, trundle, tumble, vault, volplane, waddle, wade,
walk, wallow, waltz, wander, whie, whirl, whiz, wing, wobble, worm,
wriggle, zigzag, zing, zip, zoom.
Italian:
accelerare, arettarsi, ambiare, ancheggiare, arrampicar-e/si, arrancare,
avventarsi, ballare, ballonzolare, balzare, balzellare, barcollare, bighello-
nare, bordeggiare, brancolare, buttarsi, calarsi, camminare, capitombolare,
caracollare, cascare, catapultarsi, cavalcare, (ac)ciabattare, claudicare,
correre, crollare, danzare, deambulare, derapare, divincolarsi, erompere,
lare, ondarsi, uire, franare, frullare, fuggire, galoppare, gattonare, get-
566 F-E. Cardini
tarsi, girellare, gironzolare, grufolarsi, guizzare, incedere, inciampicare,
inerpicarsi, intrufolarsi, irrompere, lanciarsi, marciare, molleggiarsi, mon-
tare, navigare, nuotare, pagaiare, paracadutarsi, passeggiare, pattinare,
pedalare, piombare, planare, precipitare, precipitarsi, prorompere, remare,
rimbalzare, rinculare, rinsaccare, ronzare, rotolare, rovinare, ruzzare, ruz-
zolare, saettare, saltare, salte(re)llare, sbalzare, sbandare, sbarellare, sca-
gliarsi, scalpicciare, scantonare, scapicollarsi, scappare, scaraventarsi,
scarpinare, scarrozzare, scattare, schettinare, schizzare, sciare, scivolare,
scodinzolare, scorrazzare, scorrere, sculettare, sdrucciolare, sfarfallare, s-
lare, sfrecciare, sgambettare, sgattaiolare, sgorgare, sgusciare, slanciarsi,
slittare, smottare, sobbalzare, spaziare, sprizzare, sprofondare, squagliar-
e/sela, strascicarsi, strisciare, svicolare, svignar-e/sela, svolazzare, tacchet-
tare, telare, tombolare, traballare, trascinarsi, tronare, trottare, trotterel-
lare, tuarsi, veleggiare, vogare, volare, volteggiare, (av/ri)voltolarsi,
zampettare, zampillare, zigzagare, zoccolare, zoppicare.
Received 18 December 2006 Lancaster University
Revision received 24 February 2008
References
Berlin, Brent and Paul Kay
1969 Basic Color Terms: Their Universality and Evolution. Berkeley: University of
California Press.
Berman, Ruth A. and Dan I. Slobin
1994 Relating Events in Narrative: A Crosslinguistic Developmental Study. Hills-
dale, NJ: Lawrence Erlbaum Associates.
Damasio, Antonio R. and Hanna Damasio
1992 Brain and language. Scientic American 267(3), 8895.
De Mauro, Tullio, Federico Mancini, Massimo Vedovelli, and Miriam Voghera
1993 Lessico di frequenza dellitaliano parlato (LIP). Milan: Etaslibri.
Dukhovny, E. and M. Kaushanskaya
1998 Russian verbs of motion. Unpublished paper, Department of Psychology,
University of California, Berkeley.
Goddard, Cli and Anna Wierzbicka
2002 Semantic primes and universal grammarTheory and Empirical Findings,
vol. 1. In Goddard, Cli and Anna Wierzbicka (eds.), Meaning and Univer-
sal Grammar. Amsterdam: John Benjamins, 4185.
Ibarretxe-Antun ano, Iraide
2004 Language typologies in our language use: The case of Basque motion events
in adult oral narratives. Cognitive Linguistics 15(3), 317349.
Jovanovic, J. and M. Kenteld
1998 Manifold manner: An exploratory analysis of French and English verbs
of motion. Unpublished paper, Department of Psychology, University of
California, Berkeley.
Manner of motion saliency 567
Koch, Peter
2000 Indirizzi cognitivi: per una tipologia lessicale dellItaliano. Italienische
Studien 21, 99117.
Mayer, Mercer
1969 Frog, Where are You? New York: Dial Press.
Naigles, Letitia R. and Paula Terrazas
1998 Motion-verb generalizations in English and Spanish: Inuences of language
and syntax. Psychological Science 9, 363369.
Naigles, Letitia R., A. R. Eisenberg, E. T. Kato, M. Highter, and N. McGraw
1998 Speaking of motion: Verb use in English and Spanish. Language and
Cognitive Processes 13, 521549.
O

zcalskan, S eyda and Dan I. Slobin


1999 Learning how to search for the frog: Expression of manner of motion in
English, Spanish, and Turkish. In Greenhill, A., H. Littleeld, and C. Tano
(eds.), Proceedings of the 23rd Annual Boston University Conference on Lan-
guage Development, vol. 2. Somerville, MA: Cascadilla Press, 541552.
Papafragou, Anna, C. Massey, Lila Gleitman
2002 Shake, rattle, n roll: The representation of motion in language and cogni-
tion. Cognition 84, 189219.
Sablayrolles, Pierre
1995 The semantics of motion. Proceedings of the EACL95 Dublin, 281283.
Schwarze, Christoph
1985 Uscire e andare fuori: struttura sintattica e semantica lessicale. Societa`
di Linguistica Italiana 24, 355371.
Slobin, Dan I.
2000 Verbalized events: A dynamic approach to linguistic relativity and determin-
ism. In Niemeier, S. and R. Dirven (eds.), Evidence for Linguistic Relativity.
Amsterdam: John Benjamins, 107138.
2003 Language and thought online: Cognitive consequences of linguistic relativ-
ity. In Gentner, D. and S. Goldin-Medow (eds.), Language in Mind. Cam-
bridge, MA: MIT Press, 157191.
2004 The many ways to search for a frog: Linguistic typology and the expression
of motion events. In Stro mqvist, S. and L. Verhoeven (eds.), Relating Events
in Narrative, vol. 2, Typological and Contextual Perspectives. Mahwah, NJ:
Lawrence Erlbaum, 219257.
Slobin, Dan I. and Nini Hoiting
1994 Reference to movement in spoken and signed languages: Typological con-
siderations. Proceedings of the Berkeley Linguistic Society 20, 487505.
Snell-Hornby, Mary
1983 Verb-descriptivity in German and English: A contrastive study in semantic
elds. Heidelberg: Carl Winter.
Talmy, Leonard
1985 Lexicalization patterns. Semantic structure in lexical forms. In Shopen,
T. (ed.), Language Typology and Lexical Description, vol. 3, Grammatical
Categories and the Lexicon. Cambridge: Cambridge University Press, 36
149.
1991 Path to realization: A typology of event conation. Proceedings of the
Berkeley Linguistic Society 17, 480519.
2000 Toward a Cognitive Semantics, vol. 2, Typology and Process in Concept
Structuring. Cambridge, MA: MIT Press.
568 F-E. Cardini
Wheeldon, L. R. and S. Monsell
1992 The locus of repetition priming of spoken word production. The Quarterly
Journal of Experimental Psychology 44, 4, 723761.
Wierzbicka, Anna
1999 Universals of colour from a linguistic point of view. Behavioural and Brain
Sciences 22, 723733.
Zlatev, Jordan, Johan Blomberg, and Caroline David
forth- Translocation, language and the categorisation of experience. In Evans,
coming Vyvyan and Paul Chilton (eds.), vol. 1, Language and Cognition in Space:
The State of the Art and New Directions. London: Equinox.
Manner of motion saliency 569
Sequential and summary scanning: A reply
RONALD W. LANGACKER*
Abstract
It is quite legitimate for Broccias and Hollmann (2007) to question the
characterization of verbs in terms of sequential scanning. However, they
have not advanced any cogent arguments against it. Although experimental
evidence is certainly to be desired, there is no doubt about the psychological
status of sequential scanning: it amounts to nothing more than the sequen-
tiality inherent in the real-time experience of events. Its sequentiality should
be most fully manifested when a verb is directly apprehended as a grounded
clausal head (hence most salient); owing to general eects of compression,
a more holistic view should prevail when a verb is subordinated to other
elements. The arguments made by B&H concerning the English auxiliary,
causatives, and path prepositions are discussed and found to be invalid.
Nonetheless, their critique has shown the need for clarication and rene-
ment and has directed attention to important issues.
Keywords: auxiliary; distribution; grammatical category; grounding;
mental simulation; path preposition; summation; verb.
In a recent article, Broccias and Hollmann (2007), B&H hereafter, ques-
tion my conceptual characterization of verbs in terms of sequential scan-
ning (Langacker 1987a, 1987b). Certainly it is legitimate for them to
raise this issue. The description, having been proposed over two decades
ago, is ripe for reexamination; at the very least it stands in need of
clarication and elucidation. While I do not believe that B&H have raised
any valid objections to it, they have performed a most useful service by
Cognitive Linguistics 194 (2008), 571584
DOI 10.1515/COGL.2008.022
09365907/08/00190571
6 Walter de Gruyter
* Department of Linguistics, University of California, San Diego; authors e-mail:
3rlangacker@ucsd.edu4.
subjecting it to needed scrutiny and directing attention to other important
matters.
The point they emphasize the most is that the putative distinction be-
tween summary and sequential scanning is not supported by independent
psychological evidence (in contrast, say, to gure/ground organization)
and is thus to be regarded with suspicion. I disagree with this assessment
of its status. Rather than being exotic or mysterious, the two kinds of
scanning are basic aspects of moment-to-moment experience. Sequential
scanning is our primary mode of experience in the real-time observation
of events. As we watch an event unfold, e.g., a ball rolling down an
incline, we observe it at just a single position at a given point in time. In
terms of what is directly and immediately apprehended, the component
spatial congurations are accessible and available only in the temporal
sequence of their occurrence. Sequential scanning is nothing more than
this fundamental aspect of dynamic experience.
Of course, our apprehension of events is not limited to direct, real-time
observation. We can also recall or imagine events through processes of
mental simulation. Evident as well is the ability to summarize sequen-
tially experienced events and apprehend them holistically. This happens,
for example, when we watch something move and draw a line to repre-
sent its trajectory; the successive congurations constituting a dynamic
occurrence are then all captured in a single, static diagram. This summing
capacity is directly reected in a device commonly used in visual presen-
tations, e.g., to indicate the route followed by migratory birds or the
changing price of gold: starting from a point, a line or arrow is shown as
growing across the screen until it occupies the full extent of the path it
represents. Two phases or levels of organization are thus involved: a
growth or build-up phase, which captures directionality and tempo-
ral sequencing; and the nal, persistent stage that results, in which the
successive congurations are compressed into a single, simultaneously
available gestalt. That is precisely how I characterize summary scanning.
It goes without saying that experimental investigation of these scanning
modes would be most welcome. One would like a detailed, empirically
grounded understanding of their nature, their scope, their relation to one
another and to other cognitive phenomena, as well as the specics of their
implementation. We can hope for the convergence of linguistic and psy-
chological evidence, as exemplied by recent experimental work indicat-
ing that ctive motion really is conceived in terms of motion (e.g.,
Matlock 2004), or that a participants status as trajector actually does
involve the focusing of attention (Tomlin 1995, 1997; Forrest 1996).
However, the mere existence of sequential and summary scanning as con-
ceptual phenomena is hardly problematic.
572 R. W. Langacker
To be sure, one can acknowledge their existence without accepting the
proposals made in Cognitive Grammar (CG) concerning their application
to grammatical description. Resistance to their acceptance is possible
at several dierent levels. There is rst the standard doctrine that basic
grammatical categories are simply not susceptible to semantic character-
ization. So rmly entrenched is this attitude that the CG proposals are
often rejected out of hand without being considered on their merits. I
would claim, however, that the usual arguments for this doctrine con-
fuse separate issues and make invalid assumptions (Langacker 2005,
2008). In fact, I can only wonder whyfrom the perspective of cogni-
tive linguistics based on conceptual semanticsthe semantic denability
of fundamental notions would not be the default expectation. Alterna-
tively, it might be objected that phenomena like modes of scanning are
inappropriate as the basis for their description. I take this attitude as be-
ing not just gratuitous but wrong-headed. If general characterizations
are indeed possible, they have to be highly schematic. The only viable
candidates, I suggest, are mental capacities independent of any specic
conceptual content (e.g., the directing and focusing of attention, gure/
ground organization, conceptual grouping and reication, and the invo-
cation of mental spaces). Sequential and summary scanning are clearly
of this nature. We can reasonably expect them to have some linguistic
manifestation.
The critique by B&H is less concerned with these basic issues than with
the specics of the CG analysis. Are sequential and summary scanning re-
ally necessary for describing grammatical categories and constructions?
Are the proposed descriptions optimal linguistically and viable from the
processing standpoint? To properly assess these matters, we need a realis-
tic and more elaborate view of these modes of scanning and their imple-
mentation in online processing.
It would not be realistic, for example, to view sequential scanning
as consisting in a series of discrete experiences with each component
statethe relationship obtaining at a single point in timebeing fully
activated at just that instant and not at all at either the preceding or the
following instant. Our primary experience is one of continuity, where
each component state morphs seamlessly into the next and thus provides
the basis for its emergence. In processing terms, the component congura-
tions (e.g., conceptions of the ball at particular positions along the in-
cline) are neither individuated nor instantaneous. Each is projected from
those which closely precede it and contributes to projecting those which
follow (cf. Grush 2007). To the extent that we distinguish them analyti-
cally, their activation thus has a temporal contour, where it rises to a
peak and then falls. The scanning is nonetheless sequential in that the
Sequential and summary scanning: A reply 573
congurations reach their peak activation only momentarily and in the
order of their temporal manifestation.
It would likewise not be realistic to assume that sequential and sum-
mary scanning are mutually exclusive, so that only one can occur during
a given span of processing time. If my interest lies specically in the tra-
jectory followed by a rolling ball, I can summarize over its positions
building up a holistic conception of its patheven while watching the
event unfold. If summation is done through memory, a simulation of the
original sequential experience may well be necessary as a means of access-
ing the component congurations. It is not implausible to suppose that
sequentiality and summation coexist in event conceptions as two aspects
or levels of processing activity, their relative salience depending on the
higher-level task. On this view, linguistic characterizations based on the
modes of scanning would indicate the extent to which the sequentiality
of direct experience is preserved in mental simulations.
Of course, mental simulations are only partial recreations of primary
experience. Relative to the original, a simulated experience is generally
less intense, less detailed, and realized during a shorter span of processing
time. This naturally holds for the simulations that gure in linguistic
meanings, including the sequential scanning claimed to be inherent in the
meanings of verbs. If a verb like swim incorporates perceptual and motor
images of this activity, their sequentiality is bound to be less salient than
that of the primary experience, if only due to temporal compression: it
takes a while to swim, but apprehension of the verbal meaning is mea-
sured in milliseconds. Other well-known factors also diminish its salience.
The rst is automatization (or entrenchment), allowing a mental or phys-
ical action to be executed in streamlined fashion without close attention
to specic details or temporal phases. Also, an element subordinated as
part of a larger structure tends to be compressed and attenuated com-
pared to when it occurs independently. We can thus expect the actual
manifestation of sequential scanning to be a matter of degree. Indeed,
apprehending a sequenced occurrence holistically may simply consist in
its sequential aspect falling below the threshold of awareness.
These matters bear on a key argument made by B&H concerning
my analysis of the English auxiliary. As schematic verbs, the auxiliaries
have and be involve sequential scanning, whereas the participial ele-
ments impose a summary view on the process they apply to. In com-
plex expressions, sequential and summary scanning therefore alternate at
successive levels of composition, as each derivational element (shown in
bold) imposes its construal on the structure already assembled: follow >
followed > be followed > being followed > be being followed > been being
followed > have been being followed. B&H suggest that this oscillation
574 R. W. Langacker
between the scanning modes is inappropriate as a cognitively plausible
processing representation (506).
And it no doubt is. The question, though, is whether this implausible
oscillation is actually entailed by the characterization of verbs and
participles in terms of modes of scanning. The argument depends on the
following assumptions: (i) CG descriptions of grammatical structure are
intended as direct representations of what actually occurs in online lan-
guage processing. (ii) The metaphor of compositionwhereby an ex-
pression is constructed by successively combining smaller structures
into larger onesis to be taken literally. (iii) Successive levels of com-
position represent distinct temporal phases in online processing; the
structures at one level are fully implemented (i.e., the processing activity
comprising them is fully executed) before the next level is initiated. (iv) A
structure is fully implemented in all its occurrencesthere is no attenua-
tion or compression due to factors like automatization or subordination
to other elements.
But I do not make any of these assumptions. The question of how lin-
guistic structures relate to online processing is a vital one that I have long
pondered and continue to wrestle with. While I do envisage an integrated
account, this remains a long-term goal owing to the sheer complexity
of the matter. Processing occurs simultaneously in many dimensions of
structure, at multiple levels of organization, and on vastly dierent time
scales. A complex expression cannot be reduced to any single representa-
tional format, type of cognitive activity, or path of mental access. So if
CG descriptions succeed in capturing certain aspects of semantic and
grammatical structure, they are not oered as direct or stand-alone ac-
counts of actual online processing. In particular, reference to composition
and structures at successively higher levels of organization should not be
interpreted as claiming that they are processed in temporal sequence from
bottom to top. To counteract this entailment of the standard composi-
tion metaphor, I often emphasize that a composite structure is an
entity in its own right, and that component structures serve only to cat-
egorize and motivate certain facets of it. Rather than being constructed
out of building blocks, an expression consists in an assembly of symbolic
structures linked by correspondences. What matters is an assemblys over-
all coherence based on how the structures relate to one another, whether
they are accessed in top-down, bottom-up, or left-to-right fashion (or any
combination of these).
Regardless of sequence of access, can it plausibly be maintained that
multiple instances of sequential and summary scanning coexist within
the verb group of a single clause, as in have been being followed? Two
general considerations bear on the matter. First, we do have an evident
Sequential and summary scanning: A reply 575
capacity for dealing with structures at numerous hierarchical levels, each
with its own intrinsic organization. In this respect the auxiliary sequences
are actually not dissimilar to a nominal expression such as a row of stacks
of plates. Whatever one claims about constituency or sequence of access
in online processing, it would seem to be undeniable that our overall
apprehension of this expression involves multiple structures with dier-
ent kinds of proles: bounded things ( plate, stack, row), plural masses
( plates, stacks), and relationships (of plates, of stacks). Moreover, certain
structures impose their organization on others: as a whole, for instance,
stacks of plates refers to the stacks, and the entire expression, to a row.
I do not see, then, that there is anything inherently implausible about
positing complex symbolic assemblies of this sort, where certain struc-
tures override the construal imposed by others.
Perhaps the modes of scanning are dierent in this regard. But in mak-
ing this assessment, we need to consider what happens in general when a
structure is subordinated as part of a symbolic assembly in which its own
organization is overridden at a higher level. The basic eect is one of
compression and attenuation. When subordinated as part of plates, the
conception symbolized by plate is much diminished compared to when it
occurs independently. The notion of a single bounded entity is evoked to
some extent, being inherent in the conception of a plural mass, but has
lesser salience due to the mass being focused as the prole. Likewise, a
syllable pronounced independently, with full stress, has fuller manifesta-
tion than when incorporated in unstressed position within a word, where
it has lesser amplitude, is shorter in duration, and is more susceptible to
phonetic reduction (e.g., vowel centralization).
We ought to expect a comparable eect when a verb is subordinated
to other elements. Presumably a processual conception is fully mani-
fested when it is proled and grounded by a nite clause, thereby being
directly apprehended as the focused onstage entity. It is subject to com-
pression and attenuation when it has a background role as part of a
larger structure, thereby being apprehended in relation to other elements
that impose their own organization on the overall expression. By explic-
itly overriding a verbs processual nature, elements like innitival to and
the participial inections allow its content to be invoked for functions
other than its primary one of heading a nite clause (e.g., for noun
modication or non-nite complementation). A verbs subordination is
bound to diminish its inherent sequentiality, if only by compressing it
into the smaller time-frame allocated to background structures. Its se-
quential aspect may indeed be eectively or even wholly suppressed,
given that the very purpose of these constructions is to impose a summary
view.
576 R. W. Langacker
I conclude that B&H, in their critique of the CG analysis of auxiliary
sequences, do not provide a valid argument against sequential and sum-
mary scanning. They do however raise a number of considerations that
were not addressed in the original formulation (Langacker 1987b: 78,
1991: ch. 5). Prompted in part by their comments, I am presently working
toward a revised description that is more comprehensive and more ame-
nable to an integrated account of structure and online processing. Al-
though it is too complex for detailed presentation here, in terms of both
analysis and the view of grammar it presupposes, I will briey mention
some basic features.
I now consider the layering exhibited by auxiliary sequencese.g.,
(have (-en (be (-ing (be (-ed ( follow)))))))to be a matter of conceptual
organization (semantic scope) rather than grammatical constituency. The
same conceptual structure can be packaged and expressed symbolically in
dierent ways. Even using the same basic elements, it can often be coded
by symbolic assemblies providing alternate means of access to it. Constit-
uency is a non-essential aspect of such assemblies: the hierarchical
arrangement of symbolic composition. It is partial, variable, and non-
exhaustive of the groupings and relationships that gure in semantic
and grammatical structure (Langacker 1997, 2008). More important are
semantic correspondences specifying how component elements relate to
one another as overlapping facets of the composite conception (regardless
of order of composition), as well as the construal imposed on the overall
expression (notably its prole) for higher-level purposes. In particular, the
auxiliary sequence in a nite clause proles the process representing the
outermost layer in terms of scope (e.g., have been being followed proles
the have relation). This is the process proled and grounded by the clause
as a whole.
Grammar is not conceptual structure, but rather a means of symboli-
cally expressing it. There is no expectation in CG that the organization
of symbolic assemblies should mirror that of the conceptual structures
coded by them. Assemblies exhibit multiple dimensions of organization,
with elements being grouped in dierent ways for dierent purposes.
One dimension comprises groupings based on semantic function. Within
a nite clause, for instance, the entire sequence have been being followed
functions semantically as the grounded structure (connected to the
ground by tense and modality). Also established as groups, on the basis
of their perspectival functions (perfect, progressive, and passive), are the
combinations (have -en), (be -ing), and (be -ed ), which cross-cut the
groupings based on scope relations. A second dimension consists in com-
positional relationships, where component symbolic structures serve to
motivate (not, strictly speaking, to constitute) composite structures of
Sequential and summary scanning: A reply 577
greater complexity. Within an auxiliary sequence, each word represents a
minimal grouping of this sort: ((have) (been) (being) ( followed )). Observe
that this way of grouping basic elements is non-congruent with the other
two. A third dimension is temporal sequencing (linear order). Because
it denes a salient path of mental access (though certainly not the only
path), I regard the order of presentation as an aspect of linguistic mean-
ing in its own right, above and beyond its grammatical signicance.
The various factors cited by B&H all have their place in a full descrip-
tion of auxiliary sequences recognizing these (and other) dimensions of
organization. Left-to-right processing is indeed a central feature. Struc-
tures of dierent sizese.g., be Ved, be Ving, being Ved, have been Ving,
have been being Veddo indeed achieve the status of well-rehearsed units
which can thus be invoked as familiar gestalts. To a signicant extent
these units retain their analyzability (a matter of degree); have been being
Ved has not been reinterpreted as a single morpheme. Still, owing to
entrenchment and compression the constitutive elements do have lesser
salience. An account of this sort thus eliminates (or at least greatly
mitigates) any problem of oscillation between sequential and summary
scanning. The scope relations in question are not mirrored at the level of
symbolic expression, where only a modest degree of compositional layer-
ing is posited for auxiliary sequences. The words and phrases of these
sequences provide serial access to overlapping portions of the implied
conceptual structure. Since these words and phrases are mostly xed ex-
pressions in which verbs are subordinated to other elements, sequential
scanning is largely suppressed. Only the grounded process, being directly
apprehended and proled by the clause as a whole, is manifested without
compression and with full realization of its inherent sequentiality.
The other main arguments advanced by B&H concern the issue of
whether the distinction between sequential and summary scanning is
descriptively necessary. For two cases, alternatives are proposed that do
not rely on this distinction. The rst involves causative constructions,
where to-innitives (e.g., with cause) are supposedly scanned in summary
fashion, and bare innitives (e.g., with make) in sequential fashion. A
proposed alternative based on Givo ns (1980) notion of binding is consid-
ered in both synchronic and diachronic perspective. While I do not have
any quarrel with the B&H analysis, I fail to see its relevance, since I do not
in fact employ the scanning modes to distinguish the two constructions.
To my recollection, my only published analysis of bare innitives re-
gards their occurrence with perception verbs, where they contrast with
active participles, e.g., I saw the ship sink vs. I saw the ship sinking (Lan-
gacker 1999: Ch. 7, 5). In that account, I posited for the former a zero
subordinator directly analogous to -ing, the only dierence being that
578 R. W. Langacker
-ing restricts the prole to some internal portion of the verbal process.
Hence in both constructions that process is said to be viewed holistically.
Now in retrospect I would avoid positing a zero subordinator. The impo-
sition of a summary view is better ascribed to the subordinating construc-
tion itself, where the complement event is specically apprehended in
relation to the matrix predicate (rather than being grounded and directly
apprehended in its own right). Be that as it may, I do not take V as being
sequentially scanned in either make V or cause to V, and I quite agree
that the constructions dier in degree of binding and other semantic fac-
tors, such as the general future orientation of to (Wierzbicka 1988: Ch. 1).
B&H also question my appeal to modes of scanning in order to distin-
guish verbs from path prepositions. The discussion is based on my de-
scription of enter vs. into (Langacker 1992), which (at least in certain
uses) would seem to be a minimal pair, contrasting solely in this property.
Each proles a complex relationship in which the trajector occupies all
the locations constituting a spatial path. In each case, moreover, it typi-
cally occupies these locations successively through time. Hence the verb
and the preposition are not consistently distinguished in terms of concep-
tual content, proling, or trajector/landmark alignment. I thus proposed
that the basic dierence resides in whether the component statesthe
relationships obtaining at each successive point in conceived timeare
accessed through processing time in sequential or summary fashion.
The alternative proposed by B&H supposedly accords a greater role to
distributional factors. Since path prepositions typically occur in combina-
tion with motion verbs, they incorporate a schematic motion event as an
unproled element in their base. The trajector of the preposition corre-
sponds to the trajector of this event (the mover). B&H do not say explic-
itly what into proles, but they represent its prole as a straight arrow,
using a squiggly arrow for the unproled motion event. As for enter, they
merely say that it foregrounds (i.e., proles) the motion event without
featuring an extra component as part of its meaning. They represent
this event with the same straight arrow employed for the prole of into. I
confess to being mystied by these notations, which imply that enter is
simpler than into and is also non-eventive, being what remains when the
unproled motion event is subtracted from the latter. This is certainly not
their intent, and indeed, what they evidently have in mind is eminently
reasonable. I suggest, however, that when this proposal is spelled out in
more careful detail, it is not a true alternative to the CG analysis.
I fully agree that the basic sense of into incorporates a schematic mo-
tion event as an unproled part of its base. It is not sucient, however,
to say that the prepositions trajector corresponds to the mover; it must
further be specied that the path proled by into is none other than the
Sequential and summary scanning: A reply 579
path traversed by the mover in the motion event. Also in agreement with
B&H, I take the schematic motion event evoked by into as corresponding
to the specic event proled by the verb it combines with; walk into the
garage thus indicates that the into path is the path followed in walking.
But it is no less true that a motion verb incorporates the schematic notion
of a path as a central part of its meaningthere is no motion without a
path of motion. This path functions as elaboration site for the verbs
combination with the prepositional phrase, and since the former functions
as prole determinant, the latter is a complement rather than a modier.
Enter, of course, is slightly dierent in that the path is more specic, so
enter into the garage seems redundant and somewhat marginal (we would
normally say enter the garage, with the nal location focused as enters
landmark). This is apparently what B&H are alluding to when they say
that enter lacks an extra component of meaning. But enter does incor-
porate the notion of a path, just as into incorporates that of a motion
event.
I believe B&H would accept the following rephrasing of their analysis:
while they both indicate motion along a path, enter proles the motion
(including a specication of the path), whereas into proles the path, the
motion event being unproled (hence less salient). The dierence thus
resides in the relative prominence of the motion component within the
overall conception. Although this is certainly correct so far as it goes, I
suggest that it does not go far enough. I noted that B&H do not explicitly
say what into proles. More fundamentally, they oer no explicit charac-
terization of the central notions: motion and (presumably) path. This is
crucial, since the very reason for examining enter and into is that these
particular expressions allow us to see how minimally dierent these
notions are.
How might one describe the conception of motion (as in walk, enter,
etc.)? Its essential content is that the trajector, through time, occupies a
series of locations; it thus comprises a temporally ordered sequence of
spatial relationships (each potentially coded by a prepositional phrase,
like in the garage). What about the notion of a path? The issue is not the
meaning of the English noun path (which is merely used here for conve-
nience), but rather the shared conceptual import of prepositions like into,
along, and through. These, I presume, do not prole things but rather spa-
tial relationships. The distinctive property of these relationships is their
complexity: they do not consist in just a single conguration, where the
trajector occupies a single, point-like location, but in an ordered series of
such relationships. In typical uses, this ordering (the source of a path
prepositions inherent directionality) has a temporal basis, the component
congurations being realized successively through time.
580 R. W. Langacker
Granted these characterizations, the two notions are equivalent in
terms of their essential content: both involve the trajector occupying a
temporally ordered series of locations. At least in some cases, therefore,
enter and into are not distinguished by their conceptual content (nor by
proling or choice of trajector). How, then, do they dier? Along with
B&H, I would say that the motion component is less salient in the
case of into. But I have just argued that the motion and path compo-
nents are really the same, comprising an ordered series of locative rela-
tionships. It would seem that the dierence can only reside in the degree
of prominence accorded to the temporal basis of the orderingwhether
the proled relationship is viewed primarily as developing through time,
or whether its temporal evolution remains in the background.
What is the actual conceptual import of evolution through time being
foregrounded rather than backgrounded? I suggest that this has the eect
of emphasizing an events inherent sequentiality. Events unfold through
time, each component conguration obtaining at just a single instant. In
direct experience, this serial exposure to an events component states
represents the primary mode of apprehending it. So to the extent that
meanings are based on simulations of such experience, foregrounding an
events evolution through time ensures that its inherent sequentiality is
fully manifested. By the same token, it tends to be diminished or sup-
pressed when time is relegated to the background. In that case there is
greater emphasis on the component congurations per se, viewed in
abstraction from their temporal evolution. The balance thus shifts from
sequentiality to summation.
In sum, I see no intrinsic conict between my own analysis of enter vs.
into and the one proposed by B&H. When the latter is spelled out in more
explicit detailin a way that I believe conforms to their intentit relies
on a distinction either equivalent or closely akin to sequential vs. sum-
mary scanning. I am not too concerned about the precise manner in
which the distinction is drawn. More important, from the CG standpoint,
is that there be some consistent conceptual basis for distinguishing verbs
from other categories.
B&H seem doubtful that we can or should even expect to nd concep-
tual characterizations of this sort. They suggest that such distinctions
can be regarded rst and foremost as by-products of distributional facts,
i.e., of grammar as a usage-based model (509). I certainly agree with the
importance of distributional facts, in accordance with the usage-based
approach (Barlow and Kemmer 2000; Bybee and Hopper 2001; Lan-
gacker 1987a, 1988, 2000). Indeed, my analysis of enter and into is based
on distribution in just the way that B&H indicate. It is on the basis of
their grammatical distribution that enter and into are assigned to dierent
Sequential and summary scanning: A reply 581
categories, by both the linguist and the language learner. Like B&H,
moreover, I presume that path prepositions, owing to their frequent
occurrence with motion verbs, incorporate a schematic motion event as
part of their meaning. (Of course, the converse also holds.) The issue is
whether distribution alone suces, or is at least the primary factor (being
rst and foremost). Here I can only suggest that distributional and con-
ceptual factors have to be considered in tandem. Without the latter, we
cannot explain why the observed distributional asymmetriessupporting
the postulation of verbs and a preposition-type category in language after
languageshould arise in the rst place. Could we explain this just by
positing semantic prototypes? I tend to doubt it, but in any case it be-
hooves us at least to look for general denitions. In my own view, the ex-
istence of schematic characterizations of the sort proposed in CG ought
to be our default expectation. This ts in well with an overall approach
viewing grammar as a product of subjectication, whereby mental opera-
tions inherent in concrete experience come to be applied in abstraction
from such experience as the basis for grammatical phenomena (Lan-
gacker 2004, 2008).
I have proted from this occasion to reexamine, rene, and hopefully
clarify my original description of the modes of scanning and their linguis-
tic application. In response to the question posed by B&H in their title, I
would still answer in the armative. But as is so often the case in linguis-
tic disagreements, we are dealing here with dierent presuppositions con-
cerning what is reasonable to assume as the basis for analysis. B&H start
from the supposition that the scanning modes are something outlandish
whose existence is to be doubted and whose application in linguistic de-
scription ought to be resisted. I start from the opposite view. We exhibit
the capacity for summation whenever we use a static shape to represent
the course of a movement event. Sequential scanning amounts to nothing
more than mental simulation of the sequentiality inherent in real-time
event experience. In dierent guises and under dierent labels (including
image schemas), simulation has come to be widely accepted as an
essential component of linguistic meaning (Johnson 1987; Barsalou 1999;
Matlock 2004; Hampe 2005; Bergen 2005; Svensson, Lindblom, and
Ziemke 2007). It is also widely agreed that simulations are extended to
circumstances beyond those giving rise to them, and that this plays a
major role in grammar and mental construction. Whatever its ultimate
outcome, the issue raised by B&H needs to be considered in this broader
context.
Received 15 February 2008 University of California, San Diego
Revision received 11 April 2008
582 R. W. Langacker
References
Barlow, Michael and Suzanne Kemmer (eds.)
2000 Usage-Based Models of Language. Stanford: CSLI Publications.
Barsalou, Lawrence W.
1999 Perceptual symbol systems. Behavioral and Brain Sciences 22, 577660.
Bergen, Benjamin
2005 Mental simulation in literal and gurative language understanding. In
Coulson, Seana and Barbara Lewandowska-Tomaszczyk (eds.), The Literal
and Nonliteral in Language and Thought. Frankfurt am Main: Peter Lang,
255278.
Broccias, Cristiano and Willem B. Hollmann
2007 Do we need summary and sequential scanning in (cognitive) grammar?
Cognitive Linguistics 18, 487522.
Bybee, Joan and Paul Hopper (eds.)
2001 Frequency and the Emergence of Linguistic Structure. Amsterdam/
Philadelphia: John Benjamins.
Forrest, Linda B.
1996 Discourse goals and attentional processes in sentence production: The
dynamic construal of events. In Goldberg, Adele E. (ed.), Conceptual
Structure, Discourse and Language. Stanford: CSLI Publications, 149161.
Givo n, Talmy
1980 The binding hierarchy and the typology of complements. Studies in Lan-
guage 4, 333377.
Grush, Rick
2007 Agency, emulation and other minds. Cognitive Semiotics 0, 4967.
Hampe, Beate (ed.)
2005 From Perception to Meaning: Image Schemas in Cognitive Linguistics.
Berlin/New York: Mouton de Gruyter.
Johnson, Mark
1987 The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Rea-
son. Chicago/London: University of Chicago Press.
Langacker, Ronald W.
1987a Foundations of Cognitive Grammar, vol. 1, Theoretical Prerequisites. Stan-
ford: Stanford University Press.
1987b Nouns and verbs. Language 63, 5394.
1988 A usage-based model. In Rudzka-Ostyn, Brygida (ed.), Topics in Cognitive
Linguistics. Amsterdam/Philadelphia: John Benjamins, 127161.
1991 Foundations of Cognitive Grammar, vol. 2, Descriptive Application. Stanford:
Stanford University Press.
1992 Prepositions as grammatical(izing) elements. Leuvense Bijdragen 81, 287
309.
1997 Constituency, dependency, and conceptual grouping. Cognitive Linguistics 8,
132.
1999 Grammar and Conceptualization. Berlin/New York: Mouton de Gruyter.
2000 A dynamic usage-based model. In Barlow, Michael and Suzanne Kemmer
(eds.), Usage-Based Models of Language. Stanford: CSLI Publications, 163.
2004 Possession, location, and existence. In Soares da Silva, Augusto, Amadeu
Torres, and Miguel Goncalves (eds.), Linguagem, Cultura e Cognicao: Estu-
dios de Lingu stica Cognitiva, vol. I. Coimbra: Almedina, 85120.
Sequential and summary scanning: A reply 583
2005 Construction grammars: Cognitive, radical, and less so. In Ruiz de
Mendoza Iban ez, Francisco J., and M. Sandra Pen a Cervel (eds.), Cognitive
Linguistics: Internal Dynamics and Interdisciplinary Interaction. Berlin/New
York: Mouton de Gruyter, 101159.
2008 Cognitive Grammar: A Basic Introduction. New York: Oxford University
Press.
Matlock, Teenie
2004 Fictive motion as cognitive simulation. Memory and Cognition 32, 1389
1400.
Svensson, Henrik, Jessica Lindblom, and Tom Ziemke
2007 Making sense of embodied cognition: Simulation theories of shared neural
mechanisms for sensorimotor and cognitive processes. In Ziemke, Tom,
Jordan Zlatev, and Roslyn M. Frank (eds.), Body, Language and Mind,
vol. 1, Embodiment. Berlin/New York: Mouton de Gruyter, 241269.
Tomlin, Russell S.
1995 Focal attention, voice, and word order. In Downing, Pamela and Michael
Noonan (eds.), Word Order in Discourse. Amsterdam/Philadelphia: John
Benjamins, 517554.
1997 Mapping conceptual representations into linguistic representations: The role
of attention in grammar. In Nuyts, Jan and Eric Pederson (eds.), Language
and Conceptualization. Cambridge: Cambridge University Press, 162189.
Wierzbicka, Anna
1988 The Semantics of Grammar. Amsterdam/Philadelphia: John Benjamins.
584 R. W. Langacker
Metaphor and phonological reduction
in English idiomatic expressions
DANIEL SANFORD*
Abstract
Gibbs and OBrien (1990) argue that the high degree of consistency in
speakers mental images for common idioms indicates that conceptual
metaphors play a vital role in constraining the meaning of such expressions.
The experiment reported here, which monitors speakers reading of the
stimuli used by Gibbs and OBrien, seeks to uncover a phonological corre-
late of the same phenomenon. It is argued here that the highly constrained
nature of the source domains from which metaphorically motivated idioms
draw their meanings (evidenced, in the Gibbs and OBrien study, by the
consistency of speakers mental images) causes words internal to such idi-
oms to be highly predictable. This predictability is associated with words
carrying a low semantic load, licensing their phonological reduction. The
hypothesis that words internal to idiomatic expressions are reduced in dura-
tion is conrmed, and this is taken as further support for Gibbs and
OBriens ndings that metaphors are to at least some extent active in the
on-line processing of idioms.
Keywords: metaphor; idiom; predictability; word length; reduction.
Cognitive Linguistics 194 (2008), 585603
DOI 10.1515/COGL.2008.023
09365907/08/00190585
6 Walter de Gruyter
* A version of this paper was presented at the 36th Annual Meeting of the Linguistic
Association of the Southwest in Denver, CO, and I am grateful for comments I
received there. Sincere thanks as well to Evan Ashworth and Joan Bybee of the
University of New Mexico, and to two anonymous reviewers for their highly helpful
comments and contributions. Contact address: Department of Linguistics, Humanities
Bldg. 526, University of New Mexico, Albuquerque, NM 871311196; authors e-mail:
3dsanford@unm.edu4.
1. Introduction
Idiomatic expressions, generally dened as utterances the meaning of
which cant be entirely derived from the words comprising them, have
long been problematic to linguists in that they seem to represent excep-
tions to the principles by which language has long been assumed to
operate. Simple utterances such as he let the cat out of the bag are used
without diculty or ambiguity by native speakers in casual conversation.
Such expressions, however, have proven extremely dicult to account for
in models of language that assume a clear distinction between the lexicon
(which contains words and morphemes) and rules (the morphological and
syntactic processes which operate on these entities). The reaction to idiom-
aticity on the part of the community of language researchers has ranged
between, on the one hand, the grudging acceptance that some multiple-
word collocation do exist in the lexicon, but only as exceptions that prove
the more general rule that morphology and syntax operate according to
rules of compositionality (Chomsky 1965, 1980; Jackendo 1992), and,
on the other, idioms being embraced as central, rather than peripheral,
to how grammar operates, and the lexicon reconceptualized as incorpo-
rating elements from all levels of linguistic structure (Barlow and
Kemmer 1994; Goldberg 1995; Croft 2001; Wray 2002). According to
this second view of idiomaticity (the framework within which this paper
is situated), idiomatic expressions represent one end of a continuum
which places highly analyzable and semantically decomposable utterances
at one end, and highly specied, semantically opaque utterances (idioms)
at the other.
This view of idiomatic expressions as entrenched instances of con-
structions is explicitly usage-based, the storage of constructions depen-
dent on the act of communication. Bybee and Hopper (2001: 19) state
that the constituent structure [of a construction] is determined by fre-
quency of co-occurrence (Bybee and Scheibman 1999): the more two ele-
ments occur together in a sentence the tighter will be their constituent
structure. The more a particular combination of elements (a construc-
tion) is used, the more the construction is conventionalized as a linguistic
unit of itself.
Idiomatic expressions have proven a thorny issue, as well, in the eld of
metaphor research, where they raise a dierent host of issues. While it
seems certain that an utterance such as the one given above is in some
sense gurative, its less clear that its predicated on a metaphor. If, in-
deed, it is, then more questions arise: what metaphor? Does this metaphor
only exist in the past, fossilized in this one expression, or is it active in
the conceptual systems of speakers of English, activated whenever this
586 D. Sanford
utterance is processed? The traditional answer to these questions is that
idiomatic expressions are merely the fossils of dead metaphors: whatever
metaphorical motivation might underlie them exists only in the history
of the language, long forgotten by speakers to whom the connection
between an idiom and its meaning is purely arbitrary (Aitchision 1987;
Cooper 1986; Cruse 1986; Strassler 1982).
In an attempt to challenge these assumptions, a hallmark study on the
role of idiom in metaphor comprehension by Gibbs and OBrien (1990)
presented subjects with the task of describing the mental images with
which they associate particular idiomatic expressions, and answering spe-
cic questions regarding these mental images. In the case, for example, of
spill the beans, subjects were asked questions such as How big is the
container? Is the spilling accidental or intentional? and After the beans
are spilled, are they easy to retrieve? A systematic comparison between
subjects responses for idiomatic expressions and subjects responses for
closely related literal expressions (for example, spill the peas was the con-
trol item corresponding to spill the beans) revealed two major ndings.
First, subjects responses to questions were markedly more consistent for
the idiomatic expressions than the literal ones. Second, within a group of
idioms sharing a particular gurative meaning but diering notably in
their surface forms (for example, blow your stack and hit the ceiling), sub-
jects responses showed a high degree of consistency.
The authors interpret these results as strong support for the proposition
that the meanings associated with idiomatic expressions are predicated
upon the metaphors which underlie them, and that these metaphors,
far from being dead, are quite active in the on-line processing of idiom-
atic expressions. These ndings are consistent with the view of meta-
phor that has emerged out of the cognitive school, that highly conven-
tionalized metaphors reect precisely those cross-domain mappings (in
which one cognitive domain is used a framework whereby another is
conceptualized) which are most active (Fauconnier 1997; Lako 1987;
Sweetser 1992).
The experiment reported in this paper undertakes to shed light on the
intersection of these two views of idiom. It is exploratory as to the role
of metaphors which underlie common idioms in interacting with an eect
generally associated with frequency: phonological reduction, and, in
particular, word length (see Jurafsky et al. 2001 and Berkeneld 2000,
among others, for the eect of frequency on word length).
Specically, this project addresses the question of whether metaphor
has an eect, aside from frequency, in the production of idiomatic expres-
sions, and if so, how this interacts with frequency-based eects. Partici-
pants in the study completed a task in which they read, out loud, from a
Metaphor and phonological reduction 587
list containing matched pairs of idiomatic and non-idiomatic expressions
(the same used in Gibbs and OBrien 1990) diering from one another by
a single word, and the duration of the verb in each utterance were mea-
sured using audio analysis software. Its hypothesized that metaphor has
an eect, beyond frequency, in causing the reduction of elements internal
to idiomatic expressions. This study posits, as the mechanism whereby
underlying metaphors eect the phonological reduction of words internal
to idioms, the high degree of predictability (caused by the motivating
metaphor) for semantic elements internal to the idiom.
According to the predictability-based account of word-level reduction,
words which are highly predictable based on their context carry a low se-
mantic load, which is reected in production by shortening (Bolinger
1991; Fowler and Housum 1987). Elaborating on this view, Gregory et
al. (1999) assert that frequency-based and probability-based eects are in
fact dierent facets of the same phenomenon: speakers probabilistic
knowledge of their own language reected in their production of it, such
that highly probablewhether said probability be due to high frequency
or predictabilitywords are reduced in duration. Metaphors, by their
nature, use the systematicity of highly cognitively structured domains to
structure the conceptualization of less concrete domains. The image
schema which forms the basis for an idiom such as let the cat out of the
bag is highly structured to the extent that let is highly predictable based
on the remainder of the string. The same conceptual metaphors, then,
that Gibbs and OBrien (1990) view as governing the high degree of con-
sistency in speakers mental images for idioms, should also be expected to
provide a high degree of contextual probability which, Gregory et al. as-
sert, will license word shortening. Such a nding would be consistent with
research which has linked the predictability of words in idiomatic expres-
sions to the same idioms ease of processing (Cacciari and Tabossi 1988),
an eect described by Cronk, Lima and Schweigert (1993: 69) as the bi-
asing context of the phrase itself .
It is hypothesized, then, that the duration of verbs in the idiomatic ex-
pressions will be generally shorter than those in their literal counterparts,
as a result of their being highly predictable due to the eects of underly-
ing metaphors. In that, however, frequency eects are a well-attested
cause of word-level reduction, and that idioms have been advanced as in-
stances of frequent collocations, this eect will be taken as support of the
hypothesis only if the observed reduction in duration can be demon-
strated (using several statistical measures) to go signicantly beyond
what might be expected from frequency eects alone. A second com-
ponent of the analysis, taking advantage of a survey method used to
independently assess the extent to which each stimulus instantiates a
588 D. Sanford
metaphor, will aim to isolate the eect of metaphor from that of idiom-
aticity, demonstrating that it is not simply the lexicalization of idiomatic
collocations which causes the reduction of verbs within the idiomatic
stimuli.
2. Methods
Twenty students from the University of New Mexico (11 males and 9
females) participated in the experiment. The participants were native
speakers of English between the ages of 18 and 30 and were oered a
small amount of extra credit for their participation by the instructors of
their introductory linguistics classes. The stimuli used in this experiment
(see Appendix 1) are taken directly from Gibbs and OBrien (1990), 50
utterances comprising 25 matched pairs of idioms and literal, semanti-
cally coherent utterances, each pair contrasting by a single word (blow
your stack vs. blow your tire, crack the whip vs. crack the glass). The 25
idioms represent ve categories with respect to their gurative meanings:
Anger, Exerting Control/Authority, Secretiveness, Insanity, and Revela-
tion. For the purposes of this study, each utterance was altered slightly so
as to be a complete sentence, capable of being uttered with some degree
of naturalness by a speaker. blow ones stack, for example, became Dont
blow your stack, while keep x in the dark became Dont keep him in the
dark. Both members of each linked pair were changed in the same way,
so that the altered stimuli remain dierent from one another by only a
single word. The 50 stimuli were presented to each participant in one of
four random orders.
Participants in the study, advised that there was no right or wrong
speed or way to read the stimuli and that the researcher was interested
only in how they, as native speakers, said each utterance, were directed
to look at each item, see what it said, and then say it out loud, moving
in this manner through the list of 50 items. Subjects responses were re-
corded using a digital recording device, and waveforms were generated
for the resulting audio les using the audio analysis program Sound
Forge 4.0. For each utterance, the verb (where relevant, the verb of the
subordinate clause, the portion containing the idiom in the 25 idiomatic
utterances: spill, for example, in Dont spill the beans and Dont spill the
peas) was isolated and measured in milliseconds.
Isolated words were chosen as the units being measured with respect to
duration (rather than the phrase as a whole) because 1) the hypothesized
aect applies to individual words, due their level of predictability, rather
than to entire phrases, and 2) because measuring the entire phrase would
include the element being controlled, the nominal which conditions either
Metaphor and phonological reduction 589
a literal or gurative reading. The verb, specically, was isolated because
the nominals are the element being manipulated, and because grammati-
cal words are already subject to extensive frequency eects. The verb is
the remaining element which is common to all of the stimuli, and which
can therefore serve as a measure of the indicated eect. The nature of the
action, moreover, is an element expected to be highly constrained in the
image schema underlying each idiom, and thus the verb should be espe-
cially subject to the predicted eect.
Several issues present themselves with respect to measuring the dura-
tion of individual words in natural speech. Most critically, it can be
dicult to establish a clear dividing line between a word and the ones pre-
ceding and following it. While Blow the lid/water o, for example, is rel-
atively straightforward, with blow beginning with the onset of the [b] and
ending with the onset of frication from the initial consonant of the word
which follows, others entail considerably more ambiguity. In natural pro-
duction of Hold your tongue/arm, for example, the [d] of hold and [ j] of
your merge, in many cases, into a single voiced alveolar aricate. In Go
o your rocker/path, its no mean task to decide where the nal vowel of
go ends and the initial one of o begins. These issues have been, to some
extent, sidestepped here: in that the stimuli come in linked pairs, the mea-
surement of duration serves the purposes of the study as long as it takes
place according to the same criteria for each member of the pair. In the
examples cited above, for example, the release of the dental stop [d] (into
either frication or a glide) was treated as the end of hold for both items in
the matched pair, and a characteristic trough in the waveform treated as
the end of go, for all participants.
3. Results
Table 1 presents the results (averaged by stimulus, with standard devia-
tions reported in parentheses to the right of each average) obtained
from the study, along with frequency data for each item. The frequency
gure given for each item reects the frequency of the phrase in a corpus
of Q360 million words. The methods used in collecting frequency data are
outlined in Appendix 2.
In Table 1, the third column presents the duration of the target word
in each of the idiomatic utterances, while the entries in the fth repre-
sent the duration of the same verb in the corresponding, non-idiomatic
control utterances. The nal column gives the dierence between the
two, a positive number indicating that the verb in the idiom was (as pre-
dicted) shorter than the verb in the non-idiom, a negative number the
inverse.
590 D. Sanford
Table 1. Experiment data
Idioms Controls Dierence
(ControlIdiom)
Frequency
(/Q360 m)
Duration
(ms)
Frequency
(/Q360 m)
Duration
(ms)
Frequency
(/Q360 m)
Duration
(ms)
Dont blow your stack/tire. 6 151.65 (27.12) 0 187.65 (42.96) 6 36
She hit the ceiling/table. 22 170.9 (29.53) 35 172.25 (28.55) 13 1.35
He lost his cool/wallet. 18 336.85 (58.62) 8 333.2 (54.42) 10 3.65
Try not to foam at the mouth/top. 10 244.25 (65.77) 1 278.4 (53.88) 9 34.15
Dont ip your lid/coin. 1 269.05 (65.05) 0 257.8 (60.74) 1 11.25
Lets crack the whip/glass. 26 281.6 (36.36) 4 298.95 (48.46) 22 17.35
Its time to lay down the law/tile. 49 176.05 (42.49) 0 170.15 (35.77) 49 5.9
I call the shots/police. 153 209.9 (35.97) 783 223.55 (30.57) 630 13.65
You wear the pants/scarf. 13 190.85 (33.89) 3 199.05 (43.96) 10 8.2
Try to keep the ball rolling/steady. 14 203.85 (47.02) 0 218.85 (71.33) 14 15
Just keep it under your hat/bed. 2 193.25 (38.83) 0 199.7 (37.44) 2 6.45
Will you button your lips/shirt? 1 256.25 (42.42) 110 258.45 (52.3) 109 2.2
Please hold your tongue/arm. 34 188.95 (41.41) 6 195.85 (38.38) 28 6.9
He went behind my back/chair. 4 195.95 (66.82) 4 180.95 (33.3) 0 15
Dont keep him in the dark/garage. 4 232.5 (54.05) 0 222.9 (27.15) 4 9.6
Try not to go o your rocker/path. 1 127.85 (28.97) 0 153.25 (44.19) 1 25.4
Youre going to lose your marbles/hair. 2 221.45 (35.97) 19 252.65 (43.95) 17 31.2
Are you going to go to pieces/church? 36 151.35 (59.13) 602 107.35 (80.18) 566 43.7
I suppose youll lose your grip/keys. 6 239.25 (45.31) 6 274.25 (122.5) 0 35
Theyre going to bounce o the walls/oor. 15 281 (37.21) 2 279.35 (20.64) 13 1.65
Dont spill the beans/peas. 47 279.55 (49.12) 0 324.35 (52.02) 47 44.8
Try not to let the cat out of the bag/house. 18 168.95 (39.53) 0 178.25 (33.41) 18 9.3
Lets blow the whistle/bubble. 142 180.9 (46.81) 0 192.15 (32.81) 142 11.25
Blow the lid/water o. 16 202.35 (49.22) 1 221.1 (70.09) 15 18.75
She has loose lips/teeth. 31 229.65 (60.24) 12 236.85 (50.8) 19 7.2
Average 26.84 215.4 (45.47) 63.84 224.7 (48.39) 37 8.9
M
e
t
a
p
h
o
r
a
n
d
p
h
o
n
o
l
o
g
i
c
a
l
r
e
d
u
c
t
i
o
n
5
9
1
As a measure of inter-rater reliability for verb-duration scores, data
from two randomly selected participants (10% of the data overall) were
coded by a second reader, who did not know the hypothesis of the exper-
iment, according to the same method. Pearsons correlation between the
sets of scores (all of the data for each of the two participants, including
both idiomatic and control utterances) yielded by the author and by the
second reader was r .85, indicating a satisfactory degree of inter-rater
reliability, and allowing for a high degree of condence in the observed
experimental eect.
Two approaches were taken to analyzing the data. The rst follows the
original study (Gibbs and OBrien 1990) in treating idiomaticity as a cat-
egorical variable which is co-occurent with metaphoricity, while the sec-
ond treats metaphoricity as a continuous variable, based on the results
of a survey. Both approaches are outlined below.
3.1. Idiomaticity as a categorical variable
Comparing results across the two levels of the independent variable of
idiomaticity, its clear that the verbs in the idiomatic utterances were gen-
erally shorter in duration than those in the controls, with this pattern
bearing out in 18 of the 25 matched pairs. Averaging across stimuli, the
verbs in the idiomatic expressions averaged 215.4 ms and the verbs in
the controls 224.7 ms. The verbs in the idioms, then, were on average 9.3
ms shorter than those in the controls. These results were statistically sig-
nicant, t24 2.4, p <.05.
One candidate as to a confounding variable in the data is the possibility
that certain of the idiomatic items could be interpreted as literal: blow the
whistle, for example, could potentially be read as referring to the act of
blowing a whistle, rather than to the gurative meaning of calling atten-
tion to a misdeed. A t-test on the data which removed those items which
were most likely to interpreted literally was no more signicant (i.e., the
likelihood that the observed eect was due to chance was no lower) than
the score indicated above, indicating that this was not an important
variable.
1
The possibility of the length of the nominal being manipulated aect-
ing the length of the rest of the phrase (and, therefore, the length of the
word being measured) was investigated using a t-test which looked for
1. Items removed in the control for literalness: Are you going to go to pieces/church?, I call
the shots/police, He went behind my back/chair, Lets blow the whistle/bubble, Dont spill
the beans/peas.
592 D. Sanford
a dierence between the two experimental groups in the length, mea-
sured in syllables, of the manipulated nominal. The results of this test,
t24 .37, p .71, indicate that there was not a signicant dierence
between the two experimental groups in the number of syllables in the
word being manipulated, and therefore that this factor did not have a sig-
nicant eect on the duration of the verb.
Figure 1 compares, for each participant, the averages of their responses
for the idiomatic and control utterances.
In Figure 1, idiomatic expressions correspond to the black line, and
the controls to gray: For 17 of the 20 participants, responses for idioms
were shorter than responses for controls, and the average (across par-
ticipants) dierence between the verbs in the idiomatic and control ut-
terances is 8.9 ms. The results of the analysis by subjects also indicated
a signicant dierence between the two groups, t19 3.8, p <.005.
In that the average duration of target words in the idioms is shorter
than the average duration of target words in the the controls, despite the
fact that average frequency was considerably higher for the control utter-
ances than for the idioms (63.84 and 26.84, respectively), this initial look
at the data suggests that frequency does not wholly account for the ob-
served dierence between the two groups. Two statistical methods were
used towards the goal of more fully understanding the relationship be-
tween the two variables, both attempts to isolate reduction due to idiom-
aticity and reduction to frequency eects. First, as a rough control for
Figure 1. Average target word duration by participant
Metaphor and phonological reduction 593
frequency, the 20% of matched pairs which had the greatest dierence, in
frequency, between the idiom and control utterances were removed from
the analysis.
2
This control for frequency, rough as it is, increased the ef-
fect to t19 3.3, p <.005 for the by-stimulus analysis, indicating
that the more variation due to frequency is removed, the greater the eect
due to idiomaticity.
Second, an Analysis of Covariance (ANCOVA), which tests whether
the dierence between two groups of observations (here, the duration
of the verbs in each of the two experimental conditions) is signi-
cant following controlling for the eect of a pre-existing, confound-
ing variable (frequency), was used in an attempt to fully control for
the eect of frequency. A test for homogeneity of regression yielded
F1; 46 .28, p .59, indicating that the assumption of homogene-
ity of regression is satised (i.e., that the covariatefrequencyis
aecting each of the two groups equally) and that an ANCOVA can
be performed. The results of the ANCOVA itself indicate a statistically
signicant main eect of idiomaticity on duration following a control
for frequency: F1; 47 4.67, p < .05, r .26.
3
The model R-square
is .07, indicating that a relatively low level of variation is accounted
for the by the two factors combined. The adjusted means for the experi-
mental and control group are .97 and 1.05, respectively, and h
2
, a
measure of the size of the size of the experimental eect, was calculated
at .09.
Each of the between-groups analyses presented in 3.1 indicates that
the observed experimental eect cannot be wholly traced to the eect of
the frequency of the phrases in which the target words are embedded.
These analyses are dependent, however, on the asumption that each of
the idiomatic utterances are metaphorical, and that each of the control
utterances are not (and, indeed, that all of the utterances in each category
are metaphorical to the same extent: either entirely, or not at all). 3.2
elaborates on these analyses by independently assessing the degree to
which each utterance used in the experiment is metaphorical, allowing
for a more ne-tuned approach to isolating the eect of metaphor from
that of frequency.
2. Items removed in the control for frequency: Its time to lay down the law/tile, I call the
shots/police, Will you button your lips/shirt?, Are you going to go to pieces/church?, Lets
blow the whistle/bubble.
3. In the ANCOVA, the durations of the target words were converted to a standard format
by expressing each measurement as a ratio of each item relative to the duration of the
other item in each matched pair (e.g., the duration of let in let the cat out of the bag /
the duration of let in let the cat out of the garage).
594 D. Sanford
3.2. Metaphoricity as a continuous variable
One weakness of the experiment outlined here, inherited from the Gibbs
and OBrien study on which it is based, is the conation of metaphoricity
and idiomaticity. Despite the fact that, for each of the idioms used in this
experiment, their idiomaticity and metaphoricity is closely interconnected
(the conventionality of the expression tightly linked to the conventionality
of the metaphor underlying it), the two cannot be treated as referring
to the same property: not all idioms are gurative, and of those that are,
not all are metaphorical. An argument addressing the causative factors of
the reduction taking place must make a clear distinction between the two.
In order to preserve the studys basis in the original Gibbs and Brien
(1990) experiment and allow for a direct comparison of results, the same
categorical variable of idiomaticity was used. A subsequent analysis,
however, attempts to look more directly at the eect of metaphor itself,
independently of idiomaticity, on reduction. In oder to provide an inde-
pendent measure, for each utterance used in the experiment, of the degree
to which it instantiates a metaphor, a survey method (following Homan
1984; Gentner and Bowdle 2001; Coulson and Van Petten 2002) was
used. The methodology used for the survey is outlined in Appendix 3.
The results of the survey, averaged across participants and rounded to
the nearest tenth, are indicated in Table 2 (standard deviations are, again,
reported in parentheses to the right of the mean gure for each stimulus).
The numbers indicated are the mean of subjects metaphoricity rating for
each stimulus on a scale of 1 to 5, with a lower number indicating a more
metaphorical stimulus.
A glance at Table 2 demonstrates that survey participants, averaging
across the data, ranked idiomatic utterances about 1.8 points lower than
the controls on a 5-point scale with respect to the degree to which each
utterance is metaphorical. This indicates that the idiomatic utterances
are, as intended, clearly more metaphorical than the controls, a predic-
tion borne out by a t-test which shows a statistical dierence between the
two groups, t24 11.2, p < .005. This test supports the overall validity
of the categorical approach taken in 3.1 (and in Gibbs and OBrien
1990), in that the idioms are, as intended, not only more idiomatic but
also more metaphorical than the controls. It also demonstrates the valid-
ity of the survey method, in that variation between respondents ratings
for items does not outweigh variation between the two experimental
groups.
In analyzing the survey data in connection to the experimental data,
all analyses use, as the unit of comparison, the dierence between the
Idiom and Control (IdiomControl, for frequency, metaphoricity, and
Metaphor and phonological reduction 595
duration) for each matched pair (the values found in the rightmost col-
umns of Tables 1 and 2). The survey rating serves as an independent mea-
sure of the dierence, for each matched pair, between the metaphoricity
of the idiom and of the control, while the experimental data provides
the dierence in reduction, for the target word, between the idiom and
control. A clear relationship between these two gures, across the set
of matched pairs used in the survey and experiment, indicates that
phonological reduction is related to metaphoricity. A test of the hypothe-
sis that metaphoricity correlates signicantly with reduction yielded
t23 2.16, p < .05, indicating that, at least for this preliminary anal-
ysis of the relationship between metaphoricity and reduction, the null
hypothesis that there is no meaningful relationship between the two can
be rejected.
In order to isolate, in this analysis, the eect of metaphoricity from that
of frequency, a multiple regression analysis was used. Multiple regression
Table 2. Average metaphoricity rating by stimulus (lower number higher metaphoricity)
Idioms Controls Dierence
(ControlIdiom)
Dont blow your stack/tire. 1.9 (1.2) 3.2 (1.4) 1.2
She hit the ceiling/table. 1.9 (1.1) 4.1 (1.2) 2.2
He lost his cool/wallet. 2.3 (1.3) 4.8 (.9) 2.5
Try not to foam at the mouth/top. 2.2 (1.3) 3.3 (1.6) 1.1
Dont ip your lid/coin. 2.1 (1.1) 3.2 (1.3) 1.2
Lets crack the whip/glass. 1.7 (.9) 3.4 (1.4) 1.8
Its time to lay down the law/tile. 2 (1.1) 4.2 (1.3) 2.2
I call the shots/police. 2.6 (1.4) 4.8 (.5) 2.3
You wear the pants/scarf. 2.4 (1.4) 4.3 (1.1) 1.9
Try to keep the ball rolling/steady. 1.8 (1) 3.1 (1.5) 1.4
Just keep it under your hat/bed. 2.5 (1.3) 3.9 (1.3) 1.4
Will you button your lips/shirt? 1.8 (1.1) 4.7 (.9) 2.9
Please hold your tongue/arm. 2.3 (1.2) 4 (1.2) 1.7
He went behind my back/chair. 2.5 (1.2) 4.6 (.9) 2.1
Dont keep him in the dark/garage. 2.2 (1) 4.2 (1.1) 2
Try not to go o your rocker/path. 2 (1.1) 2.6 (1.3) 0.7
Youre going to lose your marbles/hair. 1.5 (.9) 3.5 (1.4) 2
Are you going to go to pieces/church? 2 (1.2) 4.9 (.7) 2.8
I suppose youll lose your grip/keys. 3 (1.3) 4.2 (1.1) 1.2
Theyre going to bounce o the walls/oor. 1.6 (.7) 3.3 (1.6) 1.7
Dont spill the beans/peas. 1.6 (.9) 3.7 (1.4) 2.1
Try not to let the cat out of the bag/house. 1.5 (.9) 4.4 (1.1) 2.9
Lets blow the whistle/bubble. 3.3 (1.3) 2.9 (1.4) 0.5
Blow the lid/water o. 2.3 (1.2) 3.2 (1.5) 0.9
She has loose lips/teeth. 2.2 (1.3) 3.9 (1.4) 1.7
Average 2.1 (.6) 3.9 (.6) 1.7
596 D. Sanford
analyzes the eects of each of a number of independent variables (regres-
sors) on a single dependent variable. This analysis, then, analyzed the
eect of the regressors frequency and metaphoricity on the dependent
variable of duration. The regression, overall, approached but failed to
reach statistical signicance, F2; 22 3.24, p 0.058unsurprising,
given that the two factors of phrase frequency and metaphoricity cant
be expected, of themselves, to wholly account for word duration. Nota-
bly, however, R
2
adj, a measure of the portion of overall variation which
is accounted for by the model, increases from 13%, in a regression which
looks at frequency alone, to 16% when metaphoricity is included.
One output of a regression analysis is a formula which can be used to
predict a set of values for the y-variable given a set of values for one or
more x-variables. In this case, the multiple regression yielded a formula
for predicting the dierence in duration for the target word between
the idiomatic and control utterances, given the dierence in frequency
and metaphoricity between the idiomatic and control utterance in each
matched pair. Another regression was completed in order to derive a for-
mula for predicting the dierence in duration between the target in the
idioms and controls using frequency alone (F1; 23 4.49, p < 0.05).
Both of these formulas were applied to the data, yielding two sets of g-
ures: expected values for dierence in duration based on frequency alone,
and expected values based on both metaphoricity and frequency.
The average dierence between the expected and actual gures using
frequency alone was 15.19 ms, and there was a .4 correlation between
the two sets of gures. The average dierence between the expected and
actual gures using both metaphoricity and frequency was 14.3 ms, and
the correlation between the two sets of gures was .48. This analysis
provides direct verication of the hypothesis that metaphoricity and fre-
quency predict word duration more accurately than frequency alone.
4. Discussion
The nature of the experiments reported here precludes, for several reasons
(most notably, the aforementioned high degree of interconnectedness be-
tween idiomaticity and guration, and the assumed imperfection of the
relationship between perceived and actual metaphoricity), an expectation
for an extremely robust experimental eect. These data, in combination
with the ndings reported in Gibbs and OBrien (1990) which attest to
the reality of the mental images underlying common gurative idioms,
are cautiously interpreted as consistent with the view outlined in 1. The
highly constrained nature of the metaphorical image schemas underlying
many common gurative idiomatic expressions cause words internal to
Metaphor and phonological reduction 597
the idioms to be highly predictable, licensing their phonological reduc-
tion. Both analyses, with their diering approaches to controlling for fre-
quency, indicate that the observed dierences in the duration of verbs in
the experimental and control items is neither due to chance nor wholly
to frequency eects, bearing out the prediction that metaphoricity has a
direct eect on word shortening.
An alternate account for the ndings presented here (and one also
highly consistent with the functionalist approach framing this research)
might posit the observed reduction in duration to be a function of the ho-
listic storage (Bybee 1995; Alegre and Gordon 1999; Titone and Connine
1999; Jurafsky et al., 2001) of idiomatic expressions. The loss of internal
complexity which accompanies the holistic storage of frequently occur-
ring multi-word collocations is accompanied by the reduction of internal
elements. This causes individual words within the utterance to be re-
duced, in duration, during production. Holistic storage, however, is a
function of frequency, and the analyses presented here have demostrated
that the observed eect is not wholly due to frequency.
This interpretation, moreover, would seem to run contrary to research
which has challenged the view of idioms as unanalyzable wholes. Nun-
berg, Sag, and Wasow argue, with respect to the internal complexity of
phrasal idioms such as the ones employed here, that the conventionality
of such expressions does not necessitate a corresponding loss of analyz-
ability. They assert, rather, that the meanings of most idioms have rec-
ognizable parts, which are associated with the constituents of the idiom
(1994: 531). Billig and MacMillans (2005) study on the use of the expres-
sion smoking gun in political discourse, similarly, demonstrates that the
use of conventionalized linguistic metaphors doesnt necessarily become
any more automatic as they become conventionalized, but rather that
the metaphor underlying such idioms can continue to be negotiated in
discourse. The predictability-based view of the observed reduction better
allows for some (if reduced) semantic load to be associated with words
internal to the idiom, and is therefore more consistent with ndings such
as these.
Future studies on the role of metaphor in licensing the reduction of
words internal to idioms would do well to seek other independent mea-
sures of the metaphoricity of phrases, to which word duration can be
compared. A study replicating both the original Gibbs and OBrien
experiment, and the experiment reported here, would be well situated to
compare the duration of words internal to idioms to the consistency for
the mental images which speakers have for such idioms, potentially
providing very strong evidence both for the connection between meta-
phoricity and duration, and for the metaphoricity of idioms.
598 D. Sanford
5. Conclusion
The study reported in Gibbs and OBrien (1990) indicates that, for com-
mon idioms with gurative meanings, their meanings are not arbitrary,
but are constrained by conceptual metaphors. Nor, according to the re-
sults of their study, can the metaphors that have assigned them their
meanings be relegated to the history of the language. Metaphorically mo-
tivated idioms arent the fossils of dead metaphors, with meanings now
determined on a purely lexical basis, but rather continue to draw their
meaning from the underlying metaphor, with speakers covert awareness
of these metaphors accounting for the high degree of consistency in their
mental images for such expressions.
The results presented here demonstrate, for the same idioms, a phono-
logical correlate of the same underlying cause: the conceptual metaphors
which underlie many common idiomatic expressions cause the shortening
of elements within the idiom which are, via the highly structured nature
of the source domains of the metaphors, highly predictable based on their
context. These ndings add yet another factor to the growing host of fac-
tors which have been put forward as contributing to word shortening, and
point in hopeful directions as to a diagnostic tool which might be used
to identify metaphorically motivated idioms. They oer support for the
hypothesis, advanced in Gibbs and OBrien (1990) and elsewhere (Nayak
and Gibbs 1991; Pfa, Gibbs, and Johnson 1997; Cacciari and Levor-
ato 1998), that the meaning of gurative idioms are predicated on un-
derlying conceptual metaphors, and, should they be corroborated by
further research, point to the extension of the eects of metaphor into
the domain of phonology, underscoring the impossibility of treating dif-
ferent levels of linguistic structure as operating largely independently of
one another.
Received 14 May 2007 University of New Mexico
Revision received 20 March 2008
Appendix 1: Experimental stimuli
Idioms Control items
Anger
Dont blow your stack. Dont blow your tire.
She hit the ceiling. She hit the table.
He lost his cool. He lost his wallet.
Try not to foam at the mouth. Try not to foam at the top.
Dont ip your lid. Dont ip your coin.
Metaphor and phonological reduction 599
Exerting Control/Authority
Lets crack the whip. Lets crack the glass.
Its time to lay down the law. Its time to lay down the tile.
I call the shots. I call the police.
You wear the pants. You wear the scarf.
Try to keep the ball rolling. Try to keep the ball steady.
Secretiveness
Just keep it under your hat. Just keep it under your bed.
Will you button your lips? Will you button your shirt?
Please hold your tongue. Please hold your arm.
He went behind my back. He went behind my chair.
Dont keep him in the dark. Dont keep him in the garage.
Insanity
Try not to go o your rocker. Try not to go o your path.
Youre going to lose your marbles. Youre going to lose your hair.
Are you going to go to pieces? Are you going to go to church?
I suppose youll lose your grip. I suppose youll lose your keys.
Theyre going to bounce o the
walls.
Theyre going to bounce o the
oor.
Revelation
Dont spill the beans. Dont spill the peas.
Try not to let the cat out of the
bag.
Try not to let the cat out of the
house.
Lets blow the whistle. Lets blow the bubble.
Blow the lid o. Blow the water o.
She has loose lips. She has loose teeth.
Appendix 2: Corpus methodology
In order to control for the eect of frequency causing reduction in dura-
tion for the experimental items, the frequency of each utterance used in
the experiment was assessed in a corpus. The BYU Corpus of American
English (Davies 2008), a 360 million-word corpus which is equally di-
vided among both genres of discourse (including spoken), and years
from 1990 to the present, was used for this study.
The frequency gures used in the analysis (which appear in Table 1)
were gleaned on February 22, 2008. These gures report the frequency
only of that part of each utterance which comes from Gibbs and OBrien
(1997), not of the overall string resulting from placing these items into a
complete utterance (e.g., the frequency of let the cat out of the bag was
600 D. Sanford
assessed, not the frequency of Try not to let the cat out of the bag), and it
is the frequency of the exact string (not allowing for variations in tense
and person) which is reported here.
Appendix 3: Survey methodology
All 50 of the items used in the experiment (both idioms and controls)
were presented, in one of four dierent random orders, to a group of 34
participants who did not take part in the experiment but were drawn from
the same population as the experimental participants. Participants were
asked to rate each item on a scale running between 1, corresponding to
utterances which are very metaphorical, and 5, corresponding to utter-
ances which are not at all metaphorical. Metaphorical expressions were
dened, in the instructions for the survey, as expressions for which the g-
urative meaning is not the same as the literal meaning, but for which the
literal meaning contributes to the gurative meaning. Participants were
not otherwise given direction as to what constitutes a metaphor, the sur-
vey relying for the most part on their previous understanding of the term.
While the link between perceived and actual metaphoricity (used here
to indicate the extent to which an utterance activates an underlying meta-
phor) is assumed to be imperfect, the method is anticipated to be su-
ciently ne-grained to provide a basis for making judgments as to the
link between metaphoricity and reduction.
References
Aitchision, Jean
1987 Words in the Mind: An Introduction to the Mental Lexicon. London:
Blackwell.
Alegre, Maria and Peter Gordon
1999 Frequency eects and the representational status of regular inections.
Journal of Memory and Language 40, 4161.
Barlow, Michael and Suzanne Kemmer
1994 A schema-based approach to grammatical description. In Lima, Susan,
Roberta Corrigan, and Gregory Iverson (eds.), The Reality of Linguistic
Rules. John Benjamins, 1942.
Berkeneld, Catie
2000 The role of syntactic constructions and frequency in the realization of
English that. Masters thesis, University of New Mexico, Albuquerque, NM.
Billig, Michael and Katie MacMillan
2005 Metaphor, idiom and ideology: The search for no smoking guns across
time. Discourse and Society 16, 459480.
Bolinger, Dwight
1981 Two Kinds of Vowels, Two Kinds of Rhythm. Bloomington, IN: Indiana Uni-
versity Linguistics Club.
Metaphor and phonological reduction 601
Bybee, Joan
1995 Regular morphology and the lexicon. Language and Cognitive Processes 10,
425455.
Bybee, Joan and Paul Hopper (eds.)
2001 Frequency and the Emergence of Linguistic Structure. Amsterdam: John
Benjamins.
Bybee, Joan and Joanne Scheibman
1999 The eect of usage on degrees of constituency: The reduction of dont in
English. Linguistics 37, 575596.
Cacciari, Cristina and Patrizia Tabossi
1988 The comprehension of idioms. Journal of Memory and Language 27, 668
683.
Cacciari, Cristina and Maria Chiara Levorato
1998 The eect of semantic analyzability of idioms in metalinguistic tasks. Meta-
phor and Symbol 13, 159177.
Chomsky, Noam
1965 Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
1980 Rules and Representations. New York: Columbia University Press.
Cooper, David
1986 Metaphor. London: Blackwell.
Coulson, Seana and Cyma Van Petten
2002 Conceptual integration and metaphor: An event-related potential study.
Memory and Cognition 30, 958968.
Croft, William
2001 Radical Construction Grammar: Syntactic Theory in Typological Perspective.
Oxford: Oxford University Press.
Cronk, Brian, Susan Lima, and Wendy Schweigert
1993 Idioms in sentences: Eects of frequency, literalness, and familiarity. Journal
of Psycholinguistic Research 22, 5982.
Cruse, D. Alan
1986 Lexical Semantics. Cambridge: Cambridge University Press.
Davies, Mark
2008 BYU Corpus of American English (360 million words, 19902007).
3http://www.americancorpus.org/4.
Fauconnier, Gilles
1997 Mappings in Thought and Language. Cambridge: Cambridge University
Press.
Fowler, Carol and Jonathan Housum
1987 Talkers signalling of new and old words in speech and listeners
perception and use of the distinction. Journal of Memory and Language 25,
489504.
Gentner, Dedre and Brian Bowdle
2001 Convention, form, and gurative language processing. Metaphor and
Symbol 16, 223247.
Gibbs, Raymond and Jennifer OBrien
1990 Idioms and mental imagery: The metaphorical motivation for idiomatic
meaning. Cognition 36, 3568.
Goldberg, Adele
1995 Constructions: A Construction Grammar Approach to Argument Structure.
Chicago: Chicago University Press.
602 D. Sanford
Gregory, Michelle, William Raymond, Alan Bell, Eric Fosler-Lussier, and Daniel Jurafsky
1999 The eects of collocational strength and contextual predictability in lexical
production. Papers from the Regional Meetings, Chicago Linguistic Society
35, 151166.
Homan, Robert
1984 Recent psycholinguistic research on gurative language. Annals of the New
York Academy of Sciences 433, 137166.
Jackendo, Ray
1992 Babe Ruth homered his way into the hearts of America. In Stowell, Tim and
Eric Wehrli (eds.), Syntax and the Lexicon. San Diego, CA: Academic Press,
155178.
Jurafsky, Daniel, Alan Bell, Michelle Gregory, and William Raymond
2001 Probabilistic relations between words: Evidence from reduction in lexical
production. In Bybee, Joan and Paul Hopper (eds.), Frequency and the
Emergence of Linguistic Structure. Amsterdam: John Benjamins, 229254.
Lako, George
1987 The death of dead metaphor. Metaphor and Symbolic Activity 2, 143147.
Nayak, Nandini and Raymond Gibbs
1991 Conceptual knowledge in the interpretation of idioms. Journal of Experi-
mental Psychology: General 119, 315330.
Nunberg, Georey, Ivan Sag, and Thomas Wasow
1994 Idioms. Language 70, 491538.
Pfa, Kerry, Raymond Gibbs, and Michael Johnson
1997 Metaphor in using and understanding euphemism and dysphemism. Applied
Psycholinguistics 18, 5983.
Strassler, Jurg
1982 Idioms in English: A Pragmatic Analysis. Tu bingen: Gunter Narr Verlag.
Sweetser, Eve
1992 English metaphors for language: Motivations, conventions, and creativity.
Poetics Today 13, 705724.
Titone, Debra and Cynthia Connine
1999 On the compositional and noncompositional nature of idiomatic expres-
sions. Journal of Pragmatics 31, 16551674.
Wray, Alison
2002 Formulaic Language and the Lexicon. Cambridge: Cambridge University
Press.
Metaphor and phonological reduction 603
Contents Volume 19 (2008)
Ben Ambridge and Adele E. Goldberg
The island status of clausal complements: Evidence in
favor of an information structure explanation (193) 357389
Robert Botne and Tiany L. Kershner
Tense and cognitive space: On the organization of
tense/aspect systems in Bantu languages and beyond
(192) 145218
Filippo-Enrico Cardini
Manner of motion saliency: An inquiry into Italian
(194) 533569
William Croft
On iconicity of distance (191) 4957
Ewa Dabrowska
Questions with long-distance dependencies:
A usage-based perspective (193) 391425
Holger Diessel
Iconicity of sequence: A corpus-based analysis of the
positioning of temporal adverbial clauses in English
(193) 465490
Gilberto Gomes
Three types of conditionals and their verb forms in
English and Portuguese (192) 219240
John Haiman
In defence of iconicity (191) 3548
Martin Haspelmath
Frequency vs. iconicity in explaining grammatical
asymmetries (191) 133
Martin Haspelmath
Reply to Haiman and Croft (191) 5966
Cognitive Linguistics 194 (2008), 605607
DOI 10.1515/COGL.2008.024
09365907/08/00190605
6 Walter de Gruyter
Martin Hilpert
New evidence against the modularity of grammar:
constructions, collocations, and speech perception (193) 491511
Wolfram Hinzen and Michiel van Lambalgen
Explaining intersubjectivity. A comment on Arie
Verhagen, Constructions of Intersubjectivity (191) 107123
Zhuo Jing-Schmidt
Much mouth much tongue: Chinese metonymies and
metaphors of verbal behaviour (192) 241282
Ronald W. Langacker
Sequential and summary scanning: A reply (194) 571584
Fey Parrill
Subjects in the hands of speakers: An experimental study
of syntactic subject and speech-gesture integration (192) 283299
Karen Roehr
Linguistic and metalinguistic categories in second
language learning (191) 67106
Daniel Sanford
Metaphor and phonological reduction in English
idiomatic expressions (194) 585603
Anatol Stefanowitsch
Negative entrenchment: A usage-based approach to
negative evidence (193) 513531
Arie Verhagen
Intersubjectivity and explanation in linguistics: A reply
to Hinzen and van Lambalgen (191) 125143
Daniel Wiechmann
Initial parsing decisions and lexical bias: Corpus
evidence from local NP/S-ambiguities (193) 447463
Arne Zeschel
Introduction (193) 349355
Arne Zeschel
Lexical chunking eects in syntactic processing (193) 427446
Book reviews
Vyvyan Evans and Melanie Green. Cognitive Linguistics.
An Introduction.
Reviewed by Rene Dirven (192) 338348
Verena Haser. Metaphor, Metonymy and Experientialist
Philosophy: Challenging Cognitive Semantics.
Reviewed by Dominik Lukes (192) 313324
606 Contents Volume 19 (2008)
Catherine E. Travis. Discourse Markers in Colombian
Spanish: A Study in Polysemy.
Reviewed by Natalya I. Stolova (192) 301313
Dirk Geeraerts (ed.), Cognitive Linguistics: Basic Readings.
Reviewed by Thora Tenbrink (192) 325338
Contents Volume 19 (2008) 607

Das könnte Ihnen auch gefallen