Sie sind auf Seite 1von 24

Identifying Metaphor in Language: a cognitive approach

Style, Fall, 2002 (9/22/02) by Gerard J. Steen

1. The Cognitive Paradigm and Linguistic Metaphor identification

One paradoxical effect of the cognitive turn in metaphor studies has been the neglect of the linguistic analysis
of metaphorical language. Many metaphor scholars have concentrated on fleshing out the presumed conceptual
connections between related metaphorical expressions, but they have not really turned back to examine how
and why which conceptual metaphors are expressed in the way they are in which contexts of language use. In
recent years there has been some improvement of this situation, but the balance has still not quite been
adjusted (cf. Cameron and Low; Deignan; Goatly Language of Metaphors; and Gibbs and Steen).

One of the reasons for this slow development of an examination of linguistic metaphor is that it involves a
transition from one type of metaphor research to another. Most cognitive linguistic research on metaphor may
be characterised as theory building, in which concepts and hypotheses are developed about the nature of
conceptual metaphor. To be sure, such theories have empirical underpinnings, in that their authors are careful
to collect many linguistic examples that corroborate their theoretical constructs. To put this slightly differently,
these are theories meant to be put to the test in empirical research. In that respect, they are not like the
hermeneutic theories of philosophers like Ricoeur or the analytical theories of philosophers like Davidson.

However, when such cognitive-linguistic ideas are actually tested in linguistic research, practical problems arise
which require their own solutions. In particular, the most urgent problem is the reliable identification of
metaphors in on-going discourse. If cognitive metaphor theories are to be tested or applied to authentic
language use, the reliable identification of linguistic metaphors is a conditio sine qua non. For linguistic research
does not set out with a preconceived set of conceptual metaphors, but instead has to deal with spontaneous
metaphorical expressions as they are encountered in concrete uncontrolled language use. There is a decided
difference between the postulation of conceptual metaphors such as LIFE IS A JOURNEY, LOVE IS A JOURNEY,
HAPPY IS UP, and so on, as well as their illustration by well-chosen examples, on the one hand, and the
technical identification in on-going discourse of expressions presumably related to such postulated conceptual
metaphors, on the other hand.

Of course, many researchers have attempted to bridge the gap between the cognitive theory of metaphor and
the identification of linguistic metaphor in analyses of individual texts and discourse, or, indeed, in their
development of textual materials for experimental purposes. However, these researchers have all had to rely on
their own solution to the problem of formulating an operational definition of linguistic metaphor. A metaanalysis of such studies might well be able to show that there is a good deal of variation between these

An explanation of this lack of a reliable instrument for metaphor identification may lie in the two main traditions
dealing with cognitive metaphor. Psychologists do believe in the importance of reliable measuring instruments,
but, as far as metaphor identification is concerned, they have not really needed one for their experimental
purposes. They can simply adopt the same strategy as the cognitive linguists who are in the business of theory
formation, and restrict themselves to relatively clear and pre-selected cases. Cognitive anthropologists,
linguists, and pocticians, by contrast, do not work in a tradition where the reliability of a measuring instrument
has to be evaluated and reported. Scholars who work on literary texts or on the analysis of metaphor in
conversation, in particular, are used to supporting their method with argumentation, often celebrating the
combination of analysis and interpretation rather than separating them out in the service of achieving reliability.
As a result, neither of the tw o main traditions of cognitive metaphor research has devoted much time to the
reliability of linguistic metaphor identification.

Two linguistic traditions that generally do have an interest in reliable measuring instruments are corpus
linguistics and applied linguistics (Barnden, Lee, and Markert; Cameron and Low). When large bodies of
discourse are to be examined for metaphorical language, it is of paramount importance to have a reliable
identification procedure. And when linguistic insights are to be applied to issues in language learning,
translation, text design, and so on, we are dealing with a type of research which is also accustomed to looking
at the reliability of the data. It is this basically social-scientific attitude towards research that has led to a
project on the feasibility of a metaphor identification procedure in discourse (Crisp; Crisp, Heywood, and Steen;
Heywood, Semino, and Short; Semino, Heywood, and Short; Steen "Five Steps," "Reliable Procedure," "Rhetoric
of Metaphor," and "Towards a Procedure"). It should be noted that such a procedure is not limited to ordinary
discourse: it may also be helpful in the empir ical study of literature, which employs corpus-linguistic and
experimental techniques in the study of literary discourse (Steen "Rhetoric of Metaphor"). And even less
empirically minded scholars of literature may find it useful in their critical analyses and interpretations of single
literary texts (cf. Steen "Analyzing Metaphor").

The metaphor identification project is based on the cooperation between ten linguists covering such areas as
linguistics, stylistics, cognitive linguistics, psycholinguistics, and applied linguistics. The members of the group
are, in alphabetical order, Lynne Cameron, Alan Cienki, Peter Crisp, Alice Deignan, Ray Gibbs, Joe Grady, Zoltan
Kovecses, Graham Low, Elena Semino, and myself. We have coined the name "Pragglejaz" for our group and
project, consisting of the first letters of our first names. We have done joint research on metaphor identification
in five poems, one newspaper article, and two stretches of verbal interaction and have had several meetings to
discuss our results and program for the future. We have achieved some moderate success in attaining
satisfactory reliability for our small sample of texts, about which more will be said in the next sections. For
further information, I refer to the group's Web site at

It is the purpose of this contribution to present the most important results of our work on the identification
procedure. I will first deal with the background to the procedure. Then I will summarize the main findings of our
reliability studies. I will go into the general patterns of our results and continue with discussing some practical
and substantial details. To do this, I will concentrate on a poem by Tennyson that has been part of our sample,
"Now Sleeps the Crimson Petal."

2. General Assumptions about Metaphor, Cognition, and Language

The general goal of the identification project is simple: to enable researchers to reach agreement about what
counts as a metaphorical expression in any piece of discourse. In order to achieve this goal, a general
theoretical framework about metaphor has to be assumed in order to derive an operational definition of
metaphor in discourse. Another requirement is a method of dividing natural discourse up into smaller units of
analysis, so that these may be used for the application of the operational definition in order to inspect the
discourse for the incidence of metaphor. Theoretical framework, operational definition, and method of discourse
analysis all have to be reasonably compatible with the cognitive orientation of contemporary metaphor
research, while also being compatible with some of the more general assumptions regarding the cognitive
analysis of discourse. This is in the spirit of the two basic commitments of cognitive linguistics, the cognitive
commitment and the generalization commitment (e.g., Ste en and Gibbs).

2.1. Theoretical Framework

The main motive for a metaphor identification procedure is to minimize measurer bias. Suppose you wish to
compare the properties of metaphors in one sample as opposed to another, as in two works of one author, or
one author as opposed to another, or one genre opposed to another, or even literary as opposed to nonliterary
language. If different analysts use different measures for metaphor identification, it will not be clear whether

any resulting differences (or similarities, for that matter) between the two samples compared are due to the
nature of the materials or to the bias of the researchers. It is simply good scientific practice to exclude
measurer bias as much as possible, and a standardized procedure that produces demonstrably reliable results is
one of the best means to do so.

Moreover, there is an intrinsic interest in developing such a procedure. Cognitive linguistics promises a uniform
and precise view of the nature and function of conceptual metaphor in language. However, if we do not have
one generally accepted means to establish what counts as metaphorical language in spontaneous discourse, we
shall never be able to fulfil the promise of cognitive linguistics by proceeding to systematic tests of the
program. In that case, the cognitive paradigm will remain a theoretical position, justly criticized by empirical
researchers of other persuasions, such as Murphy.

An important consequence of adopting this standpoint is the emergence of a functional difference between
technical metaphor identification by the analyst and spontaneous (or elicited) metaphor recognition by the
language user. This distinction has bemused some cognitive linguists and poeticians, since "we" are all
supposed to understand metaphor in the same way. However, a procedure for metaphor identification serves as
a technical instrument of linguistic analysis, providing data on metaphorical expressions in different contexts of
discourse for the purpose of their further scientific study in relation to some research question. This bears a
tenuous relation to how people understand metaphor. For that is an issue which may precisely be the object of
the more encompassing research question, for which the linguistic metaphors have to be collected in the first
place. How people understand metaphors involves technical metaphor identification data plus other types of
information about an aspect of people's unders tanding of these metaphors, such as additional metaphor
analysis in context, informant judgements or behavior relating to the metaphors, and so on. Metaphor
identification is an instrument to study such phenomena as metaphor recognition.

Given these preliminary considerations, the general theoretical framework is predominantly cognitive-linguistic
in nature (e.g., Lakoff "Contemporary Theory"), but with a decidedly behavioral or social-scientific orientation
(Steen and Gibbs). Here are the most pertinent assumptions that we have adopted for our work (see Crisp, and
Steen "Towards a Procedure," for details):

* Meaning is grounded in knowledge

* Literal meaning is direct meaning, metaphorical meaning is indirect meaning (in the sense of Lakoff
"Meanings of Literal," not in the sense of Searle)

* Metaphor is primarily a matter of conceptual structure, and derivatively a matter of language

* Metaphor is a set of correspondences between two concepts in two different knowledge domains (Lakoff
"Contemporary Theory")

* Metaphor may be conventional, systematic, and familiar, or not

* Metaphor, whether conventional or not, may be deliberate or "emergent" (Cameron)

* Metaphor may be signaled as such, or not (see Goatly Language of Metaphors)

* Metaphor may be expressed at various levels of linguistic organization and in various rhetorical forms

Most of these assumptions are widely shared among cognitively oriented researchers of metaphor today.

2.2. Operational Definition

The operational definition of metaphor we have adopted aims to get a concrete handle on the linguistic
expression of conceptual metaphor. This means that we need a level of discourse analysis that provides a link
between linguistic and conceptual metaphor. We have found the cognitive-psychological notion of the
proposition to be most useful in this context (Crisp; Steen "Towards a Procedure").

Following the work by Kintsch, propositions are minimal idea units representing the conceptual content of
linguistic expressions. They consist of a conceptual predicate and one or more conceptual arguments, in the
form of predicate calculus. They have been used in text psychology to get from the linguistic surface structure
of the text to the conceptual text base, a linearly ordered and hierarchically organized list of propositions
expressing the content of a text in a form that closely corresponds to the language itself. The difference
between the two, however, is that the text base makes explicit all kinds of notions that have been left implicit in
the surface structure of the text, by means of substitution, ellipsis, presupposition, and so on.

For instance, consider the following line by Tennyson, the opening line from one of the poems in our sample:

Now sleeps the crimson petal, now the white

A propositional analysis of this line would look like this:







The capitals are used to indicate that we are dealing with concepts, not words. The division of the line into two
parts will be explained below. And the underlining indicates that SLEEP is used metaphorically.

There is a stark difference between the analysis of the two main clauses in this first line. The propositional
analysis of the second main clause has filled in the concept for the ellipted verb "sleep" and it has explicated
the substitution "the white" by a complete conceptual structure, "THE WHITE PETAL." This means that the text
base contains completely explicit thoughts in the form of conceptual propositions.

The example may also demonstrate the importance of using a propositional text analysis. A strictly formal
linguistic analysis would not be able to recognize the second main clause of this line as metaphorical. The
explication of the presupposed, recoverable meanings is a necessary act of discourse analysis to show that we
are dealing with metaphor, and, what is more important, it may be more or less problematic and controversial,
depending on the structure to be explicated. A more complicated case will be presented in the last section.

It is also important that propositions may be seen as miniature ideas or thoughts. Since thought may be
metaphorical, propositions may come in at least two kinds: metaphorical and nonmetaphorical propositions.
Our operational definition of metaphor hence turns on the detection of the metaphorical usage of one or more
concepts inside a proposition. Thus, in the above example, the first proposition of 1a and of 1b is metaphorical,
and expresses a metaphorical thought, that a petal sleeps. This is because the use of the concept SLEEP is not
literal but metaphorical. However, the third proposition of 1a and 1b is not metaphorical, for it expresses the
literal idea that a petal is crimson or white. All of the concepts in both P3s directly refer to the referents in the
text world they are presumably linked to.

One advantage of propositions is that they may be used to build mental models, such as situation models
(Kintsch). Concepts in propositions, both literal and metaphorical, provide entry to much richer knowledge
schemes, which include other concepts that play a linguistically implicit role in the mental model. The

explication of these other concepts leads the way to the construction of a complete mental mapping between
the literally and metaphorically used concepts in the proposition. For instance, the concept of sleeping in the
Tennyson line may activate a complete sleeping scenario, in which information about the action, purpose,
location, and props of sleeping are filled by default values. These pieces of information may be exploited in the
construction of the mapping between the source and target domain, as will be illustrated in the next section.
This means that propositions can form a bridge between linguistic and conceptual metaphor (Steen "Five

One hidden assumption is that we concentrate on stretches of discourse as parts of messages between
language users. We do not look at metaphor in the language system as an abstract entity (cf. Kovecses). We
focus on metaphor as nonliteral expression in concrete messages that have a linguistic and a conceptual
structure. It is especially important that we work with a conceptual-referential approach, in which words
activate concepts which play a role in more encompassing and possibly abstract mental models (cf. Goatly
"Text-Linguistic Comments"). This is also why a propositional analysis does not have to be at odds with the
analysis of image metaphor (Crisp). And, more importantly, it is the reason why metaphor identification has to
be situated in the framework of discourse analysis. I have explained my general view of such a messageoriented approach to discourse, and its relation to literature as discourse, in "Genres of Discourse" and in
"Poetics and Linguistics Again."

2.3. Units of Analysis

This leads on to the matter of units of analysis. For the operational definition of metaphor as propositions needs
to be bounded in terms of its domain of application. Propositions are found at all levels of linguistic expression,
and the question arises whether each metaphorical proposition should be seen as a metaphor in its own right.
Different answers may be given to this question in different research contexts, as we have discussed in Crisp,
Heywood, and Steen. However, in the Pragglejaz metaphor identification project we have adopted one specific
level of abstraction for the analysis of metaphorical language, and that is one which I shall refer to as the
discourse unit. Metaphorical propositions within discourse units are treated differently than metaphorical
propositions between discourse units.

The discourse unit is that level of abstraction in discourse which splits up all relatively independent utterances
in a message. It is identical to the notion of minimal text span utilized in Rhetorical Structure Theory (Mann and
Thompson), and may be roughly described as all semi-independent or nondowngraded clauses. This also
happens to correspond with at least two forms of propositional analysis, the ones advanced by Bovair and

Kieras and by Kintsch. Another way of explaining this is to say that all utterances in a message with their own
illocutionary force and reference to a state of affairs are discourse units.

The effect of adopting the discourse unit as the basic level of analysis for linguistic metaphor may now be
explained with the help of our previous example. As will have been noticed in our discussion of the propositional
analysis, the first line of the poem was not treated as one whole, but broken up into two units, each consisting
of a main clause. These are the discourse units used as units of analysis. Each of these units displays its own
metaphorical proposition and moreover contains only one metaphorically used concept. They thereby
instantiate one particular class of metaphor out of many (see the taxonomy in Crisp, Heywood, and Steen).

The pattern with only one metaphorically used word in a discourse unit seems to be the most frequent pattern
in discourse. Examples of more complex patterns will be given below. What is more interesting at present is the
fact that our framework leads to the conclusion that line 1 contains two metaphors, one in each unit. Another
approach might say that line 1 contains only one metaphor, namely the one of a mapping between the domains
of petals and sleeping. I prefer the two metaphor approach, because it facilitates a dynamic perspective on
metaphorical language as it unfolds from one discourse unit to another: I would like to call the first metaphor
the beginning of a serialized metaphor, and the second the end of a serialized metaphor. This analysis captures
the two-fold nature of the metaphors in question, namely that they can be read as relatively self-contained and
self-explanatory expressions, on the one hand, but that they may also be seen as part of a more encompassing
discursive pattern, which will af fect their isolated readings. I will come back to this issue at the end of this

Another advantage of our approach is that it makes it possible to quantify metaphor at a level of discourse units
in messages which has been independently recognized by discourse analysts and reading psychologists. This is
especially important for the application of our instrument to corpus linguistics and experimental psychology. A
proposal for handling these distinctions in annotating texts in corpora has been made by Semino and Steen.

To sum up, we have seen how research on the feasibility of a metaphor identification procedure requires a
number of assumptions. The general theoretical framework of our project may be characterized as cognitive
linguistic. The operational definition of metaphor as propositions comes from the field of reading psychology
and may look at odds with Lakoffs view in his "Contemporary Theory" that metaphors are not propositions.
However, propositions may actually form a bridge between linguistic and conceptual metaphor and have proven
to be a very useful tool in our approach. The last basic assumption that we have made is that we analyze
metaphor at the level of discourse units, defined in the tradition of Mann and Thompson as all semi-

independent clauses or utterances with their own illocutionary force and reference to some state of affairs.
Given these assumptions, a first sketch of a metaphor identification procedure has been provided in my "Five
Steps," which will now be illustrated in the next section.

3. The Five-Step Procedure for Metaphor Identification

When linguists identify metaphor in discourse, they arguably make a number of steps, from linguistic metaphor
to conceptual metaphor. I have labeled these steps as follows:

Identification of

1. metaphorical focus

2. metaphorical idea

3. metaphorical comparison

4. metaphorical analogy

5. metaphorical mapping

This series of steps is a logical reconstruction of what presumably happens when researchers assert that a word
or some words are used metaphorically. In actual practice, most researchers have made the jump from step
one to step five without any explication of the intermediate steps. However, it should usually be possible to
reconstruct what happens in between, and these reconstructions may serve as a check on thc quality of the
mapping postulated for the verbal expression. (For the relation with the psychology of metaphor recognition,
see "Five Steps.")

Let us illustrate and explain what happens to the Tennyson line discussed above, in order to give a general
impression of the procedure. There are still many details to be resolved, and different criticisms have been
voiced about different aspects of the procedure by Crisp, Gibbs, Goatly in "Text-Linguistic Comments,"
Kovecses, Low and Cameron, and Semino, Heywood, and Short. However, this is not the place to go into these
issues, as we need our space to present some of the results coming out of the overall approach in the next

The procedure begins with metaphorical focus identification. This label alludes to the terminology introduced by
Max Black, where the metaphorically used word is called the focus, which stands out against the background of

a literal frame. Thus the word sleep in the first unit of the Tennyson line "Now sleeps the crimson petal" is the
metaphor focus, and it would be picked out as the metaphorically used word by most analysts. The ostensibly
trivial nature of this step is immediately called into question when we consider the second unit of the first line,
which does not display a metaphorically used word, as we have seen above. An approach that simply works
with a focus/frame distinction is not sufficient for a rigorous cognitive-linguistic methodology of metaphor

The need for the second step, metaphorical idea identification, is illustrated by the same token. Its nature has
been discussed above, in that both units of line 1 give rise to a proposition containing a metaphorically used
concept, which thereby leads to the identification by the analyst of two metaphorical ideas, that a crimson petal
sleeps and a white petal sleeps. If propositions may be regarded as one representational form of human
thought, then these metaphorical propositions are one representational form of metaphorical thought.

However, such a metaphorical thought, by its very nature, presents a kind of semantic puzzle. In cognitive
semantics, it needs to be resolved by transforming it into a mapping. The reason why this has to be done has
not received much attention. However, what is actually at stake when people use linguistic metaphor is the
referential coherence of the stretch of discourse: a solution has to be found to the presence of the indirect form
of reference that is the very nature of a metaphorically used concept. A mapping is one solution to that
problem, in that it provides a fully-fledged referential baseline (the target domain) which is motivated and
understood by elaborating a source domain that closely corresponds with it.

Since mappings are sets of connections between two domains, and since metaphorical mappings additionally
involve comparison, the first condition for setting up a mapping is to separate out the elements of the two
domains in an open comparison (Miller). The open comparison in this case involves the assumption of a
construed similarity between some activity of the petals in the target domain and the sleeping of some entity in
the source domain. A formal notation of this comparison, first proposed by Miller, looks like this:

(** F) (** y) {SIM[F (PETAL), SLEEP (y)]}

This formula should read as follows: there is some activity F and some entity y for which it may be asserted
that there is a similarity between petals doing F and ys sleeping. As can be seen, an open comparison contains
two incomplete propositions that are asserted to exhibit a relation of similarity if their open slots can be filled.

The filling of the slots is the task for step four. By filling the slots, we move away from an open and
indeterminate comparison statement to a completed and determinate nonliteral analogy. It consists of two

references to two states of affairs by means of two complete propositions, which suggest that their elements
fulfil analogous functions in the two similar domains. In our example, the analogy might look like this:


The nature of the concepts to be inserted into the analogy is problematic. Semantically speaking, sleeping may
be done by animals apart from people, and the choice for personification is not self-evident. Indeed, the old
approach to metaphor by means of selection restrictions might actually have preferred the widest possible
scope for the argument coming with the verb sleep, that is, [+animal]. However, we typically think of sleeping
as a human activity, and that may be one cognitively motivated reason why the metaphor should be
constructed in this way.

This part of the analysis is called vehicle interpretation by Reinhart. Its problematic aspects have been
discussed at length by Semino, Heywood, and Short, also in connection with the difficulties exhibited by the
other side of step four, tenor interpretation. Thus, in the present instance, the choice of "inactive" instead of
"quiet" or "still" or "hang down" begs just as many questions. However, these caveats are not intended to
undermine the potential of the procedure. On the contrary, what they point out is that the validity of one
solution may have to be compared with the validity of another, and that these alternatives have different effects
on the further fleshing out of the analogy into a full-blown mapping (step five).

Let us briefly consider this fifth step of identifying the metaphorical mapping. Adopting the format of Lakoff and
Johnson, we can come up with a list of correspondences that are entailed by the analogy constructed under
step four:

Being inactive corresponds to sleep

Petals correspond to persons

The function of being inactive corresponds to the function of being asleep: resting from tiredness

The quality of being inactive corresponds to the quality of being asleep: it is typically deep and long

The spahia-temporal location of being inactive corresponds to the location of being asleep: it is typically at night
and in a bedroom

Each of these entailments and their elaborations might have to be considered as possible components of the
complete mapping, some of the entailments having greater plausibility (or "strength," as in Relevance Theory)
than others. A comparison of this list of correspondences with the lists derived from the alternative analogies
suggested above might be helpful in explaining different interpretations of the metaphor by different readers.

The five-step procedure is a logical reconstruction of what has to happen if analysts want to get from linguistic
to conceptual metaphors. There are problems with the procedure, but these are problems that have to do with
the nature of metaphor analysis more than with the nature of the procedure itself. What the procedure helps to
do is increase the awareness of analysts of those moments when they make interpretative steps. It may thus
be useful in revealing differences between analysts, which may then be scrutinized for unwarranted or less
warranted assumptions, so that disagreement may be decreased.

4. Two Reliability Studies

4.1. General Findings

The assumptions sketched in section 2 and the procedure summarized in section 3 were the background to two
reliability studies carried out by some of the members of the Pragglejaz group. Both studies were aimed at
exploring the difficulties in achieving sufficient inter-analyst agreement in performing metaphor identification
according to the general approach outlined above. The more specific instruction for both studies was to mark
every content word that was metaphorically used according to the definition given above. In particular, a
content word is used metaphorically if it can give rise to a proposition, comparison statement, analogy and
finally mapping that is deemed to involve two distinct domains of knowledge. The restriction to content words
was a deliberate research strategy, aimed at beginning with clear cases before turning to more complicated
forms of linguistic metaphor. Moreover, the concentration on words rather than concepts involves another
intentional limitation: we have tested our approach at the level of the first step of the five-step procedure only,
for the same reason.

The first study was carried out by four members of the group, who looked at five nineteenth-century English
poems, including the Tennyson poem. A more detailed report of this study may be found in my "Reliable
Procedure." About half of the words of the poems were content words (N=410). Each of these words eventually
received a score ranging between 0 and 4, indicating the number of times it had been marked as
metaphorically used by the four analysts. The statistical easure Cochran's Q was computed to establish whether
there were reliable differences between the analysts, which of course is an undesirable outcome of the study.
However, this did indeed turn out to be the case.

Therefore, a controlled round of discussion was held between the four analysts to see whether the degree of
disagreement could be reduced. This discussion led to the explication of guidelines and the correction of
obvious mistakes, producing a new set of metaphor identification data "after discussion." As a result of the
discussion, a total of 62 scores were changed, out of a potential of 4x410=1640 scores. This means that only
4% of the total number of scores were changed because of the discussion. There were 8 changes from
[+metaphorical] to f-metaphorical], and 54 changes in the opposite direction. The average total number of
words marked by the four analysts as metaphorically used, out of the total number of 410 content words, rose
from 139 to 150.

New statistics were computed after discussion, and they turned out to be much better. When all content words
were included in the test, there were no reliable differences between the analysts anymore. When the sample
was divided according to word class, there was reliable agreement for the nouns, adverbs, and verbs, in that
order. It should be observed here that there were twice as many nouns than verbs in the sample. Moreover,
there was a higher proportion of metaphorically used verbs than nouns. The difference between word classes in
metaphor identification is one of the most important issues on the agenda for future linguistic research.

After the first round had been brought to a modestly successful conclusion, we setup a second round in which
we wished to test our method on different materials. We also wanted to find out whether our experience with
the first round led to an improvement of the reliability statistics before and/or after discussion. To this end, the
same set of analysts plus one new member looked at three new pieces of discourse: one newspaper article, one
piece of elicited dialogue, and one piece of classroom interaction. The total number of content words in these
three stretches of discourse was more than double the number of the first round: N=901. The same
instructions and procedures were followed as in the first round.

The results of this round were as follows. The new member had an average performance, ranking right in the
middle of the other four analysts. The analysis by all five analysts of the classroom interaction was reliable
before discussion. The analysis of the elicited dialogue was reliable after discussion. And the analysis of the
newspaper article was reliable for four of the five analysts after discussion, the analyst causing difficulties not
being the new analyst. The order of the reliability scores for the different word classes was different from the
first round, possibly because of their different distribution. This time, adjectives performed best, followed at
great length by nouns and verbs, which were close to each other.

There are many details about these studies that deserve further treatment, but they have to wait until another
occasion. Our general conclusion is that what is in effect the first step of the method, focus identification, works

across a range of discourse classes; its applicability is therefore not limited to poetic language. This means that
the method may be used for comparative studies of the use of metaphor across genres. Another conclusion is
that it is possible to make explicit many instructions regarding metaphor identification, facilitating a high degree
of agreement between analysts who perform their work independently of each other. And finally, the method is
sensitive to metaphors expressed in different word classes, and this may be an empirical finding which is of
some import for further theoretical and empirical work on metaphor as such.

4.2. Practical Details, Illustrated with Reference to Tennyson's "Now Sleeps a Crimson Petal"

We shall now have a look at some of the details of our work by examining a single text, the Tennyson poem
from which the earlier examples of linguistic metaphor in this article were taken. The text of the complete poem
is given below, and is reproduced from Helen Gardner's New Oxford Book of English Verse.

Now Sleeps the Crimson Petal

Now(0) sleeps(4) the crimson(0) petal(0), now(0) the white(0);

Nor waves (3,4) the cypress(0) in the palace(0) walk(0);
Nor winks(4) the gold(3) fin(0) in the porphyry(0) font(0);
The fire-fly(0) wakens(0): waken(1) thou with me.

Now(0) droops(1) the milkwhite(2) peacock(0) like a ghost(4),

And like a ghost(3,4) she glimmers(4) on to me.

Now(0) lies(2) the Earth(0) all Danae to the stars(0),

And all thy heart(1,2) lies(3,4) open(4) unto me.

Now(0) slifes(4) the silent(0) meteor(0) on, and leaves(1)

10 A shining(0) furrow(4), as thy thoughts(1) in me.

Now(0) folds(2) the lily(0) all her sweetness(3,4) up,

And slips(2,4) into the bosom(4) of the lake(0):
So fold(4) thyself, my dearest(0), thou, and slip(3,4)
Into my bosom(1,0) and be lost(4) in me.

Each content word exhibits the metaphor identification score, either as a single or a pair of figures. The latter
indicates a change in the score before discussion and after discussion. To illustrate, "waves" in line 2 had three
marks before discussion, but then the analyst who had not independently marked it for metaphorical usage
changed the score after the discussion. This lifted the total number of marks to four after discussion. The
discussion site contains the following comments by the analysts involved in the change:

A: The question is whether "waving" prototypically takes an animate subject or not.

B: I did not originally include "waves," but now I would since it could be construed metaphorically (waving as
greeting), even if it could also be taken nonmetaphorically (waving as simple motion).

By comparison, the fact that "sleeps" from line I only has one figure indicates that all analysts independently
identified it as metaphorical before discussion, and that nobody changed the identification during the

A few details need to be pointed out, in order to prevent misunderstandings. First of all, the title has been
excluded from our analysis, for the reason that it may contain block language, which is a special case.
Secondly, the present analysis has also excluded proper names as special cases, such as "Danae," but it is clear
that these can be used in expressions bordering on the metaphorical (cf. "do a Nixon on me"). And thirdly, the
exclusion of the adverbial particles in the phrasal verbs "slide on" and "folds up" emphasizes the grammatical
particle aspect of these words. Since it was not clear whether all analysts had followed the same rule for this
word class, adverbial particles were left aside in the statistical analysis. In future analyses, their adverbial
aspect will be emphasized and then they will be included in our samples, just as the other two exclusions.

Another point that is worth making bears on the value of the scores and comments themselves. Our discussion
of the scores was not aimed at resolving all disagreements by relating them to their fundamental principles.
This would have taken much more time than we had available. Instead, we used a rather pragmatic approach
to our discussion of the original identification scores. We all took one more round of going over the data and
simply focused on all cases where we had got either a single mark or three marks. Our task was to see whether
they could be changed to zeros or fours, in the following manner.

If a word has three marks, this may suggest that one of the analysts has simply made a mistake. Upon closer
consideration, the analyst in question may then be persuaded to change his or her score. As can be noticed
from the poem, there are no fewer than five cases which may be seen as such: "waves" (1. 2), "ghost" (1. 6),
"lies" (I. 8), "sweetness" (I. II), "slip" (1. 13). This number rises to six cases if we include the changes in "slips"

in line 12. In itself it is an interesting result that highly trained analysts may still miss what seem to he
relatively clear cases. It is just as interesting a finding that they may also be quickly convinced of their error
once it is pointed out to them by others. And finally, it is even more interesting to conclude that the reparation
of these (relatively few) mistakes has caused the reliability of the overall metaphor identification score to
become sufficient (but this is also due to the small size of the sample).

If a word has only one mark out of four, this may be suggestive of two situations: either an analyst may have
made an error (again), or an analyst may have been unusually sensitive to a metaphorical usage that has
escaped the notice of the other analysts. A clear case of the first situation may be found in line 14, where
"bosom" had originally been marked as metaphorical by one of the analysts. It was retracted after discussion,
when it was pointed Out that "bosom" is more likely to be based on metonymy than on metaphor. It is hard to
find a ground for a mapping that is based in similarity, as is required by our overall approach. Step four (and
following) in the five-step procedure would be highly problematic for such a view of "bosom."

Other words that had a single mark before discussion were "droops" in line 5, "heart" in line 8, "leaves" in line
9, and "thoughts" in line 10. These are less easy to classify as error. The first two of these four words were
discussed, resulting in a change for "heart" but no change for "droops." The other two words were not
discussed. In an ideal world, all of these cases should receive further reflection and a final decision. The fact
that they have been left as they are may look odd when the Tennyson poem is looked at in isolation. However,
the fact that the other changes to the scores led to sufficient improvement of the reliability analysis was the
most important aspect of our study, which was why we did not continue to solve all of the remaining problems.

This leads to a final comment about the practical aspect of our approach. Our primary goal in these studies has
been to develop a measuring instrument. This means that we have left details of content aside. If the
instrument turns out to be useful, one may follow two paths. One may use it to mark up texts until complete
agreement has been reached regarding all cases. But this does not seem very realistic. Or one may apply the
instrument in order to give all words in the texts a metaphor identification score. The more analysts are
involved in such a project, the subtler the distinctions between well-identifiable and less-identifiable cases
might become, because a range of metaphor identification scores may suggest a dine of metaphor
identifiability. These scores might be used as additional data in the more encompassing study that one wishes
to perform on the basis of the metaphor identification data.

5. Application and Prospects

Having cleared the ground for a better understanding of the kinds of data we are dealing with, it is now finally
time to address some of the issues of content. I have only space to deal with a couple of observations here.
This is especially frustrating in the light of the list of issues that we are collecting on the basis of our
discussions. The list features such diverse issues as metaphorical use of auxiliaries, simile, extended metaphor,
and personification. We are planning to publish on these issues as soon as they have received more concrete

However, a few comments may be made anyway. The first comment concerns the large number of zeros in the
poem. According to our current analysis, the poem has fifty-two content words, twenty-seven of which have
received zero hits as metaphorically used. The other words either have one, two, three, or four marks. This
means that there is no doubt in the minds of the analysts that half of the content words of the poem are not
used metaphorically, the bulk of them being literal. Another way of saying this is that there is a clearly literal
setting against which the metaphorical usage of other words may be detected. This is representative of our
other results in other types of discourse.

The literal background of the poem may seem a trivial issue, but it is noteworthy in the context of the many
claims that all of language and all of literature are fundamentally metaphorical. This type of sweeping
statement may be rejected on the basis of a careful and systematic analysis that is grounded in a general
theory of language and language use. Indeed, I believe that most language use is fundamentally literal, and
that there are patches of metaphor that can only be understood with reference to that literal basis. The role of
metaphor in language and literature may actually have been considerably overestimated during the last few
decades. One of the benefits of developing a reliable measuring instrument is that it may be used to refine our
insights on the true status of metaphor in language and thought.

The second substantial point I wish to make has to do with the verbal patterns of metaphorical expression. We
have coded linguistic metaphor at the level of words, but this does not mean that we see metaphor only as a
matter of single words. Sometimes it is, as in line 1, which was discussed above. To be more precise, in that
case there is one word that is used metaphorically, and it gives rise to a metaphorical utterance with one
metaphorical element. However, there are also patterns of metaphorical expression that encompass more
words, and we shall now examine some of these.

Consider, for example, "lies open" in line 8. It is clear that these two words cannot literally be predicated of the
intended (metonymic) meaning of "heart." They are two metaphorically used words activating two
metaphorically used concepts which will be two independent parts of the same source domain of the eventual

mapping. This may also be discerned from the open predication that results from step three of the five-step

(** F,x) (** y) (SIM[F (HEART, x), LIE (y, OPEN)])

The words "lies" and "open" activate concepts that jointly construct a domain of physical state. We say that
such discourse units contain a multiple metaphorical expression, as opposed to the singular metaphorical
expression in the two units in line I (Crisp, Heywood, and Steen). (I have tacitly added an extra argument in
the form of an x here, and will do the same in the form of a y' in the next example. They should be read as
operators indicating additional elements in the propositions that are being compared.)

One might think that a similar situation would hold for line 11, if it is granted that both "fold up" and
"sweetness" are metaphorical. However, this is precisely where our word-by-word approach shows that the two
lines are not completely comparable. My step three analysis would look like this:

(** F) (** y, y') (SIM[F (LILY, SWEETNESS), FOLD-UP(y, y')])

In other words, the nonliteral open comparison that underlies line 11 concerns an activity which relates the
concepts of "lily" and "sweetness," and two referents related by the activity concept of "folding up." In its turn,
"sweetness" is a metaphorically used word, but it should be realized that it is not metaphorical because it would
belong to the same source domain as the activity of folding up. Instead, it is metaphorical because it involves a
(highly conventional) cross-domain mapping between taste and smell. In this context, it is therefore more
closely related to the domain of the lily than to the neutral, physical activity of folding something up. What is
required for the metaphorical nature of "sweetness" is a separate analysis, which is embedded in the analysis of
the complete line. In fact, "sweetness" is an implicit metaphor (Steen "Analyzing Metaphor"), but its detailed
analysis would lead us too far away from our present concerns. Let me therefore simply make the general point
that the analysis of linguistic metaphor at word level does not preempt a consideration of metaphor at higher
levels of linguistic organization. Indeed, the study of such higher levels becomes more precise and richer if it
can be based in an analysis at word level.

Even though it is possible to have such multiple and other complex linguistic structures when it comes to the
verbal expression of metaphor, it is striking that the present poem does not make much use of them. Our two
examples are the only clear cases, with the exception of "leaves a shining furrow" in lines 9-10 (which has an
inexplicably low identification score for "leaves"). Apart from these cases, however, the bulk of the metaphors in
this poem consists of verbs used as finite predicates, as may be observed from the following list: "sleeps" (1),

deleted "sleeps" (I), "waves" (2), "winks" (3), "droops" (but questionable, in line 5), "glimmers" (6), "lies" (7),
"lies" (8), "slides on" (9), "leaves" (9), "folds up" (11), "slips" (12), "fold" (13), "slip" (13), "lost" (14). In fact,
there is hardly a predication in the poem that does not have a metaphorical predicate.

The variation regarding linguistic metaphor in this poem derives, then, from the addition of other metaphorical
elements to the predominantly metaphorical verbs, such as the adjective "open" in line 8, or the adverbial
adjunct "like a ghost" in lines 5 and 6. Moreover, there is further variation on account of the words that have
word-internal metaphoricity, as in "milk white" or "gold fin." (I am leaving aside problems of orthography and
conventionalized compounds here.) But these are not the most interesting patterns of linguistic metaphor in the
poem, as may now be finally shown by a more stylistically and rhetorically oriented analysis (cf. my "Analyzing
Metaphor" and "Rhetoric of Metaphor").

There is even more formal variation in the metaphorical alignments of complete discourse units that also
happen to have metaphorical structure inside them. This is a pattern that becomes increasingly prominent as
the poem progresses. And the comparison is always between an element of the natural world that functions as
a setting to the poem on the one hand and some aspect of the beloved on the other, in that order. The poem
may be viewed as a progressively explicit and increasingly complex series of analogies.

In the first stanza, there is only a literal comparison between units, and within one line at that, in line 4: "The
fire-fly wakens: waken thou with me." The next couplet contains a metaphorical comparison between the
drooping of a peacock in line 5 and the glimmering of the female in line 6, but its metaphorical nature is leveled
because both lines contain the identical expression "like a ghost." Lines 7 and 8 contain another metaphorical
comparison between two units, which this time, however, is complicated by the novel expression "lies all
Danae." The first complete and typical metaphorical comparison between two units, therefore, is only to be
found in the third couplet, where we have left the earth of the previous couplet and are looking at the sky. But
here, too, there is a formal complication, in that the second member of the analogy is expressed by deletion:
"Now slides the silent meteor on, and leaves / a shining furrow, as thy thoughts [slide on and leave a shining
furrow] in me." This is a case whe re the propositional reconstruction at step 2 is more hazardous than the one
for the elliptical metaphor in line 1, discussed at the beginning of this paper. The formal climax of the poem,
therefore, lies in the last stanza, which takes two lines to express the natural point of reference and then needs
two more lines to suggest the personal point of reference that is to be compared with it.

These are not gratuitous formal observations that merely serve to make distinctions between classes of
metaphor. They may have a bearing on the interpretation of the poem as well. And they certainly have

consequences for the reading of the poem. They have definite cognitive potential, as may be briefly illustrated
by imagining the nature of the reading process for some of these lines. What I am about to suggest here is
rather speculative, but it may help to model how meaning, including metaphorical meaning, gets incrementally

Consider how the reader has to build a mental representation of the first utterance, in line I. This involves
constructing a situation in which a crimson petal does something like sleeping. How the activity of the petal is
actually imagined is a matter of the individual reader. However, at the end of this first unit, a situation model of
a "sleeping" crimson petal has been constructed, and the reader moves on to the next utterance, in which it
turns out that a similar activity is performed by a white petal. The construction of the mental representation for
this line should not be very different from the one of the previous utterance. The result is an update of the
situation model, where two different petals are now incorporated in the same scene.

Now consider what happens in lines 5 and 6, repeated here for the sake of convenience:

Now droops the milkwhite peacock like a ghost,

And like a ghost she glimmers on to me.

In line 5, the reader has to construct a mental image of a milkwhite peacock that is drooping, and this image
has to be shaped on the model of a ghost that is drooping.

However the reader pulls off this metaphorical trick, there is a sense of closure at the end of this line, and the
reader should have integrated the peacock within the situation model that has developed since line 1. Then the
reader has to move on to the next line, in which the poet shifts our attention to a woman, presumably the same
person as the "thou" in line 4. Here, too, the reader has to construct a basically metaphorical image, of a
woman glimmering like a ghost. This has to be stored in another part of the model for the text, which records
the information about the second theme, the woman or beloved.

However, the crucial thing about the reading process is this. Even though each of the two units in line 1 as well
as the ones in line 5 and 6 require unit-internal metaphorical mappings, the difference between them should
also be conspicuous. For in line 1, there is no further metaphorical mapping between the two units, whereas in
lines 5 and 6, there is. The reading process of line 6 has to go on beyond an understanding of the line itself,
and perform a mapping between the glimmering like a ghost by the she on the one hand and the drooping like

a ghost by the milkwhite peacock on the other hand. The relational coherence between lines 5 and 6 is one of
matching (cf. Hoey). This process becomes even more elongated and complex in the last stanza of the poem.

These are only some of the substantial issues that arise from our analysis of the poem. As may be seen,
metaphor identification may be used for further work on text and reading analysis. It is not an automatic and
dry application of some instrument, but it has a function in doing research on texts and their understanding.
The most important result that should come out of this work is a firmer grasp of our linguistic materials in
performing such studies. I hope that the present section has gone some way in suggesting the potential of such
an approach.


I wish to thank Alan Cienki, Zoltan Kovecses, and Graham Low for their comments and suggestions on an
earlier version.

Works Cited

Barnden, John, Mark Lee, and Katja Markert, eds. Proceedings of the Workshop on Corpus-Based and
Processing Approaches to Figurative Language--Held in Conjunction with Corpus Linguistics 2001. Lancaster:
University Centre for Computer Corpus Research on Language, 2001.

Black, Max. "More about Metaphor." Metaphor and Thought, Second Edition. Ed. Andrew Ortony. Cambridge:
Cambridge UP, 1993. 19-41.

Bovair, Susan, and David Kieras. "A Guide to Propositional Analysis for Research on Technical Prose."
Understanding Expository Text. Eds. Bruce Britton and John Black. Hillsdale, NJ: Erlbaum, 1985. 315-62.

Cameron, Lynne. "Metaphorical Use of Language in Educational Discourse: A Theoretical and Empirical
Investigation." Diss. U of London, 1997.

Cameron, Lynne, and Graham Low, eds. Researching and Applying Metaphor. Cambridge: Cambridge UP, 1999.

Crisp, Peter. "Metaphorical Propositions: A Rationale." Language and Literature 11 (2002): 7-16.

Crisp, Peter, John Heywood, and Gerard Steen. "Identification and Analysis, Classification and Quantification."
Language and Literature 11 (2002): 55-69.

Davidson, Donald. "What Metaphors Mean." Critical Inquiry 5 (1978): 31-47.

Deignan, Alice. Metaphor. London: Harper Collins, 1996.

Gardner, Helen, ed. The New Oxford Book of English Verse. Oxford: Oxford UP, 1972.

Gibbs, Raymond W., Jr. "Stalking Metaphor in the Wild: Psycholinguistic Comments on the Metaphor
Identification Project." Language and Literature 11 (2002): 78-84.

Gibbs, Raymond W., Jr., and Gerard J. Steen, eds. Metaphor in Cognitive Linguistics. Amsterdam: John
Benjamins, 1999.

Goatly, Andrew. The Language of Metaphors. London: Routledge, 1997.

_____. "Text-Linguistic Comments on the Metaphor Identification Project." Language and Literature 11 (2002):

Heywood, John, Elena Semino, and Mick Short. "Linguistic Metaphor Identification in Two Extracts from Novels."
Language and Literature 11 (2002): 35-54.

Hoey, Michael. Textual Interaction: An Introduction to Written Discourse Analysis. London: Routledge, 2001.

Kintsch, Walter. Comprehension. A Paradigm for Cognition. Cambridge: Cambridge UP, 1998.

Kovecses, Zoltan. "Cognitive-Linguistic Comments on the Metaphor Identification Project." Language and
Literature 11 (2002): 74-78.

Lakoff, George. "The Contemporary Theory of Metaphor." Metaphor and Thought, Second Edition. Ed. Andrew
Ortony. Cambridge: Cambridge UP, 1993. 202-51.

_____. "The Meanings of Literal." Metaphor And Symbolic Activity 1:4 (1986): 291-96.

Lakoff, George, and Mark Johnson. Metaphors We Live By. Chicago: Chicago UP, 1980.

Low, Graham, and Lynne Cameron. "Applied-Linguistic Comments on the Metaphor Identification Project."
Language and Literature 11 (2002): 84-90.

Mann, William, and Sandra Thompson. "Rhetorical Structure Theory: Toward a Functional Theory of Text
Organization." Text 8:3 (1988): 243-81.

Miller, George. "Images and Models, Similes and Metaphors." Metaphor and Thought, Second Edition. Ed.
Andrew Ortony. Cambridge: Cambridge UP, 1993. 357-400.

Murphy, Greg. "On Metaphoric Representations." Cognition 60 (1996): 173-204.

Reinhart, Tanya. "On Understanding Poetic Metaphor." Poetics 5 (1976): 383-402.

Ricoeur, Paul. The Rule of Metaphor. London: Routledge & Kegan Paul, 1978.

Searle, John. "Metaphor." Metaphor and Thought, Second Edition. Ed. Andrew Ortony. Cambridge: Cambridge
UP, 1993. 83-111.

Semino, Elena, and Gerard Steen. "A Method for Annotating Metaphors in Corpora." Proceedings of the
Workshop on Corpus-Based and Processing Approaches to Figurative Language--Held in Conjunction with
Corpus Linguistics 2001. Ed. John Barnden, Mark Lee, and Katja Markert. Lancaster: U Centre for Computer
Corpus Research on Language, 2001. 59-66.

Semino, Elena, John Heywood, and Mick Short. "Methodological Problems in the Analysis of Metaphors in a
Corpus of Conversations about Cancer." Unpublished essay, 2002.

Steen, Gerard J. "Analyzing Metaphor in Literature: With Examples from William Wordsworth's 'I Wandered
Lonely as a Cloud.'" Poetics Today 20:3 (1999): 499-522.

_____. "From Linguistic to Conceptual Metaphor in Five Steps." Metaphor in Cognitive Linguistics. Ed. Raymond
W. Gibbs Jr., and Gerard Steen. Amsterdam: John Benjamins, 1999. 57-77.

_____. "Genres of Discourse and the Definition of Literature." Discourse Processes 28:2 (1999): 109-20.

_____. "Poetics and Linguistics Again: The Role of Genre." Textual Secrets: The Message of the Medium. Ed.
Szylvia Csabi and Judith Zerkowitz. In press.

_____. "A Reliable Procedure for Metaphor Identification." Proceedings of the Workshop on Corpus-Based and
Processing Approaches to Figurative Language--Held in Conjunction with Corpus Linguistics 2001. Eds. John

Barnden, Mark Lee, and Katja Markert. Lancaster: U Centre for Computer Corpus Research on Language, 2001.

___. "A Rhetoric of Metaphor: Conceptual and Linguistic Metaphor and the Psychology of Literature." The
Psychology and Sociology of Literature. Ed. Dick Schram and Gerard Steen. Amsterdam: John Benjamins, 2001.

___. "Towards a Procedure for Metaphor identification." Language and Literature 11 (2002): 17-34.

Steen, Gerard J., and Raymond W. Gibbs Jr. "Introduction." Metaphor in Cognitive Linguistics. Ed. Raymond W.
Gibbs Jr., and Gerard Steen. Amsterdam: John Benjamins, 1999. 1-8.

Gerard J. Steen ( is assistant professor in English at the Free University Amsterdam. He is
associate editor of Metaphor and Symbol and author of Understanding Metaphor in Literature (1994). He has
co-edited Metaphor in Cognitive Linguistics (1999, with Ray Gibbs), The Psychology and Sociology of Literature
(2001, with Dick Schram), a special issue of Language and Literature on metaphor identification (2002), and
Cognitive Poetics in Practice (in press, with Joanna Gavins).

*[Text unreadable in original source]

COPYRIGHT 2002 Northern Illinois University

COPYRIGHT 2002 Gale Group