You are on page 1of 6

Compare and Contrast the Speech Production Models of Pim Levelt,

Alfonso Caramazza and Don MacKay.

On the surface, there appears to be a great deal of agreement

between psychologists over the process of language production.
This is because the basic structure of language production is fairly
self-evident. All sensible theories agree that semantic, syntactic and
lexical form properties are distinct levels of representation;
semantics concerns meaning, syntax concerns structure and lexical
form concerns inflection (Caramazza, 1997). It is also widely agreed
that these processes occur sequentially, as each one builds upon
the one before it (Caramazza, 1997). Furthermore, there is
widespread agreement that language production occurs through two
distinct phases, firstly selection of the required lexical
representation or lemma, as specified semantically and
syntactically, secondly the selection of the lexical-phonological
representation or lexeme (Caramazza, 1997 & Caramazza et al.,
2004); this is because the linguistic expression of a concept requires
association of concept to sound, lemma to lexeme.
However, there is very little agreement over the deeper
processes of language production (Caramazza, 1997). The limited
scope of this essay prevents a full analysis of the theoretical
similarities and disagreements between the models of Levelt,
Caramazza and MacKay, therefore here focus shall be made on
three major theoretical concerns; whether meanings are
represented holistically or componentially, whether the stages of
processing are discrete or interactive and the function of priming in
language production.
Theories of language production must be able to solve two
problems, the ‘hyperonym’ and ‘hyponym’ problems (Levelt, 1992).
Language production rarely, if ever, results in the production of
inaccurate hyperonyms or hyponyms, for example the hyponym
‘tree’ will not be produced when the speaker means to produce the
hyperonym ‘plant’ and vice versa. Levelt (1992) argues that these
problems entail that representations of meaning cannot be
componential as they would require a ‘principle of specificity, which
says that of all lemmas whose conditions are satisfied by the
concept-to-be-expressed the most specific one (the most entailing
one) should be selected (Levelt, 1992; pp.7). Whilst this principle
may be sound in theory, it is not obvious how it can be implemented
into a network model of language production (Levelt, 1992). Thus,
Levelt postulates that representations are holistic, that there is an
individual node for every item in the lexical item in the language,
the meanings of which are represented by lexical-concept nodes
and labelled connections between the concept nodes (Levelt, 1992).
However, MacKay and Caramazza both agree that meanings
are represented componentially (Caramazza, 1997; Levelt, 1992;
MacKay, 1987;). Caramazza argues in his 1997 that there is no need
for a principle of specificity once three sensible assumptions are
made, each one of which follows logically from the basic
assumptions of componentiality. Firstly, ‘the amount of activation
passed onto the next level by any one feature is a weighted
proportion of the number of selected features’ (pp.200); thus, for a
concept node to be activated, all of the component meaning nodes
of the word must be activated. Secondly, ‘the amount of activation
normally needed by the activated lexemes to reach threshold is the
full unit of activation propagated from the lexical-semantic network’
(pp.200), and finally, ‘the maximum amount of activation
contributed by a singe link to a node is a direct function of the
number of links that feed into the node’ (pp.201). These final two
assumptions entail that the amount of activation one node receives
from another is proportionately weighted to the number of links
between the two nodes in the case of activating, and weighted to
the number of links between the target node and all other nodes in
the case of activation. This solves the hyperonym problem and the
hyponym problem respectively as the hyperonym and hyponym
lexemes will, by definition, receive activation from only a few of the
selected nodes, not enough to reach the activation threshold.
Unlike MacKay, both Levelt and Caramazza argue that the
stages of processing are discrete rather than interactive
(Caramazza, 1997; Levelt, 1992; MacKay, 1987 & 2004).
Experimental evidence for discrete systems comes from brain-
damaged subjects found to have selective difficulties when
producing lemmas of a single grammatical class through a single
modality of output (Caramazza, 1997). That these difficulties occur
in only a single modality suggests that the lexical-semantic system
still functions correctly. This, combined with the fact that the
difficulties are constrained to a single grammatical class, suggest
further that the difficulties must manifest only within the syntactic
level of representation (Caramazza, 1997). Thus, the lexical-
semantic and syntactic information must be represented
independently, and therefore their systems must be discrete
(Caramazza, 1997). Furthermore, experimental evidence from
anomic subjects that are able to provide information about syntactic
features of words they are unable to produce suggest that the
syntactic features of a word and it form must also be represented
independently, and thus also discrete (Caramazza, 1997).
However, MacKay (James & MacKay, 2004) argues that
experimental evidence from research into phonological and
morphological speech errors has shown that phonology has at least
some ‘retroactive’ (pp.104) effects on lexical retrieval, effects that
could not occur if the systems were discrete. This entails that the
systems must be both autonomous and interactive; a problem for
modular theories as according to Fodor modules cannot be
interactive in any way as modularity requires ‘encapsulation of
processing’ (cited by MacKay, 1987; pp. 411). However, MacKay
argues that modules can exist in an interactive system, so long as
priming is correctly distinguished from activation (1987). Priming is
both automatic and unencapsulated within modules, whilst
activation occurs through sequencing and timing nodes
encapsulated within modules (MacKay, 1987); thus priming allows
interaction between modules whilst sequencing keeps them distinct.
There is such a wealth of experimental evidence of the effects
of priming that few would argue with its phenomenological
existence; however, Levelt and Caramazza disagree with MacKay on
exactly how important the process is during language production
(Caramazza, 1997; Levelt, 2001; MacKay, 1987). They argue that
node activation can be facilitated by priming, but is in no way
dependent on it (Caramazza, 1997; Levelt, 2001). In their models
the lexical selection system selects the appropriate nodes ‘under
competition’ (Levelt, 2001; pp. 13464) with other syntactically
identical but semantically different nodes, without the need for
separate processes of priming and sequencing. Activation spreads
unidirectionally, for example, to the phonological nodes connected
to the lemma node only, rather than indiscriminately to all other
connected nodes; the phonological nodes of related lemmas are not
activated at all (Levelt, 2001). The creation of a lexical selection
network provides a group of relevant lemmas, and perspective
taking obtains the most relevant of the group (Levelt, 2001). The
target lemma node will most likely be the first to reach its activation
threshold and become fully activated, thus activating the
phonological nodes connected to the lemma node.
However, MacKay’s model of language production requires
that every node involved in the utterance of a specific word
undergoes separate processes of priming and activation; priming is
the sole process by which a node is prepared for activation (Burke et
al., 1991). This preliminary or subthreshold stimulation spreads from
an individual node to all other nodes connected to it, regardless of
their place in the system hierarchy (Burke, MacKay, et al., 1991).
Priming spreads quickly amongst the nodes regardless of their
content and thus causes the priming of many irrelevant nodes;
however, the most relevant nodes will experience the highest level
of subthreshold stimulation as they are connected to the greatest
number of other primed nodes (Burke, MacKay et al., 1991). The
primed nodes are then activated by a separate system of domain-
specific sequence nodes that repeatedly multiply the levels of
priming across an entire domain (Burke, MacKay et al., 1991); the
node with the highest level of priming within each domain, the most
relevant node, will be the first to reach the activation threshold and
be expressed. Thus, MacKay’s theory produces the same outcome
as that of Caramazza and Levelt, the activation of the target lemma
node, but by different means.
Thus it can be clearly seen that models of language
production, although similar in their general structure, can vary a
great deal in terms of the more detailed underlying processes.
Levelt’ theory argues that meanings are represented holistically due
to the hyperonym/hyponym problem, whilst MacKay concurs with
Caramazza arguments that show that simple assumptions that
follow naturally from those of componentiality itself can solve the
problem much more sensibly. Both Levelt and Caramazza argue that
the stages of processing are discrete, but MacKay points out
evidence for interaction between the stages and is able to integrate
the evidence into his theory. Unlike Levelt and Caramazza, MacKay’s
theory argues that the processes of priming and activation are very
important in language production. Indeed, it is this distinction that
divided the three theories most fully. Of the three, it would appear
that MacKay’s theory is the strongest. Not only is his theory the only
one of the three that is consistent with all the experimental
evidence mentioned here, but also his is the only theory that allows
activation to spread from a node to all its connect nodes, and thus
does not require that each node performs selective activation of
other connected nodes, a much more intuitive proposition.

Burke, D., MacKay, D., Worthley, J. & Wade, E. (1991). ‘On the Tip of
the Tongue: What Causes Word Finding Failures in Young and Older
Adults?’ Journal of Memory and Language, 30, pp. 542-579.
Caramazza, A. (1997). ‘How many Levels of Processing are there in
Lexical Access?’ Cognitive Neuropsychology, 14(1), pp. 177-208.
Caramazza, A., Costa, A. & Miozzo, M. (2004). ‘What Determines
the Speed of Lexical Access: Homophone or Specific-Word
Frequency? A Reply to Jescheniak et al. (2003).’ Journal of
Experimental Psychology: Learning, Memory, and Cognition. 30(1),
pp. 278-282.
Levelt, W. (1992). ‘Accessing Words in Speech Production: Stages,
Processes and Representations.’ Cognition, 42, pp. 1-22.
Levelt, W. (2001). ‘Spoken Word Production: A Theory of Lexical
Access.’ Proceedings of the National Academy of Sciences of the
United States of America, 98(23), pp. 13464-13471.
James, L. & MacKay, D. (2004). ‘Sequencing, Speech Production,
and Selective Effects of Aging on Phonological an Morphological
Speech Errors.’ Psychology and Aging, 19(1), pp. 93-107.
MacKay, D. (1987). ‘Constraints on Theories of Sequencing and
Timing in Language Perception and Production.’ Language
Perception and Production, Academic Press Inc. pp. 407-429.