You are on page 1of 382


This page intentionally left blank


Edited by

Oxford University Press, Inc., publishes works that further
Oxford University’s objective of excellence
in research, scholarship, and education.

Oxford New York

Auckland Cape Town Dar es Salaam Hong Kong Karachi
Kuala Lumpur Madrid Melbourne Mexico City Nairobi
New Delhi Shanghai Taipei Toronto
With offices in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam

Copyright © 2010 by Scott P. Johnson

Published by Oxford University Press, Inc.

198 Madison Avenue, New York, New York 10016

Oxford is a registered trademark of Oxford University Press.

All rights reserved. No part of this publication may be reproduced,

stored in a retrieval system, or transmitted, in any form or by any means,
electronic, mechanical, photocopying, recording, or otherwise,
without the prior permission of Oxford University Press.

Library of Congress Cataloging-in-Publication Data

Neoconstructivism : the new science of cognitive development / edited by Scott P. Johnson.

p. cm.
Includes bibliographical references and index.
ISBN 978-0-19-533105-9
1. Cognition in infants. 2. Attention in infants. 3. Perception in infants.
I. Johnson, Scott P., 1959–
BF720.C63N46 2009
155.4′13—dc22 2009007210

9 8 7 6 5 4 3 2 1
Printed in the United States of America
on acid-free paper.

What is Neoconstructivism?
Nora S. Newcombe
Temple University

Piaget was supposed to have solved the problem is organized into modules that are not only neu-
of the origins of knowledge, to have bridged the rally specialized and present from the begin-
nativist–empiricist split. This goal was one of ning but that do not accept information from
his aims in beginning to study children’s cogni- each other. Indeed, nativists have argued that
tive development, and, at the end of his life, he evolution could not work to create human intel-
regarded this goal as accomplished (Chapman, ligence without such modular organization—
1988). Yet, even before his death, criticisms of there must be some thing for evolution to select
Piaget were mounting (Gelman & Baillargeon, or weed out (Cosmides & Tooby, 1994). And
1983). They now form a familiar litany: no clear nativism solved the second of Piaget’s main
evidence for stages and structuralism, insuf- problems (one that had shrunk to minuscule
ficient attention to the gradualisms and local- size by the hypothesized existence of so much
isms of cognitive progress, excessive emphasis innate knowledge) essentially by fiat—by pos-
on verbal justifications of judgments. Among tulating simple “triggers” that led children to
these limitations, two issues stand out: an overly select parameters or fi ll content into slots. More
lean delineation of the starting points for cog- recently, they have added the hypothesis that
nitive development, and a description of the change occurs when human language bridges
mechanism of cognitive change that was little the gap across modules of core knowledge
more than a re-naming of the phenomenon as (Spelke, 2003).
accommodation. None of these postulates of nativism are,
So, how do we characterize starting points however, supported by the evidence. Starting
and developmental change? These two ques- points are strong, but infants are not tiny adults
tions were addressed in a bold way by the with insufficient control over their arms and
resurgence of nativism, which came roaring legs. There is much more conceptual change
back on the intellectual scene in the 1950s with than nativists envision and strong evidence that
Chomsky’s (1959) critique of Skinner and which environmental input is integral to cognitive
had become the dominant paradigm for think- development in complex ways that go far beyond
ing about the origins of knowledge by 1980 or triggering (Newcombe, 2002). There is also good
so. Nativism solves the first of Piaget’s problems reason to think that language, while helpful to
by definition—by postulating the richest start- human thought, is not the sine qua non of cog-
ing points imaginable, ones that encompass all nitive flexibility (Newcombe & Ratliff, 2007). So,
the “core knowledge” required to understand what’s the alternative to nativism? Information-
the world. An important addition to this philo- processing theorists of the 1970s and 1980s
sophically classic position, appearing in force in retained lean starting points and used produc-
Fodor (1983), was the idea that the mind/brain tion system modeling to address the problem of


change (Klahr & Wallace, 1976). However, there consider Piaget in a new light. Perhaps his
seems to be more initial competence than most life work did after all come close to his goal of
such modelers were willing to contemplate. In reconciling nativism and empiricism. He was
addition, these modeling efforts, like the con- wrong about many things—the viability of
nectionist modeling that succeeded produc- structuralism, the leanness of starting points,
tion systems, have often failed to use empirical the lack of a need for close study of input and
information about the kind and sequence of mechanism. He was also living at a time when
environmental information to constrain the he could not follow up thoroughly on his nods
models (Newcombe, 1998). Vygotsky has some- to social interaction, or to what might have
times been presented as an alternative to Piaget, been his fascination with the architecture
but his work concentrates too exclusively on and processes in the physical substrate (the
social and cultural interaction to seem to pro- brain that is the mind). Nevertheless, his fun-
vide a satisfying overall framework for many damental idea seems now to have been abso-
aspects of cognitive development (Newcombe & lutely right: that a biologically prepared mind
Huttenlocher, 2000). interacts in biologically evolved ways with
The need for new approaches to cogni- an expectable environment that nevertheless
tive development became increasingly evident includes significant variation. The chapters in
by 1990, and several books began to fi ll the this book collectively show us the promise of
need for nonnativist approaches to cognitive neoconstructivism.
development: Elman et al. (1996); Gopnik and Here are some tenets that I think unite the
Meltzoff (1997); Karmiloff-Smith (1992); Siegler neoconstructivist approach.
(1996); Thelen and Smith (1994). However, each
of these approaches also has some limitations, • Everyone is a Darwinist. That is, all theoriz-
for example, of scope or specificity. In addi- ing in cognitive development is situated in a
tion, they competed with each other, so that, for context in which we must consider the adap-
example, connectionism and dynamic systems tive value of thinking, and how it developed
theorists spent much time debating whether over evolutionary as well as developmental
their efforts were similar or different (Spencer, time. There is no need to cede the Darwinian
Thomas & McClelland, in press). The end result high ground to the modularity theorists or to
is that there is not, as yet, a dominant theoreti- the nativists in general.
cal framework within which to situate the large • Experience expectancy is a key concept.
volume of exciting recent empirical work on Keeping this valuable concept (delineated by
cognitive development—research that some- Greenough, Black, & Wallace, 1987) firmly
times seems to define theory better than debate in mind, we can see how nature’s solution to
that is self-consciously theoretical (Oakes, the problem of the construction of knowledge
Newcombe, & Plumert, in press). Research on could as easily—arguably more easily—have
cognitive development has gained steadily in been the selection of neural abilities that will
interest for several reasons—better techniques inevitably learn from their expectable input
and methods, established phenomena with what needs to be learned. There is no a priori
richly detailed data that allow for finely tuned need for specific content to be wired in—al-
competing explanations to be pitted against though of course some may be.
each other, and better and better contact with • The world is richly structured and well
insights and methods from cognitive science, equipped with perceptual redundancies and
neuroscience, computer science, and compara- correlations that support experience-expect-
tive psychology. But it is missing an “ism” to ant learning. This idea is a fundamentally
define it. Gibsonian one, although it acquires new reso-
Into this healthy and hopeful intellectual nance in contemporary theorizing, in which
ferment comes this book. It is forthrightly we can specify how the information is “picked
titled Neoconstructivism, returning us to up” rather than simply asserting that it is.

• Humans (and perhaps other species as well) description (and “thick” description [see
bring to the task of learning about their world Geertz, 1973] comes close to being a cause);
a rich endowment in computing probabili- material cause is analogous to the neural
ties, lying at the heart of the work on statis- substrate; final cause is analogous to putting
tical learning discussed in this book, as well development in an evolutionary and adap-
as of Bayesian approaches to cognition and tive context; efficient cause is analogous to
its development. These abilities go a long way an analysis of the interactions of input with
to solve the problem of profligate association the neural substrate and the current cognitive
that nativism is fond of using to attack more state of the learner.
balanced approaches that include empiricist
elements. The chapters in this book cover many domains
• A richly structured world and a strong capac- (although not all) and more important, they
ity for probabilistic reasoning interact, within agree in many ways, subscribing either explic-
the experience-expectancy framework, to itly or implicitly to the list of key ideas listed
select among and/or to integrate the multiple above. But within the species of neoconstruc-
cues typically available to draw conclusions tivists, there are also dimensions of variation,
about causality, to remember spatial location, just as cats differ in their markings, eye color, or
and so forth. even in the possession versus absence of a tail.
• Action plays a key role in learning and devel- The two most important differences among the
opment, just as Piaget thought, not only chapter authors are the following:
because it creates the occasion for experi- • How strongly domain-general is human
ment but also because it allows for situations cognition and cognitive development? Some
that are more replete with information than investigators in the neoconstructivist tradi-
observation. tion embrace domain generality while oth-
• Development and learning are closely inter- ers clearly work within a domain-specific
twined concepts but not quite the same. framework. Note, however, that, importantly,
Development is learning as the learner domain specificity does not entail either
changes. For example, the learner acquires a nativism or modularity.
shape bias or the idea that words are reliable • How bottom-up versus top-down is human
cues to categories. As another and different cognition and cognitive development? Some
example, perceptual tuning, especially in the chapter authors seem to think that bot-
first year of life, works by pruning capacities tom-up approaches are necessary to avoid the
not by adding to capacity, to create a funda- extremes of nativism while others are more
mentally altered learner. comfortable with top-down influences—
• Developmental change can be quantitative, recognizing that those influences may them-
qualitative, or both at the same time, depend- selves be constructed.
ing on the granularity of observation. The
oft-cited dichotomy between quantitative Going back to the issue of a new “ism” to replace
and qualitative change that is supposed to nativism as the framework for thinking about
distinguish theories of developments should cognitive development—is neoconstructivism
be consigned to the dustbin of history; see just one more “ism” that can be added to the list
Thelen and Smith’s (1994) elegant discussion of contenders for a contemporary alternative?
of the “view from above” and the “view from Does it vie with connectionism, or dynamic
below.” systems thinking, or emergentism, or overlap-
• Analyses of the causes and mechanisms of ping wave theory, or small-p piagetianism, or
developmental change need to proceed on other terms or schools of thought? Very impor-
all four of Aristotle’s fronts—looking for tantly, I think the simple answer is No. The eight
formal, material, final, and efficient causes. tenets listed above establish a neoconstructivist
Formal cause is analogous to developmental big tent that can cover all of the specific schools

of thought mentioned above and more. What Klahr, D., & Wallace, J. G. (1976). Cognitive devel-
can then ensue is the sorting out of the specific opment: An information-processing view.
issues in empirical description, theory making Oxford: Lawrence Erlbaum.
and modeling that are the normal business of a Newcombe, N. S. (1998). Defi ning the radical mid-
mature science. Piaget’s biggest idea, if not his dle. (Essay review of J. Elman et al., Rethinking
Innateness). Human Development, 41, 210–214.
many smaller ones, has turned out to be right
Newcombe, N. S. (2002). The nativist-empiricist
after all. controversy in the context of recent research
on spatial and quantitative development.
REFERENCES Psychological Science, 13, 395–401.
Newcombe, N. S., & Huttenlocher, J. (2000).
Chapman, M. (1988). Constructive evolution: Making space: The development of spatial rep-
Origins and development of Piaget’s thought. resentation and reasoning. Cambridge, MA:
New York, NY: Cambridge University Press. MIT Press.
Chomsky, N. (1959). A review of B. F. Skinner’s Newcombe, N. S., & Ratliff, K. R. (2007).
Verbal Behavior. Language, 35, 26–58. Explaining the development of spatial reori-
Cosmides, L., & Tooby, J. (1994). Origins of domain entation: Modularity-plus-language versus
specificity: The evolution of functional orga- the emergence of adaptive combination. In
nization. In L. A. Hirschfeld & S. A. Gelman J. Plumert & J. Spencer (Eds.), The emerging
(Eds), Mapping the mind: Domain specificity in spatial mind (pp. 53–76). Oxford University
cognition and culture. (pp. 85–116). New York: Press.
Cambridge University Press. Oakes, L. M., Newcombe, N. S., & Plumert,
Elman, J. L., Bates, E. A., Johnson, M. H., J. M. (2009). Are dynamic systems and con-
Karmiloff-Smith, A., Parisi, D., & Plunkett, K. nectionist approaches an alternative to “good
(1996). Rethinking innateness: A connectionist old-fashioned cognitive development”? In J.P.
perspective on development. Cambridge, MA: Spencer, M. Thomas & J. McClelland (Eds.),
MIT Press. Toward a new grand theory of development?
Fodor, J. A. (1983). The modularity of mind. Connectionism and dynamic systems theory
Cambridge, MA: MIT Press. re-considered (pp. 268–284). Oxford: Oxford
Geertz, C. (1973). Th ick description: Toward University Press.
an interpretive theory of culture. In The Siegler, R. S. (1996). Emerging minds: The process of
Interpretation of Cultures: Selected Essays (pp. change in children’s thinking. New York: Oxford
3–30). New York: Basic Books. University Press.
Gelman, R., & Baillargeon, R. (1983). A review of Spelke, E. S. (2003). What makes us smart? Core
some Piagetian concepts. In J. H. Flavell and knowledge and natural language. In D. Gentner
E. Markman (Eds.), Cognitive development: and S. Goldin-Meadow (Eds.), Language in
Vol. 3, Handbook of child development (pp. mind: Advances in the investigation of language
167–230). New York: Wiley. and thought. Cambridge, MA: MIT Press.
Gopnik, A., & Meltzoff, A. (1997). Words, thoughts Spencer, J. P., Thomas, M., & McClelland, J.
and theories. Cambridge, MA: MIT Press. (Eds.) (in press). Toward a new grand theory
Greenough, W. T., Black, J. E., &Wallace, C. S. of development? Connectionism and dynamic
(1987). Experience and brain development. systems theory re-considered. Oxford: Oxford
Child Development, 58, 539–559. University Press.
Karmiloff-Smith, A. (1992). Beyond modularity: Thelen, E., & Smith, L.B. (1994). A dynamic sys-
A developmental perspective on cognitive sci- tems approach to the development of cognition
ence. Cambridge, MA: MIT Press. and action. Cambridge, MA: MIT Press.

Contributors xi
Introduction xiii
Scott P. Johnson


1. Attention in the Brain and Early Infancy 3

John E. Richards
2. All Together Now: Learning through Multiple Sources 32
Natasha Kirkham
3. Perceptual Completion in Infancy 45
Scott P. Johnson
4. Numerical Identity and the Development of Object Permanence 61
M. Keith Moore and Andrew N. Meltzoff


5. Connectionist Explorations of Multiple-Cue Integration

in Syntax Acquisition 87
Morten H. Christiansen, Rick Dale, and Florencia Reali
6. Shape, Action, Symbolic Play, and Words: Overlapping Loops
of Cause and Consequence in Developmental Process 109
Linda B. Smith and Alfredo F. Pereira
7. Musical Enculturation: How Young Listeners Construct
Musical Knowledge through Perceptual Experience 132
Erin E. Hannon



8. Integrating Top-down and Bottom-up Approaches to Children’s

Causal Inference 159
David M. Sobel
9. What Is Statistical Learning, and What Statistical Learning Is Not 180
Jenny R. Saff ran
10. Processing Constraints on Learning 195
Rebecca Gómez
11. Mixing the Old with the New and the New with the Old:
Combining Prior and Current Knowledge in Conceptual Change 213
Denis Mareschal and Gert Westermann


12. Development of Inductive Inference in Infancy 233

David H. Rakison and Jessica B. Cicchino
13. The Acquisition of Expertise as a Model for the Growth
of Cognitive Structure 252
Paul C. Quinn
14. Similarity, Induction, Naming, and Categorization: A Bottom-up
Approach 274
Vladimir M. Sloutsky


15. Building Intentional Action Knowledge with One’s Hands 295

Sarah Gerson and Amanda Woodward
16. A Neoconstructivistic Approach to the Emergence of a Face
Processing System 314
Francesca Simion and Irene Leo


17. A Bottom-up Approach to Infant Perception and Cognition: A

Summary of Evidence and Discussion of Issues 335
Leslie B. Cohen

Author Index 347

Subject Index 361

Morten H. Christiansen Denis Mareschal

Department of Psychology Centre for Brain and Cognitive Development
Cornell University Birkbeck College, University of London
Jessica B. Cicchino Andrew N. Meltzoff
Department of Psychology Institute for Learning and Brain Sciences
Carnegie Mellon University University of Washington
Leslie B. Cohen M. Keith Moore
Department of Psychology Institute for Learning and Brain Sciences
University of Texas at Austin University of Washington
Rick Dale Nora S. Newcombe
Department of Psychology Department of Psychology
University of Memphis Temple University
Sarah Gerson Alfredo F. Pereira
Department of Psychology Department of Psychological and Brain Sciences
University of Maryland, College Park Indiana University—Bloomington
Rebecca Gómez Paul C. Quinn
Department of Psychology Department of Psychology
University of Arizona University of Delaware
Erin E. Hannon David H. Rakison
Department of Psychology Department of Psychology
University of Nevada, Las Vegas Carnegie Mellon University
Scott P. Johnson Florencia Reali
Department of Psychology Department of Psychology
University of California, Los Angeles University of California, Berkeley
Natasha Kirkham John E. Richards
Centre for Brain and Cognitive Development Department of Psychology
Birkbeck College, University of London University of South Carolina
Irene Leo Jenny R. Saffran
Department of Psychology Department of Psychology
University of Padova University of Wisconsin—Madison


Francesca Simion David M. Sobel

Department of Psychology Department of Cognitive and Linguistic
University of Padova Sciences
Brown University
Vladimir M. Sloutsky
Center for Cognitive Science Gert Westermann
Ohio State University Department of Psychology
Oxford Brookes University
Linda B. Smith
Department of Psychological and Brain Amanda Woodward
Sciences Department of Psychology
Indiana University—Bloomington University of Maryland, College Park

The term neoconstructivism was generated by combining neo, taken from the Greek neos, meaning
“new,” and constructivism, taken from (among other sources) the pioneering theorist and researcher
Jean Piaget. Piaget’s constructivist theory holds that cognitive development is a continual process of
building knowledge on previous skills (e.g., perception, memory, and action repertoires) and existing
knowledge structures, from a foundation at birth consisting largely of reflexes and sensory impres-
sions. It seems to me a first principle of any theory of development that development happens—every
human being that ever existed or ever will started out as a fertilized egg and grew from there. This
is hardly an insightful observation, yet it is rarely mentioned in the literature, and consequently our
understanding of the “growth” of cognition is woefully incomplete. Piaget’s constructivism, and now
neoconstructivism, represent attempts to address this problem.
The origins of this book are rooted in an idea that came to me in 2003. The idea was motivated
by the following observations. Research on cognitive development, particularly in infancy, consists
largely of demonstration studies—experiments designed to show some cognitive skill at a partic-
ular age, often with little or no consideration of limitations characteristic of young infants’ per-
ceptual skills and cortical immaturity, and often with little or no consideration of development.
Demonstration studies can be contrasted with process studies—experiments designed to examine
mechanisms underlying performance or development that support the skill in question or bring it
Back in 2003, I thought that the balance of the field was weighted heavily toward demonstration
studies, which tend to grab attention and headlines (“Infants Are Smarter Than You Think” and
such). These studies have an important place in the literature, and my colleagues and I have produced
a few ourselves. Yet progress in the field relies also on an understanding of process, in particular
developmental mechanisms, because an understanding of development is required for a complete
characterization of any psychological phenomenon. My idea in 2003 was to organize a symposium
focusing on process studies as a theme for a major developmental conference. I began attending
conferences in 1992 (the International Conference on Infant Studies, or ICIS, in Miami), and was
not aware of any such symposium having been organized previously. So I asked around, got agree-
ments from four principal researchers in the area (most of whom are represented by chapters in this
book), and submitted it to ICIS for the 2004 meeting in Chicago under the title “The Big Questions in
Infant Cognition: Trenchant Debate, Tentative Answers.” Talks were presented on object perception,
categorization, word learning, and dorsal/ventral visual processing.
The reviewers accepted the symposium, and the reception at the conference itself far exceeded
any of our expectations. The room was packed and overflowing; some audience members sat
on the floor, stood at the back, and stacked up deep outside the doors. At that time, I thought either


(a) we got lucky in terms of the conference schedule, (b) this is a fluke, or (c) there is pent-up demand
for discussion of developmental mechanisms at our conferences. To fi nd out, I tried it a second
time, submitting a symposium titled “Origins and Ontogenesis of Human Cognition” (all of whose
contributors have chapters in this book) to the 2007 meeting of the Society for Research in Child
Development in Boston. This time, the talks centered on memory development, grammar learning,
and social cognition. Again, the room, though substantially larger than the 2003 symposium, was
full to overflowing.
Finally, I organized a smaller meeting, held in November 2006 in New York City and generously
funded by the National Science Foundation, bringing together 12 of the 24 authors who appear in
this book. Participants in the meeting and authors in this book are all active researchers in cogni-
tive development whose work, though involving a wide range of methods and approaches, coheres
in a common framework: an explicit focus on developmental mechanisms of human cognition. The
meeting was productive, enlightening, and encouraging, and suggested to me (and, I think, the other
participants) that we are really onto something worthwhile.
I think the fields of developmental and cognitive science need this book and others like it. The
range of methods and approaches is not necessarily representative of relevant research as a whole,
but it is representative of some of the questions that are being asked and of some of the important
findings that have been yielded in the past several decades. I hope you find it useful.
I would like to thank the authors for their hard work; Catharine Carlin at Oxford University
Press for her enthusiasm and patience; the National Science Foundation for funding the meeting in
2006; the NSF, National Institute of Child Health and Human Development, Economic and Social
Research Council, and Nuffield Foundation for supporting my own work, and my family, in particu-
lar Kerri Johnson.

Scott P. Johnson
Los Angeles

Objects and Space

This page intentionally left blank
Attention in the Brain and Early Infancy

John E. Richards


covered by the myelin sheath. Myelin appears
as “white” when viewed in the brain (fatty tis-
sue reflects light). Thus, in autopsied brains,
Attention shows dramatic changes over the there are large areas called “white matter” that
period of infancy. At birth, there is little intrin- consist of long myelinated axons. Myelination
sic control of behavior and attention is affected is seen in magnetic resonance imaging (MRI)
mainly by salient physical characteristics of the T1-weighted scans as long channels of white
infant’s environment. By the age of 2 years, the matter surrounded on the edges by gray mat-
infants’ executive control systems are function- ter. Figure 1.1 shows MRIs from a newborn,
ing and infants voluntarily direct information 6-month-old infant, 15-month-old infant,
processing flow by allocating attention on the 10-year-old child, and an adult. The changes in
basis of well-defined goals and tasks. These the myelination appear to be rapid from birth
changes in attention affect a wide range of to the 15-month MRI scan, then slower after-
cognitive, social-emotional, and physiological wards. Myelination of the axon results in less
processes. noisy and quicker transmission, making the
The attention changes in young infants communication between neurons more effi-
occur simultaneously with substantial changes cient. It often is used an explanatory mechanism
in the brain. At birth, the structure, myelina- for how changes in the brain affect cognitive
tion, connectivity, and functional specialization development (e.g., Klingberg, 2008; Yakolev &
of the brain are relatively primitive. Much of the Lecours, 19672008). The changes in myelin have
brain’s structural development occurs between been documented in several publications, most
birth and 2 years. Many brain areas showing notably in the work of Yakolev and Lecours
these changes are closely linked in adult partici- (1967), Kinney and colleagues (Kinney, Brody,
pants to cognitive processes such as attention. Kloman, & Gilles, 1988; Kinney, Karthigasan,
A natural inclination is to hypothesize that the Borenshteyn, Flax, & Kirschner, 1994), and
changes in attention development are caused by Conel (1939 to 1967). (also see Johnson, 1997;
the changes in these brain areas. Klingberg, 2008; Sampaio & Truwit, 2001).
One example of the brain changes is the The relationship between brain development
axonal myelination of neurons. Myelin is a fatty and attention development has been hypothe-
substance that in adult brains covers the axons sized by several models. I recently reviewed my
of many neurons. Figure 1.1 shows a “typical” own view of the relationship between brain cen-
neuron with an unmyelinated portion (cell ters controlling eye movement, brain develop-
body, dendrites) and whose axon is completely ment in these areas, and developmental changes



Cell body

Newborn 6 Months 15 Months



Axon terminal 10 Years


Figure 1.1 The left figure is a cartoon drawing of a “typical” neuron with unmyelinated dendrites, cell
body, and axon terminal, and the myelin sheath covering the axon. The MRIs are T1-weighted slices
taken at the same anatomical level (anterior commisure) from participants from birth to adult. The
myelination of the long axons in the brain is seen as “white” matter in the adults and there are large
changes from the newborn to the adult period.

in attention (Richards, 2008). I will briefly (medial temporal, parietal), and show changes
summarize my observations here. beginning at 3 months and lasting throughout
There are three types of eye movements the period of infancy.
used to track visual stimuli and each eye move- There have been several models theoriz-
ment type is controlled by areas of the brain ing how the areas of the brain controlling the
that show different developmental trajectories. eye movement develop, and how these changes
“Reflexive saccadic” eye movements occur in affect attention-controlled eye movements,
response to the sudden onset of a peripheral including models by Bronson (1974, 1997),
stimulus, are controlled largely by subcortical Maurer and Lewis (1979, 1991, 1998), Johnson
brain areas, and are largely intact by 3 months and colleagues (Johnson, 1990, 1995; Johnson
of age. “Voluntary saccadic” eye movements are et al., 1991, 1998, 2003, 2007), Hood (Hood,
under voluntary or planned control, involve 1995; Hood, Atkinson, & Braddick, 1998), and
several parts of the cortex (occipital, fusiform Richards (Richards, 2002, 2008; Richards &
gyrus, parietal cortex, frontal eye fields), and Casey, 1992; Richards & Hunter, 1998). Iliescu
show rapid development from 3 to 9 months and Dannemiller (2008) review several per-
of age. “Smooth pursuit” eye movements occur tinent “neurodevelopmental” models. These
either voluntarily or involuntarily toward models hypothesize that change in the brain
smoothly moving objects, involve several cor- areas controlling the eye movements result
tical areas of the brain involved in voluntary in the overt changes in the eye movements.
saccadic eye movements and some areas not With respect to attention-directed eye move-
involved in voluntary saccadic eye movements ments, particularly in the first few months, the

voluntary saccadic system is most relevant. It age and performing a brain dissection, or with
has been hypothesized that connections within invasive neuroscientific techniques such as
primarily visual areas of the cortex (e.g., pri- direct neural recording or lesions. These tech-
mary and secondary visual cortex) and myeli- niques may be applied to individuals who also
nation/connectivity to other areas of the cortex participate in tasks measuring behavioral per-
(e.g., parietal area PG) show growth spurts in formance and psychological processes. An
about 3 to 6 months. These brain changes are example of this approach is work on infant
accompanied by changes in the voluntary track- memory by Bachevalier (2008). She has shown
ing of objects in the visual field (Richards & that changes in memory in young monkeys are
Holley, 1999), changes in the attention-directed closely related to the development of the brain
eye movements toward peripheral visual targets areas that are the basis for this type of memory
(Hunter & Richards, 2003, submitted; Richards in adults. Lesioning these areas in the infant
& Hunter, 1997), and changes in the ability to monkey disrupts the onset or occurrence of this
shift attention “covertly” without making an eye type of memory. Bachevalier makes the parallel
movement (Richards, 2000a, 2000b, 2001, 2005, between the age-related monkey performance
2007b; also see Richards, 2004b, and the section and human infant performance on analog ver-
“Brain and Attention: Spatial Orienting”). sions of the visual preference procedure and
visual discrimination tasks. She concludes that
her experiment suggests that the basic neural
systems underlying these memory tasks in mon-
The previous section presented the hypothesis keys and humans are parallel, and that studies
that brain changes in young infants are respon- of infant monkeys inform us about comparable
sible for the changes seen in psychological pro- memory development in infant humans.
cesses, with an emphasis on attention-directed There are several assumptions necessary
eye movements. There are many aspects of for the study of the study of infant nonhuman
infant cognitive development that have been animal to be relevant to human infants. First,
explained as a function of brain development; this study requires that a correlation can be
the field of “developmental cognitive neurosci- made between ages of the nonhuman animals
ence” uses this as a basic explanatory mecha- and human infants. For example, in a study
nism (Johnson, 1997; Nelson & Luciana, 2008). of changes in synaptogenesis in visual areas,
However, these models are severely limited in Bourgeois (1997) showed changes in rat, cat,
their description of what the actual brain is like macaque monkeys, and humans. Figure 1.2
for infants at a specific age or a specific infant shows comparable changes in the primary visual
at a specific age. They also are limited in their cortices of four different species. These changes
measurement of brain function (see the section show some similarity in the overall pattern of
“How to Measure Brain Activity in Infants”). change. However, the pattern of changes often is
I will review two ways in which brain develop- not isomorphic across species (e.g., compare the
ment has been modeled in past research. Then, prolonged decay in synaptic density in humans
I will assert that structural MRI techniques in Figure 1.2), and a comparison of development
should be used for this purpose. between species in one brain area might not be
The primary information about brain devel- the same for other brain areas. Second, one can
opment in infants comes from nonhuman ani- relate the changes in the brain in the nonhuman
mal models of brain development, primarily animals, the importance of these brain areas for
primates. For example, our knowledge of the a specific psychological process, and the changes
patterns of myelination, synaptogenesis, and in these psychological processes in human
neurochemical development comes primar- infants. This analysis presumes that there is
ily from study of normally developing non- an unequivocal relationship between the psy-
human animals. Nonhuman animals may be chological process and the brain area in devel-
studied by sacrificing the animal at a specific opment, which is a questionable assumption.

Synaptogenesis in primary visual cortex








Density of synapses





10 102 103 104 105
Days after conception

Figure 1.2 Changes in the relative density of synapses in the primary visual cortex for four species.
The increase in synaptic density indicates synaptogenesis occurring, and the decline represents syn-
aptic pruning. Note the differences in the pattern of development relative to specific developmental
events (birth, puberty, death) in the four species. From Bourgeois (1997).

This type of analysis does not take into account of studies use analogical reasoning as a basis for
different developmental patterns for other brain nonhuman animal models being relevant for
areas and the influence of these brain areas on human infants. They cannot apply the invasive
human infant psychological activity. Third, this methods directly to human participants and
type of study assumes that brain–behavior rela- cannot inform us about the developmental sta-
tions in nonhuman animals are comparable to tus of the brain of individual infants.
those in humans. This assumption is doubtful A second way in which information about
due to the complexity of human behavior rela- human brain development has been obtained
tive to animals, the extremely large changes is from postmortem studies of young infants.
in brain size between nonhuman animals and Infants who die of neural-related causes and/
humans, and the relative size of brain areas in or other causes have been studied for a wide
humans and nonhuman animals (e.g., prefron- range of neuroanatomical, neurochemical, and
tal cortex; occipital cortex). Finally, these types cytoarchitectural processes (synaptogenesis

# of myelinated fibers/0.005 mm2

Neuron # per 0.001 mm3

700 Layer VI (dotted lines)

Layer II (solid lines) 25
400 Layer I (dashed lines) Layer II (solid lines)
300 15
100 10
0 5
0 1 3 6 15 24 48 72
Postnatal age, months 0
0 1 3 6 15 24 48 72
Postnatal age, months

Newborn 1 Month 3 to 4 Months

1 1 1
2 2 2

3 3 3
4A 4A 4A
4B 4B 4B
4Cα 4Cα 4Cα

4Cp 4Cp 4Cp

5A 5A 5A
5B 5B 5B

6 6 6

M P P M S.C. B.G M P B.G. LGN S.C. M P B.G. LGN S.C. MT M MT V2, V3 etc
LGN LGN (P stream)

Figure 1.3 Work from Conel’s postmortem neuroanatomical studies (Conel, 1939 to 1967). The top
graphs show neural density in layers I and II (left figure) and the myelinated fiber density in layers II
and VI (right figure), both shown as a function of postnatal age (from Shankle et al., 1998). The bot-
tom figures show innervations of the layers in the primary visual cortex by different cell types at birth,
1 month, and 3 to 4 months of age.

in Huttenlocher, 1990, 1994; myelination in in more complex cortical functioning, shows

Kinney et al., 1988, 1994). The most well-known an extended time course for myelination. The
version of this kind of study is a series of stud- bottom figures show the types of neurons that
ies by Conel (1939–1967). Conel studied the innervate the cortical layers at different ages.
neuroanatomical and cytoarchitecture of the Conel’s work has been popular in develop-
layers of the human cerebral cortex in autop- mental cognitive neuroscience because of the
sied individuals. He measured six anatomical specificity of the information about the neu-
features, including cortical layer thickness, rophysiological processes being studied and its
cell density, numbers of cell types, and myeli- large age range. For example, Johnson (1990,
nated fiber density (Shankle, Romney, Landing, 1995; Johnson et al., 1991, 1998) posits that
& Hara, 1998). Figure 1.3 shows two types of development in the layers of the the primary
results from these studies. The top figures dis- visual cortex (e.g., Figure 1.3, bottom figures)
play neuron density and number of myelinated acts as a limiting factor for visual behavior and
fibers in two cortical layers as a function of age. visual attention controlled by brain systems.
Notice that the myelination of the layer 6, which Such developmental changes in the layers of the
receives input from noncortical areas and is primary cortex from birth to about 6 months
involved in simple cortical functioning, shows act as a gateway for the onset of these eye move-
myelination changes very early. Alternatively, ments in young infants.
layer 2, which has communications within lay- There are limitations of these postmor-
ers and with other cortical areas and is involved tem studies for developmental cognitive

neuroscience. An implicit assumption is that been used to measure developmental changes

the individuals measured at different ages are in myelination (Sampaio & Truwit, 2001), dis-
representative of that age and provide knowl- tribution of white and gray matter in the cor-
edge about the brain status of specific individu- tex (O’Hare & Sowell, 2008), and biochemical
als of that age. It is likely true that the changes characteristics of the developing brain (Sajja &
across age are large enough that groups of Narayana, 2008). An interesting use of the MRI
infants at one age will have neuroanatomic and is the application of “diff usion tensor imaging,”
cytoarchectural similarities differentially from which details the connectivity of axons between
groups at different ages. However the develop- different areas in the brain (Wozniak, Mueller,
mental status of the brain of an individual are & Lim, 2008).
likely to show idiosyncratic individual differ- The use of structural MRI for determining
ences in brain development between individuals brain developmental status is relatively new but
at the same age. Second, these studies typically well underway. The most comprehensive study
are limited to small samples (single individual of brain structural development with MRI is
at each age) and restricted to infants who died. currently ongoing. The “NIH MRI Study of
If the reasons for the death are related to the Normal Brain Development” (Almli, Rivkin, &
characteristics being studied, the results will McKinstry, 2007; Evans, 2006; NIH, 1998) is a
not be applicable to a wide range of participants. multicenter research project sponsored by the
Without some independent verification of the National Institutes of Health to perform ana-
generality of the findings, these surely cannot tomical scans of about 800 children ranging in
be applied to determine the status of individual age from birth to 18 years. This study uses 1.5T
participants. These studies typically have used scanners to get T1- and T2-weighted images,
extremely limited samples for young infants proton density excitation, DTI, and other scans.
and children. An interesting aspect of this study is the collec-
One technique that may be used to examine tion of a large battery of neuropsychological and
the brains of individual infants is MRI. MRI developmental tests. This will allow the correla-
has been described in several publications (e.g., tion of psychological processes, neuropsycho-
Huettel, Song, & McCarthy, 2004; Thomas & logical status, and developmental level with
Tseng, 2008) and I will provide a brief overview the brain status of individual participants. The
emphasizing its use for studying the brain. MRI study will provide individual MRIs to research-
applies a very large magnetic field to the head. ers interested in these aspects, and likely will
The head’s media (skull, cerebrospinal fluid provide standardized or stereotaxic scans in
[CSF], brain) have magnetic properties such that the “MNI” framework (Montreal Neurological
the magnetic field aligns spinning protons in the Institute brain atlas, Mazziotta, Toga, Evans,
same direction as the field. Radiofrequency (RF) Fox, & Lancaster, 1995; Evans, Collins, & Milner,
energy pulses cause disruption of the magnetic 1992; Evans et al., 1993) or Talairach space
fields and disrupt the alignment of the protons. (Talairach &Tournoux, 1988; also see Talairach
MRI measures the disruption in alignment due Atlas Database Daemon, Fox & Uecker, 2005;
to the RF pulse and return of the alignment to Lancaster, Summerlin, Rainey, Freitas, & Fox,
the strong magnetic field. Different body tissue 1997; Lancaster et al., 2000). Currently (late
types have differing times to return to align- 2007) MRIs are available for children ranging
ment, and these differing times (or different in ages from 4 to 18 years. Most of the MRIs for
resonance frequencies and/or differing on/off the infant participants have been collected but
durations of the RF pulses) may be used to iden- are undergoing quality control.
tify where different types of media are located A second approach I am using in my current
in the head. This allows the identification and work is to acquire structural MRIs on infant
visualization of skull, skin, CSF, white and gray participants who also participate in studies
matter, myelin, vascularization, and other com- of attention, and relate the information found
ponents in the head. MRI measurement has about specific individual’s brain developmental

status to performance in attention tasks. Parents nonsedated infants with success rates that range
of infants are contacted in the normal course of from 66% to 90% (Almli et al., 2007; Dehaene-
our contact system for psychological experi- Lambertz, Dehaene, & Hertz-Pannier, 2002;
ments. The parents who agree to participate in Evans, 2006; Gilmore et al., 2004; Paterson,
the MRI have several visits. First, the parent(s) Badridze, Flax, Liu, & Benasich, 2004; Sury,
and infant come to the MRI center located at Harker, Begent, & Chong, 2005). We have had
a local community/teaching hospital. The par- 100% success after the infant is sleeping; sev-
ent is shown the equipment and the process is eral infants have not been able to get to sleep.
described. Second, the parents return to the The parent is in the scanner room during the
MRI center for the scan. The procedures for the scan, along with a pediatric nurse. Finally, the
MRI recording use infants during sleep (Almli, infant and parent then come to a psychophysi-
Rivkin, & McKinstry, 2007; Evans, 2006; NIH, ological laboratory for studies of attention (see
1998). The infant and parent come to the MRI the section “Brain and Infant Attention: Spatial
center in the evening at the infant’s normal bed- Orienting”).
time. The infant and parent go into the dark- Several procedures are followed to insure
ened room with the MRI and the infant is put the infant’s safety, obtain a good recording, and
to sleep. Then the infant is placed on the MRI minimize the amount of time in the scanner.
table, earplugs and headphones put on, and Potential risks of MRI recording include scan-
then the recording is done. Figure 1.4 shows an ner noise, the magnetic fields, and magnetic
infant lying on the MRI bed—the headphones gradients. We use earplugs and earphones to
and cloths surrounding the infant can be seen. minimize scanner noise. The scans in our 3T
When the infant is in the MRI tube, a research magnet are optimized for the lowest sound lev-
assistant reaches in and has a hand on the infant els and fastest recording. The infants are placed
to see if the baby moves or wakens. Getting the on the bed on “memory foam,” covered snugly
baby to sleep usually takes about 45 to 60 min, with sheets, and have rolled washcloths around
and the MRI recording itself has scan sequences the head for restricting head movement com-
lasting in total about 20 min. Previous stud- fortably. The U.S. FDA considers MRI record-
ies from several laboratories have described ing in infants to be a “nonsignificant” risk when
procedures for performing MRI recording of used within FDA-specified parameters (USFDA,
2003, 2006). This assessment is based on over 20
years of MRI recording in neonate and infants
(e.g., Barkovich, Kjos, Jackson, & Norman,
1988; Rivkin, 1998) with no reports of del-
eterious long-term effects. Outcome studies of
such effects show that the magnetic field or the
magnetic gradients do not threaten the concur-
rent physiological stability of the infant during
scanning (Battin, Maalouf, Counsell, Herlihy,
& Hall, 1998; Taber, Hayman, Northrup, &
Maturi, 1998; Stokowski, 2005) and there are
several studies showing no short-term or long-
term effects from this type of recording (Baker,
Figure 1.4 An infant lying on the MRI bed going
Johnson, Harvey, Gowland, & Mansfield, 1994;
into the MRI tunnel. The infant is covered with
Clements, Duncan, Fielding, Gowland, Johnson,
a sheet and has a restraining strap lightly placed
across its body. The headphones and cloths sur- & Baker, 2000; Kangarlu, Burgess, & Zu, 1999;
rounding the infant can be seen in this picture. Kok, de Vries, Heerschap, & van den Berg,
A research assistant (left side of picture) and the 2004; Myers, Duncan, Gowland, Johnson, &
parent (right side of picture) are close to the baby Baker, 1998; Schenck, 2000). These risks are dis-
during the scan. cussed in detail in several sources (Barkovich,

2005; Dehaene-Lambertz, 2001; Evans, 2006; attention and how they affect specific corti-
Stokowski, 2005). cal/cognitive processes. Neural activity pro-
Three scans are done on each infant. First, duces electrical currents that pass through the
we do a localizer sequence (45 s) to orient the media of the head and can be recorded on the
subsequent high-resolution slices. Th is orients surface of the scalp. This electrical activity is
the longitudinal fissure parallel with the sag- the electroencephalogram (EEG). We use the
gital plane, perpendicular to the coronal and EEG for cortical source analysis. One aspect of
axial planes, and the line between the anterior this analysis is the necessity to have a realistic
commisure (AC) and posterior commisure (PC) head model for the quantification. We obtain
is on the center MRI slice. This will orient the this from the MRI. Figure 1.5 shows the MRI
scan so that the MRI may be oriented relative recorded from the infant pictured in Figure 1.4.
to the origin for the stereotaxic space defined The middle section shows the media identified
by Talairach (Talairach & Tournoux, 1988). inside this baby’s head (i.e., gray matter, white
Second, the localizer scan is followed by a 3D matter, CSF, skull). The right section shows
MPRAGE T1-weighted scan. The T1-weighted a tetrahedral wireframe model of the identi-
(T1W) scan results in an MRI volume that fied sections. This wireframe model is used in
shows maximal distinction between gray mat- computer program such as BESA, EMSE, and
ter and white matter. The MPRAGE employs a my own computer program (Richards, 2006,
TI of 960 ms, a delay of 3000 ms between shots submitted) to identify the current sources of the
and an 8º flip angle, with a very short TE (4.9 EEG on the head. I will describe the details of
ms). Using this sequence, we can collect a 1 mm the EEG analysis in the next section (section on
isotropic (150 × 256 × 256 mm FOV) in about “How to Measure Brain Function”) and some
9 min. The gray matter, white matter, and CSF results from the study in a following section
can be segmented from the T1W scan, but a (section on “Brain and Infant Attention: Spatial
subsequent T2-weighted (T2W) scan helps to Orienting”). Whereas our use of these MRIs is
discriminate white matter and CSF with auto- primarily as an adjunct to current source analy-
matic segmenting routines. The T2W scan sis of EEG signals, specific aspects of the MRI
emphasizes liquid in the brain, so CSF is very can be used such as amount of myelination in
bright (high voxel values) relative to other brain brain areas, structure of brain areas, and per-
matter. This is acquired as a dual contrast pro- haps connectivity between brain areas.
ton density and T2-weighted sequence with Why is this approach important? The first
a dual echo fast spin echo sequence. The dual and perhaps obvious answer is that “cognitive
echo acquisition takes 4 min to acquire (256 neuroscience” requires measurement of both
× 256 matrix with rectangular FOV, 1 × 1mm the cognitive and the neural aspects of cogni-
pixel size, 2 mm slice thickness, no slice gap, tive neuroscience. A developmental cognitive
50 saggital slices in two interleaved packages, neuroscience study without measures of brain
echo train length of 10, first TE=13 ms, second development will be extremely limited. Second,
TE=101 ms, TR=4640 ms, parallel imaging fac- the structural MRI of individual infants is nec-
tor = 2). Any of the sequences may be repeated essary to relate the findings for particular par-
if degraded by motion artifacts; however, all ticipants to the functional data of particular
infants we have tested have been sleeping and infants. This is especially important given the
still during the entire scanning protocol. If only wide range of topographical changes over this
the T1W scan is available, some automatic clas- age, and the possibility that large individual
sification/segmentation routines are substituted differences might occur in infant brains. An
with manual segmentation routines of the T1W example of this problem is shown in Figure 1.6.
scan. This shows MRIs at a common stereotaxic posi-
What are the MRI scans used for? Briefly, tion (axial level of anterior commisure) across
the purpose of the attention studies is to iden- a wide range of ages. The wide range of head
tify areas of the brain involved in sustained sizes across this age, and some variability within
Figure 1.5 The MRI from the infant shown in Figure 1.4. The left scan is a saggital view of the
T1-weighted scan, with the cross-hairs indicating the position of the anterior commisure. The middle
figure is the brain segmented into skin and muscle (white), gray matter (red), white matter (green),
CSF (cerebralspinal fluid, yellow), dura (pink), skull (blue), and nasal cavity (purple). The right figure
is a representation of the tetrahedral wireframe used in EEG source analysis programs.

Newborn (3.0T), 1 & 3 Months (1.5T), 3.5 & 4.5 Months (3.0T)

6 & 6.5 Months (3.0T), 12 Months (1.5T), 12 Years (3.0T)

Figure 1.6 Axial T1-weighted MRI scans for participants ranging from birth to 12 years. Each scan
is presented at the axial level of the anterior commisure (blue cross-hairs). Note the large change in
shape and size, differences in the type of brain underlying similar skull locations, and the changes in
myelination across these ages.


ages, is clear in these examples. Additionally, the have been identified for the MNI brain. For
exact type of brain media under the same skull example, Figure 1.7 (bottom right panel) shows
location differs across the infants. It is important various anatomical structures that have been
for the study of brain development to have struc- identified on the MNI brain. The MNI brain
tural information on particular infants. and associated brain areas may be used as a ste-
A third reason why this approach is impor- reotaxic atlas to identify structures in children
tant is that this allows specific comparisons at younger ages. The bottom part of the figure
across age in related structures. Figure 1.7 shows the structure identified on the MNI brain
shows brains of infants at 3 and 6 months, (bottom right panel), which is then transformed
and a 10-year-old child. The MNI brain is also to the head size and shape of the participants at
shown (Montreal Neurological Institute brain the other ages (three bottom panels on left). This
atlas, Mazziotta et al., 1995; Evans et al., 1992, may allow the direct comparison of the develop-
1993). The MNI brain consists of the average of ment of specific brain areas across a wide range
152 college-age participants. Several brain areas of participants.

3 Months 6 Months 10 Years MNI

Figure 1.7 Axial T1-weighted MRI scans for participants at 3 and 6 months, 10 years, and the “MNI”
brain, located on the axial level of the anterior commisure (cross-hairs). The bottom right MRI has
the anatomical locations overlaid in color derived from the MNI brain, and structures such as the pre-
frontal cortex (blue), temporal (red) and occipital (green) cortex, and several subcortical structures
may be seen. The three figures on the bottom left are the single participant brains with the stereotaxic
anatomical areas translated from the MNI brain to the individual participant.

not tell us about the developmental status of the

brain for an individual participant. With proper
caution, marker tasks and psychophysiological
There are multiple aspects of brain development measures allow inferences to be made about
that might be important in the development of brain development and help to inform a devel-
cognitive processes. The previous section gave opmental cognitive neuroscience approach to
some illustrative examples of the development attention.
of brain anatomy (brain structure). The field I have been using the EEG and scalp-
of cognitive neuroscience is interested in the recorded event-related potentials (ERP) as mea-
functioning the relationship between the brain, sures of brain activity. The EEG is electrical
and cognitive and psychological processes. The activity located on the scalp that is generated by
development of brain functioning is as impor- neural activity occurring in cell bodies or extra-
tant for developmental cognitive neuroscience cellular neural tissue. ERP are EEG activity that
as is the development of brain structure. is time-locked either to experimental events or
The measurement of brain activity during to cognitive events. Recently I have been advo-
psychological tasks in infant participants has cating the use of these measures with “cortical
been as difficult as the measure of brain struc- source analysis” (Reynolds & Richards, 2007,
ture. Most of the neurodevelopmental models 2009; Richards, 2003b, 2005, 2006, 2007a,
of infant attention mentioned in section 1 were 2007b, submitted). Cortical source analysis uses
based on the brain function of nonhuman ani- high-density EEG recording (Johnson et al.,
mals, so-called “marker tasks,” or speculative 2001; Reynolds & Richards, 2009; Tucker, 1993;
relations between overt behavioral measures Tucker, Liotti, Potts, Russell, & Posner, 1994)
and putative brain markers. I will briefly review to hypothesize cortical sources of the electrical
the older measurement techniques and then activity and identifies the location of the brain
comment on three new techniques. areas generating the EEG or ERP (Huizenga
Most neurodevelopmental models have & Molenaar, 1994; Michel et al., 2004; Nunez,
relied either on measurement of brain function 1990; Scherg, 1990, 1992; Scherg & Picton, 1991;
in animal models, or the measurement of overt Swick, Kutas, & Neville, 1994). The activity of
behavior putatively linked to brain activity. these cortical sources may be directly linked
Johnson (1997) calls the latter measures “marker to ongoing behavioral manipulations or psy-
tasks.” Marker tasks are behavioral activities chological processes. This results in a descrip-
that can be measured overtly but which are tion of the functional significance of the brain
thought to be controlled by specific brain areas. activity, i.e., functional cognitive neuroscience.
Johnson proposes that such tasks may be used Greg Reynolds and I have been using this tech-
in infants with the understanding that devel- nique to study infant recognition memory in
opment in these tasks implies brain develop- the paired-comparison visual-preference pro-
ment in the associated areas. I have discussed cedure (Reynolds, Courage, & Richards, 2006;
this proposal previously (Richards, 2002, 2008; Reynolds & Richards, 2005) and have recently
Richards & Hunter, 2002). Similarly, there are a reviewed our use of this technique (Reynolds &
wide range of studies using physiological indices Richards, 2009). I will present a brief introduc-
in the infant in psychological tasks (Richards, tion to this work but we have reviewed it else-
2004c; Reynolds and Richards, 2007). These where (Reynolds & Richards, 2007, 2009).
psychophysiological measures (e.g., heart rate, The basic outlines of this technique are as
EEG) have known physiological processes that follows. Recording of electrical activity on the
cause their activity and thus may show changes head (EEG, ERP) is made. The cortical source
in these processes linked to experimental analysis hypothesizes electrical dipoles gener-
manipulations or cognitive processing. Like the ating current inside the head as the sources of
marker tasks, psychophysiological measures the EEG (ERP) changes measured on the scalp.
are indirect measures of brain function and do The source analysis estimates the location and

amplitude of the dipole. Figure 1.8 (top left MRI events. Figure 1.8 (bottom figures) shows the
slices) shows the dipoles for an ERP compo- activity of these dipoles for stimuli that were
nent known as the “Nc” that occurs in young novel or familiar, and which elicited atten-
infants in response to brief familiar and novel tion or did not. The activity of the dipoles dis-
stimuli (from Reynolds & Richards, 2005; also tinguishes the type of stimuli (experimental
see Reynolds et al., 2006, and Richards, 2003a). manipulation), the attention state of the infant
The spatial resolution of EEG for localizing (psychological process), and the temporal
brain activity is typically believed to be about 5 unfolding of the brain activity. Since neural
cm (Huettel et al., 2004), whereas source analy- activity is generating the EEG, the temporal
sis with realistic models has spatial resolutions resolution of this procedure is on the same time
closer to 1 cm (Richards, 2006). course of neural activity (1 ms). Our conclusion
The activity of the dipoles can be estimated from this analysis is that we have identified the
over time and in relation to psychological brain areas that generate the scalp-recorded

10 Prefrontal 10
Attention Prefrontal


0 0
1000 ms 1000 ms

–10 –10

Figure 1.8 The sequence of MRI slides shows the dipole locations (yellow circles) for an ERP compo-
nent known as the Nc (topographical scalp potential maps on upper right figures). The activity of the
dipoles is shown in the bottom figures as a function of experiment condition (familiar stimuli on left ,
novel on right), psychological process (attentive dark line, inattentive solid line), and temporal pattern
(0 to 1000 ms following stimulus onset).

Nc ERP component. The effects of attention on differentially absorbed/reflected by oxygenated

this brain activity are to enhance the ampli- and deoxygenated hemoglobin. The reflected
tude of the brain activity and make it occur light can be measured with a detector placed
more quickly to novel stimuli. Th is technique near the emitter, and the time course of the
offers a noninvasive tool for the measurement oxygenated and deoxygenated blood flow can
of brain activity, with spatial resolution suf- be measured. Th is procedure is being applied to
ficient to locate anatomical areas in the brain infant participants routinely (Mehler, Gervain,
and temporal resolution occurring on the same Endress, & Shukla, 2008).
time frame as activity in neurons (Reynolds & Both techniques have been applied to infant
Richards, 2009; Richards, 2006). participants. The recording of fMRI was done
I will mention two other techniques that in young infants of 2–3 months of age to study
might be useful to measure brain activity in speech perception (Dehaene-Lambertz et al.,
young infants. Both measure the change in 2002). Infants were presented with speech and
blood flow that occurs following neural activ- backward speech in 20 s blocks, alternating
ity. Neural activity occurring in a localized with 20 s periods of silence. MRI sequences
area of the brain results in changes in brain tis- were recorded about every 2 s (usually done in
sue resulting from the neural activity (e.g., neu- 3 mm slices). The fMRI technique examines the
rotransmitter release, ionic exchange between blood flow in the brain during the presenta-
neuron and surrounding media). These local tion of the sounds, subtracting the blood flow
changes affect arterial capillaries and arteries measurements during the periods of silence.
to effect the transport of oxygen and nutrients Figure 1.9 shows where in the brain these
to the area. Some of these changes occur over changes occurred. The MRI slices on the top
wide areas of the brain whereas some are lim- left are from three different levels. Brain activ-
ited to the area in which the neural activity ity during speech was larger than during the
occurred. silent periods in the colored areas on the MRI
Two techniques have been used in adult slices. This occurred over a wide range of areas
participants in cognitive neuroscience to mea- in the left temporal cortex. The time course of
sure these blood flow changes. The most famil- this activity is shown in Figure 1.9 (top left fig-
iar is “functional MRI” (fMRI; Huettel et al., ure). There was a gradual increase in blood flow
2004; Thomas & Tseng, 2008). Oxygenated and to these areas that peaked about 6 to 7 s after
deoxygenated blood have differing magnetic sound onset, continued during the presentation
properties that may be distinguished in MRI of the sound, and lasted for nearly 10 s into the
recording. When the blood flow (and resulting silent period.
oxygenation) change occurs immediately after The NIRS procedure has been used to study
neural activity, the MRI may be used to local- functional brain activity in newborns and young
ize these areas and show their time course. Th is infants (Mehler et al., 2008). For example, one
signal may then be related to the experimen- study investigated language perception in new-
tal manipulations or cognitive processes, i.e., borns (Peña et al., 2003). Figure 1.9 (bottom left
functional neuroimaging. Th is procedure has figure) shows the location on the scalp on which
been used extensively in young children and the emitters and detectors were placed. The
adolescents (Thomas & Tseng, 2008), but has numbers indicate the area on the scalp under
been applied only rarely in infant participants. which the blood flow is reflected. The middle
The less-familiar technique for measuring figure shows the same locations overlaid on the
activity-dependent blood flow is near-infrared brain. This study presented newborn infants
optical spectroscopy (NIRS) or optical topog- with forward and backward speech, with 15 s
raphy (OT). An infrared emitter placed on the stimulus periods and 25 to 35 s silent periods.
skull can send an infrared signal that pen- The bottom left figure shows the changes in the
etrates several millimeters (2 to 3 cm) into the total blood flow as a function of time for the 12
skull. Infrared light of differing wavelengths is recording locations. The blood flow changes for

t(19 d.f.) % Signal change

6 1

4 0

2 –1

0 –2
2 4 6 8 10 12

1 2

3 4 5
1 2
3 4 5 6 7
6 7
8 9
11 10 8 9 10
11 12

Figure 1.9 Blood-flow-based neuroimaging techniques in infant participants. The upper figures show
the fMRI activations for activity at three levels of the temporal cortex, and the time course of this
activity is seen in the upper right figure (about 6 scans per 10 s). The bottom left figure shows the posi-
tioning of the NIRS detector (blue) and emitter (red) probes, and the lines and numbers between the
probes are the location of the scalp under which blood flow is measured. The middle figure shows the
putative brain locations being measured, and the right figure shows the time course of total hemoglo-
bin activity for forward speech (red), backward speech (green) and silent periods (blue).

the forward speech (red lines) were different measure of brain structure and function since
in the recordings over the left temporal cortex the procedure is directly measuring blood flow
than those for backward speech (green) or silent in the location of the cortical activity. The EEG
periods (blue). Comparable regions on the right source analysis procedure uses quantitative
temporal cortex (not shown) were not different inferential techniques to estimate such loca-
for forward, backward, and silent periods. The tions. The NIRS is limited to the analysis of the
onset of the maximal peak was 10 to 15 s follow- scalp-recorded optical changes. Th is restricts
ing stimulus onset and the blood flow changes its value for localizing brain activity. The exact
lasted 10–15 s after sound offset. These results type of brain material under the same skull
show that infants are sensitive to the properties location differs across infants at the same age
of speech at birth. Areas of the brain similar to and across ages (Figure 1.6). Second, the NIRS
those in older infants (i.e., Figure 1.9. top MRI and fMRI have a slower temporal resolution
figures) respond differentially to forward and than EEG/source analysis. The underlying
backward speech. measurement phenomenon in the former are
I will comment briefly on the relative advan- changes in blood flow, which occur over sec-
tages of these three techniques (EEG source onds (6–7 s for fMRI, 10–15 s for NIRS, Figure
localization, fMRI, NIRS) for the measurement 1.9) and continues to respond for several
of brain function in infant participants. First, seconds after stimulation. Alternatively, the
the fMRI technique provides the most direct EEG changes are caused by neural electrical

activity occurring around the time of the

synaptic potential changes of the neuron and
are responsive to short-latency changes in this
neural activity (e.g., 100–200 ms in Figure 1.8). The previous sections have outlined the proposal
Th ird, the spatial resolutions of the three tech- that brain development and attention develop-
niques vary. MRI recording has <1 mm resolu- ment were closely related (section “Hypothesis:
tion for structural scans, and fMRI uses 3 mm Infant Attention Development is Controlled
slices and needs to perform averaging over a by Infant Brain Development), and elaborated
wide area. Motion artifacts may also degrade on methods used to measure brain structure
the spatial resolution of fMRI, especially for (section “What’s Inside a Baby’s Head?”) and
infant participants. The EEG techniques typi- function (“How to Measure Brain Function in
cally have resolution in the 5 cm range, though Infants”). The current section details a type of
EEG source analysis with realistic head models attention, “covert attention” or “covert orient-
probably lowers this to about 1 cm (10 mm). ing” that has been studied behaviorally, psy-
The NIRS technique has the poorest spatial chophysiologically, and with the functional
resolution, since its measurement technique brain measurement described in the previous
demands that emitter/detector distance be two sections.
about 2–3 cm. It also may have its resolution Studies with adult participants have shown
blurred by larger arterial vascular changes that attention may be moved around our envi-
occurring on the surface of the cortex carrying ronment flexibly. This is shown by the volun-
blood to intracortical capillaries. Fourth, the tary movement of the eyes from one location
three techniques vary in “ease of use.” The NIRS to another, which requires disengaging fi xation
and EEG recordings are noninvasive and can (and attention) at one location, moving fi xation
be done easily on infant participants behaving (attention) to another location, and engaging
in relatively unrestrained situations. The fMRI attention in the new location. The flexibility of
recording is extremely sensitive to motion arti- spatial attention is shown most dramatically in
facts and is best done in sleeping infants. Th is “covert attention” or “covert orienting.” Michael
restricts its use for the study of a wide range of Posner first studied this type of flexibility with
psychological processes in infant participants. the spatial cueing procedure (Posner, 1980;
The NIRS has the fewest quantitative require- Posner & Cohen, 1984). In this procedure, a par-
ments; EEG source analysis and fMRI requires ticipant is directed to pay attention to a location
extensive and sophisticated modeling with in space where a target will occur. The target
computer programs. The most used measure of identifies some action needed to be done by the
the three for infant cognitive neuroscience is participant. The target occurs in the periphery
EEG, and NIRS is beginning to be used. The and target identification is done without moving
fMRI technique has rarely been used. Whereas the eyes to the location, either during the target
I prefer the EEG source analysis technique as or in the time preceding target onset. The partici-
a temporally relevant and spatially appropriate pant’s response to the target is affected by several
method for infant cognitive neuroscience, the factors, which show that attention may be moved
three techniques offer complementary infor- about in space covertly. The cueing procedure
mation about infants’ developmental cogni- can use a cue in the same location as the target,
tive neuroscience. They provide measurement in which case the psychological process is called
of brain activity (and structure) in individual “covert orienting.” Alternatively, when a cue in
participants, rather than relying on brain mea- a different location than the target, or cues are
surement in other participants (i.e., nonhu- based on simple directions to “pay attention to
man animals, postmortem or autopsy studies) the right side,” the resulting psychological pro-
or on techniques, which at best only indirectly cesses are called “covert attention.”
measure brain activity (marker tasks, indirect Behavioral studies of this type of spatial ori-
psychophysiological recording). enting have been done in infant participants.

The spatial cueing procedure developed by common variant is to present the peripheral pat-
Posner was first adapted by Hood (1995) to tern when the central stimulus is present, then
study covert orienting in infant participants. turn both stimuli off, then present a pattern,
Hood presented 3- to 6-month-old infants with functioning as a target, to which an eye move-
an interesting (color and movement) pattern on ment will be made. The target can be presented
the center of a video monitor. When the infant on the same side as the cue (“valid trials”), on
began to fi xate on this pattern, a stimulus was the opposite side (“invalid trials”), not presented
presented on the right or left side of the center; (“no-target control”), or can be presented on
the center pattern remained on. Infants at this a trial without the cue being presented (“neu-
age will not shift fi xation from a center pattern tral”). A number of studies show that in 2- and
that is engaging fi xation to the peripheral pat- 3-month-old infants, the time to move the eyes
tern. Thus, any differential response to the side from the center location to the target is faster
on which the cue was presented, or the cue on when the cue and target are on the same side
the side, would indicate that the infant was able (valid trials) than when no cue was presented or
to covertly orient toward the peripheral pattern the cue and target were on opposite sides (Hood,
in the absence of overt eye movements. Note 1993, 1995; Hood & Atkinson, 1992; Johnson
that this procedure differs from the typical & Tucker, 1996; Richards, 2000a, 2000b, 2001,
Posner-type spatial cueing procedure in which 2004a, 2004b, 2006, 2007). Figure 1.10 shows
verbal instructions are given to the participant this finding for 14-, 20-, and 26-week-old
to keep fi xation oriented toward the center of infants (Richards, 2000a). The left-hand figures
the display. show the time to move the eyes from the center
There are a number of behavioral findings location to the target when the stimulus-onset
for infants in this spatial cueing procedure. A asynchrony (SOA) was 350 ms. This time was

Valid cue
Inhibition of return
Inhibition of return

Invalid cue &

1000 no-cue control

Inhibition of return

Inhibition of return





Age 14 20 26 14 20 26 14 20 26
SOA 450 875 1300

Figure 1.10 Reaction time in the spatial cueing procedure for infants at 14, 20, and 26 weeks of age, as
a function of stimulus onset asynchrony (SOA) and cueing type. The short SOA shows faster response
times for the valid cued trials than the invalid and neutral trials for the three testing ages, whereas
the medium and long SOA show inhibition of return only for the 20 and 26 week old infants. From
Richards (2000a).

shorter on the valid trials for the three testing linking these ERP changes with fMRI recording
ages. This facilitation shows that the presence (Martinez et al., 1999). For example, in response
of the cue registered in the infant’s cognitive to a target occurring in one part of the visual
system even though fi xation continued on the space, there is an enhanced ERP component
central stimulus; i.e., covert orienting. labeled “N1” that occurs on the posterior scalp
An additional finding comes from the 20- on the contralateral side, i.e., where the occipital
and 26-week-old (4.5 and 6 months) infants. On brain areas for the opposite visual field are. This
trials when the time between the cue and tar- enhanced N1 seems to be caused by areas in the
get was relatively brief (e.g., 350 ms in Richards, extrastriate occipital cortex and the fusiform
2000a), infants showed the facilitation of the gyrus (Martinez et al., 1999). The inhibition of
response to valid targets. Alternatively, if the return effect is thought to be mediated by the
SOA was large enough (e.g., 700 to 1000 ms), superior colliculus (Posner, Rafal, Choate, &
the opposite occurred. The movement of the Vaughan, 1985; Rafal, 1998; Rafal, Calabresi,
eye from the center position to the target was Brennan, & Sciolto, 1989). It is thought that the
longer on the valid trials than on either the activation of pathways in the superior colliculus
invalid trials or neutral trials with no cue stim- responsible for fi xation shifts, and the inhibi-
ulus. This longer response time may be seen tion of those pathways during the spatial cueing
in Figure 1.10 for the middle and right sets of procedure, results in inhibition of return.
bars. The 20- and 26-week-old infants showed Researchers studying infants in the spatial
lengthened reaction times for the valid tri- cueing procedure have adopted this neurophysi-
als at the 750 and 1300 ms SOA in this figure. ological perspective (Hood, 1993, 1995; Johnson
Posner labeled this slowing of the response at & Tucker, 1996; Richards, 2000a, 2000b, 2001,
intermediate SOA levels “inhibition of return.” 2004b, 2005, 2007b; Richards & Hunter, 2002).
Interestingly, the inhibition of return occurs at The spatial cueing effects have three putative
late ages only for “covert orienting.” If the cue developmental phases for infants. First, the
is shown and fi xation is moved to the cue, and superior colliculus is relatively mature at birth
then back to the center pattern, “overt orient- and should support inhibition of return. One can
ing,” young infants and newborns take longer find in newborns, using the procedure in which
and are less likely to move fi xation back to the there are overt shifts of fi xations, examples of
cued location (Butcher, Kalverboer, & Gueze, inhibition of return (Simion et al., 1995; Valenza
1999; Clohessy, Posner, Rothbart, & Vecera, et al., 1994). Second, the facilitation of response
1991; Simion, Valenza, Umilta, & Barba, 1995; times at short SOAs must occur in cortical areas
Valenza, Simion, & Umilta, 1994). supporting visual processing. Only by 3 or 4.5
The effects showing covert shifts of atten- months is this area mature enough to support
tion (covert orienting, covert attention) in adult such response facilitation; thus the emergence
participants have been studied with methods to of shortened response times to valid targets
determine the brain bases of these responses. occurs by about 3 months of age (Richards,
Study methods have included fMRI, ERP, 2000a, 2000b, 2001, 2005, 2007b). Finally, the
study of pathological populations, and invasive emergence of inhibition of return following
studies in animal preparations. The response covert attention shifts by 4.5 or 6 months of age
facilitation that occurs at short SOAs is hypoth- must be due to the increasing influence of cor-
esized to be due to the enhancement of sensory tical systems on fi xation in this task. Perhaps
processing of information occurring in the these cortical systems inhibit fi xation to the
attended portion of visual space (Hillyard, Luck, peripheral stimulus during the presentation of
& Mangun, 1994; Hillyard, Mangun, Woldroff, the cue, leading to an inhibition of return of the
& Luck, 1995). This has been shown with ERP attention system to the cued area. The changes
studies that find enhanced amplitude of the in covert attention shifts found between 3 and 6
early components of the ERP elicited by the tar- months of age must therefore be due to cortical
get (Hillyard et al., 1994, 1995), and by studies changes in areas such as the parietal cortex and

frontal eye fields involving saccadic planning 2000b, 2001, 2005, 2007b). The spatial cueing
and attention shift ing. This interpretation is procedure adapted for infants was used and
consistent with the general view that that there the ERP was measured at the beginning of tar-
is an increase in the first 6 months of life of get onset or immediately before saccade onset.
cortical control over eye movements that occur Figure 1.11 shows the ERP changes occurring at
during attention and increasing cortical con- target onset for the occipital electrode that was
trol over general processes involved in atten- contralateral to the target (Richards, 2000a).
tion shift ing (e.g., Hood, 1995; Richards, 2008; This contralateral occipital electrode is interest-
Richards & Hunter, 1998, 2002). ing because visual information from the eye first
I have studied the areas of the brain involved reaches the cortex in the contralateral occipital
in covert orienting effects in infants using ERPs cortex, which is just underneath the scalp near
and cortical source analysis. I briefly presented this electrode. A large positive deflection in the
the use of scalp-recorded EEG for the measure ERP occurred about 135 ms following target
of brain activity (section “How to Measure Brain onset. This potential was the same size for the
Activity in Infants”). An EEG recording is made 14-week-old infants for the valid and other con-
of the electrical activity occurring on the scalp. ditions, slightly larger for the valid condition for
The EEG is generated by neural activity occur- the 20-week-old infants, and largest for the valid
ring in neural tissue inside the head. The infant condition for the 26-week-old infants. This ERP
is placed in the experimental situation with the component occurred about the same time and
spatial cueing procedure and changes in EEG has similar morphology to the “P1” ERP com-
are measured that are linked in time to the ponent often found in adults. This enhanced
experimental presentations, i.e., ERP. The ERP P1 is often found in response to a valid target
thus is a measure of brain activity, recorded in adult participants, and has been labeled the
on the scalp, which is synchronized with the “P1 validity effect” (Hillyard et al., 1994, 1995).
experimental manipulations or the psychologi- The study suggests that areas of the brain that
cal processes occurring in the spatial cueing control this response are developing over this
procedure. The link between the scalp-recorded age range. Presumably this brain development
activity and the experimental manipulations is is related to the behavioral changes occurring in
therefore a functional neuroscience method. response facilitation or inhibition of return.
The studies I have done have tested infants at The cortical locations that generate the
14, 20, and 26 weeks of age (e.g., Richards, 2000a, P1 validity effect was further examined with

14 Weeks 20 Weeks 26 Weeks

+20 +20 +20
Oc Oc Oc P1

500 ms

–20 –20 –20

Figure 1.11 The ERP changes occurring at target onset in the occipital electrode contralateral to the
target side. The valid (solid), invalid (dotted), and neutral (dashed) targets produced the same ERP
response in the 14-week-old infants. The P1 ERP component was larger for the valid trials in the 20-
and 26-week-old infants. From Richards (2000a).

cortical source analysis (Richards, 2005, 2007b). I first will discuss the change over time occur-
The section “How to Measure Brain Activity in ring in several locations. Cortical sources for
Infants” introduced cortical source analysis. In ERP recording were found in several areas of the
this analysis, electrical dipoles that can gener- cortex, including the posterior occipital cortex,
ate the current resulting in the ERP component extrastriate occipital cortex (including fusiform
may be identified. The dipoles represent the gyrus), and temporal cortex. Figure 1.12 shows the
location of the source of the cortical activity activity of these areas over time. A significant dif-
that is related to the experimental manipula- ference between the valid and the invalid/neu-
tions or psychological processes, i.e., function- tral trials is highlighted with the hatched bars.
ally localized brain sources. Activity in these The posterior occipital cortex and the temporal
dipoles changes over time so that cortical activ- cortex showed a large negative activity (brain
ity that generates the temporal characteristics of activity resulting in negative scalp recordings),
the ERP can be shown. whereas the extrastriate occipital cortical areas

10 μV Ipsilateral temporal Contralateral temporal



300 ms

Neutral Early Neg
–10 μV

Ipsilateral extrastriate occipital Contralateral extrastriate occipital




Figure 1.12 The time course of activity for the dipoles located with cortical source analysis. The valid
trials (solid line) produced a significantly larger response in the posterior occipital cortex, extrastriate
occipital (including lateral occipital, lateral-medial occipital, fusiform gyrus), and temporal regions
(hatched areas). The cortical source analysis identifies only “activity” of the sources, and the direction
of the source is determined from the direction of the ERP occurring in these locations.

showed the positive activity. This latter activity occurring in this area of the cortex underlie the
was similar in time course and electrical cur- P1 validity effect changes and perhaps some of
rent direction to the P1 ERP component. the behavioral changes occurring in this task.
The activity for the contralateral extrastri- An interesting comparison can be made
ate occipital areas will be examined further. between the findings that areas of the contral-
Figure 1.13 (top left figures) shows a topograph- ateral extrastriate occipital cortex are the brain
ical scalp potential map for the ERP activity sources for the P1 validity effect in infants and a
occurring in this area. The cortical sources for combined ERP/fMRI study in adults (Martinez
this brain area may be seen in the figure on the et al., 1999). The Martinez study used a spatial
top right. These sources occurred in middle cueing procedure in which participants were
and superior occipital areas (Brodmann areas instructed to direct fixation to either the right or
18, 19) and in the fusiform gyrus. These areas left side, and then targets were presented in either
are pathways that lead from the primary visual the attended or the unattended side. This proce-
area to the object identification areas in the dure was done separately in a psychophysiologi-
temporal cortex (“ventral processing stream”). cal session using ERP and in a functional MRI
Figure 1.13 also shows these activations in a session. They found the typical P1 validity effects
bar graph separately for 14- and 20-week-old in the ERP and localized the cortical sources to
infants, and separately for the valid, invalid, extrastriate occipital areas. These areas showed
and neutral conditions. The largest response enhanced blood oxygen level–dependent (BOLD)
was for the 20-week-old infants. This parallels activity in the fMRI experiment when attention
the earlier fi nding of the gradual increase in was directed to the contralateral visual field.
the P1 validity effect over this age (Figure 1.11; Alternatively, areas of the primary visual cortex
Richards, 2000a). Th is implies that changes did not show the ERP validity effect.

ECD Cluster, –26, –84, –7, mid & sup occipital g, fusiform g, (18,19)

Mean activations Activations for ECD cluster
20 50 to 150 ms Post-target
μV Change

300 ms

14 20
Testing age (weeks)

Figure 1.13 Topographical scalp potential maps for the contralateral extrastriate occipital areas (top
left) and the cortical sources located in the brain (top right). The activity in this area was largest for the
20-week-olds in the valid cueing condition (bottom left figure).

Figure 1.14 compares the findings from the The Martinez et al. (1999) and the Richards
Martinez et al. (1999) study to the Richards (2005) results are compared further in Figure
(2005) results. Figure 1.14 shows the fMRI areas 1.15. The average MRI from several infants in the
in the Martinez et al. study (upper left corner) range from 3 to 6 months is shown as the MRI
and the cortical sources plotted on MRI slices in Figure 1.15. Superimposed upon the MRI
from the Richards (2005) study. The green and in the cross-hatched area are the middle and
red arrows point from the fMRI areas where an superior occipital cortex (top figures) or lateral-
attention effect was observed to the comparable occipital cortex/occipital-fusiform gyrus (bot-
areas in the cortical sources of Richards, repre- tom figures). These areas were identified from
senting the contralateral extrastriate occipital the relevant anatomical areas derived from the
areas that was the basis for the infant P1 validity MNI brain (Figure 1.7). The small yellow circles
effect. These areas are very similar in both stud- represent individual source dipoles from the
ies. The yellow arrows point to the areas in the cortical sources found in Richards (2005), both
primary visual cortex in the studies of Martinez areas which show a P1 validity effect in their
et al. (1999) and Richards (2005) where no valid- activation. The larger fi lled circles are the aver-
ity effects occurred. age Talairach locations for the superior occipital

25, –86, –5,


0, –92, –4

Figure 1.14 Comparison of results from a study of adults with fMRI (Martinez et al., 1999) and infants
with source analysis of ERP (Richards, 2005). The upper left panels show the sources that were active
in the fMRI study. The green lines are areas in the lateral occipital cortex and fusiform gyrus that were
more active when attention was shifted to the contralateral side (fMRI) and the lateral occipital and
fusiform gyrus locations that showed a P1 validity effect (infant ERP). The red arrows point to middle
and superior occipital areas showing attention effects in both studies. The yellow areas are occipital
cortex areas representing the primary visual cortex, showing attention effects in the adult fMRI but
not in the infant ERP.

Middle occipital from fMRI

Ventral fusiform from fMRI

Figure 1.15 The distribution of cortical sources in Richards (2005) which were the cortical sources
for the P1 validity effect in the ERP (yellow circles), compared with the cortical source locations found
with adult ERP for this effect (large circles). The light purple hatched areas represent the anatomical
locations of these areas translated from the MNI stereotaxic atlas (Figure 1.7) to an average of infant
MRIs from 3 to 6 months of age.

and fusiform gyrus locations identified in the neuroscience” for infant participants. Second,
Martinez et al. study as showing the P1 validity the brain changes found in the P1 validity effect
effect in the ERP. Figure 1.15 shows how similar (Figures 1.11 and 1.13) parallel those findings of
are the locations of the cortical source analysis the inhibition of return rather than those of the
found in these two studies. response facilitation (Figure 1.10). This suggests
There are two implications of the work show- that the development of the inhibition of return
ing the ERP components accompanying covert may be linked to the enhanced processing of
orienting in young infants and their cortical stimuli occurring initially in the sensory sys-
bases. The first implication is that the brain areas tems. I have argued that cortical areas involved
involved in the control of sensory processing with saccade planning (presaccadic ERP; fron-
and the effects of attention on these brain areas tal eye fields; see Richards, 2000a, 2001, 2005,
may form the basis for the changes in attention 2007b) may be more closely related to the
to peripheral stimuli in young infants. These inhibition of return effect. This may occur as
techniques may be useful in showing how brain attention-based saccade planning and fi xation
changes in these areas parallel developmental control comes to inhibit the movement of the
changes in the behavioral components of these eyes from the center to the cued location. These
tasks. The use of scalp-recorded ERP and the areas are likely in prefrontal cortex rather than
source analysis allow a “functional cognitive posterior regions.

approach would be to first identify aspects of

brain development in individual participants
and relate those aspects to that infant’s behav-
This chapter reviewed the hypothesis that ioral performance. For example, perhaps extent
changes in brain areas controlling attention of myelination (Figure 1.1) of the occipital areas
strongly influence the development of attention (Figures 1.13 to 1.15) in an infant would be
in infant participants. A considerable portion related to the existence of response facilitation
of the chapter examined the methodological but not inhibition of return for that infant, and
advances in imaging showing what is inside the the inhibition of return would be closely related
infant’s head and how to measure brain activ- to the size of the P1 validity effect. Alternatively,
ity in infant participants. I have focused on my myelination in frontal areas occurring in this
work using cortical source analysis of ERP in the age range may be closely related to ERP changes
spatial cueing procedure as an example of how indicating attention-directed saccade plan-
this might be done. The goal of research in this ning and the presence of inhibition of return.
area is to link measures of infant brain develop- Such analyses would show directly the rela-
ment and measures of attention development. tion between brain development and attention
There are aspects of this work that require development. An example of this kind of work
further advances. Greg Reynolds and I is that of Klingberg (2008) showing in children
(Reynolds & Richards, 2009) describe in more and adolescents a close relation between myeli-
detail the application of cortical source analysis nation characteristics and cognitive and lin-
to infant participants. One limitation we note is guistic status.
that the cortical source analysis has been based Finally, I am working on several improve-
on parameters for use with adult participants. ments to the spatial cueing procedure to make
The forward solution used impedance values for it more amenable to EEG and ERP analysis.
the matter inside the head (gray matter, white One advance I have made is to create a testing
matter, CSF, skull) that are derived from adult protocol that results in a large number of pre-
participants. We know these are incorrect— sentations. In prior studies (Richards, 2000a,
adult skin has higher impedance than infant 2000b, 2001, 2005), the infants were presented
skin because of the accumulation of dead skin with a single presentation of center stimulus,
cells in adults, and infant skulls are less dense cue, target, and reaction time, interspersed with
and thinner than adult skulls so that adult intertrial intervals with no stimulus present.
skulls have higher impedance than infants. This took from 5 to 15 s and resulted in 20 to 40
This is being addressed by taking individual trials per participant. This allowed us to obtain
participant MRIs (section “What’s Inside a numbers of trials sufficient for ERP analysis, but
Baby’s Head?”) from infants and using source not optimal for relating individual performance
analysis based on that infants head topogra- with brain areas active in the task. Currently, I
phy and infant-based values of the impedance am using a procedure that presents a variegated
of head materials (Reynolds & Richards, 2009; background with continuous presentation of a
Richards, 2006, 2007a, submitted). stimulus that is foveated, cue, target, response,
A second aspect of this work that requires and then continued presentations. Between 75
advancement is the association of specific cor- and 200 trials can be obtained with this proce-
tical changes with specific behavior changes in dure and the infants are very cooperative. This
individual infant participants. The work dis- allows for more manipulations in a single partic-
cussed in the previous section (section “Brain ipant, larger numbers of trials for ERP averages,
and Infant Attention: Spatial Orienting”) relied and examination of the relation between ERP
on average change in the group on the measures characteristics, brain source activation, and dif-
of behavior (Figure 1.10), ERP validity effect ferent behavior patterns on a trial-by-trial basis.
(Figure 1.11), and source activation and brain I have presented some results from this proce-
change (Figures 1.12 to 1.15). An individual dure (Richards, 2007b) and am continuing with

other studies using this procedure. We also are Butcher, P. R., Kalverboer, A. F., & Gueze, R. H.
using this procedure to test individual par- (1999). Inhibition of return in very young
ticipants who have had anatomical MRIs. This infants: A longitudinal study. Infant Behavior
allows the correlation between the structural and Development, 22, 303–319.
characteristics of the individual infant’s brain Clements, H., Duncan, K. R., Fielding, K.,
Gowland, P.A., Johnson, I. R., & Baker, P. N.
and its performance in the task.
(2000). Infants exposed to MRI in utero have
a normal paediatric assessment at 9 months of
ACKNOWLEDGMENT age. British Journal of Radiology, 73, 190–194.
Clohessy, A. B., Posner, M. I., Rothbart, M. K., &
This research was supported by grants from the Vecera, S. P. (1991). The development of inhi-
National Institute of Child Health and Human bition of return in early infancy. Journal of
Development, R01-HD18942. Cognitive Neuroscience, 3, 345–350.
Conel, J. L. (1939). Postnatal development of
the human cerebral cortex: The cortex of the
REFERENCES newborn (Vol. 1). Cambridge, MA: Harvard
Almli, C. R., Rivkin, M. J., & McKinstry, R. C. University Press.
(2007). The NIH MRI study of normal brain Conel, J. L. (1941). Postnatal development of
development (Objective-2): Newborns, infants, the human cerebral cortex: The cortex of the
toddlers, and preschoolers. Neuromage, 35, one-month infant (Vol. 2). Cambridge, MA:
308–325. Harvard University Press.
Bachevalier, J. (2008). Non-human models of pri- Conel, J. L. (1947). Postnatal development of
mate memory development. In C.A. Nelson the human cerebral cortex: The cortex of the
& M. Luciana (Eds.), Developmental cognitive three-month infant (Vol. 3). Cambridge, MA:
neuroscience (pp. 499–508). Cambridge, MA: Harvard University Press.
MIT Press. Conel, J. L. (1951). Postnatal development of
Baker, P. N., Johnson, I. R., Harvey, p. R., Gowland, the human cerebral cortex: The cortex of the
P. A., & Mansfield, P. (1994). A three-year six-month infant (Vol. 4). Cambridge, MA:
follow-up of children imaged in utero with Harvard University Press.
echo-planar magnetic resonance. American Conel, J. L. (1955). Postnatal development of
Journal of Obstetrics and Gynecology, 170, 32–33. the human cerebral cortex: The cortex of the
Barkovich, A. J. (2005). Pediatric neuroimag- fifteen-month infant (Vol. 5). Cambridge, MA:
ing. Philadelphia, PA: Lippincott Williams & Harvard University Press.
Wilkins. Conel, J. L. (1959). Postnatal development of
Barkovich, A.J., Kjos, B.O., Jackson, D.E., & the human cerebral cortex: The cortex of the
Norman, D. (1988) Normal maturation of the twenty-four-month infant (Vol. 6). Cambridge,
neonatal and infant brain: MR imaging at 1.5 MA: Harvard University Press.
T. Radiology, 166, 173 Philadelphia, PA 180. Conel, J. L. (1963). Postnatal development of the
Battin, M., Maalouf, E.F., Counsell, S., Herlihy, A., human cerebral cortex: The cortex of the forty-
& Hall A. (1998). Physiologic stability of pre- eight-month infant (Vol. 7). Cambridge, MA:
term infants during magnetic resonance imag- Harvard University Press.
ing. Early Human Development, 52, 101–110. Conel, J. L. (1967). Postnatal development of the
Bourgeois, J. P. (1997). Synaptogenesis, heterochrony human cerebral cortex: The cortex of the sev-
and epigenesis in the mammalian neocortex. enty-two-month infant (Vol. 8). Cambridge,
Acta Paediatrica Supplement, 422, 27–33. MA: Harvard University Press.
Bronson, G. W. (1974). The postnatal growth Dehaene-Lambertz, G. (2001). Practical and
of visual capacity. Child Development, 45, ethical aspects of neuroimaging research in
873–890. infants
Bronson, G. W. (1997). The growth of visual capac- php?page=InfantEthics. Accessed 2009.
ity: Evidence from infant scanning patterns. Dehaene-Lambertz, G., Dahaene, S., & Hertz-
In C. Rovee-Collier & L.P. Lipsitt, Advances Pannier, L. (2002). Functional neuroimaging
in infancy research (Vol. 11, pp. 109–141). of speech perception in infants. Science, 298,
Greenwich, CT: Ablex. 2013–2015.

Evans, A. C. (2006). The NIH MRI study of normal potentials in the brain. Multivariate Behavioral
brain development. NeuroImage, 30, 184–202. Research, 29, 237–262.
Evans, A. C., Collins, D. L., & Milner, B. (1992). Hunter, S. K., & Richards, J. E. (2003). Peripheral
An MRI-based stereotactic atlas from 250 stimulus localization by 5- to 14-week-old
young normal subjects. Journal of the Society infants during phases of attention. Infancy, 4,
for Neuroscience Abstracts, 18, 408. 1–25.
Evans, A. C., Collins, D. L., Mills, S. L., Brown, Hunter, S. K., & Richards, J. E. Characteristics
E. D., Kelly, R. L., & Peters, T. M. (1993). 3D of eye movements to a “Sesame Street” movie
statistical neuroanatomical models from 305 from 8 to 26 weeks of age, Manuscript submit-
MRI volumes. Proceedings of the IEEE-Nuclear ted for publication.
Science Symposium and Medical Imaging Huttenlocher, P. R. (1990). Morphometric study
Conference, 1813–1817. of human cerebral cortex development.
Fox, M. & Uecker, A. (2005). Talairach daemon Neuropsychologia, 28, 517–527.
client. University of Texas Health Sciences Huttenlocher, P. R. (1994). Synaptogenesis, syn-
Center, San Antonio, TX apse elimination, and neural plasticity in
projects/talairachdaemon.html. Accessed 2005. human cerebral cortex. In C.A. Nelson (Ed.),
Gilmore, J. H., Zhai, G., Wilber, K., Smith, J. K., Threats to optimal development, the Minnesota
Lin, W., & Gerig, G. (2004) 3 Tesla magnetic Symposia on Child Psychology (Vol. 27,
resonance imaging of the brain in newborns. pp. 35–54). Hillsdale, NJ: Lawrence Erlbaum.
Neuroimaging, 132, 81–85. Iliescu, B. F., & Dannemiller, J. L. (2008).
Hillyard, S. A., Luck, S. J., & Mangun, G. R. Brain–behavior relationships in early visual
(1994). The cueing of attention to visual field development. In C. A. Nelson & M. Luciana
locations: Analysis with ERP recordings. In (Eds.), Developmental cognitive neuroscience
H. J. Heinze, T. F. Munte, & G. R. Mangun (pp. 127–146) Cambridge, MA: MIT Press.
(Eds.), Cognitive electrophysiology (pp. 1–25). Johnson, M. H. (1990). Cortical maturation and the
Botson: Birkhauser. development of visual attention in early infancy.
Hillyard, S. A., Mangun, G. R., Woldroff, M. G., Journal of Cognitive Neuroscience, 2, 81–95.
& Luck, S. J. (1995). Neural systems mediat- Johnson, M. H. (1995). The development of visual
ing selective attention. In M. S. Gazzaniga attention: A cognitive neuroscience perspec-
(Ed.), Cognitive neurosciences (pp. 665–682). tive. In M. S. Gazzaniga (Eds.), The cognitive
Cambridge, MA: MIT Press. neurosciences (pp. 735–747). Cambridge, MA:
Hood, B. M. (1993). Inhibition of return produced MIT Press.
by covert shifts of visual attention in 6-month- Johnson, M. H. (1997). Developmental cognitive
old infants. Infant Behavior and Development, neuroscience. London: Blackwell.
16, 245–254. Johnson, M. H., de Haan, M., Oliver, A., Smith,
Hood, B. M. (1995). Shifts of visual attention in W., Hatzakis, H., Tucker, L. A., et al. (2001).
the human infant: A neuroscientific approach. Recording and analyzing high-density
Advances in Infancy Research, 10, 163–216. event-related potentials with infants using
Hood, B. M., & Atkinson, J. (1991). Shift ing covert the Geodesic Sensor Net. Developmental
attention in infants. Paper presented at the Neuropsychology, 19, 295–323.
meeting of the Society for Research in Child Johnson, M. H., Gilmore, R. O., & Csibra, G.
Development, Seattle, WA, April, 1990. (1998). Toward a computational model of
Hood, B. M., Atkinson, J., & Braddick, O. J. (1998). the development of saccade planning. In
Selection-for-action and the development of J.E. Richards (Ed.), Cognitive neuroscience of
orienting and visual attention. In J. E. Richards attention: A developmental perspective (pp. 103–
(Ed.), Cognitive neuroscience of attention: 130). Hillsdale, NJ: Lawrence Erlbaum Press.
A developmental perspective (pp. 219–250). Johnson, M. H., Posner, M. I., & Rothbart, M. K.
Hillsdale, NJ: Lawrence Erlbaum Press. (1991). Components of visual orienting in early
Huettel, S.A., Song, A.W., & McCarthy, G. (2004). infancy: Contingency learning, anticipatory
Functional magnetic resonance imaging. looking and disengaging. Journal of Cognitive
Sunderland, MA: Sinauer Press. Neuroscience, 3, 335–344.
Huizenga, H. M., & Molenaar, P. C. M. (1994). Johnson, M. H., Posner, M. I., & Rothbart,
Estimating and testing the sources of evoked M. K. (1994). Facilitation of saccades toward

a covertly attended location in early infancy. Maurer, D., & Lewis, T. L. (1998). Overt orienting
Psychological Science, 90–93. toward peripheral stimuli: Normal develop-
Johnson, M. H., & Tucker, L. A. (1996). The devel- ment and underlying mechanisms. In J. E.
opment and temporal dynamics of spatial Richards (Ed.), Cognitive neuroscience of atten-
orienting in infants. Journal of Experimental tion: A developmental perspective (pp. 51–102).
Child Psychology, 63, 171–188. Hillsdale, NJ: Lawrence Erlbaum.
Kangarlu, A., Burgess, R. E., & Zu, H. (1998). Mazziotta, J. C., Toga, A. W., Evans, A., Fox, P.,
Cognitive, cardiac and physiological studies in & Lancaster, J. (1995). A probablistic atlas of
ultra high field magnetic resonance imaging. the human brain: Theory and rationale for its
Magnetic Resonance Imaging, 17, 1407–1416. development. NeuroImage, 2, 89–101.
Kinney, H., Brody, B., Kloman, A., & Gilles, F. Mehler, J., Gervain, J., Endress, A., & Shukla, M.
(1988). Sequence of central nervous myelina- (2008). Mechanisms of language acquisition:
tion in human infancy: Pattern of myelination imaging and behavioral evidence. In C. A.
in autopsied infants. Journal of Neuropathology Nelson & M. Luciana (Eds.), Developmental cog-
and Experimental Neurology, 47, 217–234. nitive neuroscience (pp. 325–336). Cambridge,
Kinney, H., Karthigasan, J., Borenshteyn, N., Flax, MA: MIT Press.
J., & Kirschner, D. (1994). Myelination in the Michel, C. M., Murray, M. M., Lantz, G., Gonzalez,
developing human brain: Biochemical corre- S., Spinelli, L., & Grave de Peraltz, R. (2004).
lates. Neurochemistry Research, 19, 983–996. EEG source imaging. Clinical Neurophysiology,
Klingberg, T. (2008). Development of white mat- 115, 2195–2222.
ter as a basis for cognitive development dur- Myers, C., Duncan, K. R., Gowland, P. A.,
ing childhood. In C.A. Nelson & M. Luciana Johnson, I. R., & Baker, P. N. (1998). Failure to
(Eds.), Developmental cognitive neuroscience detect intrauterine growth restriction follow-
(pp. 237–244). Cambridge, MA: MIT Press. ing in utero exposure to MRI. British Journal
Kok, R. D., de Vries, M. M., Heerschap, A., & van of Radiology, 71, 549–551.
den Berg, P. P. (2004). Absence of harmful Nelson, C.A., & Luciana, M. (2008). Developmental
effects of magnetic resonance exposure at 1.5 cognitive neuroscience. Cambridge, MA: MIT
T in utero during the third trimester of preg- Press.
nancy: A follow-up study. Magnetic Resonance NIH (1998). Pediatric study centers (PSC) for
Imaging, 22, 851–854. a MRI study of normal brain development.
Lancaster, J. L., Summerlin, J. L., Rainey, L., NIH RFP NIHNINDS-98–13, sponsored by
Freitas, C. S., & Fox, P. T. (1997), The Talairach National Institute of Neurological Disorders
Daemon, a database server for Talairach Atlas and Stroke, National Institute of Mental
Labels. Neuroimage, 5, S633. Health, National Institute of Child Health and
Lancaster, J. L., Woldorff, M. G., Parsons, L. M., Human Development.
Liotti, M., Freitas, C. S., Rainey, L., et al. (2000). Nunez, P. L. (1990). Localization of brain activ-
Automated Talairach Atlas labels for func- ity with electroencephalography. Advances in
tional brain mapping. Human Brain Mapping Neurology, 54, 39–65.
10, 120–131. O’Hare, E. D. & Sowell, E. R. (2008). Imaging
Martinez, A., Anllo-Vento, L., Sereno, M. I., developmental changes in grey and white mat-
Frank, L. R., Buxton, R. B., Dubowitz, D. J., et ter in the human brain. In C. A. Nelson & M.
al. (1999). Involvement of striate and extrastri- Luciana (Eds.), Developmental cognitive neu-
ate visual cortical areas in spatial attention. roscience (pp. 23–38). Cambridge, MA: MIT
Nature Neuroscience, 2, 364–369. Press.
Maurer, D. & Lewis, T. L. (1979). A physiological Paterson, S. J., Badridze, N., Flax, J. F., Liu, W.-C.,
explanation of infants’ early visual development. & Benasich, A. A. (2004). A method for struc-
Canadian Journal of Psychology, 33, 232–252. tural MRI scanning of non-sedated infants.
Maurer, D., & Lewis, T. L. (1991). The develop- Chicago: International Conference for Infancy
ment of peripheral vision and its physiologi- Studies.
cal underpinnings. In M. J. S. Weiss & P. R. Peña, M., Maki, A., Kovacic, D., Dehaene-
Zelazo (Eds.), Newborn attention: Biological Lambertz, G., Koizumi, H., Bouquet, F., et al.
constraints and the influence of experience (pp. (2003). Sounds and silence: An optical topog-
218–255). Norwood, NJ: Ablex. raphy study of language recognition at birth.

Proceedings of the National Academy Sciences, Richards, J. E. (2001). Cortical indices of sac-
100, 11702–11705. cade planning following covert orienting in
Posner, M. I. (1980). Orienting of attention. 20-week-old infants. Infancy, 2, 135–157.
Quarterly Journal of Experimental Psychology, Richards, J. E. (2002). Development of attentional
32, 3–25. systems. In M. De Haan & M. H. Johnson
Posner, M. I., & Cohen, Y. (1984). Components (Eds.), The cognitive neuroscience of develop-
of visual orienting. In H. Bouma & D. G. ment. East Sussex, UK: Psychology Press.
Bouwhis (Eds.), Attention and performance X Richards, J. E. (2003a). Attention affects the rec-
(pp. 531–556). Hillsdale, NJ: Erlbaum. ognition of briefly presented visual stimuli in
Posner, M. I., Rafal, R. D., Choate, L. S., & infants: An ERP study. Developmental Science,
Vaughan, J. (1985). Inhibition of return: Neural 6, 312–328.
basis and function. Cognitive Neuropsychology, Richards, J.E. (2003b). Cortical sources of event-
2, 211–228. related-potentials in the prosaccade and anti-
Rafal, R. D. (1998). The neurology of visual ori- saccade task. Psychophysiology. 40, 878–894.
enting: A pathological disintegration of devel- Richards, J.E. (2004a). Recovering cortical dipole
opment. In J. E. Richards (Ed.), Cognitive sources from scalp-recorded event-related-
neuroscience of attention: A developmental per- potentials using component analysis: Principal
spective (pp. 181–218). Hillsdale, NJ: Lawrence component analysis and independent com-
Erlbaum. ponent analysis. International Journal of
Rafal, R. D., Calabresi, P. A., Brennan, C. W., & Psychophysiology, 54, 201–220.
Sciolto, T. K. (1989). Saccade preparation inhib- Richards, J. E. (2004b). Development of covert
its reorienting to recently attended locations. orienting in young infants. In L. Itti, G. Rees,
Journal of Experimental Psychology Human & J. Tsotsos (Eds.), Neurobiology of atten-
Perception and Performance, 15, 673–685. tion (Chap. 14, pp. 82–88). London: Academic
Reynolds, G. D., Courage, M., & Richards, J. E. Press/Elsevier.
(2006). Infant visual preferences within the Richards, J. E. (2004c). The development of sus-
modified-oddball ERP paradigm. Poster pre- tained attention in infants. In M. I. Posner
sented at the International Conference on (Ed.), Cognitive neuroscience of attention
Infant Studies, Kyoto, Japan. (Chap. 25, pp. 342–356). Guilford Press.
Reynolds, G. D., & Richards, J. E. (2005). Richards, J. E. (2005). Localizing cortical sources
Familiarization, attention, and recogni- of event-related potentials in infants’ covert
tion memory in infancy: An ERP and corti- orienting. Developmental Science, 8, 255–278.
cal source localization study. Developmental Richards, J. E. (2006). Realistic cortical source
Psychology, 41, 598–615. models of ERP. Unpublished manuscript.
Reynolds, G. D., & Richards, J. E. (2007). Infant
heart rate: A developmental psychophysi- RealisticSourceModels.pdf. Accessed 2009.
ological perspective. In L. A. Schmidt & S. J. Richards, J. E. (2007a). Realistic head models for
Segalowitz (Eds.), Developmental psychophysi- cortical source analysis in infant participants.
ology (pp. 106–117). Cambridge: Cambridge Society for Research in Child Development,
University Press. Boston.
Reynolds, G. D., & Richards, J. E. (2009). Richards, J. E. (2007b). Infant sustained attention
Cortical source analysis of infant cognition, affects brain areas controlling covert orienting.
Developmental Neuropsychology,34,312–329. Society for Research in Child Development,
Richards, J. E. (2000a). Localizing the development Boston.
of covert attention in infants using scalp event- Richards, J. E. (2008). Attention in young
related-potentials. Developmental Psychology, infants: A developmental psychophysiologi-
36, 91–108. cal perspective. In C. A. Nelson & M. Luciana
Richards, J. E. (2000b). The development of covert (Eds.), Developmental cognitive neuroscience
attention to peripheral targets and its relation (pp. 479–497) Cambridge, MA: MIT Press.
to attention to central visual stimuli. Paper Richards, J. E.(2009) Cortical sources of ERP in
presented at the International Conference the prosaccade and antisaccade task using real-
for Infancy Studies, Brighton, England, July istic source models based on individual MRIs.
2000. Manuscript submitted for publication.

Richards, J. E., & Casey, B. J. (1992). Development M. N. (Eds.), Event-related brain research
of sustained visual attention in the human (pp. 24–37). Amsterdam: Elsevier Science.
infant. In B. A. Campbell, H. Hayne, R. Shankle, W. R., Romney, A. K., Landing, B. H., &
Richardson (Eds.), Attention and information Hara, J. (1998). Developmental patterns in the
processing in infants and adults: Perspectives cytoarchitecture of the human cerebral cortex
from human and animal research (pp. 30–60). from birth to 6 years examined by correspon-
Hillsdale: Lawrence Erlbaum Associates. dence analysis. Proceedings of the National
Richards, J. E., & Holley, F. B. (1999) Infant atten- Academy of Sciences, 95, 4023–4028.
tion and the development of smooth pur- Simion, F., Valenza, E., Umilta, C., & Barba, B.
suit tracking. Developmental Psychology, 35, D. (1995). Inhibition of return in newborns is
856–867. temporo-nasal asymmetrical. Infant Behavior
Richards, J. E., & Hunter, S. K. (1997). Peripheral and Development, 18, 189–194.
stimulus localization by infants with eye and Stokowski, L. A. (2005). Ensuring safety for
head movements during visual attention. infants undergoing magnetic resonance imag-
Vision Research, 37, 3021–3035. ing. Advances in Neonatal Care, 5, 14–27.
Richards, J. E. & Hunter, S. K. (1998). Attention Sury, J., Harker, H., Begent, J., & Chong, W. K.
and eye movement in young infants: Neural (2005) The management of infants and chil-
control and development. In J. E. Richards dren for painless imaging. Clinical Radiology,
(Ed.), Cognitive neuroscience of attention: 60, 731–741.
A developmental perspective (pp. 131–162). Swick, D., Kutas, M., & Neville, H. J. (1994).
Mahway, NJ: Erlbaum. Localizing the neural generators of event-
Richards, J. E., & Hunter, S. K. (2002). Testing related brain potentials. In A. Kertesz (Ed.),
neural models of the development of infant Localization and neuroimaging in neuropsy-
visual attention. Developmental Psychobiology, chology. Foundations of neuropsychology (pp.
40, 226–236. 73–121). San Diego: Academic Press.
Rivkin, M. J. (1998). Developmental neuroimag- Taber, K. H., Hayman, L. A., Northrup, S. R., &
ing of children using magnetic resonance tech- Maturi, L. (1998). Vital sign changes dur-
niques. Mental Retardation and Developmental ing infant magnetic resonance examinations.
Disabilities Research Reviews, 6, 68–80. Journal of Magnetic Resonance Imaging, 8,
Sajja, B. R., & Narayana, P. A. (2008). Magnetic 1252–1256.
resonance spectroscopy of developing Talairach, J., & Tournoux, P. (1988). Co-planar ste-
brain. In C.A. Nelson & M. Luciana (Eds.), reotaxic atals of the human brain. New York:
Developmental cognitive neuroscience (pp. 337– Thieme Medical Publishers.
350). Cambridge, MA: MIT Press. Thomas, K. M., & Tseng, A. (2008). Functional
Sampaio, R. C., & Truwit, C. L. (2001). Myelination MRI methods in developmental cognitive neu-
in the developing human brain. In C. A. Nelson roscience. In C.A. Nelson & M. Luciana (Eds.),
& M. Luciana (Eds.), Developmental cogni- Developmental cognitive neuroscience (pp. 311–
tive neuroscience (pp. 35–44) Cambridge, MA: 324). Cambridge, MA: MIT Press.
MIT Press. Tucker, D. M. (1993). Spatial sampling of
Schenck, J. F. (2000). Safety of strong, static mag- head electrical fields: the geodesic sensor
netic fields. Journal of Magnetic Resonance net. Electroencephalography and Clinical
Imaging, 12, 2–19. Neurophysiology, 87, 154–163.
Scherg, M. (1990). Fundamentals of dipole source Tucker, D. M., Liotti, M., Potts, G. F., Russell,
potential analysis. In F. Grandori, M. Hoke, & G. S., & Posner, M. I. (1994). Spatiotemporal
G. L. Romani (Eds.), Auditory evoked magnetic analysis of brain electrical fields. Human Brain
fields and potentials (pp. 40–69). Basel: Karger. Mapping, 1, 134–152.
Scherg, M. (1992). Functional imaging and local- U. S. Food and Drug Administration (2003).
ization of electromagnetic brain activity. Brain Criteria for significant risk investigations of
Topography, 5, 103–111. magnetic resonance diagnostic devices. http://
Scherg, M., & Picton, T. W. (1991). Separation and w w
identification of event-related potential com- Accessed 2009.
ponents by brain electrical source analysis. U.S. Food and Drug Administration (2006).
In Brunia, C. H. M., Mulder, G., & Verbaten, Information Sheet Guidance For IRBs, Clinical

Investigators, and Sponsors Significant Risk M. Luciana (Eds.), Developmental cognitive

and Nonsignificant Risk Medical Device neuroscience (pp. 301–310). Cambridge, MA:
Studies. MIT Press.
isk.pdf. Accessed 2009. Yakolev, P. I., & Lecours, A. R. (1967). The
Valenza, E., Simion, F., & Umilta, C. (1994). myelogenetic cycles of regional maturation of
Inhibition of return in newborn infants. Infant the brain. In A. Mankowski (Ed.), Regional
Behavior and Development, 17, 293–302. development of the brain in early life (pp. 3–69).
Wozniak, J.R., Mueller, B.A., & Lim, K.O. (2009). Philadelphia: Davis.
Diff usion tensor imaging. In C.A. Nelson &
All Together Now: Learning through
Multiple Sources

Natasha Kirkham

The exciting promise of developmental In this chapter, I shall discuss the impor-
psychology is that it can produce coherent tance and the usefulness of such cross-modal
explanations of development instead of merely integration in learning about basic structure,
describing behavior. It is, therefore, our business suggesting that it provides the foundations for
as developmental psychologists to find mecha- more complex knowledge, and outline a theory
nistic explanations, which can be traced back to of multiple cue integration. Data from a variety
their origins, as opposed to delineating behavior of paradigms will show the effect that richer,
at only one time slice. We must account for how more numerous, and cross-modal cues have on
infants and children build complex representa- infants’ ability to learn visual sequences and
tions and acquire sophisticated knowledge, and events. As the infant develops, her tolerance for
then how they go on to use that knowledge to coherence between these cues changes, as she is
define and modulate their behavior. And we able to acquire more sophisticated representa-
must show how all this occurs in real time. This tions for a narrower set of cues. It may be that
is a daunting task, which requires analyzing the rich scenery, with the many cues available that
cognitive system from the bottom up, as learn- highlight environmental structure, is processed
ing occurs over time. more efficiently by infants than a scene contain-
Such a study of learning is, in part, a study of ing sparser features. If this is so, then emergent
perception, since the building blocks of complex sensitivities to probabilistic spatiotemporal
knowledge must presumably be extracted from information might require more contextual
the world. Fortunately for the infant, the world support in order to glean structure from the
is rich with perceptual redundancies that help environment.
guide attention (e.g., Bahrick & Lickliter, 2000; In order for humans to effectively use proba-
Lewkowicz, 2000). The same information, such bilistic information (i.e., to draw conclusions
as the location of an event or the number of an about the likelihood of potential events), they
event’s occurrences, can be conveyed simulta- must be sensitive to the relative frequencies of
neously in visual, auditory, and tactile cues: We various events. Indeed, there is much evidence
hear the voice of our partner coming from the to suggest that humans are very sensitive to
direction of our front door, we see his face come frequencies of occurrence (and co-occurrence;
through the doorway, and we reach out to greet Jonides & Jones, 1992; Jonides & Naveh-
him. The ability to integrate perceptual infor- Benjamin, 1987). Some examples include the
mation across different modalities (as well as ability to accurately judge the relative frequencies
within them) is a key element in learning about of encounters with other people (e.g., Saegert,
events and objects. Swap, & Zajonc, 1972), the ability to judge the


frequency of lethal events (e.g., Lichenstein, information across different senses (e.g., Ernst,
Slovic, Fischoff, Layman, & Combs, 1978), and Banks, & Bülthoff, 2000). Research on depth
in the language domain, judgments of the fre- perception has shown the benefit of differen-
quency of words (e.g., Shapiro, 1969), syllables tial visual cues such as binocular disparity
(e.g., Rubin, 1974), letters (e.g., Attneave, 1953), (e.g., Mayhew & Frisby, 1980), texture gradients
and position of letters (e.g., Sedmeier, Hertwig, (e.g., Todd & Akerstrom, 1987), and motion
& Gigerenzer, 1998). Zajonc (1968) showed that (e.g., Sperling, Landy, Dosher, & Perkins, 1989).
this sensitivity to frequency can implicitly bias Object motion, binocular disparity, occlusion
judgments of liking, the “mere exposure” effect and texture gradients, to name a few, are inte-
that has been used widely by experimental psy- grated to produce coherent visual percepts of
chologists across numerous fields. objects, surface layout, and scenes (e.g., Bruno &
Infants are sensitive not just to frequency Cutting, 1988; Mayhew & Frisby, 1980; Sperling
of events, but also their probabilistic structure. et al., 1989; Todd & Akerstrom, 1987). In the
As I shall discuss below, it has become evident adult vision literature, theories of multiple cue
that infants and young children have access to integration, or weighted sum models, have been
a powerful domain-general learning device. In offered as explanations for how observers com-
laboratory experiments they can quickly learn bine stimulus property cues (e.g., Kinchla, 1977;
statistically defined (or probabilistic) patterns Landy et al., 1995; Massaro, 1999).
in both auditory and visual domains (Fiser & In the cognitive domain too, multiple cues
Aslin, 2003; Kirkham, Slemmer, & Johnson, are often integrated in complex decision mak-
2002; Saff ran, Aslin, & Newport, 1996). Young ing, such as medical diagnosis (Brehmer &
children, as well, are quickly capable of under- Qvarnstrom, 1976). A doctor uses all the pieces
standing probabilistic information implicitly of information available to her, not just one.
(Meulemans, Van der Linden, & Perruchet, The information will be weighted differentially
1998) or with explicit feedback (Kirkham depending on the context and the nature of
& Shohamy, 2007). I posit that this learn- the information, but each bit will be used. This
ing device, coupled with a rich and accessible instance is an example of a complex cognitive
environment, provides the building blocks for problem, but multiple cue use is evident in
mature knowledge. the simplest, most mundane situations. When
you decide to go to a restaurant for dinner, for
example, there are a great many bits of infor-
mation that will help produce a decision of the
What happens when there are multiple probabi- specific restaurant: What kind of food do you
listic information sources available? From basic want to eat? Who else is going, and what food
vision to language acquisition, multiple percep- do they want to eat? Who is paying? Are either
tual cues have been cited as providing important time spent at or distance from the restaurant
information to the system. How these cues are issues for anyone attending the dinner? No one
weighted, and what parameters determine dif- bit will be the conclusive bit, but a combination
ferential weighting, has been of great interest in of all of them will get you to the correct deci-
the perception literature. Weighted sum models sion. And these cue weightings are done in a
of perception have been proffered as explana- matter of minutes, or even seconds.
tions for how adults perform many different So, given that multiple sources of informa-
tasks, including detecting an auditory signal tion are available to solve problems in both per-
with two frequency components that activate ception and cognition, and that these sources
different auditory channels (e.g., Green, 1958), tend to be probabilistic in nature, reliance on
combining redundant stimulus properties in multiple cues should therefore produce greater
complex figures (e.g., Kinchla, 1977), combin- success than reliance on any one individual cue.
ing multiple depth cues (e.g., Landy, Maloney, Furthermore, since many problem domains
Johnston, & Young, 1995), and combining are characterized by probabilistic solutions,

sensitivity to variables such as frequency and extent that they expect the sound to move with
correlation should be a domain-general ability, the associated object (Morrongiello, Fenwick, &
and perhaps a fundamental characteristic of the Chance, 1998). Lewkowicz and Turkewitz (1980)
developing mind. In order for infants to make showed that very young infants (3 weeks of age)
use of this information available in the envi- bind sound and vision together by attending to
ronment, she must be sensitive to probabilistic the intensity of the stimuli. By 4 months of age,
patterns. This sensitivity then creates the foun- infants can not only perceive the bimodal nature
dation upon which higher-order knowledge is of objects (Spelke, 1979, 1981), but they can also
based. perceive speech bimodally (Kuhl and Meltzoff,
1982; Rosenblum, Schmuckler, & Johnson,
1997). Infants of 5 months, when habituated
to a bimodal presentation of rhythm (e.g., an
audiovisual movie of a hammer tapping out a
Infants show great sensitivity to multimodal rhythm), dishabituated to a unimodal presenta-
regularities. Similar to adults, infants use mul- tion of a novel rhythm (e.g., just the visual of
tiple cues to organize their visual perception: a hammer tapping, without the sound; Bahrick
For example, when making decisions about an & Lickliter, 2000). By 5–7 months, infants can
occluded object, infants may exploit such cues match faces with voices based on the age, gen-
as edge alignment, synchronous motion, and der, and affective expression of the speaker
depth to support perception of unity (Johnson, (Bahrick, Netto, & Hernandez-Reif, 1998).
1997; Smith, Johnson, & Spelke, 2003). In addi-
tion, investigations of language acquisition have
led Christiansen and colleagues (Christiansen,
Allen, & Seidenberg, 1998; Christiansen & Dale, It seems clear from the evidence presented in the
2001; Christiansen & Monaghan, 2006; see also earlier section that infant learning in the natu-
the chapter in this volume by Christiansen, ral environment would exploit multiple cues
Dale, & Reali) to put forward a multiple cue across different modalities. Certainly, infants’
hypothesis. They propose that the mecha- sensitivity to cross-modal information stands in
nism underlying language acquisition has the contrast to the sparse, unimodal presentations
capacity to extract and store various statistical of many laboratory experiments. If laboratory
properties of language, and integrate different studies do not fully exploit the cross-modal sen-
sources of information. They go on to suggest sitivity of infants, then perhaps they risk under-
that it might be the conjunctions of these cues estimating the full capacity of their learning
that provide evidence about aspects of linguistic abilities. For example, Bahrick and Lickliter and
structure that is not available from any single colleagues have presented beautiful evidence
source of information. that “intersensory redundancy,” the overlap of
A great deal of our perceptual understand- information provided by amodal stimuli, drives
ing (as well as a great many of our responses selective attention (e.g., Bahrick & Lickliter,
to the environment) requires correctly corre- 2000; Bahrick, Lickliter, & Flom, 2004). Thus, it
lating events across modalities. Gibson (1969) seems to follow nicely that intersensory redun-
suggested that the responsiveness to invari- dancy should also drive basic perceptual learn-
ant intersensory relations is a necessary part ing. Broadly speaking, this theory predicts that
of the development of perception and learn- there are three factors that will affect an infant’s
ing. Indeed, during the first 6 months of life, ability to learn a particular sequence or set of
infants develop many intersensory capacities, events: (1) the availability of multiple cues, (2)
which allow them to perceive correlations the coherence of the cues presented, and (3) the
across modalities (Lewkowicz, 2000; see also age of the infant. Thus, learning of basic visu-
Richardson & Kirkham, 2004). Newborns bind a ospatial patterns or events should be facilitated
visual stimulus with an auditory stimulus to the if the same information is present in multiple

cues within or across modalities. For example, as to the joint and conditional probabilities of
infants will be better able to learn the statisti- successive shape pairs.
cal structure of a sequence of events when that Are infants also sensitive to the statisti-
pattern is conveyed by the shape, color, posi- cal structure of events? Research concerned
tion, and sound of the events, rather than one of with the development of sequence learning has
those cues alone. Cues can interfere with learn- revealed a capacity to pick up temporal patterns
ing when they do not coherently co-occur. So if under many conditions. Saff ran, Aslin, and col-
particular sounds are associated with particular leagues found that 8-month-old infants parse a
locations, but the visual features that appear in stream of auditory stimuli based solely on the
that location are random or constant, learning transitional probabilities within and between
will suffer. At younger ages, infants will require the syllables (Aslin, Saff ran, & Newport, 1998;
a higher number of cues and a higher degree of Saff ran et al., 1996; see also Gomez’s and
coherence between them. At later ages, infants Saff ran’s chapters in this book for more dis-
will be able to learn statistical structure from cussion of infant auditory sequence learning).
a smaller set of cues and be able to tolerate a Gomez and Gerken (1999) exposed 12-month-
degree of incoherence between multiple cues. olds to a subset of strings produced by one of
two artificial grammars and then tested the
infants on their ability to discriminate new
strings from both the familiar and the unfa-
Both adults and infants are sensitive to the miliar grammar. Infants preferred to listen to
statistical structure of perceptual events. Adults new strings from their training set relative to
are extremely competent at exploiting complex strings from the novel grammar. These gram-
spatiotemporal sequences in order to guide mars differed only in terms of the ordering of
behavior (Chun & Jiang, 1998; Howard, Mutter, word pairs: Individual words in the two sets,
& Howard, 1993). In serial reaction time stud- and the starting and ending words, were always
ies, for example, adult observers view a single the same. The only cues to provide recognition,
repetitive stimulus presented sequentially at therefore, were contained in word order, imply-
different locations and respond to each position ing that the infants encoded the temporal pat-
by pressing a corresponding key (e.g., Nissen terns of word co-occurrences. Infants’ ability
& Bullemer, 1987). Stimulus locations follow a to extract regularities in sequential input does
particular spatial and temporal pattern that a not seem to be a language-specific mechanism,
participant may be unable to describe explicitly, but exists broadly across audition. Infants parse
yet reaction times typically decrease reliably auditory streams based on statistical probabili-
across trials (Cohen, Ivry, & Keele, 1990; Curran ties even when the stimuli are tones (Saff ran,
& Keele, 1993; Nissen & Bullemer, 1987). There Johnson, Aslin, & Newport, 1999), and at least
is evidence that such learning is independent one species of nonhuman primates, cotton-
of the specific motor response: Observation top tamarins (which never develop humanlike
of a sequential pattern can also lead to knowl- language skills), learns statistically structured
edge of serial order (Howard et al., 1993), and sounds (Hauser, Newport, & Aslin, 2001).
there appears to be no special benefit to learn- There is evidence from other paradigms,
ing imparted by manual responses, relative to however, that infants show some sensitivity to
oculomotor responses (Heyes & Foster, 2002). visual spatial relations among repetitive events
Perception of scenes can be guided by statisti- under certain conditions. For example, young
cal information (Chun, 2000). Fiser and Aslin infants learn simple (two-location), predictable
(2001, 2002) presented adults with probabilisti- spatial sequences in the visual expectation par-
cally structured sequences of single shapes and adigm, which uses oculomotor anticipation as
shape arrays and found that observers were sen- the index of learning (Haith, 1993). Infants also
sitive to the statistical correlations among mul- show sensitivity to spatial contingency in tem-
tipart objects presented simultaneously, as well poral sequences: Wentworth, Haith, and Hood

(2002) presented 3-month-old infants with a p = 1.0

spatiotemporal sequence in which a stimulus
appeared on the left, in the center, or on the
right of a computer monitor. Infants viewed p = 1.0
either a fi xed or a random pattern of locations,
and in some cases, there was a contingent rela- p = 1.0
tion between the identity of the central stimulus
and the location of the next peripheral picture. p = .33
The fi xed sequence of three locations resulted
in more eye movement anticipations, and there
were more anticipatory saccades to the cor- p = .33 Time
rect location when there was a contingent rela-
tion between central and peripheral events. In
displays of greater complexity than the simple
two- and three-location events described previ-
Figure 2.1. Schematic of Kirkham et al. (2002):
ously, infants are still responsive to statistical
An example of the visual sequence shown to the
structure. Upon exposure to a sequence of static
infants (NB. In the actual experiment, the shapes
multielement scenes, for example, 9-month- were different colors and not black-and-white).
olds appeared to acquire the underlying sta-
tistical structure of the scene layout, attending
longer following habituation to isolated ele- the first shape in a pair reliably predicted the sec-
ment pairs that had co-occurred with a higher ond, but the next shape to appear could be any
frequency within the familiar scenes (Fiser & of the first members of a pair. Following habitu-
Aslin, 2003). ation, infants viewed six test displays alternating
between familiar sequences, composed of the
same three pairs of shapes, and novel sequences,
produced by randomly ordering the same
shapes. The only difference between familiar
There is clear and compelling evidence that and novel sequences was the transitional prob-
infants are sensitive to patterns of events. Much abilities between the shapes. We hypothesized
of the work discussed has focused on the types that visual statistical learning would be evinced
of abstract statistical structure that infants are by a consistent preference for the random post-
capable (and not capable) of learning at specific habituation sequence (i.e., a novelty preference).
ages. This leaves open the question of the effect Our prediction was supported: Infants in all
of the range and type of stimuli that infants three age-groups exhibited a reliable preference
are exposed to in such learning paradigms. For for the random (novel) sequence. Indeed, there
example, in earlier work, my colleagues and were no statistically significant age differences
I tested the hypothesis that infants’ ability to in the strength of novelty preferences, indicating
learn the statistical structure of event sequences that statistical learning under these conditions is
was not limited to the auditory domain. available even to very young infants.
We examined visual statistical learning In the Kirkham et al. (2002) study, sequence
in infants using a habituation/dishabituation information was contained in two cues, shape
technique (Kirkham et al., 2002). Two-, 5-, and and color. Although the information conveyed
8-month-olds were familiarized with a series of by the cues was completely redundant, a multiple
six discrete colored shapes that loomed from the cue integration theory would suggest that such
center of a display monitor. Presentation order a redundancy supports learning. In the first test
was defined in part by statistical regularities: The of a multiple cue integration theory, this predic-
shapes were organized into pairs, and the pairs tion was supported by several experiments in
were ordered randomly (see Figure 2.1). That is, which infants of ages 5 and 8 months failed to

learn the same sequences when they consisted A

of either monochrome looming shapes or dif-
ferent colors of the same shape (Kirkham &
A1 A2 B2
Wagner, in preparation).



The world will only seem coherent if one can pro-

cess an object’s spatial location and understand
what its present location might predict about B
future events. Acquisition of this type of knowl-
edge is essential for motion perception and for A1 A2 B2
the production of action sequences, because one
has to learn not only which actions are appro-
priate, but also where and when they should be
performed. In recognition of the importance B1 C2 C1
of location information, Kirkham, Slemmer,
Richardson, and Johnson (2007) adapted the
Kirkham et al. (2002) visual statistical learning Figure 2.2 (A) Experiment 1 in Kirkham et al.
paradigm to examine spatiotemporal statistical (2007): An example of the location statistics.
learning. Multiple cue integration theory was (B) Experiment 2 in Kirkham et al. (2007): An
tested by isolating location from visual feature example of the location statistics
cues and examining learning with the support
of different cue combinations across the first
year of life. Could infants extract purely spa- the 11-month-olds had. Experiment 3 suggested
tiotemporal correlations, and if so, at what age that perhaps 5-month-olds were simply not
and under what conditions? able to encode spatiotemporal information suf-
Infants were familiarized to stimuli appear- ficiently, since they showed renewed interest in
ing in one of six different locations on a grid and the novel sequence when the sequence of loca-
then were shown the familiar spatial pattern tions but not the sequence of color/shape pair-
alternating with a novel spatial sequence. These ings was held constant in the test trials.
spatial patterns mirrored the structure of the There is an alternative explanation for
randomized shape pairs used by Kirkham et al. 8-month-olds’ performance in Experiment 2. It
(2002). In Experiment 1, a red circle appeared is possible that the infants were not sensitive to
in a statistically defined spatial pattern, and a probabilistic pattern of location changes: They
11-month-olds, but not 8-month-olds, exhibited merely picked up on individual shape–location
significantly greater interest in the novel spatial pairings and showed a novelty preference when
sequence (Figure 2.2A). In Experiment 2, six those pairings were violated. To rule out this
different color/shape stimuli were presented explanation, Experiment 4 used an “on-line
(Figure 2.2B). There was a statistical structure to learning” paradigm, in which 8-month-olds’ eye
both the features and the locations of the stimuli, movements were recorded as they watched the
but crucially, only the spatial sequence was vio- habituation sequence employed in Experiment
lated during the test phase. Eight-month-olds, 2. Saccadic latencies to the newly appearing
but not 5-month-olds, showed a novelty prefer- shapes were recorded as a measure of sequence
ence for an altered spatial sequence. Although learning. After exposure to the sequence, laten-
they provided only redundant information, cies to the first shape in a pair were longer on
these multiple visual cues allowed 8-month- average than latencies to the second shape in a
olds to learn the spatiotemporal sequence as pair. This is because the second shape in a pair

could be predicted once the first had appeared. events, track the locations as they move, and
If, as the alternative explanation holds, the later, look back to the correct location when
8-month-olds in Experiment 2 only learnt asso- a particular event is relevant (Richardson &
ciations between a particular shape and a partic- Kirkham, 2004). This behavior has been shown
ular location, they would have no information in adult subjects. For example, in Experiment
about how one location can predict the next and 1 of Richardson and Kirkham (2004), adult
should not show any saccade latency decrease participants looked at a spinning cross that
in this experiment. In contrast, the decrease in appeared in two squares ports on a computer
anticipatory saccades that was found revealed screen. While the cross span, adults heard a
the role multiple cue integration can play in piece of factual information. After two facts, the
generating expectations about the world. ports moved around the screen. Participants
These four experiments present evidence then answered a question relating to one of
concerning a fundamental cognitive skill in the facts. While answering, they looked at the
infancy—the ability to learn probabilistic event empty port that had previously been associated
sequences across space and time. Evidence was with the fact (see also, Richardson & Spivey,
also obtained of an important developmental 2000). We hypothesized that a propensity for
limitation in learning: Only the oldest infants dynamic spatial indexing is not just a feature of
observed (11-month-olds) responded solely on the mature adult visual system, but emerges by
the basis of location statistics, showing a post- 6 months along with some of the first uses of
habituation novelty preference to a display adult-like spatial reference frames. At that age,
in which the positions of stimulus elements infants are still learning to orientate their atten-
were randomly placed. The youngest infants tion properly (Colombo, 2001) and are only
(5-month-olds), in contrast, appeared largely beginning to represent spatial locations egocen-
insensitive to location statistics, although they trically (Gilmore & Johnson, 1997).
were able to detect probabilistic sequences based
on a combination of color and shape. Infants at Consistent Visual Cues
an intermediate age (8 months) provided evi-
In Experiment 2 (Richardson & Kirkham,
dence of learning location statistics only when
2004), infants saw movies of two brightly col-
color and shape contributed additional (redun-
ored toys that moved in time to two different
dant) cues for the spatiotemporal sequence.
sounds (see Figure 2.3). The toys appeared in
This suggests that temporal order statistics that
square ports on a computer screen. Test phases
involve spatial relations may become available
consisted of the two empty ports and the audi-
to infants over the course of the first year after
tory element of one of the movies. We found that
birth and that integrating multiple cues can
infants looked longer at the empty port that had
bolster such learning.
previously been associated with the toy, even
when the ports had moved round the screen
SPATIALINDEXING: COLOR, SHAPE, in between the presentation and test phases
LOCATION, AND SOUND (Experiment 3). We argued that the ability to
At a certain stage of their development, infants spatially index under these challenging circum-
can learn spatiotemporal sequences by exploit- stances was supported by the rich, multimodal
ing redundant information of multiple cues. The nature of the events. So, how does the relative
cues used in Kirkham et al. (2007), however, coherence of these three types of cues (visual
were all visual. Infants do not live in a silent features, location, and sound) affect dynamic
world, as a trip to any toy store will show. Do the spatial indexing?
sounds of particular objects act as redundant,
Uninformative Visual Cues
cross-modal cues that can help infants to learn?
Dynamic spatial indexing is defined as the The multiple cue integration theory pre-
ability to encode the locations of multimodal dicts that even though the visual features

Boing..! Boing..!


Boing..! Boing..!

r is

r ia

Figure 2.3. An example of what was presented to infants in Experiment 2 of Richardson and Kirkham
(2004) with crosshair showing fi xation. Note: In Experiment 3, the location of the ports during the
familiarization phase was at the top and bottom of the screen; during test the empty ports translated
to the horizontal positions.

provide information that is redundant for spatial seemed to be reliably associated with two
indexing, their presence will support the infants’ locations. In every case, on test trials, only the
behavior. In contrast, it has been argued that auditory element was presented, and infants’
infants below 9 months of age have an “object looking times to empty frames were measured.
concept” governed by spatial–temporal conti- Infants of 6 and 10 months were run in the
nuity and that changes in visual features can go three conditions. Three-month-olds were run
unnoticed (e.g., Xu & Carey, 1996). This account only in the original condition. The original
predicts that visual features would be irrelevant dynamic spatial indexing finding (informa-
to dynamic spatial indexing. tive cue) was replicated in all three age groups;
Kirkham, Richardson, Johnson, and Wu (in even the 3-month-olds performed well. Infants
preparation) tested these predictions in a series looked longer at the empty port previously asso-
of experiments that modified Richardson and ciated with the toy after the ports had moved
Kirkham’s (2004) paradigm (see Figure 2.3). In around the screen. The behavior is different,
the inconstant visual cues condition, the infants however, in the inconstant and constant visual
saw a different object on each presentation trial. cues conditions. Six-month-old infants failed to
In the constant visual cues condition, infants look significantly longer at either port in both the
saw the same object on every presentation trial. inconstant and constant visual cues conditions.
In other words, the same auditory stimulus and Changing the visual features of the multimodal
the same synchronized motion were always events on each presentation, or keeping them
associated with a certain location, but the visual the same, reduced the infants’ ability to dis-
features either changed on every trial (incon- criminate locations during the test trial. When
stant cues) or stayed the same (constant cues). we presented 10-month-old infants with the
To replicate the original finding, an informative same stimuli, however, they successfully looked
cue condition was included where two objects at the critical location in both conditions. As

predicted by a multiple cue integration theory, understanding of the object’s trajectory is

without the support of reliable and informative determined by number of anticipatory saccades
(though still redundant) multiple cues, younger directed toward the emerging ball. The typical
infants find it harder to learn about their world. results are that 4-month-olds appear to con-
Older infants, however, are capable of learning strue partly occluded trajectory events in terms
these associations from a narrower set of reli- of disconnected trajectories on either side of the
able cues. occluder (suggesting that they are not expecting
its reemergence), whereas by 6 months of age,
infants are beginning to perceive the continuity
of the trajectory behind the occluder (Johnson
et al., 2003). This skill is not on a fi xed path,
So far in this chapter, I have discussed the however: Johnson, Amso, and Slemmer (2003)
usefulness of multiple cues on various forms showed that 4-month-old infants, who are right
of pattern detection. I would like to intro- at the beginning of a transition toward success
duce the idea that multiple cues are useful in at perceptual completion in the ball-and-box
supporting more complex representations. display, benefit greatly from “training.” When
This can be shown in investigations of young exposed to an unoccluded trajectory prior to
infants’ understanding of objects as enduring viewing an occluded trajectory, 4-month-olds
across space and time, regardless of tempo- showed a reliable increase in anticipatory sac-
rary occlusion. In most experimental settings, cades (relative to a control group of untrained
this experiment is unimodal: Infants watch a 4-month-olds). This visual experience with the
ball move across a screen, disappearing briefly object’s complete trajectory facilitated a repre-
behind a rectangular occluder. Accompanying sentation of the persistent motion of the object
sound tends to be stationary, coming at the even when interrupted by occlusion. Given that
infant from both sides of the screen that is outside of the laboratory infants have more than
used to keep the infants focused. Successful just visual experience of moving objects, perhaps

Figure 2.4. An example of one of the trajectories presented to the babies in Kirkham, Johnson, and
Wagner (in preparation).

the addition of another source of information cross-modal perception perspectives and ascer-
would help. Kirkham, Johnson, and Wagner (in tain the capabilities of infants’ hypothesized
preparation) incorporated a continuous moving statistical learning mechanism.
sound into the ball-and-box paradigm, such that
the sound traveled with the object from one side
of the occluder to the other side (see Figure 2.4). ACKNOWLEDGMENTS
Results from this study showed that 4-month- The research discussed in this chapter was sup-
olds integrate multiple auditory and visual cues ported, in part, by Grant RO3 HD050613–01 from
to infer a continuous trajectory. Control con- the National Institute of Health. I would like to
ditions examining whether static lateralized thank all the parents and children who came to
sounds would help or whether a sound moving our laboratory and participated in our research.
in the opposite direction of the object would I would also like to thank Daniel Richardson for
produce anticipations in the opposite direction helpful comments on this chapter.
support the initial finding. Four-month-olds
show more anticipations only when the sound REFERENCES
travels with the object. Indeed, when given these
multiple, cross-modal cues, 4-month-old infants Aslin, R. N., Saff ran, J. R., & Newport, E. L. (1998).
appear to be anticipating trajectories as well as Computation of conditional probability sta-
6-month-olds in the unimodal condition. tistics by 8-month-old infants. Psychological
Science, 9, 321–324.
Attneave, F. (1953). Psychological probability as a
CONCLUSION function of experienced frequency. Journal of
Experimental Psychology, 46, 81–86.
In this chapter, I have tried to describe a the- Bahrick, L. E., & Lickliter, R. (2000). Intersensory
ory of multiple cue integration, supporting the redundancy guides attentional selectivity and
acquisition of complex knowledge. As we now perceptual learning in infancy. Developmental
know, the “great blooming, buzzing confusion” Psychology, 36, 190–201.
(James, 1890, p. 462) of the perceptual world Bahrick, L. E., Lickliter, R., & Flom, R. (2004).
actually holds a wealth of information for an Intersensory redundancy guides the develop-
infant in the form of statistical structure and ment of selective attention, perception, and
cross-modal regularities. It also contains many cognition in infancy. Current Directions in
Psychological Science, 13, 99–102.
distractions, in terms of noise and extraneous
Bahrick, L. E., Netto, D., & Hernandez-Reif,
information. When learning about the world,
M. (1998). Intermodal perception of adult
how do infants pay attention to the right set and child faces and voices by infants. Child
of multiple cues at the right time? This theory Development, 69, 1263–1275.
posits a developmental trajectory out of this Brehmer, B., & Qvarnstrom, G. (1976). Information
problem: At early ages, infants rely on multiple, integration and subjective weights in multiple-
cross-modal cues in learning about sequences cue judgments. Organizational Behavior &
and locations; later on, they are better able to Human Performance, 17, 118–126.
ignore unhelpful information and learn from a Bruno, N., & Cutting, J. E. (1988). Minimodularity
narrower set of cues. and the perception of layout. Journal of
In conclusion, I would like to suggest that Experimental Psychology: General, 117, 161–170.
Christiansen, M. H., Allen, J., & Seidenberg, M. S.
the impact of multiple cue integration on
(1998). Learning to segment speech using mul-
infant learning in a probabilistic environment
tiple cues: A connectionist model. Language
remains largely unexplored. How does multiple and Cognitive Processes, 13, 221–268.
cue integration interact with statistical learn- Christiansen, M. H., & Dale, R. A. C. (2001).
ing? What kinds of expectations about objects Integrating distributional, prosodic and pho-
can be supported by this type of learning? The nological information in a connectionist model
theory outlined in this chapter is an attempt of language acquisition. Proceedings of the 23rd
to unify research from statistical learning and Annual Conference of the Cognitive Science

Society (pp. 220–225). Mahwah, NJ: Lawrence Green, D. M. (1958). Detection of multiple
Erlbaum. component signals in noise. Journal of the
Christiansen, M. H., & Monaghan, P. (2006). Acoustical Society of America, 30, 904–911.
Discovering verbs through multiple-cue inte- Haith, M. M. (1993). Future-oriented processes
gration. In K. Hirsh-Pasek & R.M. Golinkoff in infancy: The case of visual expectations. In.
(Eds.), Action meets words: How children C. Granrud (Ed.), Visual perception and cog-
learn verbs (pp. 88–107). New York: Oxford nition in infancy (pp. 235–264). Hillsdale, NJ:
University Press. Erlbaum.
Chun, M. M. (2000). Contextual cueing of visual Hauser, M. D., Newport, E. L., & Aslin, R. N.
attention. Trends in Cognitive Sciences, 4, (2001). Segmentation of the speech stream in
170–178. a non-human primate: Statistical learning in
Chun, M. M., & Jiang, Y. (1998). Contextual cue- cotton-top tamarins. Cognition, 78, B53–B64.
ing: Implicit learning and memory of visual Heyes, C. M., & Foster, C. L. (2002). Motor learn-
context guides spatial attention. Cognitive ing by observation: Evidence from a serial
Psychology, 36, 28–71. reaction time task. The Quarterly Journal of
Cohen, A., Ivry, R. I., & Keele, S. W. (1990). Experimental Psychology, 55A, 593–607.
Attention and structure in sequence learning. Howard, J. H., Mutter, S. A., & Howard, D. V. (1993).
Journal of Experimental Psychology: Learning, Serial pattern learning by event observation.
Memory, and Cognition, 16, 17–30. Journal of Experimental Psychology: Learning,
Colombo, J. (2001). The development of visual Memory, and Cognition, 18, 1029–1039.
attention in infancy. Annual Review of James, W. (1890/1981). The principles of psychology.
Psychology, 52, 337–367. Cambridge, MA: Harvard University Press.
Curran, T., & Keele, S. W. (1993). Attentional and Johnson, S. P. (1997). Young infants’ perception
nonattentional forms of sequence learning. of object unity: Implications for development
Journal of Experimental Psychology: Learning, of attentional and cognitive skills. Current
Memory, and Cognition, 19, 189–202. Directions in Psychological Science, 6, 5–11.
Ernst, M. O., Banks, M. S., & Bülthoff, H. H. Johnson, S. P., Amso, D., & Slemmer, J. A. (2003).
(2000). Touch can change visual slant percep- Development of object concepts in infancy:
tion. Nature Neuroscience 3, 69–73. Evidence for early learning in an eye tracking
Fiser, J., & Aslin, R. N. (2001). Unsupervised sta- paradigm. Proceedings of the National Academy
tistical learning of higher-order spatial struc- of Sciences USA, 100, 10568–10573.
tures from visual scenes. Psychological Science, Johnson, S. P., Bremner, J. G., Slater, A., Mason,
12, 499– 504. U., Foster, K., & Cheshire, A. (2003). Infants’
Fiser, J. & Aslin, R. N. (2002). Statistical learning perception of object trajectories. Child
of higher-order temporal structure from visual Development, 74, 94–108.
shape sequences. Journal of Experimental Jonides, J., & Jones, C. M. (1992). Direct cod-
Psychology: Learning, Memory, and Cognition, ing of frequency of occurrence. Journal of
28, 458–467. Experimental Psychology: Learning, Memory,
Fiser, J., & Aslin, R. N. (2003). Statistical learning and Cognition, 18, 368–378.
of new visual feature combinations by infants. Jonides, J. M., & Naveh-Benjamin, M. (1987).
Proceedings of the National Academy of Sciences of Estimating frequency of occurrence. Journal of
The United States of America, 99, 15822–15826. Experimental Psychology: Learning, Memory,
Gibson, E. J. (1969). Principles of perceptual learn- and Cognition, 13, 230–240.
ing and development. New York: Appleton Kinchla, R. A. (1977). The role of structural
Century Crofts. redundancy in the perception of visual targets.
Gilmore, R. O., & Johnson, M. H. (1997). Egocentric Perception and Psychophysics, 22, 19–30.
action in early infancy: Spatial frames of ref- Kirkham, N. Z., Johnson, S. P., & Wagner, J. Moving
erence for saccades. Psychological Science, 8, sounds: The role of inter-modal perception in
224–230. solving the problem of occlusion. Manuscript in
Gomez, R. L., & Gerken, L. (1999). Artificial preparation.
grammar learning by 1-year-olds leads to spe- Kirkham, N.Z., & Shohamy, D (2007). Feedback
cific and abstract knowledge. Cognition, 70, modulation of probabilistic learning: A devel-
109–135. opmental perspective. Poster presented at the

Annual Meeting of the Cognitive Neurosciences Eye movements of adults and 6-month-olds
Society, May 2007. reveal dynamic spatial indexing. Journal
Kirkham, N. Z., Richardson, D. C., & Johnson, S. P., of Experimental Psychology: General, 133,
& Wu, R. The importance of ‘what’ The usefulness 46–62.
of multiple redundant cues across the first year of Richardson, D. C., & Spivey, M. J. (2000).
life. Manuscript in preparation. Representation, space, and Hollywood Squares:
Kirkham, N. Z., Slemmer, J. A., & Johnson, S. P. Looking at things that aren’t there anymore.
(2002). Visual statistical learning in infancy. Cognition, 76, 269–295.
Cognition, 83, B35–B42. Rosenblum, L. D., Schmuckler, M. A., & Johnson,
Kirkham, N. Z., Slemmer, J. A., Richardson, D. C., J. A. (1997). The McGurk effect in infants.
& Johnson, S. P. (2007). Location, location, loca- Perception & Psychophysics, 59, 347–357.
tion: Development of spatiotemporal sequence Rubin, D. C. (1974). The subjective estima-
learning in infancy. Child Development, 78, tion of syllable frequency. Perception and
1559–1571. Psychophysics, 16, 193–196.
Kuhl, P. K., & Meltzoff, A. N. (1982). The bimodal Saegert, S., Swap, W., & Zajonc, R. B. (1973).
perception of speech in infancy. Science, 218, Exposure, context, and interpersonal attrac-
1138–1140. tion. Journal of Personality and Social
Landy, M. S., Maloney, L. T., Johnston, E. B., & Psychology, 25, 234–242.
Young, M. (1995). Measurement and modelling Saff ran, J. R., Aslin, R. N., & Newport, E. L. (1996).
of depth cue combination: In defense of weak Statistical learning by 8-month-old infants.
fusion. Vision Research, 35, 389–412. Science, 274, 1926–1928.
Lewkowicz, D. J. (2000). The development of Saff ran, J. R., Johnson, E. K., Aslin, R. N., &
intersensory temporal perception: An epige- Newport, E. L. (1999). Statistical learning of
netic systems/limitations view. Psychological tone sequences by human infants and adults.
Bulletin, 126, 281–308. Cognition, 70, 27–52.
Lewkowicz, D. J., & Turkewitz, G. (1980). Cross- Sedmeier, P., Hertwig, R., & Gigerenzer, G.
modal equivalence in early infancy: Auditory– (1998). Are judgments of the positional
visual intensity of matching. Developmental frequencies of letters systematically biased
Psychology, 16, 597–607. due to availability? Journal of Experimental
Lichenstein, S. P., Slovic, P., Fischoff, B., Layman, Psychology: Learning, Memory, and Cognition,
M., & Combs, B. (1978). Judged frequency 24, 754–770.
of lethal events. Journal of Experimental Shapiro, B., J. (1969). The subjective estimation
Psychology: Human Learning and Memory, 4, of relative word frequency. Journal of Verbal
551–578. Learning and Verbal Behavior, 8, 248–251.
Massaro, D. W. (1999). Speechreading: Illusion Smith, W. C., Johnson, S. P., & Spelke, E. S.
or window into pattern recognition. Trends in (2003). Motion and edge sensitivity in percep-
Cognitive Science, 3, 310–317. tion of object unity. Cognitive Psychology, 46,
Mayhew, J. E., & Frisby, J. P. (1980). The computa- 31–64.
tion of binocular edges. Perception, 1, 69–86. Spelke, E. S. (1979). Perceiving bimodally specified
Meulemans, T., Van der Linden, M., & Perrouchet, events in infancy. Developmental Psychology,
P. (1998). Implicit learning in children. Journal 15, 626–636.
of Experimental Child Psychology, 69, 199–221. Spelke, E. S. (1981). The infant’s acquisition of
Morrongiello, B. A., Fenwick, K. D., & Chance, knowledge of bimodally specified events.
G. (1998). Cross modal learning in new- Journal of Experimental Child Psychology, 31,
born infants: Inferences about properties of 279–299.
auditory–visual events. Infant Behavior and Sperling, G., Landy, M. S., Dosher, B. A., &
Development, 21, 543–554. Perkins, M. E. (1989). Kinetic depth effect
Nissen, M. J., & Bullemer, P. (1987). Attentional and identification of shape. Journal of
requirements of learning: Evidence from per- Experimental Psychology: Human Perception
formance measures. Cognitive Psychology, 19, and Performance, 15, 826–840.
1–32. Todd, J. T., & Akerstrom, R. A. (1987). Perception of
Richardson, D. C., & Kirkham, N. Z. (2004). three-dimensional form from patterns of opti-
Multi-modal events and moving locations: cal texture. Journal of Experimental Psychology:

Human Perception and Performance, 13, Xu, F. & Carey, S. (1996). Infants’ metaphys-
242–255. ics: the case of numerical identity. Cognitive
Wentworth, N. Haith, M. M., & Hood, R. (2002). Psychology, 30, 111–153.
Spatiotemporal regularity and interevent con- Zajonc, R. B. (1968). Attitudinal effects of mere
tingencies as information for infants’ visual exposure. Journal of Personality and Social
expectations. Infancy, 3, 303–321. Psychology Monograph Supplement, 9, 1–28.
Perceptual Completion in Infancy

Scott P. Johnson

Sensory systems provide information about • Perception of units as complete across space
the environment so that we might prepare and and time despite gaps in perception due to
enact actions appropriate for the context. Visual occlusion and movement of objects.
perception, in particular, is useful in acquiring • Deduction of 3D shape from limited views
information about near and distant objects in due to objects’ self-occlusion.
our surroundings. Consider, for example, the • Recognition of objects encountered before.
scene in Figure 3.1, adjacent to the beach in • Tracking the identity of previously encoun-
Venice, California. This scene is typical of what tered objects over time.
we might encounter when we move about in the • Categorization of similar objects.
world: the ground extends into the distance and • Detecting affordances for action.
consists of different materials (concrete, grass,
This way of decomposing object perception
sand) that might dictate our route, and there
underscores its complexity, and also provides
are many objects that might either be avoided
hints as to how investigations into its mecha-
or approached as we move about, depending on
nisms and development might proceed. In this
our goals.
chapter, I will describe a program of research
In everyday contexts, we hold certain com-
whose goal is to elucidate the developmental ori-
monsense expectations about the objects we see.
gins of object perception, with particular focus
We expect most objects, for example, to be solid
on perceptual completion, which constitutes a
and three-dimensional, persistent across space
subset of the steps just described: assembly of
and time, and we plan our actions around these
visible surfaces into units despite gaps in percep-
expectations. How are these commonsense
tion due to occlusion by other objects, and per-
expectations about objects achieved? Object
ception of 3D shape despite self-occlusion. I will
perception and action planning typically seem
describe three kinds of perceptual completion
so effortless and rapid that we may fail to appre-
and consider the developmental mechanisms
ciate their complexity. But there are many steps
involved in each: spatial completion, perceiving
involved in perceiving objects:
the unity of partly occluded surfaces; spatiotem-
• Segmentation of a scene into its component poral completion, perceiving the unity of partly
surfaces based on differences in color, lumi- occluded trajectories; and 3D object completion,
nance, motion, shape, orientation, distance, perceiving the 3D shape of objects seen from a
and so forth. limited vantage point.
• Assembly of the surfaces into units (units, or In principle, perceptual completion might be
collections of units, constitute objects). accomplished innately—in the absence of visual


Figure 3.1 A scene in Venice, California.

experience—if the developmental processes birth (Johnson, 2004). It is unknown if there

that build those parts of the visual system that is a one-to-one correspondence between these
support it were complete at birth. In the mature developmental events, as might be predicted on
primate visual system, for example, spatial a strictly maturational account (e.g., if growth of
completion is accomplished in part by interac- neural connections causes spatial completion),
tions among neurons in relatively low levels of but maturation is likely an important part of
the visual system (V1 and V2), which, when fir- the overall developmental story. The evidence is
ing in response to a visible edge, connect with clearer, however, for a direct and important role
other neurons that are tuned to similar edge for learning as a principal means of develop-
orientations (Peterhans & von der Heydt, 1991). ment of perceptual completion. I will describe
In this way, activation “spreads” among neu- this evidence subsequently.
rons coding for a specific edge orientation when The remainder of this chapter is organized
that orientation is detected—so-called “local” as follows. In the next section, I will describe
circuits—and activation can propagate across a two theoretical views of object perception in
spatial gap. As it happens, connections among infancy, one of Jean Piaget and the other of
low-level local circuits in the human visual theorists espousing a nativist view. In the fol-
system are likely first formed at about 2 post- lowing sections, I will describe in more detail
natal months (Burkhalter, 1993; Burkhalter, the three kinds of perceptual completion men-
Bernardo, & Charles, 1993), and consistent with tioned previously (spatial, spatiotemporal, and
this timing, infants begin to provide evidence of 3D object completion), and some of the devel-
perceptual completion at about 2 months after opmental mechanisms involved. The chapter

will conclude by considering implications of of infants’ behavior has been disputed on the
our research for constructivist theory, which of basis of more recent experiments employing
course is the theme of this book. methods claimed to be more sophisticated and
sensitive than reaching in tapping infants’ cog-
nitive constructs. Some of these methods capi-
talize on infants’ tendency to show clear visual
Piaget (1954) described the first systematic preferences for particular stimuli over others;
investigations of how infants, beginning at birth one such method, known as the “violation of
and extending across the next several years, expectation” method, is claimed to demonstrate
respond to objects and their spatial relations. infants’ knowledge of event sequences that are
According to Piagetian theory, the development physically impossible. For example, in an exper-
of object knowledge consists of construction of iment described by Baillargeon, Spelke, and
two related concepts: first, that objects persist Wasserman (1985), 5-month-old infants viewed
across time and space, and second, that objects a box and a screen arranged such that the screen
exist external to the self in a particular spatial rotated and appeared to move through the
arrangement. Piaget devised a series of clever space occupied by the box, a so-called “impos-
activities involving objects, often occluded sible” event, accomplished with one-way mir-
objects, which he presented to infants and chil- rors. Infants were reported to look longer at
dren, and he recorded the children’s responses. this event than at a “possible” version in which
These observations revealed a developmental the screen stopped at an appropriate place in its
sequence of behaviors thought to reflect under- rotation, before moving through the box, there-
lying knowledge of objects and their spatial rela- fore providing evidence for object permanence
tions. Initially, infants provided no responses earlier than acknowledged by Piagetian theory.
to indicate knowledge of object permanence, A second example comes from an exper-
though there was recognition of familiar iment described by Bower (1967) in which
objects. Between 3 and 6 months, infants begin 1-month-old infants viewed a triangle made of
to show evidence of a sense of objects as having wire, its center occluded by a rectangular object.
boundaries that extend beyond the visible, such Bower did not record looking times toward the
as searching for (whole) objects that are only object; instead, he recorded rates of sucking on
partly visible, and directing gaze toward the a pacifier during viewing and reasoned that
expected point of emergence of a moving object a change in sucking rates upon presentation
that became hidden. Later, at around 8 months of a novel stimulus would indicate perceptual
of age, infants search for fully hidden objects; discrimination of the new stimulus from the
still later (around 18–24 months), infants solve original. Following the occluded object, infants
more complex hiding tasks involving more than viewed four new stimuli presented individually:
one location, at which point Piaget ascribed a whole triangle, a triangle with crossed lines in
them full object permanence. the center, and two different kinds of incomplete
At birth, therefore, the infant experiences triangles, each with a gap in the center. All four
not objects per se but rather surfaces that appear test stimuli were consistent with the visible por-
and disappear erratically—fleeting images that tions of the triangle seen during training. The
are arbitrary and subjective, rather than sub- infants maintained sucking rates in response to
stantial, predictable, and permanent. Piaget the complete object, and reduced sucking rates
suggested, furthermore, that the principal most when viewing the two incomplete forms.
mechanisms of development were rooted in the This implies generalization of the complete form
infant’s own behavior: engaging with objects, to the partly occluded object, and response to
following their motions, and taking note of the the incomplete forms as novel.
consequences of self-directed actions. Noting the inconsistencies between the
Many of Piaget’s observations have been Bower study and Piaget’s observations, Kellman
replicated repeatedly, but his interpretation and Spelke (1983) undertook a more thorough

examination of young infants’ perception of of cognitive development stressing an innate

partly occluded objects. In one such experi- knowledge component (i.e., independent of
ment, infants were first presented with a mov- experience) that guides infant responses to
ing, partly occluded rod until habituation objects. This theory is founded principally on
of looking occurred; that is, looking times the assumption that object knowledge is innate
declined according to a predetermined crite- because responses to occlusion are observed at
rion. The infants were then tested with incom- an early age, and there is inadequate opportu-
plete (or broken) and complete versions of this nity for infant learning of foundational con-
display without the occluder (see Figure 3.2). cepts, such as persistence and permanence
The test stimulus that attracted the most atten- (Baillargeon, 1995). On a nativist account,
tion was assumed to be perceived as most novel Piaget may have underestimated infants’ object
relative to the initial stimulus. Four-month-old knowledge and the age at which competence at
infants showed reliable preferences for a broken object permanence can be elicited because of
rod test display when the partly occluded rod the insensitivity of his methods. Piaget’s meth-
(the habituation stimulus) was seen to move lat- ods are thought to rely too heavily on overt
erally relative to the background and occluder manual responses, such as coordinated reach-
(as depicted in Figure 3.2). (In this study, and ing. Difficulties with reaching may mask latent
the others described in this chapter that use cognitive capacities that can be revealed with
habituation methods, a separate age-matched methods that rely on relatively simpler action
control group was observed for evidence of sequences, such as looking and sucking.
spontaneous preference for one of the test stim- Yet the nativist account has three funda-
uli; in all cases, none were found.) There was no mental flaws with respect to infants’ responses
evidence of unity perception, however, in dis- to partly and fully occluded objects. First, find-
plays in which the rod remained stationary, in ings from experiments using measures that do
contradiction to the results reported by Bower. not require manual search skills, such as look-
The discrepancies between the findings of the ing times, in reality are broadly consistent with
Kellman and Spelke study concerning unity Piaget’s observations: Infants provide evidence
perception and the Bower experiment are dif- of representing partly occluded objects a few
ficult to reconcile, but it is noteworthy that the months after birth, and fully occluded objects
Kellman and Spelke finding has been replicated by about the middle of the first year. Some have
on multiple occasions. claimed that infants have a sense of object per-
These studies and others have been sug- manence on the basis of evidence from looking
gested to provide support for a nativist theory time studies (e.g., Baillargeon et al., 1985), but

Figure 3.2 Schematic depictions of displays used to assess infants’ perception of object unity, or spatial
completion. Left: The display shown during habituation consists of two aligned rod parts moving in
tandem above and below an occluder. Center: Broken test display. Right: Complete rod test display.
Note: Real, 3D objects were used in the original Kellman and Spelke (1983) experiments; displays
depicted here are adapted from 2D versions (Johnson & Náñez, 1995).

in fact the best that can be said about infant preference; the 4-month-olds and neonates show
performance in most such paradigms is that opposite patterns of preference. This has led to
there is a short-term representation of an object the more general conclusion that neonates are
that is maintained for a matter of seconds (cf. unable to perceive occlusion, and that occlusion
Haith, 1998; Kagan, 2008). This is a far cry from perception emerges over the first several post-
Piaget’s criteria for full object permanence, natal months (Johnson, 2004). That is, “piece-
which requires accurate search in multiple meal” or fragmented perception of the visual
locations for a desired hidden object, accessing environment extends from birth through the
knowledge of both persistence and the spatial first several months afterward, implying a fun-
relations between the object, the hiding loca- damental shift in the infant’s perceptual experi-
tions, and the infant. ence: “During the first few months of existence
Second, nativist theories contribute little if the child’s universe is really lacking permanent
any substantive knowledge about developmen- objects . . . this means that perceived figures
tal mechanisms underlying object perception. simply appear and disappear like moving tab-
Even on a strictly maturational account stipu- leaux and exhibit a series of changing shapes in
lating developments that are exclusive of visual between” (Piaget & Inhelder, 1956, p. 9).
experience, there are means by which the infant Because neonates and 4-month-olds appear
comes to perceive and represent objects, and to regard rod-and-box displays differently—as
these means may be amenable to empirical separate surfaces and as occluded objects,
investigation. respectively—an important step in understand-
Third, there is substantial evidence that ing development of spatial completion is inves-
newborn infants do not perceive moving, tigations of performance in infants between
partly occluded objects as having hidden parts. these ages. In the first such investigation,
Instead, neonates appear to construe such stim- 2-month-olds were found to show an “inter-
uli solely in terms of their visible parts, failing mediate” pattern of performance—no reliable
to achieve spatial completion (Slater, Johnson, posthabituation preference—implying that spa-
Brown, & Badenoch, 1996; Slater et al., 1990; tial completion is developing at this point but
but see Valenza, Leo, Gava, & Simion, 2006, not yet complete (Johnson & Náñez, 1995). A
for evidence against this possibility). This and follow-up study examined the possibility that
other findings show clear evidence for a devel- 2-month-olds will perceive unity if given addi-
opmental progression in perceptual comple- tional perceptual support. The amount of visible
tion and call for an explanation of underlying rod surface revealed behind the occluder was
mechanisms of change. This is the goal of the enhanced by reducing box height and by adding
experiments described in the following sections gaps in it, and under these conditions 2-month-
of this chapter. olds provided evidence of unity perception
(Johnson & Aslin, 1995). With newborns, how-
ever, this manipulation failed to reveal similar
evidence: Even in enhanced displays, newborns
Adults and 4-month-old infants construe the seemed to perceive disjoint rather than unified
occlusion display depicted in Figure 3.2 as con- rod parts (Slater et al., 1996; Slater, Johnson,
sisting of two parts, a single rod or bar moving Kellman, & Spelke, 1994). These experiments
back and forth behind an occluding rectangle served to pinpoint more precisely the time of
(Kellman & Spelke, 1983). Neonates, by con- emergence of spatial completion in infancy: the
trast, perceive this display as consisting of three first several weeks or months after birth under
separate parts: two disjoint rod parts and an typical circumstances.
occluder (Slater et al., 1990, 1996). These con- Additional experiments explored the kinds
clusions arise from looking time experiments of visual information infants use to perceive
described previously in which posthabituation spatial completion. Our starting point was the
looking patterns are thought to reflect a novelty Gestalt cues of good continuation and common

Figure 3.3 Schematic depictions of displays used to assess 2-month-olds’ spatial completion. Infants
perceived completion only when rod parts were aligned and the occluder was relatively narrow (left).
Infants provided evidence of perceiving disjoint surfaces in the other displays. Adapted from Johnson

motion, also known as common fate. The serving to segment the scene into its constitu-
Kellman and Spelke (1983) experiments pro- ent surfaces, and then serving to bind moving
vided evidence that 4-month-olds perceived surfaces into a single object (Johnson, Davidow,
object unity only when the rod parts moved in Hall-Haro, & Frank, 2008).
tandem behind a stationary occluder. We rep- In summary, experiments that examine
licated and extended this finding, showing in development of spatial completion provide sup-
addition that 4-month-olds provided evidence port for the possibility that young infants ana-
of completion only when the rod parts were lyze the motions and arrangements of visible
aligned (Johnson & Aslin, 1996). Later experi- surfaces and initially (at birth) perceive them as
ments revealed similar patterns of performance separate from one another and the background.
in 2-month-olds when tested using displays Only later do infants integrate these surfaces into
with different occluder sizes and edge arrange- percepts of coherent, partly occluded objects.
ments, as seen in Figure 3.3 (Johnson, 2004). According to this view, therefore, development
Perceptual completion obtained only when rod of object knowledge begins with perception of
parts were aligned across a narrow occluder; in visible object components, and proceeds with
the other displays, infants provided evidence of increasing proficiency at representation of those
disjoint surface perception. object parts that cannot be discerned directly.
One possible interpretation of these findings
Learning to Perceive Spatial Completion
is that alignment, motion, and occluder width
(i.e., the spatial gap) are interdependent contri- In this section, I will describe experiments
butions to spatial completion, such that com- designed to elucidate developmental mecha-
mon motion is detected most effectively when nisms of spatial completion. An important
rod parts are aligned (Kellman & Arterberry, part of the developmental explanation revealed
1998). I evaluated this possibility in experiments by these experiments is the strong relation
probing 2-month-olds’ discrimination of differ- between oculomotor scanning patterns—eye
ent patterns of rod motion with varying orien- movements—and unity perception. Amso and
tations of rod parts and occluder widths. Under Johnson (2006) and Johnson, Slemmer, and
all tested conditions, infants discriminated the Amso (2004) observed 3-month-old infants in
motion patterns, implying that motion discrim- a spatial completion task using the habituation
ination was neither impaired nor facilitated by paradigm described previously. Infants’ eye
misalignment or occluder width. The precise movements were recorded with a corneal reflec-
contributions of motion to spatial completion tion eye tracker during the habituation phase
in infants remain unknown; one possibility of the experiment. We found systematic dif-
is that motion serves multiple functions, first ferences in scanning patterns between infants

Figure 3.4 Scan patterns of a “perceiver” (left) and a “nonperceiver” (right), both 3-month-olds, from
Amso and Johnson (2006). The leftmost and rightmost positions of the rod during its motion are
shown. Scan paths are depicted as lines between points, which represent fi xations.

whose posthabituation test display preferences acquisition may lead to a “default” response
indicated unity perception and infants who to the visible surfaces only, characteristic of
provided evidence of perception of disjoint sur- neonates, yielding a novelty preference for the
faces: “Perceivers” tended to scan more in the complete rod at test. Either of the possibilities
vicinity of the two visible rod segments, and is consistent with the idea that efficient visual
to scan back and forth between them (Figure exploration is an important mechanism of
3.4). In a somewhat younger sample (58 to 97 development in object perception.
days), Johnson et al. (2008) found a reliable cor- How do these findings provide evidence for
relation between posthabituation preference— learning as a mechanism of spatial completion?
our index of spatial completion—and targeted A nativist view stressing innate mechanisms
visual exploration, the proportion of eye move- that are independent of experience might posit
ments directed toward the moving rod parts, that spatial completion stems exclusively from
which we reasoned was the most relevant aspect maturation of neural structures responsible for
of the stimulus for perception of completion. object perception, such as connections in V1
Spatial completion was not predicted by other and V2 mentioned previously. Then, as infants
measures of oculomotor performance, includ- begin to perceive occlusion, their eye move-
ing mean number of eye movements per sec- ment patterns support or confirm this percept.
ond, mean distance between fi xations, and the Unequivocal evidence for a direct role for tar-
“dispersion” of visual attention, an assessment geted visual exploration in development of spa-
of “global” versus “local” scanning activity. tial completion would come from experiments
Spatial completion was best predicted by sacca- in which individual differences in oculomotor
des directed toward the vicinity of the moving patterns were observed in both spatial com-
rod parts. This can be a challenge for a develop- pletion and some other visual task, and this
ing oculomotor system, attested by the fact that was recently reported by Amso and Johnson
targeted scans almost always followed the rod as (2006). We found that both spatial completion
it moved, rarely anticipating its position. and scanning patterns were strongly related to
How targeted visual exploration emerges performance in an independent visual search
in infancy to maximize effective uptake of task in which targets were selected amongst
visual information is not yet known. Very distracters. This finding is inconsistent with
young infants’ ability to perceive occlusion the possibility that scanning patterns were
may be challenged by difficulties in accessing tailored specifically to perceptual completion,
visual information for unity such as alignment and instead suggests that a general facility with
and common motion of edges across a spatial targeted visual behavior leads to improvements
gap. Alternatively, insufficient information across multiple tasks.

How might developing object perception objects—that is, spatial completion—findings

systems benefit from targeted scans? Eye move- that raise the question of how perception of
ments may serve as a vital binding mechanism complete occlusion emerges during the first few
due to the relatively restricted visual field and months after birth. Apart from Piaget’s theory,
poor acuity characteristic of young infants’ this question has received relatively little serious
vision. Visual information in the periphery is attention until recently, in favor of accounts that
more difficult to access with a single glance, stress innate object concepts (e.g., Baillargeon,
increasing the need to scan between features to 2008; Spelke, 1990).
ascertain their relations to one another. Infants To address this gap in our knowledge, my
who are more likely to do this will increase their colleagues and I have conducted experiments
processing of relevant features and their corre- with computer-generated displays in which
spondences, as irrelevant features are ignored. objects moved on a trajectory, disappeared
In summary, the individual differences behind an occluder, reappeared on the far side,
in targeted visual exploration that we have and reversed direction, repeating the cycle. We
observed suggest that scanning patterns make reasoned that manipulation of spatial and tem-
a vital contribution to the emergence of veridi- poral characteristics of the stimuli, and the use
cal object perception. Evidence suggests that as of different age groups, might provide insights
scanning patterns develop, they enable learning into development of spatiotemporal completion,
of relevant visual features of the environment, as they did in the case of spatial completion.
and support binding of these features into These investigations revealed a fragmented-
coherent percepts of unified objects. to-holistic developmental pattern, and revealed
spatial and temporal processing constraints as
well. Both sets of results are in parallel with the
investigations of spatial completion described
A number of studies using different meth- in the previous section. Spatiotemporal com-
ods (e.g., looking times, reaching in the dark) pletion was tested using similar methods:
have shown that young infants can main- habituation to an occlusion display (Figure 3.5),
tain representations for hidden objects across followed by broken and complete test displays,
brief delays (e.g., Aguiar & Baillargeon, 1999; and different versions of the partly hidden tra-
Clifton, Rochat, Litovsky, & Perris, 1991; Spelke, jectory seen during habituation. At 4 months,
Breinlinger, Macomber, & Jacobson, 1992). Yet, infants treat the ball-and-box display depicted
as mentioned previously, newborns provide in Figure 3.5 as consisting of two disconnected
little evidence of perceiving partly occluded trajectories, rather than a single, partly hidden

Figure 3.5 Schematic depictions of displays used to assess infants’ perception of occluded trajectories,
or spatiotemporal completion. Left: The display shown during habituation consists of two segments of
object trajectory. Center: Discontinuous trajectory test display. Right: Continuous trajectory rod test
display. Adapted from Johnson, Bremner et al. (2003).

path (Johnson et al., 2003); evidence comes This work leads to two conclusions. First,
from a reliable preference for the continuous there may be a lower age limit for trajectory
version of the test trajectory. By 6 months, completion (between 2 and 4 months), just
infants perceived this trajectory as unitary, as as there appears to be for spatial completion
revealed by a reliable preference for the dis- (between birth and 2 months). Second, young
continuous trajectory test stimulus. When infants’ spatiotemporal completion is based on
occluder size was narrowed, however, reduc- relatively simple parameters. Either a short time
ing the spatiotemporal gap across which the or short distance out of sight leads to percep-
trajectory had to be interpolated, 4-month- tion of continuity, and this may occur because
olds’ posthabituation preferences (and thus, the processing load is reduced by these manipu-
by inference, their percepts of spatiotemporal lations. The fragile nature of emerging spatio-
completion) were shifted toward the discon- temporal completion is underscored as well
tinuous, partway by an intermediate width, by results showing its breakdown when either
and fully by a narrow width, so narrow as to occluder or path orientation is nonorthogonal.
be only slightly larger than the ball itself. In
2-month-olds, this manipulation appeared to Learning to Perceive Spatiotemporal
have no effect. Completion
Reducing the spatiotemporal gap, therefore, Piaget (1954) described a series of infants’
facilitates spatiotemporal completion. Reducing behaviors providing evidence for an emerging
the temporal gap during which an object is hid- ability to track objects that became occluded.
den, independently from the spatial gap, also Before the development of skilled manual
supports spatiotemporal completion. Increasing search at about 4 to 6 months, search for hid-
the ball size (Figure 3.6) can minimize the time den objects was exclusively visual. For example,
out of sight as it passes behind the occluder, Piaget’s son Laurent, at 2 months, was reported
and this led 4-month-olds to perceive its tra- to maintain gaze at a point where Piaget had
jectory as complete. Accelerating the speed of been seen previously, a passive expectation of
a smaller ball as it passed behind the occluder his father’s reappearance. More active visual
(and appeared more quickly) had a similar effect search behavior emerged after 4 months, such
(Bremner et al., 2005). On the other hand, alter- as visual “accommodation,” as when an infant
ing the orientation of the trajectory-impaired would respond to a dropped object by looking
path completion (Figure 3.6), unless the edges of down toward the floor, a behavior observed
the occluder were orthogonal to the path; these in Laurent at 6.5 months. On Piaget’s theory,
findings are similar to outcomes of experiments visual accommodation or anticipation becomes
on edge misalignment described in the previous more consistent as the infant learns from self-
section (Bremner et al., 2007). directed manipulation of objects, providing

Figure 3.6 Schematic depictions of displays used to examine conditions under which infants show
evidence of spatiotemporal completion. Adapted from Bremner et al. (2005, 2007).

direct experience with dropping and retrieval, of occluded objects are weak if not nonexistent
and develops alongside reconstruction of partly in very young infants, and gradually strengthen
occluded objects from visible fragments, which across the first year after birth (Munakata, 2001;
I have referred to in this chapter as spatial com- Piaget, 1954).
pletion. Piaget suggested that increasing visual, To examine the possibility that learning
tactile, and manual experience is a vital devel- plays an important role in development of spa-
opmental mechanism in developing more com- tiotemporal completion, my colleagues and
plex concepts of objects. I presented ball-and-box displays to 4- and
More recent research has shown that by 6-month-olds as we recorded their eye move-
6 months, infants’ representations of hid- ments (Johnson, Amso, & Slemmer, 2003). The
den objects are sufficiently robust to guide stimulus was identical to the displays used by
reaching prospectively to intercept objects on Johnson, Bremner et al. (2003) (Figure 3.5).
occluded trajectories (Clifton et al., 1991; von Because 6-month-olds provided evidence of
Hofsten, Vishton, Spelke, Feng, & Rosander, spatiotemporal completion in these displays
1998). Researchers have built on these ideas by when tested with a habituation paradigm, we
recording infants’ eye movements as they view predicted that oculomotor anticipations would
repetitive events in which objects move behind be more frequent in the older age group. This
an occluder and subsequently reemerge, such prediction was supported. A higher proportion
as that shown in Figure 3.5. The question is the of 6-month-olds’ eye movements was classified
extent to which infants produce anticipatory as anticipatory (i.e., initiated prior to the ball’s
eye movements toward the place of reemergence emergence from behind the occluder) relative
while the object is out of view, implying a func- to 4-month-olds, corroborating the likelihood
tional representation of the object that guides that spatiotemporal completion strengthens
oculomotor behavior. between 4 and 6 months.
At 4 months, prospective behavior— As noted previously, 4 months is a time of
anticipations from eye and head movements to the transition toward spatiotemporal completion,
place of reappearance of an object seen to move raising questions about the role of experience
behind an occluder—is adapted to variations in in oculomotor anticipation performance at this
occluder width and object speed, implying that age. There is mixed evidence for short-term
under some conditions, infants may track with gains in predictive performance, gains that
their “mind’s eye” (von Hofsten, Kochukhova, & hypothetically might arise from repeated expo-
Rosander, 2007). At 6 months, infants begin to sure to a target object that moves in a perfectly
respond to nonlinear trajectories, showing spa- predictable manner (to adults). In other words,
tially accurate predictive eye movements when a we might expect infants to show more reliable
target moves on a partly occluded circular path anticipation as they view multiple instances of
(Gredebäck & von Hofsten, 2004; Gredebäck, an object emerging from behind an occluder.
von Hofsten, & Boudreau, 2002). Yet the evi- Rosander and von Hofsten (2004) found that
dence from habituation experiments described predictive performance of the oldest infants
previously indicates 4-month-olds process they observed (21-week-olds) improved with
partly occluded trajectories in terms of visible repeated exposure to four complete cycles of
components only, not complete paths, when motion, in terms of decreasing eye movement
tested under conditions that challenge spatio- latencies as a function of trial. By contrast,
temporal completion (Johnson, Bremner, et al., Gredebäck et al. (2002) and Johnson, Amso,
2003), and, notably, the lower bound for predic- and Slemmer (2003) found that oculomotor
tive tracking is about 12 weeks (Rosander & von anticipations declined across trials in infants
Hofsten, 2004). These results indicate that repre- ranging from 4 to 9 months. In these studies,
sentations of occluded objects in 4-month-olds infants were exposed to several dozen opportu-
may be rather fragile and not completely estab- nities to learn about the repetitive event during
lished, and, more broadly, that representations the sequence of trials, yet performance declined

consistently. In general, therefore, infants do For associative learning about occlusion to

not seem to capitalize on the predictable nature be a viable means of dealing with real-world
of occlusion stimuli in producing predictive events, associations between visible and partly
eye movements, implying that infants do not occluded paths must be committed to memory.
acquire spatiotemporal completion solely by How long do such rapidly acquired associations
means of this kind of experience. last? To address this question, we replicated the
But of course the visual environment in the Johnson, Slemmer, and Amso (2003) methods
real world is not limited to dynamic occlusion and observed a nearly identical pattern of antic-
events. There are many instances of objects that ipatory behaviors by 4-month-olds in baseline
move in full view, and infants will track mov- and training conditions (Johnson & Shuwairi,
ing objects from the first opportunities to do 2009). A third group received a half-hour break
so, at birth (Slater, 1995). What do infants learn between training and test, and performance
by watching such events? We addressed this reverted to baseline, implying that memory for
question with a “training” paradigm: present- the association was lost during the delay. But a
ing infants with unoccluded object trajectory fourth group, provided with a single “reminder”
displays for 2 min immediately preceding the trial after an identical delay, showed a recovery
occlusion stimulus seen in Figure 3.5 (Johnson, of oculomotor anticipations equivalent to the
Amso, & Slemmer, 2003). Following training, no-delay training condition. (A fi ft h group,
the 4-month-olds’ performance was statistically provided only a single training trial, showed
indistinguishable from that of the 6-month- no benefit in the form of anticipatory looking.)
olds, providing evidence for rapid learning of These findings suggest that accumulated expo-
spatiotemporal completion. Data from a sec- sure to occlusion events may be an important
ond training condition indicates that we were means by which spatiotemporal completion
not simply facilitating horizontal eye move- arises in infancy.
ments that carried over from training to test. In summary, research that examines infants’
This second training condition used a vertical oculomotor anticipations as they view repeti-
unoccluded trajectory for training, followed tive dynamic occlusion events is broadly consis-
as before by the (horizontal) occlusion display. tent with Piaget’s original descriptions of infant
Again, performance was statistically indistin- performance: There is little evidence of system-
guishable from that of the older infants. atic predictive behavior prior to 4 months, after
Brief training, therefore, brought 4-month- which time anticipations become robust and
olds to a level of predictive performance similar flexible, and performance continues to improve
to that of older infants. How might this work with age. It seems unlikely, however, that direct
in infants’ everyday environment? In the real manual experience with objects is a principal
world, infants are exposed to many different developmental mechanism driving the emer-
objects moving in different ways, presenting gence of this predictive behavior, because oculo-
multiple opportunities for learning. It may be motor anticipations begin to become established
that repeated exposure to moving objects, and prior to the onset of functional goal-directed
repeated viewing of objects as they move in and reaching and manual object manipulation in
out of view due to occlusion, leads to an asso- developmental time. Experiments described in
ciative link between the two scenarios. These the next section, however, provide evidence for
associations accrue gradually in everyday learn- a vital role for manual experience in develop-
ing, attested by the relatively long span between ment of perception of objects as coherent in 3D
improvements in performance (between 4 space.
and 6 months), but the learning mechanisms
themselves are remarkably efficient, attested by
the very brief training interval (2 min) neces-
sary to bridge this gap in performance in the Spatial and spatiotemporal completion consist
laboratory. of fi lling in the gaps in object surfaces that have

been occluded by nearer ones. Solid objects also In a follow-up study (Soska & Johnson,
occlude parts of themselves such that we cannot 2008b), we used these same methods with a
see their hidden surfaces from our present van- more complex stimulus: a solid “L”-shaped
tage point, yet our experience of most objects is object with eight faces and vertices, as opposed
that of fi lled volumes rather than hollow shells. to the five faces and six vertices in the wedge-
Perceiving objects as solid in 3D space despite shaped object described previously. We tested
limited views constitutes 3D object completion. 4-, 6-, and 9.5-month-olds. As in the Soska and
In contrast to spatial and spatiotemporal com- Johnson (2008a) study with the wedge stimulus,
pletion, little is known about development of 3D we found a developmental progression in 3D
object completion. We recently addressed this object completion: 4-month-olds’ posthabitu-
question with a looking time paradigm similar ation looking times revealed no evidence for
to those described previously (Soska & Johnson, completion, whereas 9.5-month-olds consis-
2008a). Four- and 6-month-olds were habitu- tently looked longer at the hollow test display,
ated to a wedge rotating through 15° around the implying perception of the habituation object
vertical axis such that the far sides were never as volumetric in 3D space. At 6 months, inter-
revealed (Figure 3.7). Following habituation, estingly, only the male infants showed this pref-
infants viewed two test displays in alternation, erence; females looked about equally at the two
one an incomplete, hollow version of the wedge, test displays. (At 9.5 months, the male advan-
and the other a complete, whole version, both tage had disappeared: both males and females
undergoing a full 360° rotation revealing the looked longer at the hollow shape.)
entirety of the object shape. Four-month-olds Data from the Soska and Johnson (2008a,
showed no consistent posthabituation prefer- 2008b) studies provide evidence for a develop-
ence, but 6-month-olds looked longer at the mental progression in infants’ 3D object com-
hollow stimulus, indicating perception of the pletion abilities, and for a sex difference in these
wedge during habituation as a solid, volumetric abilities that is revealed at a transitional period
object in 3D space. in this skill—6 months—but only when infants’

Habituation: pivots through 15°

Test: rotates through 360°


Figure 3.7 Schematic depictions of displays used to investigate 3D object completion in infants.
Adapted from Soska and Johnson (2008a).

views were tested with relatively complex explore objects from multiple viewpoints would
stimuli. It may be that the infants who were also have had more opportunities to learn about
successful at 3D object completion engaged in objects’ 3D forms outside the laboratory. Thus,
mental rotation in this task: manipulation of within this age range, individual differences in
a mental image of the object and imagining it coordinated visual–manual exploration (rota-
from a different perspective. Mental rotation is tions, fingerings, and transfers with looking at
a cognitive skill for which men have an advan- toys) and self-sitting experience should predict
tage relative to women (Shepard & Metzler, individual differences in infants’ looking pref-
1971; Zacks, 2008), and two recent reports have erences to the complete and incomplete object
provided evidence of a male advantage in young displays, our index of 3D object completion.
infants as well (Moore & Johnson, 2009; Quinn These predictions were supported. We found
& Liben, 2009). It remains to be determined strong and significant relations between both
definitely whether mental rotation is involved self-sitting and visual–manual coordination
in 3D object completion; at present, it is clear (from the motor skills assessment) and our
that it develops early in postnatal life, alongside measure of 3D object completion (from the
spatial and spatiotemporal completion. habituation paradigm). (Other motor skills
we recorded, such as holding skill and man-
Learning to Perceive 3D Object
ual exploration without visual attention to the
objects, did not predict 3D object completion.)
How does 3D object completion arise? One Self-sitting experience and coordinated visual–
possibility is that emerging motor skills sup- manual exploration were the strongest predic-
port perception of objects as coherent vol- tors of performance on the visual habituation
umes. Two types of motor skills, both of which task, but it seems that the role of self-sitting
undergo dramatic improvements between 4 and was indirect, influencing 3D completion chiefly
6 months, may be particularly important: self- because of its support of infants’ visual–manual
sitting ability and coordinated visual–manual exploration. Self-sitting infants performed more
object exploration. Independent sitting frees manual exploration while looking at objects
the hands for play and promotes gaze stabiliza- than did nonsitters, and visual–manual object
tion during manual actions (Rochat & Goubet, exploration is precisely the skill that provides
1995). Thus, self-sitting might spur coordina- active experience viewing objects from multi-
tion of object manipulation (e.g., rotating and ple viewpoints, thereby facilitating perceptual
transferring objects hand-to-hand) with visual completion of 3D form. These results provide
inspection, providing infants with multiple evidence for a cascade of developmental events
views of objects. Stroking, poking, turning, and following from the advent of visual–motor coor-
transferring objects hand to hand may promote dination, including learning from self-produced
learning about object form by supplying tactile experiences.
information and would provide multiple views In principle, 3D object completion might
at the same time. develop from more passive perceptual experi-
To examine these possibilities, we tested ences, but the findings yielded by the Soska et
infants between 4.5 and 7.5 months in a al. (in press) experiment indicate that passive
replication of the Soska and Johnson (2008a) experience may be insufficient to learn about
habituation experiment with the rotating 3D object form. Active exploration provides
wedge stimuli (Soska, Adolph, & Johnson, in information to the infant about her own con-
press). In addition, we assessed infants’ manual trol of an event while simultaneously generating
exploration skills by observing their spontane- multimodal information to inform developing
ous object manipulation in a controlled setting object perception skills. Coordinating visual
and obtained parental reports of the duration inspection with manual exploration seems
of infants’ sitting experience. We reasoned to be critical: Only the visual–manual skills
that infants who showed a greater tendency to involved in generating changes in object

viewpoint—rotating, fingering, and transfer- oculomotor and manual action systems in spa-
ring while looking—were related to 3D object tial and 3D object completion, respectively, and
completion. learning by observation, evinced by the impor-
tance of associating views of fully visible and
partly occluded trajectories in spatiotemporal
completion. The potential role of learning fun-
I have described a set of object perception skills damental object concepts and their develop-
that develop early in infancy, focusing on three mental antecedents is becoming increasingly
ways in which observers fi ll in the gaps in per- clear from the experiments I have described in
ception imposed by occlusion: spatial comple- this chapter, in their suggestion that infants use
tion, spatiotemporal completion, and 3D object developing perceptual, cognitive, and action
completion. And I have described experiments systems to learn about the visual environment,
designed to examine developmental mecha- and to assemble its constituent parts into coher-
nisms of perceptual completion in infants. The ent wholes.
best evidence to date suggests that newborn
infants do not fi ll in the gaps in perception, and
therefore, do not perceive objects as do adults. ACKNOWLEDGMENT
Instead, the visual world of neonates seems to Preparation of this article was supported by NIH
consist solely of surface fragments that have no grants R01-HD40432 and R01-HD48733.
substance, volume, or continuity. Infants ini-
tially appear to process complex stimuli as sim-
ple isolated units, and subsequently integrate REFERENCES
them into higher-order patterns. Aguiar, A., & Baillargeon, R. (1999). 2.5-month-
This developmental progression is consistent old infants’ reasoning about when objects
with a constructivist view of cognitive devel- should and should not be occluded. Cognitive
opment, and consistent with developmental Psychology, 39, 116–157.
patterns in other domains, as described in the Amso, D., & Johnson, S. P. (2006). Learning by
remainder of the chapters in this book. A con- selection: Visual search and object perception
structivist view attempts to understand devel- in young infants. Developmental Psychology, 6,
opment in terms of information processing 1236–1245.
Baillargeon, R. (1995). A model of physical rea-
principles (Cohen & Cashon, 2006). According
soning in infancy. In C. Rovee-Collier & L.
to this approach, infants process information at
P. Lipsitt (Eds.), Advances in infancy research
the highest level of complexity possible, but this (Vol. 9, pp. 305–371). Norwood, NJ: Ablex.
level depends on the infant’s individual capac- Baillargeon, R. (2008). Innate ideas revisited: For a
ity and on task demands. Processing capac- principle of persistence in infants’ physical rea-
ity expands with development, which helps soning. Perspectives on Psychological Science,
to explain why older infants and children are 3, 2–13.
better able to see configurations and relations Baillargeon, R., Spelke, E. S., & Wasserman, S.
among parts, and it is constrained by task com- (1985). Object permanence in five-month-old
plexity, which helps to explain why perception of infants. Developmental Psychology, 20, 191–208.
intricate, multielement stimuli may break down Bower, T. G. R. (1967). Phenomenal identity and
form perception in an infant. Perception &
when perceptual systems become “overloaded”
Psychophysics, 2, 74–76.
(see the chapter by Cohen in this volume).
Bremner, J. G., Johnson, S. P., Slater, A., Mason, U.,
Each of the three kinds of filling-in I have Cheshire, A., & Spring, J. (2007). Conditions
described appears to proceed on a separate for young infants’ failure to perceive trajec-
developmental timetable with a unique set of tory continuity. Developmental Science, 10,
developmental mechanisms, yet they have in 613–624.
common a prominent role for learning: learn- Bremner, J. G., Johnson, S. P., Slater, A. M., Mason, U.,
ing by doing, evinced by the importance of Foster, K., Cheshire, A., et al. (2005). Conditions

for young infants’ perception of object trajecto- pletion originates in information acquisition.
ries. Child Development, 74, 1029–1043. Developmental Psychology, 44, 1214–1224.
Burkhalter, A. (1993). Development of forward and Johnson, S. P., & Náñez, J. E. (1995). Young infants’
feedback connections between areas V1 and perception of object unity in two-dimensional
V2 of human visual cortex. Cerebral Cortex, 3, displays. Infant Behavior & Development, 18,
476–487. 133–143.
Burkhalter, A., Bernardo, K. L., & Charles, V. Johnson, S. P., Slemmer, J. A., & Amso, D. (2004).
(1993). Development of local circuits in human Where infants look determines how they see:
visual cortex. The Journal of Neuroscience, 13, Eye movements and object perception perfor-
1916–1931. mance in 3-month-olds. Infancy, 6, 185–201.
Clifton, R. K., Rochat, P., Litovsky, R. Y., & Perris, Kagan, J. (2008). In defense of qualitative changes
E. E. (1991). Object representation guides in development. Child Development, 79,
infants’ reaching in the dark. Journal of 1606–1624.
Experimental Psychology: Human Perception Kellman, P. J., & Arterberry, M. E. (1998). The cra-
and Performance, 17, 323–329. dle of knowledge: The development of perception
Cohen, L. B., & Cashon, C. H. (2006). Infant in infancy. Cambridge, MA: MIT Press.
cognition. In D. Kuhn & R. S. Siegler (Eds.), Kellman, P. J., & Spelke, E. S. (1983). Perception
Handbook of child psychology: Vol. 2. Cognition, of partly occluded objects in infancy. Cognitive
perception, and language (6th ed., pp. 214–251). Psychology, 15, 483–524.
New York: Wiley. Moore, D. S., & Johnson, S. P. (2008). Mental
Gredebäck, G., & von Hofsten, C. (2004). Infants’ rotation in human infants: A sex difference.
evolving representations of object motion dur- Psychological Science, 19, 1063–1066.
ing occlusion: A longitudinal study of 6- to Munakata, Y. (2001). Graded representations in
12-month-old infants. Infancy, 6, 165–184. behavioral dissociations. Trends in Cognitive
Gredebäck, G., von Hofsten, C., & Boudreau, J. P. Sciences, 5, 309–315.
(2002). Infants’ visual tracking of continuous Peterhans, E., & von der Heydt, R. (1991).
circular motion under conditions of occlu- Subjective contours—bridging the gap between
sion and non-occlusion. Infant Behavior & psychophysics and physiology. Trends in Neuro-
Development, 25, 161–182. sciences, 3, 112–119.
Haith, M. M. (1998). Who put the cog in infant Piaget, J. (1954). The construction of reality in
cognition? Is rich interpretation too costly? the child (M. Cook, Trans.). New York: Basic
Infant Behavior & Development, 21, 167–179. Books. (Original work published 1937).
Johnson, S. P. (2004). Development of perceptual Piaget, J., & Inhelder, B. (1956). The child’s concep-
completion in infancy. Psychological Science, tion of space. New York: Routledge. (Original
15, 769–775. work published 1948).
Johnson, S. P., Amso, D., & Slemmer, J. A. (2003). Quinn, P. C., & Liben, L. S. (2008). A sex differ-
Development of object concepts in infancy: ence in mental rotation in young infants.
Evidence for early learning in an eye tracking Psychological Science, 19, 1067–1070.
paradigm. Proceedings of the National Academy Rochat, P., & Goubet, N. (1995). Development of sit-
of Sciences, USA, 100, 10568–10573. ting and reaching in 5- to 6-month-old infants.
Johnson, S. P., & Aslin, R. N. (1995). Perception Infant Behavior & Development, 18, 53–68.
of object unity in 2-month-old infants. Rosander, K., & von Hofsten, C. (2004). Infants’
Developmental Psychology, 31, 739–745. emerging ability to represent occluded object
Johnson, S. P., & Aslin, R. N. (1996). Perception motion. Cognition, 91, 1–22.
of object unity in young infants: The roles of Shepard, R. N., & Metzler, J. (1971). Mental rota-
motion, depth, and orientation. Cognitive tion of three-dimensional objects. Science, 171,
Development, 11, 161–180. 701–703.
Johnson, S. P., Bremner, J. G., Slater, A., Mason, Slater, A. (1995). Visual perception and memory
U., Foster, K., & Cheshire, A. (2003). Infants’ at birth. In C. Rovee-Collier & L. P. Lipsitt
perception of object trajectories. Child (Eds.), Advances in infancy research (Vol. 9, pp.
Development, 74, 94–108. 107–162). Norwood, NJ: Ablex.
Johnson, S. P., Davidow, J., Hall-Haro, C., & Frank, Slater, A., Johnson, S. P., Brown, E., & Badenoch,
M. C. (2008). Development of perceptual com- M. (1996). Newborn infants’ perception of

partly occluded objects. Infant Behavior & Spelke, E. S. (1990). Principles of object percep-
Development, 19, 145–148. tion. Cognitive Science, 14, 29–56.
Slater, A., Johnson, S. P., Kellman, P. J., & Spelke, E. S. Spelke, E. S., Breinlinger, K., Macomber, J., &
(1994). The role of three-dimensional depth cues Jacobson, K. (1992). Origins of knowledge.
in infants’ perception of partly occluded objects. Psychological Review, 99, 605–632.
Early Development and Parenting, 3, 187–191. Valenza, E., Leo, I., Gava, L., & Simion, F. (2006).
Slater, A., Morison, V., Somers, M., Mattock, A., Perceptual completion in newborn human
Brown, E., & Taylor, D. (1990). Newborn and older infants. Child Development, 77, 1810–1821.
infants’ perception of partly occluded objects. von Hofsten, C., Kochukhova, O., & Rosander, K.
Infant Behavior & Development, 13, 33–49. (2007). Predictive tracking over occlusions in
Soska, K. C., Adolph, K. A., & Johnson, S. P. (in 4-month-old infants. Developmental Science,
press). Systems in development: Motor skill 10, 625–640.
acquisition facilitates 3D object completion. von Hofsten, C., Vishton, P., Spelke, E. S., Feng,
Developmental Psychology. Q., & Rosander, K. (1998). Predictive action
Soska, K. C., & Johnson, S. P. (2008a). Development in infancy: Tracking and reaching for moving
of 3D object completion in infancy. Child objects. Cognition, 67, 255–285.
Development, 79, 1230–1236. Zacks, J. M. (2008). Neuroimaging studies of
Soska, K. C., & Johnson, S. P. (2008b). Infants’ mental rotation: A meta-analysis and review.
3D object completion in complex displays. Journal of Cognitive Neuroscience, 20, 1–19.
Manuscript in preparation.
Numerical Identity and the Development
of Object Permanence

M. Keith Moore and Andrew N. Meltzoff

described the curious fact that young infants

would not search for a highly desired object
Numerical identity refers to an object being the when it was hidden. For Piaget, a central task for
selfsame individual over time. Our principal infants was to extract an independent, enduring
way of knowing an object’s numerical identity concept of objects from the infants’ sensorimo-
is by tracing its spatial history. This is how we tor experience with them. Piaget’s first key theo-
find “our” Coke can on a table full of Coke cans retical assumption was the primacy of the role of
that all look alike. Numerical identity allows us action. In early infancy, to “know” an object was
to construe the changes in an object’s appear- to act upon it. Development derived from relat-
ance, location, motion, orientation, and visibil- ing actions to one another and to consequences
ity, as different manifestations of a single object in the perceptual world (sensory–motor con-
rather than as many objects. It enables us to dif- nections). The second key assumption was that
ferentiate an encounter with a new object from a lack of sensory contact, especially invisibil-
a reencounter with the same one again. In this ity, was an insurmountable problem for young
chapter, we propose that infants’ developing infants. When sensory contact with objects was
understanding of numerical identity underlies lost, objects ceased to exist for the infant (“out of
their discovery of object permanence. We also sight is out of mind”). The development of rep-
suggest a mechanism for developmental change resentation around 18 months of age was pos-
that derives from this view. tulated as the way infants transcended a purely
Object permanence refers to the fact that sensorimotor world to realize that objects were
material objects are preserved over breaks in permanent over all occlusion events. An object
perceptual contact. When an occluder moves was not deemed to be fully independent of per-
in front of an object, adults understand that the ception and action, and thus permanent, until
occluder blocks visual access to it. They know infants could represent the invisible movements
that the object still exists in a specific location of an object that was stationary when it was
in the world for every moment it is occluded. occluded, an invisible displacement in Piaget’s
This understanding is what we mean when we terms, at 18–24 months of age. His age-ordered
say that objects are permanent over occlusion search tasks were seen as measures of progress
events. Thus conceived, object permanence pro- toward that end (see Table 4.1).
vides a powerful tool for extracting structural Many studies have replicated Piaget’s stages
regularity from experience. of search for hidden objects. Yet, modern empir-
Psychologists have been fascinated by infant ical research has largely undermined his key
object permanence ever since Piaget (1954) theoretical assumptions. We are left then with


Table 4.1 Summary of Piaget’s Stages of Object Permanence Development

Stage Age Sensorimotor Level Object Permanence Manual Search

(months) Behavior
1 0–1 Reflex repetition None None
2 1–4 Reflexes are coordinated Search is only an extension None
by action on a of current action; visual
common O tracking
3 4–8 Action is differentiated Manual acts on part to Finds partially hidden
from its result; acts to conserve visible whole O O, but not one totally
prolong the result hidden
4 8–12 Actions can be O’s existence is dependent Finds totally hidden O
coordinated to on the last action on O in one location, but
achieve results; returns there if O
means–ends acts is hidden in a new
location (the A-not-B
5 12–18 Explores all variations O’s existence depends on Finds O where last
of a new means act prior perception, but not displaced visibly, but
discovered by chance prior action not if moved invisibly
6 18–24 Representation allows O is independent of action Finds O after invisible
invention of new and perception because displacement from its
means; hidden causes represented last visible location
Note. O indicates object.

the puzzle of his ordered sequence in infants’

Identity Development Account’s Relation
manual search for hidden objects. A compre-
to Other Theoretical Positions
hensive theory of object permanence should
explain the development of manual search for Several strands of contemporary research,
occluded objects and the invariant ordering of including ours, have been influenced by
these steps. Bower’s (1967, 1971) assertion that infants’
Here we offer a solution to this puzzle that notion of object identity influences their
does not rely on Piagetian theory. In our view, behavior. Studies building on this insight have
the fundamental issue of object permanence explored how infants individuate different
is how infants use the visible transformations objects to determine how many are involved
of their perceptual world, such as an object’s in a visual event (e.g., two objects seen simul-
occlusion and disocclusion, to develop an taneously in different locations are different
understanding of an invisible world that links objects; Wilcox & Baillargeon, 1998; Xu &
these visible events. The infant’s primary data Carey, 1996). Leslie and colleagues (e.g., Leslie,
are their encounters with objects disappearing Xu, Tremoulet, & Scholl, 1998) demonstrated
and reappearing, which immediately poses a that object identification (i.e., distinguishing
question about numerical identity. Thus, we which objects are involved in an event) is a
propose that the origins and development of related but more difficult task than individu-
object permanence are preceded by develop- ation (for a review, see Krøjgaard, 2004). Still
ment in infants’ understanding of how to deter- another strand of research on object identity
mine and trace numerical identity. We call this has focused on how infants determine that the
view the identity development (ID) account of object before them is the same unique individ-
object permanence (Moore & Meltzoff, 1999). ual that they encountered previously—that it

is the same one again (e.g., Moore, Borton, & a disappearance–reappearance unless they
Darby, 1978; Moore & Meltzoff, 1978). In this got the same object back. The permanence
strand, infants’ notions of numerical identity judgment depends on identity. Conversely, an
are said to develop and change as they experi- adult does not interpret such an event as two
ence objects in the world. encounters with the same object unless it con-
tinued to exist between encounters. The iden-
Structure of the Argument tity judgment depends on permanence. We
As required of all developmental theories, the hypothesize that as infants begin to under-
ID account has the burden of specifying (a) the stand permanence, it is only understood for
foundational primitives underlying the earli- certain kinds of disappearances and not oth-
est notions of object identity, (b) the principles ers. We capture this by saying that permanence
that determine the course of successive develop- is constrained to the kinds of disappearance
ments, and (c) a mechanism of change account- events that the infant can construe as preserv-
ing for how the transition from having no ing the numerical identity of the object. Thus,
concept of permanence to having permanence permanence depends on identity, but not the
occurs. This is a substantial challenge, and few other way round.
developmental theories have met it. In our view, the infants’ prepermanence
We turn first to the theoretical assump- world is stranger still. They can determine
tions and hypotheses of the ID account, and object identity but do not treat objects as per-
then take up the empirical methods needed to manent. To illustrate this by analogy, infants’
test it using manual search, and new evidence unusual cognitive representation of their prep-
obtained with such methods. We then propose a ermanence world would be like projecting an
detailed mechanism of change for the transition adult’s 3-D perceptual world onto a 2-D TV
from treating occluded objects as impermanent screen. All of the interactions of objects would
to treating them as permanent. We conclude by be visible because there are no invisible dimen-
evaluating four theories of object permanence: sions. Objects disappear at edges by deletion
Piagetian, dynamical systems, nativist, and the and reappear by accretion; there is no image
ID account. overlap, so nothing is hidden. In this 2-D
world, an individual image can be reidenti-
fied after absences on the basis of its place or
trajectory of motion, without requiring that it
be somewhere between appearances. It is spa-
The ID account utilizes three theoretical terms tiotemporally the same image, but it does not
that are often conflated—representation, iden- exist constantly, because there is nowhere for it
tity, and permanence. We wish to differenti- to exist out of sight.
ate them and show the resulting implications
for describing infants’ understanding of object Identity and Representation
permanence. A second assumption of the ID account is that
the infant representational system can relate a
Identity and Permanence
currently perceived object to a stored represen-
The fi rst fundamental assumption of the ID tation of that object. Identity criteria provide a
account, and one that cannot be overstressed, means of linking the currently perceived object
is that the infant’s notion of the relation to its previously formed representation (i.e.,
between permanence and numerical iden- the criteria describing the object representa-
tity is radically different from that of adults. tion match the criteria of the perceived object).
For adults, permanence entails identity, and We have argued that infants have such a rep-
identity entails permanence. Adults do not resentational system from birth, and that it is
interpret an object as being permanent over sufficient to maintain the numerical identity

of visible objects participating in events with as described by philosophers (e.g., Strawson,

visible outcomes in a steady-state world—for 1959). The primary way of knowing that an
example, reidentifying objects after looking object at one point in time is numerically iden-
away from them—and enabling infants to tical to an object perceived at another point in
learn to predict object appearances after dis- time is by tracing the object’s spatiotemporal
appearances (for details, see Meltzoff & Moore, history between these points of contact: If it is in
1998). the right place in space at the right time when-
ever it is seen, it is numerically the same object.
Representation and Permanence The psychological reality of this analysis has
been demonstrated by the use of spatiotemporal
Another tenet of the ID account is that a
coordinates to address “object fi les” in studies of
further change in the representational system
adult attention (Kahneman, Treisman, & Gibbs,
is needed to account for permanence. Object
1992; Treisman, 1992) and object identity and
permanence, as we defi ne it, is not simply
indexing in infants (Bower, 1982; Carey & Xu,
maintaining a representation in mind, no mat-
2001; Leslie et al., 1998).
ter how long it lasts. Nor, is it reidentifying the
2. Infants are innately prepared for a
object as the same one again after it disappears
Newtonian world operating according to the
and then reappears. Object permanence is the
first law of kinematics: Objects at rest remain
understanding that an individual object, while
at rest; objects in motion continue in motion.
it is still invisible, continues to exist in a hid-
Infants are evolutionarily prepared for inter-
den location in the external world. To encom-
acting with objects in a Newtonian steady-state
pass permanence, the representational system
world, and the fi rst spatiotemporal distinction
has to link the representation of the object and
is whether the object is at rest or in motion. The
the representation of its location, while neither
spatiotemporal parameters that capture this
object nor location is currently visible. When
distinction are its place in space for a station-
this is achieved, the infant can be said to know
ary object or its trajectory of motion for a mov-
where the object is while it is out of sight. Such
ing object. Neuroscientists have shown that
understanding is necessary to support inten-
the location of objects in space and their tra-
tional, permanence-directed search for an
jectories of motion can be established by per-
occluded object.
ceptual processing (Haxby et al., 1991; Köhler,
Kapur, Moscovitch, Winocur, & Houle, 1995;
IDENTITY DEVELOPMENT ACCOUNT Watamaniuk & McKee, 1995; Watamaniuk,
McKee, & Grzywacz, 1995).
In this section, we elucidate a series of 10 To be “evolutionarily prepared” does not
interlocking hypotheses that comprise the ID mean that infants are born with an adult-like
account. The series describes the development notion of trajectory, for example, but rather
and interrelationship of infants’ notions of that they are predisposed to detect a trajectory
identity and permanence over the first 2 years of visual motion—the constant movement of a
of life. Because they are hypotheses, we cite rel- visual feature in a particular direction—from a
evant evidence where available. Finally, we pro- background of random-direction noise. These
pose a theoretically appropriate way to describe “trajectory detectors” are thought to be higher-
occlusion events. level units in the visual system extracting coher-
ent signals in space and time from lower-level
Identity and Permanence Development:
motion detectors (Grzywacz, Watamaniuk, &
10 Hypotheses
McKee, 1995). Such evolutionary preparedness
1. The fundamental criteria for numerical underlies the development of smooth pursuit
identity are spatiotemporal parameters. This idea visual tracking, the perception of object tra-
draws on “quantitative” or “numerical” identity jectories, and their representation (Aslin, 1981;

Bremner et al., 2005; Johnson, Bremner et al., object properties are preserved over events (so
2003).1 long as the object can be construed as the same
3. The spatiotemporal parameters of an object’s one—i.e., in the same place or on the same tra-
place and/or trajectory act as identity criteria, jectory). Initially, an object’s properties play
allowing the object to be identified as the same one no role in judgments of numerical identity (#3
again after breaks in perceptual contact. The earli- above). Thus, the utility of object properties to
est identity logic used by infants is that a station- confirm or disconfirm numerical identity is
ary object encountered in the same place as one learned (although there is some disagreement
seen previously in that place is the same object over the age at which this learning occurs, see:
again. Similarly, a moving object encountered on Krøjgaard, 2007; Van de Walle et al., 2000).
the same trajectory of motion as one seen previ- 5. Young infants use the spatiotemporal
ously on that trajectory is the same object again. parameters to reidentify an object as the numeri-
Spatiotemporal parameters initially override cally same individual over a disappearance–
featural appearance in judgments of numeri- reappearance event, without implying that the
cal identity. Young infants do not treat a pre- to object was located anywhere in the external world
postdisappearance change of object features as during the period of occlusion. Initially, infants
specifying a different object as long as the altered are using the spatiotemporal parameters to iden-
object reappears in the place or on the trajectory tify individual objects over changes in the visible
established by their first encounter with it (Bower, world, and even to anticipate where the same
Broughton, & Moore, 1971; Krøjgaard, 2007; one is likely to be seen again (e.g., extrapolating
Newcombe, Huttenlocher, & Learmonth, 1999; an object’s visible trajectory across an occluder
Van de Walle, Carey, & Prevor, 2000; Wilcox & to anticipate its next appearance in the visible
Baillargeon, 1998; Xu & Carey, 1996). world). These spatiotemporal identity criteria
4. Experience from repeated encounters with provide an overarching structure, allowing young
visible objects allows infants to learn which infants to extract predictable regularities from
visible events. However, unlike adults, the crite-
A trajectory of motion can be described as a vec- ria do not specify the object’s location while it is
tor specifying direction and speed. As an example of invisible, which is consistent with young infants’
the predisposition to detect trajectories, 4-month-old
infants visually extrapolated the left to right order in the failure to search for occluded objects (Meltzoff
sequential illumination of a linear array of lights into the & Moore, 1998). There is broad consensus for
space beyond the array. Th is occurred even though there such failure before 8 or 9 months of age, despite
was no “object” in motion and the pattern of illumina-
tion did not continue in that direction (Haith, Kessen, &
infants’ ability to anticipate reappearances.
Collins, 1969). A broad range of infant behavior is con- 6. Object permanence is the understanding
sonant with the idea that a notion of trajectory underlies that a particular object continues to exist in an
it: The prospective control of head tracking and reach-
ing for visibly moving objects (von Hofsten, Vishton, invisible location or on an invisible trajectory in
Spelke, Feng, & Rosander, 1998); learning to extrapolate the external world during the period of occlu-
the trajectory of a moving object across an occluder to sion or break in perceptual contact—while it is
predict its reappearance (Johnson, Amso, & Slemmer,
2003; Rosander & von Hofsten, 2004); the facilitation of still invisible. Permanence refers to a state of
predictive tracking over occlusions by unoccluded tra- affairs that is beyond the infant’s perception. It
jectory experience, and subsequent generalization to a is the basis for infants’ prediction of an object’s
new trajectory direction (Johnson, Amso, et al., 2003);
and adjusting the time of an object’s expected appear- occluded location after disappearance and
ance over varying occluder widths by the trajectory’s during the time when it is still invisible. Such
velocity (von Hofsten, Kochukhova, & Rossander, 2007). predictions about the object’s location while it
Conversely, when a constant velocity is not maintained,
reappearances are not predicted (e.g., if the object is cannot be seen provide the goals for infants’
decelerating as it disappears; Rosander & von Hofsten, intentional search acts and are the hallmarks of
2004) and overtrial learning to do so does not occur permanence-governed search.
(e.g., if the object moves constantly while behind an
occluder on some trials and delays behind it on others; 7. Object permanence develops from numeri-
Bertenthal, Longo, & Kenny, 2007). cal identity. An infant must be able to construe

the disappearance and reappearance of an object 8. Initially, object permanence understand-

as involving a single individual, a numerical ing is an interpretation infants make of observed
identity, before the answer to where the object physical events that satisfy two conditions: the
was located during the period of occlusion can object participating in an occlusion event is iden-
be obtained. Unless numerical identity can be tified as a single individual, and both the object
established, objects appearing after an occlusion and its occluded location can be independently
are new and different ones, rather than reap- represented. Permanence understanding unites
pearances of the same one again. And, if new the object and its hidden location by an interpre-
and different objects are popping into view after tation of the occlusion event—a deduction based
occlusions, the question of what happens to a on the occlusion—that links the now-hidden
single object between appearances, while it is but represented object with the now-hidden but
invisible, never arises and could not be learned represented location. This is the representational
“from experience.” Numerical identity renders basis for infants’ knowing where that particular
this problem solvable. object is after it disappears. We hypothesize that
Disappearance events can be described in development proceeds by infants at first reinter-
spatiotemporal terms relevant to numerical preting the event after the reappearance of the
identity as the places and trajectories of objects object—“the same object was there before it
and their occluders over the time course of an reappeared”—and only with further experience
occlusion. Rather than describe disappearances is the interpretation prompted by the occlusion
in terms of the recovery actions needed for event itself.
search (as Piagetian theory did), the ID account 9. Once objects as wholes are interpreted as
describes them in terms of the places and trajec- permanent over a particular class of disappear-
tories of all the objects involved. For example, ance events (a disappearance transform), further
“a stationary object’s occlusion in place by the experience with that same transformation allows
movement of an occluder” specifies one type of infants to learn which object properties are also
disappearance transform and implies that the preserved over that disappearance transform.
object can be reidentified as the same one again This learning process is parallel to the one in
by its place of disappearance—reappearance hypothesis #4 above except that now it is based
(Moore & Meltzoff, 1999). on the object’s permanence over the disappear-
Since the spatiotemporal parameters serve as ance transform, and the preserved properties
criteria for reidentifying the reappearing object are taken to be permanent properties of the
as the same one again (see #5 above), the nature object in its occluded state.
and development of these spatiotemporal Infant cognition is conservative. Infants
parameters for numerical identity provide the do not assume that all properties of a predis-
skeleton underlying permanence development. appearance object are preserved in that same
In other words, the age ordering of disappear- object postdisappearance. Conservation of the
ance transforms over which infants treat objects whole has priority, and initially infants accept
as permanent depends on the order of disap- a reappearing object satisfying the spatiotem-
pearance transforms for which the numerical poral identity criteria regardless of its visual
identity of an object can be maintained. When features or function. They then learn that some
infants can understand a disappearance trans- properties, such as orientation and perspective,
form as one in which “the same object has come often are not preserved over occlusions; and that
back,” they can then use subsequent experience others, such as shape and functional properties
to learn that the object is permanent over this like sound, usually are (referred to here as the
disappearance transform. Thus, we say infants’ object’s “distinctive features and functions”).
understanding of permanence is dependent on 10. Once some object features and functions
the type of disappearance transform involved, are also known to be permanent over a par-
or “permanence is transformationally depen- ticular disappearance transform, they can play
dent knowledge.” an independent role in determining an object’s

numerical identity for that transformation. serves as a fi lter, separating out disappearance
Object features and functions that are perma- transitions that destroy the object (e.g., implo-
nent over a particular transform allow infants sions, dissolutions, instantaneous disappear-
to use three identity criteria—spatiotemporal, ances, etc.) from ones that do not. Only events
featural, and functional—to determine numeri- that survive this fi ltering engage the next two
cal identity. Now infants do not have to accept a components and feed into infants’ determina-
featurally or functionally different object as the tion of identity and permanence.
same one again just because its reappearance (b) Degree of object occlusion. The degree of
satisfies the spatiotemporal identity criteria. For object occlusion refers to the extent of occlu-
example, an object reappearing in the expected sion—that is, how much of the whole object is
place with the wrong (unexpected) features or occluded (totally, partially, or not obscured at
functions given the disappearance transform all; Moore & Meltzoff, 2008).
could lead to further search for the original (c) The disappearance transform. Both
object. Similarly, if an object was moved to a descriptions of the Michottean transition and
new location when infants were not watching, the degree of occlusion apply to disappearance
they can weigh whether the identity of the one transforms, but a transform is not reducible to
they see is the same as the one that disappeared them. As we define the term, the “disappear-
(because it looks the same but is in the wrong ance transform” describes the spatiotemporal
place). The answer is not completely determined arrangement of object(s) and occluder(s) over
by its location, all three of the identity criteria the entire course of the occlusion event (e.g.,
can be taken into account in decision-making. the occlusion of a stationary object in place by
the movement of an occluder). A disappearance
Describing an Occlusion Event transform refers to a class of equivalent events;
An occlusion event can be characterized by they are spatiotemporally equivalent. Thus, any
three components, all of which bear on per- total occlusion of a stationary object in place by
manence understanding: (a) the psychophysics the movement of an occluder is the same disap-
of the transition to invisibility, (b) the degree pearance transform—the objects, locations, and
of object occlusion, and (c) the type of disap- occluders can all vary. This means that many
pearance transform. All three components are events, which are different on the surface, can
incorporated in the ID account. be grouped as the same abstract disappearance
transform. In the ID view, it should not matter
(a) Psychophysics of transition: Michottean to infants whether a cloth covers a stationary
disappearance events. Different types of visual object or a vertical barrier is placed in front of
events are specified psychophysically by the it—both are occlusions of a stationary object in
nature of the transition to invisibility (Michotte, place.
1962) and have been shown to differentially
affect looking, sucking, predictive tracking,
and electroencephalogram (EEG) responses
in young infants (e.g., Bertenthal, Longo, &
Kenny, 2007; Bower, 1967; Kaufman, Csibra, Testing the ID account of infants’ object perma-
& Johnson, 2005). For example, a progressive nence development presents two major empiri-
deletion of the visible portion of an object at an cal challenges. Object permanence refers to
edge is a necessary, though not always sufficient, infants’ understanding of a postocclusion state
condition to perceptually specify that the object of affairs. The first challenge is how to assure
slipped behind/under the edge during the tran- that infants’ search acts are actually launched
sition, and has not been destroyed by the disap- on the basis of the object in its occluded state,
pearance (Bremner et al., 2007; Gibson, Kaplan, rather than on some other basis. We call this the
Reynolds, & Wheeler, 1969). In the ID account, “occluded object standard.” The second chal-
this innate, Michottean perceptual mechanism lenge is how to assess whether infants represent

the object in a specific, invisible location. We In short, if infants’ search acts are initiated
call this the “invisible location standard.” The after the object is fully occluded, and if infants
point of this section is to provide the logic of an are looking to where the object should reappear
empirical method that can meet both standards as a consequence of their acts, before the object
and why it is necessary to adopt these safeguards is visible, then such acts are valid evidence of
in order to be sure one is tapping infants’ object object permanence. Permanence measured this
permanence understanding rather than some way is called the strong form of object perma-
lower-order action. nence for clarity, because it meets both stan-
dards. These more rigorous requirements lead
Occluded Object Standard to slightly more conservative age estimates than
In order to force infants to act off of their repre- studies that use “occluder removal” alone as a
sentation of the occluded object while it is invis- direct measure of permanence. We believe that
ible, we hide the object while it is out of reach. these precautions allow a more valid measure of
Infants are thus prevented from initiating search infants’ object permanence understanding, and
until after the occlusion is complete and they will use them in assessing the ID account.
are brought back within reach. This procedure
protects against one kind of artifact—continua- EMPIRICAL EVIDENCE
tions of search action already in progress before
the disappearance is complete. From that point Of the 10 ID account hypotheses, those num-
on, any action taken toward the hidden object bered 1–4 have substantial empirical support.
would have to be governed by their representa- There is less evidence bearing on hypotheses
tion of the object in its occluded state. 5–10, because few studies have assessed the
There are other potential artifacts that must strong form of object permanence until recently.
also be prevented: (a) acts based on prior prac- We turn now to consider such evidence.
tice with occluder removal in the test situa-
tion (e.g., extensive warm-up trials in a study); Is a Strong Form of Object Permanence
(b) acts based on clues from the experimenter Needed to Account for Search Behavior
such as drawing attention to an occluder by in Infancy?
“touching it last” (Diamond, Cruttenden, & Studies of object permanence, whether using
Neiderman, 1994; Smith, Thelen, Titzer, & visual habituation or manual search methods,
McLin, 1999); or (c) acts based on contingen- are typically conducted within one spatial set-
cies set up by the experimenter accidentally or ting and the infant situated in one position
by training (e.g. continued testing after chance within it. Thus, an object’s permanence may
success has uncovered the object in a particu- be put in doubt by an occlusion, but perma-
lar place). Search based on any of these do not nence of the spatial setting is preserved through
meet the occluded object standard because it unbroken perceptual contact. Many other cir-
need not be based on the object’s disappearance cumstances in infants’ lives lead to an object’s
(Moore & Meltzoff, 2008). disappearance where the setting is changed
before they ever see it reappear or get an oppor-
Invisible Location Standard
tunity to act. Infants are removed from the set-
Under the conditions above, correct manual ting, they go to sleep, they travel, and objects
search coupled with spatially directed visual are moved to new settings unobserved by them.
anticipation of the object’s reappearance locus Obviously, the adult notion of permanent
is evidence about where infants think the object objects is rich enough to encompass these situ-
is located while it is out of sight. Such behavior ations. For the adult, absent objects continue to
implies that the location of the object is repre- exist in hidden locations after all forms of per-
sented while both the object and its location are ceptual contact with the original setting have
occluded. been severed. And, if another agent has moved

the objects, the adult believes they continue to object was in the hiding place on day 2, so no
exist in some new location. Do infants view the infants found it there. When the original object
world in this way? Most studies of object per- was later shown in the middle of the room to
manence, regardless of method, are silent on infants who had seen it hidden on day 1, they
this fundamental point. engaged in “verifying search.” They went across
In a recent study, 14-month-old infants the room to the hiding place and looked inside,
watched an object being hidden, left the test even though the object was in full view (Moore
environment, and returned 24 h later. The results & Meltzoff, 2004). Despite the fact that the fea-
showed that when they were brought back to tures and functions of the visible object matched
the same room the next day, they searched suc- the one they saw hidden, they checked in the
cessfully (Moore & Meltzoff, 2004). This test hiding place before playing with it. Our inter-
satisfies both standards for strong object per- pretation of this behavior is that the 14-month-
manence. Successful search under these condi- olds were searching the disappearance place
tions shows that in addition to representing the to verify that the original object was not there.
object, a representation of the hiding place was This would help them determine if the visible
also set up at the disappearance event (concor- object was the numerically correct individual
dant with hypothesis # 8). For at least one basic or merely one that looked and acted like it. This
disappearance transform, “the occlusion of an behavior supports hypothesis # 10 on the inter-
object in place,” 14-month-olds’ search after a play of the three identity criteria, because the
24-h break suggests that the object’s existence object’s features and functions were sufficient
in the world is not dependent on maintaining to tentatively identify it (as the same one), and
any kind of perceptual contact with the disap- the spatiotemporal information (the place of its
pearance locale. expected reappearance) was used to confirm or
disconfirm that provisional identity.
Does Numerical Identity Play a Role
Taken together, the room-change results
in This Strong Form of Infant Object
and the verifying search behavior suggest that
these infants were seeking, in the same hiding
The ID account holds that the aim of infants’ place, within the same disappearance locale,
permanence-governed search is to recover the selfsame object that they saw hidden on day
exactly the same object that disappeared. 1. Violating the global spatiotemporal criterion
Leaving the locale of an object’s disappear- for the identity of a stationary object (the room)
ance and returning after 24 h poses a ques- led to no search at all if it was the wrong locale.
tion of numerical identity. If one returns to the Violating the local spatiotemporal criterion (the
same locale, the object hidden on day 1 could place in the room) led to verifying search if the
be found here; but, if this is a different locale, object was in the wrong place within the cor-
then the expectation should be that the origi- rect global locale. The role of numerical identity
nal object could not be found here. To test this in object permanence understanding provides
idea, we instituted a “room change” condition. an explanatory concept for both behavior
The findings were that the 14-month-old infants patterns.
in the room-change group did not search while
Is a Strong Form of Object Permanence
the same-room infants searched successfully
Present from Birth?
(Moore & Meltzoff, 2004). This result com-
ports with the idea that infants were seeking If the strong form of object permanence is pres-
the original object and supports hypothesis # 7 ent at some time during infancy, is it present at
that numerical identity underlies permanence- all times? The classic argument has been that
governed search. attempts to answer this question using man-
A new behavior was also discovered that ual search tend to underestimate competence
points up the importance of numerical identity because of “performance constraints” (e.g.,
to object permanence. In these experiments, no Baillargeon, Graber, DeVos, & Black, 1990).

A recent study investigated whether four com- 8.75- and 10-month-old infants, there were no
monly cited performance constraints presumed differences in search success whether the object
to limit infant search actually caused failures: was behind an occluder or under an occluder:
motor skills, means–ends coordination, spatial The younger infants failed even when the object
understanding, and memory span (Moore & was behind an upright occluder (Moore &
Meltzoff, 2008). Meltzoff, 2008).
A new partial occlusion task was used to Another commonly cited performance con-
assess whether 8.75-month-old infants had straint for young infants concerns memory. If
the means–ends coordination and motor skills the memory span required by a total occlusion
needed to remove an occluder (see also Johnson, were too great, infants might forget the object
this volume, for more on partial occlusions). In before they could search (Diamond, 1985;
the standard Piagetian task, the visible part Harris, 1987). This limitation was addressed
extends toward the infants, and they typically by hiding an object that emitted a continuous
pull on the visible part because it is close and sound to prevent forgetting. Even with this
easy to reach. In the new task, the object’s vis- memory aid, 8.75-month-olds did not succeed.
ible part projected laterally from the occluder, Older infants were also tested. The introduction
so both part and occluder were equally avail- of the sounding object more than doubled the
able. The first question was whether the infants success rate for the 10-month-olds, but it did
would recover the object by removing the not help the younger infants (Moore & Meltzoff,
occluder. If infants do this, their lift ing or dis- 2008, Experiment 2).
placing of the occluder demonstrates the same In sum, these fi ndings show that infants at
motor and means–ends skills needed to remove 8.75 months of age possess the requisite skills
it on total occlusions. The next question was to search for the hidden object. But they did
whether the infants who removed the occluder not search. Taken together, with the fact that
on partial occlusions also removed it on total 14-month-olds demonstrate the strong form
occlusions, as would be expected if they under- of object permanence for this same disap-
stood permanence. The fi ndings showed they pearance transform, we infer that a notion of
did not: Fully half of the 32 8.75-month-old permanence begins to develop between 8.75
infants tested had the requisite skills, but only and 10 months of age and is quite robust by 14
two used them to remove the occluder from a months.
totally hidden object (Moore & Meltzoff, 2008,
If Object Permanence Develops, Is the
Experiment 1).
Change Once and for All or a Series
If motor skills and means–ends coordina-
of Steps?
tion were not the limiting factors, what was
the impediment? Bower (1982) has argued that In the work discussed thus far, only the degree
a source of difficulty is the spatial relationship of the object’s occlusion has been manipulat-
between an occluder and the object it occludes. ed—partial hidings are easier to solve than total
He predicted that when some distance separates hidings. On the ID account, however, changes
a stationary object and a totally occluding verti- in the type of disappearance transform should
cal screen, the object is perceived as behind the also affect search success even when the degree
occluder. However, if there is no spatial separa- of occlusion is exactly the same. A study test-
tion between object and occluder (e.g., under ing this idea compared two types of total occlu-
cloths or inside cups), the search task is more sions in which the same object was hidden in
difficult, because the occluder appears to be the same place behind the same screen (Moore
taking the place of the object rather than hiding & Meltzoff, 1999). If infants solved one task but
it during the disappearance event. Therefore, not the other, this task differentiation could
he argued that using cloth occluders would not be attributed to the types of performance
underestimate infants’ understanding of object constraints previously mentioned, because
permanence. When we tested this idea with the same search response to the same totally

Occlusion-in-place to-a-place

Figure 4.1 Two types of disappearance transforms. The left column depicts occlusion-in-place and
the right column depicts occlusion-on-a-carrier-to-a-place. Occlusion-in-place begins with the exper-
imenter carrying the object to a place on the table next to the folded cloth and depositing it there. The
occlusion occurs by unfolding the cloth over the object. The experimenter’s hand then returns to the
starting point in the center of the table. Occlusion-on-a-carrier-to-a-place begins with the experi-
menter carrying the object toward the cloth. The occlusion occurs as the object goes under the cloth;
it is then deposited on the table. The experimenter’s hand returns to the starting point in the center of
the table. Adapted from Moore & Meltzoff (1999).

occluded object, in the same spatial location Identity Development Interpretation of the
was needed in both (Figure 4.1). In one occlu- Empirical Data
sion, a stationary object in place on the table
On the ID account, the occlusion-in-place is a
was totally hidden by the movement of a screen.
total disappearance of an object at rest on the
In the other, the object was carried under the
table by the movement of a cloth occluder. In
stationary screen by hand, occluding it on the
an occlusion-on-a-carrier-to-place, an object at
carrier/hand, and deposited on the table; then
rest on the carrier is moved under the cloth. At
the carrier emerged empty.
this point in the occlusion-on-a-carrier, both
Th ree age groups were tested and the occlu-
tasks occlude an object that is at rest relative to
sion-in-place was significantly easier than
the surface it is on (table or carrier). For both,
the occlusion-on-a-carrier-to-a-place at 10
the object would be identified by the spatiotem-
and 12 months of age; only the 14-month-old
poral criterion of place of disappearance and
group was equally successful on both tasks.
expected to reappear there. For infants who
Even though a majority of the younger infants
understand permanence for the occlusion-in-
succeeded on the total occlusion-in-place,
place transform, the object in its invisible state
only a small minority succeeded on the total
continues to exist in that place (table or carrier)
occlusion-on-a-carrier-to-place. Importantly,
for both tasks. However, in the occlusion-on-
therefore, the permanence understanding
a-carrier, the object is deposited on the table
that enables success on one total occlusion is
under the cloth and the carrier is withdrawn
insufficient to solve the other. In fact, about 4
empty. No object is present on the carrier where
months elapse before a majority of infants can
that same one would be expected. If infants use
solve both tasks. These fi ndings suggest that
their permanence understanding to uncover
object permanence is not a once-and-for-all
the place of disappearance in order to find the
attainment: The nature of the disappearance
hidden object, on the occlusion-in-place task
transform matters.

they succeed because they uncover the place on Violations of Strong Object Permanence
the table; but on the occlusion-on-a-carrier task Cause Negative Emotion in Infants
they fail, because the disappearance place (the
Adults can be driven to distraction when some-
carrier) is empty. Thus, the identity criterion
thing important is inexplicably lost. If object
or rule that underlies infants’ comprehension
permanence were an equally fundamental
of one task leads to noncomprehension of the
understanding of the world for infants, then vio-
other. 2
lations of permanence should generate strong
In terms of numerical identity for the 10- and
negative affect—conflict, upset, and avoidance.
12-month-old infants, the original object has to
The 10-month-olds’ response to the occlusion-
reappear on the carrier to be the same object—
on-a-carrier-to-place in the Moore and Meltzoff
there is no other place in the external world
(1999) study provides a test of this idea. As
where that object could be identified as the same
argued above, the empty carrier emerging from
one. If numerical identity guides search, it also
under the occluder violates the place rule for
follows that there is nowhere else to search for
where the hidden object should be. By contrast,
that same object. The 10-month-olds who failed
the place rule is not violated when the object
this task provide support for this interpretation
disappears by an occlusion-in-place, because a
because the overwhelming majority of them did
majority of infants at this age understand that
not search at all (even when they searched cor-
the object still resides in the now invisible place
rectly on the other occlusion task).
that it disappeared. Therefore, there should be
In terms of permanence, if there is no place
a difference in their emotional reactions to the
in the world for that same object to be after dis-
two tasks.
appearance except on the carrier, then, when the
Infants’ active avoidance was the measure
carrier is empty, it is evidence for the infant that
of affect used. The results were that avoidance
the object is not permanent. In this sense, one
was strongly associated with the occlusion-
transform preserves the object over an occlu-
on-a-carrier-to-place. The avoidance of the
sion (permanent for occlusion-in-place) and
occlusion-on-a-carrier did not simply reflect
the other transform does not (impermanent for
infants’ frustration at not finding the object.
occlusion-on-a-carrier-to-place). This provides
This was examined using infants who failed
support for a key claim of the ID account: An
both tasks. Significantly more of these infants
infant can understand objects as permanent for
avoided occlusion-on-a-carrier but did not avoid
one type of disappearance transform, but still
occlusion-in-place than infants who did the con-
think that objects are not permanent and are
verse (Moore & Meltzoff, 1999). Thus, infants’
nowhere to be found under a different trans-
avoidance of occlusion-on-a-carrier-to-place
form. This is what we mean when we say object
appears to be a reaction to the disappearance
permanence understanding is transformation-
transform, rather than to search consequences.
dependent; it is not an all-or-none attainment.
This differential avoidance pattern suggests
The pattern of these fi ndings and their interpre-
that infants were treating the “empty hand” as
tation suggests that permanence development
a violation of their understanding of perma-
proceeds in a transformationally dependent
nence, which is apparently important enough to
manner and also that the steps can be described
produce conflict when violated, and as argued
by the spatiotemporal parameters for numerical
above, for which their identity rules provide
no alternative understanding. These findings
suggest that the strong form of object perma-
For ease of exposition, we often characterize these
spatiotemporal criteria as “rules” because the spa- nence reflects a fundamental understanding of
tiotemporal parameters yield a rule-governed pattern the infants’ world as early as 10-months of age.
of operation (e.g., a place rule for permanence, a trajec- When this understanding is violated, there is a
tory rule for identity, a place-to-place rule, etc). They are
functional descriptions; we are not speculating on the strong emotional response (not just increases in
underlying neurophysiology. looking time, but avoidance and even upset).

A MECHANISM OF CHANGE FOR object markedly improved the success rate of

DEVELOPING OBJECT PERMANENCE the 10-month-olds. This change in the use of
sound suggests that a developmental transition
We have argued that the strong form of object occurs between these ages.
permanence is not innately specified, but devel- Our specific mechanism of permanence
ops. We sketched the ID account that perma- development has two interwoven parts. First,
nence develops from infants understanding infants’ understanding of partial occlusions
of numerical identity and reviewed the new is a necessary precursor to locating stationary
empirical evidence bearing on this claim. The objects that are totally occluded and to estab-
results suggest that permanence is the under- lishing their identity when they are out of sight.
standing that allows infants to make sense of Second, infants’ discovery of permanence for a
what happens between encounters with objects total occlusion is a process of reinterpreting the
that can be reidentified as the same one again. occlusion event based on their existing under-
Object permanence fi lls the spatiotemporal standing of the precursor, the partial occlusion.
gap between an object’s disappearance and its
reappearance. Transition From Impermanence to
The theoretical problem is now sharply Permanence: The Crucial Role of Partial
posed. If permanence is a discovery that arises Occlusions
from a precondition in which objects are not When an object is hidden on a table, the total
permanent, how does the concept develop? This occlusion of the object is only a partial occlusion
raises the classic nativist challenge to all devel- of the table surface on which it sits. The occluded
opmental theories and all claims for conceptual place on the partially occluded table continues
change (Fodor, 1981). In particular, how can a to exist after occlusion (for infants who under-
concept of object permanence evolve from pre- stand partial occlusions) and provides an invis-
cursors that do not already entail a notion of ible location for the totally occluded object to
permanence? reside while it is out of sight. Thus, infants could
Genesis of Object Permanence for the
understand that there is somewhere in the
Occlusion of an Object in Place
external world for the totally occluded object to
be. Moreover, if that invisible place continues
The crux of the developmental problem is the to exist, then it could satisfy the place criterion
transition from impermanence to permanence. that identifies the object in that place, when it is
Here, we will describe a mechanism of per- out of sight, as the same one that disappeared
manence development for a particular case of there. We suggest that this development in spa-
occlusion. In the next section, we extend these tial cognition lays the groundwork for discover-
ideas to provide a generative mechanism of per- ing permanence over total occlusions in place.
manence discovery and development. Two key Infants can use the permanence of the partially
findings for theory construction emerged from hidden portion of the table supporting the
the Moore and Meltzoff (2008) study. First, the object to provide an invisible, but still existing,
infants’ pattern of success established an invari- location for the object to reside after it is totally
ant ordering: Many infants solved partial occlu- occluded, and also to provide a place criterion
sions by removing the occluder and failed total identifying it as the same one while it is invisible
occlusions, but none of the infants failed par- and when it is disoccluded.
tial occlusions and solved total occlusions. This
Sounding Objects as a Window on the
suggests that understanding the easier, partial
Process of Discovery
occlusion serves as a foundation for under-
standing the more difficult, total occlusion. There are at least three ways that sound from an
Second, as noted above, 8.75-month-olds were occluded object could help infants search. First,
no more successful searching for a sounding sound from the object could aid in remember-
object than a silent one. However, the sounding ing and localizing the object. If this were true,

then sound should help both age groups, but a deductive inference that restructures what the
the younger group more than the older. Second, infants already know about partial occlusions to
infants might not be able to interpret sound as yield the new understanding.
coming from a hidden object unless they knew How did sound from the object help? We
that the object still existed in the hidden loca- think it provided an interpretive aid. Hypothesis
tion (i.e., the object is already permanent). If # 8 suggests that the normal developmental
this were true, then only infants who could course would be for infants to first reinterpret
solve occlusions with silent objects would solve the disappearance as conserving the object
them with sounding ones. Third, sound could in place after reappearance (because they can
function as a catalyst, triggering a new way to confirm the reappearing object’s identity after
understand the occlusion that was not accessible disocclusion); subsequently, they begin to make
when the object was silent. The data showed that that interpretation after the object’s disappear-
a sounding object was of no help to the youn- ance but before it reappears. On this view, the
ger infants, but significantly more of the older characteristic sound from the object provides a
infants succeeded with the sounding object shortcut enabling the interpretation to be made
than with the silent one (Moore & Meltzoff, at disappearance, because it allowed infants to
2008). This pattern suggests that sound acted as confirm the object’s identity by its sound from
a catalyst. How might that work? the represented place before it reappeared. Thus,
The fact that partial occlusions appear to be the auditory provision of identity and localiza-
a precursor to solving total occlusions suggests tion information before disocclusion fostered
that infants who understand partial occlusions interpretation by infants who were already able
are developmentally poised to discover how to to represent the hidden place.
understand total occlusions from experience. In sum, search for the sounding object is a
Once infants have this framework, a character- special case, but illustrates a process of inter-
istic sound from the object could provide addi- pretation that could be applied more generally.
tional spatial and identity information about According to the specific mechanism of change
how to interpret an object’s disappearance. described here, permanence arises only when
Sound from the hidden object, localized as com- an existing means for determining numerical
ing from the partially occluded surface, could identity and a developing understanding of par-
help catalyze a reinterpretation: the same object tial hidings have prepared the ground. On this
that disappeared is the source of this sound foundation, an occlusion event that was previ-
and remains unseen on that partially occluded ously interpreted as not preserving the object
surface. Based on this view, permanence is an can be interpreted in a new way, and confirmed
interpretation infants make of the occlusion by subsequent experience, as actually preserv-
event. The logic is concordant with hypothesis ing the object in a precise hidden location—it is
#8, the object is interpreted as permanent over now permanent over this disappearance trans-
a transform if it is identified as a single individ- form, a total occlusion in place.
ual, and both the object and the occluded loca-
Generalizing the Mechanism of Change:
tion are independently represented.
From Transform T to Transform T+1
We are not arguing that the role of sound is
a general explanation for how infants acquire We have suggested that the process of develop-
object permanence for total occlusions in place. mental change is one in which an understanding
Nor are we arguing that the 10-month-olds who of permanence and the experience gained with a
did not succeed on total occlusions with silent simpler disappearance transform make a harder
objects, but did with sounding objects, acquired transform amenable to reinterpretation so long
object permanence from that experience alone. as the numerical identity of the object can be
Rather, we think that the results with the sound- maintained. In the case of total occlusions, the
ing object give us a window on the process of how advance occurred because infants could use the
permanence is acquired for total occlusions: It is permanence of the partially hidden portion of

the table supporting the object to provide (a) 2004). Infants treated the properties of an object
an invisible, but still existing, location for the seen hidden on day 1 as bearing on numerical
object to reside after it was totally occluded, identity because, when the object was presented
and (b) a continuously existing “place” criterion in a new location on day 2, infants searched in
identifying it as the same individual from dis- the original hiding place before playing with it.
appearance to reappearance. In this context, a Even though it was in the wrong place, the fea-
reinterpretation of total occlusions became pos- tural and functional identity criteria confl icted
sible. In what follows, we utilize this analysis with the spatiotemporal criterion and raised the
and our new findings to extend the developmen- question of its identity. For this disappearance
tal process toward a more general mechanism of transform then, the distinctive features and
change and development. functions of the object implied its numerical
There are two major problems confronting identity at 14 months of age.
a general mechanism. One is how to explain This suggests a general mechanism of
the step-like progression of occlusion tasks change that could account for the ordering of
that infants can solve as they develop. We have search tasks found in the longitudinal studies
reviewed data on two ordered steps here: the par- and how infants might use apparent violations
tial to total occlusion transition, and the occlu- of permanence. We will state the hypothesis in
sion-in-place to occlusion-on-a-carrier-to-place its most abstract form and illustrate it with a
transition. Other steps have been suggested by simple transform that violates the place rule for
previous longitudinal studies (Kramer, Hill, & permanence and identity. The general problem
Cohen, 1975; Piaget, 1954). A second problem is how “rule R” for the identity and permanence
arises when infants find that applying their cur- of an object and its features over transform T
rent permanence rule to a disappearance trans- changes to rule “R+1” for a new disappearance
form does not preserve the object. This obstacle transform. The proposed mechanism is shown
was illustrated in the Moore and Meltzoff (1999) schematically in Figure 4.2. Here it is applied to
study. Infants, who had a place rule to solve an a task in which the object is moved after disap-
occlusion-in-place, found that applying it to an pearance in place X to a second place Y by means
occlusion-on-a-carrier resulted in an empty of the screen (e.g., a cup covers an object and is
reappearance place, and an apparently upset- then pushed to a new location with the object
ting violation of permanence. still underneath). According to rule R, the infant
The general process of developmental change searches place X on the table and finds it empty,
focuses on what else develops once a transform which violates the spatiotemporal logic of rule
is understood as conserving an occluded object R. Meanwhile, the object reappears at place Y,
as a whole. The major claim is that infants learn which confirms the featural logic of rule R—it
which features and functions of an object are can be interpreted provisionally as the same one
themselves permanent over that transform and again. This produces conflict because rule R is
can bear independently on identity determina- both confirmed and violated. The infant has to
tion (hypotheses # 9 & 10). Thus, an object’s per- weigh the apparent violation of permanence for
manent features and functions could also serve the object at X and the appearance of a featur-
as identity criteria for distinguishing that object ally identical object at Y against the validity of
when spatiotemporal parameters are absent, rule R. This conflict is resolved by reinterpreting
neutral, or even in disagreement. We term these the spatiotemporal logic of rule R to encompass
criteria the object’s “distinctive” features and the change of location (X →Y). This reorganiza-
functions. This discovery offers new developmen- tion provides the new spatiotemporal logic of
tal leverage because the spatiotemporal param- rule R+1, maintaining an object’s identity and
eters, the object’s properties, and its permanence permanence over a new transform T+1.
can all interact in interpreting occlusion events. In the terms we have been using, the exam-
The phenomenon of “verifying search” pro- ple above captures the process of reinterpreting
vides relevant evidence (Moore & Meltzoff, a place rule for identity and permanence to yield

Object disappears place X:

engages rule R for
disappearance transform T

Object displaced Infant searches place X

to place Y by screen according to spatiotemporal
logic of rule R

Object reappears at Y: Place X empty:

confirming violates
featural logic of rule R spatial logic of rule R

of rule R

logic of rule R to
include object at X Y

New spatiotemporal logic

of rule R + 1
for new transform T + 1

Figure 4.2 A mechanism of change for developing object permanence. An object disappears at place X
and reappears at place Y. The infant expects the object to reappear at X according to permanence rule
R. The flowchart illustrates the hypothesized process for changing the spatiotemporal component of
rule R to rule R+1. Conflict occurs between confirmation of the featural component of rule R, which
is satisfied by the object appearing at Y, and disconfirmation of the spatiotemporal component, which
is violated by the object’s failure to appear at X. Changing the spatiotemporal component of rule R to
rule R+1 resolves the conflict. See text for details. Adapted from Moore (1975).

a new place-to-place rule. An object disappear- transform T+1 (i.e., after occlusion, the occluder
ing in place X can be the same one again when moves from the place disappearance to a new
it reappears in place Y. The identity and perma- place). The particular rules are engaged by the
nence of the whole object are preserved over spatiotemporal structure of the various disap-
the new transform and the process of learning pearance transforms. We think that this kind of
which properties of an object are also preserved mechanism of development would account for
can begin again for the R+1 transform. Note the stepwise progression of infant success on
that rule R+1 does not overwrite rule R. Rather, search tasks because the steps are generated by
rule R is engaged by observing transform T (i.e., the order of the underlying spatiotemporal cri-
after occlusion the occluder remains at the place teria for identity and the resulting understand-
of disappearance) and rule R+1 is engaged by ings of object permanence.

More broadly, when an object is occluded, different type of total occlusion (an occlusion-
the goal of search is quite specific: Infants are on-a-carrier-to-a-place) demonstrates that
seeking the same object that disappeared—no Piaget’s developmental sequence is incomplete.
other object will do. Successful search recon- The same search action at the same location
nects the infant with the same predisappear- was required to find the object in both tasks.
ance object and maintains order in the infant’s For Piaget (1954), there is no easy explanation
cognitive world; failed search confronts infants for how total hidings in one place, solved by
with disorder, which can have affective con- the same recovery act, can be developmentally
sequences. Infants’ striving to preserve order different.
and coherence in their world is the motivation Moreover, there is further evidence that
for permanence development, and tracing an does not comport with Piagetian theory. Piaget
object’s identity over transformational events is is correct in claiming that success on partial
a means to achieve it. occlusions precedes success on total occlusions.
But, his theory provides no explanation for the
new data showing that infants fail to remove
the occluder of a totally hidden object when
they have the means–ends coordination to do
Infant object permanence has been the focus so. Such coordination characterizes Piaget’s
of attention for seven decades. Four basic stage 4, and infants who uncovered the partially
approaches have been articulated. We consider occluded object should have uncovered the
them in light of the evidence and arguments totally occluded object, but they did not.
presented here and make suggestions for future Taken together, the new studies suggest that
research. Piaget’s diagnosis of infants’ problems in devel-
oping object permanence was off target, and his
Piagetian Approaches action-based theory of development does not
According to Piaget (1952, 1954), infants fit the evidence. The infant’s conceptual prob-
develop a concept of objects as permanent by lem is not separating objects from the matrix of
increasingly separating the object itself from action, representing them in mind, or position-
the matrix of actions upon it, culminating in ing them in visible space—the early perceptual
an object’s representation independent of both and representational systems do all three.
perception and action (Table 4.1). A number
Dynamic Systems Theory
of theorists have broadened his theoretical
terms to include a “gradual strengthening of Dynamic systems theorists believe that infants’
representation” or “movement of the observer initial appreciation of objects is embedded in the
either by actions of the infant or by being car- dynamics of their acts of attending, reaching,
ried through space” as the sources of develop- and remembering (Thelen & Smith, 1994). They
ment while narrowing their focus to explaining think that objects are so inextricably bound up
how infants first solve a total occlusion or over- in attention and action that a concept of per-
come the A-not-B search error (Bremner, 1989; manence is not developed in infancy (Thelen,
Campos et al., 2000; Mareschal, Plunkett, & Schoner, Scheier, & Smith, 2001), and in that
Harris, 1999; Munakata, McClelland, Johnson sense, postulate an even less cognitive, more
& Siegler, 1997; Newcombe & Huttenlocher, action-bound infant than Piaget. The study of
2000). Piaget’s manual search tasks are differ- 14-month-olds’ search after a 24-h delay tests
entiated in terms of three major factors: (a) the this assertion (Moore & Meltzoff, 2004). Infants
actions required for recovery, (b) the degree of observed an object’s disappearance with no
occlusion, and (c) the number of hiding loca- familiarization play with the hiding places and
tions. The research reviewed here showing that immediately left the laboratory. Upon return
infants can solve one type of total occlusion (an 24 h later, no attention was drawn to the hiding
occlusion-in-place) 4 months before solving a place, yet infants successfully found the object.

From the dynamic systems perspective, there occlusion. The results showed that by 10 months
were no practiced acts to repeat, no directing of age, infants could understand permanence
of infants’ attention to the hiding place. Infants for a total occlusion-in-place. Importantly, that
searched based on a stored representation of was not the end of development. Recall that
the absent object and its location in space. 10-month-olds, who solved this form of total
This suggests that some conception of perma- occlusion, did not succeed on a total occlusion-
nence is needed to guide search on the second on-a-carrier to-a-place until 14 months of age.
day contrary to the dynamic systems’ model of These data suggest there are at least two steps in
infancy. permanence development, unexplained by per-
formance constraints, which challenge nativist
Nativist Theory theory. Permanence neither seems to be innate
Object permanence nativists claim that it is log- nor a once-and-for-all acquisition.
ically impossible for infants to learn that objects
Identity Development Theory
are permanent from the chaos of sensory expe-
rience (Spelke, 1994). In this approach, object On the ID account, object permanence develops
is an innate conception resulting from per- from a prior understanding of numerical iden-
ceptual processing that entails permanence tity—the spatiotemporal criteria infants use to
(Baillargeon, 2008; Spelke, 1990). Permanence reidentify an object as the same one again after
does not develop; it is present from the begin- a break in perceptual contact. When infants
ning and part of what it means to perceive an can parse a particular disappearance transform
object. as maintaining the identity of the object, they
Nativists claim that search necessarily are in a position to discover what happens to
underestimates infants’ competence. Instead, it between appearances. Then, experience with
increased looking time to events in which object reappearances can be understood in
occluded objects do not reappear when/where a new way, allowing infants to reinterpret the
they are expected to reappear is the appropri- occlusion as preserving the object in a hidden
ate measure. This method leads to the paradox place.
that infant’s putative knowledge of permanence Once infants understand the total occlusion
at birth as inferred from looking time measures of an object in place as preserving it invisibly
does not guide action: Infants fail to search in that place, they still do not understand all
manually for hidden stationary objects (for total occlusions (as argued above). The ID claim
review: Marcovitch & Zelazo, 1999), and they is that the interplay of an object’s spatiotem-
fail to “catch” moving objects that briefly disap- poral parameters and its permanent features
pear and reappear (Berthier et al., 2001; Jonsson and functions afford a mechanism of develop-
& von Hofsten, 2003; Spelke & von Hofsten, mental change, which enables the discovery
2001) until about 9 months of age. This age dis- of other disappearance transforms over which
crepancy is usually explained as a result of per- the object’s identity and its permanence are
formance constraints on the innate knowledge. preserved. Hence, permanence understanding
However, the new finding of a developmen- is constrained to specific types of disappear-
tal difference in search for occluded objects, ance transforms, and develops one transform
when performance factors are controlled and at a time. The ID account holds that object per-
search skills are available, casts doubt on this manence develops in ordered steps, and that
explanation for the failure to search. Infants search tasks, properly conducted, can assess
who succeeded on partial hidings by removing this development.
the occluder should also succeed in removing On the surface, the ID account resembles
the occluder on total hidings if they under- Piaget’s in arguing for a step-like development
stood permanence. But they did not succeed. in object permanence and for the validity of
We interpret this to mean that the 8.75-month- manual search as a measure of it. However, this
olds do not understand permanence for a total resemblance is more apparent than real. There

are profound differences. Empirically, the ID Permanence is knowledge of an invisible state of

account encompasses additional steps in per- affairs. This contrasts with the nativists’ equat-
manence development and in visual search not ing object permanence with continuity in space
included in Piaget’s account. Theoretically, the and time (Spelke, Breinlinger, Macomber, &
ID account bases infant development on an Jacobson, 1992), but being unable, using prefer-
initial capacity for representation and numeri- ential looking or habituation methods, to assess
cal identity rather than seeing representation as infants’ reactions to continuity/discontinuity
the culmination of development at 18 months while the object is still invisible. Essentially, what
of age. Moreover, the ID engine of development is measured by looking time is postreappearance
is cognitive, and stems from infants’ striving knowledge. That is, what is directly measured is
to understand which objects in the external whether the visible, preocclusion state of affairs
world are the same ones encountered previ- is consistent with or discrepant from the visible,
ously, rather than from Piaget’s hierarchical postocclusion state. Looking time measures are
coordination of sensorimotor action schemes. retrospective and based on the visible structure
This striving for a coherent understanding of of the entire disappearance–reappearance cycle.
the appearance and disappearance of the same By contrast, the search measures described here
object, which leads to the discovery of perma- (i.e., incorporating spatially directed visual
nence and orders the course of development, is anticipation of the object’s reappearance locus)
more objective and independent of action than are predictive. Search success under these con-
envisaged by Piaget. ditions shows that infants are seeking the object
where it is located while it is still in its occluded
Future Directions in Object
state. In light of this distinction, one resolution
Permanence Research
of the dilemma is that we are not in fact assess-
At present, the field has two methods that yield ing the same concept at all.
diametrically opposed results, yet both claim On the other hand, we have identified some
to measure the same concept—object perma- markers of the strong form of object perma-
nence. This dichotomy poses deep difficul- nence that nativists could use to demonstrate
ties. According to the nativists, permanence is that the two approaches are measuring the same
an innately perceived property of objects, and concept. It would be interesting if the looking
increased looking time to events incompatible time methods could be adapted to allow young
with permanence demonstrates this implicit infants to leave the locale where the object dis-
knowledge of permanence. According to ID appeared and return later for assessment. Or,
theory, permanence is a function of the disap- could the violation-of-expectation method be
pearance transforms that infants understand, adapted to show strong emotion for a violation
and the order of the manual search tasks solved of permanence (rather than simply increased
by infants demonstrates the development of this looking time), as shown by search tests with
understanding. A paradox arises because the 10-month-olds? One corollary of the nativists’
innate knowledge shown by looking time mea- view seems less persuasive in light of our new
sures does not lead to search before 9 months of data: It is difficult to maintain that infants fail to
age. This paradox still engenders considerable search due to known performance constraints.
debate (e.g., Cohen & Cashon, 2006; Kagan, We found no evidence for this in two indepen-
2008; Meltzoff & Moore, 1998; Newcombe, & dent studies and three different ages.
Huttenlocher, 2006; Quinn, 2008). Conversely, one might wonder then whether
In this chapter, we have tried to narrow the the ID account has any explanation for the
gap to some extent. On the one hand, we have looking-time phenomena demonstrated by the
clarified the definition of object permanence—it nativist approach. We have taken up this chal-
refers to a prereappearance understanding that lenge for some of the phenomena (Meltzoff
an occluded object continues to exist in a par- & Moore, 1998). Essentially, our argument
ticular hidden location in the external world. is that many of the looking time effects result

from the early representational system oper- ACKNOWLEDGMENTS

ating to maintain an object’s numerical iden-
tity even in the absence of object permanence This work was supported by NIH (HD-22514),
NSF (SBE-0354453), and the Tamaki Foundation.
(hypothesis # 5). The ID account contends that
Any opinions, findings, and conclusions expressed
this representational system allows infants to in the paper are those of the authors and do not
learn and retain the spatiotemporal structure of necessarily reflect the views of these agencies.
disappearance and reappearance events in the We thank Calle Fisher and Craig Harris for their
visible world, and to form expectations about assistance and S.J. for very helpful comments on
the visible outcomes of such events (Johnson, an earlier version of this chapter.
Amso, et al., 2003; Kochukhova & Gredebäck,
2007). When discrepancies from expected out-
comes occur, they will recruit increased atten- REFERENCES
tion. Thus, on the ID account, such discrepancies Aslin, R. (1981). Development of smooth pursuit in
could explain increased looking times to the human infants. In D. F. Fischer, R. A. Monty, &
events that have been studied, without imply- E. J. Senders (Eds.), Eye movements: cognition
ing actual infant knowledge of permanence (for and visual perception (pp. 31–51). Hillsdale, NJ:
details, see Meltzoff & Moore, 1998). Erlbaum.
In the end, a full developmental theory has Baillargeon, R. (2008). Innate ideas revisited: For a
to be compatible with the facts: Infants learn, principle of persistence in infants’ physical rea-
develop, and change seemingly based on input; soning. Perspectives on Psychological Science, 3,
they solve problems; they care about conse-
Baillargeon, R., Graber, M., DeVos, J., & Black, J.
quences and show emotion; and they act. They do (1990). Why do young infants fail to search for
not just sit and perceive, receive, and parse events hidden objects? Cognition, 36, 255–284.
in the world. They go out and change it. We have Bertenthal, B. I., Longo, M. R., & Kenny, S. (2007).
begun to provide an account of object perma- Phenomenal permanence and the develop-
nence development compatible with these facts. ment of predictive tracking in infancy. Child
Development, 78, 350–363.
Berthier, N. E., Bertenthal, B. I., Seaks, J. D., Sylvia,
We have proposed an ID account of object per- M. R., Johnson, R. L., & Clifton, R. K. (2001).
manence that locates the origins and develop- Using object knowledge in visual tracking and
ment of permanence in infants’ notions of how reaching. Infancy, 2, 257–284.
Bower, T. G. R. (1967). The development of object-
to determine and trace numerical identity. The
permanence: Some studies of existence con-
arguments and evidence generated from this stancy. Perception & Psychophysics, 2, 411–418.
approach suggest a number of conclusions: (a) Bower, T. G. R. (1971). The object in the world of
object permanence understanding is not an all- the infant. Scientific American, 225, 30–38.
or-none attainment; (b) permanence is under- Bower, T. G. R. (1982). Development in infancy
stood for some disappearance transforms but (2nd ed.). San Francisco: Freeman.
not others; (c) the development of infants’ spa- Bower, T. G. R., Broughton, J. M., & Moore, M. K.
tiotemporal criteria for numerical identity pro- (1971). Development of the object concept as
vide the form and ordering of the disappearance manifested in changes in the tracking behav-
transforms over which they understand perma- ior of infants between 7 and 20 weeks of age.
nence; (d) apparent violations of permanence Journal of Experimental Child Psychology, 11,
can cause negative emotion; and (e) taking
Bremner, G. (1989). Development of spatial aware-
seriously the conceptual distinctions between ness in infancy. In A. Slater & G. Bremner
representation, identity, and permanence offers (Eds.), Infant Development (pp. 123–141).
considerable theoretical power. Finally, we pro- Hillsdale, NJ: Erlbaum.
posed a mechanism of change to account for the Bremner, J. G., Johnson, S. P., Slater, A., Mason, U.,
transition from having no concept of perma- Cheshire, A., & Spring, J. (2007). Conditions
nence to having permanence. for young infants, failure to perceive

trajectory continuity. Developmental Science, infancy: Evidence for early learning in an eye-
10, 613–624. tracking paradigm. Proceedings of the National
Bremner, J. G., Johnson, S. P., Slater, A., Mason, U., Academy of Sciences, USA, 100, 10568–10573.
Foster, K., Cheshire, A., et al. (2005). Conditions Johnson, S. P., Bremner, J. G., Slater, A., Mason,
for young infants, perception of object trajec- U., Foster, K., & Cheshire, A. (2003). Infants’
tories. Child Development, 76, 1029–1043. perception of object trajectories. Child
Campos, J. J., Anderson, D. I., Barbu-Roth, M. Development, 74, 94–108.
A., Hubbard, E. M., Hertenstein, M. J., & Jonsson, B., & von Hofsten, C. (2003). Infants’ abil-
Witherington, D. (2000). Travel broadens the ity to track and reach for temporarily occluded
mind. Infancy, 1, 149–219. objects. Developmental Science, 6, 86–99.
Carey, S., & Xu, F. (2001). Infants’ knowledge of Kagan, J. (2008). In defense of qualitative changes in
objects: Beyond object fi les and object track- development. Child Development, 79, 1606–1624.
ing. Cognition, 80, 179–213. Kahneman, D., Treisman, A., & Gibbs, B. J.
Cohen, L. B, & Cashon, C. H. (2006). Infant cog- (1992). The reviewing of object fi les: Object-
nition. In D. Kuhn, R. S. Siegler, W. Damon, & specific integration of information. Cognitive
R. M. Lerner, (Eds.). Handbook of child psychol- Psychology, 24, 175–219.
ogy, (Vol. 2, pp. 214–251). Hoboken, NJ: Wiley. Kaufman, J., Csibra, G., & Johnson, M. H. (2005).
Diamond, A. (1985). Development of the abil- Oscillatory activity in the infant brain reflects
ity to use recall to guide action, as _ indi- object maintenance. Proceedings of the National
cated by infants’ performance on AB. Child Academy of Sciences, USA, 102, 15271–15274.
Development, 56, 868–883. Kochukhova, O., & Gredebäck, G. (2007). Learning
Diamond, A.,_ Cruttenden, L., & Neiderman, D. about occlusion: Initial assumptions and rapid
(1994). AB with multiple wells: 1. Why are adjustments. Cognition, 105, 26–46.
multiple wells sometimes easier than two Köhler, S., Kapur, S., Moscovitch, M., Winocur,
wells? 2. Memory or memory + inhibition? G., & Houle, S. (1995). Dissociation of path-
Developmental Psychology, 30, 192–205. ways for object and spatial vision: A PET study
Fodor, J. A. (1981). Representations: Philosophical in humans. Neuroreport, 6, 1865–1868.
essays on the foundations of cognitive science. Kramer, J. A., Hill, K. T., & Cohen, L. B. (1975).
Cambridge, MA: MIT Press. Infants’ development of object permanence:
Gibson, J. J., Kaplan, G. A., Reynolds, H. N., & A refined methodology and new evidence
Wheeler, K. (1969). The change from visi- for Piaget’s hypothesized ordinality. Child
ble to invisible: A study of optical transitions. Development, 46, 149–155.
Perception & Psychophysics, 5, 113–116. Krøjgaard, P. (2004). A review of object individua-
Grzywacz, N. M., Watamaniuk, S. N. J., & McKee, tion in infancy. British Journal of Developmental
S. P. (1995). Temporal coherence theory for the Psychology, 22, 159–183.
detection and measurement of visual motion. Krøjgaard, P. (2007). Comparing infants’ use of
Vision Research, 35, 3183–3203. featural and spatiotemporal information in an
Haith, M. M., Kessen, W., & Collins, D. (1969). object individuation task using a new event-
Response of the human infant to level of com- monitoring design. Developmental Science, 10,
plexity of intermittent visual movement. Journal 892–909.
of Experimental Child Psychology, 7, 52–69. Leslie, A. M., Xu, F., Tremoulet, P. D., & Scholl,
Harris, P. L. (1987). The development of search. B. J. (1998). Indexing and the object concept:
In P. Salapatek & L. Cohen (Eds.), Handbook Developing ‘what’ and ‘where’ systems. Trends
of infant perception (Vol. 2, pp. 155–207). New in Cognitive Sciences, 2, 10–18.
York: Academic Press. Marcovitch, S., & Zelazo, P. D. (1999). The A-not-B
Haxby, J. V., Grady, C. L., Horwitz, B., Ungerleider, error: Results from a logistic meta-analysis.
L. G., Mishkin, M., Carson, R. E., et al. (1991). Child Development, 70, 1297–1313.
Dissociation of object and spatial visual pro- Mareschal, D., Plunkett, K., & Harris, P. (1999).
cessing pathways in human extrastriate cor- A computational and neuropsychological
tex. Proceedings of the National Academy of account of object-oriented behaviors in infancy.
Sciences, USA, 88, 1621–1625. Developmental Science, 2, 306–317.
Johnson, S. P., Amso, D., & Slemmer, J. A. Meltzoff, A. N., & Moore, M. K. (1998). Object rep-
(2003). Development of object concepts in resentation, identity, and the paradox of early

permanence: Steps toward a new framework. Newcombe, N., Huttenlocher, J., & Learmonth, A.
Infant Behavior & Development, 21, 201–235. (1999). Infants’ coding of location in continu-
Michotte, A. (1962). Causalité, permanence, et réal- ous space. Infant Behavior & Development, 22,
ité phénoménales: Études de psychologie expéri- 483–510.
mentale. Louvain: Publications Universitaires. Piaget, J. (1952). The origins of intelligence in chil-
Moore, M. K. (1975, April). Object permanence dren (M. Cook, Trans.). New York: International
and object identity: A stage-developmental Universities Press.
model. In M. K. Moore (Chair), Object iden- Piaget, J. (1954). The construction of reality in
tity: The missing link between Piaget’s stages the child (M. Cook, Trans.). New York: Basic
of object permanence. Symposium conducted Books.
at the meeting of the Society for Research in Quinn, P. C. (2008). In defense of core competen-
Child Development, Denver, CO. cies, quantitative change, and continuity. Child
Moore, M. K., Borton, R., & Darby, B. L. (1978). Development, 79, 1633–1638.
Visual tracking in young infants: Evidence for Rosander, K., & von Hofsten, C. (2004). Infants’
object identity or object permanence? Journal emerging ability to represent occluded object
of Experimental Child Psychology, 25, 183–198. motion. Cognition, 91, 1–22.
Moore, M. K., & Meltzoff, A. N. (1978). Object Smith, L. B., Thelen, E., Titzer, R., & McLin,
permanence, imitation, and language devel- D. (1999). Knowing in the context of act-
opment in infancy: Toward a neo-Piagetian ing: The task dynamics of the A-not-B error.
perspective on communicative and cognitive Psychological Review, 106, 235–260.
development. In F. D. Minifie & L. L. Lloyd Spelke, E. S. (1990). Principles of object percep-
(Eds.), Communicative and cognitive abilities— tion. Cognitive Science, 14, 29–56.
Early behavioral assessment (pp. 151–184). Spelke, E. S. (1994). Initial knowledge: Six sugges-
Baltimore: University Park Press. tions. Cognition, 50, 431–445.
Moore, M. K., & Meltzoff, A. N. (1999). New find- Spelke, E. S., Breinlinger, K., Macomber, J., &
ings on object permanence: A developmen- Jacobson, K. (1992). Origins of knowledge.
tal difference between two types of occlusion. Psychological Review, 99, 605–632.
British Journal of Developmental Psychology, Spelke, E. S., & von Hofsten, C. (2001). Predictive
17, 623–644. reaching for occluded objects by 6-month-old
Moore, M. K., & Meltzoff, A. N. (2004). Object infants. Journal of Cognition and Development,
permanence after a 24-hr delay and leaving the 2, 261–281.
locale of disappearance: The role of memory, Strawson, P. F. (1959). Individuals: An essay in
space, and identity. Developmental Psychology, descriptive metaphysics. London: Methuen.
40, 606–620. Thelen, E., Schöner, G., Scheier, C., & Smith, L. B.
Moore, M. K., & Meltzoff, A. N. (2008). Factors (2001). The dynamics of embodiment: A
affecting infants’ manual search for occluded field theory of infant perseverative reaching.
objects and the genesis of object permanence. Behavioral and Brain Sciences, 24, 1–86.
Infant Behavior & Development, 31, 168–180. Thelen, E., & Smith, L. B. (1994). A dynamic sys-
Munakata, Y., McClelland, J. L., Johnson, M. H., & tems approach to the development of cognition
Siegler, R. S. (1997). Rethinking infant knowl- and action. Cambridge, MA: MIT Press.
edge: Toward an adaptive process account of Treisman, A. (1992). Perceiving and re-perceiving
successes and failures in object permanence objects. American Psychologist, 47, 862–875.
tasks. Psychological Review, 104, 686–713. Van de Walle, G. A., Carey, S., & Prevor, M. (2000).
Newcombe, N. S., & Huttenlocher, J. (2006). Bases for object individuation in infancy:
Development of spatial cognition. In D. Kuhn, Evidence from manual search. Journal of
R. S. Siegler, W. Damon, & R. M. Lerner, (Eds.). Cognition and Development, 1, 249–280.
Handbook of child psychology (Vol. 2, pp. von Hofsten, C., Kochukhova, O., & Rosander,
734–776). Hoboken, NJ: Wiley. K. (2007). Predictive tracking over occlusions
Newcombe, N. S., & Huttenlocher, J. (2000). by 4-month-olds. Developmental Science, 10,
Making space: The development of spatial rep- 625–640.
resentation and reasoning. Cambridge, MA: von Hofsten, C., Vishton, P., Spelke, E. S., Feng,
MIT Press. Q., & Rosander, K. (1998). Predictive action

in infancy: Tracking and reaching for moving Wilcox, T., & Baillargeon, R. (1998). Object indi-
objects. Cognition, 67, 255–285. viduation in infancy: The use of featural infor-
Watamaniuk, S. N. J., & McKee, S. P. (1995). mation in reasoning about occlusion events.
Seeing motion behind occluders. Nature, 377, Cognitive Psychology, 37, 97–155.
729–730. Xu, F., & Carey, S. (1996). Infants’ metaphys-
Watamaniuk, S. N. J., McKee, S. P., & Grzywacz, ics: The case of numerical identity. Cognitive
N. M. (1995). Detecting a trajectory embed- Psychology, 30, 111–153.
ded in random-direction motion noise. Vision
Research, 35, 65–77.
This page intentionally left blank

Words, Language, and Music

This page intentionally left blank
Connectionist Explorations of Multiple-Cue
Integration in Syntax Acquisition

Morten H. Christiansen, Rick Dale, and Florencia Reali

Among the many feats of learning that children By 12 months, infants are attuned to the
showcase in their development, syntactic abil- phonological and prosodic regularities of their
ities appear long before many other skills, such native language (Jusczyk, 1997; Kuhl, 1999). This
as riding bikes, tying shoes, or playing a musical perceptual attunement may provide an essential
instrument. This is achieved with little or no direct scaffolding for later learning by biasing children
instruction, making it both impressive and even toward aspects of language input that are par-
puzzling, because mastering natural language ticularly informative for acquiring grammati-
syntax is one of the most difficult learning tasks cal knowledge. In this chapter, we hypothesize
that humans face. One reason for this difficulty is that integrating multiple probabilistic cues
a “chicken-and-egg” problem involved in acquir- (phonological, prosodic, and distributional) by
ing syntax. Syntactic knowledge can be charac- perceptually attuned general-purpose learning
terized by constraints governing the relationship mechanisms may hold promise for explaining
between grammatical categories of words (such how children solve the bootstrapping problem.
as noun and verb) in a sentence. At the same Multiple cues can provide reliable evidence
time, the syntactic constraints presuppose the about linguistic structure that is unavailable
grammatical categories in terms of which they from any single source of information.
are defined; and the validity of grammatical cat- In the remainder of this chapter, we first
egories depends on how they support those same review empirical evidence suggesting that infants
syntactic constraints. A similar “bootstrapping” may use a combination of phonological, pro-
problem faces a student learning an academic sodic, and distributional cues to bootstrap into
subject such as physics: understanding momen- syntax. We then report a series of simulations
tum or force presupposes some understanding demonstrating the computational efficacy of
of the physical laws in which they figure; yet multiple-cue integration within a connectionist
these laws presuppose these very concepts. The framework (for modeling of other aspects of cog-
bootstrapping problem solved by very young chil- nitive development, see the chapter by Mareschal
dren seems much more daunting, both because & Westermann, this volume). Simulation 1 shows
the constraints governing natural language are how multiple-cue integration results in better,
so intricate, and because these children do not faster, and more uniform learning. Simulation
have the intellectual capacity or explicit instruc- 2 uses this initial model to mimic the effect of
tion present in conventional academic settings. grammatical and prosodic manipulations in a
Determining how children accomplish the sentence comprehension study with 2-year-olds
astonishing feat of language acquisition remains (Shady & Gerken, 1999). Simulation 3 uses an
a key question in cognitive science. idealized representation of prenatal exposure


to gross-level phonological and prosodic cues, native language. Recently, a wealth of compel-
leading to facilitation of postnatal learning of ling experimental evidence has accumulated,
syntax by the model. Simulation 4 demonstrates suggesting that children do not initially use
that adding additional distracting cues, irrele- abstract linguistic categories. Instead, they
vant to the syntactic acquisition task, does not seem to employ words at first as concrete indi-
hinder learning. Finally, Simulation 5 scales up viduals (rather than instances of abstract kinds),
these initial simulations, showing that connec- thereby challenging the usefulness of hypothe-
tionist models can acquire aspects of syntactic sized innate grammatical categories (Tomasello,
structure from cues present in actual child-di- 2000). Whether we grant the presence of exten-
rected speech. sive innate knowledge or not, it seems clear that
other sources of information are necessary to
solve the bootstrapping problem.
Language-external information, such as cor-
relations between the environment and seman-
In this section, we identify three kinds of con- tic categories, may contribute to language
straints that may serve to help the language acquisition by supplying a “semantic bootstrap-
learner solve the syntactic bootstrapping prob- ping” solution (Pinker, 1984). However, because
lem. First, innate constraints in the form of children learn linguistic distinctions that
linguistic universals may be available to dis- have no semantic basis (e.g., gender in French:
cover to which grammatical category a word Karmiloff-Smith, 1979), semantics cannot be
belongs, and how they function in syntactic the only source of information involved in solv-
rules. Second, language-external information, ing the bootstrapping problem. Other sources of
concerning observed semantic relationships language-external constraints include cultural
between language and the world, could help learning, indicated by a child’s imitation of lin-
map individual words onto their grammatical guistic forms in socially conventional contexts
function. Finally, language-internal informa- (Tomasello, Kruger & Ratner, 1993). For exam-
tion, such as aspects of phonological, prosodic, ple, a child may perceive that the idiom “John
and distributional patterns, may indicate the let the cat out of the bag,” used in the appropri-
relation of various parts of language to each ate context, means that John has revealed some
other, thus bootstrapping the child into the sort of secret, and not that he released a feline
realm of syntactic relations. We discuss each of from captivity. Despite both of these important
these potential constraints below, and conclude language-external sources, to break down the
that some form of language-internal informa- linguistic forms into relevant units, it appears
tion is needed to break the circularity. that correlation and cultural learning must be
Although innate constraints likely play a role coupled with language-internal information.
in language acquisition, they cannot solve the We do not challenge the important role that
bootstrapping problem. Even with genetically the two foregoing sources of information play in
prescribed abstract knowledge of grammati- language acquisition. We would argue, however,
cal categories and syntactic rules (e.g., Pinker, that language-internal information is fundamen-
1984), the problem remains: Innate knowledge tal to bootstrapping the child into syntax. Because
requires building in universal mappings across language-internal input is rich in potential cues
languages, but the relationships between words to linguistic structure, we offer a requisite feature
and grammatical categories clearly differ cross- of this information for syntax acquisition: Cues
linguistically (e.g., the sound /su/ is a noun in may only be partially reliable individually, and
French (sou) but a verb in English (sue)). Even a learner must integrate an array of these cues
with rich innate knowledge, children still must to solve the bootstrapping problem. For exam-
assign sound sequences to appropriate gram- ple, a learner could use the tendency for English
matical categories while determining the syn- nouns to be longer than verbs to conjecture that
tactic relations between these categories in their bonobo is a noun, but the same strategy would

fail for ingratiate. Likewise, although speakers Prosodic cues help word and phrasal/clausal
tend to pause at syntactic phrase boundaries in segmentation and may reveal syntactic structure
a sentence, pauses also occur elsewhere during (e.g., Gerken, Jusczyk & Mandel, 1994; Gleitman
normal language production. And although it is & Wanner, 1982; Kemler-Nelson, Hirsh-Pasek,
a good distributional bet that the definite arti- Jusczyk, & Wright Cassidy, 1989; Morgan, 1996).
cle the will precede a noun, so might adjectives, Acoustic analyses find that pause length, vowel
such as silly. The child therefore needs to inte- duration, and pitch all mark phrasal boundaries
grate a great diversity of probabilistic cues to in English and Japanese child-directed speech
language structure. Fortunately, as we review in (Fisher & Tokura, 1996). Perhaps from utero
the next section, there is now extensive evidence (Mehler et al., 1988) and beyond, infants seem
that multiple probabilistic cues are available in highly sensitive to such language-specific pro-
language-internal input, that children are sen- sodic patterns (Gerken et al., 1994; Kemler-Nelson
sitive to them, and that they facilitate learning et al., 1989; for reviews, see Gerken, 1996; Jusczyk
through integration. & Kemler-Nelson, 1996; Morgan, 1996). Prosodic
information also improves sentence comprehen-
Bootstrapping through Multiple
sion in 2-year-olds (Shady & Gerken, 1999). In
Language-Internal Cues
experiments using adult participants, artificial
We explore three sources of language-internal language learning is facilitated in the presence
cues: phonological, prosodic, and distributional. of prosodic marking of syntactic phrase bound-
Phonological information includes stress, vowel aries (Morgan, Meier & Newport, 1987; Valian
quality, and duration, and may help distinguish & Levitt, 1996). Neurophysiological evidence in
grammatical function words (e.g., determiners, the form of event-related brainwave potentials
prepositions, and conjunctions) from content (ERP) in adults shows that prosodic information
words (nouns, verbs, adjectives, and adverbs) in has an immediate effect on syntactic processing
English (e.g., Cutler, 1993; Gleitman & Wanner, (Steinhauer, Alter, & Friederici, 1999), suggesting
1982; Monaghan, Chater & Christiansen, 2005; a rapid, on-line role for this important cue. While
Monaghan, Christiansen & Chater, 2007; prosody is influenced to some extent by a num-
Morgan, Shi, & Allopenna, 1996; Shi, Morgan, & ber of nonsyntactic factors, such as breathing pat-
Allopenna, 1998). Phonological information may terns, resulting in an imperfect mapping between
also help separate nouns and verbs (Monaghan prosody and syntax (Fernald & McRoberts,
et al., 2005, 2007; Onnis & Christiansen, 2008). 1996), infants’ sensitivity to prosody argues for its
For example, English disyllabic nouns tend to likely contribution to syntax acquisition (Fisher &
receive initial-syllable (trochaic) stress whereas Tokura, 1996; Gerken 1996; Morgan, 1996).
disyllabic verbs tend to receive final-syllable Distributional characteristics of linguis-
(iambic) stress, and adults are sensitive to this tic fragments at or below the word level may
distinction (Kelly, 1988). Acoustic analyses also provide cues to grammatical category.
have also shown that disyllabic words that are Morphological patterns across words may
noun–verb ambiguous and have the same stress be informative—e.g., English words that are
placement can still be differentiated by syllable observed to have both –ed and –s endings are
duration and amplitude cue differences (Sereno likely to be verbs (Maratsos & Chalkley, 1980). In
& Jongman, 1995). Even 3-year-old children are artificial language learning experiments, adults
sensitive to this stress cue, despite the fact that acquire grammatical categories more effectively
few multisyllabic verbs occur in child-directed when they are cued by such word-internal pat-
speech (Cassidy & Kelly, 1991, 2001). Additional terns (Brooks, Braine, Catalano & Brody, 1993;
noun/verb cues in English likely include dif- Frigo & McDonald, 1998). Corpus analyses
ferences in word duration, consonant voicing, reveal that word co-occurrence also gives useful
and vowel types, and many of these cues may cues to grammatical categories in child-directed
be cross-linguistically relevant (see Kelly, 1992; speech (e.g., Mintz, 2003; Monaghan et al., 2005,
Monaghan & Christiansen, 2008, for reviews). 2007; Redington, Chater, & Finch, 1998). Given

that function words primarily occur at phrase segmentation, and may provide further cueing
boundaries (e.g., initially in English and French of syntactic relations. Despite the value of each
and finally in Japanese), they can also help the source, none of these cues in isolation suffices
learner by signaling syntactic structure. This to solve the bootstrapping problem. The learner
idea has received support from corpus analyses must integrate these multiple cues to overcome
(Mintz, Newport & Bever, 2002) and artificial the limited reliability of each individually. This
language learning studies (Green, 1979; Morgan review has indicated that a range of language-in-
et al., 1987; Valian & Coulson, 1988). Finally, ternal cues is available for language acquisition,
artificial language learning experiments indi- that these cues affect learning and processing,
cate that duplication of morphological patterns and that mechanisms exist for multiple-cue inte-
across related items in a phrase (e.g., Spanish: gration. What is yet unknown is how far these
Los Estados Unidos) <COMP: Keep underline cues can be combined to solve the bootstrapping
for clarity.> facilitates learning (Meier & Bower, problem (Fernald & McRoberts, 1996). Here we
1986; Morgan et al., 1987). present connectionist simulations to demon-
It is important to note that there is ample strate that efficient and robust computational
evidence that children are sensitive to these mechanisms exist for multiple-cue integration
multiple sources of information. After just 1 year (see also the chapters in this volume by Hannon,
of language exposure, the perceptual attune- Kirkham, and Saffran, for evidence from human
ment of children likely allows them to make infant learning).
use of language-internal probabilistic cues (for
reviews, see Jusczyk, 1997, 1999; Kuhl, 1999;
Pallier, Christophe & Mehler, 1997; Werker &
Tees, 1999). Through early learning experiences,
infants already appear sensitive to the acoustic Although the multiple-cue approach is gaining
differences between function and content words support in developmental psycholinguistics, its
(Shi, Werker & Morgan, 1999) and the relation- computational efficacy still remains to be estab-
ship between function words and prosody in lished. The simulations reported in this chapter
speech (Shafer, D. W. Shucard, J. L. Shucard & are therefore intended as a first step toward a
Gerken, 1998). Young infants are able to detect computational approach to multiple-cue inte-
differences in syllable number among isolated gration, seeking to test its potential value in
words (Bijeljac, Bertoncini & Mehler, 1993). In syntax acquisition. Based on our previous expe-
addition, infants exhibit rapid distributional rience with modeling multiple-cue integration
learning (e.g., Gómez & Gerken, 1999; Saff ran, in speech segmentation (Christiansen, Allen, &
Aslin, & Newport, 1996; see Gómez & Gerken, Seidenberg, 1998), we used a simple recurrent
2000; Saff ran, 2003 for reviews), and impor- network (SRN; Elman, 1990) to model the inte-
tantly, they are capable of multiple-cue integra- gration of multiple cues. The SRN is feed-forward
tion (Mattys, Jusczyk, Luce, & Morgan, 1999; neural network equipped with an additional
Morgan & Saff ran, 1995). When facing the copy-back loop that permits the learning and
bootstrapping problem, children probably also processing of temporal regularities in the stimuli
benefit from characteristics of child-directed presented to it (see Figure 5.1). This makes it par-
speech, such as the predominance of short sen- ticularly suitable for exploring the acquisition of
tences (Newport, Gleitman & Gleitman, 1977) syntax, an inherently temporal phenomenon.
and exaggerated prosody (Kuhl et al., 1997). The networks were trained on corpora of
In summary, phonological information helps artificial child-directed speech generated by a
to distinguish function words from content grammar that includes three probabilistic cues
words and nouns from verbs. Prosodic informa- to grammatical structure: word length, lexical
tion helps word and phrasal/clausal segmenta- stress, and pitch. The grammar (described fur-
tion, thus serving to uncover syntactic structure. ther below) was motivated by considering fre-
Distributional characteristics aid in labeling and quent constructions in child-directed speech in

Time t + 1

Lexical layer

1:1 copy back

Hidden layer Context layer

Activation at time t – 1

Lexical input

Time t

Figure 5.1 The general architecture of the simple-recurrent network (SRN) employed across simula-
tions. An input layer representing information relevant for individual words along with an utterance
boundary marker feeds into a hidden layer, and then to an output that predicts information relevant
to the following word in a corpus. The hidden layer copies itself to a context layer, which supplies a
limited memory for past words.

the CHILDES database (MacWhinney, 2000). Materials

Simulation 1 demonstrates how the integration
We constructed an idealized but relatively com-
of these three cues benefits the acquisition of
plex grammar based on independent analyses
syntactic structure by comparing performance
of child-directed speech corpora (Bernstein-
across the eight possible cue combinations
Ratner, 1984; Korman, 1984) and a study of
ranging from the absence of cues to the pres-
child-directed speech by mother–daughter
ence of all three.
pairs (Fisher & Tokura, 1996). As illustrated in
Method Table 5.1, the grammar included three primary
sentence types: declarative, imperative, and
Networks interrogative sentences. Each type consisted
Ten networks were trained per condition, with of a variety of common utterances reflecting
an initial randomization of network connec- the child’s exposure. For example, declarative
tions in the interval [–0.1, 0.1]. Learning rate sentences most frequently appeared as transi-
was set to 0.1, and momentum to 0. Each input tive or intransitive verb constructions (the boy
to the networks contained a localist representa- chases the cat, the boy swims), but also included
tion of a word (one unit = one word) and a set of predication using be (the horse is pretty) and
cue units depending on cue condition. Words second person pronominal constructions com-
were presented one by one, and networks were monly found in child-directed corpora (you are
required to predict the next word in a sentence a boy). Interrogative sentences were composed
along with the corresponding cues for that of wh-questions (where are the boys?, where do
word. With a total of 44 words (see below) and a the boys swim?), and questions formed by using
pause marking boundaries between utterances, auxiliary verbs (do the boys walk?, are the cats
the networks had 45 input units. Networks in pretty?). Imperatives were the simplest class of
the condition with all available cues had an sentences, appearing as intransitive or transitive
additional five input units. The number of input verb phrases (kiss the bunny, sleep). Subject–verb
and output units thus varied between 45 and 50 agreement was upheld in the grammar, along
across conditions. Each network had 80 hidden with appropriate determiners accompanying
units and 80 context units. nouns (the cars vs. *a cars).

Table 5.1 The Stochastic Phrase-Structure Grammar Used to Generate Training Corpora for
Simulations 1–4
S → Imperative [0.1] | Interrogative [0.3] | Declarative [0.6]
Declarative → NP VP [0.7] | NP-ADJ [0.1] | That-NP [0.075] | You-P [0.125]
NP-ADJ → NP is/are adjective

That-NP → that/those is/are NP

You-P → you are NP

Imperative → VP
Interrogative → Wh-Question [0.65] | Aux-Question [0.35]

Wh-Question → where/who/what is/are NP [0.5] |

Where/who/what do/does NP VP [0.5]

Aux-Question → do/does NP VP [0.33] |

Do/does NP wanna VP [0.33] |

is/are NP adjective [0.34]

NP → a/the N-sing/N-plur
VP → V-int | V-trans NP

Each word was assigned a unit for input into thermometer encoding—that is, one unit would
the model, and we added a number of units to be on for monosyllabic words, two for bisyl-
represent cues. Two basic cues were available to labic words, and three for trisyllabic words.
all networks. The fundamental distributional Pitch change is a cue associated with syllables
information inherent in the grammar could that precede pauses. Fisher and Tokura (1996)
be exploited by all networks in this simulation. found that these pauses signaled grammatically
As a second basic cue, utterance-boundary distinct utterances with 96% accuracy in child-
pauses signaled grammatically distinct utter- directed speech, allowing pitch to serve as a cue
ances with 92% reliability (Broen, 1972). Th is to grammatical structure. In the networks, this
was encoded as a single unit that was activated cue was a single unit that would be activated at
at the end of all but 8% of the sentences. Other the fi nal word in an utterance. Finally, we used
semireliable prosodic and phonological cues a single unit to encode lexical stress as a pos-
accompanied the phrase-structure grammar: sible cue to distinguish stressed content words
word length, stress, and pitch. Network groups from the reduced, unstressed form of function
were constructed using different combinations words. Th is unit would be on for all content
of these three cues. Cassidy and Kelly (1991) words.
demonstrated that syllable count is a cue avail-
able to English speakers to distinguish nouns
and verbs. They found that the probability of Eight groups of networks, one for each combi-
a single syllable word to be a noun rather than nation of cues (all cues, 2 cues, 1 cue, or none),
a verb is 38%. Th is probability rises to 76% at were trained on corpora consisting of 10,000
two syllables, and 92% at three. We selected sentences generated from the grammar. Each
verb and noun tokens that exhibited this dis- network within a group was trained on a dif-
tinction, whereas the length of the remaining ferent randomized training corpus. Training
words was typical for their class (i.e., function consisted of 200,000 input/output presentations
words tended to be monosyllabic). Word length (words), or approximately 5 passes through the
was represented in terms of three units using training corpus. Each group of networks had

cues added to its training corpus depending on able to assess SRN performance on novel sen-
cue condition. Networks were expected to pre- tences in which the definite determiner “the”
dict the next word in a sentence, along with the followed the example fragment (as in “The boy
appropriate cue values. A corpus consisting of chases the cat”’). Formally, we thus have the fol-
1,000 novel sentences was generated for testing. lowing Equation 1 with ci denoting the category
Performance was measured by assessing the of the ith word in the sentence:
networks’ ability to predict the next set of gram-
Freq(c1 , c2 ,!, c p−1 , c p )
matical items given prior context. Importantly, P (c p | c1 , c2 ,!, c p−1 ) ≅ (5.1)
this measure did not include predictions of cue Freq(c1 , c2 ,!, c p−1 )
information, and all network conditions were
where the probability of getting some member
thus evaluated by exactly the same performance
of a given lexical category as the pth item, cp, in
a sentence is conditional on the previous p–1
To provide a statistical benchmark with
lexical categories. Note that for the purpose of
which to compare network performance, we
performance assessment, singular and plural
trained bigram and trigram models on the
nouns are assigned to separate lexical categories
same corpora as the networks. These finite-state
throughout Simulations 1–4, as are singular
models, borrowed from computational linguis-
and plural verbs. Given that the choice of lexical
tics, provide a simple prediction method based
items for each category is independent, and that
on strings of two (bigrams) or three (trigrams)
each word in a category is equally frequent, the
consecutive words. Comparisons with these
probability of encountering a particular word
simple models provide an indication of whether
wn, which is a member of a category cp, is simply
the networks are learning more than simple
inversely proportional to the number of items,
two- or three-word associations.
Cp, in that category. So, overall, we have the fol-
Results lowing equation:

After training, SRNs trained with localist out- Freq(c1 , c2 ,!, c p−1 , c p )
P (wn | c1 , c2 ,!, c p−1 ) ≅ (5.2)
put representations will produce a distributional Freq(c1 , c2 ,!, c p−1 )C p
pattern of activation closely corresponding to a
probability distribution of possible next items. If the networks are performing optimally, then
In order to assess the overall performance of the vector of output unit activations should
the SRNs, we made comparisons between net- exactly match these probabilities. We evaluate
work output probabilities and the full condi- the degree to which each network performs suc-
tional probabilities given the prior context. For cessfully by measuring the mean squared error
example, the full conditional probabilities given between the vectors representing the network’s
the context of “The boy chases . . . ” can be rep- output and the conditional probabilities (with 0
resented as a vector containing the probabili- indicating optimal performance).
ties of being the next item in this sentence for All networks achieved better performance
each of the 44 words in the vocabulary and the than the standard bigram/trigram models
pause. To ensure that our performance measure (p-values < .0001), suggesting that the networks
can deal with novel test sentences not seen dur- had acquired knowledge of syntactic structure
ing training, we estimate the prior conditional beyond the information associated with simple
probabilities based on lexical categories rather pairs or triples of words. Figure 5.2A illustrates
than individual words (Christiansen & Chater, the best performance achieved by the trigram
1999). Suppose, in the example above, that every model as well as SRNs provided with no cues
continuation of this sentence fragment in the (the baseline network), a single cue (length,
training corpus always involved the indefinite stress, or prosody), and three cues. The nets pro-
determiner “a” (as in “The boy chases a cat”). If vided with one or more phonological/prosodic
we did not base our full conditional probability cues achieved significantly better performance
estimates on lexical categories, we would not be than baseline networks (p-values < .02). Using

trigram performance as criterion, all multiple- Finally, using Brown-Forsyth tests for variability
cue networks surpassed this level of performance in the final level of performance, we found that
faster than the baseline networks as shown in the three-cue networks also exhibited signifi-
Figure 5.2B (p-values < .002). Moreover, the cantly more uniform learning than the baseline
three-cue networks were significantly faster networks (F(1,18) = 5.14, p < .04), as depicted in
than the single-cue networks (p-values < .001). Figure 5.2C.

A 0.100

Final mean squared error





Trigram Baseline Length Stress Pitch All cues

B 90
Amount of training to trigram

mean (x 1000)

Baseline Length Stress Pitch All cues

C 3.5


Variance (Z-score)





Baseline Length Stress Pitch All cues

Figure 5.2 Comparison of learning performance for different cue combinations in Simulation 1,
showing that multiple-cue integration leads to (A) better learning (as measured by the lowest error
obtained on the test corpus), (B) faster learning (measured in terms of the amount of training needed
to surpass the performance of the trigram model), and (C) more uniform learning (as indicated by less
variance across the performance of the different instances of the network). (Error bars = S.E.M.)

SIMULATION 2:SENTENCE match the stimuli in Shady and Gerken. Twelve

COMPREHENSION IN 2-YEAR- OLDS sentences for each prosody condition (pause
location) were constructed. Pauses were sim-
Simulation 1 provides evidence for the general ulated by activating the utterance-boundary
feasibility of multiple-cue integration for sup- unit. Because these pauses probabilistically sig-
porting syntax learning. To further demon- nal grammatically of distinct utterances, the
strate the relevance of the model to language utterance-boundary unit provides an approxi-
development, closer contact with human data mation of what the children in the experiment
is needed (Christiansen & Chater, 2001). In would experience. Finally, the nonsense word
the current simulation, we demonstrate that was added to the stimuli for the within group
the three-cue networks from Simulation 1 are condition (grammatical vs. ungrammatical
able to accommodate experimental data show- vs. nonsense). Adjusting for vocabulary differ-
ing that 2-year-olds can integrate grammatical ences, the networks were tested on comparable
markers (function words) and prosodic cues sentences, such as (2):
in sentence comprehension (Shady & Gerken,
1999: Experiment 1). In this study, children 2. Where does [e] the/is/gub [u] dog [l] eat?
heard sentences, such as (1) [see below], in one of
three prosodic conditions depending on pause Procedure
location: early natural [e], late natural [l], and Each group of networks was exposed to the set
unnatural [u]. Each sentence moreover involved of sentences corresponding to its assigned pause
one of three grammatical markers: grammati- location (early vs. late vs. unnatural). No learn-
cal (the), ungrammatical (was), and nonsense ing took place, since the fully trained networks
(gub). were used. To approximate the picture selec-
1. Find [e] the/was/gub [u] dog [l] for me. tion task in the experiment, we measured the
degree to which the networks would activate
The child’s task was to identify the correct picture the groups of nouns following the/is/gub. The
corresponding to the target noun (dog). Children two conditions were expected to affect the acti-
performed the task best when the pause location vation of the nouns.
delimited a phrasal boundary (early/late), and
with the grammatical marker the. Simulation 2 Results
models these data by using comparable stimuli
and assessing noun unit activations. The human results for the prosody condition in
Shady and Gerken (1999) is depicted in Figure
Method 5.3A. They reported a significant effect of pros-
ody on the picture selection task. The same
was true for our networks (F(2,33) = 1,253.07,
Twelve three-cue networks of the same archi- p < .0001), and the pattern of noun activations
tecture and training used in Simulation 1 closely resembles that of the toddlers’ correct
were used in each prosodic condition in the picture choice as evidenced by Figure 5.3B.
infant experiment. Th is number was chosen The late natural condition elicited the highest
to match the number of infants in the Shady noun activation, followed by the early natural
and Gerken (1999) experiment. An additional condition, and with the unnatural condition
unit was added to the networks to encode the yielding the least activation. The experiment
nonsense word (gub) in Shady and Gerken’s also revealed an effect of grammaticality as can
experiment. be seen from the human data shown in Figure
5.3C. We similarly obtained a significant gram-
maticality effect for our networks (F(2,70) =
We constructed a sample set of sentences 69.85, p < .0001), which, as illustrated by Figure
from our grammar that could be modified to 5.3D, produced the highest noun activation

A 70 B 0.6
Percent correct picture identification

Accumulated Noun Activation

60 0.5

10 0.1

0 0.0
Early Late Unnatural Early Late Unnatural

C 70 D
Percent correct picture identification

Accumulated Noun Activation

50 0.6
40 0.5
10 0.1
0 0.0
Grammatical Nonsense Ungrammatical Grammatical Nonsense Ungrammatical

Figure 5.3 The effect of prosody and grammatical markers on human and SRN sentence processing.
(A) Percent correct picture identification by 2-year-olds in the prosody condition of the Shady and
Gerken (1999) experiment, with pauses inserted early, late, or in the unnatural position between the
determiner and the noun. (B) Total activation of nouns by the SRN when exposed to the same pro-
sodic manipulation as the human children. (C) Picture identification performance in the grammatical
marker condition in Shady and Gerken (1999), involving a grammatical, nonsense, or ungrammatical
word before the target noun. (D) Matching SRN activation of nouns for the same three types of gram-
matical markers. (Error bars = S.E.M.)

following the determiner, followed by the non- 1996). Thus, the results suggest that the syntac-
sense word, and lastly for the ungrammatical tic knowledge acquired by the networks mirrors
word. Again, the network results match the the kind of sensitivity to syntactic relations and
pattern observed for the toddlers. One slight prosodic content observed in human children.
discrepancy is that the networks are producing Together with Simulation 1, the results also
higher noun activation following the nonsense demonstrate that multiple-cue integration may
word compared to the ungrammatical marker. both facilitate syntax acquisition, and underlie
This result is however consistent with the results some patterns of linguistic skill observed early
from a more sensitive picture selection task, on in human performance. In the next simula-
showing that children were more likely to end tion, we show that the multiple-cue perspective
up with a semantic representation of the tar- can simulate possible prosodic scaffolding that
get following nonsense syllables compared to occurs much earlier in development: prenatal
incorrectly used morphemes (Carter & Gerken, attunement to prosody.

SIMULATION 3: THE ROLE OF PRENATAL the prediction of following words, ignoring the
EXPOSURE cue units.

Studies of 4-day-old infants suggest that the Results

attunement to prosodic information may begin
prior to birth (Mehler et al., 1988). We suggest Both network groups exhibited significantly
that this prenatal exposure to language may pro- higher performance than the bigram/trigram
vide a scaffolding for later syntactic acquisition models (F(1,18) = 25.32, p < .0001 for prenatal,
by initially focusing learning on certain aspects F(1,18) = 12.03, p < .01 for non-prenatal), again
of prosody and gross-level properties of phonol- indicating that the networks are acquiring com-
ogy (such as word length) that later will play an plex grammatical regularities that go beyond
important role in postnatal multiple-cue inte- simple adjacency relations. We compared the
gration. In the current simulation, we test this performance of the two network groups across
hypothesis using the connectionist model from different degrees of training using a two-way
Simulations 1 and 2. If this scaffolding hypoth- analysis of variance with training condition
esis is correct, we would expect that prenatal (prenatal vs. non-prenatal) as the between-net-
exposure corresponding to what infants receive work factor and amount of training as within-
in the womb would result in improved acquisi- network factor (five levels of training measured
tion of syntactic structure. in 20,000 input/output presentation intervals).
There was a main effect of training condition
Method (F(1,18) = 12.36, p < .01), suggesting that pre-
natal exposure significantly improved learning.
A main effect of degrees of training (F(9,162) =
Ten SRNs were used in both prenatal and non- 15.96, p < .001) reveals that both network groups
prenatal groups, with the same initial condi- benefited significantly from training. An inter-
tions and training details as Simulation 1. Each action between training conditions and degrees
network was supplied with the full range of cues of training indicates that the prenatal networks
used in Simulation 1. learned significantly better than postnatal net-
works (F(1,18) = 9.90, p < .01). Finally, as illus-
trated by Figure 5.4, prenatal input also resulted
A set of “fi ltered” prenatal stimuli was gener- in faster learning (measured in terms of the
ated using the same grammar as previously amount of training needed to surpass the tri-
(Table 5.1), with the exception that input/out- gram model; F(1,18) = 9.90, p < .01). The expo-
put patterns now ignored individual words and sure to prenatal input—void of any information
only involved the units encoding word length, about individual words—promotes better per-
stress, pitch change and utterance boundar- formance on the prediction task as well as faster
ies. The postnatal stimuli were the same as in learning overall. This provides computational
Simulation 1. support for the prenatal scaffolding hypothesis,
derived as a prediction from the multiple-cue
perspective on syntax acquisition.
The networks in the prenatal group were first
trained on 100,000 input/output fi ltered pre-
sentations drawn from a corpus of 10,000 new
sentences. Following this prenatal exposure, the
nets were then trained on the full input patterns
exactly as in Simulation 1. The non-prenatal So far, simulations have demonstrated the
group only received training on the postnatal importance of cue integration in syntax acqui-
corpora. As previously, networks were required sition, that integration can match data obtained
to predict the following word and correspond- in infant experiments, and that this perspec-
ing cues. Performance was again measured by tive can provide novel predictions in language


Amount of training to trigram mean (x 1000)





Figure 5.4 Speed of learning 10

for networks trained with or
without prenatal exposure to 5
prosody and gross-level prop-
erties of phonology. (Error bars 0
= S.E.M.) No prenatal Prenatal

development. A possible objection to these cue, word-final voicing, also does not provide
simulations is that our networks succeed at useful distinguishing properties of word classes.
multiple-cue integration because they are only Finally, as an additional prenatal and postnatal
provided with cues that are at least partially cue, overall pitch quality was added to the stim-
relevant for syntax acquisition. Consequently, uli. This was intended to capture whether the
performance may potentially drop signifi- speaker was female or male. In prenatal train-
cantly if the networks themselves had to dis- ing, this probability was set to be extremely high
cover which cues were partially relevant and (90%), and lower in postnatal training (60%).
which are not. Simulation 4 therefore tests the In the womb, the mother’s voice naturally pro-
robustness of our multiple-cue approach when vides most of the input during the final trimes-
faced with additional, uncorrelated distractor ter when the infant’s auditory system has begun
cues. Accordingly, we added three distractor to function (Rubel, 1985). The probability used
cues to the previous three reliable cues. These here was intended to capture the likelihood that
new cues encoded the presence of word-initial some experience would derive from other speak-
vowels, word-final voicing, and relative (male/ ers as well. In postnatal training, this probability
female) speaker pitch—all acoustically salient drops, representing exposure to male members
in speech, but which do not appear to cue syn- of the linguistic community, but still favoring
tactic structure. mother–child interactions.
Prenatal stimuli included the three previous
semireliable cues, and only the additional pro-
Networks, groups, and training details were the sodic, distractor cue encoding relative speaker
same as in Simulation 3, except for three addi- pitch. In the postnatal stimuli, all three dis-
tional input units encoding the distractor cues. tractor cues were added. Training and testing
details were the same as in Simulation 3.
The three distractor cues were added to the stim-
uli used in Simulation 3. Two of the cues were As in Simulations 1 and 3, both groups per-
phonetic and therefore available only in postna- formed significantly better than the bigram/
tal training. The word-initial vowel cue appears trigram models (F(1,18) = 18.95, p < .0001 for
in all words across classes. The second distractor prenatal, and F(1,18) = 14.27, p < .001 for non-

Amount of training to trigram mean (x 1000)




Figure 5.5 Speed of learning
for networks trained with or
0 without distractor cues. (Error
Good cues only Including distractors bars = S.E.M.)

prenatal). We repeated the two-factor analysis artificial corpora. Next, we scale up the model
of variance computed for Simulation 2, reveal- to deal with naturalistic child-directed speech.
ing a main effect for training condition (F(1,18)
= 4.76, p < .05) and degrees of training (F(9,162)
= 13.88, p < .0001). This indicates that the pres-
ence of the distractor cues did not hinder the
improved performance following prenatal lan-
guage exposure. As in Simulation 3, the prenatal In this final simulation, we take a further step
networks learned comparatively faster than the toward describing the computational underpin-
non-prenatal networks (F(1,18) = 5.31, p < .05). nings of multiple-cue integration. The previous
To determine how the distractor cues affected series of simulations have demonstrated that
performance, we compared the prenatal con- SRNs provide a suitable model for integrating
dition in Simulation 3 with that of the current multiple cues when exposed to input generated
simulation. There was no significant difference by a psychologically motivated artificial gram-
in performance across the two simulations mar. Here we further show that the SRN scales
(F(1,18) = 0.13, p = .72). Moreover, as shown in up to deal with real child-directed speech. In
Figure 5.5, there was no difference in the speed particular, we seek to determine the extent to
of learning between the SRNs trained only which these networks are sensitive to the lex-
with good cues and those whose input included ical category information present in the set of
distractor cues (F(1,18) = .57, p = .46). A fur- phonological cues. To accomplish this task, we
ther comparison between these non-prenatal set up two identical groups of networks, each
networks and the bare networks in Simulation provided with a different encoding of the cor-
1 showed that the networks trained with cues pus. The encoding of the first corpus was based
of mixed reliability significantly outperformed on 16 phonological cues, previously shown by
networks trained without any cues (F(1,18) = Monaghan et al. (2005) to provide information
14.27, p < .001). This indicates that the uncor- useful for syntax acquisition. The second set
related cues did not prevent the networks from of input was encoded using the same cue vec-
integrating the partially reliable ones toward tors but randomized across lexical categories.
learning grammatical structure. Together with Possible performance differences in networks
the first three simulations, Simulation 4 dem- trained with these different input sets would be
onstrates that SRNs can integrate multiple cues due to lexical category information revealed by
efficiently when exposed to relatively complex the multiple phonological cues.

Method over a 4- to 5-month period when the children

were between the ages of 1 year and 1 month
to 1 year and 9 months. The corpus includes
Ten SRNs were used for the phonetic-input 1,371 word types and 33,035 tokens distributed
condition and the random-input condition, over 10,082 utterances. The sentences incorpo-
with an initial weight randomization in the rate a number of different types of grammatical
interval [–0.1, 0.1]. A different random seed was structures, showing the varied nature of the lin-
used for each simulation. Learning rate was guistic input to children. Utterances range from
set to 0.1 and momentum to 0.7. Each input to declarative sentences (Oh you need some space)
the network contained a thermometer encod- to wh-questions (Where’s my apple) to one-word
ing for each of the 16 phonological cues from utterances (“Uh” or “hello”). Each word in the
Monaghan et al. (2005), listed in Table 5.2. This corpus corresponded to one of the 14 following
encoding required 43 units (each of them in a lexical categories: nouns (19.5%), verbs (18.5%),
range from 0 to 1) and a pause marking bound- adjectives (4%), numerals (<0.1%), adverbs
aries between utterances, resulting in the net- (6.5%), articles (6.5%), pronouns (18.5%), prep-
works having 44 input units. Each output was ositions (5%), conjunctions (4%), interjections
encoded using a localist representation consist- (7%), complex contractions (8%), abbreviations
ing of 14 different lexical categories and a pause (<0.1%), infinitive markers (1.2%), and proper
marking boundaries between utterances, result- names (1.2%). The training set consisted of
ing in networks with 15 output units. Each net- 9,072 sentences (29,930 word tokens) from the
work furthermore was equipped with 88 hidden original corpus. A separate test set consisted of
units and 88 context units. 963 additional sentences (2,930 word tokens).
Each word was encoded in terms of the
following 16 phonological cues from Table 5.2:
We trained and tested the network on a corpus number of phonemes (1–11), number of syllables
of child-directed speech (Bernstein-Ratner, (1–5), stress position (0 = no stress, 1 = 1st
1984). This corpus contains speech recorded syllable stressed, etc.), proportion of reduced
from nine mothers speaking to their children vowels (0–1), proportion of coronal consonants

Table 5.2 Phonological Cues that Distinguish between Lexical Categories

Nouns and Verbs

Nouns have more syllables than verbs (Kelly, 1992)
Bisyllabic nouns have 1st syllable stress, verbs tend to have 2nd syllable stress (Kelly & Bock, 1988)
Inflection -ed is pronounced /d/ for verbs, /@d/ or /Id/ for adjectives (Marchand, 1969)
Stressed syllables of nouns have more back vowels than front vowels. Verbs have more front vowels than
back vowels (Sereno & Jongman, 1990)
Nouns have more low vowels, verbs have more high vowels (Sereno & Jongman, 1990)
Nouns are more likely to have nasal consonants (Kelly, 1992)
Nouns contain more phonemes per syllable than verbs (Kelly, 1996)

Function and Content Words

Function words have fewer syllables than content words (Morgan, Shi & Allopenna, 1996)
Function words have minimal or null onsets (Morgan, Shi & Allopenna, 1996)
Function word onsets are more likely to be coronal (Morgan, Shi & Allopenna, 1996)
/D/ occurs word-initially only for function words (Morgan, Shi & Allopenna, 1996)
Function words have reduced vowels in the first syllable (Cutler, 1993)
Function words are often unstressed (Gleitman & Wanner, 1982)

(0–1), number of consonants in onset (1–3), on to use pragmatic and other cues to discover
consonant complexity (0–1), initial /D/ (1 if the meaning of words. Given that the networks
begins /D/, 0 otherwise), reduced first vowel in our simulations only have access to linguis-
(1 if first vowel is reduced, 0 otherwise), any tic information, we see lexical categories as a
stress (0 if no stress, 1 otherwise), final inflec- “stand-in” for more ecologically valid cues that
tion (0 if none, /@d/ or /Id/, 1 if present), stress we hope to be able to include in future work.
vowel position (from front to back, 1–3), vowel
position (mean position of vowels, from front Results
to back, 1–3), final consonant voicing (0: vowel,
We recorded the output vectors for the two
1: voiced, 2: unvoiced), proportion of nasal con-
groups of networks. Because the output con-
sonants (0–1) and mean height of vowels (0–3).
sisted of localist representations for each lex-
The cues that assume only binary values were
ical category (one unit = one lexical category)
encoded using a single unit (e.g., “any stress”,
along with the utterance-fi nal pause, we could
“initial /D/”). The cues that take on values
use Equation 5.1 to estimate the full condi-
between 0 and 1 (e.g., proportion of vowel con-
tional probabilities, comparing network pre-
sonants) were also encoded using a single unit
dictions to the full conditional probabilities for
with a decimal number, whereas the cues that
the next lexical category using the mean cosine
assume values in a broader range (e.g., number
of the angle between the two vectors (with 1
of syllables) were represented using a thermom-
corresponding to optimal performance). We
eter encoding; for example, one unit would
compared the predictions of the phonetic-in-
be on for monosyllabic words, two for bisyl-
put networks with those of the random-input
labic words, and so on. Finally we used a single
networks. Figure 5.6A shows a comparison of
unit that would be activated at pauses between
test–set performance for the phonetic-input net-
works with that of the random-input networks.
The random-input networks were trained
The phonetic-input networks were significantly
using input for which we randomly distributed
better than the random-input networks at pre-
the multiple-cue vectors among all the words
dicting the next combination of lexical catego-
in the corpus. Thus, the vector encoding for a
ries (p-values < .00005). These results suggest
given word would be randomly reassigned to
that distributional information is generally a
a different word in the corpus regardless of its
stronger cue than phonological information,
lexical category. Each phonological vector was
even though the latter does lead to better learn-
assigned to only one word. Moreover, each
ing overall. However, phonological information
token of a word was represented using the same
may provide the networks with a better basis for
random vector for all occurrences of that word
processing novel lexical items. Next, we probe
in the test and training sets.
the internal representations of the two sets of
Procedure networks in order to gain further insight into
their performance differences.
Ten networks were trained on phonological
cues and 10 control networks were trained on Probing the Internal Representations
the random vectors. Training consisted of one
pass through the training corpus. We used Simulation 5 indicated that the phonetic-input
the same 10 random seeds for both simulation networks did not benefit as much as one perhaps
conditions. The networks were trained to pre- would have expected from the information pro-
dict the lexical category of the next word. The vided by the phonological cues. However, the
task of mapping phonological cues onto lex- networks may nonetheless use this information
ical categories may seem somewhat artificial to develop internal representations that better
because children are not provided directly with encode differences between lexical categories.
the lexical categories of the words to which they This may allow them to go beyond the phonetic
are exposed. However, children do learn early input and integrate it with the distributional

A 0.70 B


Percent correct classification

Final mean cosine


0.55 60

0.50 50
Distrib cues only Phon + Distrib cues Distrib cues only Phon + Distrib cues

Figure 5.6 Performance of the network models trained on full-blown child-directed speech. (A) Test
performance for networks provided only with distributional cues and networks provided with both
phonological and distributional cues. (B) Results of the discriminant analyses, comparing the ability
of the two types of networks to place themselves in a “noun state” and a “verb state” when processing
novel nouns and verbs, respectively. (Error bars = S.E.M.)

information derived from the sequential order vector when the hidden unit activations were
in which these vectors were presented. To inves- recorded for a noun (phonetic) input vector. We
tigate these possibilities, we carried out a series also included a condition in which the noun/
of discriminant analyses of network hidden unit verb labels were randomized with respect to the
activations as well as of the phonetic input vec- hidden unit vectors for both sets of networks, in
tors, focusing on the representations of nouns order to establish a random control.
and verbs.
We first compared the categorization perfor-
Informally, a linear discriminant analysis mance of the two sets of networks, as illustrated
allows us to determine the degree to which it in Figure 5.6B. The phonetic-input networks
is possible to separate a set of vectors into two had developed hidden unit representations
(or more) groups based on the information con- that allowed them to correctly separate 80.30%
tained in those vectors. In effect, we attempt to of the 400 nouns and verbs. This was signifi-
use a linear plane to split the hidden unit space cantly better than the random-input networks,
into a group of noun vectors and a group of verb which only achieved 73.15% correct separation
vectors. Using discriminant analyses, we can (t(8) = 5.89, p < .0001). Both sets of networks
statistically estimate the degree to which this surpassed their respective randomized controls
split can be accomplished given a set of vectors. (phonetic-input control: 69.05% – t(8) = 11.51,
We recorded the hidden unit activations p < .0001; random-input control: 68.20% –
from the two sets of networks in Simulation 5. t(8) = 3.92, p < .004). The controls for the two
The hidden unit activations were recorded for sets of networks were not significantly different
200 novel nouns and 200 novel verbs occurring from each other (t(8) = 0.82, p > .43). As indicated
in unique sentences taken from other CHILDES by our previous analyses of phonetic cue infor-
corpora (MacWhinney, 2000). The hidden unit mation in child-directed speech (Monaghan
activations were labeled such that each corre- et al., 2005), the phonetic input vectors con-
sponded to the particular lexical category of the tained a considerable amount of information
input presented to the network (though the net- about lexical categories, allowing for 67.25%
works did not receive this information as input). correct separation of nouns and verbs, but
For example, a vector would be labeled a noun still significantly below the performance of the

phonetic-input networks (t(4) = 25.97, p < .0001). the service of syntactic acquisition when trained
The random-input networks also surpassed the on a naturalistic corpus of child-directed speech
level of separation afforded by their input vec- (Simulation 5). Analysis of the networks’ hidden
tors (59.00% – t(4) = 12.80, p < .0001). unit activations provided further evidence that
The results of the hidden-unit discriminant the integration of phonological and distribu-
analyses suggest that not only did the phonet- tional cues during learning leads to more robust
ic-input networks develop internal representa- internal representations of lexical categories, at
tions better suited for distinguishing between least when it comes to distinguishing between
nouns and verbs, but they also went beyond the the two major categories of nouns and verbs.
information afforded by the phonetic input and Overall, the simulation results presented in
integrate it with distributional information. this chapter provide support not only for the
Crucially, the phonetic-input vectors were able multiple-cue integration approach in general,
to surpass the random-input networks, despite but also for using neural network architectures
that the latter was also able to use distributional to explore the integration of distributional,
information to go beyond the input. Consistent prosodic, and phonological information in lan-
phonological information thus appears to be guage acquisition. Some researchers have chal-
important for network generalization to novel lenged the value of multiple probabilistic cues
nouns and verbs. (e.g., Fernald & McRoberts, 1996), but we have
computationally demonstrated that their inte-
gration results in faster, better, and more uni-
form learning, even in the face of distracting
As described in an earlier part of this chapter, information. Our simulations, along with arti-
children who are learning syntax face a complex ficial language learning experiments (Billman,
“chicken-and-egg” bootstrapping problem. A 1989; Brooks et al., 1993; McDonald & Plauche,
growing bulk of evidence from developmental 1995; Morgan et al., 1987), underscore multiple-
cognitive science has suggested that a solution cue integration as a means of facilitating the
may come from a process of integrating mul- complex task of syntax acquisition.
tiple sources of probabilistic information, each We have elsewhere explored the evolution-
of which is individually unreliable, but jointly ary emergence of phonological cues in agent-
advantageous (cf. Smith & Pereia chapter in based simulations (Christiansen & Dale, 2004).
this volume). What has so far been lacking is a In these evolutionary simulations, languages
demonstration of the computational feasibility were mutated slightly across generations of ran-
of this approach and the series of simulations domized SRN learners. For any given genera-
reported here takes a first step toward accom- tion, the languages best learned by the networks
plishing this. We have demonstrated that pro- were allowed to be passed down to the next
viding SRNs with prosodic and phonological generation. Results showed that there emerges
cues significantly improves their acquisition of cross-linguistic variation in stable linguistic
syntactic structure (Simulation 1), and that the cues. Nevertheless, observed stable cue sys-
three-cue networks can mimic children’s sen- tems were consistent in that syntactic categories
sitivity to both prosodic and grammatical cues were marked by phonological cues, as found in
in sentence comprehension (Simulation 2). The English, French, Japanese, and other languages
model illustrates the potential value of prenatal (as reviewed above). This stability was particu-
exposure (Simulation 3) and provides evidence larly strong when languages had larger lexicons,
for the robustness of multiple-cue integration, indicating that multiple-cue integration may
since highly unreliable cues did not interfere have contributed to language evolution by aid-
with the integration process (Simulation 4). ing a learner’s acquisition of growing set of lexi-
Finally, we expanded these results by showing cal items and classes.
that SRNs can also utilize highly probabilistic Because different natural languages employ
information found in 16 phonological cues in different constellations of cues to signal syntactic

distinctions, an important question for further context information to resolve a syntactically

research is exactly how a child’s learning mech- ambiguous sentence does not appear until
anisms discover which cues are relevant and about 8 years of age, considerably later than the
for which aspects of syntax. This problem is knowledge of constraints on constructions that
compounded by the fact that the same cue may may follow specific verbs (Snedeker & Trueswell,
work in different directions across different 2004). To reveal cue integration and its develop-
languages. A case in point is that nouns tend to ment, models must capture the developmental
contain more vowels and fewer consonants than trajectory of cue use across different phases of
verbs in English, whereas nouns and verbs in language acquisition. We anticipate that the
French show the opposite pattern (Monaghan availability of so-called “dense” corpora, which
et al., 2007). So how can the child learn which sample the child’s input at a higher frequency
cues are relevant and in which direction? One (e.g., Behrens, 2006; Maslen, Theakston, Lieven,
possibility may be to encode the correlations & Tomasello, 2004), will help the development
between cues in the linguistic environment. of such constructivist-oriented models of lan-
This view is supported by related mathemati- guage acquisition.
cal analyses based on the Vapnik-Chervonenkis Future work should therefore provide more
(VC) dimension (Abu-Mostafa, 1993), showing detailed analysis of the developmental trajec-
that the integration of multiple “hints” or cues tory of multiple-cue integration. Most work
of correlated information reduces the number of on cue availability in the child’s environment
hypotheses a learning system has to entertain. makes the simplifying assumption that all infor-
The VC dimension specifies an upper bound for mation is available to the child simultaneously.
the amount of input needed by a learning pro- This is an oversimplification: Children’s pro-
cess that starts with a set of hypotheses about ductions indicate that the whole of language is
a task solution. Cue information may lead to not acquired in one step, but that overlapping
a reduction in the VC dimension by weeding phases of acquisition occur, where learning
out unhelpful hypotheses and thus lowering progress at any one time relies on progress that
the number of examples needed to find a solu- preceded it. Attempts to explain and exploit
tion. In other words, the integration of multiple these learning phases in computational mod-
cues may reduce learning time by reducing the els has been successful in accounting for early
number of steps necessary to find an appropri- processing constraints that facilitate later learn-
ate function approximation, as well as reduce ing of complex syntactic structures (Elman,
the set of candidate functions considered, thus 1993), phrasal productions and errors in young
potentially ensuring better generalization. children (Freudenthal, Pine, & Gobet, 2005),
More generally, the development of compu- and the development of the lexicon (Steyvers
tational multiple-cue integration models is still & Tenenbaum, 2005). Such approaches could
in its infancy. There now exists a wealth of sup- equally be applied to the computational sim-
port for the usefulness of multiple probabilis- ulation of multiple-cue integration reported
tic cues for language acquisition, and although in this chapter: The reliability of phonological,
theoretical models abound (e.g., Gleitman & prosodic, or distributional cues could be based
Wanner, 1982; and contributions in Morgan & on the most frequent, or earliest-learned words,
Demuth, 1996; Weissenborn & Höhle, 2001), and constructed incrementally, and such a con-
only a few psychologically plausible computa- structivist approach would enhance the cogni-
tional models for multiple-cue integration are tive plausibility of the availability and process of
on offer (e.g., Cartwright & Brent, 1997). Extant use of such cues by the developing child.
models tend to capture the end-state of learn- The wide array of phonological, prosodic,
ing rather the developmental process itself. This and distributional information sources in pri-
approach cannot identify the time course of dif- mary linguistic input may make the child’s
ferent cues as they become important for acqui- learning task substantially easier than it might
sition. For example, the ability to use visual seem when we consider only the complexities

of syntax that they acquire. A domain-general utterances? Developmental Psychology, 29,

learning mechanism, such as the SRN archi- 711–721.
tecture used here, can capitalize on this rich Billman, D. (1989). Systems of correlations in rule
information to acquire deep domain-specific and category learning: Use of structured input
knowledge that emerges through developmen- in learning syntactic categories. Language and
Cognitive Processes, 4, 127–155.
tal time. Along with this language-internal
Broen, P. (1972). The verbal environment of the
information, surely innate and language-exter- language-learning child. ASHA Monographs,
nal constraints also contribute to the task, and No. 17. Washington, DC: American Speech
future work should aim to integrate all three and Hearing Society.
fundamental sources of constraints. We have Brooks, P. J., Braine, M. D., Catalano, L. & Brody,
nevertheless shown that even with relatively R. E. (1993). Acquisition of gender-like noun
simple domain-general assumptions about the subclasses in an artificial language: The con-
learner, multiple-cue integration can facilitate tribution of phonological markers to learning.
the complex task of syntax acquisition. Theories Journal of Memory and Language, 32, 76–95.
of the language learner therefore should not Carter, A. & Gerken, L. A. (1996). Children’s use of
overburden innate and language-external con- grammatical morphemes in on-line sentence
comprehension. In E. Clark (Ed.), Proceedings
straints where language-internal multiple-cue
of the Twenty-Eighth Annual Child Language
integration can help. Research Forum (Vol. 29). Palo Alto, CA:
Stanford University Press.
Cartwright, T. A. & Brent, M. R. (1997). Syntactic
ACKNOWLEDGMENTS categorization in early language acquisition:
This research was supported in part by a Formalizing the role of distributional analysis.
Human Frontiers Science Program Grant Cognition, 63, 121–170.
(RGP0177/2001-B) to M.H.C. Some of the material Cassidy, K. W., & Kelly, M. H. (1991). Phonological
in this chapter was adapted from Christiansen, information for grammatical category assign-
M. H., & Dale, R. (2001), Integrating distribu- ments. Journal of Memory and Language, 30,
tional, prosodic and phonological information in 348–369.
a connectionist model of language acquisition, in Cassidy, K. W., & Kelly, M. H. (2001). Children’s
Proceedings of the 23rd Annual Conference of the use of phonology to infer grammatical class in
Cognitive Science Society (pp. 220–225), Mahwah, vocabulary learning. Psychonomic Bulletin and
NJ: Lawrence Erlbaum, and Reali, F., Christiansen, Review, 8, 519–523.
M. H., & Monaghan, P. (2003), Phonological and Christiansen, M. H., Allen, J., & Seidenberg, M. S.
distributional cues in syntax acquisition: Scaling (1998). Learning to segment speech using mul-
up the connectionist approach to multiple-cue tiple cues: A connectionist model. Language
integration, in Proceedings of the 25th Annual and Cognitive Processes, 13, 221–268.
Conference of the Cognitive Science Society Christiansen, M. H., & Chater, N. (1999). Toward
(pp. 970–975), Mahwah, NJ: Lawrence Erlbaum. a connectionist model of recursion in human
linguistic performance. Cognitive Science, 23,
REFERENCES Christiansen, M. H., & Chater, N. (2001).
Connectionist psycholinguistics: Capturing
Abu-Mostafa, Y. S. (1993) Hints and the VC dimen- the empirical data. Trends in Cognitive Sciences,
sion. Neural Computation, 5, 278–288. 5, 82–88.
Behrens, H. (2006). The input–output relationship Christiansen, M. H., & Dale, R. (2004). The role
in first language acquisition. Language and of learning and development in the evolu-
Cognitive Processes, 21, 2–24. tion of language. A connectionist perspec-
Bernstein-Ratner, N. (1984). Patterns of vowel tive. In D. Kimbrough Oller & U. Griebel
modification in motherese. Journal of Child (Eds.), Evolution of communication systems:
Language, 11, 557–578. A comparative approach. The Vienna Series in
Bijeljac, R., Bertoncini, J., & Mehler, J. (1993). How Theoretical Biology (pp. 90–109). Cambridge,
do 4-day-old infants categorize multisyllabic MA: MIT Press.

Cutler, A. (1993). Phonological cues to open-and Jusczyk, P. W. (1999). How infants begin to
closed-class words in the processing of spoken extract words from speech. Trends in Cognitive
sentences. Journal of Psycholinguistic Research, Sciences, 3, 323–328.
22, 109–131. Jusczyk, P. W., & Kemler-Nelson, D. G. (1996).
Elman, J. L. (1990). Finding structure in time. Syntactic units, prosody, and psychologi-
Cognitive Science, 14, 179–211. cal reality during infancy. In J. L. Morgan &
Elman, J. L. (1993). Learning and development in K. Demuth (Eds.), Signal to syntax: Boot-
neural networks: The importance of starting strapping from speech to grammar in early
small. Cognition, 48, 71–99. acquisition (pp. 389–408). Mahwah, NJ:
Fernald, A., & McRoberts, G. (1996). Prosodic Lawrence Erlbaum Associates.
bootstrapping: A critical analysis of the argu- Karmiloff-Smith, A. (1979). A functional approach
ment and the evidence. In J. L. Morgan & to child language: A study of determiners
K. Demuth (Eds.), From Signal to syntax and reference. Cambridge, UK: Cambridge
(pp. 365–388). Mahwah, NJ: Lawrence Erlbaum University Press.
Associates. Kelly, M. H. (1988). Phonological biases in gram-
Fisher, C., & Tokura, H. (1996). Acoustic cues matical category shifts. Journal of Memory and
to grammatical structure in infant-directed Language, 27, 343–358.
speech: Cross-linguistic evidence. Child Kelly, M. H. (1992). Using sound to solve syntactic
Development, 67, 3192–3218. problems: The role of phonology in grammati-
Freudenthal, D., Pine, J. M., & Gobet, F. (2006). cal category assignments. Psychological Review,
Modelling the development of children’s use of 99, 349–364.
optional infi nitives in English and Dutch using Kemler-Nelson, D. G., Hirsh-Pasek, K., Jusczyk, P.
MOSAIC. Cognitive Science, 30, 277–310. W., & Wright Cassidy, K. (1989). How the pro-
Frigo, L., & McDonald, J. L. (1998). Properties of sodic cues in motherese might assist language
phonological markers that affect the acqui- learning. Journal of Child Language, 16, 55–68.
sition of gender-like subclasses. Journal of Korman, M. (1984). Adaptive aspects of maternal
Memory and Language, 39, 218–245. vocalization in differing contexts at ten weeks.
Gerken, L. A. (1996). Prosody’s role in language First Language, 5, 44–45.
acquisition and adult parsing. Journal of Kuhl, P. K. (1999). Speech, language, and the brain:
Psycholinguistic Research, 25, 345–356. Innate preparation for learning. In M. Konishi
Gerken, L. A., Jusczyk, P. W., & Mandel, D. R. & M. Hauser (Eds.), Neural mechanisms of
(1994). When prosody fails to cue syntactic communication (pp. 419–450). Cambridge,
structure: Nine-month-olds’ sensitivity to pho- MA: MIT Press.
nological vs. syntactic phrases. Cognition, 51, Kuhl, P. K., Andruski, J. E., Chistovich, I. A.,
237–265. Chistovich, L. A., Kozhevnikova, E. V., Ryskina,
Gleitman, L. & Wanner, E. (1982). Language V. L., et al. (1997). Cross-language analysis
acquisition: The state of the state of the art. of phonetic units in language addressed to
In E. Wanner & L. Gleitman (Eds.), Language infants. Science, 277, 684–686.
acquisition: The state of the art (pp. 3–48). MacWhinney, B. (2000). The CHILDES project:
Cambridge, UK: Cambridge University Press. Tools for analyzing talk (3rd ed.). Mahwah, NJ:
Gómez, R. L., & Gerken, L. A. (1999). Artificial Lawrence Erlbaum Associates.
grammar learning by 1-year-olds leads to spe- Maratsos, M., & Chalkley, M. A. (1980). The inter-
cific and abstract knowledge. Cognition, 70, nal language of children’s syntax: The ontogen-
109–135. esis and representation of syntactic categories.
Gómez, R. L., & Gerken, L. A. (2000). Infant arti- In K. Nelson (Ed.), Children’s language (Vol. 2,
ficial language learning and language acquisi- pp. 127–214). New York: Gardner Press.
tion. Trends in Cognitive Sciences, 4, 178–186. Maslen, R., Theakston, A., Lieven, E., & Tomasello,
Green, T. R. G. (1979). The necessity of syntax M. (2004) A dense corpus study of past tense
markers: Two experiments with artificial lan- and plural overregularization in English.
guages. Journal of Verbal Learning and Verbal Journal of Speech, Language and Hearing
Behavior, 18, 481–496. Research, 47, 1319–1333.
Jusczyk, P. W. (1997). The discovery of spoken Mattys, S. L., Jusczyk, P. W., Luce, P. A., & Morgan,
language. Cambridge, MA: MIT Press. J. L. (1999). Phonotactic and prosodic effects

on word segmentation in infants. Cognitive Morgan, J. L., Shi., R., & Allopenna, P. (1996).
Psychology, 38, 465–494. Perceptual bases of grammatical categories. In
McDonald, J. L., & Plauche, M. (1995). Single J. L. Morgan & K. Demuth (Eds.), Signal to syn-
and correlated cues in an artificial language tax: Bootstrapping from speech to grammar in
learning paradigm. Language and Speech, 38, early acquisition. (pp. 263–283). Mahwah, NJ:
223–236. Lawrence Erlbaum Associates.
Mehler, J., Jusczyk, P. W., Lambertz, G., Halsted, Newport, E. L., Gleitman, H., & Gleitman, L. R.
N., Bertoncini, J., & Amiel-Tison, C. (1988). (1977). Mother, I’d rather do it myself: Some
A precursor of language acquisition in young effects and non-effects of maternal speech style.
infants. Cognition, 29, 143–178. In C. E. Snow & C. A. Ferguson (Eds.), Talking
Meier, R. P., & Bower, G. H. (1986). Semantic refer- to children: Language input and acquisition
ence and phrasal grouping in the acquisition of (pp. 109–149). Cambridge, UK: Cambridge
a miniature phrase structure language. Journal University Press.
of Memory and Language, 25, 492–505. Onnis, L., & Christiansen, M. H. (2008). Lexical
Mintz, T.H. (2003). Frequent frames as a cue categories at the edge of the word. Cognitive
for grammatical categories in child directed Science, 32, 184–221.
speech. Cognition, 90, 91–117. Pallier, C., Christophe, A., & Mehler, J. (1997).
Mintz, T. H., Newport, E. L., & Bever, T. G. Language-specific listening. Trends in Cognitive
(2002). The distributional structure of gram- Sciences, 1, 129–132.
matical categories in speech to young children. Pinker, S. (1984). Language learnability and lan-
Cognitive Science, 26, 393–424. guage development. Cambridge, MA: Harvard
Monaghan, P., Chater, N., & Christiansen, M. H. University Press.
(2005). The differential contribution of phono- Redington, M., Chater, N., & Finch, S. (1998).
logical and distributional cues in grammatical Distributional information: A powerful cue
categorisation. Cognition, 96, 143–182. for acquiring syntactic categories. Cognitive
Monaghan, P., & Christiansen, M. H. (2008). Science, 22, 425–469.
Integration of multiple probabilistic cues in Rubel, E. W. (1985). Auditory system develop-
syntax acquisition. In H. Behrens (Ed.), Trends ment. In G. Gottlieb & N. A. Krasnegor (Eds.),
in corpus research: Finding structure in data Measurement of audition and vision in the first
(pp. 139–163) (TILAR Series). Amsterdam: year of postnatal life (pp. 53–89). Norwood, NJ:
John Benjamins. Ablex.
Monaghan, P., Christiansen, M. H., & Chater, N. Saffran, J. R. (2003). Statistical language learn-
(2007). The phonological–distributional coher- ing: Mechanisms and constraints. Current
ence hypothesis: Cross-linguistic evidence in Directions in Psychological Science, 12, 110–114.
language acquisition. Cognitive Psychology, 55, Saff ran, J. R., Aslin, R. N., & Newport, E. L. (1996).
259–305. Statistical learning by 8-month-old infants.
Morgan, J. L. (1996). Prosody and the roots of Science, 274, 1926–1928.
parsing. Language and Cognitive Processes, 11, Sereno, J. A., & Jongman, A. (1995). Acoustic cor-
69–106. relates of grammatical class. Language and
Morgan, J. L., & Demuth, K. (1996). Signal to syn- Speech, 38, 57–76.
tax: Bootstrapping from speech to grammar Shady, M., & Gerken, L. A. (1999). Grammatical
in early acquisition. Mahwah, NJ: Lawrence and caregiver cues in early sentence comprehen-
Erlbaum Associates. sion. Journal of Child Language, 26, 163–175.
Morgan, J. L., Meier, R. P., & Newport, E. L. (1987). Shafer, V. L., Shucard, D. W., Shucard, J. L., &
Structural packaging in the input to language Gerken, L. A. (1998). An electrophysiologi-
learning: Contributions of prosodic and mor- cal study of infants’ sensitivity to the sound
phological marking of phrases to the acqui- patterns of English speech. Journal of Speech,
sition of language. Cognitive Psychology, 19, Language, and Hearing Research, 41, 874–886.
498–550. Shi, R., Morgan, J., & Allopenna, P. (1998).
Morgan, J. L., & Saff ran, J. R. (1995). Emerging Phonological and acoustic bases for earliest
integration of sequential and suprasegmental grammatical category assignment: A cross-
information in preverbal speech segmentation. linguistic perspective. Journal of Child
Child Development, 66, 911–936. Language, 25, 169–201.

Shi, R., Werker, J. F., & Morgan, J. L. (1999). Tomasello, M., Kruger, A. C., & Ratner, H. H.
Newborn infants’ sensitivity to perceptual cues (1993). Cultural learning. Behavioral and Brain
to lexical and grammatical words. Cognition, Sciences, 16, 495–552.
72, B11–B21. Valian, V., & Coulson, S. (1988). Anchor points in
Snedeker, J., & Trueswell, J. (2004). The devel- language learning: The role of marker frequency.
oping constraints on parsing decisions: The Journal of Memory and Language, 27, 71–86.
role of lexical-biases and referential scenes in Valian, V., & Levitt, A. (1996). Prosody and adults’
child and adult sentence processing. Cognitive learning of syntactic structure. Journal of
Psychology, 49(3), 238–299. Memory and Language, 35, 497–516.
Steinhauer, K., Alter, K., & Friederici, A. D. (1999). Weissenborn, J., & Höhle, B. (Eds.) (2001).
Brain potentials indicate immediate use of Approaches to bootstrapping: Phonological, lex-
prosodic cues in natural speech processing. ical, syntactic and neurophysiological aspects of
Nature Neuroscience, 2, 191–196. early language acquisition. Philadelphia, PA:
Steyvers, M., & Tenenbaum, J. (2005). The large John Benjamins.
scale structure of semantic networks: Statistical Werker, J. F., & Tees, R. C. (1999). Influences
analyses and a model of semantic growth. on infant speech processing: Toward a new
Cognitive Science, 29, 41–78. synthesis. Annual Review of Psychology, 50,
Tomasello, M. (2000). The item-based nature of 509–535.
children’s early syntactic development. Trends
in Cognitive Sciences, 4, 156–163.
Shape, Action, Symbolic Play, and
Words: Overlapping Loops of Cause
and Consequence in Developmental Process

Linda B. Smith and Alfredo F. Pereira

Human beings are remarkably inventive, learning through developmental changes in

possessing the ability to solve problems and visual object recognition.
to create novel things. Th is chapter is about At a broader level, this chapter is about the
one early form of inventiveness that has long fundamentally constructive nature of develop-
intrigued developmentalists—what is some mental process itself: how development creates
times called symbolic play, but more narrowly, new forms of behaviors and abilities from inter-
is also known as “object substitution in play.” action of multiple processes, engaged in differ-
The specific phenomenon consists of young ent assemblies in overlapping tasks; how every
children using some object not for what it is developmental cause is itself a consequence of
but as a “stand in” for something else in play—a developmental process; how development is
banana as a phone, a box as a doll bed, a shoe made of weird loops of causes and consequences
as a toy car. Piaget (1962) considered object- with far-reaching and unexpected developmen-
substitution in play—the using of a banana as tal dependencies.
phone, for example—as “symbolic” because A summary of the developmental story we
the substituted object could be interpreted as will narrate is provided by Figure 6.1: Learning
“standing for” the real thing. Th is view that object names increases children’s attention to
object-substitution is a form of symbolizing shape, which in turn speeds up object name
(whatever precisely that means) has been dis- learning. Learning object names also changes
puted (Namy, 2002; Perner, 1991). Regardless of how children perceive object shape, which facil-
different opinions on this issue, object substi- itates learning and generalizing object names
tution in play remains a signal of developmen- and of actions. Acting on objects, in turn, refines
tal achievement, emerging at the same time (18 and tunes—making even more abstract—the
to 24 months) as children’s spoken vocabulary representation of object shape. Along the way,
also expands. Perhaps most critically, object we will suggest and provide evidence for the
substitution in play is strongly linked to indi- idea that the abstract representation of object
vidual children’s language development (see shape is the critical link between object name
McCune, 1995; McCune-Nicolich, 1981; Shore, learning and object substitutions in play.
O’Connell, & Bates, 1984; Veneziano, 1981),
with the lack of this behavior being a strong
predictor of significant subsequent language
delay (e.g., McCune-Nicolich, 1981; Weismer,
2007). This chapter is about how and why these Common object categories, categories such as
object substitutions may be linked to language chair, cup, spoon, house, and dog are (by adult


Learing substantial area of research has generated a

object names large number of well-replicated results impor-
tant to the first loop.
Attention to First, attention to shape increases as children
learn object names. Children initially (12- to
18-month-olds, see Gershkoff-Stowe & Smith,
Perception of 2004; Rakison & Butterworth, 1998a, 1998b)
object shape do not systematically attend to object shape in
naming and categorization tasks, but increas-
ingly do so in the period between 18 and 30
play/action months. Moreover, longitudinal studies show
that in individual children, the emergence of the
Figure 6.1 Loops of causes and consequences in shape bias is temporally linked to a measurable
the development of visual object recognition. spurt in the growth of object name vocabulary
(Gershkoff-Stowe & Smith, 2004). Attention to
shape also predicts developmental delays. Late
judgment) well organized by shape (Biederman, talkers—children delayed in their early noun
1987; Rosch, Mervis, Gray, Johnson, & Boyes- acquisitions—show systematic deficits in atten-
Braem, 1976; Samuelson & Smith, 1999). More tion to object shape in naming tasks (Jones,
critically, in the vocabulary of typically devel- 2003; Jones & Smith, 2005).
oping 2-year-olds, over 70% of the nouns are for Second, attention to shape is causally related
objects similar in shape (Samuelson & Smith, to object name learning. Teaching children to
1999). Accordingly, a large literature has been attend to shape facilitates novel noun acqui-
concerned with children’s attention to three- sitions and accelerates the rate of real-world
dimensional object shape in the context of early vocabulary development (Smith, Jones, Landau,
word learning. Landau, Smith, and Jones (1988) Gershkoff-Stowe & Samuelson, 2002; see also,
reported one key result: They showed 2- and Samuelson, 2002). This study was a 9-week lon-
3-year-old children a novel wooden object of gitudinal study. The children were 17 months
a particular shape and named it with a novel of age at the start—too young to show a shape
count noun, “This is a dax.” The children were bias in novel noun generalization tasks and
then presented with test objects that matched also on the early side of the increasing rate of
the exemplar in shape, size, or texture and new object names that begins around 18 to 22
were asked about each of those objects “Is this months for most children. The children in the
a dax?” Children generalized the name to test “Experimental” condition came to the labo-
objects that were the same in shape as the exem- ratory once a week for 7 weeks. During that
plar but not to test objects that were different in time, they played with 4 pairs of objects. The
shape. The degree of children’s selective atten- objects in each pair matched in shape but dif-
tion to shape was considerable: for example, fered markedly in all other properties—color,
they extended the name “dax” to same-shaped texture, material, and size. The objects in each
test objects that were 100 times the size of the pair were named by the same novel name (e.g.,
original. This “shape bias” in novel noun gener- “dax, riff, zup, toma”) and during each weekly
alization tasks has been demonstrated in many play session, the experimenter named each of
different studies and by different experimenters these training objects by its designated name at
using a variety of both specially constructed and least 20 times.
real objects (e.g., Gathercole & Min, 1997; Imai, On week 8, children’s ability to general-
Gentner, & Uchida, 1994; Keil, 1994; Soja, 1992) ize these trained names was tested. If children
and is evident in children learning a variety of learned that two particularly shaped objects
languages (Colunga & Smith, 2005; Gathercole were “daxes,” would they judge a novel object—
& Min, 1997; Yoshida & Smith, 2003). This new size, new color, new material but the same

shape—to also be “a dax?” The answer is yes; 1994) at the start and end of the experiment.
the children learned the category. On week 9, There was a marked increase in new noun learn-
children were tested in a novel noun general- ing for children in the Experimental but not the
ization task, with all new objects and all new Control conditions, and training influenced the
names. If shown a new never-before-seen thing rate with which children in the Experimental
and told its name, would these children know condition added new objects names to their
how to generalize the name to new instances by vocabulary but not the rate with which they
shape? Again, the answer is yes. The children added other words. Learning names for things
learned not just about particular categories and in shape-based categories teaches children to
the importance of shape, but also that shape in attend to shape when generalizing names for
general matters for naming objects. A variety things, and doing so accelerates the learning of
of control conditions were run in these series new object names, lexical categories that are in
of training studies, including grouping with- general well-organized by shape.
out naming, playing with objects with neither These training experiments may be micro-
grouping or naming, or learning names for genetic and targeted versions of what hap-
groups organized by texture or color. None of pens in the everyday development. In the real
the children in these training groups general- world, young children more slowly learn names
ized novel names for novel things on week 9 by for many different things at the same time.
shape. Although these categories are not as well orga-
The most dramatic result from these studies nized by the shape as the training categories,
is the finding that training children to attend many are categories of things that are mostly
to shape in this laboratory task increased their similar in shape. As children learn these catego-
rate of new object name acquisitions outside of ries, attention to shape in the context of nam-
the laboratory. Figure 6.2 shows the number of ing things may increase, and, as a consequence,
object names in children’s productive vocabu- children may learn object names more rapidly,
lary (as measured by the MCDI, Fenson et al., which should further tune attention to object

80 80

70 70
Mean number of object names

Mean number of other words

60 60

50 50

40 40

30 30

20 20

10 10

0 0
Pre (17 months) Post (19 months) Pre (17 months) Post (19 months)

Figure 6.2 Object names and other words in children’s productive vocabulary at the start and end of
training in the Smith et al. (2002) study.

shape. Every word learned sets the stage for and table as a particular and favorite cup. Although
constrains future learning. The shape bias in competing theories of object recognition
early noun learning is in this way both a cause (Biederman, 1987; Edelman, 1999; Ullman,
and a consequence of learning object names. 2000) often pit different kinds of hypothesized
processes and representations against each
other, it is likely that human object recognition
is dependent on a multitude of partially distinct
and partially overlapping processes (Hayward,
There is one critical unexplained aspect of Loop 2003; Hummel, 2000; Marr, 1982; Peissig &
1. In order for a shape bias to work to help chil- Tarr, 2007; Peterson, 1999). That is, no single
dren learn names for everyday object categories, mechanism is likely to explain the full range of
children must be able to recognize sameness in contexts in which people recognize objects as
shape across different instances of a category. individuals and as instances of categories.
This is trivial in artificial noun-learning tasks Theories of object recognition that concen-
since all objects are relatively simple and “same- trate on how people rapidly recognize instances
shaped” (objects are the exact same shape). But of novel categories (and approaches to machine
this is not trivial in the real world. In order for vision that attempt to build devices that can rec-
children to learn, for example, that chairs are ognize novel instances of categories) often pro-
“chair-shaped” and to use that knowledge to rec- pose processes of shape recognition that depend
ognize a new chair, they must be able to abstract on abstract, sparse representations of the global
the common shape from a whole array of chairs shapes of things. There are two general classes
that have been experienced, each with its own of such theories. According to “view-based”
unique and detailed shape. Real world instances theories, people store representations of specific
of common noun categories, though judged by views of experienced instances and use proto-
adults to be the “same shape” (e.g., Samuelson types—a kind of average simplified shape that
& Smith, 1999) are not exactly the same shape, captures the global structure—to recognize new
but only similar in shape at some appropriate category instance (Edelman, 1995; Edelman
level of abstraction. This then is the critical & Duvdevani-Bar, 1997; Edelman & Intrator,
next question: What is the proper description 1997). In Edelman’s (1995) account, category
of shape for common object categories? When learning plays the critical role in creating proto-
and how do children discover that description? types of the holistic shape of category members.
Before answering this question, we step back to Novel instances are subsequently categorized by
consider what is known about adult shape rep- their overall similarity to these representations.
resentations and visual object recognition. “Object-based” theories such as Biederman’s
The first key fact about human visual object (1987) recognition-by-components (RBC)
recognition is that it is impressive: it is fast, account present another idea about what con-
seemingly automatic, robust under degraded stitutes “sameness in shape.” Th is theory pro-
viewing conditions, and capable of recognizing poses that objects are perceptually parsed,
novel instances of a very large number of com- represented, and stored as configurations of geo-
mon categories (Biederman & Gerhardstein, metric volumes (“geons”). Within this account,
1993; Cooper, Biederman, & Hummel, 1992; object shape is defined by 2 to 4 geometric vol-
Fize, Fabre-Thorpe, Richard, Doyon, & Thorpe, umes in the proper spatial arrangement, an idea
2005; Pegna, Khateb, Michel, & Landis, 2004). supported by the fact that adults need only 2 to
The second key fact is that object recognition 4 major parts to recognize instances of com-
is not a single skill but a consortium of abili- mon categories (Biederman, 1987; Hummel &
ties. For example, in their everyday lives, peo- Biederman, 1992) as illustrated at the bottom of
ple routinely recognize the dog whose nose Figure 6.3. This account thus posits sparse and
is sticking out from the blanket, the highly impoverished representations that, through
unique modernistic chair, and the cup on the their high level of abstraction, can gather all

Figure 6.3 Pictures of some of the three-dimensional objects, richly detailed and shape caricatures,
used in Smith (2003).

variety of highly different things into a “same from sparse representations of the geometric
shape” category. Both classes of theories fit structure as can adults (e.g., Biederman, 1987).
aspects of the adult data, which include strong The experiment specifically contrasted richly
view dependencies in object recognition and detailed typical examples with “shape cari-
also knowledge of part structure and relations. catures” as shown in Figure 6.3. The task was
Accordingly, there is a growing consensus that name comprehension (“get the camera”) and
both kinds of theories may capture important the 18- to 24-month participants were grouped
but different processes in mature object recog- into developmental level by the number of object
nition (Hayward, 2003; Peissig & Tarr, 2007; names in their productive vocabulary. The main
Peterson, 1999; Stankiewicz, 2003; Tarr & results were that children with smaller and
Vuong, 2002). For this chapter, the important larger vocabularies (below 100 object names ver-
point is that both approaches posit that sparse sus more than 100 object names) recognized the
and abstract representations of object shape richly detailed instances equally well. However,
support the recognition of instances of com- children with smaller noun vocabularies per-
mon object categories. The question we want formed at chance levels when presented with
to answer is when and how children develop the shape caricatures, whereas the children
these representations. Despite the importance with high noun vocabularies recognized the
of object recognition to many domains of cog- shape caricatures as well as they did the richly
nitive development, there was, until recently, detailed and typical instances.
extraordinarily little developmental research in These results have been replicated in two
this area (see Kellman, 2001). further studies (Son, Smith, & Goldstone,
The first study (Smith, 2003) asked whether 2008; Pereira & Smith, 2009). Further, a study
young children (18 to 24 months) could rec- of older late talkers (children whose produc-
ognize instances of common object categories tive vocabulary is below the 20th percentile for

their age) found a deficit in the recognition of object recognition that are used in different
shape caricatures but not richly detailed typi- contexts or for different kinds of tasks, perhaps
cal instances (Jones & Smith, 2005). Altogether, these processes each have their own develop-
these results suggest a potentially significant mental trajectories with more “fragments”
change in how young children represent and or “feature” processes developing earlier and
compare object shape that is developmentally sparse representations of global geometric
linked to the learning of objects names. In par- structure emerging as a consequence of learn-
ticular, sparse representations of object shape ing many shape-based object categories (see
appear to emerge between 18 and 24 months. also the chapter in this volume by Johnson, for
One other line of research suggests that a similar developmental trajectory in young
these developmental changes may also involve infants’ perceptual completion).
a shift in the kind of stimulus information used Pereira and Smith (2009) provide support for
to categorize and recognize objects. In particu- this idea in a direct comparison of young chil-
lar, a number of studies suggest that children dren’s ability to recognize objects given local
younger than 20 months attend to the individ- featural details versus global geometric struc-
ual parts or local details of objects rather than ture. The stimulus sets that they used resulted
overall shape (Rakison & Butterworth, 1998a). from a 2 × 2 design: the presence and absence
In a series of programmatic studies, Rakison of global information about geometric struc-
and colleagues (Rakison & Butterworth, ture (which they labeled by +Shape Caricature
1998b, Rakison & Cohen, 1999, Rakison and and –Shape Caricature) and localized and fine
Cicchino, 2008) showed that 14- and 22-month detailed information predictive of the category
old children based category decisions on highly (which they labeled by +Local Details and –Local
salient parts (such as legs and wheels) and not Details). Examples of the four stimulus condi-
on overall shape. For example, when children tions are in Figure 6.4. The +Shape Caricatures,
were presented with cows whose legs had been structured as in the Smith (2003) experiment
replaced by wheels, they classified the cows were made from 1 to 4 geometric components
with vehicles rather than animals; likewise they in the proper spatial relations. The –Shape
categorized a vehicle as an animal when it had Caricatures were alterations of the +Shape cari-
cow legs. Similarly, Colunga (2003) showed that catures: the shapes of at least two component
18-month-olds tended to only look at a small volumes were altered and if possible the spatial
part of any pictured object, using clusters of arrangements of two volumes relative to each
local features such as the face when recognizing other were rearranged. The presence of detailed
animals, or the grill and headlights when recog- local information was achieved by painting sur-
nizing vehicles. face details on these volumes that were predic-
These results raise the possibility that very tive of the target category, for example, the face
young children—perhaps before they develop of a dog, wheels, and so forth.
more sparse representations of object struc- Figure 6.5 shows the main result; the darker
ture—recognize objects via what Cerella (1986) bars indicate performance when local details
called “particulate perception,” concentrat- were present (+Local Details) and the solid bars
ing on local components unintegrated into the indicate performance given the +Shape cari-
whole. Such “part”-based object recognition is catures, that is, when the appropriate though
also suggestive of an approach to object recog- sparse global shape structure was present. The
nition that has emerged in the machine vision children in the lowest vocabulary group show
literature: in particular, Ullman has developed their highest level of performance (the darker
a procedure through which objects are success- bars) when the stimuli present local details,
fully recognized via stored representations of and for these stimuli, the presence or absence
category specific fragments (Ullman & Bart, of appropriate shape structure does not matter.
2004; Ullman, Vidal-Naquet, & Sali, 2002). If The children in the most advanced vocabulary
adults possess multiple distinct processes of group perform best given the appropriate sparse

Figure 6.4 Photos of the stimuli used in Pereira and Smith’s (2009) experiment 2. Each set of four
pictures contains clockwise from the upper left: (–Local Details, +Shape Caricature), (–Local Details,
–Shape Caricature), (+Local Details, –Shape Caricature), (+Local Details, +Shape Caricature). The
black line close to each object is one inch in length.

representations of global shape. In brief, there is The very idea that object recognition may
increasing recognition of the shape caricatures change substantially during this developmen-
with increasing vocabulary size and a greater tal period is not commonly considered in stud-
dependence on local features earlier in their ies of categorization and concepts in infancy
vocabulary development. These results add to and early childhood. This is so even though we
the growing number of findings suggesting sig- know that there is at last one domain in which
nificant changes in visual object recognition in recognition undergoes significant changes as
the second year of life (Rakison & Lupyan, 2008; a function of development and experience.
Smith, 2003; Son, Smith, & Goldstone, 2008). Specifically, face recognition is characterized

1.00 –Local Details AND +Shape Caricature

–Local Details AND –Shape Caricature
+Local Details AND +Shape Caricature
Proportion of objects correctly categorized

+Local Details AND –Shape Caricature

* Error Bars show Mean ± 1.0 SE

0.80 Bars show Means
*: Mean differs from
change (t-test comparing
mean with average
* * * performance of 33% for
0.60 * random choice, p < 0.05)



Group I Group II Group III

Figure 6.5 Mean proportion of number of objects correctly categorized (out of 6 trials) across the
three groups of vocabulary level for the Local Details × Shape Caricature interaction (Pereira & Smith,
2009). Between-subjects conditions –Local Details and +Local Details are shown in white and black
bars, respectively. Within-subjects conditions +Shape Caricature and –Shape Caricature are shown in
solid and patterned bars, respectively.

by strong early sensitivities in infancy yet also of object shape may foster more rapid category
shows a slow and protracted course of devel- learning and generalization (see especially, Son,
opment with adult-like expertise not achieved Smith, & Goldstone, 2007).
until adolescence (e.g., Mondloch, Le Grand, & At the very least, there is growing evi-
Maurer, 2002). In this context, the idea of signif- dence for significant changes in visual object
icant changes in object recognition and a possi- recognition during the developmental period in
bly protracted course of development seem less which children’s object name learning is rapidly
surprising (as also suggested by Abecassis, Sera, expanding. Multiple kinds of information may
Yonas, & Schwade, 2001). be used to recognize objects and it appears that
It seems likely that the visual system devel- very young children, at the start of a period of
ops the kinds of representations that support the rapid category learning, mostly rely on detailed
task that needs to done (Biederman & Kalocsai, local information to recognize instances of
1997; Nelson, 2001). The task of object recogni- common categories but not more abstract infor-
tion for many different categories with many mation about geometric structure. Children
different potential instances in each category who are only slightly more advanced, how-
may demand a more abstract and geometric ever, do recognize common objects from such
description of object shape—–one in which, for shape caricatures. Th is period of rapid develop-
example, a chair is a horizontal surface (to sit mental change seems crucial to understanding
on) and a vertical surface (to support one’s back). the nature of human object recognition and
The period in which children learn the names may also provide a crucial missing link in our
for many different categories of things could understanding of the developmental trajectory
be a driving force behind these developments. in early object name learning, a trajectory
And certainly, more abstract representations of vocabulary growth that begins slow but

Figure 6.6 Sample stimulus set from Smith and Pereira (2008) for the symbolic play task.

progresses to quite rapid learning characterized around a theme but with one key object miss-
by the fast-mapping of object names to catego- ing. A sample set is shown in Figure 6.6. For
ries of things alike in shape. this set, the child was given a doll, a blanket,
and a pillow—three objects suggesting a “going
Unexpected Connections—Symbolic Play
to bed” theme—but no bed. Instead the fourth
These changes in visual object recognition object was block. The question was whether the
raise a new hypothesis about the develop- child would engage in thematic play, using the
mental origins of object substitutions in play. block as a bed. There were four such sets, the
Although these inventive actions seem likely to bed theme, an eating theme (doll, plate, spoon,
involve many interacting processes (including pompoms to be potentially used as food), a car
wanting to engage in thematic play), they may theme (road, bridge, stoplight, wooden shoe to
also depend critically on abstract descriptions be potentially used as car), and a house theme
of object shape. Using a banana as a phone, a (house, table, chair, and stick to be potentially
shoe as a car, a stick as a bottle, and a pot as a hat used as a person). Because children might inad-
all suggest sensitivity to high-level structural vertently use the object in a way consistent with
properties of shape. Accordingly, in a recently its targeted role, we required that children do
completed study, we examined whether chil- two successive acts involving the target object
dren’s recognition of shape caricatures might in thematic play (e.g., laid the doll on the block
predict the likelihood of object substitutions in then immediately put the blanket on the doll) to
play (Smith & Pereira, 2008). The participants score it as an instance of symbolic play.
were children 17 to 22 months of age. There Figure 6.7 shows the performances of chil-
were three dependent measures: (1) the number dren with fewer and greater than 100 object
of nouns in the children’s productive vocabu- names in their vocabulary: the proportion of
lary (by MCDI parent report, see Fenson et al., shape caricatures they recognized (given they
1993), (2) children’s recognition of common cat- recognized the richly detailed instances) and the
egories from shape caricatures (given that they proportion of trials on which they engaged in
could recognize richly detailed instances of the symbolic play (as defined above). Children with
same thing), and (3) performance in a symbolic fewer than 100 object names in their produc-
play task. tive vocabulary were much less likely to use the
To encourage children to engage in symbolic target objects in thematic play and also much
play, they were given a set of toys organized less likely to recognize the shape caricatures for

1 Shape caricature
Symbolic play task
Error bars: ± 1.00 SE
Mean proportion of trials




Under 100 nouns Over 100 nouns
Vocabulary lavel

Figure 6.7 Mean proportion of trials on which children with fewer and with more than 100 object
names in productive vocabulary recognized the shape caricatures in the caricature recognition task
and used the target object for the missing object in the symbolic play task (Smith & Pereira, 2008).

common nouns. In contrast, children with over visual object recognition, which are influenced
100 object names in their productive vocabu- by learning names for things.
lary both used the target object for the missing
object and also readily recognized the shape
caricatures for common categories. Across the
36 children who participated in this study, there The link between object substitution in play
was a strong correlation between recognition of and the recognition of three-dimensional cari-
shape caricatures and symbolic play (r =.66, p catures of the geometric structure of common
< .001). Further a control study, which replaced objects suggests that an abstract and sparse
the target object with richly detailed instances description of object structure invites the gen-
(e.g., replaced the block with a toy bed, the stick eralization of actions and potentially classes of
with a person), did not yield vocabulary-related actions. This makes sense as there is a strong
differences in symbolic play nor a reliable cor- causal link between the shape of things, how
relation with shape caricature recognition. they are held, how they feel while being held,
Finally, although vocabulary and age are cor- and the actions those objects afford.
related, size of the object name vocabulary is Contemporary research in cognitive neuro-
a better predictor of both symbolic play and science also indicates a coupling between brain
shape caricature recognition than is age (see regions involved in visually recognizing objects
also, Pereira & Smith, 2009). and in producing action that may be particu-
Although many abilities seem likely to be larly relevant to an understanding of the devel-
important to the development of thematic play opmental relationship between the abstraction
and children’s inventive object substitutions in of sparse descriptions of object structure and
that play, these results suggest that these object action. In particular, perceptual-motor interac-
substitutions may be, at least in part, linked to tions have been shown in behavioral paradigms
language through developmental changes in and in recordings of brain activation (Christou

& Bulthoff, 1999; Craighero et al., 1996; Freyd,

1983; Harman, Humphrey, & Goodale, 1999;
James, Humphrey, & Goodale, 2001; James et al.,
2002; Tong et al., 1995; Wexler & von Boxtel,
2005; A. Wohschlager & A. Wohlschlager,
1998). Specifically, there appear to be automatic
links among visual systems used for object per-
ception and recognition, and motor systems
used to act on objects (Arbib, 1981; Chao &
Visual processing Moter plan
Martin, 2000; Grezes & Decety, 2002; James Shape Proprioception
et al., 2006; Longcamp, Boucard, Gilhodes, & Motion Haptic system
Vely, 2005; Paillard, 1991; Vivani & Stucchi,
1992) such that, upon visual perception of an Figure 6.8 A schematic representation of the
object, motor areas are automatically activated. inter-relating of multiple simultaneous repre-
These studies have demonstrated that activation sentations across modalities.
in motor cortices emerges upon visual presenta-
tion of manipulable objects such as tools (Chao
& Martin, 2000) and kitchen utensils (Gerlach Figure 6.8 provides a schematic representa-
et al., 2000 ; Grezes & Decety, 2002; Grezes tion of this possibility—of what Edelman (1987)
et al., 2003; Mecklinger, Gruenewald, Besson, calls re-entry, the explicit interrelating of multi-
Magnie, & Cramon, 2002); and manually cre- ple simultaneous representations across modal-
ated objects such as letters (James & Gauthier, ities. For example, when a child is given a toy
2006; Longcamp et al., 2005). That is, visual to hold and look at, both the visual and motor
presentations of objects with which we have had systems are simultaneously engaged. Together,
extensive motor interactions appear to auto- they yield a constellation of sensations and
matically activate the motor areas responsible movements associated with various actions on
for those actions. the toy and their consequences. Importantly,
There are several open questions about these multimodal experiences are time-locked
these links, including why and how they are and correlated. Changes in the way the hand
constructed, and whether they play a role in feels when it moves the toy are time-locked with
visual recognition (James & Gauthier, 2006; the changes the infant sees as the toy is moved.
Mecklinger et al., 2002; Wexler & van Boxtel, The time-locked correlations potentially create a
2005). One possibility is that such links are powerful learning mechanism, as illustrated in
neural correlates of mere associations, so that the figure, which shows five related mappings.
although the motor regions are activated in One map is between the physical properties of
response to visual stimuli, perhaps as prepara- the toy and the neuronal activity in the visual
tion for action, they play no role in the visual system. Another map is between the physical
recognition of the objects. A second possibility properties of the toy and neuronal activity in
is that these motor activations feed back on and the motor planning, proprioceptive, and haptic
actually influence and help select activations in systems. A third map is between the motor sys-
visual regions. More radically, a third possibility tems and actions on the object. The fourth and
is that a developmental history of the dynamic fift h maps are what Edelman calls the re-entrant
coupling of visual and motor activations con- maps: activity in the visual system is mapped
stitutes the stored representation of the object, to the motor (and haptic and proprioceptive)
with the history of activations in visual regions systems, and activity in the motor system is
influencing stored representations in motor mapped to the visual system. Thus independent
regions, and with the history of activations in mappings of the stimulus—the sight of it and
motor regions influencing stored representa- the action on it—provide qualitatively different
tions in visual regions. takes on the world.

By being correlated in real time, these planar and nonplanar views and (2) the major
different takes can potentially educate each axes of an object. We consider each of these in
other. At the same time as the visual system turn, presenting background on why they are
is activated by time-varying changes in visual likely structural properties for seeking a rela-
information about shape and collinear move- tion between action and object recognition and
ment of points on the toy, the motor and pro- also new evidence suggesting marked develop-
prioceptive system is activated by action and mental change during the period between 18 to
felt movements. At every step in real time, the 24 months.
activities in these heterogeneous processes are
Preferred Views
mapped to each other, potentially enabling the
coupled systems through their own activity to There is ample evidence that not all viewpoints
discover higher-order regularities that tran- of objects are equal in terms of the ease with
scend the individual systems considered alone. which an object is recognized from that view-
Again, developmental relations may be bidi- point. Two views—off-axis versus on-axis
rectional—changes in the perception of object views—have generated considerable interest in
shape may foster the generalizations of actions, the adult object recognition literature. These
and action, in turn, may promote attention to or two perspectives, illustrated in Figure 6.9 are
the representation of relevant geometric prop- often called the “3/4” and “planar” (front on and
erties of shape. side on) views. These are the terms we will use
As a first step in considering how action may here. With familiar objects, adults can recog-
educate a sparse structural description shape— nize an object faster from a single image if that
the kind of category encompassing description image is a 3/4 or off-axis view than if it is a planar
that enables one to recognize the sameness in view (Blanz, Tarr, & Bulthoff, 1999; Humphrey
shape across all varieties of shape—we have & Jolicouer, 1993; Lawson & Humphreys, 1998;
concentrated on two structural properties that Newell & Findlay, 1997; Palmer, Rosch, &
are related to each other, commonly considered Chase, 1981). In addition, when asked to pick
important across a wide variety of different the- the “best” view of an object, adults will usually
ories of human and machine visual object rec- pick a 3/4 view (Blanz et al., 1999; Palmer et al.,
ognition and potentially informed by action: (1) 1981). Critically, these results are specific to the

Figure 6.9 A planar and nonplanar view of an object.


recognition of pictures of well-known category 18 and 24 months, the very same period in
instances. When adults are asked to pick the which children learn many object names, in
“best” view of a novel object, the planar views which the shape bias emerges, in which chil-
are picked as often as the 3/4 views (Blanz et al., dren begin to recognize objects from sparse
1999), perhaps because these views are less representations of geometric structure, and
likely to occlude relevant object features that in which they fi rst show shape-based object
cannot be inferred for novel things. Similarly, substitutions in thematic play. One cannot ask
when adults dynamically explore novel objects 18- to 24-month-old children to manipulate
prior to visual recognition tasks, they actually computer-rendered images with a trackball
prefer planar over 3/4 views. That is, during (the method used by James et al.). What we did
active exploration, adults spend a significantly instead was that we asked children to manu-
greater amount of time looking at views of ally and visually explore objects while wearing
objects where axes are foreshortened (front) a head-camera, a methodology for tapping the
or elongated (side) (Harman et al., 1999; James fi rst-person view developed by Yoshida and
et al., 2001, 2002; Perrett & Haries, 1988; Perrett Smith (2008). Children were given novel and
et al., 1992). The suggestion from the above familiar objects to explore as they sat in a chair
work is that the preferred view of an object with no table so that each object could only be
depends upon whether the task is one of rec- held by one or two hands. The child was given
ognition (retrieval of information) or explora- one object at a time to look at and explore for
tion (encoding of information), on whether the up to 20 s.
object in question is highly familiar or novel, The data were coded using a custom-made
and on whether it is a static, two-dimensional soft ware application that allowed a coder to
representation versus an actively perceived compare an image taken from the camera to
three-dimensional thing. an image of a computer-rendered object. Here,
In a series of studies particularly relevant to we will report two analyses. The fi rst is at a
our developmental work, James and colleagues coarser grain, and just asked whether the view
asked the adult subjects to view, for later tests was “near planar” (within 15º) for each of the
of recognition, computer rendered, virtual 3-D 6 possible planar views compared to a random
objects on a computer monitor by rotating them object manipulation. The second analysis yields
using a trackball device (Harman et al., 1999; a detailed continuous presentation of the actual
James et al., 2001). Subjects rotated the objects x, y, and z coordinates, a map of dwell times.
in any dimension (x, y, and z) for a total of 20 The 18-month-old children showed no prefer-
s. The subjects spent most of their viewing time ence for the planar compared to baseline when
on the planar views (see also James et al., 2002; they explored either the novel or the familiar
Perrett & Harries, 1988; Perrett, Harries, & objects. In contrast, the 24-month-old chil-
Looker, 1992). In a separate experiment, James dren showed marked preferences for the pla-
et al. (2001) also showed that dynamically view- nar views for known but not for novel objects
ing mostly planar views facilitated subsequent as shown in Figure 6.10. These results suggest
recognition when compared with the dynamic an emerging preference for planar views in the
viewing of mostly 3/4 views. Thus, not only active exploration of objects that begins with
do adults prefer to study planar views of novel known object shapes and thus potentially may
objects, but also when these views are controlled develop from experience with those particular
experimentally, dwelling on and around the objects. An important detail to consider here
planar views promotes the formation of more is that in the James et al. study, adults spent
robust memories of object shape. around 70% of their dwell time around planar
When and how do these preferences views. Th is is considerably higher that the val-
emerge developmentally? We have new results ues here, even for the older group, so it seems
(Pereira, James, Smith, & Jones, 2007) that that we have identified the beginning of this
suggest that this preference emerges between phenomenon.

the very least, all the results reviewed thus far

Proportion dwell time on planar views

suggest that fundamentally important changes

30% in visual object recognition—evident in a vari-
ety of task domains—are occurring between
Novel 18 and 24 months.
Axes of Elongation

Planar and 3/4 views are defined by the relation

of the object’s axis of elongation to the viewer;
that is, in the planar view, the axis of elongation
Under 100 object Over 100 object is parallel or perpendicular to the viewer’s line
names names
of sight. Many theorists of visual object recog-
Figure 6.10 Dwell times on planar views given nition have posited that axes of elongation (and/
exploration of novel and known objects for chil- or symmetry, which is often correlated with
dren with fewer and with more than 100 object the axis of elongation) play a particularly criti-
names in productive vocabulary as compared to cal role in a current input’s activation of stored
that expected (the solid line) by random object object representations (Biederman, 1987; Marr
rotations (Pereira et al., 2007). & Nishihara, 1992). This is because the object’s
major axes are proposed to play an important
role in parsing objects into their main parts and
To the best of our knowledge, there are no for the comparison of sensory inputs to stored
prior developmental studies directly comparing representations (Biederman, 1987; Jolicoeur,
children’s recognition of objects from 3/4 and 1985, 1990; Marr & Nishihara, 1992; Ullman,
planar views and no prior studies of the views of 1996). This makes sense: the principal axes are
objects that children actively generate for them- enduring characteristics of objects and provide
selves as they explore objects. This is a critical a systematic means for transforming and com-
gap in our understanding of the development of paring images by defining a common reference
visual object recognition. Children’s first-hand frame for alignment. Axes of elongation and
views of objects are the visual experiences on their relation to the body are also important
which they must build their object recognition determiners of how objects are picked up, held,
systems and their representations of specific viewed, and used (Goodale & Humphrey, 1998;
objects (see chapter by Johnson in this volume). Jeannerod, 1988, 1997; Jones & Lederman, 2006;
That these views change with development— Milner & Goodale, 1995; Turvey, Park, Dumais,
that younger children present different views & Carello, 1998).
to themselves than do older children—suggest Studies of the influence of axes of elongation
a link between perceptual development and in adult object recognition mainly involve the
action, with developmental changes in object presentation of objects from various viewpoints
representations perhaps driving changes in the that differ in the relation of the axes of elonga-
views that children present themselves. These tion to the viewer. Put together, these studies
self-generated views, in turn, seem likely to have yielded mixed results. Studies that have
be driving forces in the development of object used pictures of highly familiar objects rotated
representations. Although we have not made in the picture plane typically find, at best, small
the direct link yet, the fact that these changes effects (Large, McMullen, & Hamm, 2003;
in preferred views begin at the same time when Sekuler, 1996). As Large et al. note, strong top-
children learn many object category names, down effects in adults’ recognition of prototypi-
begin to build abstract and category encom- cal pictures of things may overwhelm any role
passing representations of object structure, for variations in the main axes. Consistent with
and engage in shape-based object substitutions this idea, Liu and Cooper (2001) found strong
in play suggest that they may be connected. At axis of symmetry effects in judgments about

nonsense objects. Thus, the principal axes may development in other domains (e.g. Amso &
be particularly important in setting up one’s Johnson, 2006; Bushnell & Bourdreau, 1993;
initial representations of an object, and perhaps Gibson, 1969; Needham, Barrett, & Peterman,
in integrating multiple images into a coherent 2002; Ruff & Rothbart, 1996). Two recent devel-
whole. In support of this idea, Jolicoeur (1985) opmental studies in our laboratory suggest that
found that axes of elongation were important in this may also be the case in children’s represen-
recognizing novel objects, but played a decreas- tation of the major axes of an object.
ing role with increasing object familiarity. The first relevant study (Smith, 2005)
There is also evidence that object shape, and demonstrates how action may promote the
perceived axes of elongation and symmetry, discovery of deeper regularities concern-
depend on (and also influence) the perceived ing three-dimensional object structure, par-
frame of reference (Quinlan & Humphrey, 1993; ticularly, the definition of an object’s axes of
Rock, 1973; Sekuler, 1996; Sekuler & Swimmer, elongation and symmetry. The participants
2000). For example, the same pattern can be seen were 24- to 30-month-old children. In the first
as a square or a diamond, depending on how experiment, the children were given a three-
one assigns the reference frame and the main dimensional object to hold in one hand that is
axis of symmetry (Rock, 1973). Particularly shown in Figure 6.11A. This nearly sphere-like
relevant to our interest in the relation between exemplar object did not have a single main
action and perception, the perceived axes of axis of elongation. In one condition, children
elongation and symmetry in adults are also moved the object up and down along a 1-m ver-
influenced by motion (Bucher & Palmer, 1985; tical path. In a second condition, they moved
Rock 1973; Sekuler & Swimmer, 2000). Adults the object back and forth on a 1-m horizontal
are biased to see both the main axis of symme- path. Immediately following, children were
try and the main axis of elongation as parallel to asked to group the exemplar object with other
the path of movement (Morikawa, 1999; Sekuler like things. No movement was involved in this
& Swimmer, 2000). categorization task. Children who had acted on
There is almost no evidence on how children the exemplar by moving it vertically grouped
perceive axes of elongation, on how principle it with objects elongated on their vertical axes
axes relate to the development of object recog- (Figure 6.11B), but children who had moved the
nition or on how action—holding and mov- exemplar horizontally grouped it with objects
ing objects—may be related to the perceptual elongated on their horizontal axes. These cat-
definition of the principle axes of an object egorization choices emerged only as a conse-
(but see E.J. Gibson, 1969; Turvey et al., 1998) quence of action and not when children merely
even though object shape and axes of elonga- observed someone else move the exemplar along
tion strongly influence not only how we hold the same path. The path of action thus selected
and grasp objects, but also how objects may be or highlighted the corresponding visual axis,
used functionally (Goodale & Humphrey, 1998; altering the perceived similarity of the exemplar
Jeannerod, 1988; Jones & Lederman, 2006; to the test objects.
Milner & Goodale, 1995). However, if the axes of The second experiment in this study used
elongation are important for setting up frames an exemplar like that shown in figure, part C,
of reference for the perception of shape, as many an exemplar not quite symmetrical around its
theories of visual object recognition suggest, center axis. The actions are illustrated in the
and if axes of elongation determine how we hold, figure, part D. Children who held the exemplar
grasp, and use objects, then children’s actions in one hand by one part and moved it back and
on objects may play an important developmen- forth subsequently grouped the exemplar with
tal role in their discovery and representation test objects (part E) that were less symmetrical
of these axes, and thus in visual object recog- in shape than the exemplar itself, as if they saw
nition. There are many well-documented dem- the exemplar as composed of two unequal parts.
onstrations of how action organizes perceptual Children who held the exemplar in the two

Figure 6.11 Exemplars and test

objects used in Smith (2005).

hands and rotated it about a central axis sub- to have selected compatible visual descriptions
sequently grouped the exemplar with objects of object shape.
more symmetrical in shape than the exemplar, Axes of elongation and symmetry are
as if they saw the exemplar as composed of two higher-order dimensions of object shape fun-
comparable and symmetric parts. Again, these damental to processes of human object recogni-
results only obtained when children acted on tion (e.g., Marr, 1982). These results suggest that
the objects, not when they watched someone they may be developmentally defined not by
else do the action. The enacted action appears vision alone but by the in-task coordination of

visual and motor processes. This is potentially shapes and asked to fit them into a container
of considerable importance. Theories of object through holes specific to those shapes (Wyly,
recognition are for the most part theories of 1997). Although there are normative standards
static object recognition (see also, Liu & Cooper, for preschool children’s success in these tasks
2003). Yet how we act on objects is intimately (and their perseveration in the task), there is
related to their shapes, and may even develop- remarkably little empirical study of the pro-
mentally be defining of them. Every time the cesses and skills that underlie success. We have
child lays a doll in a doll bed, or perhaps on preliminary evidence in a version of a shape-
top of a block as a pretend bed, the child acts sorting task designed to measure children’s abil-
in ways that may help define the major axes of ity to abstract the axis of elongation of shapes of
an object and the frame of reference for com- various complexities.
paring one shape to another. There are physi- Our approach is based on the “posting”
cal and biological constraints on how we can studies of Efron (1969) with adults and neu-
hold and move objects of different shapes and ropsychological patients (see also Goodale &
thus highly constrained associations between Milner, 1992; Milner, Perrett, & Johnston.,
symmetry, elongation, and paths of movement 1991; Warrington 1985). In these studies, sub-
that may bootstrap these developments. Related jects were given a range of “Efron rectangles”:
to this idea is Morikawa’s (1999) proposal that flat, simple, plaques that differ in their height–
adults are biased to perceive movement paral- width ratio. Their task was to insert them in a
lel to an object’s long axis and this bias derives slot aligned at a particular orientation. The crit-
from a regularity in the world, that objects in ical dependent measure was whether subjects
general move on paths parallel to their long axis oriented the handheld object to match the slot.
(there are obvious exceptions: people, e.g., move We use a much simpler version of this “posting
orthogonally to their long axis). Still, a person’s task” to ask whether—given the goal of insert-
movements of objects (rather than, or as well ing an object in a slot—children align that
as, how objects move on their own) may well be object’s axis of elongation to the axis of elonga-
systematically related to shape in ways that mat- tion of the slot. This task thus provides a good
ter to the development of object recognition. measure of children’s ability to abstract the axis
And critically, the visual information young of elongation and to make use of that informa-
learners receive about objects varies systemati- tion in action. The participants are 30 children
cally with their own actions on those objects. in two age groups, 17–18 and 23–24 months of
Thus it seems likely that changes in visual object age. Children are presented with a box with a
recognition support developmental changes in quite large slot (7 by 21 cm) oriented either
action (including object substitution in play) horizontally or vertically. They are then given
and that those activities in turn help define and objects, one at a time, and asked to put them
refine structural descriptions of shape. Th is, into the slot. All the objects can be easily fit into
then, is another potential loop of codeveloping the slot–either by aligning the axis of orienta-
processes, of causes as consequences and conse- tion or by tilting the object so that the foreshort-
quences as causes. At the very least, the present ened end goes in first. The key independent
results show that action has a strong influence variables were: the Orientation of the objects on
on the range of shapes 2-year-olds take as being the table (that were Matching or Mismatching
similar and appears to do so by defining axes of the orientation of the slot); and the Complexity
elongation and symmetry. of the objects. Complexity of shape was manip-
Our most recent work on this topic (Street, ulated in three ways: Shape Matches, solid rect-
Smith, James, & Jones, 2008) uses a task that angular blocks whose shape matched the slot;
is commonly used to diagnose developmental Simple Shapes, novel forms with height–width
delays and is included in many assessment pro- ratios comparable to the rectangular blocks);
cedures. This is a shape-sorting task in which and Known Shapes, complex real objects
children are presented with objects of various with height–width ratios comparable to the

rectangular blocks but with multiple parts and again suggest marked growth during this devel-
a canonical axis of orientation (e.g., a tiger ver- opmental period in children’s representation
sus a rocket). Children wear the head-camera in and use of the structural dimensions of three-
this study so that we can record the alignment dimensional shape.
of object and slot from their point of view.
This is a highly enjoyable and engaging
task and on virtually every trial the children
inserted the object into the slot in one way or
another. The first main result, however, is that
this skill undergoes considerable developmen- This chapter began with a phenomenon often
tal change in this period. Eighteen-month-old known as symbolic play, an extremely inter-
children struggle in this task, often making esting behavior that has been strongly linked
many wrong attempts (see Figure 6.12). In con- to language learning, to social interactions in
trast, 24-month-old children are nearly per- collaborative play, and to developing tool use
fect, aligning and inserting the object rapidly (see Rakoczy, Tomasello, & Striano, 2006). The
and almost without error. Our main depen- program of research reviewed in this chapter
dent measure is degree of alignment, measured in no way explains symbolic play, since that
from the head-camera view as shown in Figure explanation will likely consist of a cascade of
6.12. For the younger children, this angle aver- interacting processes beyond those involved in
ages 33º across all objects, that is, these children perceiving and representing object shape. The
were typically off the mark, and their error was findings reviewed here, however, do suggest
greater for complex than for simple objects. In that one component of that larger developmen-
contrast, older children’s alignment error was tal story will be changes in fundamental pro-
less 10º for all objects. How the objects were cesses of visual object recognition, which is the
presented did not matter, perhaps because main focus of this chapter. The entire pattern
the children held and rotated them, exploring of results reviewed here strongly suggests that
them, before attempting insertion. These results there are significant and consequential changes

Figure 6.12 Head-camera views from Street et al. (2008) of a 18-month-old infant inserting objects
into slots. The smaller images show the view was an additional camera. As shown in the third image,
alignment at first attempt is measured by the angle between the major axis of the slot and the major
axis of the object to be inserted.

in how children perceive, represent, and Arbib, M. A. (1981). Visuomotor coordination:

compare three-dimensional object shape, a shift From neural nets to schema theory. Cognition
from more piecemeal emphasis on local details and Brain Theory, 4, 23–39.
to a sparse, and thus category encompassing, Biederman, I. (1987). Recognition-by-components:
description of shape in terms of global geomet- A theory of human image understanding.
Psychological Review, 94(2), 115–147.
ric structure. These changes are seen in (1) the
Biederman, I., & Gerhardstein, P. C. (1993).
recognition of instances of common categories, Recognizing depth-rotated objects: Evidence
(2) in object substitution in pretend play, (3) in and conditions for three-dimensional view-
active exploration of objects, and (4) in actions point invariance. Journal of Experimental
that make use of structural properties. Psychology, 19(6), 1162–1182.
From this first set of studies, we cannot know Biederman, I., & Kalocsai, P. (1997). Neuro-
with any certainty what causes what, but it may computational bases of object and face rec-
well be, as suggested by the opening Figure 6.1, ognition. Philosophical Transactions of the
that there are causal influences in all directions. Royal Society of London B: Biological Sciences,
Development, after all, occurs in real time, in 352(1358), 1203–1219.
incremental steps, across a number of inter- Blanz, V., Tarr, M. J., & Bulthoff, H. H. (1999).
What object attributes determine canonical
leaved real-time experiences. The child hears a
views? Perception, 28(5), 575–599.
new object named (say an oddly shaped mug), Bucher, N. M., & Palmer, S. E. (1985). Effects of
uses it to drink from, sees and imitates his older motion on perceived pointing of ambiguous
brother pretend to use it as a hat. All these expe- triangles. Perception and Psychophysics, 38(3),
riences influence what the child sees, what the 227–236.
child feels, and how this one experience is con- Bushnell, E. W., & Bourdreau, J. P. (1993). Motor
nected to other experiences. The inventiveness development and the mind: The potential role
of human cognition, its adaptability and power, of motor abilities as a determinant of aspects of
may, quite literally, be constructed from the perceptual development. Child Development,
overlapping, mutually influencing, interactions 64(4), 1005–1021.
of many different tasks involving and educating Cerella, J. (1986). Pigeons and perceptrons. Pattern
Recognition, 19, 431–438.
the same component processes.
Chao, L. L., & Martin, A. (2000). Representation
of manipulable man-made objects in the dorsal
stream. NeuroImage, 12(4), 478–484.
Christou, C. G., & Bulthoff, H. H. (1999). View
This research was supported by National Institute dependence in scene recognition after active
for Child Health and Development (R01HD 28675); learning. Memory and Cognition, 27(6),
Portuguese Ministry of Science and Higher 996–1007.
Education PhD scholarship SFRH/BD/13890/2003 Colunga, E. (2003, April). Where is the cow hiding?
and a Fulbright fellowship to AF.P. A new method for studying the development
of features. Paper presented at the Biennial
meeting of the Society for Research on Child
REFERENCES Development, Tampa, FL.
Colunga, E., & Smith, L. B. (2005). From the lexi-
Abecassis, M., Sera, M. D., Yonas, A., & Schwade, con to expectations about kinds: A role for
J. (2001). What’s in a shape? Children represent associative learning. Psychological Review,
shape variability differently than adults when 112(2), 342–382.
naming objects. Journal of Experimental Child Cooper, E. E., Biederman, I., & Hummel, J. E.
Psychology, 78, 213–239. (1992). Metric invariance in object recogni-
Amso, D., & Johnson, S. P. (2006). Learning by tion: A review and further evidence. Canadian
selection: Visual search and object perception Journal of Psychology, 46(2), 191–214.
in young infants. Developmental psychology, Craighero, L., Fadiga, L., Umilta, C. A., &
42(6), 1236–1245. Rizzolatti, G. (1996). Evidence for visuomotor

priming effect. Cognitive Neuroscience and Goodale, M. A., & Humphrey, G. K. (1998) The
Neuropsychology, 8(1), 347–349. objects of action and perception. Cognition, 67,
Edelman, G. (1987) Neural darwinism. The theory of 181–207.
neuronal group selection. New York: Basic Books. Grezes, J., & Decety, J. (2002). Does visual percep-
Edelman, S. (1995). Representation, similarity, and tion of object afford action? Evidence from a
the chorus of prototypes. Minds and Machines. neuroimaging study. Neuropsychologia, 40,
5(1), 45–68. 212–222.
Edelman, S., & Duvdevani-Bar, S. (1997). Hayward, W. G. (2003). After the viewpoint debate:
Similarity, connectionism, and the problem of where next in object recognition? Trends in
representation in vision. Neural Computation, Cognitive Science, 7(10), 425–427.
9(4), 701–720. Harman, K. L., Humphrey, G. K., & Goodale,
Edelman, S., & Intrator, N. (1997). Learning as M. A. (1999). Active manual control of object
extraction of low-dimensional representations. views facilitates visual recognition. Current
In R. L. Goldstone, D. L. Medin, & P. G. Schyns Biology, 9(22). 1315–1318.
(Eds.), Perceptual learning. The psychology of Hummel, J. E. (2000). Where view-based theories
learning and motivation, (pp. 353–380): San break down: The role of structure in shape per-
Diego, CA: Academic Press. ception and object recognition. In E. Dietrich
Edelman, S., & Intrator, N. (2003). Towards struc- & A. Markman (Ed.), Cognitive dynamics:
tural systematicity in distributed, statically Conceptual change in humans and machines
bound representations. Cognitive Science, 27, (pp. 157–185). Hillsdale, NJ: Erlbaum.
73–109. Hummel, J. E., & Biederman, I. (1992). Dynamic
Efron, R. (1969). What is perception? Boston binding in a neural network for shape recogni-
Studies of the Philisophical Society, 4, 137–173. tion. Psychological Review, 99(3), 480–517.
Fenson, L. Marchman, V. A., Thal, D. J., Dale, P. Humphrey, G. K., & Jolicoeur, P. (1993). An
S., Reznick, S., & Bates, E. (1993). MacArthur- examination of the effects of axis foreshorten-
Bates communicative development inventories ing, monocular depth cues, and visual field on
(2nd ed.). Baltimore, MD: Brookes Publishing. object identification. The Quarterly Journal of
Fize, D., Fabre-Thorpe, M. I., Richard, G., Doyon, Experimental Psychology, 46(1), 137–159.
B., & Thorpe, S. J. (2005). Rapid categoriza- Imai, M., Gentner, D., & Uchida, N. (1994).
tion of foveal and extrafoveal natural images: Children’s theories of word meaning: The
Associated ERPs and effects of lateralization. role of shape similarity in early acquisition.
Brain and Cognition, 59(2), 145–158. Cognitive Development, 9(1), 45–75.
Freyd, J. J. (1983). Representing the dynamics of James, K. H., Humphrey, G. K., & Goodale, M. A.
static form. Memory & Cognition, 11, 342–346. (2001). Manipulating and recognizing virtual
Gathercole, V. C. M., & Min, H. (1997). Word objects: Where the action is. Canadian Journal
meaning biases or language-specific effects? of Experimental Psychology, 55(2), 111–120.
Evidence from English, Spanish and Korean. James, K. H., Humphrey, G. K., Vilis, T., Baddour,
First Language, 17, 31–56. R., Corrie, B., & Goodale, M. A. (2002).
Gerlach, C., Law, I., Gade, A., & Paulson, O. B. Learning three-dimensional object structure:
(2000). Categorization and category effects A virtual reality study. Behavioral Research
in normal object recognition: A PET study. Methods, Instruments and Computers, 34(3),
Neuropsychologia, 38, 1693–1703. 383–390.
Gershkoff-Stowe, L., & Smith, L. B. (2004). James, K. H., & Gauthier, I. (2006). Letter process-
Shape and the first hundred nouns. Child ing automatically recruits a sensory-motor brain
Development, 75(40), 1098–1114. network. Neuropsychologia, 44, 2937–2949.
Gibson, E. J. (1969). Principles of perceptual learn- Jeannerod, M. (1988) The neural and behavioral
ing and development. New York: Appleton organization of goal-directed movements. New
Century Crofts. York: Oxford University Press.
Gibson, J. J. (1979). The ecological approach to Jeannerod, M. (1997). The cognitive neuroscience of
visual perception. Hillsdale, NJ: Erlbaum. action. Cambridge, MA: Blackwell.
Goodale, M. A., & Milner, A. D. (1992) Separate Jolicoeur, P. (1985). The time to name disoriented
visual pathways for perception and action, natural objects. Memory and Cognition, 13(4),
Trends in Neurosciences, 15, 2025. 289–303.

Jolicoeur, P. (1990). Identification of disoriented McCune-Nicolich, L. (1981). Toward symbolic

objects: A dual-systems theory. Mind and functioning: Structure of early pretend games
Language, 5(4), 387–410. and potential parallels with language. Child
Jones, S. S. (2003). Late talkers show no shape bias Development, 52(3), 785–797.
in object naming. Developmental Science, 6(5), McCune, L. (1995). A normative study of repre-
477–483. sentational play in the transition to language.
Jones, L. A., & Lederman, S. J. (2006). Human hand Developmental psychology, 31(2), 198–206.
function. New York: Oxford University Press. Mecklinger, A., Gruenewald, C., Besson, M.,
Jones, S. S., & Smith, L. B. (2005). Object name Magnie, M. N., & Cramon, D. Y. (2002).
learning and object perception: A deficit in Seperable neuronal circuitries for manipu-
late talkers. Journal of Child Language, 32(1), lable and non-manipulable objects in working
223–240. memory. Cerebral Cortex, 12(11), 1115–1123.
Keil, F. C. (1994). The birth and nurturance of con- Milner, D., & Goodale, M. (1995) The visual brain
cepts by domains: The origins of concepts of liv- in action. New York: Oxford University Press.
ing things. In L. A. Hirschfeld & S.A. Gelman Milner, A. D., Perrett, D. I., & Johnston, R. S.
(Eds.), Mapping the mind: Domain specificity in (1991). Perception and action in ‘visual form
cognition and culture (pp. 234–254). New York: agnosia’. Brain, 114, 405–428.
Cambridge University Press. Mondloch, C. J., Le Grand, R., & Maurer, D.
Kellman, P. J. (2001). Separating processes in (2002). Configural face processing develops
object perception. Journal of Experimental more slowly than featural face processing.
Child Psychology. Special Issue: Reflections, Perception, 31(5), 553–566.
78(1), 84–97. Morikawa, K. (1999). Symmetry and elongation of
Landau, B., Smith, L., & Jones, S. (1988). The objects influence perceived direction of trans-
importance of shape in early lexical learning. lational motion. Perception and Psychophysics,
Cognitive Development, 3, 299–321. 61, 134–143.
Large, M. E., McMullen, P. A., & Hamm, J. P. Namy, L. (2002) Symbol use and symbolic repre-
(2003). The role of axes of elongation and sym- sentation: Developmental and Comparative
metry in rotated object naming. Perception and Perspectives: New York: Rutledge
Psychophysics, 65(1). 1–19. Needham, A., Barrett, T., & Peterman K. (2002). A
Lawson, R., & Humphreys, G. W. (1998). View- pick-me-up for infants’ exploratory skills: Early
specific effects of depth rotation and foreshort- simulated experiences reaching for objects
ening on the initial recognition and priming of using “sticky mittens” enhances young infants’
familiar objects. Perception and Psychophysics, object exploration skills. Infant Behavior and
60(60), 1052–1066. Development, 25(3), 279–295.
Longcamp, M., Boucard, C., Gilhodes, J. C., & Nelson, C. A. (2001). The development and neu-
Vely, J. L. (2005). Remembering the orienta- ral bases of face recognition. Infant and Child
tion of newly learned characters depends on Development. Special Issue: Face Processing in
the associated writing knowledge: A compar- Infancy and Early Childhood, 10(1–2), 3–18.
ison between handwriting and typing. Human Newell, F. N., & Findlay, J. M.(1997). The effect
Movement Science, 25(4–5), 646–656. of depth rotation on object identification.
Liu, T., & Cooper, L. A. (2003). Explicit and Perception, 26(10), 1231–1257.
implicit memory for rotating objects. Journal Paillard, J. (1991). Brain and space. New York:
of Experimental Psychology: Learning, Memory, Oxford University Press.
and Cognition, 29, 554–562. Palmer, S. E., Rosch, E., & Chase, P. (1981).
Marr, D. (1982) Vision: A computational investiga- Canonical perspective and the perception
tion into the human representation and process- of objects. In J. Long & A. Baddeley (Eds),
ing of visual information. New York, NY: Henry Attention and performance (pp. 135–151)
Holt and Co., Inc. Hillsdale, NJ: Erlbaum.
Marr, D. & Nishihara, H. K. (1978). Representation Pegna, A. J., Khateb, A., Michel, C. M., & Landis,
and recognition of the spatial organization T. (2004). Visual recognition of faces, objects,
of three-dimensional shapes. Proceedings and words using degraded stimuli: Where and
Royal Society of London, Series B, 200(1140): when it occurs. Human Brain Mapping, 22(4),
269–94 300–311.

Pereira, A. & Smith, L. B. (2009). Developmental play actions. British Journal of Developmental
changes in visual object recognition between 18 Psychology, 24(2), 305–335.
and 24 months of age. Developmental Science, Rock, I. (1973). Orientation and form. New York,
12(4), 67–80. NY: Academic Press.
Pereira, A., James, K. H., Smith, L. B., & Jones, S. S. Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D.
(2007). Preferred views in children’s active explo- M., & Boyes-Braem, P. (1976). Basic objects
ration of objects, Society for Research in Child in natural categories. Cognitive Psychology, 8,
Development, Annual Meeting, Boston, MA. 382–439.
Perner, J. (1991). Understanding the representa- Ruff, H. A., & Rothbart, M. K. (1996). Attention
tional mind. Cambridge, MA: MIT Press. in early development. New York: Oxford
Perrett, D. I., & Harries, M. H. (1988). Characteristic University Press.
views and the visual inspection of simple fac- Samuelson, L. K. (2002). Statistical regularities
eted and smooth objects: Tetrahedra and pota- in vocabulary guide language acquisition in
toes. Perception, 17(6), 703–720. connectionist models and 15–20-month-olds.
Perrett, D. I., Harries, M. H., & Looker, S. (1992). Use Developmental Psychology, 38(6), 1016–1037.
of preferential inspection to define the viewing Samuelson, L. K., & Smith, L. B. (1999). Early noun
sphere and characteristic views of an arbitrary vocabularies: Do ontology, category structure
machined tool part. Perception, 21, 497–515. and syntax correspond? Cognition, 73, 1–33.
Peissig, J. J., & Tarr, M. J. (2007). Visual object rec- Sekuler, A. B. (1996) Axis of elongation can deter-
ognition: Do we know more now than we did mine reference frames for object perception.
20 years ago? Annual Review of Psychology, Canadian Journal of Experimental Psychology,
58(1), 75–96. 50(3), 270–279.
Peterson, M. (Ed.) (1999) The MIT encyclopedia Sekuler, A. B., & Swimmer, M. B. (2000).
of the cognitive sciences. Cambridge, MA: The Interactions between symmetry and elonga-
MIT Press. tion in determining reference frames for object
Piaget, J. (1962). Play, dreams, and imitation in perception. Canadian Journal of Experimental
childhood. New York: Norton. Psychology, 54(1), 42–56.
Quinlan, P. T. & Humphreys, G. W. (1993). Sekuler, E. B. (1996). Perceptual cues in pure alexia.
Perceptual frames of reference and two-dimen- Cognitive Neuropsychology, 13(7), 941–974.
sional shape recognition: further examination Shore, C., O’Connell, B., & Bates, E. (1984). First
of internal axes. Perception. 22(11):1343–64. sentences in language and symbolic play.
Rakison, D. H., & Butterworth, G. E. (1998a). Developmental Psychology, 20(5), 872–880.
Infants’ attention to object structure in early Smith, L. B. (2003). Learning to recognize objects.
categorization. Developmental Psychology, 34(6), Psychological Science, 14(3) 244– 251.
1310–1325. Smith, L. B. (2005). Action alters perceived shape.
Rakison, D. H., & Butterworth, G. E. (1998b). Cognitive Science, 29, 665–679.
Infant’s use of object parts in early categoriza- Smith, L. B., Jones, S. S., Landau, B., Gershkoff-
tion. Developmental Psychology, 34(1), 49–62. Stowe, L., & Samuelson, S. (2002). Early noun
Rakison, D. H., & Cicchino, J. B. (2008). Induction learning provides on-the-job training for
in infancy. in S. Johnson (Ed.), A neo-construc- attention. Psychological Science, 13, 13–19.
tivist approach to early development. New York: Smith, L. B., & Pereira. A. F. (2008) Symbolic
Oxford University Press. play links to language through object recogni-
Rakison, D. H., & Cohen, L. B. (1999) Infants’ use tion. Unpublished manuscript, Department
of functional parts in basic-like categorization. of Psychological and Brain Sciences, Indiana
Developmental Science, 2, 423–431. University Bloomington, Bloomington, IN.
Rakison, D. H., & Lupyan, G. (2008). Developing Soja, N. N. (1992). Inferences about the meanings
object concepts in infancy: An associative of nouns: The relationship between perception
learning perspective. Monographs of the SRCD, and syntax. Cognitive Development, 7, 29–45.
73(1). Son, J.Y., Smith, L.B., & Goldstone, R.L. (2008).
Rakoczy, H., Tomasello, M., & Striano, T. (2006). Simplicity and generalization: Short-cutting
The role of experience and discourse in chil- abstraction in children’s object categorizations.
dren’s developing understanding of pretend Cognition, 108, 626–638.

Stankiewicz, B. J. (2003). Just another view. Trends their use on classification. Nature Neuroscience,
in Cognitive Science, 7, 526. 5(7), 682–687.
Street, S., Smith, L. B., James. K. H., & Jones, S. S. Vivani, P., & Stucchi, N. (1992). Biological
(2008). Posting ability and object recognition in movements look uniform: Evidence for
18–24 month old children. Unpublished manu- motor-perceptual interactions. Journal of
script, Department of Psychological and Brain Experimental Psychology, 18, 603–623.
Sciences, Indiana University Bloomington, Warrington, E. K. (1985). Agnosia: the impairment
Bloomington, IN. of object recognition. In P. J. Vinken, G. W.
Tarr, M., & Vuong, Q. C. (2002). Visual object rec- Bruyn, & H. L. Klawans (Eds.), Handbook of
ognition. In S. Yantis (Ed.), Steven’s Handbook clinical neurology. Amsterdam: Elsevier.
of Experimental Psychology: Vol. 1. Sensation Weismer, S. E. (2007). Typical talkers, late talkers,
and Perception (Vol. 1). New York, NY: John and children with specific language impairment:
Wiley & Sons, Inc. A language endowment spectrum? Mahwah,
Tong, F. H., Marlin, S. G., & Frost, B. J. (1995). NJ: Lawrence Erlbaum Associates.
Cognition map formation in a three-dimensional Wexler, M., & van Boxtel., J. (2005). Depth percep-
visual virtual world. Poster presented at the tion by the active observer. Trends in Cognitive
IRIS/PRECARN Workshop,Vancouver, BC. Science, 9, 431–8. s
Turvey, M. T., Park, H., Dumais, S. M., & Carello, Wohlschlager, A., & Wohlschlager, A. (1998).
C. (1998). Nonvisible perception of segments Mental and manual rotation. Journal of
of a hand-held object and the attitude spinor. Experimental Psychology, Human Perception
Journal of Motor Behavior, 30(1), 3–19. and Performance, 24(2), 397–412.
Ullman, S. (1996). High level vision. Cambridge, Wyly, M. V. (1997). Infant assessment. Boulder,
MA: MIT Press. CO: Westview Press.
Ullman, S. (2000). High-level vision: Object recog- Yoshida, H., & Smith, L. B. (2003). Shift ing
nition and visual cognition. Cambridge, MA: ontological boundaries: How Japanese- and
MIT Press. English-speaking children generalize names
Ullman, S., & Bart, E. (2004). Recognition invari- for animals and artifacts. Developmental
ance obtained by extended and invariant fea- Science, 6, 1–34.
tures. Neural Networks, 17(5/6), 833–848. Yoshida, H. & Smith, L.B. (2008). What’s in view
Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). for toddlers? Using a head camera to study
Visual features of intermediate complexity and visual experience. Infancy, 13(3), 229–248.
Musical Enculturation: How Young Listeners
Construct Musical Knowledge through
Perceptual Experience

Erin E. Hannon

Human musical capacities have recently become their native language, every member of a
the focus of an exploding quantity of theoretical particular culture—with or without formal
and empirical contributions in the psychologi- musical training—can be expected to acquire
cal and brain sciences (Avanzini Lopez, Koelsch, basic musical competence and a working knowl-
& Majno, 2006; Compston, 2006; Peretz, 2006; edge of familiar musical structures (Bigand &
Spiro, 2003; Wallin, Merker, & Brown, 2000; Poulin-Chourronnat, 2006; Hannon & Trainor,
Zatorre & Peretz, 2003). This marks a signif- 2007).
icant change in the status of musical behavior The question of exactly how this musical
and cognition as the topic of empirical inves- knowledge is structured and acquired during
tigation. Until recently, it was widely assumed development is of great interest but as of yet
that musical skills and knowledge were only unanswered. One proposal is that such knowl-
possessed by an elite, highly trained minority, edge arises from a “music faculty” that is modular
a view bolstered by characterizations of music and built from innate, “core” capacities (Hauser
as frivolous, devoid of adaptive significance, & McDermott, 2003; Peretz & Coltheart, 2003;
and subject to insurmountable individual and Peretz & Morais, 1989). Proponents of the mod-
cross-cultural variability (Pinker, 1997, 2002). ular approach have cited evidence of music-
Current approaches refute this view by empha- specific neural circuitry in adults (Janata et al.,
sizing biological substrates of music (i.e., Zatorre 2002; Koelsch & Siebel, 2005), cases of double-
& Peretz, 2003), potential survival advantages of dissociations between music and other cogni-
music (Huron, 2003; Miller, 2000), and the ubiq- tive or linguistic abilities in patients with brain
uity of many basic musical skills. For example, damage (Peretz et al., 1994; Vignolo, 2003), and
most adults have sufficient musical knowledge parallels between the musical abilities of adults
to sing a familiar tune (Dalla Bella, Giguere, & and young infants (Trehub, 2003). According
Peretz, 2007), detect “wrong notes” in a musical to the modular view, musical knowledge and
sequence (Drayna et al., 2001; Hyde & Peretz, behavior arise from an innate set of music-
2004; Trainor & Trehub, 1992), recognize and specific evolutionary adaptations.
experience emotions communicated by music An alternative to the modular approach,
(Grewe, Kopiez, & Altenmüller, 2005; Juslin embraced here, is that highly specialized knowl-
& Laukka, 2003), and move or dance in syn- edge of music in adulthood arises through
chrony with music (Large, Fink, & Kelso, 2002; simple perceptual learning mechanisms that
McAuley, Jones, Holub, Johnstone, & Miller, build increasingly specific representations from
2006; Snyder & Krumhansl, 2001). Thus, just as domain-general capacities (Trehub & Hannon,
most children eventually learn to understand 2006). According to this view, the appearance


of domain-specificity and even encapsulation mechanisms to infer structure across a number

in adult music processing does not necessarily of domains (Conway & Christiansen, 2006;
signify the existence of modularity in the initial Kirkham, Slemmer, & Johnson, 2002; Saff ran,
state—rather, such specialization could be the Aslin, & Newport, 1996; see Kirkham, Saff ran,
result of developmental processes (Karmiloff- and Gomez chapters, this volume). Statistical
Smith, 1992; McMullen & Saff ran, 2004). Thus, learning depends on the perceiver’s simple
observed parallels between adult and infant capacity to track the frequency with which cer-
performance in music cognition tasks may arise tain units or combinations of units occur. Many
from general properties of the nervous system examples of statistical learning have been docu-
that are not necessarily specific to music or even mented in the language domain. For instance,
to humans. Such initial abilities constrain musi- infants infer phonemic distinctions between
cal enculturation, the process through which syllables only when the prototypical speech
everyday exposure to the statistics of music sounds occur more frequently than nonproto-
drives the acquisition of culture-specific musi- typical sounds (Maye, Werker, & Gerken, 2002).
cal knowledge. Transitional probabilities between units (i.e.,
This chapter explores the question of how the likelihood that one unit will be preceded or
infants and children build musical representa- followed by another) enable infants to segment
tions, with particular focus on perception and unfamiliar sequences of syllables or tones into
knowledge of temporal structure in music, such groups or “words” after brief exposure (Saff ran,
as rhythm and meter. Rhythm and meter are Johnson, Aslin, & Newport, 1999). Multiple and
fundamental to most socially significant and correlated statistical cues provide a powerful
universal musical behaviors, such as synchro- means by which infants can rapidly build
nous dancing or ensemble performance (Brown, increasingly complex representations from sim-
2003). In this chapter, I will review published ple learning mechanisms (Christiansen, Allen,
and new evidence that infants can perceive & Seidenberg, 1998; Christiansen, Dale, & Reali,
rhythm and meter by attending to the same sta- this volume; Thiessen & Saff ran, 2007).
tistical properties that underlie adults’ percep- Relatively little is known about the role of sta-
tion, that representations of rhythm and meter tistical learning in the development of musical
undergo reorganization as a result of culture- representations. Transitional probabilities and
specific perceptual experience, and that infants frequency of occurrence likely provide informa-
and adults share some basic temporal process- tion about the hierarchical pitch organization of
ing constraints despite infants’ initial flexibil- Western music or tonality, since tonally promi-
ity. In addition to examining development of nent pitches tend to occur more frequently than
music-specific knowledge, a parallel goal is to do other pitches, and the sequential structure of
understand the emergence of domain-specific pitch sequences is often highly constrained by
representations in auditory cognition. If we harmonic cadences (Krumhansl, 2004). When
assume that early representations of music are presented with an unfamiliar pitch sequence,
primarily domain-general and become culture- adults can use frequency of occurrence to infer
specific through perceptual experience, then a tonal prominence (Creel & Newport, 2002), and
question of great interest is whether overlap- a self-organizing neural network exposed to the
ping structures are present and detected in the statistics of Western music can simulate tonal
musical and linguistic input available to infants expectations (Tillman, Bharucha, & Bigand,
and children. I will briefly review some new evi- 2000). Despite the presence of such statistical
dence suggesting that this is may be the case. information, however, it does not appear that
tonality is learned during infancy. On the con-
trary, most evidence suggests that adult-like
knowledge of tonality does not emerge until
Numerous studies suggest that infants and after at least 5 years of age (Cuddy & Badertscher,
adults use domain-general statistical learning 1987; Koelsch, Fritz, Schulze, Alsop, & Schlaug,

2005; Krumhansl & Keil, 1982; Schellenberg, (i.e., “duple” meter) (see Figure 7.1). After
2005; Trainor & Trehub, 1992). By contrast, sta- habituation, they were presented with two novel
tistical information about temporal structure rhythms that were otherwise matched but dif-
in music is available and used by infants and fered only in the extent to which the frequency
adults alike. distribution of events and accents supported the
Musical meter is the hierarchical temporal meter induced during habituation. Infants dis-
structure of music. The ability to dance or move habituated to the stimulus with a novel meter,
in synchrony with music depends on a listen- which demonstrates that they not only detected
er’s ability to infer the underlying meter in the changes in the temporal statistics of the rhythms
auditory input, which in turn guides temporal but that they categorized rhythms on the basis
expectations and gives rise to the subjective of the underlying meter. Interestingly, a third
experience of a primary pulse and alternating experiment suggested that infants could also
patterns of strong and weak beats. Although learn to associate particular pitches with strong
the pulse of a simple metronomic or isochro- and weak metrical positions. Because tonally
nous sequence is obvious from the acoustic prominent pitches tend to occur at strong met-
input (i.e., every tone onset corresponds to a rical positions in music (Järvinen & Toivianien,
pulse), real music presents a greater challenge to 2000; Meyer, 1973; Palmer & Pfordresher, 2003),
the listener because any number of amplitude this finding suggests that at least in principle
peaks or event onsets could mark multiple and infants’ ability to infer meter could provide a
sometimes conflicting primary pulse rates. It is foundation for learning about tonality through
nevertheless trivial for most adults to perceive similar statistical learning processes.
and move in synchrony with music. Evidence Not only do infants infer meter from sta-
suggests that adult listeners infer the meter by tistics in auditory input, but recent fi ndings
attending to periodically occurring statistics. suggest they also integrate metrically relevant
For example, in Western music (both classi- information across sensory modalities. When
cal and children’s), event onsets tend to occur presented with an ambiguous rhythm in which
more frequently at strong than at weak metrical events and accents support either triple or
positions (Palmer & Krumhansl, 1990; Palmer duple meter, 7-month-old infants use move-
& Pfordresher, 2003). Events that are accented ment cues to infer the meter (Phillips-Silver
or made salient through changes in amplitude, & Trainor, 2005). In this set of experiments,
length, pitch, or grouping also tend to occur infants were familiarized with an ambiguous
more frequently at strong metrical positions rhythm while being bounced or while watching
in music (Huron & Royal, 1996). Frequencies an experimenter bounce on every two or three
of event and accent occurrence predict when beats. After familiarization, infants preferred
adults will tap to music (Snyder & Krumhansl, listening to a version of the rhythm containing
2001) and their perception of meter in unfamil- disambiguating cues that matched the meter to
iar melodic patterns (Hannon, Snyder, Eerola, which they were bounced, but showed no pref-
& Krumhansl, 2001). Figure 7.1 illustrates how erence when they had only watched someone
a given set of events and accents might sup- bouncing. Thus, infants use information from
port duple or triple meters by virtue of their their own movement patterns to structure their
frequency distribution at metrically weak or metrical interpretations. This result under-
strong locations over time. scores the importance of movement for meter
Infants use frequency of occurrence to perception, and it converges with numerous
infer the meter in simple rhythmic patterns. adult studies documenting robust associations
In Hannon and Johnson (2005), 7-month- between movement and time perception using
old infants were habituated to three unique behavioral measures (Ivry & Hazeltine, 1995;
rhythms containing events and accents that Meegan, Aslin, & Jacobs, 2000; Phillips-Silver
were more likely to occur every three units & Trainor, 2007; Todd, Cousins, & Lee, 2007;
(i.e. “triple” meter) or every two and four units Trainor, 2007) and brain responses (Platel
Triple meter

Metrical stress S W W S W W S W W S W W S


0.9 Events
0.8 Accents
Frequency of Occurrence

(Triple Meter)

Every 2 Every 3 Every 4

Duple meter

Metrical Stress S W S W S W S W S W S W S


Frequency of occurrence

(Duple meter)

Every 2 Every 3 Every 4

Figure 7.1 Patterns that support duple and triple meters. Both rhythmic patterns have potentially
13 temporal units during which events could occur, but the frequency distribution of event onsets
(depicted by squares) and accents (depicted by larger squares) differs dramatically depending on
whether the rhythm supports a triple meter (i.e., sww or strong weak weak) or a duple meter (i.e., swsw
or strong weak strong weak). Specifically, events and accents occur most frequently every three units
in the triple-meter rhythm but every two or four units in the duple-meter pattern. The bar graphs
represent the average frequency of events and accents for duple- and triple-meter rhythms used in
Hannon and Johnson (2005).


et al., 2007; Schubotz, Friederici, & Yves von

Cramon, 2000).
Do infants ignore visual information related
to meter? Although infants in Phillips-Silver and Throughout first several months after birth,
Trainor (2005) were not able to use visual infor- exposure to the statistics of the environment
mation to infer meter in an ambiguous rhythm, begins to alter infants’ basic perceptual pro-
infants may nevertheless perceive congruence cesses in a number of seemingly disparate
of metrical information across auditory and domains. One well-documented example of
visual modalities and use this information to this comes from speech perception, where
further their music learning. In a recent experi- infants develop language-specific biases during
ment (Hannon, in preparation-a), 10-month-old the second half of the first year after birth. At
infants were habituated to a movie of a woman only a few months of age, infants discriminate
dancing to a fast or a slow song. After habitua- speech sounds from virtually all spoken lan-
tion, infants saw two movies containing a novel guages, even languages they have never heard
segment of the same song they had already (Eimas, 1974, 1975; Eimas, Siqueland, Jusczyk,
heard, but the visual stimulus (i.e. the dancing) & Vigorito, 1971; Trehub, 1976). These early
that accompanied that segment differed for each abilities change dramatically by the end of the
movie. For the synchronous movie, the dancing first year, when infants only discriminate speech
matched the song and for the asynchronous sounds that demarcate meaning in their native
movie the dancing matched a separate song language, presumably because of their expo-
with a contrasting tempo. Infants dishabituated sure to linguistic input containing dispropor-
only to the asynchronous video, presumably tionately frequent exemplars of native-language
because they noticed that the dancing did not categories (Kuhl, Williams, Lacerda, Stevens,
match the music. In a control condition, infants & Lindblom, 1992; Kuhl et al., 2006; Maye
showed no preference after habituation to the et al., 2002; Werker & Lalonde, 1988; Werker &
same visual stimuli presented without sound, Tees, 1984). Other recent findings suggest that
suggesting that discrimination in the experi- a similar progression occurs for face identifi-
mental condition was based on true intersen- cation, where accurate discrimination of indi-
sory perception. Although previous studies vidual monkey and human faces is robust early
have shown that infants perceive audiovisual in infancy (i.e., before 6 months) but declines
synchrony when both auditory and visual stim- between 6 and 9 months, when infants continue
uli contain discrete events, such as the sight and to accurately discriminate only individuals of
sound of a ball bouncing (Lewkowicz, 1996), their own species (Pascalis, de Haan, & Nelson,
the above task requires inference of metrical 2002) or race (Kelly et al., 2007). Even intersen-
structure from rich and complex information sory perception may undergo comparable devel-
in both modalities. Thus, infants may be sensi- opmental changes, such as discriminating the
tive to visual temporal information that is cor- visual head and lip movements of one’s native
related with musical meter. Future studies will language from those of a foreign language, an
investigate the relative contributions of both ability that declines between 4 and 8 months of
vestibular/movement and visual cues in infants’ age (Weikum et al., 2007).
music learning. The above examples demonstrate percep-
To summarize, the findings described above tual tuning for socially significant, frequently
strongly support the claim that infants can infer encountered stimuli in multiple domains. Given
temporal structure in music from the same basic the prominence of music in early caregiving
statistics that are known to influence adults’ contexts (Trehub & Trainor, 1998), it is not sur-
perception of such structures. Moreover, they prising that enculturation to musical structures
do this not only in the auditory modality, but is also characterized by a similar developmental
also make use of multiple and redundant cues trajectory, where young infants discriminate
available through movement and vision. musical structures that elude their parents but

begin to exhibit culture-specific declines by the example, in order for the rhythm depicted
end of the first year. in Figure 7.2 (top) to support an isochronous
metrical hierarchy, its event onsets must occur
Developmental Changes in Perception
primarily at strong metrical positions, which
of Meter
naturally leads to 1:1 and 2:1 duration ratios.
It is widely assumed that when listeners infer The challenge of inferring a primary pulse
the meter in music, they not only experience in music seems particularly daunting when
a primary pulse that is isochronous (i.e. com- one considers the fact that live music is rarely
posed of equal duration intervals), but they also isochronous but instead tends to contain inter-
perceive additional isochronous levels that are onset intervals that vary continuously as a func-
subdivisions or multiples of the primary pulse, tion of the performer’s expressive intentions
all of which are integral to the metrical hier- (Repp, 1992). For example, when a performer
archy (Lerdahl & Jackendoff, 1983; Palmer & begins to slow down at the end of a phrase, the
Krumhansl, 1990). Figure 7.2 (top) illustrates a interonset intervals will become incrementally
typical Western, isochronous metrical hierar- longer, but this does not necessarily lead the
chy, with three levels of isochronous structure listener to reinterpret the meter. Rather, the
giving rise to weak, strong, and stronger met- listener ignores the subtle differences in inter-
rical positions. Because perception of meter is val size and consequent interval ratios, and
dependent on the regular occurrence of events categorizes intervals according to the metrical
at strong metrical positions, isochronous met- hierarchy that he or she has inferred (Desain &
rical hierarchies tend to constrain the pattern Honing, 2003).
of interonset intervals in music by requiring Abundant evidence of metrical categoriza-
primarily simple-integer duration ratios. For tion can be found in studies of perception and


Metrical hierarchy

Metrical stress S w s w S w s w S w s w S


Interval ratios 2 : 1 : 1 : 2


Metrical hierarchy
Metrical stress S w w s w s w S w w w s w


Interval ratios 3 : 2 : 2 : 3 : 2

Figure 7.2 Metrical hierarchies for isochronous Western (top) and nonisochronous Balkan (bottom)
meters. In an isochronous metrical hierarchy, typical of Western music, multiple levels of isochro-
nous structure can be felt simultaneously, giving rise to simple duration ratios in the set of interonset
intervals comprising the rhythm. In a nonisochronous metrical hierarchy, typical in Balkan music,
multiple levels can also be felt but the intermediate level, which tends to be the primary pulse for danc-
ing and movement, is made up of alternating long and short intervals having a 3:2 ratio. These ratios
are also evident in the rhythm that supports that meter.

production of rhythms. In spontaneous rhythm This is reflected in the performance of North

production tasks, French adults tend to produce American adults (“Western Adults”), who
exclusively 1:1 and 2:1 ratios (Fraisse, 1978) even notice superthreshold temporal disruptions
when asked to tap irregularly (Fraisse, 1982). of a folk tune only when that tune has an iso-
When attempting to reproduce or synchro- chronous meter (with 2:1 ratios) but not when
nize with target rhythms containing complex it has a nonisochronous meter (with 3:2 ratios,
interonset interval ratios, most adults ignore see Figure 7.3) (Hannon & Trehub, 2005a). By
instructions and instead tend to produce 2:1 contrast, adults from Macedonia and Bulgaria
and 1:1 ratios (Cummins & Port, 1998; Essens, (“Balkan Adults”) perform equally well in both
1986; Essens & Povel, 1985; Povel, 1981; Repp, isochronous and nonisochronous conditions,
London, & Keller, 2005; Snyder, Hannon, Large, presumably because both are equally familiar
& Christiansen, 2006). Likewise, the brain activ- to these subjects.
ity that accompanies rhythmic reproduction Western adults may fail to detect disruptions
tasks is qualitatively different for simple- ver- to nonisochronous meters because their encod-
sus complex-ratio sequences, with more auto- ing of the original stimulus is compromised
matic processing associated with simple ratios by a strong tendency to assimilate all patterns
(Sakai et al., 1999). In perception tasks, listen- toward a familiar metrical template. The ten-
ers will label or transcribe (i.e. put into musical dency to assimilate can thus be interpreted as
notation) a rhythm according to a simple-ratio evidence of acquired culture-specific knowledge
category despite a continuous range of interval of meter, which may begin to emerge as early
ratios in the physical stimulus (Clarke, 1987; as infancy. After familiarization with the same
Desain & Honing, 2003; Large, 2000). These folk tunes described above, 6-month-old infants
findings probably reflect listeners’ tendency to exhibit a novelty preference for disrupted ver-
assume that rhythms fit into metrical hierar- sions, regardless of whether the familiarization
chies and thus assimilate complex-integer ratios stimulus was isochronous or nonisochronous
toward simple-integer ratios that support famil- (Figure 7.3) (Hannon & Trehub, 2005a). By 12
iar Western meters (the section “Constraints on months, however, this pattern changes, and
Music Learning” will discuss additional expla- Western infants fail to discriminate rhythmic
nations of simple-ratio biases). variations in the nonisochronous condition,
If biases toward simple-integer ratios arise even though they continue showing a nov-
from a tendency to assimilate patterns to iso- elty preference in the isochronous condition
chronous metrical hierarchies, then a listener’s (Hannon & Trehub, 2005b). Thus, enculturation
musical experience and knowledge would be to musical rhythms, more specifically acqui-
expected to exert at least some influence on sition of culture-specific metrical categories,
performance. In particular, listeners who are rapidly changes infants’ behavior and closely
accustomed to nonisochronous meters should parallels trends observed in other domains,
not have difficulty reproducing or identifying where initial discrimination abilities are main-
ratios other than 2:1 or 1:1 if such ratios exist tained for familiar structures but decline for
in the music of their culture. Both isochronous unfamiliar structures by the end of one year.
and nonisochronous meters are very common
The Role of Everyday Music Listening
in traditional music from throughout the world,
in Perceptual Reorganization
such as Africa, the Middle East, Eastern Europe,
and South Asia (Clayton, 2000; London, 1995, The observed developmental changes in Western
2004). Figure 7.2 (bottom) provides an example infants’ musical rhythm perception are pre-
of a typical Balkan rhythm and its metrical sumably driven by exposure to Western music,
hierarchy. Note that in addition to having fre- where simple ratios are much more frequent
quent 1:1 ratios, the rhythm also contains 3:2 than complex ratios. Thus, infants may build
ratios that are usually challenging for Western their culture-specific musical representations
listeners to perceive and produce (Essens, 1986). by simply listening to music, in the same way

3 0.8
2 0.7

Novelty preference
1 0.6

0 0.5

–1 0.4

–2 0.3

–3 0.2
Western adults Balkan adults 6-month-olds 12-month-olds

Figure 7.3 Perception of isochronous and nonisochronous meters by infants and adults. Adults’
accuracy for the perceptual judgment task (i.e., tendency to state that a disrupted version of a folk tune
is dissimilar relative to a standard) is on the left y-axis, while infants’ novelty preference (amount of
orienting to the disrupted variation divided by total looking time) is on the right y-axis. Dashed line
indicates chance performance. Balkan adults and 6-month-old infants accurately differentiate rhyth-
mic variations in isochronous and nonisochronous contexts, while Western adults and 12-month-
olds only perform accurately in the isochronous metrical context. Data replotted from Hannon and
Trehub (2005a, 2005b).

that listening to native language and watching condition (Hannon & Trehub, 2005b). In this
familiar faces leads to declines in perception of study, parents of 12-month-old infants were sent
unfamiliar speech and faces (Kelly et al., 2007; CDs containing 10 minutes of Balkan folk dance
Pascalis, et al., 2002; Werker & Tees, 1984). music having nonisochronous meters, which
Simple training paradigms have been used to they were asked to play for their infants twice
demonstrate the effects of perceptual experi- per day for 2 weeks prior to coming into the lab-
ence, by testing discrimination in older infants oratory. During testing, infants were then pre-
after exposure to unfamiliar structures. For sented with the same stimuli as described above.
example, after American 9-month-old infants Importantly, the specific recordings heard at
are exposed to Mandarin Chinese through a home were completely different from the stim-
series of interactions with native speakers over a uli presented in the laboratory aside from shar-
4- to 6-week period, they successfully discrimi- ing nonisochronous meters. Nevertheless, after
nate Mandarin speech contrasts that their age- exposure to the Balkan music, 12-month-old
matched American counterparts do not (Kuhl, infants successfully discriminate nonisochro-
Tsao, & Liu, 2003). Similarly, when 9-month- nous rhythms on the basis of metrical disrup-
old infants are sent home with picture books tions. As can be seen from Figure 7.4, novelty
containing monkey faces, they subsequently preferences in the nonisochronous condition
discriminate individual monkey faces, unlike after exposure are indistinguishable from those
9-month-olds without such exposure (Pascalis obtained from Western 12-month-olds in the
et al., 2005). isochronous condition.
A brief period of at-home exposure to foreign It is tempting to conclude from these results
music also reverses the decline in the perfor- that by 12 months of age, infants have achieved
mance of older infants in the nonisochronous adult-like knowledge of meter. However,
Infants (12 months)

Novelty preference


No Exposure After Exposure

2 Children (4–8 years)




–0.5 Nonisochronous

Session 1 Session 2

Adults (18 +years)




–0.5 Isochronous
Session 1 Session 2

Figure 7.4 Effects of perceptual experience at different ages. For 12-month-old infants (top), at-home
exposure to nonisochronous meters results in significant improvement in discrimination perfor-
mance, where a novelty preference is only obtained in the non-isochronous condition after expo-
sure. For children (middle), comparable results are obtained using a perceptual judgment paradigm.
Preexposure performance reflects accurate performance in the isochronous condition but chance
performance in the nonisochronous condition. After exposure, children’s performance in the noni-
sochronous condition is above chance and indistinguishable from performance in the isochronous
condition. By contrast, the performance of adults (bottom) in the nonisochronous condition does
change after at-home exposure, but never reaches above-chance levels or levels of accuracy obtained
in the isochronous condition. Infant and adult data are replotted from Hannon and Trehub (2005b).
Child data are from Hannon and Soley (in preparation).


exposure to foreign music does not have the in adults. Despite the fact that Western adults
same effect on adults, who improve after the do improve in the nonisochronous condition, a
same amount of exposure but do not discrim- considerable gap remains between performance
inate nonisochronous rhythms above chance in isochronous versus nonisochronous contexts
levels off after 2 weeks of at-home exposure even after exposure. Thus, by investigating the
(Figure 7.4) (Hannon & Trehub, 2005b). It is extent to which a representation is susceptible
not clear exactly why the effects of training to modification, we learn that culture-specific
differ across ages, but such disparities may indi- representations of musical rhythm and meter
cate that infants’ musical representations have may continue to undergo developmental change
greater flexibility and are more susceptible to throughout childhood.
being influenced by perceptual experience than
Mechanisms of Perceptual Development
are those of adults. In particular, it should be
and Reorganization
possible to document developmental changes
not only in a listener’s tendency to assimilate Across speech, face, and music perception, we
unfamiliar rhythms to culture-specific catego- see a strikingly similar developmental picture.
ries, but also in the extent to which those assim- What drives these developmental changes
ilative tendencies can be modified by passive during infancy and why are similar patterns
exposure to unfamiliar music. observed across such disparate domains? One
Recent research on this topic (Hannon & proposal is that repeated exposure to particular
Soley, in preparation) suggests that adult-like sounds and sights leads neural circuits underly-
representations of musical meter may not ing perception to become increasingly commit-
emerge until after age 8. North American chil- ted to the statistics of the input (Kuhl, 2004).
dren aged 4–8 undertook the same basic train- Thus, initial abilities to discriminate unfamil-
ing regimen as described above for adults. iar structures arise from immature, uncom-
Testing occurred at repeated sessions 2 weeks mitted circuitry, which is then “warped” by
apart, using a game-like procedure adapted experience, whereby representations of more
from the adult task in which they judged the frequently encountered stimuli expand as pro-
similarity of two musical stimuli by adjusting totypes. These modified representations in turn
the position of a game piece. During the 2-week influence the extent to which future learning is
training period, children listened at home to possible because novel stimuli will tend to be
10-min recordings of Balkan folk music twice assimilated toward the prototype if they share
every day. Figure 7.4 shows that data from chil- its properties.
dren replicate the classic findings obtained with This account is generally consistent with
Western adults and 12-month-olds prior to patterns of early brain development. Infancy
exposure; 4- to 8-year-old children successfully is characterized by a proliferation of synap-
distinguish rhythms in the isochronous condi- ses followed by pruning, a process driven by
tion but perform at chance levels in the noniso- Hebbian learning through which repeated use
chronous condition. Importantly, performance (i.e., exposure to particular types of faces) leads
in the isochronous and nonisochronous condi- to a strengthening of neural circuits and disuse
tions differs significantly. After exposure, how- (i.e., lack of exposure to other-species or other-
ever, children perform at above-chance levels race faces) leads to deterioration (Huttenlocher
in both isochronous and nonisochronous con- & Dabholkar, 1997; Scott, Pascalis, & Nelson,
ditions and their performance in the two con- 2007). It is also consistent with studies of sec-
ditions is indistinguishable. As can be seen in ond language learning, in which adult learners
Figure 7.4, children’s accuracy is lower than have greater difficulty perceiving and produc-
that of adults in all conditions, but it is never- ing foreign speech contrasts than do younger
theless striking that passive exposure to foreign learners, especially when the target contrasts
music gives rise to native-like levels of perfor- interfere with the phonology of the subjects’
mance in children (and 12-month-olds) but not native language (Flege, Yeni-Komshian, & Liu,

1999; Iverson et al., 2003; McCandliss, Fiez, which the system automatically overcomes or
Protopapas, Conway, & McClelland, 2002). suppresses perceptual distinctions that are not
Such a model might also describe the formation relevant to a particular individual’s experience
of culture-specific metrical categories, where (Rivera-Gaxiola et al., 2000).
frequent exposure to simple ratios and isochro- Consistent with this view is the notion that
nous meters leads to the formation of metrical developmental changes across these multiple
prototypes that subsequently influence how lis- domains arise from domain-general aspects of
teners interpret events and form expectations cognitive development, such as the ability to
while listening to music. inhibit irrelevant information. Indeed, when 8-
Importantly, the observed declines in to 10-month-old infants’ nonnative consonant
discrimination across modalities have not discrimination is compared with their perfor-
been interpreted as signifying a loss of ability. mance in object search (A not B) and visual cat-
Rather, they are seen to reflect an enhancement egorization tasks, developmental changes across
of representations that code familiar structures tasks appear to occur in synchrony (Werker
and a rise in the ability to ignore irrelevant & Lalonde, 1995). In other words, infants who
information. In the speech domain, recent perform most accurately in a nonspeech task
evidence suggests that declines in discrimina- are also poorest at nonnative consonant dis-
tion of nonnative contrasts are accompanied crimination. It is not currently known whether
by improvements in perception of native con- performance on nonmusical cognitive tasks
trasts (Kuhl et al., 2005, 2006). Similarly, indi- would correlate with perception of unfamiliar
vidual variability in discrimination of native musical structures.
and nonnative speech contrasts is correlated Maturational changes in neural plasticity
with language skills during childhood, such and cognitive control may account for the occur-
as language production, comprehension, and rence of similar patterns of perceptual develop-
reading (Burnham, 2004; Kuhl et al., 2005; ment across music, speech, and face perception,
Tsao, Liu, & Kuhl, 2004). Thus, declines in and may explain why adults have greater dif-
discrimination of foreign meters are probably ficulty learning unfamiliar, culture-specific
not due to a worsening of music perception, structures than do infants. It is also worth con-
but rather the result of strengthened culture- sidering, however, the ways in which perceptual
specific music knowledge. experience itself could function to change the
Moreover, despite showing poor discrimi- nature of the learner, independent of matura-
nation of unfamiliar structures in behavioral tion. One possibility is that age-related changes
tasks, paradigms using eye-tracking or event- in the ability to learn novel structures arise from
related brain potentials reveal that adults the amount of interference between culturally
and older infants do in fact, at a preattentive unfamiliar and familiar structures (Flege et al.,
level, respond to subtle distinctions in foreign 1999). As representations of familiar structures
speech and in nonhuman faces (McMurray & become more elaborate and more entrenched,
Aslin, 2005; Rivera-Gaxiola, Csibra, Johnson, whether for own-species faces, native language
& Karmiloff-Smith, 2000; Rivera-Gaxiola, speech contrasts, or familiar musical meters,
Silva-Pereyra, & Kuhl, 2005; Scott, Shannon, the impact of unfamiliar structures on the
& Nelson, 2006). At some level, older infants, existing representation will diminish because
children, and adults must also remain sensitive the unfamiliar structures will be perceptu-
to complex ratios and meters in music despite ally assimilated or ignored. Studies of second
their poor performance in the behavioral task, language learning are largely consistent with
otherwise it would not be possible for discrimi- this account (Iverson et al., 2003; McCandliss
nation to improve as a result of exposure. Thus, et al., 2002), but controlled studies are made
developmental declines across all modalities difficult by the fact that age and acquisition of
may be less indicative of perceptual deteriora- knowledge are typically confounded through-
tion and more indicative of a reorganization in out development. Evidence from animal models

suggests that after being deprived of patterned absorb all information encountered. Yet infants
input to sensory cortex, cortical organization somehow manage to focus on the statistics that
in deprived but older animals closely resembles lead to mastery of appropriate structures in
that of younger animals (Chang & Merzenich, appropriate domains. In this section, I will dis-
2003). This strongly supports the notion that cuss possible constraints for learning musical
by preventing experience-related tuning from rhythm and meter.
occurring, we can reverse or postpone expected What are the starting points for music learn-
developmental changes independent of age or ing? It is sensible to assume that all learning will
maturation. be constrained to some extent by intrinsic prop-
To summarize, multiple mechanisms may erties of the nervous system, such as the struc-
account for parallel developmental changes ture of sensory organs. The phenomenon of
across the music, face, and language domains, musical consonance and dissonance provides a
ranging from brain maturation and pruning classic example of this in the auditory domain.
to cognitive control to experience-based inter- Simultaneous pitches that stand in simple inte-
ference. An important goal for future research ger ratios (such as 2:1, 3:2, and 4:3, corresponding
will be to describe in more detail the develop- to the octave, fifth, and fourth) are termed con-
mental trajectory of culture-specific knowledge sonant, while pitches standing in more complex
in multiple domains and to compare individ- ratios (such as 11:12 or 45:32, corresponding to
ual development across domains to under- the minor second and the tritone) are termed
stand why such change occurs. For example, if dissonant. Throughout the world and over the
experience-based changes are solely responsible course of history, consonant intervals tend to
for developmental change, we should see little occur more frequently in music and give rise to
correlation across domains and individual dif- positive affective responses, whereas the oppo-
ferences that vary according to amount of expo- site is true of dissonant intervals (Cross, 2001;
sure in a specific domain. If, however, more Dowling & Harwood, 1984; Kilmer, Crocker, &
general maturational factors also play a role, Brown, 1976; Koelsch, Fritz, Cramon, Müller, &
we would expect to see more congruence across Friederici, 2006). The distinction between con-
domains and relatively small individual differ- sonance and dissonance likely originates in the
ences as a function of specific experience. Music structure of the ear—the frequencies of disso-
may provide a particularly unique opportunity nant intervals tend to be too close to be resolved
to address these questions, because exposure to on the basilar membrane so their resulting vibra-
music even within a culture is subject to much tion patterns give rise to beating and percep-
greater individual variability (i.e., different lis- tion of roughness (Fishman et al., 2001; Tramo,
tening practices across families and individu- Cariani, Koh, Makris, & Braida, 2003). It is pre-
als) than exposure to other structures such as sumably for this reason that human infants and
speech and faces. even nonhuman animals are able to discriminate
and categorize sounds on the basis of consonance
and dissonance (Schellenberg & Trainor, 1996;
Watanabe, Uozumi, & Tanaka, 2005).
The above sections focus on the role of percep- Humans not only discriminate conso-
tual experience and learning mechanisms in nance and dissonance, but they find beating
acquiring music knowledge. It is also impor- and roughness aversive, which is probably why
tant, however, to understand the constraints robust listening preferences to consonant over
that define the starting point for learning and dissonant intervals have been observed in very
limit what can be learned. Because newborns young infants, including hearing newborns of
are faced with a virtual cacophony of struc- deaf mothers who may have reduced prena-
tures across all sensory modalities, we might tal exposure to musical intervals (Masataka,
expect the learning process to be easily derailed 2006; Trainor & Heinmiller, 1998; Trainor,
as infants voraciously and indiscriminately Tsang, & Cheung, 2002; Zentner & Kagan,

1996). Interestingly, encoding and memory for

Biological Basis for Biases toward
musical patterns can be compromised by dis-
sonance. For example, adults and 6-month-old
infants have greater difficulty detecting subtle Why are irregular rhythms so challenging for
frequency changes in patterns containing dis- listeners? Most explanations rely on the assump-
sonant intervals than in patterns containing tion that internal timekeeping mechanisms,
consonant intervals (Acker, Pastore, & Hall, such as a grid, clock, or bank of oscillators, con-
1995; Schellenberg & Trehub, 1994, 1996). The strain rhythmic perception and behavior such
phenomenon of consonance and dissonance that regular sequences are more efficiently pro-
thus illustrates how peripheral properties (i.e., cessed than irregular sequences. Clock or grid
frequency resolution in the ear) can give rise models assume that the listener deduces a max-
to a cascade of effects shaping discrimination, imally efficient description of rhythmic patterns
esthetic preferences, and efficiency of encod- where individual events line up with the period
ing in infants and adults, which may ultimately of one or more internal clocks (Povel, 1984).
determine which structures humans prefer in Rhythms containing event onsets that do not
music throughout the world. consistently support a single clock (or when they
Do similar constraints affect temporal struc- simultaneously support many clocks) are not
tures in music? Although the ear is not a likely easily described, and thus force the listener to
candidate for constraining perception of rhythm rely on explicit memory of each interval instead
and meter, domain-general mechanisms in the of iterated interval categories (Janata & Grafton,
nervous system—such as those underlying pre- 2003; Povel, 1984; Semjen & Ivry, 2001).
diction and movement—may give rise to intrin- Dynamical systems approaches describe
sic biases for temporal regularity. Unpredictable rhythmic pattern coordination using the math-
auditory sequences result in more anxiety-like ematics of nonlinear oscillators (Large, 2001;
behavior and sustained amygdala activity than Treff ner & Turvey, 1993). One model of musi-
do predictable sequences, suggesting that at cal meter proposes that temporal patterns are
some level, listeners find temporal irregular- represented by a bank of internal oscillators
ity aversive and may therefore seek out regu- that entrain to periodicities in the stimulus
larity (Herry et al., 2007). Adult listeners also and compete for activation through inhibition
have greater difficulty discriminating tempo- (Large, 2001; Large & Jones, 1999). The intrin-
ral intervals, patterns, or individual pitches sic dynamics of oscillators give rise to greater
when the preceding context is unpredictable stability for simple ratios and greater instability
than when it is predictable (Barnes & Jones, for complex ratios. Importantly, the behavioral
2000; Drake & Botte, 1993; Jones, Johnston, & output of coupled oscillators need not rely on
Puente, 2006; Jones, Moynihan, Mackenzie, & complex or highly specialized neural substrates.
Puente, 2002). As reviewed above, the simplic- The ratios that are most stable for human inter-
ity of serial interval ratios predicts how well limb coordination (Peper et al., 1995) are also
adults can reproduce, identify, remember, and most stable for synchronous behavior in sing-
synchronize with rhythmic patterns (Collier & ing birds (Laje & Mindlin, 2003) and courting
Wright, 1995; Desain & Honing, 2003; Essens, fireflies (Buck, 1988), suggesting that in prin-
1986; Essens & Povel, 1985; Hannon & Trehub, ciple, similar mechanisms could be responsi-
2005a; Large, 2000; Povel, 1981; Repp et al., ble for movement coordination across species.
2005; Snyder et al., 2006). Abundant evidence Thus, basic human interlimb coordination,
also suggests that production and perception of such as walking, running, and other forms of
parallel ratios (i.e., the ratios between two simul- movement, may give rise to temporal process-
taneous periodic patterns, such as polyrhythms) ing mechanisms that bias listeners toward regu-
is affected by ratio simplicity (Deutsch, 1983; larity (Summers, 2001).
Klapp, 1981; Klapp et al., 1985; Peper, Beek, & By claiming that simple-integer temporal
van Wieringen, 1995; Treff ner & Turvey, 1993). ratios are intrinsically easier for timekeeping

mechanisms to represent, the above accounts rhythmic patterns (Nakata & Mitani, 2005;
are seemingly at odds with findings reviewed Soley & Hannon, submitted), and they appear
in the section “Building Musical Knowledge to have difficulty in processing and remember-
through Enculturation,” which underscore the ing patterns having unconventional rhythmic
role of culture and experience by showing that structure (Trehub & Hannon, 2009). In this
one’s prior exposure to nonisochronous musi- study, adults listened to a corpus of rhythmic
cal structures dramatically shapes the extent arrangements of a 12-note pitch sequence and
to which irregularity disrupts performance labeled each arrangement as either “good” or
(Hannon & Trehub, 2005a, 2005b). Learning “bad.” The arrangement most frequently labeled
and experience are also implicated by individ- bad and the one labeled good were selected
ual differences in production of complex-ratio for use in a detection task, where adults and
sequences, because 3:2 ratios can pose great dif- 6-month-old infants were trained to respond
ficulty for most subjects (Povel, 1981) but pose to subtle disruptions of either the good or
minimal difficulty for subjects with extensive bad arrangement (adults responded by rais-
music training (Collier & Wright, 1995; although ing a hand while infants made a head-turn).
this was not found by Hannon & Trehub, 2005a Although changes were successfully detected
or Repp et al., 2005). Training studies also show in all conditions, both infants and adults were
that practice can dramatically improve the pro- significantly better at detecting a 260-ms rhyth-
duction of complex-ratio polyrhythms (Krampe, mic change to the good arrangement than to the
Kliegl, Mayr, Engbert, & Vorbert, 2000; Zanone bad arrangement, even though the serial posi-
& Kelso, 1992). Any explanation of intrinsic tion and size of the change was identical in both
biases toward regularity must therefore account conditions. Thus, some aspect of rhythm in the
for the effects of experience and learning. sequence most preferred by adults may have
afforded better perceptual processing regard-
The Importance of Temporal Regularity
less of experience and culture. The basis for this
for Infants
processing advantage is difficult to determine,
To the extent that basic timing mechanisms of however, because the good and bad arrange-
the nervous system are responsible for biases ments were unique and thus differed from each
toward temporal regularity in adults, we should other in multiple ways. One potential explana-
expect infants and adults to have equal dif- tion is that the good rhythm implied a consis-
ficulty processing irregular patterns. Instead, tent underlying pulse whereas the bad rhythm
as described above, 6-month-olds outperform implied multiple pulses at different points in
adults at discriminating patterns containing the pattern. Without controlled manipulation
3:2 ratios, suggesting that either (1) infants in of specific structures, however, it is impossible
these experiments are not processing rhythmic to know why the good arrangement gave rise to
patterns in an entirely adult-like fashion but are superior performance.
using some alternative strategy such as remem- The above finding implies that infants’
bering interval sequences or (2) infants and rhythm perception is not infinitely flexible, but
adults process patterns using the same mecha- might instead be constrained by at least some
nisms, which can accommodate slightly com- of the limitations that apply to adults. If infants
plex ratios such as 3:2 but not highly complex and adults rely on the same basic temporal
ratios—enculturation processes during infancy mechanisms, we should see even very young
then suppress ratios not used in familiar musi- infants struggle with highly complex ratios
cal styles. The mechanisms underlying infants’ that are relatively rare in music. If, however,
perception of rhythm and meter are not yet infants use an alternative strategy to discrim-
understood, but recent fi ndings lend support to inate rhythmic patterns, such as memorizing
the latter hypothesis. a sequence of specific intervals, then the com-
Like adults, young infants exhibit lis- plexity of interval ratios should not matter as
tening preferences for simple over complex long as the serial structure of rhythms is simple.

Hannon (in preparation-b) recently addressed continuity between infants and adults in the
this question by examining how 4- to 6-month- nature and limitations of basic temporal pro-
old infants perceive rhythmic patterns having cessing mechanisms. Although such mecha-
varying levels of ratio complexity. Three rhyth- nisms are not fully understood, they may derive
mic variations of the same folk tune were cre- from dynamic behaviors such as anticipatory
ated containing simple Western (2:1), complex attending and the coordination of various
Balkan (3:2) and highly complex “Alien” (7:4) types of movement. Models of rhythmic tim-
ratios (see Figure 7.5). In all three conditions, ing suggest that perfectly isochronous meters
the song cycled through a sequence of Long– are optimal, but they also suggest that there is
Short–Short intervals with the duration of the a continuum of complexity with ratios such as
long interval set at 756 ms; the only difference 3:2 positioned at the simpler end. In this light,
between conditions was the size of the short it is interesting to consider features of non-iso-
interval and the resulting ratio. After habitua- chronous musical meters throughout the world,
tion to the standard version of one of the three such as those common in India, the Balkans, and
variations, infants were alternately presented throughout Africa, which tend to be restricted,
with the standard and a changed version con- at least in practice, to alternating patterns of
taining a 200-ms increase in the duration of the 2 and 3 (London, 2004; Powers & Widdess,
long interval. Thus, across the three conditions, 2001). By better understanding early tempo-
the absolute size of the target interval and the ral processing constraints, future research can
change were held constant. Infants showed a develop more thorough and complete accounts
novelty preference for the disrupted stimulus in of musical rhythm learning.
the simple and complex conditions, but showed
no preference in the highly complex condition,
suggesting that ratio complexity does influence
how listeners perceive rhythms even prior to
enculturation. Music and speech unquestionably depend on
In summary, although young infants are specialized perceptual processes that are often
culture-general music listeners, they are never- associated with distinct and separable brain
theless influenced by the regularity of rhythmic areas (Binder et al., 2000; Narain et al., 2003;
patterns and the simplicity of temporal inter- Peretz et al., 1994; Vouloumanos, Kiehl, Werker
val ratios, presumably because there is some & Liddle, 2001), a fact that has contributed to

Standard Change (L+200)

Simple 2:1 756 378 956 378

Complex 3:2 756 504 956 504

Highly 7:4 756 432 956 432


Figure 7.5 Simple, complex, and highly complex versions of a folk tune used in three conditions of
a habituation experiment with young infants (Hannon, in preparation-b). The standard rhythm in
the simple condition contained a repeating cycle of long and short intervals having a 2:1 ratio. The
complex Balkan rhythm contained 3:2 ratios, whereas the highly complex rhythm contained a 7:4
ratio, which is relatively rare in music. In all three conditions, the standard and a changed version
were presented after habituation. The change consisted of a 200-ms increase in the duration of the
long interval. Note that the absolute size of the long interval was identical across conditions for the
standard and change stimulus.

the widely held assumption that music and as the process through which culture-specific
language abilities arise from modular, innate knowledge is built through everyday listening
adaptations (Fodor, 1983; Liberman & Mattingly, experiences, as infants and children attempt
1985; Peretz & Coltheart, 2003; Peretz & Hyde, to actively predict and interpret patterns that
2003). However, these assumptions have been unfold over time. Because speech and music
challenged by evidence that individuals with are both complex and dynamic acoustic struc-
music-specific deficits also exhibit impairments tures, young listeners likely build music- and
in speech perception (Patel, Foxton, & Griffiths, language-specific representations in paral-
2005) and that individuals with language- lel. Thus, if the culture-specific structures of
specific deficits also show impairment on music child-directed music and speech contain a high
tasks (Alcock, Passingham, Watkins, & Vargha- degree of overlap, this might have implications
Khadem, 2000). Studies using brain-imaging in for early representations of music and speech.
normal adult listeners have provided additional Rhythm is not unique to music. Rhythmic
evidence that purportedly language-specific structure is also fundamental for speech com-
brain regions, such as Broca’s area, are also prehension and in fact probably plays a vital role
involved in music processing (Maess, Koelsch, in infants’ early responses to language. Within
Gunter, & Friederici, 2001). Thus, evidence of days of birth, newborn infants can discrimi-
at least some shared processes for music and nate native from foreign utterances of speech
speech contradicts a strictly modular account. (Bahrick & Pickens 1988; Mehler et al., 1988;
Nevertheless, there are undoubtedly many Moon, Cooper, & Fifer, 1993). The preference is
speech- and music-specific structures in the maintained even when utterances are low-pass
adult brain, although the origins of such special- fi ltered, preserving only the rhythmic prop-
ization are not clear. Some behavioral evidence erties of speech and mimicking the quality of
suggests that speech-specific processes may be sound in utero (Abrams et al., 2000; Mehler
functional very early—for example, 2-month- et al., 1988).
old infants prefer speech to nonspeech stim- Language-specific rhythmic structures
uli, even when the acoustic structure of speech probably form the basis for these early native-
and nonspeech is very similar (Vouloumanos & language preferences. The speech rhythm of a
Werker, 2004) and infants may employ a type language has historically been defined by lin-
of rule-learning that is optimally suited to lan- guists as arising from the way languages divide
guage (Marcus, Fernandes, & Johnson, 2007, but time; syllable-timed languages such as French
see Saff ran, Pollak, Seibel, & Shkolnik, 2007 for and Spanish use the syllable to mark equal time
an example of rule-learning in vision). It is not units whereas stress-timed languages such as
clear, however, that fully lateralized, language- English and Dutch use stressed syllables to mark
and music-specific structures exist in the brain equal units of time (Cutler, 1994; Jusczyk, 2002).
of the newborn (Dehaene-Lambertz, 2000). These rhythm-based classifications also map
In fact, recent studies suggest that language- onto newborn’s discrimination of languages—
specific brain areas do not emerge until late newborns fail to discriminate languages from
in infancy or after infancy (Imada et al., 2006; the same rhythmic class but easily discrimi-
Minagawa-Kawai, Mori, Naoi, & Kojima, 2007). nate languages from separate rhythmic classes
Thus, it is worth considering the possibility (Nazzi, Bertoncini, & Mehler, 1998). Acoustic
that domain-specificity develops—that infants measures have been identified that can classify
might initially approach both speech and music languages based on the amount of variability
with one set of basic auditory perceptual skills that characterizes adjacent vocalic and inter-
and learning mechanisms. vocalic intervals (Grabe & Low, 2002; Ramus,
As representations of music become increas- Nespor, & Mehler, 1999). One such measure, the
ingly culture-specific, so too might repre- normalized pairwise variability index (nPVI),
sentations of sound become increasingly has been successfully used to distinguish stress-
domain-specific. Enculturation is defined here timed languages such as English and German,

which have higher vocalic interval variability, question to ask is whether such differences exist
from syllable-timed languages such as French in musical input directed toward young listen-
and Spanish, which have lower vocalic interval ers and if so whether the differences can be
variability. perceived.
Recent evidence suggests that there may be To address the fi rst question, 140 songs
a link between rhythmic structure in language from French- and English-speaking cultures
and music. Inspired by musicologists’ specula- were selected from children’s music antholo-
tions that a culture’s language influences its gies and analyzed for their rhythmic proper-
music, Patel and Daniele (2003) used nPVI to ties (Hannon, in preparation-c). Consistent
examine the variability of musical note dura- with prior fi ndings, English-language songs
tions in the themes of instrumental art music contained higher nPVI values than did French-
written by English- and French-speaking com- language songs (see Figure 7.6). Interestingly,
posers. They discovered that, like actual utter- the magnitude of the difference in children’s
ances of English and French speech, English music was somewhat larger than has been
musical themes contained higher durational obtained in prior studies using instrumental
contrast than did French themes. Subsequent (Huron & Ollen, 2003; Patel & Daniele, 2003)
studies have further verified a relationship and popular music (Sadakata et al., 2004). Th is
between speech prosody and musical struc- could arise if child-directed musical input
tures (Huron & Ollen, 2003; Patel, Iverson, & exaggerates culture-specific rhythmic proper-
Rosenberg, 2006; Sadakata, Desain, Honing, ties, but it could also result from the presence of
Patel, & Iversen, 2004). Given that culture-spe- words in all children’s songs examined, which
cific differences in rhythmic structure appear might have maximized the influence of the
to exist in both speech and music, a natural native language on musical rhythm over more

45 English

Adults Children

Figure 7.6 Analysis of rhythmic variability in French and English songs for children and adults
(Hannon, in preparation-c). The normalized pairwise variability index (nPVI) for adjacent musical
notes represents the y-axis, and the compilation from which songs were selected as children’s or adult’s
(i.e., traditional folk songs with text) is on the x-axis. Dark bars represent English songs and light bars
represent French songs. For songs taken from adult compilations, the nPVI for English was slightly
higher (M = 41.6, N = 70) than for French (M = 40.01, N = 70). For songs taken from children’s compila-
tions, the nPVI for English was much higher (M = 42.03, N = 70) than for French (M = 33.54, N = 70).
Error bars represent standard error.

instrumental samples. To test this possibility, SUMMARY

a second set of 140 French and English songs
with text were selected from traditional folk This chapter has outlined a research program
and popular music anthologies. Th is analysis that aims to understand how individuals build
revealed the same trend but much more mod- musical representations throughout develop-
est differences (see Figure 7.6). Future research ment. Instead of framing the question of musi-
will further explore differences between child- cal knowledge and behavior in terms of modular
and adult-directed songs, but these initial and unique capacities evolved through natural
fi ndings point toward the presence of culture- selection, this approach emphasizes the role of
specific rhythm differences in music directed perceptual experience and statistical learning
toward young listeners. that is domain-general, operating in tandem
In order for such differences to affect infants’ with simple constraints that arise from proper-
learning of music and speech, speech-based ties of the sensory organs and the nervous sys-
rhythmic structures must actually be meaning- tem. Infants learn about music as they actively
ful to listeners in the context of music. Although attempt to predict and interpret musical events
it is clear that infants and adults can perceive that unfold over time, and these experiences in
and categorize utterances of spoken language turn influence the nature of subsequent per-
on the basis of nPVI (Nazzi et al., 1998; Ramus ception and learning. Investigations of musi-
et al., 1999), comparable structures in music cal knowledge acquisition therefore have the
may simply exist as a by-product of setting text potential not only to shed light on the nature
to music and not necessarily be perceived by and origins of human musical behavior but also
listeners. To address this question, adult listen- may inform our understanding of perceptual
ers were required to categorize instrumental development across a number of domains.
versions of French and English children’s songs
as belonging to one of two fictional languages
(Hannon, in press). During the training phase, REFERENCES
they received feedback following every trial, but Abrams, R. M., Gerhardt, K. J., Huang, X., Peters,
during the test phase, they had to categorize an A. J. M., & Langford, R. G. (2000). Musical
entirely novel set of French and English chil- experiences of the unborn baby. Journal of
dren’s songs. Results showed that listeners were Sound and Vibration, 231, 253–258.
highly accurate (77% correct) in their categori- Acker, B. E., Pastore, R. E., & Hall, M. D. (1995).
zation of novel songs after training. Moreover, Within-category discrimination of musi-
cal chords: Perceptual magnet or anchor?
performance was unchanged after pitch infor-
Perception & Psychophysics, 57, 863–874.
mation was removed from the songs, suggest- Alcock, K. J., Passingham, R. E., Watkins, K., &
ing that rhythm was the primary cue and not Vargha-Khadem, F. (2000). Pitch and tim-
culture-specific pitch structure or familiar- ing abilities in inherited speech and language
ity. Thus, adult listeners were able to perceive impairment. Brain and Language, 75, 34–46.
and use language-based rhythmic differences Avanzini, G., Lopez, L., Koelsch, S., & Majno, M.
to categorize novel songs, suggesting that in (Eds.). (2006). The neurosciences and music II:
principle, infants may also be sensitive to these From perception to performance. New York:
structures. The New York Academy of Sciences.
In summary, a burgeoning research area is Bahrick, L. E., & Pickens, J. N. (1988). Classification
beginning to explore similarities in music and of bimodal English and Spanish passages by
infants. Infant Behavior and Development, 11,
language processing from a developmental
perspective. By examining the nature of audi- Barnes, R., & Jones, M. R. (2000). Expectancy,
tory input in the environment of the young lis- attention, and time. Cognitive Psychology, 41,
tener, we may better understand developmental 254–311.
changes in culture-specific representations Bigand, E., & Poulin-Charronnat, B. (2006). Are
across music and speech domains. we “experienced listeners”? A review of the

musical capacities that do not depend on formal Cummins, F., & Port, R. (1998). Rhythmic con-
music training. Cognition, 100, 100–130. straints on stress timing in English. Journal of
Binder, J. R., Frost, J. A., Hammeke, T. A., Phonetics, 26, 145–171.
Bellgowan, P. S. F., Springer, J. A., Kaufman, J. Cutler, A. (1994). Segmentation problems, rhyth-
N., et al. (2000). Human temporal lobe activa- mic solutions. Lingua, 92, 81–104.
tion by speech and nonspeech sounds. Cerebral Dalla Bella, S., Giguere, J., & Peretz, I. (2007).
Cortex, 10, 512–528. Singing proficiency in the general population.
Brown, S. (2003). Biomusicology, and three bio- Journal of the Acoustical Society of America,
logical paradoxes about music. Bulletin of 121, 1182–1189.
Psychology and the Arts, 4, 15–17. Deutsch, D. (1983). The generation of two iso-
Buck, J. (1988). Synchronous rhythmic flashing of chronous sequences in parallel. Perception &
fireflies. II. The Quarterly Review of Biology, 63, Psychophysics, 34, 331–337.
265–289. Dehaene-Lambertz, G. (2000). Cerebral special-
Burnham, D. (2004). Language specific speech ization for speech and non-speech stimuli in
perception and the onset of reading. Reading infants. Journal of Cognitive Neuroscience, 12,
and Writing, 16, 573–609. 449–460.
Chang, E., & Merzenich, M. (2003). Environmental Desain, P., & Honing, H. (2003). The formation
noise retards auditory cortical development. of rhythmic categories and metric priming.
Science, 300(5618), 498–502. Perception, 32, 341–365.
Christiansen, M. H., Allen, J., & Seidenberg, M. S. Dowling, W. J., & Harwood, D. L. (1986). Music
(1998). Learning to segment speech using mul- cognition. Orlando, FL: Academic Press.
tiple cues: A connectionist model. Language Drake, C. & Botte, M. (1993). Tempo sensitivity in
and Cognitive Processes, 13, 221–268. auditory sequences: Evidence for the multiple
Clarke, E. F. (1987). Levels of structure in the orga- look model. Perception and Psychophysics, 54,
nization of musical time. Contemporary Music 277–286.
Review, 2, 211–239. Drayna, D., Manichaikul, A., de Lange, M.,
Clayton, M. (2000). Time in Indian music. New Snieder, H., & Spector, T. (2001). Genetic cor-
York: Oxford University Press. relates of musical pitch recognition in humans.
Collier, G. L., & Wright, C. E. (1995). Temporal res- Science, 291, 1969–1972.
caling of simple and complex ratios in rhythmic Eimas, P. D. (1974). Auditory and linguistic process-
tapping. Journal of Experimental Psychology: ing of cues for place of articulation by infants.
Human Perception and Performance, 21, Perception & Psychophysics, 16, 513–521.
602–627. Eimas, P. D. (1975). Auditory and phonetic coding
Compston, A. (Ed.). (2006). Music and the brain of the cues for speech: Discrimination of the
[Special Issue]. Brain, 129. [r-l] distinction by young infants. Perception &
Conway, C. M., & Christiansen, M. H. (2006). Psychophysics, 18, 341–347.
Statistical learning within and between modali- Eimas, P. D., Siqueland, E. R., Jusczyk, P. W., &
ties: Pitting abstract against stimulus specific rep- Vigorito, J. (1971). Speech perception in infants.
resentations. Psychological Science, 17, 905–912. Science, 171, 303–306.
Creel, S. C., & Newport, E. L. (2002). Tonal profi les Essens, P. (1986). Hierarchical organization of
of artificial scales: Implications for music learn- temporal patterns. Perception & Psychophysics,
ing. In C. Stevens, D. Burnham, G. McPherson, 40, 69–73.
E.Schubert, & J. Renwick (Eds.), Proceedings Essens, P., & Povel, D. (1985). Metrical and non-
of the 7th International Conference on Music metrical representations of temporal patterns.
Perception and Cognition, Sydney. Adelaide: Perception & Psychophysics, 37, 1–7.
Causal Productions. Fishman, Y. I., Volkov, I. O., Noh, M. D., Garell,
Cross, I. (2001). Music, cognition, culture and P. C., Bakken, H., Arezzo, J. C., et al. (2001).
evolution. Annals of the New York Academy of Consonance and dissonance of musical chords:
Sciences, 930, 28–42. Nural correlates in auditory cortex of monkeys
Cuddy, L. L., & Badertscher, B. (1987). Recovery of and humans. The Journal of Neurophysiology,
the tonal hierarchy: Some comparisons across 86, 2761–2788.
age and levels of musical experience. Perception Flege, J. E., Yeni-Komshian, G. H., & Liu, S. (1999).
& Psychophysics, 41, 609–620. Age constraints on second-language acqui-

sition. Journal of Memory and Language, 1, Hannon, E. E., & Trehub, S. E. (2005a). Metrical
78–104. categories in infancy and adulthood.
Fodor, J. A. (1983). The modularity of mind: An Psychological Science, 16, 48–55.
essay on faculty psychology. Cambridge, MA: Hannon, E. E., & Trehub, S. E. (2005b). Tuning in
MIT Press. to rhythms: Infants learn more readily than
Fraisse, P. (1978). Time and rhythm perception. adults. Proceedings of the National Academy of
In E. C. Carterette & M. P. Friedman (Eds.), Sciences (USA), 102, 12639–12643.
Handbook of perception (Vol. 8, pp. 203–254). Hauser, M. D., & McDermott, J. (2003). The evolu-
New York: Academic Press. tion of the music faculty: A comparative per-
Fraisse, P. (1982). Rhythm and tempo. In spective. Nature Neuroscience, 6, 663–668.
D. Deutsch (Ed), The psychology of music Herry, C., Bach, D. R., Esposito, F., Di Salle,
(pp. 149–180). New York: Academic Press. F., Perrig, W. J., Scheffler, K., et al. (2007).
Grabe, E., & Low, E. L. (2002). Durational vari- Processing of temporal unpredictability in
ability in speech and the rhythm class hypoth- human and animal amygdala. The Journal of
esis. In C. Gussenhoven & N. Warner (Eds.), Neuroscience, 27, 5958–5966.
Laboratory phonology (pp. 515–546). Berlin: Huron, D. (2003). Is music an evolutionary adap-
Mouton de Gruyter. tation?. In I. Peretz & R. J. Zatorre (Eds.), The
Grewe, O., Kopiez, R., & Altenmüller, E. (2005). cognitive neuroscience of music (pp. 57–75).
How does music arouse “chills”? Investigating New York: Oxford University Press.
strong emotions, combining psychological, Huron, D., & Ollen, J. (2003). Agogic contrasts in
physiological, and psychoacoustic methods. French and English themes: Further support
Annals of the New York Academy of Sciences, for Patel and Daniele (2003). Music Perception,
1060, 446–449. 21, 267–271.
Hannon, E. E. Infants know bad dancing when Huron, D., & Royal, M. (1996). What is melodic
they see it: Audiovisual synchrony perception accent? Converging evidence from musical
of dancing to music in 10-month-old infants. practice. Music Perception, 13, 489–516.
Manuscript in preparation-a. Huttenlocher, P. R., & Dabholkar, A. S. (1997).
Hannon, E. E. Infants’ perception of musical rhythm Regional differences in synaptogenesis in
is culture-general but constrained by ratio sim- human cerebral cortex. Journal of Comparative
plicity. Manuscript in preparation-b. Neurology, 387, 167–178.
Hannon, E. E. Speech rhythms are exaggerated in Hyde, K. & Peretz, I. (2004). Brains that are out
children’s music. Manuscript in preparation-c. of tune but in time. Psychological Science, 15,
Hannon, E. E., & Soley, G. Children learn to 356–360.
perceive foreign musical structures after Imada, T., Zhang, Y., Cheour, M., Taulu, S.,
passive listening experience. Manuscript in Ahonen, A., & Kuhl, P. K. (2006). Infant speech
preparation. perception activates Broca’s area: A devel-
Hannon, E. E. (in press). Perceiving speech opmental magnetoencephalography study.
rhythm in music: Listeners categorize instru- Neuroreport, 17, 957–962.
mental songs according to language of origin. Iverson, P., Kuhl, P. K., Akahane-Yamada, R.,
Cognition. Diesch, E., Tohkura, Y., Kettermann, A., et
Hannon, E. E., & Johnson, S. P. (2005). Infants al. (2003). A perceptual interference account
use meter to categorize rhythms and melodies: of acquisition difficulties for non-native pho-
Implications for musical structure learning. nemes. Cognition, 87, B47–B57.
Cognitive Psychology, 50, 354–377. Ivry, R. B., & Hazeltine, R. E., (1995). Perception
Hannon, E. E., Snyder, J. S., Eerola, T., & and production of temporal intervals across
Krumhansl, C. L. (2004). The role of melodic a range of durations: Evidence for a common
and temporal cues in perceiving musical meter. timing mechanism. Journal of Experimental
Journal of Experimental Psychology: Human Psychology: Human Perception and Performance,
Perception and Performance, 30, 956–974. 21, 3–18.
Hannon, E. E., & Trainor, L. J. (2007). Music Janata, P., Birk, J. L., Van Horn, J. D., Leman, M.,
acquisition: Effects of enculturation and formal Tillman, B., & Bharucha, J. J. (2002). The cor-
training on development. Trends in Cognitive tical topography of tonal structures underlying
Sciences, 11, 466–472. western music. Science, 298, 2167–2170.

Janata, P., & Grafton, S. T. (2003). Swinging in Klapp, S. T., Hill, M. D., Tyler, J. G., Martin, Z.
the brain: Shared neural substrates for behav- E., Jagacinski, R. J., & Jones, M. R. (1985).
iors related to sequencing and music. Nature On marching to two different drummers:
Neuroscience, 6, 682–687. Perceptual aspects of the difficulties. Journal
Järvinen, T., & Toivianien, P. (2000). The effect of of Experimental Psychology: Human Perception
metre on the use of tones in jazz improvisation. and Performance, 11, 814–827.
Musicae Scientiae, 4, 55–74. Krampe, R., Kliegl, R., Mayr, R., Engbert, R.,
Jones, M. R., Johnston, H. J., & Puente, J. (2006). & Vorbert, D. (2000). The fast and the slow
Effects of auditory pattern structure on antici- of skilled bimanual rhythm production:
patory and reactive attending. Cognitive Parallel vs. integrated timing. Journal of
Psychology, 53, 59–96. Experimental Psychology: Human Perception
Jones, M. R., Moynihan, H., Mackenzie, N., & and Performance, 26, 206–233.
Puente, J. (2002). Temporal aspects of stim- Kuhl, P. K. (2004). Early language acquisition:
ulus-driven attending in dynamic arrays. Cracking the speech code. Nature Reviews
Psychological Science, 13, 313–319. Neuroscience, 5, 831–843.
Juslin, P. N., & Laukka, P. (2003). Communication Kuhl, P. K., Conboy, B. T., Padden, D., Nelson, T.,
of emotions in vocal expression and music & Pruitt, J. (2005). Early speech perception and
performance: Different channels, same code? later language development: Implications for
Psychological Bulletin, 129, 770–814. the “critical period.” Language Learning and
Jusczyk, P. W. (2002). How infants adapt speech- Development, 3–4, 237–264.
processing capacities to native-language Kuhl, P. K., Stevens, E., Hayashi, A, Deguchi, T.,
structure. Current Directions in Psychological Kiritani, S., & Iverson, P. (2006). Infants show
Science, 11, 15–18. a facilitation effect for native language pho-
Karmiloff-Smith, A. (1992). Beyond modularity: A netic perception between 6 and 12 months.
developmental perspective on cognitive science. Developmental Science, 9, F13–F21.
Cambridge, MA: MIT Press. Kuhl, P. K., Tsao, F-M., & Liu, H-M. (2003).
Kelly, D. J., Quinn, P. C., Slater, A. M., Lee, K., Ge, L., Foreign-language experience in infancy:
& Pascalis, O. (2007). The other-race effect devel- Effects of short-term exposure and social inter-
ops during infancy: Evidence of perceptual nar- action on phonetic learning. Proceedings of
rowing. Psychological Science, 18, 1084–1089. the National Academy of Sciences (USA), 100,
Kilmer, A. D., Crocker, R. L., & Brown, R. R. 9096–9101.
(1985). Sounds from silence: Recent discoveries Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens,
in ancient Near Eastern music. Berkeley, CA: K. N., & Lindblom, B. (1992). Linguistic experi-
Bit Enki Publications. ence alters phonetic perception in infants by 6
Kirkham, N. Z., Slemmer, J. A., & Johnson, S. P. months of age. Science, 255, 606–608.
(2002). Visual statistical learning in infancy: Krumhansl, C. L. (2004). The cognition of tonal-
Evidence for a domain general learning mech- ity: As we know it today. Journal of New Music
anism. Cognition, 83, B35–B42. Research, 33, 253–268.
Koelsch, S., Fritz, T., Cramon, Y., Müller, K., & Krumhansl, C. L., & Keil, F. C. (1982). Acquisition
Friederici, A. D. (2006). Investigating emo- of the hierarchy of tonal function in music.
tion with music: An fMRI study. Human Brain Memory & Cognition, 10, 243–251.
Mapping, 27, 239–250. Laje, R., & Mindlin, G. B. (2003). Highly struc-
Koelsch, S., Fritz, T., Schulze, K., Alsop, D., & ture duets in the song of the South American
Schlaug, G. (2005). Adults and children pro- Hornero. Physical Review Letters, 91, 1–4.
cessing music: An fMRI study. NeuroImage, Large, E. W. (2000). Rhythm categorization in con-
25, 1068–1076. text. In C. Woods, G. B. Luck, R. Brochard, S.
Koelsch, S., & Siebel, W. A. (2005). Towards a A. O’Neill, & J. A. Sloboda (Eds.), Proceedings
neural basis of music perception. Trends in of the 6th International Conference on Music
Cognitive Sciences, 9, 578–584. Perception and Cognition. Keele, Staffordshire,
Klapp, S. T. (1981). Temporal compatibility in dual UK: Department of Psychology.
motor tasks II: Simultaneous articulation and Large, E. W. (2001). Periodicity, pattern formation,
hand movements. Memory & Cognition, 9, and metric structure. Journal of New Music
398–401. Research, 30, 173–185.

Large, E. W., Fink, P., & Kelso, J. A. (2002). Tracking McMurray, B., & Aslin, R. D. (2005). Infants are
simple and complex sequences. Psychological sensitive to within-category variation in speech
Research, 66, 3–17. perception. Cognition, 95, B15–B26.
Large, E. W., & Jones, M. R. (1999). The dynam- Mehler, J., Jusczyk, P. W., Lambertz, G., Halsted,
ics of attending: How we track time varying N., Bertoncini, J., & Amiel-Tison, C. (1988).
events. Psychological Review, 106, 119–159. A precursor of language acquisition in young
Lerdahl, F., & Jackendoff, R. (1983). A generative infants. Cognition, 29, 143–178.
theory of tonal music. Cambridge, MA: MIT Meegan, D. V., Aslin, R. N., & Jacobs, R. A. (2000).
Press. Motor timing learned without motor training.
Lewkowicz, D. (1996). Perception of auditory- Nature Neuroscience, 3, 860–862.
visual temporal synchrony in human infants. Meyer, L. B. (1973). Explaining music. Chicago:
Journal of Experimental Psychology: Human University of Chicago Press.
Perception and Performance, 22, 1094–1106. Miller, G. F. (2000). Evolution of human music
Liberman, A. M., & Mattingly, I. G. (1985). The through sexual selection. In N. L. Wallin,
motor theory of speech perception revised. B. Merker, & S. Brown (Eds.), The origins of music
Cognition, 21, 1–36. (pp. 329–360). Cambridge, MA: MIT Press.
London, J. (1995). Some examples of complex Minagawa-Kawai, Y., Mori, K., Naoi, N., & Kojima,
meters and their implications for models of S. (2007). Neural attunement processes in
metric perception. Music Perception, 13, 59–77. infants during the acquisition of a language-
London, J. (2004). Hearing in time: Psychological specific phonemic contrast. The Journal of
aspects of musical meter. New York: Oxford Neuroscience, 10, 315–321.
University Press. Moon, C., Cooper, R. P., & Fifer, W. P. (1993). Two-
Maess, B., Koelsch, S., Gunter, T. C., & Friederici, day-olds prefer their native language. Infant
A. D. (2001). Musical syntax is processed Behavior and Development, 16, 495–500.
in Broca’s area: An MEG study. Nature Nakata, T. & Mitani, C. (2005). Influences of tem-
Neuroscience, 4, 540–545. poral fluctuation on infant attention. Music
Marcus, G. F., Fernandes, K. J., & Johnson, S. Perception, 22, 401–409.
P. (2007). Infant rule learning facilitated by Narain, C., Scott, S. K., Wise, R. J. S., Rosen, S.,
speech. Psychological Science, 18, 387–391. Leff, A., Iversen, S. D., et al. (2003). Defining
Masataka, N. (2006). Preference for consonance a left-lateralized response specific to intelligi-
over dissonance by hearing newborns of deaf ble speech using fMRI. Cerebral Cortex, 13,
parents and of hearing parents. Developmental 1362–1368.
Science, 9: 46–50. Nazzi, T., Bertoncini, J., & Mehler, J. (1998).
Maye, J., Werker, J. F., & Gerken, L. (2002). Infant Language discrimination by newborns: Toward
sensitivity to distributional information can an understanding of the role of rhythm. Journal
affect phonetic discrimination. Cognition, 82, of Experimental Psychology: Human Perception
B101–B111. and Performance, 24, 756–766.
McAuley, D., Jones, M. R., Holub, S., Johnstone, H. Palmer, C., & Krumhansl, C. L. (1990). Mental
M., & Miller, N. S. (2006). The time of our lives: representations for musical meter. Journal of
Life span development of timing and event Experimental Psychology: Human Perception
tracking. Journal of Experimental Psychology: and Performance, 16, 728–741.
General, 135, 348–367. Palmer, C., & Pfordresher, P. Q. (2003). Incremental
McCandliss, B. D., Fiez, J. A., Protopapas, A., planning in sequence production. Psychological
Conway, M., & McClelland, J. L. (2002). Success Review, 110, 683–712.
and failure in teaching the [r]-[l] contrast to Pascalis, O., de Haan, M., & Nelson, C. A. (2002).
Japanese adults: Tests of a Hebbian model of Is face-processing species-specific during the
plasticity and stabilization in spoken language first year of life? Science, 296, 1321–1323.
perception. Cognitive, Affective, and Behavioral Pascalis, O., Scott, L. S., Kelly, D. J., Shannon,
Neuroscience, 2, 89–108. R. W., Nicholson, E., Coleman, M., et al.
McMullen, E., & Saff ran, J. R. (2004). Music and (2005). Plasticity of face processing in infancy.
language: A developmental comparison. Music Proceedings of the National Academy of
Perception, 21, 1–23. Sciences, USA, 102, 5297–5300.

Patel, A. D., & Daniele, J. R. (2003). An empirical Povel, D. (1984). A theoretical framework for
comparison of rhythm in language and music. rhythm perception. Psychological Research, 45,
Cognition, 87, B35–B45. 315–337.
Patel, A. D., Foxton, J. M., & Griffiths, T. D. Powers, H. S., & Widdess, R. (2001). Theory and
(2005). Musically tone-deaf individuals have practice of classical music: Rhythm and tala.
difficulty discriminating intonation contours In S. Sadie & J. Tyrrell, (Eds.), The new Grove
extracted from speech. Brain and Cognition, dictionary of music and musicians (2nd ed., pp.
59, 310–333. 195–202). London: MacMillan.
Patel, A. D., Iversen, J. R., & Rosenberg, J. C. Ramus, F., Nespor, M., & Mehler, J. (1999).
(2006). Comparing the rhythm and melody of Correlates of linguistic rhythm in the speech
speech and music: The case of British English signal. Cognition, 73, 265–292.
and French. Journal of the Acoustical Society of Repp, B. H. (1992). Diversity and commonality
America, 119, 3034–3047. in music performance: An analysis of timing
Phillips-Silver, J. & Trainor, L. J. (2005). Feeling microstructure in Schumann’s “Träumerei”.
the beat: Movement influences infant rhythm Journal of the Acoustical Society of America,
perception. Science, 308, 1430. 92, 2546–3568.
Phillips-Silver, J., & Trainor, L. J. (2007). Hearing Repp, B. H., London, J., & Keller, P. E. (2005).
what the body feels: Auditory encoding Production and synchronization of uneven
of rhythmic movement. Cognition, 105, rhythms at fast tempi. Music Perception, 23,
533–546. 61–78.
Peper, C. E., Beek, P. J., van Wieringen, P. C. W. Rivera-Gaxiola, M., Csibra, G., Johnson, M.
(1995). Multifrequency coordination in H., & Karmiloff-Smith, A. (2000). Electro-
bimanual tapping: Asymmetrical coupling physiological correlates of cross-linguistic
and signs of supercriticality. Journal of speech perception in native English speakers.
Experimental Psychology: Human Perception Behavioral and Brain Research, 111, 11–23.
and Performance, 21, 1117–1138. Rivera-Gaxiola, M., Silva-Pereyra, J., & Kuhl, P.
Peretz, I. (Ed.). (2006). The nature of music [Special K. (2005). Brain potentials to native and non-
Issue]. Cognition, 100. native speech contrast in 7- and 11-month-old
Peretz, I., & Coltheart, M. (2003). Modularity American infants. Developmental Science, 8,
of music processing. Nature Neuroscience, 7, 162–172.
688–691. Sadakata, M., Desain, P., Honing, H., Patel, A. D., &
Peretz, I., & Hyde, K. L. (2003). What is specific Iversen, J. R. (2004). A cross-cultural study of the
to music processing? Insights from congeni- rhythm in English and Japanese popular music.
tal amusia. Trends in Cognitive Sciences, 7, Proceedings of the International Symposium on
362–367. Musical Acoustics (ISMA), 41–44. Nara.
Peretz, I., Kolinsky, R., Tramo, M., Labrecque, Saff ran, J. R., Aslin, R. N., & Newport, E. L. (1996).
R., Hublet, C., Demeurisse, G. et al. (1994). Statistical learning by 8-month-old infants.
Functional dissociations following bilat- Science, 274, 1926–1928.
eral lesions of auditory cortex. Brain, 117, Saff ran, J. R., Johnson, E. K., Aslin, R. N., &
1283–1301. Newport, E. L. (1999). Statistical learning of
Peretz, I., & Morais, J. (1989). Music and modular- tone sequences by human infants and adults.
ity. Contemporary Music Review, 4, 277–291. Cognition, 70, 27–52.
Pinker, S. (1997). How the mind works. New York: Saff ran, J. R., Pollak, S. D., Seibel, R. L., & Shkolnik,
Norton. A. (2006). Dog is a dog is a dog: Infant rule
Pinker, S. (2002). The blank slate: The modern learning is not specific to language. Cognition,
denial of human nature. New York: Penguin. 105, 669–680.
Platel, H., Price, C., Baron, J., Wise, R., Lambert, J., Sakai, K., Hikosaka, O., Miyauchi, S., Takino, R.,
Frackowiak, R. S. J., et al. (1997). The structural Tamada, T., Iwata, N., et al. (1999). Neural rep-
components of music perception: A functional resentation of a rhythm depends on its inter-
anatomical study. Brain, 120, 229–243. val ratio. The Journal of Neuroscience, 15,
Povel, D. (1981). Internal representation of sim- 10074–10081.
ple temporal patterns. Journal of Experimental Schellenberg, E. G. (2005). Children’s implicit
Psychology: Human Perception and Performance, knowledge of harmony in Western music.
7, 3–18. Developmental Science, 8, 551–566.

Schellenberg, E. G., & Trainor, L. J. (1996). Sensory of rhythm. Empirical Musicology Review, 2,
consonance and the perceptual similarity of 1–13.
complex-tone harmonic intervals: Tests of adult Trainor, L. J., & Heinmiller, B. M. (1998). The
and infant listeners. Journal of the Acoustical development of evaluative responses to music:
Society of America, 100, 3321–3328. Infants prefer to listen to consonance over dis-
Schellenberg, E. G. & Trehub, S. E. (1994). sonance. Infant Behavior and Development, 21,
Frequency ratios and the discrimination of pure 77–88.
tone sequences. Perception & Psychophysics, Trainor, L. J. & Trehub, S. E. (1992). A com-
56, 472–478. parison of infants’ and adults’ sensitivity
Schellenberg, E. G., & Trehub, S. E. (1996). Natural to Western musical structure. Journal of
musical intervals: Evidence from infant listen- Experimental Psychology: Human Perception
ers. Psychological Science, 7, 272–277. and Performance, 18, 394–402.
Schubotz, R. I., Friederici, A. D., & Yves von Trainor, L. J., Tsang, C. D., & Cheung, V. H. W.
Cramon, D. (2000). Time perception and motor (2002). Preference for sensory consonance in
timing: A common cortical and subcortical 2- and 4-month-old infants. Music Perception,
basis revealed by fMRI. NeuroImage, 11, 1–12. 20, 187–194.
Scott, L. S., Pascalis, O., & Nelson, C. A. (2007). A Tramo, M. J., Cariani, P. A., Koh, C. K., Makris,
domain-general theory of the development of N., & Braida, L. D. (2003). Neurobiology of har-
perceptual discrimination. Current Directions mony perception. In I. Peretz and Zatorre, R.J.
in Psychological Science, 16, 197–201. (Eds.), The cognitive neurosciences of music (pp.
Scott, L. S., Shannon, R. W., & Nelson, C. A. (2006). 121–151). Oxford: Oxford University Press.
Neural correlates of human and monkey face Treff ner, P. J., & Turvey, M. T. (1993). Resonance
processing by 9-month-old infants. Infancy, 10, constraints on rhythmic movement. Journal of
171–186. Experimental Psychology: Human Perception
Semjen, A., & Ivry, R. B. (2001). The couled oscil- and Performance, 19, 1221–1237.
lator model of between-hand coordination in Trehub, S. E. (1976). The discrimination of foreign
alternate-hand tapping: A reappraisal. Journal speech contrasts by infants and adults. Child
of Experimental Psychology: Human Perception Development, 47, 466–472.
and Performance, 27, 251–265. Trehub, S. E. (2003). The developmental origins of
Snyder, J. S., Hannon, E. E., Large, E. W., & musicality. Nature Neuroscience, 6, 669–673.
Christiansen, M. H. (2006). Synchronization Trehub, S. E., & Hannon, E. E. (2006). Infant
and continuation tapping to complex meters. music perception: Domain-general or domain-
Music Perception, 24, 135–146. specific mechanisms? Cognition, 100, 73–99.
Snyder, J. S., & Krumhansl, C. L. (2001). Tapping Trehub, S. E., & Hannon, E. E. (2009).
to ragtime: Cues to pulse-fi nding. Music Conventional rhythms enhance infants’ and
Perception, 18, 445–489. adults’ perception of musical patterns. Cortex,
Soley, G. & Hannon, E.E. Infants prefer music of 45, 110–118.
their own culture: A cross-cultural compari- Trehub, S. E., & Trainor, L. J. (1998). Singing to
son. Manuscript submitted for publication. infants: Lullabies and play songs. Advances in
Spiro, J. (Ed.). (2003). Music and the brain [Special Infancy Research, 12, 43–77.
Issue]. Nature Neuroscience, 6. Tsao, F.-M., Liu, H.-M., & Kuhl, P. K. (2004).
Summers, J. (2001). Practice and training in Speech perception in infancy predicts
bimanual coordination tasks: Strategies and language development in the second year of
constraints. Brain and Cognition, 48, 1–13. life: A longitudinal study. Child Development,
Thiessen, E. D., & Saff ran, J. R. (2007). Learning 75, 1067.
to learn: Infants’ acquisition of stress-based Vignolo, L. A. (2003). Music agnosia and auditory
strategies for word segmentation. Language agnosia. Annals of the New York Academy of
Learning and Development, 3, 73–100. Sciences, 999, 50–57.
Tillman, B. A., Bharucha, J. J., & Bigand, E. (2000). Vouloumanos, A., Kiehl, K. A., Werker, J. F., &
Implicit learning of tonality: A self-organized Liddle, P. F. (2001). Detection of sounds in the
approach. Psychological Review, 107, 885–913. auditory stream: Event-related fMRI evidence
Todd, N. P. M., Cousins, R., & Lee, C. S. (2007). for differential activation to speech and non-
The contribution of anthropometric factors speech. Journal of Cognitive Neuroscience, 13,
to individual differences in the perception 994–1005.

Vouloumanos, A., & Werker, J. F. (2004). Tuned Werker, J. F. & Lalonde, C. E. (1995). Cognitive
to the signal: The privileged status of speech influences on cross-language speech perception
for young infants. Developmental Science, 7, in infancy. Infant Behavior and Development,
270–276. 18, 459–475.
Wallin, N. L., Merker, B., & Brown, S. (2000). The Werker, J. F., & Tees, R. C. (1984). Cross-language
origins of music. Cambridge, MA: MIT Press. speech perception: Evidence for perceptual
Watanabe, S., Uozumi, M., & Tanaka, N. (2005). reorganization during the first year of life.
Discrimination of consonance and dissonance Infant Behavior and Development, 7, 49–63.
in Java sparrows. Behavioural Processes, 70, Zanone, P. G., & Kelso, J. A. S. (1992). Evolution
203–208. of behavioral attractors with learning:
Weikum, W. M., Vouloumanos, A., Navarra, Nonequilibrium phase transitions. Journal of
J., Soto-Faraco, S., Sebastian-Galles, N., & Experimental Psychology: Human Perception
Werker, J. F. (2007). Visual language discrim- and Performance, 18, 403–421.
ination in infancy. Science, 316, 1159. Zatorre, R. J., & Peretz, I. (Eds.). (2003). The neu-
Werker, J. F., & Lalonde, C. E. (1988). Cross- rosciences and music. New York: The New York
language speech perception: Initial capabili- Academy of Sciences.
ties and developmental change. Developmental Zentner, M. R., & Kagan, J. (1996). Perception of
Psychology, 24, 672–683. music by infants. Nature, 383, 29.

Learning Mechanisms
This page intentionally left blank
Integrating Top-down and Bottom-up Approaches
to Children’s Causal Inference

David M. Sobel

Several chapters in this volume are dedicated information. One might consider this tradi-
to describing how children learn conceptual tion more “top-down.” The second goal of this
structure from the data available to them (e.g., chapter is to consider how to integrate these two
Kirkham, this volume; Rakison & Cicchino, approaches for describing children’s causal rea-
this volume; Sloutsky, this volume). My plan for soning abilities.
this chapter is to focus this discussion on a par- In particular, following a set of propos-
ticular piece of conceptual knowledge: under- als developed by Josh Tenenbaum and Tom
standing the causal relations among events. Griffiths (Griffiths & Tenenbaum, 2005, 2007;
Piaget, on whom the constructivist approach to Sobel, Tenenbaum, & Gopnik, 2004; Tenenbaum
cognitive development is based, recognized the & Griffiths, 2001, 2003), I will suggest a
importance of causality in children’s cognitive description of children’s causal inference. This
development (Piaget, 1929, 1930). However, he approach offers a way of considering how causal
failed to attribute significant causal reasoning principles are acquired from data to learn rep-
abilities to young children, with preoperational resentations of causal structure. I will first pres-
children often receiving the label “precausal” ent some background information, then some
based on their verbal explanations of behaviors empirical work consistent with this description
in the world. The first goal of this chapter is to as well as what might be developing. Finally, I
highlight young children’s sophisticated causal will consider some limitations of this mecha-
reasoning abilities. nism, focusing on other information that might
There are two approaches to causal learn- be available to the child to facilitate their causal
ing that are critical to the present discussion inference and learning.
in this volume. First, there is a long tradition
of research in causal learning and inference
that has focused on how causal knowledge is
acquired from observing events—algorithms
that construct a mental model of causal knowl- There have been numerous accounts of causal
edge from patterns of correlational informa- learning in which a representation of causal
tion. One might consider such a tradition more structure is built from observing data in the
“bottom-up.” But there is a second tradition environment. On these accounts, children use
of research in cognitive development—dating little prior knowledge to learn about causal rela-
back to Piaget—describing how children using tions. Often, they only have the ability to trans-
their prior knowledge or contextual informa- late associations among events into a causal
tion in the environment to learn new causal representation. The simplest such account is


that children associate causes and effects in the information. The critical difference between
same way that animals associate conditioned these models and the ones mentioned above is
and unconditioned stimuli in classical con- that they estimate the strength of a fi xed rep-
ditioning (e.g., Mackintosh, 1975; Rescorla & resentation of causal structure, and do so accu-
Wagner, 1972). rately only given sufficiently large quantities
But since associative models only output of data (see Tenenbaum & Griffiths, 2001, for
strength relations, they do not appear to make further discussion of this issue). Most of these
predictions about how learners use causal models, however, are agnostic as to how that
knowledge to generate interventions to elicit causal structure is fi xed, with a potential excep-
effects. It appears that even rats are capable of tion being the Power PC model (Cheng, 1997;
causal reasoning in a manner that reflects more Novick & Cheng, 2004), which suggests ways of
than just an associative mechanism (Blaisdell, discerning cause from effect (see e.g., Cheng &
Sawa, Leising, & Waldmann, 2006). As a result, Novick, 1990).
several independent research programs have
suggested that to generalize an associative
approach, causal learning occurs by transform-
ing a measure of associative strength into a The models described above focus on deriving
measure of causal strength. Measures of causal the strength of a set of known causal relations.
strength are then used to make inferences or Informally, here is the first way in which tradi-
generate interventions. Some of these models tionally bottom-up accounts of causal learning
were based on the Rescorla-Wagner equation can be integrated with prior knowledge: If the
(see e.g., Cramer et al., 2002). Other accounts learner is determining the strength of a known
emerged as researchers discovered a set of cause and effect, then there must be some knowl-
learning paradigms that this model has trouble edge in addition to the data that identifies cause
explaining (e.g., Krushke & Blair, 2000; Van from effect. While there are several theories of
Hamme & Wasserman, 1994; Wasserman & causal inference that place such mechanistic
Berglan, 1998). One advantage of these accounts information central to understanding causality
is that they allow a way to describe how a learner (Ahn, Kalish, Medin, & Gelman, 1995; Shultz,
might generate interventions on the world— 1982), such knowledge might also be entirely
actions (usually intentional) that change the minimal, perhaps limited to only priority, con-
value of an event exogenously (without affecting tiguity, and contingency (e.g., Hume, 1978/1739;
other variables in the model directly). To use a Michotte, 1962).
traditional example, some associative mecha- However, there are some contemporary
nisms were designed to describe classical condi- accounts of causal learning that consider
tional paradigms, in which the learner passively how causal structure is learned: How do chil-
observed the environment. These accounts of dren (and adults) recognize that an event is
human learning also take operant paradigms a cause or effect of another event (in addition
into account, in which the learner also gener- to considering the strength of that causal rela-
ates actions, which have varying degrees of effi- tion)? Most of the psychological investigation
cacy, and must learn the strength of the existing on this approach has concentrated on adult
causal relation (see e.g., Dickinson & Shanks, causal learning (Griffiths & Tenenbaum, 2005;
1995, for a detailed discussion). Lagnado & Sloman, 2004; Steyvers, Tenenbaum,
Still other endeavors have considered more Wagenmakers, & Blum, 2003; Tenenbaum &
complex relations among events beyond stimu- Griffiths, 2001; Waldmann & Hagmayer, 2005;
lus, response, and reinforcement (e.g., Allan, see Lagnado, Waldmann, Hagmayer, & Sloman,
1980; Cheng, 1997; Shanks, 1995). These mod- 2007, for a review). There are also investigations
els estimate the strength of a particular causal that suggest that children construct an abstract
model using the probability that an effect representation of the causal structure among
occurs given a cause and some background a set of variables (Gopnik, Sobel, Schulz, &

Glymour, 2001; Schulz & Gopnik, 2004; Sobel, example, if I make the rooster to crow at 2 am,
Tenenbaum, & Gopnik, 2004; see Gopnik et al., I should not expect the sun to rise; the causal
2004, for a review). relation between sunrises and roosters crowing
This description of causal learning and runs in the opposite direction (see Woodward,
inference has been grounded in the literature on 2003, for further discussion). Children clearly
causal graphical models, which have been devel- learn causal structure from observing (and gen-
oped in computer science and statistics (Pearl, erating) these interventions (see e.g., Schulz,
2000; Spirtes, Glymour, & Scheines, 2001). Gopnik & Glymour, 2007).
Causal graphical models are representations of A second assumption that underlies causal
a joint probability distribution: the probability graphical models is the faithfulness assumption.
that each possible combination of events occurs. Faithfulness specifies that data are indicative of
These representations embody conditional the causal structure in the world. Suppose that
probability information among events. Events three events are related in the following manner:
are represented as nodes, and causal relations X→Y←Z and that X has a generative relation with
are represented as edges between nodes. Y (i.e., the occurrence of X raises the probability
Making inferences from this account relies that Y will occur) and that Z has a preventative
on a set of assumptions. One assumption is relation with Y (i.e., the occurrence of Z lowers
that any vertex represents a causal relation the probability that Y will occur). Faithfulness
between two nodes, specifically in the form of states that the causal relations among X, Y, and
a mechanism that can be either observed or Z will never be such that X and Z exactly cancel
unobserved (following Pearl, 2000). As such, each other’s effects on Y, so that the three events
any graph is consistent with a set of probabilis- appear independent. I do not know of a psycho-
tic models that specify the nature of the relation logical investigation dedicated to faithfulness;
among the variables. A unique causal structure however, most psychologists investigating chil-
is formed by defining the probability distribu- dren’s causal learning assume this to be true.
tion for each variable conditioned on its parents A third assumption is the Markov assump-
(called parameterizing a graph). Parameterizing tion, which is a way of translating between
a graph can be thought of as assigning weights causal relations and conditional probability
to each edge that represent the strength of information (Pearl, 2000). The Markov assump-
the corresponding causal relations. A graph’s tion states that the value of an event (i.e., a node
parameterization can reflect the nature of the in the graph) is independent of all other events
mechanism(s) by which causes produce effects. except its children (i.e., its direct effects) condi-
Causal graphical models support reasoning tional on its parents (i.e., its direct causes). For
about interventions—actions that change the example, consider the causal model A→B→C.
value of variables in the graph (without directly In this model, the values of events A and C are
influencing those other variables, see Pearl, dependent. The Markov assumption states that
2000). Consider the simple graph X→Y. In this these values become independent conditional
graph, the probability that event Y takes a par- on the value of event B. C has no children, and
ticular value given that event X takes a particu- B is its only parent. If you want to predict the
lar value is the same when you observe that X value of C and know the value of B, additional
has that value as when you act to make X have knowledge about the value of A does not help:
that value. Such interventions are represented the only influence that A has on C is through B.
by Pearl (2000) and others as the do(X) operator. In the next section, I will consider evidence
Note that the opposite is not true in this graph: that suggests children engage in causal reason-
The probability that X has a particular value ing in a manner consistent with the Markov
given that you observe Y has a particular value is assumption. Specifically, this evidence suggests
not necessarily the same as the probability that that young children can recognize dependen-
X has that value given that you force Y to take cies among events as well as when events are
on the same value. To use a classic philosophical independent based on the presence of a third

event. Such inference is tantamount to recog-

nizing the difference between correlations due
to causal relations and correlations due to spuri-
ous associations.



In order to investigate whether children rec-

ognize the difference between dependence and
conditional independence information, we need
Figure 8.1 A blicket detector (specifically the
a method that presents a novel causal property
detector used in Gopnik and Sobel, 2000, and
to children wherein researchers can control the elsewhere). In this case, an object is placed on
amount of prior knowledge they possess. Much the detector, and it is enabled, so that the object
of the research I will describe uses a blicket is activating the detector. This particular detec-
detector (shown in Figure 8.1), a machine that tor lights up red and plays fur elise.
lights up and plays music (controlled by the
experimenter) when certain objects are placed
upon it. The blicket detector presents a novel, times, and activated two out of three times.
nonobvious causal property, which any object Children categorized both objects as blickets.
might possess. Both objects individually activated the detector;
Gopnik et al. (2001) trained 3- and 4-year- they just did so with different frequencies.
olds that objects that activated the detector were These data suggest that children recognize
labeled, “blickets.” Children quickly learned this the difference between two events that are
relation. Then, children observed a set of trials dependent because of a causal relation and two
in which objects either independently activated events that are dependent because of the pres-
the machine, or did so only in the presence of ence of a third (causal) event.1 This procedure
another object. Specifically, on the one-cause tri- generalizes beyond reasoning about physical
als, children were shown two objects. Children events: Schulz and Gopnik (2004) demonstrated
observed one object (A) activate the detector by that 3- and 4-year-olds make similar inferences
itself. Then, they saw that the other object (B) across a variety of domains. Using some slight
did not activate the detector by itself. Finally, manipulations to the procedure, Gopnik et al.
they saw objects A and B activate the detector (2001) demonstrated that 30-month-olds also
twice together. Children were asked whether made these inferences.
each object was a blicket. Three- and 4-year-olds The trouble with simply concluding that chil-
labeled only object A as a blicket (although this dren reason according to the Markov assumption
was more likely for the older children), recogniz- is that the data presented above are analogous to
ing that object B only activated the detector in blocking, a phenomenon from the animal condi-
the presence of the object A. tioning literature (Kamin, 1969). In a blocking
Performance on these trials were compared
with performance on two-cause trials, in which 1
Th is kind of inference has often been called “explain-
the same children were shown two objects that ing away.” It is consistent with the Markov assumption,
activated the detector individually with the but is a different inference from the example provided
in the previous section (in which an individual reasons
same frequency as the objects in the one-cause about a causal chain). There is evidence that under some
trials. Specifically, children saw two new objects conditions, children can learn causal chains from these
(C and D). Object C was placed on the machine patterns of data (Sobel & Sommerville, 2009), but most
of the evidence suggesting children reason accord-
three times and activated it all three times. ing to the Markov assumption asks them to make this
Object D was placed on the machine three “explaining away” inference.

procedure, a learner is shown an association consistent with it being a blicket and not being
between a conditioned and unconditioned stim- one. However, these intuitions differ from the
ulus (e.g., that a tone predicts the occurrence of associative relations that objects B and D have
food). This association is trained until asymp- with the machine’s activation. In both cases,
tote, and then the learner is shown a novel stim- children observe the object activate the machine
ulus, which presented in compound with the in conjunction with another object. That other
established conditioned stimulus will predict object (A or C) then activates or fails to acti-
the same unconditioned stimulus (e.g., that the vate the machine, but this piece of informa-
same tone paired with light will predict food). In tion should not change the associative relation
most cases, learners do not learn that the second between objects B and D and the machine’s acti-
stimulus is predictive. Various models of asso- vation. If children were responding on the basis
ciative reasoning (e.g., Rescorla & Wagner, 1972) of these associative calculations, they should
were designed to explain this phenomenon. In treat these objects the same. Three- and 4-year-
the blicket detector paradigm described above, olds responded in a manner consistent with the
one might consider object A to be analogous to intuitions, not the associative relations: object B
the first stimulus, object B the second stimulus, was almost always judged to be a blicket, while
and the detector’s activation the unconditioned object D was judged to be so approximately 35%
stimulus. Children’s performance, thus, is analo- of the time. Preschoolers reasoned in a man-
gous to that of animal learners. ner consistent with the Markov assumption,
It is necessary to consider alternate proce- and less consistent with at least some models of
dures that models of associative reasoning have causal reasoning based on calculations of asso-
difficulty explaining. One such example involves ciative strength (e.g., Rescorla & Wagner, 1972).
considering how children reason retrospectively
about ambiguous events (following Shanks,
1985, 1995). Sobel et al. (2004) introduced 3- and
4-year-olds to the blicket machine in the same An open question raised by the previous sec-
manner as Gopnik et al. (2001), and then pre- tion is whether younger children would reason
sented them with two types of trials. In their one- in a similar manner. A variety of researchers
cause trials, children saw that two objects (A and have suggested that children’s causal reason-
B) activated the machine together, and then that ing abilities develop during the preschool years
object A did not activate the machine by itself. (e.g., Bullock, Gelman, & Baillargeon, 1982; Das
In their backwards blocking trials, children saw Gupta & Bryant, 1989; Goswami & Brown, 1990;
two new objects (C and D) activate the machine Gottfried & Gelman, 2005). Further, children’s
together, and then that one of those objects (C) ability to relate causal inferences to perceptions
did activate the machine by itself (note that this of time develops after the preschool years (e.g.,
procedure is analogous to the reverse of Kamin’s McCormack & Hoerl, 2005).
blocking procedure described above, hence its But these findings are mostly concerned with
name). Children were asked whether each of how children understand causal mechanisms—
these objects were blickets. how events are related to each other in a causal
The blicket status of objects A and C are manner. In subsequent sections, I will outline a
unambiguous given these data, but one’s intu- description of a domain-general causal learning
itions about objects B and D should differ. Object mechanism, and then demonstrate how domain-
B should be a blicket in the one-cause trial; this specific knowledge influences children’s use of
is consistent with the Markov assumption: this mechanism. But before we consider those
object A only activates the machine dependent ideas, the issue of whether younger children’s
on the presence of object B, so object B must be causal reasoning is consistent with the Markov
the causal factor. Object D’s status is uncertain; assumption is still open.
the data (under a few assumptions, which I will Sobel and Kirkham (2006) considered this
make clear in subsequent sections) are equally question by investigating 19- and 24-month-olds’

inferences using a similar procedure to the auditory streams based on statistical probabil-
one-cause and backwards blocking trials ities even when the stimuli are tones (Saff ran,
described above. In their one-cause condition, Johnson, Aslin, & Newport, 1999). This suggests
they placed two objects (A and B) on the machine that the ability to perceive statistical structure is
together, which activated, and then showed perhaps not language-specific.
the children that object A failed to activate the Further evidence for this position comes
machine by itself. The objects and detector were from work on infants’ parsing of visuospatial
slid over to the child, who was asked to “make it sequences of events. Using a procedure analo-
go.” In their backwards blocking condition, they gous to Saff ran et al. (1996), Kirkham, Slemmer,
placed three objects on the table (C, D, and E). and Johnson (2002) demonstrated that infants
Objects C and D made the machine go together, as young as 2 months old register statistical
and then object C made the machine go by itself. relations among sequences of visual events (see
Object C was removed, and the child was given Kirkham, this volume, for a detailed descrip-
objects D and E with the detector to “make it tion of this literature). Similarly, Fiser and Aslin
go.” Twenty-four-month-olds used object B to (2002) demonstrated that 9-month-olds can rec-
activate the machine in the one-cause trial, more ognize conditional probability relations between
often than they used object D to do so in the the spatial positions of visual events. Both of
backwards blocking trial. However, the younger these projects have their origins with work by
children responded at chance levels, and did not Haith and colleagues (Haith, 1993; Wentworth,
discriminate between objects B and D. Haith, & Hood, 2002), who demonstrated that
A difficulty with asking children this young young infants (3–4-month-olds) learn simple,
to make manual responses consistent with their two-location spatial sequences of events. Haith
causal knowledge (i.e., put objects on the detec- and colleagues used a visual expectation para-
tor to make it go), is that there are cases in which digm, in which infants’ eye gaze to a particular
18-month-olds fail to engage in simple imitative spatial location represented where the infants
“means–ends” behaviors (e.g., Uzgiris & Hunt, thought an event would appear.
1975; see also Gopnik & Meltzoff, 1992). In the Sobel and Kirkham (2006) modified this tech-
one-cause trial, children had to inhibit an event nique to investigate whether infants reasoned
they observed activate the machine (placing about sequences of events in a manner similar
both objects on it) in favor of a novel interven- to preschoolers’ causal inference. Eight-month-
tion (placing only object B on it). The demand olds were shown a video screen similar to what
characteristics of this experiment might have is seen in Figure 8.2. Four frames were always
overwhelmed the toddlers, and prevented them
from producing the appropriate responses.
There is reason to believe that 18-month-
olds, and even younger infants, have the abil-
ity to detect conditional probabilities among
events. Saff ran, Aslin, and colleagues found that
8-month-old infants could parse a stream of
auditory stimuli based solely on the transitional
probabilities within and between syllables (i.e.,
the likelihood that one syllable would predict
the next syllable; Aslin, Saffran, & Newport,
1998; Saff ran, Aslin, & Newport, 1996). Infants’
statistical learning abilities extend beyond
learning word boundaries. Infants are capa-
ble of recognizing and discriminating between Figure 8.2 A screenshot shown to infants in
complex grammars relating words together Sobel and Kirkham (2006, 2007). In this shot,
(e.g., Gomez & Gerken, 1999). They also parse events A and B are presented together.

present on the screen. I will refer to the top cen- by itself, followed by the D event. This sequence
ter frame as A and the bottom center frame as was shown twice. Infants in the backwards block-
B, the right frame as C and the left frame as D, ing condition observe only the A event followed
but this was counterbalanced in the experiment. by the C event (twice). Immediately after this,
Infants observed sequences of events appear in infants were shown the B event, and then the
their respective frames. During familiarization, screen went blank. The music that had accompa-
infants saw a sequence of three types of events nied the C and D events began to play, and at this
(the first event was randomly chosen from the point infants’ eye gaze was measured for 8 s.
three). One was that the A and B events could One might think of this sequence of events
appear in their respective frames together. These in a manner similar to objects being placed on
were two grayscale events that silently rotated in a blicket detector. Events A and B correspond
space for 8 s. The AB compound predicted the to the two objects being placed on the detector,
occurrence of the C event (which occurred in and events C and D correspond to the detector
the C frame) with 100% certainty. The C event activating or not, respectively. In the indirect
was a more interesting color event, which moved screening-off condition (analogous to the one-
around in the frame and was accompanied by cause condition in the previous experiments
a piece of cartoonish music. It also lasted for described above), events A and B together pre-
8 s. The C event, however, did not predict any dict C, and then A alone does not. Events A
event. Half of the time it was followed by the AB and C are dependent in the presence of B, but
compound (in which case, the subsequent event independent, conditioned on the absence of B;
would again be C), and half of the time it was fol- this makes B predictive of C. The expectation
lowed by the D event, which was the same color is that infants will look more to the C frame,
event occurring in the D frame for 8 s accom- expecting an event to appear there. In the back-
panied by the same music. The D event also did wards blocking condition, the associative rela-
not predict anything. Half of the time it was fol- tion between events B and C is the same as in
lowed by the AB compound, and half of the time the indirect screening-off condition, but since A
it was followed by C. and C are not independent, conditioned on the
Infants observed this sequence of events until absence of B, B is not necessarily predictive of C.
they saw four occurrences of the AB→C pair- We would expect to find an interaction between
ing (usually 11 events total). Immediately after the amount of time spent looking at each frame
the last AB→C pairing, infants who had been and the condition the infant was assigned to.
assigned to the indirect screening-off condition This was exactly what was found (see Figure
observed only the A event appear on the screen 8.3). Eight-month-olds spent more time looking

Mean looking times to the frames

5-month-olds 8-month-olds
(Sobel & Kirkham, 2007) (Sobel & Kirkham, 2006)
700 C
600 D
Time (ms)

400 Figure 8.3 Amount of time
300 spent looking to the C and D
200 frames in the indirect
screening-off and back-
Backwards Indirect Backwards Indirecting wards blocking conditions
blocking screening- blocking screening- by 8-month-olds (Sobel &
off off Kirkham, 2006) and 5-month-
Condition olds (Sobel & Kirkham, 2007).

in the C frame in the indirect screening-off Natasha Kirkham and I have begun several
condition than the D frame, and more time investigations focused on this question. One
looking in the D frame than the C frame in the involves infants watching videos of objects
backwards blocking condition. Moreover, they placed on a blicket detector, consistent with the
spent more time looking in the C frame in the data shown in the one-cause and backwards
indirect screening-off condition than back- blocking conditions described above. Using a
wards blocking condition. violation-of-expectation procedure, we should
These data suggest that infants’ statistical be able to discern infants’ expectations about
learning abilities appear to be consistent with the causal efficacy of individual objects. Further,
the Markov assumption. However, this infer- using the anticipatory eye-gaze paradigm, we
ential ability may not be available to younger are attempting to train children that looks to
infants. In a follow-up, Sobel and Kirkham particular locations of a screen actually cause
(2007) found that 5-month-olds’ pattern of look- events to occur. Using eye gaze to allow infants
ing time in response to the same procedure was to generate interventions might allow them to
quite different (see Figure 8.3). Five-month-olds respond in a causal manner to sequences simi-
looked longer at the C frame in the backwards lar to the one used in our previous work. These
blocking condition, and equally long at the two investigations are currently underway.
frames in the indirect screening-off condition,
inconsistent with the Markov assumption. One
interpretation of these data is that infants might
be developing a mechanism for causal and sta-
tistical reasoning that moves from recognizing An objection that one might have to the lines of
associations among events to one that incor- research described above is that the difference
porates the Markov assumption. However, the between responses in the indirect screening-
5-month-olds’ responses were inconsistent with off or one-cause condition and responses in
an associative mechanism as well. An alterna- the backwards blocking condition is problem-
tive interpretation is that when events A and atic for certain models of associative reasoning
B occur together, younger infants might sim- (e.g., Rescorla & Wagner, 1972), but not others.
ply treat them as the same event. If infants are Several contemporary accounts of associative
treating event A alone, event B alone, and the reasoning were designed with the backwards
AB compound as the same event, then their blocking paradigm in mind (e.g., Kruschke &
pattern of performance is consistent with both Blair, 2000; Van Hamme & Wasserman, 1994;
associative reasoning mechanisms and reason- Wasserman & Berglan, 1998). For instance,
ing mechanisms consistent with the Markov Wasserman and Berglan (1998) use a derivative
assumption. More research is necessary to dis- of the Rescorla-Wagner equation, in which the
criminate between these possibilities. strength of a relation changes positively when a
Second, the research on infancy presented potential cause and effect occur and negatively
so far has focused on infants’ statistical reason- when the effect occurs without a potential cause.
ing, and not necessarily their understanding of Similarly, models of causal reasoning that rely
cause and effect. These data do not demonstrate on the estimation of causal parameters based
that 8-month-olds register that event B causes on the frequency with which events co-occur
event C in the indirect screening-off condition. also explain the backwards blocking data (e.g.,
Rather, they suggest that infants’ statistical rea- Cheng, 1997; Shanks, 1995). These models cate-
soning is consistent with the Markov assump- gorize events as causes or effects and then cal-
tion, and may form the building block for a culate the probability that an effect occurs given
representation of causal knowledge. An open a cause and some background information.
question is to consider how to convert this pro- For example, Cheng’s (1997) Power PC model
cedure to one in which infants’ causal reasoning makes a clear prediction about the causal effi-
can be measured. cacy of the objects in the one-cause conditions,

but generates an undefined value in the back- of B, then the p(d | h) = 0 for this particular
wards blocking case, which can be interpreted hypothesis. This hypothesis requires B to occur
as consistent with the present findings. whenever A occurs, and that is not the case.
Is there a method of distinguishing among To see this computational description
all of these competing options as explanations in action, consider the backwards blocking
of children’s causal reasoning? One difficulty sequences in which two objects activate the
with considering the majority of these algo- blicket detector together and then one of those
rithms is that they rely on multiple pieces of two objects activates the detector by itself. There
data (i.e., large sample sizes) in order to make are four hypotheses potentially consistent with
rational inferences. What we have observed in these data:
children’s causal reasoning is that they appear
h1: that neither object is a blicket
capable of making such inferences based on
h2: that only the first is a blicket
small amounts of data. Following research-
h3: that only the second is a blicket
ers in adult cognition and cognitive science
h4: that both are blickets
(e.g., Griffiths & Tenenbaum, 2005; Steyvers
et al., 2003; Tenenbaum & Griffiths, 2001, 2003), The data are equally inconsistent with hypoth-
Sobel et al. (2004, see also Griffiths, Sobel, eses h1 and h3 (i.e., p(d | h) = 0), since the first
Tenenbaum, & Gopnik, submitted) proposed object has to be a blicket (it activates the machine
that children’s causal learning and inference by itself, but more on this in the subsequent sec-
was better described by a model that relies on tions). The data, however, are equally consistent
Bayesian inference. with the other two hypotheses (h2 and h4), and
On this view, causal reasoning can best be as such the p(d | h) = 1 for both. But this descrip-
described by inference over a set of hypotheses tion allows for another piece of information to
(H). Hypotheses take the form of a causal graph- influence causal inference, namely the prior
ical model with a particular parameterization. probabilities, and these priors might affect chil-
Each hypothesis (h1, h2 , . . . hn) is assigned a prior dren’s inferences.
probability, p(h) before observing any data. A rational way in which these priors might
These priors reflect the learner’s causal knowl- be assigned is through observing the base rate
edge about possible causal structures as well as of objects with causal efficacy—the frequency of
any other information the learner gleams from blickets in the world. If there are few blickets in
the environment before observing the data. the world, then the prior probability of hypoth-
Given the data, d (values for the variables in the esis h2 should be higher than that of h4, since
hypotheses), the learner computes the posterior h2 posits fewer blickets. Similarly, if blickets are
probability that each hypothesis is the actual relatively common, then the reverse should be
causal structure of the system, p(h | d). This is true. Using this logic, Sobel et al. (2004) pre-
done using Bayes’ rule: sented 3- and 4-year-olds with a version of the
backwards blocking procedure in which they
p(d | h) p(h)
p(h | d ) = initially manipulated the base rate of blickets.
∑ p(d | h⬘) p(h⬘)
Children were shown the blicket detector, and
taught that blickets make the machine go. The
The prior p(h) is the probability that each experimenter then brought out a box of identi-
hypothesis is the hypothesis that actually gen- cal blocks. In one condition (the rare condition),
erated the data. The value p(d | h) is the likeli- 2 out of the first 12 blocks shown to the child
hood of the observed data being generated if activated the detector, and were categorized as
that particular hypothesis was the actual causal blickets. In the other condition (the common
structure in the world. For example, if A→B with condition), 10 out of the first 12 blocks activated
a deterministic parameterization (i.e., A always the detector, and were blickets. Then the exper-
causes B) is one of the hypotheses, and the data imenter brought out two more blocks (A and
consists of trials of A occurring in the absence B), and proceeded with the backwards blocking

demonstration: these blocks together activated causal perception (e.g., Leslie & Keeble, 1987;
the detector, and then that object A activated Oakes & Cohen, 1990), and preschoolers’ causal
the detector by itself. knowledge (e.g., Bullock et al., 1982; Sophian &
The causal status of object A is unambigu- Huber, 1984), it seems reasonable to assume that
ous— it is a blicket—and all of the children cat- young children reason according to these two
egorized it as such. The causal status of object principles. Such knowledge limits the hypoth-
B is ambiguous given the data, but if children esis space to the four models described above.
relied on the observed base rates, they should But children also need to understand that
treat this object differently between the rare and there is a particular parameterization between
common conditions. In the common condition, objects and the detector activating—what
both 3- and 4-year-olds claimed that object B Tenenbaum and Griffiths (2003) and Sobel et
was a blicket, consistent with children recogniz- al. (2004) called the activation law. Do chil-
ing prior probabilities when evaluating ambigu- dren recognize that there is something about a
ous data. In the rare condition, the 4-year-olds blicket that makes the machine go? The activa-
claimed that object B was not a blicket, again tion law specifies that children recognize that
consistent with recognizing priors, but the there is some mechanism that relates blickets to
3-year-olds did not. They judged that the B the detector’s activation in a deterministic (or
object was a blicket regardless of the base rate of near-deterministic) manner. This information
blickets in the environment. allows the learner to recognize that the data in
There are two conclusions from these data. the backwards blocking procedure are ambigu-
The first is that 4-year-olds’ inferences were ous. Without this information (i.e., if children
consistent with the Bayesian description in believed that blickets only sometimes made
so far as they could recognize priors from the the machine go), the data are more consistent
environment and use that information to make with object B having the capacity to activate
rational inferences about ambiguous data. The the detector than not. To illustrate this, sup-
second is that there was a developmental dif- pose that blickets only activated the detector
ference between 3- and 4-year-olds’ inferences. 80% of the time. Even though object A clearly
It is possible that a system for causal inference is a blicket (by virtue of activating the machine
develops between these ages. However, there is alone), it might have failed to be responsible
another possibility, which involves considering for activating the machine when it was placed
what information is necessary for the child to together with object B on the machine; there
possess in order to formulate a hypothesis space would be a nontrivial chance that the detector’s
accurately. activation was uniquely caused by object B hav-
ing the efficacy to activate the machine.
While there is some good evidence that sug-
gests 4-year-olds treat causal relations, including
causal relations involving the blicket detector,
In the previous section, I asserted that there were as deterministic (Bullock et al., 1982; Schulz
four hypotheses consistent with the backwards & Sommerville, 2006), it is not clear whether
blocking data. What knowledge was necessary younger children do so as well. Further, even
to form this hypothesis space? Do children pos- this work does not suggest that children recog-
sess this knowledge? nize that deterministic data are related to partic-
Some spatiotemporal knowledge appears ular causal mechanisms. As such, in a series of
necessary. First, placing an object on the blicket investigations, my colleagues and I considered
detector activates it; the detector’s activation how 3- and 4-year-olds reasoned about the rela-
should not cause the experimenter to place tion between the causal properties of artifacts
an object on it. Second, an object’s location in and nonobvious, internal properties. Our ques-
space should be independent of another object’s tion was whether 3- and 4-year-olds recognized
locations in space. Given research on infants’ that insides of objects could act as mechanisms

for those objects’ causal properties, and whether the member of the pair with the internal part
children might understand such mechanisms activated the detector, and they were asked to
differently across domains of knowledge. show the experimenter another object that
In one set of experiments (Sobel, Yoachim, would activate the machine. The majority of
Gopnik, Meltzoff, & Blumenthal, 2007), we children chose the other object with the internal
presented children with the blicket detector part (66% of the time), significantly more often
(although we simply labeled it as a machine, than chance. Four-year-olds also claimed that
so that children were not influenced by object objects that shared internal parts were more
label information) and a set of objects such as likely to share causal properties (i.e., activate
those shown in Figure 8.4. Two objects were the detector) than objects that shared external
externally identical, and another was unique in parts (e.g., had stickers on them).
appearance. All three objects had holes drilled This inference also worked in the other
into them, covered by dowels, which could direction. In another experiment 3- and 4-year-
reveal whether each contained an internal part. olds were shown the same sets of objects and the
Four-year-olds were shown the insides of each blicket detector (again, without it being labeled
toy: one of the identical objects and the unique as such), and were shown the causal efficacy
object contained an internal part (a white map of the three objects. One member of the pair
pin), while the third member of the set was and the unique object activated the detector,
empty inside. Children were then shown that while the other member of the pair did not. The

Figure 8.4 Stimulus set used to measure whether children appreciated the relation between objects
causal properties and insides (Sobel et al., 2007).

member of the pair that activated the detector not. We also showed children that the detec-
was opened to reveal that it contained an inter- tor activated if at least one object with a blicket
nal part, and children were asked which other inside was on it. After receiving this training,
object also contained such an inside. A striking children were shown two objects (A and B),
developmental difference was found: 3-year- which activated the machine together. The door
olds chose the other object that activated the on object A was opened to reveal it was empty.
detector 31% of the time, significantly lower Children had no trouble inferring whether each
than what would be expected by chance. Four- object contained an internal part. The critical
year-olds chose this object with significantly question was an intervention question—chil-
greater frequency (72%), and more often than dren were asked to make the machine go. The
chance responding. Importantly, Sobel et al. child has observed the experimenter generate
(2007) also ran a condition in which the associ- an intervention that activates the machine—
ation between the detector’s activation and each placing both objects on it. But this imitative
object was held constant, but the object was not response is not the most efficient way of acti-
causally related to the machine. Each object was vating the machine. If children recognize that
held over the detector, and the experimenter the internal part is responsible for the object’s
pressed a button on the detector for the objects causal property, then they should recognize
that would have activated the machine. Here, there is no need to put object A on the detector,
both 3- and 4-year-olds made causal responses and when asked to generate an intervention that
less than 30% of the time, significantly lower activates the machine, place only object B on the
than what would be expected by chance. machine. This was the response generated by
What these data suggest is that 4-year-olds, the majority of 4-year-olds, significantly more
but not 3-year-olds, recognize that there is a often than the younger children. The younger
relation between an object’s causal and internal children were more likely to imitate, and place
properties. However, these data do not demon- A and B on the machine together (Sobel &
strate that 4-year-olds understand an activation Blumenthal, in preparation).
law—that there is something about the internal These data suggest that 4-year-olds tie
part that is responsible for activating the detec- together the correlational information they
tor. Four-year-olds do integrate some amount observe between each object and the detector’s
of correlational information together when activation, and mechanism information about
making inferences about causal mechanisms: what is necessary for each object to activate
they only respond on the basis of the machine’s the detector: namely, a nonobvious property.
activation when the spatiotemporal connec- Three-year-olds have a harder time integrating
tion between the object and machine warrants this information. This development might relate
a causal relation. A stronger argument would to children’s developing use of a Bayesian mech-
be to demonstrate that 4-year-olds, but not anism of causal inference. Correlational infor-
younger children, interpret an object’s internal mation is reflected in how the data generate
parts as being necessary and sufficient for the posterior probabilities of each hypothesis being
detector’s activation. correct. Mechanism information is reflected in
To test this, Emily Blumenthal and I intro- how those hypotheses are formed: what causal
duced 3- and 4-year-olds to the blicket detector structures come under consideration and how
and provided children with (what we thought those causal structures are parameterized.
was) the strongest possible information about its Specifically, what we would like to show
efficacy. We told children that the machine was is that children who recognize the activation
a “blicket machine” and “things with blickets law, by virtue of connecting objects’ causal and
inside made the machine go.” We then showed internal parts together, are more likely to engage
children that a set of objects with internal parts in inferences consistent with recognizing the
(labeled blickets) all activated the machine, and prior probability information they observe. The
that a set of objects without internal parts did 3-year-olds who failed to discriminate between

the rare and common conditions in Sobel et al.’s given to the children, they preferred to play with
(2004) procedure might have lacked the under- the toy associated with the first person. This sug-
standing that the way an object can produce its gests that children infer a nonobvious property
causal properties can be related to its insides. to the toy based on another’s desires, since that
Lacking this knowledge might indicate that property must be responsible for those desires
they lacked an activation law relating objects (see also Perner, 1991; Yuill, 1984).
with the detector, which would make their fail- Sobel and Munro (2009) manipulated the
ure to respond like the older children rational. blicket detector to attempt to introduce it to
3-year-olds as a psychological agent. They
placed a set of cardboard eyes on the machine
(shown in Figure 8.5) and introduced it to chil-
dren as “Mr. Blicket.” The experimenter con-
In order to test this hypothesis, we need to ducted a dialogue with the machine, which
consider how we might facilitate 3-year-olds’ activated spontaneously in response to ques-
understanding of an activation law. One possi- tions and comments (this procedure was mod-
bility is to consider how children reason about eled after Johnson, Slaughter, and Carey (1998)
such causal relations in another domain of and Johnson, Booth, and O’Hearn (2001), who
knowledge; all of the experiments mentioned so used a similar procedure to study agent gaze-
far have been exclusive to the domain of blicket following in infants). The children were then
detectors and aspects of physical causality. Can told that they were going to play a game in which
similar manipulations be performed in another Mr. Blicket would tell them whether he liked an
domain? There is some reason to suspect that object. They then repeated the procedure used
some causal inference abilities, such as infer- by Sobel et al. (2007) to study whether 3-year-
ences consistent with the Markov assump- olds linked the internal parts of objects with
tion, appear to be domain-general (e.g., Schulz their causal property (in this case, whether Mr.
& Gopnik, 2004). However, these inferences Blicket liked the object). Three-year-olds did
involved general logical principles, not specific link the causal property with the object’s insides
pieces of causal mechanism information. in this condition, significantly more often
Research in theory of mind tells us that than in another condition, in which the same
young children understand a particular mental procedure was performed with a machine that
state—the results of an agent’s desires—at very spontaneously activated during the warm-up
young ages. Eighteen-month-olds recognize that of the procedure, with the same temporal con-
others can have desires different from their own tiguity as Mr. Blicket’s activation (70% vs. 41%
(Repacholi & Gopnik, 1997). Two- and 3-year- of the time).
olds have a good understanding of the outcomes
of fulfi lled and unfulfi lled desires (Wellman &
Woolley, 1990). Three-year-olds can also keep
track of their own and other’s desires over time
and changes in the environment (e.g., Gopnik
& Slaughter, 1991). Fawcett and Markson
(2005) asked 2-year-olds to make inferences
about their own preferences based on another’s
desires. They showed children that one person
consistently played with toys that matched the
child’s preferences, and that another person
consistently played with toys that did not match
the child’s preferences. They then presented two
novel (equally preferential) toys, and each per-
son played with one. When those two toys were Figure 8.5 Mr. Blicket.

These data suggest that 3-year-olds might we gave another group of 3-year-olds the same
integrate the correlational data they observe Mr. Blicket procedure, except that we labeled his
about agent’s desires toward objects with mech- activation as indicating what he was thinking
anism information—that there must be some- about, instead of what he liked. Unlike desire,
thing about those particular objects responsible 3-year-olds have little knowledge of other’s
for Mr. Blicket’s desires. If this is the case, then belief states (e.g., Wellman, Cross, & Watson,
3-year-olds might have an activation law about 2001), the role of thinking in other mental activ-
Mr. Blicket’s desire, and reason more consis- ities (Flavell, Green, & Flavell, 1995; Johnson &
tently with the Bayesian description than they Wellman, 1982; Lillard, 1996), or the possibility
do in the physical domain. To test this, we (Sobel that thoughts could be related to objects or other
& Munro, 2009) introduced 3-year-olds to Mr. thoughts (Eisbach, 2004). It seemed likely that
Blicket in the same manner as described above, only a few 3-year-olds would recognize that an
and then gave them the same rare or common agent’s thoughts could be based on an internal
training as in Sobel et al. (2004), followed by the property of objects, which would provide them
same backwards blocking trial. Two new objects with a causal mechanism equivalent to the acti-
(A and B) were placed on Mr. Blicket together, vation law. This also appeared to be the case. In
and he activated. Then object A alone was this condition, 3-year-olds categorized object B
placed on him with the same result. All children as something Mr. Blicket was thinking about
claimed that Mr. Blicket liked object A, and the 72% of the time, more often than in the desire
question was how they categorized object B. condition.
Three-year-olds claimed that Mr. Blicket Further, in the three conditions in which
liked object B 44% of the time when trained children were trained that the causal power
that he liked relatively few things (recall that the (Mr. Blicket’s desire, thoughts, or the activation
base rate in this condition was 1/6). By contrast, of a spontaneous machine) was rare, we also gave
when Mr. Blicket liked many things (a base children a set of unrelated cognitive measures as
rate of 5/6), children responded that he liked well as a measure in which they were asked to
object B 93% of the time, a significant difference relate the causal power of objects to those objects
between the conditions. Performance in the insides (analogous to the procedure used in
rare condition, however, could have been influ- Sobel et al., 2007). Across all of these conditions,
enced by a number of factors. One possibility is the ability to relate the causal property of objects
that children were influenced by the spontane- to those objects’ insides predicted whether chil-
ous activation of the box, and would respond dren claimed that object B did not have causal
in a similar manner to a blicket machine that efficacy (i.e., a response consistent with the
they observed spontaneously activate. This was Bayesian description), even when age and other
not the case. Another group of 3-year-olds were measures of general cognition were considered.
shown a blicket machine that spontaneously These data indicate that children are not
activated during the initial part of the proce- specifically developing causal inference abilities
dure. They were trained that objects that acti- between the ages of 3 and 4. Rather, children
vate the machine (and hence, are blickets) were appear to have such an inferential mechanism
rare, and were given the same procedure. In this in place at the age of three, and lack the particu-
condition, 3-year-olds categorized object B as a lar domain-specific knowledge necessary to use
blicket 72% of the time, more often than in the that mechanism appropriately. The Bayesian
desire condition. description I am suggesting here (following
Similarly, another possibility is that chil- similar proposals by Griffiths & Tenenbaum,
dren were simply more interested in Mr. Blicket 2005; Griffiths et al., submitted; Tenenbaum &
than when it was a blicket machine. There are Griffiths, 2001, 2003; Tenenbaum, Griffiths, &
cases where children’s interest level clearly Niyogi, 2007) offers a rational way of consider-
mediates their cognition (e.g., Renninger & ing how children’s developing prior knowledge
Wozniak, 1985). To consider this possibility, influences their causal reasoning abilities.

An open question is how such causal detector is probabilistic, then there should be
knowledge might be acquired. In the final sec- the possibility that object B is a blicket; object
tions, I want to consider two possibilities. The B might have failed to be effective when it was
first is an extension of the Bayesian mechanism. placed on the machine alone, but demonstrated
The second attempts to integrate other pieces of its efficacy when placed on the machine with
information from the environment. object A. If the detector is deterministic, then
this is not the case: object A should be a blicket,
and object B should not be by virtue of it failing
to activate the detector independently. Gopnik
So far we have considered how children recover et al. (2001) found that overall, children (partic-
a representation of the causal environment ularly 4-year-olds) who were shown these data
based on the data they observe. Th is learn- responded consistently with the deterministic
ing mechanism is guided by a particular set of interpretation. Tom Griffiths and I reanalyzed
causal principles, which potentially constrain performance on the one-cause trials as a func-
the hypothesis space children consider and the tion of whether they observed a two-cause trial
parameterization of those hypotheses. An open first (recall that in the two-cause trial, one object
question is how children develop knowledge of activates the detector probabilistically; it fails to
these causal principles. activate the machine the first time it is placed on
Consider the mechanism that underlies the it, and does so the next two times). Four-year-
blicket detector. The previous sections argued olds were more likely to say that object B was a
that preschoolers develop a conception that the blicket in the one-cause trial if they saw a two-
mechanism that underlies the detector’s activa- cause trial first.
tion is deterministic. This knowledge is what Griffiths et al. (submitted) considered more
allows us (and young children) to make infer- systematically whether children and adults
ences based on small samples of data. In almost can extract mechanism information from the
all of the experiments described above, children data they observe. Specifically, if learners first
are never shown data that contradict a deter- observe evidence that the detector is determin-
ministic mechanism. What happens if this is istic will they make different inferences about
the case? the same data than if they first observe evidence
In Gopnik et al. (2001), children were shown that the detector is not deterministic? This ques-
cases in which objects sometimes made the tion can also be formulated as one of Bayesian
machine go and sometimes did not. In their inference, although the hypotheses are about
two-cause trials, children inferred that an the principles that govern how hypotheses about
object that activated the blicket detector two out causal models are formulated. In this example,
of three times was a blicket most of the time. the hypotheses include the nature of the activa-
This trial provides evidence that the detector is tion law—the mechanism that relates objects to
not deterministic and might activate based on the detector—in addition to the specific causal
a more probabilistic mechanism. How might structures (following Tenenbaum, Griffiths, &
seeing this trial first affect children’s inferences Kemp, 2006; Tenenbaum et al., 2007). For pur-
on other trials, in which a deterministic mecha- poses of space, I will only describe the psycho-
nism is required? logical investigation with young children, but
Like the backwards blocking procedure, we have done similar investigations on adults.
Gopnik et al.’s (2001) one-cause procedure Griffiths et al. (submitted) showed 4-year-
relies on children understanding that there is a olds the blicket detector, and trained them that
deterministic mechanism that relates blickets to the detector was either deterministic or proba-
the blicket detector (recall that on a one-cause bilistic. In the deterministic condition, children
trial, object A activates the machine by itself, were introduced to the detector as in Gopnik
object B does not by itself, then both objects et al. (2001). They then observed six objects, each
activated the machine together twice). If the placed on the machine three times. Five of the

six objects activated the machine all three times, the older children treat them as if they were the
and were labeled blickets; the other object failed mechanism for the detector’s activation, with-
to activate the machine all three times, and out (apparently) a real conception of how such
was labeled as not a blicket. In the probabilistic mechanisms function.
condition, children received the same introduc-
tion, and saw the same six objects. But here, the INTEGRATING TOP-DOWN AND
objects that activated the machine perfectly in BOTTOM - UP LEARNING
the previous condition did so with some noise.
Objects either activated the machine perfectly Appealing to a Bayesian description of a causal
(one object), two out of three times (two objects) learning mechanism—specifically one that
or one out of three times (two objects), and any might be able to extract such mechanism knowl-
of the objects that activated the machine was edge from observed data—does not imply that
labeled a blicket. The object that failed to acti- all causal learning is “bottom-up.” Instead, the
vate the machine all three times was still labeled Bayesian description seems more integrative:
as not a blicket, keeping the base rate of blickets “top-down” principles for constraining causal
the same across the conditions. learning can be derived from data, but this
Children then observed a set of trials simi- should not be considered the only way causal
lar to the one-cause condition in Gopnik et al. learning works. Below, I suggest several addi-
(2001). The critical part of the trial involved them tional ways children might be able to acquire
observing two new objects (A and B). Object A information about the principles for causal
activated the machine by itself once. Object B learning.
failed to activate the machine by itself once, and
then A and B together activated the machine
twice. Children were asked whether each was More likely than not, the best way in which chil-
a blicket. In the deterministic condition, per- dren learn new causal structures (or new causal
formance paralleled Gopnik et al. (2001): chil- principles) is through direct instruction—what
dren stated that object A was a blicket (100% of Harris and Koenig (2006) call learning from
the time), and object B was not (only 9% of the “testimony.” Harris, Pasquini, Duke, Asscher,
time). In the probabilistic condition, children and Pons (2006), for example, demonstrated
stated that object A was a blicket (92% of the that children made strong ontological commit-
time), but were significantly more likely to state ments about different nonobservable scientific
that object B was as well (79% of the time). and endorsed entities (e.g., vitamins vs. Santa
These data offer preliminary evidence that Claus). Further, the degree of their commitment
4-year-olds not only can recover information in these entities varied with the exposure that
about causal models from the data that they they received about them.
observed, but that they also recover the prin- More generally, one could imagine that chil-
ciples necessary to learn causal structure from dren learn a great deal of causal structure simply
those data. Given the same correlational infor- by being told about that structure (something
mation, their inferences were different depend- that might be particularly important in learning
ing on the nature of the mechanism they were science, see Klahr & Nigam, 2004). This is evi-
exposed to. Children’s understanding of these dent in the introduction to most blicket detec-
mechanisms might not be terribly deep; they tor experiments, in which children are told that
might not have explicit understanding of the the machine is a “blicket machine,” and that
mechanism, but rather just be aware that some objects that make it go are “blickets.” The fact
kind of mechanism exists, which constrains that children learn this readily (established in
inference in certain ways. Th is seems con- the pretests of almost all of these experiments),
sistent with the work on relating causes and suggests that they can learn causal principles
insides: the internal parts of the objects in directly from the language they hear, but this is
Sobel et al. (2007) are dummy mechanisms, but a topic for further investigation.

Analogy similar data in adult participants (Lagnado &

Sloman, 2004; Steyvers et al., 2003; Waldmann
Numerous investigations suggest that young
& Hagmayer, 2005), in child participants
children can make inferences from analogies
(Schulz et al., 2007), and in animals (Blaisdell et
(e.g., Brown & Kane, 1988; Gentner, 1977), and
al., 2006). Moreover, young children appear to
this is especially true when reasoning about
treat their own data as more informative than if
causal relations (e.g., Goswami & Brown, 1989;
the same data were generated by another person
Goswami, Leevers, Pressley, & Wheelwright,
(Kushnir & Gopnik, 2005).
1998; Ratterman & Gentner, 1998). This sug-
Jessica Sommerville and I investigated how
gests that children can come to make new causal
young children’s causal learning was affected
inferences from analogous information, or learn
by particular contextual demands (Sobel &
new information faster/more accurately if the
Sommerville, in preparation). We found that
analogy is mapped out for them. Emily Hopkins
4-year-olds whose free play with a system allowed
and I (Hopkins & Sobel, 2007; Sobel & Hopkins,
them to discover causal structure learned that
submitted) have recently considered this possi-
structure better than children whose free play
bility by looking at a particular type of causal
with a system came after they observed an
inference: reasoning about enabling condition.
experimenter generate a small number of inter-
Specifically, we found that 4-year-olds struggled
ventions on the system (enough to discover the
to understand enabling conditions in a decon-
structure). Further, when children were shown
textualized environment (where the part of an
identical intervention data, which was sufficient
object that acted as the enabling condition was
to learn a causal structure, children who were
labeled an “inside”). However, young children
given an inappropriate rationale for why the
do appear to understand enabling conditions
experimenter has generated those data failed to
in a particular setting: a Child Language Data
learn the system; children given an appropriate
Exchange System (CHILDES; MacWhinney,
rationale learned above chance values (Sobel &
2000) analysis revealed that children talk about
Sommerville, 2009). These contextual factors
how batteries are necessary to make machines
are not part of the computational description
and toys function. Four-year-olds were able to
I have described so far, and must be accounted
make proper inferences about enabling condi-
for therein.
tions in a condition in which the part that acted
in this manner was labeled as a battery.
Contextual Information in Data
In this chapter, I have suggested a description
A potential limitation of the causal graphi- of causal inference based on Bayesian infer-
cal model framework is that it does not easily ence, which illustrates how children engage in
describe a way in which contextual cues can causal learning (for a more detailed description
influence learning. For example, active con- of this model, see e.g., Griffiths & Tenenbaum,
struction of knowledge in the world is a hallmark 2007). This description is meant at the compu-
of both classic (e.g., Montessori, 1912; Piaget, tational level of analysis (followed Marr, 1982),
1952) and certain contemporary (e.g., Gopnik & which means that an obvious limitation of this
Meltzoff, 1997) approaches to cognitive develop- approach is that it should not be taken for the
ment. The computational approaches described actual algorithm by which children learn causal
here do not consider whether the child has an knowledge, nor should it be considered how
active hand in constructing its knowledge as the brain instantiates such inference. However,
opposed to recovering causal structure from in describing the way in which children learn
simply observing the environment. causal knowledge, we provide insight into these
The ability to control what data one observes, questions.
and generate interventions consistent with those I want to conclude by emphasizing that com-
data appear to facilitate learning over observing putational models are a good way to focus an

investigation, but a psychological description of Bullock, M., Gelman, R., & Baillargeon, R. (1982).
human causal learning should not be completely The development of causal reasoning. In W. J.
model-dependent (whether that model be bot- Friedman (Ed.), The developmental psychology of
tom-up, top-down, or something in between). time (pp. 209–254). New York: Academic Press.
One should integrate model with human work- Cheng, P. W. (1997). From covariation to causa-
tion: A causal power theory. Psychological
ings to describe psychological accounts of
Review, 104, 367–405.
reasoning (what Lagnado et al., 2007, calls a Cheng, P. W., & Novick, L. R. (1990). A proba-
“heuristic-based” approach). Here, it should be bilistic contrast model of causal induction.
emphasized that young children possess con- Journal of Personality and Social Psychology,
siderable causal reasoning abilities, starting at a 58, 545–567.
very young age. The goal of future research is to Cramer, R. E., Weiss, R. F., Williams, R., Reid, S.,
describe these abilities, and potentially develop Nieri, L., & Manning-Ryan, B. (2002). Human
an algorithmic and implementational level of agency and associative learning: Pavlovian
children’s causal inference—in more detail. principles govern social process in causal
relationship detection. Quarterly Journal of
Experimental Psychology: Comparative and
ACKNOWLEDGMENTS Physiological Psychology, 55B, 241–266.
Das Gupta, P., & Bryant, P. E. (1989). Young chil-
I was supported by NSF (DLS-0518161 to D.M.S.)
dren’s causal inferences. Child Development,
during the writing of this chapter. I would like to
60, 1138–1146.
thank all of the parents and children who par-
Dickinson, A., & Shanks, D. (1995). Instrumental
ticipated in the research described here. I would
action and causal representation. In D. Sperber,
also like to thank Scott Johnson, David Buchanan,
D. Premack, & A. J. Premack (Eds.), Causal cog-
Claire Cook, Tom Griffiths, Natasha Kirkham and
nition: A multidisciplinary debate (pp. 5–25).
the members of the NYBUG workshop for helpful
New York: Clarendon Press/Oxford University
discussions about material in this chapter.
Eisbach, A. O. D. (2004). Children’s developing
REFERENCES awareness of diversity in people’s trains of
thought. Child Development, 75, 1694–1707.
Ahn, W., Kalish, C. W., Medin, D. L., & Gelman, Fawcett, C., & Markson, L. (2005). Developing
S. A. (1995). The role of covariation versus ideas about other’s preferences. Poster pre-
mechanism information in causal attribution. sented at the 2005 meeting of the Cognitive
Cognition, 54, 299–352. Development Society, San Diego, CA.
Allan, L. G. (1980). A note on measurement of Fiser, J., & Aslin, R. N. (2002). Statistical learn-
contingency between two binary variables in ing of new visual feature combinations by
judgment tasks. Bulletin of the Psychonomic infants. Proceedings of the National Academy of
Society, 15, 147–149 Sciences, USA, 99, 15822–15826.
Aslin, R. N., Saff ran, J. R., & Newport, E. L. (1998). Flavell, J. H., Green, F. L., & Flavell, E. R. (1995).
Computation of conditional probability sta- Young children’s knowledge about thinking.
tistics by 8-month-old infants. Psychological Monographs of the Society for Research in
Science, 9, 321–324. Child Development, 60(1, Serial No. 243).
Blaisdell, A. P., Sawa, K., Leising, K. J., & Gentner, D. (1977). Children’s performance on a
Waldmann, M. R. (2006). Causal reasoning in spatial analogies task. Child Development, 48,
rats. Science, 311, 1020–1022. 1034–1039.
Blumenthal, E. J., & Sobel, D. M. (2008). Gomez, R. L., & Gerken, L. A., (1999). Artificial
Preschoolers’ developing knowledge of causal grammar learning by 1-year-olds leads to spe-
and internal properties of artifacts. Manuscript cific and abstract knowledge. Cognition, 70,
in preparation, Brown University. 109–135.
Brown, A. L., & Kane, M. J. (1988). Preschool Gopnik, A., Glymour, C., Sobel, D. M., Schulz, L.
children can learn to transfer: Learning to E., Kushnir, T., & Danks, D. (2004). A theory of
learn and learning from example. Cognitive causal learning in children: Causal maps and
Psychology, 20, 493–523. Bayes nets. Psychological Review, 111, 3–32.

Gopnik, A., & Meltzoff, A. N. (1992). Categorization In C. Granrud (Ed.), Visual perception and
and naming: Basic-level sorting in eighteen- cognition in infancy (pp. 235–264). Hillsdale,
month-olds and its relation to language. Child NJ: Erlbaum.
Development, 63, 1091–1103. Harris, P. L., & Koenig, M. A. (2006). Trust in tes-
Gopnik, A., & Meltzoff, A. N. (1997). Words, timony: How children learn about science and
thoughts, and theories. Cambridge, MA: MIT religion. Child Development, 77, 505–524.
Press. Harris, P. L., Pasquini, E. S., Duke, S., Asscher,
Gopnik, A., & Slaughter, V. (1991). Young chil- J. J., & Pons, F. (2006).Germs and angels: The
dren’s understanding of changes in their men- role of testimony in young children’s ontology.
tal states. Child Development, 62, 98–110. Developmental Science, 9, 76–96.
Gopnik, A., & Sobel, D. M. (2000). Detecting Hopkins, E. J., & Sobel, D. M. (2007, March).
blickets: How young children use informa- Children’s causal inferences about enabling
tion about novel causal powers in categoriza- conditions in the physical and psychological
tion and induction. Child Development, 71, domains. Poster presented at the 2007 Biennial
1205–1222. meeting of the Society for Research in Child
Gopnik, A., Sobel, D. M., Schulz, L., & Glymour, Development, Boston, MA.
C. (2001). Causal learning mechanisms in very Hume, D. (1978). A treatise of human nature.
young children: Two, three, and four-year-olds Oxford: Oxford University Press. (Original
infer causal relations from patterns of variation work published 1739)
and co-variation. Developmental Psychology, Johnson, C. N., & Wellman, H. M. (1982).
37, 620–629. Children’s developing conceptions of the mind
Goswami, U., & Brown, A. L. (1989). Melting and brain. Child Development, 53, 222–234.
chocolate and melting snowmen: Analogical Johnson, S., Slaughter, V., & Carey, S. (1998).
reasoning and causal relations. Cognition, 35, Whose gaze will infants follow? The elicita-
69–95. tion of gaze-following in 12-month-olds.
Goswami, U., & Brown, A. L. (1990). Higher-order Developmental Science, 1, 233–238.
structure and relational reasoning: Contrasting Johnson, S. C., Booth, A., & O’Hearn, K. (2001).
analogical and thematic relations. Cognition, Inferring the goals of a nonhuman agent.
36, 207–226. Cognitive Development, 16, 637–656.
Goswami, U., Leevers, H., Pressley, S., & Kamin, L. J. (1969). Predictability, surprise, atten-
Wheelwright, S. (1998). Causal reasoning tion, and conditioning. In B. A. Campbell &
about pairs of relations and analogical rea- R. M. Church (Eds.), Punishment and aversive
soning in young children. British Journal of behavior (pp. 279–296). New York: Appleton-
Developmental Psychology, 16, 553–569. Century-Crofts.
Gottfried, G. M., & Gelman, S. A. (2005). Kirkham, N. Z., Slemmer, J. A., & Johnson, S. P.
Developing domain-specific causal-explana- (2002). Visual statistical learning in infancy.
tory frameworks: The role of insides and imma- Cognition, 83, B35–B42.
nence. Cognitive Development, 20, 137–158 Klahr, D., & Nigam, M. (2004). The equivalence
Griffiths, T. L., Sobel, D. M., Tenenbaum, J. B., & of learning paths in early science instruction.
Gopnik, A. (2008). Bayesian reasoning in adults’ Psychological Science, 15, 661–667.
and children’s causal inferences. Manuscript Kruschke, J. K., & Blair, N. J. (2000). Blocking and
in preparation, University of California at backward blocking involve learned inattention.
Berkeley. Psychonomic Bulletin and Review, 7, 636–645.
Griffiths, T. L., & Tenenbaum, J. B. (2005). Kushnir, T., & Gopnik, A. (2005). Young Children
Structure and strength in causal induction. Infer Causal Strength from Probabilities
Cognitive Psychology, 51, 334–384. and Interventions. Psychological Science, 16,
Griffiths, T. L., & Tenenbaum, J. B. (2007). Two 678–683.
proposals for causal grammars. In A. Gopnik & Lagnado, D. A., & Sloman, S. (2004). The advantage
L. E. Schulz (Eds.), Causal learning: Psychology, of timely intervention. Journal of Experimental
Philosophy, and Computation (pp. 323–346). Psychology: Learning, Memory, and Cognition,
New York: Oxford University Press. 30, 856–876.
Haith, M. M. (1993). Future-oriented processes Lagnado, D. A., Waldmann, M. R., Hagmayer, Y., &
in infancy: The case of visual expectations. Sloman, S. A. (2007). Beyond covariation: Cues

to causal structure. In A. Gopnik & L. E. Schulz and recall in young children. Developmental
(Eds.), Causal learning: Psychology, philoso- Psychology, 21, 624–632.
phy, and computation (pp. 154–172). New York: Repacholi, B. M., & Gopnik, A. (1997). Early rea-
Oxford. soning about desires: Evidence from 14- and
Leslie, A. M., & Keeble, S. (1987). Do six-month- 18-month-olds. Developmental Psychology, 33,
old infants perceive causality? Cognition, 25, 12–21.
265–288. Rescorla, R. A., & Wagner, A. R. (1972). A the-
Lillard, A. S. (1996). Body or mind: Children’s cat- ory of Pavlovian conditioning: Variations in
egorizing of pretense. Child Development, 67, the effectiveness of reinforcement and nonre-
1717–1734. inforcement. In A. H. Black & W. F. Prokasy
Mackintosh, N. J. (1975). A theory of attention: (Eds.), Classical Conditioning II: Current theory
Variations in the associability of stimuli with and research (pp. 64–99). New York: Appleton-
reinforcement. Psychological Review, 82, Century-Crofts.
276–298. Saff ran, J. R., Aslin, R. N., & Newport, E. L. (1996).
MacWhinney, B. (2000). The CHILDES project: Statistical learning by 8-month-old infants.
Tools for analyzing talk (3rd ed.). Mahwah, NJ: Science, 274, 1926–1928.
Lawrence Erlbaum Associates. Saff ran, J. R., Johnson, E. K., Aslin, R. N., &
Marr, D. (1982). Vision. New York: Henry Holt. Newport, E. L. (1999). Statistical learning of
McCormack, T., & Hoerl, C. (2005). Children’s tone sequences by human infants and adults.
reasoning about the causal significance of Cognition, 70, 27–52.
the temporal order of events. Developmental Schulz, L. E., & Gopnik, A. (2004). Causal learning
Psychology, 41, 54–63. across domains. Developmental Psychology, 40,
Michotte, A. (1962).Causalite, permanence et 162–176.
realite phenomenales. Oxford, England: Publi- Schulz, L. E., Gopnik, A., & Glymour, C. (2007).
cations Universitaires. Preschool children learn causal structure from
Montessori, M. (1912/1964). The montessori conditional independence. Developmental
method. New York: Schocken. Science, 10, 322–332.
Novick, L. R., & Cheng, P. W. (2004). Assessing Schulz, L. E., & Sommerville, J. (2006). God does
interactive causal influence. Psychological not play dice: Causal determinism and chil-
Review, 111, 455–485. dren’s inferences about unobserved causes.
Oakes, L. M., & Cohen, L. B. (1990). Infant percep- Child Development, 77, 427–442.
tion of a causal event. Cognitive Development, Shanks, D. R. (1985). Forward and backward
5, 193–207. blocking in human contingency judgment.
Pearl, J. (2000). Causality: Models, reasoning, and Quarterly Journal of Experimental Psychology,
inference. New York: Cambridge University 37B, 1–21.
Press. Shanks, D. R. (1995). Is human learning rational?
Perner, J. (1991). Understanding the representa- Quarterly Journal of Experimental Psychology
tional mind. Cambridge, MA: MIT Press. A, 48, 257–279.
Piaget, J. (1929). The child’s conception of the world. Shultz, T. R. (1982). Rules of causal attribution.
London: Routledge and Kegan Paul. Monographs of the Society for Research in Child
Piaget, J. (1930). The child’s conception of physical Development, 47(1, Serial No. 194).
causality (M. Gabain, Trans.). London: Lund Sobel, D. M. & Blumenthal, E. J. (2009). Children’s
Humphries. causal inferences about ambiguous evidence reflect
Piaget, J. (1952). The origins of intelligence in their developing understanding of mechanisms.
children (M. Cook, Trans.). Madison, WI: Manuscript in preparation, Brown University.
International Universities Press. Sobel, D. M., & Hopkins, E. (2009). Batteries not
Ratterman, M. J., & Gentner, D. (1998). More included: Children’s causal inferences about
evidence for a relational shift in the develop- enabling conditions. Manuscript in prepara-
ment of analogy: Children’s performance on a tion, Brown University.
causal-mapping task. Cognitive Development, Sobel, D. M., & Kirkham, N. Z. (2006). Blickets
13, 453–478. and babies: The development of causal rea-
Renninger, K. A., & Wozniak, R. H. (1985). Effect soning in toddlers and infants. Developmental
of interest on attentional shift, recognition, Psychology, 42, 1103–1115.

Sobel, D. M., & Kirkham, N. Z. (2007). Bayes nets Annual Conference on the Advances in Neural
and blickets: Infants developing representa- Information Processing Systems.
tions of causal knowledge. Developmental Tenenbaum, J. B., Griffiths, T. L., & Kemp, C.
Science, 10, 298–306. (2006). Theory-based Bayesian models of
Sobel, D. M., & Munro, S. A. (2009). Domain gen- inductive learning and reasoning. Trends in
erality and specificity in children’s causal infer- Cognitive Science, 10, 309–318.
ences about ambiguous data. Developmental Tenenbaum, J. B., Griffiths, T. L., & Niyogi, S. (2007).
Psychology, 45, 511–524. Intuitive theories as grammars for causal infer-
Sobel, D. M., & Sommerville, J. A. (2009a). ence. In A. Gopnik & L. E. Schulz (Eds.), Causal
Rationales in children’s causal learning from learning; Psychology, philosophy and computa-
other’s actions. Cognitive Development, 24, tion (pp. 301–322). New York: Oxford.
70–79. Uzgiris, I. C., & Hunt, J. M. V. (1975). Assessment
Sobel, D. M. & Sommerville, J. A. (2009b). The in infancy: Ordinal scales of psychological devel-
importance of discovery for children’s causal opment. Oxford, England: University of Illinois
learning from interventions. Manuscript in Press.
preparation, Brown University. Van Hamme, L. J., & Wasserman, E. A. (1994). Cue
Sobel, D. M., Tenenbaum, J. B., & Gopnik, A. competition in causality judgments: The role of
(2004). Children’s causal inferences from nonpresentation of compound stimulus ele-
indirect evidence: Backwards blocking and ments. Learning and Motivation, 25, 127–151.
Bayesian reasoning in preschoolers. Cognitive Waldmann, M. R., & Hagmayer, Y. (2005). Seeing
Science, 28, 303–333. versus doing: Two modes of accessing causal
Sobel, D. M., Yoachim, C. M., Gopnik, A., knowledge. Journal of Experimental Psychology:
Meltzoff, A. N., & Blumenthal, E. J. (2007). The Learning Memory and Cognition, 31, 216–227.
blicket within: Preschoolers’ inferences about Wasserman, E. A., & Berglan, L. R (1998). Backward
insides and causes. Journal of Cognition and blocking and recovery from overshadowing in
Development, 8, 159–182. human causal judgment: The role of within-
Sophian, C., & Huber, A. (1984). Early develop- compound associations. Quarterly Journal
ments in children’s causal judgments. Child of Experimental Psychology: Comparative &
Development, 55, 512–526. Physiological Psychology, 51, 121–138.
Spelke, E. S., Breinlinger, K., Macomber, J., & Wellman, H. M., Cross, D., & Watson, J. K. (2001).
Jacobson, K. (1992). Origins of knowledge. A meta-analysis of theory of mind: The truth
Psychological Review, 99, 605–632. about false belief. Child Development, 72,
Spirtes, P., Glymour, C., & Scheines, R. (2001). 655–684.
Causation, prediction, and search (Springer Wellman, H. M., & Woolley, J. D. (1990). From sim-
Lecture Notes in Statistics, 2nd ed., Rev.). ple desires to ordinary beliefs: The early devel-
Cambridge, MA: MIT Press. opment of everyday psychology. Cognition, 35,
Steyvers, M., Tenenbaum, J. B., Wagenmakers, E. 245–275.
J., & Blum, B. (2003). Inferring causal networks Wentworth, N. Haith, M. M., & Hood, R. (2002).
from observations and interventions. Cognitive Spatiotemporal regularity and interevent con-
Science, 27, 453–489. tingencies as information for infants’ visual
Tenenbaum, J. B., & Griffiths, T. L. (2001). expectations. Infancy, 3, 303–321.
Structure learning in human causal induction. Woodward, J. (2003). Making things happen: A the-
Proceedings of the 13th Annual Conference ory of causal explanation. New York: Oxford.
on the Advances in Neural Information Yuill, N. (1984). Young children’s coordina-
Processing Systems. tion of motive and outcome in judgments of
Tenenbaum, J. B., & Griffiths, T. L. (2003). Theory- satisfaction and morality. British Journal of
based causal inference. Proceedings of the 14th Developmental Psychology, 2, 73–81.
What Is Statistical Learning, and
What Statistical Learning Is Not

Jenny R. Saffran

Over the past decade, researchers in develop- processing, statistical learning typically refers
mental cognitive science and psycholinguistics to a set of processes or procedures for parsing,
have become increasingly interested in the role part-of-speech tagging (e.g., discovering lexical
that statistical learning may play in perceptual categories), or induction of grammatical struc-
and cognitive development. Despite the term tures using such procedures as Hidden Markov
being weighty, the idea is quite simple. To the Models (e.g., Charniak, 1993). The broad idea
extent that structure in the environment is behind this body of research is that bottom-up
patterned, learners with appropriate learning processes tracking joint frequencies, condi-
mechanisms can make use of that patterning to tional probabilities, prior probabilities, mutual
discover underlying structure. This idea has a information, and/or entropy (among many
long history, across myriad domains. For exam- other possible computations) may efficiently
ple, linguists in the first half of the twentieth discover structure in complex domains, at least
century routinely examined the distribution of given relatively constrained search spaces.
sounds, words, and categories of words in novel Consistent with but largely separate from
languages to infer the structures that gener- these developments, computational modelers
ated those distributions, including phonemes, in cognitive science were developing novel tech-
words, morphemes, and rudimentary syntax niques for learning via ‘dumb’ algorithms oper-
(e.g., Bloomfield, 1933; Harris, 1955). Similarly, ating en masse, intended to mimic the operation
researchers studying the behavior of nonhuman of neural structures (e.g., Hebbian learning).
animals, both in the laboratory and in their nat- Again, working within relatively constrained
ural habitats, discovered that nonhuman ani- domains and/or toy corpora, researchers dem-
mals skillfully track environmental regularities onstrated discovery procedures using neural
to increase the probabilities of reward (for an network models that essentially capitalized on
extensive review, see Gallistel, 1990). statistical properties of the input to parse sen-
The term “statistical learning” itself origi- tences, discover lexical categories, and discern
nated in the area of computer science. Statistical structure via learning (e.g., Allen & Seidenberg,
learning algorithms are typically used for pat- 1999; Chater & Redington, 1999; Curtin et al.,
tern recognition processes and are applied across 2001; Mintz, Newport, & Bever, 2002; Reali &
numerous domains from face recognition to Christiansen, 2005; Seidenberg & MacDonald,
speech processing. In this field, statistical learn- 2001). These models have been highly effective
ing is sometimes used to refer to algorithms in demonstrating the information content avail-
that themselves learn from data. In the subfield able in various types of input, as they permit
of computer science known as natural language researchers to determine what types of cues are,


in principle, available to actual human learners were intended to provide an existence proof
and to test specific hypotheses concerning how that infants can, at least under certain circum-
learning might operate (e.g., the debate between stances, track types of information that are
proponents of rules vs. statistics: cf. Altmann, relevant to linguistic structure (according to
2002; Christiansen & Curtin, 1999; Marcus, the computational and linguistic literatures).
1999a, 1999b, 2001; McClelland, McClelland, & Importantly, the initial claims never suggested
Plaut, 1999; Rohde & Plaut, 1999; Seidenberg & that statistical learning could account for all of
Elman, 1999). language acquisition, or even all of any subset
Connectionist models have had a tremendous of language (e.g., discovering words in fluent
impact on theoretical and empirical work in the speech). The intent was to develop methods,
cognitive sciences. Interestingly, because these using artificial languages, which would allow
are learning models, they are arguably more subsequent researchers to test specific hypoth-
relevant to developmental psychology than to eses concerning the role of statistical learning
any other branch of cognitive science. Perhaps in language acquisition in infancy and beyond.
because of this fact, connectionist models have And to a large extent, this is exactly what has
been especially vulnerable to the kinds of cri- happened.
tiques and concerns raised most prominently by However, perhaps because of its relatively
developmentalists. For example, each model is simple beginnings, the area of statistical lan-
typically tailored to handle just one type of task. guage learning has been caricatured as focusing
That is, a model trained to learn the past tense of on a single computation (pairwise transitional
English verbs cannot turn around and do object probabilities) performed between physically
segregation as well (the input/output representa- observable entities (syllables) in a highly artificial
tions are wrong; even given overlapping represen- language. Given this view of statistical language
tations, new learning would lead to catastrophic learning, it is indeed the case that the theoreti-
interference, erasing the prior learning). These cal oomph of statistical learning is extremely
kinds of observations raise questions about how limited. To acquire natural languages, learn-
domain-specific such models need to be. What ers need to detect more complex relationships
is “innate” or prespecified in the representations between more abstract entities in a far richer set
and/or architectures of individual models? What of input. The insufficiency of statistical learn-
about types of learning—e.g., more abstract ing is even true for a task like detecting word
structures—that do not appear to be captured by boundaries in fluent speech. While tracking
statistical models? Perhaps most importantly, do sequential probabilities is demonstrably useful,
developing humans really learn this way? given corpus analyses (Swingley, 2005), other
Perhaps because of these developments and sources of information are absolutely necessary
debates, the initial findings that infants can to achieve good learning outcomes. These might
track the probabilities of sequential elements range from tracking additional regularities in
(syllables and/or phonemes) in speech received speech, such as the degree of stress carried by a
a great deal of attention (Aslin, Saff ran, & syllable (Curtin, Mintz, & Christiansen, 2005),
Newport, 1998; Goodsitt, Morgan, & Kuhl, to potential innate knowledge concerning uni-
1993; Hauser, Newport, & Aslin, 2001; Saff ran, versal phonological regularities (Yang, 2004).
Aslin, & Newport, 1996), both positive and neg- Importantly, the initial reports concerning
ative. On the positive side, these results unam- statistical learning attempted to be very explicit
biguously demonstrated that infants can track on this point, repeatedly pointing out the fact
regularities in rapid speech, without the benefit that sequential statistics alone are not enough
of added linguistic cues, social cues, or exter- to fully solve any real language learning prob-
nal reinforcement; learning itself appears to be lems. For example, Saff ran et al. (1996) noted
reinforcing in these tasks. This is not to say that that “although experience with speech in the
other sorts of information are not beneficial— real world is unlikely to be as concentrated as
indeed they are, as discussed below. These data it was in these studies, infants in more natural

settings presumably benefit from other types the extent that this caricature is accurate, the
of cues correlated with statistical information” potential role for statistical learning in language
(p. 1928). In an adult study of this phenom- acquisition is limited at best. For the remainder
enon also published in 1996, these investiga- of this chapter, I will focus on three issues that
tors explicitly manipulated an additional cue to are implicit in this characterization of statisti-
word boundaries, vowel lengthening, to inves- cal learning: the nature of the computations,
tigate additive effects of multiple cues (Saff ran, the complexity of the learning problem, and the
Newport, & Aslin, 1996). It has become increas- role of artificial languages.
ingly clear that sequential statistical cues (such
as transitional probabilities) operate in tandem
with other types of regularities in the service
of infant word segmentation, including lexi-
cal stress (e.g., Curtin et al., 2005; Johnson & The short answer to this question is that we do
Jusczyk, 2001; Jusczyk, 1999; Jusczyk, Houston, not yet know. There are (at least) three differ-
& Newsome, 1999; Thiessen & Saff ran, 2003, ent ways to approach this problem. The fi rst
2007), known words (Bortfeld, Morgan, is to analyze language corpora to determine
Golinkoff, & Rathbun, 2005), and other rel- which statistics, in principle, might be useful/
evant cues in speech to infants (for a recent necessary to capture language structure. The
review, see Saff ran, Werker, & Werner, 2006). second is to create carefully designed experi-
Moreover, use of sequential statistics appears ments to determine whether appropriately
to be enhanced by the presence of attention- aged learners can make use of the statistics
grabbing infant-directed speech (Thiessen, Hill, in question. These two approaches, in tan-
& Saff ran, 2005). dem, have been quite useful, in that they have
It is thus very much not the case that sequen- shown that infants can keep track of, at mini-
tial statistics operate in a vacuum; this point is mum, the adjacent pairwise probabilities (e.g.,
both implicitly and explicitly made through- Aslin et al., 1998) that corpus analyses suggest
out this burgeoning literature. Moreover, it would be useful for word segmentation (e.g.,
is important to note that many of the types of Swingley, 2005), nonadjacent pairwise proba-
information usually considered in opposition to bilities (e.g., Gomez, 2002), and histogram fre-
statistics are themselves nondeterministic. For quencies (e.g., Maye, Werker, & Gerken, 2002).
example, the lexical stress information upon The latter case is particularly interesting, as it
which English-learning 9-month-olds rely to suggests that infants not only track the indi-
segment bisyllabic words (first syllable stress) vidual frequencies of occurrence of elements,
is itself probabilistic (Cutler & Carter, 1987). but the distribution of those frequencies,
An issue for the field, then, is to decide what distinguishing unimodal and bimodal func-
“counts” as statistical. Are probabilistic regu- tions from one another. Importantly, these
larities that are not sequential still statistical? It functions are useful for discovering speech
seems that the logical answer is yes, in which categories given the statistics of real speech
case the question becomes how to characterize samples (Vallabha, McClelland, Pons, Werker,
the myriad different types of statistical regu- & Amano, 2007; Werker et al., 2007). These
larities in the input, including phonological sorts of statistics, along with others currently
and social cues (Goldstein, King, & West, 2003; described in the adult literature—e.g., clus-
Kuhl, 2007; Kuhl, Tsao, & Liu, 2003). tering words into categories (Mintz, 2002);
With these considerations in mind, let us tracking probabilities of individual word pairs
return to the caricature: statistical learning (Thompson & Newport, 2007)—suggest that
consists of a single computation (pairwise tran- infants may have access to a powerful tool set
sitional probabilities, either adjacent or non- for exploiting the distributional regularities of
adjacent) over simple elements (e.g., phonemes human languages. The exact nature of this tool
or syllables) in a highly artificial language. To set, however, remains underspecified.

The third approach is somewhat different disorders). Making use of expectations is even
and is perhaps best exemplified by connec- more critical in language production, where
tionist models. Rather that trying to specify decisions about future speech acts constantly
which statistics are used by infant learners, the inform current motor actions.
problem can be turned on its head by asking We can thus consider statistical learning as a
what task infants are attempting to perform. component of language processing and use. By
Computational models such as simple recur- anticipating what will come next, infants can
rent networks (SRNs) take as their task the potentially increase the efficiency of their lan-
problem of trying to predict what is coming up guage comprehension—which is necessary given
downstream in the input (Elman, 1990). This the amazing rapidity of speech. Patterns in the
idea—learning by predicting—is consistent input thus can serve to influence this compre-
with a body of recent work in adult language hension, by biasing perceivers toward likely out-
comprehension that has focused on predicting comes. Sequential patterns at numerous grains
during skilled language processing (for a recent of analysis could be construed as providing such
review, see MacDonald & Seidenberg, 2006). biasing information. For example, predicting
For example, event-related potential (ERP) which syllable will come next provides informa-
evidence suggests that adults use the phonology tion about where a word might end, facilitating
of a determiner (e.g., a versus an) to generate word segmentation and, eventually, lexical access.
expectations concerning what noun will come Predicting which word will come downstream
next: consonant-initial nouns follow a, while facilitates lexical access and subsequent sentence-
vowel-initial nouns follow an (DeLong, Urbach, level parsing. Category-level predictions (e.g., that
& Kutas, 2005). Semantic information in the nouns follow adjectives, or that certain types of
verb (e.g., eats) constrains adults’ anticipatory verbs are followed by the complementizer “that”)
eye movements, such that listeners look toward are likely to be particularly germane to syntac-
object pictures that are consistent with the verb tic structure; that is, predicting which word class
(e.g., cake) well before the noun itself occurs should follow another word class.
(Altmann & Kamide, 1999; Kamide, Altmann, Moreover, such predictions might serve as an
& Haywood, 2003); this effect is maintained important learning signal. Connectionist net-
even when the noun pictures are removed works frequently take this approach, learning
(Altmann, 2004). Similarly, adults and young by predicting the next element in the input and
children can use the gender of a determiner assessing the match between the prediction and
(e.g., le vs. la in French) to generate expecta- what actually occurs. Differences between pre-
tions concerning the following noun (Dahan, dicted input and actual input serve as an implicit
Swingley, Tanenhaus, & Magnuson, 2000; error signal, such that weights can be updated
Lew-Williams & Fernald, 2007). to reflect actual occurrences (e.g., Elman, 1990).
It is thus possible that infants, along with Note that this idea is quite different from the
adults and children, exploit patterns in their lin- classic concept of “negative evidence,” wherein
guistic environment—including statistics—to learners are provided with explicit corrections;
make predictions about what will come next, as such evidence is argued to be rarely available
well as to interpret what has already occurred to children, and often not useful even when it
(which is necessary given the vast amount of does occur (Brown, Hanlon, & 1970; Marcus,
ambiguity in natural language). The question 1993; Morgan & Travis, 1989). Implicit negative
then is not which statistics do infants compute, evidence, based on prediction, could provide
but which statistics inform infants’ expecta- corrective information at far more time points,
tions about subsequent input? Efficient tracking facilitating learning relative to explicit negative
of relevant regularities would allow infants to evidence. This intuition is supported by model-
become skilled at rapid language processing, as ing results contrasting different approaches to
observed occurs over the course of typical lan- learning a toy grammar (Spivey-Knowlton &
guage development (in the absence of language Saff ran, 1995).

Despite extensive evidence suggesting that deterministic—that is, 100% probability from
adults generate predictions while comprehend- one word to the next—than when the word
ing language, and a few exciting studies showing sequence is probabilistic—50% probability
similar abilities in early childhood, no stud- from one word to the next (Romberg & Saffran,
ies have yet tested the hypothesis that infants 2009). The targets were names of animals that
generate predictions concerning sequences our 16-month-old participants would already
of sound. We know that infants are of course likely know; the names were paired with pic-
highly attuned to sequential information in tures shown on a large computer screen. These
language, including the statistical knowledge noun targets were preceded by adjectives. Our
described above. However, these tasks are essen- principal manipulation concerned the distri-
tially off-line, with measurement after learning bution of the adjective/noun pairs. Some pairs
has already occurred. We also know that in were deterministic: e.g., pretty always preceded
nonlinguistic tasks, infants appear to generate doggie. Other pairs were probabilistic: e.g., little
on-line predictions. For example, infants gener- preceded kitty on half the trials and fish on the
ate anticipatory eye movements when exposed other half; the specific pairing of nouns and
to patterns of shapes (e.g., Canfield & Haith, adjectives was counterbalanced. Infants lis-
1991; Canfield, Smith, Brezsnyak, & Snow, 1997; tened to several minutes of speech in which this
Haith, Hazan, & Goodman, 1988; Haith, Wass, distributional information was provided: e.g.,
& Adler, 1997), and can even do so when the “This little kitty and one pretty doggie and one
patterns are removed and the screen is blank little fish and . . .”. During this training phase,
(Richardson & Kirkham, 2004). Infants com- infants also watched pictures of animals on the
pute trajectories, anticipating where objects will screen. Each of the animals always appeared in
appear (e.g., Johnson, Amso, & Slemmer, 2003), the same position—right or left—on the screen
and their hand movements suggest computa- and flashed when it was spoken in the accom-
tion of expectancies (e.g., von Hofsten, Vishton, panying audio stream. This gave the infants the
Spelke, Feng, & Rosander, 1998). However, to opportunity to learn the positions of the animal
date, the only infant studies examining on-line pictures while also learning about the adjective/
anticipation in a linguistic task have involved noun distributions.
speech perception (McMurray & Aslin, 2004): Following this brief training, infants were
infants who have learned an arbitrary sound/ tested to see if they generated anticipatory looks
object correspondence can use the sound to as a function of the distributions of the adjec-
predict the object’s location, as indexed by their tive/noun pairs. Unlike the visual stimuli pre-
anticipatory eye movements. sented during training, the screen was blank
We have recently begun to develop a meth- during the test until the onset of the noun when
odology to ask whether infants generate antici- the matching picture would flash on the screen.
pations on-line during sequential linguistic However, the auditory materials were the same
events. To do so, we have borrowed from prior as those presented during exposure. We hypoth-
studies that use eye movements to interrogate esized that if infants were using the adjectives
infant predictions. As a starting point, we began to predict the upcoming nouns, we should see
by examining infant processing of extremely anticipatory looks to the position where the
simple grammatical structures: determiner– noun picture would occur, prior to the noun/
adjective–noun sequences. Of course, our picture event. In particular, we predicted a dif-
studies concern spoken language, but the ference between deterministic pairs and proba-
dependent variable concerns infants’ eyes, so bilistic pairs. In the deterministic pairs, infants
the method requires that the infant link spoken could—in principle—generate an expectation
words with visually presented objects. In our regarding the upcoming noun based on the
first study, we decided to simply ask whether adjective. However, in the probabilistic pairs,
infants show different levels of anticipation no information was available to tell the infant
from word to word when the word sequence is which noun would follow the adjective. Indeed,

as hypothesized, infants only made reliable word boundary information must be learned.
anticipatory eye movements to the target loca- Or was the intent to argue that statistical learn-
tion (on the blank screen) for the deterministic ing functioned for tasks up the linguistic food
pairs using the adjective as a cue to the upcom- chain, including the acquisition of syntax?
ing target noun. For the probabilistic pairs, At the time, we viewed this to be rightly
infants did not reliably fi xate the target location an empirical question and the decade that fol-
until the onset of the noun itself. These results lowed has seen the publication of numerous
suggest that the statistics of the input influenced adult studies focusing on the role of statistics in
infants’ eye movements in response to sequen- syntax learning (Lany, Gomez, & Gerken, 2007;
tial linguistic stimuli, as we would expect if Saff ran, 2001a, 2002; Thompson & Newport,
infants generate on-line expectations about 2007). The broad claim is that, analogous to the
upcoming linguistic events. work in segmentation and other phonological
While this is just an initial study, and the processes, learners can track statistical regulari-
methodology is still under development, these ties across words and/or word classes that afford
results suggest that like adults and young chil- detection of syntactic patterns. Importantly, the
dren, infants can use distributional information presence of multiple correlated cues, such as
in speech to generate expectations about what prosodic or phonological regularities, appears
might come next. On the basis of this view, to facilitate this process (Gerken, Wilson, &
infants are not static computers, using a set of Lewis, 2005; Gomez & Lakusta, 2004; Kaschak
algorithms to perform some set of computations. & Saff ran, 2006; Morgan, Meier, & Newport,
Instead, infants may be engaged in a dynamic 1987), just as observed in word segmentation.
process of using whatever information they can Nevertheless, much remains unknown
amass that might help to generate informative about the manner in which these regularities
predictions, thereby facilitating comprehen- are learned. Some of these issues revolve around
sion and, eventually, production. The goal of the long-standing rules versus statistics debates,
our ensuing research will be to determine how whose roots lie in the fracas that emerged fol-
infants assess which aspects of linguistic input lowing the publication of the original Parallel
are informative and which are not, and to test Distributed Processing volume (Pinker &
the related hypothesis that part of the learning Prince, 1988; Rumelhart & McClelland, 1987).
process entails input-driven error correction. More recent incarnations of this debate have
surrounded artificial language learning stud-
ies involving different types of syllable-level
computations tracked over streams of speech.
To what extent do generalizations in these tasks
The initial reports about statistical learning in require rule-level processes (Pena, Bonatti,
infants concerned word segmentation; track- Nespor, & Mehler, 2002) as opposed to statis-
ing the probabilities of syllable co-occurrences. tical processes (Perruchet, Tyler, Galland, &
At the time of the Saff ran et al. (1996) Science Peereman, 2004; Seidenberg, MacDonald, &
paper, it was unclear how broad the claims Saff ran, 2002)? That is, can statistical learning
should be. While the paper itself restricted the operate over primitives that are more abstract
claims to the domain of word segmentation, than physically available syllables or words, or
an accompanying “Perspective” piece broad- is this necessarily the purview of algebraic rules
ened the implications to include critiques of (Marcus, 2000, 2001)?
nativist approaches to syntax acquisition (Bates In order to be useful for grammar acquisi-
& Elman, 1996), and a furor erupted in the tion, the relevant computation(s) must be able to
“Correspondence” pages of subsequent issues. operate over classes or categories. As observed
Was the claim simply that statistical learning 50 years ago by Chomsky (1959), the accept-
played a role in word segmentation—a relatively ability of nonsensical sentences like “Colorless
uncontroversial idea, given that on any theory, green ideas sleep furiously” means that our

representations must supercede word-to-word with which rules occur, and detect violations as
transitions to include transitions between cate- a function of the rules’ likelihood.
gories of words (e.g., nouns, verbs, adverbs, etc.). These possibilities are extremely challenging
The question, again, becomes what “counts” as to disentangle empirically, particularly given
statistical. On some views, only observable ele- the relatively simple structures typically used
ments can be tracked in this fashion (Marcus & in infant language studies—and also given that
Berent, 2003). There are many ways to address patterns can be tracked across various levels of
these representational questions, ranging from representation, from individual exemplars (e.g.,
computational models designed to implement words) to categories (e.g., grammatical classes).
one process or another to adult studies that In a recent study, we began to investigate this
carefully manipulate structures in artificial issue by manipulating the types of ungram-
grammars. Our approach has largely involved matical sentences used at test (Saff ran, 2009).
studies of infant learners, in an attempt to ascer- Twelve-month-old infants were first exposed to
tain which sorts of regularities they detect. a small grammar, written over small lexical cat-
In one set of studies, we exposed 12-month- egories. Critically, some of the transitions in the
old infants to two types of grammatical struc- grammar were high probability (100%) while
tures (Saffran et al., 2008). One grammar others were not (50%). Infants were then tested
structure contained statistical cues to phrase on grammatical sentences versus two different
structure, providing infants with information types of ungrammatical sentences (between-
concerning the types of words that typically co- subjects manipulation); all test sentences were
occurred together. In the other grammar struc- novel. In one group, infants were tested on
ture, these particular cues were absent, though grammatical sentences versus ungrammatical
the grammar was otherwise equally complex. sentences that violated a high-probability tran-
Across several experiments manipulating the sition. In the second group, infants were tested
size of the grammar structure, we consistently on grammatical sentences versus ungrammati-
found the same result: learning only occurred cal sentences that violated a low-probability
when statistical dependencies within phrases transition. Importantly, both types of transi-
were present. These findings mirrored previous tions were equally frequent in the exposure
results with adults and children (Saff ran, 2002), corpus. We hypothesized that if infants were
which suggested that learners’ focus on these sta- responding solely to grammaticality, then dis-
tistical regularities was not specific to language. crimination performance should be equivalent
In infant artificial grammar studies, the crit- across the two groups, as the ungrammatical
ical test contrasts compare infants’ responses to sentences both violated the same number of
familiar (grammatical) versus novel (ungram- transitions and were matched for frequency of
matical) sentences, which violate one or more transitions. However, if infants were attuned to
of the regularities in the language. In the case the probability of the transitions presented dur-
of infant artificial grammar learning studies, ing training, we hypothesized that violations of
while it is clear that infants have learned some- the high-probability transitions should be more
thing about the patterns that differ between the readily detected than the violations of the low-
grammatical and ungrammatical sentences, the probability transitions.
nature of this knowledge remains unclear. One The data were consistent with the latter pre-
possibility is that infants have learned the rules diction: only those infants tested on violations
of the language, which are then violated in the of high-probability transitions discriminated
ungrammatical sentences. Another possibility between the grammatical and ungrammatical
is that infants have learned the (high) probabil- sentences. Importantly, data from a no-exposure
ity sequences of the language, which differ from control group confirmed that these results were
the low probability sequences in the ungram- due to language exposure, not idiosyncratic
matical sentences. A third possibility is a hybrid features of the test items. These data thus sup-
system, in which infants learn the probabilities port the hypothesis that at least when learning

artificial grammars, infants are sensitive to the artificial, simplified materials. This issue is
probabilities with which words (or possibly gaining attention; indeed, a summer workshop
word classes) co-occur. To the extent that infant in 2007 was dedicated to the topic of “Current
language learning is subserved by the same Issues in Language Acquisition: Artificial and
mechanisms as adult language processing, these Statistical Language Learning.” The question
results are consistent with myriad results from is whether the mechanisms that appear to sub-
sentence processing, suggesting that adult com- serve learning in simplified laboratory tasks are
prehenders are attuned to the statistical prop- also operating “in the wild”; that is, given more
erties of syntactic structures at multiple grains naturalistic types of language input. Again, it is
of analysis, from concrete to abstract (e.g., necessarily the case that infant language learn-
MacDonald & Seidenberg, 2006; Seidenberg & ers exploit cues beyond sequential statistics.
MacDonald, 2001). Indeed, it may be the case that sequential sta-
tistics help learners to discover other cues in the
input, which may be specific to an individual
language and thus require some initial learning.
One major critique of studies focused on sta- For example, infants in a word segmentation
tistical learning is the artificial nature of the task can use transitional probability statistics
experimental tasks. These methods originated to discover an additional word boundary cue in
from a tradition in cognitive psychology and the input, even one that is entirely novel, which
experimental psycholinguistics, which involved can then be used for subsequent segmentation
the use of artificial language materials (Gomez (e.g., Sahni, Seidenberg, & Seidenberg, 2009).
& Gerken, 2000). Dating back to the 1970s, Across several lines of research, we have
researchers used these miniature languages begun to ask whether statistical learning in these
to test specific hypotheses—with adults— artificial tasks translates into actual language
concerning the types of information used by learning skill. We know that infants do appear
language learners (Braine, 1987; Moeser & to be sensitive to natural language statistics, at
Bregman, 1973; Morgan et al., 1987; Morgan, least in some limited domains. For example,
Meier, & Newport, 1989; Morgan & Newport, infants are known to track phonotactic proba-
1981; Smith, 1966). While these methods were bility information in their native language: the
clearly very artificial, like those used through- likelihood that certain phonemes will co-occur
out cognitive psychology at the time, they per- in particular positions within a word (Friederici
mitted researchers to isolate particular cues & Wessels, 1993; Jusczyk, Friederici, Wessels, &
hypothesized to influence language learning, Svenkerud, 1993; Jusczyk, Luce, & Charles-Luce,
while controlling other potential confounding 1994). Laboratory tasks have demonstrated that
variables. At the same time, the field of implicit infants can make use of these phonotactic prob-
learning developed parallel methodologies abilities when segmenting novel words from
using artificial grammars, but with a focus pri- fluent speech (Mattys & Jusczyk, 2001; Mattys,
marily on learning and memory rather than on Jusczyk, Luce, & Morgan, 1999) and when map-
the acquisition of structures found in natural ping novel labels to objects (Graf Estes, Edwards,
languages (Allen & Reber, 1980; Reber, 1967, & Saffran, 2009). Do artificial language statistics
1993). Despite largely separate literatures dur- operate in a similar way?
ing the 1980s and 1990s, these two literatures To date, we have pursued three different
are beginning to converge in an exciting way, approaches to this question. The first approach
generating testable predictions that constrain has involved experiments in which we ask
theories of both implicit learning and language whether the “words” from our artificial speech
acquisition (Perruchet & Pacton, 2006). streams are treated in a word-like fashion by
Nevertheless, one can critique both infants. For example, 8-month-old infants
literatures—artificial language learning and appear to integrate the nonsense words from
implicit learning—for their use of highly our speech streams into English test sentences

(Saff ran, 2001b). In related work, we found that typical sample, performance on the artificial
18-month-olds more readily mapped these language task was correlated with standardized
nonsense words to novel meanings (objects) measures of English vocabulary, providing an
than part-words, which violated the statistics of additional link between statistical learning skill
the artificial speech stream (Graf Estes, Evans, and native language proficiency (Evans et al.,
Alibali, & Saff ran, 2007). Infants can also go 2009).
from segmentation to syntax, first finding non- The third source of evidence required us
sense words in fluent speech and then learning to move away from artificial language meth-
about their order, as required for natural lan- odologies. To what extent is statistical learn-
guage learning (Saff ran & Wilson, 2003). Across ing performance tied to the use of these highly
all these studies, the results support the hypoth- unnatural structures? The statistical learning
esis that infants are doing the very things with studies performed in our laboratory and else-
these sound sequences that we would expect where typically entail speech that is monotone,
infants acquiring language to do. isochronous (that is, devoid of variations in
The second line of research is focused on rhythm), and synthesized. We have manipulated
individual differences in natural language these features to some extent, but always within
acquisition. Approximately 5%–10% of elemen- the purview of artificially devised systems. For
tary school-aged children are diagnosed with example, we manipulated the pitches of our arti-
specific language impairment (SLI): despite ficial speech streams to mimic infant-directed
nonverbal IQ in the normal range, their lan- pitch contours versus adult-directed pitch
guage skills lag significantly behind their peers. contours; the results suggested that statistical
We hypothesized that if individual differences learning was facilitated by the expa