Beruflich Dokumente
Kultur Dokumente
This book was originally selected and revised to be included in the World Theses Series
(Holland Academic Graphics, The Hague), edited by Lisa L.-S. Cheng.
LANGUAGE ACQUISITION
AND THE FORM
OF THE GRAMMAR
JOHN BENJAMINS PUBLISHING COMPANY
PHILADELPHIA/AMSTERDAM
DAVID LEBEAUX
NEC Research Institute
The paper used in this publication meets the minimum requirements of
American National Standard for Information Sciences Permanence of
Paper for Printed Library Materials, ANSI Z39.48-1984.
8
TM
Library of Congress Cataloging-in-Publication Data
Lebeaux, David.
Language acquisition and the form of the grammar / David Lebeaux
p. cm.
Includes bibliographical references and index.
1. Language acquisition. 2. Generative grammar. I. Title.
P118.L38995 2000
401.93--dc21 00-039775
ISBN 90 272 2565 6 (Eur.) / 1 55619 858 2 (US)
2000 John Benjamins B.V.
No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any
other means, without written permission from the publisher.
John Benjamins Publishing Co. P.O.Box 75577 1070 AN Amsterdam The Netherlands
John Benjamins North America P.O.Box 27519 Philadelphia PA 19118-0519 USA
Table of Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Cn:i1r 1
A Re-Denition of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 The Pivot/Open Distinction and the Government Relation . . . . . . . . 7
1.1.1 Braines Distinction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.2 The Government Relation . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 The Open/Closed Class Distinction . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1 Finiteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.2 The Question of Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1 A Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.2 Determining the base order of German . . . . . . . . . . . . . . . . 17
1.3.2.1 The Movement of NEG (syntax) . . . . . . . . . . . . . . . 24
1.3.2.2 The Placement of NEG (Acquisition) . . . . . . . . . . . . 26
Cn:i1r 2
Project- , Argument-Linking,
and Telegraphic Speech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1 Parametric variation in Phrase Structure . . . . . . . . . . . . . . . . . . . . . 31
2.1.1 Phrase Structure Articulation . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.2 Building Phrase Structure (Pinker 1984) . . . . . . . . . . . . . . . 32
2.2 Argument-linking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2.1 An ergative subsystem: English nominals . . . . . . . . . . . . . . 41
2.2.2 Argument-linking and Phrase Structure: Summary . . . . . . . . 45
vi TABLE OF CONTENTS
2.3 The Projection of Lexical Structure . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3.1 The Nature of Projection . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.3.2 Pre-Project- representations (acquisition) . . . . . . . . . . . . . . 56
2.3.3 Pre-Project- representations and the Segmentation Problem . 60
2.3.4 The Initial Induction: Summary . . . . . . . . . . . . . . . . . . . . . 65
2.3.5 The Early Phrase Marker (continued) . . . . . . . . . . . . . . . . . 66
2.3.6 From the Lexical to the Phrasal Syntax . . . . . . . . . . . . . . . . 75
2.3.7 Licensing of Determiners . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2.3.8 Submaximal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Cn:i1r 3
Adjoin- and Relative Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.2 Some general considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.3 The Argument/Adjunct Distinction, Derivationally Considered . . . . . 94
3.3.1 RCs and the Argument/Adjunct Distinction . . . . . . . . . . . . . 94
3.3.2 Adjunctual Structure and the Structure of the Base . . . . . . . . 98
3.3.3 Anti-Reconstruction Eects . . . . . . . . . . . . . . . . . . . . . . . . 102
3.3.4 In the Derivational Mode: Adjoin- . . . . . . . . . . . . . . . . . . 104
3.3.5 A Conceptual Argument . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.4 An Account of Parametric Variation . . . . . . . . . . . . . . . . . . . . . . . 112
3.5 Relative Clause Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.6 The Fine Structure of the Grammar, with Correspondences: The
General Congruence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
3.7 What the Relation of the Grammar to the Parser Might Be . . . . . . . 136
Cn:i1r 4
Agreement and Merger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.1 The Complement of Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.2 Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.3 Merger or Project- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
4.3.1 Relation to Psycholinguistic Evidence . . . . . . . . . . . . . . . . . 154
4.3.2 Reduced Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
4.3.3 Merger, or Project- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
4.3.4 Idioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
TABLE OF CONTENTS vii
Cn:i1r 5
The Abrogation of DS Functions:
Dislocated Constituents and Indexing Relations . . . . . . . . . . . . . . . . . 183
5.1 Shallow Analyses vs. the Derivational Theory of Complexity . . . . 184
5.2 Computational Complexity and The Notion of Anchoring . . . . . . . . 188
5.3 Levels of Representation and Learnability . . . . . . . . . . . . . . . . . . . 192
5.4 Equipollence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
5.5 Case Study I: Tavakolians results and the Early Nature of Control . . 203
5.5.1 Tavakolians Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.5.2 Two Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
5.5.3 PRO as Pro, or as a Neutralized Element . . . . . . . . . . . . . . . 208
5.5.4 The Control Rule, Syntactic Considerations: The Question of
C-command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
5.5.5 The Abrogation of DS functions . . . . . . . . . . . . . . . . . . . . . 220
5.6 Case Study II: Condition C and Dislocated Constituents . . . . . . . . . 224
5.6.1 The Abrogation of DS Functions: Condition C . . . . . . . . . . . 226
5.6.2 The Application of Indexing . . . . . . . . . . . . . . . . . . . . . . . 229
5.6.3 Distinguishing Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . 234
5.7 Case Study III: Wh-Questions and Strong Crossover . . . . . . . . . . . . 239
5.7.1 Wh-questions: Barriers framework . . . . . . . . . . . . . . . . . . . 240
5.7.2 Strong Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
5.7.3 Acquisition Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
5.7.4 Two possibilities of explanation . . . . . . . . . . . . . . . . . . . . . 248
5.7.5 A Representational Account . . . . . . . . . . . . . . . . . . . . . . . . 249
5.7.6 A Derivational Account, and a Possible Compromise . . . . . . 251
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
There are two ways of painting two trees together. Draw
a large tree and add a small one; this is called fu lao
(carrying the old on the back). Draw a small tree and add
a large one; this is called hsieh yu (leading the young by
the hand). Old trees should show a grave dignity and an
air of compassion. Young trees should appear modest and
retiring. They should stand together gazing at each other.
Mai-mai Sze
The Way of Chinese Painting
Acknowledgments
This book had its origins as a linguistics thesis at the University of Massachu-
setts. First of all, I would like to thank my committee: Tom Roeper, for scores
of hours of talk, for encouragement, and for his unagging conviction of the
importance of work in language acquisition; Edwin Williams, for the example of
his work; Lyn Frazier, for an acute and creative reading; and Chuck Clifton, for
a psychologists view. More generally, I would like to thank the faculty and
students of the University of Massachusetts, for making it a place where creative
thinking is valued. The concerns and orientation of this book are very much
molded by the training that I received there.
Further back, I would like to thank the people who got me interested in all
of this in the rst place: Steve Pinker, Jorge Hankamer, Jane Grimshaw, Annie
Zaenen, Merrill Garrett and Susan Carey. I would also like to thank Noam
Chomsky for encouragement throughout the years.
Since the writing of the thesis, I have had the encouragement and advice of
many ne colleagues. I would especially like to thank Susan Powers, Alan
Munn, Cristina Schmitt, Juan Uriagereka, Anne Vainikka, Ann Farmer, and Ana-
Teresa Perez-Leroux. I am also indebted to Sandiway Fong, as well as Bob
Krovetz, Christiane Fellbaum, Kiyoshi Yamabana, Piroska Csuri, and the NEC
Research Institute for a remarkable environment in which to pursue the research
further.
I would also like to thank Mamoru Saito, Hajime Hoji, Peggy Speas,
Juergen Weissenborn, Clare Voss, Keiko Muromatsu, Eloise Jelinek, Emmon
Bach, Jan Koster, and Ray Jackendoff.
Finally, I would like to thank my parents, Charles and Lillian Lebeaux, my
sister, Debbie Lebeaux, and my sons, Mark and Theo. Most of all, I would like
to thank my wife Pam, without whom this book would have been done badly, if
at all. This book is dedicated to her, with love.
Preface
What is the best way to structure a grammar? This is the question that I started
out with in the writing of my thesis in 1988. I believe that the thesis had a
marked eect in its answering of this question, particularly in the creation of the
Minimalist Program by Chomsky (1993) a few years later.
I attempted real answers to the question of how to structure a grammar, and
the answers were these:
(i) In acquisition, the grammar is arranged along the lines of subgrammars.
These grammars are arranged so that the child passes from one to the next,
and each succeeding grammar contains the last. I shall make this clearer
below.
(ii) In addition, in acquisition, the child proceeds to construct his/her grammar
from derivational endpoints (Chapter 5). From the derivational endpoints,
the child proceeds to construct the entire grammar. This may be forward or
backward, depending on what the derivational endpoint is. If the derivation
endpoint, or anchorpoint, is DS, then the construction is forward; if the
derivational endpoint or anchorpoint is S-structure or the surface, then the
construction proceeds backwards.
The above two proposals were the main proposals made about the
acquisition sequence. There were many proposals made about the syntax. Of
these, the main architectural proposals were the following.
(iii) The acquisition sequence and the syntax in particular, the syntactic
derivation are not to be considered in isolation from each other, but
rather are tightly yoked. The acquisition sequence can be seen as the result
of derivational steps or subsequences (as can be seen in Chapter 2, 3, and
4). This means that the acquisition sequence gives unique purchase onto the
derivation itself, including the adult derivation.
(iv) Phrase structure is not given as is, nor is derived top-down, but rather is
composed (Speas 1990). This phrase structure composition (Lebeaux 1988),
is not strictly bottom up, as in Chomskys (1995) Merge, but rather involves
xiv PREFACE
(a) the intermingling or units, (b) is grammatically licensed, and not simply
geometrical (bottom-up) in character (in a way which will become clearer
below), and (c) involves, among other transformations, the transformation
Project- (Chapter 4).
(v) Two specic composition operations (and the beginnings of a third) are
proposed. Adjoin- (Chapter 3) is proposed, adding adjuncts to the basic
nuclear clause structure (Conjoin- is also suggested in that chapter). In
further work, this is quite similar to the Adjunction operation of Joshi and
Kroch, and the Tree Adjoining Grammars (Joshi 1985; Joshi and Kroch
1985; Frank 1992), though the proposals are independent and the proposals
are not exactly the same. The second new composition operation is
Project- (Chapter 4), which is an absolutely new operation in the eld. It
projects open class structure into a closed class frame, and constitutes the
single most radical syntactic proposal of this book.
(vi) Finally, composition operations, and the variance in the grammar as a
whole, are linked to the closed class set elements like the, a, to, of, etc.
In particular, each composition operation requires the satisfaction of a
closed class element; as well as a closed class element being implicated in
each parameter.
These constitute some of the major proposals that are made in the course of this
thesis. In this preface I would like to both lay out these proposals in more detail,
and compare them with some of the other proposals that have been made since
the publication of this thesis in 1988. While this thesis played a major role in the
coming of the Minimalist Program (Chomsky 1993, 1995), the ideas of the thesis
warrant a renewed look by researchers in the eld, for they have provocative
implications for the treatment of language acquisition and the composition of
phrase structure.
Let us start to outline the dierences of this thesis with respect to later
proposals, not with respect to language acquisition, but with respect to syntax. In
particular, let us start with parts (iv) and (v) above: that the phrase marker is
composed from smaller units.
A similar proposal is made with Chomskys (1995) Merge. However, here,
unlike Merge:
(1) The composition is not simply bottom-up, but involves the possible
intermingling of units.
(2) The composition is syntactically triggered in that all phrase structure
composition involves the satisfaction of closed class elements
PREFACE xv
(Chapters 3 and 4), and is not simply the geometric putting together
of two units, as in Merge, and
(3) The composition consists of two operations among others (these are
the only two that are developed in this thesis), Adjoin- and
Project-.
With respect to the idea that all composition operations are syntactically triggered
by features, let us take the operation Adjoin-. This takes two structures and
adjoins the second into the rst.
(1)
s1:
s2:
the man met the woman
who loved him
the man met the woman
who loved him
Adjoin-
This shows the intermingling of units, as the second is intermeshed with the rst.
However, I argue here (Chapter 4), that it also shows the satisfaction of closed
class elements, in an interesting way. Let us call the wh-element of the relative
clause, who here, the relative clause linker.
It is a proposal of this thesis that the adjunction operation itself involves the
satisfaction of the relative clause linker (who), by the relative clause head (the
woman), and it is this relation, which is the relation of Agreement, which composes
the phrase marker. The relative clause linker is part of the closed class set. This
relative clause linker is satised in the course of Agreement, thus the composi-
tion operation is put into a 1-to-1 relation with the satisfaction of a closed class
head. (This proposal, so far as I know, is brand new in the literature).
(2) Agree Relative head/relativizer Adjoin-
This goes along with the proposal (Chapter 4), which was taken up in the
Minimalist literature (Chomsky 1992, 1995), that movement involves the
satisfaction of closed class features. The proposal here, however, is that composi-
tion, as well as movement, involves the satisfaction of a closed class feature (in
particular, Agreement). In the position here, taken up in the Minimalist literature,
the movement of an element to the subject position is put into a 1-to-1 corre-
spondence with agreement (Chapter 4 again).
(3) Agree Subject/Predicate Move NP (Chapter 4)
The proposal here is thus more thoroughgoing than that in the minimalist
literature, in that both the composition operation, and the movement operation are
triggered by Agreement, and the satisfaction of closed class features. In the
minimalist literature, it is simply movement which is triggered by the satisfaction
xvi PREFACE
of closed class elements (features); phrase structure composition is done simply
geometrically (bottom-up). Here, both are done through the satisfaction of
Agreement. This is shown below.
(4) Minimalism Lebeaux (1988)
Movement syntactic (satisfaction
of features)
syntactic (satisfaction
of features)
Phrase Structure
Composition
asyntactic (geometric) syntactic (satisfaction
of features)
This proposal (Lebeaux 1988) links the entire grammar to the closed class set
both the movement operations and the composition operations are linked to
this set.
The set of composition operations discussed in this thesis is not intended to
be exhaustive, merely representative. Along with Adjoin- which Chomsky-
adjoins elements into the representation (Chapter 3), let us take the second, yet
more radical phrase structure composition operation, Project-. This is not
equivalent to Speas (1990) Project-, but rather projects an open class structure
into a closed class frame. The open class structure also represents pure thematic
structure, and the closed class structure, pure Case structure.
This operation, for a simple partial sentence, looks like (5) (see Lebeaux
1988, 1991, 1997, 1998 for further extensive discussion).
The operation projects the open class elements into the closed class (Case)
frame. It also projects up the Case information from Determiner to DP, and
unies the theta information, from the theta subtree, into the Case Frame, so that
it appears on the DP node.
The Project- operation was motivated in part by the postulation of a
subgrammar in acquisition (Chapters 2, 3, and 4), in part by the remarkable
speech error data of Garrett (Chapter 4, Garrett 1975), and in part by idioms
(Chapter 4). This operation is discussed at much greater length in further
developments by myself (Lebeaux 1991, 1997, 1998).
I will discuss in more detail about the subgrammar underpinnings of the
Project- approach later in this preface. For now, I would simply like to point to
the remarkable speech error data collected by Merrill Garrett (1975, 1980), the
MIT corpus, which anchors this approach.
PREFACE xvii
(5)
V
N
agent
V
V N
patient
woman see man
man
VP
VP
DP
+nom
DP
+agent
+nom
Det
+nom
V
V
Det
+nom
the
the
NP
e
V
V
DP
+acc
DP
+patient
+acc
NP Det
+acc
a e see
Theta subtree (open class) Case Frame (closed class)
Project-
NP
+agent
see
Det
+acc
NP
+patient
a woman
Garrett and Shattuck-Hufnagel collected a sample of 3400 speech errors. Of
these, by far the most interesting class is the so-called morpheme-stranding
errors. These are absolutely remarkable in that they show the insertion of open
class elements into a closed class frame. Thus, empirically, the apparent impor-
tance of open class and closed class items is reversed rather than open class
items being paramount, closed class items are paramount, and guide the deriva-
tion. Open class elements are put into slots provided by closed class elements, in
Garretts remarkable work. A small sample of Garretts set is shown below.
xviii PREFACE
(6) Speech errors (stranded morpheme errors), Garrett (personal commu-
nication) (permuted elements underlined)
Error Target
my frozers are shoulden my shoulders are frozen
that just a back trucking out a truck backing out
McGovern favors pushing busters favors busting pushers
but the cleans twoer twos cleaner
his sink is shipping ship is sinking
the cancel has been practiced the practice has been
cancelled
shes got her sets sight sights set
a puncture tiring device tire puncturing device
As can be seen, these errors can only arise at a level where open class elements
are inserted into a closed class frame. The insertion does not take place correctly
a speech error so that the open class elements end up in permuted slots
(e.g. a puncture tiring device).
Garrett summarizes this as follows:
why should the presence of a syntactically active bound morpheme be
associated with an error at the level described in [(6)]? Precisely because the
attachment of a syntactic morpheme to a particular lexical stem reects a
mapping from a functional level [i.e. grammatical functional, i.e. my theta
subtree, D. L.] to a positional level of sentence planning
This summarizes the two phrase structure composition operations that I propose
in this thesis: Adjoin- and Project-. As can be seen, these involve (1) the inter-
mingling of structures (and are not simply bottom up), and (2) satisfaction of
closed class elements. Let us now turn to the general acquisition side of the
problem.
It was said above that this thesis was unique in that the acquisition sequence
and the syntax in particular, the syntactic derivation were not considered
in isolation, but rather in tandem. The acquisition sequence can be viewed as the
output of derivational processes. Therefore, to the extent to which the derivation
is partial, the corresponding stage of the acquisition sequence can be seen as a
subgrammar of the full grammar. The yoking of the acquisition sequence and the
syntax is therefore the following:
(7) Acqcisi1ioN subgrammar approach
SxN1:x phrase structure composition from smaller units
PREFACE xix
The subgrammar approach means that children literally have a smaller grammar
than the adult. The grammar increases over time by adding new structures (e.g.
relative clauses, conjunctions), and by adding new primitives of the representa-
tional vocabulary, as in the change from pure theta composed speech, to theta
and Case composed speech.
The addition of new structures e.g. relative clauses and conjunctions
may be thought of as follows. A complex sentence like that in (8) may be
thought of as a triple: the two units, and the operation composing them (8b).
(8) a. The man saw the woman who loved him.
b. (the man saw the woman (rooted), who loved him, Adjoin-)
Therefore a subgrammar, if it is lacking the operation joining the units may be
thought of as simply taking one of the units let us say the rooted one and
letting go of the other unit (plus letting go of the operation itself). This is
possible and necessary because it is the operation itself which joins the units: if
the operation is not present, one or the other of the units must be chosen. The
subgrammar behind (8a), but lacking the Adjoin- operation, will therefore generate
the structure in (9) (assuming that it is the rooted structure which is chosen).
(9) The man saw the woman.
This is what is wanted.
Note that the subgrammar approach (in acquisition), and the phrase structure
composition approach (in syntax itself) are in perfect parity. The phrase structure
composition approach gives the actual operation dividing the subgrammar from
the supergrammar. That is, with respect to this operation (Adjoin-), the
grammars are arranged in two circles: Grammar 1 containing the grammar itself,
but without Adjoin-, and Grammar 2 containing the grammar including Adjoin-.
(10)
Grammar 2
(w/ Adjoin- )
Grammar 1
The above is a case of adding a new operation.
The case of adding another representational primitive is yet more interesting.
xx PREFACE
Let us assume that the initial grammar is a pure representation of theta relations.
At a later stage, Case comes in. This hypothesis is of the layering of vocabu-
lary: one type of representational vocabulary comes in, and does not displace,
but rather is added to, another.
(11) theta theta + Case
Stage I Stage II
The natural lines along which this representational addition takes place is
precisely given by the operation Project-. The derivation may again be thought
of as a triple: the two composing structures, one a pure representation of theta
relations, and one a pure representation of Case, and the operation composing them.
(12) ((man (see woman)), (the __ (see (a __))), Project-)
the sees in theta tree and Case frame each contain partial informa-
tion which is unied in the Project- operation.
The subgrammar is one of the two representational units: in this case, the unit
(man(see woman)). That is a sort of theta representation or telegraphic speech.
The sequence from Grammar 0 to Grammar 1 is therefore given by the addition
of Project-.
(13)
Grammar 1
(w/ Project- )
Grammar 0
The full pattern of stage-like growth is shown in the chart below:
(14) Acqcisi1ioN: Subgrammar Approach
Add construction operations Relative clauses,
to simplied tree Conjunction (not discussed here)
Add primitives to Theta Theta + Case
representational vocabulary
As can be seen, the acquisition sequence and the syntax syntactic derivation
are tightly yoked.
Another way of putting the arguments above is in terms of distinguishing
PREFACE xxi
accounts. I wish to distinguish the phrase structure operations here from Merge;
and the acquisition subgrammar approach here from the alternative, which is the
Full Tree, or Full Competence, Approach (the full tree approach holds that the
child does not start out with a substructure, but rather has the full tree, at all
stages of development.) Let us see how the accounts are distinguished, in turn.
Let us start with Chomskys Merge. According to Merge, the (adult) phrase
structure tree, as in Montague (1974), is built up bottom-up, taking individual
units and joining them together, and so on. The chief property of Merge is that
it is strictly bottom-up. Thus, for example, in a right-branching structure like see
the big man, Merge would rst take big and man and Merge them together,
then add the to big man, and then add see to the resultant.
(15)
Application of Merge:
V Det Adj N
see the big man
N
Adj
big
N
man
DP
Det NP
Adj N
man
big the
V
see
VP
DP
Det NP
the Adj N
man big
The proposal assayed in this thesis (Lebeaux 1988) would, however, have a
radically dierent derivation. It would take the basic structure as being the basic
government relation: (see man). This is the primitive unit (unlike with Merge).
To this, the the and the big may be added, by separate transformations, Project-
and Adjoin-, respectively.
xxii PREFACE
(16)
V
N
man
man
V
see
V
V
V
V
DP
DP
NP
NP
Det
Det
e
(see)
(see)
Case Frame
Project-
Theta subtree
the
the
a. Project-
( see ( the man))
( big)
V DP
ADJ
b. Adjoin-
Adjoin- ( see ( the ( big man)))
V DP NP
How can these radically distinct accounts (Lebeaux 1988 and Merge) be
empirically distinguished? I would suggest in two ways. First, conceptually the
proposal here (as in Chomsky 19751955, 1957, and Tree Adjoining Grammars,
Kroch and Joshi 1985) takes information nuclei as its input structures, not
arbitrary pieces of string. For example, for the structure The man saw the
photograph that was taken by Stieglitz, the representation here would take the
two clausal nuclear structures, shown in (17) below, and adjoin them. This is not
true for Merge which does not deal in nuclear units.
(17)
s1:
s2:
the man saw the photograph
that was by Stieglitz
the man saw the photograph
that was by Stieglitz
Adjoin-
Even more interesting nuclear units are implicated in the transformation
Project-, where the full sentence is decomposed into a nuclear unit which is the
theta subtree, and the Case Frame.
PREFACE xxiii
(18)
The man saw the woman
(man (see woman))
(the _(see a_))
The structure in (18), the man saw the woman, is composed of a basic nuclear
unit, (man (see woman)), which is telegraphic speech (as argued for in Chap-
ter 2). No such nuclear unit exists in the Merge derivation of the man saw the
woman: that is, in the Merge derivation, (man (see woman)) does not exist as
a substructure of ((the man) (saw (the woman)).
This is the conceptual argument for preferring the composition operation
here over Merge. In addition, there are two simplicity arguments, of which I will
give just one here.
The simplicity argument has to do with a set of structures that children
produce which are called replacement sequences (Braine 1976). In these sequenc-
es, the child is trying to reach (output) some structure which is somewhat too
dicult for him/her. To make it, therefore, he or she rst outputs a substructure,
and then the whole structure. Examples are given below: the rst line is the rst
outputted structure, and the second line is the second outputted structure, as the
child attempts to reach the target (which is the second line).
(19) see ball (rst output)
see big ball (second output and target)
(20) see ball (rst output)
see the ball (second output and target)
What is striking about these replacement sequences is that the child does not
simply rst output random substrings of the nal target, but rather that the rst
output is an organized part of the second. Thus in both (19) and (20), what the
child has done is rst isolate out the basic government relation, (see ball), and
then added to it: with big and the, respectively.
The particular simplications chosen are precisely what we would expect with
the substructure approach outlined here, and crucially not with Merge. With the
substructure approach outlined here (Chapter 2, 4), what the child (or adult) rst
has in the derivation is precisely the structure (see ball), shown in example (21).
xxiv PREFACE
(21)
V
V
see
N
+patient
ball
To this structure is then added other elements, by Project- or Adjoin-. Thus,
crucially, the rst structure in (19) and (20) actually exists as a literal substruc-
ture of the nal form line 2 and thus could help the child in deriving the
nal form. It literally goes into the derivation.
By contrast, with Merge, the rst line in (19) and (20) never underlies the
second line. It is easy to see why. Merge is simply bottom-up it extends the
phrase marker. Therefore, the phrase structure composition derivation underlying
(20) line 2, is simply the following (Merge derivation).
(22) Merge derivation underlying (20) line 2
(
N
ball)
(
DP
(
D
the) (
N
ball))
(see (
DP
(
D
the) (
N
ball)))
However, this derivation crucially does not have the rst line of (20) (see (ball))
as a subcomponent. That is, (see (ball)) does not go into the making of (see
(the ball)), in the Merge derivation, but it does in the substructure derivation.
But this is a strong argument against Merge. For the rst line of the
outputted sequence of (20), (see ball), is presumably helping the child in
reaching the ultimate target (see (the ball)). But this is impossible with Merge,
for the rst line in (20) does not go into the making of the second line, accord-
ing to the Merge derivation.
That is, Merge cannot explain why (see ball) would help the child get to the
target (see (the ball)), since (see ball) is not part of the derivation of (see (the
ball)), in the Merge derivation. It is part of the sub-derivation in the substructure
approach outlined here, because of the operation Project-.
The above (see Chapters 2, 3, and 4) dierentiates the sort of phrase
structure composition operations found here from Merge. This is in the domain
of syntax though I have used language acquisition argumentation. In the
domain of language acquisition proper, the proposal of this thesis the
hypothesis of substructures must be contrasted with the alternative, which
holds that the child is outputting the full tree, even when the child is potentially
just in the one word stage: this may be called the Full Tree Hypothesis. These
PREFACE xxv
dierential possibilities are shown below. (For much additional discussion, see
Lebeaux 1991, 1997, 1998, in preparation.)
(23) Lebeaux (1988) Distinguished From
Syntax phrase structure com-
position
Both:
(1) no composition
(2) Merge
Language Acquisition subgrammar approach Full Tree Approach
Let us now briey distinguish the proposals here from the Full Tree Approach.
In the Full Tree Approach, the structure underlying a child sentence like ball
or see ball might be the following in (24). In contrast, the substructure
approach (Lebeaux, 1988) would assign the radically dierent representation,
given in (25).
(24)
Full Tree Approach
IP
TP
AgrSP
AgrOP
VP
V
DP
NP D
V
DP
AgrO
AgrS
T
DP
D NP
ball e e e e e e e e
xxvi PREFACE
(25)
V
V N
+patient
ball
Substructure Approach
How can these approaches be distinguished? That is, how can a choice be made
between (25), the substructure approach, and (24), the Full Tree approach? I
would suggest briey at least four ways (to see full argumentation, consult
Lebeaux 1997, to appear; Powers and Lebeaux 1998).
First, the subgrammar approach, but not the full tree approach, has some
notion of simplicity in representation and derivation. Simplicity is a much used
notion in science, for example deciding between two equally empirically
adequate theories. The Full Tree Approach has no notion of simplicity: in
particular, it has no idea of how the child would proceed from simpler structures
to more complex ones. On the other hand, the substructure theory has a strong
proposal to make: the child proceeds from simpler structures over time to those
which are more complex. Thus the subgrammar point of view makes a strong
proposal linked to simplicity, while the Full Tree hypothesis makes none.
A second argument has to do with the closed class elements, and may be
broken up into two subarguments. The rst of these arguments is that, in the Full
Tree Approach, there is no principled reason for the exclusion of closed class
elements in early speech (telegraphic speech). That is, both the open class and
closed class nodes exist, according to the Full Tree Hypothesis, and there is no
principled reason why initial speech would simply be open class, as it is. That is,
given the Full Tree Hypothesis, since the full tree is present, lexical insertion
could take place just as easily in the closed class nodes as the open class nodes.
The fact that it doesnt leaves the Full Tree approach with no principled reason
why closed class items are lacking in early speech.
A second reason having to do with closed class items, has to do with the
special role that they have in structuring an utterance, as shown by the work of
Garrett (1975, 1980), and Gleitman (1990). Since the Full Tree Approach gives
open and closed class items the same status, it has no explanation for why closed
class items play a special role in processing and acquisition. The substructure
approach, with Project-, on the other hand, faithfully models the dierence, by
having open class and closed class elements initially on dierent representations,
PREFACE xxvii
which are then fused (for additional discussion, see Chapter 4, and Lebeaux
1991, 1997, to appear).
A third argument against the Full Tree Approach has to do with structures
like see ball (natural) vs. see big (unnatural) given below.
(26) see ball (natural and common)
see big (unnatural and uncommon)
Why would an utterance like see ball be natural and common for the child
maintaining the government relation while see big is unnatural and uncom-
mon? There is a common sense explanation for this: see ball maintains the
government relation (between a verb and a complement), while see and big
have no natural relation. While this fact is obvious, it cannot be accounted for
with the Full Tree Approach. The reason is that the Full Tree Approach has all
nodes potentially available for use: including the adjectival ones. Thus there
would be no constraint on lexically inserting see and big (rather than see
and ball). On the substructure approach, on the other hand, there is a marked
dierence: see and ball are on a single primitive substructure the theta
tree while see and big are not.
A fourth argument against the Full Tree Approach and for the substructure
approach comes from a paper by Laporte-Grimes and Lebeaux (1993). In this
paper, the authors show that the acquisition sequence proceeds almost sequential-
ly in terms of the geometric complexity of the phrase marker. This is, children
rst output binary branching structures, then double binary branching, then triply
binary branching, and so on. This complexity result would be unexpected with
the Full Tree Approach, where the full tree is always available.
This concludes the four arguments against the Full Tree Approach, and for
the substructure approach in acquisition. The substructure approach (in acquisi-
tion) and the composition of the phrase marker (in syntax) form the two main
proposals of this thesis.
Aside from the main lines of argumentation, which I have just given, there
are a number of other proposals in this thesis. I just list them here.
(1) One main proposal which I take up in all of Chapter 5 is that the acquisition
sequence is built up from derivational endpoints. In particular, for some purpos-
es, the childs derivation is anchored in the surface, and only goes part of the
way back to DS. The main example of this can be seen with dislocated constitu-
ents. In examples like (27a) and (b), exemplifying Strong Crossover and a
Condition C violation respectively, the adult would not allow these constructions,
while the child does.
xxviii PREFACE
(27) a. *Which man
i
did he
i
see t? (OK for child)
b. *In Johns
i
house, he
i
put a book t. (OK for child)
It cannot be simply said, as in (27b), that Condition C does not apply in the
childs grammar, because it does, in nondislocated structures (Carden 1986b).
The solution to this puzzle and there exist a large number of similar puzzles
in the acquisition literature, see Chapter 5is that Condition C in general applies
over direct c-command relations, including at D-Structure (Lebeaux 1988, 1991,
1998), and that the child analyzes structures like (27b) as if they were dislocated
at all levels of representation, thus never triggering Condition C (a similar
analysis holds of Strong Crossover, construed as a Condition C type constraint,
at DS, van Riemsdijk and Williams 1981). That is, the child derivation, unlike
the adult, does not have movement, but starts out with the element in a dislocat-
ed position, and indexes it to the trace. This explains the lack of Condition C and
Crossover constraints (shown in Chapter 5). It does so by saying that the childs
derivation is shallow: anchored at SS or the surface, and the dislocated item is
never treated as if it were fully back in the DS position.
This is the shallowness of the derivation, anchored in SS (discussed in
Chapter 5).
(2) A number of proposals are made in Chapter 2. One main proposal concerns
the theta tree. In order to construct the tree, one takes a lexical entry, and does
lexical insertion of open class items directly into that. This is shown in (28).
(28) V
N V
V
see
N
patient
woman
man
This means that the sequence between the lexicon and the syntax is in fact a
continuum: the theta subtree constitutes an intermediate structure between those
usually thought to be in the lexicon, and those in the syntax. This is a radical
proposal.
A second proposal made in Chapter 2 is that X projections project up as far
as they need to. Thus if one assumed the X-theory of Jackendo (1977) (as I
did in this thesis) recall that Jackendo had 3 X levels then an element
might project up to the single bar level, double bar level, or all the way up to the
triple bar level, as needed.
PREFACE xxix
(29)
N
N
N
N
Ordering Relations
1 < 2
2 < 5
3 < 4
4 < 4
Argument skeleton domains
S
NP
NP S
VP
V S
NP VP
Domains
1
21
2
Ordering Relations
1 < 2
21 < 21
102 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
Unlike the case with cyclic domains, there is no strict inclusion relation with the
argument skeleton domains. Rather, it is as if the upper bar level complements
of Jackendo (1977) were not present in the original scanning, and are only
added later, in the course of the derivation.
3.3.3 Anti-Reconstruction Eects
Van Riemsdijk and Williams (1981) note a peculiar hole in the operation of
Condition C, as it applies to structures involving moved constituents. Consider
the data in (26).
(26) a. *He
i
likes those pictures of John
i
.
b. *He
i
likes the pictures that John
i
took.
c.
?
*Which pictures of John
i
does he
i
like?
d. Which pictures that John
i
took does he
i
like?
As expected, both (26a) and (26b) are ungrammatical; he c-commands the
coreferent John, and is out by Condition C. The interesting divergence occurs in
(26c) and (26d). Here, where John is contained in a fronted noun phrase,
Condition C applies dierentially in the two cases. Where John is the object of
a picture-noun phrase the sentence retains the ungrammaticality of the original
(26a), but when it is contained inside a relative clause, and this is fronted, then
the ungrammaticality suddenly disappears: (26d) is perfect.
At rst glance, it may appear that this contrast can be handled by locating the
application of Condition C to a particular level, say, D-structure or S-structure.
Yet it is clear that this device will not work. If Condition C were located at DS,
then all the sentences above, (26a, b, c and d), would be expected to be bad.
This is not the case: (26d) is ne. On the other hand, if Condition C were
located at S-structure, and applied directly on structures (rather than using a
derived notion of c-command such as is found in Williams 1987), then the
grammar would allow in too much: both (26c, d) would be expected to be good.
But neither of these locations for Condition C would allow for the true result: the
grammaticality of (26d) and the ungrammaticality of (26c).
Van Riemsdijk and Williams themselves take a dierent tack. They suggest
that the degree of embedding of the name in the dislocated constituent is the
crucial factor in creating the contrast. In particular, they suggest that the name in
the relative clause in (26d) is embedded under an S, while the name in (26c) is
embedded in a PP. This lack of embedding in (26c) is related, they suggest, to
the comparative grammaticality of that construction. The formulation that they
give is the following.
ADJOIN- AND RELATIVE CLAUSES 103
(27) In a structure where NP is part of a dislocated constituent, NP is
exempted from reconstruction if it is deeply embedded enough: part
of a S or genitive phrase.
While the van Riemsdijk and Williams observation is extremely interesting, there
is some reason to believe that their statement of the constraint may be improved
upon, most particularly by examining the function of the structure containing the
name, rather than the degree of embedding per se. Let us consider a somewhat
more inclusive range of data.
Note rst the following contrast:
(28) a. *He
i
believes the claim that John
i
is nice.
b. *He
i
likes the story that John
i
wrote.
c. *Whose claim that John
i
is nice did he
i
believe?
d. Which story that John
i
wrote did he
i
like?
The contrast in (28) is rather striking. All constructions evince the same degree
of embedding: the name is embedded in an S. As expected, the non-dislocated
structures show a Condition C violation. However, there is a clear distinction in
the sentences with dislocated NPs. In (28c), where the name John is contained
in an S which is a complement of the head noun claim, the ungrammaticality of
the initial undislocated structure is retained with full force. In (28d), where the
name is likewise contained in an S, but where the S is part of an adjunct
relative clause associated with the dislocated head, the output becomes perfect.
In (28), then, it is the adjunct status of the containing structure, rather than the
degree of embedding of the name, which is associated with the dierence in
grammaticality.
The same can be seen by an appropriate choice of PPs. As noted before, if
the name is contained in the PP complement of a picture-noun phrase, and
fronted, the resultant is ungrammatical. As suggested earlier, the internal
argument of a picture-noun is a sort of direct complement (Jackendo 1977).
Consider what happens when the name appears in an indisputable adjunct.
(29) a. *He
i
destroyed those pictures of John
i
.
b. *He
i
destroyed those pictures near John
i
.
c.
?
*Which pictures of John
i
did he
i
destroy?
d. Which pictures near John
i
did he
i
destroy?
As expected, (29a) and (b) are ungrammatical: they violate Condition C. The
interesting contrast appears when the fronted NPs are in dislocated position.
When the name is part of a picture-noun phrase, and fronted, the output retains
104 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
the ungrammaticality of the base (29c). However, when it is part of a (locative)
adjunct, the ungrammaticality disappears (29d), though it is still present in the
putative D-structure (29c). Degree of embedding is not an issue, being held
constant (this same observation, that adjuncthood is what matters, not degree of
embedding per se, was made by Freidin 1986, independently: a fact brought to
the authors notice only after this was written).
We note the same contrast between (29c) and (d) in (30), using this time
derived nouns in their verbal vs. nominal uses (Lebeaux 1984, 1986)
(30) a.
?
*Whose examination of John
i
did he
i
fear?
b. Which examinations near John
i
did he
i
peek at?
The deverbal noun examination is being used in (30) is being used in either a
simple referential sense (30b), or as a nominalized process (Lebeaux 1984,
1986). It is plausible to consider that the argument structures of these usages
dier. The data in (30) supports that: the true argument John in (30a) violates
Condition C when dislocated, while the adjunct near John does not.
The data in (28)(30), then, supports the following conclusion: it is the
grammatical function or character of the structure within which the name is
contained, which determines whether a Condition C violation occurs when it is
dislocated. Yet this grammatical function or character (the argument/adjunct
distinction) is irrelevant, if the structure within which the name is contained is in
place.
3.3.4 In the Derivational Mode: Adjoin-a
One way of accounting for the data above the anti-reconstruction facts
would be to simply stipulate it as part of the primitive basis:
(31) If , a name, is contained within a fronted adjunct then Condition C
eects are abrogated; otherwise not.
However, this is hardly an intuitive solution to the problem: stipulation (31), as
a primitive specication in UG it would have to be in UG; there is not
sucient evidence in the data to set this as a parameter or possibility from low-
level learning is hardly satisfactory. Further, this sort of stipulation would
leave unexplained, in a rather a priori fashion, the relation of the anti-reconstruc-
tion constraint to the more standard oddities associated with adjuncts, the
Condition on Extraction Domains (Huang 1982), however reconstructed. While
the solution proposed below will not directly relate the anti-reconstruction facts
to the Condition on Extraction Domains, it does, I believe, clear the way for such
ADJOIN- AND RELATIVE CLAUSES 105
a relation to be made: something which is not the case if (31) were simply
adopted per force.
Let us return to the earlier theoretical construct: the argument skeleton. It
may be assumed that the Projection Principle requires that heads and their
arguments, and the arguments of these heads, and so on, must be present in the
base. That is, the entire argument skeleton must be present, insofar as it is a pure
instantiation of the relation argument-of. However, adjuncts need not be
present in the base. They may then be added later by a rule. Let us call this
Adjoin-. Adjoin- takes two tree structure, and adjoins the second into the rst.
Let us assume that this always involves Chomsky-adjunction, copying the node
in the adjoined-to structure. Like Move-, Adjoin- applies perfectly freely,
ungrammatical results ruled out by general principles, interpretive or otherwise.
(32) A: XP XP
YP YP
YP WP
WP
UP
UP B: ZP
ZP
Adjoin-
Output:
Here, the subtree ZP has been adjoined into the phrase marker A, copying the
YP node. Relative clause adjunction would look like the following.
(33) A: S S
VP VP
V V
NP
NP
S B: S
NP
Adjoin-
Output:
NP NP
And the adjunction of a locative NP-modifying PP would look like this, if the
locative is adjoined to the object:
106 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(34) A: S S
VP VP
V V
NP
NP
PP B: PP
NP
Adjoin-
Output:
NP NP
Here, the subtree B has been adjoined into A, copying the NP node.
We are left, then, with the base generating a set of phrase markers (one
specied as the root). The rule Adjoin- is dened over pairs of phrase markers;
the rule Move- is dened over a single phrase marker. Given absolutely
minimal assumptions, Move- would be expected to apply both prior to, and
posterior to, any given adjunction operation, since it is simply dened as
movement within a phrase marker, and phrase markers exist both prior to, and
posterior to Adjoin-. There is thus no level at which Adjoin- takes place, it
is simply an operation joining phrase markers, given minimal assumptions. We
will see below that there are empirical reasons as well to assume the free
ordering of Move- and Adjoin-.
I will assume that each individual substructure prior to Adjoin- is well-formed.
Assuming a derivation of this type where both Move- and Adjoin- are
available as operations a solution is at hand for the anti-reconstruction eects
discussed above. Let us assume that Condition C is not earmarked for any
particular level it applies throughout the derivation, and marks as ungrammati-
cal any conguration which it sees, in which a name is c-commanded by a
coindexed pronoun.
2
Let us further assume that it applies directly over structures,
2. Like Lasnik (1986), I will assume that Condition C is actually split into two separate conditions,
one which bars the c-commanding of a name by a pronoun, which is much stronger, and one which
bars the c-commanding of a name by another name, which is much weaker. As Lasnik notes, some
languages, e.g. Thai, allow c-command of the second sort, but not the rst. I will discuss the rst
constraint here, and restrict the term Condition C to that. The statement that Condition C applies
throughout the derivation may be too strong, given that sentences like (1b) are grammatical (consider
what the DS would be).
(1) a. *It seems to him
i
that Johns
i
mother is nice.
b. Johns
i
mother seems to him
i
t to be nice.
One way to account for this is to restrict Condition C to apply at any point after NP movement. A
more radical, but more principled solution, I believe, is to maintain that Condition C applies
everywhere, but to argue that the lexical insertion of names applies after NP movement. This
assumption is fairly natural given the theory in Chapter 4, but obviously has widespread implications.
ADJOIN- AND RELATIVE CLAUSES 107
not using any derived or re-dened notion of c-command (Chapter 5). Assume
further, as discussed above, that the full argument skeleton must be present at all
levels of representation, by the Projection Principle, but that adjuncts need not
be.
Consider, now, the two relevant structures.
(35) a. Which pictures that John
i
took did he
i
like?
b.
?
*Whose claim that John
i
took pictures did he
i
deny?
The full DS for (35b) must be the following.
(36) *He
i
denied the claim that John
i
took pictures.
(36) is the full argument skeleton. Deny subcategorizes for the internal argument
claim, and claim itself takes the clause that John took pictures as a complement
(not an adjunct). We must assume then, that the full structure is present at DS,
by the Projection Principle.
This full structure, however, violates Condition C, since the name is
c-commanded by a coindexed pronoun. Therefore the sentence is marked as
ungrammatical at that level. Making the usual assumption that starred sentences
may not be saved by additional operations, this means that the grammar,
correctly, disallows the sentence.
Consider now the unexpectedly grammatical (35a). The corresponding non-
question for (35a) is (37).
(37) He
i
liked which pictures that John
i
took.
Under standard assumptions, this sentence would be marked as ungrammatical at
DS. However, the corresponding SS (35a) is fully grammatical.
Under the theory proposed here, however, the deep structure underlying (35
a) is not (37). Rather, it is (38) (i.e. the full phrase structure trees corresponding
to the argument skeletons. I suppress PS detail for convenience.)
(38) Argument skeleton 1: (
S
(
NP
He) (
VP
liked which pictures)).
Argument skeleton 2: (
S
that John took).
The rooted structure is 1.
To each of these argument skeletons Move- may apply; Adjoin- also applies
adjoining argument skeleton 2 into argument skeleton 1. Move- may also apply
to the resulting, full, sentence structure.
There are two possible derivations for (35a). In one, Adjoin- applies prior
to Move-.
108 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(39)
Derivation 1:
a.
b.
He liked which pictures.
that John took.
He liked which pictures that John took.
i i
Which pictures that John took did h
i i
e like?
Adjoin-
Move-
*
*
In this derivation, if Adjoin- applies rst, then Condition C will apply to the
intermediate structure, ruling it out.
There is, however, another derivation, given in (40).
(40)
Derivation 2:
a.
b.
a.
b.
He liked which pictures.
that John took.
Which pictures did he like?
that John took
Which pictures that John took did h
i i
e like?
Adjoin-
Move-
In (40), Move-, applying in argument skeleton 1, applies before Adjoin-. This
derivation gives rise to the appropriate s-structure as well. However, unlike the
derivation in (39), as well as the standard derivation, there is no structure in (40)
which violates Condition C. This is because the adjunct clause containing John
has been adjoined into the representation after movement has taken place, and
after the fronted NP has been removed from the position in which it is
c-commanded by the pronoun he.
This is possible only for adjuncts: direct complements, like the complement
of claim, must be present at all levels (i.e. part of the rooted argument structure
at all levels).
The same analysis as given for relative clauses holds for locative adjuncts
in NPs. Recall the contrast in (30).
(41) a.
?
*Whose examination of John
i
did he
i
fear?
b. Which examinations near John
i
did he
i
peek at?
The DS of (41a) is given in (42); the DS of (41b) is given in (43).
(42) Root:
(
S
(
NP
He) (
VP
feared (
NP
whose examination of John))).
ADJOIN- AND RELATIVE CLAUSES 109
(43) Root:
(
S
(
NP
He) (
VP
peeked (at (
NP
which examinations))))
Argument Structure 2:
(
PP
near John)
In (42) only one transformation may apply: Move-. Move- fronts the
wh-phrase whose examination of John. However, coreference is disallowed
between he and John since he c-commands John at D-structure.
In (43), two transformations apply: Move- and Adjoin-. These may be
ordered in either fashion: Move- applying in the root prior to the adjunction
operation, or after it. If Move- applies after the adjunction operation, then
coreference between John and the pronoun is impossible, because Condition C
would be violated. However, there is still the derivation in which Move- applies
in the root prior to Adjoin-. This would look as follows.
(44) A:
B:
S
NP
He
VP
V
peeked
PP
P
at
NP
which examinations
near John
Move-a
A:
B:
C
near John
Adjoin-a
SpecC
Which exams
C
C
did
S(=IP)
VP NP
he V
peek
PP
P
at
NP
Here the +wh taking verb, know, would select for the saturated +wh Comp. The
parametric dierence between a language which requires (or allows) wh-movement
in the syntax, English, and a language which apportions it to LF, Chinese, then
comes down to the following specication on the wh-feature.
(64) +wh must be satised in the syntax (English)
+wh need not be satised in the syntax (Chinese)
The Move- rule itself, as it applies to wh-words, is put into 1-to-1 correspon-
dence with the satisfaction of that feature, exactly in the same way that the
Adjoin- rule was put into 1-to-1 correspondence with the satisfaction of the RC
Linker. We may take the following to be equivalent:
(65) Move- Satisfy +wh feature
(as it applies
(to wh-words)
It is the satisfaction of the +wh feature which initiates Move-.
This supports the niteness claim of Chomsky (1981). Movement itself is
put into 1-to-1 correspondence with a contrast in a bit associated with a closed
class element.
Note an additional consequence. While Move-wh and Move-NP (Chomsky
1977b, van Riemsdijk and Williams 1981), are not specied as distinct opera-
tions via the movement rule itself just as is in general the case in most recent
work, where a single rule Move- is assumed they are dierentiated in terms
of the lexical feature which must be satised. In the case of wh-movement this
is the wh-feature itself; we leave it open for now what it is with respect to
NP-movement.
With respect to parametric variation, then, we are led to the following
picture. Adjoin- either need or need not apply in a language for relative clauses.
ADJOIN- AND RELATIVE CLAUSES 117
If it does not apply, a default operation of simple conjunction occurs. The
parametric situation in terms of the operation is the following.
(66) UG: (Adjoin-)
Default: Conjunction
We may conceive of these two operations as being in a bleeding relation in a
derivation. If Adjoin- does not apply, then Conjunction will.
(67)
DS
SS
English
DS
SS
Co-relative language
Adjoin-
Conjunction
(always bled)
no Adjoin-
Conjunction
(always applies)
The order of operation is the same in English and in a co-relative language. The
dierence is that in English, Adjoin- is always specied as applying (and so
conjunction as a universal default is always bled), while in a co-relative lan-
guage, Adjoin- will never apply, and the default will take over.
This same dierence may be looked at with respect to the specication of
the satisfaction of the relative clause linker. The linker either will have to be, or
will not need to be, satised in a given language.
(68)
UG
RC linker
must be satisfied
RC linker
need not/must not be satisfied
(Default)
The grammar, having made the decision in (68), will adopt one or the other sort
of relative clause.
The situation is similar, though subtly dierent, in the case of wh-move-
ment. At rst, it appears to be identical. We have a cross-linguistic dierence in
movement, depending on the satisfaction of the +wh element in Comp at S-structure.
118 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(69)
UG
+ element
must be satisfied
at SS (English)
wh + element
need not be satisfied
at SS (Chinese)
wh
If we assume that the nonspecication of information is always the default, then
the Chinese case would be the default case, with the wh-element in place, in
spite of the fact that it appears to be cross-linguistically less common.
However, there is an additional possible distinction in the wh-data, which
has been brought home most forcefully by recent work in Japanese (Hoji 1985),
and which also echoes a position that Joan Bresnan took further back (Bresnan
1977). Namely, given a dislocated topic or wh-word, it appears that there are two
derivations, in a language like Japanese, which could have produced it. In one,
the dislocated element (perhaps a topic) has been generated in its theta position
at D-structure, and is moved to the dislocated position in the course of the
derivation, by the rule Move-. In the other derivation, the dislocated element is
generated in place at D-structure, this dislocated element is assigned a theta role,
and is somehow linked to the gap. The most plausible means by which this could
be done would be via operator movement plus ultimate co-indexing of the
operator with the dislocated phrase (as in Chomskys 1977a, analysis of tough-
movement), though it is possible that there is some other sort of indexing
procedure altogether. Note that in this case, one might wish to assume that there
is some sort of auxiliary theta role associated with the base-generated D-structure
position (perhaps in tough-constructions in English), so that the element gets a
theta role at DS, and that it additionally gets a theta role from the operator. See
also Chapter 5 (end of chapter), where I suggest that this possibility may hold for
wh-questions in early speech.
(70)
Dislocated element
coindexed by
Move-
generated in place, linked
to gap by Move- of operator
These possibilities may be put together as follows:
ADJOIN- AND RELATIVE CLAUSES 119
(71)
Wh-Dependencies
Dislocated Not Dislocated
(+ feature need
not be satisfied at SS)
wh-
Movement
of Element
Base-generation
of Element
In terms of parameter-setting, the parametric situation here would be determined,
rst, by the necessity for the wh-feature to be satised by SS (or not), and
second, by the possibility of theta assignment to dislocated positions. There does
seem to be some evidence in the acquisition sequence for a switch between the
two left branches in (71) (see Chapter 5), but no evidence for the child adopting
the right branch in initial grammars, i.e. that the +wh feature need not be dis-
located at SS in English. This may be due to the fact that the right branch is not
a default option cross-linguistically, or due to the fact that there is no positive
evidence for that option, or due to some other factor. See Chapter 5 for discussion.
I should note that a Barriers-type analysis suggests the possibility that it is
not the wh-element itself which binds the trace, as suggested in traditional
analyses of wh-movement, nor an index associated with the phrasal category, as
suggested in van Riemsdijk and Williams (1981), but the wh-feature itself in Comp.
(72) traditional analysis:
who
i
did John see t
i
?
(73) van Riemsdijk and Williams analysis:
I dont know who
i
(
i
John saw t
i
)?
120 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(74)
S
NP
I
VP
know CP dont
who
i
C
Comp
+wh
i
i
S
John saw e
i
(after Chomsky 1986)
There is some evidence for this position, see Chapter 5 for discussion.
3.5 Relative Clause Acquisition
Let us now take a look at the acquisition of relative clauses. If the above analysis
of parametric variation is illuminating, it should carry over, ceteris paribus, to the
acquisition of relative clauses as well.
The major work on the acquisition of relative clauses is Tavakolian (1978);
previous work had been done by Sheldon (1974) as well as many others;
subsequent work has been done by Goodluck (1978), Solan and Roeper (1978),
Hamburger and Crain (1982), and again many others. The basic contention of
Tavakolian (1978) is the following: children attempt to parse RCs with the rules
(and computational resources) present in the grammar; to the extent that these
fail, they adopt a conjoined clause analysis of the relative clause. The relevant
structures, then, would be the following.
(75)
S
NP VP
V NP
NP S
who tickled the rabbit. the monkey kissed The sheep
Subject/Object relative (adult grammar):
ADJOIN- AND RELATIVE CLAUSES 121
(76)
NP
tickled the rabbit. kissed the monkey The sheep
Subject/Object relative (child grammar, when it fails):
S
S S
NP VP VP
who
Tavakolian notes that children interpret this in the same manner as the corre-
sponding conjoined clause structure: The sheep kissed the monkey and tickled
the rabbit. Assuming, as she assumes, that there is a null element in the
conjoined clause (i.e. the relative), and assuming that there is the high attach-
ment, then the propensity for young children to interpret the relative clause as if
it were modifying the rst subject (the sheep above) is explained. It is treated as
a sort of co-ordinate construction. This hypothesis diered from an earlier
hypothesis due to Amy Sheldon, who suggested that children attempt to maintain
parallelism in grammatical function between elements in the matrix and the
relative clause: thus, in Sheldons view, subject-subject relatives would be well-
understood (i.e. RCs where the subject had been relativized and associated with
the main clause subject), and object-object relatives, but not subject-object
relatives or object-subject relatives.
Tavakolian adduced a number of pieces of evidence for her position. The most
interesting have to do with the dierence in the comprehension of RCs with relative
subjects, depending on whether they hang o of a subject or an object NP. For
subject-subject relatives, the comprehension data is the following (Tavakolian 1978):
(77) SS relatives:
The sheep that hit the rabbit kisses the lion.
1 2 3
(78) Response category
Correct (12,13) 12,23 21,23 12,32 other
Age 3.03.6
Age 4.04.6
Age 5.05.6
18
16
22
2
5
0
1
1
0
0
0
2
3
2
0
Totals 56 7 2 2 5
Note: A 12, 13 response means that the child acts out 2 actions, one in which the rst
NP acts on the second (12), and one in which the rst NP acts on the third (13).
Similarly for all the number pairs.
122 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
It is clear that children do very well in the comprehension of this relative. For
contrast, now, consider the object/subject relative: i.e., the subject relative o of
an object. Note that according to the usual notions of parsing complexity, these
structures should be easier than the subject relative o of a subject, since they
involve right branching rather than left branching. The results are the following.
(79) Object/Subject relatives
The lion kissed the duck that hit the pig.
Response Category
Age Correct (12,23) 12,13 12,31 12,32 21,23 Other
3.03.6
4.04.6
5.05.6
1
4
9
17
15
13
1
3
1
2
1
0
1
0
1
2
1
0
Total 14 45 5 3 2 3
The clear result, lessening over time, is that children choose a response in which
the subject of the relative clause is the matrix subject, not the matrix object: i.e.
the child takes the 12, 13 response (as if, the lion kissed the duck and hit the
pig), not the 12, 23 response (the lion kissed the duck and the duck hit the
pig). This is a remarkable contrast with the Subject/Subject response, where the
childs response is largely appropriate. Note also that it would be unexpected
given the usual parsing theory of greater complexity in left-branching structures.
Given the analysis of relative clauses suggested in this chapter, however, the
Tavakolian data follows immediately. Let us modify Tavakolians conjoined
clause analysis, so that it becomes not simply a parsing principle, but is
integrated into the general structure of the grammar. Let us assume, in particular,
as above, that relative clauses are not present in the base, but rather are added-in
in the course of the derivation. There are two ways for a language to do so. It
may have recourse to the rule Adjoin-, which adjoins a relative to its nominal
head. This is shown in (80).
(80)
structure 1: structure 2:
S S
NP VP
V NP
Output: English-type relative
ADJOIN- AND RELATIVE CLAUSES 123
Or, it may conjoin the structure (Conjoin-). For convenience, let us assume that
this involves daughter adjunction under S. These are the parametric possibilities.
Let us now make the simplest assumption about the immature grammar: that
it may, under conditions of computational complexity, have recourse not to the
actual rule in the target grammar of the language to be learned, but rather to the
default rule allowed by UG.
(81) The immature grammar may have recourse to the default rule.
In such a case, the grammar would be un-English, but it would not be un-
UG. It would simply be displaying an option available in UG, but unavailable
in the language to be learned. Note that this gives a rather dierent view of
parameter-setting than is conventionally understood. Rather than the child rst
setting the parameter at a default, and attempting to learn the actual value, the
child has the actual value as a target at all times. When the grammar/computational
system fails, it takes the default, as a default. It is not, however, vacillating
between two choices. A physical analogy would be: the grammar is a 3-space,
and in the 3-space are hills to be climbed these are the parameters to be set.
In times of computational diculty, the grammar may fall into a hollow. These
hollows are the default settings. Both the hilltops and the hollows are specied
possibilities of UG: the system, however, is trying to hill-climb. It is only under
conditions of computational complexity that recourse is had to a setting not true
to the target language.
Returning from these general considerations, let us consider how recourse
to the default setting will explain the Tavakolian data. Let us take as a point of
departure the assumption that children are having recourse to the default setting:
Conjoin-. Let us also, more tentatively, assume that this involves daughter
adjunction of the RC S-node underneath S.
(82) Default grammar (children):
Conjoin-
(83) Conjoin- is daughter adjunction under S.
What now happens in the case at hand? Assuming the default grammar in (82)
for both the subject and object relatives, the child would have the following
structural analyses.
124 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(84)
structure 1: structure 2:
S S
S
NP VP
V NP
Output after Conjoin- :
S
NP VP
Relative off of subject:
(85)
structure 1: structure 2:
S S
S
NP VP
Output after Conjoin- :
S
NP VP
Relative off of object:
In both cases, the RC is daughter adjoined to the S. However, this gives us
precisely the result that we want with respect to interpretation. The relative
clause does not form a constituent in relation to any NP: in all cases it hangs o
of S. There is, therefore, no natural relative interpretation. However, the RC does
lack a subject: it is a sort of predicate with an unsaturated subject position. This
means that it may be construed with any sister NP to form a proposition. In the
case of the relative clause o of a subject, the sister NP will be the subject of
the sentence, and the appropriate interpretation will be given accidentally, so
to speak. In the case of the relative clause o of an object, the RC will again be
daughter-adjoined under S, and the relevant sister will be the subject of the
sentence: the object of the sentence would not c-command the relative clause.
The interpretation that will be given, therefore, will be one in which the relative
clause is construed as interpreted with the subject of the sentence: the wrong
interpretation.
ADJOIN- AND RELATIVE CLAUSES 125
By assuming that the child has recourse to the default operation, then, we
are able to account for the pattern of data in the misconstrual of these relative
clauses by children. Strikingly, no separate parsing principle is needed, but what
is needed is a radical restructuring of our understanding of the genesis of relative
clauses. In this way, the acquisition theory may actually lead the syntactic theory
to a novel analysis.
Is there any additional evidence that this sort of analysis is correct ? In fact,
there is. In Tavakolians analysis, the diculty for children in interpretation is
linked to a structural dierence in the phrase markers between children and
adults: the Object/Subject relative is attached high by the children but not by
adults. Solan and Roeper (1978) distinguish Tavakolians account from the
parallel structures account in the following way.
Solan and Roeper constructed sentences in which the relative clause, a
subject relative, is attached o of an object. This is similar to the Object/Subject
sentences above. However, they provided a crucial test to determine whether the
high attachment analysis (conjoined clause analysis) is correct. Namely, they
chose sentences which contained, in addition to the direct object, an obligatory
prepositional object. These were sentences using the verb put.
(86) The lion put the turtle that saw the sh in the trunk.
The adult analysis of (86) would have the RC adjoined to the head noun the
turtle. This is shown with the full line in the diagram in (87). Suppose, however,
that because of computational diculties the child cannot use the rule Adjoin-.
Then the default Conjoin- should appear (the Tavakolian analysis). However,
in this case, Conjoin-, interpreted as S-conjunction also must fail, because it
would result in crossing branches (Solan and Roeper 1978). Hence there is only
one further possibility, that the relative clause remains entirely unattached into
the structure. Now if we make the additional obvious assumption that only rooted
structures may be interpreted, this would mean that the relative clause would not
be erroneously interpreted by the child as conjoined (i.e., as having the subject
as its antecedent) for these put constructions, but rather would not be interpreted
at all. In fact, this seems to be the case (88).
126 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(87)
S
NP VP
V NP PP
NP S
The lion put the turtle that saw the fish in the trunk
?
(88)
Conjoined Clause Response Failure to Interpret RC
Sentences with put
Sentences with push
00
40
42
06
The Roeper and Solan data show clearly that Tavakolians analysis, and the
analysis here, are correct. The child rst attempts to have recourse to the
adjunction structure: Adjoin-. If that fails, he or she attempts to conjoin the
structure: Conjoin-. If that fails, the relative clause must remain unrooted, and
so uninterpretable.
(89) a. Adjoin- (Correct Interpretation)
b. If fails, Conjoin- (Conjoined Clause Interpretation),
c. If fails, remains unrooted (No interpretation)
The acquisition data and the syntactic analysis involving a syntactic rule of
adjunction are then in perfect parity.
3.6 The Fine Structure of the Grammar, with Correspondences: The General
Congruence Principle
I wish now to present one view as to how the theory of levels, the theory of
parametric variation, and the theory of acquisition relate.
In the sections above, I have suggested that there is a rule adding relative
clauses, and in general adjuncts, in the course of the derivation. However, this
rule itself, Adjoin-, has a certain substructure with respect to the derivation. We
may consider it as an optional rule in UG ordered before the default associated
with it, Conjoin-, which it completely bleeds, if present.
ADJOIN- AND RELATIVE CLAUSES 127
(90) DS
SS
UG Specification
(Adjoin- )
Conjoin-
(91) DS
SS
DS
SS
DS
SS
Adjoin-
Conjoin-
(Adjoin-
Conjoin-
)
Conjoin-
Universal Grammar
This general point of view, a parameter-setting approach, has the structure in
(92), with G
1
being a relative clause-head language like English, and G
2
being a
co-relative type language.
(92)
Universal Grammar
G
1
G
2
G
0
128 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
As noted in the previous section, this view, while of a parameter-setting type,
diers from a standard parameter-setting view in at least two ways. First, the
initial grammar G
0
, is not a possible nal grammar. The initial grammar has
Adjoin- as an optional rule, and this presumably is not an option at least
generally for the nal grammar. A more usual parameter-setting approach
might have the childs grammar originally set at either G
1
or G
2
, and changing
over, if necessary, to the other type.
Second, this view diverges from a standard parameter-setting view with
respect to the notion of setting a parameter. In the standard view, a parameter
is a sort of cognitive switch: the child starts with the switch set in a particular
direction, and the setting of the switch may change. Each position is stable. In
the representation in (92) however, with an operation/default type arrangement,
the original setting is neither of the nal two settings, and the parameter is not
so much a switch to be set as a hill to be climbed, a target of the system. Thus
when the child fails to apply the rule Adjoin- in his/her grammar, or in his or
her analysis of a sentence, the default rule Conjoin- is fallen into. It is as if the
former (Adjoin-) were a local maximum surrounded by a local minimum
(Conjoin-). Moreover, the grammar itself must be organized in such a fashion:
that when the attempt at a target rule fails, a local minimum must exist to fall
into, which itself is a possible specication of UG. This at least is an obvious
conclusion to draw from such an approach.
There is, nally, an interesting property of this system in (92) which
requires note. It supports the general sort of philosophical framework that has
been set forth by Chomsky (1986a) and Fodor (1975, 1981) with respect to the
nature of learning. The instance of parameter-setting given in (91) is very strange
from the point of view of traditional learning-theoretic and behaviorist notions
or even more commonly held man-in-the-street views according to which
learning is an accretion of knowledge or information. What is actually happening
in (91) is the reverse of that. Each of the two nal grammars in (91), G
1
and G
2
,
actually has less information than the initial grammar G
0
in terms of the number
of bits of information in them. The initial grammar has two pieces of information
associated with the operation Adjoin-: Adjoin-a itself, and the parenthesis ( )
surrounding it. The nal grammars have less information than that. In the
English type grammar, the parentheses surrounding Adjoin- have been erased:
this means that the single piece of information, Adjoin-a, as an operation, exists.
The co-relative language contains even less information, containing neither
Adjoin- nor the parenthesis. This means that both of the nal grammars have
fewer pieces of information in them than the initial grammar: the process of
learning involves the erasure of information specied in UG, at least for this
ADJOIN- AND RELATIVE CLAUSES 129
central case. This is very much in line with the sort of view of learning that
Chomsky/Fodor propose, and much against an accretionist view.
Let us return to the main problem. The general structure of the choice
situation in (92) can be used both to describe the parametric situation cross-
linguistically, and the childs acquisition problem. The childs undecided
language may be associated with G
0
, and she may choose either of the two
options, G
1
or G
2
. There is an asymmetry in the choice of options, in that G
2
is
the default (cf. the section above): if the child is aiming for G
1
as the target
grammar she may fall into G
2
under conditions of computational complexity, but
not the reverse. Further, there are three welcome (or at least interestingly
dierent) features of the analysis which recommend it: i) it introduces a
developmental aspect, in that the initial grammar in UG, G
0
, is not a nal
grammar, ii) it views parameter-setting not so much as setting a switch, as
attempting to reach a target grammar (climbing a hill): the default grammar is
therefore fallen into, rather than initially specied, and iii) it views learning as
the erasure, rather than the accretion of information.
All this appears well and good. But there is a hidden diculty at this point
for the thesis of this work. According to the General Congruence Principle
(Chapter 2) there is some congruence relation between the acquisition sequence
and the organization of operations in the grammar. But it seems fairly clear that
this is not the case for the analysis of relative clauses presented so far. The
particular format of the parameter-setting approach given in (92), repeated below,
can hold both for the structure of parameters cross-linguistically and for the
childs setting of a parameter in her language.
(93)
UG: Structure of choice of grammars (parameter-setting)
G
1
G
2
G
0
The structure of operations in the grammar, in UG has so far been presented as
rather dierent: it involved the optional specication of Adjoin- followed by
the obligatory specication of Conjoin-.
130 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(94) DS
SS
UG: Structure of operations
(Adjoin- )
Conjoin-
A language like English would erase the brackets in (94), while a corelative
language would erase the entire specication (Adjoin-a). However, if a principle
like the General Congruence Principle is to be correct suggesting that there is
a deep correspondence between the structure of operations within a grammar
(94), and the parametric-acquisitional choice (93) then the particular organiza-
tions in (93) and (94) cannot possibly both be correct: there is no isomorphism
between them, as can be seen by simple inspection. Rather, either (93) or (94)
must be mistaken, and there must be a common format for the two aspects of
Universal Grammar.
Let us therefore proceed in this manner. Change the format in (93) from
that in which a pure choice exists between grammar G
1
and G
2
to something of
the format in (95).
(95)
G
1
) G
2
) G
0
((
Adjoin- Conjoin-
Parametric Specification in UG
The interpretation of the parenthesis in (95) will be peculiar; I return to this
below. And let us keep the format of the operations in the grammar virtually the
same, changing it slightly in typography.
(96)
s
1
) s
2
) DS ((
O :
O :
1
2
Adjoin-
Conjoin-
O
2
O
1
Ignoring parenthesis, the gure in (96) is to be read as follows: the operation O
1
,
Adjoin-, maps the DS into structure s
1
, the operation o
2
maps the representation
ADJOIN- AND RELATIVE CLAUSES 131
into structure s
2
. S
2
may be identied with SS in the case where no other
operations have applied.
It is apparent that the structure of the grammars in (95) and the structure of
the operations in (96) are identical.
More on notation. The parenthesis are not to be read as optionality. Rather,
they are to be read as invisibility. The material inside the parenthesis is invisible
to the grammar/acquisitional device. The grammar develops by removing
parenthesis, allowing for the instantiation of operations already specied in UG.
Second, the particular numbering on the grammars, operations, and structures
(e.g., s
1
vs. s
2
) is of no ultimate signicance s
2
, for example, may come
directly after DS in a particular languages grammar.
Let us start now with (96). I suggested earlier that a childs initial grammar
had recourse to a default rule of Conjoin- to analyze relative clauses. Prior to
this, however, the childs grammar lters out relative clauses altogether, only under-
standing the main proposition. The full developmental sequence is the following.
(97) Stage I: relative clause not understood at all (ltered out)
Stage II: relative clause understood as generated by the rule
Conjoin-
Stage III: relative clause understood at generated by the rule
Adjoin-
This full developmental sequence is represented in the diagram in (96) if we
assume that: i) operations within the parenthesis at time t are unavailable to the
grammar at that time, and ii) the child progresses by removing parenthesis in the
UG representation, starting from the most external set and proceeding inward.
Consider how this would work. The initial grammar for the child would
simply be the following:
(98) s
1
) s
2
) DS ((
O :
O :
1
2
Adjoin-
Conjoin-
o
2
o
1
The operations in parentheses would be unavailable to the child in the initial
grammar, that is both the operations Adjoin- and Conjoin- would be unavail-
able. This means that the representation in the grammar with respect to these
operations would be simply that in (99).
(99) DS
132 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
Since by earlier assumption the DS representation is a pure representation of the
rooted argument-of relation with no adjuncts present, this means that the childs
initial analysis of a sentence like (100a) would be simply (100b), without
Adjoin- or Conjoin- applying.
(100) a. The man saw the woman who knew Bill.
b. The man saw the woman.
That is, the childs grammar at the stage in (99) would be doing the adjunct
ltering that was noted earlier. This goes along with the observation that in
initial stages, relative clauses are simply dropped by the child.
What then is the next stage? According to the above, parenthesis are erased,
starting from the outermost inward. Erasing the outermost parenthesis in (98)
would give rise to the following grammar:
(101)
s
1
) s
2
DS (
O :
O :
1
2
Adjoin-
Conjoin-
o
2
o
1
Assuming that the material inside the parenthesis is invisible to the child, this is
simply equivalent to the grammar in (102), where the numbering on the opera-
tion (O
2
) is not signicant.
(102)
s
2
DS
O :
2
Conjoin-
o
2
Conjoin- will be the operative operation in the grammar at this point. This
means that the child will interpret a sentence with the actual bracketing in (103a)
as having the bracketing in (103b), and one with the actual bracketing of (103c)
as having the bracketing in (103d). This is because only Conjoin-, not Adjoin-, is
part of the grammar at this point. This explains the Tavakolian result.
(103) a. (
S
The man saw (
NP
(
NP
the woman) (
S
who knew Bill)))
b. (
S
The man (
VP
saw (
NP
the woman)) (
S
who knew Bill))
c. (
S
(
NP
(
NP
The man) (
S
who knew Bill)) (saw the woman))
d. (
S
(
NP
The man) (
S
who knew Bill) (saw the woman))
In (103b) the relative o of the object has been attached high daughter-
adjoined under S. This means that the subject is the only possible controller,
ADJOIN- AND RELATIVE CLAUSES 133
i.e., coreferent item, with the subject variable who. This is indeed the mistake
that children make, choosing the subject of the sentence as the subject of the RC.
In (103d), the relative clause is again attached high. However, in this case, with
the subject as controller, the correct interpretation is gotten even though the
structural analysis is faulty. So the Tavakolian facts follow.
In the nal stage in the acquisition of the construction the innermost
brackets are removed (e.g, in the acquisition of English). This gives rise to the
following derivational representation.
(104)
s
1
s
2
DS
O :
O :
1
2
Adjoin-
Conjoin-
o
2
o
1
This was exactly the sequence of operations that we noted earlier as the appro-
priate one for English ((91) above), with Adjoin- continually bleeding
Conjoin- for the appropriate choice of structures.
Thus the sequence of grammars that the child passes through is accounted
for if we assume the following representation in UG together with a rule
which removes outermost brackets on the basis of positive evidence.
(105)
s
1
) s
2
) DS ((
O :
O :
1
2
Adjoin-
Conjoin-
O
2
O
1
Consider now the parametric situation. I suggested earlier that there was a
congruence between the structure of operations in levels, and the structure of
parameter-setting itself. This requires that the parametric structure of grammars
is the following:
(106)
G
1
) G
2
) G
0
((
Adjoin- Conjoin-
The parenthesis are given the same interpretation as above: as invisibility if
present.
The rst grammar would therefore be that in (107a), this would be read
simply as (107b).
134 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(107)
G
1
) G
2
) G
0
((
G
0
Adjoin- Conjoin-
a.
b.
Neither the adjunction nor the conjunction operation would be an attribute of this
grammar. While no human language apparently has this property i.e. is a pure
representation of argument structure with no adjunctual possibilities allowed: all
are too rich for this it is possible that certain subparts of natural language
have precisely this property. I am thinking in particular of the lexicon, or lexical
representation, which is often thought to represent argument structure, but not
adjunctual structure or conjunction (Bresnan 1982; Zubizarreta 1987). The idea
that the most primitive grammar is a pure representation of argument structure,
and that this is the type of the lexicon also goes along with the idea presented
in Chapter 2, that the original grammar is the lexicon, where the lexicon itself
has a tree-like structure.
The next ordered grammar would involve the removal of the outermost
parenthesis in (106). This would be the representation in (108a), which would be
read as (108b) (recall that the subscripts bear no absolute signicance).
(108)
G
1
) G
2
G
2
G
0
(
G
0
Adjoin- Conjoin-
Conjoin-
a.
b.
This would demarcate precisely the co-relative languages with no head-adjoined
clause structure. These grammars are less rich structurally than those containing
Adjoin-, and are the rst to be reached temporally.
The next and nal grammar would be the one reached after the nal,
innermost, parenthesis had been removed.
(109)
G
1
G
2
G
0
O:Adjoin- O:Conjoin-
(arrows to be read as the addition of the
operation to the grammar)
What is the signicance of this representation? At rst, it appears quite unrea-
sonable. It states that the child, having started from an original G
0
, passes to a
grammar which is characterized by having in addition the operation Adjoin-.
However, from there, he or she has the additional operation of Conjoin- added
ADJOIN- AND RELATIVE CLAUSES 135
into the grammar. But since Conjoin- will not even be relevant now for the
structures under consideration, what sense does this make?
I would suggest, however, that it is precisely this organization that is needed
to allow for the fact that when the child fails with Adjoin-, he or she falls into
a grammar that is characterized by the rule Conjoin-. If we assume the
bifurcationist structure above, repeated again below, then there is no reason in the
format of the grammars themselves why the child should fall into G
2
, failing G
1
.
This is represented by the arrow, but that has no formal signicance.
(110)
G
1
G
2
G
0
falls into
But given the format in (109), there is such an organization. Under normal
conditions, the rule Conjoin- is always ready to be added to the grammar in
(109). However, for the relevant RC structures it is always bled, since the rule
of Adjoin- is known to apply. Consider now what happens upon the failure of
Adjoin- let us say on a construction-by-construction basis. To represent this
we may simply cross out the operation, and the resulting grammar will look like
(111b) (recall that the numbering on grammars has no signicance).
(111)
G
1
G
2
G
2
G
0
G
0
Adjoin- Conjoin-
Conjoin-
a.
b.
This grammar, however, is simply the grammar in which Conjoin- holds:
exactly the default grammar that was wanted. Thus by this particular arrangement
of grammar, the retreat of a grammar into a default is accounted for formally
notationally, and not simply by at.
Rather than crossing out the operation (Adjoin-), we may consider a
failure under conditions of computational complexity as tantamount to the re-
insertion of parenthesis in a format in which they have already been removed.
This would be equivalent to a regression to a state which is less specied, and
closer to the original Universal Grammar representation. The grammar would
thus regress to the grammar in (112a), which is read as (112b).
136 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(112)
G
1
) G
2
G
2
G
0
(
G
0
Adjoin- Conjoin-
Conjoin-
a.
b.
Namely, the default grammar, in cases of computational complexity, would be
the grammar in which conjunction held. This is precisely the needed result.
To summarize: in this section, I have argued that there ia a principle, the
General Congruence Principle, which relates structures of operations in a
grammar, and the structure of parameters themselves. These are equivalent up to
isomorphism, at least for the analysis of relative clauses. Second, the child
proceeds by removing parentheses in a representation. This supports the
Chomsky/Fodor position with respect to learning: in this case, at least, learning
is not the accretion of information, but the removal of information, representing
the removal of possibilities from a universally specied set. Third, under
conditions of computational complexity the child falls into a default grammar
involving conjunction rather than adjunction. This however is not due to a
separate parsing principle (as Tavakolian suggests), but rather due to a retreat to
a grammatical format which is closer to the UG format, with all parentheses
included. Finally, the nature of parameter setting is taken to involve not so much
the ipping of a switch, but the climbing of a hill. This represents a local
maximum (the target grammar) surrounded by local minima (the default options).
Both the target grammar, and the fall-back grammars must be represented in UG.
3.7 What the Relation of the Grammar to the Parser Might Be
A common position in the acquisition literature has been that the grammar
remains constant over time, and is hidden by the exercise of parsing strategies.
That is, when the grammar/computational system fails to come up with an
analysis, an exogenous parsing strategy enters in and returns an analysis not
countenanced by the current grammar: the childs analysis, so to speak, falls out
of the grammar, and returns a value which is not one of the possible permissi-
ble targets. In the following, I would like to take the position that that sort of
masking of the grammar does not in fact occur: i.e., that it is not the case that
the grammar remains constant and is masked by an autonomous system of
parsing, production, etc., with its own separate principles. Rather, to the extent
to which parsing and performance considerations matter, they do so via the
ADJOIN- AND RELATIVE CLAUSES 137
grammar itself, either directly, in the sense that possible restrictions on left-to-
right computations are taken up into and stated in the grammar, or indirectly,
where if the childs analysis fail, it falls into another permissible grammar, as in
the discussion above.
Clearly a complete argument for this position would be impossible at
present, given that so little is known about the parser. What I will do instead is
outline some possible relations of the parser to the grammar, with special
attention to the sorts of claims that have been made in the acquisition literature.
Let us imagine what a parsing account of ungrammaticality might be, by
imagining two instances of it. The rst is taken from Frazier (1979), in which
she suggests that there may be a parsing ground that is required for sentential
subjects.
(113) a. That John loves horseradish is obvious.
b. *John loves horseradish is obvious.
Frazier notes the following: suppose that there is a parsing principle like that of
minimal attachment. Then the sentential subject without a that complementizer
would, in the course of the parsing derivation, be immediately attached to the root.
(114)
Parsing representation: left-segment
S
root
NP
John
VP
loves horseradish
This representation would then have to be reanalyzed at a later point, so that it
was subordinated, at a later stage of the parse. Suppressing details:
(115)
S
S VP
John loves
horseradish
is obvious
reanalyzed as
subordinated
138 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
Such a reanalysis would not have to be done for the sentential subject marked
with that, a subordination marker. Suppose that we assume that such a re-
analysis is either: i) costly, or ii) impossible under these conditions. If the latter,
then we would have the basis for a direct parsing account for the ungrammat-
icality in (113b). If the former, which is what Frazier assumes, then the necessity
for the that-complementizer is a parsing-based necessity, but this is encoded in
the grammar in a way which may not be based on a parsing vocabulary at all,
for example in terms of proper government (Stowell 1981). By parsing vocabu-
lary in the last sentence, I mean minimally an explanation which depends on left-
to-rightness.
Let us take a second example, a direct parsing-grammatical account of
that-trace eects. This particular account is by myself, and it is intended for
demonstration purposes only (though it may turn out that something like it is
true, if facts like the que/qui alternation in French could be handled). Assume
that the partial computations in the left-to-right parse must be grammatically
well-formed, fullling X theory, proper government of null categories, etc.
Assume further that null closed class heads need not be postulated until the end
of selected domains of the parse, and that a limited amount of reanalysis, in
terms of addition of phrasal categories, is allowed (exactly how this is done, I
will leave unspecied).
Consider now the that-trace eect in (116).
(116) a. Who do you believe e is here?
b. *Who do you believe that e is here?
In (116a), the null category could be parsed as part of the matrix in a left-to-
right partial computation, if we assume that null categories in argument position
must be projected immediately, and that categories like the embedded IP are
projected at the time that their heads are encountered.
(117)
Partial Parse:
CP
Who do you VP
V
believe e ...
NP
properly governed (in partial parse)
ADJOIN- AND RELATIVE CLAUSES 139
On the other hand, no such partial parse exists for the construction with that. The
null category, if posited, will not be properly governed during the intermediate
parse, prior to the uncovering of In.
(118) CP
Who do you VP
V
believe
CP
e not properly governed
C
that
IP
NP
This would then constitute a parsing-grammatical explanation for that-trace
eects. It would make predictions as well: e.g. that that-trace eects should not
be collapsed with a general inability of extraction from subjects.
As noted above, this explanation is for demonstration purposes only: what
I would like to concentrate on is not the particular explanations above, but their
general type. These would constitute genuine parsing-theoretic explanations of
types of ungrammaticality. Fraziers explanation would restrict the set of
grammar by developing constraints on the possible re-analysis of a partially
parsed tree; the constraint directly above would restrict the set of grammars by
stating well-formedness conditions on partially computed objects. In this latter
case, these well-formedness conditions would be exactly the same conditions
which characterized the full phrase marker.
The claim is often made that the early linguistic system is more dependant
upon or is masked by the parser, but it is not quite clear what this means. For
this type, would it mean that there are more constraints of this type on the early
grammar? That they are stricter?
Further, it is unclear, in a terminological way, that one would want to call
the above constraints parsing constraints. Let us call a constraint a left-linear
constraint if it is a constraint on the building up of a tree (from a string), in a
left-linear fashion. The two example constraints above would be instances of left-
linear constraints. The parsing theorist, insofar as he or she is making claims
about the parser directly determining the properties of the grammar, is making
well-formedness claims about the formal, partially computed object in a left-to-
140 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
right analysis. Yet these constraints on left-linear analysis characterize the
speaker as well as the hearer. But then, insofar as such constraints exist, they
should be simply be considered part of the grammar: i.e. a part of the grammar
concerned with the well-formedness of certain subtrees. That is, the grammatical
theory should be expanded to consider these as part of the grammar: they would
be part of some future grammar, they would be left-linear constraints in such a
grammar, and would not have a dierent ontological character than simple
grammatical constraints.
The following sorts of relations seem possible (focussing on the failure of
the grammatical system in acquisition).
(119)
Left-Linear
Constraints
Frazier Fall into less
mature grammar
(this book)
Grammar Masked
(Hamburger and
Crain)
Correct value
returned;
unavailable
Directly Indirectly
Parsing considerations
determine grammar
(at least partly)
Parsing considerations
do not determine grammar
Parser returns value
not in grammar
Ranking
Hypotheses
Role of Parser
The parsing theory implicitly adopted in this chapter, and in the work as a
whole, is that parsing considerations indirectly determine the grammar, in the
sense that computational diculties cause the system as a whole to fall into a
less mature system, where this system is both grammatically prior and computa-
tionally simpler. That is, it is not so much that the parser determines the form of
the grammar, but that the parser (i.e. computational considerations) partly
determines what grammar one is in, out of a sequence of successive grammars
in acquisition. In this particular addendum, I have suggested there may as well
be ways by which the linking is more direct, to the extent to which left-linear
constraints are directly stated in the grammar (see Weinberg 1988, for such a
view) unfortunately there have been very few grammatical-parsing theories of
this left-linear type, so it is dicult to gauge their range of application: see
Marcus, Hindle, and Fleck (1983) for an exception. In general, I should tend to
favor either of these two sorts of approaches on the left, which may be broadly
ADJOIN- AND RELATIVE CLAUSES 141
distinguished from those which, when confronted with a nonadult structure in
acquisition, hold that the grammar is fully adult, but that it is not reached by the
parser: i.e. that the parser returns a value not in its permissible range. Rather, it
seems preferable to assume that the grammar is organized in such a way that
when the child is confronted with a structure too (computationally) dicult for
him or her to analyze, the grammar/computational system falls back as a unit to
a grammar/computational system in which the child can analyze the string and
return a permissible value in the less advanced system, even if some elements in
the string must be ignored (and part of the meaning may be ignored or errone-
ously construed). This would be the case if the following were true.
(120) Property of Smooth Degradation
The childs analysis degrades smoothly when faced with a not fully
understood input.
(121) Principle of Representability
All analyses by the child are generated by the childs grammar.
These two assumptions may appear to be obvious, but their mutual adoption has,
it seems to me, far-reaching eects in the grammar. One would expect the
property of smooth degradation to hold of any truly robust learning system.
When a failure occurs in the analysis of the input, it ensures that the child has
some sort of analysis of an incoming string, and thus ensures that when the child
hears a sentence that cannot be completely analyzed, a partial analysis will still
be able to be given, so that i) the meaning can be partially recovered, and ii) the
elements which are not understood can be isolated. The Property of Smooth
Degradation distinguishes, I believe, the particular sort of failure that one nds
in intermediate stages of child language, from those which occur in default due
to injury or stroke, i.e. various types of aphasia, and of course from other sorts
of simple, non-redundant input systems like radio receivers.
The Principle of Representability requires the grammar to be reached by
the parser at all times: all values in the parers range are in the grammar.
How should the Property of Smooth Degradation be modeled (if it in fact
is the case, as it seems to be)? It suggest that there is a sort of redundancy in the
system. However, this redundancy is not in the formal redundancy of identical
element, nor in the overlaying of constraints (see Chomskys 1980, comments on
the Tensed Sentence Condition and the Specied Subject Condition), but in the
overlaying of a more articulated and richer system over one which is less so.
Part of the way in which this could be accomplished would be by having
recourse to operation/default organization within the grammar, another would be
to have two (or more) systems operating in distinct vocabularies over the same
142 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
input string: in particular, the Case and theta system (perhaps several systems
within Case); see Chapter 4. With respect to vocabulary, the redundancy is a
functional redundancy, not a formal redundancy: the two systems are distinct in
their primitives.
But then by the Principle of Representability, the partial analysis by the
child must itself be represented in the grammar. That is, the child may be viewed
as passing through a sequence of distinct and gradually enriching grammars, the
simpler grammars acting as a back up for the more complex ones.
Many of the traditional ndings in the psycholinguistic literature may be
viewed in precisely this way: not as the exercise of autonomous parsing
principles, but as the falling back into an earlier grammatical analysis. For
example, a traditional nding due to Bever (1970) is that children initially mis-
analyze passives, when they do, as equivalent to the corresponding active form.
(122) John was V-ed by Mary.
Childs interpretation: John V-ed Mary. active
Bever interpreted this as involving the exercise of an autonomous parsing
strategy: namely, the child tries to t the structure NP-V-NP over the input
string, where the rst NP is an agent, and the second, a patient. Yet this same
nding may be viewed not as implicating a parsing strategy, but of the tting of
the direct lexical form of the verb, or its form after Project-, to the input. As
such, part of the string would be (mis-)analyzed, and the rest would be ignored.
That is, instead of viewing the misanalysis as due to the intercession of a
separate system, one may view it as the retreat to a former system respecting
the Principle of Representability above.
Let me go into somewhat more detail about what the indirect approach in
(119) above would be for relative clauses. Let us suppose that, with respect to a
given construction type, e.g. relative clauses, the child successively adopts over
the course of development one of three analyses: i) the rst, in which relative
clauses are entirely ltered out, ii) the second, in which there is high attachment
(Tavakolian 1978), and iii) the third, in which the relative clause is correctly
adjoined to the head, and construed with it. The second of these analyses corresponds
roughly to what is a co-relative construction in the worlds language: the third of
these corresponds to a grammar in which the phrase marker is a pure representa-
tion of the argument-of relation. We may list the successive grammars below.
(123) Grammar for Relative Clause
G1 RC not attached
G2 High Attachment
G3 Regular Attachment
ADJOIN- AND RELATIVE CLAUSES 143
Suppose now that we reach a particular situation in acquisition. Namely, a child
sometimes chooses a high attachment for the relative clause, and sometimes
chooses the correct NP-S or Det-N-S analysis. One possibility is that the
childs grammar picks out the right analysis, but the childs parser incorrectly
returns a dierent analysis. That is, the parser returns a value an erroneous
phrase marker which is not one of the permissible values in its range, the set
of structures countenanced by the grammar. The parser masks the grammar.
The indirect possibility is the following. At a given time, the entire compu-
tational device, the grammar/parser, is at a certain stage of development. There
is a particular grammar that the device is located at, G3. Together with these
grammars are paired analyses sanctioned by them.
(124)
Time
t1
t2
t3
Grammar
G1
G2
G3
Analysis
A1
A2
A3
The claim is the following. If the grammar at a particular point fails, then it is
not masked, but retreats back to the analysis associated most directly with some
previous grammar. Thus at time t3, the child is normally at grammar G3, which
is associated with analysis A3. However, given a particularly dicult sentence,
the child may fall back to analysis A2, associated with grammar G2, or possibly
even A1, though the last would be unlikely. The situation in which the child is
sometimes returning values of A2 (high attachment), and sometimes those of A3
(the correct analysis), corresponds to the time in which a child may vary in his
or her analysis, according to other factors (computational load, pragmatic
considerations, etc.) However, the child never falls out of a grammar specied
in UG. That is, the parser never returns a value which is not countenanced by
any grammar that the child has ever adopted. This means that even the mis-
takes of the child are fully subject to grammatical analysis: they show, in fact,
the geological layering of the grammar.
This type of analysis has an additional prediction to make. Namely, the
simpler grammars that the child falls into must also be computationally simpler.
For it would do the child little good to fall into a grammatically simpler system,
if the system were computationally more complex. The full position should
therefore be the following, where G1 is simpler than G2, is simpler than G3 on
grammatical grounds, and P1 is simpler than P2, is simpler than P3 in terms of
parsing operations.
144 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(125)
Time
t1
t2
t3
Grammars
G1
G2
G3
Analyses
A1
A2
A3
Parsing Operations
P1
P2
P3
where G1 < G2 < G3
P1 < P2 < P3
and < denotes degree of simplicity, along some metric
I have argued in this paper so far precisely for such a grammatical dierence in
simplicity. The grammar itself has, for any particular operation, both a place with
respect to a lexical sequence of values (the values for the closed class elements,
choosing operations), and in the operational sequence. Along the latter of these,
a very simple sequencing exists: remove external parentheses. The latter gram-
mars are more advanced than the former in that a larger number of external
parentheses have been removed (the mind employs the art of the sculptor: at
least here). To retreat to an earlier grammar, all that is necessary is to reinsert
the most external parenthesis.
Cn:i1r 4
Agreement and Merger
The organization of operations that I will assume is within the general frame-
work of Government-Binding theory, but extended to include certain composition
operations. While composition operations will be used, the primitives are those
of GB theory, including Case theory, theta theory, Move-, and so on. Moreover,
the composition is not strictly bottom-up. This chapter will introduce the relevant
notions.
There are three basic questions that I wish to focus on:
I. What is the primitive vocabulary of the grammar (NP, VP ; agent,
patient, ; nominative, accusative, ; subject, object )?
II. What is the set of primitive operations?
III. How do the distinct vocabularies (Case Theory, Theta theory, etc.) enter
into the description of the phrase marker? What is the organization of rules
or operations in the grammar?
These questions are intended to be answered in such a way as to guarantee the
following:
IV. That niteness is a necessary part of the grammar (see Chomsky 1981, and
Chapter 1),
V. That dierences in vocabulary type are modelled in an adequate way in the
grammar in particular, that the grammar makes the same sensible cuts as
the vocabulary types themselves do, by means of its organization (e.g. Case
theory vs theta theory; open class vs. closed class elements),
VI. That the acquisition sequence bears a congruence relation to the structure of
the grammar (Chapters 2 and 3).
Each of the questions IIII may be looked at in either of two aspects: with
respect to Universal Grammar, or with respect to the mapping in the derivation
itself. It is a thesis of this work that there exists a deep isomorphism between the
two: i.e. that the structure of choices in UG is isomorphic to the structure of
operations in the derivation.
146 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
4.1 The Complement of Operations
I take there to be 2 basic processes in the grammar, the second divided into two
subparts. These are the following:
(1) Assignment of features (theta assignment)
(2) Copying of features
a. Unidirectionally (Case government)
b. Bidirectionally (Agreement)
The unidirectional copying of features is government. Government takes place
under strict sisterhood. The canonical relation of this type would be Case
government, where a verb or prepositions copies abstract Case features to its
right or to its left.
(3) accusative
hit Bill
or
(4) Input: hit Bill
(+acc.feature)
Output: hit Bill
(+acc.feature)
The examples in (3) and (4) show two dierent ways of conceptualizing the
operation. The logic is clearer in (4), where an actual feature, +accusative Case,
is transferred from the head to the Case-governed element.
There is also the bidirectional copying of features (not the same features).
This is an instance of agreement. Agreement also takes place under strict
sisterhood. The canonical case of agreement is agreement under predication for
the subject-predicate relation (Williams 1980).
(5) NP VP predication
a features b features
d features e features
NP VP
a features b features
d features e features
b features a features
AGREEMENT AND MERGER 147
The mutual copying of features is shown in (5). The relevant categories, NP and
VP, each have features associated with them (the labelling a, b, d, etc. is simply
conventional: the labels have no signicance). In (5), each category is associated
with a set of features: the features which will ultimately be copied from them (a
features for NP, b features for VP), and a residue (d features for NP, e features
for VP). After the mutual copying operation has taken place, NP and VP share
certain features (a and b features), and do not share others (d and e features).
In the theory of Williams (1980), predication involves the copying of an
index from the NP onto the VP. According to the discussion here, this cannot be
the case, since predication is actually an agreement relation, agreement between
an NP subject and a VP, and such relations are bidirectional, not unidirectional.
Rather, predication involves copying the number of the subject NP onto the VP
(where it percolates down to the head), and the copying of the external theta role
associated with the VP onto the NP subject.
(6)
number
theta role
Predication and Agreement
NP VP
This is then a typical instance of an agreement relation, with information passing
in both directions. Note that if agreement is essentially a bi-directional operation,
the general reduction of the Subject-In-Predicate relation into one of govern-
ment of the Subject by In is erroneous. Rather, this is an instance of agreement,
a symmetrical operation, unlike government. This would then constitute a distinct
primitive relation in the grammar.
Along with the unidirectional (Case govt.) and bi-directional (agreement)
copying of features, there is another operation, which I have called feature
assignment. A better name might be feature sanctioning or licensing. This is
dierent than the copying of features, because in the latter case the features
actually do originate with the head, and once they are copied the head no longer
retains them. With the assignment or sanctioning of features, there is no copying,
but simple licensing in a conguration. Thus I take (7) to be an instance of
feature licensing, but not (8).
(7) hit Bill
patient
148 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(8) *Input: hit Bill
+patient
Output: hit Bill
+patient
Of (7) and (8), the rst of these, (7), is more accurate. This is because there is
no time prior to the licensing of the theta role: it is not ordered in the derivation,
at every point the theta role is already assigned. Feature assignment or licensing
in this sense is not a temporal operation, as feature copying is, but rather a
continuous process. The conguration in (7) is continuously licensed in the course of
a derivation. Since there is no copying of information from the head to the comple-
ment, feature licensing may continually apply. It is for this reason that the Projection
Principle holds. Namely, this relation is not of a copying type, and such relations
may take place continuously or constantly over the course of a derivation.
To summarize the foregoing, I assume the following operations.
(9)
Operation Type Example Structural Cond. Informat. Flow
Feature lic.
Feature copy
a) Unidirect.
b) Bidirect.
theta
Case ass.
subj-pred.
sisterhood
sisterhood
sisterhood
(continuous)
head to compl. (once)
sisterhood bi-directional
(once)
We have, then, dierent types of information ow: feature assignment vs. two types
of feature copying (unidirectional and bidirectional). Following from this, there
is a dierence in type or mode of application. If a process involves a transfer of
a piece of information (feature-copying), then it must take place at a single time
in the derivation. If it involves what I have called feature assignment or licens-
ing, then it may apply throughout. Note that this use of the terms feature
assignment or feature licensing is more restrictive than the usual sense, which
would include such things as Case assignment (which I am calling an instance of
feature copying). For feature assignment or licensing, but not feature copying,
there may be constancy principles holding, such as the Projection Principle.
Finally, these dierent types of operations are associated with dierent vocabu-
laries. Feature assignment or licensing (in this restrictive sense) involves theta
roles, unidirectional feature copying is exemplied by case assignment, and
bidirectional feature copying is associated with agreement. In the NP-VP case,
AGREEMENT AND MERGER 149
this involves the copying of the number onto the VP from the subject, and the
copying of the VP-associated theta role onto the subject.
All the processes so far discussed have been assumed to apply under strict
sisterhood. There are further distinctions which might be made: for example, it
may be that Case assignment requires adjacency as well, while theta assignment
does not. This would be the case if the Prepositional object to Mary in cases like
(10) were assigned a theta role by the verb I will assume that it is.
(10) John gave a book to Mary.
4.2 Agreement
I would now like to introduce a second type of Case assignment. After earlier
work (Lebeaux 1987), I will call this phrase structural Case. I will assume that,
like theta assignment, phrase structural Case assignment takes place throughout
the derivation. It is closest to the operation Assign GF in Chomsky (1981), but
also may be spelled out as a particular case in the case system. Phrase structural
Case, unlike structural case, is dependent on mother-daughter relations, not head-
sister relations (Lebeaux 1987). In Lebeaux (1987), I argue that the rst instanc-
es of Case assignment to the subject position by the child are actually instances
of the assignment of phrase-structural Case, not structural case (see Chomsky
1981: structural case is sister-assigned Case). Thus in examples like (11), phrase
structural Case is assigned to the subject position.
(11) My/me did it.
I will assume that there are three major places in which phrase structural case is
assigned in adult English.
(12) a. subject of S: (NP, S)
b. subject of NP: (NP, NP)
c. topic position: (NP, S)
If we consider the topic position to be the subject of the utterance, or perhaps the
subject of (what used to be known as) S or S, then its appearance would be
regularized to the other two.
Note that phrase structural Case, like structural case, may take dierent
forms. The subject of NP is marked genitive, while the topic position is marked
accusative.
150 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
Note as well that all of these positions are either islands (b) and (c), or
partial islands (a), from the point of view of extraction.
Phrase structural Case assignment diers from simple structural case
assignment in a few central ways. First, unlike structural case assignment, it is
assigned optionally. Thus the subject position of a NP need not be assigned any
case: e.g. if no lexical element is in that position. Second, it would be assigned
under the mother-daughter relation. It would not be an instance of feature
copying, since this would require that information be copied from a mother node
to one of its daughters. Rather, it would be an instance of feature assignment or
feature licensing, which has the technical restricted sense given to it above. This
fact has a further consequence: phrase structural case may apply several times
throughout the derivation. This follows from the fact that the assignment of such
case does not involve the transfer of information from one node to another, but
rather the scanning of a tree to see if the structural condition has been met. In
this sense as in many others it is similar to the Assign GF relation in
Chomsky (1981), the relation from which function chains are formed (and which
must apply throughout the derivation). The full set of operations, then, is the
following.
(13)
Operation Type Example Structural Cond. Informat. ow Application
Feature licensing
a.
b.
theta
PS case
sisterhood
mother-daughter
head to sister
mother to daugh.
continuous
continuous
Feature copying
a. Unidirect.
b. Bidirect.
struct. c.
agreement
sisterhood
sisterhood
head to sister
both directions
single time
single time
Thus there are two continuous processes, theta assignment and phrase-structural
Case assignment, and two one-time process, the assignment of structural case,
and agreement. How are the two operations known or postulated to exist in the
grammar, Move- and Adjoin-, incorporated into this scheme? Move- moves
an NP into an A or A position: the subject position of S, or the Spec C position
of C. The latter movement may be thought of as movement into the subject
position of C. Thus both types of movement are movement into a subject
position, broadly construed (Pustejovsky 1984). If we conceive of movement in
this way, then movement itself may be conceived of as a sort of bi-product of a
more primitive necessity. That necessity would be to saturate the +/ wh feature
AGREEMENT AND MERGER 151
in the case of wh-movement, and to saturate In in the case of NP-movement.
These both would seem to fall under the rubric of agreement: i.e. the necessity
for agreement would initiate NP movement.
The table above suggests that movement of both types may be consid-
ered a result of, or more exactly, in a 1-to-1 relation with feature satisfaction. Let
us adopt the following terminology: an operation O is initiated by a feature F i
the satisfaction of F requires that O take place. This would work in the obvious
way for structures like (14). Given an input structure like that in (14), the
satisfaction of the closed class RC linker would initiate the relative clause
adjoining operation.
(14) S
NP
The man
VP
V
saw
NP
the woman
S :
1
S :
2
S
Comp
who
S
I knew
The two operations, Saturate-RC Linker and Adjoin-, are therefore in a 1-to-1
relationship; the necessity for saturation involves or initiates the adjoining
operation.
Similarly, wh-movement may be considered to be in a 1-to-1 relationship
with the satisfaction of the +/ wh feature in the Comp for the clause in which
the wh-element nally appears. Note that this dierentiates the nal target of
wh-movement from any of its intermediate landing sites. The situation here is
therefore somewhat more complex than that with the Adjoin- operation, because
the wh-element may move several times in the derivation, with only the last
movement into the Spec C satisfying the +/ wh feature in Comp. We might
assume that the full set of movements are initiated once the ultimate Comp
feature requires satisfaction, or alternatively, that the intermediate movements are
free, and only the last movement is regulated by the necessity for feature
satisfaction. I leave this open.
152 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
If we adopt such a solution, then instead of thinking of a derivation as being
composed of primitive operations (Move-, Adjoin-), we may consider it to
consist of the specications of closed class elements, which must be satised.
For the operations above this would be the following:
(15) Move-NP satisfy In/Agr
Move-wh satisfy +/ wh feature
Adjoin- satisfy RC linker
This provides for an interestingly dierent way of conceiving of the operations
of the derivation: that they are equivalent to the satisfaction of the specications
of certain closed class elements. This would satisfy the niteness characteristic
noted in Chapter 1 (and in Chomsky 1981). While there would be ordered relations
between the necessity for satisfaction in a particular derivation, there would be
nothing in the scheme in (15) to require that actual levels be picked out.
Move-NP and Move-wh would therefore be dierentiated in the following
way: not by the specication of the movement rule itself (operationally), but in
the satisfaction of the diering closed class elements, or equivalently, in the
diering agreement relations which take place. There are two operations: Agree:
Subj./Pred. and Agree: Spec C/C. The rst of these applies to the structure in (16).
(16)
S S
NP
e NP
e
VP VP was was
V
hit
V
hit
NP
John
NP
John
Then it is actually the Subject/Predicate operation itself which forces movement.
Similarly, wh-movement may apply in a 1-to-1 correspondence with the relation
which satises the +/wh-feature in Comp. Call this Spec C/C agreement.
(17)
S S
SpecC SpecC C C
C C S S
NP VP
saw who John e e
John see ? e did who
AGREEMENT AND MERGER 153
Since this is another sort of agreement relation, agreement actually underlies both
Wh-movement and NP-movement.
The two operations, then, are the following
(18) Agree Subject/Predicate Move NP
Agree Spec C/C Move wh
These operations initiate movement. By their application, wh-movement and
NP-movement take place. The third operation which has been introduced so far
is the adjunction of the relative clause into the NP. So far, I have suggested that
this involves saturation of the relative clause linker. This by itself would make
the operation of Adjoin-, a unidirectional operation. However, as Chomsky
(1982) notes, the relation of the relative clause to the head is also an operation
of Predication. If this is correct then relative clause formation (adjunction) is also
an instance of a bidirectional operation: the head N satises the relative clause
linker, but at the same time the relative clause itself is predicated of the head N.
This would mean that relative clause adjunction would also be an instance of
agreement.
(19) Agree Subject/Predicate Move NP
Agree Spec C/C Move-wh
Agree Rel head/relativizer Adjoin-
This is shown above. In fact, this would mean that all of the operations which
involve a radically changing operation of any type (both forms of movement,
relative clause adjunction) are initiated by the action of agreement. Other types
of information relations, theta assignment and structural case assignment for
example, do not radically change the structure of the tree, or the position of
elements in it.
The brunt of this section, then, has been to introduce a new primitive
operation into the grammar: agreement. Unlike Case Assignment, agreement is
intrinsically bi-directional. It should not be reduced to some other primitive
operation (e.g. government). Agreement has as well two other attributes. It can
always be put in a 1-to-1 correspondence with the satisfaction of a closed class
element, as in (19) above. This guarantees niteness. Second, it is agreement
itself (so far) which composes substructures into a whole.
154 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
4.3 Merger or Project-
4.3.1 Relation to Psycholinguistic Evidence
Let us look at another sort of phenomenon, apparent in acquisition. It is a
commonplace in the acquisition literature that children in the earliest stages of
language just use open class elements in production. This is the famous stage of
telegraphic speech, where the closed class morphemes have dropped out (Brown
1973). The phenomenon of telegraphic speech is well known to every linguist,
as well as every parent, yet very little has been made of it in the literature. Nor
has much been made of the closed class/open class distinction as a signicant
demarcation in adult speech. Indeed, in Aspects (Chomsky 1965), as well as in
recent work by Joseph Emonds (Emonds 1985), some attempt has been made to
model the open class/closed class distinction in a derivational way. The proposal
in Chomsky (1965) was to allow for late S-structure insertion of closed class
elements; a similar proposal is made in Emonds (1985). These two have perhaps
been the main line proposals in the syntactic literature, but neither has been
substantively followed up. Yet the existence of telegraphic (i.e. open class)
speech by children suggests that the open class representation would have
considerable signicance in development; the General Congruence Principle
would direct that it have repercussions on representations in adult speech as well.
Examples of telegraphic speech are given below.
(20) see ball
here Mommy
want orange juice
make castle
etc.
In spite of the paucity of proposals about the open class/closed class distinction
in syntactic theory proper, this lack of attention does not seem to be principled.
One reason for the lack of interest has to do with the fact that closed class
morphemes do not belong to a single category (like NP), but rather to any of a
number of types. There are Determiners (the), auxiliary verbs (may), inectional
elements (to), prepositions (to, of), and nouns (him). Since the majority of
generalizations in linguistic theory are stated in terms of category types (e.g.
lexical NPs need Case), the lack of a coherent categorization for closed class
elements has perhaps drawn investigation away from this class.
A second reason bears more directly on acquisition. While in general GB
theorists have shown a useful suspicion of functionalist proposals, within the
AGREEMENT AND MERGER 155
realm of telegraphic speech such proposals have reigned supreme, without
substantive criticism. The functionalist proposal for the absence of closed class
elements in early speech would be simply the following: the child has limited
memory and computational resources in early stages. Given such a limitation,
morphemes are at a premium. And since open class morphemes are information-
rich compared to closed class morphemes, it is hardly surprising that the child
has recourse to the former rather than the latter.
There are, however, a number of diculties with this functionalist account.
First, even given limited resources, one would expect that the closed class
morphemes would appear sometime, if the child had command of them. The fact
that they do not appear at all, in this early stage, suggests that their exclusion is
principled, not simply functional in character. More exactly, while there may be
a functional reason why closed class morphemes are not generally used, it is
reasonable to believe that this functionalist reason has been grammaticalized:
i.e. realized in the grammar in a principled and meaningful way. Otherwise, one
would expect occasional outcroppings of closed class elements even in the
earliest stages, something which does not occur (except for pronouns).
Evidence from quite a dierent area suggests as well that there are real
dierences in the adult computational system in the handling of open class and
open class elements. I am thinking here of a quite complex paper by Garrett
(1975). Garrett analyzed a large corpus of speech errors, gathered by Shattuck-
Hufnagel and himself, the so-called MIT Corpus (approximately 3400 errors).
Exchange errors fell within two basic types: those which occurred between
independent words, and those which occurred in what Garrett calls combined
forms, essentially involving the stranding of bound axes, as the free mor-
phemes were interchanged.
(21) Independent form exchanges (examples):
a. I broke a dinghy in the stay yesterday.
b. Ive got to go home and give my bath a hot back.
(22) Combined form exchanges (examples):
a. McGovern favors pushing busters.
b. It just sounded to start.
c. Oh, thats just a back trucking out.
(exchanged elements underlined)
The independent forms and the combined forms apparently operate dierently
in exchanges: in particular, the former, but not the latter, obey form class (i.e.
syntactic category), according to Garrett, and this constraint is stronger in
156 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
between-clause exchanges. He thus suggests that there are two independent levels
of syntactic processing (see also Lapointe 1985, for discussion):
a. Exchanged words that are (relatively) widely separated in the intended
output or that are members of distinct surface clauses will serve similar
roles in the sentence structures underlying the intended utterance, and, in
particular, will be of the same form class. These exchange errors repre-
sent interactions of elements at a level of processing for which functional
relations are the determinant of computational simultaneity
b. Exchanged elements that are (relatively) near to each other and which
violate form class represent inter-actions at a level of processing for
which the serial order of an intended utterance is the determinant of
computational simultaneity
from Garrett (1975)
Yet more interesting, from the point of view of the theory advocated here is the
following comment on the stranding of closed class morphemes (Garrett 1975):
The errors we have been referring to as combined form exchanges are errors
of a rather remarkable sort. They might, as a matter of fact, have been more
aptly described as morpheme stranding errors, for not only are the permuted
elements nearly always free forms, but the elements left behind are as often
bound morphemes
(30) Im not in the read for mooding
(31) he made a lot of money intelephoning stalls
(32) Shes already trunked two packs.
Why should the presence of a syntactically active bound morpheme be
associated with an error at the level described in (b)? Precisely because the
attachment of a syntactic morpheme to a particular lexical item reects a
mapping from the functional level to the positional level of sentence
planning
It is examples like those in (30)(32) that lead Garrett to propose that syntactic
production is divided into two levels: a functional level, and a positional level,
and that the former is mapped into the latter.
This whole line of research might be sanitized from the point of view of
linguistics by assuming that what it really pertains to is the theory of the
language producer and language acquirer (in the case of telegraphic speech). This
would then require that these be part of a separate theory, of unclear extent,
which does not have to have any correspondence with syntactic theory per se.
Let us nonetheless not take such a position, and instead reach to integrate the
Garrett-Shattuck proposals, and the phenomenon of telegraphic speech within
linguistic theory proper. This is exactly the position which was taken last chapter
AGREEMENT AND MERGER 157
with regard to relative clauses, where it was argued that the high attachment of
RCs noted by Tavakolian (1978) was not a separate parsing principle, but an
instantiation of a possibility open in UG: namely, of having a co-relative
construction. By taking such a position, a sort of synthesis was achieved between
the (now standard) parameter setting approach, and the thesis of this work, that
real development takes place. I will concentrate here more on the telegraphic
speech, with the Garrett data forming a sort of backdrop.
There is a nal observation which suggests that the stage of development of
telegraphic speech is an organized stage, and that it should be taken account of
in adult speech as well. This is the simple, but meaningful observation that
adults, as well as children, can speak telegraphic speech. If we viewed such
speech as simply the direct result of a computational decit by the child, we
would expect that adults would no longer be able to produce such speech, at
least insofar as this would require mimicking a computational decit that the
adult no longer had. Given the fact that adults can speak telegraphically, there is
a strong implication, though of course no sure proof, that telegraphic speech is
an actual subgrammar of the full grammar, and that adults using such speech are
gaining access to that subgrammar. This in turn would be very much in line with
the General Congruence Principle, which suggests that the acquisitional stage
exists in the adult grammar in something like the same sense that a particular
geological layer may underlie a landscape: it therefore may be accessed.
4.3.2 Reduced Structures
But what would this subgrammar look like? It was noted above that the open
class/closed class distinction had been mentioned, and partly modelled, by such
early works as Aspects (1965), where it was assumed that closed class elements
were a late spell-out of certain types of information. The Garrett and Shattuck-
Hufnagel data, however, suggests that something like the opposite ordering holds.
Namely, that there exists a grid or template of closed class elements, and the
open class elements are projected into them. This is perhaps counter-intuitive
from the point of view of actual speech production, yet the logic of the grammar
supports it. It is also in line with certain conclusions that were reached in
Chapter 1. A constraint was suggested there, the Fixed Specier Constraint,
which would bar the independent movement of closed class speciers (i.e. unless
they were part of a whole constituent which was moved). This was made
necessary by the fact that under conditions of extensive movement, there must
remain certain stable elements, the grid around which the others are moved, for
the child to be able to induce a grammar at all. The closed class specier
158 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
elements seemed to be just such a set. (Note that this form of the proposal, while
based in part on considerations that Garrett raises, diers in content.)
Let us adopt the same conceptual device that was used in the analysis of
relative clauses early. In that chapter a full sentence was put through a concep-
tual lter, which ltered adjuncts out of the representation. This created the
argument skeleton on the one hand (the rooted structure which was a pure
representation of the argument-of relation), and a set of adjuncts which would
later be added into the representation. If we adopt the same device here, we
would get a reduction of a full sentence, together with a set of closed class
elements. Let us ignore the latter set for now, concentrating on the reduction
itself. The term reduction here is used with a rather dierent meaning than that
in Bloom (1970).
(23) I saw the ball.
reduction: see ball
(24) Mommy left the room.
reduction: Mommy leave room
(25) I put the ball on the table
reduction: put ball (on) table
In fact, what representations like (23)(25) show us is that we had not gone far
enough in Chapter 3 in attempting to isolate a pure representation of argument
structure. The reductions in (23)(25) are a purer isolate yet. And, if the General
Congruence Principle is to hold, it must be the case that these reductions are not
simply spoken by the child, but underlie adult speech as well.
The term reduction to describe (23)(25) is intended purely descriptively.
Assuming that the reduction in (23)(25) is what the child would say if he or she
wished to express the full meaning directly above it, we will call the childs
utterance a reduction of the full phrase marker. This still leaves undetermined
what the nature of this reduction is. There are three central possibilities:
I. that the reduced phrase marker is directly generated as such by the childs
grammar,
II. that there is a reduction transformation of some sort from a fuller structure
(where this reduction may be done by the parser rather than the grammar),
and
III. that the actual phrase marker is relatively more developed even at SS, and
null elements ll the determiner and other closed class positions.
1
1. This possibility was suggested to me by Joung-Ran Kim.
AGREEMENT AND MERGER 159
Of these three possibilities, I wish to adopt the rst, and to some degree the
third. This has a further eect. Given the General Congruence Principles, such
a reduced mode of representation must also underlie the more complete adult
representation. That is, telegraphic speech is actually generated by a subgrammar
of the adult grammar, a modular and unied subgrammar, and this enters into the
full phrase marker.
The three logical possibilities underlying the childs reduction of adult
speech are the following:
(26)
Simple reduced phrase marker:
V
V
see
N
theme
ball
(27)
Deletion account:
S
NP
I
I
Tns
e
VP
V
see
NP
Det
the
N
ball
S
VP
V
see
NP or N
ball
Deep Structure Surface Structure
160 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(28)
Null lexical items:
S
NP
e
I
Tns
e
VP
V
see
NP
Det
e
N
ball
Deep Structure and Surface Structure
The arguments for the rst proposal over the deletion transformation account are
conceptual in nature. Suppose that we assume that there is some measure of
complexity of a phrase marker. This would be a function of the complexity of
the tree, the licensing relations in it, and so on, and would no doubt dier to
some degree from production to comprehension. It would be natural, given such
an analysis to suppose that the phrase marker itself (though not the universal
principles underlying it), complicates itself over time, in the sense that the
complexity with respect to that metric increases. That is, the analyses allowed by
the grammar become more complex over time, though the universal principles do
not. This was the case with the analysis of relative clauses earlier, where the UG
information was, in fact, lessened over time (as parentheses were removed),
while the analysis itself was to some degree made more complex.
Given such an assumption, there is something extremely odd about the
deletion account, Proposal II. Such an account requires that there be an original
full representation, together with an operation, a reduction transformation which
operates on it. The child grammar (or system) and the adult grammar generating
the syntactic representation underlying see the ball would therefore be the
following:
(29) Adult grammar: rules underlying full phrase marker.
Child grammar: i) rules underlying full phrase marker
ii) reduction transformation or operation
But this is surely odd, if the grammar has any sort of computational reex at all.
The childs grammar here contains more material in it than the adult grammar,
and these operations must all work: precisely the opposite of what might be
AGREEMENT AND MERGER 161
expected. One might expect, given the grammar in (29), that a more complex
structure (i.e. one containing more NPs) would be less reduced, because the rules
underlying the full phrase marker would already be stretched to the limit, and it
would be dicult for the child to apply (ii) in addition: i.e. (i) would be instan-
tiated at the cost of (ii).
There is a second way in which the reduction analysis is odd. Namely, such
an operation would not exist in UG, but would simply be present at a particular
stage in acquisition, the telegraphic stage. This would make it look very unlike
the situation with relative clauses discussed earlier, where the possibility of high
attachment was reduced to an actual alternative specication in UG: the possibili-
ty of a corelative construction. The theory under construction in the last chapter
would require that when an appropriate structure is not reached by the child (in
this case, the full phrase marker), the child falls into another grammar specied
by UG. This would not be the case with a reduction transformation, since the
reduction transformation itself is not specied by UG. The child, therefore,
would be falling into a grammar which is not specied by UG. Keeping with
the strictures above, this possibility is unavailable to us. I will therefore assume
that there is no reduction operation of this type.
With respect to the third possibility, the situation is more interesting. I limit
myself here to some preliminary comments.
In Chapter 2, I suggested that the analysis of the early string by the child
took place in two stages. The child rst tted the lexical subtree over the open
class part of the incoming string.
(30)
V
(N)
man
V
V
saw
N
woman the the
This gave rise to telegraphic speech: an analysis in which the closed class
elements dropped out.
162 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(31)
V
(N)
(man)
V
V
saw
N
woman
At a logically slightly later stage, the closed class elements, marked simply X
0
,
were incorporated into the structure, according to the Principle Branching
Direction of the language.
(32)
V
N
X
0
X
0
N
N
V
V
the man saw woman the
This would correspond to a derivational stage in which Project- occurs. Some
evidence for this second stage would be the presence of schwa in the output,
corresponding to the X
0
elements.
If this is in fact the progression, then both the null category and the simple
structures view would be expected to be correct, though at slightly dierent
stages: the latter slightly less advanced than the former, and logically prior to it.
(This assumes that the Pre-Project- stage is phonologically realizable: if not,
true telegraphic speech in the simple structures sense above would exist only
in the initial analysis, and as a subgrammar of the nal grammar (see later
discussion), and not in exteriorized speech at the telegraphic stage. I leave this
odd possibility aside.)
Either of the two views above would have an advantage over a fourth view
(in GB-theory): that the initial NP is a full phrasal node, with no determiner.
AGREEMENT AND MERGER 163
(33)
S
NP VP
V
see
NP
N
ball
The reason has to do with extendibility of the grammar (in Marcus, et. al.s
sense). The following generalization must be expressed somewhere in the
grammar of English.
(34) In the phrasal syntax, deniteness is marked with the (or perhaps,
building up to the N level: Chapter 2)
The representation in (33) would violate the restriction in (34), while the
representation in (31) would not, since the phrasal syntax had not yet been
entered. That is, by assuming a dierent type of representation, the thematic
representation, one arrives at the position that the childs grammar is at this stage
not incorrect, but simply incomplete.
Let me turn now to another consideration in syntactic description at this
stage. At the stage at which children are saying things like see ball, their
behavior suggests that they are using something quite dierent than that simple
sentence for the information structure which is input to semantic interpretation.
In particular, while ball in see ball is determinerless in early childhood speech,
and while determinerless nouns in adult speech generally have a generic or class
interpretation, the child speaks reduced phrases like see ball in contexts where
ball must be regarded as specic in reference. Thus the child interprets see
ball as something like I see the ball. But this simple fact creates diculties
for the simple structures account. If the DS and SS representations are indeed
((see)
v
(ball)
n
)
v
i.e. extremely simplied, and without a determiner: pure
lexical representations then the child must still have a way of rendering the
fact that such structures are not generic. Assuming that at LF the structure is
interpreted, this means that by LF, the representation must be something like
((
V
see) (
NP
the ball)). But if this is so, then structure-building operations must be
available at LF, for the child but not the adult. This is surely not to be desired.
Further, the postulation of structure-building operations at LF mimics the
possibility of a reduction transformation already rejected, although in a reverse
164 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
direction. The null lexical item account avoids these problems because there is
already a slot for the determiner element. We might even assume that the slot
itself is marked for deniteness or indeniteness:
(35)
VP
V
see
NP
Det
+Definite
e
N
ball
In this case, the child would not need to structure-build at all at LF, but would
simply use the structure as given: deniteness is already correctly marked,
though the lexical item is missing.
This diculty with the thematic structure account also appears to put it at
a disadvantage with respect to an acquisition theory like that in LFG (see Pinker
1984). In Pinkers theory, a relatively complete f-structure may be paired with an
incomplete c-structure. The representation of see ball might therefore be the
following.
(36)
S
VP
V
see
NP
N
ball
c-structure f-structure
SUBJ (Pred I)
see (SUBJ,OBJ)
OBJ +definite
Pred ball
TNS Present
DS
DS
SS
PF LF
Potential levels
of analysis
Thus for the example given in (11), the childs analysis may generally have the
wh-element in dislocated position at the deepest computed level, but it may occasion-
ally have the wh-element originating in the (adult) DS object position as well.
(11) Surface: Who did you see (t)?
DS (Computation l): Who
i
did you see t
i
?
DS (Computation 2): Who
i
did you see t
i
?
DS (Computation 3): You saw who?
SS: Who
i
did you see t
i
?
LF: For which x, you saw x?
This is in accord with the psycholinguistic fact that the possibility of analysis
will change under diering conditions. The claim above would be that the
childs analysis is shallow, not his or her grammar.
As noted above, this fact might be taken to be a purely computational fact.
Or it might be that it has parametric eects: in particular, for those structures in
which the wh-element is analyzed as base-generated in dislocated position, it is
also analyzed as being in a theta position (or quasi-theta position) even at the
deepest level of analysis. This would mean that a computational eect would
have parametric repercussions: the child would fall into a dierent type of
grammar. Some evidence for this is discussed in 5.7.6. The crucial point
throughout the chapter however, will be the light that the shallow analysis sheds
on the structure of the grammar.
192 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
5.3 Levels of Representation and Learnability
While shallowness of analysis in this case need not be considered a property of
the grammar per se (but may rather be of the computational-acquisition device
as it computes a representation), it does provide a unique clue into the structure
of the grammar. Namely a prediction is made: insofar as the analysis is shallow
(i.e. extends backwards from SS to DS), the set of grammatical functions
associated with the not-present levels (DS to DS) would be expected to not be
present as well. Suppose that a particular grammatical module (e.g. part of the
Binding Theory) applies at DS. Given that DS is not available, in a particular
analysis of a string, to the child, the part of the Binding Theory which was stated
over DS would also be expected to be not present. Thus the set of structures
which underwent some rule at DS (being marked for coreference or obligatory
disjoint reference} would be expected to be treated differently in the childs
grammar than in the adults grammar. This is shown in (12) below:
(12)
DS
DS
SS
PF LF
set of operations or rules which apply at DS
shallow analysis
anchor of the childs analysis
We may say that the childs analysis is anchored at a particular level: SS, or the
surface. (For convenience, I will henceforth simply use SS as the anchoring level
rather than the surface. No theoretical point is intended thereby, and I use it for
convenience since the properties of that level, but not of the surface, are
relatively well-explored. The contrast is with D-structure anchored representa-
tions.) Over time, the derivation lls out backwards, the analysis becomes less
shallow. At any particular time, however, the analysis is shallow, not encompass-
ing the adult DS. This, however, has a consequence. The set of grammatical
functions associated with the adult DS will not be present in the childs gram-
mar. Borrowing, and changing, terminology from Williams (1986), the set of
grammatical functions associated with DS would be unavailable: this is the
abrogation of DS functions.
THE ABROGATION OF DS FUNCTIONS 193
(13)
DS
DS
SS
functions unavailable
deepest level computed
anchor of the analysis
To say that SS is the anchor of the analysis is to say that the computation
proceeds backwards from that level, at least in part. This fact may then be used
in a positive way by the linguist: to determine the structure of levels or organiza-
tion of rules in the grammar. Insofar as particular grammatical functions be
shown to not be available to the child (e.g. some aspects of Binding Theory),
they are earmarked as belonging to the missing levels: i.e. in the domain
DSDS. The shallowness of analysis would thus give us insight into the
structure of the grammar, and where therein particular operations apply.
There is a second property of this type of analysis which is worthy of note.
I have suggested somewhat tentatively that this aspect of shallowness of analysis
(with the anchoring at SS) may not be general, but rather associated with a
particular process: comprehension. What about the child as speaker in the
speaker/hearer duality? Here I will mention a possibility that will remain quite
speculative. Let us suppose that the child as speaker again adopts (computes) a
shallow analysis. However, this analysis is shallow in the opposite direction:
anchored at DS, but shallow with respect to SS. The result would be the following:
(14)
DS
SS
(SS) PF
production anchored at
shallow S-structure
corresponding adult S-structure
Representation 1: Anchored at DS
LF
194 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
DS
DS
SS
PF
analysis anchored at
computed structure
corresponding adult d-structure
Representation 2: Anchored at SS
LF
Such a system would then give the following relation of the computational
system to the structure of the grammar, where comprehension and production are
viewed with a suitable degree of abstractness.
(15) i. Comprehension: shallow in the upwards direction, anchored at
SS, functions uniquely associated with DS not available.
ii. Production: shallow in the downwards direction, anchored at
DS, functions uniquely associated with SS not available.
While the grammar is of the Chomsky-Lasnik type, the anchorings for compre-
hension and production would be distinct. Let us dene a grammar/computational
system as equipollent if it has the following property.
(16) A (Grammar, Comput. System) is equipollent if it is anchored at all
levels for all operations (comprehension/production).
(17) The adult grammar, but not the developing childs, is equipollent.
(16) and (17) together give us a characterization of a developing grammar.
Moreover, this characterization is not only available to us as linguists, but to the
child him/herself. This suggests a solution a general solution to the
problem of overgeneration and negative evidence discussed in detail in Pinker
(1984). Recall that Pinker faced a serious problem with respect to the intermedi-
ate grammars that the child adopted. Namely, the child, at early stages, produces
sentences like the following:
(18) Me give ball Mommy (I gave the ball to Mommy)
I walk the table (I am walking on the table)
In Chapters 2 and 4, I suggested a particular solution to the ungrammaticality of
these utterances: namely, that they are not ungrammatical at all, but rather
THE ABROGATION OF DS FUNCTIONS 195
correspond to subrepresentations in the adult grammar: the theta representations.
Let us consider a second possibility here, which will appear incompatible with
the other approach. (In the genesis of this thesis, I considered the approach here
rst, and later abandoned it in favor of the approach in Chapters 2 and 4; I will
attempt to synthesize them here). This approach originates more directly out of
an attempt to come to grips with certain problems in Pinkers approach. It
depends crucially on the notion of anchoring at a level.
A natural assumption given Pinkers approach is that the sentences in (18)
would correspond to the following phrase markers (note that this would not be
the representation given in Chapters 2 and 4, where they would be part of the
sub-phrasal syntax).
(19) a. S
NP
Me
VP
V
give
NP
ball
NP
Mommy
b. S
NP
I
VP
V
walk
NP
table
But while this assumption is natural, it leads, as Pinker notes, to a serious
diculty. If we assume that lexical heads have subcategorization frames
corresponding to these phrase markers, these would have to be the following.
(20) a. give: ____ NP NP
theme goal
b. walk: ____ NP
location
The problem is that these subcategorization frames are impossible for the adult
grammar. That is the goal for give must either be marked with to or must
precede the theme: the locative object of walk is the object of an on preposition
(in this usage). How does the child then get rid of the erroneous subcategorization
196 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
information? (Note that this problem, a delearnability problem, does not arise in
the representation in Chapters 2 and 4, because the representations would be
taken to be accurate, though in a subgrammar: let us put aside this solution for
now.) This is the problem of negative evidence, in its strongest form.
It might at rst be thought that a uniqueness principle would be appropriate.
For example, if it could be argued that every lexical item has a unique deploy-
ment of thematic relations and category types, then the later acquisition of a
subcategorization frame such as the following would knock out the subcategori-
zation frame in (20a).
(21) give: ___ NP PP
theme goal
While it can indeed be argued for the subclass of give verbs that only one DS
deployment exists NP
theme
PP
goal
, with a possible movement operation producing
the double object form (see Stowell 1981; Baker 1985, for a theory along these
lines) and thus while a uniqueness principle may in fact be used for this set
of examples to exclude the erroneous entry, this cannot in general be the case.
The spray/load class of verbs, for example, allow two realizations of objects.
(22) a. spray the wall with paint
spray paint on the wall
b. spray: ___ NP (with NP)
loc (inst
spray: ___ NP (on NP)
inst (loc
So the existence of two lexical entries per se, for a given verb, cannot in general
help the child in excluding initial erroneous entries. However, this leaves the question
of how the child eliminates the oending entries in (18)(20) from the grammar.
In an important contribution, Pinker (1984) adopts one possible solution.
Namely, he suggests that until the nal grammar is set, lexical entries (and
phrase structure rules) are given provisional status, by the device of orphaned
nodes. This means that the phrase structure rule, and the corresponding subcate-
gorization frame, that the child uses is marked with a question mark to indicate
its provisional status; the phrase marker itself contains an orphan and a
possible mother node (or set of such nodes).
The lexical entries in the grammar corresponding to the PS expansions in
(19) would then be:
THE ABROGATION OF DS FUNCTIONS 197
(23) a. ?
give: ___ NP NP
theme goal
b.
?
walk: ___ NP
location
Since these entries are not given full-edged status in the grammar, the problem
of the lack of evidence to eliminate the entries in (23) does not arise. How, then,
is the correct grammar reached? According to Pinker, the intermediate entries are
assigned some provisional probability of occurring. The actual sanctioning of a
lexical entry is not all-or-none, but rather with respect to a learning curve. In the
long run, not enough evidence is gotten from a late enough stage to allow for the
erroneous entry to be permanently listed.
While the above solution is interesting, it has a peculiar property. The entire
grammar is up for grabs at every intermediate point, and there is, in addition,
no denite way of knowing for certain when the end point has been reached
i.e. when the question mark has been erased. This means that the entire
grammar has, at every intermediate point, a rather provisional status. This
character might be argued to be not a failing, but a virtue: that this is precisely
what occurs in a learning system. That is, the all-or-none idealizations of
linguistics are false precisely on this point, and the notion of a learning curve
and a learning system, must explicitly allow for the notion of a question-
marked entry, where the question mark ultimately fades into oblivion.
However, there is one good reason to suppose that this solution is less than
optimal. This is because a linguistic system is not simply a group of isolated
facts, but has itself a deductive structure. Certain pieces of information must be
used as the basis for determining other pieces of information. This means, in
turn, that the pieces of information which are used as such a basis must be
known with almost exact certainty, otherwise, the degree of uncertainty in the
initial entry infects the rest of the grammar. In fact, to the degree to which more
than one piece of information enters into a deduction, the degree of certainty in
the result decreases with the multiplicand of the certainty of the elements in the
basis: two elements, each with the relative certainty of .8 and .7 give rise to a
deduction of certainty only .56.
What would this mean, in terms of the characterization of the abstract,
learnable, deductive system? It would mean, I believe, the following. The ideal
learnable deductive system should not so much have a normal distribution in
terms of the certainty of the elements of the grammar therein at intermediate
points, but rather have something closer to a bi-modal distribution. Certain
elements would be known with almost exact certainty: say, probability .98 or
198 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
above. Certain other elements would be provisional in status: clustered at some
markedly lower probability (say, .60), and marked as such. Given such a
distribution, the function of the elements in the grammar would dier. Namely,
the elements clustered around surety would act as the deductive basis for further
pieces of information in the grammar. The elements clustered at the lower
probability would not. Further, the elements at the lower probability would be
checked for exactness by the application of the deductive structure inherent in the
system to the elements which are known with relative certainty: that is, the
certain elements would act in tandem to weed out the less certain. Further, in
the course of development, elements would move from an unsure characterization
as to accuracy to a quite sure characterization fairly rapidly.
There does seem to be some evidence for such a distribution. Consider a
standard description of learning, as it applies to the natural learning of some
element of an articulated system such as language. It goes something like the
following. The child, at the initial stage, uses a piece of information sporadically
and often wrongly. This stage may last years. There appears then a stage in
which the element is used over and over again, often incorrectly but with greater
and greater frequency of appropriateness. This stage is comparatively very quick:
often lasting only a month or two or three. Finally, the construction is mastered,
and frequency of use again drops down. The intermediate stage is much shorter
in duration than either of the two anking stages. Labov and Labov (1976) note
exactly such a process in the mastering of questions.
For about fourteen months after Jessies rst wh-questions, she showed only
sporadic uses of this syntactic form, less than one a day. In the rst three
months of 1975 (3:43:8), there was a small but signicant increase to two or
three a day. Then there was a sudden increase to an average of 30 questions a
day in April, May, and June, and another sudden jump to an average of 79
questions a day in July with a peak of 115 questions on July 16th After the
peak of mid-July, the average frequency fell o slowly over the next two
months (to 4:0), then fell more sharply through October and December to a
stable plateau of 1418 a day for the next seven months.
This pattern would correspond exactly to one in which an element began as
unlearned (i.e. with low subjective assignment of probability of accuracy: prior
to 1975 here), to one in which it had (from October 1975 onward): in between,
was the time of discovery, relatively short. Such descriptions are of course
characteristic of most instances of learning, and are obvious from simple
observation. Note that if this description is correct, the all-or-none characteriza-
tion of learning in linguistic theory is in fact close to the truth. While some
variance must at both points be allowed, what is crucial is that the distribution is
THE ABROGATION OF DS FUNCTIONS 199
bimodal, where the modes correspond to the long periods of time over which the
element is viewed as not learned and learned, the two points of temporal
stability. Furthermore, we may expect that the elements in the two groups dier
in function: the elements which are known are the basis for further deduction,
while the elements which are not known are not.
5.4 Equipollence
Assuming the above as a general characterization of the system, there must be
some way by which particular entries are marked as certain, while other entries
are not. Rather than assuming that this is done simply quantitatively, and tagged
onto the system, let us take it as a working assumption that this is done in the
representational system itself. If this were the case, then we may have a reason
for the necessity of the bimodality in certainty: a necessity linked to the repre-
sentational system.
The above-mentioned paradigm suggests a way in which this might be done.
Suppose that the child saves two representations of a given lexical item or
syntactic substring. One is anchored in the more surfacy level i.e. S-struc-
ture or the surface. This representation extends to the other levels (DS, LF, etc.),
but is anchored in a particular, surfacy level: say: S-structure. The other repre-
sentation also extends to all levels, but is anchored at another level: say, DS or
underlying representation. The child therefore has two representations, each with
a full complement of levels (and thus fully formed), but with a dierent
anchoring level.
How does the child then know when his or her nal grammar has been
reached? Suppose that, rather than relying on a device such as questioning all
intermediate entries, the child uses the notion of equipollent (equally anchored)
dened above. In particular, the following holds:
(24) When the representation underlying a construction is equipollent
(single representation anchored at all levels), the representation is
nal and correct.
(25) When the representation underlying a construction is not equipollent
i.e. consists of either two representations, or a single non-equi-
pollent representation, it is provisional.
Learning, then, would be the process of converting representations in the form of
(25) to the form in (24). Note that this faithfully represents the idea that there is
a basic two way distinction in the form of the knowledge and that this is encoded
200 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
in the representational system: either a piece of information is in the form (24),
in which case it is learned, or it is in the form (25), in which case it is not yet
learned. Further, the child knows, by the representational system itself, which
element falls into which group: the non-learned elements have multiple, nonequi-
pollent entries, while the learned elements have a single, equipollent entry.
How might this work in practice? Suppose that we take the over-
generations noted earlier:
(26) a. Me give book John (I gave a book to John)
b. I walk table (I am walking on the table)
Rather than supposing a provisional lexical entry for these constructions, let us
imagine a real one:
(27) a. give: ___ NP NP
theme goal
b. walk: ___ NP
location
Clearly, however, this cannot be all that is said, or the entry could never be
driven out. Let us seek help from the Projection Principle, which states (roughly)
that representations from all syntactic levels are projections from the lexicon.
This means, however, that we may view S-structure as a projection from the
lexicon as well. Let us suppose that this is the case, and that the childs grammar
contains a representation of the lexical representation underlying S-structure, as
well as that underlying DS: so much is implicit in the Projection Principle. Let
us, however, put this together with the notion of anchoring. Suppose that the
child has two representations, not one, of a single lexical item (in a single
usage). One is anchored at DS, it is the one that underlies the sentences that the
child produced in (26), and is to be found in (27a).
(28) give: ___ NP NP (Me give doll mommy)
theme goal
The second is anchored at S-structure, and consists of the actual heard form:
(29) give: ___ NP PP
theme goal
The childs full representation, of a single lexical entry, therefore consists of the
following subentries, both distributed throughout the grammar (recall that there
are two entries, not one), but anchored at dierent points.
THE ABROGATION OF DS FUNCTIONS 201
(30)
DS
SS
PF
Give: ___
Give: ___
LF
NP
theme
NP
theme
NP
goal
NP
goal
Subentry 1: a.
(PF and LF entries same as DS and SS)
Anchored at DS
DS
SS
PF
Give: ___
Give: ___
LF
NP
theme
NP
theme
PP
goal
PP
goal
Subentry 2: b.
(PF and LF entries same as DS and SS)
Anchored at SS
Notice that this solution immediately explains one problem which is puzzling in
Pinkers account. Namely, if the childs grammar only contains the rst entry in
(28) (with the question mark attached), how is the child able to comprehend
sentences such as John gave a ball to Mary? That is, at the same time that the
child is generating (in the nontechnical sense) erroneous forms, the appropriate
representation must be available somehow to account for comprehension. The
representation in (30) does so, and anchors it appropriately.
More centrally, the outline of a way to handle the negative evidence
problem can be seen. Recall the problem: the child is exhibiting constructions
which appear to be ungrammatical from the point of view of the adult. Further-
more, the lexical subcategorization frames underlying them are overgenerated,
but appear not to be eliminitable from uniqueness principles alone. Rather than
marking broad sections of the grammar provisional per se, two entries are listed
in the grammar, anchored in dierent places. Since the representation is not
equipollent, being neither single, nor anchored at all levels, the child knows that
202 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
the form, which he may be using, is not the nal form that it is appropriate to
have for the grammar, i.e. the child knows that learning must take place.
This method has one great advantage. Aside from marking particular entries
as provisional, it allows other entries to be unequivocally marked as nal: i.e.,
complete and accurate. These are the entries which are equipollent. Thus,
suppose that at a later stage, a dierent representation of give was internalized.
(31) give: ___ NP PP (anchored in DS)
theme goal
give: ___ NP NP (anchored in SS)
theme goal
At this stage, this part of the grammar would be equipollent, anchored both in
DS and SS (we may ignore LF and PF, for current purposes). More exactly. the
syntactic structure corresponding to the projected lexical entry would be equi-
pollent, so the lexical entry would be. This would mean, however, that this part
of the grammar could be considered by the child to be complete and true:
namely, a full and complete lexical entry. This is important, because certain
sections of the grammar must be known as certain, and not simply provisionally,
in order for the child to use them to make other judgments. That is, in situations
of partial information, it is important that certain of the pieces of information to
be known absolutely (or nearly absolutely) as true. It is these points which act as
the basis for further inference.
This presents a picture dierent than that in Pinker (1984). Rather than the
grammar being provisional en masse, and gradually achieving certitude, particu-
lar parts of the grammar namely, those which are equipollent are known
to be accurate. It is these which exert their force over the rest of the grammar
from the point of view of inference. This has an advantage over that which
Pinker assumes, because if an entry is marked as provisional it would not act as
the basis for further inferences in the grammar, and hence would not infect the
rest of the grammar.
Consider now briey how this sort of system may be accommodated to the
approach sketched in Chapters 2 and 4. A single equipollent entry may be
created from two nonequipollent entries in the following ways:
i) By retaining the D-structure anchored entry and removing the S-structure
anchored entry from the grammar (allowing the D-structure-entry to be
anchored at all levels).
ii) By retaining the S-structure anchored entry and removing the D-structure
anchored entry from the grammar (allowing the S-structure anchored entry
to be anchored at all levels).
THE ABROGATION OF DS FUNCTIONS 203
iii) By retaining both the D-structure and S-structure entries, and positing an
operation which mediates between them.
The dierence between the approach in Chapters 2 and 4 and that given directly
above is a dierence in the type of approach above. In the chapters above, the
third mechanism (iii above) is adopted, assuming that the initial representation
(me go Mommy) is actually retained in the nal grammar, and the problem for
the child is to mediate between this and the adult representation which he/she
does by means of the rule Project-.
In the assumption directly above, the D-structure anchored entries by the
child are assumed to be simply false, and the child ultimately projects back the
S-structure anchored entry, using mechanism ii). The problem there was for the
false entries to be eliminated, without the entire grammar falling into disrepute
(by the means of questioning entries). This was done by means of allowing two
entries by the child, but not allowing them to be equipollent.
Let us assume henceforth that the approach in Chapters 2 and 4 is correct,
rather than the one outlined directly above. In this case, it will still be necessary
to allow for two entries, one anchored at DS (or, more exactly, thematic
structure), and another anchored at SS. Thus the crucial notion of anchoring still
holds. Further, it will be necessary to mediate between these two representation.
This may be done by the following rule:
(32) a. Entry merger: If is the entry for a word anchored at DS, and
is an entry anchored at SS, and there is some operation
existing in UG mediating between and , then Merge (, )
with the operation .
b. If (a) is impossible, chose or as the anchored entry and
create an equipollent entry from it.
This creates a single equipollent entry, or allows a transformation to mediate
between the two known levels.
A number of questions at the theoretical level remain; for the remainder of
this chapter, however, I will simply concentrate on empirical evidence supporting
the notion of anchoring.
5.5 Case Study I: Tavakolians results and the Early Nature of Control
In the preceding section I suggested that a general solution to the problem posed
by negative evidence was to be found with the concept of equipollence. In the
rest of the chapter, I would like to investigate a particular instantiation of the
204 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
notion of shallowness of a derivation in particular, with respect to the
anchoring in S-structure, with the derivation shallow with respect to DS. That is,
the situation in (33) abides:
(33)
DS
DS
SS
Shallow derivation
This would mean that the particular grammatical functions associated with the
real DS would be unavailable to the child at the point at which DS were the
deepest computed level.
Such a general prospect would have three eects. First to the extent to
which grammatical functions associated with DS are screened out in early
derivations (which only proceed back to DS), there is evidence that the grammar
really is leveled, rather than DS being simply an aspect of SS. Second, the
indeterminacy as to where particular operations apply in the adult grammar may
be solved or at least moved toward a solution-by looking at the child
grammar. To the extent to which development is truly organized as this model
suggests, and not simply helter-skelter, the actual content of DS, and the
principles which apply there, may be determined by noting which operations fail
to apply at the stage at which a shallow derivation is computed. Finally, the
notion of development is given real status in the grammar, both in terms of the
structure of levels in the grammar, and in the development of particular construc-
tions from non-equipollence to equipollence.
I discuss now three phenomena indicating the presence of a shallow
derivation: Susan Tavakolians (1978) data concerning control into sentential
subjects, Guy Cardens (1986b) thoughtful analysis of Condition C eects in
dislocated constituents, and Roeper, Akiyama, Mallis and Rooths (1986) paper
concerning wh-movement, Strong Crossover, and quanticational binding.
5.5.1 Tavakolians Results
Susan Tavakolian, in an extremely interesting set of experiments (Tavakolian
1978), has investigated aspects of the acquisition of sentences with clausal
THE ABROGATION OF DS FUNCTIONS 205
subjects. In particular, she tested childrens competence on innitival clausal
complements, including those both with and without a pronominal subject the
latter being control structures. It is these control structures, often called instances
of nonobligatory control, after Williams (1980), which are of interest here.
Examples of the control structures tested are the following.
(34) a. To stand on the rabbit would make the duck happy.
b. To bump into the pig would make the sheep sad.
c. To walk around the pig would make the duck glad.
d. To kiss the lion would make the duck happy.
e. To hit the duck would make the horse sad.
f. To jump over the duck would make the rabbit happy.
Control properties of structures with subjectless innitival complements (i.e.
apparently subjectless) are extremely various. In particular, the following sorts of
control seem to be possible.
(35) a. Arbitrary control, where the PRO refers to an arbitrary
unspecied element, similar (in English)to the meaning of
one.
b. Control from the object or subject of the predicate which the
controlled clause is the complement of.
c. Control from a controller up-the-tree, so-called long distance
control.
d. Control or interpretation by a discourse or extra-sentential
referent, denite in reference.
e. Control by the prepositional object of a restricted class of
predicates, mostly psychological predicates, into a sentential
subject.
Additional renements are possible: for example, between thematic and pragmat-
ic control (Nishigauchi 1984), and between control as it takes place in the want
vs. try class (Rosenbaum 1967). These are irrelevant for what follows.
Examples of the type of control in (35) are given below.
(36) a. To know oneself is dicult. (Arbitrary Control)
b. John persuaded Mary to leave. (Control by Matrix Object)
c. Bill said that shaving himself was a drag. (Long Distance Control)
d. Have you seen Bill? Shooting himself in the foot must really
have hurt. (Discourse Control)
e. To know himself is dicult for Bill. (Control by Object into
Sentential Subject)
206 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
The theoretical problem is to try to get a unied account out of this apparent
diversity. See Williams (1980), Chomsky (1981), Manzini (1983), and Clark
(1986) for further discussion. This variance notwithstanding a topic which
will be discussed shortly it appears that in the Tavakolian sentences, given in
(34), and in a large class of like constructions, if an object is present (including
a for object), it must control the PRO subject. Thus while (37a) is perfect with
the lion as controller, it is considerably worse than an arbitrary one reading, and
(I believe) impossible with control outside the clause: a discourse referent.
(37) a. PRO to kiss the duck would make the lion happy.
(the lion kisses the duck)
b.
??
PRO to kiss oneself would make the lion happy.
(test for one interpretation by (in-)ability to take reexive
oneself object)
c. Did you see the pigs? PRO to kiss the duck would make the
cow happy.
(Impossible under interpretation in which the pig is kissing the
duck)
The same appears to be the case with all of Tavakolians examples, which are
identical to (37a) in form, except for the choice of NPs, representing dierent
animals. Discourse control, either by an explicitly mentioned referent (37c), or
by a pragmatically accessible entity, are impossible for adult speakers, for this
class of examples.
However, the situation with children is dierent. Children consistently and
systematically allow a discourse referent, one which was not even mentioned in
a previous sentence but is pragmatically available in the set of animals with
which the child was told to act-out each sentence. Tavakolians results were the
following (Tavakolian 1978: 187):
Table 1. Distribution of Responses to Sentential Subjects with Missing Complement Subjects
(To kiss the lion would make the duck happy)
Response Type
Age Matrix NP Extrasentential NP Other
3.03.6
4.04.6
5.06.6
7
8
3
12
13
19
5
3
2
Total 18 44 10
Percentage 25% 61% 14%
THE ABROGATION OF DS FUNCTIONS 207
Thus 61% of the children allowed an extrasentential referent for PRO. For the
sentence given, this might be, for example, a pig which was included in the set
of farm animals. This percentage did not signicantly change for the ages under
investigation, though it was, in fact, slightly higher for 5 year olds than 3 year
olds. This is clearly at variance with the adult response. Moreover, unlike other
somewhat similar cases, it is not liable to the methodological strictures of Lasnik
anal Crain (1985) and Crain and McKee (1985), who note correctly that with
respect to backwards anaphora with overt pronouns, the predisposition for
children to allow an extrasentential reading (Solan 1983) does not make a grammati-
cal point. The authors above note that for sentences such as the following:
(38) After he left, John went to the store.
Children tend to either take he as an extrasentential referent, or, if they allow
coreference, to transpose the name and the pronoun in the following manner:
(39) After John left, he went to the store.
But as Lasnik and Crain note, the tendency for coreference in (39) is not a
grammatical phenomenon in any case (but purely pragmatic), and, even more
crucially, if the child is able to transpose the pronoun and the name in (39) in a
repetition task, this means that the two elements in (38) in the comprehension
part of the task must have been co-indexed: i.e. known to be coreferent. So the
bias, on this type of backwards anaphora, must not be part of the grammar.
The situation with the Tavakolian sentences is quite dierent. First, the
necessity for object coreference in sentences such as (40) seems to be a matter
of the grammar, not simply pragmatics. Adults do not allow extrasentential
referents for the control clause in examples such as the following:
(40) To kiss the duck would make the lion happy.
Second, when children are asked to repeat these examples, they may supply a
pronoun for the PRO (Tavakolian 1978), but that pronoun is not taken as
referring to the matrix object, but to an extrasentential referent. Assuming that
they are paraphrasing their semantic representation in some sense, this means
that in that representation, and thus in the output to the comprehension task, the
missing subject in the control clause is marked for extrasentential reference.
Thus the phenomenon appears to be grammatical in character.
5.5.2 Two Solutions
There are two possible avenues to take in attempting to isolate the aspect of the
208 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
childs grammar here which is dierent from an adults grammar. The dier-
ence might fall in the control rule itself or, more exactly, in how control
interacts with levels of representation or it might be traced to a dierence in
the categorization of the null element.
If the PRO, for example, were interpreted as a simple pronoun in this stage
of development, then one might expect free extra-sentential reference. This is, in
fact, the explanation that Tavokolian herself oers: that PRO is interpreted as a
simple pronoun. A more sophisticated version of this same idea might be the
following. All null categories are, at some stage, neutralized along some dimen-
sion. Thus if we assume that null categories, in their feature sets of +/anaphor,
+/pronominal (Chomsky 1981) are stored in a paradigm and the paradigm itself
must be learned, then their might be some intermediate stage in the articulation
of the paradigm which would predict extrasentential reference. For a recent
version of this sort of theory, see Drozd (1987, 1994). Of course, any theory
which posits a change in the categorization of PRO, either in the simple version
(namely, that pro PRO) or in the more complex version (that the null
category paradigm is learned, and neutralizations exist in it in early stages), must
give some account of how the nal setting appears from the initial setting.
The second possibility is that there is some charge in the control rule over
time: more exactly, in the interaction of control with levels of representation.
Suppose that, for the constructions under investigation, PRO is its normal adult
category (e.g. +anaphor, +pronominal), with standard characteristics. Suppose
that, for some reason, the Control rule coindexing the PRO with an antecedent
does not apply in the usual manner. An uncontrolled (i.e. unindexed) PRO, we
may suppose, is able to pick up a general referent from discourse, or taken to be
arb. In this way, the same set of data could arise.
In the next two subsections, I will further discuss these two possibilities,
coming to the conclusion that of the two possible failings a failure in
category typing, or in the control rule as it interacts with levels of representation
the latter is the more likely, though with some real possibility of a mediated
view. Finally, I will attempt to justify the dierence in the control rule in terms
of the general structure of the grammar.
5.5.3 PRO as Pro, or as a Neutralized Element
One solution to the puzzle that external reference provides for the analysis of
acquisition is simply to assume that PRO has dierent referential properties in
the childs grammar, and thus, in GB theory, that it has a dierent feature
composition in this stage of development than it does in the adults grammar.
THE ABROGATION OF DS FUNCTIONS 209
The propensity of children to take an external referent in cases like (40), repeated
below, would follow if PRO were simply being interpreted as little pro.
(41) PRO to kiss the duck would make the lion happy.
(External reference: 61%)
There is some interesting empirical evidence from Tavakolian which supports
this view. First, as she notes, the propensity for external control in these
structures is very similar to that for similar constructions with overt pronouns.
Examples like (42) are given an external referent reading about 55% of the time
(Tavakolian 1978).
(42) For him to kiss the duck would make the lion happy.
This similarity in response would be explained if PRO were simply being
interpreted as a pronominal.
Such a misconstrual would be similar to the sort of theory that Hyams
(1985, 1986, 1987) proposes with respect to the acquisition of null subjects: that
they are misconstrued as little pro.
In spite of the simplicity of the proposal, there is empirical data which
severely undercuts it. This data is of one basic type: instances in which the
relevant null category seems to be acting like PRO, not pro, in the childs
grammar. To the extent to which these cases are convincing and they appear
to be there is no way that we can assume a general identication of PRO
with pro in the childs grammar: that is, it is not the case that PRO is simply
misconstrued as pro by the child.
The clearest empirical counterevidence to this hypothesis is to be found in
Goodlucks (1978) thesis. First, there is a class of control structures studied by
Goodluck, but not by Tavakolian, which very clearly operated as if they con-
tained a controlled PRO. These were purposives and rationale clauses, and
temporal adjuncts. In the case of in order to clauses, children show a clear
c-command constraint on the choice of the controller.
Table 2. Control in Purposives (Goodluck)
Percentage Subject Control
In sentences with
Direct Object NPs
In sentences with
locative PPs
4 years
5 years
56.7%
63.4%
90.1%
90.1%
210 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
These are sentences like the following:
(43) a. Daisy hits Pluto PRO to put on the watch.
(56.7% subject control, age 4;
(63.4% subject control, age 5)
b. Daisy stands near Pluto PRO to do a somersault.
(90.1% subject control age 4;
(90.1% subject control age 5)
The correct choice for adults for these constructions would of course be subject
control, for both instances. What is interesting here is that even though subject
control is not uniform for the children here, it is obligatory (or nearly so) for the
constructions involving a locative: those in (43b). This may be explained simply
if the child is not able to control PRO out of a locative PP. This c-command
constraint would not be expected with little pro, which is allowed free
coreference, like any pronoun. If we use Tavakolians paraphrase test, we see in
addition that an overt pronoun allows reference to either the subject or the
locative NP.
(44) Daisy stands near Pluto for him to do a somersault.
If anything, the more pragmatically biased reading in (44) would be that in
which Pluto were the antecedent of him. The fact that the child does not allow
coreference with Pluto when the subject of the complement clause is nonovert
suggests that it is not acting as little pro, which should allow Pluto as antecedent
as him does.
The second body of data weighing against the general interpretation of PRO
as pro comes again from Goodluck, and involves temporal adjuncts. These do not
allow extrasentential reference, but rather must refer back to the main clause subject.
(45) Daisy hit Pluto after putting on the watch.
The percentage of subject coreference for these was the following (Goodluck
does not provide the degree of extra-sentential reference):
Table 3. Subject Control with Temporal Adjuncts
Age Percentage Subject Control
4
5
66.7%
63.4%
This data is again very strongly in contrast with the original data of control into
sentential subjects, where extrasentential control was the usual case (PRO to
THE ABROGATION OF DS FUNCTIONS 211
kiss the duck would make the lion happy, Controller: extrasentential). The
uniform assumption that PRO was simply misconstrued as pro in this stage of
the childs grammar would predict uniform results in the two cases: this is not
the case. Again, supplying an overt pronoun, which presumably should operate
similarly to pro, easily allows object coreference, to the extent to which the
construction is grammatical at all.
(46) Daisy hit Pluto after him putting on the watch.
(Only semi-grammatical, but either main clause NP may he coreferent)
All this suggests that the simple misconstrual hypothesis cannot be maintained:
while one might suppose on the basis of Tavakolians initial results that PRO in
sentential subjects was bring interpreted simply as pro, as a result of a general
tendency for the null element to take such an interpretations this solution would
overgenerate pro in positions in which something much closer to the adult
control rule was operative. These are in purposives and also in temporal adjuncts.
So it cannot be the case that in early grammars PRO is simply uniformly
interpreted as pro.
There is a second possibility that we might consider given the basic
miscategorization hypothesis. Namely, that the paradigm of null categories is
underdierentiated in initial stages, so that the relevant null category is neither
pro nor PRO, but something antedating either. This might be done, for example,
by supposing that only the +/pronominal feature and not the +/anaphoric
feature was operative, in initial stages. The reduced paradigm would look like the
following.
(47) Full Paradigm (Adult)
+Pronominal Pronominal
+Anaphor
Anaphor
e
e
e
e
(48) ? Reduced Paradigm (Child)?
+Pronominal Pronominal
+/Anaphor e e
In the adult paradigm, the null category will be interpreted as one of the four
major types depending on the slot that the element lls in the paradigm: e in
+pronominal, +anaphor = PRO, e in +pronominal, anaphor = pro, e in pro-
212 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
nominal, +anaphor = NP-trace, e in pronominal, anaphor = wh-trace. Suppose
that the paradigm is neutralized along the anaphoric dimension: the anaphoric
feature had not been discovered yet by the child. The result would be a
collapsed paradigm in which the anaphoric feature played no role: the +pronomi-
nal null category would be a hybrid between PRO and pro; the pronominal null
category would be a hybrid between NP-trace and wh-trace. (Needless to say,
this is not the only sort of neutralized paradigm for null categories that might be
imagined: I choose this as illustrative.)
The general properties of such a paradigm collapse in acquisition (more
exactly, underdierentiation: Pinker and Lebeaux 1982; Pinker 1984) are quite
interesting, and bear further investigation: see Drozd (1987, 1994), for further
discussion. Nonetheless, there is reason to believe that for the particular set of
data under discussion, even this sophisticated version of the misinterpretation of
the early null category is insucient. The reason is the following. The logical
diculty with the hypothesis that PRO is, in early grammars, uniformly
interpreted as pro (i.e. a simple, though null, pronoun) resided in the fact that
dierent constructions operated dierently. Control structures of the type
directly studied by Tavakolian, those involving control of an object into a
sentential subject, did indeed allow the null category in the subject of that control
clause to freely choose extrasentential referents. One way of accounting for this
would be to suppose that it was operating as a null pronominal, and that this
followed from some deciency in the interpretation of PRO. However, this
deciency is hardly viable, given that other instances of PRO operate in a
standard way, namely as an obligatorily controlled element. That is, the dicul-
ty cannot reside simply in the misinterpretation of PRO, since this diculty
would then be expected to be general: but it is not.
The same logic would rule out the undierentiated paradigm explanation,
at least if it is considered alone to be the root cause. The underdierentiated
paradigm would allow us to posit a new null category, not yet dierentiated
between PRO and pro. But whatever this set of properties were, we would
expect them to be consistent, just as the properties of PRO and pro are. Howev-
er, this is precisely what we do not nd: sometimes the null category is acting
like a small free pronominal, and sometimes as controlled PRO. That is, while
the underdierentiated paradigm idea would introduce a dierence between the
adult grammar and the childs in the interpretation of the element, it does not
seem to be the right sort of dierence: what is needed is a dierence which will
allow the null element in the sentential subject construction to have dierent
properties from the adult PRO, but the null category in (say) temporal adjuncts
not to have such dierent properties. An underdierentiated paradigm, by itself,
THE ABROGATION OF DS FUNCTIONS 213
would not capture this dierence (but see Drozd 1987, 1994, for a somewhat
dierent point of view).
5.5.4 The Control Rule, Syntactic Considerations: The Question of C-command
Given the failure of the theory that initial PRO is analyzed as pro (or as a
neutralized category), we are driven to look elsewhere for an answer. The
questions appear to be:
(49) a. Why are certain control complements (purposives, temporals)
behaving dierently than others (control into sentential sub-
jects) for children?
b. Why are children treating clausal subject complements dier-
ently than adults, with respect to control?
Underlying this, we might wish for a unied theory of control, at least at some
level.
One possibility for the dierence in (49a) is that this dierence is associat-
ed with the distinction between OC (Obligatory Control) and NOC (Nonobligato-
ry Control), in the sense of Williams (1980). Goodluck (1978) makes such a
suggestion. This might appear to be on the right track, yet recent work (Lebeaux
1984, 19841985; Sportiche 1983) suggests that the distinction between the two
sorts of control, while existent, is not of the primitive sort posited by Williams.
If the analysis of arbitrary control of Lebeaux (1984), Epstein (1984) is correct,
than so-called PRO
arb
is not an unbound element (i.e., a free variable), but rather
operator-bound. Evidence for this is found in double binding constructions
(Lebeaux 1984).
(50) a. PRO to know him is PRO to love him.
b. PRO to get a nice apartment requires PRO getting a higher
paying job.
(*PRO to get a nice apartment requires PRO getting trustwor-
thy tenants.)
c. PRO to become a movie star involves PRO becoming well-known.
(*PRO to become a movie star involves PRO recognizing you.)
In such constructions, each PRO is arbitrary in reference, but the two PROs must
be linked in reference. Thus (50a) means that for some arbitrary person to know
him, that same arbitrary person will love him. (50b) means that for some person,
x, to get a nice apartment, that same person x, must get a higher paying job. The
logical representation of the structures in (50) is therefore the following.
214 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(51) a. O
x
((PRO
x
to know him) is (PRO
x
to love him))
b. O
x
((PRO
x
to get a nice apartment) requires (PRO
x
getting a
higher paying job))
c. O
x
((PRO
x
to become a movie star) involves (PRO
x
becoming
well-known))
Further, the operator binding must take place quite locally since the double
binding eect disappears when one of the open sentences is further embedded.
(52) PRO being from the Old World means that stories about PRO
winning the West are unlikely to be thrilling.
(53) PRO being from the Old World means PRO hearing stories about
PRO winning the West.
In (52), the two arbitrary PROs are unlinked: since the latter is embedded in an
NP, it is not close enough to the other arbitrary PRO so that they are bound by
the same operator. In (53) the rst two PROs are close enough, and they are
linked in reference (bound by the same operator); the third arbitrary PRO is
unlinked, being embedded in an additional NP.
In earlier work (Lebeaux 1984), I suggested that: i) PRO (including arbitrary
PRO) must always be bound (to account for the above facts), ii) that the binding
is local (to account for the above facts and the crossing eects), and iii) that the
binding element was a universal quantier in an operator position.
I wish to retain the rst two of these assumptions, and may do so via the
following specication:
(54) PRO must be bound in the minimal maximal NP, S dominating
controlled S, where controlled S is the S most immediately domi-
nating PRO.
With respect to the third assumption, I will change that here in the following
way. Namely, the binding element is not a universal quantier, but rather a
simple abstractor. This abstraction, however, does not take place locally, i.e.
within the predicate, but at the S or S level (compare to Chierchia 1984). This
element will continue to be represented with O, simply meaning a null category
(assuming that null categories in particular positions may have operator status).
Second, in part in response to considerations raised in Browning (1987), I will
no longer assume the binder to be in Comp, but in a topic position, or some
(quasi-)theta position peripheral to S. The reason for this will appear below.
The idea that arbitrary PRO and long distance control PRO are in fact
operator bound (see the above mentioned work for additional discussion of Long
THE ABROGATION OF DS FUNCTIONS 215
Distance control PRO) means that another issue, which might initially be thought
to be decided by incontrovertible evidence, is thrown into high relief. Namely, is
c-command necessary for all instances of control? It would be well if it could be,
since this would regularize it to other co-indexing operations in the grammar. Yet
both Williams (1980) and Chomsky (1981) are driven to answer the question in
the negative, Williams using the lack of c-command as a criterion for his
classication of Non-Obligatory Control, while Chomsky correctly notes the
existence of sentences such as this:
(55) PRO to learn math is necessary for Johns development.
Indeed, if John is the direct controller of PRO here, this would counterexemplify
the need for c-command directly. With the possibility of a nonovert topic
operator, however, PRO could get its reference from that, with the topic itself
taking the reference of John from discourse, similar to the situation noted in
Huang (1982) for Chinese.
(56) O
i
((PRO
i
to learn math) is necessary for Johns development).
One argument against such an analysis is that there would be a Condition C violation
with respect to the coindexed operator and John, which have the same referent.
However, this violation is weak, in pseudo-topic type constructions in English.
(57)
?
As for John, to learn math is necessary for Johns development.
Thus one would not expect Condition C to rule out sentences such as (56).
There does seem to be some quite suggestive evidence for just such an
analysis. It is of the following form: that there are widespread similarities in the
patterns of grammaticality and ungrammaticality for sentences which allow (or
do not allow) bound reading for control, and the grammaticality and ungrammat-
icality of sentences with an overt as-for topic. Consider rst the following
sentences:
(58) a. As for John
i
, this shows that he
i
is a liar.
b.
?
*As for John
i
, this shows that John
i
is a liar.
(59) a. As for John
i
, this sort of thing is important for Johns
i
development.
b.
?
*As for John
i
, this sort of thing is important to Johns
i
mother.
The contrast in (58) is expected: the Condition C violation, while weaker for
these constructions than that usually found, is still present (it is considerably
weaker when a name c-commands a name, than when a pronoun c-commands a
name, as Lasnik 1986, notes). The contrast in (59) is the really interesting one.
When a name is part of a nominal like Johns development, it causes a much
216 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
weaker Condition C violation then when it is part of a name like Johns mother.
From (59) it is not clear whether this is because the nominal is deverbal in the
case of Johns development, or because Johns mother is an animate referent; for
present purposes, this does not matter. Quite likely, the contrast in (59) is not
due to Condition C at all, but to the aboutness relationship which must hold
between the as-for topic and the rest of the sentence, which is (for some reason)
easier to contrive in (59a) then in (59b).
Consider now the corresponding cases with control. Corresponding to (58a)
and (b) is the following contrast from Bresnan (1982). (My judgements, Bresnan
nds a stronger contrast yet, OK vs *).
(60) a. PRO
i
contradicting himself demonstrates that he
i
is a liar.
(Bresnan (51))
b.
??
PRO
i
contradicting himself demonstrates that Mr. Jones
i
is a
liar. (Bresnan (52))
If one assumed that there were some rule of control by which the PRO were co-
indexed with the element in the element in the lower clause, or one assumed an
arb-rewriting rule, this contrast would be inexplicable: such a rule would not be
expected to be sensitive to the pronoun vs. name distinction. However, this
contrast would be explained if there were a null topic in the cases in (60): in
(60b), but not in (60a), a name would be c-commanded by another name.
(61) a. As for John
i
, this sort of thing is important for Johns
i
development.
b. *As for John
i
, this sort of thing is important to Johns
i
mother.
The contrast in (61) is the really interesting one. When a name is part of a
nominal like Johns development, it causes a much weaker violation then when
it is part of a name like Johns mother. Quite likely, the contrast in (61) is not
due to Condition C at all, but to the aboutness relationship which must hold
between the as-for topic and the rest of the sentence, which is (for the same
reason) easier to contrive in (61a) then in (61b). Consider now the following set
of sentences. Here, again, the apparent control contrast is strongly in parallel
with the aboutness relation contrast.
(62) a. As for John, this is important for Johns development.
b. *As for John, this is important to Johns mother.
c. PRO
i
to learn math is important for Johns
i
development.
d. *PRO
i
to learn math is important to Johns
i
mother.
The contrast between (62a) and (b) parallels that between (62c) and (d), though
the latter is a case of control and the former is not. This is explained if we
THE ABROGATION OF DS FUNCTIONS 217
assume that it is not the control rule which is sensitive to such predicates as part-
of-a-possible-controller, but rather that the control rule, like other co-indexing
rules, requires c-command by some minimally local element: in this case a null
topic. The contrast between (62c) and (d) would then not be stated in the control
rule itself, but such a contrast would be factored into the possible constraints on
an aboutness rule, which is independently needed to explain the contrast in (62a)
and (b), and a structure-dependant control rule.
Theoretically, this allows for the control rule, a coindexing rule, to not be
sensitive to pragmatic information, and thus regularizes it to other coindexing
rules in the grammar, to some degree.
The same sort of parallel holds for cases of embedded object vs. embedded
subject, as the following quadruplet shows:
(63) a.
?
As for Bill, this shows that Bill is really smart.
b. *As for Bill, this shows that Mary is right about Bill.
c.
?
PRO
i
winning the Nobel Prize shows that Bill
i
is really smart.
d. *PRO
i
winning the Nobel Prize shows that Mary was right about
Bill
i
.
The same comments apply.
To summarize: by choosing a null topic analysis, the pragmatic sensitive
features of control are, to some degree, teased out, and taken to be part of the
aboutness relation between the null topic and the following clause. This speci-
cation is independently needed as shown above. While this does not identify
obligatory and nonobligatory control (dierences exist between them if
Lebeaux 19841985 is correct, the former and not the latter is bound by predica-
tion; other dierences are found in Koster 1984 and Franks and Hornstein
1990), this does regularize control, even so-called arbitrary control, to other
c-commanding relations in the grammar. Directly relevant here: it would mean
that c-command characterizes all cases of control.
Pending further analysis, then, I will assume that the operator-type structure
given in (56) is the correct analysis for sentences such as (55). We are still left
with the problem of explaining the acquisition facts, and a major linguistic
problem as well. In a large number of constructions with control into sentential
subjects, it is not possible to take a discourse antecedent as already noted with
the Tavakolian sentences and, further, in others, a long distance antecedent is
not viable. Examples are given below.
(64) a. *Did you see the pig
i
? PRO
i
to kiss the duck would make the
lion happy.
218 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
b. *John
i
thinks that PRO
i
to sleep more would be pleasing to his
father.
c. *Bill
i
said that PRO
i
knowing himself was dicult for his wife.
This is in spite of the fact that in many cases an as-for topic would be sanctioned
by the following context. Thus in the sentences in (65) and (66) the as-for topic
would seem to be licensed by the possessor identical to it. Yet it is still not
possible to have control occur.
(65) a. As for John, this would please his father.
b. As for Mary, this made her mother angry.
(66) a. Do you know John
i
? *PRO
i
to succeed in business would
please his
i
father.
b. *Ive met Mary
i
, and PRO
i
to have a mohawk haircut makes her
i
mother angry.
Therefore, in cases like (65) and (66), it is necessary to prevent the PRO from
being controlled by a topic or a discourse element.
We thus appear to have two sets of conicting data. On the one hand, it
appears that an operator-type analysis fullls three functions: i) it explains the
linked reading for arbitrary PROs, and the locality involved, ii) it explains the
crossing eects for long-distance binding of PRO, and iii) it allows c-command
to be maintained for examples such as (57). On the other hand, the possibility of
an operator-type reading which is associated with Long Distance control and
discourse control is strictly limited. In none of the examples in (64) is such
a reading available.
Let us rst note again an observation made earlier: the class of control into
subject constructions is associated with a restricted group of predicates. These are
of diering types: i) psychological predicates (please, disgusts, excites, etc.), ii)
tough-predicates (tough, easy, etc.), and iii) predicates involving necessity (is
necessary, requires, etc.), iv) predicates involving causation (make, etc.). While
these predicate-types dier from each other in a number of ways, the latter three
types allow an expletive subject.
(67) It is tough (for John) to do that.
(68) It is necessary (for John) to do that.
(69) It makes Mary happy to do that.
Psychological predicates normally take an NP subject, but if the argument is
clausal, this may also appear in an extraposed position.
THE ABROGATION OF DS FUNCTIONS 219
(70) It pleases Mary to do that.
that Je is so handsome.
It has been recently convincingly argued that psychological predicates have, in
at least some of their uses, two internal arguments at DS (Belletti and Rizzi
1986; Johnson 1986). Accepting the basic position of Belletti/Rizzi and Johnson,
this means that the D-structure of (71b) is (71a), and that the NP is moved into
subject position.
(71) a. e please John pictures of himself.
b. Pictures of himself please John.
(The authors above dier in their DS assignments: Belletti and Rizzi assuming
that the s-structure subject starts o in the most internal position in the VP,
while Johnson assumes that it originates in a 2nd NP position. For concreteness,
I have used Johnsons structure.)
A piece of supporting evidence for the Belletti/Rizzi and Johnson analysis
for English is the placement of arguments in nominals. In nominals, where case
is not a consideration, both arguments appear in internal position.
(72) the pleasure of John in Marys company
(Marys company pleases John)
The two-internal-arguments analysis allows a long-standing diculty with the
binding theory to be resolved. Given the standard analysis in which reexivi-
zation requires c-command, structures such as (73) constitute a puzzle.
(73) a. Pictures of himself please John.
b. Each others choice of friends bae the two boys.
Given the DS posited by the theorists above, however, these structures are no
longer a puzzle. At DS, c-command does hold, and one internal argument acts as
an antecedent for the other.
(74) ((e
NP
) (please (John
NP
) (pictures of himself
NP
)
VP
)
S
)
Suppose that we extend the Belletti and Rizzi analysis to the (somewhat more
problematic) cases of control. The DS of (75a) would then be (75b); the DS of
(76a) would be (76b).
(75) a. PRO to kiss the duck would please the lion.
b. e would please the lion (PRO to kiss the duck).
(76) a. PRO kiss the duck would make the lion happy.
b. e would make the lion happy PRO to kiss the duck.
220 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
Further, assume that the small clause complement originates as part of a complex
verb (Chomsky 1957, 19751955; Bach 1979).
(77) e would make-happy the lion (PRO to kiss the duck).
The strict c-command of the PRO by the controller is in all cases preserved.
Such an analysis, in conjunction with the operator-type analysis, allows us
to retain strict c-command as a necessary condition for control. The near-
obligatoriness of control by the object in these constructions is tied to the deep
structure position of the control clause, before it is fronted. Suppose that control
at DS is generally obligatory, modulo the data discussed above. Suppose that for
some predicate it does not take place at DS. Then the clause is fronted to an
S-initial position. Assuming that PRO always looks to its nearest c-commanding
antecedent, and that this relation is local, then at this point null topic insertion
will take place, if it has escaped control by the DS object, closing o the
sentence. In eect, the necessity of object control for most predicates when the
clause is in subject position is a bleeding phenomenon, where the object co-
indexes with the controlled PRO. Only in such cases where it has escaped
control is reference up the tree possible.
This account traces both the possibility and near-obligatoriness of object
control to the deep structure position of the clause as sister to its object. It is at
this point that object control occurs, if it does. The solution is the following:
(78) a. The controlled subject clause is an internal argument at DS.
b. The control of PRO is dened directly rather than using a
derived notion of c-command.
c. External reference, or reference up the tree takes place after
the clause has moved to its S-structure position, and only then.
5.5.5 The Abrogation of DS functions
We are now in a position to trace the dierence in the childs interpretation of
sentential subjects to a very simple dierence in the control rule, as it interacts
with levels of representation. The dierence is simply this: that for the child, but
not for the adult, the control clause is always in the fronted position, both at
S-structure, and at the deepest computed level D-structure (when I say dislocat-
ed, above, and throughout the chapter, I am purposely being vague as to
whether I mean moved or base-generated in a dislocated position, unless
otherwise specied). When the controlled clause originates in an internal to VP
position at D-structure in the adult grammar, the following sequence of operations
THE ABROGATION OF DS FUNCTIONS 221
apply in the adult grammar (see (78)): i) application of control, dependant on
direct c-command, ii) movement of control clause to fronted position.
(79)
Adult grammar
a. DS: S
VP
V
make happy
NP
the lion
S
PRO to kiss the duck
NP
e
Control
b. S
VP
V
make happy
NP
the lion
i
S
PRO to kiss the duck
i
NP
e
Fronting
c. S
V
make happy
NP
the lion
i
S
PRO to kiss the duck
i
S
e
S
I
would VP
j
j
The control rule applies in (near-)obligatory fashion to the representation in
which the control clause is an internal complement. It demands direct c-command,
and has it. Following control, the clause is fronted. The result is the coindexed
representation, with PRO coindexed with the controller (here: the lion). Since an
element may only be indexed once, this control is nal: there is no possibility of
extra-sentential reference (in the adult grammar).
Consider now what happens with the child. The control rule is constant: it
222 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
applies in the child grammar exactly as in the adult system, and requires direct
c-command. PRO also has the same status in the childs grammar as in the
adults. The only dierence is the following: the childs initial representation is
shallow: DS rather than DS. At DS, the deepest level, the controlled clause
already is in the fronted position.
(80)
S
V
make happy
NP
the lion
S
PRO to kiss the duck
S
e
S
I
would VP
j
j
Childs representation (DS and SS):
Given that the clause originates in the surface fronted position, it is never in the
c-command domain of the object (the lion). Since control is stated over direct
c-command relations, this means that control cannot apply to coindex PRO with
the object. Instead, unindexed, it looks up the tree, to the inserted topic, for an
antecedent. This is precisely Tavakolians result. (Since the Projection Principle
and Theta theory must be satised, the fronted element binds a null category in
the theta position. This null category, however, is not a trace in the derivational
sense: the dislocated control clause never originated in that position.)
In eect, the application of a control rule in the adult grammar, bleeds the
possibility of external reference. Since the clause does not originate in an internal
position in the childs grammar (the deepest representation being shallower), the
co-indexing, and bleeding, does not occur. Hence extrasentential reference is
expected.
The structure of the adult and child grammars is therefore the following:
THE ABROGATION OF DS FUNCTIONS 223
(81)
DS
DS
SS
Adult Grammar Child Gramar
Control-by-Object Control-by-Object
Control
applies
throughout
Control
applies
throughout
Control
applies
throughout
Control
applies
throughout
Fronting Fronting
Deepest computed level by child
Control of
as-yet-unindexed
elements
(by topic)
Control of
as-yet-unindexed
elements
(by topic)
The control rule, and the fronting rule apply identically in the childs grammar
and the adults. The only dierence is that the deepest computed level for the
child is post-fronting, while that for the adult is pre-fronting. As such, the
operation which is the default for adults, namely control of as-yet-unindexed
elements, applies as a matter of course for children. This involves operator-
binding, allowing reference extrasententially or up the tree.
This allows for an explanation not only of the dierence of binding into
sentential subjects found by Tavakolian, but also the instances in which control
applies identically in the childs grammar and in the adults. These are found in (82).
(82) a. Daisy hits Pluto PRO to put on the watch.
b. Daisy stands near Pluto PRO to do a somersault.
c. Daisy hits Pluto after PRO putting on a watch.
As noted earlier, the childrens results in these constructions do not allow
extrasentential reference. Given the structure of the grammar in (81), and the
retention of the standard control rule by children, this result is expected. In the
instances of control with the sentential subject, the actual control rule in the adult
grammar applies when the clause is in an internal position. Since the childs
224 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
grammar is shallow, the clause is never in that position, and escapes standard
control. There is no fronting, however, in the examples in (82). Hence the
control rule applies, as it should, and the childs interpretation is identical with
the adults.
5.6 Case Study II: Condition C and Dislocated Constituents
In this section, I would like to look at a dierent range of experimental evidence
supporting the conclusion advanced above: namely, that if a particular operation
(e.g. control) applies at DS (as well as elsewhere), and the childs analysis is
shallow, then the childs grammar will not show evidence of the operation
applying. This amounts to the abrogation of that particular DS function. In the
previous section, the form that the abrogation took was in the lack of an
obligatory rule: the indexing function between internal arguments at DS, a
positive condition. In this section, the abrogation is of a negative condition:
namely, Condition C as it applies at DS. As such, the childs grammar will
appear to overgenerate. This is a result, again, of the shallowness of analysis by
the child.
The empirical data that I will draw upon is taken largely from a review
article by Guy Carden (Carden 1986b), which in turn analyzes, and re-analyzes,
a variety of sources. I take Cardens proposal to be particularly acute, and follow
it in a number of respects (though for some counter-discussion, see Lust 1986).
Carden (1986b) explores in some detail the dierences which follow from
what he calls a Surface vs. an Abstract Model of Noncoreference. The
progenitors of the Abstract Model he gives as Carden (1986a), Carden and
Dietrich (1981), and McCawley (1984); the Surface Model is that advanced by
Reinhart (1983). Recent models, not discussed by Carden, have fallen under the
rubric of Reconstruction, whether real Reconstruction or quasi-Reconstruction
where no actual reconstruction is found: see, e.g., Higginbotham (1985), Wil-
liams (1987), and Barss (1985, 1986) for proposals along these lines. Carden,
and the discussion here, draw on both the adult grammar and the acquisition
evidence.
The relevant data are examples such as these:
(83) a. *Near John
i
, he
i
saw a snake e.
b. *In Johns
i
bag, he
i
put some tennis shoes e.
c. Near him
i
, John
i
saw a snake e.
d. In his
i
bag, John
i
put some tennis shoes e.
THE ABROGATION OF DS FUNCTIONS 225
In (83c) and (d) coreference between the pronoun contained in the preposed PP
and the subject is possible; in (83a) and (b), the name in the preposed PP may
not be coreferent with the subject pronoun. In Cardens account, these facts may
be accounted for in two distinct ways: by an abstract account which states the
condition on disjoint reference at deep structure, or by Reinharts surface
model, where such conditions are stated on s-structure. For Reinhart, this does
not involve reference to the trace as well. To the two possibilities outlined by
Carden, we may note a third: the possibility of using a level of Reconstruction
Structure, or using derived c-command relations at a level like LF. This is more
like the D-structure approach in making use of the original DS position.
The D-structure model of Carden would state the conditions on disjoint
reference at D-structure; the sentences in (83a), (83b) would then be related to
their DS counterparts.
(84) a. *He
i
saw a snake near John
i
.
b. *He
i
put some tennis shoes in Johns
i
bag.
The sentences in (84) would then be marked ungrammatical at DS, as violating
Condition C. They retain their ungrammaticality throughout the derivation. This
is identical to the position suggested in Chapter 3.
In Reinhart (1983), Condition C is stated over S-structure, not using the
position from which the dislocation occurred. This is done by extending the
notion of c-command so that an element c-commands all relevant elements in its
maximal projection as in (85) (as in Aoun and Sportiche 1981). (Or it may be
done by positing a structure in which the preposed PP hangs o S, as in (86).)
(85)
* S
PP
Near John
i
S
NP
he
i
VP
saw a snake
(86)
* S
PP
Near John
i
NP
he
i
VP
saw a snake
226 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
Using some modied notion of c-command, the relevant coreference relations in (85),
(86) may be expected to be impossible, on the derived position (Reinhart 1983).
While a number of arguments may be broached against this S-structure
solution to the problem of disjointness, perhaps the strongest argument has never
been mentioned in the literature, to my knowledge. This is simply that the
necessity for disjointness holds even under additional embedding.
(87) *Near John
i
, Bill said he
i
saw a snake.
Reference to simple S-structure position, without reference to some trace clearly
will not work for this sentence since there is no manipulation of the phrase
marker by which he directly c-commands John here. While it might be the case
that disjoint reference (i.e. Condition C) applies to an intermediate structure, e.g.
when it has been fronted in the lower clause but not the upper, it seems clear
that Reinharts general solution, in terms of S-structure c-command without
reference to traces, is not really viable.
5.6.1 The Abrogation of DS Functions: Condition C
Let us now consider some acquisition evidence. The following pattern of results
is from Carden 1986b, summarizing a large body of experiments (for more
detailed evidence, see that article.) The names of the experimental conditions
have been changed to reect current GB-style terminology.
(88) Question-Answering Interpretation Task (Age:3.57.0). (italicized
elements coreferent)
a. Pronominal Coreference
i) Mickey is afraid that he might fall down.
(78% coreference: Ingram and Shaw)
ii) Kens mother said that he was sick.
(96% coreference: Taylor-Browne)
b. Condition C: Dislocated Constituent
i) Under Mickey, he found a penny.
(78% coreference: Ingram and Shaw)
ii) Near Barbara, she dropped the earring.
(76% coreference: Taylor-Browne)
c. Pronominal Coreference: Dislocated Constituent
Near him, Wayne found the programme.
(69% coreference: Taylor-Browne)
THE ABROGATION OF DS FUNCTIONS 227
d. Condition C
i) He was glad that Donald got the earring.
(24% coreference: Ingram and Shaw)
ii) He was glad that Wayne was coming.
(13% coreference: Taylor-Browne)
The data may be summarized as follows. First, simple pronominal coreference is
of course possible (88a). Second, contrary to the sometimes posited linearity
conditions on childrens grammar, backwards coreference also appears possible
with the coreferent pronoun preceding the name (as long as it doesnt c-command it).
This is shown by examples like (88c): Near him, Wayne saw a snake. This is in
line with the Solan (1983) conclusion. Third, children at this age do appear to
have Condition C, as they rightly reject coreference in examples like those in
(88d): He was glad that Wayne was coming. In all these respects, i.e. in
examples (88) (a), (c), and (d), children are behaving identically to adults.
Finally, however, they do diverge from the adult data in (88b). Coreference is
allowed by children in examples in which the fronted PP contains a name, which
is coreferent with a pronoun which c-commands it in D-structure: Under
Mickey, he found a snake. That is, instances of Condition C with a dislocated
name are not blocked for the child.
Carden draws the correct conclusion with respect to the consequence this
data has for the D-structure account vs. Reinharts direct c-command account.
(Reconstruction accounts here would pair with the D-structure account.) Reinhart
unies the instances of obligatory disjointness in (89a) and (b) at a single level,
and under a single condition: the c-command condition (Condition C) applying
at S-structure.
(89) a. *He was angry that Wayne was there.
b. *Under Wayne, he put a dime.
Insofar as such a unication is appropriate, one would expect it to appear
uniformly in the developmental sequence as well: either the c-command condi-
tion on coreference holds or it does not, at any given developmental stage.
However, this is not the case: examples like (89a) are correctly rejected by the
child (under the coreferent interpretation), while coreference is possible in (89b).
But there is no value of the c-command condition as Reinhart states it which
could change over time to allow (89b) in, while (89a) is out.
The D-structure account, and the Reconstruction account as well, would
distinguish the data in the appropriate way, by allowing for two distinctions: the
Condition C condition itself, and the fact of dislocation. Condition C holds for
228 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
the child, but not under movement. This suggests that it is not Condition C
which is at fault at all, but rather that the D-structure representation which would
act as the target for that condition is not computed by the child, under conditions
of dislocation.
In particular, the data follows from the following shallow derivation. The
acquisition sequence is shallow, going back to DS rather than DS. This is shown
in (90) and (91).
(90) Adult analysis.
DS: *He
i
saw a snake near John
i
.
SS: *Near John
i
, he
i
saw a snake t. (retains star)
(91) Child analysis.
DS: Near John
i
, he
i
saw a snake e.
SS: Near John
i
, he
i
saw a snake e.
(92)
DS
DS
SS
shallow analysis
Condition C applies throughout
Condition C applies throughout, and it applies directly to structurally dened
c-command, rather than through some derived notion using chains as an equiva-
lence class. But this means that Condition C will not apply if the child has a
shallow structure such as that in (91) at the deepest level of analysis. This is
precisely Cardens result.
In general, both this analysis and the above analysis of control, suggest that
the grammatical functions associated with a level will be abrogated, if that level
itself is not computed, or is only partially computed, by the child. In the case of
control, the abrogated function is the positive indexing rule, where the sentential
element is indexed with its object controller in its DS position. Since the child
does not analyze the dislocated clause as ever being in that position in the course
of the derivation, the grammatical function associated with that position, the
indexing to the Direct Object controller, is abrogated, since its structural
condition is not met. Hence the default rule of null topic control applies instead.
THE ABROGATION OF DS FUNCTIONS 229
But there is no change in the control rule (or principle) itself; it is simply that
one of the set of structures feeding it has not been supplied.
Similarly, here, with respect to the negative (contra-)indexing rule of
Condition C. Its structural condition is not met, so it does not apply, given the
shallow analysis.
(93) DS
DS
SS
Condition C:
Structural Condition not met
Control:
Object control
structural condition
not met deepest computed level
5.6.2 The Application of Indexing
While Carden correctly notes the advantage of the movement account for the
above data, there is no sense in which the abrogation of Condition C under
movement would logically follow in his account. Under the sort of analysis
suggested in this chapter, however, it is not simply the case that the data is cut
naturally, but that the particular sort of failing by the child, the failure of
Condition C only under conditions of movement, would be predicted. This is
because the failure is not a failure of Condition C at all (which applies correctly
at all the relevant levels), but rather of shallowness of analysis. Since Condition
C, and all the binding conditions, apply directly on structures, and since the child
is not computing a full derivation DS-SS, but rather a shallow derivation DS-SS,
the relevant level in which the pronoun is c-commanding the name is not
available to the child. Rather, at the deepest level of analysis computed by the
child, the preposed element (generally a PP) is already in fronted position,
binding a trace in argument position. This means, however, that the name is not
c-commanded by the pronoun at any level of analysis. So Condition C, while
operative in the childs grammar, nds no level in which its structural condi-
tion i.e. a name c-commanded by another name or a pronoun is satised.
So none of the relevant structures are marked *; the grammar overgenerates.
The analysis above therefore supports the following set of propositions:
a) that there is a D-structure (i.e. that the grammar is in the derivational, rather
than the representational, mode),
230 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
b) that binding principles, in particular Condition C and Control, apply
throughout the derivation,
c) that the binding principles apply to a structurally dened c-command
relation, and that therefore
d) if DS is not computed, or is only partially computed, in a particular analysis
in acquisition, then the positive or negative principles associated with it will
be abrogated.
Let us assume the following metatheoretical condition on the indexing rules.
(94) Metatheoretical Condition on Indexing
a. If a positive condition applies, it must be satised somewhere
in the course of a derivation.
b. If a negative condition applies, it must be satised nowhere in
the course of a derivation.
For the present, we may assume that the conditions in (94) apply particularly to
the Binding Theory, and in general to binding, i.e. to the marking of co- and
disjoint reference, though there may be other areas in which it would be applica-
ble as well. A positive condition would be the marking of coreference; a negative
condition would be the marking of disjoint reference. Further, I will assume,
following the discussion in Chapter 3, that the Binding Theory applies so that
c-command is dened directly, rather than derivatively, or by some further level
of reconstruction.
Consider how (94a) would work for positive conditions of coreference, for
example, for anaphoric binding. The anaphoric element would enter the deriva-
tion with no index. Throughout the derivation the element could be indexed with
an antecedent, if the locality conditions of the binding theory were met with
respect to that antecedent. Finally, at LF (or LF), all elements are checked for
an index: structures having unindexed elements are thrown out. This means that
if the element satises the binding conditions at any point, the derivation will be
sound, if the binding condition is a positive one: i.e. one which requires or
involves the assignment of an index. This sort of theory is close in spirit to the
Assign gamma feature of Lasnik and Saito (1984).
With respect to alternative notions of the Binding Theory, the view pro-
posed in (94) must be defended in two ways. On the one hand, one might
propose that the Binding Theory, or binding, applies at some particular level: for
example, NP-structure or LF. On the other hand, one might view the binding
theory as applying at Reconstruction Structure, where reconstruction structure
is not a level in the traditional sense, but something like the union of information
from the previous levels (see, e.g. Williams 1987, for such a conception). The
THE ABROGATION OF DS FUNCTIONS 231
latter view is of course rather dicult to distinguish from the position given
above since in both cases a union of information is being taken, but it is possible
to distinguish between them.
A complete discussion of the Principle in (94) goes beyond the scope of this
book, see Lebeaux (1991) for further discussion (see also Barss 1985, 1986, for
a relatively thorough discussion of reconstruction: I was able to see that work
only after the following was written). In the following several paragraphs, I will
try to briey indicate some eects of the relevant positions particularly with
respect to reconstruction.
Let us rst consider the possibility of Binding Theory applying within the
derivation. In cases of NP movement, binding must at least be allowed after the
movement has occurred, to account for examples like (95).
(95) The boys seemed to each other t to be very nice.
Examples like (95) also show that the positive condition, Condition A, need not
be satised everywhere, since it is not satised at DS. Condition A must also be
allowed to apply post wh-movement, to account for the pit stop property of
Steve Weisler (p.c.): namely, that a wh-element in a moved wh-phrase may
contain a reexive bound to any of the NPs in the intervening clauses (possible
antecedents italicized).
(96) a. Which pictures of himself did John say Bill liked e?
b. Which pictures of himself did John say Bill liked e?
(97) John wondered which pictures of himself Bill liked.
In (96), the reexive appears to be bound to John from its position in the
intervening Comp. This, together with simpler data like that in (97), suggests that
anaphoric binding must at least apply after wh-movement.
It might be suggested, then, that S-structure is the place. Note that even here
some sort of union of indexing must be involved either throughout the
derivation or by an equivalence class in chains since both of the indexings in
(98) are possible (coreferent items italicized).
(98) a. John wondered which pictures of himself Bill liked e.
b. John wondered which pictures of himself Bill liked e.
There is some interesting evidence, however, from T. Daniel Seeley (Seeley
1989) which suggests that the Binding Theory must also apply at LF (the
interpretation here is mine not Seeleys). Consider the behavior of stressed
reexives in a discourse (Seeley 1989).
232 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(99) A: Does John like MARY?
B: No, John likes HIMSELF.
(100) A: Do the boys believe John to like STEVE?
B:
?
No, the boys believe John to like EACH OTHER.
(101) A: Do the boys expect Steve to believe John to have seen BILL?
B:
?
*No, the boys expect Steve to believe John to have seen EACH
OTHER.
(102) A: Did Bill leave because Sue saw MARY?
B: *No, Bill left because Sue saw HIMSELF.
The judgements in (99)(102) are by no means crystal-clear, but I believe that
they capture the intuitions of a number of native speakers. The judgement of
(101) is the most various one; many speakers, including myself, nd it ungram-
matical, but some nd it acceptable. If the judgements are correct, two facts
emerge: stressed reexives may escape their immediate clause, but not further,
and stressed reexives in an adjunct may not take an element in the main clause
as an antecedent (102). These facts may be captured with a simple analysis. The
stressed reexive, like other focussed elements, is fronted to an S-adjoined
position at LF (Chomsky 1977b). This means that it escapes from its binding
category at LF, and may take a higher clause element as antecedent.
(103) LF of (100b):
No, The boys
i
believe (each other
i
(John to like e
i
))
Consequently, (100) would be grammatical because Condition A would be satised
at LF. (99) would be grammatical because Condition A would be satised at
S-structure (or earlier). However, neither (101) nor (102) would be grammatical
because the output after focus fronting would not satisfy the binding conditions.
(104) LF of (101):
?
*No, The boys
i
expect Steve to believe (each other
i
(John to have
seen e
i
)).
(105) LF of (102): *John
i
left because (himself
i
(Sue saw e
i
)).
The binding in (104) would violate the binding conditions, as would the binding
in (105).
Seeleys data from stressed reexives, as well as the even more clear-cut
evidence from wh-movement suggests that positive binding conditions cannot be
stated at a single level, without reference to either other levels as in the cumula-
tive-derivational approach as suggested above, where the actual indexing is done
THE ABROGATION OF DS FUNCTIONS 233
throughout, or via a reconstruction-type approach, which denes a derived notion
of c-command or equivalence classes of chains. Certain facts, in particular those
which have come under the rubric of chain-binding (Barss 1985, 1986), seem to
be problematic to the cumulative-derivational approach.
(106) Chain-binding
a. Those pictures of himself are the ones that I think that John
really likes.
b. Those pictures of each other are the kinds of things that Bill
thought that those men really liked.
Here, the reexive seems to require reference to the embedded trace, though
presumably no movement has occurred. Of course, the entire binding theory
cannot be stated over an equivalence class of chains (if one wished to dene a
chain having the left-most member of the copular sentence in (106) as its head,
the lexical relative clause head as a middle member, and the trace as the tail),
since Condition C does not apply obligatorily.
(107) Those pictures of John are the ones that he really enjoys.
One empirical oddity about the structures in (106) seems to me to be the
following. These constructions, unlike other long distance binding into nominals
in standard sentences, require the nominal bound into as having a implicit
possessor identical to the anaphor. My judgements are the following:
(108) a. Those pictures of each other are the ones that the boys really
like.
(must be their pictures of each other)
b. The boys really like those pictures of each other.
(need not be their pictures of each other)
(109) a. Those stories about each other are the ones that the boys
believe to be true.
(must be their stories about each other)
b. The boys believe those stories about each other to be true.
(need not be their stories about each other)
Oddly, this implicit possessor reading does not seem to be required of the
corresponding pseudo-cleft type.
(110) What the boys like are stories about each other.
If these judgements are correct, then the theoretical problem associated with the
copular sentences in (109) dissolves, though not the one for (110).
234 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
More generally, it would seem to me preferable to retain direct notions of
c-command and chains, and revise the relevant notion of phrase structure, than
to do the reverse. See Lebeaux (1991, 1998) for more discussion.
5.6.3 Distinguishing Accounts
In this chapter, I have been following a particular type of proposal in order to
account for certain facts in acquisition. Namely, that indexing applies throughout
the derivation, that it applies directly in terms of structurally dened c-command,
and that certain dierences between the childs grammar and the adults may
then follow from the fact that the childs analysis is, in certain respects, shallow.
The grammatical functions associated with the partially missing or partially
computed levels would therefore be abrogated. In the last section, I suggested
that this view would follow from a general metatheoretical condition on index-
ing, namely that positive indexing applies throughout the derivation (i.e. positive
conditions must be satised somewhere), while negative conditions may never be
satised. While issues are complex, I would like to indicate briey in this
section the dierences between this account, and those which fall under the
rubric Reconstruction. Two broad types of reconstruction accounts may be
distinguished, those which involve actual reconstruction of the moved element,
and those which dene c-command relations or equivalence classes of elements
in chains in terms of the dislocated structure. The latter type of account, I will
call quasi-Reconstruction.
In spite of the similarities between the cumulative notion of indexing above
and the reconstruction approaches in general both allow for the union of
certain types of information dierences would be expected between them. In
particular: (1) to the extent to which elements are added in the course of the
derivation (Chapter 3), various conditions may be taken to not apply to the added
element, in the cumulative-derivational view, while this result may only be gotten
with diculty using (quasi-)reconstruction, (2) the cumulative-derivational view
would allow for an ordering of operations within the grammar, and hence for
bleeding or blocking possibilities, a result which would, again, only be possible
with diculty using quasi-reconstruction. To the extent to which (1) and (2)
hold, the cumulative-derivational view is supported (recall that the cumulative-
derivational view holds that binding possibilities e.g. positive indexing as in
Condition A applies cumulatively throughout the derivation, adding indexings).
I will examine (1) and (2) here only with respect to the acquisition evidence
above, in particular with respect to the abrogation of DS functions; see Lebeaux
(in preparation) for a more complete syntactic discussion.
THE ABROGATION OF DS FUNCTIONS 235
Let us consider an instance of the two accounts. The positive conditions
everywhere/negative conditions nowhere account would have the following form:
(111) DS
SS
LF
Binding operations apply throughout
(e.g. Condition A)
Indices Checked
(112) DS
SS
LF
Negative Condition may not
be met (anywhere)
And assume a Reconstruction-account of the following form, where Reconstruction-
structure, either as a set of structures or a set of dened relations, holds at LF.
(113) DS
SS
LF Reconstruction Structure ~
Now consider the acquisition data that we have been examining as well as earlier
syntactic data, to see which is preferable. The cumulative-derivational account
above can account for the lack of Condition C eects for dislocated constituents
(Cardens data), by assuming that DS is not computed, insofar as the dislocation
is concerned. The Reconstruction-type account apparently can account for the
(lack of) Condition C eects as well. Suppose that a parallel reconstruction-type
account has the following form:
(114) Condition C is stated over R-structure, a set of structures derived
from LF by
(115) In the childs grammar, no separate level of R-structure exists at a
particular stage, because the derivation is shallow in the LF direction.
Then the lack of Condition C eects for the Carden type sentences above is
explained:
(116) In Mickeys wallet, he put a penny e. (OK for child: coreferent items
italicized)
236 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
Sentence (116) would not have a R-structure corresponding to the structure in
which In Mickeys wallet has been put back into place (however exactly this
is done). Hence it would invoke no Condition C violation by the child: the actual
result. This is shown in (117)
(117) DS
SS
LF
LF R-structure
actual computed structure
However, we earlier noted syntactic considerations which mediated against
Condition C being stated over R-structure: the anti-Reconstruction eects of van
Riemsdijk and Williams (1981) (see Chapter 3). Given the existence of these
eects, and the argument/adjunct distinction in the presence of a Condition C
violation for dislocated constituents, syntactic considerations alone advance a
Positive Conditions Everywhere/Negative Conditions Nowhere type approach.
We are left with the following.
(118) Type of Binding into Dislocated Constituent
(cumulative approach vs. R-structure)
Condition C
acquisition evidence indeterminate
syntactic evidence cumulative approach
(Positive Conditions Everywhere/
Negative Conditions Nowhere)
This summarizes the set of data considered in Chapter 3: the Condition C
binding. What about the data considered in the last section: that bearing on
control. Here, the syntactic evidence is indeterminate about the type of approach
which is supported, but the acquisition evidence is not, supporting a cumulative-
derivational approach, though weakly, over one involving a level of R-structure.
The full chart, then, for the two instances discussed will be the following.
THE ABROGATION OF DS FUNCTIONS 237
(119) Type of Binding into Dislocated Constituent
(cumulative-derivational approach vs. R-structure)
Condition C Control
acquisition evidence indeterminate cumulative approach
syntactic evidence cumulative approach indeterminate
cumulative approach = Positive Conditions Everywhere/Negative
Conditions Nowhere
Consider why the acquisition evidence does support the cumulative derivational
(direct) approach, given the general analysis of Non-Obligatory Control given in
the previous section. The crucial data were the Tavakolian-type sentences given
in (120).
(120) a. PRO to kiss the duck would make the lion happy.
b. PRO to leave the room would make the skunk happy.
As noted there, children, but not adults, allow extra-sentential reference in such
constructions. I suggested that this was due to the interaction of the following
three factors:
(121) a. Non Obligatory Control clausal subjects originate in internal-to-
VP position
b. Control requires direct c-command; when this applies at DS,
the other internal argument is the controller
c. The sentential clause is fronted; if the PRO is unindexed, it is
operator-bound, and gets its index from the operator.
In this theory, it is the presence of the control clause in internal position, and the
application of control there, which bleeds the later possibility of operator-
binding and external reference. As noted earlier, if the analysis is shallow, i.e. if
the deepest computed level is DS not DS, and at DS the control clause is
already in fronted position, then control by the internal to VP element will not
apply. So operator binding will take place, and extrasentential reference will
occur. This accounts for the Tavakolian results.
(122) Adult analysis:
a. DS
e would make the lion happy (PRO to kiss the duck).
238 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
b. DS after Control
e would make the lion
i
happy (PRO
i
to kiss the duck).
c. SS
(PRO
i
to kiss the duck)
j
would make the lion
i
happy e
j
.
(123) Childs analysis:
a. DS
(PRO to kiss the duck)
j
would make the lion happy e
j
.
b. SS
O
i
(PRO
i
to kiss the duck)
j
would make the lion happy e
j
.
It is the shallowness of the derivation which allows the operator to be inserted.
(Note that in the childs analysis, an empty category is present in the second
object position at all levels of representation, as required by the Projection
Principle, etc; it is just not the trace of a real movement operation.)
(124)
DS
DS
SS
LF default topic insertion
and interpretation
normal control rule
shallow analysis
Suppose that we tried to do it with a Reconstruction type account.
(125)
DS
SS
LF
Operator (topic) interpretation or insertion
~ R-structure
Adult Grammar:
THE ABROGATION OF DS FUNCTIONS 239
(126)
DS
SS
LF
LF
Operator (topic) interpretation or insertion
R-structure
Child Grammar:
The corresponding control rule, using reconstruction, would be the following.
The dislocated constituent is placed back into its dislocated position (or read as-if
placed back) at R-structure. This operation is optional. If it applies, control by
the object NP is possible; otherwise a peripheral topic operator is inserted. The
key syntactic fact would be the ordering of the reconstruction operation and the
topic interpretation.
Consider now what would happen in acquisition. As noted above for
Condition C eects, the parallel way of accomplishing this end would be by
supposing that R-structure was not relevant for the childs grammar, the grammar
being shallow in that direction. But this situation is not as symmetrical as it
seems. The control rule and the operator insertion rule are on opposite sides of
the grammar for the cumulative-derivational formulation of control, but not for
the reconstruction type approach. A shallow analysis (lacking in some of the
operations of D-structure) for the rst case would eliminate object control, but
would retain operator insertion, which is occurring on the other side of the
grammar. In the second case, however, the shallowness of analysis at LF would
eliminate (abrogate) both the reconstruction rule, and the default rule of operator
insertion or interpretation. The result would be a structure which would be ill-
dened: not the actual result.
5.7 Case Study III: Wh-Questions and Strong Crossover
In the previous two cases under investigation, I have dealt with two areas in
which it appears that the childs grammar is shallow: the analysis of control,
and the analysis of constructions which should apparently be ruled out by
Condition C. In each case, it was found that the childs grammar diverged from
the adults. However, this was taken not as evidence that the condition itself was
240 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
dierent in the two grammars (Non Obligatory Control, Condition C), nor that
there was a dierence in category type (i.e. that the initial PRO was pro, or
some neutralized null category), but rather that the childs analysis was shallow:
anchored in S-structure, and extended only part of the way back to DS, to DS.
As such, the constraints and operations which would have applied to the DS
representation did not. As a consequence: (i) in the case of Control, since the
c-command condition for control was not met, a default operation of operator
insertion applied, and extrasentential reference was gotten; (ii) in the case of
Condition C, with dislocated constituents at the deepest level of analysis,
Condition C did not apply because its structural condition was not met (the name
did not have a c-commanding coreferent name or pronoun).
In the next sections, I would like to discuss a third area in which the child
adopts a shallow analysis: the analysis of wh-questions. The data here are drawn
from an important paper: Roeper, Akiyama, Mallis, and Rooth (1986), Cross-
over and Binding in Childrens Grammars. The data are quite complex, and the
paper itself is not as well known as it should be. Accordingly, I will discuss here
rst the analysis of wh-questions which I will adopt, then summarize the paper,
and then present an analysis of how the acquisition data can be accounted for
within the general levels-of-representation conception that this work argues for.
5.7.1 Wh-questions: Barriers framework
While I am generally assuming the extension of GB found in Barriers (Chomsky
1986), the analysis of wh-questions in acquisition is more specically tied to
elements of the analysis of Barriers than other aspects of this thesis. (Indeed, it
strongly supports particular aspects of that analysis, and would not be workable
without it.) A quick review of the relevant points is therefore in order.
With respect to X-theory, , the domain of elements falling outside of S (=
I = IP) is more articulated than in earlier versions (Chomsky 1981). In particular, S
is the maximal projection of In, and, crucially, wh-movement is not into Comp, but
into the Spec C position. That is, the full structure of the clause is as follows.
THE ABROGATION OF DS FUNCTIONS 241
(127)
C
C
S (=IP) Comp
SpecC
NP I
Infl VP
V NP
The movement of a fronted NP is no longer into Comp (or an adjunction in
Comp), but rather into the specier position of C (it will become clearer later
why I am reviewing this).
A second innovation of the Barriers-type approach is that the movement
operation is a substitution operation (into Spec), rather than an adjunction. A
general consequence of that is that In, including do, may now move into the
head position of Comp. The entire clause then becomes a projection of In. The
movement of In into Comp, and overlay of Comp by In, is shown in (128).
(128)
I
S (=IP) Infl
SpecI
NP I
Infl VP
V NP
A third innovation has to do with the locality of the movement. Chomsky (1986),
following the analysis of anaphor movement in Lebeaux (1983), assumes that
movement is highly local; Chomsky (1986b) assumes likewise for wh-movement,
involving adjunction to intermediate nodes, including VP. While I accept this
part of the analysis, it will not be crucial in what follows.
We note immediately one consequence of the Barriers analysis, which has
already been commented on in the foregoing (Chapter 1). Assuming that
242 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
categorial selection is selection of the head (Belletti and Rizzi 1988), a wh-clause
must be selected in terms of its Comp feature, not the element in Spec C. This
means, in turn, that Spec C must agree with the +/wh-feature in Comp, so that
the selectional process in the grammar knows that a wh-element has been
moved into Spec, and can dierentiate (129a) from (129b). I assume, simply,
that there is, still, such a +/wh-feature in Comp, and that it agrees with Spec C.
(129) a. I wonder ((who) ((+wh) (John saw)))
b. *I wonder ((e) ((+wh) (John saw who)))
The reason for the ungrammaticality of (129) is that the +wh-feature is selected
by the verb, and this is unsatised at S-structure. Such satisfaction is required at
S-structure, for English.
Before proceeding, let us note three facts which support Chomskys
analysis. First, the analysis of In movement as overlaying Comp is supported
by selectional facts. Assuming that complement taking verbs (believe, wonder)
select for a particular type of Comp (+wh, wh, respectively), and assuming that
this selection must be satised at all levels of representation, we have an
immediate explanation, given Chomskys analysis, of why Subject/Aux inversion
is impossible in subordinate clauses. If Aux moved into Comp, the clause itself
would be a projection of Aux (or In): i.e., it would be I. This would mean,
however, that the selected clause was C at DS, but I at SS: an impossibility,
given the assumptions above. Hence no selected clause may have Subject/Aux
inversion, the correct result.
A second fact supporting Chomskys analysis is conceptual, but I believe
powerful. In traditional Extended Standard Theory and GB analyses (e.g.
Chomsky and Lasnik 1977, Chomsky 1981), Comp is a sort of garbage
category. It contains mostly closed class elements (that, if, etc.), but also those
of radically dierent character, open class NPs (whose hat, etc.). This made
Comp very dicult to treat as a unied element, and very dicult to probe the
properties of. Given the current theory, Comp again makes sense categorially: it
is a position in which a particular set of closed morphemes may appear.
A third fact has to do with wh-island eects. It has sometimes been noted,
though usually just in passing, that there is a considerable contrast in grammat-
icality between examples (130a) and (130b).
(130) a. *Who do you wonder which books John gave e to e?
b. Who do you wonder if John gave books to e?
In the example in (130a), the dependencies are nested, so a crossing constraint
cannot be the cause of the ungrammaticality.
THE ABROGATION OF DS FUNCTIONS 243
While other candidates for explanation of the dierence are available in
pre-Barriers type frameworks, Barriers does provide a ready explanation for the
dierence. While a +wh Comp does exist in the complement clause in both
cases, in the former case, but not the latter, the Spec C position is lled (with
which books). This suggests that the long distance extraction in (130b) is through
that position, and the ungrammaticality of (130a) should be traced precisely to
the fact that that position is unavailable. This proposal might be instantiated in
a number of ways, which I will not try to go into here. This sort of explanation
is not nearly as available if one assumes that extraction is through Comp, since
if is an obligatory element (though variants may be tried, using particular
indexing algorithms).
5.7.2 Strong Crossover
Before turning to the acquisition evidence, let us deal with another aspect of wh-
questions: namely, the existence of (Strong) Crossover eects. Such eects, rst
noted by Postal (1974) essentially forbid the crossing over of a moved wh-item
over a coreferent pronoun or name, in congurations in which the pronoun or
name c-commands the trace of the wh-element. Contrasts such as (131) were
noted by Postal.
(131) a. *Who
i
did he
i
say that John liked e
i
?
b. Who
i
did the man that saw him
i
say that John liked e
i
?
In (131b), who may be construed as coreferent with he, but not in (131a).
Similarly, and yet more clearly, there is a dierence in the possibility of a
bound reading in (132) depending on whether a crossover or a noncrossover
conguration underlies it.
(132) a. Who
i
e
i
ate his
i
hat?
b. Who
i
did he
i
say e
i
ate his
i
hat?
(132a) easily allows a bound reading. However, while (132b) would be structur-
ally identical (with the addition of an element), if one simply looked at the
indexing and ignored the lexical/nonlexical distinction, it does not allow such a
reading.
(133) a. Who
x
(x ate xs hat)?
b. Who
x
did (x say x ate xs hat)? (not allowed as reading)
The contrast between (132) and (133), then, is strong evidence for the role of
Strong Crossover in the adult grammar.
244 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
Strikingly, and remarkably, such a contrast does not exist in the child
grammar, except for some peripheral structures (Roeper, Akiyama, Mallis and
Rooth 1986). It is to this acquisition evidence, and the theoretical consequence
of that evidence, that I will return below.
While the Strong Crossover fact is quite uncontroversial, the theoretical
explanation of the fact is a good deal less so. At least three proposals exist in the
literature. Postal (1974) suggests that the condition is actually a condition on a
transformation i.e., that the crossing of a wh-element over a c-commanding
name or pronoun is disallowed. Chomsky (class lectures 1985) has at least
contemplated a similar idea. A second alternative, perhaps the most widely
accepted, is that the trace left by wh-movement is a name (Chomsky 1981), and
that Condition C bars the relevant conguration because a name would be
c-commanded by a coreferent element in an A position, at S-structure.
(134) Who
i
did he
i
visit e
i
?
c-commanding name
pronoun
A third possibility has been suggested by van Riemsdijk and Williams (1981).
This is that the condition is neither on a movement operation, nor on the trace as
a name, but rather on the pre-movement structure. If the wh-element itself is
considered a name, and one adopts the position advocated earlier in this work
that positive conditions must be satised everywhere, and negative conditions
nowhere violated then the Condition C will rule out the pre-movement
structures at DS.
(135) a. *He
i
didnt know who
i
? DS
b. *Who
i
didnt he
i
know e
i
? SS (retains * from DS)
(135a) is ruled out at DS, and the full derivation in (135) retains the ungrammat-
icality of (135a).
The van Riemsdijk and Williams proposal has certain attractive features,
though they are hardly decisive. First, it allows the rather natural proposal of
Joseph Aoun, that wh-trace is an anaphor as a locally necessarily dependant
element, to be straightforwardly instantiated. Given that it is the wh-element
itself which is the name, and that Condition C is stated in terms of that, the wh-
trace is freed to be an anaphor. Second, the van Riemsdijk and Williams
proposal does not require recourse to layered traces. As van Riemsdijk and
Williams note, in an important way, the strong crossover eect holds not only
over the whole moved phrasal node, but over all the material that it dominates.
THE ABROGATION OF DS FUNCTIONS 245
(136) *Whose
i
hat did he
i
eat e ?
Whose in (136) cannot be coreferent with he. This constraint cannot be stated
over the maximal null phrasal category, but must be stated in terms of layered
traces. While layered traces would, under certain renditions of movement, even
be expected, they have certain characteristics, aside from complexity, which are
somewhat unattractive. It must not only be the case that the trace is layered, but
also that each individual subnode is individually co-indexed with its antecedent
in the moved item. Further, the government relation must be dened between a
set of elements, all null. More problematic, from the point of view of this thesis,
is that syntactic elements which correspond to a phonologically null segment of
the string are no longer closed class (i.e. necessarily nite in character), but open
class. This is because a layered trace may contain arbitrarily complex syntactic
material.
In the following, I will argue that acquisition evidence supports the van
Riemsdijk/Williams proposal (or possibly Postals original proposal) over the
alternative, that variables act as names. This supports, or allows support to
develop for, Joseph Aouns proposal: that wh-traces are anaphors. It also further
supports, and allows articulation for, the proposal with which this chapter is
concerned: that the derivation is real, and may be construed, by the child under
certain circumstances, as shallow.
5.7.3 Acquisition Evidence
The basic nding of the Roeper et. al experiments is that Strong Crossover does
not exist for children, for a majority of constructions. (The exceptions, those
constructions in which Strong Crossover does exist for the child, also play a
crucial role in the following analysis.) Why should this be?
One possibility is that the constraint itself is not available at an early stage,
and pops into the grammar at some later stage. Recall, however, that a similar
solution was found wanting for the lack of Condition C eects in constructions
like the following:
(137) In Johns
i
room, he
i
put a book. (OK for kids, * for adults)
As noted earlier in this chapter (see also Carden 1986a, 1986b), it is not the case
that Condition C has disappeared at the point at which constructions like (137)
are wrongly accepted. Rather, Condition C is present in its simple form, but just
doesnt seem to be operating in dislocated structures. That pattern of judgements,
it was argued, was not diagnostic of the lack of Condition C, as a condition, at
all, but rather due to the fact that its structural condition was not met, due to the
246 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
shallowness of analysis by the child. A similar explanation was found for the
Tavakolian data.
Let us exclude the possibility of a principle suddenly appearing as follows:
(138) Universal Application of Principles
Any (universal) principle P in the adult grammar applies at all stages
of development, if the vocabulary satisfying that principle is present.
By the parenthetical universal in (138), I do not mean to restrict the application
unnecessarily, but simply to allow for the fact that a parameterized principle
would not have to apply in its language-specic form at all stages of develop-
ment. The proviso in (138) would require for each principle P in UG, that i) it
either hold in the appropriate language specic form at all stages of development
for a given language L, or ii) it be given a dierent parametric form than the
one in the language L, but still be part of the specication in UG, or iii) that the
vocabulary over which the principle holds, not be present in the childs grammar.
This would then require that the binding principles apply as soon as the vocabu-
lary dening them (presumably the +referential features on nouns) were dened.
An explanation of the Jakubewicz (1984) and Wexler and Chien (1985, 1987a)
data would therefore have to be found which would be in accord with this
principle. The earlier case, where only theta theory applied in initial representations,
would not be a counterexample, since the vocabulary over which Case assign-
ment was dened would not be present.
With respect to the data at hand here, the question is whether a similar,
levels-of-representation type analysis can be found for the strong crossover data.
As noted above, children allow both (139a) and (b) as well-formed structures at
rst approximation, according to Roeper, et.al.
(139) a. Who
i
e
i
thinks he
i
likes his
i
hat?
b. Who
i
does he
i
think e
i
likes his
i
hat?
Indeed, the full set of data is quite complex and apparently somewhat confusing,
the result of complex array of experiments performed by Roeper and his
colleagues. Considered in full, the theoretical problem becomes quite intricate. I
will present here the data from experiment #7 which is the most detailed set of
data which Roeper et.al. provide, and characteristic of the whole.
THE ABROGATION OF DS FUNCTIONS 247
Percent coreferent or bound
I. Noncrossover conguration: single clause
a. Who is V-ing himself?
b. Who is V-ing him?
c. Who is V-ing his N?
100.0
027.0
036.9
II. Crossover conguration: single clause
a. What is he V-ing?
b. Who is he V-ing?
c. Whose N is he V-ing?
015.9
025.9
003.6
III. Noncrossover conguration: 2 clauses
a. Who thinks NP is V-ing him?
b. Who thinks he is V-ing NP?
040.5
038.1
IV. Crossover conguration: 2 clauses
a. Who does he think NP is V-ing?
b. Who does he think is V-ing NP?
c. Who does he think he is V-ing?
029.8
019.0
035.2
Figure 1. Percentage of Coreferent or Bound Responses
A word about the notation is in order. V stands for any of a number of the verbs
chosen; and NP for any of a number of NPs. An exemplar of Who does he
think NP is V-ing? would be Who does he think Big Bird is pushing e?
The data in I. is of theoretical interest only as a basis of comparison. Note
that: i) children have 100% bound responses when the bound element is a reexive
(Ia.), ii) coreference or binding is allowed, at least marginally, for single clause
structures with a coreferent pronoun (Ib., 27%). This result is already familiar
from work by Jakubewitz (1984) and Wexler and Chien (1985, 1987a).
The rst striking result comes in IIb. This is a classic crossover congura-
tion, and unlike IIa., has who as a questioned word, so coreference would be
possible without violating animacy requirements. For this question, it appears that
25.9% of the children allow coreference, a clear violation of the adult rule. More
strikingly, and yet comfortingly as well for the accuracy of the original result, is
the fact that the result in IIb. is contrasted with that in IIc. Children do not allow
coreference between the fronted wh-element and the crossed over pronoun, if that
wh-element is part of a containing wh-phrase (*Whose
i
hat did he
i
like e? for
children, as well as adults). The fact that children obey the crossover constraint here
is crucial, since it shows that there is not simply a total breakdown in the grammar,
or that the allowance of strong crossover in the simple wh- cases is due to the
248 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
complexity of the task. Rather, in the slightly more complex case where whose
hat has been fronted strong crossover is obeyed, even if it is not in the simpler
cases (only 3.6% coreference in IIc).This is then the second puzzle to account for,
along with the original puzzle of the lack of strong crossover in cases like IIb.
Finally, there is a third result to be explained, which does not show up so
strongly in this set of data, but does in other data sets gathered by Roeper et.al.
Roeper et.al. note that the lack of a strong crossover condition is not uniform
across ages with respect to clauses. Rather, they note that there is a developmen-
tal pattern of the following sort.
(140) Stage I: No Strong Crossover Condition
Stage II: Strong Crossover Condition for 1 clause sentences;
no Strong Crossover for 2 clause sentences.
Stage III: Strong Crossover Condition generally
This result, noted by Roeper et al., does not rise clearly out of the data in Figure
I, but perhaps its outlines can be seen by comparing the 25.9% strong crossover
violation in IIb with the 29.8% and 35.2% strong crossover in IIIa and IIIc. See
Roeper et al. for more extensive discussion of this result.
To summarize, there are three problems or puzzles which must be answered
by a legitimate acquisition account:
(i) How does one account for the fact that Strong Crossover does not seem to
be operating or not nearly as strongly in the childs grammar as in
the adults, for examples like (i)?
(i) Who
i
is he
i
V-ing e
i
?
(ii) How does one account for the fact that, at the same time that Strong
Crossover is not respected by the child in constructions like (i), it is
respected for (ii), where a full NP is fronted?
(ii) Whose
i
hat is he
i
V-ing e?
(iii) How does one account for a particular lag in acquisition? Namely, that
children rst learn the strong crossover constraint in one clause construc-
tions, and then repeat their initial mistake of not having strong crossover in
two clause structures. Such a construction-sensitive dierence would not be
expected in any simple parameter-setting account.
5.7.4 Two possibilities of explanation
Given the basic bifurcation of grammars into those in the representational mode
vs. those in the derivational mode, two major possibilities arise in the explanation
THE ABROGATION OF DS FUNCTIONS 249
of puzzles (i)(iii) above, if one excludes the possibility that strong crossover has
suddenly popped into the grammar at the relevant stage (this latter possibility is
in any case made unlikely by the data in (ii) above). On the one hand, one might
assume that there has been some change in the representation (say, the S-struc-
ture representation). For example, there could be a change in the category type
of the element corresponding to the wh-trace in the adult grammar. This is the
position that Roeper et al. take: that the initial wh-trace is actually little pro. As
such, the Roeper et al. explanation is grounded, as it were, in the representational
mode. Though the grammar as a whole contains a derivation for Roeper et al.,
the particular acquisition explanation is not dependant on that, just as it is not in
Hyams (1985, 1986, 1987). On the other hand, we might suppose that the dier-
ence over time for children is not represenationally based, but derivationally
based: for example, that the derivation is shallower for children than adults, or
somehow dierent. This type of acquisition account underlies the analysis of the
data above: i.e. the analysis of the Tavakolian and Carden data. However, this
account may be deepened and made more subtle in a number of ways. For
example, it neednt be the case that if the explanation is derivationally based, the
representational system is exactly the same that it is in the adult grammar.
Rather, another possibility presents itself: that the childs derivation is dierent,
and because of this, as an eect, the representation is dierent as well. Say, the
representation at S-structure. Under this view, while it may well be true that
some aspect of the representation has been changed over time for example,
the null element has changed from pro to wh-trace this is not the deepest
level of analysis. This is to be found instead in the derivation itself, and it is the
change in the derivation which has given rise to the denitional properties which
mean that the representation is read dierently. In a sense, the parametric
change, while real, is not the cause of the acquisitional change, but an eect.
That is the position taken in the analysis below.
5.7.5 A Representational Account
Roeper et al. take a purely representational view. They suggest that the basic
dierence between the adult grammar and that of the child is in the category
type of the null element corresponding to the wh-trace in the adult grammar.
They argue, in essence, that the null category corresponding to the wh-trace in
the adult grammar is not a wh-trace for the child at all, but rather an indexed
little pro. The representation of (i) above is then the following:
250 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(141) Who
i
is he V-ing pro
i
?
(e.g. Who
i
is he following pro
i
?)
Although I will argue against this view as the ultimate basis for the acquisition
facts, this sort of explanation has much to recommend it. First, and most
crucially, it allows for the explanation of the lack of Strong Crossover by
children. Assuming that the strong crossover eect is really dependant on the
fact that wh-traces, as names, cannot be c-commanded by coindexed pronouns,
if one changes the category type to an element which can be A-bound
namely, to small pro the lack of Strong Crossover is explained.
Second, the assumption that the initial trace is specically little pro could
help to explain the one clause/two clause contrast noted above: namely, that the
strong crossover constraint seems to come into the grammar rst for one clause
constructions, and only later for two clause constructions. This might be traced,
not to the application of Condition C to a name, but to the application of
Condition B to little pro. Since little pro would obey Condition B, to the extent
to which this condition is operative in the childs grammar at all (see Jakubewitz
1985; Wexler and Chien 1985, 1987a), it would be expected to disallow single
clause coindexed structures before it would disallow double clause structures.
This would give the appearance of the strong crossover constraint operating in
single clause structures.
In spite of the interest of the representational view above, there are consid-
erable diculties, if this is taken as the ultimate level of explanation. These
seem to me to support a theory that is derivational in character, or at root
derivational, though the change in derivation may have (as always) representa-
tional eects.
Perhaps the most signicant diculty with the approach outlined above has
to do with the dierence in strong crossover eects for children depending on
whether a full noun phrase has been fronted (whose hat), or a simple wh-element
(who). The Roeper et al. data strongly shows a contrast between the two.
(142) a. Who
i
is he
i
V-ing e
i
? (25.9% coreferent)
b. Whose
i
N is he
i
V-ing e
i
? (3.9% coreferent)
This is perhaps the most signicant statistical result in the whole experiment. Yet
this distinction is not really covered by the basic motif that Roeper et al. follow:
that wh-trace is read by the child as little pro. Of course other possibilities of
explanation may be advanced as they are in the paper but it would be
preferable if such a signicant result could be a result of the basic framework.
A second diculty in the account has to do with the one clause vs. two
THE ABROGATION OF DS FUNCTIONS 251
clause contrast. As noted above, children begin to respect strong crossover in one
clause structures before they respect it in two clause structures. They have the
following developmental path:
(143) a. Coreference allowed everywhere (no Strong Crossover constraint)
b. Coreference allowed in 2 clause structures; not in 1 clause
structures (Strong Crossover constraint for 1 clause only)
c. Coreference not allowed (adult Strong Crossover constraint)
It might at rst be thought that this divergence in one clause and two clause
structures could be traced simply to the fact that the initial wh-trace is treated as
little pro by the child, and that this obeys Condition B. However, things cannot
be this simple. That is because there are two changes in the grammar given in
(143), but just one parameter to manipulate: pro wh-trace. If the change in
parameter (pro wh-trace) is supposed to account for the rst change in the
data, i.e. the transition from (143a) to (143b), then it cannot also account for the
second change, from (143b) to (143c). If it is supposed to account for the second
change, then it cannot also be at the root of the rst. In short, there are two
changes in the developmental path, but only one parameter: both cannot be
linked to a single change in the grammar.
There is indeed a way around this, which rather stretches the conceptual
grounding of the notion parameter. This is to say that there is a simple
parametric change (pro wh-trace), but that this operates more than once. In
particular, it is construction-sensitive, rst operating in one clause structures, and
later in two clause structures. This is odd, however, from the point of view of
what is normally meant by parameter: i.e, an independently specied piece of
information in the grammar. Such a piece of information would not normally be
thought of as construction-specic: if wh-trace were being read as little pro by
the child, one would assume that it was listed in the grammar as such. And the
particular change that was noted: that it would rst change from little pro to wh-
trace in one clause structures, and only later in two clause structures, seems
equally inexplicable, given that the only possible basis for such a divergence,
some general change in computational complexity, operates in the opposite
direction with respect to the extraction of simple wh-elements vs. full wh-phrases
(who vs. whose hat).
5.7.6 A Derivational Account, and a Possible Compromise
As I have just noted, it would be odd, if one assumes that the deepest level of
explanation is a representationally based parametric one, to assume that that
252 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
wh-trace begins as little pro in one and two clause constructions, then becomes
wh-trace in one clause constructions only and retains its status as little pro in two
clause constructions, and nally becomes wh-trace generally. However, this is
odd not because this developmental course is odd per se, but rather because it
does not t in well with the notion of parameter-setting as conventionally
understood. Suppose that, instead of assuming that the categorial change were the
root level of explanation, that the dierence between the adult grammar and the
childs were to be traced instead to some change in the derivation occurring over
time. Suppose further that null categories, and in particular wh-trace, are
derivationally, and not representationally dened: i.e. that wh-trace is the null
element left by wh-movement. Then the derivational dierence would have an
immediate eect on the denition of elements at any particular level. This
dierence might well be on a construction-by-construction basis, rather than para-
metric in the conventional sense. It is this sort of line of inquiry that I will follow.
Let us rst adopt the van Riemsdijk and Williams account of Strong
Crossover that it is an instance of Condition C applying at DS to structures
such as (144).
(144) *He
i
liked whose
i
hat.
The van Riemsdijk and Williams approach allows for an explanation of the
acquisition facts on exactly the same grounds that had been advanced earlier: the
shallowness of the grammar. Assuming that S-structure or the surface is the
(computational) anchor for comprehension, and assuming a shallow analysis in
which the wh-element is base generated in place by the child, but not by the
adult, then the childs representation for sentences like (145) will be given in
(146), while the adults representation will be that in (147).
(145) Who
i
did he
i
say that Bill liked e
i
?
(146) Childs analysis:
DS: Who did he say that Bill liked e?
SS: Who
i
did he
i
say that Bill liked e
i
?
(147) Adults analysis:
DS: *He
i
said that Bill liked who
i
.
SS: *Who
i
did he
i
say that Bill liked e
i
?
Assuming that the deepest level of analysis computed by the child is DS
where the wh-element is already in fronted position and assuming that Strong
Crossover really is a condition on the c-command of a wh-trace by a coreferent
name or pronoun, then the childs grammar would not be expected to exhibit
strong crossover eects. That is, if we assume shallowness of analysis, then the
THE ABROGATION OF DS FUNCTIONS 253
strong crossover facts i.e. the lack of strong crossover eects for children
are explained without any additional stipulation given the van Riemsdijk/
Williams account, since the wh-element would not be c-commanded by a
coindexed name or pronoun at any level of representation. The acquisition
account therefore supports the van Riemsdijk and Williams proposal..
Consider now the second contrast noted by Roeper et al.: that between a
fronted simple wh-element and a full fronted NP.
(148) a. Who is he V-ing t? 25.9% coreferent
b. Whose N is he V-ing t? 3.6% coreferent
As noted earlier, this contrast is striking. These facts are also important in
showing that the child is not simply behaving randomly in examples like (148a)
due to confusion, because in the equally complex (148b) Strong Crossover is
maintained.
Of course, no such contrast exists in the adult grammar.
As noted earlier, Chomsky (1986b) allows two left peripheral positions
outside of S (=IP), a direct Comp position, and a position which is Spec C.
(149)
CP
SpecC C
Comp
+/ wh
The former contains a limited set of closed class features and words (+/wh, if,
that, etc.), while the latter contains full NPs. As noted earlier, this allows the
garbage category character of Comp to be avoided. Let us make use of this
contrast here in the following way:
(150) Who, what, and other closed class wh-elements are optionally
generated in Comp by the child, as spell-outs of the +wh feature;
which man, whose book, and other full wh-NPs are found in Spec C.
The distinction in (150) is reasonable, on two grounds. First, since simple +wh
elements (who, what, etc.) contain barely more information then the fact that they
are wh-elements themselves (plus some information as to humanness. etc.), they
could be simple spell-outs of the +wh feature. This is of course impossible with
the full NP phrase. Second, there is some evidence from early acquisition which
is quite suggestive as to the placement of the initial wh-words. A traditional
nding is that there is a stage of development in which children allow either the
254 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
fronting of the wh-word, or the fronting of the auxiliary, but not both (though see
Pinker 1984, for some critical discussion).
(151) Possible: Did John see Mary?
Possible: Who John can see?
Impossible: Who can John see?
This has generally been suggested to be due to a computational decit of some
type: that both wh-movement and Subj/Aux inversion take some sort of computa-
tional power, so the mutual application of both is impossible. The comments
above suggest instead a structural explanation: if the wh-word is generated
initially as a spell-out of a wh-feature in Comp, and if Subj/Aux inversion is a
substitution operation into Comp (Chomsky 1986b), then the conguration of
data in (151) is explained. Let us, therefore, accept the premise of (150), that in
wh-questions at this stage, the child is able optionally to spell-out the wh-element
from the +wh feature in Comp (if it is simple). This allows us immediately to
explain the lack of Strong Crossover for such questions: the wh-word is generat-
ed in place, so it is never c-commanded by a coreferent pronoun. But what now
of the fact that Strong Crossover always holds for the full fronted wh-phrases?
I believe that this calls for the revision of an assumption that I have been
making throughout this chapter. I have been assuming that in the shallow
analysis, the element starts o in a dislocated position at DS, and is able to do
so because it gets its theta role from a null category in an associated GF-
position. While this allows for the element to get a theta role in its deepest level
position via the trace, it does generate an element in a -bar position at the
deepest level. This would still not violate the Projection Principle if one assumed
that the adult DS, but not the childs computed DS (=DS) were a pure instantia-
tion of GF-. That is if one assumed the following:
(152) DS (i.e. the adult DS) is the representation where pure theta rela-
tions are expressed (i.e. is a direct projection of the thematic struc-
ture).
Let us suppose instead that the following holds, a more reasonable assumption:
(153) The deepest computed level (at any stage of development) must be
where pure theta relations are expressed.
The assumption in (153) would require that the dislocated elements in (154a) and
(b) i.e. the two types of dislocated elements discussed earlier actually be
in some sort of theta position of their own at DS.
THE ABROGATION OF DS FUNCTIONS 255
(154) a. Near Johns house, he put a case.
b. To kiss the pig would make the horse happy.
This is certainly not implausible for (154b), where the element may be in a
subject position (Subj/Aux inversion can apply), and assigned an auxiliary theta
role; let us assume that it is also the case in (154a), where the element may be
in some sort of topic position. In eect this states that there is a parametric
dierence in the childs grammar which allows certain positions to be base-
generated theta positions.
Given the above assumption, an appropriate derivational distinction may be
made. The dislocated full wh-phrase in sentences like (155) is in the Spec C
position, and this is in no instance a theta position.
(155) Whose hat did he buy?
Hence the wh-element must come from an actual DS object position, even in the
childrens derivation (given revised assumption (153)).
(156) Childs analysis:
DS: He saw whose hat.
SS: Whose hat did he see?
Since the wh-element is coming from its DS position, there is a Condition C
violation for the child.
On the other hand, the simple wh-element is a spell-out of the +wh feature.
It is generated in Comp, perhaps spelled-out at some later level, and never
appears in a non-dislocated position. The childs derivation is therefore the
following:
(157) Childs analysis:
DS: Who (did) he see?
SS: Who (did) he see?
Because the wh-element is never c-commanded by a name, no Condition C
violation results.
Let us turn now to the third nding of Roeper et al.: that there is a sequenc-
ing eect in the appearance of the Strong Crossover Constraint. Namely, the
child rst allows both 1 and 2 clause structures violating the constraint, then
correctly rules out 1 clause structures, while allowing Strong Crossover to be
violated in 2 clause structures, and nally allowing no Strong Crossover at all.
256 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
(158) Stage I:
Who is he
i
V-ing t? OK for children
Who does he
i
think NP is V-ing t? OK for children
Stage II:
Who is he
i
V-ing t? * for children
Who does he
i
think NP is V-ing t? OK for children
Stage III:
Who is he
i
V-ing t? both * for children
Who does he
i
think NP is V-ing t? (from Roeper et al.)
If this degree of complexity of data is to be trusted, then what we have here is
not indicative of parameter-setting in the representational mode as commonly
understood. If one assumed a single change in the course of development pro
wh-trace. then it would be unexpected that this would apply in a context-sensitive
way, depending on the one clause vs. two clause contrast. Nor would this data be
explained if one assumed that Strong Crossover suddenly popped into the
grammar, a possibility which is in any case excluded in principle above.
On the other hand, if the Strong Crossover Constraint were dependant on
the possibility of wh-movement, then the outlines of the solution for the sequenc-
ing eect are clear, since the emergence of Strong Crossover is dependant on
the amount of wh-movement which is occurring. The data would follow from the
following:
(159) a. Stage I:
Wh-movement does not occur at all for simple wh-elements, the
elements are spellouts of features in Comp.
b. Stage II:
Wh-movement does occur for simple wh-elements, but only
within CP (CP (=S) acts as an absolute barrier).
c. Stage III:
Wh-movement applies as in the adult grammar.
The rst stage is one in which there is no Strong Crossover Eect at all for the
simple moved wh-elements. The second stage is the crucial one for current
purposes. While there is apparent extraction over two clauses, the second stage
would require that the wh-element be simply base-generated as a spell-out of
+wh features in the matrix, rather than moved from the lower clause. This in turn
would require the movement of a null operator in the lower clause, and the
indexing of this element with the matrix wh-element. In the third stage the
grammar would be identical to the adult one.
THE ABROGATION OF DS FUNCTIONS 257
(160)
a.
b.
c.
Stage I:
Who did he see ? t
Spell-out of -features wh
Stage II:
Stage III:
i) 1 clause:
2 clause:
Who did he see ? t
additional
indexing
ii)
Who did he think that O
i
NP saw ? t
movement
movement
movement
spell-
out
Who did he think that NP saw ? t t
movement
Stage I and Stage III have already been extensively discussed. The crucial fact
about Stage II is that by assuming that CP is an absolute barrier to wh-movement
at this stage, the wh-element is forced to be generated in the matrix Comp node,
at this stage, rather than moved from below. This would then mean that Strong
Crossover would not occur in these structures, while at the same time it would
for 1 clause structures, which is Roeper et al.s result. The crucial fact is that the
stage-like progression in the acquisition data can nd a stage-like progression in
the extraction domain of wh-elements. A similar sort of solution is not available
if one assumes that the change is in category type: pro wh-trace.
To summarize, I have suggested in this chapter that the grammar is
essentially in the derivational mode, and that reexes of this may be seen in the
intermediate grammars that the child adopts. Let me close, however, by consider-
ing a point of contact of the above analysis with that of Roeper et al. Let us
suppose that, in the derivational mode, the notion of wh-trace is not dened in
terms of contextual features (e.g. a Case-assigned empty category), but rather in
terms of its history. Something is a trace, in this sense, if it is a null category left
by movement: it is a wh-trace if it is a null category left by movement to an A-
position. This is of course dierent from the contextual denition sometimes
adopted or suggested (e.g. Chomsky 1981, 1982), but is itself viable and rather
258 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
straightforward. Consider now what happens under the conditions that we have
been suggesting: that the child computes, in some instances, a shallow derivation,
anchored in S-structure, but only receding back to DS.
(161) Childs analysis:
DS: Who did he say Bill liked.
SS: Who
i
did he
i
say Bill liked e
i
?
The derivation is clear, but what is the nature of the null category? According to
the denition above, it could not be a wh-trace even though it looks like a wh-
trace in the adult grammar simply because a wh-trace is an element left by
wh-movement, and no wh-movement has taken place in the derivation.
What then is it? In fact, it is not clear what the null category would be. One
possibility, and a fairly likely one for the category is that it is simply little pro,
at least at DS. For it is Case-marked, as little pro is, and it is not derived by wh-
movement, as is also the case for little pro. If we assume that elements retain
their character over a derivation (Brody 1984), and we also allow the possibility
of an A-bound little pro (Roeper et al. 1986), then the element would also be
little pro at s-structure: the A-bound pro of Roeper et. al.
The theoretical ramications of this sort of account are, I believe, quite
interesting. It would mean that both the derivational and representational views
were correct: the derivation is shallow, and so the element is dened as little
pro. However, unlike the pure representational account, or even a representation-
al-based account like that of Roeper et al., the representational assignment as
little pro is based on the analysis in the derivational mode. That is, it is the
shallowness of the derivation which requires the child to analyze the null element
as little pro (or at least not wh-trace), since wh-trace is dened as the position
from which A-movement has taken place. In this way it is possible to hold to
the basic insight of the Roeper et al. approach, namely that initial wh-trace is not
treated as such, without holding to the view that this is the deepest level of
explanatory analysis. That has to do with the shallowness of the computed
derivation, and it is because of this shallowness that the null category is, as a
result, analyzed as little pro. This analysis would also have the following eect:
that when the childs grammar fails, perhaps for computational reasons, it falls
into another grammar in UG. In this case the grammar would be one in which
the element was a bound little pro. While the details in such a case require
further research, the general architecture seems clear.
References
Abney, S. (1987a). The English Noun Phrase in Its Sentential Aspect. Ph.D. dissertation,
Massachusetts Institute of Technology.
Abney, S. (1987b). Licensing and Parsing. Proceedings of NELS 17.
Abney, S. and J. Cole (1985). A Government-Binding Parser. Proceedings of NELS 15.
Akmajian, A. , S. Steele, and T. Wasow (1979). The Category Aux in Universal Gram-
mar. Linguistic Inquiry 10. 1, 164.
Anderson, M. (1979). Noun Phrase Structure. Ph.D Dissertation, Univ. of Connecticut.
Aoun, Y. (1985). A Grammar of Anaphora, Cambridge, Mass.: MIT Press.
Aoun, Y. and D. Sportiche (1981). On the Formal Theory of Government. The Linguistic
Review 2. 3, 211236.
Aoun, Y. , N. Hornstein, D. Lightfoot, and A. Weinberg (1987). Two Types of Locality.
Linguistic Inquiry 18. 4, 537577.
Bach, E. (1962). The Order of Elements in a Transformational Grammar of German.
Language 38, 263269.
Bach, E. (1977). The Position of the Embedding Transformation in the Grammar
Revisited. Linguistic Structures Processing, edited by A. Zampoli. New York: North-
Holland.
Bach, E. (1979). Control and Montague Grammar. Linguistic Inquiry 10. 4, 515531.
Bach, E. (1983). On the Relationship between Word Grammar and Phrase Grammar.
Natural Language and Linguistic Theory 1, 6589.
Baker, C. L. (1979). Syntactic Theory and the Projection Problem. Linguistic Inquiry 10.
4, 533581.
Baker, C. L. (1977). Comments on Culicover and Wexler. Formal Syntax, edited by P.
Culicover, T. Wasow, and A. Akmajian. New York: Academic Press.
Baker, C. L. and J. J. McCarthy, eds. (1981). The Logical Problem of Language Acquisi-
tion. Cambridge, Mass.: MIT Press.
Baker, M. (1985). Incorporation : A Theory of Grammatical Function Changing. Ph.D.
dissertation, Massachusetts Institute of Technology.
Bar-Adon, A. and W. Leopold, ed. (1971). Child Language: A Book of Readings. New
York: Prentice-Hall.
Barss, A. (1985). Chain-Binding. Presentation given at West Coast Conference on Formal
Linguistics.
260 REFERENCES
Barss, A. (1986). Chains and Anaphoric Dependencies. Ph.D. dissertation, Massachusetts
Institute of Technology.
Barss, A. and H. Lasnik (1986). A Note on Anaphora and Double Objects. Linguistic
Inquiry 17. 2, 347354.
Belletti, A. and L. Rizzi (1988). Psych-verbs and Th-theory. Natural Language and
Linguistic Theory 6. 3, 291352.
Berwick, R. (1985). The Acquisition of Syntactic Knowledge. Cambridge, Mass.: MIT
Press.
Berwick, R. and A. Weinberg (1984). The Grammatical Basis of Linguistic Performance.
Cambridge, Mass.: MIT Press.
Bever, T. (1970). The Cognitive Basis for Linguistic Structures. Cognition and the
Development of Language, edited by J. R. Hayes, 279352. New York: Wiley.
Bierwisch, M. (1963). Grammatik des Deutschen Verbs. Studia Grammatica. Berlin:
GDR.
Bloom, L. (1970). Language Development: Form and Function in Emerging Grammars.
Cambridge, Mass: MIT Press.
Bloom, L. , P. Lightbown, and L. Hood (1975). Structure and Variation in Child
Language. Monographs of the Society for Research in Child Development 40.
Borer, H. (1984). The Projection Principle and Rules of Morphology. Proceedings of
NELS 14.
Borer, H. (1985). The Lexical Learning Hypothesis and Universal Grammar. Boston
University Conference on Language Development, Boston, Mass.
Borer, H. and K. Wexler (1987). The Maturation of Syntax. Parameter-Setting, edited by
T. Roeper and E. Williams. Cambridge, Mass.: MIT Press.
Bouchard, D. (1984). On the Content of Empty Categories. Dordrecht: Foris.
Bowerman, M. (1973). Early Syntactic Development. Cambridge, England: Cambridge
University Press.
Bowerman, M. (1974). Learning the Structure of Causative Verbs. Papers and Reports on
Child Language Development 8, edited by E. Clark, 142178. Stanford, Calif:
Stanford University.
Bowerman, M. (1982). Reorganizational Processes in Lexical and Syntactic Development.
Language Acquisition: The State of the Art, edited by E. Wanner and L. Gleitman,
319- 347. Cambridge, England: Cambridge University Press.
Bradley, D. (1979). Computational Distinctions of Vocabulary Type. Ph.D. dissertation
Massachusetts Institute of Technology.
Braine, M. D. S. (1963). The Ontogeny of the English Phrase Structure: the First Phase.
Reprinted in Child Language: A Book of Readings, edited by A. Bar-Adon. and W.
Leopold. New York: Prentice-Hall.
Braine, M. D. S. (1963). On Learning the Grammatical Order of Words. Reprinted in
Child Language: A Book of Readings, edited by A. Bar-Adon and W. Leopold. New
York: Prentice-Hall.
REFERENCES 261
Braine, M. D. S. (1965). Learning the Positions of Words Relative to a Marker Element.
Journal of Experimental Psychology 72. 4, 532540.
Braine, M. D. S. (1976). Childrens First Word Combinations. Monographs of the Society
for Research in Child Development 41, 196.
Bresnan, J. (1977). Variables in the Theory of Transformations. Formal Syntax, edited by
P. Culicover, T. Wasow, and A. Akmajian, New York: Academic Press.
Bresnan, J. (1978). A Realistic Transformational Grammar. Linguistic Theory and
Psychological Reality, edited by M. Halle, J. Bresnan, and G. Miller. Cambridge,
Mass.: MIT Press.
Bresnan, J. , ed. (1982). The Mental Representation of Grammatical Relations. Cambridge,
Mass.: MIT Press.
Brody, M. (1984). On Contextual Denitions and the Role of Chains. Linguistic Inquiry
15. 3, 355380.
Brown, R. (1973). A First Language: The Early Stages. Cambridge, Mass.: Harvard
University Press.
Brown, R. and U. Bellugi (1964). Three Processes in the Childs Acquisition of Syntax.
reprinted in Child Language: A Book of Readings, edited by A. Bar-Adon and W.
Leopold. New York: Prentice-Hall.
Browning, M. (1987). Null Operator Constructions. Ph.D. dissertation, Massachusetts
Institute of Technology.
Burzio, Luigi (1986). Italian Syntax, Dordrecht: Reidel.
Carden, G. (1986a). Blocked Forwards Coreference and Unblocked Forwards Anaphora:
Evidence for an Abstract Model of Coreference. Papers from the Regional Meeting
of CLS 22, 262276.
Carden, G. (1986b). Blocked Forwards Coreference:Theoretical Implications of the
Acquisition Data. Studies in the Acquisition of Anaphora, Vol I, edited by B. Lust.
Dordrecht: Reidel.
Carden, G. and T. Dietrich (1981). Introspection, Observation, and Experiment. Proceed-
ings of the 1980 Biennial Meeting of the Philosophy of Science Association in East
Lansing, MI, edited by R. Giere, 583597.
Chierchia, G. (1984). Topics in the Syntax and Semantics of Innitives and Gerunds. Ph.D.
dissertation, University of Massachusetts
Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton.
Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press.
Chomsky, N. (1970). Remarks on Nominalization. Studies in Semantics in Generative
Grammar, Papers by N. Chomsky. The Hague: Mouton.
Chomsky, N. (1972). Studies in Semantics in Generative Grammar. The Hague: Mouton.
Chomsky, N. (1973). Conditions on Transformations. A Festschrift for Morris Halle,
edited by S. Anderson and P. Kiparsky, 223286. New York: Holt, Rinehardt, and
Winston.
Chomsky, N. (19751955). The Logical Structure of Linguistic Theory. New York: Plenum
Press.
262 REFERENCES
Chomsky, N. (1975). Reections on Language. Pantheon.
Chomsky, N. (1977a). On Wh-Movement. Formal Syntax, edited by P. Culicover, T.
Wasow, and A. Akmajian, 71132. New York: Academic Press.
Chomsky, N. (1977b). Essays on Form and Interpretation. New York: North-Holland.
Chomsky, N. (1980). On Binding. Linguistic Inquiry 11. 1, 146.
Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, N. (1982). Some Concepts and Consequences of the Theory of Government and
Binding, Cambridge, Mass.: MIT Press.
Chomsky, N. (1986a). Knowledge of Language: Its Nature, Origins, and Use, Praeger.
Chomsky, N. (1986b). Barriers. Cambridge, Mass.: MIT Press.
Chomsky, N. (1993). A Minimalist Program for Linguistic Theory. The View from
Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, edited by Ken
Hale and S. J. Keyser, 152. Cambridge, Mass.: MIT Press.
Chomsky, N. (1995). The Minimalist Program. Cambridge, Mass.: MIT Press.
Chomsky, N. and H. Lasnik (1977). Filters and Control. Linguistic Inquiry 8. 3, 425504.
Clark, E. (1973). Whats in a Word? On the Childs Acquisition of Semantics in his First
Language. Cognitive Development and the Acquisition of Language, edited by T. E.
Moore, 65110. New York: Academic Press.
Clark, H. and E. Clark (1977). Psychology and Language: An Introduction to Psycho-
linguistics. New York: Harcourt, Brace, Jovanich.
Clark, R. (1986). Boundaries and the Treatment of Control. Ph.D. dissertation, UCLA.
Cooper, R. (1978). Montagues Semantic Theory and Transformational Grammar. Ph.D.
dissertation, University of Massachusetts.
Crain, S. and C. McKee (1985). Acquisition of Structural Constraints on Anaphora.
Proceedings of NELS 16.
Culicover, P. (1967). The Treatment of Idioms within a Transformational Framework.
IBM Technical Report.
Culicover, P. , T. Wasow, and A. Akmajian, eds. (1977). Formal Syntax. New York:
Academic Press.
Culicover, P. and K. Wexler (1977). A Degree-2 Theory of Learnability. Formal Syntax,
edited by P. Culicover, T. Wasow, and A. Akmajian. New York: Academic Press.
Davis, H. (1985). Syntactic Undergeneration in the Acquisition of English: Wh-construc-
tions and the ECP. Proceedings of NELS 16.
Dowty, D. , R. Wall, and S. Peters (1979). Introduction to Montague Semantics. Dor-
drecht: Reidel.
Drozd, K. (1987). Minimal Syntactic Structures in Child Language. Manuscript, Tucson,
Ariz.: University of Arizona.
Drozd, K. (1994). A Unication Categorial Grammar of Child English Negation. Ph.D.
dissertation, University of Arizona.
Emonds, J. (1975). A Transformational Approach to English Syntax. New York: Academic
Press.
Emonds, J. (1985). A Unied Theory of Syntactic Categories. Dordrecht: Foris.
REFERENCES 263
Epstein, S. (1984). Quantier-pro and the LF Interpretation of PRO
arb
. Linguistic Inquiry
15. 3, 499505.
Farmer, A. (1984). Modularity in Syntax: A Study of Japanese and English. Cambridge,
Mass.: MIT Press.
Fiengo, R. (1980). Surface Structure: The Interface of Autonomous Components. Cam-
bridge, Mass.: Harvard University Press.
Fiengo, R. and J. Higginbotham (1981). Opacity in NP. Linguistic Analysis 7. 4, 395421.
Finer, D. and E. Broselow (1985). The Acquisition of Binding in Second Language
Learning. Proceedings of NELS 15.
Fodor, J. A. (1975). The Language of Thought. New York: Crowell.
Fodor, J. A. (1981). Methodological Solipsism as a Research Strategy in Psycholinguistics.
Representations, edited by J. Fodor. Cambridge, Mass.: MIT Press.
Fodor, J., T. G. Bever, and M. F. Garrett (1974). The Psychology of Language. New York:
McGraw-Hill.
Fodor, J. D. (1977). Semantics: Theories of Meaning in Generative Grammar. New York:
Crowell.
Frank, R. (1992). Syntactic Locality and Tree Adjoining Grammar. University of Pennsyl-
vania Ph.D. dissertation
Franks, S. and N. Hornstein (1990). Governed PRO. McGill Working Papers in Linguistics
6. 3, 167191.
Freidin, R. (1978). Cyclicity and the Theory of Grammar. Linguistic Inquiry 9. 4,
519549.
Freidin, R. (1986). Fundamental Issues in the Theory of Binding. Studies in the Acquisi-
tion of Anaphora, edited by B. Lust, 151191, Dordrecht: Kluwer.
Frazier, Lyn (1979). Some Notes on Parsing. University of Massachusetts Occasional
Papers in Linguistics. Amherst, Mass.: University of Massachusetts.
Fukui, N. and M. Speas (1986). Speciers and Projections. MIT Working Papers in
Linguistics 8, 128172.
Garrett, M. F. (1975). The Analysis of Sentence Production. The Psychology of Learning
and Motivation 9, edited by G. H. Bower, 133177. New York: Academic Press.
Garrett, M. F. (1980). Levels of Processing in Speech Processing. Language Production,
vol. 1, edited by B. Butterworth, 177220. New York: Academic Press.
Gleitman, L. and H. Gleitman (1990). Structural Sources of Verb Meaning. Language
Acquisition 1, 355.
Goodall, G. (1984). Parallel Structures in Syntax. Ph.D. dissertation, University of
California at San Diego.
Goodall, G. (19851986). Parallel Structures in Syntax. The Linguistic Review 5. 2,
173184.
Goodluck, H. (1978). Linguistic Principles in Childrens Grammar of Complement Subject
Interpretation. Ph.D. dissertation, University of Massachusetts.
Goodluck, H. and S. Tavakolian (1982). Competence and Processing in Childrens
Grammar of Relative Clauses. Cognition 16, 128.
264 REFERENCES
Grimshaw, J. (1981). Form, Function, and Language Acquisition Device. The Logical
Problem of Language Acquisition, edited by C. L. Baker and J. J. McCarthy.
Cambridge, Mass.: MIT Press.
Grimshaw, J. (1986). Nouns, Arguments, and Adjuncts. MIT Working Papers in Linguis-
tics. Cambridge, Mass.: Massachusetts Institute of Technology.
Gruber, J. (1967). Topicalization in Child Language. Child Language: A Book of Readings,
edited by A. Bar-Adon and W. Leopold, New York: Prentice-Hall.
Guilfoyle, E. (1985). The Acquisition of Tense and the Emergence of Lexical Subjects in
Child Grammars of English. McGill Working Papers in Linguistics.
Hale, K. (1979). On the Position of Walpiri in a Typology of the Base. Bloomington,
Ind.: Indiana University Linguistic Club.
Hale, K. (1983). Warlpiri and the Grammar of Noncongurational Languages. Natural
Language and Linguistic Theory 1, 548.
Hale, K. and J. Keyser (1986a). On the Syntax of Argument Structure. Lexicon Project
Working Papers 34. MIT Working Papers in Linguistics. Cambridge, Mass.: Massa-
chusetts Institute of Technology.
Hale, K. and J. Keyser (1986b). Some Transitivity Alternations in English. Lexicon
Project Working Papers 7. MIT Working Papers in Linguistics. Cambridge, Mass.:
Massachusetts Institute of Technology.
Hamburger, H. and S. Crain (1982). Relative Acquisition. Language Development:
Volume 1, edited by S. Kuczaj II, 245274. Mahwah, NJ: Lawrence Erlbaum.
Hayes, B. (1980). A Metrical Theory of Stress Rules. Ph.D. dissertation, Massachusetts
Institute of Technology.
Higginbotham, J. (1985). On Semantics. Linguistic Inquiry, 16, 547594.
Hoji, H. (1983). X
n
(YP) X
n-1
and the Bound Variable Zibun. MIT Workshop on
Japanese Linguistics. Cambridge, Mass.: Massachusetts Institute of Technology.
Hoji, H. (1985). Logical Form Constraints and Congurational Structures in Japanese.
Ph.D. dissertation, University of Washington.
Hoji, H. (1986). Empty Pronominals in Japanese and the Subject of NP. Proceedings of
NELS 17.
Hornstein, N. (19851986). Restructuring and Interpretation in a T-model. The Linguistic
Review 5. 4, 301334.
Hornstein, N. (1987). Levels of Meaning. Modularity in Representation and Natural
Language Understanding, edited by J. Gareld. Cambridge, Mass.: MIT Press.
Huang, J. (1982). Logical Relations in Chinese and the Theory of Grammar. Ph.D.
dissertation, Massachusetts Institute of Technology.
Huang, J. (1993). Reconstruction and the Structure of the VP: Some Theoretical Conse-
quences. Linguistic Inquiry 24.
Hyams, N. (1985). Language Acquisition and the Theory of Parameters. Ph.D. dissertation,
City University of New York.
Hyams, N. (1986). Language Acquisition and the Theory of Parameters. Dordrecht: Reidel.
REFERENCES 265
Hyams, N. (1987). Parameter-Setting. Parameter-Setting, edited by T. Roeper and E.
Williams, 122. Dordrecht: Reidel.
Jackendo, R. (1972). Semantic Interpretation in Generative Grammar. Cambridge,
Mass.: MIT Press.
Jackendo, R. (1977). X-Syntax. Cambridge, Mass.: MIT Press.
Jackendo, R. (1983). Semantics and Cognition. Cambridge, Mass.: MIT Press.
Jackendo, R. (1988). Consciousness and the Computational Mind. Bradford Books,
Cambridge, Mass.: MIT Press.
Jaeggli, O. (1986). Passive. Linguistic Inquiry 17. 4, 587622.
Jakubewicz, C. (1984). On Markedness and Binding Principles. Proceedings of NELS 14,
154182.
Jelinek, E. (1984). Empty Categories, Case, and Congurationality. Natural Language and
Linguistic Theory 2. 1, 3976.
Jelinek, E. (1985). The Projection Principle and the Argument Type Parameter. Paper
presented at Linguistic Society of America, winter meeting.
Johnson, K. (1986). Subjects and Theta Theory. Manuscript, Cambridge, Mass.: Massa-
chusetts Institute of Technology.
Joshi, A. (1985). Tree-Adjoining Grammars: How Much Context Sensitivity is Required
to Provide Reasonable Descriptions? Natural Language Parsing, edited by D. Dowty,
L. Kartunen, and A. Zwicky. Cambridge, Eng.: Cambridge University Press.
Joshi, A. , L. Levy, and M. Takahashi (1975). Tree Adjunct Grammars. Journal of
Computer and System Sciences 10.
Joshi, A. and A. Kroch (1985), The Linguistic Relevance of Tree-Adjoining Grammars,
MS-CS-8516, Department of Computer and Information Sciences, University of
Pennsylvania.
Kayne, R. (1981). ECP Extensions. Linguistic Inquiry 12, 93133.
Kayne, R. (1983). Connectedness. Linguistic Inquiry 14, 223249.
Kayne, R. (1984). Connectedness and Binary Branching. Dordrecht: Foris.
Kayne, R. (1985). Principles of Particle Constructions. Grammatical Representations,
edited by J. Guron, H. -G. Obenauer, and J. Pollock, 101140. Dordrecht: Foris.
Kegl, J. and J. Gee (undated). ASL Structure: Toward a Theory of Abstract Case.
Manuscript, Cambridge, Mass.: Massachusetts Institute of Technology.
Keyser, J. and T. Roeper (1984). On the Middle and Ergative Constructions in English.
Linguistic Inquiry 15, 381416.
Kiparsky, P. (1982a). From Cyclic Phonology to Lexical Phonology. The Structure of
Phonological Representations, edited by H. van der Hulst and N. Smith, 131177.
Dordrecht: Foris.
Kiparsky, P. (1982b). Lexical Morphology and Phonology. Linguistics in the Morning
Calm, edited by I. -S. Yang, 393. Seoul: Hansin.
Kitagawa, Y. (1986). Subjects in English and Japanese, Ph.D. dissertation, University of
Massachusetts.
266 REFERENCES
Klein, S. (1982). Syntactic Theory and the Developing Grammar: Reestablishing the
Relationship between Linguistic Theory and Data from Language Acquisition. Ph.D.
dissertation, University of California at Los Angeles.
Klima, E. and U. Bellugi (1966). Syntactic Regularities in the Speech of Children.
Psycholinguistic Papers, edited by J. Lyons and R. Wales, 183208. Edinburgh:
Edinburgh University Press.
Koopman, H. (1984). The Syntax of Verbs. Dordrecht: Foris.
Koster, J. (1975). Dutch as an SOV Language. Linguistic Analysis 1, 111136.
Koster, J. (1978). Locality Principles in Syntax. Dordrecht: Foris.
Koster, J. (1982). Class lectures. Salzburg Institute of Summer Linguistics, Salzburg,
Austria.
Koster, J. (1984). On Binding and Control. Linguistic Inquiry 15, 417459.
Koster, J. (1987). Domains and Dynasties. Dordrecht: Foris.
Kroch, A. and A. Joshi (1988). Analyzing Extraposition in a Tree Adjoining Grammar.
Syntax and Semantics 20, edited by G. Huck and A. Ojeda. New York: Academic
Press.
Labov, W. and T. Labov (1976). The Learning of Syntax from Questions. Zeitschrift fuer
Literatur Wissenschaft und Linguistik 6, 4782.
Ladusaw, W. (1985). A Proposed Distinction between Levels and Strata. Paper presented
at Linguistic Society of America, winter meeting.
Lapointe, S. (1978). A Theory of Grammatical Agreement. Ph.D. dissertation, University
of Massachusetts.
Lapointe, S. (1985a). A Model of Syntactic Phrase Combination in Speech Production.
Proceedings of NELS 15.
Lapointe, S. (1985b). A Theory of Verb Form Use in the Speech of Agrammatic
Aphasics. Brain and Language 24. 1, 100155.
Laporte-Grimes, L. and D. Lebeaux (1993). Complexity Considerations in Early Speech.
Manuscript. University of Connecticut and University of Maryland.
Lasnik, H. (1986). Two Types of Condition C. Presentation at Princeton Conference on
Linguistic Theory, Princeton, NJ.
Lasnik, H. and S. Crain (1985). On the Acquisition of Pronominal Reference. Lingua 65,
135154.
Lasnik, H. and M. Saito (1984). On the Nature of Proper Government. Linguistic Inquiry
15, 235289.
Lebeaux, D. (1981). The Acquisition of the Passive. MA Thesis, Harvard University.
Lebeaux, D. (1982). Submaximal Projections. Manuscript. Amherst, Mass.: University of
Massachusetts.
Lebeaux, D. (1983). A Distributional Dierence Between Reciprocals and Reexives.
Linguistic Inquiry 14.
Lebeaux, D. (1984). Anaphoric Binding and the Denition of PRO. Proceedings of NELS
14.
REFERENCES 267
Lebeaux, D. (19841985). Locality and Anaphoric Binding. The Linguistic Review 4,
343363.
Lebeaux, D. (1986). The Interpretation of Derived Nominals. Papers from the Regional
Meeting of the Chicago Linguistic Society 22, 231247.
Lebeaux, D. (1987). Comments on Hyams. Parameter-Setting, edited by T. Roeper and
E. Williams, Dordrecht: Reidel.
Lebeaux, D. (1987). The Composition of Phrase Structure. Presentation, Tucson, Az.:
University of Arizona.
Lebeaux, D. (1988). The Feature +Aected and the Formation of the Passive. Thematic
Relations [Syntax and Semantic 21], edited by W. Wilkins. New York: Academic
Press.
Lebeaux, D. (1988). Language Acquisition and the Form of the Grammar. Ph.D. disserta-
tion University of Massachusetts, Amherst.
Lebeaux, D. (in preparation). Appeared as Lebeaux (1991). Relative Clauses, Licensing,
and the Nature of the Derivation. Perspectives on Phrase Structure: Heads and
Licensing [Syntax and Semantics 25], edited by S. Rothstein. New York: Academic
Press.
Lebeaux, D. (1997). Determining the Kernel II: Prosodic Form, Syntactic Form, and
Phonological Bootstrapping. NEC Technical Report 97094.
Lebeaux, D. (1998). Where does the Binding Theory Apply? NEC Technical Report
98015. Princeton, NJ: NEC Research Institute.
Lebeaux, D. (to appear). A Subgrammar Approach to Language Acquisition. NEC
Technical Report.
Levin, B. (1983). On the Nature of Ergativity, Ph.D. dissertation, Massachusetts Institute
of Technology.
Lust, B. and L. Mangione (1984). The Principal Branching Direction Constraint in First
Language Acquisition of Anaphora. Proceedings of NELS 14.
Lust, B. , ed. (1986). Studies in the Acquisition of Anaphora. Dordrecht: Reidel.
Manzini, R. (1983). On Control and Control Theory. Linguistic Inquiry 14, 421446.
Marantz, A. (1980). Whither Move NP. MIT Working Papers in Linguistics. Cambridge,
Mass.: Massachusetts Institute of Technology.
Marantz, A. (1982). On the Acquisition of Grammatical Relations. Linguistische Berichte:
Linguistik als Kognitive Wissenschaft 80/82, 3269.
Marantz, A. (1984). On the Nature of Grammatical Relations. Cambridge, Mass.: MIT
Press.
Maratsos, M. , S. Kuczaj II, D. Fox, and M. Chalkley (1979). Some Empirical Studies in
the Acquisition of Transformational Relations: Passives, Negatives, and the Past
Tense. Minnesota Symposium on Child Psychology 12, edited by W. Collins,
Mahwah, NJ: Lawrence Erlbaum.
Maratsos, M. , D. Fox, J. Becker, and M. Chalkley (1985). Semantic Restrictions in
Childrens Passives. Cognition 19, 167192.
268 REFERENCES
Marcus, M. , D. Hindle, and M. Fleck (1983). D-theory: Talking about Talking about
Trees. Association for Computational Linguistics 21, 129136.
May, R. (1985). Logical Form. Cambridge, Mass.: MIT Press.
McCawley, J. (1984). Anaphora and Notions of Command. Proceedings of the Tenth
Annual Meeting of the Berkeley Linguistic Society. Berkeley, Calif.: University of
California.
McNeill, D. (1970). The Acquisition of Language: The Study of Developmental Psycho-
linguistics. New York: Harper and Row.
Miller, G. and K. McKean (1964). A Chronometric Study of Some Relations between
Sentences. Quarterly Journal of Experimental Psychology 16, 297308.
Mohanon, K. P. (1982). Lexical Phonology. Ph.D. dissertation, Massachusetts Institute of
Technology.
Montague, R. (1974). Formal Philosophy, edited by R. Thomason. New Haven, Conn.:
Yale University Press.
Morgan, J. , R. Meier, and E. Newport (1987). Structural Packaging in the Input to
Language Learning: Contributions of Prosodic and Morphological Marking of
Phrases to the Acquisition of Language. Cognitive Psychology 19. 4, 498550.
Newport, E. , L. Gleitman, and H. Gleitman (1977). Mother, Id Rather Do It Myself:
Some Eects and Non-eects of Maternal Speech Style. Talking to Children:
Language Input and Acquisition, edited by C. E. Snow and C. Ferguson. Cambridge:
Cambridge University Press.
Nishigauchi, T. (1984). Control and Thematic Domain. Language 60, 215250.
Partee, B. H. (1979). Montague Grammar and The Well-Formedness Constraint. Selections
from the Third Groningen Round Table [Syntax and Semantics 10], edited by F. Heny
and B. Schnelle, 275313, New York: Academic Press.
Partee, B. H. (1984). Compositionality. Varities of Formal Semantics, Proceedings of the
4
th
Amsterdam Colloquium, edited by F. Landman and F. Veltman, 281311.
Dordrecht: Foris.
Pesetsky, D. (1982). Paths and Categories, Ph.D. dissertation, Massachusetts Institute of
Technology.
Pesetsky, D. (1985). Morphology and Logical Form. Linguistic Inquiry 16, 193246.
Pinker, S. (1979). A Theory of the Acquisition of Lexical Interpretive Grammars. MIT
Lexicon Project.
Pinker, S. (1984). Language Learnability and Language Development, Cambridge, Mass.:
Harvard University Press.
Pinker, S. and D. Lebeaux (1982). A Learnability-Theoretic Approach to Language
Acquisition. Manuscript, Cambridge, Mass.: Harvard University.
Postal, P. (1984). On Raising. Cambridge, Mass.: MIT Press.
Powers, S. and D. Lebeaux (1998). Data on DP Acquisition. Issues in the Theory of
Language Acquisition, edited by N. Dittmar and Zvi Penner, 3776. Bern: Peter
Lang.
REFERENCES 269
Pustejovsky, J. (1984). Studies in Generalized Binding. Ph.D. dissertation, University of
Massachusetts.
Radford, A. (1981). Transformational Syntax. Cambridge, Eng.: Cambridge University
Press.
Randall, J. (1985). Morphological Structure and Language Acquisition. New York:
Garland Press.
Reinhart, T. (1983). Anaphora and Semantic Interpretation. Chicago: University of
Chicago Press.
Riemsdijk, H. van and E. Williams (1981). NP-structure. The Linguistic Review 1.
Rizzi, L. (1982). Issues in Italian Syntax. Dordrecht: Foris.
Rizzi, L. (1986a). On Chain Formation. The Syntax of Pronominal Clitics [Syntax and
Semantics 19], edited by H. Borer, 6595. New York: Academic Press.
Rizzi, L. (1986b). Null Objects in Italian and the Theory of pro. Linguistic Inquiry 17, 501-
558.
Roberts, I. (1986). Implicit and Dethematized Subjects. Ph.D. dissertation, University of
Southern California.
Roeper, T. (1974). Ph.D. dissertation, Harvard University.
Roeper, T. and M. Siegel (1978a). A Lexical Transformation for Verbal Compounds.
Linguistic Inquiry 9, 199260.
Roeper, T. (1978b). Linguistic Universals and the Acquisition of Gerunds. University of
Massachusetts Occasional Papers 4, edited by H. Goodluck and L. Solan. Amherst,
Mass.: University of Massachusetts.
Roeper, T. (1982). The Role of Universals in the Acquisition of Gerunds. Language
Acquisition: The State of the Art, edited by E. Wanner and L. Gleitman, 267288.
Cambridge, Eng.: Cambridge University Press.
Roeper, T. (1983). Implicit Theta Roles in the Lexicon and Syntax. Manuscript, Amherst,
Mass.: University of Massachusetts.
Roeper, T. (1987). Implicit Arguments and the Head Complement Relation. Linguistic
Inquiry 18. 2, 267310.
Roeper, T. , S. Akiyama, L. Mallis, and M. Rooth (1986), The Problem of Empty
Categories and Bound Variables in Language Acquisition. Manuscript, Amherst,
Mass.: University of Massachusetts.
Roeper, T. and J. Keyser (1984). On the Middle and Ergative Constructions in English.
Linguistic Inquiry 15, 381416.
Roeper, T. and E. Williams (1986). Parameter Setting, Dordrecht: Reidel.
Rosenbaum, P. (1967). The Grammar of English Predication Complement Constructions.
Cambridge, Mass.: MIT Press.
Ross, J. R. (1968). Constraints on Variables in Syntax. Ph.D. dissertation, Massachusetts
Institute of Technology.
Rothstein, S. (1983). The Syntactic Forms of Predication. Ph.D. dissertation, Massachu-
setts Institute of Technology.
270 REFERENCES
Rozwadowska, B. (1986). Thematic Relations in Derived Nominals. Thematic Relations
[Syntax and Semantics 21], edited by W. Wilkins. New York: Academic Press.
Sar, K. (1982). Inection-Government and Inversion. The Linguistic Review 1. 4,
417467.
Sar, K. (1987). The Syntactic Projection of Lexical Thematic Structure. Natural
Language and Linguistic Theory 5. 4, 561611.
Sar, K. (1987). Comments on Manzini and Wexler. Parameter-Setting, edited by T.
Roeper and E. Williams. Dordrecht: Reidel.
Saito, M. and H. Hoji (1983). Weak Crossover and Move- in Japanese. Natural
Language and Linguistic Theory 1, 245260.
Schlesinger, I. M. (1971). Production of Utterances and Language Acquisition. The
Ontogenesis of Grammar, edited by D. Slobin, 63101. New York: Academic Press.
Seeley, T. D. (1989). Anaphoric Relations, Chains, and Paths. Ph.D. dissertation, Universi-
ty of Massachusetts.
Selkirk, E. (1984). Phonology and Syntax: The Relation between Sound and Structure.
Cambridge, Mass.: MIT Press.
Shattuck, S. R. (1974). Speech Errors: An Analysis. Ph.D. dissertation, Massachusetts
Institute of Technology.
Sheldon, A. (1974). The Role of Parallel Function in the Acquisition of Relative Clauses/
Journal of Verbal Learning and Verbal Behavior 13, 272281.
Solan, L. (1983). Pronominal Reference: Child Language and the Theory of Grammar.
Dordrecht: Reidel.
Solan, L. and T. Roeper (1978). Childrens Use of Syntactic Structure in Interpreting
Relative Clauses. Papers in the Structure and Development of Child Language,
University of Massachusetts Occasional Papers 4, edited by H. Goodluck and L.
Solan, 105126. Amherst, Mass.: University of Massachusetts.
Speas, M. (1990). Phrase Structure in Natural Language. Dordrecht: Reidel.
Sproat, R. (1985). On Deriving the Lexicon. Ph.D. dissertation, Massachusetts Institute of
Technology.
Sproat, R. (1985). The Projection Principle and the Syntax of Synthetic Compounds.
Proceedings of NELS 16.
Sportiche, D. (1983). Structural Invariance and Symmetry in Syntax. Ph.D dissertation,
Massachusetts Institute of Technology.
Sportiche, D. (1987). Unifying Movement Theory. Manuscript, Los Angeles: University
of Southern California.
Sportiche, D. (1988). A Theory of Floated Quantiers, and its Consequences for Constitu-
ent Structure. Linguistic Inquiry 19, 425450.
Steele, S. (in preparation). A Grammar of Luiseno.
Stowell, T. (1981). The Origins of Phrase Structure. Ph.D. dissertation, Massachusetts
Institute of Technology.
Stowell, T. (1981/1982). A Formal Theory of Congurational Phenomena. Proceedings
of NELS 12.
REFERENCES 271
Stowell, T. (1983). Subjects across Categories. The Linguistic Review 2, 285312.
Stowell, T. (1988). Small Clause Restructuring. Manuscript, Los Angeles: University of
California at Los Angeles.
Tavakolian, S. (1978). Structural Principles in the Acquisition of Complex Sentences. Ph.D.
dissertation, University of Massachusetts.
Tavakolian, S. (1981). The Conjoined Clause Analysis of Relative Clauses. Language
Acquisition and Linguistic Theory, edited by S. Tavakolian, Cambridge, Mass.: MIT
Press.
Tavakolian, S., ed. (1981). Language Acquisition and Linguistic Theory. Cambridge,
Mass.: MIT Press.
Thiersch, C. (1978). Topics in German Syntax. Ph.D. dissertation, Massachusetts Institute
of Technology.
Travis, L. (1984). Parameters and Eects of Word Order Variation. Ph.D. dissertation,
Massachusetts Institute of Technology.
Vainikka, A. (1985). The Acquisition of English Case. Presented at Boston University
Conference on Language Development 10, later appeared as Vainikka, A. (1993/
1994). Case in the Development of English Syntax. Language Acquisition 3,
257324.
Vainikka, A. (1986). Case in Finnish and Acquisition. Manuscript, Amherst, Mass.:
University of Massachusetts.
Vainikka, A. (1986). Nominative Signals Movement. Presentation at 3rd Workshop in
Comparative Germanic Syntax, Turku, Finland.
Vainikka, A. (1988), Manuscript, Amherst, Mass.: University of Massachusetts.
Vergnaud, J. -R. (1985). Dependances et niveaux de representation en syntaxe. Amster-
dam: John Benjamins.
Wanner, E. and L. Gleitman, ed. (1982). Language Acquisition: The State of the Art,
Cambridge, Eng.: Cambridge University Press.
Wasow, T. (1977). Adjectival and Verbal Passive. Formal Syntax, edited by P. Culicover,
T. Wasow, and A. Akmajian. New York: Academic Press.
Weinberg, A. (1988), Locality Principles in Syntax and in Parsing,. Ph.D. dissertation,
Massachusetts Institute of Technology.
Wexler, K. and P. Culicover (1980). Formal Principles of Language Acquisition. Cam-
bridge, Mass.: MIT Press.
Wexler, K. (1982). A Principled Theory for Language Acquisition. Language Acquisition:
The State of the Art edited by E. Wanner and L. Gleitman, Cambridge, Eng.:
Cambridge University Press.
Wexler, K. and Y. Chien (1985). The Development of Lexical Anaphors and Pronouns.
Papers and Reports on Child Language Development 24, 138149. Stanford, Calif.:
Stanford University.
Wexler, K. and Y. Chien (1987a). Childrens Acquisition of Locality Conditions for
Reexives and Pronouns. Papers in Linguistics 26, 3039. Irvine, California:
University of California.
272 REFERENCES
Wexler, K. and R. Manzini (1987b). Parameters and Learnability in Binding Theory.
Parameter-Setting edited by T. Roeper and E. Williams, Dordrecht: Reidel.
Williams, E. (1978). Across-the-Board Rule Application. Linguistic Inquiry 9. 1, 3143.
Williams, E. (1980). Predication. Linguistic Inquiry 11. 1, 203238.
Williams, E. (1981). Argument Structure and Morphology. The Linguistic Review 1,
81114.
Williams, E. (1982). The NP-Cycle. Linguistic Inquiry 13. 2, 277296.
Williams, E. (1987). Reassignment of Functions at LF. Linguistic Inquiry 17. 2, 265299.
Zubizarreta, M. -L. (1987). Levels of Representation in the Lexicon and in the Syntax.
Dordrecht: Foris.
Index
A
Abney, 71, 72, 77, 84, 96
Abrogation of deep structure functions,
183258
Adjoin-, xv, xix, 91144
Agreement, 145153
Akiyama, 204, 239258
Akmajian, 88
Anchoring
and deepest computed level, 188194
derivation anchored at DS, 184194
derivation anchored at SS, 184194
Anti-Reconstruction Effects, 102112
Aoun, 21, 43, 44, 244
Argument/Adjunct distinction, 94136
Argument-linking, 3851
and learnability, 3841
and the Projection Principle, 104112
and the Structure of the Base,
104112
and ergative languages, 3841
and the derivation, 94136
B
Bach, 17, 100, 101, 220
Baker, 51
Barss, 224239
Base Order, 1729
Determining the base order 1729
Belletti, 219, 241
Bellugi, 26, 27
Bever, 9, 142, 185, 186
Bierwisch, 17
Bloom, 6970, 7980, 158
Borer, 15
Bowerman, 11, 28, 92
Bradley,12, 92
Braine, xxiii , 712, 19
Bresnan, 54, 118, 216
Brody, 258
Brown, 70, 92, 154
Browning, 214
C
Canonical structural realization, 36
Carden, xxviii, 204, 224239
Case representation, 178, 179
Chien, 245, 249
Chierchia, 23
Chomsky, xiii, xiv, xv, xxii, 2, 4, 10,
1315, 23, 25, 31, 41, 46, 47, 52,
70, 74, 80, 8486, 93, 96, 97, 100,
114, 116, 118, 128, 141, 145, 149,
150, 152154, 165, 183, 206, 208,
215, 220, 240242, 253
Clark, E., 92
Clark, H., 92
Clark, R., 206
Closed class elements, 15, 7,1116,
151153
and set of governors, 12
and niteness, 1314
and open class elements, 1115
link with grammatical operations,
151153
Cole, 96
274 INDEX
Composition of phrase structure, 91144
and saturation of closed class
elements, 114120
Condition C, 102112, 224239
and Dislocated Constituents, 224239
constraint of direct c-command,
224229
Conjoin-, 112115, 120136
Constansy principles, 9194
Control, 203224
and abrogation of DS functions,
188194, 220224
c-command constraint, 213220
double-binding constructions,
213220
early stages, 204213
Goodlucks result, 210211
representation in early grammar,
220224
Tavakolians result, 206207
Cooper, 113
Crain, 120, 207
Culicover, 7, 185
D
Deductive system, modelling, 195203
Deep structure, 104112
Derivational Endpoints, xiii, xxvii
Derivational model, 91136
Derivational Theory of Complexity,
184194
Detecting Movement, 1922
Dietrich, 224239
Dislocated constituents and indexing
functions, 183258
Dowty, 100
Drozd, 208213
E
Early phrase structure
building, 3136, 4753, 5684
from lexical to phrasal syntax, 6880
lexical representation, 67
pivot/open sequence, 5860, 6869,
7279
thematic representation, 67
Emonds, 17, 154
English nominals
as ergative, 4145
Epstein, 213
Equipollence, 194203
Ergative languages, 3845
F
Farmer, 68
Fiengo, 25, 88
Finiteness, 2, 1314
and closed class elements, 1314
Fixed Specier Constraint, 1923
Fleck, 140
Fodor, 9, 128, 185, 186
Frank, xiv
Franks, 217
Frazier, 137
Freiden, 101, 104
Fukui, 27, 28, 72, 84
Full vs. Reduced Paradigm, 211
Functor/Pivot, 6869, 7175
G
Garrett, xvi, xxvi, 9, 12, 92, 155157,
185, 186
General Congruence Principle, 47,
126136
and setting of parameters, 126136
Gleitman H., 67
Gleitman L., xxvi, 67
Goodluck, 120, 209211, 213
Governor, Canonical, 12
Grammatical operations, 145153
Grammatical Sequence, relative clauses,
142144
Grimshaw, 32, 35, 47, 51, 95
INDEX 275
H
Hale, 32, 49, 52, 54, 55
Hamburger, 120
Higginbotham, 88, 96, 224
Hindle, 140
Hoji, 52, 77, 118
Hornstein, 21, 43, 44, 217
Huang, 13, 42, 104
Hyams, 16, 82, 209, 215, 249
I
Idioms, 165182
and passive, 165182
Level I idioms, 178181
Level II idioms, 178181
J
Jackendoff, xxviii, 22, 33, 61, 85, 86,
95, 97, 102, 103, 172, 183
Jakubewicz, 245, 247, 249
Jelinek, 38, 40, 41, 46, 52
Johnson, 219
Joshi, xiv, xxii, 78
K
Keyser, 54, 64
Kitagawa, 27, 28
Klein, 26
Klima, 26, 27
Koopman, 12, 18, 55, 91, 166
Koster, 17, 93, 217
Kroch, xiv, xxii, 78
L
Labov,T., 198
Labov,W., 198
Lapointe, 156
Laporte-Grimes, xxvii
Lasnik, 21, 106, 207, 215, 230, 242
Lebeaux, xvi, xxi, xxii, xxv, xxvi, xxvii,
xxviii, 32, 34, 81, 82, 85, 86, 88,
92, 149, 174, 212, 213-220, 241
Levels of Representation
and parametric variation, 1415
Levin, 38
Levy , 78
Lexical entry/representation, 50, 5360,
63, 6678
structure of, 50, 5360, 63, 6678
with lexical insertion into open slots,
5758
Lightfoot, 21, 43, 44
Link of closed class item with
grammatical operations, 151153
Lust, 76
M
Mallis, 204, 239258
Mangione, 76
Manzini, 206
Marantz, 38, 92
Marcus, 140
Marker, 1922
McCawley, 224
McKean, 165, 185, 186
McKee, 207
McNeill, 185, 186
Meier, 19
Merge, xxii-xxiv
Merger, 154182
Metatheoretical Constraint on Indexing,
231
and chain-binding, 230234
Miller, 165, 185, 186
Minimalist Program, the, xiii-xxix
Montague, xxi
Morgan, 19
N
Newport, 19, 68
Nishigauchi, 205
O
Open class/closed class distinction, 716
276 INDEX
P
Parametric variation in phrase structure,
3, 1415, 1618, 3141
amount of, 3
and levels, 1415,
and triggers, 1618
Parametric variation in Relative Clauses
and saturation of closed class
elements, 114120
and structure of parameters, 112120,
126136
Peters, 100
Phrase Structural Case, 150
Phrase structure composition, xiii-xxix
Phrase Structure,
Building, 3136, 4753, 5684
Pinker, 7, 10, 23, 3136, 45, 48, 51, 63,
65, 70, 86, 164, 194197, 212
Pivot/Open Constructions, 5860, 6869,
7275
Pivot/Open distinction, 711
and government relation, 9,10
Postal, 243
Powers, xxvi
Predication, 146147
Pre-Project- representation, 5183
Principle of Representability, 141
Processing considerations, 136144
and ECP, 138139
and grammar, 136144
Project-, xvi-xxvii, 154182
Projection of Lexical Structure, 4753
Projection Principle, 104112
Property of Smooth Degradation, 141
Pustejovsky, 150
R
Radford, 88
Reconstruction, 234239
quasi-Reconstruction, 234239
vs. direct approach, 224229
Reduced structures
deletion account of, 159
null item account of, 160
subgrammar account of, 165182
Reinhart, 110, 224226
Relative clauses, 91144
Relative Clauses, Acquisition
conjoined clause analysis, 120126
default grammar, 123
Replacement Sequences, 69
Rizzi, 219, 241
Roeper, 16, 17, 19, 20, 24, 54, 56,
120126, 184, 204, 239258
Rooth, 204, 239258
Rosenbaum, 205
Ross, 88
Rothstein, 96
S
Sar, 17
Saito, 21, 72, 230
Schlesinger, 11
Seely, 231235
Selkirk, 88
Semantic bootstrapping, 3138, 4547
Sequence of Structures, 69
Shallow analysis/derivation, 184194
and Derivational Theory of
Complexity, 184194
Shattuck-Hufnagel, xxvi, 12, 92,
155157
Sheldon, 120, 121
Siegel, 54
Solan, 120, 125, 126, 184, 207, 227
Speas, xvi, 27, 28, 72, 84
Specied Determiner Generalization,
170175
Speech errors, xviii, 155156, 165169
Sportiche, 21, 27, 28, 213
Steele, 88
Stowell, 10, 12, 32, 37, 49, 52, 54, 55,
95, 99, 114, 138
INDEX 277
Strong Crossover, 239258
acquisition evidence, 245258
and van Riemsdijk and Williams
proposal, 243244
and wh-questions, 239258
derivational account, 247249,
251258
representational account, 247251
Structure of the Base, 104112
Subgrammar Approach, xiii-xxix, 53, 67,
7578, 165182
Submaximal Projections, 8687
T
Takahashi, 78
Tavakolian, 120126, 142, 184, 203224
Telegraphic speech, 154182
The placement of Neg
syntax, 2426
acquisition, 2629
Thematic Representation, 67
Theta representation, 178, 179
Thiersch, 18
Travis, 55, 91
Triggers, 1629
U
Universal Application of Principles, 245
V
Vainikka, 82
van Riemsdijk, xxviii, 4, 46, 102, 103,
116, 119, 183, 236, 243258
W
Wall, 100
Wasow, 88
Weinberg, 21, 43, 44, 140
Weisler, 231
Wexler, 7, 185, 245, 247, 249
Williams, xxviii, 5, 46, 56, 66, 96, 102,
103, 111, 113, 116, 119, 146, 147,
183, 205, 206, 213, 215, 224, 236,
243258
Z
Zubizarretta, 67