David Lebeaux Language Acquisition and The Form of The Grammar 2000

LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
This book was originally selected and revised to be included in the World Theses Series
(Holland Academic Graphics, The Hague), edited by Lisa L.-S. Cheng.
LANGUAGE ACQUISITION
AND THE FORM
OF THE GRAMMAR
JOHN BENJAMINS PUBLISHING COMPANY
PHILADELPHIA/AMSTERDAM
DAVID LEBEAUX
NEC Research Institute
The paper used in this publication meets the minimum requirements of
American National Standard for Information Sciences Permanence of
Paper for Printed Library Materials, ANSI Z39.48-1984.
8
TM
Library of Congress Cataloging-in-Publication Data
Lebeaux, David.
Language acquisition and the form of the grammar / David Lebeaux
p. cm.
Includes bibliographical references and index.
1. Language acquisition. 2. Generative grammar. I. Title.
P118.L38995 2000
401.93--dc21 00-039775
ISBN 90 272 2565 6 (Eur.) / 1 55619 858 2 (US)
2000 John Benjamins B.V.
No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any
other means, without written permission from the publisher.
John Benjamins Publishing Co. P.O.Box 75577 1070 AN Amsterdam The Netherlands
John Benjamins North America P.O.Box 27519 Philadelphia PA 19118-0519 USA
Table of Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Cn:i1r 1
A Re-Denition of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 The Pivot/Open Distinction and the Government Relation . . . . . . . . 7
1.1.1 Braines Distinction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.2 The Government Relation . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 The Open/Closed Class Distinction . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1 Finiteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.2 The Question of Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1 A Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.2 Determining the base order of German . . . . . . . . . . . . . . . . 17
1.3.2.1 The Movement of NEG (syntax) . . . . . . . . . . . . . . . 24
1.3.2.2 The Placement of NEG (Acquisition) . . . . . . . . . . . . 26
Cn:i1r 2
Project- , Argument-Linking,
and Telegraphic Speech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1 Parametric variation in Phrase Structure . . . . . . . . . . . . . . . . . . . . . 31
2.1.1 Phrase Structure Articulation . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.2 Building Phrase Structure (Pinker 1984) . . . . . . . . . . . . . . . 32
2.2 Argument-linking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2.1 An ergative subsystem: English nominals . . . . . . . . . . . . . . 41
2.2.2 Argument-linking and Phrase Structure: Summary . . . . . . . . 45
vi TABLE OF CONTENTS
2.3 The Projection of Lexical Structure . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3.1 The Nature of Projection . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.3.2 Pre-Project- representations (acquisition) . . . . . . . . . . . . . . 56
2.3.3 Pre-Project- representations and the Segmentation Problem . 60
2.3.4 The Initial Induction: Summary . . . . . . . . . . . . . . . . . . . . . 65
2.3.5 The Early Phrase Marker (continued) . . . . . . . . . . . . . . . . . 66
2.3.6 From the Lexical to the Phrasal Syntax . . . . . . . . . . . . . . . . 75
2.3.7 Licensing of Determiners . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2.3.8 Submaximal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Cn:i1r 3
Adjoin- and Relative Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.2 Some general considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.3 The Argument/Adjunct Distinction, Derivationally Considered . . . . . 94
3.3.1 RCs and the Argument/Adjunct Distinction . . . . . . . . . . . . . 94
3.3.2 Adjunctual Structure and the Structure of the Base . . . . . . . . 98
3.3.3 Anti-Reconstruction Eects . . . . . . . . . . . . . . . . . . . . . . . . 102
3.3.4 In the Derivational Mode: Adjoin- . . . . . . . . . . . . . . . . . . 104
3.3.5 A Conceptual Argument . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.4 An Account of Parametric Variation . . . . . . . . . . . . . . . . . . . . . . . 112
3.5 Relative Clause Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.6 The Fine Structure of the Grammar, with Correspondences: The
General Congruence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
3.7 What the Relation of the Grammar to the Parser Might Be . . . . . . . 136
Cn:i1r 4
Agreement and Merger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.1 The Complement of Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.2 Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.3 Merger or Project- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
4.3.1 Relation to Psycholinguistic Evidence . . . . . . . . . . . . . . . . . 154
4.3.2 Reduced Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
4.3.3 Merger, or Project- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
4.3.4 Idioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
TABLE OF CONTENTS vii
Cn:i1r 5
The Abrogation of DS Functions:
Dislocated Constituents and Indexing Relations . . . . . . . . . . . . . . . . . 183
5.1 Shallow Analyses vs. the Derivational Theory of Complexity . . . . 184
5.2 Computational Complexity and The Notion of Anchoring . . . . . . . . 188
5.3 Levels of Representation and Learnability . . . . . . . . . . . . . . . . . . . 192
5.4 Equipollence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
5.5 Case Study I: Tavakolians results and the Early Nature of Control . . 203
5.5.1 Tavakolians Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.5.2 Two Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
5.5.3 PRO as Pro, or as a Neutralized Element . . . . . . . . . . . . . . . 208
5.5.4 The Control Rule, Syntactic Considerations: The Question of
C-command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
5.5.5 The Abrogation of DS functions . . . . . . . . . . . . . . . . . . . . . 220
5.6 Case Study II: Condition C and Dislocated Constituents . . . . . . . . . 224
5.6.1 The Abrogation of DS Functions: Condition C . . . . . . . . . . . 226
5.6.2 The Application of Indexing . . . . . . . . . . . . . . . . . . . . . . . 229
5.6.3 Distinguishing Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . 234
5.7 Case Study III: Wh-Questions and Strong Crossover . . . . . . . . . . . . 239
5.7.1 Wh-questions: Barriers framework . . . . . . . . . . . . . . . . . . . 240
5.7.2 Strong Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
5.7.3 Acquisition Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
5.7.4 Two possibilities of explanation . . . . . . . . . . . . . . . . . . . . . 248
5.7.5 A Representational Account . . . . . . . . . . . . . . . . . . . . . . . . 249
5.7.6 A Derivational Account, and a Possible Compromise . . . . . . 251
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
There are two ways of painting two trees together. Draw
a large tree and add a small one; this is called fu lao
(carrying the old on the back). Draw a small tree and add
a large one; this is called hsieh yu (leading the young by
the hand). Old trees should show a grave dignity and an
air of compassion. Young trees should appear modest and
retiring. They should stand together gazing at each other.
Mai-mai Sze
The Way of Chinese Painting
Acknowledgments
This book had its origins as a linguistics thesis at the University of Massachu-
setts. First of all, I would like to thank my committee: Tom Roeper, for scores
of hours of talk, for encouragement, and for his unagging conviction of the
importance of work in language acquisition; Edwin Williams, for the example of
his work; Lyn Frazier, for an acute and creative reading; and Chuck Clifton, for
a psychologists view. More generally, I would like to thank the faculty and
students of the University of Massachusetts, for making it a place where creative
thinking is valued. The concerns and orientation of this book are very much
molded by the training that I received there.
Further back, I would like to thank the people who got me interested in all
of this in the rst place: Steve Pinker, Jorge Hankamer, Jane Grimshaw, Annie
Zaenen, Merrill Garrett and Susan Carey. I would also like to thank Noam
Chomsky for encouragement throughout the years.
Since the writing of the thesis, I have had the encouragement and advice of
many ne colleagues. I would especially like to thank Susan Powers, Alan
Munn, Cristina Schmitt, Juan Uriagereka, Anne Vainikka, Ann Farmer, and Ana-
Teresa Perez-Leroux. I am also indebted to Sandiway Fong, as well as Bob
Krovetz, Christiane Fellbaum, Kiyoshi Yamabana, Piroska Csuri, and the NEC
Research Institute for a remarkable environment in which to pursue the research
further.
I would also like to thank Mamoru Saito, Hajime Hoji, Peggy Speas,
Juergen Weissenborn, Clare Voss, Keiko Muromatsu, Eloise Jelinek, Emmon
Bach, Jan Koster, and Ray Jackendoff.
Finally, I would like to thank my parents, Charles and Lillian Lebeaux, my
sister, Debbie Lebeaux, and my sons, Mark and Theo. Most of all, I would like
to thank my wife Pam, without whom this book would have been done badly, if
at all. This book is dedicated to her, with love.
Preface
What is the best way to structure a grammar? This is the question that I started
out with in the writing of my thesis in 1988. I believe that the thesis had a
marked eect in its answering of this question, particularly in the creation of the
Minimalist Program by Chomsky (1993) a few years later.
I attempted real answers to the question of how to structure a grammar, and
the answers were these:
(i) In acquisition, the grammar is arranged along the lines of subgrammars.
These grammars are arranged so that the child passes from one to the next,
and each succeeding grammar contains the last. I shall make this clearer
below.
(ii) In addition, in acquisition, the child proceeds to construct his/her grammar
from derivational endpoints (Chapter 5). From the derivational endpoints,
the child proceeds to construct the entire grammar. This may be forward or
backward, depending on what the derivational endpoint is. If the derivation
endpoint, or anchorpoint, is DS, then the construction is forward; if the
derivational endpoint or anchorpoint is S-structure or the surface, then the
construction proceeds backwards.
The above two proposals were the main proposals made about the
acquisition sequence. There were many proposals made about the syntax. Of
these, the main architectural proposals were the following.
(iii) The acquisition sequence and the syntax in particular, the syntactic
derivation are not to be considered in isolation from each other, but
rather are tightly yoked. The acquisition sequence can be seen as the result
of derivational steps or subsequences (as can be seen in Chapter 2, 3, and
4). This means that the acquisition sequence gives unique purchase onto the
derivation itself, including the adult derivation.
(iv) Phrase structure is not given as is, nor is derived top-down, but rather is
composed (Speas 1990). This phrase structure composition (Lebeaux 1988),
is not strictly bottom up, as in Chomskys (1995) Merge, but rather involves
xiv PREFACE
(a) the intermingling or units, (b) is grammatically licensed, and not simply
geometrical (bottom-up) in character (in a way which will become clearer
below), and (c) involves, among other transformations, the transformation
Project- (Chapter 4).
(v) Two specic composition operations (and the beginnings of a third) are
proposed. Adjoin- (Chapter 3) is proposed, adding adjuncts to the basic
nuclear clause structure (Conjoin- is also suggested in that chapter). In
further work, this is quite similar to the Adjunction operation of Joshi and
Kroch, and the Tree Adjoining Grammars (Joshi 1985; Joshi and Kroch
1985; Frank 1992), though the proposals are independent and the proposals
are not exactly the same. The second new composition operation is
Project- (Chapter 4), which is an absolutely new operation in the eld. It
projects open class structure into a closed class frame, and constitutes the
single most radical syntactic proposal of this book.
(vi) Finally, composition operations, and the variance in the grammar as a
whole, are linked to the closed class set elements like the, a, to, of, etc.
In particular, each composition operation requires the satisfaction of a
closed class element; as well as a closed class element being implicated in
each parameter.
These constitute some of the major proposals that are made in the course of this
thesis. In this preface I would like to both lay out these proposals in more detail,
and compare them with some of the other proposals that have been made since
the publication of this thesis in 1988. While this thesis played a major role in the
coming of the Minimalist Program (Chomsky 1993, 1995), the ideas of the thesis
warrant a renewed look by researchers in the eld, for they have provocative
implications for the treatment of language acquisition and the composition of
phrase structure.
Let us start to outline the dierences of this thesis with respect to later
proposals, not with respect to language acquisition, but with respect to syntax. In
particular, let us start with parts (iv) and (v) above: that the phrase marker is
composed from smaller units.
A similar proposal is made with Chomskys (1995) Merge. However, here,
unlike Merge:
(1) The composition is not simply bottom-up, but involves the possible
intermingling of units.
(2) The composition is syntactically triggered in that all phrase structure
composition involves the satisfaction of closed class elements
PREFACE xv
(Chapters 3 and 4), and is not simply the geometric putting together
of two units, as in Merge, and
(3) The composition consists of two operations among others (these are
the only two that are developed in this thesis), Adjoin- and
Project-.
With respect to the idea that all composition operations are syntactically triggered
by features, let us take the operation Adjoin-. This takes two structures and
adjoins the second into the rst.
(1)
s1:
s2:
the man met the woman
who loved him
the man met the woman
who loved him
Adjoin-
This shows the intermingling of units, as the second is intermeshed with the rst.
However, I argue here (Chapter 4), that it also shows the satisfaction of closed
class elements, in an interesting way. Let us call the wh-element of the relative
clause, who here, the relative clause linker.
It is a proposal of this thesis that the adjunction operation itself involves the
satisfaction of the relative clause linker (who), by the relative clause head (the
woman), and it is this relation, which is the relation of Agreement, which composes
the phrase marker. The relative clause linker is part of the closed class set. This
relative clause linker is satised in the course of Agreement, thus the composi-
tion operation is put into a 1-to-1 relation with the satisfaction of a closed class
head. (This proposal, so far as I know, is brand new in the literature).
(2) Agree Relative head/relativizer Adjoin-
This goes along with the proposal (Chapter 4), which was taken up in the
Minimalist literature (Chomsky 1992, 1995), that movement involves the
satisfaction of closed class features. The proposal here, however, is that composi-
tion, as well as movement, involves the satisfaction of a closed class feature (in
particular, Agreement). In the position here, taken up in the Minimalist literature,
the movement of an element to the subject position is put into a 1-to-1 corre-
spondence with agreement (Chapter 4 again).
(3) Agree Subject/Predicate Move NP (Chapter 4)
The proposal here is thus more thoroughgoing than that in the minimalist
literature, in that both the composition operation, and the movement operation are
triggered by Agreement, and the satisfaction of closed class features. In the
minimalist literature, it is simply movement which is triggered by the satisfaction
xvi PREFACE
of closed class elements (features); phrase structure composition is done simply
geometrically (bottom-up). Here, both are done through the satisfaction of
Agreement. This is shown below.
(4) Minimalism Lebeaux (1988)
Movement syntactic (satisfaction
of features)
syntactic (satisfaction
of features)
Phrase Structure
Composition
asyntactic (geometric) syntactic (satisfaction
of features)
This proposal (Lebeaux 1988) links the entire grammar to the closed class set
both the movement operations and the composition operations are linked to
this set.
The set of composition operations discussed in this thesis is not intended to
be exhaustive, merely representative. Along with Adjoin- which Chomsky-
adjoins elements into the representation (Chapter 3), let us take the second, yet
more radical phrase structure composition operation, Project-. This is not
equivalent to Speas (1990) Project-, but rather projects an open class structure
into a closed class frame. The open class structure also represents pure thematic
structure, and the closed class structure, pure Case structure.
This operation, for a simple partial sentence, looks like (5) (see Lebeaux
1988, 1991, 1997, 1998 for further extensive discussion).
The operation projects the open class elements into the closed class (Case)
frame. It also projects up the Case information from Determiner to DP, and
unies the theta information, from the theta subtree, into the Case Frame, so that
it appears on the DP node.
The Project- operation was motivated in part by the postulation of a
subgrammar in acquisition (Chapters 2, 3, and 4), in part by the remarkable
speech error data of Garrett (Chapter 4, Garrett 1975), and in part by idioms
(Chapter 4). This operation is discussed at much greater length in further
developments by myself (Lebeaux 1991, 1997, 1998).
I will discuss in more detail about the subgrammar underpinnings of the
Project- approach later in this preface. For now, I would simply like to point to
the remarkable speech error data collected by Merrill Garrett (1975, 1980), the
MIT corpus, which anchors this approach.
PREFACE xvii
(5)
V
N
agent
V
V N
patient
woman see man
man
VP
VP
DP
+nom
DP
+agent
+nom
Det
+nom
V
V
Det
+nom
the
the
NP
e
V
V
DP
+acc
DP
+patient
+acc
NP Det
+acc
a e see
Theta subtree (open class) Case Frame (closed class)
Project-
NP
+agent
see
Det
+acc
NP
+patient
a woman
Garrett and Shattuck-Hufnagel collected a sample of 3400 speech errors. Of
these, by far the most interesting class is the so-called morpheme-stranding
errors. These are absolutely remarkable in that they show the insertion of open
class elements into a closed class frame. Thus, empirically, the apparent impor-
tance of open class and closed class items is reversed rather than open class
items being paramount, closed class items are paramount, and guide the deriva-
tion. Open class elements are put into slots provided by closed class elements, in
Garretts remarkable work. A small sample of Garretts set is shown below.
xviii PREFACE
(6) Speech errors (stranded morpheme errors), Garrett (personal commu-
nication) (permuted elements underlined)
Error Target
my frozers are shoulden my shoulders are frozen
that just a back trucking out a truck backing out
McGovern favors pushing busters favors busting pushers
but the cleans twoer twos cleaner
his sink is shipping ship is sinking
the cancel has been practiced the practice has been
cancelled
shes got her sets sight sights set
a puncture tiring device tire puncturing device
As can be seen, these errors can only arise at a level where open class elements
are inserted into a closed class frame. The insertion does not take place correctly
a speech error so that the open class elements end up in permuted slots
(e.g. a puncture tiring device).
Garrett summarizes this as follows:
why should the presence of a syntactically active bound morpheme be
associated with an error at the level described in [(6)]? Precisely because the
attachment of a syntactic morpheme to a particular lexical stem reects a
mapping from a functional level [i.e. grammatical functional, i.e. my theta
subtree, D. L.] to a positional level of sentence planning
This summarizes the two phrase structure composition operations that I propose
in this thesis: Adjoin- and Project-. As can be seen, these involve (1) the inter-
mingling of structures (and are not simply bottom up), and (2) satisfaction of
closed class elements. Let us now turn to the general acquisition side of the
problem.
It was said above that this thesis was unique in that the acquisition sequence
and the syntax in particular, the syntactic derivation were not considered
in isolation, but rather in tandem. The acquisition sequence can be viewed as the
output of derivational processes. Therefore, to the extent to which the derivation
is partial, the corresponding stage of the acquisition sequence can be seen as a
subgrammar of the full grammar. The yoking of the acquisition sequence and the
syntax is therefore the following:
(7) Acqcisi1ioN subgrammar approach
SxN1:x phrase structure composition from smaller units
PREFACE xix
The subgrammar approach means that children literally have a smaller grammar
than the adult. The grammar increases over time by adding new structures (e.g.
relative clauses, conjunctions), and by adding new primitives of the representa-
tional vocabulary, as in the change from pure theta composed speech, to theta
and Case composed speech.
The addition of new structures e.g. relative clauses and conjunctions
may be thought of as follows. A complex sentence like that in (8) may be
thought of as a triple: the two units, and the operation composing them (8b).
(8) a. The man saw the woman who loved him.
b. (the man saw the woman (rooted), who loved him, Adjoin-)
Therefore a subgrammar, if it is lacking the operation joining the units may be
thought of as simply taking one of the units let us say the rooted one and
letting go of the other unit (plus letting go of the operation itself). This is
possible and necessary because it is the operation itself which joins the units: if
the operation is not present, one or the other of the units must be chosen. The
subgrammar behind (8a), but lacking the Adjoin- operation, will therefore generate
the structure in (9) (assuming that it is the rooted structure which is chosen).
(9) The man saw the woman.
This is what is wanted.
Note that the subgrammar approach (in acquisition), and the phrase structure
composition approach (in syntax itself) are in perfect parity. The phrase structure
composition approach gives the actual operation dividing the subgrammar from
the supergrammar. That is, with respect to this operation (Adjoin-), the
grammars are arranged in two circles: Grammar 1 containing the grammar itself,
but without Adjoin-, and Grammar 2 containing the grammar including Adjoin-.
(10)
Grammar 2
(w/ Adjoin- )
Grammar 1
The above is a case of adding a new operation.
The case of adding another representational primitive is yet more interesting.
xx PREFACE
Let us assume that the initial grammar is a pure representation of theta relations.
At a later stage, Case comes in. This hypothesis is of the layering of vocabu-
lary: one type of representational vocabulary comes in, and does not displace,
but rather is added to, another.
(11) theta theta + Case
Stage I Stage II
The natural lines along which this representational addition takes place is
precisely given by the operation Project-. The derivation may again be thought
of as a triple: the two composing structures, one a pure representation of theta
relations, and one a pure representation of Case, and the operation composing them.
(12) ((man (see woman)), (the __ (see (a __))), Project-)
the sees in theta tree and Case frame each contain partial informa-
tion which is unied in the Project- operation.
The subgrammar is one of the two representational units: in this case, the unit
(man(see woman)). That is a sort of theta representation or telegraphic speech.
The sequence from Grammar 0 to Grammar 1 is therefore given by the addition
of Project-.
(13)
Grammar 1
(w/ Project- )
Grammar 0
The full pattern of stage-like growth is shown in the chart below:
(14) Acqcisi1ioN: Subgrammar Approach
Add construction operations Relative clauses,
to simplied tree Conjunction (not discussed here)
Add primitives to Theta Theta + Case
representational vocabulary
As can be seen, the acquisition sequence and the syntax syntactic derivation
are tightly yoked.
Another way of putting the arguments above is in terms of distinguishing
PREFACE xxi
accounts. I wish to distinguish the phrase structure operations here from Merge;
and the acquisition subgrammar approach here from the alternative, which is the
Full Tree, or Full Competence, Approach (the full tree approach holds that the
child does not start out with a substructure, but rather has the full tree, at all
stages of development.) Let us see how the accounts are distinguished, in turn.
Let us start with Chomskys Merge. According to Merge, the (adult) phrase
structure tree, as in Montague (1974), is built up bottom-up, taking individual
units and joining them together, and so on. The chief property of Merge is that
it is strictly bottom-up. Thus, for example, in a right-branching structure like see
the big man, Merge would rst take big and man and Merge them together,
then add the to big man, and then add see to the resultant.
(15)
Application of Merge:
V Det Adj N
see the big man
N
Adj
big
N
man
DP
Det NP
Adj N
man

big the
V
see
VP
DP
Det NP
the Adj N
man big
The proposal assayed in this thesis (Lebeaux 1988) would, however, have a
radically dierent derivation. It would take the basic structure as being the basic
government relation: (see man). This is the primitive unit (unlike with Merge).
To this, the the and the big may be added, by separate transformations, Project-
and Adjoin-, respectively.
xxii PREFACE
(16)
V
N
man
man
V
see
V
V
V
V
DP
DP
NP
NP
Det
Det
e
(see)
(see)
Case Frame
Project-
Theta subtree
the
the
a. Project-
( see ( the man))
( big)
V DP
ADJ
b. Adjoin-
Adjoin- ( see ( the ( big man)))
V DP NP
How can these radically distinct accounts (Lebeaux 1988 and Merge) be
empirically distinguished? I would suggest in two ways. First, conceptually the
proposal here (as in Chomsky 19751955, 1957, and Tree Adjoining Grammars,
Kroch and Joshi 1985) takes information nuclei as its input structures, not
arbitrary pieces of string. For example, for the structure The man saw the
photograph that was taken by Stieglitz, the representation here would take the
two clausal nuclear structures, shown in (17) below, and adjoin them. This is not
true for Merge which does not deal in nuclear units.
(17)
s1:
s2:
the man saw the photograph
that was by Stieglitz
the man saw the photograph
that was by Stieglitz
Adjoin-
Even more interesting nuclear units are implicated in the transformation
Project-, where the full sentence is decomposed into a nuclear unit which is the
theta subtree, and the Case Frame.
PREFACE xxiii
(18)
The man saw the woman
(man (see woman))
(the _(see a_))
The structure in (18), the man saw the woman, is composed of a basic nuclear
unit, (man (see woman)), which is telegraphic speech (as argued for in Chap-
ter 2). No such nuclear unit exists in the Merge derivation of the man saw the
woman: that is, in the Merge derivation, (man (see woman)) does not exist as
a substructure of ((the man) (saw (the woman)).
This is the conceptual argument for preferring the composition operation
here over Merge. In addition, there are two simplicity arguments, of which I will
give just one here.
The simplicity argument has to do with a set of structures that children
produce which are called replacement sequences (Braine 1976). In these sequenc-
es, the child is trying to reach (output) some structure which is somewhat too
dicult for him/her. To make it, therefore, he or she rst outputs a substructure,
and then the whole structure. Examples are given below: the rst line is the rst
outputted structure, and the second line is the second outputted structure, as the
child attempts to reach the target (which is the second line).
(19) see ball (rst output)
see big ball (second output and target)
(20) see ball (rst output)
see the ball (second output and target)
What is striking about these replacement sequences is that the child does not
simply rst output random substrings of the nal target, but rather that the rst
output is an organized part of the second. Thus in both (19) and (20), what the
child has done is rst isolate out the basic government relation, (see ball), and
then added to it: with big and the, respectively.
The particular simplications chosen are precisely what we would expect with
the substructure approach outlined here, and crucially not with Merge. With the
substructure approach outlined here (Chapter 2, 4), what the child (or adult) rst
has in the derivation is precisely the structure (see ball), shown in example (21).
xxiv PREFACE
(21)
V
V
see
N
+patient
ball
To this structure is then added other elements, by Project- or Adjoin-. Thus,
crucially, the rst structure in (19) and (20) actually exists as a literal substruc-
ture of the nal form line 2 and thus could help the child in deriving the
nal form. It literally goes into the derivation.
By contrast, with Merge, the rst line in (19) and (20) never underlies the
second line. It is easy to see why. Merge is simply bottom-up it extends the
phrase marker. Therefore, the phrase structure composition derivation underlying
(20) line 2, is simply the following (Merge derivation).
(22) Merge derivation underlying (20) line 2
(
N
ball)
(
DP
(
D
the) (
N
ball))
(see (
DP
(
D
the) (
N
ball)))
However, this derivation crucially does not have the rst line of (20) (see (ball))
as a subcomponent. That is, (see (ball)) does not go into the making of (see
(the ball)), in the Merge derivation, but it does in the substructure derivation.
But this is a strong argument against Merge. For the rst line of the
outputted sequence of (20), (see ball), is presumably helping the child in
reaching the ultimate target (see (the ball)). But this is impossible with Merge,
for the rst line in (20) does not go into the making of the second line, accord-
ing to the Merge derivation.
That is, Merge cannot explain why (see ball) would help the child get to the
target (see (the ball)), since (see ball) is not part of the derivation of (see (the
ball)), in the Merge derivation. It is part of the sub-derivation in the substructure
approach outlined here, because of the operation Project-.
The above (see Chapters 2, 3, and 4) dierentiates the sort of phrase
structure composition operations found here from Merge. This is in the domain
of syntax though I have used language acquisition argumentation. In the
domain of language acquisition proper, the proposal of this thesis the
hypothesis of substructures must be contrasted with the alternative, which
holds that the child is outputting the full tree, even when the child is potentially
just in the one word stage: this may be called the Full Tree Hypothesis. These
PREFACE xxv
dierential possibilities are shown below. (For much additional discussion, see
Lebeaux 1991, 1997, 1998, in preparation.)
(23) Lebeaux (1988) Distinguished From
Syntax phrase structure com-
position
Both:
(1) no composition
(2) Merge
Language Acquisition subgrammar approach Full Tree Approach
Let us now briey distinguish the proposals here from the Full Tree Approach.
In the Full Tree Approach, the structure underlying a child sentence like ball
or see ball might be the following in (24). In contrast, the substructure
approach (Lebeaux, 1988) would assign the radically dierent representation,
given in (25).
(24)
Full Tree Approach
IP
TP
AgrSP
AgrOP
VP
V
DP
NP D
V
DP
AgrO
AgrS
T
DP
D NP
ball e e e e e e e e
xxvi PREFACE
(25)
V
V N
+patient
ball
Substructure Approach
How can these approaches be distinguished? That is, how can a choice be made
between (25), the substructure approach, and (24), the Full Tree approach? I
would suggest briey at least four ways (to see full argumentation, consult
Lebeaux 1997, to appear; Powers and Lebeaux 1998).
First, the subgrammar approach, but not the full tree approach, has some
notion of simplicity in representation and derivation. Simplicity is a much used
notion in science, for example deciding between two equally empirically
adequate theories. The Full Tree Approach has no notion of simplicity: in
particular, it has no idea of how the child would proceed from simpler structures
to more complex ones. On the other hand, the substructure theory has a strong
proposal to make: the child proceeds from simpler structures over time to those
which are more complex. Thus the subgrammar point of view makes a strong
proposal linked to simplicity, while the Full Tree hypothesis makes none.
A second argument has to do with the closed class elements, and may be
broken up into two subarguments. The rst of these arguments is that, in the Full
Tree Approach, there is no principled reason for the exclusion of closed class
elements in early speech (telegraphic speech). That is, both the open class and
closed class nodes exist, according to the Full Tree Hypothesis, and there is no
principled reason why initial speech would simply be open class, as it is. That is,
given the Full Tree Hypothesis, since the full tree is present, lexical insertion
could take place just as easily in the closed class nodes as the open class nodes.
The fact that it doesnt leaves the Full Tree approach with no principled reason
why closed class items are lacking in early speech.
A second reason having to do with closed class items, has to do with the
special role that they have in structuring an utterance, as shown by the work of
Garrett (1975, 1980), and Gleitman (1990). Since the Full Tree Approach gives
open and closed class items the same status, it has no explanation for why closed
class items play a special role in processing and acquisition. The substructure
approach, with Project-, on the other hand, faithfully models the dierence, by
having open class and closed class elements initially on dierent representations,
PREFACE xxvii
which are then fused (for additional discussion, see Chapter 4, and Lebeaux
1991, 1997, to appear).
A third argument against the Full Tree Approach has to do with structures
like see ball (natural) vs. see big (unnatural) given below.
(26) see ball (natural and common)
see big (unnatural and uncommon)
Why would an utterance like see ball be natural and common for the child
maintaining the government relation while see big is unnatural and uncom-
mon? There is a common sense explanation for this: see ball maintains the
government relation (between a verb and a complement), while see and big
have no natural relation. While this fact is obvious, it cannot be accounted for
with the Full Tree Approach. The reason is that the Full Tree Approach has all
nodes potentially available for use: including the adjectival ones. Thus there
would be no constraint on lexically inserting see and big (rather than see
and ball). On the substructure approach, on the other hand, there is a marked
dierence: see and ball are on a single primitive substructure the theta
tree while see and big are not.
A fourth argument against the Full Tree Approach and for the substructure
approach comes from a paper by Laporte-Grimes and Lebeaux (1993). In this
paper, the authors show that the acquisition sequence proceeds almost sequential-
ly in terms of the geometric complexity of the phrase marker. This is, children
rst output binary branching structures, then double binary branching, then triply
binary branching, and so on. This complexity result would be unexpected with
the Full Tree Approach, where the full tree is always available.
This concludes the four arguments against the Full Tree Approach, and for
the substructure approach in acquisition. The substructure approach (in acquisi-
tion) and the composition of the phrase marker (in syntax) form the two main
proposals of this thesis.
Aside from the main lines of argumentation, which I have just given, there
are a number of other proposals in this thesis. I just list them here.
(1) One main proposal which I take up in all of Chapter 5 is that the acquisition
sequence is built up from derivational endpoints. In particular, for some purpos-
es, the childs derivation is anchored in the surface, and only goes part of the
way back to DS. The main example of this can be seen with dislocated constitu-
ents. In examples like (27a) and (b), exemplifying Strong Crossover and a
Condition C violation respectively, the adult would not allow these constructions,
while the child does.
xxviii PREFACE
(27) a. *Which man
i
did he
i
see t? (OK for child)
b. *In Johns
i
house, he
i
put a book t. (OK for child)
It cannot be simply said, as in (27b), that Condition C does not apply in the
childs grammar, because it does, in nondislocated structures (Carden 1986b).
The solution to this puzzle and there exist a large number of similar puzzles
in the acquisition literature, see Chapter 5is that Condition C in general applies
over direct c-command relations, including at D-Structure (Lebeaux 1988, 1991,
1998), and that the child analyzes structures like (27b) as if they were dislocated
at all levels of representation, thus never triggering Condition C (a similar
analysis holds of Strong Crossover, construed as a Condition C type constraint,
at DS, van Riemsdijk and Williams 1981). That is, the child derivation, unlike
the adult, does not have movement, but starts out with the element in a dislocat-
ed position, and indexes it to the trace. This explains the lack of Condition C and
Crossover constraints (shown in Chapter 5). It does so by saying that the childs
derivation is shallow: anchored at SS or the surface, and the dislocated item is
never treated as if it were fully back in the DS position.
This is the shallowness of the derivation, anchored in SS (discussed in
Chapter 5).
(2) A number of proposals are made in Chapter 2. One main proposal concerns
the theta tree. In order to construct the tree, one takes a lexical entry, and does
lexical insertion of open class items directly into that. This is shown in (28).
(28) V
N V
V
see
N
patient
woman
man
This means that the sequence between the lexicon and the syntax is in fact a
continuum: the theta subtree constitutes an intermediate structure between those
usually thought to be in the lexicon, and those in the syntax. This is a radical
proposal.
A second proposal made in Chapter 2 is that X projections project up as far
as they need to. Thus if one assumed the X-theory of Jackendo (1977) (as I
did in this thesis) recall that Jackendo had 3 X levels then an element
might project up to the single bar level, double bar level, or all the way up to the
triple bar level, as needed.
PREFACE xxix
(29)
N
N
N
N
This was called the hypothesis of submaximal projections.

A nal proposal of Chapter 2 is that the English nominal system is ergative.
That is, a simple intransitive noun phrase like that in (29), with the subject in the
subject position (of the noun phrase) is always derived from a DS in which the
subject is a DS object. Crucially, this includes not simply unaccusative verbs (i.e.
nominals from unaccusative verbs) but unergative verbs as well (such as sleeping
and swimming).
(30) a. Johns sleeping
derived from: the sleeping of John (subject internal)
b. Johns swimming
derived from: the swimming of John (subject internal)
This means that the English nominal system is actually ergative in character
a startling result.
Some nal editorial comments. For space reasons in this series, Chapter 5 in the
original thesis has been deleted, and Chapter 6 has been re-numbered Chapter 5.
Second, I have maintained the phrase structure nodes of the original trees, rather
than trying to update them with the more recent nodes. The current IP is
therefore generally labelled S (sentence), the current DP is generally labelled NP
(noun phrase), and the current CP is sometimes labelled S (S-bar, the old name
for CP). Finally, the term dislocation in Chapter 5 is intended to be neutral by
itself between moved and base-generated. The argument of that section is that
wh-elements which are moved by the adult, are base generated in dislocated
positions by the child. Finally, I would like to thank Lisa Cheng and Anke de
Looper for helpful editorial assistance.
Introduction
This work arose out of an attempt to answer three questions:
I. Is there a way in which the Government-Binding theory of Chomsky (1981)
can be formulated so that the leveling in it is more essential than in the
current version of the theory?
II. What is the relation between the sequence of grammars that the child adopts,
and the basic formation of the grammar, and is there such a relation?
III. Is there a way to anchor Chomskys (1981) niteness claim that the set of
possible human grammars is nite, so that it becomes a central explanatory
factor in the grammar itself?
The work attempts to accomplish the following:
I. To provide for an essentially leveled theory, in two ways: by showing that
DS and SS are clearly demarcated by positing operations additional to
Move- which relate them, and by suggesting that there is a ordering in
addition by vocabulary, the vocabulary of description (in particular, Case
and theta theory) accumulating over the derivation.
II. To relate this syntactically argued for leveling to the acquisition theory,
again in two ways: by arguing that the external levels (DS, the Surface, PF)
may precede S-structure with respect to the induction of structure, and by
positing a general principle, the General Congruence Principle, which relates
acquisition stages and syntactic levels.
III. To give the closed class elements a crucial role to play: with respect to
parametric variation, they are the locus of the specication of parametric
dierence, and with respect to the composition of the phrase marker: it is
the need for closed class (CC) elements to be satised which gives rise to
phrase marker composition from more primitive units, and which initiates
Move- as well.
In terms of syntactic content, Chapters 24 deal with phrase structure both the
acquisition and the syntactic analysis thereof and Chapter 5 deals with the
interaction of indexing functions, Control and Binding Theory, with levels of
representation, particularly as it is displayed in the acquisition sequence.
2 INTRODUCTION
Thematically, a number of concerns emerge throughout. A major concern is
with closed class elements and niteness. With respect to parametric variation,
I suggest that closed class elements are the locus of parametric variation. This
guarantees niteness of possible grammars in UG, since the set of possible
closed class elements is nite.
1
With respect to phrase structure composition, it
is the closed class elements, and the necessity for their satisfaction, which require
the phrase marker to be composed, and initiate movement as well (e.g. Move-wh
is in a 1-to-1 correspondence with the lexical necessity: Satisfy +wh feature).
The phrase marker composition has some relation to the traditional generalized
transformations of Chomsky (1957), and they may apply (in the case of
Adjoin-) after movement. But the composition that occurs is of a strictly
limited sort, where the units are demarcated according to the principles of GB.
Finally, closed class elements form a xed frame into which the open class (OC)
elements are projected (Chapters 1, 2, and 4). More exactly, they form a Case
frame into which a theta sub-tree is projected (Chapter 4). This rule, I call
Merger (or Project-a).
A second theme is the relation of stages in acquisition to levels of grammat-
ical representation. Since the apparent diculty of any theory which involves
the learning of transformations,
2
the precise nature of the relation of the acquisi-
tion sequence to the structure of the grammar has remained murky, without a
theory of how the grammatical acquisition sequence interacts with, or displays
the structure of the grammar, and with, perhaps, many theoreticians believing
that any correspondence is otiose. Yet there is considerable reason to believe that
there should be such a correspondence. On theoretical grounds, this would be
expected for the following reason: The child in his/her induction of the grammar
is not handed information from all levels in the grammar at once, but rather from
particular picked out levels; the external levels of Chomsky (class lectures, 1985)
DS, LF, and PF or the surface.
These are contrasted to the internal level, S-structure. Briey, information
from the external levels are available to the child; about LF because of the paired
meaning interpretation, from the surface in the obvious fashion, and from DS,
construed here simply as the format of lexical forms, which are presumably
given by UG. As such, the childs task (still!) involves the interpolation of
operations and levels between these relatively xed points. But, this then means
1. Modulo the comments in Chapter 1, footnote 1.
2. Because individual transformations are no longer sanctioned in the grammar. I do not believe,
however, that the jury is yet in on the type of theory that Wexler and Culicover (1980) envisage.
INTRODUCTION 3
that the acquisition sequence must build on these external levels, and display the
structure of the levels, perhaps in a complex fashion.
A numerical argument leads in the same direction: namely, that the acquisi-
tion theory, in addition to being a parametric theory, should contain some
essential reference to, and reect, the structure of the grammar. Suppose that, as
above, the closed class elements and their values are identied with the possible
parameters. Let us (somewhat fancifully) set the number at 25, and assume that
they are binary. This would then give 2
25
target grammars in UG (=30 million),
a really quite small nite system. But, consider the range of acquisition sequenc-
es involved. If parameters are independent a common assumption then any
of these 25 parameters could be set rst, then any of the remaining 24, and so
on. This gives 25! possible acquisition sequences for the learning of a single
language (=1.5 10
25
), a truly gigantic number. That is, the range of acquisition
sequences would be much larger than the range of possible grammars, and
children might be expected to display widely divergent intermediate grammars in
their path to the nal common target, given independence. Yet they do nothing
of the sort; acquisition sequences in a given language look remarkably similar.
All children pass through a stage of telegraphic speech, and similar sorts of
errors are made in structures of complementation, in the acquisition of Control,
and so on. There is no wide fecundity in the display of intermediate grammars.
The way that has been broached in the acquisition literature to handle this
has been the so-called linking of parameters, where the setting of a single
parameter leads to another being set. This could restrict the range of acquisition
sequences. But, the theories embodying this idea have tended to have a rather
idiosyncratic and fragmentary character, and have not been numerous. The
suggestion in this work is that there is substructuring, but this is not in the
lexical-parametric domain itself (conceived of as the set of values for the closed
class (CC) elements), but in the operational domain with which this lexical
domain is associated. An example of this association was given above with the
relation of the wh-movement to the satisfaction of the +wh feature; another
example would be with satisfaction of the relative clause linker (the wh-element
itself), which either needs or does not need to be satised in the syntax. This
gives rise to either language in which the relative forms a constituent with the
head (English-type languages), or languages in which it is splayed out after the
main proposition, correlative languages.
(1) Lexical Domain
+wh must be satised by SS
+wh may not be satised by SS
Operational Domain
Move-wh applies in syntax
Move-wh applies at LF
4 INTRODUCTION
Lexical Domain
Relative Clause linker must be
satised by SS
Relative Clause linker may not
be satised by SS
Operational Domain
English-type language
Correlative language
The theory of this work suggests that all operations are dually specied in the
lexical domain (requiring satisfaction of a CC lexical element) and in the
operational domain.
The acquisition sequence reects the structure of the grammar in two ways:
via the General Congruence Principle, which states that the stages in acquisition
are in a congruence relation with the structure of parameters (see Chapter 3 for
discussion), and via the use of the external levels (DS, PF, LF) as anchoring
levels for the analysis essentially, as the inductive basis. The General
Congruence Principle is discussed in Chapter 24, the possibility of broader
anchoring levels, in Chapter 5. The latter point of view is somewhat distinct
from the former, and (to be frank) the exact relation between them is not yet
clear to the author. It may be that the General Congruence Principle is a special
case, when the anchoring level is DS, or it may be that these are autonomous
principles. I leave this question open.
The third theme of this work has to do with levels or precedence relations
in the grammar. In particular, with respect to two issues: (a) Is it possible to
make an argument that the grammar is essentially derivational in character, rather
than in the representational mode (cf. Chomskys 1981 discussion of Move-)?
(b) Is there any evidence of intermediate levels, of the sort postulated in van
Riemsdijk and Williams (1981)? I believe that considering a wider range of
operations than Move- may move this debate forward. In particular, I propose
two additional operations of phrase structure composition: Adjoin-, which
adjoins adjuncts in the course of the derivation, and Project-, which relates the
lexical syntax to the phrasal. With respect to these operations, two types of
precedence relations do seem to hold. First, operation/default organization holds
within an operation type. In the case of Adjoin- and its corresponding default,
Conjoin- (i.e., two of the types of generalized transformations in Chomsky
1957, are organized as a single operation type, with an operation/default relation
between them). The other precedence relation is vocabulary layering and this
hold between dierent operations, for example, Case and theta theory (see
Chapter 2, 3, and 4 for discussion). Further, operations like Adjoin- may follow
Move-, and this explains the anti-Reconstruction facts of van Riemsdijk and
INTRODUCTION 5
Williams (1981); such facts cannot be easily explained in the representational
mode (see Chapter 3).
In general, throughout this work I will interleave acquisition data and theory
with pure syntactic theory, since I do not really dierentiate between them.
Thus, the proposal having to do with Adjoin- was motivated by pure syntactic
concerns (the anti-Reconstruction facts, and the attempt to get a simple descrip-
tion of licensing), but was then carried over into the acquisition sphere. The
proposal having to do with the operation of Project- (or Merger) was formulat-
ed rst in order to give a succinct account of telegraphic speech (and, to a lesser
degree, to account for speech error data), and was then carried over into the
syntactic domain. To the extent to which this type of work is successful, the two
areas, pure syntactic theory and acquisition theory may be brought much closer,
perhaps identied.
Cn:i1r 1
A Re-Denition of the Problem
1.1 The Pivot/Open Distinction and the Government Relation
For many years language acquisition research has been a sort of weak sister in
grammatical research. The reason for this, I believe, lies not so much in its own
intrinsic weakness (for a theoretical tour de force, see Wexler and Culicover
1980, see also Pinker 1984), but rather, as in other unequal sibships, in relation.
This relation has not been a close one; moreover the lionizing of the theoretical
importance of language acquisition as the conceptual ground of linguistic
theorizing has existed in uneasy conscience alongside a real practical lack of
interest. Nor is the fault purely on the side of theoretical linguistics: the acquisi-
tion literature, especially on the psychological side, is notorious for having
drifted further and further from the original goal of explaining acquisition, i.e.
the sequence of mappings which take the child from G
0
to the terminal grammar
G
n
, to the study of a dierent sort of creature altogether, Child Language (see
Pinker 1984, for discussion and a diagnostic).
1.1.1 Braines Distinction
Nonetheless, even in the psychological literature, especially early on, there were
a number of proposals of quite far-reaching importance which would, or could,
have (had) a direct bearing on linguistic theory, and which pointed the way to
theories far more advanced than those available at the time. For example,
Braines (1963a) postulation of pivot-open structures in early grammars. Braine
essentially noticed and isolated three properties of early speech: for a large
number of children, the vocabulary divided into two classes, which he called
pivot and open. The pivot class was closed class, partly in the sense that it
applies in the adult grammar (e.g., containing prepositions, pronouns, etc.) but
partly also in the broader sense: it was a class that contained a small set of
words which couldnt be added on to, even though these words corresponded to
8 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR
those which would ordinarily be thought of as open class (e.g. come); these
words operated on a comparatively large number of open class elements. An
example of the Braine data is given below.
(1) Stevens word combinations
want baby see record
want car see Stevie
want do
want get whoa cards
want glasses whoa jeep
want head
want high more ball
want horsie more book
want jeep
want more there ball
want page there book
want pon there doggie
want purse there doll
want ride there high
want up there momma
want byebye car there record
there trunk
it ball there byebye car
it bang there daddy truck
it checker there momma truck
it daddy
it Dennis that box
it X etc. that Dennis
that X etc.
get ball
get Betty here bed
get doll here checker
here doll
see ball here truck
see doll
bunny do
daddy do
momma do
A RE-DEFINITION OF THE PROBLEM 9
The second property of the pivot/open distinction noticed by Braine was that
pivot and open are positional classes, occurring in a specied position with
respect to each other, though the positional ordering was specic to the pivot
element itself (P1 Open, Open P2, etc.) and hence not to be captured by a
general phrase structure rewrite rule: S Pivot Open. This latter fact was used
by critical studies of the time (Fodor, Bever, and Garrett 1974, for example) to
argue that Braines distinction was somehow incoherent, since the one means of
capturing such a distinction, phrase structure rules, required a general collapse
across elements in the pivot class which was simply not available in the data.
The third property of the pivot/open distinction was that the open class
elements were generally optional, while the pivot elements were not.
1.1.2 The Government Relation
What is interesting from the perspective of current theory is just how closely
Braine managed to isolate analogs not to the phrase structure rule descriptions
popular at that time, but to the central relation primitives of the current theory.
Thus the relation of pivot to open classes may be thought of as that between
governor and governed element, or perhaps more generally that of head to
complement; something like a primitive prediction or small clause structure (in
the extended sense of Kayne 1984) appears to be in evidence in these early
structures as well:
(2) Steven word utterances:
it ball that box
it bang that Dennis
it checker that doll
it X, etc. that Tommy
that truck
there ball here bed
there book here checker
there doggie here doll
there X, etc. here X, etc.
Andrew word combinations:
boot o airplane all gone
light o Calico all gone
pants o Calico all done
shirt o salt all shut
shoe o all done milk
water o all done now
all gone juice
clock on there all gone outside
up on there all gone pacier
hot in there
X in/on there, etc.
Gregory word combinations:
byebye plane allgone shoe
byebye man allgone vitamins
byebye hot allgone egg
allgone lettuce
allgone watch
etc.
The third property that Braine notes, the optionality of the open constituent with
respect to the pivot, may also be regularized to current theory: it is simply the
idea that heads are generally obligatory while complements are not.
The idea that the child, very early on, is trying to determine the general
properties of the government relation in the language (remaining neutral for now
about whether this is case or theta government) is supported by two other facts
as well: the presence of what Braine calls groping patterns in the early data,
and the presence of what he calls formulas of limited scope. The former can be
seen in the presence of the allgone constructions in Andrews speech. The
latter refers simply to the fact that in the very early two-word grammars, the set
of relations possible between the two words appears limited in terms of the
semantic relation which hold between them. This may be thought of as showing
that the initial government relation is learned with respect to specic lexical
items, or cognitively specied subclasses, and is then collapsed between them.
See also later discussion. The presence of groping patterns, i.e. the presence,
in two word utterances of patterns in which the order of elements is not xed for
lexically specic elements corresponds to the original experimentation in
determining the directionality of government (Chomsky 1981, Stowell 1981). The
presence of groping patterns is problematic for any theory of grammar which
gives a prominent role to phrase structure rules in early speech, since the order
of elements must be xed for all elements in a class. See, e.g., the discussion in
Pinker (1984), which attempts, unsuccessfully I believe, to naturalize this set of
data. To the extent to which phrase structure order is considered to be a deriva-
tive notion, and the government-of relation the primitive one, the presence of
lexically specic order dierence is not particularly problematic, as long as the
directionality of government is assumed to be determined at rst on a word-by-
word basis.
1.2 The Open/Closed Class Distinction
Braines prescient analysis was attacked in the psychological literature on both
empirical and especially theoretical grounds; it was ignored in the linguistic
literature. The basis of the theoretical attack was that the pivot/open distinction,
being lexically specic with respect to distribution, would not be accommodated
in a general theory of phrase structure rules (as already mentioned above);
moreover, the particular form of the theory adopted by Braine posited a radical
discontinuity in the form of the grammar as it changed from a pivot/open
grammar to a standard Aspects-style PS grammar. This latter charge we may
partly diuse by noting that there is no need to suppose a radical discontinuity
in the form of the grammar as it changed over time, the pivot/open grammar is
simply contained as a subgrammar in all the later stages. However, we wish to
remain neutral, for now, on the general issue of whether such radical discontinu-
ities are possible. The proponents of such a view, especially the holders of the
view that the original grammar was essentially semantic (i.e. thematically
organized), held the view in either a more or less radical form. The more
extreme advocates (Schlesinger 1971) held not simply that there was a radical
discontinuity, but that the primitives of later stages syntactic primitives like
case and syntactic categories like noun or noun phrase were constructed out
of the primitives of the earlier stages: a position one may emphatically reject.
Other theoreticians, however, in particular Melissa Bowerman (Bowerman 1973,
1974) held that there was such a discontinuity, but without supposing any
construction of the primitives of the later stages from those of the earlier. We
return, in detail, to this possibility below.
More generally, however, the charge that the pivot/open class stage presents
a problem for grammatical description appears to dissolve once the government-
of relation is taken to be the primitive, rather than the learning of a collection of
(internally coherent) phrase structure rules.
However, more still needs to be said about Braines data. For it is not
simply the case that a rudimentary government relation is being established, but
that this is overlaid, in a mysterious way, with the open/closed class distinction.
Thus it is not simply that the child is determining the government-of and
predicate of relations in his or her language, but also that the class of governing
elements is, in some peculiar way, associated with a distributional class: namely,
that of closed class elements.
While the central place of the government-of relation in current theory gives
us insight into one-half of Braines data, the role of the closed class/open class
distinction, though absolutely pervasive in both Braines work and in all the
psycholinguistic literature (see Garrett 1975, Shattuck-Hufnagel 1974, Bradley
1979, for a small sample) has remained totally untouched. Indeed, even the
semantic literature, which has in general paid much more attention to the
specier relation than transformational-generative linguistics, does not appear to
have anything to say that would account for the acquisition facts.
What could we say about the initial overlay of the elements closed class and
the set of governors? The minimal assumption would be something like this:
(3) The set of canonical governors is closed class.
While this is an interesting possibility, it would involve, for example, including
prepositions and auxiliary verbs in the class of canonical governors, but not main
verbs. Suppose that we strengthen (3), nonetheless.
(4) Only closed class elements may govern.
What about verbs? Interestingly, a solution already exists in the literature: in fact,
two of them. Stowell (1981) suggests that it is not the verb per se which governs
its complements, but rather the theta grid associated with it. Thus the comple-
ments are theta governed under coindexing with positions in the theta grid. And
while the class of verbs in a language is clearly open class and potentially
innite, the class of theta grids is equally clearly nite: a member of a closed,
nite set of elements. Along the same lines, Koopman (1984) makes the
interesting, though at rst glance odd, suggestion that it is not the verb which
Case-governs its complements, but Case-assigning features associated with the
verb. She does this in the context of a discussion of Stowells Case adjacency
requirement for case assignment; a proposal which appears to be immediately
falsied by the existence of Dutch, a language in which the verb is VP nal, but
the accusative marked object is at the left periphery of the VP. Koopman saves
Stowells proposal by supposing that the Case-assigning features of the verb are
at the left periphery, though the verb itself is at the right. This idea that the two
aspects of the verb are separable in this fashion will be returned to, and support-
ed, below. What is crucial for present purposes is simply to note that Case-
governing properties of the verb are themselves closed class, though the set of
verbs is not. Thus both the Case-assigning and theta-assigning properties of the
verb are closed class, and we may assume that these, rather than some property
of the open class itself enters into the government relation.
There is a second possibility, less theory-dependant. This is simply that, as
has often been noted, there is within the open part of the vocabulary of
language a subset which is potentially closed: this is the so-called basic vocabu-
lary of the language, used in the teaching of basic English, and other languages.
The verb say would presumably be part of this closed subset, but not the verb
mutter, as would their translations. The child task may be viewed as centering on
the closed class elements in the less abstract sense of lexical items, if these are
included in the set.
1.2.1 Finiteness
While the syntactic conjecture that the Case features on the verb are governing
its object has been often enough made, the theoretical potential of such a
proposal has not been realized. In essence, this proposal reduces a property of an
open class of elements, namely verbs, to a property of a closed class of elements
(the Case features on verbs). Insofar as direction of government is treated as a
parameter of variation across languages, by reducing government directionality
to a property of a closed class set, the two sorts of niteness, lexical and
syntactic, are joined together. The niteness of syntactic variation (Chomsky
1981) is tied, in the closest possible way, to the necessary niteness of a lexical
class (and the specications associated with it).
Let us take another example. English allows wh-movement in the syntax;
Chinese, apparently, apportions it into LF (Huang 1982). This is a parametric
dierence in the level of derivation at which a particular operation applies.
However, this may well be reducible to a parametric dierence in a closed class
element. Let us suppose, following Chomsky (1986), that wh-movement is
movement into the specier position of C.
Ordinarily it is assumed that lexical selection (of the complement-taking
verb) is of the head. Let us assume likewise the matrix verb must select for
a +/ wh feature in Comp. This, in turn, must regulate the possible appearance
of the wh-word appearing in the specier position of C. We may assume that
some agreement relation holds between these two positions, in direct analog to
the agreement relation which exists generally between specier and head
positions, e.g. with respect to case. Thus the presence of the overt wh-element in
Spec C is necessary to agree with, or saturate the +wh feature which is base-
generated in Comp. What then is the dierence between English and Chinese? Just
this: the agreeing element in Comp must be satised at S-structure in English,
while it needs only be satised at LF in Chinese. This dierence, in turn, may
be traced to some intrinsic property of agreement in the two languages, we might
hope.
(5)
I wonder C
C
I
I
VP
NP
e
V
saw
I
NP
John
Comp
who
If this sketch of an analysis is correct or something like it is then the
parametric dierence between English and Chinese with respect to wh-move-
ment is reduced to a dierence in the lexical specication of a closed class
element.
1
Since the possible set of universal specications associated with a
closed class set of elements is of necessity nite, the niteness conjecture of
Chomsky (1981) would be vindicated in the strongest possible way. Namely, the
niteness in parametric variation would be tied, and perhaps only tied, to the
niteness of a group of necessarily nite lexical elements, and the information
associated with them.
1.2.2 The Question of Levels
There is a dierent aspect of this which requires note. The dierence between
Chinese and English with respect to wh-movement is perhaps associated with
features on the closed class morpheme, but this shows up as a dierence in the
appearance of the structure at a representational level. I believe that this is in
1. I should note that the term closed class element here is being used in a somewhat broader sense
than usual, to encompass elements like the +wh feature. The niteness in the closed class set cannot
be that of the actual lexical items themselves, since these may vary from language to language, but
in the schema which denes them (e.g. denite determiner, indenite determiner, In, etc.).
general the case: namely, that while information associated with a closed class
element is at the root of some aspect of parametric variation, this dierence
often evidences itself in the grammar by a dierence in the representational level at
which a particular operation applies. We may put this in the form of a proposal:
(6) The theory of UG is the theory of the parametric variation in the
specications of closed class elements, ltered through a theory of
levels.
I will return throughout this work to more specic ways in which the conjecture
in (6) may be eshed out, but I would like to return at this point to two aspects
which seem relevant. First is the observation made repeatedly by Chomsky
(1981, 1986a), that while the set of possible human languages is (at least
conjecturally) nite, they appear to have a wide scatter in terms of surface
features. Why, we might ask, should this be the case? If the above conjecture (6)
is correct, it is precisely because of the interaction of the nite set of specica-
tions associated with the closed class elements, and the rather huge surface
dierences which would follow from having dierent operations apply at
dierent levels. The information associated with the former would determine the
latter; the latter would give rise to the apparent huge dierences in the descrip-
tion of the worlds languages, but would itself be tied to a parametric variation
in a small, necessarily nite set.
How does language acquisition proceed under these circumstances? Briey,
it must proceed in two ways: by determining the properties of lexical specica-
tions associated with the closed class set the child determines the structure of the
levels; by determining the structure of the levels he or she determines the
properties of the closed class morphemes. The proposal that the discovery of
properties associated with closed class lexical items is central obviously owes a
lot to Borers (1985) lexical learning hypothesis, that what the child learns, and
all that he/she learns is associated with properties of lexical elements. It consti-
tutes, in fact, a (fairly radical) strengthening of that proposal, in the direction of
niteness. Thus while the original lexical learning hypothesis would not guaran-
tee niteness in parametric variation, the version adopted in (6) would, and thus
may be viewed as providing a particular sort of grounding for Chomskys
niteness claim.
However, the proposal in (6) contains an additional claim as well: that the
dierence in the specications of closed class elements cashes in as a dierence
in the level that various operations apply. Thus it provides an outline of the way
that the gross scatter of languages may be associated with a nite range.
1.3 Triggers
1.3.1 A Constraint
The theory of parametric variation or grammatical determination has often been
linked with a dierent theory: that of triggers (Roeper 1978b, 1982, Roeper and
Williams 1986). A trigger may be thought of, in the most general case, as a
piece of information, on the surface string, which allows the child to determine
some aspect of grammatical realization. The idea is an attractive one, in that it
suggests a direct connection between a piece of surface data and the underlying
projected grammar; it is also in danger, if left further undened, of becoming
nearly vacuous as a means of grammatical description. A trigger, as it is
commonly used, may apply to virtually any property of the surface string which
allows the child to make some determination about his or her grammar.
There is, as is usual in linguistic theory, a way to make an idea more
theoretically valuable: that is, by constraining it. This constraint may be either
right or wrong, but it should, in either case, sharpen the theoretical issues involved.
In line with the discussion earlier in the chapter, let us limit the content of
trigger in the following way:
(7) A trigger is a determination in the property of a closed class element.
Given the previous discussion, the dierences in the look of the output grammar
may be large, given that a trigger has been set. The trigger-setting itself, however, is
aligned with the setting of the specication of a closed class element.
There are a number of instances of triggers in the input which must be re-
examined given (7) above, there are however, at least two very good instances
of triggers in the above sense which have been proposed in the literature. The
rst is Hyams (1985, 1986, 1987) analysis of the early dropping of subjects in
English. Hyams suggests that children start o with a grammar which is
essentially pro-drop, and that English-speaking children then move to an English-
type grammar, which is not. These correspond to developmental stages in which
children initially allow subjects to drop, lter out auxiliaries, and so on (as a rst
step), to one in which they do not so do (as the second step). The means by
which children pass from the rst grammar to the second, Hyams suggests, is by
means of the detection of expletives in the input. Such elements are generally
assumed not to exist in pro-drop languages; the presence of such elements would
thus allow the child to determine the type of the language that he or she was facing.
1.3.2 Determining the base order of German
The other example of a trigger, in the sense of (7) above, is found in Roepers
(1978b) analysis of German. While German sentences are underlyingly verb-nal
(see Bierwisch 1963, Bach 1962, Koster 1975, and many others), the verb may
show up in either the second or nal position.
(8) a. Ich sah ihn.
I saw him.
b. Ich glaube dass ich ihn gesehen habe.
I believe that I him seen have
Roepers empirical data suggests that the child analyses German as verb-nal at
a very early stage. However, this leaves the acquisition question open: how does
the child know that German is verb nal?
Roeper proposes two possible answers:
(9) i. Children pay attention to the word order in embedded, not
matrix clauses.
ii. Children isolate the deep structure position of the verb by
reference to the placement of the word not, which is always
at the end of the sentence.
At rst, it appears that the solution (i) is far preferable. It is much more general,
for one thing, and it also allows a natural tie-in with theory namely, Emonds
(1975) conception, that various transformations apply in root clauses which are
barred from applying in embedded contexts. However, recent work by Sar
(1982) suggests that Emonds generalization follows from other principles, in
particular that of government, and even if it were the case that Sars particular
proposal were not correct, it would certainly be expected, in the context of
current theory, that the dierence between root and embedded clauses would not
be stated as part of the primitive basis, but would follow from more primitive
specications.
A dierent line of deduction, not available to Roeper in 1974, appears to
be more promising. Namely, for the child to deduce DS position from a property
of the government relation. Given Case Adjacency (Stowell 1981), and given a
theory which states that Case assignment applies prior to verb movement, and
given the assumption that the accusative-marked element has not moved (all
these assumptions are necessary), then the presence of the accusative-marked
object, in the presence of other material preceding it in the VP, would act as a
legitimate marker for the presence of the DS verb following it:
(10) a. Ich habe dem Mann das Buch gegeben.
I have (to) the man the book given.
b. Ich gebe dem Mann das Buch t.
I gave the man the book.
This is one way out of the problem. However, certain of these assumptions
appear questionable, or at least not easily determinable by the child on the basis
of surface evidence. For example, the accusative will appear adjacent to the
phrase-nal verb if both objects in the double object construction are denite full
NPs, but if the direct object is a pronoun, the order of the two complements
reverses, obligatorily (Thiersch 1978).
(11) a. *Ich hatte dem Mann es gegeben.
I had the man it given
b. Ich hatte es dem Mann gegeben.
I had it the man given
Assuming that accusative is a case assigned by the verb but that dative is not, the
interposition of the dative object between the verb and the direct object would
create learnability problems for the child; in particular, the presence of the
accusative object would not be an invariable marker of the presence (at DS) of
the verb beside it. Of course, additional properties or rules (e.g. with respect to
the possibility of cliticization) may be added to account for the adult data, but
this would complicate the learnability problem, in the sense that the child would
already have to have access to this information prior to his/her positing of the
verb-nal base.
A second and equally serious diculty with using the presence of the
accusative object as a sure marker of the presence of the DS subject (under Case
Adjacency) is simply that in quite closely related languages, e.g. Dutch, such a
strict adjacency requirement does not seem to be required. Thus in Dutch, the
accusative object appears at the beginning of the verb phrase, while the verb is
phrase nal.
(12) a. Jan plant bomen in de tuin.
John plants trees in the garden.
b. Jan be-plant de tuin met bomen.
John plants the garden with trees.
Of course, it is possible to take account of this theoretically, along the lines that
Koopman (1984) suggests, where the Case-assigning features are split o from
the verb itself. But the degree of freedom necessitated in this proposal, while
quite possible from the point of view of a synchronic description of the grammar,
makes it unattractive as a learnability trigger (in the sense of (7) above). In
particular, the abstract Case-assigning features, now separated from the verb,
could no longer allow the presence of an accusative marked object to be the
invariable marker of the verb itself, and thus allow the child to determine the
deep structure position of the verb within the VP.
While not unambiguously rejecting the possibility that the presence of
accusative case may act as the marker for the verb in conjunction with other
factors for the child (since, in current theory, it is the interaction of theories like
Case and Theta theory which allow properties of a construction to be determined:
why should it be any dierent for the child?) let us turn to the third option for learn-
ability, the second option outlined by Roeper (1974). This is that the position of
Neg or the negative-like element marks the placement of the verb for the child.
Not following (yet) from any general principle, this may appear to be the most
unpromising proposal of the lot. Let us rst, however, suitably generalize it:
(13) The child locates the position of the head by locating the position of
the closed class specier of the head; the latter acts as a marker for
the presence of the former.
If we assume that Neg or not is in the specier of V or V, the generalization
in (13) is appropriate.
Does (13) hold? Before turning to specically linguistic questions, it should
be noted that there does exist a body of evidence in the psycholinguistic
literature, dating from the mid-sixties, which bears on this question (Braine 1965,
Morgan, Meier, and Newport 1987) This is in the learning of articial languages
by adults. Such languages may be constructed to have diering properties, and,
in particular, to either have or not have a closed class subset in them. Such
languages are learned, not by direct tuition, but rather by direct exposure to a
large number of instances of well-formed sentences in the language, large enough
so that the subject cannot have recourse to nonlinguistic, general problem-solving
techniques. What is interesting about this line of research is that the closed-class
morphemes seem to play an absolutely crucial role in the learnability of the
language (Braine 1965, Morgan, Meier, and Newport 1987) In particular, with
such morphemes, the grammar of the language is fairly easily deducible, but
without them the deduction is very much more dicult. Certain questions
relevant to the issue at hand have not been dealt with in this literature for
example, in a language in which at the surface the string may have the head in
one of two places, how is this placement determined? but the general upshot
of this line of research seems clear: such elements are crucial in the languages
acquisition. Of course, it must still be noted that this is language-learning by
adults, not children, but the fact that the learning is occurring under conditions
of input-ooding, rather than direct tuition, makes it correspond much more
closely to the original conditions under which learning takes place.
Let us return to (13). The claim in (13) is that the closed class speciers
associated with a head are better markers of that heads DS position than the
head itself. Given that the child must in general determine whether movement
has taken place, it is necessary that there be some principle, gatherable from the
surface itself, which so determines it. With respect to direct complements of a
head, we may assume that the detection of movement (i.e. the construction of the
DS and SS representations from the surface) is done with respect to characteris-
tics of the head: for example, that the head is missing one of its obligatorily
specied complements. What about if the head itself is moved? In this case, any
of three sorts of information would suce: either the stranding of a (on the
surface, not governed) complement would suce, or the stranding of a specier.
It is also possible, if the head is subcategorized by another head, and this head
is left without a complement, then the detection of movement would take place.
This would be the case, in English, in instances where a full clause was fronted.
With respect to the movement of the verbal head in German, the second
proposal, that the DS position of the verb is determined with respect to an
obligatorily present complement, corresponds to the proposal that the DS position
of the verb is detected by reference to the accusative-marked object. The second
possibility, that the stranding of the specier marks the DS position of the head,
is essentially Roepers proposal with respect to the placement of Neg.
What about if the specier itself is moved? This could be detected by the
placement of the head, if the head itself were assumed to be rigid in such a
conguration. A grave diculty would arise, however, if both the specier and
the head-complement complex were moved from a single DS position, since the
DS position would not be determinable.
(14) XP
X Spec X
YP X
We are left with the following logical space of possibilities
[Recall that at the time that this work was written, the specier was viewed
dierently than it is now. It constituted the set of closed class elements associated
with an open class head, among other things. For example the was considered
the specier in the picture of John; and an auxiliary verb like was was considered
the specier of the verb phrase, in a verb phrase like was talking to Mary. Thus
the and was would be considered speciers, rather than independently projecting
heads. The following discussion only makes sense with this notion of specier
in mind. D. L.]
(15) Moved element Detected by
complement head
head complement/specier/subcategorizing head
specier head/subcategorizing head (?)
By subcategorizing head I mean the head which, if it exists, selects for the
embedded head and its arguments. The idea that the subcategorizing head may
determine the presence of the specier, as well as the head of the selected
phrase, may be related to the proposal, found in Fukui and Speas (1986) as well
as in the categorial grammar, that, for some purposes at least, the specier may
act as the head of a given element (e.g. of an NP). I return in detail to this
possibility below.
The chart in (15) gives the logical space in which the problem may be
solved, but leaves almost all substantive issues unresolved; more disturbingly, the
process of detection, as it is faced by the child in (15), does not bear any
obvious and direct relation to current linguistic theory.
The linking of dislocated categories and their DS positions takes place, in
current theory, under two relations: antecedent government and lexical govern-
ment (Lasnik and Saito 1985, Chomsky 1986, Aoun, Hornstein, Lightfoot and
Weinberg 1987). Let us go further, and, in the spirit of Aoun, Hornstein,
Lightfoot, and Weinberg (1987), associate lexical government with the detection
of the existence of the null element (and perhaps its category), while antecedent
government determines the properties of that element: both constitute, in the
broadest sense, a sort of recoverability condition with respect to the placement of
dislocated elements. We might take the detection of the existence to take place
at a particular level (e.g. PF or the surface), while the detection of properties
takes place at another (e.g., LF).
It was suggested earlier that, in spite of its theoretical attractiveness, the
possibility that the child detected the DS position of the verb via the position of
the accusative-marked object and Case Adjacency seemed unlikely, as too
dicult an empirical problem (given the possible splitting up of Case-assigning
features from the verb, etc.) Let us suppose that this diculty is principled, in
the sense that in the child grammar, as in the adult grammar, the movement of
heads is never detected by the governed element of the head, but rather by the
governor. Thus the child, even though he is constructing the grammar, is using
the same principles as the adult. This radically reduces the logical space of (15)
to that in (16):
2
(16) Type of element moved Detected by
complement governing head
head governing head
specier ?/governing head
The question mark next to the specier in (16) is partly because of an embarrass-
ment of riches it could be either the higher head or the category head itself
which governed the specier and partly because of a lack: it is not clear that
either governs the specier in the same way that a complement (e.g a subcate-
gorized NP) is governed by a head.
One thing seems quite clear: closed class speciers are xed, with respect
to certain movement operations in a way that other elements are not. There is,
for example, no way to directly question the cardinality of the specier in
English, without moving the entire NP along with it:
(17) *a did you see (e man)?
Nor, as Chierchia (1984) points out, is there a way to directly question preposi-
tions, suggesting a similar constraint:
(18) *To did you talk (e the man)? (Was it to that you talked to the
man?)
While prepositions are not normally thought of as speciers of NP, but rather as
enforcing their own X system (see Jackendo 1977), there is a strong and
persistent undercurrent that certain PPs, even in English, are simply a spell-out
of Case-marking and perhaps theta-marking features, something which is,
arguably part of the specier, which later gets spelled out onto the head. If this
is the case, then the data in (17) and (18), which seems to fall together in terms
of pre-theoretical intuitions, may ultimately be collapsed.
But why should (17) and (18) be so bad? The chart in (16), which suggests
essentially that all elements are detected (i.e. lexically governed for the point of
view of recovery) by their governor, gives no clue. While one might argue that
2. Recall again that the notion of specier used is that current in 1988 [D.L. in 1999]. It includes
elements like the determiner the in the picture of Mary, and was in was talking to John: closed class
items specifying the head.
there is some sort of constraint that, in attempting to extract a head, necessarily
drags other material with it, and that this accounted for the ungrammaticality of
(18), there is no way to extend this constraint to (17), under normal assumptions
about the headedness of the phrase. However, even in its own terms such a
constraint is dubious, since in, e.g., German and Dutch, there is verb movement
without the complement of the verb being moved as well.
Chierchia himself suggests that the ungrammaticality of sentences like (17)
and (18) is due to a (deep) property of the type system: namely, that the system
is strictly second-order (with a nominalization operator), and that no variable
categories exist of a high enough type to correspond to the traces of the deter-
miner and preposition.
While Chierchias solution is coherent, and indeed exciting, from the point
of view of the theory that he advocates, there are obvious problems in transpos-
ing the solution to any current version of GB. Indeed, even if the constraint did
follow from some deep semantic property of the system, we would still be
licensed in asking if there was some constraint in the purely syntactic system
which corresponds to it. To the extent to which constraints are placed over
purely syntactic forms, as well as (perhaps) the semantic system corresponding
to it, we arrive at a position of an autonomous syntax, which, while perhaps
constructed over a semantic base, retains its own set of properties distinct from
the conceptual core on which it was formed. For discussion, from quite dierent
points of view, see Chomsky (1981), where it is argued that the core relation of
government is grammaticalized in a way which might not be determinable
from its conceptual content alone, and that this sort of formal extension is a deep
property of human language; see also Pinker (1984) where the notion of semantic
bootstrapping plays a similar role.
Returning to the problem posed by the ungrammaticality of (17) and (18),
we would wish to propose a syntactic constraint which would bar the unwanted
sentences, and at the same time help the acquisition system to operate:
(19) Fixed specier constraint:
The closed class specier of a category is xed (may not move
independently from the category itself).
It is clear why something like (19) would be benecial from the point of view
of the acquisition system. The problem for that system, under conditions of
extensive movement, is that there is no set of xed points from which to
determine the D-structure. Of course, a trace-enriched S-structure would be
sucient, but the child himself is given no such structure, only the surface. The
xed specier constraint suggests that there is a set of xed points, loci from
which the structure of the string can be determined. Further, this seems to be
supported by the grammatical evidence of lack of extractability. The alternative
is that the D-structure and the fact that movement has taken place, is determina-
ble from a complex set of principles that the child has reference to (Case theory,
theta theory, etc.), but without any particular set of elements picked out as the
xed points around which others may move. This possibility cannot be rejected
out of hand, but a system which contained (19) instead would quite clearly be
simpler. (19) itself is a generalization of Roepers initial proposal, that it is the
element Neg which plays a crucial role; we will see later on that there is overt
evidence for the role of this element in the acquisition of English.
The learnability point notwithstanding, it may be asked whether a strong
constraint such as (19) can empirically be upheld. While the closed class
speciers of NP clearly do not move in English, and it is curiously supportive as
well that, if we do think of prepositions as in some sense reanalyzed as the
specier of the NP that they precede, that particle movement applies just in case
the preposition does not have an NP associated with it (i.e. cannot be reanalyzed
as a specier), there do seem to be instances in which speciers move at a
sentential level. Not may apparently move in Neg-raising contexts (20a), and do
fronts in questions, sometimes in conjunction with a cliticized not(20 b).
(20) a. I dont believe that he is coming. (= I believe that he isnt
coming.)
b. Didnt he leave already?
Thus, while it is in general the case that complements are more mobile than
heads, and heads are more mobile than speciers, it is by no means clear that
speciers form the grid necessary to determine the basic underlying structure
of the language for the child.
1.3.2.1 The Movement of NEG (syntax)
The syntactic problem posed by (20) for the general idea that speciers consti-
tute a xed grid from which the child posits syntactic order is a dicult one,
but perhaps not insuperable. The status of the Neg-raising case, is in any case
unclear e.g. as to whether movement has taken place. The problem posed by
(20b) is more dicult. Given that movement has taken place, examples such as
(20b) would seem to provide a straightforward violation of the xed specier
constraint (and thus leave the learnability problem untouched).
However, the example given, and the movement operation, is one which aects
both Neg and the auxiliary verb: examples such as (21) are ungrammatical.
(21) *Not he saw Mary.
That is, the movement operation does not move Neg per se, but rather the
category under which it is adjoined. If we consider this category itself to be In,
which is not a specier category, but rather the head of I, then it is not the case
that the closed class specier itself has been moved to the front of the sentence,
but rather I, a head constituent. The xed specier constraint is therefore not
violated by Subject/Aux inversion.
(22)
S
NP I
Infl
Infl Neg
not is
VP
going he
Derivation: (He ((is not) (going))) (Isnt he (( ) (going))) e
(23) Movement types.
a. Move-. Potentially unbounded, applies to major categories,
maximal projections.
b. Hop-. String adjacent, applies to minor categories, closed
class elements, minimal projections.
The set of properties listed under the movement types is intended as a pre-
theoretical characterization only, with the formal status of this division to be
determined in detail. We might include other diering properties as well: e.g.,
perhaps Hop-, but not Move-, is restricted to particular syntactic levels.
Further, the exact empirical extension of Hop- is left undetermined. In the
original account (Chomsky 19751955, 1957), Hop- was restricted (though not
in principle) to the movement of axes, i.e. closed class selected morphemes,
onto the governed element, in particular the governed verb. In Fiengos interest-
ing extension, Hop- may be applied to other string adjacent operations involv-
ing closed class elements.
3
Assuming a division of movement types such as that given in (23), the Neg
movement operation adjoining not to In may be considered a movement
3. For a somewhat dierent view of Hop-, see Chomsky (1986b).
operation of a particular type: namely, an instance of Hop-, not Move-. As
such, the xed specier constraint may still be retained, but modied in the
following way:
(24) Fixed specier constraint (modied form):
A closed class specier may not be moved by Move-.
1.3.2.2 The Placement of NEG (Acquisition)
While the revision of the Fixed Specier Constraint in (24) allows the syntactic
system to retain a set of nearly xed points, and thus simplify the acquisition
problem at a theoretical level, a very interesting body of literature remains, about
the acquisition sequence, which appears to directly undermine the claim that this
information is in fact used by the child. This is the set of papers due to Ursula
Bellugi and Edward Klima from the mid-60s (Klima and Bellugi 1966), recently
re-investigated by Klein (1982), in which it appears that Neg is initially analyzed
by the child as a sentential, not a VP operator, and hence appears prior to the
modied clause. Of course, if the child himself allows negation to move in his
grammar, it can hardly be the case that he is using it as a xed point from which
to determine the placement of other elements. Bellugi and Klima distinguish
three major stages in the acquisition of negation.
(25) a. Stage 1: Negation appears in pre-sentential position.
b. Stage 2: Negation appears contracted with an auxiliary, stage
prior to that at which the auxiliary appears alone.
c. Stage 3: Negation is used correctly.
Bellugi and Klima suggest that in the intermediate stage the auxiliary does not
have the same status that it has in the adult grammar, since it appears only in the
case that negation also occurs contracted on it. They suggest, rather, that the
negation and the auxiliary form themselves a constituent headed by Neg: (
NEG
can (
NEG
not)) . Thus the fact that the negated auxiliary appears prior to any
occurrences of the non-negated auxiliary is accounted for by supposing that no
independent auxiliary node exists as such, the initial negative auxiliary is a
projection of Neg.
The data corresponding to the Stages 13 above is given in (26).
(26) a. Stage 1: no see Mommy
no I go
no Bobby bam-bam
etc.
b. Stage 2: I no leave
I no put dress
me cant go table
etc.
c. Stage 3: comparable to adult utterances
The problematic utterances from the current point of view are given in (26a).
Given the assumption that the structure of these utterances is that in (27), the
child appears to be lowering the negation into the VP, in the transition between
Stage 1 and Stage 2. This, in turn, is problematic for any view that such
elements are viewed as xed for the child.
(27)
S
Neg
no
S
NP
me
VP
like spinach
(Klima and Bellugi 1966), analysis of Stage 1 negation:
The analysis given in (27), however, is the sentential analysis of Bellugi and
Klima. Recently, a new analysis has been given for the basic structure of S
(Kitagawa 1986, Sportiche 1988, Fukui and Speas 1986). Kitagawa, Sportiche,
Fukui and Speas argue, on the basis of data from Italian and Japanese, that the
basic D-structure of S has the subject internal to the VP, though outside the V.
The D-structure of (28a) is therefore given in (28b).
(28)
S
NP I
I VP
e
a. John saw Mary.
b.
NP
John
V
saw Mary
The internal-to-VP analysis allows theta assignment to take place internal to the
maximal projection VP; it also provides for a variable position in the VP, as a
predication-type structure results: NP
i
(
VP
e
i
(saw Mary)) (Predication type
analysis). Kitagawa also argues that certain dierences in the analysis of English
and Japanese follow from this analysis. En route to S-structure, the DS subject
is moved out of the VP, to the subject position. Note that such a movement is
necessary, if the subject is to be assigned Case by In.
A Kitagawa/Sportiche/Fukui-Speas type analysis of the basic structure of S
receives striking conrmation from the acquisition data, if we assume that
negation is xed throughout the acquisition sequence, and that, throughout Stage
1 speech, it is direct theta role assignment, rather than assignment of (abstract)
Case, which is regulating the appearance of arguments. That is, in Stage 1 speech
the negation is, as expected, adjoined o of VP as the Spec of VP. However,
the subject is internal to the VP: that is, in its D-structure position. The relevant
structure then is that in (29).
(29)
VP
VP
V
NP
mommy
V
go
NP
me
Spec VP
no
In this structure, if we assume that theta role assignment to the subject is via the
V, and further, that abstract Case has not yet entered the system (see later
discussion), then the resultant structure is precisely what would be expected,
given the xity of the specier and the lack of subject raising. The apparent
sentential scope is actually VP scope, with a VP-internal subject. At the point at
which abstract Case does enter the system, the subject must be external, and
appears, moved, prior to negation: Stage 2.
I will return to this analysis, and to a fuller analysis of the role of Case and
theta assignment in early grammars in later chapters. For the moment, we may
simply note two properties of the above analysis: rst, that the (very) early
system is regulated purely by theta assignment, rather than the assignment of
abstract Case. This is close to the traditional analysis in much of the psycho-
linguistic literature (e.g. Bowerman 1973) that the early grammar is semantic,
i.e. thematic. The second property of this analysis is in the relation of syntactic
levels, in the adult grammar, to stages in the acquisition sequence. Namely, there
is a very simple relation: the two stages in the acquisition sequence correspond
to two adjacent levels of representation in the synchronic analysis of the adult
grammar. That is, the geological pattern of surface forms L
1
L
2
correspond-
ing to adjacent grammars in the childs acquisition sequence corresponds to
adjacent levels of representation in the adult grammar. This sort of congruence,
while natural, is nonetheless striking, and suggests that rather deep properties of
the adult grammar may be projected from the acquisition sequence, i.e. the fact
of development.
Cn:i1r 2
Project- , Argument-Linking,
and Telegraphic Speech
2.1 Parametric variation in Phrase Structure
In the last chapter, I suggested that the range of parametric variation across
languages was tied to the dierence in the specications associated with closed
class elements. This strengthened the niteness claim of Chomsky (1981), by
linking the niteness of variation in possible grammars with another sort of
niteness: that of the closed class elements and their specications. However,
this very small range of possible parametric variation still had to be reconciled
with a very dierent fact: the apparent scatter of the worlds languages with
respect to their external properties, so that radically dierent surface types
appear to occur. It was suggested that this scatter was due to the interaction of
the (nite and localized) theory of parametric variation in lexical items with a
dierent aspect of the theory: that of representational levels. Slightly dierent
parametric settings of the closed class set would give rise to dierent, perhaps
radically dierent, organizations of grammars. This would include the postulate
that dierent operations might apply at dierent grammatical levels cross-
linguistically, as in the earlier discussion which suggested that the dierent role
that wh-movement played in English and Chinese (Huang 1982) a levels
dierence should be tied to some property of the agreement relation which
held between the fronted wh-element and the +/wh feature in Comp, and that
this, in turn, could be related to the diering status of agreement, a closed class
morpheme, in the two languages. If this general sort of approach is correct, it
may be supposed that large numbers of dierences may be traced back in this way.
2.1.1 Phrase Structure Articulation
One dierence cross-linguistically, then, would be traceable to a dierence in
the level at which a particular operation applied. If the foregoing is correct, then
this should not simply be stated directly, but instead in terms of the varying
property of some element of the closed class set. What about dierences in
phrase structure? In the general sort of framework proposed in Stowell (1981),
and further developed by many others, phrase structure rules should not have the
status of primitives in the grammar, but should be replaced by, on the one hand,
lexical specications (e.g. in government direction), and, on the other, general
licensing conditions. Within a theory of parametric variation, one would therefore
expect that languages would dier in these two ways.
On the other hand, really radical dierences in phrase structure articulation
cross-linguistically may be possible, at least if the theory of Hale (1979) is
correct. Even if one did not adopt the radical bifurcationist view implicit in the
notion of W* languages (see Hales appendix), one might still adopt the view
that degrees of articulation are possible with respect to the phrase structure of a
language. The atter and less articulated a language in phrase structure, the
closer that it would approximate a W* language. Of course, the question still
arises of how a child would learn these cross-linguistic dierences in degree of
articulation, particularly if true W* languages existed alongside languages which
were not W*, but exhibited a large degree of scrambling.
2.1.2 Building Phrase Structure (Pinker 1984)
Pinker and Lebeaux (1982) and Pinker (1984) made one sort of proposal to deal
precisely with this problem: how might the child learn the full range of phrase
structure articulation, in the presence of solely positive evidence. The answer
given relied on a few key ideas. First, following Grimshaw (1981), relations
between particular sets of primitives were assumed to contain a subset of
canonical realizations. The possibility of such realizations were assumed to be
directly available to the child, and in fact used by him/her in the labelling of the
string. Thus, in the rst place, the child has access to a set of cognitively based
notions: thing, action, property, and so on. These correspond, in a way that is
obviously not one-to-one, to the set of grammatical categories: NP, Verb,
Adjective Phrase, and so on. What is the relation, if not one-to-one? According
to Grimshaw, the grammatical categories, while ultimately fully formal in
character, are nonetheless centered in the cognitive categories, so that member-
ship in the latter acts as a marker for membership in the former: a noun phrase
is the canonical grammatical category corresponding to the cognitive category
thing; a verb (or verb phrase) is the canonical grammatical category corresponding
to the cognitive category action; a clause is the canonical grammatical category
corresponding to the cognitive category event or proposition; and so on. This
PROJECT-, ARGUMENT-LINKING, AND TELEGRAPHIC SPEECH 33
assumes an ontology rich enough to make the correct dierentiations, see
Jackendo (1983) for a preliminary attempt to construct such an ontology.
Crucially, the canonical realizations are implicational, not bidirectional;
further, once the formal system is constructed over the cognitive base, it is freed
from its reliance on the earlier set of categories.
Consider how this would work for the basic labelling of a string. The simple
three-word sentence in (1) would have cognitive categories associated with each
of their elements.
(1) thing act thing (cognitive category)
John saw Mary.
These would be associated with their canonical structural realizations in phrasal
categories.
(2) NP V NP (canonical grammatical realization)
thing act thing (cognitive category)
John saw Mary
On the other hand, sentences to which the child was exposed which did not
satisfy the canonical correspondences, would not be assigned a structure:
(3) ? ? ? (cognitive category)
This situation resembles a morass.
A number of questions arise here, as throughout. For example, could the child be
fooled by sequences which did not only not satisfy the canonical correspondenc-
es, but positively deed them? Deverbal nominals would be a good example:
(4)
(canonical grammatical realization)
(cognitive category)
VP
event
The examination
of the patient
In (4), the deverbal nominal recognizably names an event. Given the canonical
correspondences, this should be labelled a VP, or some projection of V. But this,
in turn, would severely hamper the child in the correct determination of the basic
structure of the language.
One way around this problem would be to simply note that deverbal
nominals are not likely to be common in the input. A more principled solution, I
believe, would be to further restrict the base on which the canonical correspondences
are drawn. For example, within each language there is not simply a class of
nouns, roughly labelling things, but a distinguished subset of proper names. Of
course, if the general theory in Lebeaux (1986) is correct, then derived nominals
of the sort in (4) i.e. nominalized processes or actions actually are
projections of V: namely nominalized Vs or Vs, with -tion acting as a nominal-
izing ax, though they achieve this category only at LF, after ax raising (see
Lebeaux 1986 for such an analysis).
The second property of the analysis, along with the idea that one set of
primitives is related to another by being a canonical realization of the rst in the
dierent system, is that the second set is constructed over the rst and is itself
autonomous, and freed from its reliance on the rst in the fully adult system. It
is this which allows strings like that given above (E.g. This situation resembles
a morass), which do not obey the canonical correspondences, to be generated by
the grammar. We may imagine a number of possibilities in how the systems may
be overlaid: it may be that the original set of primitives, while initially used by
the grammatical system in the acquisition phase, is entirely eliminated in the
adult grammatical system. This presumably would be the case in the above
labelling of elements as thing, action, etc., which would not be retained in
the adult (or older childs) grammar. On the other hand, certain sets of primitives
might be expected to be retained in the adult system. In the framework of Pinker
(1984), this would include the set of grammatical relations, which were used to
build up the phrase structure.
In Pinker and Lebeaux (1982), Pinker (1984) the labelled string allowed the
basic structure of S to be built over it in the following way: (i) particular
elements of the string were thematically labelled in a way retrievable from
context (Wexler and Culicover 1980), (ii) particular grammatical functions
corresponded to the canonical structural realizations of thematically labelled
elements (agent subject, patient object, goal oblique object, etc.), (iii)
grammatical relations were realized, according to their Aspects denition, as
elements in a phrase marker: subject (NP, S), object (NP, VP), Oblique Object
(NP, PP), and so on, (iv) the denitions in (iii) were relaxed as required, to
avoid crossing branches in the PS tree.
The proviso in (iv) was intended to provide for languages exhibiting a range
of hierarchical structuring. Rather than specically including each degree of
hierarchical structuring as a setting in UG as a substantive universal (i.e. a
possible setting for a substantive universal), the highly modular approach of
(i)(iv) allows for the interaction of the substantive universal in (iii) and the
formal universal in (iv) to introduce the degree of hierarchical relaxation neces-
sary, without specic provision having to be made for a series of diering
grammars.
The provisions (i)(iv) may be put in a procedural format:
(5) Building phrase structure:
a. Thematic labelling:
(i) Label agent of action: agent
(ii) Label patient of action: patient
(iii) Label goal of action: goal
etc.
b. Grammatical Functional labelling:
(i) Label agent: subject
(ii) Label patient: object
(iii) Label goal: oblique object
etc.
c. Tree-building:
(i) Let subject be (NP, S)
(ii) Let object be (NP, VP)
(iii) Let oblique object be (NP, XP), XP not VP, S
etc.
d. Tree-relaxation:
If (a)(c) requires crossing branches, eliminate oending nodes
as necessary, from the bottom up. Allow default attachment to
the next highest node.
The combination of (c) and (d) assumes maximum structure, and then relaxes
that assumption as necessary. The principle (5a) meshes with the general
cognitive system, as does the node-labelling mentioned earlier. The other
principles are purely linguistic, but even here the question arises of whether they
are permanent properties of the linguistic system (i.e. UG as it describes the
adult grammar), or localized in the acquisition system per se, as Grimshaw
(1981) suggests. We leave this question open for now.
I have considered above how the basic labelling would work; consider now
the general analysis. The string in (6) is segmented by the child.
(6) John hit Bill.
From the general cognitive system, the segmented entities may be labelled for
their semantic content.
(7) thing (name) action thing (name)
John hit Bill
These cognitive terms take their canonical grammatical realization in node labels.
(8)
NP
thing(name)
John
V
action
hit
NP
thing(name)
Bill
These nodes, in turn, may be thematically labelled.
(9)
NP
thing(name)
agent
John
V
action
hit
NP
thing(name)
patient
Bill
The canonical realization of these theta roles is as particular grammatical
relations.
(10)
NP
thing(name)
agent
subject
John
V
action
hit
NP
thing(name)
patient
object
Bill
These grammatical relations have, as in the Aspects denition, particular
structural encodings.
(11)
NP
thing(name)
agent
subject
John
NP
thing(name)
patient
object
Bill
S
VP
V
hit
And the phrase structure tree is complete.
As Pinker (1984) notes, once the structure in (11) is built, the relevant
general grammatical information (e.g. S NP VP) may be entered in the
grammar. The PS rule is available apart from its particular instantiating instance,
and the system itself is freed from its reliance on conceptual or notional catego-
ries. Rather, it may analyze new instances which do not satisfy the canonical
correspondences: any pre-verbal NP, regardless of its conceptual content (e.g.
naming an event, rather than a thing) will be analyzed as an NP. It is precisely
this property, which represents the autonomous character of the syntactic system,
after its initial meshing with the cognitive/semantic system, which gives the
proposal its power.
I have included, in the phrase structure tree, not purely grammatical
information, e.g. the node labels, but also other sorts of information: thematic
and grammatical relational, as well as cognitive. Should such information be
included? There is some reason to believe not, at least for categories above.
Thus, while grammatical processes require reference to node labels like NP, they
do not seem to require reference to cognitive categories like thing or proper
name. Giving such labels equal status in the grammatical representation implies
counterfactually that grammatical processes may refer to them. This suggests, in
turn, that they should not be part of the representation per se, but part of the
rules constructing the representation.
The situation is more complex with respect to the other information in the
phrase structure tree. Thus in the tree in (11), it is assumed that thematic
information (the thematic labels agent, patient, etc.) is copied onto the node
labels directly, as is grammatical-functional information (Subj, Obj, etc.).
Presumably, in a theory such as GB, the latter intermediate stage of grammatical
functions would be discarded in favor of a theory of abstract Case. The question
of whether the thematic labels are copied directly onto the phrasal nodes can also
not be answered a priori, and is associated with the question of how thematic
assignment occurs exactly in the adult grammar. In traditional analyzes, theta
roles were thought of as being copied directly onto the associated NP: i.e. the
relevant argument NP was considered to have the thematic role as part of its
feature set directly. In the theory of Stowell (1981), the NP does not have the
theta role copied onto it, but rather receives its theta role by virtue of a
(mini-)chain which coindexes the phrasal node with the position in the theta grid.
(12)
VP
V
see
(Ag, Pat )
j
NP
Mary
j
While Stowells system is in certain respects more natural in particular, it
captures the fact that case, but not theta roles, seem to show up morphologically
on the relevant arguments, and hence may be viewed as direct spell-outs of the
feature set the acquisition sequence given here suggests that theta roles
actually are part of the feature set of the relevant NP. Since I will assume here,
as throughout, that the representations adopted in the course of acquisition are
very closely aligned with those in the adult grammar, this suggests that theta
roles actually are assigned to the phrasal NP, at least in the case of agent and
patient (which are the central roles for this part of the account).
2.2 Argument-linking
The plausibility and ecacy of the above approach in learning cross-linguistic
variation in phrase structure depends in part on the outcome of unresolved
linguistic questions. In particular: (i) to what degree do languages actually dier
in degree of articulation, (ii) to what degree may elements directly associated
with the verbal head, or auxiliary, the pronominal arguments of the head (Jelinek
1984), be construed as the direct arguments, with the auxiliary NP considered to
be simply adjuncts or adarguments, and (iii) the precise characterization of the
dierence between nominative/accusative and ergative/absolutive languages, or
the range of languages that partially have ergative/absolutive properties (e.g. split
ergative languages). The existence of so-called true ergative languages has, in
particular, been used to critique the above notion that there are canonical
(absolute) syntactic/thematic correspondences, and that these may be used
universally by the child to determine the Grammatical Relational or abstract Case
assignment in the language. Thus it is often noted that while nominative/
accusative languages use the mapping principles in (13a), ergative/absolutive
languages use those in (13b) (Marantz 1984; Levin 1983).
(13) a. subject of transivitive nominative
subject of intransitive nominative
object of transitive accusative
b. subject of transitive ergative
subject of intransitive absolutive
object of transitive absolutive
And in fact the true ergative language Dyiribal is assumed to have the
following alignment of theta roles and grammatical relations (Marantz 1984;
Levin 1983):
(14) agent object
patient subject
The mapping principles in (13) are stated in terms of grammatical relations, but
it is clear that even if the principles were re-stated in other terms (e.g those
relating to abstract Case), a serious problem would arise for the sort of acquisi-
tion system envisioned above, if no more were said. The reason is that the set of
canonical correspondences must be assumed to be universal, and one set of
primitives (case-marking) centered in another set (thematic), perhaps through the
mediation of a third set (abstract Case). The linking rules must be assumed to be
universal, since if they were not so assumed, they would not give a determinate
value in the learning of any language, for the child faced with a language of
unknown type. It is easy to see why. Let us call the (canonical) theta role
associated with the subject position of intransitives t
1
, the canonical theta role
associated with the subject position of transitives t
2
, and the canonical theta role
associated with the object position of transitives t
3
.
Then nominative/accusative languages use the following grouping into the
case system (15).
(15)
t
t
t
theta system
1
2
3
nominative
accusative
case system
The ergative absolutive languages use the following grouping:
(16)
t
t
t
theta system
1
2
3
absolutive
ergative
case system
This is perfectly adequate as a description, but if the child is going to use theta
roles to determine the (grammatical relation and) case system in the language in
which he has been placed, then the position is hopeless, since the child would
have to know what language-type community he was in antecedent to his
postulation of where subjects were in the language. Otherwise he will not be able
to tell where, e.g., subjects are in the language. But it is precisely this that the
radical break in the linking principles evidenced in (15) and (16) would not
allow. The division of language types in this fashion would thus create insupera-
ble diculties for the child, since there would be no coherent set of theta
Grammatical Relational or thetaCase mappings that he or she could use, to
determine the position of the subject in the language. Of course, once the
language-type was determined, the mapping rules themselves would be as well,
but it is precisely this information that the child has to determine.
There is, however, an unexamined assumption in this critique. It is that the
situation is truly symmetrical: that is, that the child is faced with the choice of
the following two linking patterns:
(17)
G
0
nom/acc pattern erg/abs pattern
Given such an assumption, there is no way for this acquisition system to
proceed. However, if the situation is not truly symmetrical i.e. if there are
other dierences in the languages exhibiting ergative vs. those exhibiting nomina-
tive/accusative linking patterns and if these dierences are determined by the
child prior to his/her determination of the linking pattern in (17), then the critique
itself is without force. We would wish to discover, rather, how this prior determina-
tion occurs, and how it and the adoption of a particular linking pattern mesh.
In fact, there appears to be evidence for just this lack of asymmetry:
evidence that the majority (and perhaps the vast bulk) of ergative/absolutive
languages are associated with a dierent sort of argument structure. I rely here
on the work of Jelinek (1984, 1985), and associated work. Jelinek proposes a
typological dierence between broadly congurational languages, which take
their associated NPs as direct arguments (e.g. English, French), and languages
which she designates pronominal argument languages. In the latter type (she
argues), the pronominal clitics associated with the verbal head or auxiliary are
actually acting as direct arguments of the main predicate, and the lexical noun
phrases are adjuncts or adarguments, further specifying the content of the
argument slot. The sentential pattern of phrasally realized arguments in such
languages, then, would roughly resemble the nominal pattern in English in
picture-noun phrases, where all arguments may be considered to be adjuncts in
relation to the head.
While it is not the case that all ergative languages reveal this sort of
optionality of arguments (and in particular Dyiribal does not), it does seem that
the bulk do (Jelinek 1984). If we take this as the primary characteristic of these
languages, then the choice for the child is no longer the irresolvable choice of
(17), but rather the following:
(18)
G
0
arguments
obligatory
nominative/
accusative
arguments
optional
ergative/
absolutive
(i.e. not arguments
but ad-arguments)
The choice matrix in (18) is undoubtedly very much simplied. Nonetheless,
it appears to be the basic cut made in the data. If this is so, however, then the
original decision made by the child is not that of the linking pattern used, but
rather in the determination of the argument status of the phrasal arguments; this,
in turn, may cue the child into the sort of language that he/she is facing.
2.2.1 An ergative subsystem: English nominals
The general pattern suggested in (18) gets support from a rather remarkable
source: English nominals. It was noted above that simple nominals in English
(e.g. picture), have pure optionality of arguments. What has been less com-
mented on is that deverbal English nominals have an ergative linking pattern as
well, in the sense of (13b) above.
Deverbal nominals from a transitive stem base have the same linking
patterns as their verbal counterparts (Chomsky 1970):
(19) Johns destruction of the evidence
John destroyed the evidence
I assume, with Chomsky (1970), that in cases in which the DS object is in
subject position (the citys destruction) Move- has applied.
What about deverbal nominals from an intransitive base? Here the data is
more complex. Appearance, from the unaccusative verb appear, allows the
argument to appear in either N internal or N external position, with genitive
marking on the latter.
(20) the appearance of John shocked us all
Johns appearance shocked us all
The second example in (20) is actually ambiguous between a true argument and
a relation R type reading (the fact that John appeared vs. the way he looked,
respectively), the latter reading does not concern us here.
What about other intransitives? Surprisingly, the internal appearance of
the single argument of intransitives verbs is not limited to unaccusatives, but
occurs with other verbs as well. For example, sleep and swim, totally unexcep-
tionable unergative (i.e. pure intransitive) verbs, allow their argument to appear
both internal and external to the N.
(21) the sleeping of John
Johns sleeping
(22) the swimming of John
Johns swimming
This possibility of taking the subject argument as internal is not limited to
deverbal nominals simply taking one argument, but extends to those taking other
internal arguments as well, as long as those would be realized as prepositional
objects (rather than direct objects) in the corresponding verbal form.
(23) a. the talking of John to Mary
(John talked to Mary)
b. the reliance of John on Bill
(John relied on Bill)
c. the commenting of John on her unkindness
(John commented on her unkindness)
Of course, deverbal nominals formed from simple transitive verbs do not allow
the subject of the verbal construction to appear in the internal-to-N position in
an of-phrase:
(24) a. *the destruction of John of the city
(John destroyed the city)
b. *the examination of Bill of the students
(Bills examination of the students)
It appears, then, that the linking pattern for subjects in nominals diers,
according to whether the underlying verbal form is transitive or not. In all
deverbal nominals formed from intransitive verbs not simply those formed
from unaccusatives the subject of the corresponding verb may be found in the
nominal in internal-to-N position, in the of-phrase which marks direct arguments
(see (23)), and without genitive marking in that position. This is totally impossi-
ble in deverbal nominals formed on a transitive verbal base.
This clearly is an ergative-like pattern, but before drawing that conclusion
directly, a slightly broader pattern of data must be examined. It is not simply the
case that nominals formed from intransitive verbal bases allow their (single)
direct argument to appear internal to the N; it may appear in the external
position as well, as noted above.
(25) a. the sleeping of John
Johns sleeping (in derived nominal, not gerundive, reading)
b. the talking of John to Mary
Johns talking to Mary
c. the reliance of John on Marys help
Johns reliance on Marys help
d. the love of John for his homeland
Johns love for his homeland
Second, genitives may appear post-posed, in certain cases:
(26) a. Johns pictures of Mary
the pictures of Mary of Johns
b. Johns stories
the stories of Johns
However, we may note that when the subject is postposed, it must appear with
genitive marking (or in a by-phrase).
(27) a. the pictures of Mary of Johns
*the pictures of Mary of John
b. Johns examination
the examination of Johns (possessive reading)
the examination of John (OK, but only under patient/theme
reading)
A reasonable possibility for analysis is that the NP with genitive marking which
appears post-head is postposed from the subject/genitive slot: this accounts for
the genitive marking (see Aoun, Hornstein, Lightfoot, and Weinberg 1987, for a
dierent analysis, in which the genitive is actually associated with a null N
category). What appears to be the case, however, is that elements which have
moved into the subject position, may no longer postpose.
(28) a. Johns picture (ambiguous between possessor and theme read-
ing, the latter derived from movement)
b. the picture of Johns (only possessor reading)
(29) a. Johns examination (ambiguous between possessor and theme
reading)
b. the examination of Johns (only possessor reading)
Thus the nominal in (28a) is ambiguous between the moved and not moved
interpretation while the postposed genitive in (28b) is not, similarly for (29a) and
(b). Thus the following constraint appears to be true, whatever its genesis. (see
Aoun, Hornstein, Lightfoot, and Weinberg 1987, which makes a similar empiri-
cal observation, though it gives a dierent analysis).
(30) Posthead genitives may only be associated with the deep structure
subject interpretation; derived genitive subjects may not appear
posthead with genitive marking.
The constraint in (30) may now be used to help determine the argument structure
of the intransitive nominals under investigation. It was earlier noted that the
subject in such nominals appeared on either side of the head.
(31) a. the appearance of John
Johns appearance
b. the sleeping of John
Johns sleeping
Which position is the DS position? Given (30), the genitive subject should count
as a deep structure subject if it may postpose with genitive marking, but not
otherwise. In fact, it cannot postpose:
(32) a. the appearance of John (startled us all)
Johns appearance
*the appearance of Johns
b. the sleeping of John
Johns sleeping
*the sleeping of Johns
c. the talking of John to Mary
Johns talking to Mary
*the talking to Mary of Johns
The inability of the genitive subject of the intransitive to postpose then consti-
tutes evidence that the subject position is not the DS position of the single direct
argument, but rather the internal-to-N position, and the constructions in which
that element does appear in subject position are themselves derived by preposing.
The argument-linking pattern for English nominals would then not be some
sort of mixed system, but a true ergative-style system, given in (33).
(33)
English nominals:
t
t
t
1
2
3
(subject of intransitives)
(object of transitives)
(subject of transitives)
internal position
external position
This, in turn, strongly suggests that the argument linking pattern split found
between the nominative/accusative languages and ergative/absolutive languages
should not be itself the primary cut faced by the child, but rather that between
languages (and sub-languages) which obligatorily take arguments, and those which do
not. Even in English, a strongly nominative/accusative language, a sub-system exists
which is basically ergative in character: this, not coincidentally, corresponds to
the subsystem in which the arguments are optional, the nominal system.
2.2.2 Argument-linking and Phrase Structure: Summary
To summarize: I have suggested (following Pinker 1984) the need for maximally
general linking rules, associating thematic roles with particular grammatical
functions, or abstract case. Pinker has suggested the term semantic-bootstrap-
ping to apply to such cases; it would be preferable, perhaps, to consider this a
specic instance of a more general concept of analytic priority.
(34) Analytic priority: A set of primitives a
1
, a
2
, a
n
is analytically
dependant on another set b
1
, b
2
, b
n
i b
i
must be applied to the
input in order for a
i
to apply.
(35) The set of theta-theoretic primitives is analytically prior to the set of
Case-theoretic primitives.
Semantic-bootstrapping would thus be a particular instance of analytical priority;
another such example would be Stowells derivation (1981) of phrase structure
ordering from principles involving Case.
In order for a set of primitives to be analytically prior in the way suggested
above, and for this to aid the child in acquisition, it must be the case that the
analytic dependance is truly universal, since if this were not so, the child would
not be able to determine the crucial features of his or her language (in particular,
the alignment of the analytically dependant primitives a
1
, a
2
, a
n
) on the basis
of the analytically prior set. It is for this reason that the existence of ergative
languages constitutes an important potential counterexample to the idea of
analytic priority, and its particular instantiation here, since it would be that there
would be no linking regularities that the child could antecedently have access to.
However, there is fairly strong evidence that the languages dier not only in
their linking patterns, but in their argument structure in general: in particular, that
ergative/absolutive languages are of a pronominal argument structure type, with
elements in the verb itself satisfying the projection principle (Jelinek 1984). The
choice faced by the child would then be that given in (18), repeated here below.
(36)
G
0
obligatory
arguments
nominative/
accusative
optional
arguments
ergative/
absolutive
The association of optionality with ergativity was supported by the fact that such
languages do in general show optionality in the realization of arguments, and,
further, in the extensive existence of ergative splits. Most remarkably of all, the
English nominal appears to show an ergative-like pattern in argument-linking:
that is, the ergative pattern shows up precisely in that sub-system in which
arguments need not be obligatorily realized.
This suggests, then, that the idea of analytic priority, and in general the
priority of a given set of primitives over another set, is viable. This general idea,
however, has a range of application far more general than the original application
in terms of acquisition; moreover, there is some reason to believe that while the
original proposals involving semantic bootstrapping or analytic priority were
made in an LFG framework, they would be able to accommodate themselves in
a more interesting form a multi-leveled framework like Government-Binding
theory. While the proposal that the grammar and in particular, the syntax
is levelled has been common at least since the inception of generative grammar
(Chomsky, 19751955), the precise characterization of these levels has remained
somewhat vague and underdetermined by evidence (van Riemsdijk and Williams
1981). Within the Government-Binding theory of Chomsky (1981), this problem
has become more interesting and yet more acute, since while it is crucial that
particular subtheories (e.g. Binding Theory) apply at particular levels (e.g.
S-structure), the general principles which would force a particular module to be
tied to a particular level are by no means clear, nor is there any general charac-
terization which would lead one to expect that a particular module or subtheory
should apply at a single level (Control theory, Binding theory), while another
subtheory must be satised throughout the derivation (theta theory, according to
the Projection Principle). The problem is intensied by the fact that, given that
a single operation, Move-, is the only relation between the levels, and the
content of this operation is for the most part perhaps entirely retrievable
from the output, the levels themselves may be collapsed, at least conjecturally,
so that the entire representation is no longer in the derivational mode, but rather
contains all its information in a single level representation, S-structure (the
representational mode). Chomsky (1981, 1982), while opting ultimately for a
derivational description, notes that in a framework containing only Move- as
the relation between the levels, the choice between these modes of the grammar
is quite dicult, and appears to rest on evidence which is hardly central.
This indeterminacy of representational style may appear to be quite unrelat-
ed to the other problem noted in Chapter 1, the lack of any general understand-
ing of how acquisition, and the acquisition sequence in its specicity, ts into
the theory of the adult grammar. I believe, however, that the relation between the
two problem areas is close, and that an understanding of the acquisition
sequence provides a unique clue to the theory of levels. In particular, at a rst
approximation, the levels of representation simply correspond, one-to-one, to the
stages of acquisition. That is:
(37) General Congruence Principle:
Levels of grammatical representation correspond to (the output of)
acquisitional stages.
We will return to more exact formulations, and general consequences, through-
out. For the present, we simply note that if the General Congruence Principle is
correct, then the idea of analytic priority, and the possibility that the Case system
in some way bootstraps o of the thematic system would be expected to be
true not only of the acquisition sequence, but reected in the adult grammar as
well. Before turning to the ramications of (37), and its fuller specication, a
dierent aspect of phrase structure variance must be considered.
2.3 The Projection of Lexical Structure
In the section above, one aspect of grammatical variance was considered, from
the point of view of a learnability theory: namely, the possibility that languages
used radically dierent argument linking patterns. Such a possibility would run
strongly against Grimshaws (1981) hypothesis, that particular (sets of) primitives
were centered in other sets, being their canonical structural realizations in a
dierent syntactic vocabulary (e.g. cognitive type: phrasal category, thematic
role: grammatical relation, and so on). It was argued, however, that Grimshaws
hypothesis could be upheld upon closer examination, and that the divergence in
linking pattern was not the primary cut in the data: that being rather tied to the
dierence in optionality and obligatoriness of arguments. Since this divergence
could be learned by the child antecedently to the linking pattern itself, the
learnability problem would not arise in its most extreme form, for this portion of
the data.
Moreover, bootstrapping-type accounts were found to be a subcase of a
broader set: those involving analytic priority, one set of primitives being applied
to the data antecedently to another set. This notion would appear to be quite
natural from the point of view of a multi-leveled theory like Government-
Binding theory.
We may turn now to another aspect of the structure building rules in (5),
where structure is assumed to be maximal, and then relaxed as necessary to
avoid crossing branches. The structure-building and structure-labelling rules are
repeated below:
(38) Building phrase structure:
a. Thematic labelling:
i) Label agent of action: agent
ii) Label patient of action: patient
iii) Label goal of action: goal
(Note: the categories on the left are cognitive, those on right
are linguistic.)
b. Grammatical Functional labelling:
i) Label agent: subject
ii) Label patient: object
iii) Label goal: oblique object
etc.
c. Tree-building:
i) Let subject be (NP, S)
ii) Let object be (NP, VP)
iii) Let oblique object be (NP, XP), XP not VP, S
d. Tree-relaxation:
If (a)(c) requires crossing branches, eliminate oending nodes
as necessary, from the bottom up. Allow default attachment to
the next highest node.
This degree of relaxation may be assumed to occur either in a language-wide or
on a construction-by-construction basis (Pinker 1984). To the extent to which the
interaction of rules in (38) are accurate, they would allow the learner to determine a
range of possible structural congurations for languages, on the basis of evidence
readily accessible to the child: surface order. Finally, while it is not the case that
the grammars at rst adopted would be a direct subset of those adopted later on
(assuming some relaxation has occurred), it would be the case that the
languages generated by these grammars would be in a subset relation.
These attractive features notwithstanding, there are potential empirical and
theoretical complications, if a theory of the above sort is to be eshed out and
made to work. The problem is perhaps more acute if the theory is to be trans-
posed from an LFG to a GB framework. This is because LFG maintains
grammatical functions as nonstructurally dened primitives; the lack of articula-
tion in phrase structure in at languages is still perfectly compatible with a
built-up and articulated functional structure. Relevant constraints may then be
stated over this structure. This possibility is not open in GB (though see Wil-
liams 1984, for a discussion which apparently allows scrambling in the widest
sense perhaps including attening as a relation between S-structure and the
surface). This diculty reaches its apex in the syntactic description of noncon-
gurational languages.
In the general GB framework followed by Hale (1979), and also adopted
here, no formal level of f (unctional)-structure exists. Nonetheless, the descriptive
problem remains: if there is argument structure in such languages, and rules
sensitive to asymmetries in it, then there must be some level or substructure
which has the necessary degree of articulation. If we assume with Hale that there
are languages in which this degree of syntactic articulation does not exist at any
level of phrasal representation (though this set of languages need not necessary
include languages such as Japanese, where it may be the case that Move- has
applied instead, see Farmer 1984; Saito and Hoji 1983 for relevant divergent
opinions), then this articulation must still be present somewhere.
Hales original proposal (1979) was that the necessary structure was
available at a separate level, lexical structure, at which elements were coindexed
with positions in the theta structure of the head. It was left unclear in Hales
proposal what precisely the nature of lexical structure would be in conguration-
al languages. Because of this, in part, the proposal was sharply questioned by
Stowell (1981, 1981/1982), essentially on learnability grounds. Stowell noted the
diculty in supposing that languages represented their argument structure in two
radically dierent ways, phrase-structurally or in lexical structure: how could the
child tell the dierence? would lexical structure, as Hale dened it, just remove
itself in languages which didnt use it?
Let us sketch one way through this, then step back to consider the conse-
quences. We may recast Hales original proposal in such a way as to escape
some of these questions, using a mechanism introduced by Stowell himself: the
theta grid. Suppose we identify Hales lexical structure simply with the theta grid
itself. Whatever degree of articulation is necessary to describe languages of this
type would then have to be present on the theta grid itself; the deployment of the
phrasal nodes would themselves be at. Rather than representing the lexical-
thematic information on a grid, as Stowell suggests, let us represent it on a small
(lexical) subtree.
(39)
hit: (lexical structure) V
N
agent
V
V
hit
N
patient
It is these positions, not positions in a grid, which are linked to dierent
arguments in a full clausal structure:
(40)
S
NP
i
The man
VP
V
N
i
(agent)
V
V
hit
N
j
(patient)
NP
i
the boy
With respect to theta assignment, this representation and Stowells would behave
identically. However, there are key dierences between the two representations.
First, the representation of thematic positions as actual positions on a subtree
gives them full syntactic status: this is not clearly the case with the grid repre-
sentation. Second, the tree is articulated in a way that the grid is not: in particu-
lar, the internal argument is represented as inside a smaller V than the external
argument. This means that the internal/external dierence is directly and
congurationally represented, at the level of the grid itself (no longer a grid,
rather, a subtree). There is some reason to think that theta positions may have the
real status given to them here. It would allow a very clear representation of
clitics, for example: they would simply be lexicalized theta tree positions
(perhaps binding phrasal node empty categories). Similarly, the pronominal
arguments of pronominal argument languages would be lexically realized
elements of these positions on the small theta subtree. And, nally, the possibili-
ty of such a lexical subtree provides a solution to a puzzle involving noun
incorporation. In the theory of Baker (1985), noun incorporation involves the
movement of the nominal head (N) of an NP into the verb itself, incorporating
that head but obligatorily not retaining speciers, determiners, and other material.
(41) a. I push the bark.
b. I bark-push. (incorporated analysis)
c. *I the bark-push.
Assuming the basics of Bakers theory, we might ask why the incorporated noun
is restricted to the head, and specically the N
0
category: why, for example, it
cannot include speciers. Baker suggests that this is because movement is
restricted to either maximal or minimal projections. This representation, however,
suggests a dierent reason: the noun incorporation movement is a substitution
operation, rather than an adjunction operation. As such, the moved category must
be of the right type to land inside the word: in particular, it must not be of
category bar-level greater than that available as a landing site. Since the landing
site is a mere N, not an N or NP, the only possible incorporating elements are
of the X
0
level.
I will henceforth use the theta subtree rather than the theta grid, except for
reason of expository convenience.
2.3.1 The Nature of Projection
The Grimshaw (1981) and Pinker (1984) proposal contained essentially two
parts. The rst was that certain aspects of argument-linking are universal, and it
is this which allows the child to pick out the set of subjects in his or her
language. I have attempted above to adopt this proposal intact, and in fact
expand on it, so that the principles are not simply in the mapping principles
allowing the child to determine his or her initial grammar, but in the synchronic
description of the grammar as well. The second part of the proposal is concerned
not so much with argument-linking, but the building of phrase structure. Here I
would like to take a rather dierent position than that considered in the
Grimshaw/Pinker work. In particular, I would like to argue that the Projection
Principle, construed as a continually applying rule Project-, plays a crucial role.
Recall the substance of the proposal in (38), where arguments are rst
thematically labelled, then given grammatical functions, and then given PS
representations, relaxed as necessary to avoid crossing branches. From the point
of view of the current theory (GB), there are aspects of this proposal which are
conceptually odd or unexpected, and perhaps empirical problems as well. First,
the tree-building rules given above require crucial reference to grammatical
relations. In most versions of Government-Binding Theory, grammatical relations
play a rather peripheral role, e.g. as in the denition of function chains (Choms-
ky 1981). Second, the tree representation itself is given an unusual prominence,
in that it is precisely the crossing of branches which forces a attening of
structure. This extensive reliance on tree geometry runs against the general
concept of theories like Stowell (1981). Further, there is an empirical problem.
Given the possibility that two dierent sorts of languages exist, those which are
truly at, and those which are not but have extensive recourse to Move-
such at least would be the conclusion to be drawn by putting together the
proposals of Hale (1979), Jelinek (1984), on the one hand, and Saito and Hoji
(1983) on the other the simple recourse to attening in the case in which the
canonical correspondences are not satised cannot possibly be sucient. Finally,
it is odd again that the Projection Principle, which plays such a large role in the
adult grammar, should play no role at all in the determination of early child
grammars.
We may pose the same questions in a more positive way. The child starts
out with a one-word lexical-looking grammar. From that, he or she must enter
into phrasal syntax. How is that done?
Let us give the Projection Principle, construed as an operation as well as a
condition on representations, central place. In particular, let us assume the
following rule:
(42) Project-
Project- holds at all levels of representations; it is a condition on representa-
tions. However, it is also a rule (or principle) which generates the phrasal syntax
from the lexical syntax, and links the two together. Thus, looking at the represen-
tation from the head outward, we may assume that the phrasal structure envelop-
ing it actually is projected from the lexical argument structure. This relies
crucially, of course, on the theta subtree representation introduced above.
(43)
lexical form:
V
N
agent
V
V
hit
N
patient
Project-
(44)
phrasal projection: V
x
NP
1
V
x
V
N
agent
1
V
V
hit
N
patient
2
NP
2
In the representation in (43)(44), the lexical representation of the verbal head
has projected itself into the phrasal syntax, retaining the structure of the lexical
entry. In eect, it has constructed the phrasal representation around it, project-
ing out the structure of the lexical head. Into this phrasal structure, lexical
insertion may take place. I have left it intentionally vague, for now, what the
level of the phrasal projections which are the output of Project- is, calling
them simply V
x
. Similarly, the question of whether the thematic role (agent,
patient, etc) projects has been left undetermined (no projection is shown above).
There are 4 areas in which the above representation may be queried, or its
general characteristics further investigated:
i) Given that the phrase structure is already articulated, does not the addition
of an articulated lexical representation, in addition to the PS tree, introduce
a massive redundancy into the system?
ii) Assuming that Project- does occur, is all information associated with the
lexical structure projected, or just some of it? Further, are there any
conditions under which Project- is optional, or need not be fully realized?
iii) Given that Project- is an operation, as well as a condition on ultimate
representations, is there any evidence for pre-Project- representations,
either in acquisition or elsewhere?
iv) How might the typology of natural languages, in particular, Hales (1979)
conjecture that noncongurational languages are atter in structure (i.e. less
articulated with respect to X-theory), be associated with dierences in
Project-?
In the following sections, I will be concentrating on questions (ii)-(iv). The
question or challenge posed in (i), I would like to consider here. It
requires a response which is essentially two-pronged. The rst part concerns the
nature of the lexical representation that is employed: that is, as a subtree with
articulated argument structure, including potential order information (e.g. the
internal argument is on the right), rather than the more standard formats with the
predicate rst, and unordered sets of arguments (Bresnan 1978, 1982; Stowell
1981). In part, the problem posed here is simply one of unfamiliarity of notation
an unordered set of arguments may be converted into a less familiar tree
structure with no loss of information but in part the most natural theoretical
commitments of the notations will be dierent, and the notation adopted here
requires a defense. I have suggested above one line of reasoning for it: namely,
by allowing the theta grid real syntactic status, the placement of elements like
clitics (and perhaps noun-incorporation structures) can be accounted for. Two
other consequences are: the notion of Move- in the lexicon (Roeper and
Siegel 1978; Roeper and Keyser 1984) would be given a natural format if tree
structures are assumed; perhaps less so in the more standard Bresnan-Stowell
type notation. Moreover, making the usual assumptions about c-command and
unbound traces (namely, that traces must be bound by a c-commanding anteced-
ent), the tree notation makes a prediction: externalization rules in the lexicon
should be possible, but internalization rules should not. This is because the latter
rule would be a lowering rule, leaving an unbound trace; the externalization rule
would not.
(45)
Externalization:
V
N
agent
i
V
V
melt
N
patient
i
(46)
Internalization:
V
N
agent
i
V
V
melt
N
patient
i
*
Aside from this empirical consequence, which falls out automatically from the
notation and requires verication or not, the tree notation has a further conse-
quence, bearing on a theoretical proposal of Travis (1984) and Koopman (1984).
Travis and Koopman suggest that there are two distinct processes, Case assign-
ment (or government) and theta assignment (or government), and that each of
these processes is directional. Thus, in the theory of Travis (1984), these
processes may be opposite in directionality, for some categories (e.g. V in
Chinese), with Move- occurring obligatorily as a consequence. The notion that
case-assignment is a directional process is a natural one, the notion that theta-
assignment is directional is perhaps less so, at least under the version of theta
assignment outlined in Stowell (1981), where theta roles are assigned under
coindexing with positions in the grid. Note, however, that given the tree-type
representation of lexical entries above, where order information is included, the
idea that theta government is directional becomes considerably more natural. This
is because, to the extent to which Project- faithfully projects all the information
associated with the lexical entry, the order information encoded in the lexical
entry would be expected to be projected as well. Thus the Koopman-Travis
proposal, and the proposal for the format of lexical entries suggested here,
mutually reinforce each other, if it is assumed that Project- is a true projection
of the information present in the lexical representation.
This partially answers the question posed by i): the existence of redundancy
in the chosen lexical representation, and that given in the syntax. While the two
representations are close to each other in format and information given, this is
precisely what would be expected under interpretations of the Projection
Principle in which the projected syntactic information is faithfully projected from
the lexicon. In fact, we may use the syntactic information so the Projection
Principle would tell us to infer information about the structure of the lexical
entry, given a strict interpretation of that principle.
There is, however, a dierent aspect of the problem. The above argument
serves to weaken criticisms about redundancy based on the similarity between
the lexical representation and the syntactic: in fact, just such redundancy or
congruence might be expected, given the Projection Principle. It might still be
the case that both sorts of information, while present in the grammar, are not
present in the syntactic tree itself. This would represent a dierent sort of
redundancy. The output of Project- given above includes both the lexical
information, and the syntactic (as does Stowells theta grid).
(47) V
VV
N
agent
N
agent
V
V
V
V
N
patient
N
patient
Project-
VP
NP V
NP
The arguments for this sort of reduplication of information are essentially
empirical. They depend ultimately on the position that implicit arguments play in
the grammar. In many recent formulations (e.g. Williams 1982), implicit
arguments are conceived of as positions in the theta grid not bound by (co-
indexed with) any phrasal node. As such, they may be partially active in a
syntactic representation; co-indexed with the PRO subject of a purposive, or an
adverbial adjunct (following Roeper 1986), even though the position associated
with them is not phrasally projected.
(48) a. The boat
i
(
vp
was (
v
ag
j
(
v
sunk patient
i
)) t
i
PRO
j
to collect the
insurance).
b. The piano
i
(
vp
was (
v
ag
j
(
v
played theme
i
)) t
i
(PRO
j
nude)).
Of course, if such positions are available in the syntax, even partially, they must
be present in the representation.
2.3.2 Pre-Project-a representations (acquisition)
The theory outlined above has two central consequences. First, the Pre-Project-
representation has a structure which is fully syntactic, a syntactically represented
subtree. Second, this subtree is available in principle prior to the application of
Project-, since the latter is interpreted both as an operation, and as a constraint
on the outputted representations. In the normal course of language use, no
instances of Pre-Project- representations would be expected to be visible, since
one utters elements which are sentences, or at least part of the phrasal syntax,
and not bare words. However, this need not be the case with the initial stages of
acquisition, if we assume, as in fact seems to be the case, that the child starts out
with a lexical-looking syntax, and only later moves into the phrasal syntax.
The idea that the child may be uttering a single lexical item, even when he
or she is in the two (or perhaps three) word stage, becomes plausible if we adopt
the sub-tree notation suggested earlier. It was suggested above that a verb like
want, in its lexical representation, consists of a small subtree, with the positions
intrinsically theta-marked.
(49) V
N
agent
V
V
want
N
theme
There were shown above a number of constructions from Braine (1963, 1976)
from very early speech. Among them were the want constructions from Steven:
(50) want baby
want car
want do
want get
want glasses
want head
want high
etc.
These appear directly after the one-word stage. One possibility of analysis is that
these are (small) phrasal collocations. Suppose we assume instead that they are
themselves simply words: the lexical subtree in (49), with the terminal nodes (or
the object terminal node) lled in.
(51) V
N
agent
V
V
want
N
theme
baby
Then the child is, at this point, still only speaking words, though with the
terminals partly lled in.
Can other constructions be analyzed similarly? It appears so, for the noun-
particle constructions (Andrew, word combinations), if we assume that a particle
subcategorizes for its subject.
(52) P
N
theme
boot
P
off
A dierent, interesting set is the collocations of a closed class referring element
and a following noun. These have often been called structures of nomination:
(53) that Dennis it bang
that doll it checker
that Tommy etc.
that truck
etc.
A similar set involves a locative closed class subject, followed by various
predicative nouns.
(54) Steven word combinations:
there ball here bed
there book here checker
there doggie here doll
etc. etc.
From the point of view of the current proposal, these examples are potentially
problematic on at least two grounds. First, we have been assuming, essentially
following Braine, that the initial structures were pivot-open (order irrelevant),
where this distinction is close to that of head-complement or subcategorizer-
subcategorized. The set of pivots (heads) is small compared to the set of
elements that they operate on. However, under normal assumptions, it is the
predicate of a simple copular sentence (which the sentences in (53) and (54)
correspond to) which is the head, and subcategorizer, and least in the semantic
sense (though INFL may be the head of the whole clause).
(55)
S
NP H
John is a fool
The data in (53) and (54), however, suggest the reverse. In all these cases it is
the subject which is closed class and xed (the pivot), and the predicative
element freely varies. Trusting the acquisition data, we must then say that it is
the subject which is acting as the head or operator, and that is operating on the
open element, for simple subject-copula-predicative phrase constructions, though
not generally:
(56) i) here __
ii) there __
iii) that __
iv) it __
This suggests in turn that simple sentences may dier with respect to their
semantic and syntactic headedness, depending on whether they are copular or
eventive, headed by a main verb predicate. In the latter structures (e.g. want X),
the main verb seems to act as the pivot or functor category; in the former (e.g.
here X, it X), the subject seems to act in a similar manner. This syntactic fact
from early speech is in accord with a semantic intuition: namely, that eventive
sentences are in some sense about the event that they designate, while copular
sentences are about the subject. Precisely how these semantic intuitions may be
syntactically represented, I leave open.
In eect, the structures in (56) are like small clause structures, without the
copula. In the literature, two sorts of small clauses are often distinguished:
simple small clauses and resultatives. The former are shown in (57a), the latter
in (57b).
(57) a. John ate the meat (PRO nude)
I consider that a man
b. I painted the barn black
I want him out
Interestingly, both of these structures are available in the acquisition data above,
but with dierent classications of what counts as the pivot. Corresponding to
the simple small clauses are the structures of nomination in (58), where the
subject acts as pivot. Corresponding to the resultatives are the noun-particle
constructions given in (52).
(58) a. that __
it ___
here __
b. __ o
The former have their pivot (head) to the left, it is the subject. The latter have
their pivot (head) to the right. This suggests that, semantically, the subject
operates on (or syntactically licenses) the predicate in the constructions in
(58a) in the child grammar (and, ceteris paribus, in the adult grammar as well),
while the reverse occurs in (58b). This is additional evidence for speculations
directly above about dierences in argument structure in these types of sentenc-
es, both at a broad semantic level, and syntactically.
2.3.3 Pre-Project-a representations and the Segmentation Problem
The section above provides a plausibility argument that the initial representations
are lexical in character, and that the pivot is to be identied with the head of a
Pre-Project- representation, while the open element is a direct realization of the
position in the argument structure. I will consider here some additional data, and
a constraint relevant to that determination.
One aspect of the analytical side of the childs task of acquisition in early
speech must be the problem of segmentation. The child is faced with a string of
elements, largely undetermined as to category and identity, and must, from this,
segment the string suciently to label the elements. Some part of this task may
be achieved by hearing individual words in isolation (and thus escaping the
segmentation task altogether), but it is clear that the vast majority of the childs
vocabulary must be learned in syntactic context, through some sort of segmenta-
tion procedure.
It has often been assumed that the segmentation task requires extensive
reliance on phrase structure rules. Thus given the phrase structure rule in (59),
and given the string in (60) with the verb analyzed but the NP object not
analyzed yet by the child, the application of the phrase structure rule (59) to the
partially labelled string in (60) would be sucient to determine the category of
the object.
(59) VP V NP
(60) V ? V NP
see the cat see the cat
Similarly, given the phrase structure rule in (61), and the partially analyzed string
in (62), the identity of the second element may be determined.
(61) NP Det N
(62) Det ? Det N
an albatross an albatross
While these simple examples seem to allow for a perspicacious identication of
categories on a phrase structure basis, both empirical and theoretical complica-
tions arise with this solution, when the problem is considered at a more detailed
level. The empirical problems are of two sorts: the indeterminacy of category
labelling given the optionality of categories in the phrase structure rule, and the
mislabeling of categories, with potentially catastrophic results.
An example of the former can be seen very easily at both the VP and NP
level. A traditional, Jackendoan expansion of the VP might be something like
the following:
(63) VP V (NP) (NP) (PP) (S)
And that of the NP is given in (64).
(64) NP (Det) (Adj) N (PP) (PP) (S)
Consider the childs task. He or she has isolated the verb in some construction.
A complement appears after it. What is the complements type?
The expansion rule in (63) is nearly hopeless for this task. Suppose that the
child has a relatively complete VP expansion. This is then laid over the partially
segmented string:
(65) V (NP) (NP) (PP) (S)
see beavers
(66) beavers: NP?
PP?
S?
The result, given in (66) is nowhere near determinate. In fact, the unknown
element may be placed as any of the optional categories: this extreme indetermi-
nacy of result makes the procedure very unclear as to result.
The same holds for the nominal expansion. Suppose that the child hears the
following:
(67) American arrogance
This has two analyses under his/her grammar, perhaps with additional specica-
tion of the determiner in (68b).
(68) a. ( (American)
A
(arrogance)
N
)
NP
b. ( (American)
DET
(arrogance)
N
)
NP
Again, the category assigned by the phrase structure rule is indeterminate, and
perhaps misleading.
It may be objected at this point that the above criticism, while correct as far
as it goes, does not take account of the fact that additional evidence may be
brought to bear to lessen the indeterminacy of the analysis. For example, in (66),
the category of beavers may be initially placed as (NP, PP, S), but the choice
between these is drastically narrowed by the fact that beavers name things, and
the canonical structural realization of things is as NPs. However, while this
assertion is correct, it allows a mode of analysis which is too powerful, if the
phrase structure rule itself is still assumed to play a central role. If the child is
able to segment the substring beavers from the larger string, and if the child
knows that this labels a thing, and is able to know that things are NPs (in
general), then the category of beavers can itself be determined by this procedure,
without reference to the phrase structure analysis at all.
In fact, things are even more dicult for the phrase structure analysis than
has so far appeared. We have been assuming thus far that the child has a full
phrase structure analysis including all optional categories, and has applied that to
the input. Suppose that, as seems more likely, the initial phrase structure
grammar is not complete, but only has a subset of the categories. The VP, let us
say, has only the object position; the NP has a determiner, but no adjective. Let
us further assume that these categories have so far been analyzed as obligatory
i.e. no intransitive verbs (or an insucient number of them) have appeared
in the input, similarly for determinerless nouns.
(69) VP V NP
NP Det N
While it is obviously not the case that every child will have the phrase structure
rules in (69) at an early stage, the postulation of these rules would not seem out
of place for at least a subset of the children.
Consider now what happens when the child is faced with the following two
instances.
(70) a. go to Mommy
b. Tall men (are nice)
The application of the PS rules in (69) to the data in (70) gives the following
result:
(71) ( (go)
V
( (to)
D
(Mommy)
N
)
NP
(72) ( (tall)
D
(men)
N
)
NP
To is mislabeled as a determiner, and to Mommy as an NP; tall is also misla-
belled as a determiner. Of course, this misidentication of categories is not
recoverable from on the basis of positive evidence (the category would just be
considered to be ambiguous); worse, the misidentication of a category would
not be local, but would insinuate itself throughout the grammar. Thus the
misidentication of to as a determiner might result in the following misidenti-
cation of the true determiner category following it, and in the following misiden-
tication of the selection of other verbs:
(73) a. to this store
b. ( (to)
D
(this)
A
(store)
N
)
NP
(74) ( (talk) ( (to)
D
(Mary)
N
)
NP
It is precisely this indeterminacy of analysis, and possibly misleading analyses,
which led Grimshaw (1981) and Pinker (1984) to adopt the notion that structural
canonical realizations play a crucial role in early grammars, with the analytic role
of phrase structure rules reduced. However, the Grimshaw/Pinker analysis does
not go far enough. For the later stages of grammars, the optionality of elements
in the phrase structure rule would make the output of analytic operations
involving them useless (giving multiple analyses), while in the early stages, such
analyses could be positively misleading, with erroneous analyses triggering other
misanalyses in the system. Let us therefore eliminate their role entirely:
(75) No analytic operations occur at the phrase structure level.
A generalization such as (75) would be more or less expected, given Stowells
(1981) elimination of phrase structure rules from the grammar. It is nonetheless
satisfying that the elimination of a component in the adult grammar is not paired
with its ghostly continuance in the acquisition system.
The segmentation problem, however, still remains. Here I would like to
adopt a proposal, in essence a universal constraint, from Lyn Frazier (personal
communication):
(76) Segmentation Constraint:
All analytic segmentation operations apply at the word level.
Let us suppose that (76) is correct. Suppose that the child hears a sentence like
the following:
(77) I saw the woman.
Suppose that the child has already identied the meaning of see, and thus the
lexical thematic structure associated with it. According to the discussion above,
this is a small lexical sub-tree.
(78) V
N
agent
V
V N
theme
The subtree in (78) is applied the input. In the resultant, the closed class
determiner, which is causing the N to project up to the full maximal node (see
later discussion), is dropped out.
(79) V
N
agent
V
V
saw
N
theme
woman
I have included the subject as optional in (78) partly for syntactic reasons (it is
by no means clear that the subject is part of the lexical representation in the
same sense as the object, see Hale and Keyser 1986a, 1986b), and partly because
of the acquisition data (the subject appears to be optional in child language, but
we may conceive of this as due to some property of the lexical entry, or the case
system, rather than due to the existence of pro-drop: see later discussion). The
dropping of the determiner element in the childs speech falls out immediately if
we assume that the child is speaking words (which contain simple -bar-level
nominal constituents); this assumption is being carried over from the discussion
of the initial utterances. The isolation of the nominal category (woman, here),
occurs at the word or word-internal level.
Fraziers segmentation constraint therefore ts in well with the fact that the
early grammars drop determiner elements. If such elements are part of the
phrasal syntax (i.e. project up to the XP, or cause the head to so project), and if
such syntax is not available in these very early stages, then the dropping of the
determiner elements is precisely what would be expected. Further, the segmenta-
tion of early strings becomes a morphological operation, which is surely natural.
2.3.4 The Initial Induction: Summary
I will discuss further aspects of the analytic problem, and the role that the open
class/closed class distinction may play in the identication of categories, and in
the composition of phrase structure itself, further below. For the moment, I
would like to summarize the dierences between the proposals above, and the
Grimshaw/Pinker proposals with which the chapter began, since the current
comments suggest a dierence in orientation not only with respect to the
building and relaxation of phrase structure (the second part of the proposal), but
with the nature of the generalization involved in argument-linking as well.
In the Grimshaw/Pinker system, the following holds:
i) The subject and object are identied by virtue of their associated thematic
relations, agent and patient; similarly for other thematic roles. This identi-
cation is universal (the assumption of canonical grammatical-functional
realizations).
ii) Cross-linguistic variation in articulation of phrase structure follows from a
constraint on the phrase structure system: no crossing branches are allowed.
Given this constraint, and given the apparent permutation of subject and
object arguments, the phrase structure tree of the target language will be
attened.
In the proposal above, the Projection Principle, in the form Project-, is given
pride of place. In addition, it is claimed that i) lexical representations are
articulated (sub-)tree representations, which Project- faithfully represents the
information in, and ii) that very early representations are lexical in character. The
proposals corresponding to the Grimshaw/Pinker proposals would then take the
following form. First, the primitives subject and object would be replaced with
Williams (1981) categorization of internal and external argument. This would be
present structurally in terms of the articulation of the phrase structure tree, with
the external argument more external than the internal argument, though perhaps
not external in Williams sense of external to a maximal projection (at least at
D-structure, see Chapter 1, discussion of Kitagawa and Sportiche). The external/
internal distinction would also be present in the lexical sub-tree. Proposal i) of
Grimshaw/Pinker would therefore correspond to the following:
(80) Agents are more external than patients (universally).
If we assume (80) to hold both over lexical representations, and, by virtue of the
Projection Principle, over syntactic representations as well, then the child may,
upon isolation of elements carrying the thematic roles agent and patient,
determine the following sub-tree.
(81) V
(N)
agent
V
V N
patient
This has the same content as the original proposal in terms of the articulation of
the initial trees, but eliminates the primitives subject and object from the system.
2.3.5 The Early Phrase Marker (continued)
I have suggested above that the initial grammar is lexical, in the sense of its
content being determined by X
0
elements. In general, the phrasal system is
entered into at the point at which determiner elements, i.e. closed class specier
elements, are met in the surface string. This general picture, together with the
fact that the lexicon and the syntax use the same type of formal representation,
namely tree-structures (acyclic, directed, labelled graphs, with precedence
dened), suggests that the demarcation between the lexicon and the syntax is not
as sharp as has sometimes been claimed.
Let me now expand a bit on the type of the representation, since the claim
that the child is still operating with lexical representations at (say) the two-word
stage is unnecessarily dramatic, though still signicant. The actuality is that the
representation is pre-Project-, and that more than one level characterizes
representations of this type.
Let us divide the pre-Project- representation into two relevant levels (there
may be others, cf. Zubizarreta 1987, they do not concern us now). The rst will
be called the lexical representation proper; the second is the thematic representa-
tion. Both of these are tree representations. The dierence is that OC (open
class) lexical insertion has taken place in the latter, but not the former.
(82)
V
N
agent
V
V
hit
N
patient
V
N
agent
man
V
V
hit
N
patient
dog
OC insertion
Project-
Lexical representation Thematic representation
S
NP
The man
VP
V
hit
NP
the dog
In the sequence of posited mappings, open class lexical insertion (and perhaps
other operations) distinguish the thematic representation from the lexical
representation. The rule Project- projects the thematic representation into the
phrasal syntax. The claim that the child is speaking lexical representations
when he/she is speaking telegraphic speech may now be recast as: the child is
speaking thematic representations. This is a pre-Project- representation, but does
not correspond to any single accepted level in current theories (e.g. GB or LFG).
It has rather an intermediate character between what is generally assumed to be
lexical and syntactic levels.
Let us turn now back to the segmentation problem. We may simply view
this as the determination of the set of primitives that the child employs at PF
(where the segmentation occurs).
One obvious fact about acquisition is that at the stage where the child
commands telegraphic speech, he or she is able to isolate out the open class parts
of the vocabulary from parental speech. Assuming that Motherese does not play
a crucial role in this process (though it may speed it up, Gleitman, Gleitman, and
Newport 1977), we may view this as involving the child parsing the initial full
sentence which is heard with a lexical representation of the head, a tree with
precedence and externality/internality dened, giving rise to the thematic
representation. For the case of (83), this would be the following:
(83) V
N V
V
see
N
Input:
The bureaucrat saw the monster
Retrieved representation: ( bureaucrat ( see ( monster)))
V V N
The representation which would be applied to the input would be the lexical
representation in the sense suggested above: a headed subtree, with precedence
and externality structurally dened. The non-head nodes correspond to lexical,
not phrasal, categories.
As such, only part of the input string would be analyzed: that corresponding
to the full lexical categories. As such, there is no need for a separate parsing
structure here apart from that already implicated in the grammar itself: the
parsing is done with a representation which is part of the grammar, the lexical
representation (and it returns the thematic representation).
A second example would be the following: the structures of nomination.
These are given in (84), where, as already noted above, the subject appears to act
as a xed functor category, taking a series of predicates as arguments. These
occur in copular constructions, not elsewhere.
(84) a. that ball
that ___
b. it mouse
it ___
These may be formally represented as involving a tree structure with a xed
subject and open predicate. applied successively to dierent inputs. The resultant
is a partially analyzed string.
(85)
N
that
V lexical/thematic representation
(86)
N
that
N
ball is a
.
The copular element and the determiner drop out of the analyzed representation;
as in the case of the headed eventive structures, it is the lexical/thematic
structures themselves which parse the string.
In the adult grammar, these initial representations are licensing conditions:
the that __ above licenses the predicate that follows it, at the thematic level of
representation (the thematic representation exists in the adult grammar as well).
This mode of analysis suggests that the complexity of initial structures
should be viewed as a function of the complexity of the primitive units, and of
the relations between them. As such they may be used to gain knowledge of
what the primitive units are. As noted by Bloom (1970), childrens ability to
construct sentences does not proceed sequentially as the following hypothetical
paradigm might suggest.
bridge
big bridge
a big bridge
build a big bridge
This suggests that the childs grammar, while built of more primitive units, is not
built purely bottom-up: see also discussion next chapter. More natural sequences
are those given below:
(87) I see
see bridge
see big bridge
(88) like Mary
I like Mary
These sequences tend to isolate the government relation, and then add elements
to it (e.g. in (87)). One prediction of the account above would be that in the two
word stage, the government relation, and perhaps other types of licensing
relations (e.g. modication) would be paramount; two word utterances not
evidencing these relations would be relatively rare. That this is so is hardly news
in the acquisition literature (see Bloom 1970; Brown 1973, and a host of others),
but the diculty has been to model this fact in grammatical theory. Phrase
structure representations do not do well; the idea of primitive licensing relations,
and compounds of these (see also Chomsky 19751955) would do much better.
This would give one notion of the complexity of the initial phrase marker.
Another point of reference would be the dierences in complexity between
subjects and objects. Much early work in acquisition (by Bloom and others)
suggested that the initial subjects were dierent than objects in category type:
less articulated, and much less likely to be introduced with a determiner or
modier. This was encoded in Bloom (1970) by introducing subjects with the
dominating node Nom rather than NP in the phrase structure rule, since it
showed dierent distributional characteristics. This is shown in (89).
(89) S Nom VP
VP V NP
If there is a distinction like this (though see the critical discussion in Pinker
1984), then it would follow quite naturally from the sort of theory discussed in
this chapter. Given a rule of Project-, one might ask: is there an ordering in
how arguments are projected? In particular, are arguments projected simulta-
neously, or one-by-one? Projection here would be close to the inverse operation
to that of argument-taking.
The Bloom data suggests an answer, assuming, as always, a real conver-
gence between the ordering of stages in acquisition and the ordering of opera-
tions in the completed grammar. Suppose the following holds:
(90) The projection of internal arguments applies prior to the projection
of external arguments (in both the completed grammar and in
acquisition).
Then the Bloom data and the dierence between the subject and object would
follow. I leave this open.
Let us turn back to the segmentation problem (which may be simply
identical to determining the set of primitive structures at PF). I suggested earlier
that the parsing of initial strings was done by application of the thematic
representation, a pre-Project- representation, to the phrasal input.
(91) V
V
V
V
V
N
N
N
V N
the bureaucrat saw the monster
Returns:
see bureaucrat monster
This returns the open class representation given in (91). Ultimately, the category
of the determiner (and its existence), must be determined as well. I suggested
above that all segmentation takes place at the lexical level. This means that for
the determiner to be segmented out, it must rst be analyzed as part of a single
word with the noun that it species. This means that one of two representations
of it must hold.
(92)
N
Det
the
Det
the
Det
N
man
N
man
(at PF)
(at PF)
or
a.
b.
For the time being, I will assume the rst of these representations, the more
traditional one. This goes against the position of Abney (1987a), for example.
There is a reason for this, however. With the notion that the noun is the head of
the NP, we are able to keep a coherent semantic characterization of the specier
element (in general) it contains closed class elements, and not other elements.
The notion that all elements are projecting heads in the NP, no longer allows us
to keep this semantic characterization.
The notion of functor category that I am using here, deriving from Braines
work on pivots, is dierent from that of Abney (1987a), Fukui and Speas
(1986), and dierent as well from the notion of functor in categorial grammar.
I have taken the following categories as functors, in the child grammar and
presumably in the adult grammar as well: verbs in eventive structures, preposi-
tions in constructions like boot o, and the (deictic) subject in sentences like
that ball. This is clearly not a single lexical class, nor is it all closed class
elements. It is closer to the notion of governing head, but here as well dierenc-
es appear: that in that ball would not normally be taken as a governing head.
Let us dene a functor or pivot as follows:
(93) G is a functor or pivot i there is a lexical representation containing
G and an open variable position.
As such, the extension of the term functor is an empirical matter, to be deter-
mined by the data. In these terms, there is some reason to think that closed class
determiners are not functors or pivots, in the sense of (93). While the child does
exhibit replacement sequences like those in (94) (with the verbal head taking the
nominal complement), he/she does not exhibit structures like those in (95).
(94) see ball
see man
see big ball
(95) the ball (not present in output data)
the man
the table
The presence of the former type of sequences may be viewed as a result of the
child inserting a nominal into the open position in the lexical frame in (96).
(96)
V
V
see
N
The fact that the latter does not occur may be taken to suggest that there is no
comparable stored representation such as (97) in the grammar.
(97)
Det or N
Det
the
N
Note that a pragmatic explanation, that such lists are unnecessary for the child
for the cases in (97), would be incorrect. The child needs to identify the set of
cardinal nouns in his language, and the formation in (97) would be a way to do it.
Curiously, the phenomenon of lexical selection, and its direction, may have
some light shed on it by second language acquisition, particularly by the format
of grammatical drills. A common approach is to retain a single head, and vary
the complement.
(98) a. (ich) sah den Mann. (verb repeated, object alternated)
I saw the man.
b. (ich) sah das Maedchen.
I saw the girl.
c. (ich) sah die Frau.
I saw the woman.
etc.
Much less common would be lists in which the object is held constant and the
complement-taking head varied.
(99) a. (ich) sah den Mann. (object repeated, verb alternated)
I saw the man.
b. (ich) totet den Mann.
I killed the man.
c. (ich) kusste den Mann
I kissed the man.
etc.
This may be viewed as a means of inducing the head-centered thematic represen-
tations suggested above, while replacing the complement with a variable term.
Induced representation for (98):
(100)
V
V
sah
N
Note that this answers the question asked in Aspects (Chomsky 1965): should the
subcategorization frame be a piece of information specied with the verbal head,
or should the subcategorization (e.g. __ NP) be freely generated with the
complement, and a context-sensitive rule govern the insertion of the verb? Only
the former device would be appropriate, given the data here. (The Projection
Principle derives the result from rst principles.)
Another type of grammatical drill is one in which the subject is kept
constant, and the copular predicate varied:
(101) a. Der Tisch ist grau. (subject constant, copular predicate varied)
the table is gray.
b. Der Tisch ist blau.
the table is blue.
c. Der Tisch ist grun
the table is green.
Much less common is one in which the subject of an intransitive is kept constant
and the verb varied:
(102) a. Der Mann schlaft.
the man sleeps
b. Der Mann totet.
the man killed
etc.
This suggests again that there is a dierence in thematic headness, with the
simple copular predication structure being about the subject, and forming a
nominal-variable structure like that in (103), while the eventive intransitive does
not form a subject headed structure (104).
(103)
N
Der Tisch
A
(104)
N
Der Mann
V
V
Finally, we note that lists in which the determiner is held constant and the
nominal varied are not common at all.
(105) der Mann
der ___
etc.
This suggests, again, that these do not form the sort of headed lexical structures
(with the determiner as head) noted above. The data from second language drill,
and rst language acquisition (rst stages) are therefore much in parity.
While it is true that second language learning grammars have the disadvan-
tage of attempting to teach language by direct tuition, the particular area of
language drills is not one for which a (possibly misleading) prescriptive tradition
exists. Rather, linguistic intuitions are tapped. It is therefore striking that the
variable positions in such drills correspond to what may be viewed as the open
position in a functor-open analysis (or governed-governee in a broad sense).
I have suggested, then, the following: there is a set of lexical-centered
frames which are applied by children to the input data; the head or functor
element corresponds to a xed lexical item. These are in addition part of the
adult grammatical competence. Let us return now to the question of how closed
class elements are analyzed in the initial parse.
2.3.6 From the Lexical to the Phrasal Syntax
The result of applying the verbal lexical entry to the full phrase structure string
is an instance of parsed telegraphic speech.
(106)
V
N V
V N
The bureacrat saw the monster
The residue of this analysis are the closed class elements: the subject determiner,
In, and the object determiner. In the earliest speech, these are simply ltered
out. At a later stage, these are not part of active speech, but understood at least
partly in the input, before being mastered. The acquisition problem therefore is the
following: how, and when, are these elements incorporated into the phrase marker?
I have suggested already the rst step: the lexical representation of the head
(see, in this case) is projected over the entire structure, parsing the open class
part of the structure. The problem with the closed class elements is that they
must be analyzed in the phrase marker antecedent to their identication as
markers of categories of given types (this would follow if telegraphic speech
really is a type of speech, and shows the primary identication of the open class
heads). I will assume at this point a uniform X
0
for all the closed class elements.
That is, they must be attached rst on simply structural grounds. Let us attach
determiners and closed class elements into the structure as follows.
(107) a. Segment the independent closed class elements.
b. Identify the Principle Branching Direction of the language (Lust
and Mangione 1984; Lust 1986).
c. Attach each closed class element in accordance with the Principle
Branching Direction (and the already established segmentation).
Proceeding from right-to-left, the child would, apart from the strictures in (107)
be allowed either of the two attachments for the, actually associated with the
object (assuming binary branching as a rst approximation):
(108)
V
V
N
N
V
V
V
V
N
N
The bureacrat
The bureacrat
saw
saw
the monster
the monster
a.
b.
Right attachment:
Left attachment:
Two possible attachments:
Given (107), only the structure in (108a) would be the appropriate one.
Three questions: How is the Principle Branching Direction determined by
the child? Why is the parsing done right-to-left? Why are the elements only
added in a binary-branching fashion?
Three answers: 1) The Principle Branching Direction is either given to the
child on phonological grounds, or is perhaps identied with the branching
direction given by the transitive verb (with the internal/external division). 2)
Parsing of the closed class elements is done in the direction opposite of the
Principle Branching Direction (so, right-to-left in English). 3) While it need not
be the case that all structures in the language are binary branching, I will assume
that there is necessary binary branching in one of the following two situations:
a) where the selecting category is a functional head in the sense of the Abney
(1987a, b) possibly, in cases in which the language is uniformly right-headed (as
in. the case of Japanese, Hoji 1983, 1985).
Continuing with the parse, the left-peripheral element is arrived at. In line
with (107c), this would give the following (erroneous) parse (if the were attached
low, there would be a left branch).
(109)
V
N V
V N
The bureaucrat saw the monster
Let us exclude this in principle, in the following way:
(110) a. is extra-syntactic i it is an unincorporated element on the
periphery of the domain on the border opposite from that with
which the parsing has begun.
b. Extra-syntactic elements need not be parsed in the initial parsing.
The result of (107) together with (110) would therefore be the following, with
the initial the extrasyntactic.
(111)
V
N V
V N
(The) bureaucrat saw the monster
The parentheses indicate extra-syntacticity.
The tree in (111) must now be labelled. I have suggested earlier that the
closed class elements, while segmented, have not been labelled for distinguishing
form class. Let us therefore label them all X
0
or X
CC
(X
closed class
). In addition,
there is the question of what the constituent is that is composed of an unspecied
X
0
and an N. Let us adopt the following labeling conventions or principles:
(112) Labelling:
a. If a major category (N, V, A) is grouped with a minor element,
the resultant is part of the phrasal syntax.
b. The major category projects over an unspecied minor category, or
c. The adjoined category never projects up.
d. No lexical category may dominate a phrasal category.
(112a) is the labelling principle corresponding to the suggestion above that it is
precisely when the closed class determiner elements are added that the phrasal
syntax is entered into: this part of the parsing procedure corresponds, then, to the
rule of Project-. (112d) is an unexceptional assumption of much current work
(though problems arise potentially for the analysis of idioms). (112b) is more
tentatively assumed than the others; see later discussion. None of these are
intended to be special parsing principles; all are simply part of the grammar.
The addition of these elements to the phrase marker stretches it, rather than
building it up. The attachment operations are roughly equivalent to the sort of
transformations envisioned in Joshi, Levy, and Takahashi (1975), Kroch and
Joshi (1985), Joshi (1985).
The resultant of applying (112) is the following tree:
(113) VP
N VP
V NP
X
0
N
monster the saw bureaucrat the
The second the is labelled X
0
and attached to a constituent with the following
noun. By (112a) this forms a phrasal category; by (112b) or (c) this phrasal
category is an NP. Since a lexical category may not dominate a phrasal category
((112c)), the type of the category dominating the verb and the following elements
is not a V, but a VP. By (112c), though not (112b), the entire category would be
labelled VP. Finally, the initial the is still not incorporated.
This initial representation in fact makes a prediction: that the child may
initially know that there is a slot before the head noun, and in other places, but
have this slot unspecied as to content and category. Interestingly, in production
data, it often appears that the overt competence of closed class determiner items,
and other sorts of items is preceded by a stage in which a simple schwa appears
in the position in which the determiner would be due. This seems to be the case
(from Bloom 1970).
(114) a. This 6 slide.
b. That 6 baby.
c. 6 take 6 nap.
d. 6 put 6 baby.
e. Here comes 6 chine.
More strikingly, however, the schwa appears in two others places: in a preverbal
(or perhaps copular) position, and in a position at the beginning of the sentence
(from Bloom 1970).
(115) a. This 6 tiger books. (could be determiner or copula)
b. This 6 slide. (could be determiner or copula)
c. 6 pull. (sentence initial or subject)
d. 6 pull hat. (sentence initial or subject)
e. 6 write. (sentence initial or subject)
f. 6 sit. (sentence initial or subject)
g. 6 t. (sentence initial or subject)
Bloom makes the following comment (pg. 7475):
The transformation for placing the /6/ in preverbal position accounted for the
31 occurrences of /6/ before verbs for example, 6 try, 6 see ball and
for the fact that the schwa was not aected by the operation of the reduction
transformation. The preverbal occurrence of /6/ was observed in the texts of all
three children and deserves consideration particularly since such occurrence
precluded the specication of /6/ in the texts as an article or determiner
exclusively.
The problem of ordering the /6/ placement transformation in relation to the
reduction rules [the set of deletion rules that Bloom assumes deriving the
childs phrase marker from a relatively more expansive adult-like one, D. L.]
has not been solved satisfactorily. The order in which they appear in the
grammar was chosen because /6/ was not aected by reduction; that is, it was
not deleted with increased complexity within the sentence, and its occurrence
did not appear to operate to reduce the string in the way that a preverbal noun
(as sentence-subject) did. This order is attractive because /6/ can be viewed as
standing in for the deleted element and appears to justify the /6/ as a
grammatical place holder.
The reduction transformation that Bloom posits is of course not carried over into
the present theory. Interestingly, the set of places that the schwa is occurring is
precisely where one would expect the X
0
categories to appear in the current
theory.
Three questions again arise: How and when is the sentence initial the
incorporated into the representation? What about the inection on the verb
(childrens initial verbs are uninected)? And, crucially, what constitutes the
domain over which extra-syntacticity is dened? The second of these questions
also has repercussions for what the category of the entire resultant is (VP or IP).
For the rst of these questions, let us assume that the answer is the same
that is given in phonology for extrametricality:
If is extrasyntactic, group it locally with the most nearly adjacent category.
This would give rise to the initial the being grouped, correctly, in the initial
NP. The question of In is more complex, and partly hangs on the domain of
extrasyntacticity. Let us assume, as I believe we must, that the child initially is
able to analyze the verb as composed of the verb and some additional element.
I will assume that this additional element is simply analyzed as an X
0
, i.e., an
additional closed class element of uncertain category, and is simply factored out
of the produced representation. That is, the childs telegraphic speech is of bare
innitival stems (or the citation form), not of correct (or incorrect!) inected
verbs. If we assume that In is also adjoined to the VP in accordance with the
principle branching direction of the language, we arrive at a representation which
is, except for categoricity, identical to the representation of the phrase marker
given (for example) in Chomsky (1986). This corresponds, presumably, to the
DS representation.
On the other hand, if we assume that the domain of extrasyntacticity is the
VP, as well as the S, and that In is therefore regarded as extrasyntactic in its
domain, its attachment site would be low, with the verb. This would correspond
to the representation after ax-hopping, in the standard theory (though see,
again, Chomsky 1986).
(116)
VP N
N
V NP
X
0
X
0
X
0
bureaucrat The
S
NP
Infl see the monster
a. Placement of Infl, if S is only extrasyntactic node:
N
N
V
V
NP
X
0
X
0
X
0
bureaucrat The
S
NP
Infl see the monster
b. Placement of Infl, if S and VP are extrasyntactic nodes:
VP
These representations correspond not only to two dierent representations of the
phrase marker, but to what might be assumed to be two dierent levels of the
phrase marker: at DS, In takes scope over the entire VP, while at PF, it is
attached to the verb. This process itself may be simply a subcase of a more
general process in the grammar where a closed class element appears to take
wider scope semantically than it does at PF: in morphology, where constructions
like transformational grammarian have the -ian take wide scope semantically,
but comparative narrow scope syntactically; and perhaps relative clause construc-
tions, where the determiner must appear outside the N or N for a perspicacious
construal of scope relations (the Det-Nom analysis), while for other purposes it
appears that the NP-S analysis may be preferable (where the determiner has
been lowered).
With respect to In, in earlier work (Lebeaux 1987) I took the position that
the initial In was adjoined to V, hence did not govern the subject position, and
hence the latter could stay unrealized in early speech. This was in contradistinction
to Hyams analysis of early pro-drop (Hyams 1985, 1986, 1987). The present
representation suggests other alternatives for the early representation of subjects.
One alternative, which I will simply mention without taking a position on: that
in the early grammar, the subject position is, initially, purely a thematic position
(a pure instantiation of theta relations). It might be assumed that such positions,
if external, need not be realized on purely thematic grounds. This would account
for the non-obligatoriness of deep structure subjects in passive constructions;
the actual obligatoriness of subjects in active constructions would then have to
be due to some other principle.
More strikingly, extra-syntacticity may give an account for the placement of
early subjects. I suggested, again in Lebeaux (1987), that early pronominal
subjects may not be in subject position, but rather adjoined to the verb, as verbal
clitics (see also Vainikka 1985). This would follow if such elements were extra-
syntactic, and adjoined low, late.
(117) VP
V
N
my
V
did
NP
N
it
Other elements, in particular agreement elements in Italian, noted by Hyams
(1985, 1986, 1987) in her discussion of the early phrase marker would presum-
ably have the same analysis, and hence not be counterexamples to the thesis of
this chapter (see also Chapter 4).
In the representation above, I have begun with a thematic representation,
i.e., a projection of the pure thematic argument structure of the head, and then
incorporated the closed class determiner elements by stretching the representa-
tion to incorporate them, rather than building strictly bottom-up. This is what the
operation corresponds to in the analytic mode. In the derivational mode, this
corresponds to the basic rule of Project-: these are simply aspects of the same
operation. This is the means by which the phrasal syntax is entered into, from
the lexical syntax. Since the closed class elements are not initially analyzed as to
category, and their attachment direction is also not given, these choices must
initially be made blindly, in accordance with the Principle Branching Direction
of the language (or perhaps the governing direction of eventive verbs), and the
projection of categorial information. Crucial as well was the notion of extra-
syntacticity at the periphery of a domain: it was this, and only this, which
allowed the appropriate attachment of the initial determiner.
One might wonder how far these initial assumptions would go, in building
the phrase marker by the child, and how they are revised. For example, a
sentence like that in (118a) would have by the child an initial (telegraphic)
analysis like that in (118b).
(118) a.
b.
N
N
mommy
N
loc
N
loc
X
0
X
0
X
0
NP NP
NP
NP
N
roof
Mommy is on the roof
The representation in (118b), while correct so far as constituency is concerned,
is obviously incorrect so far as categoricity: the PP is misanalyzed, and worse,
given the uniform analysis of the closed class elements all being of the form X
0
,
one would expect, wrongly, that they would freely permute in the child grammar.
One might of course immediately reply that while the initial grouping is
given by the X
0
term, the more specic categoricity is determined later, far prior
to the point at which the closed class determiner elements play a role in the
grammar, and thus that the representation in (118c) should not play any formal
role in the analysis, being merely a stepping o stage: in the child grammar, as
well as in the adult. Nonetheless, it has shed light before and will continue to
do so in the rest of this work to suppose that the childs representations have
a real status both in themselves, and in their relations to the adult grammar. The
question would then be what representations of the form in (118 c) would mean.
2.3.7 Licensing of Determiners
One advantageous aspect to the analysis in (118) is that it supposes the closed
class elements to be of a uniform category. While this is false with respect to
distributional characteristics in the syntax, it may well be close to true in the
phonology: i.e., at PF. To the extent to which the acquisition system in its analytic
mode is operating on PF structures (see Chapter 5), this assumption is correct.
A deeper question, with respect to the format of the adult grammar is the
following. In the traditional phrase structure rule approach (Chomsky 1965)
categories were licensed with relation to their mother node, via the phrase
structure rules.
(119) VP V NP
The rewrite rule in (119) may be taken to mean that the node VP licenses to
dominated categories, V, NP, in that order. In Chomsky (1981) this view is
revised in a number of ways, most particularly by assuming that the basic
complement is a projection of the lexical entry (of heads), and the element is
licensed in that way.
This leaves open a number of licensing possibilities for those elements
which are not projections of a head: for example, adjuncts and determiners (or
speciers). The situation with adjuncts is discussed in the following chapter.
Insofar as the current literature is concerned (see especially the suggestive and
important work of Fukui and Speas 1986; Abney 1987a) a particular position has
been taken with respect to the specier-of relation. Namely, in the case of
nominals at least, these have been taken to be a projection of the determiner,
with the determiner taking the (old) Noun Phrase as a complement.
(120) DetP Det NP
This answers, though obliquely, a particular question that might be raised
namely, what is the relation between the specier and the head (or head plus
complement) and it answers it by assuming that that relation is modelled in
the grammar in essentially the same type of way that a simple head (e.g. a verbal
head) takes its complement. In essence, both are types of complementation, the
head projecting up.
I would like to suggest here that the specier-of relation should be modelled
in a way which is dierent than complementation, as has already been suggested
by the treatment of the closed class categories in acquisition above. This involves
(at least) two specications: i) what is the projecting head?, and ii) how is the
licensing done? The projecting head is the major category. The licensing is done
in the following way:
(121) a. Projection: Let M a major category, project up to any of the
bar levels associated with it (for concreteness, I will assume
three: thus, N, N, N, N).
b. Attachment: Attach X
0
to the appropriate bar level.
It is the attachment rule here which is distinguished from the type of licensing
which occurs for direct complements. Categories traditionally analyzed as
determiners would therefore have dierent attachment sites associated with
them. If Jackendo (1977) is correct, and denite and indenite determiners
dier in their attachment site in English, the former being N, and the latter
being N, then their lexical specications would be the following.
(122) a. the: X
0
, (X
0
, N)
b. a: X
0
, (X
0
, N)
The notation on the left is the category; the notation on the right is simply the
structural denition of the element, in terms of the mother-daughter relation (cf.
the structural denition of grammatical relations in Aspects). It was suggested in
earlier work of mine (Lebeaux 1987, see also Lebeaux 1985) that NP elements
may be licensed in more than one way: via their relation to sister elements,
which is associated with what is called structural case in Chomsky (1981), and
via their relation to their mother nodes. An example of the latter would be the
licensing of the genitive subject of nominals in English: that is licensed via its
relation (NP, NP). In that article, I argued that this dierential type of licensing
was associated with a dierent type of case assignment, namely what I called
phrase-structural case, and this is associated with a dierent type of theta
assignment as well (where the NP genitive element is not given its theta role by
the N following it, but rather a variable relation, relation R, is supplied at the
full NP node, relating both to each other; see Lebeaux 1985).
By adopting the format in (127) for closed class determiner elements, and
other closed class elements, I am adopting the position that these elements are
licensed not via their relation to the N with which they are associated, but rather
via their relation to the mother element (the N or N or whatever). The
category which they close o (the variable of) is the one which they are licensed
by. They neither license the following N nor are licensed by it: that is, a
complement-of type relation with respect to licensing is not adopted in either
direction. Rather, a dierent type of licensing relation, that of the element to the
mother node, is adopted.
By supposing that the NP-genitive, and the determiner are both licensed in
a common way, i.e. via the structural condition on insertion with respect to the
mother node, an intuition behind the position of Chomsky (1970) is perhaps
explained. In that work, Chomsky supposed that the genitive element was
generated directly under the specier, in the same position as the closed class
determiner. This is not obviously correct. Nonetheless, if the above is correct,
these elements are licensed in a common way.
2.3.8 Submaximal Projections
As the foregoing suggests, categories may be built up not simply to the maximal
projection, but to what might be called submaximal projections. By submaximal
projections, I mean a projection which is maximal within its phrase structure
context (i.e., its dominating category is of a dierent type), but which is not the
maximal projection allowed for that category in the grammar. For nominals, for
example, assuming provisionally the three-bar system of Jackendo (1977)
(nothing hangs on the number of bar levels here), a nominal might be built up to
simply N, or to N, or to N, or to N: N would be maximal for the grammar,
the others would be submaximal (though perhaps maximal for that particular
instantiation). So far as I know, the syntactic proposal was rst made in an
unpublished paper by myself (Lebeaux 1982), it had earlier been suggested in
acquisition work by Pinker and Lebeaux (1982) and Pinker (1984), and has
recently been independently formulated, in a rather dierent way, by Abney,
Fukui, and Speas. Though some of the terminology and many of the concerns
will be similar to those of Abney, Fukui, and Speas, there are dierences both
of principal and detail; I will build here on my own work (Lebeaux 1982, 1987).
Some consequences of assuming submaximal projections are the following:
I. Subcategorization is not in terms of a particular phrasal category (e.g. N), but
rather in terms of a category of a given type. E.g. a nominal category N
x
, where
x may vary over bar levels.
II. Crucially, the denite/indenite contrast may be gotten, without reference to
features, or to percolation from the nonhead node (which would be necessary
otherwise if one assumes that N is the head of the full category NP).
This is done as follows: the nominal is built up as far as necessary.
Jackendo (1977) argues quite forcefully that the denite determiner is attached
under N in English, while the indenite is attached under N. The two repre-
sentations are therefore the following.
(123)
N
Det
a
N
N
picture
PP
of Mary
(124)
N
Det
the
N
N
N
picture
PP
of Mary
Thus, the semantics need not refer to a feature such as +/denite: this is
encoded in the representation. Other types of processes sensitive to the denite/
indenite contrast, e.g., there-insertion in English and dative shift in German,
would presumably refer to this same structural dierence.
III. Other types of binding and bounding processes may refer to the maximal
projection, rather than to the presence or absence of the denite determiner.
Here, the relevant contrasts are those such as those in (125) and (126):
(125) a. Who would you like a picture of t?
b. Who do you like pictures of t?
c. *Who do you like the picture of t?
(126) a. Every man
i
thinks that a picture of him
i
would be nice.
b. Every man
i
thinks that pictures of him
i
would be nice.
c.
?
*Every man
i
thought that the picture of him
i
was nice.
Assuming the same general theory as that above, these contrasts would be
specied not by the feature +/denite, but by the maximal projected node.
Of course, the means of capturing the generalization here is simply as strong
as the original generalization, and there is some reason to think that it is name-
hood or specicity which matters in determining these binding relations
(Fiengo and Higginbotham 1981), not the presence or absence of the determiner
itself. If so, there is no purely structural correlate of the opacity.
(127) a. *Which director do you like the picture of t?
b. What director do you like the rst picture of t?
IV. Assuming that there is the possibility of submaximal projections, we nd a
curious phenomenon: for each major category, there are cases which appear to
be degenerate with respect to projection (Lebeaux 1982). That is, they simply
project up to the single bar-level, no further. These are the following.
(128) Basic category Degenerate instance
verb auxiliary verb
noun pronoun
preposition particle
adjective (prenominal adjective in English)
The rst three of these are clear-cut, the fourth is more speculative. The sense in
which a pronoun would be a degenerate case of a noun is that it does not allow
complements, nor speciers.
(129) a. *the him
b. *every him
c. every one
d. the friend of Mary, and the one/*him of Sue
This may be accounted for if we assume that it is inherently degenerate with
respect to the projections that these would be registered under.
Similarly, while the issues and data are complex see the data in Radford
(1981) for interesting puzzles an auxiliary verb does not appear to take
complements, speciers, and adjuncts in the same way that verbs do. Indeed, to
account for some of the mixed set of data in Radford, it may make sense to
allow for both a Syntactic Structures type analysis and one along the lines of
Ross (1968) or Akmajian, Steele, and Wasow (1979), where the auxiliary takes
the dierent structures at dierent levels of representation.
Aside from facts having to do with complement-taking, there are those
having to do with cliticization, and the rhythm rule (Selkirk 1984) which in
general support the hypothesis of bare, nonprojecting lexical categories. In
general, the rhythm rule does not apply across phrasal categories. However, it
may apply in prenominal adjectival phrases.
(130) a. That solution is piecemeal.
b. A piecemeal solution.
(131) a. The tooth is impacted.
b. An impacted tooth
In (130b) the main adjectival stress has retracted onto the rst syllable; in (131b)
it has done likewise, optionally.
Assuming that the rhythm rule does not apply across phrasal boundaries,
this means that the structure must be one in which the adjective is non-phrasally
projecting, i.e., an A, not an A or A:
(132) NP
Det
an
A N
tooth impacted
This would also account for the impossibility of phrasal APs in prenominal
position in English: it suggests that it is the degeneracy of this category in this
position which is the point to be explained, rather than something along the lines
of the Head Final Filter of Williams.
The other area in which phonological evidence is relevant is with respect to
cliticization. In English, a pronoun, but not a full noun, nor the pro-nominal one,
may lose its word beat and retract onto the governing verb.
(133) a. I saw m.
b. I saw one.
c. *I saw n.
Similarly, the auxiliary may do so, and cliticize onto the subject, or onto each
other.
(134) a. I cn go.
b. He mayve left.
The generalization in these cases seems to be the following.
(135) If i) governs , and
ii) is an X
0
category,
Then may cliticize onto .
But this generalization requires the presence of submaximal nodes: i.e., degener-
ate projections.
V. Finally, we may note that the device of submaximal projections gives the
possibility of dierentiating null categories, without assuming a primitive feature
set of +/anaphor, +/pronominal. Namely, the categories may be dierentiated
by the node to which they project up. These would be as follows.
(136) Null Category Category type
PRO, pro N
wh-trace N
NP-trace N
The particular assignments are given for the following reasons: PRO and pro
because of their similarity to simple pronominals in interpretation (though not, in
the case of PRO, in their dependency domains), NP-trace because it may also
simply be regarded semantically as a surrogate for the NP which binds it: this is
not true for either wh-trace or PRO (or pro). The reason that wh-trace is
identied with N will be apparent in the next chapter.
This identication of null categories with nominal projection levels here is
intended as suggestive: a full proposal along these lines would go far beyond the
scope of this thesis.
Cn:i1r 3
Adjoin- and Relative Clauses
3.1 Introduction
In the previous chapter I dealt with some aspects of argument structure, phrase
structure, and the realization rule, Project-, which relates lexical representations
to the syntactic system strictly speaking. Aside from particular points that were
made, a few central conclusions were reached: i) that Project- is a relation
relating the lexical entry to the phrase marker in a very faithful way in
particular, the internal/external distinction used in the syntax, and even direction-
ality information as to theta marking (Travis 1984; Koopman 1984) is found in
the lexical entry, ii) that telegraphic speech was a representation of pure theta
relations, and iii) that there was a close relation between acquisition stages and
representational levels, a relation of congruence (the General Congruence Principle).
In this chapter, I take up the third point more carefully, modifying it as
necessary.
What is the nature of the relation between acquisitional stages and represen-
tational levels? Are there true levels, and, if not, are there at least ordered
sequences of operations (where these sequences do not, however, pick out
levels)? If there are levels, are they ordered: i.e. are DS, SS, PF, LF, and perhaps
others, to be construed as a set, or is there a (partial) ordering relation between
them? Finally, what is the way that levels may be ordered: on what grounds?
Note that to the extent to which there are real acquisitional stages, and these
are in a correspondence relation with representational levels, a strong syntactic
result would be possible: that the grammar is essentially leveled, and the leveling
follows from, is projected from, the course of development.
A geological metaphor is apt: the sedimentation pattern over a period of
time is essentially leveled. The sedimentation layers are distinct and form strata,
moreover they have distinct vocabularies. The course of the geological history
is projected into the nal structure: a cross-section reveals the geological type of
the resultant.
In the next three chapters, I would like to discuss three areas of grammatical
research. These are intended to shed light on the question of representational
mode. One of these areas is well mapped out in current syntactic work, though
the appropriate analysis is still not clear: the argument/adjunct distinction. This
I will discuss in this chapter. Here I will be looking at, in particular, the
representation of relative clauses the paramount case of an element in an
adjunctual relation to a head. The general conclusion will be that the adjunct
relation should be modeled in a derivational way: namely, by supposing that
adjuncts are added into the representation in the course of a derivation. Both
syntactic and acquisition evidence will be presented to support this view.
A second question has been asked more extensively in the acquisition
literature than in syntactic research per se, though with notable exceptions
(Chomsky 1980, on Case assignment; Marantz 1982, on early theta assignment).
It divides into two parts. First, is there a semantic stage in acquisition, such as
suggested by much early work in psycholinguistics (Bowerman 1973, 1974;
Brown 1973)? Such a stage would use purely semantic, i.e. thematic, descriptors
in its vocabulary, without reference to, e.g., grammatical relations or abstract
Case. The same question may be asked with respect to Case assignment: are
there dierent types of Case assignment, are these ordered in the derivation, and
may they be put into some correspondence with acquisitional stages? N. Choms-
ky in On Binding answered the rst and second questions in the armative
about case assignment (though this was before work on abstract Case), and in
earlier work (Lebeaux 1987), I suggested that Hyams data about the dropping
of early subjects might fall under the rubric not of pro-drop, but of the lack of
analysis of the verb + In combination, together with two types of Case assign-
ment (structural and phrase-structural) having a precedence relation in the
acquisition sequence, and operating in dierent manners. My analysis, then,
answered the third question in a positive manner, with respect to Case. I will not
discuss the possibility of precedence within types of Case assignment in this
book, but I will discuss the thematic issue, in Chapter 4.
A third question has to do with the nature of the open class/closed class
distinction, and how this might be modelled in a grammatical theory. This
question, again, has received more attention in the psycholinguistic literature (see
Shattuck-Hufnagel 1974; Garrett 1975; Bradley 1979; Clark and Clark 1977, and
a panoply of other references) than in grammatical theory proper. I propose in
Chapter 4, the beginnings of a way to accommodate the two literatures.
The current chapter, however, will be concerned with the argument/adjunct
distinction.
ADJOIN- AND RELATIVE CLAUSES 93
3.2 Some general considerations
As has often been noted (Koster 1978; Chomsky 1981), given a particular string,
say that in (1), there are two ways of modelling that string, and the dependencies
within it, within a GB-type theory.
(1) Who
j
did
i
John e
i
see e
j
?
On the one hand, these informational dependencies may be viewed as an aspect
of a single level of representation. Thus in Chomsky (1981) it is suggested that
the operation Move- may be viewed as a set of characteristics of the S-structure
string, involving boundedness, single assignment of theta role, and so on.
On the other hand, the string in (1) may be viewed not as a representation
at a single level, but as being the output of a particular derivation, that in (2).
(2)
C
C
C C
C
C
IP
IP
NP
John
NP
John
I
I
I
Infl
I VP VP
V
see
V
see
NP
who
NP
e
NP
DS: SS:
Move-
Who
Under this view, the sentence in (1) is just a sectioning of a larger gure. The
full gure is multi-leveled.
The representation in (1) retains a good deal and perhaps all of the
information necessary to derive the full representation in (2) back again. It is
precisely this character, forced in part by the projection principle, that makes the
distinction between representational and derivational modes noted in
Chomsky (1981) so dicult. Indeed, in the old-style Aspects derivations, where
no traces were left, such a question could not arise, since the surface (corre-
sponding to the present S-structure) patently did not contain all the information
present in the DS form: it did not dene for example, the position from which
the movement had taken place.
However, to the extent to which constancy principles hold i.e. principles
like the Projection Principle which force information to be present at all levels
of the same derivation the problem of the competition in analysis between the
representational and derivational modes becomes more vexed. It is therefore
natural, and necessary, to see what sort of information in principle might help
decide between them.
Basically, the information may be of two types. Either i) there will be
information present in the representational mode which is not present in the
derivational mode, or can only be present under conditions of some unnatural-
ness, or ii) there is information present in the derivational mode which is not
present in the representational mode or, again, may only be represented under
conditions of some unnaturalness. It is possible to conceive of more complex
possibilities as well. For example, it may be that the grammar is stored in both
modes, and is used for particular purposes for either. I do not wish to examine
this third, more complex, possibility here.
3.3 The Argument/Adjunct Distinction, Derivationally Considered
In the next few sections, I would like to argue for a derivational approach, both
from the point of view of the adult system, and from the point of view of
acquisition. The issue here is the formation of relative clauses and the modelling
of the argument/adjunct distinction in a derivational approach.
3.3.1 RCs and the Argument/Adjunct Distinction
Let us consider the following sentences:
(3) a. The man near Fred joined us.
b. The picture of Fred amazed us.
c. We enjoyed the stories that Rick told.
d. We disbelieved the claim that Bill saw a ghost.
e. John left because he wanted to.
The following examples give the same sentences with the adjuncts in italics.
(4) a. The man near Fred joined us.
b. The picture of Fred amazed us.
c. We enjoyed the stories that Rick told.
d. We disbelieved the claim that Bill saw a ghost.
e. John left because he wanted to.
I have dierentiated in the sentences above between the modifying phrases near
Fred and that Rick told in (4a, c), and the phrases of Fred and that Bill saw a
ghost in (4b, d), which have intuitively more the force of direct arguments. See
Jackendo (1977) for structural arguments that the two types of elements should
be distinguished. It is sometimes claimed that the latter phrases are adjuncts as
well (Stowell 1981; Grimshaw 1986), but it seems clear that, whatever the
extension of the term adjunct, there is some dierence between the comple-
ments in (4b, d) vs. those in (4a, c). It is likely, therefore, that there is a three-
way dierence between pure arguments like obligatory arguments in verbal
constructions (I like John), the optional arguments in instances like picture-
noun constructions and in the complement of denominals like claim (the picture
of John, the claim that Bill saw a ghost), and true adjuncts like relative
clauses and locative modiers (near Fred and that Rick told above). For
present purposes, what matters is the distinction between the second type of
complement and the third, and it is here that I will locate the argument/adjunct
distinction.
There is no unequivocal way to determine the adjunctual status of a given
phrase, at least pre-theoretically. One commonly mentioned criterion is optional-
ity, but that will not work for the complements above, since all the nominal
complements are optional yet we still wish to make a distinction between the
picture-noun case (as nominal arguments), and the locative phrases or relative
clauses (as adjuncts). Nonetheless, if the intuition that linguists have is correct,
the property of optionality is somehow involved. Note that there is still a
dierence in the two nominal cases: namely, that if the nominal construction is
transposed to its verbal correlate in the cases that we would wish to call argu-
ment, then the complement is indeed obligatory, while the locative complement
remains not so.
(5) a. the photograph (of Fred)
b. the photograph (near Fred)
(6) a. We photographed Fred.
#
We photographed. (not same interpretation)
b. We photographed near Fred.
We photographed. (same interpretation)
This suggests that the dierence between (5a) and (6a) may not reside so much
in theta theory as in Case theory (Norbert Hornstein, p.c.). Let us therefore
divide the problem in this way. There are two sorts of optionality involved. The
rst is an optionality across the subcategorization frames of an element. The
nominal head of a construction like photograph, or picture, is optional across
subcategorization frames, while the corresponding verbal head is not.
(7) a. photograph (V): __NP
b. photograph (N): __ (NP)
It is this sort of optionality which may, ultimately, following Hornstein, be
attributed to the theory of Case: for example, that the verbal head photograph
assigns case and hence requires an internal argument at all levels, while the
nominal head photograph does not.
Over against this sort of optionality, let us consider another sort: namely,
that of licensing in a derivation. Since the work of Williams (1980), Chomsky
(1982), Rothstein (1983), Abney and Cole (1985), and Abney (1987b), as well
as traditional grammatical work, it is clear that elements may be licensed in a
phrase marker in dierent ways. In particular, there exist at least two dierent
sorts of licensing: licensing by theta assignment and licensing by predication. Let
us divide these into three subcases: the licensing of an object by its head, which
is a pure case of theta licensing, the licensing involved in the subject-predicate
relation, which perhaps involves both theta licensing and licensing by predica-
tion, and the relation of a relative clause to its head, which is again a pure
instance of predication licensing (according to Chomsky 1982). These sorts of
combinatorial relations may themselves be part of a broader theory of theta
satisfaction along the lines sketched by Higginbotham (1985), which treats things
like integration of adjectival elements into a construction; the renements of
Higginbothams approach are irrelevant for the moment.
(8) a. John hit Bill. (licensed by theta theory)
b. John hit Bill. (licensed by theta theory and predication theory)
c. The man that John saw (licensed by predication theory)
Let us now, following the basic approach of Chomsky (1982) and Williams
(1980), note that predication licensing i.e. predication indexing in Williams
sense need not take place throughout the derivation, but may be associated
with a particular level, Predication Structure in Williams theory. Let us weaken
Williams position that there is a particular level at which predication need apply,
and adopt instead the following division, which still maintains an organizational
dierence between the two licensing conditions:
(9) a. If is licensed by theta theory, it must be so licensed at all
levels of representation.
b. If is not licensed by theta theory, it need not be licensed at
all levels of representation (but only at some point).
Predication licensing in Chomskys (1982) broad sense (and possibly in Williams
1980, sense as well) would fall under (9b), while the licensing of direct internal
arguments would fall under (9a). However, (9a) itself is just a natural conse-
quence of the Projection Principle, while (9b) simply reduces to the instances
over which the Projection Principle holds no domain, which needs no special
statement. The strictures in (9) may therefore be reduced to (10), which is
already known.
(10) a. The Projection Principle holds.
b. All categories must be licensed.
In terms of the two types of optionality noted above, the optionality of (9) is
the optionality in licensing conditions for adjuncts at DS.
(11) Arguments must be licensed at DS; adjuncts are optionally licensed
at DS.
With respect to the constructions discussed earlier, the picture-noun complements
and complements of claim, this means that the complements in such constructions,
as arguments, must be assigned a theta role and licensed at DS, when they appear.
(12) the picture of Mary
theme
licensed at DS
(13) the claim that Rick saw a ghost
theme
licensed at DS
These complements need not appear; they are optional for the particular head
(picture, claim). However, when they appear, they must be licensed and theta
marked at DS, by the Projection Principle. This distinguishes them from true
adjuncts, which need not be licensed at DS.
The optionality in the licensing of adjuncts at DS, but not arguments, is one
way of playing out the argument/adjunct distinction which goes beyond a simple
representational dierence such as is found in Jackendo (1977), where
arguments and adjuncts are attached under dierent bar-levels. However, there
is a more profound way in which the argument/adjunct distinction, and the
derivational optionality associated with it may enter into the construction of the
grammar. It is to this that I turn in the next section.
3.3.2 Adjunctual Structure and the Structure of the Base
In the sentences in (4) above the adjuncts were italicized, picking them out.
Suppose that, rather than considering the adjuncts in isolation, we consider the
rest of the structure, ltering out the adjuncts themselves. (The (b) sentences are
after adjunct ltering.)
(14) a. Bill enjoyed the picture of Fred.
b. Bill enjoyed the picture of Fred.
(15) a. He looked at the picture near Fred.
b. He looked at the picture.
(16) a. We disbelieved the claim that John saw a ghost.
b. We disbelieved the claim that John saw a ghost.
(17) a. We liked the stories that Rick told.
b. We liked the stories.
(18) a. John left because he wanted to.
b. John left.
Comparing the (a) and (b) structures, what is left is the main proposition,
divested of adjuncts. Let us suppose that we apply this adjunct ltering operation
conceptually to each string. The output will be a set of structures, in which the
argument-of relation holds in a pure way within each structure (i.e. the subject-
of, object-of, or prepositional-object-of is purely instantiated), but the relation of
adjunct-of holds between structures. In addition, one substructure is specially
picked out as the root.
(19) (15a) after adjunct ltering:
Argument structure 1: He looked at the picture.
Argument structure 2: near Fred
The rooted structure is 1.
Argument structure 1: We disbelieved the claim that
John saw a ghost.
Argument structure 1: We liked the stories.
Argument structure 2: that Rick told.
Argument structure 1: John left.
Argument structure 2: because he wanted to
Each of the separate argument structure elements are a pure representation of the
argument-of relation; no adjuncts are included. They may be called the Argument
Skeletons of the phrase marker. In this sense, each phrase marker is composed of
a set of argument skeletons, with certain embedding relations between them
(which havent been indicated above), and one element picked out as the root.
(23)
Phrase marker
Argument skeletons
Can anything be made of such a conceptual device? Before considering data, let
us note one aspect of current formulations of the base. According to Stowell
(1981), there is no independent specication of the base. Rather, its properties
follow from that of other modules: the theory of the lexicon, Case theory, theta
theory, and so on. Let us take this as a point of departure: all properties of the
base follow from general principles in the grammar. What about the actual
content of the base: of the initial phrase marker? Here we note (as was noted
above) that a duality arises in licensing conditions: elements may either be
directly licensed by selection by a head (i.e. subcategorized, perhaps in the
extended sense of theta selection), or they may not be obligatorily licensed at all,
but may be optionally present, and, if so, need not be licensed at DS, but simply
at some point in the derivation: the case of adjuncts (Chomsky 1982, and others).
Suppose that we adopt the following constraint on D-structures:
(24) (Every) D-structure is a pure representation of a single licensing
condition.
Then the duality noted in the licensing conditions would be forced deeper into
the grammar. The consequence of (24) would be that arguments, licensed by a
head, and adjuncts, licensed in some other way, would no longer be able to both
be present in the base.
1
The base instead would be split up into a set of
sub-structures, each a pure representation of a single licensing condition (argu-
ment-of or assigned-a-theta-role-by), with certain adjoining relations between
them. That is, if (24) is adopted, the argument skeletons above (arg. skeleton 1,
arg. skeleton 2, etc.) are not simply conceptual divisions of the phrase marker,
but real divisions, recorded as such in the base. Ultimately, they must be put
together by an operation: Adjoin-.
By adopting a position such as (24), we arrive then, at a position in some
ways related to that of Chomsky (1957) (see also Bach 1977): there is a (limited)
amount of phrase marker composition in the course of a derivation. Yet while
phrase markers are composed (in limited ways), they are not composed in the
manner that Chomsky (1957) assumes. Rather, the Projection Principle guides the
system in such a way that the substructures must respect it.
There is, in fact, another way of conceiving of the argument structures
picked out in (19)(23). They are the result of the Projection Principle operating
in the grammar, and, with respect to the formulation of the base, only the
Projection Principle. If the Projection Principle holds, then there must be the
argument structures recorded in (19)(23), at all levels of representation.
However, there need not be other elements in the base, there need not be
adjuncts. If we assume that the Projection Principle holds, and (with respect to
this issue) only the Projection Principle, then it would require additional stipula-
tion to actually have adjuncts present in the base: the Projection Principle
requires that arguments be present, but not adjuncts. It is simpler to assume that
only the Projection Principle holds, and the adjuncts need not be present.
The sort of phrase structure composition suggested above diers from both
the sort suggested in varieties of categorial grammar (e.g. Dowty, Wall, and
Peters 1979), and from the domains of operation of traditional cyclic transforma-
tions (Chomsky 1965). With respect to categorial grammar, since the ultimate
phrase marker or analysis tree is fully the result of composition operations, there
are no subunits which respect the Projection Principle. The analysis may start o
with the transitive verb (say, of category S/NP/NP), compose it with an object
creating a transitive verb phrase, and compose that TVP with a subject. The
original verb, however, starts out naked, unlike the representation in Chapter 2,
and the argument skeletons above. And the composition operation, being close
1. This is a stronger condition than that which simply follows from the Projection Principle, because
it requires that something actually force the adjuncts in, for them to be present. In other words,
elements cannot be hanging around in the structure, before they are licensed there (or if they are
not licensed there).
to the inverse of the usual phrase structure rule derivation (with the possible
difference of extensions like Right- and Left- Wrap, Bach 1979), would not
add adjuncts like relative clauses in the course of the derivation, but rather would
compose a relative clause directly with its head, and then let the resultant be
taken as an argument: exactly the reverse of the order of expansion in the phrase
marker. The operation above, however, takes two well-formed argument skele-
tons and embeds one in the other.
The dierence between the domains scanned in the theory proposed above,
and that found in standard early versions of cyclic theories is perhaps more
subtle. Cyclic theories (pre-Freidin 1978) scan successive sequences of sub-
domains, the least inclusive sub-domains rst. A partial ordering exists between
the sub-domains, where the possibility of multiple branching requires that the
ordering be merely partial rather than complete. This is also true with the
argument skeleton approach above. However, the domains which are in such an
inclusion relation are dierent. This is shown in (25) below.
(25)
Cyclic domains
S
NP
NP S
VP
V S
NP VP
Ordering Relations
1 < 2
2 < 5
3 < 4
4 < 4
Argument skeleton domains
S
NP
NP S
VP
V S
NP VP
Domains
1
21
2
Ordering Relations
1 < 2
21 < 21
Unlike the case with cyclic domains, there is no strict inclusion relation with the
argument skeleton domains. Rather, it is as if the upper bar level complements
of Jackendo (1977) were not present in the original scanning, and are only
added later, in the course of the derivation.
3.3.3 Anti-Reconstruction Eects
Van Riemsdijk and Williams (1981) note a peculiar hole in the operation of
Condition C, as it applies to structures involving moved constituents. Consider
the data in (26).
(26) a. *He
i
likes those pictures of John
i
.
b. *He
i
likes the pictures that John
i
took.
c.
?
*Which pictures of John
i
does he
i
like?
d. Which pictures that John
i
took does he
i
like?
As expected, both (26a) and (26b) are ungrammatical; he c-commands the
coreferent John, and is out by Condition C. The interesting divergence occurs in
(26c) and (26d). Here, where John is contained in a fronted noun phrase,
Condition C applies dierentially in the two cases. Where John is the object of
a picture-noun phrase the sentence retains the ungrammaticality of the original
(26a), but when it is contained inside a relative clause, and this is fronted, then
the ungrammaticality suddenly disappears: (26d) is perfect.
At rst glance, it may appear that this contrast can be handled by locating the
application of Condition C to a particular level, say, D-structure or S-structure.
Yet it is clear that this device will not work. If Condition C were located at DS,
then all the sentences above, (26a, b, c and d), would be expected to be bad.
This is not the case: (26d) is ne. On the other hand, if Condition C were
located at S-structure, and applied directly on structures (rather than using a
derived notion of c-command such as is found in Williams 1987), then the
grammar would allow in too much: both (26c, d) would be expected to be good.
But neither of these locations for Condition C would allow for the true result: the
grammaticality of (26d) and the ungrammaticality of (26c).
Van Riemsdijk and Williams themselves take a dierent tack. They suggest
that the degree of embedding of the name in the dislocated constituent is the
crucial factor in creating the contrast. In particular, they suggest that the name in
the relative clause in (26d) is embedded under an S, while the name in (26c) is
embedded in a PP. This lack of embedding in (26c) is related, they suggest, to
the comparative grammaticality of that construction. The formulation that they
give is the following.
(27) In a structure where NP is part of a dislocated constituent, NP is
exempted from reconstruction if it is deeply embedded enough: part
of a S or genitive phrase.
While the van Riemsdijk and Williams observation is extremely interesting, there
is some reason to believe that their statement of the constraint may be improved
upon, most particularly by examining the function of the structure containing the
name, rather than the degree of embedding per se. Let us consider a somewhat
more inclusive range of data.
Note rst the following contrast:
(28) a. *He
i
believes the claim that John
i
is nice.
b. *He
i
likes the story that John
i
wrote.
c. *Whose claim that John
i
is nice did he
i
believe?
d. Which story that John
i
wrote did he
i
like?
The contrast in (28) is rather striking. All constructions evince the same degree
of embedding: the name is embedded in an S. As expected, the non-dislocated
structures show a Condition C violation. However, there is a clear distinction in
the sentences with dislocated NPs. In (28c), where the name John is contained
in an S which is a complement of the head noun claim, the ungrammaticality of
the initial undislocated structure is retained with full force. In (28d), where the
name is likewise contained in an S, but where the S is part of an adjunct
relative clause associated with the dislocated head, the output becomes perfect.
In (28), then, it is the adjunct status of the containing structure, rather than the
degree of embedding of the name, which is associated with the dierence in
grammaticality.
The same can be seen by an appropriate choice of PPs. As noted before, if
the name is contained in the PP complement of a picture-noun phrase, and
fronted, the resultant is ungrammatical. As suggested earlier, the internal
argument of a picture-noun is a sort of direct complement (Jackendo 1977).
Consider what happens when the name appears in an indisputable adjunct.
(29) a. *He
i
destroyed those pictures of John
i
.
b. *He
i
destroyed those pictures near John
i
.
c.
?
*Which pictures of John
i
did he
i
destroy?
d. Which pictures near John
i
did he
i
destroy?
As expected, (29a) and (b) are ungrammatical: they violate Condition C. The
interesting contrast appears when the fronted NPs are in dislocated position.
When the name is part of a picture-noun phrase, and fronted, the output retains
the ungrammaticality of the base (29c). However, when it is part of a (locative)
adjunct, the ungrammaticality disappears (29d), though it is still present in the
putative D-structure (29c). Degree of embedding is not an issue, being held
constant (this same observation, that adjuncthood is what matters, not degree of
embedding per se, was made by Freidin 1986, independently: a fact brought to
the authors notice only after this was written).
We note the same contrast between (29c) and (d) in (30), using this time
derived nouns in their verbal vs. nominal uses (Lebeaux 1984, 1986)
(30) a.
?
*Whose examination of John
i
did he
i
fear?
b. Which examinations near John
i
did he
i
peek at?
The deverbal noun examination is being used in (30) is being used in either a
simple referential sense (30b), or as a nominalized process (Lebeaux 1984,
1986). It is plausible to consider that the argument structures of these usages
dier. The data in (30) supports that: the true argument John in (30a) violates
Condition C when dislocated, while the adjunct near John does not.
The data in (28)(30), then, supports the following conclusion: it is the
grammatical function or character of the structure within which the name is
contained, which determines whether a Condition C violation occurs when it is
dislocated. Yet this grammatical function or character (the argument/adjunct
distinction) is irrelevant, if the structure within which the name is contained is in
place.
3.3.4 In the Derivational Mode: Adjoin-a
One way of accounting for the data above the anti-reconstruction facts
would be to simply stipulate it as part of the primitive basis:
(31) If , a name, is contained within a fronted adjunct then Condition C
eects are abrogated; otherwise not.
However, this is hardly an intuitive solution to the problem: stipulation (31), as
a primitive specication in UG it would have to be in UG; there is not
sucient evidence in the data to set this as a parameter or possibility from low-
level learning is hardly satisfactory. Further, this sort of stipulation would
leave unexplained, in a rather a priori fashion, the relation of the anti-reconstruc-
tion constraint to the more standard oddities associated with adjuncts, the
Condition on Extraction Domains (Huang 1982), however reconstructed. While
the solution proposed below will not directly relate the anti-reconstruction facts
to the Condition on Extraction Domains, it does, I believe, clear the way for such
a relation to be made: something which is not the case if (31) were simply
adopted per force.
Let us return to the earlier theoretical construct: the argument skeleton. It
may be assumed that the Projection Principle requires that heads and their
arguments, and the arguments of these heads, and so on, must be present in the
base. That is, the entire argument skeleton must be present, insofar as it is a pure
instantiation of the relation argument-of. However, adjuncts need not be
present in the base. They may then be added later by a rule. Let us call this
Adjoin-. Adjoin- takes two tree structure, and adjoins the second into the rst.
Let us assume that this always involves Chomsky-adjunction, copying the node
in the adjoined-to structure. Like Move-, Adjoin- applies perfectly freely,
ungrammatical results ruled out by general principles, interpretive or otherwise.
(32) A: XP XP
YP YP
YP WP
WP
UP
UP B: ZP
ZP
Adjoin-
Output:
Here, the subtree ZP has been adjoined into the phrase marker A, copying the
YP node. Relative clause adjunction would look like the following.
(33) A: S S
VP VP
V V
NP
NP
S B: S
NP
Adjoin-
Output:
NP NP
And the adjunction of a locative NP-modifying PP would look like this, if the
locative is adjoined to the object:
(34) A: S S
VP VP
V V
NP
NP
PP B: PP
NP
Adjoin-
Output:
NP NP
Here, the subtree B has been adjoined into A, copying the NP node.
We are left, then, with the base generating a set of phrase markers (one
specied as the root). The rule Adjoin- is dened over pairs of phrase markers;
the rule Move- is dened over a single phrase marker. Given absolutely
minimal assumptions, Move- would be expected to apply both prior to, and
posterior to, any given adjunction operation, since it is simply dened as
movement within a phrase marker, and phrase markers exist both prior to, and
posterior to Adjoin-. There is thus no level at which Adjoin- takes place, it
is simply an operation joining phrase markers, given minimal assumptions. We
will see below that there are empirical reasons as well to assume the free
ordering of Move- and Adjoin-.
I will assume that each individual substructure prior to Adjoin- is well-formed.
Assuming a derivation of this type where both Move- and Adjoin- are
available as operations a solution is at hand for the anti-reconstruction eects
discussed above. Let us assume that Condition C is not earmarked for any
particular level it applies throughout the derivation, and marks as ungrammati-
cal any conguration which it sees, in which a name is c-commanded by a
coindexed pronoun.
2
Let us further assume that it applies directly over structures,
2. Like Lasnik (1986), I will assume that Condition C is actually split into two separate conditions,
one which bars the c-commanding of a name by a pronoun, which is much stronger, and one which
bars the c-commanding of a name by another name, which is much weaker. As Lasnik notes, some
languages, e.g. Thai, allow c-command of the second sort, but not the rst. I will discuss the rst
constraint here, and restrict the term Condition C to that. The statement that Condition C applies
throughout the derivation may be too strong, given that sentences like (1b) are grammatical (consider
what the DS would be).
(1) a. *It seems to him
i
that Johns
i
mother is nice.
b. Johns
i
mother seems to him
i
t to be nice.
One way to account for this is to restrict Condition C to apply at any point after NP movement. A
more radical, but more principled solution, I believe, is to maintain that Condition C applies
everywhere, but to argue that the lexical insertion of names applies after NP movement. This
assumption is fairly natural given the theory in Chapter 4, but obviously has widespread implications.
not using any derived or re-dened notion of c-command (Chapter 5). Assume
further, as discussed above, that the full argument skeleton must be present at all
levels of representation, by the Projection Principle, but that adjuncts need not
be.
Consider, now, the two relevant structures.
(35) a. Which pictures that John
i
took did he
i
like?
b.
?
*Whose claim that John
i
took pictures did he
i
deny?
The full DS for (35b) must be the following.
(36) *He
i
denied the claim that John
i
took pictures.
(36) is the full argument skeleton. Deny subcategorizes for the internal argument
claim, and claim itself takes the clause that John took pictures as a complement
(not an adjunct). We must assume then, that the full structure is present at DS,
by the Projection Principle.
This full structure, however, violates Condition C, since the name is
c-commanded by a coindexed pronoun. Therefore the sentence is marked as
ungrammatical at that level. Making the usual assumption that starred sentences
may not be saved by additional operations, this means that the grammar,
correctly, disallows the sentence.
Consider now the unexpectedly grammatical (35a). The corresponding non-
question for (35a) is (37).
(37) He
i
liked which pictures that John
i
took.
Under standard assumptions, this sentence would be marked as ungrammatical at
DS. However, the corresponding SS (35a) is fully grammatical.
Under the theory proposed here, however, the deep structure underlying (35
a) is not (37). Rather, it is (38) (i.e. the full phrase structure trees corresponding
to the argument skeletons. I suppress PS detail for convenience.)
(38) Argument skeleton 1: (
S
(
NP
He) (
VP
liked which pictures)).
Argument skeleton 2: (
S
that John took).
To each of these argument skeletons Move- may apply; Adjoin- also applies
adjoining argument skeleton 2 into argument skeleton 1. Move- may also apply
to the resulting, full, sentence structure.
There are two possible derivations for (35a). In one, Adjoin- applies prior
to Move-.
(39)
Derivation 1:
a.
b.
He liked which pictures.
that John took.
He liked which pictures that John took.
i i
Which pictures that John took did h
i i
e like?
Adjoin-
Move-
*
*
In this derivation, if Adjoin- applies rst, then Condition C will apply to the
intermediate structure, ruling it out.
There is, however, another derivation, given in (40).
(40)
Derivation 2:
a.
b.
a.
b.
He liked which pictures.
that John took.
Which pictures did he like?
that John took
Which pictures that John took did h
i i
e like?
Adjoin-
Move-
In (40), Move-, applying in argument skeleton 1, applies before Adjoin-. This
derivation gives rise to the appropriate s-structure as well. However, unlike the
derivation in (39), as well as the standard derivation, there is no structure in (40)
which violates Condition C. This is because the adjunct clause containing John
has been adjoined into the representation after movement has taken place, and
after the fronted NP has been removed from the position in which it is
c-commanded by the pronoun he.
This is possible only for adjuncts: direct complements, like the complement
of claim, must be present at all levels (i.e. part of the rooted argument structure
at all levels).
The same analysis as given for relative clauses holds for locative adjuncts
in NPs. Recall the contrast in (30).
(41) a.
?
*Whose examination of John
i
did he
i
fear?
i
did he
i
peek at?
The DS of (41a) is given in (42); the DS of (41b) is given in (43).
(42) Root:
(
S
(
NP
He) (
VP
feared (
NP
whose examination of John))).
(43) Root:
(
S
(
NP
He) (
VP
peeked (at (
NP
which examinations))))
Argument Structure 2:
(
PP
near John)
In (42) only one transformation may apply: Move-. Move- fronts the
wh-phrase whose examination of John. However, coreference is disallowed
between he and John since he c-commands John at D-structure.
In (43), two transformations apply: Move- and Adjoin-. These may be
ordered in either fashion: Move- applying in the root prior to the adjunction
operation, or after it. If Move- applies after the adjunction operation, then
coreference between John and the pronoun is impossible, because Condition C
would be violated. However, there is still the derivation in which Move- applies
in the root prior to Adjoin-. This would look as follows.
(44) A:
B:
S
NP
He
VP
V
peeked
PP
P
at
NP
which examinations
near John
Move-a
A:
B:
C
near John
Adjoin-a
SpecC
Which exams
C
C
did
S(=IP)
VP NP
he V
peek
PP
P
at
NP

C
SpecC
Which exams
near John
C
C
did
S
VP NP
he peek at
The PP containing the name is added into the representation after Move- has
applied. Hence there is no structure in which Condition C is violated. The
sentence is, correctly, marked as grammatical.
Finally, let us note that the same contrast with respect to the abrogation of
Condition C holds for sentential adjuncts vs. complements when these are not
part of a more inclusive NP. By itself this contrast could have been due to a
number of other factors-see Reinhart (1983), where it is traced to the attachment
site but given the data already noted, it seems likely that it should be traced
to the same cause as above. The contrast is given in (45).
(45) a. In Johns
i
neighborhood, he
i
jogs.
b. *In Johns
i
neighborhood, he
i
resides.
c. In Johns
i
home, he
i
smokes dope.
d.
??
In Johns
i
home, he
i
placed a new Buddha.
e.
?
Under John
i
, he
i
noticed that a general hubbub was occurring.
f. *Under John
i
, he
i
placed a blanket.
(cf. Under him
i
, John
i
placed a blanket.)
These types of sentences are familiar from Reinharts work, where the fronted
adjunct does not initiate a Condition C violation, while the fronted argument
does (45a, c, e) vs. (45b, d, f). The point here is that the same set of data would
follow from the theory advocated above.
3.3.5 A Conceptual Argument
One consequence of the analysis above has to do with the question with which
the chapter began: the question of mode of representation. Grossly, to the extent
to which little happens in a derivation, the derivation may be conceptually
collapsed: the derivational mode falling into the representational mode. If
information is gained, or lost, then such a collapse becomes commensurately
more dicult. In the traditional Aspects analysis of NP-movement, information
is lost: the position from which an element has moved is eliminated. Real
movement was needed. In the wake of the advent of trace theory (Chomsky
1973), the necessity for movement became less clear, the derivational distance
between DS and SS having been lessened. The type of analysis above increases
the distance between DS and SS again by increasing the disparity between them,
this time not because information is lost between DS and SS as in the early
versions of movement, but because information is gained: the adjunct is added in
the course of the derivation.
It might still be argued, correctly, that the statement in (31), repeated below,
is perfectly sucient to maintain the appropriate distinction.
(46) If , a name, is contained in a fronted adjunct then Condition C
eects are abrogated; otherwise not.
Generalizing (46) somewhat to a Reconstruction-type account (or a lambda
conversion type account, la Williams 1987), and restricting ourselves to the
case where the adjunct is contained in a nominal, we would have the following.
(47) Reconstruct only N obligatorily; all other bar-levels of N optionally
(or dene as-if reconstructed, etc.)
What is the matter with (47)?
It seems clear that empirically there is nothing wrong with (47) at least
over the range of data that we are considering. (47) as well as the postulation of
Adjoin- will account for the following data set, assuming that of John is under
N and near John is under N.
(48) a. *Whose examination of John
i
did he
i
fear?
i
did he
i
peek at?
However, there is one rather striking dierence between the formulation in (47)
and the Adjoin- account: the latter analysis, but not the former, is intimately
bound up with the properties of the Projection Principle. In fact, assuming the
Projection Principle, and nothing else, the possibility that adjuncts may be added
in the course of the derivation, but not arguments, follows as a consequence. The
data in (48) could not be the reverse:
(49) a. Whose examination of John
i
did he
i
fear?
b. *Which examinations near John
i
did he
i
peek at?
Given the account in (47), however, there is nothing which would favor the
formulation in (47) over the alternative in (50), for example.
(50) Reconstruct only the elements under the N node obligatorily; all
other elements optionally.
The alternative in (50), in which only adjunctual elements are added back into
the representation at Reconstruction Structure, would not violate any constraint.
In particular, it would not violate the Projection Principle, so far as I can see.
Yet it would give rise precisely to the (erroneous) data set in (49).
In eect, the palette of possibilities allowed by the Reconstruction type
approach (or chain-binding approach, if construed as (50)) is too large; the child
must be assumed to be given one sort of reconstruction statement (47) instead of
another (50), though there is no general principle to choose between them. This
is not the case with the Adjoin- analysis, where the possibility of adjunction
follows directly from the Projection Principle, and no inverse is possible.
3
3.4 An Account of Parametric Variation
Let us assume, then, that the grammar consists of at least two rules:
(51) Move-
Adjoin-
Either may apply in the course of the derivation, in either order. Is there any
substructure in either?
3. There is another area of data here that might provide evidence that adjuncts must be added after
D-structure. This is based on an observation on NP-NP anaphora. The peculiar fact is that NP-NP
anaphora, with both names, is in general possible, but not if the second NP has part of its reference
specied by an adjunct. Some examples follow: the (a) example allows anaphora, but the (b) example
does not.
(1) a. The old man tried to catch the marlin, but the big sh was tough to catch.
b. *The old man tried to catch the marlin, but the sh that was big was tough to
catch.
(2) a. I looked at the Corvette, during which time the wheelless car really impressed
me.
b. *I looked at the Corvette, during which time the car without wheels really
impressed me.
(3) a. I was reading Ulysses, while my baby was trying to eat the story of Bloom.
b.
?
*I was reading Ulysses, while my baby was trying to eat the story about Bloom.
c. *I was reading Ulysses, while my baby was trying to eat the story that was
about Bloom.
(4) a. I was examining the Stieglitz photograph, while my wife tried to buy the
picture of the clouds.
b. *I was examining the Stieglitz photograph, while my wife tried to buy the
picture near the entrance.
I suggest that there is, and that, in particular, Adjoin- is stated in this manner:
(52) Adjoin-
Default: Conjunction
In eect, this states that the grammar attempts to adjoin a second structure into
a rst, but that if this fails, there is still a default operation. This is simple
conjunction: the linearization rule of Williams (1978). If (52) is correct, then
linearization is not so much a special rule or device, but just a default opera-
tion which applies, when two argument skeletons exist, as they often do, and one
has not been embedded in the other. That is, additional information is needed to
adjoin one structure into another; if no such information is available, the
grammar simply conjoins the two structures as a matter of default.
Consider now the exhibition of parametric variation as it relates to the
statement in Chapter 1 (Chap 1, 6):
(53) The theory of UG is the theory of parametric variation in the specica-
tions of closed class elements, ltered through a theory of levels.
Some languages (e.g. English) contain a relative clause that is structurally
adjacent to its head. Other languages, e.g. ancient Hittite (Cooper 1978) and
Austronesian languages, possess an adjoined relative clause, the so-called
corelative construction which does not structurally form a constituent with the
head, but apparently exists in juxtaposition with the full clause, following it. The
analyses I have seen have not been exact about the precise phrase structure in
these cases; I will assume that the contrast is that below:
(54)
NP (=N )
NP (=N ) S
English type:
(55)
Co-relative:
S
S RC
NP VP
V NP
If one wished to give a phrase structure analysis of the two languages, they
would dier in the rewrite rules below:
(56) English, Japanese, etc:
NP (NP, S) (either order)
(57) Adjoined RC language:
S (S, S) (where S is of RC type)
Given the strictures of Stowell (1981) such an analysis is not available to us;
there is, however, something better. A relative clause must agree with its head
(Chomsky 1982). Let us assume that the rule doing this, a predication rule,
indexes the head with the element in Comp: the index gets copied from the head
to the element in Comp.
(58) (the man)
i
(who (I knew t))
(the man)
i
(who
i
(I knew t))
It is, in fact, this rule of construal which allows the relative clause to be inter-
preted: syntactically, we may conceive that it is the saturation of the head of the
relative clause (who above) with an index is what licenses it. Let us call this
relative clause head the relative linker it is the link between the RC itself and
the NP head which it modies.
What, then, is the dierence between a language like English (or French or
Italian ) and a language in which the co-relative construction is employed?
Just this: in English, the wh-head of the relative clause, the relative linker, must
be saturated in the syntax, while in any language employing a co-relative it need
not be. That is, the parametric dierence is the following:
(59) a. English: Relative clause linker must be saturated at s-structure.
b. Corelative language: Relative clause linker need not be saturat-
ed at s-structure.
Notice now what has happened. Rather than positing a dierence in the phrase
structure rule generating relative clauses, the dierence has been located
elsewhere: in the saturating of the relative clause linker. But this linker itself is
simply part of the closed class set. That is, the dierence between the two
language types is isolated to a single dierence in the specication of a closed
class element; an element which is part of a closed class set, and which is of
necessity nite. Further, this dierence in the specication of an element in the
closed class set translates into a dierence of representation at a particular level.
The dierence, then, is reduced to something of the form of (53) above: the
relative clause dierences are reduced to dierences in the parametric specications
of closed class elements, ltered through a theory of levels.
Further, the statement in (59) may itself be re-stated in line with the notions
that we have been assuming. When does Adjoin- take place; what causes it?
The answer is at hand: adjoin- takes place at the point at which the relative
clause linker is saturated. More exactly, it is the saturation operation itself which
composes the two structures: the adjunct into the argument skeleton. There is a
1-to-1 relation between them.
(60) Adjoin- Saturate RC linker
The adjunction operation thereby is in a 1-to-1 relation with the saturation
operation itself: it is the necessity for saturation in the syntax, in languages like
English, which composes the two (clausal) representations, and builds the relative
clause structure. In languages which do not require saturation of the relative
clause linker in the syntax (though perhaps they do by LF, in which case
adjunction will take place later in the derivation: I will remain neutral on this
point), the relative clause is simply conjoined at the end of the clausal construc-
tion: the co-relative construction.
The English case is the following.
(61)
A1:
A2:
John saw the man.
who Bill liked.
Adjoin- or Saturate linker

John saw the man who Bill liked.
i i
Thus the adjunction operation is actually put in a 1-to-1 correspondence with the
change in the specication of a single closed class element, the relative clause
linker.
We might, in fact, say the same thing for the Move-. The primitive
operation is generally taken to be the movement operation itself. However, this
may be taken to follow from or, more exactly, to be put in 1-to-1 correspon-
dence with the saturation of the Comp feature. Recall the parametric analysis
in Chapter 1. In that chapter, it was suggested that Move-, as movement to
SpecC (Chomsky, 1986b), still needed to agree with a +/ wh feature in Comp.
This was necessary since the grammar needs to detect whether wh-movement has
taken place in the syntax, and this detection (i.e. selection) must be in terms of
the head position: Comp, not Spec C.
(62) a. *John didnt know Bill saw who.
b. John didnt know who Bill saw.
If we assume that the coindexing and saturation of the +/wh feature is attendant
upon movement, the following representation would result:
(63)
C
SpecC
who
i
C
C
+wh
i
S
Bill saw
John didnt know

Here the +wh taking verb, know, would select for the saturated +wh Comp. The
parametric dierence between a language which requires (or allows) wh-movement
in the syntax, English, and a language which apportions it to LF, Chinese, then
comes down to the following specication on the wh-feature.
(64) +wh must be satised in the syntax (English)
+wh need not be satised in the syntax (Chinese)
The Move- rule itself, as it applies to wh-words, is put into 1-to-1 correspon-
dence with the satisfaction of that feature, exactly in the same way that the
Adjoin- rule was put into 1-to-1 correspondence with the satisfaction of the RC
Linker. We may take the following to be equivalent:
(65) Move- Satisfy +wh feature
(as it applies
(to wh-words)
It is the satisfaction of the +wh feature which initiates Move-.
This supports the niteness claim of Chomsky (1981). Movement itself is
put into 1-to-1 correspondence with a contrast in a bit associated with a closed
class element.
Note an additional consequence. While Move-wh and Move-NP (Chomsky
1977b, van Riemsdijk and Williams 1981), are not specied as distinct opera-
tions via the movement rule itself just as is in general the case in most recent
work, where a single rule Move- is assumed they are dierentiated in terms
of the lexical feature which must be satised. In the case of wh-movement this
is the wh-feature itself; we leave it open for now what it is with respect to
NP-movement.
With respect to parametric variation, then, we are led to the following
picture. Adjoin- either need or need not apply in a language for relative clauses.
If it does not apply, a default operation of simple conjunction occurs. The
parametric situation in terms of the operation is the following.
(66) UG: (Adjoin-)
Default: Conjunction
We may conceive of these two operations as being in a bleeding relation in a
derivation. If Adjoin- does not apply, then Conjunction will.
(67)
DS
SS
English
DS
SS
Co-relative language
Adjoin-
Conjunction
(always bled)
no Adjoin-
Conjunction
(always applies)
The order of operation is the same in English and in a co-relative language. The
dierence is that in English, Adjoin- is always specied as applying (and so
conjunction as a universal default is always bled), while in a co-relative lan-
guage, Adjoin- will never apply, and the default will take over.
This same dierence may be looked at with respect to the specication of
the satisfaction of the relative clause linker. The linker either will have to be, or
will not need to be, satised in a given language.
(68)
UG
RC linker
must be satisfied
RC linker
need not/must not be satisfied
(Default)
The grammar, having made the decision in (68), will adopt one or the other sort
of relative clause.
The situation is similar, though subtly dierent, in the case of wh-move-
ment. At rst, it appears to be identical. We have a cross-linguistic dierence in
movement, depending on the satisfaction of the +wh element in Comp at S-structure.
(69)
UG
+ element
must be satisfied
at SS (English)
wh + element
need not be satisfied
at SS (Chinese)
wh
If we assume that the nonspecication of information is always the default, then
the Chinese case would be the default case, with the wh-element in place, in
spite of the fact that it appears to be cross-linguistically less common.
However, there is an additional possible distinction in the wh-data, which
has been brought home most forcefully by recent work in Japanese (Hoji 1985),
and which also echoes a position that Joan Bresnan took further back (Bresnan
1977). Namely, given a dislocated topic or wh-word, it appears that there are two
derivations, in a language like Japanese, which could have produced it. In one,
the dislocated element (perhaps a topic) has been generated in its theta position
at D-structure, and is moved to the dislocated position in the course of the
derivation, by the rule Move-. In the other derivation, the dislocated element is
generated in place at D-structure, this dislocated element is assigned a theta role,
and is somehow linked to the gap. The most plausible means by which this could
be done would be via operator movement plus ultimate co-indexing of the
operator with the dislocated phrase (as in Chomskys 1977a, analysis of tough-
movement), though it is possible that there is some other sort of indexing
procedure altogether. Note that in this case, one might wish to assume that there
is some sort of auxiliary theta role associated with the base-generated D-structure
position (perhaps in tough-constructions in English), so that the element gets a
theta role at DS, and that it additionally gets a theta role from the operator. See
also Chapter 5 (end of chapter), where I suggest that this possibility may hold for
wh-questions in early speech.
(70)
Dislocated element
coindexed by
Move-
generated in place, linked
to gap by Move- of operator
These possibilities may be put together as follows:
(71)
Wh-Dependencies
Dislocated Not Dislocated
(+ feature need
not be satisfied at SS)
wh-
Movement
of Element
Base-generation
of Element
In terms of parameter-setting, the parametric situation here would be determined,
rst, by the necessity for the wh-feature to be satised by SS (or not), and
second, by the possibility of theta assignment to dislocated positions. There does
seem to be some evidence in the acquisition sequence for a switch between the
two left branches in (71) (see Chapter 5), but no evidence for the child adopting
the right branch in initial grammars, i.e. that the +wh feature need not be dis-
located at SS in English. This may be due to the fact that the right branch is not
a default option cross-linguistically, or due to the fact that there is no positive
evidence for that option, or due to some other factor. See Chapter 5 for discussion.
I should note that a Barriers-type analysis suggests the possibility that it is
not the wh-element itself which binds the trace, as suggested in traditional
analyses of wh-movement, nor an index associated with the phrasal category, as
suggested in van Riemsdijk and Williams (1981), but the wh-feature itself in Comp.
(72) traditional analysis:
who
i
did John see t
i
?
(73) van Riemsdijk and Williams analysis:
I dont know who
i
(
i
John saw t
i
)?
(74)
S
NP
I
VP
know CP dont
who
i
C
Comp
+wh
i
i
S
John saw e
i
(after Chomsky 1986)
There is some evidence for this position, see Chapter 5 for discussion.
3.5 Relative Clause Acquisition
Let us now take a look at the acquisition of relative clauses. If the above analysis
of parametric variation is illuminating, it should carry over, ceteris paribus, to the
acquisition of relative clauses as well.
The major work on the acquisition of relative clauses is Tavakolian (1978);
previous work had been done by Sheldon (1974) as well as many others;
subsequent work has been done by Goodluck (1978), Solan and Roeper (1978),
Hamburger and Crain (1982), and again many others. The basic contention of
Tavakolian (1978) is the following: children attempt to parse RCs with the rules
(and computational resources) present in the grammar; to the extent that these
fail, they adopt a conjoined clause analysis of the relative clause. The relevant
structures, then, would be the following.
(75)
S
NP VP
V NP
NP S
who tickled the rabbit. the monkey kissed The sheep
Subject/Object relative (adult grammar):
(76)
NP
tickled the rabbit. kissed the monkey The sheep
Subject/Object relative (child grammar, when it fails):
S
S S
NP VP VP
who
Tavakolian notes that children interpret this in the same manner as the corre-
sponding conjoined clause structure: The sheep kissed the monkey and tickled
the rabbit. Assuming, as she assumes, that there is a null element in the
conjoined clause (i.e. the relative), and assuming that there is the high attach-
ment, then the propensity for young children to interpret the relative clause as if
it were modifying the rst subject (the sheep above) is explained. It is treated as
a sort of co-ordinate construction. This hypothesis diered from an earlier
hypothesis due to Amy Sheldon, who suggested that children attempt to maintain
parallelism in grammatical function between elements in the matrix and the
relative clause: thus, in Sheldons view, subject-subject relatives would be well-
understood (i.e. RCs where the subject had been relativized and associated with
the main clause subject), and object-object relatives, but not subject-object
relatives or object-subject relatives.
Tavakolian adduced a number of pieces of evidence for her position. The most
interesting have to do with the dierence in the comprehension of RCs with relative
subjects, depending on whether they hang o of a subject or an object NP. For
subject-subject relatives, the comprehension data is the following (Tavakolian 1978):
(77) SS relatives:
The sheep that hit the rabbit kisses the lion.
1 2 3
(78) Response category
Correct (12,13) 12,23 21,23 12,32 other
Age 3.03.6
Age 4.04.6
Age 5.05.6
18
16
22
2
5
0
1
1
0
0
0
2
3
2
0
Totals 56 7 2 2 5
Note: A 12, 13 response means that the child acts out 2 actions, one in which the rst
NP acts on the second (12), and one in which the rst NP acts on the third (13).
Similarly for all the number pairs.
It is clear that children do very well in the comprehension of this relative. For
contrast, now, consider the object/subject relative: i.e., the subject relative o of
an object. Note that according to the usual notions of parsing complexity, these
structures should be easier than the subject relative o of a subject, since they
involve right branching rather than left branching. The results are the following.
(79) Object/Subject relatives
The lion kissed the duck that hit the pig.
Response Category
Age Correct (12,23) 12,13 12,31 12,32 21,23 Other
3.03.6
4.04.6
5.05.6
1
4
9
17
15
13
1
3
1
2
1
0
1
0
1
2
1
0
Total 14 45 5 3 2 3
The clear result, lessening over time, is that children choose a response in which
the subject of the relative clause is the matrix subject, not the matrix object: i.e.
the child takes the 12, 13 response (as if, the lion kissed the duck and hit the
pig), not the 12, 23 response (the lion kissed the duck and the duck hit the
pig). This is a remarkable contrast with the Subject/Subject response, where the
childs response is largely appropriate. Note also that it would be unexpected
given the usual parsing theory of greater complexity in left-branching structures.
Given the analysis of relative clauses suggested in this chapter, however, the
Tavakolian data follows immediately. Let us modify Tavakolians conjoined
clause analysis, so that it becomes not simply a parsing principle, but is
integrated into the general structure of the grammar. Let us assume, in particular,
as above, that relative clauses are not present in the base, but rather are added-in
in the course of the derivation. There are two ways for a language to do so. It
may have recourse to the rule Adjoin-, which adjoins a relative to its nominal
head. This is shown in (80).
(80)
structure 1: structure 2:
S S
NP VP
V NP
Output: English-type relative
Or, it may conjoin the structure (Conjoin-). For convenience, let us assume that
this involves daughter adjunction under S. These are the parametric possibilities.
Let us now make the simplest assumption about the immature grammar: that
it may, under conditions of computational complexity, have recourse not to the
actual rule in the target grammar of the language to be learned, but rather to the
default rule allowed by UG.
(81) The immature grammar may have recourse to the default rule.
In such a case, the grammar would be un-English, but it would not be un-
UG. It would simply be displaying an option available in UG, but unavailable
in the language to be learned. Note that this gives a rather dierent view of
parameter-setting than is conventionally understood. Rather than the child rst
setting the parameter at a default, and attempting to learn the actual value, the
child has the actual value as a target at all times. When the grammar/computational
system fails, it takes the default, as a default. It is not, however, vacillating
between two choices. A physical analogy would be: the grammar is a 3-space,
and in the 3-space are hills to be climbed these are the parameters to be set.
In times of computational diculty, the grammar may fall into a hollow. These
hollows are the default settings. Both the hilltops and the hollows are specied
possibilities of UG: the system, however, is trying to hill-climb. It is only under
conditions of computational complexity that recourse is had to a setting not true
to the target language.
Returning from these general considerations, let us consider how recourse
to the default setting will explain the Tavakolian data. Let us take as a point of
departure the assumption that children are having recourse to the default setting:
Conjoin-. Let us also, more tentatively, assume that this involves daughter
adjunction of the RC S-node underneath S.
(82) Default grammar (children):
Conjoin-
(83) Conjoin- is daughter adjunction under S.
What now happens in the case at hand? Assuming the default grammar in (82)
for both the subject and object relatives, the child would have the following
structural analyses.
(84)
S S
S
NP VP
V NP
Output after Conjoin- :
S
NP VP
Relative off of subject:
(85)
S S
S
NP VP
Output after Conjoin- :
S
NP VP
Relative off of object:
In both cases, the RC is daughter adjoined to the S. However, this gives us
precisely the result that we want with respect to interpretation. The relative
clause does not form a constituent in relation to any NP: in all cases it hangs o
of S. There is, therefore, no natural relative interpretation. However, the RC does
lack a subject: it is a sort of predicate with an unsaturated subject position. This
means that it may be construed with any sister NP to form a proposition. In the
case of the relative clause o of a subject, the sister NP will be the subject of
the sentence, and the appropriate interpretation will be given accidentally, so
to speak. In the case of the relative clause o of an object, the RC will again be
daughter-adjoined under S, and the relevant sister will be the subject of the
sentence: the object of the sentence would not c-command the relative clause.
The interpretation that will be given, therefore, will be one in which the relative
clause is construed as interpreted with the subject of the sentence: the wrong
interpretation.
By assuming that the child has recourse to the default operation, then, we
are able to account for the pattern of data in the misconstrual of these relative
clauses by children. Strikingly, no separate parsing principle is needed, but what
is needed is a radical restructuring of our understanding of the genesis of relative
clauses. In this way, the acquisition theory may actually lead the syntactic theory
to a novel analysis.
Is there any additional evidence that this sort of analysis is correct ? In fact,
there is. In Tavakolians analysis, the diculty for children in interpretation is
linked to a structural dierence in the phrase markers between children and
adults: the Object/Subject relative is attached high by the children but not by
adults. Solan and Roeper (1978) distinguish Tavakolians account from the
parallel structures account in the following way.
Solan and Roeper constructed sentences in which the relative clause, a
subject relative, is attached o of an object. This is similar to the Object/Subject
sentences above. However, they provided a crucial test to determine whether the
high attachment analysis (conjoined clause analysis) is correct. Namely, they
chose sentences which contained, in addition to the direct object, an obligatory
prepositional object. These were sentences using the verb put.
(86) The lion put the turtle that saw the sh in the trunk.
The adult analysis of (86) would have the RC adjoined to the head noun the
turtle. This is shown with the full line in the diagram in (87). Suppose, however,
that because of computational diculties the child cannot use the rule Adjoin-.
Then the default Conjoin- should appear (the Tavakolian analysis). However,
in this case, Conjoin-, interpreted as S-conjunction also must fail, because it
would result in crossing branches (Solan and Roeper 1978). Hence there is only
one further possibility, that the relative clause remains entirely unattached into
the structure. Now if we make the additional obvious assumption that only rooted
structures may be interpreted, this would mean that the relative clause would not
be erroneously interpreted by the child as conjoined (i.e., as having the subject
as its antecedent) for these put constructions, but rather would not be interpreted
at all. In fact, this seems to be the case (88).
(87)
S
NP VP
V NP PP
NP S
The lion put the turtle that saw the fish in the trunk
?
(88)
Conjoined Clause Response Failure to Interpret RC
Sentences with put
Sentences with push
00
40
42
06
The Roeper and Solan data show clearly that Tavakolians analysis, and the
analysis here, are correct. The child rst attempts to have recourse to the
adjunction structure: Adjoin-. If that fails, he or she attempts to conjoin the
structure: Conjoin-. If that fails, the relative clause must remain unrooted, and
so uninterpretable.
(89) a. Adjoin- (Correct Interpretation)
b. If fails, Conjoin- (Conjoined Clause Interpretation),
c. If fails, remains unrooted (No interpretation)
The acquisition data and the syntactic analysis involving a syntactic rule of
adjunction are then in perfect parity.
3.6 The Fine Structure of the Grammar, with Correspondences: The General
Congruence Principle
I wish now to present one view as to how the theory of levels, the theory of
parametric variation, and the theory of acquisition relate.
In the sections above, I have suggested that there is a rule adding relative
clauses, and in general adjuncts, in the course of the derivation. However, this
rule itself, Adjoin-, has a certain substructure with respect to the derivation. We
may consider it as an optional rule in UG ordered before the default associated
with it, Conjoin-, which it completely bleeds, if present.
(90) DS
SS
UG Specification
(Adjoin- )
Conjoin-
(91) DS
SS
DS
SS
DS
SS
Adjoin-
Conjoin-
(Adjoin-
Conjoin-
)
Conjoin-
Universal Grammar
This general point of view, a parameter-setting approach, has the structure in
(92), with G
1
being a relative clause-head language like English, and G
2
being a
co-relative type language.
(92)
Universal Grammar
G
1
G
2
G
0
As noted in the previous section, this view, while of a parameter-setting type,
diers from a standard parameter-setting view in at least two ways. First, the
initial grammar G
0
, is not a possible nal grammar. The initial grammar has
Adjoin- as an optional rule, and this presumably is not an option at least
generally for the nal grammar. A more usual parameter-setting approach
might have the childs grammar originally set at either G
1
or G
2
, and changing
over, if necessary, to the other type.
Second, this view diverges from a standard parameter-setting view with
respect to the notion of setting a parameter. In the standard view, a parameter
is a sort of cognitive switch: the child starts with the switch set in a particular
direction, and the setting of the switch may change. Each position is stable. In
the representation in (92) however, with an operation/default type arrangement,
the original setting is neither of the nal two settings, and the parameter is not
so much a switch to be set as a hill to be climbed, a target of the system. Thus
when the child fails to apply the rule Adjoin- in his/her grammar, or in his or
her analysis of a sentence, the default rule Conjoin- is fallen into. It is as if the
former (Adjoin-) were a local maximum surrounded by a local minimum
(Conjoin-). Moreover, the grammar itself must be organized in such a fashion:
that when the attempt at a target rule fails, a local minimum must exist to fall
into, which itself is a possible specication of UG. This at least is an obvious
conclusion to draw from such an approach.
There is, nally, an interesting property of this system in (92) which
requires note. It supports the general sort of philosophical framework that has
been set forth by Chomsky (1986a) and Fodor (1975, 1981) with respect to the
nature of learning. The instance of parameter-setting given in (91) is very strange
from the point of view of traditional learning-theoretic and behaviorist notions
or even more commonly held man-in-the-street views according to which
learning is an accretion of knowledge or information. What is actually happening
in (91) is the reverse of that. Each of the two nal grammars in (91), G
1
and G
2
,
actually has less information than the initial grammar G
0
in terms of the number
of bits of information in them. The initial grammar has two pieces of information
associated with the operation Adjoin-: Adjoin-a itself, and the parenthesis ( )
surrounding it. The nal grammars have less information than that. In the
English type grammar, the parentheses surrounding Adjoin- have been erased:
this means that the single piece of information, Adjoin-a, as an operation, exists.
The co-relative language contains even less information, containing neither
Adjoin- nor the parenthesis. This means that both of the nal grammars have
fewer pieces of information in them than the initial grammar: the process of
learning involves the erasure of information specied in UG, at least for this
central case. This is very much in line with the sort of view of learning that
Chomsky/Fodor propose, and much against an accretionist view.
Let us return to the main problem. The general structure of the choice
situation in (92) can be used both to describe the parametric situation cross-
linguistically, and the childs acquisition problem. The childs undecided
language may be associated with G
0
, and she may choose either of the two
options, G
1
or G
2
. There is an asymmetry in the choice of options, in that G
2
is
the default (cf. the section above): if the child is aiming for G
1
as the target
grammar she may fall into G
2
under conditions of computational complexity, but
not the reverse. Further, there are three welcome (or at least interestingly
dierent) features of the analysis which recommend it: i) it introduces a
developmental aspect, in that the initial grammar in UG, G
0
, is not a nal
grammar, ii) it views parameter-setting not so much as setting a switch, as
attempting to reach a target grammar (climbing a hill): the default grammar is
therefore fallen into, rather than initially specied, and iii) it views learning as
the erasure, rather than the accretion of information.
All this appears well and good. But there is a hidden diculty at this point
for the thesis of this work. According to the General Congruence Principle
(Chapter 2) there is some congruence relation between the acquisition sequence
and the organization of operations in the grammar. But it seems fairly clear that
this is not the case for the analysis of relative clauses presented so far. The
particular format of the parameter-setting approach given in (92), repeated below,
can hold both for the structure of parameters cross-linguistically and for the
childs setting of a parameter in her language.
(93)
UG: Structure of choice of grammars (parameter-setting)
G
1
G
2
G
0
The structure of operations in the grammar, in UG has so far been presented as
rather dierent: it involved the optional specication of Adjoin- followed by
the obligatory specication of Conjoin-.
(94) DS
SS
UG: Structure of operations
(Adjoin- )
Conjoin-
A language like English would erase the brackets in (94), while a corelative
language would erase the entire specication (Adjoin-a). However, if a principle
like the General Congruence Principle is to be correct suggesting that there is
a deep correspondence between the structure of operations within a grammar
(94), and the parametric-acquisitional choice (93) then the particular organiza-
tions in (93) and (94) cannot possibly both be correct: there is no isomorphism
between them, as can be seen by simple inspection. Rather, either (93) or (94)
must be mistaken, and there must be a common format for the two aspects of
Universal Grammar.
Let us therefore proceed in this manner. Change the format in (93) from
that in which a pure choice exists between grammar G
1
and G
2
to something of
the format in (95).
(95)
G
1
) G
2
) G
0
((
Adjoin- Conjoin-
Parametric Specification in UG
The interpretation of the parenthesis in (95) will be peculiar; I return to this
below. And let us keep the format of the operations in the grammar virtually the
same, changing it slightly in typography.
(96)
s
1
) s
2
) DS ((
O :
O :
1
2
Adjoin-
Conjoin-
O
2
O
1
Ignoring parenthesis, the gure in (96) is to be read as follows: the operation O
1
,
Adjoin-, maps the DS into structure s
1
, the operation o
2
maps the representation
into structure s
2
. S
2
may be identied with SS in the case where no other
operations have applied.
It is apparent that the structure of the grammars in (95) and the structure of
the operations in (96) are identical.
More on notation. The parenthesis are not to be read as optionality. Rather,
they are to be read as invisibility. The material inside the parenthesis is invisible
to the grammar/acquisitional device. The grammar develops by removing
parenthesis, allowing for the instantiation of operations already specied in UG.
Second, the particular numbering on the grammars, operations, and structures
(e.g., s
1
vs. s
2
) is of no ultimate signicance s
2
, for example, may come
directly after DS in a particular languages grammar.
Let us start now with (96). I suggested earlier that a childs initial grammar
had recourse to a default rule of Conjoin- to analyze relative clauses. Prior to
this, however, the childs grammar lters out relative clauses altogether, only under-
standing the main proposition. The full developmental sequence is the following.
(97) Stage I: relative clause not understood at all (ltered out)
Stage II: relative clause understood as generated by the rule
Conjoin-
Stage III: relative clause understood at generated by the rule
Adjoin-
This full developmental sequence is represented in the diagram in (96) if we
assume that: i) operations within the parenthesis at time t are unavailable to the
grammar at that time, and ii) the child progresses by removing parenthesis in the
UG representation, starting from the most external set and proceeding inward.
Consider how this would work. The initial grammar for the child would
simply be the following:
(98) s
1
) s
2
) DS ((
O :
O :
1
2
Adjoin-
Conjoin-
o
2
o
1
The operations in parentheses would be unavailable to the child in the initial
grammar, that is both the operations Adjoin- and Conjoin- would be unavail-
able. This means that the representation in the grammar with respect to these
operations would be simply that in (99).
(99) DS
Since by earlier assumption the DS representation is a pure representation of the
rooted argument-of relation with no adjuncts present, this means that the childs
initial analysis of a sentence like (100a) would be simply (100b), without
Adjoin- or Conjoin- applying.
(100) a. The man saw the woman who knew Bill.
b. The man saw the woman.
That is, the childs grammar at the stage in (99) would be doing the adjunct
ltering that was noted earlier. This goes along with the observation that in
initial stages, relative clauses are simply dropped by the child.
What then is the next stage? According to the above, parenthesis are erased,
starting from the outermost inward. Erasing the outermost parenthesis in (98)
would give rise to the following grammar:
(101)
s
1
) s
2
DS (
O :
O :
1
2
Adjoin-
Conjoin-
o
2
o
1
Assuming that the material inside the parenthesis is invisible to the child, this is
simply equivalent to the grammar in (102), where the numbering on the opera-
tion (O
2
) is not signicant.
(102)
s
2
DS
O :
2
Conjoin-
o
2
Conjoin- will be the operative operation in the grammar at this point. This
means that the child will interpret a sentence with the actual bracketing in (103a)
as having the bracketing in (103b), and one with the actual bracketing of (103c)
as having the bracketing in (103d). This is because only Conjoin-, not Adjoin-, is
part of the grammar at this point. This explains the Tavakolian result.
(103) a. (
S
The man saw (
NP
(
NP
the woman) (
S
who knew Bill)))
b. (
S
The man (
VP
saw (
NP
the woman)) (
S
who knew Bill))
c. (
S
(
NP
(
NP
The man) (
S
who knew Bill)) (saw the woman))
d. (
S
(
NP
The man) (
S
who knew Bill) (saw the woman))
In (103b) the relative o of the object has been attached high daughter-
adjoined under S. This means that the subject is the only possible controller,
i.e., coreferent item, with the subject variable who. This is indeed the mistake
that children make, choosing the subject of the sentence as the subject of the RC.
In (103d), the relative clause is again attached high. However, in this case, with
the subject as controller, the correct interpretation is gotten even though the
structural analysis is faulty. So the Tavakolian facts follow.
In the nal stage in the acquisition of the construction the innermost
brackets are removed (e.g, in the acquisition of English). This gives rise to the
following derivational representation.
(104)
s
1
s
2
DS
O :
O :
1
2
Adjoin-
Conjoin-
o
2
o
1
This was exactly the sequence of operations that we noted earlier as the appro-
priate one for English ((91) above), with Adjoin- continually bleeding
Conjoin- for the appropriate choice of structures.
Thus the sequence of grammars that the child passes through is accounted
for if we assume the following representation in UG together with a rule
which removes outermost brackets on the basis of positive evidence.
(105)
s
1
) s
2
) DS ((
O :
O :
1
2
Adjoin-
Conjoin-
O
2
O
1
Consider now the parametric situation. I suggested earlier that there was a
congruence between the structure of operations in levels, and the structure of
parameter-setting itself. This requires that the parametric structure of grammars
is the following:
(106)
G
1
) G
2
) G
0
((
Adjoin- Conjoin-
The parenthesis are given the same interpretation as above: as invisibility if
present.
The rst grammar would therefore be that in (107a), this would be read
simply as (107b).
(107)
G
1
) G
2
) G
0
((
G
0
Adjoin- Conjoin-
a.
b.
Neither the adjunction nor the conjunction operation would be an attribute of this
grammar. While no human language apparently has this property i.e. is a pure
representation of argument structure with no adjunctual possibilities allowed: all
are too rich for this it is possible that certain subparts of natural language
have precisely this property. I am thinking in particular of the lexicon, or lexical
representation, which is often thought to represent argument structure, but not
adjunctual structure or conjunction (Bresnan 1982; Zubizarreta 1987). The idea
that the most primitive grammar is a pure representation of argument structure,
and that this is the type of the lexicon also goes along with the idea presented
in Chapter 2, that the original grammar is the lexicon, where the lexicon itself
has a tree-like structure.
The next ordered grammar would involve the removal of the outermost
parenthesis in (106). This would be the representation in (108a), which would be
read as (108b) (recall that the subscripts bear no absolute signicance).
(108)
G
1
) G
2
G
2
G
0
(
G
0
Adjoin- Conjoin-
Conjoin-
a.
b.
This would demarcate precisely the co-relative languages with no head-adjoined
clause structure. These grammars are less rich structurally than those containing
Adjoin-, and are the rst to be reached temporally.
The next and nal grammar would be the one reached after the nal,
innermost, parenthesis had been removed.
(109)
G
1
G
2
G
0
O:Adjoin- O:Conjoin-
(arrows to be read as the addition of the
operation to the grammar)
What is the signicance of this representation? At rst, it appears quite unrea-
sonable. It states that the child, having started from an original G
0
, passes to a
grammar which is characterized by having in addition the operation Adjoin-.
However, from there, he or she has the additional operation of Conjoin- added
into the grammar. But since Conjoin- will not even be relevant now for the
structures under consideration, what sense does this make?
I would suggest, however, that it is precisely this organization that is needed
to allow for the fact that when the child fails with Adjoin-, he or she falls into
a grammar that is characterized by the rule Conjoin-. If we assume the
bifurcationist structure above, repeated again below, then there is no reason in the
format of the grammars themselves why the child should fall into G
2
, failing G
1
.
This is represented by the arrow, but that has no formal signicance.
(110)
G
1
G
2
G
0
falls into
But given the format in (109), there is such an organization. Under normal
conditions, the rule Conjoin- is always ready to be added to the grammar in
(109). However, for the relevant RC structures it is always bled, since the rule
of Adjoin- is known to apply. Consider now what happens upon the failure of
Adjoin- let us say on a construction-by-construction basis. To represent this
we may simply cross out the operation, and the resulting grammar will look like
(111b) (recall that the numbering on grammars has no signicance).
(111)
G
1
G
2
G
2
G
0
G
0
Adjoin- Conjoin-
Conjoin-
a.
b.
This grammar, however, is simply the grammar in which Conjoin- holds:
exactly the default grammar that was wanted. Thus by this particular arrangement
of grammar, the retreat of a grammar into a default is accounted for formally
notationally, and not simply by at.
Rather than crossing out the operation (Adjoin-), we may consider a
failure under conditions of computational complexity as tantamount to the re-
insertion of parenthesis in a format in which they have already been removed.
This would be equivalent to a regression to a state which is less specied, and
closer to the original Universal Grammar representation. The grammar would
thus regress to the grammar in (112a), which is read as (112b).
(112)
G
1
) G
2
G
2
G
0
(
G
0
Adjoin- Conjoin-
Conjoin-
a.
b.
Namely, the default grammar, in cases of computational complexity, would be
the grammar in which conjunction held. This is precisely the needed result.
To summarize: in this section, I have argued that there ia a principle, the
General Congruence Principle, which relates structures of operations in a
grammar, and the structure of parameters themselves. These are equivalent up to
isomorphism, at least for the analysis of relative clauses. Second, the child
proceeds by removing parentheses in a representation. This supports the
Chomsky/Fodor position with respect to learning: in this case, at least, learning
is not the accretion of information, but the removal of information, representing
the removal of possibilities from a universally specied set. Third, under
conditions of computational complexity the child falls into a default grammar
involving conjunction rather than adjunction. This however is not due to a
separate parsing principle (as Tavakolian suggests), but rather due to a retreat to
a grammatical format which is closer to the UG format, with all parentheses
included. Finally, the nature of parameter setting is taken to involve not so much
the ipping of a switch, but the climbing of a hill. This represents a local
maximum (the target grammar) surrounded by local minima (the default options).
Both the target grammar, and the fall-back grammars must be represented in UG.
3.7 What the Relation of the Grammar to the Parser Might Be
A common position in the acquisition literature has been that the grammar
remains constant over time, and is hidden by the exercise of parsing strategies.
That is, when the grammar/computational system fails to come up with an
analysis, an exogenous parsing strategy enters in and returns an analysis not
countenanced by the current grammar: the childs analysis, so to speak, falls out
of the grammar, and returns a value which is not one of the possible permissi-
ble targets. In the following, I would like to take the position that that sort of
masking of the grammar does not in fact occur: i.e., that it is not the case that
the grammar remains constant and is masked by an autonomous system of
parsing, production, etc., with its own separate principles. Rather, to the extent
to which parsing and performance considerations matter, they do so via the
grammar itself, either directly, in the sense that possible restrictions on left-to-
right computations are taken up into and stated in the grammar, or indirectly,
where if the childs analysis fail, it falls into another permissible grammar, as in
the discussion above.
Clearly a complete argument for this position would be impossible at
present, given that so little is known about the parser. What I will do instead is
outline some possible relations of the parser to the grammar, with special
attention to the sorts of claims that have been made in the acquisition literature.
Let us imagine what a parsing account of ungrammaticality might be, by
imagining two instances of it. The rst is taken from Frazier (1979), in which
she suggests that there may be a parsing ground that is required for sentential
subjects.
(113) a. That John loves horseradish is obvious.
b. *John loves horseradish is obvious.
Frazier notes the following: suppose that there is a parsing principle like that of
minimal attachment. Then the sentential subject without a that complementizer
would, in the course of the parsing derivation, be immediately attached to the root.
(114)
Parsing representation: left-segment
S
root
NP
John
VP
loves horseradish
This representation would then have to be reanalyzed at a later point, so that it
was subordinated, at a later stage of the parse. Suppressing details:
(115)
S
S VP
John loves
horseradish
is obvious
reanalyzed as
subordinated
Such a reanalysis would not have to be done for the sentential subject marked
with that, a subordination marker. Suppose that we assume that such a re-
analysis is either: i) costly, or ii) impossible under these conditions. If the latter,
then we would have the basis for a direct parsing account for the ungrammat-
icality in (113b). If the former, which is what Frazier assumes, then the necessity
for the that-complementizer is a parsing-based necessity, but this is encoded in
the grammar in a way which may not be based on a parsing vocabulary at all,
for example in terms of proper government (Stowell 1981). By parsing vocabu-
lary in the last sentence, I mean minimally an explanation which depends on left-
to-rightness.
Let us take a second example, a direct parsing-grammatical account of
that-trace eects. This particular account is by myself, and it is intended for
demonstration purposes only (though it may turn out that something like it is
true, if facts like the que/qui alternation in French could be handled). Assume
that the partial computations in the left-to-right parse must be grammatically
well-formed, fullling X theory, proper government of null categories, etc.
Assume further that null closed class heads need not be postulated until the end
of selected domains of the parse, and that a limited amount of reanalysis, in
terms of addition of phrasal categories, is allowed (exactly how this is done, I
will leave unspecied).
Consider now the that-trace eect in (116).
(116) a. Who do you believe e is here?
b. *Who do you believe that e is here?
In (116a), the null category could be parsed as part of the matrix in a left-to-
right partial computation, if we assume that null categories in argument position
must be projected immediately, and that categories like the embedded IP are
projected at the time that their heads are encountered.
(117)
Partial Parse:
CP
Who do you VP
V
believe e ...
NP
properly governed (in partial parse)
On the other hand, no such partial parse exists for the construction with that. The
null category, if posited, will not be properly governed during the intermediate
parse, prior to the uncovering of In.
(118) CP
Who do you VP
V
believe
CP
e not properly governed
C
that
IP
NP
This would then constitute a parsing-grammatical explanation for that-trace
eects. It would make predictions as well: e.g. that that-trace eects should not
be collapsed with a general inability of extraction from subjects.
As noted above, this explanation is for demonstration purposes only: what
I would like to concentrate on is not the particular explanations above, but their
general type. These would constitute genuine parsing-theoretic explanations of
types of ungrammaticality. Fraziers explanation would restrict the set of
grammar by developing constraints on the possible re-analysis of a partially
parsed tree; the constraint directly above would restrict the set of grammars by
stating well-formedness conditions on partially computed objects. In this latter
case, these well-formedness conditions would be exactly the same conditions
which characterized the full phrase marker.
The claim is often made that the early linguistic system is more dependant
upon or is masked by the parser, but it is not quite clear what this means. For
this type, would it mean that there are more constraints of this type on the early
grammar? That they are stricter?
Further, it is unclear, in a terminological way, that one would want to call
the above constraints parsing constraints. Let us call a constraint a left-linear
constraint if it is a constraint on the building up of a tree (from a string), in a
left-linear fashion. The two example constraints above would be instances of left-
linear constraints. The parsing theorist, insofar as he or she is making claims
about the parser directly determining the properties of the grammar, is making
well-formedness claims about the formal, partially computed object in a left-to-
right analysis. Yet these constraints on left-linear analysis characterize the
speaker as well as the hearer. But then, insofar as such constraints exist, they
should be simply be considered part of the grammar: i.e. a part of the grammar
concerned with the well-formedness of certain subtrees. That is, the grammatical
theory should be expanded to consider these as part of the grammar: they would
be part of some future grammar, they would be left-linear constraints in such a
grammar, and would not have a dierent ontological character than simple
grammatical constraints.
The following sorts of relations seem possible (focussing on the failure of
the grammatical system in acquisition).
(119)
Left-Linear
Constraints
Frazier Fall into less
mature grammar
(this book)
Grammar Masked
(Hamburger and
Crain)
Correct value
returned;
unavailable
Directly Indirectly
Parsing considerations
determine grammar
(at least partly)
Parsing considerations
do not determine grammar
Parser returns value
not in grammar
Ranking
Hypotheses
Role of Parser
The parsing theory implicitly adopted in this chapter, and in the work as a
whole, is that parsing considerations indirectly determine the grammar, in the
sense that computational diculties cause the system as a whole to fall into a
less mature system, where this system is both grammatically prior and computa-
tionally simpler. That is, it is not so much that the parser determines the form of
the grammar, but that the parser (i.e. computational considerations) partly
determines what grammar one is in, out of a sequence of successive grammars
in acquisition. In this particular addendum, I have suggested there may as well
be ways by which the linking is more direct, to the extent to which left-linear
constraints are directly stated in the grammar (see Weinberg 1988, for such a
view) unfortunately there have been very few grammatical-parsing theories of
this left-linear type, so it is dicult to gauge their range of application: see
Marcus, Hindle, and Fleck (1983) for an exception. In general, I should tend to
favor either of these two sorts of approaches on the left, which may be broadly
distinguished from those which, when confronted with a nonadult structure in
acquisition, hold that the grammar is fully adult, but that it is not reached by the
parser: i.e. that the parser returns a value not in its permissible range. Rather, it
seems preferable to assume that the grammar is organized in such a way that
when the child is confronted with a structure too (computationally) dicult for
him or her to analyze, the grammar/computational system falls back as a unit to
a grammar/computational system in which the child can analyze the string and
return a permissible value in the less advanced system, even if some elements in
the string must be ignored (and part of the meaning may be ignored or errone-
ously construed). This would be the case if the following were true.
(120) Property of Smooth Degradation
The childs analysis degrades smoothly when faced with a not fully
understood input.
(121) Principle of Representability
All analyses by the child are generated by the childs grammar.
These two assumptions may appear to be obvious, but their mutual adoption has,
it seems to me, far-reaching eects in the grammar. One would expect the
property of smooth degradation to hold of any truly robust learning system.
When a failure occurs in the analysis of the input, it ensures that the child has
some sort of analysis of an incoming string, and thus ensures that when the child
hears a sentence that cannot be completely analyzed, a partial analysis will still
be able to be given, so that i) the meaning can be partially recovered, and ii) the
elements which are not understood can be isolated. The Property of Smooth
Degradation distinguishes, I believe, the particular sort of failure that one nds
in intermediate stages of child language, from those which occur in default due
to injury or stroke, i.e. various types of aphasia, and of course from other sorts
of simple, non-redundant input systems like radio receivers.
The Principle of Representability requires the grammar to be reached by
the parser at all times: all values in the parers range are in the grammar.
How should the Property of Smooth Degradation be modeled (if it in fact
is the case, as it seems to be)? It suggest that there is a sort of redundancy in the
system. However, this redundancy is not in the formal redundancy of identical
element, nor in the overlaying of constraints (see Chomskys 1980, comments on
the Tensed Sentence Condition and the Specied Subject Condition), but in the
overlaying of a more articulated and richer system over one which is less so.
Part of the way in which this could be accomplished would be by having
recourse to operation/default organization within the grammar, another would be
to have two (or more) systems operating in distinct vocabularies over the same
input string: in particular, the Case and theta system (perhaps several systems
within Case); see Chapter 4. With respect to vocabulary, the redundancy is a
functional redundancy, not a formal redundancy: the two systems are distinct in
their primitives.
But then by the Principle of Representability, the partial analysis by the
child must itself be represented in the grammar. That is, the child may be viewed
as passing through a sequence of distinct and gradually enriching grammars, the
simpler grammars acting as a back up for the more complex ones.
Many of the traditional ndings in the psycholinguistic literature may be
viewed in precisely this way: not as the exercise of autonomous parsing
principles, but as the falling back into an earlier grammatical analysis. For
example, a traditional nding due to Bever (1970) is that children initially mis-
analyze passives, when they do, as equivalent to the corresponding active form.
(122) John was V-ed by Mary.
Childs interpretation: John V-ed Mary. active
Bever interpreted this as involving the exercise of an autonomous parsing
strategy: namely, the child tries to t the structure NP-V-NP over the input
string, where the rst NP is an agent, and the second, a patient. Yet this same
nding may be viewed not as implicating a parsing strategy, but of the tting of
the direct lexical form of the verb, or its form after Project-, to the input. As
such, part of the string would be (mis-)analyzed, and the rest would be ignored.
That is, instead of viewing the misanalysis as due to the intercession of a
separate system, one may view it as the retreat to a former system respecting
the Principle of Representability above.
Let me go into somewhat more detail about what the indirect approach in
(119) above would be for relative clauses. Let us suppose that, with respect to a
given construction type, e.g. relative clauses, the child successively adopts over
the course of development one of three analyses: i) the rst, in which relative
clauses are entirely ltered out, ii) the second, in which there is high attachment
(Tavakolian 1978), and iii) the third, in which the relative clause is correctly
adjoined to the head, and construed with it. The second of these analyses corresponds
roughly to what is a co-relative construction in the worlds language: the third of
these corresponds to a grammar in which the phrase marker is a pure representa-
tion of the argument-of relation. We may list the successive grammars below.
(123) Grammar for Relative Clause
G1 RC not attached
G2 High Attachment
G3 Regular Attachment
Suppose now that we reach a particular situation in acquisition. Namely, a child
sometimes chooses a high attachment for the relative clause, and sometimes
chooses the correct NP-S or Det-N-S analysis. One possibility is that the
childs grammar picks out the right analysis, but the childs parser incorrectly
returns a dierent analysis. That is, the parser returns a value an erroneous
phrase marker which is not one of the permissible values in its range, the set
of structures countenanced by the grammar. The parser masks the grammar.
The indirect possibility is the following. At a given time, the entire compu-
tational device, the grammar/parser, is at a certain stage of development. There
is a particular grammar that the device is located at, G3. Together with these
grammars are paired analyses sanctioned by them.
(124)
Time
t1
t2
t3
Grammar
G1
G2
G3
Analysis
A1
A2
A3
The claim is the following. If the grammar at a particular point fails, then it is
not masked, but retreats back to the analysis associated most directly with some
previous grammar. Thus at time t3, the child is normally at grammar G3, which
is associated with analysis A3. However, given a particularly dicult sentence,
the child may fall back to analysis A2, associated with grammar G2, or possibly
even A1, though the last would be unlikely. The situation in which the child is
sometimes returning values of A2 (high attachment), and sometimes those of A3
(the correct analysis), corresponds to the time in which a child may vary in his
or her analysis, according to other factors (computational load, pragmatic
considerations, etc.) However, the child never falls out of a grammar specied
in UG. That is, the parser never returns a value which is not countenanced by
any grammar that the child has ever adopted. This means that even the mis-
takes of the child are fully subject to grammatical analysis: they show, in fact,
the geological layering of the grammar.
This type of analysis has an additional prediction to make. Namely, the
simpler grammars that the child falls into must also be computationally simpler.
For it would do the child little good to fall into a grammatically simpler system,
if the system were computationally more complex. The full position should
therefore be the following, where G1 is simpler than G2, is simpler than G3 on
grammatical grounds, and P1 is simpler than P2, is simpler than P3 in terms of
parsing operations.
(125)
Time
t1
t2
t3
Grammars
G1
G2
G3
Analyses
A1
A2
A3
Parsing Operations
P1
P2
P3
where G1 < G2 < G3
P1 < P2 < P3
and < denotes degree of simplicity, along some metric
I have argued in this paper so far precisely for such a grammatical dierence in
simplicity. The grammar itself has, for any particular operation, both a place with
respect to a lexical sequence of values (the values for the closed class elements,
choosing operations), and in the operational sequence. Along the latter of these,
a very simple sequencing exists: remove external parentheses. The latter gram-
mars are more advanced than the former in that a larger number of external
parentheses have been removed (the mind employs the art of the sculptor: at
least here). To retreat to an earlier grammar, all that is necessary is to reinsert
the most external parenthesis.
Cn:i1r 4
Agreement and Merger
The organization of operations that I will assume is within the general frame-
work of Government-Binding theory, but extended to include certain composition
operations. While composition operations will be used, the primitives are those
of GB theory, including Case theory, theta theory, Move-, and so on. Moreover,
the composition is not strictly bottom-up. This chapter will introduce the relevant
notions.
There are three basic questions that I wish to focus on:
I. What is the primitive vocabulary of the grammar (NP, VP ; agent,
patient, ; nominative, accusative, ; subject, object )?
II. What is the set of primitive operations?
III. How do the distinct vocabularies (Case Theory, Theta theory, etc.) enter
into the description of the phrase marker? What is the organization of rules
or operations in the grammar?
These questions are intended to be answered in such a way as to guarantee the
following:
IV. That niteness is a necessary part of the grammar (see Chomsky 1981, and
Chapter 1),
V. That dierences in vocabulary type are modelled in an adequate way in the
grammar in particular, that the grammar makes the same sensible cuts as
the vocabulary types themselves do, by means of its organization (e.g. Case
theory vs theta theory; open class vs. closed class elements),
VI. That the acquisition sequence bears a congruence relation to the structure of
the grammar (Chapters 2 and 3).
Each of the questions IIII may be looked at in either of two aspects: with
respect to Universal Grammar, or with respect to the mapping in the derivation
itself. It is a thesis of this work that there exists a deep isomorphism between the
two: i.e. that the structure of choices in UG is isomorphic to the structure of
operations in the derivation.
4.1 The Complement of Operations
I take there to be 2 basic processes in the grammar, the second divided into two
subparts. These are the following:
(1) Assignment of features (theta assignment)
(2) Copying of features
a. Unidirectionally (Case government)
b. Bidirectionally (Agreement)
The unidirectional copying of features is government. Government takes place
under strict sisterhood. The canonical relation of this type would be Case
government, where a verb or prepositions copies abstract Case features to its
right or to its left.
(3) accusative
hit Bill
or
(4) Input: hit Bill
(+acc.feature)
Output: hit Bill
(+acc.feature)
The examples in (3) and (4) show two dierent ways of conceptualizing the
operation. The logic is clearer in (4), where an actual feature, +accusative Case,
is transferred from the head to the Case-governed element.
There is also the bidirectional copying of features (not the same features).
This is an instance of agreement. Agreement also takes place under strict
sisterhood. The canonical case of agreement is agreement under predication for
the subject-predicate relation (Williams 1980).
(5) NP VP predication
a features b features
d features e features
NP VP
a features b features
d features e features
b features a features
AGREEMENT AND MERGER 147
The mutual copying of features is shown in (5). The relevant categories, NP and
VP, each have features associated with them (the labelling a, b, d, etc. is simply
conventional: the labels have no signicance). In (5), each category is associated
with a set of features: the features which will ultimately be copied from them (a
features for NP, b features for VP), and a residue (d features for NP, e features
for VP). After the mutual copying operation has taken place, NP and VP share
certain features (a and b features), and do not share others (d and e features).
In the theory of Williams (1980), predication involves the copying of an
index from the NP onto the VP. According to the discussion here, this cannot be
the case, since predication is actually an agreement relation, agreement between
an NP subject and a VP, and such relations are bidirectional, not unidirectional.
Rather, predication involves copying the number of the subject NP onto the VP
(where it percolates down to the head), and the copying of the external theta role
associated with the VP onto the NP subject.
(6)
number
theta role
Predication and Agreement
NP VP
This is then a typical instance of an agreement relation, with information passing
in both directions. Note that if agreement is essentially a bi-directional operation,
the general reduction of the Subject-In-Predicate relation into one of govern-
ment of the Subject by In is erroneous. Rather, this is an instance of agreement,
a symmetrical operation, unlike government. This would then constitute a distinct
primitive relation in the grammar.
Along with the unidirectional (Case govt.) and bi-directional (agreement)
copying of features, there is another operation, which I have called feature
assignment. A better name might be feature sanctioning or licensing. This is
dierent than the copying of features, because in the latter case the features
actually do originate with the head, and once they are copied the head no longer
retains them. With the assignment or sanctioning of features, there is no copying,
but simple licensing in a conguration. Thus I take (7) to be an instance of
feature licensing, but not (8).
(7) hit Bill
patient
(8) *Input: hit Bill
+patient
Output: hit Bill
+patient
Of (7) and (8), the rst of these, (7), is more accurate. This is because there is
no time prior to the licensing of the theta role: it is not ordered in the derivation,
at every point the theta role is already assigned. Feature assignment or licensing
in this sense is not a temporal operation, as feature copying is, but rather a
continuous process. The conguration in (7) is continuously licensed in the course of
a derivation. Since there is no copying of information from the head to the comple-
ment, feature licensing may continually apply. It is for this reason that the Projection
Principle holds. Namely, this relation is not of a copying type, and such relations
may take place continuously or constantly over the course of a derivation.
To summarize the foregoing, I assume the following operations.
(9)
Operation Type Example Structural Cond. Informat. Flow
Feature lic.
Feature copy
a) Unidirect.
b) Bidirect.
theta
Case ass.
subj-pred.
sisterhood
sisterhood
sisterhood
(continuous)
head to compl. (once)
sisterhood bi-directional
(once)
We have, then, dierent types of information ow: feature assignment vs. two types
of feature copying (unidirectional and bidirectional). Following from this, there
is a dierence in type or mode of application. If a process involves a transfer of
a piece of information (feature-copying), then it must take place at a single time
in the derivation. If it involves what I have called feature assignment or licens-
ing, then it may apply throughout. Note that this use of the terms feature
assignment or feature licensing is more restrictive than the usual sense, which
would include such things as Case assignment (which I am calling an instance of
feature copying). For feature assignment or licensing, but not feature copying,
there may be constancy principles holding, such as the Projection Principle.
Finally, these dierent types of operations are associated with dierent vocabu-
laries. Feature assignment or licensing (in this restrictive sense) involves theta
roles, unidirectional feature copying is exemplied by case assignment, and
bidirectional feature copying is associated with agreement. In the NP-VP case,
this involves the copying of the number onto the VP from the subject, and the
copying of the VP-associated theta role onto the subject.
All the processes so far discussed have been assumed to apply under strict
sisterhood. There are further distinctions which might be made: for example, it
may be that Case assignment requires adjacency as well, while theta assignment
does not. This would be the case if the Prepositional object to Mary in cases like
(10) were assigned a theta role by the verb I will assume that it is.
(10) John gave a book to Mary.
4.2 Agreement
I would now like to introduce a second type of Case assignment. After earlier
work (Lebeaux 1987), I will call this phrase structural Case. I will assume that,
like theta assignment, phrase structural Case assignment takes place throughout
the derivation. It is closest to the operation Assign GF in Chomsky (1981), but
also may be spelled out as a particular case in the case system. Phrase structural
Case, unlike structural case, is dependent on mother-daughter relations, not head-
sister relations (Lebeaux 1987). In Lebeaux (1987), I argue that the rst instanc-
es of Case assignment to the subject position by the child are actually instances
of the assignment of phrase-structural Case, not structural case (see Chomsky
1981: structural case is sister-assigned Case). Thus in examples like (11), phrase
structural Case is assigned to the subject position.
(11) My/me did it.
I will assume that there are three major places in which phrase structural case is
assigned in adult English.
(12) a. subject of S: (NP, S)
b. subject of NP: (NP, NP)
c. topic position: (NP, S)
If we consider the topic position to be the subject of the utterance, or perhaps the
subject of (what used to be known as) S or S, then its appearance would be
regularized to the other two.
Note that phrase structural Case, like structural case, may take dierent
forms. The subject of NP is marked genitive, while the topic position is marked
accusative.
Note as well that all of these positions are either islands (b) and (c), or
partial islands (a), from the point of view of extraction.
Phrase structural Case assignment diers from simple structural case
assignment in a few central ways. First, unlike structural case assignment, it is
assigned optionally. Thus the subject position of a NP need not be assigned any
case: e.g. if no lexical element is in that position. Second, it would be assigned
under the mother-daughter relation. It would not be an instance of feature
copying, since this would require that information be copied from a mother node
to one of its daughters. Rather, it would be an instance of feature assignment or
feature licensing, which has the technical restricted sense given to it above. This
fact has a further consequence: phrase structural case may apply several times
throughout the derivation. This follows from the fact that the assignment of such
case does not involve the transfer of information from one node to another, but
rather the scanning of a tree to see if the structural condition has been met. In
this sense as in many others it is similar to the Assign GF relation in
Chomsky (1981), the relation from which function chains are formed (and which
must apply throughout the derivation). The full set of operations, then, is the
following.
(13)
Operation Type Example Structural Cond. Informat. ow Application
Feature licensing
a.
b.
theta
PS case
sisterhood
mother-daughter
head to sister
mother to daugh.
continuous
continuous
Feature copying
a. Unidirect.
b. Bidirect.
struct. c.
agreement
sisterhood
sisterhood
head to sister
both directions
single time
single time
Thus there are two continuous processes, theta assignment and phrase-structural
Case assignment, and two one-time process, the assignment of structural case,
and agreement. How are the two operations known or postulated to exist in the
grammar, Move- and Adjoin-, incorporated into this scheme? Move- moves
an NP into an A or A position: the subject position of S, or the Spec C position
of C. The latter movement may be thought of as movement into the subject
position of C. Thus both types of movement are movement into a subject
position, broadly construed (Pustejovsky 1984). If we conceive of movement in
this way, then movement itself may be conceived of as a sort of bi-product of a
more primitive necessity. That necessity would be to saturate the +/ wh feature
in the case of wh-movement, and to saturate In in the case of NP-movement.
These both would seem to fall under the rubric of agreement: i.e. the necessity
for agreement would initiate NP movement.
The table above suggests that movement of both types may be consid-
ered a result of, or more exactly, in a 1-to-1 relation with feature satisfaction. Let
us adopt the following terminology: an operation O is initiated by a feature F i
the satisfaction of F requires that O take place. This would work in the obvious
way for structures like (14). Given an input structure like that in (14), the
satisfaction of the closed class RC linker would initiate the relative clause
adjoining operation.
(14) S
NP
The man
VP
V
saw
NP
the woman
S :
1
S :
2
S
Comp
who
S
I knew
The two operations, Saturate-RC Linker and Adjoin-, are therefore in a 1-to-1
relationship; the necessity for saturation involves or initiates the adjoining
operation.
Similarly, wh-movement may be considered to be in a 1-to-1 relationship
with the satisfaction of the +/ wh feature in the Comp for the clause in which
the wh-element nally appears. Note that this dierentiates the nal target of
wh-movement from any of its intermediate landing sites. The situation here is
therefore somewhat more complex than that with the Adjoin- operation, because
the wh-element may move several times in the derivation, with only the last
movement into the Spec C satisfying the +/ wh feature in Comp. We might
assume that the full set of movements are initiated once the ultimate Comp
feature requires satisfaction, or alternatively, that the intermediate movements are
free, and only the last movement is regulated by the necessity for feature
satisfaction. I leave this open.
If we adopt such a solution, then instead of thinking of a derivation as being
composed of primitive operations (Move-, Adjoin-), we may consider it to
consist of the specications of closed class elements, which must be satised.
For the operations above this would be the following:
(15) Move-NP satisfy In/Agr
Move-wh satisfy +/ wh feature
Adjoin- satisfy RC linker
This provides for an interestingly dierent way of conceiving of the operations
of the derivation: that they are equivalent to the satisfaction of the specications
of certain closed class elements. This would satisfy the niteness characteristic
noted in Chapter 1 (and in Chomsky 1981). While there would be ordered relations
between the necessity for satisfaction in a particular derivation, there would be
nothing in the scheme in (15) to require that actual levels be picked out.
Move-NP and Move-wh would therefore be dierentiated in the following
way: not by the specication of the movement rule itself (operationally), but in
the satisfaction of the diering closed class elements, or equivalently, in the
diering agreement relations which take place. There are two operations: Agree:
Subj./Pred. and Agree: Spec C/C. The rst of these applies to the structure in (16).
(16)
S S
NP
e NP
e
VP VP was was
V
hit
V
hit
NP
John
NP
John
Then it is actually the Subject/Predicate operation itself which forces movement.
Similarly, wh-movement may apply in a 1-to-1 correspondence with the relation
which satises the +/wh-feature in Comp. Call this Spec C/C agreement.
(17)
S S
SpecC SpecC C C
C C S S
NP VP
saw who John e e
John see ? e did who
Since this is another sort of agreement relation, agreement actually underlies both
Wh-movement and NP-movement.
The two operations, then, are the following
(18) Agree Subject/Predicate Move NP
Agree Spec C/C Move wh
These operations initiate movement. By their application, wh-movement and
NP-movement take place. The third operation which has been introduced so far
is the adjunction of the relative clause into the NP. So far, I have suggested that
this involves saturation of the relative clause linker. This by itself would make
the operation of Adjoin-, a unidirectional operation. However, as Chomsky
(1982) notes, the relation of the relative clause to the head is also an operation
of Predication. If this is correct then relative clause formation (adjunction) is also
an instance of a bidirectional operation: the head N satises the relative clause
linker, but at the same time the relative clause itself is predicated of the head N.
This would mean that relative clause adjunction would also be an instance of
agreement.
(19) Agree Subject/Predicate Move NP
Agree Spec C/C Move-wh
Agree Rel head/relativizer Adjoin-
This is shown above. In fact, this would mean that all of the operations which
involve a radically changing operation of any type (both forms of movement,
relative clause adjunction) are initiated by the action of agreement. Other types
of information relations, theta assignment and structural case assignment for
example, do not radically change the structure of the tree, or the position of
elements in it.
The brunt of this section, then, has been to introduce a new primitive
operation into the grammar: agreement. Unlike Case Assignment, agreement is
intrinsically bi-directional. It should not be reduced to some other primitive
operation (e.g. government). Agreement has as well two other attributes. It can
always be put in a 1-to-1 correspondence with the satisfaction of a closed class
element, as in (19) above. This guarantees niteness. Second, it is agreement
itself (so far) which composes substructures into a whole.
4.3 Merger or Project-
4.3.1 Relation to Psycholinguistic Evidence
Let us look at another sort of phenomenon, apparent in acquisition. It is a
commonplace in the acquisition literature that children in the earliest stages of
language just use open class elements in production. This is the famous stage of
telegraphic speech, where the closed class morphemes have dropped out (Brown
1973). The phenomenon of telegraphic speech is well known to every linguist,
as well as every parent, yet very little has been made of it in the literature. Nor
has much been made of the closed class/open class distinction as a signicant
demarcation in adult speech. Indeed, in Aspects (Chomsky 1965), as well as in
recent work by Joseph Emonds (Emonds 1985), some attempt has been made to
model the open class/closed class distinction in a derivational way. The proposal
in Chomsky (1965) was to allow for late S-structure insertion of closed class
elements; a similar proposal is made in Emonds (1985). These two have perhaps
been the main line proposals in the syntactic literature, but neither has been
substantively followed up. Yet the existence of telegraphic (i.e. open class)
speech by children suggests that the open class representation would have
considerable signicance in development; the General Congruence Principle
would direct that it have repercussions on representations in adult speech as well.
Examples of telegraphic speech are given below.
(20) see ball
here Mommy
want orange juice
make castle
etc.
In spite of the paucity of proposals about the open class/closed class distinction
in syntactic theory proper, this lack of attention does not seem to be principled.
One reason for the lack of interest has to do with the fact that closed class
morphemes do not belong to a single category (like NP), but rather to any of a
number of types. There are Determiners (the), auxiliary verbs (may), inectional
elements (to), prepositions (to, of), and nouns (him). Since the majority of
generalizations in linguistic theory are stated in terms of category types (e.g.
lexical NPs need Case), the lack of a coherent categorization for closed class
elements has perhaps drawn investigation away from this class.
A second reason bears more directly on acquisition. While in general GB
theorists have shown a useful suspicion of functionalist proposals, within the
realm of telegraphic speech such proposals have reigned supreme, without
substantive criticism. The functionalist proposal for the absence of closed class
elements in early speech would be simply the following: the child has limited
memory and computational resources in early stages. Given such a limitation,
morphemes are at a premium. And since open class morphemes are information-
rich compared to closed class morphemes, it is hardly surprising that the child
has recourse to the former rather than the latter.
There are, however, a number of diculties with this functionalist account.
First, even given limited resources, one would expect that the closed class
morphemes would appear sometime, if the child had command of them. The fact
that they do not appear at all, in this early stage, suggests that their exclusion is
principled, not simply functional in character. More exactly, while there may be
a functional reason why closed class morphemes are not generally used, it is
reasonable to believe that this functionalist reason has been grammaticalized:
i.e. realized in the grammar in a principled and meaningful way. Otherwise, one
would expect occasional outcroppings of closed class elements even in the
earliest stages, something which does not occur (except for pronouns).
Evidence from quite a dierent area suggests as well that there are real
dierences in the adult computational system in the handling of open class and
open class elements. I am thinking here of a quite complex paper by Garrett
(1975). Garrett analyzed a large corpus of speech errors, gathered by Shattuck-
Hufnagel and himself, the so-called MIT Corpus (approximately 3400 errors).
Exchange errors fell within two basic types: those which occurred between
independent words, and those which occurred in what Garrett calls combined
forms, essentially involving the stranding of bound axes, as the free mor-
phemes were interchanged.
(21) Independent form exchanges (examples):
a. I broke a dinghy in the stay yesterday.
b. Ive got to go home and give my bath a hot back.
(22) Combined form exchanges (examples):
a. McGovern favors pushing busters.
b. It just sounded to start.
c. Oh, thats just a back trucking out.
(exchanged elements underlined)
The independent forms and the combined forms apparently operate dierently
in exchanges: in particular, the former, but not the latter, obey form class (i.e.
syntactic category), according to Garrett, and this constraint is stronger in
between-clause exchanges. He thus suggests that there are two independent levels
of syntactic processing (see also Lapointe 1985, for discussion):
a. Exchanged words that are (relatively) widely separated in the intended
output or that are members of distinct surface clauses will serve similar
roles in the sentence structures underlying the intended utterance, and, in
particular, will be of the same form class. These exchange errors repre-
sent interactions of elements at a level of processing for which functional
relations are the determinant of computational simultaneity
b. Exchanged elements that are (relatively) near to each other and which
violate form class represent inter-actions at a level of processing for
which the serial order of an intended utterance is the determinant of
computational simultaneity
from Garrett (1975)
Yet more interesting, from the point of view of the theory advocated here is the
following comment on the stranding of closed class morphemes (Garrett 1975):
The errors we have been referring to as combined form exchanges are errors
of a rather remarkable sort. They might, as a matter of fact, have been more
aptly described as morpheme stranding errors, for not only are the permuted
elements nearly always free forms, but the elements left behind are as often
bound morphemes
(30) Im not in the read for mooding
(31) he made a lot of money intelephoning stalls
(32) Shes already trunked two packs.
Why should the presence of a syntactically active bound morpheme be
associated with an error at the level described in (b)? Precisely because the
attachment of a syntactic morpheme to a particular lexical item reects a
mapping from the functional level to the positional level of sentence
planning
It is examples like those in (30)(32) that lead Garrett to propose that syntactic
production is divided into two levels: a functional level, and a positional level,
and that the former is mapped into the latter.
This whole line of research might be sanitized from the point of view of
linguistics by assuming that what it really pertains to is the theory of the
language producer and language acquirer (in the case of telegraphic speech). This
would then require that these be part of a separate theory, of unclear extent,
which does not have to have any correspondence with syntactic theory per se.
Let us nonetheless not take such a position, and instead reach to integrate the
Garrett-Shattuck proposals, and the phenomenon of telegraphic speech within
linguistic theory proper. This is exactly the position which was taken last chapter
with regard to relative clauses, where it was argued that the high attachment of
RCs noted by Tavakolian (1978) was not a separate parsing principle, but an
instantiation of a possibility open in UG: namely, of having a co-relative
construction. By taking such a position, a sort of synthesis was achieved between
the (now standard) parameter setting approach, and the thesis of this work, that
real development takes place. I will concentrate here more on the telegraphic
speech, with the Garrett data forming a sort of backdrop.
There is a nal observation which suggests that the stage of development of
telegraphic speech is an organized stage, and that it should be taken account of
in adult speech as well. This is the simple, but meaningful observation that
adults, as well as children, can speak telegraphic speech. If we viewed such
speech as simply the direct result of a computational decit by the child, we
would expect that adults would no longer be able to produce such speech, at
least insofar as this would require mimicking a computational decit that the
adult no longer had. Given the fact that adults can speak telegraphically, there is
a strong implication, though of course no sure proof, that telegraphic speech is
an actual subgrammar of the full grammar, and that adults using such speech are
gaining access to that subgrammar. This in turn would be very much in line with
the General Congruence Principle, which suggests that the acquisitional stage
exists in the adult grammar in something like the same sense that a particular
geological layer may underlie a landscape: it therefore may be accessed.
4.3.2 Reduced Structures
But what would this subgrammar look like? It was noted above that the open
class/closed class distinction had been mentioned, and partly modelled, by such
early works as Aspects (1965), where it was assumed that closed class elements
were a late spell-out of certain types of information. The Garrett and Shattuck-
Hufnagel data, however, suggests that something like the opposite ordering holds.
Namely, that there exists a grid or template of closed class elements, and the
open class elements are projected into them. This is perhaps counter-intuitive
from the point of view of actual speech production, yet the logic of the grammar
supports it. It is also in line with certain conclusions that were reached in
Chapter 1. A constraint was suggested there, the Fixed Specier Constraint,
which would bar the independent movement of closed class speciers (i.e. unless
they were part of a whole constituent which was moved). This was made
necessary by the fact that under conditions of extensive movement, there must
remain certain stable elements, the grid around which the others are moved, for
the child to be able to induce a grammar at all. The closed class specier
elements seemed to be just such a set. (Note that this form of the proposal, while
based in part on considerations that Garrett raises, diers in content.)
Let us adopt the same conceptual device that was used in the analysis of
relative clauses early. In that chapter a full sentence was put through a concep-
tual lter, which ltered adjuncts out of the representation. This created the
argument skeleton on the one hand (the rooted structure which was a pure
representation of the argument-of relation), and a set of adjuncts which would
later be added into the representation. If we adopt the same device here, we
would get a reduction of a full sentence, together with a set of closed class
elements. Let us ignore the latter set for now, concentrating on the reduction
itself. The term reduction here is used with a rather dierent meaning than that
in Bloom (1970).
(23) I saw the ball.
reduction: see ball
(24) Mommy left the room.
reduction: Mommy leave room
(25) I put the ball on the table
reduction: put ball (on) table
In fact, what representations like (23)(25) show us is that we had not gone far
enough in Chapter 3 in attempting to isolate a pure representation of argument
structure. The reductions in (23)(25) are a purer isolate yet. And, if the General
Congruence Principle is to hold, it must be the case that these reductions are not
simply spoken by the child, but underlie adult speech as well.
The term reduction to describe (23)(25) is intended purely descriptively.
Assuming that the reduction in (23)(25) is what the child would say if he or she
wished to express the full meaning directly above it, we will call the childs
utterance a reduction of the full phrase marker. This still leaves undetermined
what the nature of this reduction is. There are three central possibilities:
I. that the reduced phrase marker is directly generated as such by the childs
grammar,
II. that there is a reduction transformation of some sort from a fuller structure
(where this reduction may be done by the parser rather than the grammar),
and
III. that the actual phrase marker is relatively more developed even at SS, and
null elements ll the determiner and other closed class positions.
1
1. This possibility was suggested to me by Joung-Ran Kim.
Of these three possibilities, I wish to adopt the rst, and to some degree the
third. This has a further eect. Given the General Congruence Principles, such
a reduced mode of representation must also underlie the more complete adult
representation. That is, telegraphic speech is actually generated by a subgrammar
of the adult grammar, a modular and unied subgrammar, and this enters into the
full phrase marker.
The three logical possibilities underlying the childs reduction of adult
speech are the following:
(26)
Simple reduced phrase marker:
V
V
see
N
theme
ball
(27)
Deletion account:
S
NP
I
I
Tns
e
VP
V
see
NP
Det
the
N
ball
S
VP
V
see
NP or N
ball
Deep Structure Surface Structure
(28)
Null lexical items:
S
NP
e
I
Tns
e
VP
V
see
NP
Det
e
N
ball
Deep Structure and Surface Structure
The arguments for the rst proposal over the deletion transformation account are
conceptual in nature. Suppose that we assume that there is some measure of
complexity of a phrase marker. This would be a function of the complexity of
the tree, the licensing relations in it, and so on, and would no doubt dier to
some degree from production to comprehension. It would be natural, given such
an analysis to suppose that the phrase marker itself (though not the universal
principles underlying it), complicates itself over time, in the sense that the
complexity with respect to that metric increases. That is, the analyses allowed by
the grammar become more complex over time, though the universal principles do
not. This was the case with the analysis of relative clauses earlier, where the UG
information was, in fact, lessened over time (as parentheses were removed),
while the analysis itself was to some degree made more complex.
Given such an assumption, there is something extremely odd about the
deletion account, Proposal II. Such an account requires that there be an original
full representation, together with an operation, a reduction transformation which
operates on it. The child grammar (or system) and the adult grammar generating
the syntactic representation underlying see the ball would therefore be the
following:
(29) Adult grammar: rules underlying full phrase marker.
Child grammar: i) rules underlying full phrase marker
ii) reduction transformation or operation
But this is surely odd, if the grammar has any sort of computational reex at all.
The childs grammar here contains more material in it than the adult grammar,
and these operations must all work: precisely the opposite of what might be
expected. One might expect, given the grammar in (29), that a more complex
structure (i.e. one containing more NPs) would be less reduced, because the rules
underlying the full phrase marker would already be stretched to the limit, and it
would be dicult for the child to apply (ii) in addition: i.e. (i) would be instan-
tiated at the cost of (ii).
There is a second way in which the reduction analysis is odd. Namely, such
an operation would not exist in UG, but would simply be present at a particular
stage in acquisition, the telegraphic stage. This would make it look very unlike
the situation with relative clauses discussed earlier, where the possibility of high
attachment was reduced to an actual alternative specication in UG: the possibili-
ty of a corelative construction. The theory under construction in the last chapter
would require that when an appropriate structure is not reached by the child (in
this case, the full phrase marker), the child falls into another grammar specied
by UG. This would not be the case with a reduction transformation, since the
reduction transformation itself is not specied by UG. The child, therefore,
would be falling into a grammar which is not specied by UG. Keeping with
the strictures above, this possibility is unavailable to us. I will therefore assume
that there is no reduction operation of this type.
With respect to the third possibility, the situation is more interesting. I limit
myself here to some preliminary comments.
In Chapter 2, I suggested that the analysis of the early string by the child
took place in two stages. The child rst tted the lexical subtree over the open
class part of the incoming string.
(30)
V
(N)
man
V
V
saw
N
woman the the
This gave rise to telegraphic speech: an analysis in which the closed class
elements dropped out.
(31)
V
(N)
(man)
V
V
saw
N
woman
At a logically slightly later stage, the closed class elements, marked simply X
0
,
were incorporated into the structure, according to the Principle Branching
Direction of the language.
(32)
V
N
X
0
X
0
N
N
V
V
the man saw woman the
This would correspond to a derivational stage in which Project- occurs. Some
evidence for this second stage would be the presence of schwa in the output,
corresponding to the X
0
elements.
If this is in fact the progression, then both the null category and the simple
structures view would be expected to be correct, though at slightly dierent
stages: the latter slightly less advanced than the former, and logically prior to it.
(This assumes that the Pre-Project- stage is phonologically realizable: if not,
true telegraphic speech in the simple structures sense above would exist only
in the initial analysis, and as a subgrammar of the nal grammar (see later
discussion), and not in exteriorized speech at the telegraphic stage. I leave this
odd possibility aside.)
Either of the two views above would have an advantage over a fourth view
(in GB-theory): that the initial NP is a full phrasal node, with no determiner.
(33)
S
NP VP
V
see
NP
N
ball
The reason has to do with extendibility of the grammar (in Marcus, et. al.s
sense). The following generalization must be expressed somewhere in the
grammar of English.
(34) In the phrasal syntax, deniteness is marked with the (or perhaps,
building up to the N level: Chapter 2)
The representation in (33) would violate the restriction in (34), while the
representation in (31) would not, since the phrasal syntax had not yet been
entered. That is, by assuming a dierent type of representation, the thematic
representation, one arrives at the position that the childs grammar is at this stage
not incorrect, but simply incomplete.
Let me turn now to another consideration in syntactic description at this
stage. At the stage at which children are saying things like see ball, their
behavior suggests that they are using something quite dierent than that simple
sentence for the information structure which is input to semantic interpretation.
In particular, while ball in see ball is determinerless in early childhood speech,
and while determinerless nouns in adult speech generally have a generic or class
interpretation, the child speaks reduced phrases like see ball in contexts where
ball must be regarded as specic in reference. Thus the child interprets see
ball as something like I see the ball. But this simple fact creates diculties
for the simple structures account. If the DS and SS representations are indeed
((see)
v
(ball)
n
)
v
i.e. extremely simplied, and without a determiner: pure
lexical representations then the child must still have a way of rendering the
fact that such structures are not generic. Assuming that at LF the structure is
interpreted, this means that by LF, the representation must be something like
((
V
see) (
NP
the ball)). But if this is so, then structure-building operations must be
available at LF, for the child but not the adult. This is surely not to be desired.
Further, the postulation of structure-building operations at LF mimics the
possibility of a reduction transformation already rejected, although in a reverse
direction. The null lexical item account avoids these problems because there is
already a slot for the determiner element. We might even assume that the slot
itself is marked for deniteness or indeniteness:
(35)
VP
V
see
NP
Det
+Definite
e
N
ball
In this case, the child would not need to structure-build at all at LF, but would
simply use the structure as given: deniteness is already correctly marked,
though the lexical item is missing.
This diculty with the thematic structure account also appears to put it at
a disadvantage with respect to an acquisition theory like that in LFG (see Pinker
1984). In Pinkers theory, a relatively complete f-structure may be paired with an
incomplete c-structure. The representation of see ball might therefore be the
following.
(36)
S
VP
V
see
NP
N
ball
c-structure f-structure
SUBJ (Pred I)
see (SUBJ,OBJ)
OBJ +definite
Pred ball
TNS Present
In the LFG account, it is possible to allow for a severely reduced c-structure,

while the f-structure is still full. Since there is only a realization mapping
between f-structure and c-structure, and each realization operation presumably
comes at a computational cost, in LFG the simpler c-structure would be associat-
ed with less computational cost. This is presumably not the case for Government-
Binding Theory, insofar as a null or identity transformation, i.e. one not deforming
the D-structure at all, would presumably be the computationally cheapest. (This
is an old idea, cf. Miller and McKean 1964, but I will retain it.) Because
D-structures can be directly realized as S-structures in that way (unlike f-struc-
tures and c-structures, which use distinct vocabularies), the reduction transforma-
tion, or the inverse in terms of LF structure-building, would presumably be more
computationally expensive then doing nothing at all to the input structure.
I will give one response to this line of attack here, reserving more full
discussion to later in the chapter, and future work. If it is in fact the case that the
child has reference to the notion of deniteness, when producing sentences like
see ball, then this fact must be registered somewhere. However, in a theory
like that of Chomsky (1977b), this need not take place at LF. Rather there is a
dierent representation, SI-1 and SI-2, at which linguistic information interfaces
with information about the world, and the general cognitive system. It is quite
unclear what the format of the information would be at that point. However, if
deniteness were marked then, and not sooner, then there would be no need for
structure building operations at LF. In a sense, the greater computation load
would be placed back further in the grammar, as a default. The actual operation
would be viewed as closing o the free variables in the representation, via an
iota operator: see (x) & ball (x)
x
(see (x) & ball (x)).
Thus by locating the variable binding operations at SI-2 for children, the
diculty with the simple structures account of see ball is avoided.
4.3.3 Merger, or Project-a
Let us work out one way by which this proposal for early thematic speech, i.e.
the reduced phrase marker (and the Garrett-Shattuck-Hufnagel ndings about
word exchanges) would be instantiated in adult speech. Other ways are possible
as well. The particular proposal here does not hew unusually closely to the
original Garrett proposals, but instead tries to integrate the idea of an open class
and closed class representation in ways more closely modelled to those available
in linguistic theory. In particular, rather than supposing that the distinction or the
cut is between open class and closed class representations, let us assume that the
two main representational theories, Case theory and theta theory are implicated.
In particular:
(37) The reductions in (23)(25), repeated below, and telegraphic speech
in general, is a pure representation of theta relations.
(38) I saw the ball.
reduction: see ball
(39) Mommy left the room.
reduction: Mommy leave room
(40) I put the ball on the table.
reduction: put ball (on) table
What then creates the fully structured phrase marker out of the pure theta
relations in (38)(40)? This, I will assume, is caused by the projection of the
theta representation into a dierent representation, a pure representative of Case
relations. There are therefore two separable representations comprising the VP
saw the woman. One is a representation of theta relations, the other is a pure
representation of Case.
(41) V
V
see
N
woman
Theta representation
(42)
VP
V
Case assigning
features of verb N
Case representation
NP
Det
+acc
the
The Case theory representation includes closed class elements in particular,
the closed class determiner the. It also includes the case assigning features of the
verb, but not the verb itself. (For an argument that these should be separated, see
Koopman 1984). Case is not assigned to the NP as a whole, nor to the nominal
(N) head, but rather to the closed class determiner position in the NP. Thus Case
and theta are actually assigned to dierent elements in the object NP. Theta
roles are assigned to the nominal head, N, in the theta representation. Case is
assigned to the determiner position in the Case representation. Ultimately, both
get spread over the NP, but in dierent ways. The theta role assigned to the
nominal head percolates to the NP node after a Merger, or Project-, operation
merges the case and theta representations. The accusative feature on the deter-
miner gets copied onto the head N, again in the operation of Merger, where the
determiner and the head must agree in case assignment.
It is thus the operation of Merger which merges the Case and the theta
representations. Note that each of these representations is a pure representation
of the particular vocabulary that it invokes (Case theory vs. theta theory), and
indeed, the crucial categories that are mentioned in each theory (determiner vs.
nominal head) are distinct as well.
(43)
V
V
see
N
theme
woman
woman
Theta theory representation
VP
V
Case assigning
features N
e
Case theory representation
NP
Det
+acc
the
Merger,
or Project-
VP
V
see
NP
+theme
Det
+acc
the
N
+acc
The theta representation above has already been put forward in the discussion of
the lexicon. The vocabulary of the representation are theta roles (already
assigned), and category labels of the zero bar level. This representation is
somewhere in between an enriched lexical representation and a truly syntactic
one: enriched because along with the representation of theta argument structure,
it includes the lled terminal nodes with lexical items in them.
What I have called the Case theory representation is a good deal stranger.
It factors out the closed class aspect of the V-NP representation in a principled
way. What it contains is the following: (1) a subtree in the phrasal syntax (it
projects up to at least V), (2) where the Case assigning features of the verb are
present, but not the verb, and (3) in which Case has been assigned to the determiner.
The position that Case is assigned not to the nominal head, but rather to the
determiner, is reasonable given the fact that in languages like German, it is
actually the determiner system (especially the denite determiner) which shows
the demarcations according to Case.
(44) nominative der Mann
accusative den Mann
dative dem Mann
genitive des Mannes
The operation of Merger has three main eects. First, it inserts two lexical items
into the slots provided by the case frame: the head verb and the theta governed
noun. Second, it percolates the theta relation already assigned to the noun to the
NP node (theme, in this case). Third, it copies the Case that was originally
associated with the determiner position onto the head noun. This means that ball,
as well as the, is marked for (abstract) accusative Case. However, other parts of
the noun phrase in a complex noun phrase (e.g. the pictures of Mary) will not be
so-marked. This stands in contrast to theta assignment, which percolates up to
the NP node, and so characterizes the whole NP.
The two representations above underlie adult speech, as well as the childs.
This allows for a very compact description of telegraphic speech: telegraphic
speech is simply a pure representation of the theta structure, itself a sub-repre-
sentation generated by the adult grammar. There are a few other properties of the
above representational system which require note:
(i) The determiner and full NP categories play something like the same role in
Case and theta assignment to the simple NP object, that the subject NP and
S play in ECM-type constructions (e.g. the complement of believe). Just as
a theta role is assigned to the full S in such constructions, while Case is
assigned only to the subject NP, in this representation, Case is assigned only
to the determiner, while a theta role is ultimately inherited by the full NP.
(ii) The conception of the grammar is not simply modular, but separable in
character. That is, it is not simply the case that Case theory applies at a
dierent point then theta theory, but that the two primitives actually pick
out dierent representations which purely instantiate them. These represen-
tations are then composed by the operation of Merger.
(iii) It allows for a clear demarcation to be made between the lexical and syn-
tactic passive. They apply to dierent structures: the syntactic passive to
the Case representation, and the lexical passive to the theta representation.
(iv) The xed character of the closed class elements (Chapter 1) is modelled by
having such elements be the frame into which the theta representation is
projected.
The operation of Merger here (or Project-) is unlike that of Agreement,
discussed earlier in the chapter, in that the latter requires that information pass
in both directions. The paradigm case of an agreement relation, in this view, is
Subject-Predicate agreement, where number information passes in one direction,
and theta information in the other.
(45)
number
theta
NP VP
S
Perhaps the most interesting area of evidence outside of acquisition for the above
theory is in idioms.
4.3.4 Idioms
Let us consider another area of evidence. I have suggested above that there are
two central sorts of representations which have open class elements in them. The
rst is the simple theta representation a sort of extended lexical representation
with terminal leaves. The other representation which has open class elements in
it is the post-merger representation: the representation after the OC (open class)
representation has merged with the Case grid.
If this conception is correct, then there should be certain set phrases, idioms,
which show where the lines of demarcation are. In particular, since idioms
require sequences of set items, then any particular idiom should be frozen with
respect to the level of elements that it is stipulated at. An idiom could be
stipulated at the level of the OC representation: the theta representation. Or it
could be stipulated at the post-merger level: what ordinarily would be the result
of a projection operation, but since idioms are set sequences, may be the furthest
back that the grammar could go.
In essence, one would expect two types of idioms: Level I, theta type
idioms, and Level II, post-merger idioms. To the extent to which such a division
exists, and has syntactic consequences, the case for a separate theta representa-
tion, and post-merger representation becomes stronger.
Consider the sample idiom sets in (46).
(46) OC Idioms Idioms with denite determiner
break bread kick the bucket
make tracks buy the farm
take advantage climb the walls
turn tricks break the ice
mean business smoke the (peace) pipe
turn tail hit the ceiling
give two cents make the grade
keep tabs give the lie to
make strides bite the dust
make much of bite the bullet
I have broken the idioms into two basic types: those which take an object with
a denite determiner, and those which obligatorily take a simple N, and allow
for a free specier.
I would like to introduce now a generalization.
(47) Determiner Generalization:
Object idioms with a specied determiner do not allow passivization.
By a specied determiner, I mean and this is crucial that the determiner
itself is a necessary part of the idiom (e.g. kick the bucket; *kick a bucket). For
example, the is part of the idiom in kick the bucket, but no determiner is part of
the idiom in take advantage of (thus I mean specied, not that it is specic): an
indenite determiner is specied, in this sense, as long as it is part of the idiom.
The determiner generalization is basically correct. A minimal contrast supported
by the generalization would be the following:
(48) a. John took advantage of Bill.
b. Advantage was taken of Bill.
(49) a. John kicked the bucket.
b. *The bucket was kicked by John.
The idiom in (48), with the bare N in the object slot, passivizes freely. However,
the idiom in (49), with the xed denite determine the, does not passivize at all.
Consider another minimal pair.
(50) a. We broke bread over the agreement.
b. Bread was broken over the agreement.
(51) a. John climbed the walls.
b. *The walls were climbed by John.
And a third example.
(52) a. We made (great) strides.
b. (Great) strides were made by us.
(53) a. The boss hit the ceiling.
b. *The ceiling was hit by the boss.
These three examples show sharply the distinction in passivization. They also do
something equally important: they exclude the possibility that the reason that the
OC idioms should passivize more freely is simply that the object is more
interpretable in isolation. Thus (it might be thought) the individual subparts of
the OC idiom are themselves interpretable, and this is the reason that the DS
object can appear in the subject position in idioms, but not the post-merger
object. However, it would be dicult to make such a case for the examples
here. Bread in break bread need not be interpreted literally in the passive:
Bread was broken over the agreement by drinking a glass of wine. Yet the
idiom more or less retains its idiomatic usage in the passive certainly does
compared to the contrasting climb the walls. Even more striking, (great) strides
cannot be taken any more literally in Great strides were made by us in getting
the new contract than the ceiling can be in *The ceiling was hit by the
President. Yet in the former case, the simple N passivizes freely, while in the
latter case the fully specied NP does not passivize at all. These contrasts
suggest that a proposal having to do simply with the individual interpretability of
the idiom-parts will not go through: they suggest, in fact, that the apparent free
interpretability of the idiom parts in the OC (open class) idiom is an eect of
passivizability, not a cause. The actual reason for the contrast in passivization
has to do with the specication of the determiner.
I have simply included an idiom list above. Let us now label the list, where
Y stands for passivizes freely, and N for not. The regularity is striking, though
not universal.
(54) Sample of idioms
OC Idioms Idioms with denite determiner
Y break bread N kick the bucket
Y make tracks N buy the farm
Y take advantage N climb the walls
Y turn tricks Y break the ice
N mean business N smoke the (peace) pipe
N turn tail N hit the ceiling
Y give two cents N make the grade
Y keep tabs Y give the lie to
Y make strides N bite the dust
Y make much of N bite the bullet
Totals: theta representation: 8 yes
2 no
denite determiner: 2 yes
8 no
The counter-examples fall into two types: theta representation idioms which
unexpectedly do not passivize (mean business, turn tail), and full specied idioms
which do (break the ice, give the lie to). Of those which unexpectedly do not
passivize, mean business falls under a dierent generalization: namely, mean
does not seem to passivize in any of its forms (*A great deal was meant by John
to Mary). Turn tail may have thematic reasons as well (Jackendo, 1972) for not
passivizing.
For the full specied idioms, break the ice, if investigated more fully actually
supports the generalization as an absolute. This is because break the ice actually
has structurally freed up with respect to its determiner. Consider the following.
(55) break the ice
break a lot of ice
?
break some ice (with that remark)
(56) kick the bucket
*(some men) kicked some buckets
*kick a lot of buckets
Compared to an idiom where the determiner is denitely specied, such as kick
the bucket, the determiner element in break the ice is quite free. This suggests
that it truly does not belong in the specied determiner list at all.
Similarly, other marginally passivizeable idioms not mentioned here, for
example, toe the line, seem to have a similar property: the determiner has freed
up (to some degree), and to that degree the idiom is passivizeable.
(57) a. He toed the line.
b.
??
The line was toed by him.
c.
??
A narrow line was toed by him.
It appears, then, that the Determiner Generalization may be taken as a principle,
rather than as a simple correlation. However, at this point an empirical problem
arises. I have so far restricted myself to a discussion of idioms with denite
determiners (in the object slot). Curiously, some idioms with an indenite
determiner, do seem to allow passivization.
(58) a. take a fancy to:
A fancy was taken to Je by Mary.
b. take a shine to:
A shine was taken to Je by Mary.
(59) take a bath:
A bath was taken in the stock market (by John).
(60) leave a lot to be desired:
A lot to be desired was left by Johns behavior.
There is, however, a fascinating contrast in these idioms. If they are given the
passive progressive form, the possibility of passivization disappears.
(61) a. take a fancy to:
*While we talked in the kitchen, a fancy was being taken to Je
by Mary in the dining room.
b. take a shine to:
*While we talked in the kitchen, a shine was being taken to Je
by Mary in the dining room.
(62) take a bath:
*A bath was being taken in the stock market by Je.
(63) leave a lot to be desired:
?
*A lot to be desired was being left by Johns behavior.
This restriction on the progressive appears only in the passive, and hence it
cannot be viewed as arising out of the some semantic property of the verbs
themselves (e.g. that they are stative).
(64) a. *A shine was being taken to Je by Mary.
b. Mary was taking a shine to Je.
The question, then, is : why are certain idioms with specied determiners freed
from the general ban on passivization, but not in the passive progressive?
In fact, a clear and ready answer is available. In general, while the syntactic
passive is either progressive or stative in character, the lexical passive is only
stative (with respect to the lexical class of the predicate). Thus while the
nonprogresssive passive in (65) allows either of two readings, the progressive
passive is restricted to just one.
(65) a. The toy was broken.
Reading 1: Someone broke the toy.
Reading 2: The toy is in pieces.
b. The toy was being broken.
Reading 1: Someone (was) break (ing) the toy.
Reading 2: None
It is natural to assume that the toy is in pieces reading, where a simple
property is being predicated of the toy, is in fact the lexical passive reading.
Similarly for the actional predicate in (66).
(66) a. The glass was shattered.
Reading 1: Someone shattered the glass.
Reading 2: The glass is in pieces.
b. The glass was being shattered.
Reading 1: Someone (was) shatter (ing) the glass.
Reading 2: None
If we assume that the progressive lters out the possibility of the lexical passive,
then the disappearance of the second reading in (65) and (66) is explained. Note
that when a predicate is chosen which (is generally taken to) only allow lexical
passives, the actional reading is ltered out.
(67) The toy appears broken
Reading 1: None
Reading 2: The toy is in pieces.
Finally, let us note another generalization: while the syntactic passive, associated
with Reading 1, has a true aected patient surface subject (the glass is aected
by the action in (65) and (66)), the surface subject in the lexical passive, associated
with Reading 2, is not aected, but is a simple theme (the toy is in the state of
being broken). In general, a syntactic passive, but not a lexical passive, allows a
+aected surface subject.
But now the reason for the peculiar behavior of the idioms in (58)(63) is
apparent. The passives which are allowed are not syntactic passives, but lexical
passives. Hence they are possible in the simple past form, but not in the past
progressive (A fancy was taken to Je/*A fancy was being taken to Je).
Hence the Specied Determiner Constraint holds in full force (Specied here
means determined in the idiom, not specic; the specied determiner is in the
object itself).
(68) Specied Determiner Constraint:
An object idiom with a specied determiner (i.e. determiner speci-
ed as part of the idiom) does not allow syntactic passivization.
Finally, let us note that two other possible generalizations which might be made
about the data are insucient. A possible alternate explanation is that the xity
of the object in these idioms with the specied determiner does not have to do
with the passive, but with a general impossibility of movement of the element
with the specied determiner. However, this is counter-exemplied by subject-to-
subject raising, which occurs in subject idioms, even in the more rigid denite
determiner type.
(69) a. The cat seems to be out of the bag.
b. The shit seems to have hit the fan.
c. The roof seems to have caved in on George.
Second, one might suppose that an idiom element without reference may not be
moved into a semantically transparent position. But this explanation has problems
on two grounds. First, idiom chunks do appear to show up in semantically
transparent positions like the head of relative clauses:
(70) The headway that we made was remarkable.
Also, the subject position in a passive construction presumably need not be
construed as necessarily semantically transparent (or if it is it allows movement
of elements into it any way), since movement of elements into it is allowed.
(71) A unicorn was being sought.
Finally, there is a syntactic subgeneralization or perhaps an independent
generalization which does seem to hold over the idioms which is dierent
from the Specied Determiner Constraint. This is that idioms which have
agentive predicates as their heads do not allow passivization, while those which
do not have agentive predicates do. The example below, which was used to argue for
the Specied Determiner Constraint, might equally be used to show that agentivity
in a predicate blocked passivization (i.e. where the predicate is generally an
agentive one, whether it seems to be used agentively in the idiom or not).
(72) a. *The bucket was kicked.
b. Advantage was taken (of John).
Looking back at the predicates listed above, we note that the agentive/nonagentive
contrast will account for a large bulk of the data as well. In fact, I initially took
this to be the basic generalization that held, in the blocking of passivization.
(73) Agent/Patient Generalization
An object idiom may not be passivized if the main predicate is agentive.
There is some reason to take the agentive/patient generalization to be the
subsidiary one, and the specied determiner constraint as more central. One
reason is conceptual. Given the view of passive suggested above, any instance of
passive may be either lexical or syntactic. The possibility of a lexical passive is
screened out in two ways: by placing the construction in the progressive, or by
allowing the derived subject to be aected (a patient). Thus the lexical version
of A toy is broken has a simple theme subject; in the case of an aected
subject, the passive is syntactic. If this is correct, then agentive predicates cannot
undergo lexical passivization and allow their derived subject to be aected. But
it that is correct, then the passives which are being allowed in the idioms are
simply lexical passives, hence not agent/patient, hence the generalization in (73).
That is, (73) would fall out of the properties of the lexical passive.
The second reason is more empirical. If (73) were correct, then one would
expect that, holding the main verb constant, there would be no eect of varying
the determiner status of the object. But consider the following set of idioms for
take, a common idiom-taking verb.
(74) Idioms with take
Idiom Passive?
take the cake no
take a leak no
take a hike no
take a bath no
take a fancy to only non-progressive (lexical)
take a shine to only non-progressive (lexical)
take ve no
take heed yes
take issue yes
Aside from take ve, the division according to the presence of the determiner
is perfect. Those idioms which require a specied determiner do not passivize;
those which do not require a specied determiner, do. In this case, of course, the
verb is constant, so agentivity is not an issue. I will therefore assume that the
Specied Determiner Constraint is correct.
2
The Specied Determiner Constraint, that idioms with a specied deter-
miner cannot passivize, would appear to be totally baing from the point of
view of current versions of Government-Binding Theory (as well as other
grammatical theories). Case is generally assumed to be assigned to the full NP
object, and is inherited by various elements in that NP, in particular, by the head
noun and the determiner. Assuming that Case is assigned in that way, there
would be no reason to expect that anything about the internal structure of the NP
object, even in the special case of an idiom, should have anything to do with
passivizability.
On the other hand, given the division of labor suggested here, where Case
is assigned to the determiner not to the whole NP, and there is a merger
operation of the Case and theta frames, the unpassivizability of the full idioms,
but not the OC (open class) idioms is rather expected. Consider how the passive
would be stated. It is necessary here as in the usual statement of passive
to prevent accusative case from being accidentally assigned to the object, prior
to ax hopping.
3
In terms of the two representations assumed here, this means
that the Case-absorbing morphology must itself be part of the Case frame.
2. The situation with wh-movement is interesting. In brief, it appears that there is a constraint on
wh-movement in idioms, but that it is not the same constraint which applies on NP-movement. For
example, it appears that some idioms that are not frozen for NP movement are frozen for
wh-movement, presumably because one cannot quantify over the elements involved.
(1) a. The cat seems to be out of the bag.
?
*Which cat is out of the bag?
On the other hand, some idioms which are frozen for NP movement (in the syntactic passive) are not
frozen for wh-movement.
(2) a. *A fancy was being taken of George by Mary.
b. How much of a fancy did Mary take of George?
This suggests that the constraints are dierent.
3. Thus, I am not assuming a surface lter on Case.
(75)
V
V NP
Det
no Case
-ed
absorbs
Case
Case
assigning
features
Case frame for passive
Now consider how idioms must be stated. An idiom is a specied chunk of
linguistic material corresponding to a specied meaning, which must be listed.
All idioms must be xed and listable, and let us assume that they are listed at
one specic level. Now the open class theta representation idioms may simply be
listed as such, in the theta representation itself, together with their meaning.
(76)
take advantage
meaning: ...
listed for theta-representation
The post-merger idioms are also listed. However, since these are unitary chunks,
this means that the deepest format for post-merger idioms is the post-merger
representation itself. The idiom is not divisible into the theta and the Case
representation.
(77)
meaning: ...
VP
VP NP
Det
the
N
bucket kick
Deepest level for Level II idioms: Post-merger:
The representation in (77) is the deepest level that the post merger idiom may
take. It does not exist in the usual subunits of a freely composable theta repre-
sentation and a freely composable Case representation: if this were the case, then
the idiom would have to be specied at two places at once, this was ruled out above.
Thus the following picture holds in the formation of idioms:
(78) Theta representation Case representation
deepest level
for OC (pure
theta) idioms
Merger
or Project-
deepest level
for specified
determiner
idioms
full representation
We may conceive of idioms, then, as being pieces of listed structure. These
pieces of listed structure may be specied at dierent levels. The idioms without
a specied determiner above were listed in the thematic representation; those
with a specied determiner were listed post-merger. We may call these Level I
and Level II idioms (not referring to those actual levels in lexical phonology, but
to comparable types of specied levels in syntax).
Consider now the problem of passivization. Given the general structure in
(78), it would be impossible for the post-merger idioms to passivize. This is
because the syntactic passive is a manipulation of the case representation: one
which adds the Case-absorbing morpheme, -ed, into that representation. The
passivization of Level I idioms would therefore look like the following:
(79)
Theta representation Case representation
Merger
or Project-
Level I idioms:
independant
representation
take advantage (of) V
V
+acc
assign.
features
NP
Det
+acc
N
Passive
(applies to
Case frame)
nothing happens V
V NP
i
V
+acc
assign.
features
-ed
NP
t
i
Passive form: Advantage was taken
(80)
Theta representation Case representation
Merger
Level II idioms:
independant
representation
Passive
not applicable not applicable
not applicable not applicable
Deepest level of
syntactic representation
VP
V
kick
NP
Det
the
N
bucket
The Case representation exists as a separate representation only for the OC (open
class) idioms. This does not so exist for the post-merger idioms, which are
specied only in the full representation. Therefore, passive, which applies to the
Case frame, is impossible for post-merger idioms: exactly the wanted result.
4.4 Conclusion
I have suggested in this chapter two new primitive operations: agreement and
merger (or Project-). These are not, I would argue, reducible to the operation
of Case or theta government, but must simply be viewed as primitive operations
that the grammar has recourse to. Agreement diers from government in being
bi-directional. More strikingly, perhaps, agreement operations can be put into
1-to-1 correspondence with basic movement and adjunct-adding rules (Adjoin-,
Move-NP, Move-wh). The latter may themselves be put into 1-to-1 correspon-
dence with the satisfaction of closed class elements (the +wh feature in Comp,
In, and the Relative Clause linker). Thus while agreement would encompass
certain other more primitive operations, it would of necessity be a nite subpart
of the grammar. The other operation which was introduced was merger. Merger
merged together an open class representation, the theta representation, and a
closed class frame. The stage of telegraphic speech, in this view, is not simply
the result of an extraneous computational decit, nor the result of a reduction
transformation, but is itself a well-formed subpart of the grammar. However, it
is only partial, not having undergone merger with a closed class representation.
By the General Congruence Principle, a similar structure of telegraphic speech
must underlie adult utterances as well. I suggested that there was evidence for
this because of the existence of both pre-merger and post-merger idioms, with
diering properties. Pre-merger idioms (take advantage, break bread) were a
pure representation of theta structure, and passivized readily. Post-merger idioms
(kick the bucket, hit the ceiling) were an instance of post-merger speech.
Cn:i1r 5
The Abrogation of DS Functions:
Dislocated Constituents and Indexing Relations
I have argued above for a grammar which contains, along with the primitive rule
Move-, at least one rule of Adjunction, Adjoin-, and a rule of PS merger, or
Projection, Project-. The rst of these corresponds roughly to the classic
generalized transformations of Chomsky (19751955, 1957), but applies only to
adjuncts. The restriction of the operation to adjuncts would be forced theoretical-
ly by the Projection Principle; further, the minimal specication of the grammar
according to the Projection Principle namely, that the lexical specications of
elements must be satised at all levels, but other properties of the phrase
structure tree need not be would predict that such an operation would be
possible. The treatment in Chapter 3 would therefore be what would be expected,
under such minimal assumptions.
Further, on purely theoretical grounds, it would be well if X theory held in
some pure form at least at D-Structure. One of the most interesting claims of that
theory (Jackendo 1977) is that X-syntax involves a smooth gradation of bar
levels, with the head of a construction connected to its maximal projections by
a sequence of bar levels obligatorily descending by 1 (Jackendo 1977).
(1) X
n
X
n1
Let us suppose that D-Structure is a pure representation of X-theory in

Jackendos sense, and that it obeys the restriction in (1). Let us assume further
that one of the classical analyses of relative clauses is correct, where the
structure is of a Chomsky-adjoined type: that is, with one of the node labels (NP
or N) repeated. It follows as a consequence that RCs must be adjoined in the
course of a derivation. for X-theory not to be violated at DS.
These theoretical boons would be vacuous if interesting empirical conse-
quences did not hold as well. It was argued in Chapter 3 that they do hold, both
in the analysis of the adult syntactic structure, and with respect to acquisition.
The so-called anti-Reconstruction facts of van Riemsdijk and Williams (1981),
reconstrued as involving the argument/adjunct distinction rather than degree of
embedding, can be accounted for by assuming that adjuncts, but not arguments,
may be embedded in the course of the derivation. This means that adjunct-
embedding may follow wh-movement in the sequence of operations, and thus
allows fronted names in adjuncts, but not in direct complements (arguments) to
escape Condition C violations.
The acquisition data pointed strongly in the same direction. Tavakolian
(1978) has shown that, in initial stages, the child may have recourse to a high
attachment of relative clause: the conjoined clause analysis. This means that a
relative clause construction like that in (2a) may receive an analysis like (2b) by
the child, where the relative clause is not embedded under the relevant NP, but
simply conjoined.
(2) a. The duck saw (the lion (that hit the sheep)).
b. ((The duck saw the lion) (that hit the sheep)).
The empirical consequence of this is that the child allows (and in fact virtually
requires) the surface subject of the matrix clause the duck, here to control
the subject of the relative clause. The resultant analysis is that the matrix subject
(the duck) is treated as the subject of both clauses (Tavakolian 1978).
While Tavakolian suggests that this high attachment analysis is due to the
presence of a separate parsing principle in the grammar, the analysis above
suggests that whether there is a parsing principle or not the high attach-
ment would follow from a an embedding analysis, together with the assumption
that, in cases where embedding failed, a default analysis of conjunction is
superimposed by the child on the data. Thus the child rst attempts to analyze
the relative clause as embedded with the noun phrase that it modies. If this
analysis succeeds, the correct analysis, and correct modication possibilities,
follow. lf the analysis does not succeed (perhaps for computational reasons), the
child adopts the default of a conjunction analysis. This accounts for the modica-
tion possibilities that the child mistakenly adopts (Tavakolian 1978); it also
explains the Solan and Roeper data (Solan and Roeper 1978), that when the
conjoined clause analysis is impossible, as in the put constructions, the clause
remains unrooted, and hence is not analyzed at all. Thus while the conjoined
clause analysis can be thought of as a parsing principle in some sense, it is not
exclusively that, but is rooted into the general conception of the computational
organization of the grammar.
5.1 Shallow Analyses vs. the Derivational Theory of Complexity
The discussion above allows for one way by which the child grammar may
THE ABROGATION OF DS FUNCTIONS 185
diverge from the adult grammar, by which a computational weakness in the
childs grammar may be viewed as giving rise to an analysis which is itself
parametrically possible: an adjoined RC analysis. I believe that this is in general
the case: the failures in the child grammar give rise to analyses which are
parametrically possible, and that, in fact, the grammar is organized so that this
is the case. That is, the child, while not speaking grammatically according to the
adult grammar, must nonetheless be speaking grammatically according to some
grammar some option in Universal Grammar. This constitutes a sort of well-
formedness constraint over the intermediate grammars that the child adopts.
The present chapter considers another way by which the intermediate
grammars through which the child passes may be well-formed in this broad
sense. In this case, I will argue that, for three separate constructions, the child
adopts an analysis which is shallow with respect to the representations
computed in the adult grammar: rooted in S-structure or possibly the surface, but
extending back only part of the way toward D-structure. I will remain rather
neutral throughout this chapter on whether this eect is essentially computational
in character and has no repercussions on the actual grammar, or whether it does
aect the way that parameters are initially set. That is, if we believe that the
derivation DS SS involves computational operations in some broad sense, and
we believe further that the childs computational resources are more limited than
the adults, then it would be expected that the set of operations, s
1
, s
2
, s
n
which relate the two levels in the childs grammar would not be as full as in the
adults grammar. Whether this would occur on a construction-by-construction
basis, perhaps isolated to constructions involving dislocations, or is overall,
would be a question yet to be answered. Similarly, the question of the direction
of the shallowness. In the classical Derivational Theory of Complexity (Miller
and McKean 1964; McNeill 1970; Fodor, Bever, and Garrett 1974; see also
Wexler and Culicover 1980), the childs grammar was assumed to be lacking in
specic transformations, which would be added onto D-structure. The childs
grammar would thus have the structure in (3a), while the adults grammar had
the structure (3b).
(3)
a. b. DS
SS
SS
DS
SS
shallow
derivation
Childs derivation Adults derivation
The shallowness of the derivation, construed as a lacking of optional transforma-
tions in the mapping from DS to SS, would give rise to dierent surface structures.
The child would speak structures of the form: SS, while adults would speak
fully processed representations: SS. Dierences in the childs and the adults
grammar could be traced to this fact (Miller and McKean 1964; McNeill 1970).
It is interesting to re-evaluate the Derivational Theory of Complexity in light
of more recent theories of the grammar than that adopted at the time of its rst
proposal (essentially, early versions of the Standard Theory). While the current
theory retains transformations, these are not the lexically governed and specic
transformations of the Standard Theory, but rather instances of a single move-
ment rule, Move-. The possibilities for output are, again, not specied in the
operation itself, but by a system of principles which forces the products of any
particular operation to take a particular form: that is, the interaction of Case
theory, Theta theory, Binding theory, Control theory, the Projection Principle,
and so on. Thus, while Move- applies, the child does not actually learn
individual transformations: rather, he or she learns (or has triggered in him/her)
the principles which govern the full set of possible derivations.
Given these facts, it may appear that the derivational theory of complexity,
in any form, is irrelevant to the current view. Before addressing that question
directly, however, we may note that a number of the empirical arguments which
were given against the Derivational Theory of Complexity in Fodor, Bever, and
Garrett (1974) which essentially adopted a negative view of it would no
longer hold given current analyses. Thus one of the arguments against the
Derivational Theory of Complexity was that children did not exhibit the full
counterpart of sentences which underwent an Equi Transformation (Equivalent
Noun Phrase Deletion, an early rule in which coreferent nouns were deleted: now
known as Control). That is, sentences of the form (4a) were not present in the
grammar prior to sentences of the form (4b) (the D-structures of the sentences in
(4b) used to be thought to be those in (4a), with a coreferent noun phrase in the
lower subject position. This noun phrase was later deleted to create the surface.
This rule was known as Equi: Equivalent Noun Phrase Deletion.)
(4) a. *John tried (for) John to leave.
John wanted him/himself to leave.
b. John tried to leave.
John wanted to leave.
While it might be argued that the sentence which underwent the obligatory
transformation in (4a) (John tried for John to leave) would not be expected to
be present in the output in any case, since the transformation is obligatory, this
would not be so for the second sentence: John wanted him/himself to leave. The
input is itself fully well-formed in one of these variants, and so, given the
optionality of transformations, would be expected to be present in the surface
rst, if the Derivational Theory of Complexity were correct that is, if
computationally more complex constructions surfaced later.
I believe that the logic of this argument against the Derivational Theory of
Complexity would be correct and convincing if it were the case that an Equi
transformation existed. In current work, however, no such transformation is
assumed. Rather, the subject of the embedded clause is at all levels null, and is
coindexed with the matrix subject (or object) by the coindexing rule of Control.
(5) a. John
i
tried PRO to leave. Control
John
i
tried PRO
i
to leave.
b. John
i
wanted PRO to leave. Control
John
i
wanted PRO
i
to leave.
Given the assumption that control applies rather than equi, the Fodor, Bever, and
Garrett argument does not go through simply on empirical grounds. There is no
D-structure which has a full noun phrase in the embedded subject position, hence
the fact that such elements do not show up on the surface antedating the
appearance of the truncated form does not constitute a computational argument
against the Derivational Theory of Complexity.
It might be suggested, nonetheless, that a variant of the Fodor, Bever, and
Garrett argument may be resurrected, but simply applied to the rule of Control
itself, rather than to Equi. Suppose that the relevant DS, as in (5a) and (b) above,
simply involves a null subject in the controlled clause. Assume, as above, that a
rule of Control applies, coindexing the embedded subject with an element in the
matrix. Then a computationally weak system might be assumed to not undergo
an instance of Control: the output would simply be the DS form with the surface
embedded subject not coindexed.
(6) Revised argument against Derivational Theory of Complexity:
a. DS: John
i
wants PRO to leave. no Control
SS: John
i
want PRO to leave.
b. DS: John
i
tried PRO to leave. no Control
SS: John
i
tried PRO to leave.
Given a surface structure like that in (6a) and (b), it might be suggested that the
child would adopt a default analysis for the unindexed PRO: say, that it is
interpreted as arbitrary in reference. Further, while it may be argued that the
control rule applies obligatorily in the case of predicates like try, it does not
apply obligatorily for want, and thus an indexed PRO, and a consequent errone-
ous reading, would be expected in that case. But such a reading does not appear.
This revised argument against the Derivational Theory of Complexity is of
course more powerful, given current assumptions. However, it itself is suscepti-
ble to question. The main issue would revolve around the status of Control as a
rule vs. Control as a principle. If control were simply an (optional) rule in these
cases, then the lack of indexing on PRO (and a consequent erroneous analysis)
would indeed be expected. However, if Control is a principle, and if such
principles are indeed part of the genetic basis of language, then it might be
expected that control would apply immediately. Moreover, the fact that control
is not necessary in cases like John wanted for himself to win is irrelevant,
since these structures do not match the domain over which the control principle
is stated: namely over structures of the form NP PRO.., with some
minimal distance-type principle ensuring locality. Thus, given the assumption that
Control is a principle, the revised argument against the Derivational Theory of
Complexity would not go through either.
5.2 Computational Complexity and The Notion of Anchoring
These points notwithstanding, I do not wish to undertake a resurrection of the
Derivational Theory of Complexity. Yet I do believe that the basic conception
behind it is correct: that there is some fairly straightforward relation between the
set of computations undertaken by the child, and the complexity of the analysis
that the child arrives at: the set of computations used by the child is simply less
rich. In earlier chapters, I outlined how this would work with respect to tele-
graphic speech and relative clauses: in the former case, the child was computing
a partial structure (the thematic representation), in the latter case, the child was
using an earlier specied, default rule (conjunction rather than adjunction).
With respect to wh-movement, there is evidence for the shallowness of the
childs grammar but not in the direction that the Derivational Theory of Complexity
supposes. Rather, the grammatical analysis is shallow in the other direction:
anchored in SS or perhaps the surface, with a shallowly generated DS, DS. If
we imagine the child, or the adult, being handed a level of representation, SS or
more exactly the surface, for the purposes of comprehension, the other levels of
grammatical representation DS and LF must be computed from that. To the
extent to which the computational system is immature and not fully strong, it
will not be able to go backwards enough to compute DS (and possibly will
lack in some aspects of LF as well). Thus the grammar will be shallow in
comprehension, but not in the direction that the Derivational Theory of Complex-
ity supposes. Rather, it will be anchored in S-structure, and the D-structure which
is computed will not be the D-structure of the adult grammar. Rather it will be
some other D-structure, a deepest computed structure, D-structure.
Let us consider this in slightly more detail. The adult grammar generates
quadruplets of structures: (DS, SS, Surface, LF). Any particular sentence is
given, at least, a representation at each of these levels.
(7) Surface: Who did you see (t) ?
DS: You In see who?
SS: Who
i
did you see t
i
?
LF: For which x, you saw x?
Further, these representations have a number of other characteristics. DS is
ordered before SS, and SS is ordered before LF, in the sense that Move- (and
possibly other rules) apply to DS to derive SS, and apply to SS to derive LF.
Move- thus induces an ordering over the levels. In addition, particular con-
straints hold over the individual levels: the Projection Principle holds at all
levels, and individual constraints or modules, Case Theory, for example, or the
Binding Theory, are earmarked for particular levels, though in a way which is as
yet not fully clear (see discussion above). Move- may be dened in either of
two ways: as a particular relation between levels, or as a projection of a certain
sort of information, chain information, on a particular level.
Let us now make a supposition: the childs computational system computes
shallow analyses for particular constructions (perhaps particularly for comprehen-
sion). These analyses are rooted or anchored at SS or the surface (what the child
hears) but not all aspects of DS are recovered. In particular, the operation of the
rule Move- is not fully undone. Thus while the computed representation may
recede part of the way back to the adult DS, it is not fully such a level, but is
rather an intermediate level: DS. The result would be the following:
(8)
DS
DS
SS
DS
SS
Childs derivation Adults derivation
PF PF LF LF
That is, the child would still compute 4 levels of analysis. However, one of these
levels, the deepest level DS, would dier from the adult computation of DS. In
particular, it would be closer to the surface than the comparable adult DS. With
respect to the sentence given above, for example, the childs full representation
might be the following.
(9) Surface: Who did you see (t)?
DS: Who
i
did you see t
i
?
SS: Who
i
did you see t
i
?
The childs DS representation, in this instance, would match the SS representa-
tion. The wh-element, rather than being generated, at the deepest level, in the
object position, would be generated in Comp, or in Spec C (Chomsky 1986b).
The consequence of the shallowness of computation would be a representational
dierence in the computed deepest structure.
Before proceeding, an additional comment is necessary. In the earlier
chapters, I have suggested that the grammar change for the child with respect to
his or her analysis of relative clauses or telegraphic speech. The situation is
considerably less clear with respect to the analysis of dislocated elements. In
particular it is unlikely that we would wish to say that the D-structure is actually
constructed as a matter of the grammar, in the course of an attempt to understand
sentences with dislocated structures. (I owe this point to Janet Fodor.) Instead,
we may assume that the comprehension process, starting from S-structure or the
surface, computes back toward D-structure, but perhaps only part of the way.
Thus, under dierent conditions, the computed D-structure (that is, D-struc-
ture1, D-structure2, etc.} may be distinct depending upon computational load,
receding toward the adult DS in cases where the load is light, or pragmatic
information has intervened to make the computation more possible. For a given
child, at a particular stage and thus with a particular given grammar, a variety of
DS may be computed, depending on computational load.
(10)
DS
DS
DS
DS
SS
PF LF
Potential levels
of analysis
Thus for the example given in (11), the childs analysis may generally have the
wh-element in dislocated position at the deepest computed level, but it may occasion-
ally have the wh-element originating in the (adult) DS object position as well.
(11) Surface: Who did you see (t)?
DS (Computation l): Who
i
did you see t
i
?
DS (Computation 2): Who
i
did you see t
i
?
DS (Computation 3): You saw who?
SS: Who
i
did you see t
i
?
This is in accord with the psycholinguistic fact that the possibility of analysis
will change under diering conditions. The claim above would be that the
childs analysis is shallow, not his or her grammar.
As noted above, this fact might be taken to be a purely computational fact.
Or it might be that it has parametric eects: in particular, for those structures in
which the wh-element is analyzed as base-generated in dislocated position, it is
also analyzed as being in a theta position (or quasi-theta position) even at the
deepest level of analysis. This would mean that a computational eect would
have parametric repercussions: the child would fall into a dierent type of
grammar. Some evidence for this is discussed in 5.7.6. The crucial point
throughout the chapter however, will be the light that the shallow analysis sheds
on the structure of the grammar.
5.3 Levels of Representation and Learnability
While shallowness of analysis in this case need not be considered a property of
the grammar per se (but may rather be of the computational-acquisition device
as it computes a representation), it does provide a unique clue into the structure
of the grammar. Namely a prediction is made: insofar as the analysis is shallow
(i.e. extends backwards from SS to DS), the set of grammatical functions
associated with the not-present levels (DS to DS) would be expected to not be
present as well. Suppose that a particular grammatical module (e.g. part of the
Binding Theory) applies at DS. Given that DS is not available, in a particular
analysis of a string, to the child, the part of the Binding Theory which was stated
over DS would also be expected to be not present. Thus the set of structures
which underwent some rule at DS (being marked for coreference or obligatory
disjoint reference} would be expected to be treated differently in the childs
grammar than in the adults grammar. This is shown in (12) below:
(12)
DS
DS
SS
PF LF
set of operations or rules which apply at DS
shallow analysis
anchor of the childs analysis
We may say that the childs analysis is anchored at a particular level: SS, or the
surface. (For convenience, I will henceforth simply use SS as the anchoring level
rather than the surface. No theoretical point is intended thereby, and I use it for
convenience since the properties of that level, but not of the surface, are
relatively well-explored. The contrast is with D-structure anchored representa-
tions.) Over time, the derivation lls out backwards, the analysis becomes less
shallow. At any particular time, however, the analysis is shallow, not encompass-
ing the adult DS. This, however, has a consequence. The set of grammatical
functions associated with the adult DS will not be present in the childs gram-
mar. Borrowing, and changing, terminology from Williams (1986), the set of
grammatical functions associated with DS would be unavailable: this is the
abrogation of DS functions.
(13)
DS
DS
SS
functions unavailable
deepest level computed
anchor of the analysis
To say that SS is the anchor of the analysis is to say that the computation
proceeds backwards from that level, at least in part. This fact may then be used
in a positive way by the linguist: to determine the structure of levels or organiza-
tion of rules in the grammar. Insofar as particular grammatical functions be
shown to not be available to the child (e.g. some aspects of Binding Theory),
they are earmarked as belonging to the missing levels: i.e. in the domain
DSDS. The shallowness of analysis would thus give us insight into the
structure of the grammar, and where therein particular operations apply.
There is a second property of this type of analysis which is worthy of note.
I have suggested somewhat tentatively that this aspect of shallowness of analysis
(with the anchoring at SS) may not be general, but rather associated with a
particular process: comprehension. What about the child as speaker in the
speaker/hearer duality? Here I will mention a possibility that will remain quite
speculative. Let us suppose that the child as speaker again adopts (computes) a
shallow analysis. However, this analysis is shallow in the opposite direction:
anchored at DS, but shallow with respect to SS. The result would be the following:
(14)
DS
SS
(SS) PF
production anchored at
shallow S-structure
corresponding adult S-structure
Representation 1: Anchored at DS
LF
DS
DS
SS
PF
analysis anchored at
computed structure
corresponding adult d-structure
Representation 2: Anchored at SS
LF
Such a system would then give the following relation of the computational
system to the structure of the grammar, where comprehension and production are
viewed with a suitable degree of abstractness.
(15) i. Comprehension: shallow in the upwards direction, anchored at
SS, functions uniquely associated with DS not available.
ii. Production: shallow in the downwards direction, anchored at
DS, functions uniquely associated with SS not available.
While the grammar is of the Chomsky-Lasnik type, the anchorings for compre-
hension and production would be distinct. Let us dene a grammar/computational
system as equipollent if it has the following property.
(16) A (Grammar, Comput. System) is equipollent if it is anchored at all
levels for all operations (comprehension/production).
(17) The adult grammar, but not the developing childs, is equipollent.
(16) and (17) together give us a characterization of a developing grammar.
Moreover, this characterization is not only available to us as linguists, but to the
child him/herself. This suggests a solution a general solution to the
problem of overgeneration and negative evidence discussed in detail in Pinker
(1984). Recall that Pinker faced a serious problem with respect to the intermedi-
ate grammars that the child adopted. Namely, the child, at early stages, produces
sentences like the following:
(18) Me give ball Mommy (I gave the ball to Mommy)
I walk the table (I am walking on the table)
In Chapters 2 and 4, I suggested a particular solution to the ungrammaticality of
these utterances: namely, that they are not ungrammatical at all, but rather
correspond to subrepresentations in the adult grammar: the theta representations.
Let us consider a second possibility here, which will appear incompatible with
the other approach. (In the genesis of this thesis, I considered the approach here
rst, and later abandoned it in favor of the approach in Chapters 2 and 4; I will
attempt to synthesize them here). This approach originates more directly out of
an attempt to come to grips with certain problems in Pinkers approach. It
depends crucially on the notion of anchoring at a level.
A natural assumption given Pinkers approach is that the sentences in (18)
would correspond to the following phrase markers (note that this would not be
the representation given in Chapters 2 and 4, where they would be part of the
sub-phrasal syntax).
(19) a. S
NP
Me
VP
V
give
NP
ball
NP
Mommy
b. S
NP
I
VP
V
walk
NP
table
But while this assumption is natural, it leads, as Pinker notes, to a serious
diculty. If we assume that lexical heads have subcategorization frames
corresponding to these phrase markers, these would have to be the following.
(20) a. give: ____ NP NP
theme goal
b. walk: ____ NP
location
The problem is that these subcategorization frames are impossible for the adult
grammar. That is the goal for give must either be marked with to or must
precede the theme: the locative object of walk is the object of an on preposition
(in this usage). How does the child then get rid of the erroneous subcategorization
information? (Note that this problem, a delearnability problem, does not arise in
the representation in Chapters 2 and 4, because the representations would be
taken to be accurate, though in a subgrammar: let us put aside this solution for
now.) This is the problem of negative evidence, in its strongest form.
It might at rst be thought that a uniqueness principle would be appropriate.
For example, if it could be argued that every lexical item has a unique deploy-
ment of thematic relations and category types, then the later acquisition of a
subcategorization frame such as the following would knock out the subcategori-
zation frame in (20a).
(21) give: ___ NP PP
theme goal
While it can indeed be argued for the subclass of give verbs that only one DS
deployment exists NP
theme
PP
goal
, with a possible movement operation producing
the double object form (see Stowell 1981; Baker 1985, for a theory along these
lines) and thus while a uniqueness principle may in fact be used for this set
of examples to exclude the erroneous entry, this cannot in general be the case.
The spray/load class of verbs, for example, allow two realizations of objects.
(22) a. spray the wall with paint
spray paint on the wall
b. spray: ___ NP (with NP)
loc (inst
spray: ___ NP (on NP)
inst (loc
So the existence of two lexical entries per se, for a given verb, cannot in general
help the child in excluding initial erroneous entries. However, this leaves the question
of how the child eliminates the oending entries in (18)(20) from the grammar.
In an important contribution, Pinker (1984) adopts one possible solution.
Namely, he suggests that until the nal grammar is set, lexical entries (and
phrase structure rules) are given provisional status, by the device of orphaned
nodes. This means that the phrase structure rule, and the corresponding subcate-
gorization frame, that the child uses is marked with a question mark to indicate
its provisional status; the phrase marker itself contains an orphan and a
possible mother node (or set of such nodes).
The lexical entries in the grammar corresponding to the PS expansions in
(19) would then be:
(23) a. ?
give: ___ NP NP
theme goal
b.
?
walk: ___ NP
location
Since these entries are not given full-edged status in the grammar, the problem
of the lack of evidence to eliminate the entries in (23) does not arise. How, then,
is the correct grammar reached? According to Pinker, the intermediate entries are
assigned some provisional probability of occurring. The actual sanctioning of a
lexical entry is not all-or-none, but rather with respect to a learning curve. In the
long run, not enough evidence is gotten from a late enough stage to allow for the
erroneous entry to be permanently listed.
While the above solution is interesting, it has a peculiar property. The entire
grammar is up for grabs at every intermediate point, and there is, in addition,
no denite way of knowing for certain when the end point has been reached
i.e. when the question mark has been erased. This means that the entire
grammar has, at every intermediate point, a rather provisional status. This
character might be argued to be not a failing, but a virtue: that this is precisely
what occurs in a learning system. That is, the all-or-none idealizations of
linguistics are false precisely on this point, and the notion of a learning curve
and a learning system, must explicitly allow for the notion of a question-
marked entry, where the question mark ultimately fades into oblivion.
However, there is one good reason to suppose that this solution is less than
optimal. This is because a linguistic system is not simply a group of isolated
facts, but has itself a deductive structure. Certain pieces of information must be
used as the basis for determining other pieces of information. This means, in
turn, that the pieces of information which are used as such a basis must be
known with almost exact certainty, otherwise, the degree of uncertainty in the
initial entry infects the rest of the grammar. In fact, to the degree to which more
than one piece of information enters into a deduction, the degree of certainty in
the result decreases with the multiplicand of the certainty of the elements in the
basis: two elements, each with the relative certainty of .8 and .7 give rise to a
deduction of certainty only .56.
What would this mean, in terms of the characterization of the abstract,
learnable, deductive system? It would mean, I believe, the following. The ideal
learnable deductive system should not so much have a normal distribution in
terms of the certainty of the elements of the grammar therein at intermediate
points, but rather have something closer to a bi-modal distribution. Certain
elements would be known with almost exact certainty: say, probability .98 or
above. Certain other elements would be provisional in status: clustered at some
markedly lower probability (say, .60), and marked as such. Given such a
distribution, the function of the elements in the grammar would dier. Namely,
the elements clustered around surety would act as the deductive basis for further
pieces of information in the grammar. The elements clustered at the lower
probability would not. Further, the elements at the lower probability would be
checked for exactness by the application of the deductive structure inherent in the
system to the elements which are known with relative certainty: that is, the
certain elements would act in tandem to weed out the less certain. Further, in
the course of development, elements would move from an unsure characterization
as to accuracy to a quite sure characterization fairly rapidly.
There does seem to be some evidence for such a distribution. Consider a
standard description of learning, as it applies to the natural learning of some
element of an articulated system such as language. It goes something like the
following. The child, at the initial stage, uses a piece of information sporadically
and often wrongly. This stage may last years. There appears then a stage in
which the element is used over and over again, often incorrectly but with greater
and greater frequency of appropriateness. This stage is comparatively very quick:
often lasting only a month or two or three. Finally, the construction is mastered,
and frequency of use again drops down. The intermediate stage is much shorter
in duration than either of the two anking stages. Labov and Labov (1976) note
exactly such a process in the mastering of questions.
For about fourteen months after Jessies rst wh-questions, she showed only
sporadic uses of this syntactic form, less than one a day. In the rst three
months of 1975 (3:43:8), there was a small but signicant increase to two or
three a day. Then there was a sudden increase to an average of 30 questions a
day in April, May, and June, and another sudden jump to an average of 79
questions a day in July with a peak of 115 questions on July 16th After the
peak of mid-July, the average frequency fell o slowly over the next two
months (to 4:0), then fell more sharply through October and December to a
stable plateau of 1418 a day for the next seven months.
This pattern would correspond exactly to one in which an element began as
unlearned (i.e. with low subjective assignment of probability of accuracy: prior
to 1975 here), to one in which it had (from October 1975 onward): in between,
was the time of discovery, relatively short. Such descriptions are of course
characteristic of most instances of learning, and are obvious from simple
observation. Note that if this description is correct, the all-or-none characteriza-
tion of learning in linguistic theory is in fact close to the truth. While some
variance must at both points be allowed, what is crucial is that the distribution is
bimodal, where the modes correspond to the long periods of time over which the
element is viewed as not learned and learned, the two points of temporal
stability. Furthermore, we may expect that the elements in the two groups dier
in function: the elements which are known are the basis for further deduction,
while the elements which are not known are not.
5.4 Equipollence
Assuming the above as a general characterization of the system, there must be
some way by which particular entries are marked as certain, while other entries
are not. Rather than assuming that this is done simply quantitatively, and tagged
onto the system, let us take it as a working assumption that this is done in the
representational system itself. If this were the case, then we may have a reason
for the necessity of the bimodality in certainty: a necessity linked to the repre-
sentational system.
The above-mentioned paradigm suggests a way in which this might be done.
Suppose that the child saves two representations of a given lexical item or
syntactic substring. One is anchored in the more surfacy level i.e. S-struc-
ture or the surface. This representation extends to the other levels (DS, LF, etc.),
but is anchored in a particular, surfacy level: say: S-structure. The other repre-
sentation also extends to all levels, but is anchored at another level: say, DS or
underlying representation. The child therefore has two representations, each with
a full complement of levels (and thus fully formed), but with a dierent
anchoring level.
How does the child then know when his or her nal grammar has been
reached? Suppose that, rather than relying on a device such as questioning all
intermediate entries, the child uses the notion of equipollent (equally anchored)
dened above. In particular, the following holds:
(24) When the representation underlying a construction is equipollent
(single representation anchored at all levels), the representation is
nal and correct.
(25) When the representation underlying a construction is not equipollent
i.e. consists of either two representations, or a single non-equi-
pollent representation, it is provisional.
Learning, then, would be the process of converting representations in the form of
(25) to the form in (24). Note that this faithfully represents the idea that there is
a basic two way distinction in the form of the knowledge and that this is encoded
in the representational system: either a piece of information is in the form (24),
in which case it is learned, or it is in the form (25), in which case it is not yet
learned. Further, the child knows, by the representational system itself, which
element falls into which group: the non-learned elements have multiple, nonequi-
pollent entries, while the learned elements have a single, equipollent entry.
How might this work in practice? Suppose that we take the over-
generations noted earlier:
(26) a. Me give book John (I gave a book to John)
b. I walk table (I am walking on the table)
Rather than supposing a provisional lexical entry for these constructions, let us
imagine a real one:
(27) a. give: ___ NP NP
theme goal
b. walk: ___ NP
location
Clearly, however, this cannot be all that is said, or the entry could never be
driven out. Let us seek help from the Projection Principle, which states (roughly)
that representations from all syntactic levels are projections from the lexicon.
This means, however, that we may view S-structure as a projection from the
lexicon as well. Let us suppose that this is the case, and that the childs grammar
contains a representation of the lexical representation underlying S-structure, as
well as that underlying DS: so much is implicit in the Projection Principle. Let
us, however, put this together with the notion of anchoring. Suppose that the
child has two representations, not one, of a single lexical item (in a single
usage). One is anchored at DS, it is the one that underlies the sentences that the
child produced in (26), and is to be found in (27a).
(28) give: ___ NP NP (Me give doll mommy)
theme goal
The second is anchored at S-structure, and consists of the actual heard form:
(29) give: ___ NP PP
theme goal
The childs full representation, of a single lexical entry, therefore consists of the
following subentries, both distributed throughout the grammar (recall that there
are two entries, not one), but anchored at dierent points.
(30)
DS
SS
PF
Give: ___
Give: ___
LF
NP
theme
NP
theme
NP
goal
NP
goal
Subentry 1: a.
(PF and LF entries same as DS and SS)
Anchored at DS
DS
SS
PF
Give: ___
Give: ___
LF
NP
theme
NP
theme
PP
goal
PP
goal
Subentry 2: b.
(PF and LF entries same as DS and SS)
Anchored at SS
Notice that this solution immediately explains one problem which is puzzling in
Pinkers account. Namely, if the childs grammar only contains the rst entry in
(28) (with the question mark attached), how is the child able to comprehend
sentences such as John gave a ball to Mary? That is, at the same time that the
child is generating (in the nontechnical sense) erroneous forms, the appropriate
representation must be available somehow to account for comprehension. The
representation in (30) does so, and anchors it appropriately.
More centrally, the outline of a way to handle the negative evidence
problem can be seen. Recall the problem: the child is exhibiting constructions
which appear to be ungrammatical from the point of view of the adult. Further-
more, the lexical subcategorization frames underlying them are overgenerated,
but appear not to be eliminitable from uniqueness principles alone. Rather than
marking broad sections of the grammar provisional per se, two entries are listed
in the grammar, anchored in dierent places. Since the representation is not
equipollent, being neither single, nor anchored at all levels, the child knows that
the form, which he may be using, is not the nal form that it is appropriate to
have for the grammar, i.e. the child knows that learning must take place.
This method has one great advantage. Aside from marking particular entries
as provisional, it allows other entries to be unequivocally marked as nal: i.e.,
complete and accurate. These are the entries which are equipollent. Thus,
suppose that at a later stage, a dierent representation of give was internalized.
(31) give: ___ NP PP (anchored in DS)
theme goal
give: ___ NP NP (anchored in SS)
theme goal
At this stage, this part of the grammar would be equipollent, anchored both in
DS and SS (we may ignore LF and PF, for current purposes). More exactly. the
syntactic structure corresponding to the projected lexical entry would be equi-
pollent, so the lexical entry would be. This would mean, however, that this part
of the grammar could be considered by the child to be complete and true:
namely, a full and complete lexical entry. This is important, because certain
sections of the grammar must be known as certain, and not simply provisionally,
in order for the child to use them to make other judgments. That is, in situations
of partial information, it is important that certain of the pieces of information to
be known absolutely (or nearly absolutely) as true. It is these points which act as
the basis for further inference.
This presents a picture dierent than that in Pinker (1984). Rather than the
grammar being provisional en masse, and gradually achieving certitude, particu-
lar parts of the grammar namely, those which are equipollent are known
to be accurate. It is these which exert their force over the rest of the grammar
from the point of view of inference. This has an advantage over that which
Pinker assumes, because if an entry is marked as provisional it would not act as
the basis for further inferences in the grammar, and hence would not infect the
rest of the grammar.
Consider now briey how this sort of system may be accommodated to the
approach sketched in Chapters 2 and 4. A single equipollent entry may be
created from two nonequipollent entries in the following ways:
i) By retaining the D-structure anchored entry and removing the S-structure
anchored entry from the grammar (allowing the D-structure-entry to be
anchored at all levels).
ii) By retaining the S-structure anchored entry and removing the D-structure
anchored entry from the grammar (allowing the S-structure anchored entry
to be anchored at all levels).
iii) By retaining both the D-structure and S-structure entries, and positing an
operation which mediates between them.
The dierence between the approach in Chapters 2 and 4 and that given directly
above is a dierence in the type of approach above. In the chapters above, the
third mechanism (iii above) is adopted, assuming that the initial representation
(me go Mommy) is actually retained in the nal grammar, and the problem for
the child is to mediate between this and the adult representation which he/she
does by means of the rule Project-.
In the assumption directly above, the D-structure anchored entries by the
child are assumed to be simply false, and the child ultimately projects back the
S-structure anchored entry, using mechanism ii). The problem there was for the
false entries to be eliminated, without the entire grammar falling into disrepute
(by the means of questioning entries). This was done by means of allowing two
entries by the child, but not allowing them to be equipollent.
Let us assume henceforth that the approach in Chapters 2 and 4 is correct,
rather than the one outlined directly above. In this case, it will still be necessary
to allow for two entries, one anchored at DS (or, more exactly, thematic
structure), and another anchored at SS. Thus the crucial notion of anchoring still
holds. Further, it will be necessary to mediate between these two representation.
This may be done by the following rule:
(32) a. Entry merger: If is the entry for a word anchored at DS, and
is an entry anchored at SS, and there is some operation
existing in UG mediating between and , then Merge (, )
with the operation .
b. If (a) is impossible, chose or as the anchored entry and
create an equipollent entry from it.
This creates a single equipollent entry, or allows a transformation to mediate
between the two known levels.
A number of questions at the theoretical level remain; for the remainder of
this chapter, however, I will simply concentrate on empirical evidence supporting
the notion of anchoring.
5.5 Case Study I: Tavakolians results and the Early Nature of Control
In the preceding section I suggested that a general solution to the problem posed
by negative evidence was to be found with the concept of equipollence. In the
rest of the chapter, I would like to investigate a particular instantiation of the
notion of shallowness of a derivation in particular, with respect to the
anchoring in S-structure, with the derivation shallow with respect to DS. That is,
the situation in (33) abides:
(33)
DS
DS
SS
Shallow derivation
This would mean that the particular grammatical functions associated with the
real DS would be unavailable to the child at the point at which DS were the
deepest computed level.
Such a general prospect would have three eects. First to the extent to
which grammatical functions associated with DS are screened out in early
derivations (which only proceed back to DS), there is evidence that the grammar
really is leveled, rather than DS being simply an aspect of SS. Second, the
indeterminacy as to where particular operations apply in the adult grammar may
be solved or at least moved toward a solution-by looking at the child
grammar. To the extent to which development is truly organized as this model
suggests, and not simply helter-skelter, the actual content of DS, and the
principles which apply there, may be determined by noting which operations fail
to apply at the stage at which a shallow derivation is computed. Finally, the
notion of development is given real status in the grammar, both in terms of the
structure of levels in the grammar, and in the development of particular construc-
tions from non-equipollence to equipollence.
I discuss now three phenomena indicating the presence of a shallow
derivation: Susan Tavakolians (1978) data concerning control into sentential
subjects, Guy Cardens (1986b) thoughtful analysis of Condition C eects in
dislocated constituents, and Roeper, Akiyama, Mallis and Rooths (1986) paper
concerning wh-movement, Strong Crossover, and quanticational binding.
5.5.1 Tavakolians Results
Susan Tavakolian, in an extremely interesting set of experiments (Tavakolian
1978), has investigated aspects of the acquisition of sentences with clausal
subjects. In particular, she tested childrens competence on innitival clausal
complements, including those both with and without a pronominal subject the
latter being control structures. It is these control structures, often called instances
of nonobligatory control, after Williams (1980), which are of interest here.
Examples of the control structures tested are the following.
(34) a. To stand on the rabbit would make the duck happy.
b. To bump into the pig would make the sheep sad.
c. To walk around the pig would make the duck glad.
d. To kiss the lion would make the duck happy.
e. To hit the duck would make the horse sad.
f. To jump over the duck would make the rabbit happy.
Control properties of structures with subjectless innitival complements (i.e.
apparently subjectless) are extremely various. In particular, the following sorts of
control seem to be possible.
(35) a. Arbitrary control, where the PRO refers to an arbitrary
unspecied element, similar (in English)to the meaning of
one.
b. Control from the object or subject of the predicate which the
controlled clause is the complement of.
c. Control from a controller up-the-tree, so-called long distance
control.
d. Control or interpretation by a discourse or extra-sentential
referent, denite in reference.
e. Control by the prepositional object of a restricted class of
predicates, mostly psychological predicates, into a sentential
subject.
Additional renements are possible: for example, between thematic and pragmat-
ic control (Nishigauchi 1984), and between control as it takes place in the want
vs. try class (Rosenbaum 1967). These are irrelevant for what follows.
Examples of the type of control in (35) are given below.
(36) a. To know oneself is dicult. (Arbitrary Control)
b. John persuaded Mary to leave. (Control by Matrix Object)
c. Bill said that shaving himself was a drag. (Long Distance Control)
d. Have you seen Bill? Shooting himself in the foot must really
have hurt. (Discourse Control)
e. To know himself is dicult for Bill. (Control by Object into
Sentential Subject)
The theoretical problem is to try to get a unied account out of this apparent
diversity. See Williams (1980), Chomsky (1981), Manzini (1983), and Clark
(1986) for further discussion. This variance notwithstanding a topic which
will be discussed shortly it appears that in the Tavakolian sentences, given in
(34), and in a large class of like constructions, if an object is present (including
a for object), it must control the PRO subject. Thus while (37a) is perfect with
the lion as controller, it is considerably worse than an arbitrary one reading, and
(I believe) impossible with control outside the clause: a discourse referent.
(37) a. PRO to kiss the duck would make the lion happy.
(the lion kisses the duck)
b.
??
PRO to kiss oneself would make the lion happy.
(test for one interpretation by (in-)ability to take reexive
oneself object)
c. Did you see the pigs? PRO to kiss the duck would make the
cow happy.
(Impossible under interpretation in which the pig is kissing the
duck)
The same appears to be the case with all of Tavakolians examples, which are
identical to (37a) in form, except for the choice of NPs, representing dierent
animals. Discourse control, either by an explicitly mentioned referent (37c), or
by a pragmatically accessible entity, are impossible for adult speakers, for this
class of examples.
However, the situation with children is dierent. Children consistently and
systematically allow a discourse referent, one which was not even mentioned in
a previous sentence but is pragmatically available in the set of animals with
which the child was told to act-out each sentence. Tavakolians results were the
following (Tavakolian 1978: 187):
Table 1. Distribution of Responses to Sentential Subjects with Missing Complement Subjects
(To kiss the lion would make the duck happy)
Response Type
Age Matrix NP Extrasentential NP Other
3.03.6
4.04.6
5.06.6
7
8
3
12
13
19
5
3
2
Total 18 44 10
Percentage 25% 61% 14%
Thus 61% of the children allowed an extrasentential referent for PRO. For the
sentence given, this might be, for example, a pig which was included in the set
of farm animals. This percentage did not signicantly change for the ages under
investigation, though it was, in fact, slightly higher for 5 year olds than 3 year
olds. This is clearly at variance with the adult response. Moreover, unlike other
somewhat similar cases, it is not liable to the methodological strictures of Lasnik
anal Crain (1985) and Crain and McKee (1985), who note correctly that with
respect to backwards anaphora with overt pronouns, the predisposition for
children to allow an extrasentential reading (Solan 1983) does not make a grammati-
cal point. The authors above note that for sentences such as the following:
(38) After he left, John went to the store.
Children tend to either take he as an extrasentential referent, or, if they allow
coreference, to transpose the name and the pronoun in the following manner:
(39) After John left, he went to the store.
But as Lasnik and Crain note, the tendency for coreference in (39) is not a
grammatical phenomenon in any case (but purely pragmatic), and, even more
crucially, if the child is able to transpose the pronoun and the name in (39) in a
repetition task, this means that the two elements in (38) in the comprehension
part of the task must have been co-indexed: i.e. known to be coreferent. So the
bias, on this type of backwards anaphora, must not be part of the grammar.
The situation with the Tavakolian sentences is quite dierent. First, the
necessity for object coreference in sentences such as (40) seems to be a matter
of the grammar, not simply pragmatics. Adults do not allow extrasentential
referents for the control clause in examples such as the following:
(40) To kiss the duck would make the lion happy.
Second, when children are asked to repeat these examples, they may supply a
pronoun for the PRO (Tavakolian 1978), but that pronoun is not taken as
referring to the matrix object, but to an extrasentential referent. Assuming that
they are paraphrasing their semantic representation in some sense, this means
that in that representation, and thus in the output to the comprehension task, the
missing subject in the control clause is marked for extrasentential reference.
Thus the phenomenon appears to be grammatical in character.
5.5.2 Two Solutions
There are two possible avenues to take in attempting to isolate the aspect of the
childs grammar here which is dierent from an adults grammar. The dier-
ence might fall in the control rule itself or, more exactly, in how control
interacts with levels of representation or it might be traced to a dierence in
the categorization of the null element.
If the PRO, for example, were interpreted as a simple pronoun in this stage
of development, then one might expect free extra-sentential reference. This is, in
fact, the explanation that Tavokolian herself oers: that PRO is interpreted as a
simple pronoun. A more sophisticated version of this same idea might be the
following. All null categories are, at some stage, neutralized along some dimen-
sion. Thus if we assume that null categories, in their feature sets of +/anaphor,
+/pronominal (Chomsky 1981) are stored in a paradigm and the paradigm itself
must be learned, then their might be some intermediate stage in the articulation
of the paradigm which would predict extrasentential reference. For a recent
version of this sort of theory, see Drozd (1987, 1994). Of course, any theory
which posits a change in the categorization of PRO, either in the simple version
(namely, that pro PRO) or in the more complex version (that the null
category paradigm is learned, and neutralizations exist in it in early stages), must
give some account of how the nal setting appears from the initial setting.
The second possibility is that there is some charge in the control rule over
time: more exactly, in the interaction of control with levels of representation.
Suppose that, for the constructions under investigation, PRO is its normal adult
category (e.g. +anaphor, +pronominal), with standard characteristics. Suppose
that, for some reason, the Control rule coindexing the PRO with an antecedent
does not apply in the usual manner. An uncontrolled (i.e. unindexed) PRO, we
may suppose, is able to pick up a general referent from discourse, or taken to be
arb. In this way, the same set of data could arise.
In the next two subsections, I will further discuss these two possibilities,
coming to the conclusion that of the two possible failings a failure in
category typing, or in the control rule as it interacts with levels of representation
the latter is the more likely, though with some real possibility of a mediated
view. Finally, I will attempt to justify the dierence in the control rule in terms
of the general structure of the grammar.
5.5.3 PRO as Pro, or as a Neutralized Element
One solution to the puzzle that external reference provides for the analysis of
acquisition is simply to assume that PRO has dierent referential properties in
the childs grammar, and thus, in GB theory, that it has a dierent feature
composition in this stage of development than it does in the adults grammar.
The propensity of children to take an external referent in cases like (40), repeated
below, would follow if PRO were simply being interpreted as little pro.
(41) PRO to kiss the duck would make the lion happy.
(External reference: 61%)
There is some interesting empirical evidence from Tavakolian which supports
this view. First, as she notes, the propensity for external control in these
structures is very similar to that for similar constructions with overt pronouns.
Examples like (42) are given an external referent reading about 55% of the time
(Tavakolian 1978).
(42) For him to kiss the duck would make the lion happy.
This similarity in response would be explained if PRO were simply being
interpreted as a pronominal.
Such a misconstrual would be similar to the sort of theory that Hyams
(1985, 1986, 1987) proposes with respect to the acquisition of null subjects: that
they are misconstrued as little pro.
In spite of the simplicity of the proposal, there is empirical data which
severely undercuts it. This data is of one basic type: instances in which the
relevant null category seems to be acting like PRO, not pro, in the childs
grammar. To the extent to which these cases are convincing and they appear
to be there is no way that we can assume a general identication of PRO
with pro in the childs grammar: that is, it is not the case that PRO is simply
misconstrued as pro by the child.
The clearest empirical counterevidence to this hypothesis is to be found in
Goodlucks (1978) thesis. First, there is a class of control structures studied by
Goodluck, but not by Tavakolian, which very clearly operated as if they con-
tained a controlled PRO. These were purposives and rationale clauses, and
temporal adjuncts. In the case of in order to clauses, children show a clear
c-command constraint on the choice of the controller.
Table 2. Control in Purposives (Goodluck)
Percentage Subject Control
In sentences with
Direct Object NPs
In sentences with
locative PPs
4 years
5 years
56.7%
63.4%
90.1%
90.1%
These are sentences like the following:
(43) a. Daisy hits Pluto PRO to put on the watch.
(56.7% subject control, age 4;
(63.4% subject control, age 5)
b. Daisy stands near Pluto PRO to do a somersault.
(90.1% subject control age 4;
(90.1% subject control age 5)
The correct choice for adults for these constructions would of course be subject
control, for both instances. What is interesting here is that even though subject
control is not uniform for the children here, it is obligatory (or nearly so) for the
constructions involving a locative: those in (43b). This may be explained simply
if the child is not able to control PRO out of a locative PP. This c-command
constraint would not be expected with little pro, which is allowed free
coreference, like any pronoun. If we use Tavakolians paraphrase test, we see in
addition that an overt pronoun allows reference to either the subject or the
locative NP.
(44) Daisy stands near Pluto for him to do a somersault.
If anything, the more pragmatically biased reading in (44) would be that in
which Pluto were the antecedent of him. The fact that the child does not allow
coreference with Pluto when the subject of the complement clause is nonovert
suggests that it is not acting as little pro, which should allow Pluto as antecedent
as him does.
The second body of data weighing against the general interpretation of PRO
as pro comes again from Goodluck, and involves temporal adjuncts. These do not
allow extrasentential reference, but rather must refer back to the main clause subject.
(45) Daisy hit Pluto after putting on the watch.
The percentage of subject coreference for these was the following (Goodluck
does not provide the degree of extra-sentential reference):
Table 3. Subject Control with Temporal Adjuncts
Age Percentage Subject Control
4
5
66.7%
63.4%
This data is again very strongly in contrast with the original data of control into
sentential subjects, where extrasentential control was the usual case (PRO to
kiss the duck would make the lion happy, Controller: extrasentential). The
uniform assumption that PRO was simply misconstrued as pro in this stage of
the childs grammar would predict uniform results in the two cases: this is not
the case. Again, supplying an overt pronoun, which presumably should operate
similarly to pro, easily allows object coreference, to the extent to which the
construction is grammatical at all.
(46) Daisy hit Pluto after him putting on the watch.
(Only semi-grammatical, but either main clause NP may he coreferent)
All this suggests that the simple misconstrual hypothesis cannot be maintained:
while one might suppose on the basis of Tavakolians initial results that PRO in
sentential subjects was bring interpreted simply as pro, as a result of a general
tendency for the null element to take such an interpretations this solution would
overgenerate pro in positions in which something much closer to the adult
control rule was operative. These are in purposives and also in temporal adjuncts.
So it cannot be the case that in early grammars PRO is simply uniformly
interpreted as pro.
There is a second possibility that we might consider given the basic
miscategorization hypothesis. Namely, that the paradigm of null categories is
underdierentiated in initial stages, so that the relevant null category is neither
pro nor PRO, but something antedating either. This might be done, for example,
by supposing that only the +/pronominal feature and not the +/anaphoric
feature was operative, in initial stages. The reduced paradigm would look like the
following.
(47) Full Paradigm (Adult)
+Pronominal Pronominal
+Anaphor
Anaphor
e
e
e
e
(48) ? Reduced Paradigm (Child)?
+Pronominal Pronominal
+/Anaphor e e
In the adult paradigm, the null category will be interpreted as one of the four
major types depending on the slot that the element lls in the paradigm: e in
+pronominal, +anaphor = PRO, e in +pronominal, anaphor = pro, e in pro-
nominal, +anaphor = NP-trace, e in pronominal, anaphor = wh-trace. Suppose
that the paradigm is neutralized along the anaphoric dimension: the anaphoric
feature had not been discovered yet by the child. The result would be a
collapsed paradigm in which the anaphoric feature played no role: the +pronomi-
nal null category would be a hybrid between PRO and pro; the pronominal null
category would be a hybrid between NP-trace and wh-trace. (Needless to say,
this is not the only sort of neutralized paradigm for null categories that might be
imagined: I choose this as illustrative.)
The general properties of such a paradigm collapse in acquisition (more
exactly, underdierentiation: Pinker and Lebeaux 1982; Pinker 1984) are quite
interesting, and bear further investigation: see Drozd (1987, 1994), for further
discussion. Nonetheless, there is reason to believe that for the particular set of
data under discussion, even this sophisticated version of the misinterpretation of
the early null category is insucient. The reason is the following. The logical
diculty with the hypothesis that PRO is, in early grammars, uniformly
interpreted as pro (i.e. a simple, though null, pronoun) resided in the fact that
dierent constructions operated dierently. Control structures of the type
directly studied by Tavakolian, those involving control of an object into a
sentential subject, did indeed allow the null category in the subject of that control
clause to freely choose extrasentential referents. One way of accounting for this
would be to suppose that it was operating as a null pronominal, and that this
followed from some deciency in the interpretation of PRO. However, this
deciency is hardly viable, given that other instances of PRO operate in a
standard way, namely as an obligatorily controlled element. That is, the dicul-
ty cannot reside simply in the misinterpretation of PRO, since this diculty
would then be expected to be general: but it is not.
The same logic would rule out the undierentiated paradigm explanation,
at least if it is considered alone to be the root cause. The underdierentiated
paradigm would allow us to posit a new null category, not yet dierentiated
between PRO and pro. But whatever this set of properties were, we would
expect them to be consistent, just as the properties of PRO and pro are. Howev-
er, this is precisely what we do not nd: sometimes the null category is acting
like a small free pronominal, and sometimes as controlled PRO. That is, while
the underdierentiated paradigm idea would introduce a dierence between the
adult grammar and the childs in the interpretation of the element, it does not
seem to be the right sort of dierence: what is needed is a dierence which will
allow the null element in the sentential subject construction to have dierent
properties from the adult PRO, but the null category in (say) temporal adjuncts
not to have such dierent properties. An underdierentiated paradigm, by itself,
would not capture this dierence (but see Drozd 1987, 1994, for a somewhat
dierent point of view).
5.5.4 The Control Rule, Syntactic Considerations: The Question of C-command
Given the failure of the theory that initial PRO is analyzed as pro (or as a
neutralized category), we are driven to look elsewhere for an answer. The
questions appear to be:
(49) a. Why are certain control complements (purposives, temporals)
behaving dierently than others (control into sentential sub-
jects) for children?
b. Why are children treating clausal subject complements dier-
ently than adults, with respect to control?
Underlying this, we might wish for a unied theory of control, at least at some
level.
One possibility for the dierence in (49a) is that this dierence is associat-
ed with the distinction between OC (Obligatory Control) and NOC (Nonobligato-
ry Control), in the sense of Williams (1980). Goodluck (1978) makes such a
suggestion. This might appear to be on the right track, yet recent work (Lebeaux
1984, 19841985; Sportiche 1983) suggests that the distinction between the two
sorts of control, while existent, is not of the primitive sort posited by Williams.
If the analysis of arbitrary control of Lebeaux (1984), Epstein (1984) is correct,
than so-called PRO
arb
is not an unbound element (i.e., a free variable), but rather
operator-bound. Evidence for this is found in double binding constructions
(Lebeaux 1984).
(50) a. PRO to know him is PRO to love him.
b. PRO to get a nice apartment requires PRO getting a higher
paying job.
(*PRO to get a nice apartment requires PRO getting trustwor-
thy tenants.)
c. PRO to become a movie star involves PRO becoming well-known.
(*PRO to become a movie star involves PRO recognizing you.)
In such constructions, each PRO is arbitrary in reference, but the two PROs must
be linked in reference. Thus (50a) means that for some arbitrary person to know
him, that same arbitrary person will love him. (50b) means that for some person,
x, to get a nice apartment, that same person x, must get a higher paying job. The
logical representation of the structures in (50) is therefore the following.
(51) a. O
x
((PRO
x
to know him) is (PRO
x
to love him))
b. O
x
((PRO
x
to get a nice apartment) requires (PRO
x
getting a
higher paying job))
c. O
x
((PRO
x
to become a movie star) involves (PRO
x
becoming
well-known))
Further, the operator binding must take place quite locally since the double
binding eect disappears when one of the open sentences is further embedded.
(52) PRO being from the Old World means that stories about PRO
winning the West are unlikely to be thrilling.
(53) PRO being from the Old World means PRO hearing stories about
PRO winning the West.
In (52), the two arbitrary PROs are unlinked: since the latter is embedded in an
NP, it is not close enough to the other arbitrary PRO so that they are bound by
the same operator. In (53) the rst two PROs are close enough, and they are
linked in reference (bound by the same operator); the third arbitrary PRO is
unlinked, being embedded in an additional NP.
In earlier work (Lebeaux 1984), I suggested that: i) PRO (including arbitrary
PRO) must always be bound (to account for the above facts), ii) that the binding
is local (to account for the above facts and the crossing eects), and iii) that the
binding element was a universal quantier in an operator position.
I wish to retain the rst two of these assumptions, and may do so via the
following specication:
(54) PRO must be bound in the minimal maximal NP, S dominating
controlled S, where controlled S is the S most immediately domi-
nating PRO.
With respect to the third assumption, I will change that here in the following
way. Namely, the binding element is not a universal quantier, but rather a
simple abstractor. This abstraction, however, does not take place locally, i.e.
within the predicate, but at the S or S level (compare to Chierchia 1984). This
element will continue to be represented with O, simply meaning a null category
(assuming that null categories in particular positions may have operator status).
Second, in part in response to considerations raised in Browning (1987), I will
no longer assume the binder to be in Comp, but in a topic position, or some
(quasi-)theta position peripheral to S. The reason for this will appear below.
The idea that arbitrary PRO and long distance control PRO are in fact
operator bound (see the above mentioned work for additional discussion of Long
Distance control PRO) means that another issue, which might initially be thought
to be decided by incontrovertible evidence, is thrown into high relief. Namely, is
c-command necessary for all instances of control? It would be well if it could be,
since this would regularize it to other co-indexing operations in the grammar. Yet
both Williams (1980) and Chomsky (1981) are driven to answer the question in
the negative, Williams using the lack of c-command as a criterion for his
classication of Non-Obligatory Control, while Chomsky correctly notes the
existence of sentences such as this:
(55) PRO to learn math is necessary for Johns development.
Indeed, if John is the direct controller of PRO here, this would counterexemplify
the need for c-command directly. With the possibility of a nonovert topic
operator, however, PRO could get its reference from that, with the topic itself
taking the reference of John from discourse, similar to the situation noted in
Huang (1982) for Chinese.
(56) O
i
((PRO
i
to learn math) is necessary for Johns development).
One argument against such an analysis is that there would be a Condition C violation
with respect to the coindexed operator and John, which have the same referent.
However, this violation is weak, in pseudo-topic type constructions in English.
(57)
?
As for John, to learn math is necessary for Johns development.
Thus one would not expect Condition C to rule out sentences such as (56).
There does seem to be some quite suggestive evidence for just such an
analysis. It is of the following form: that there are widespread similarities in the
patterns of grammaticality and ungrammaticality for sentences which allow (or
do not allow) bound reading for control, and the grammaticality and ungrammat-
icality of sentences with an overt as-for topic. Consider rst the following
sentences:
(58) a. As for John
i
, this shows that he
i
is a liar.
b.
?
*As for John
i
, this shows that John
i
is a liar.
(59) a. As for John
i
, this sort of thing is important for Johns
i
development.
b.
?
*As for John
i
, this sort of thing is important to Johns
i
mother.
The contrast in (58) is expected: the Condition C violation, while weaker for
these constructions than that usually found, is still present (it is considerably
weaker when a name c-commands a name, than when a pronoun c-commands a
name, as Lasnik 1986, notes). The contrast in (59) is the really interesting one.
When a name is part of a nominal like Johns development, it causes a much
weaker Condition C violation then when it is part of a name like Johns mother.
From (59) it is not clear whether this is because the nominal is deverbal in the
case of Johns development, or because Johns mother is an animate referent; for
present purposes, this does not matter. Quite likely, the contrast in (59) is not
due to Condition C at all, but to the aboutness relationship which must hold
between the as-for topic and the rest of the sentence, which is (for some reason)
easier to contrive in (59a) then in (59b).
Consider now the corresponding cases with control. Corresponding to (58a)
and (b) is the following contrast from Bresnan (1982). (My judgements, Bresnan
nds a stronger contrast yet, OK vs *).
(60) a. PRO
i
contradicting himself demonstrates that he
i
is a liar.
(Bresnan (51))
b.
??
PRO
i
contradicting himself demonstrates that Mr. Jones
i
is a
liar. (Bresnan (52))
If one assumed that there were some rule of control by which the PRO were co-
indexed with the element in the element in the lower clause, or one assumed an
arb-rewriting rule, this contrast would be inexplicable: such a rule would not be
expected to be sensitive to the pronoun vs. name distinction. However, this
contrast would be explained if there were a null topic in the cases in (60): in
(60b), but not in (60a), a name would be c-commanded by another name.
(61) a. As for John
i
, this sort of thing is important for Johns
i
development.
b. *As for John
i
, this sort of thing is important to Johns
i
mother.
The contrast in (61) is the really interesting one. When a name is part of a
nominal like Johns development, it causes a much weaker violation then when
it is part of a name like Johns mother. Quite likely, the contrast in (61) is not
due to Condition C at all, but to the aboutness relationship which must hold
between the as-for topic and the rest of the sentence, which is (for the same
reason) easier to contrive in (61a) then in (61b). Consider now the following set
of sentences. Here, again, the apparent control contrast is strongly in parallel
with the aboutness relation contrast.
(62) a. As for John, this is important for Johns development.
b. *As for John, this is important to Johns mother.
c. PRO
i
to learn math is important for Johns
i
development.
d. *PRO
i
to learn math is important to Johns
i
mother.
The contrast between (62a) and (b) parallels that between (62c) and (d), though
the latter is a case of control and the former is not. This is explained if we
assume that it is not the control rule which is sensitive to such predicates as part-
of-a-possible-controller, but rather that the control rule, like other co-indexing
rules, requires c-command by some minimally local element: in this case a null
topic. The contrast between (62c) and (d) would then not be stated in the control
rule itself, but such a contrast would be factored into the possible constraints on
an aboutness rule, which is independently needed to explain the contrast in (62a)
and (b), and a structure-dependant control rule.
Theoretically, this allows for the control rule, a coindexing rule, to not be
sensitive to pragmatic information, and thus regularizes it to other coindexing
rules in the grammar, to some degree.
The same sort of parallel holds for cases of embedded object vs. embedded
subject, as the following quadruplet shows:
(63) a.
?
As for Bill, this shows that Bill is really smart.
b. *As for Bill, this shows that Mary is right about Bill.
c.
?
PRO
i
winning the Nobel Prize shows that Bill
i
is really smart.
d. *PRO
i
winning the Nobel Prize shows that Mary was right about
Bill
i
.
The same comments apply.
To summarize: by choosing a null topic analysis, the pragmatic sensitive
features of control are, to some degree, teased out, and taken to be part of the
aboutness relation between the null topic and the following clause. This speci-
cation is independently needed as shown above. While this does not identify
obligatory and nonobligatory control (dierences exist between them if
Lebeaux 19841985 is correct, the former and not the latter is bound by predica-
tion; other dierences are found in Koster 1984 and Franks and Hornstein
1990), this does regularize control, even so-called arbitrary control, to other
c-commanding relations in the grammar. Directly relevant here: it would mean
that c-command characterizes all cases of control.
Pending further analysis, then, I will assume that the operator-type structure
given in (56) is the correct analysis for sentences such as (55). We are still left
with the problem of explaining the acquisition facts, and a major linguistic
problem as well. In a large number of constructions with control into sentential
subjects, it is not possible to take a discourse antecedent as already noted with
the Tavakolian sentences and, further, in others, a long distance antecedent is
not viable. Examples are given below.
(64) a. *Did you see the pig
i
? PRO
i
to kiss the duck would make the
lion happy.
b. *John
i
thinks that PRO
i
to sleep more would be pleasing to his
father.
c. *Bill
i
said that PRO
i
knowing himself was dicult for his wife.
This is in spite of the fact that in many cases an as-for topic would be sanctioned
by the following context. Thus in the sentences in (65) and (66) the as-for topic
would seem to be licensed by the possessor identical to it. Yet it is still not
possible to have control occur.
(65) a. As for John, this would please his father.
b. As for Mary, this made her mother angry.
(66) a. Do you know John
i
? *PRO
i
to succeed in business would
please his
i
father.
b. *Ive met Mary
i
, and PRO
i
to have a mohawk haircut makes her
i
mother angry.
Therefore, in cases like (65) and (66), it is necessary to prevent the PRO from
being controlled by a topic or a discourse element.
We thus appear to have two sets of conicting data. On the one hand, it
appears that an operator-type analysis fullls three functions: i) it explains the
linked reading for arbitrary PROs, and the locality involved, ii) it explains the
crossing eects for long-distance binding of PRO, and iii) it allows c-command
to be maintained for examples such as (57). On the other hand, the possibility of
an operator-type reading which is associated with Long Distance control and
discourse control is strictly limited. In none of the examples in (64) is such
a reading available.
Let us rst note again an observation made earlier: the class of control into
subject constructions is associated with a restricted group of predicates. These are
of diering types: i) psychological predicates (please, disgusts, excites, etc.), ii)
tough-predicates (tough, easy, etc.), and iii) predicates involving necessity (is
necessary, requires, etc.), iv) predicates involving causation (make, etc.). While
these predicate-types dier from each other in a number of ways, the latter three
types allow an expletive subject.
(67) It is tough (for John) to do that.
(68) It is necessary (for John) to do that.
(69) It makes Mary happy to do that.
Psychological predicates normally take an NP subject, but if the argument is
clausal, this may also appear in an extraposed position.
(70) It pleases Mary to do that.
that Je is so handsome.
It has been recently convincingly argued that psychological predicates have, in
at least some of their uses, two internal arguments at DS (Belletti and Rizzi
1986; Johnson 1986). Accepting the basic position of Belletti/Rizzi and Johnson,
this means that the D-structure of (71b) is (71a), and that the NP is moved into
subject position.
(71) a. e please John pictures of himself.
b. Pictures of himself please John.
(The authors above dier in their DS assignments: Belletti and Rizzi assuming
that the s-structure subject starts o in the most internal position in the VP,
while Johnson assumes that it originates in a 2nd NP position. For concreteness,
I have used Johnsons structure.)
A piece of supporting evidence for the Belletti/Rizzi and Johnson analysis
for English is the placement of arguments in nominals. In nominals, where case
is not a consideration, both arguments appear in internal position.
(72) the pleasure of John in Marys company
(Marys company pleases John)
The two-internal-arguments analysis allows a long-standing diculty with the
binding theory to be resolved. Given the standard analysis in which reexivi-
zation requires c-command, structures such as (73) constitute a puzzle.
(73) a. Pictures of himself please John.
b. Each others choice of friends bae the two boys.
Given the DS posited by the theorists above, however, these structures are no
longer a puzzle. At DS, c-command does hold, and one internal argument acts as
an antecedent for the other.
(74) ((e
NP
) (please (John
NP
) (pictures of himself
NP
)
VP
)
S
)
Suppose that we extend the Belletti and Rizzi analysis to the (somewhat more
problematic) cases of control. The DS of (75a) would then be (75b); the DS of
(76a) would be (76b).
(75) a. PRO to kiss the duck would please the lion.
b. e would please the lion (PRO to kiss the duck).
(76) a. PRO kiss the duck would make the lion happy.
b. e would make the lion happy PRO to kiss the duck.
Further, assume that the small clause complement originates as part of a complex
verb (Chomsky 1957, 19751955; Bach 1979).
(77) e would make-happy the lion (PRO to kiss the duck).
The strict c-command of the PRO by the controller is in all cases preserved.
Such an analysis, in conjunction with the operator-type analysis, allows us
to retain strict c-command as a necessary condition for control. The near-
obligatoriness of control by the object in these constructions is tied to the deep
structure position of the control clause, before it is fronted. Suppose that control
at DS is generally obligatory, modulo the data discussed above. Suppose that for
some predicate it does not take place at DS. Then the clause is fronted to an
S-initial position. Assuming that PRO always looks to its nearest c-commanding
antecedent, and that this relation is local, then at this point null topic insertion
will take place, if it has escaped control by the DS object, closing o the
sentence. In eect, the necessity of object control for most predicates when the
clause is in subject position is a bleeding phenomenon, where the object co-
indexes with the controlled PRO. Only in such cases where it has escaped
control is reference up the tree possible.
This account traces both the possibility and near-obligatoriness of object
control to the deep structure position of the clause as sister to its object. It is at
this point that object control occurs, if it does. The solution is the following:
(78) a. The controlled subject clause is an internal argument at DS.
b. The control of PRO is dened directly rather than using a
derived notion of c-command.
c. External reference, or reference up the tree takes place after
the clause has moved to its S-structure position, and only then.
5.5.5 The Abrogation of DS functions
We are now in a position to trace the dierence in the childs interpretation of
sentential subjects to a very simple dierence in the control rule, as it interacts
with levels of representation. The dierence is simply this: that for the child, but
not for the adult, the control clause is always in the fronted position, both at
S-structure, and at the deepest computed level D-structure (when I say dislocat-
ed, above, and throughout the chapter, I am purposely being vague as to
whether I mean moved or base-generated in a dislocated position, unless
otherwise specied). When the controlled clause originates in an internal to VP
position at D-structure in the adult grammar, the following sequence of operations
apply in the adult grammar (see (78)): i) application of control, dependant on
direct c-command, ii) movement of control clause to fronted position.
(79)
Adult grammar
a. DS: S
VP
V
make happy
NP
the lion
S
PRO to kiss the duck
NP
e
Control
b. S
VP
V
make happy
NP
the lion
i
S
i
NP
e
Fronting
c. S
V
make happy
NP
the lion
i
S
i
S
e
S
I
would VP
j
j
The control rule applies in (near-)obligatory fashion to the representation in
which the control clause is an internal complement. It demands direct c-command,
and has it. Following control, the clause is fronted. The result is the coindexed
representation, with PRO coindexed with the controller (here: the lion). Since an
element may only be indexed once, this control is nal: there is no possibility of
extra-sentential reference (in the adult grammar).
Consider now what happens with the child. The control rule is constant: it
applies in the child grammar exactly as in the adult system, and requires direct
c-command. PRO also has the same status in the childs grammar as in the
adults. The only dierence is the following: the childs initial representation is
shallow: DS rather than DS. At DS, the deepest level, the controlled clause
already is in the fronted position.
(80)
S
V
make happy
NP
the lion
S
S
e
S
I
would VP
j
j
Childs representation (DS and SS):
Given that the clause originates in the surface fronted position, it is never in the
c-command domain of the object (the lion). Since control is stated over direct
c-command relations, this means that control cannot apply to coindex PRO with
the object. Instead, unindexed, it looks up the tree, to the inserted topic, for an
antecedent. This is precisely Tavakolians result. (Since the Projection Principle
and Theta theory must be satised, the fronted element binds a null category in
the theta position. This null category, however, is not a trace in the derivational
sense: the dislocated control clause never originated in that position.)
In eect, the application of a control rule in the adult grammar, bleeds the
possibility of external reference. Since the clause does not originate in an internal
position in the childs grammar (the deepest representation being shallower), the
co-indexing, and bleeding, does not occur. Hence extrasentential reference is
expected.
The structure of the adult and child grammars is therefore the following:
(81)
DS
DS
SS
Adult Grammar Child Gramar
Control-by-Object Control-by-Object
Control
applies
throughout
Control
applies
throughout
Control
applies
throughout
Control
applies
throughout
Fronting Fronting
Deepest computed level by child
Control of
as-yet-unindexed
elements
(by topic)
Control of
as-yet-unindexed
elements
(by topic)
The control rule, and the fronting rule apply identically in the childs grammar
and the adults. The only dierence is that the deepest computed level for the
child is post-fronting, while that for the adult is pre-fronting. As such, the
operation which is the default for adults, namely control of as-yet-unindexed
elements, applies as a matter of course for children. This involves operator-
binding, allowing reference extrasententially or up the tree.
This allows for an explanation not only of the dierence of binding into
sentential subjects found by Tavakolian, but also the instances in which control
applies identically in the childs grammar and in the adults. These are found in (82).
(82) a. Daisy hits Pluto PRO to put on the watch.
b. Daisy stands near Pluto PRO to do a somersault.
c. Daisy hits Pluto after PRO putting on a watch.
As noted earlier, the childrens results in these constructions do not allow
extrasentential reference. Given the structure of the grammar in (81), and the
retention of the standard control rule by children, this result is expected. In the
instances of control with the sentential subject, the actual control rule in the adult
grammar applies when the clause is in an internal position. Since the childs
grammar is shallow, the clause is never in that position, and escapes standard
control. There is no fronting, however, in the examples in (82). Hence the
control rule applies, as it should, and the childs interpretation is identical with
the adults.
5.6 Case Study II: Condition C and Dislocated Constituents
In this section, I would like to look at a dierent range of experimental evidence
supporting the conclusion advanced above: namely, that if a particular operation
(e.g. control) applies at DS (as well as elsewhere), and the childs analysis is
shallow, then the childs grammar will not show evidence of the operation
applying. This amounts to the abrogation of that particular DS function. In the
previous section, the form that the abrogation took was in the lack of an
obligatory rule: the indexing function between internal arguments at DS, a
positive condition. In this section, the abrogation is of a negative condition:
namely, Condition C as it applies at DS. As such, the childs grammar will
appear to overgenerate. This is a result, again, of the shallowness of analysis by
the child.
The empirical data that I will draw upon is taken largely from a review
article by Guy Carden (Carden 1986b), which in turn analyzes, and re-analyzes,
a variety of sources. I take Cardens proposal to be particularly acute, and follow
it in a number of respects (though for some counter-discussion, see Lust 1986).
Carden (1986b) explores in some detail the dierences which follow from
what he calls a Surface vs. an Abstract Model of Noncoreference. The
progenitors of the Abstract Model he gives as Carden (1986a), Carden and
Dietrich (1981), and McCawley (1984); the Surface Model is that advanced by
Reinhart (1983). Recent models, not discussed by Carden, have fallen under the
rubric of Reconstruction, whether real Reconstruction or quasi-Reconstruction
where no actual reconstruction is found: see, e.g., Higginbotham (1985), Wil-
liams (1987), and Barss (1985, 1986) for proposals along these lines. Carden,
and the discussion here, draw on both the adult grammar and the acquisition
evidence.
The relevant data are examples such as these:
(83) a. *Near John
i
, he
i
saw a snake e.
b. *In Johns
i
bag, he
i
put some tennis shoes e.
c. Near him
i
, John
i
saw a snake e.
d. In his
i
bag, John
i
put some tennis shoes e.
In (83c) and (d) coreference between the pronoun contained in the preposed PP
and the subject is possible; in (83a) and (b), the name in the preposed PP may
not be coreferent with the subject pronoun. In Cardens account, these facts may
be accounted for in two distinct ways: by an abstract account which states the
condition on disjoint reference at deep structure, or by Reinharts surface
model, where such conditions are stated on s-structure. For Reinhart, this does
not involve reference to the trace as well. To the two possibilities outlined by
Carden, we may note a third: the possibility of using a level of Reconstruction
Structure, or using derived c-command relations at a level like LF. This is more
like the D-structure approach in making use of the original DS position.
The D-structure model of Carden would state the conditions on disjoint
reference at D-structure; the sentences in (83a), (83b) would then be related to
their DS counterparts.
(84) a. *He
i
saw a snake near John
i
.
b. *He
i
put some tennis shoes in Johns
i
bag.
The sentences in (84) would then be marked ungrammatical at DS, as violating
Condition C. They retain their ungrammaticality throughout the derivation. This
is identical to the position suggested in Chapter 3.
In Reinhart (1983), Condition C is stated over S-structure, not using the
position from which the dislocation occurred. This is done by extending the
notion of c-command so that an element c-commands all relevant elements in its
maximal projection as in (85) (as in Aoun and Sportiche 1981). (Or it may be
done by positing a structure in which the preposed PP hangs o S, as in (86).)
(85)
* S
PP
Near John
i
S
NP
he
i
VP
saw a snake
(86)
* S
PP
Near John
i
NP
he
i
VP
saw a snake
Using some modied notion of c-command, the relevant coreference relations in (85),
(86) may be expected to be impossible, on the derived position (Reinhart 1983).
While a number of arguments may be broached against this S-structure
solution to the problem of disjointness, perhaps the strongest argument has never
been mentioned in the literature, to my knowledge. This is simply that the
necessity for disjointness holds even under additional embedding.
(87) *Near John
i
, Bill said he
i
saw a snake.
Reference to simple S-structure position, without reference to some trace clearly
will not work for this sentence since there is no manipulation of the phrase
marker by which he directly c-commands John here. While it might be the case
that disjoint reference (i.e. Condition C) applies to an intermediate structure, e.g.
when it has been fronted in the lower clause but not the upper, it seems clear
that Reinharts general solution, in terms of S-structure c-command without
reference to traces, is not really viable.
5.6.1 The Abrogation of DS Functions: Condition C
Let us now consider some acquisition evidence. The following pattern of results
is from Carden 1986b, summarizing a large body of experiments (for more
detailed evidence, see that article.) The names of the experimental conditions
have been changed to reect current GB-style terminology.
(88) Question-Answering Interpretation Task (Age:3.57.0). (italicized
elements coreferent)
a. Pronominal Coreference
i) Mickey is afraid that he might fall down.
(78% coreference: Ingram and Shaw)
ii) Kens mother said that he was sick.
(96% coreference: Taylor-Browne)
b. Condition C: Dislocated Constituent
i) Under Mickey, he found a penny.
ii) Near Barbara, she dropped the earring.
c. Pronominal Coreference: Dislocated Constituent
Near him, Wayne found the programme.
d. Condition C
i) He was glad that Donald got the earring.
ii) He was glad that Wayne was coming.
The data may be summarized as follows. First, simple pronominal coreference is
of course possible (88a). Second, contrary to the sometimes posited linearity
conditions on childrens grammar, backwards coreference also appears possible
with the coreferent pronoun preceding the name (as long as it doesnt c-command it).
This is shown by examples like (88c): Near him, Wayne saw a snake. This is in
line with the Solan (1983) conclusion. Third, children at this age do appear to
have Condition C, as they rightly reject coreference in examples like those in
(88d): He was glad that Wayne was coming. In all these respects, i.e. in
examples (88) (a), (c), and (d), children are behaving identically to adults.
Finally, however, they do diverge from the adult data in (88b). Coreference is
allowed by children in examples in which the fronted PP contains a name, which
is coreferent with a pronoun which c-commands it in D-structure: Under
Mickey, he found a snake. That is, instances of Condition C with a dislocated
name are not blocked for the child.
Carden draws the correct conclusion with respect to the consequence this
data has for the D-structure account vs. Reinharts direct c-command account.
(Reconstruction accounts here would pair with the D-structure account.) Reinhart
unies the instances of obligatory disjointness in (89a) and (b) at a single level,
and under a single condition: the c-command condition (Condition C) applying
at S-structure.
(89) a. *He was angry that Wayne was there.
b. *Under Wayne, he put a dime.
Insofar as such a unication is appropriate, one would expect it to appear
uniformly in the developmental sequence as well: either the c-command condi-
tion on coreference holds or it does not, at any given developmental stage.
However, this is not the case: examples like (89a) are correctly rejected by the
child (under the coreferent interpretation), while coreference is possible in (89b).
But there is no value of the c-command condition as Reinhart states it which
could change over time to allow (89b) in, while (89a) is out.
The D-structure account, and the Reconstruction account as well, would
distinguish the data in the appropriate way, by allowing for two distinctions: the
Condition C condition itself, and the fact of dislocation. Condition C holds for
the child, but not under movement. This suggests that it is not Condition C
which is at fault at all, but rather that the D-structure representation which would
act as the target for that condition is not computed by the child, under conditions
of dislocation.
In particular, the data follows from the following shallow derivation. The
acquisition sequence is shallow, going back to DS rather than DS. This is shown
in (90) and (91).
(90) Adult analysis.
DS: *He
i
saw a snake near John
i
.
SS: *Near John
i
, he
i
saw a snake t. (retains star)
(91) Child analysis.
DS: Near John
i
, he
i
saw a snake e.
SS: Near John
i
, he
i
saw a snake e.
(92)
DS
DS
SS
shallow analysis
Condition C applies throughout
Condition C applies throughout, and it applies directly to structurally dened
c-command, rather than through some derived notion using chains as an equiva-
lence class. But this means that Condition C will not apply if the child has a
shallow structure such as that in (91) at the deepest level of analysis. This is
precisely Cardens result.
In general, both this analysis and the above analysis of control, suggest that
the grammatical functions associated with a level will be abrogated, if that level
itself is not computed, or is only partially computed, by the child. In the case of
control, the abrogated function is the positive indexing rule, where the sentential
element is indexed with its object controller in its DS position. Since the child
does not analyze the dislocated clause as ever being in that position in the course
of the derivation, the grammatical function associated with that position, the
indexing to the Direct Object controller, is abrogated, since its structural
condition is not met. Hence the default rule of null topic control applies instead.
But there is no change in the control rule (or principle) itself; it is simply that
one of the set of structures feeding it has not been supplied.
Similarly, here, with respect to the negative (contra-)indexing rule of
Condition C. Its structural condition is not met, so it does not apply, given the
shallow analysis.
(93) DS
DS
SS
Condition C:
Structural Condition not met
Control:
Object control
structural condition
not met deepest computed level
5.6.2 The Application of Indexing
While Carden correctly notes the advantage of the movement account for the
above data, there is no sense in which the abrogation of Condition C under
movement would logically follow in his account. Under the sort of analysis
suggested in this chapter, however, it is not simply the case that the data is cut
naturally, but that the particular sort of failing by the child, the failure of
Condition C only under conditions of movement, would be predicted. This is
because the failure is not a failure of Condition C at all (which applies correctly
at all the relevant levels), but rather of shallowness of analysis. Since Condition
C, and all the binding conditions, apply directly on structures, and since the child
is not computing a full derivation DS-SS, but rather a shallow derivation DS-SS,
the relevant level in which the pronoun is c-commanding the name is not
available to the child. Rather, at the deepest level of analysis computed by the
child, the preposed element (generally a PP) is already in fronted position,
binding a trace in argument position. This means, however, that the name is not
c-commanded by the pronoun at any level of analysis. So Condition C, while
operative in the childs grammar, nds no level in which its structural condi-
tion i.e. a name c-commanded by another name or a pronoun is satised.
So none of the relevant structures are marked *; the grammar overgenerates.
The analysis above therefore supports the following set of propositions:
a) that there is a D-structure (i.e. that the grammar is in the derivational, rather
than the representational, mode),
b) that binding principles, in particular Condition C and Control, apply
throughout the derivation,
c) that the binding principles apply to a structurally dened c-command
relation, and that therefore
d) if DS is not computed, or is only partially computed, in a particular analysis
in acquisition, then the positive or negative principles associated with it will
be abrogated.
Let us assume the following metatheoretical condition on the indexing rules.
(94) Metatheoretical Condition on Indexing
a. If a positive condition applies, it must be satised somewhere
in the course of a derivation.
b. If a negative condition applies, it must be satised nowhere in
the course of a derivation.
For the present, we may assume that the conditions in (94) apply particularly to
the Binding Theory, and in general to binding, i.e. to the marking of co- and
disjoint reference, though there may be other areas in which it would be applica-
ble as well. A positive condition would be the marking of coreference; a negative
condition would be the marking of disjoint reference. Further, I will assume,
following the discussion in Chapter 3, that the Binding Theory applies so that
c-command is dened directly, rather than derivatively, or by some further level
of reconstruction.
Consider how (94a) would work for positive conditions of coreference, for
example, for anaphoric binding. The anaphoric element would enter the deriva-
tion with no index. Throughout the derivation the element could be indexed with
an antecedent, if the locality conditions of the binding theory were met with
respect to that antecedent. Finally, at LF (or LF), all elements are checked for
an index: structures having unindexed elements are thrown out. This means that
if the element satises the binding conditions at any point, the derivation will be
sound, if the binding condition is a positive one: i.e. one which requires or
involves the assignment of an index. This sort of theory is close in spirit to the
Assign gamma feature of Lasnik and Saito (1984).
With respect to alternative notions of the Binding Theory, the view pro-
posed in (94) must be defended in two ways. On the one hand, one might
propose that the Binding Theory, or binding, applies at some particular level: for
example, NP-structure or LF. On the other hand, one might view the binding
theory as applying at Reconstruction Structure, where reconstruction structure
is not a level in the traditional sense, but something like the union of information
from the previous levels (see, e.g. Williams 1987, for such a conception). The
latter view is of course rather dicult to distinguish from the position given
above since in both cases a union of information is being taken, but it is possible
to distinguish between them.
A complete discussion of the Principle in (94) goes beyond the scope of this
book, see Lebeaux (1991) for further discussion (see also Barss 1985, 1986, for
a relatively thorough discussion of reconstruction: I was able to see that work
only after the following was written). In the following several paragraphs, I will
try to briey indicate some eects of the relevant positions particularly with
respect to reconstruction.
Let us rst consider the possibility of Binding Theory applying within the
derivation. In cases of NP movement, binding must at least be allowed after the
movement has occurred, to account for examples like (95).
(95) The boys seemed to each other t to be very nice.
Examples like (95) also show that the positive condition, Condition A, need not
be satised everywhere, since it is not satised at DS. Condition A must also be
allowed to apply post wh-movement, to account for the pit stop property of
Steve Weisler (p.c.): namely, that a wh-element in a moved wh-phrase may
contain a reexive bound to any of the NPs in the intervening clauses (possible
antecedents italicized).
(96) a. Which pictures of himself did John say Bill liked e?
b. Which pictures of himself did John say Bill liked e?
(97) John wondered which pictures of himself Bill liked.
In (96), the reexive appears to be bound to John from its position in the
intervening Comp. This, together with simpler data like that in (97), suggests that
anaphoric binding must at least apply after wh-movement.
It might be suggested, then, that S-structure is the place. Note that even here
some sort of union of indexing must be involved either throughout the
derivation or by an equivalence class in chains since both of the indexings in
(98) are possible (coreferent items italicized).
(98) a. John wondered which pictures of himself Bill liked e.
b. John wondered which pictures of himself Bill liked e.
There is some interesting evidence, however, from T. Daniel Seeley (Seeley
1989) which suggests that the Binding Theory must also apply at LF (the
interpretation here is mine not Seeleys). Consider the behavior of stressed
reexives in a discourse (Seeley 1989).
(99) A: Does John like MARY?
B: No, John likes HIMSELF.
(100) A: Do the boys believe John to like STEVE?
B:
?
No, the boys believe John to like EACH OTHER.
(101) A: Do the boys expect Steve to believe John to have seen BILL?
B:
?
*No, the boys expect Steve to believe John to have seen EACH
OTHER.
(102) A: Did Bill leave because Sue saw MARY?
B: *No, Bill left because Sue saw HIMSELF.
The judgements in (99)(102) are by no means crystal-clear, but I believe that
they capture the intuitions of a number of native speakers. The judgement of
(101) is the most various one; many speakers, including myself, nd it ungram-
matical, but some nd it acceptable. If the judgements are correct, two facts
emerge: stressed reexives may escape their immediate clause, but not further,
and stressed reexives in an adjunct may not take an element in the main clause
as an antecedent (102). These facts may be captured with a simple analysis. The
stressed reexive, like other focussed elements, is fronted to an S-adjoined
position at LF (Chomsky 1977b). This means that it escapes from its binding
category at LF, and may take a higher clause element as antecedent.
(103) LF of (100b):
No, The boys
i
believe (each other
i
(John to like e
i
))
Consequently, (100) would be grammatical because Condition A would be satised
at LF. (99) would be grammatical because Condition A would be satised at
S-structure (or earlier). However, neither (101) nor (102) would be grammatical
because the output after focus fronting would not satisfy the binding conditions.
(104) LF of (101):
?
*No, The boys
i
expect Steve to believe (each other
i
(John to have
seen e
i
)).
(105) LF of (102): *John
i
left because (himself
i
(Sue saw e
i
)).
The binding in (104) would violate the binding conditions, as would the binding
in (105).
Seeleys data from stressed reexives, as well as the even more clear-cut
evidence from wh-movement suggests that positive binding conditions cannot be
stated at a single level, without reference to either other levels as in the cumula-
tive-derivational approach as suggested above, where the actual indexing is done
throughout, or via a reconstruction-type approach, which denes a derived notion
of c-command or equivalence classes of chains. Certain facts, in particular those
which have come under the rubric of chain-binding (Barss 1985, 1986), seem to
be problematic to the cumulative-derivational approach.
(106) Chain-binding
a. Those pictures of himself are the ones that I think that John
really likes.
b. Those pictures of each other are the kinds of things that Bill
thought that those men really liked.
Here, the reexive seems to require reference to the embedded trace, though
presumably no movement has occurred. Of course, the entire binding theory
cannot be stated over an equivalence class of chains (if one wished to dene a
chain having the left-most member of the copular sentence in (106) as its head,
the lexical relative clause head as a middle member, and the trace as the tail),
since Condition C does not apply obligatorily.
(107) Those pictures of John are the ones that he really enjoys.
One empirical oddity about the structures in (106) seems to me to be the
following. These constructions, unlike other long distance binding into nominals
in standard sentences, require the nominal bound into as having a implicit
possessor identical to the anaphor. My judgements are the following:
(108) a. Those pictures of each other are the ones that the boys really
like.
(must be their pictures of each other)
b. The boys really like those pictures of each other.
(need not be their pictures of each other)
(109) a. Those stories about each other are the ones that the boys
believe to be true.
(must be their stories about each other)
b. The boys believe those stories about each other to be true.
(need not be their stories about each other)
Oddly, this implicit possessor reading does not seem to be required of the
corresponding pseudo-cleft type.
(110) What the boys like are stories about each other.
If these judgements are correct, then the theoretical problem associated with the
copular sentences in (109) dissolves, though not the one for (110).
More generally, it would seem to me preferable to retain direct notions of
c-command and chains, and revise the relevant notion of phrase structure, than
to do the reverse. See Lebeaux (1991, 1998) for more discussion.
5.6.3 Distinguishing Accounts
In this chapter, I have been following a particular type of proposal in order to
account for certain facts in acquisition. Namely, that indexing applies throughout
the derivation, that it applies directly in terms of structurally dened c-command,
and that certain dierences between the childs grammar and the adults may
then follow from the fact that the childs analysis is, in certain respects, shallow.
The grammatical functions associated with the partially missing or partially
computed levels would therefore be abrogated. In the last section, I suggested
that this view would follow from a general metatheoretical condition on index-
ing, namely that positive indexing applies throughout the derivation (i.e. positive
conditions must be satised somewhere), while negative conditions may never be
satised. While issues are complex, I would like to indicate briey in this
section the dierences between this account, and those which fall under the
rubric Reconstruction. Two broad types of reconstruction accounts may be
distinguished, those which involve actual reconstruction of the moved element,
and those which dene c-command relations or equivalence classes of elements
in chains in terms of the dislocated structure. The latter type of account, I will
call quasi-Reconstruction.
In spite of the similarities between the cumulative notion of indexing above
and the reconstruction approaches in general both allow for the union of
certain types of information dierences would be expected between them. In
particular: (1) to the extent to which elements are added in the course of the
derivation (Chapter 3), various conditions may be taken to not apply to the added
element, in the cumulative-derivational view, while this result may only be gotten
with diculty using (quasi-)reconstruction, (2) the cumulative-derivational view
would allow for an ordering of operations within the grammar, and hence for
bleeding or blocking possibilities, a result which would, again, only be possible
with diculty using quasi-reconstruction. To the extent to which (1) and (2)
hold, the cumulative-derivational view is supported (recall that the cumulative-
derivational view holds that binding possibilities e.g. positive indexing as in
Condition A applies cumulatively throughout the derivation, adding indexings).
I will examine (1) and (2) here only with respect to the acquisition evidence
above, in particular with respect to the abrogation of DS functions; see Lebeaux
(in preparation) for a more complete syntactic discussion.
Let us consider an instance of the two accounts. The positive conditions
everywhere/negative conditions nowhere account would have the following form:
(111) DS
SS
LF
Binding operations apply throughout
(e.g. Condition A)
Indices Checked
(112) DS
SS
LF
Negative Condition may not
be met (anywhere)
And assume a Reconstruction-account of the following form, where Reconstruction-
structure, either as a set of structures or a set of dened relations, holds at LF.
(113) DS
SS
LF Reconstruction Structure ~
Now consider the acquisition data that we have been examining as well as earlier
syntactic data, to see which is preferable. The cumulative-derivational account
above can account for the lack of Condition C eects for dislocated constituents
(Cardens data), by assuming that DS is not computed, insofar as the dislocation
is concerned. The Reconstruction-type account apparently can account for the
(lack of) Condition C eects as well. Suppose that a parallel reconstruction-type
account has the following form:
(114) Condition C is stated over R-structure, a set of structures derived
from LF by
(115) In the childs grammar, no separate level of R-structure exists at a
particular stage, because the derivation is shallow in the LF direction.
Then the lack of Condition C eects for the Carden type sentences above is
explained:
(116) In Mickeys wallet, he put a penny e. (OK for child: coreferent items
italicized)
Sentence (116) would not have a R-structure corresponding to the structure in
which In Mickeys wallet has been put back into place (however exactly this
is done). Hence it would invoke no Condition C violation by the child: the actual
result. This is shown in (117)
(117) DS
SS
LF
LF R-structure
actual computed structure
However, we earlier noted syntactic considerations which mediated against
Condition C being stated over R-structure: the anti-Reconstruction eects of van
Riemsdijk and Williams (1981) (see Chapter 3). Given the existence of these
eects, and the argument/adjunct distinction in the presence of a Condition C
violation for dislocated constituents, syntactic considerations alone advance a
Positive Conditions Everywhere/Negative Conditions Nowhere type approach.
We are left with the following.
(118) Type of Binding into Dislocated Constituent
(cumulative approach vs. R-structure)
Condition C
acquisition evidence indeterminate
syntactic evidence cumulative approach
(Positive Conditions Everywhere/
Negative Conditions Nowhere)
This summarizes the set of data considered in Chapter 3: the Condition C
binding. What about the data considered in the last section: that bearing on
control. Here, the syntactic evidence is indeterminate about the type of approach
which is supported, but the acquisition evidence is not, supporting a cumulative-
derivational approach, though weakly, over one involving a level of R-structure.
The full chart, then, for the two instances discussed will be the following.
(119) Type of Binding into Dislocated Constituent
(cumulative-derivational approach vs. R-structure)
Condition C Control
acquisition evidence indeterminate cumulative approach
syntactic evidence cumulative approach indeterminate
cumulative approach = Positive Conditions Everywhere/Negative
Conditions Nowhere
Consider why the acquisition evidence does support the cumulative derivational
(direct) approach, given the general analysis of Non-Obligatory Control given in
the previous section. The crucial data were the Tavakolian-type sentences given
in (120).
(120) a. PRO to kiss the duck would make the lion happy.
b. PRO to leave the room would make the skunk happy.
As noted there, children, but not adults, allow extra-sentential reference in such
constructions. I suggested that this was due to the interaction of the following
three factors:
(121) a. Non Obligatory Control clausal subjects originate in internal-to-
VP position
b. Control requires direct c-command; when this applies at DS,
the other internal argument is the controller
c. The sentential clause is fronted; if the PRO is unindexed, it is
operator-bound, and gets its index from the operator.
In this theory, it is the presence of the control clause in internal position, and the
application of control there, which bleeds the later possibility of operator-
binding and external reference. As noted earlier, if the analysis is shallow, i.e. if
the deepest computed level is DS not DS, and at DS the control clause is
already in fronted position, then control by the internal to VP element will not
apply. So operator binding will take place, and extrasentential reference will
occur. This accounts for the Tavakolian results.
(122) Adult analysis:
a. DS
e would make the lion happy (PRO to kiss the duck).
b. DS after Control
e would make the lion
i
happy (PRO
i
to kiss the duck).
c. SS
(PRO
i
to kiss the duck)
j
would make the lion
i
happy e
j
.
(123) Childs analysis:
a. DS
(PRO to kiss the duck)
j
would make the lion happy e
j
.
b. SS
O
i
(PRO
i
to kiss the duck)
j
would make the lion happy e
j
.
It is the shallowness of the derivation which allows the operator to be inserted.
(Note that in the childs analysis, an empty category is present in the second
object position at all levels of representation, as required by the Projection
Principle, etc; it is just not the trace of a real movement operation.)
(124)
DS
DS
SS
LF default topic insertion
and interpretation
normal control rule
shallow analysis
Suppose that we tried to do it with a Reconstruction type account.
(125)
DS
SS
LF
Operator (topic) interpretation or insertion
~ R-structure
Adult Grammar:
(126)
DS
SS
LF
LF
Operator (topic) interpretation or insertion
R-structure
Child Grammar:
The corresponding control rule, using reconstruction, would be the following.
The dislocated constituent is placed back into its dislocated position (or read as-if
placed back) at R-structure. This operation is optional. If it applies, control by
the object NP is possible; otherwise a peripheral topic operator is inserted. The
key syntactic fact would be the ordering of the reconstruction operation and the
topic interpretation.
Consider now what would happen in acquisition. As noted above for
Condition C eects, the parallel way of accomplishing this end would be by
supposing that R-structure was not relevant for the childs grammar, the grammar
being shallow in that direction. But this situation is not as symmetrical as it
seems. The control rule and the operator insertion rule are on opposite sides of
the grammar for the cumulative-derivational formulation of control, but not for
the reconstruction type approach. A shallow analysis (lacking in some of the
operations of D-structure) for the rst case would eliminate object control, but
would retain operator insertion, which is occurring on the other side of the
grammar. In the second case, however, the shallowness of analysis at LF would
eliminate (abrogate) both the reconstruction rule, and the default rule of operator
insertion or interpretation. The result would be a structure which would be ill-
dened: not the actual result.
5.7 Case Study III: Wh-Questions and Strong Crossover
In the previous two cases under investigation, I have dealt with two areas in
which it appears that the childs grammar is shallow: the analysis of control,
and the analysis of constructions which should apparently be ruled out by
Condition C. In each case, it was found that the childs grammar diverged from
the adults. However, this was taken not as evidence that the condition itself was
dierent in the two grammars (Non Obligatory Control, Condition C), nor that
there was a dierence in category type (i.e. that the initial PRO was pro, or
some neutralized null category), but rather that the childs analysis was shallow:
anchored in S-structure, and extended only part of the way back to DS, to DS.
As such, the constraints and operations which would have applied to the DS
representation did not. As a consequence: (i) in the case of Control, since the
c-command condition for control was not met, a default operation of operator
insertion applied, and extrasentential reference was gotten; (ii) in the case of
Condition C, with dislocated constituents at the deepest level of analysis,
Condition C did not apply because its structural condition was not met (the name
did not have a c-commanding coreferent name or pronoun).
In the next sections, I would like to discuss a third area in which the child
adopts a shallow analysis: the analysis of wh-questions. The data here are drawn
from an important paper: Roeper, Akiyama, Mallis, and Rooth (1986), Cross-
over and Binding in Childrens Grammars. The data are quite complex, and the
paper itself is not as well known as it should be. Accordingly, I will discuss here
rst the analysis of wh-questions which I will adopt, then summarize the paper,
and then present an analysis of how the acquisition data can be accounted for
within the general levels-of-representation conception that this work argues for.
5.7.1 Wh-questions: Barriers framework
While I am generally assuming the extension of GB found in Barriers (Chomsky
1986), the analysis of wh-questions in acquisition is more specically tied to
elements of the analysis of Barriers than other aspects of this thesis. (Indeed, it
strongly supports particular aspects of that analysis, and would not be workable
without it.) A quick review of the relevant points is therefore in order.
With respect to X-theory, , the domain of elements falling outside of S (=
I = IP) is more articulated than in earlier versions (Chomsky 1981). In particular, S
is the maximal projection of In, and, crucially, wh-movement is not into Comp, but
into the Spec C position. That is, the full structure of the clause is as follows.
(127)
C
C
S (=IP) Comp
SpecC
NP I
Infl VP
V NP
The movement of a fronted NP is no longer into Comp (or an adjunction in
Comp), but rather into the specier position of C (it will become clearer later
why I am reviewing this).
A second innovation of the Barriers-type approach is that the movement
operation is a substitution operation (into Spec), rather than an adjunction. A
general consequence of that is that In, including do, may now move into the
head position of Comp. The entire clause then becomes a projection of In. The
movement of In into Comp, and overlay of Comp by In, is shown in (128).
(128)

I
S (=IP) Infl
SpecI
NP I
Infl VP
V NP
A third innovation has to do with the locality of the movement. Chomsky (1986),
following the analysis of anaphor movement in Lebeaux (1983), assumes that
movement is highly local; Chomsky (1986b) assumes likewise for wh-movement,
involving adjunction to intermediate nodes, including VP. While I accept this
part of the analysis, it will not be crucial in what follows.
We note immediately one consequence of the Barriers analysis, which has
already been commented on in the foregoing (Chapter 1). Assuming that
categorial selection is selection of the head (Belletti and Rizzi 1988), a wh-clause
must be selected in terms of its Comp feature, not the element in Spec C. This
means, in turn, that Spec C must agree with the +/wh-feature in Comp, so that
the selectional process in the grammar knows that a wh-element has been
moved into Spec, and can dierentiate (129a) from (129b). I assume, simply,
that there is, still, such a +/wh-feature in Comp, and that it agrees with Spec C.
(129) a. I wonder ((who) ((+wh) (John saw)))
b. *I wonder ((e) ((+wh) (John saw who)))
The reason for the ungrammaticality of (129) is that the +wh-feature is selected
by the verb, and this is unsatised at S-structure. Such satisfaction is required at
S-structure, for English.
Before proceeding, let us note three facts which support Chomskys
analysis. First, the analysis of In movement as overlaying Comp is supported
by selectional facts. Assuming that complement taking verbs (believe, wonder)
select for a particular type of Comp (+wh, wh, respectively), and assuming that
this selection must be satised at all levels of representation, we have an
immediate explanation, given Chomskys analysis, of why Subject/Aux inversion
is impossible in subordinate clauses. If Aux moved into Comp, the clause itself
would be a projection of Aux (or In): i.e., it would be I. This would mean,
however, that the selected clause was C at DS, but I at SS: an impossibility,
given the assumptions above. Hence no selected clause may have Subject/Aux
inversion, the correct result.
A second fact supporting Chomskys analysis is conceptual, but I believe
powerful. In traditional Extended Standard Theory and GB analyses (e.g.
Chomsky and Lasnik 1977, Chomsky 1981), Comp is a sort of garbage
category. It contains mostly closed class elements (that, if, etc.), but also those
of radically dierent character, open class NPs (whose hat, etc.). This made
Comp very dicult to treat as a unied element, and very dicult to probe the
properties of. Given the current theory, Comp again makes sense categorially: it
is a position in which a particular set of closed morphemes may appear.
A third fact has to do with wh-island eects. It has sometimes been noted,
though usually just in passing, that there is a considerable contrast in grammat-
icality between examples (130a) and (130b).
(130) a. *Who do you wonder which books John gave e to e?
b. Who do you wonder if John gave books to e?
In the example in (130a), the dependencies are nested, so a crossing constraint
cannot be the cause of the ungrammaticality.
While other candidates for explanation of the dierence are available in
pre-Barriers type frameworks, Barriers does provide a ready explanation for the
dierence. While a +wh Comp does exist in the complement clause in both
cases, in the former case, but not the latter, the Spec C position is lled (with
which books). This suggests that the long distance extraction in (130b) is through
that position, and the ungrammaticality of (130a) should be traced precisely to
the fact that that position is unavailable. This proposal might be instantiated in
a number of ways, which I will not try to go into here. This sort of explanation
is not nearly as available if one assumes that extraction is through Comp, since
if is an obligatory element (though variants may be tried, using particular
indexing algorithms).
5.7.2 Strong Crossover
Before turning to the acquisition evidence, let us deal with another aspect of wh-
questions: namely, the existence of (Strong) Crossover eects. Such eects, rst
noted by Postal (1974) essentially forbid the crossing over of a moved wh-item
over a coreferent pronoun or name, in congurations in which the pronoun or
name c-commands the trace of the wh-element. Contrasts such as (131) were
noted by Postal.
(131) a. *Who
i
did he
i
say that John liked e
i
?
b. Who
i
did the man that saw him
i
say that John liked e
i
?
In (131b), who may be construed as coreferent with he, but not in (131a).
Similarly, and yet more clearly, there is a dierence in the possibility of a
bound reading in (132) depending on whether a crossover or a noncrossover
conguration underlies it.
(132) a. Who
i
e
i
ate his
i
hat?
b. Who
i
did he
i
say e
i
ate his
i
hat?
(132a) easily allows a bound reading. However, while (132b) would be structur-
ally identical (with the addition of an element), if one simply looked at the
indexing and ignored the lexical/nonlexical distinction, it does not allow such a
reading.
(133) a. Who
x
(x ate xs hat)?
b. Who
x
did (x say x ate xs hat)? (not allowed as reading)
The contrast between (132) and (133), then, is strong evidence for the role of
Strong Crossover in the adult grammar.
Strikingly, and remarkably, such a contrast does not exist in the child
grammar, except for some peripheral structures (Roeper, Akiyama, Mallis and
Rooth 1986). It is to this acquisition evidence, and the theoretical consequence
of that evidence, that I will return below.
While the Strong Crossover fact is quite uncontroversial, the theoretical
explanation of the fact is a good deal less so. At least three proposals exist in the
literature. Postal (1974) suggests that the condition is actually a condition on a
transformation i.e., that the crossing of a wh-element over a c-commanding
name or pronoun is disallowed. Chomsky (class lectures 1985) has at least
contemplated a similar idea. A second alternative, perhaps the most widely
accepted, is that the trace left by wh-movement is a name (Chomsky 1981), and
that Condition C bars the relevant conguration because a name would be
c-commanded by a coreferent element in an A position, at S-structure.
(134) Who
i
did he
i
visit e
i
?

c-commanding name
pronoun
A third possibility has been suggested by van Riemsdijk and Williams (1981).
This is that the condition is neither on a movement operation, nor on the trace as
a name, but rather on the pre-movement structure. If the wh-element itself is
considered a name, and one adopts the position advocated earlier in this work
that positive conditions must be satised everywhere, and negative conditions
nowhere violated then the Condition C will rule out the pre-movement
structures at DS.
(135) a. *He
i
didnt know who
i
? DS
b. *Who
i
didnt he
i
know e
i
? SS (retains * from DS)
(135a) is ruled out at DS, and the full derivation in (135) retains the ungrammat-
icality of (135a).
The van Riemsdijk and Williams proposal has certain attractive features,
though they are hardly decisive. First, it allows the rather natural proposal of
Joseph Aoun, that wh-trace is an anaphor as a locally necessarily dependant
element, to be straightforwardly instantiated. Given that it is the wh-element
itself which is the name, and that Condition C is stated in terms of that, the wh-
trace is freed to be an anaphor. Second, the van Riemsdijk and Williams
proposal does not require recourse to layered traces. As van Riemsdijk and
Williams note, in an important way, the strong crossover eect holds not only
over the whole moved phrasal node, but over all the material that it dominates.
(136) *Whose
i
hat did he
i
eat e ?
Whose in (136) cannot be coreferent with he. This constraint cannot be stated
over the maximal null phrasal category, but must be stated in terms of layered
traces. While layered traces would, under certain renditions of movement, even
be expected, they have certain characteristics, aside from complexity, which are
somewhat unattractive. It must not only be the case that the trace is layered, but
also that each individual subnode is individually co-indexed with its antecedent
in the moved item. Further, the government relation must be dened between a
set of elements, all null. More problematic, from the point of view of this thesis,
is that syntactic elements which correspond to a phonologically null segment of
the string are no longer closed class (i.e. necessarily nite in character), but open
class. This is because a layered trace may contain arbitrarily complex syntactic
material.
In the following, I will argue that acquisition evidence supports the van
Riemsdijk/Williams proposal (or possibly Postals original proposal) over the
alternative, that variables act as names. This supports, or allows support to
develop for, Joseph Aouns proposal: that wh-traces are anaphors. It also further
supports, and allows articulation for, the proposal with which this chapter is
concerned: that the derivation is real, and may be construed, by the child under
certain circumstances, as shallow.
5.7.3 Acquisition Evidence
The basic nding of the Roeper et. al experiments is that Strong Crossover does
not exist for children, for a majority of constructions. (The exceptions, those
constructions in which Strong Crossover does exist for the child, also play a
crucial role in the following analysis.) Why should this be?
One possibility is that the constraint itself is not available at an early stage,
and pops into the grammar at some later stage. Recall, however, that a similar
solution was found wanting for the lack of Condition C eects in constructions
like the following:
(137) In Johns
i
room, he
i
put a book. (OK for kids, * for adults)
As noted earlier in this chapter (see also Carden 1986a, 1986b), it is not the case
that Condition C has disappeared at the point at which constructions like (137)
are wrongly accepted. Rather, Condition C is present in its simple form, but just
doesnt seem to be operating in dislocated structures. That pattern of judgements,
it was argued, was not diagnostic of the lack of Condition C, as a condition, at
all, but rather due to the fact that its structural condition was not met, due to the
shallowness of analysis by the child. A similar explanation was found for the
Tavakolian data.
Let us exclude the possibility of a principle suddenly appearing as follows:
(138) Universal Application of Principles
Any (universal) principle P in the adult grammar applies at all stages
of development, if the vocabulary satisfying that principle is present.
By the parenthetical universal in (138), I do not mean to restrict the application
unnecessarily, but simply to allow for the fact that a parameterized principle
would not have to apply in its language-specic form at all stages of develop-
ment. The proviso in (138) would require for each principle P in UG, that i) it
either hold in the appropriate language specic form at all stages of development
for a given language L, or ii) it be given a dierent parametric form than the
one in the language L, but still be part of the specication in UG, or iii) that the
vocabulary over which the principle holds, not be present in the childs grammar.
This would then require that the binding principles apply as soon as the vocabu-
lary dening them (presumably the +referential features on nouns) were dened.
An explanation of the Jakubewicz (1984) and Wexler and Chien (1985, 1987a)
data would therefore have to be found which would be in accord with this
principle. The earlier case, where only theta theory applied in initial representations,
would not be a counterexample, since the vocabulary over which Case assign-
ment was dened would not be present.
With respect to the data at hand here, the question is whether a similar,
levels-of-representation type analysis can be found for the strong crossover data.
As noted above, children allow both (139a) and (b) as well-formed structures at
rst approximation, according to Roeper, et.al.
(139) a. Who
i
e
i
thinks he
i
likes his
i
hat?
b. Who
i
does he
i
think e
i
likes his
i
hat?
Indeed, the full set of data is quite complex and apparently somewhat confusing,
the result of complex array of experiments performed by Roeper and his
colleagues. Considered in full, the theoretical problem becomes quite intricate. I
will present here the data from experiment #7 which is the most detailed set of
data which Roeper et.al. provide, and characteristic of the whole.
Percent coreferent or bound
I. Noncrossover conguration: single clause
a. Who is V-ing himself?
b. Who is V-ing him?
c. Who is V-ing his N?
100.0
027.0
036.9
II. Crossover conguration: single clause
a. What is he V-ing?
b. Who is he V-ing?
c. Whose N is he V-ing?
015.9
025.9
003.6
III. Noncrossover conguration: 2 clauses
a. Who thinks NP is V-ing him?
b. Who thinks he is V-ing NP?
040.5
038.1
IV. Crossover conguration: 2 clauses
a. Who does he think NP is V-ing?
b. Who does he think is V-ing NP?
c. Who does he think he is V-ing?
029.8
019.0
035.2
Figure 1. Percentage of Coreferent or Bound Responses
A word about the notation is in order. V stands for any of a number of the verbs
chosen; and NP for any of a number of NPs. An exemplar of Who does he
think NP is V-ing? would be Who does he think Big Bird is pushing e?
The data in I. is of theoretical interest only as a basis of comparison. Note
that: i) children have 100% bound responses when the bound element is a reexive
(Ia.), ii) coreference or binding is allowed, at least marginally, for single clause
structures with a coreferent pronoun (Ib., 27%). This result is already familiar
from work by Jakubewitz (1984) and Wexler and Chien (1985, 1987a).
The rst striking result comes in IIb. This is a classic crossover congura-
tion, and unlike IIa., has who as a questioned word, so coreference would be
possible without violating animacy requirements. For this question, it appears that
25.9% of the children allow coreference, a clear violation of the adult rule. More
strikingly, and yet comfortingly as well for the accuracy of the original result, is
the fact that the result in IIb. is contrasted with that in IIc. Children do not allow
coreference between the fronted wh-element and the crossed over pronoun, if that
wh-element is part of a containing wh-phrase (*Whose
i
hat did he
i
like e? for
children, as well as adults). The fact that children obey the crossover constraint here
is crucial, since it shows that there is not simply a total breakdown in the grammar,
or that the allowance of strong crossover in the simple wh- cases is due to the
complexity of the task. Rather, in the slightly more complex case where whose
hat has been fronted strong crossover is obeyed, even if it is not in the simpler
cases (only 3.6% coreference in IIc).This is then the second puzzle to account for,
along with the original puzzle of the lack of strong crossover in cases like IIb.
Finally, there is a third result to be explained, which does not show up so
strongly in this set of data, but does in other data sets gathered by Roeper et.al.
Roeper et.al. note that the lack of a strong crossover condition is not uniform
across ages with respect to clauses. Rather, they note that there is a developmen-
tal pattern of the following sort.
(140) Stage I: No Strong Crossover Condition
Stage II: Strong Crossover Condition for 1 clause sentences;
no Strong Crossover for 2 clause sentences.
Stage III: Strong Crossover Condition generally
This result, noted by Roeper et al., does not rise clearly out of the data in Figure
I, but perhaps its outlines can be seen by comparing the 25.9% strong crossover
violation in IIb with the 29.8% and 35.2% strong crossover in IIIa and IIIc. See
Roeper et al. for more extensive discussion of this result.
To summarize, there are three problems or puzzles which must be answered
by a legitimate acquisition account:
(i) How does one account for the fact that Strong Crossover does not seem to
be operating or not nearly as strongly in the childs grammar as in
the adults, for examples like (i)?
(i) Who
i
is he
i
V-ing e
i
?
(ii) How does one account for the fact that, at the same time that Strong
Crossover is not respected by the child in constructions like (i), it is
respected for (ii), where a full NP is fronted?
(ii) Whose
i
hat is he
i
V-ing e?
(iii) How does one account for a particular lag in acquisition? Namely, that
children rst learn the strong crossover constraint in one clause construc-
tions, and then repeat their initial mistake of not having strong crossover in
two clause structures. Such a construction-sensitive dierence would not be
expected in any simple parameter-setting account.
5.7.4 Two possibilities of explanation
Given the basic bifurcation of grammars into those in the representational mode
vs. those in the derivational mode, two major possibilities arise in the explanation
of puzzles (i)(iii) above, if one excludes the possibility that strong crossover has
suddenly popped into the grammar at the relevant stage (this latter possibility is
in any case made unlikely by the data in (ii) above). On the one hand, one might
assume that there has been some change in the representation (say, the S-struc-
ture representation). For example, there could be a change in the category type
of the element corresponding to the wh-trace in the adult grammar. This is the
position that Roeper et al. take: that the initial wh-trace is actually little pro. As
such, the Roeper et al. explanation is grounded, as it were, in the representational
mode. Though the grammar as a whole contains a derivation for Roeper et al.,
the particular acquisition explanation is not dependant on that, just as it is not in
Hyams (1985, 1986, 1987). On the other hand, we might suppose that the dier-
ence over time for children is not represenationally based, but derivationally
based: for example, that the derivation is shallower for children than adults, or
somehow dierent. This type of acquisition account underlies the analysis of the
data above: i.e. the analysis of the Tavakolian and Carden data. However, this
account may be deepened and made more subtle in a number of ways. For
example, it neednt be the case that if the explanation is derivationally based, the
representational system is exactly the same that it is in the adult grammar.
Rather, another possibility presents itself: that the childs derivation is dierent,
and because of this, as an eect, the representation is dierent as well. Say, the
representation at S-structure. Under this view, while it may well be true that
some aspect of the representation has been changed over time for example,
the null element has changed from pro to wh-trace this is not the deepest
level of analysis. This is to be found instead in the derivation itself, and it is the
change in the derivation which has given rise to the denitional properties which
mean that the representation is read dierently. In a sense, the parametric
change, while real, is not the cause of the acquisitional change, but an eect.
That is the position taken in the analysis below.
5.7.5 A Representational Account
Roeper et al. take a purely representational view. They suggest that the basic
dierence between the adult grammar and that of the child is in the category
type of the null element corresponding to the wh-trace in the adult grammar.
They argue, in essence, that the null category corresponding to the wh-trace in
the adult grammar is not a wh-trace for the child at all, but rather an indexed
little pro. The representation of (i) above is then the following:
(141) Who
i
is he V-ing pro
i
?
(e.g. Who
i
is he following pro
i
?)
Although I will argue against this view as the ultimate basis for the acquisition
facts, this sort of explanation has much to recommend it. First, and most
crucially, it allows for the explanation of the lack of Strong Crossover by
children. Assuming that the strong crossover eect is really dependant on the
fact that wh-traces, as names, cannot be c-commanded by coindexed pronouns,
if one changes the category type to an element which can be A-bound
namely, to small pro the lack of Strong Crossover is explained.
Second, the assumption that the initial trace is specically little pro could
help to explain the one clause/two clause contrast noted above: namely, that the
strong crossover constraint seems to come into the grammar rst for one clause
constructions, and only later for two clause constructions. This might be traced,
not to the application of Condition C to a name, but to the application of
Condition B to little pro. Since little pro would obey Condition B, to the extent
to which this condition is operative in the childs grammar at all (see Jakubewitz
1985; Wexler and Chien 1985, 1987a), it would be expected to disallow single
clause coindexed structures before it would disallow double clause structures.
This would give the appearance of the strong crossover constraint operating in
single clause structures.
In spite of the interest of the representational view above, there are consid-
erable diculties, if this is taken as the ultimate level of explanation. These
seem to me to support a theory that is derivational in character, or at root
derivational, though the change in derivation may have (as always) representa-
tional eects.
Perhaps the most signicant diculty with the approach outlined above has
to do with the dierence in strong crossover eects for children depending on
whether a full noun phrase has been fronted (whose hat), or a simple wh-element
(who). The Roeper et al. data strongly shows a contrast between the two.
(142) a. Who
i
is he
i
V-ing e
i
? (25.9% coreferent)
b. Whose
i
N is he
i
V-ing e
i
? (3.9% coreferent)
This is perhaps the most signicant statistical result in the whole experiment. Yet
this distinction is not really covered by the basic motif that Roeper et al. follow:
that wh-trace is read by the child as little pro. Of course other possibilities of
explanation may be advanced as they are in the paper but it would be
preferable if such a signicant result could be a result of the basic framework.
A second diculty in the account has to do with the one clause vs. two
clause contrast. As noted above, children begin to respect strong crossover in one
clause structures before they respect it in two clause structures. They have the
following developmental path:
(143) a. Coreference allowed everywhere (no Strong Crossover constraint)
b. Coreference allowed in 2 clause structures; not in 1 clause
structures (Strong Crossover constraint for 1 clause only)
c. Coreference not allowed (adult Strong Crossover constraint)
It might at rst be thought that this divergence in one clause and two clause
structures could be traced simply to the fact that the initial wh-trace is treated as
little pro by the child, and that this obeys Condition B. However, things cannot
be this simple. That is because there are two changes in the grammar given in
(143), but just one parameter to manipulate: pro wh-trace. If the change in
parameter (pro wh-trace) is supposed to account for the rst change in the
data, i.e. the transition from (143a) to (143b), then it cannot also account for the
second change, from (143b) to (143c). If it is supposed to account for the second
change, then it cannot also be at the root of the rst. In short, there are two
changes in the developmental path, but only one parameter: both cannot be
linked to a single change in the grammar.
There is indeed a way around this, which rather stretches the conceptual
grounding of the notion parameter. This is to say that there is a simple
parametric change (pro wh-trace), but that this operates more than once. In
particular, it is construction-sensitive, rst operating in one clause structures, and
later in two clause structures. This is odd, however, from the point of view of
what is normally meant by parameter: i.e, an independently specied piece of
information in the grammar. Such a piece of information would not normally be
thought of as construction-specic: if wh-trace were being read as little pro by
the child, one would assume that it was listed in the grammar as such. And the
particular change that was noted: that it would rst change from little pro to wh-
trace in one clause structures, and only later in two clause structures, seems
equally inexplicable, given that the only possible basis for such a divergence,
some general change in computational complexity, operates in the opposite
direction with respect to the extraction of simple wh-elements vs. full wh-phrases
(who vs. whose hat).
5.7.6 A Derivational Account, and a Possible Compromise
As I have just noted, it would be odd, if one assumes that the deepest level of
explanation is a representationally based parametric one, to assume that that
wh-trace begins as little pro in one and two clause constructions, then becomes
wh-trace in one clause constructions only and retains its status as little pro in two
clause constructions, and nally becomes wh-trace generally. However, this is
odd not because this developmental course is odd per se, but rather because it
does not t in well with the notion of parameter-setting as conventionally
understood. Suppose that, instead of assuming that the categorial change were the
root level of explanation, that the dierence between the adult grammar and the
childs were to be traced instead to some change in the derivation occurring over
time. Suppose further that null categories, and in particular wh-trace, are
derivationally, and not representationally dened: i.e. that wh-trace is the null
element left by wh-movement. Then the derivational dierence would have an
immediate eect on the denition of elements at any particular level. This
dierence might well be on a construction-by-construction basis, rather than para-
metric in the conventional sense. It is this sort of line of inquiry that I will follow.
Let us rst adopt the van Riemsdijk and Williams account of Strong
Crossover that it is an instance of Condition C applying at DS to structures
such as (144).
(144) *He
i
liked whose
i
hat.
The van Riemsdijk and Williams approach allows for an explanation of the
acquisition facts on exactly the same grounds that had been advanced earlier: the
shallowness of the grammar. Assuming that S-structure or the surface is the
(computational) anchor for comprehension, and assuming a shallow analysis in
which the wh-element is base generated in place by the child, but not by the
adult, then the childs representation for sentences like (145) will be given in
(146), while the adults representation will be that in (147).
(145) Who
i
did he
i
say that Bill liked e
i
?
DS: Who did he say that Bill liked e?
SS: Who
i
did he
i
i
?
(147) Adults analysis:
DS: *He
i
said that Bill liked who
i
.
SS: *Who
i
did he
i
i
?
Assuming that the deepest level of analysis computed by the child is DS
where the wh-element is already in fronted position and assuming that Strong
Crossover really is a condition on the c-command of a wh-trace by a coreferent
name or pronoun, then the childs grammar would not be expected to exhibit
strong crossover eects. That is, if we assume shallowness of analysis, then the
strong crossover facts i.e. the lack of strong crossover eects for children
are explained without any additional stipulation given the van Riemsdijk/
Williams account, since the wh-element would not be c-commanded by a
coindexed name or pronoun at any level of representation. The acquisition
account therefore supports the van Riemsdijk and Williams proposal..
Consider now the second contrast noted by Roeper et al.: that between a
fronted simple wh-element and a full fronted NP.
(148) a. Who is he V-ing t? 25.9% coreferent
b. Whose N is he V-ing t? 3.6% coreferent
As noted earlier, this contrast is striking. These facts are also important in
showing that the child is not simply behaving randomly in examples like (148a)
due to confusion, because in the equally complex (148b) Strong Crossover is
maintained.
Of course, no such contrast exists in the adult grammar.
As noted earlier, Chomsky (1986b) allows two left peripheral positions
outside of S (=IP), a direct Comp position, and a position which is Spec C.
(149)
CP
SpecC C
Comp
+/ wh
The former contains a limited set of closed class features and words (+/wh, if,
that, etc.), while the latter contains full NPs. As noted earlier, this allows the
garbage category character of Comp to be avoided. Let us make use of this
contrast here in the following way:
(150) Who, what, and other closed class wh-elements are optionally
generated in Comp by the child, as spell-outs of the +wh feature;
which man, whose book, and other full wh-NPs are found in Spec C.
The distinction in (150) is reasonable, on two grounds. First, since simple +wh
elements (who, what, etc.) contain barely more information then the fact that they
are wh-elements themselves (plus some information as to humanness. etc.), they
could be simple spell-outs of the +wh feature. This is of course impossible with
the full NP phrase. Second, there is some evidence from early acquisition which
is quite suggestive as to the placement of the initial wh-words. A traditional
nding is that there is a stage of development in which children allow either the
fronting of the wh-word, or the fronting of the auxiliary, but not both (though see
Pinker 1984, for some critical discussion).
(151) Possible: Did John see Mary?
Possible: Who John can see?
Impossible: Who can John see?
This has generally been suggested to be due to a computational decit of some
type: that both wh-movement and Subj/Aux inversion take some sort of computa-
tional power, so the mutual application of both is impossible. The comments
above suggest instead a structural explanation: if the wh-word is generated
initially as a spell-out of a wh-feature in Comp, and if Subj/Aux inversion is a
substitution operation into Comp (Chomsky 1986b), then the conguration of
data in (151) is explained. Let us, therefore, accept the premise of (150), that in
wh-questions at this stage, the child is able optionally to spell-out the wh-element
from the +wh feature in Comp (if it is simple). This allows us immediately to
explain the lack of Strong Crossover for such questions: the wh-word is generat-
ed in place, so it is never c-commanded by a coreferent pronoun. But what now
of the fact that Strong Crossover always holds for the full fronted wh-phrases?
I believe that this calls for the revision of an assumption that I have been
making throughout this chapter. I have been assuming that in the shallow
analysis, the element starts o in a dislocated position at DS, and is able to do
so because it gets its theta role from a null category in an associated GF-
position. While this allows for the element to get a theta role in its deepest level
position via the trace, it does generate an element in a -bar position at the
deepest level. This would still not violate the Projection Principle if one assumed
that the adult DS, but not the childs computed DS (=DS) were a pure instantia-
tion of GF-. That is if one assumed the following:
(152) DS (i.e. the adult DS) is the representation where pure theta rela-
tions are expressed (i.e. is a direct projection of the thematic struc-
ture).
Let us suppose instead that the following holds, a more reasonable assumption:
(153) The deepest computed level (at any stage of development) must be
where pure theta relations are expressed.
The assumption in (153) would require that the dislocated elements in (154a) and
(b) i.e. the two types of dislocated elements discussed earlier actually be
in some sort of theta position of their own at DS.
(154) a. Near Johns house, he put a case.
b. To kiss the pig would make the horse happy.
This is certainly not implausible for (154b), where the element may be in a
subject position (Subj/Aux inversion can apply), and assigned an auxiliary theta
role; let us assume that it is also the case in (154a), where the element may be
in some sort of topic position. In eect this states that there is a parametric
dierence in the childs grammar which allows certain positions to be base-
generated theta positions.
Given the above assumption, an appropriate derivational distinction may be
made. The dislocated full wh-phrase in sentences like (155) is in the Spec C
position, and this is in no instance a theta position.
(155) Whose hat did he buy?
Hence the wh-element must come from an actual DS object position, even in the
childrens derivation (given revised assumption (153)).
DS: He saw whose hat.
SS: Whose hat did he see?
Since the wh-element is coming from its DS position, there is a Condition C
violation for the child.
On the other hand, the simple wh-element is a spell-out of the +wh feature.
It is generated in Comp, perhaps spelled-out at some later level, and never
appears in a non-dislocated position. The childs derivation is therefore the
following:
DS: Who (did) he see?
SS: Who (did) he see?
Because the wh-element is never c-commanded by a name, no Condition C
violation results.
Let us turn now to the third nding of Roeper et al.: that there is a sequenc-
ing eect in the appearance of the Strong Crossover Constraint. Namely, the
child rst allows both 1 and 2 clause structures violating the constraint, then
correctly rules out 1 clause structures, while allowing Strong Crossover to be
violated in 2 clause structures, and nally allowing no Strong Crossover at all.
(158) Stage I:
Who is he
i
V-ing t? OK for children
Who does he
i
think NP is V-ing t? OK for children
Stage II:
Who is he
i
V-ing t? * for children
Who does he
i
think NP is V-ing t? OK for children
Stage III:
Who is he
i
V-ing t? both * for children
Who does he
i
think NP is V-ing t? (from Roeper et al.)
If this degree of complexity of data is to be trusted, then what we have here is
not indicative of parameter-setting in the representational mode as commonly
understood. If one assumed a single change in the course of development pro
wh-trace. then it would be unexpected that this would apply in a context-sensitive
way, depending on the one clause vs. two clause contrast. Nor would this data be
explained if one assumed that Strong Crossover suddenly popped into the
grammar, a possibility which is in any case excluded in principle above.
On the other hand, if the Strong Crossover Constraint were dependant on
the possibility of wh-movement, then the outlines of the solution for the sequenc-
ing eect are clear, since the emergence of Strong Crossover is dependant on
the amount of wh-movement which is occurring. The data would follow from the
following:
(159) a. Stage I:
Wh-movement does not occur at all for simple wh-elements, the
elements are spellouts of features in Comp.
b. Stage II:
Wh-movement does occur for simple wh-elements, but only
within CP (CP (=S) acts as an absolute barrier).
c. Stage III:
Wh-movement applies as in the adult grammar.
The rst stage is one in which there is no Strong Crossover Eect at all for the
simple moved wh-elements. The second stage is the crucial one for current
purposes. While there is apparent extraction over two clauses, the second stage
would require that the wh-element be simply base-generated as a spell-out of
+wh features in the matrix, rather than moved from the lower clause. This in turn
would require the movement of a null operator in the lower clause, and the
indexing of this element with the matrix wh-element. In the third stage the
grammar would be identical to the adult one.
(160)
a.
b.
c.
Stage I:
Who did he see ? t
Spell-out of -features wh
Stage II:
Stage III:
i) 1 clause:
2 clause:
Who did he see ? t
additional
indexing
ii)
Who did he think that O
i
NP saw ? t
movement
movement
movement
spell-
out
Who did he think that NP saw ? t t
movement
Stage I and Stage III have already been extensively discussed. The crucial fact
about Stage II is that by assuming that CP is an absolute barrier to wh-movement
at this stage, the wh-element is forced to be generated in the matrix Comp node,
at this stage, rather than moved from below. This would then mean that Strong
Crossover would not occur in these structures, while at the same time it would
for 1 clause structures, which is Roeper et al.s result. The crucial fact is that the
stage-like progression in the acquisition data can nd a stage-like progression in
the extraction domain of wh-elements. A similar sort of solution is not available
if one assumes that the change is in category type: pro wh-trace.
To summarize, I have suggested in this chapter that the grammar is
essentially in the derivational mode, and that reexes of this may be seen in the
intermediate grammars that the child adopts. Let me close, however, by consider-
ing a point of contact of the above analysis with that of Roeper et al. Let us
suppose that, in the derivational mode, the notion of wh-trace is not dened in
terms of contextual features (e.g. a Case-assigned empty category), but rather in
terms of its history. Something is a trace, in this sense, if it is a null category left
by movement: it is a wh-trace if it is a null category left by movement to an A-
position. This is of course dierent from the contextual denition sometimes
adopted or suggested (e.g. Chomsky 1981, 1982), but is itself viable and rather
straightforward. Consider now what happens under the conditions that we have
been suggesting: that the child computes, in some instances, a shallow derivation,
anchored in S-structure, but only receding back to DS.
DS: Who did he say Bill liked.
SS: Who
i
did he
i
say Bill liked e
i
?
The derivation is clear, but what is the nature of the null category? According to
the denition above, it could not be a wh-trace even though it looks like a wh-
trace in the adult grammar simply because a wh-trace is an element left by
wh-movement, and no wh-movement has taken place in the derivation.
What then is it? In fact, it is not clear what the null category would be. One
possibility, and a fairly likely one for the category is that it is simply little pro,
at least at DS. For it is Case-marked, as little pro is, and it is not derived by wh-
movement, as is also the case for little pro. If we assume that elements retain
their character over a derivation (Brody 1984), and we also allow the possibility
of an A-bound little pro (Roeper et al. 1986), then the element would also be
little pro at s-structure: the A-bound pro of Roeper et. al.
The theoretical ramications of this sort of account are, I believe, quite
interesting. It would mean that both the derivational and representational views
were correct: the derivation is shallow, and so the element is dened as little
pro. However, unlike the pure representational account, or even a representation-
al-based account like that of Roeper et al., the representational assignment as
little pro is based on the analysis in the derivational mode. That is, it is the
shallowness of the derivation which requires the child to analyze the null element
as little pro (or at least not wh-trace), since wh-trace is dened as the position
from which A-movement has taken place. In this way it is possible to hold to
the basic insight of the Roeper et al. approach, namely that initial wh-trace is not
treated as such, without holding to the view that this is the deepest level of
explanatory analysis. That has to do with the shallowness of the computed
derivation, and it is because of this shallowness that the null category is, as a
result, analyzed as little pro. This analysis would also have the following eect:
that when the childs grammar fails, perhaps for computational reasons, it falls
into another grammar in UG. In this case the grammar would be one in which
the element was a bound little pro. While the details in such a case require
further research, the general architecture seems clear.
References
Abney, S. (1987a). The English Noun Phrase in Its Sentential Aspect. Ph.D. dissertation,
Massachusetts Institute of Technology.
Abney, S. (1987b). Licensing and Parsing. Proceedings of NELS 17.
Abney, S. and J. Cole (1985). A Government-Binding Parser. Proceedings of NELS 15.
Akmajian, A. , S. Steele, and T. Wasow (1979). The Category Aux in Universal Gram-
mar. Linguistic Inquiry 10. 1, 164.
Anderson, M. (1979). Noun Phrase Structure. Ph.D Dissertation, Univ. of Connecticut.
Aoun, Y. (1985). A Grammar of Anaphora, Cambridge, Mass.: MIT Press.
Aoun, Y. and D. Sportiche (1981). On the Formal Theory of Government. The Linguistic
Review 2. 3, 211236.
Aoun, Y. , N. Hornstein, D. Lightfoot, and A. Weinberg (1987). Two Types of Locality.
Linguistic Inquiry 18. 4, 537577.
Bach, E. (1962). The Order of Elements in a Transformational Grammar of German.
Language 38, 263269.
Bach, E. (1977). The Position of the Embedding Transformation in the Grammar
Revisited. Linguistic Structures Processing, edited by A. Zampoli. New York: North-
Holland.
Bach, E. (1979). Control and Montague Grammar. Linguistic Inquiry 10. 4, 515531.
Bach, E. (1983). On the Relationship between Word Grammar and Phrase Grammar.
Natural Language and Linguistic Theory 1, 6589.
Baker, C. L. (1979). Syntactic Theory and the Projection Problem. Linguistic Inquiry 10.
4, 533581.
Baker, C. L. (1977). Comments on Culicover and Wexler. Formal Syntax, edited by P.
Culicover, T. Wasow, and A. Akmajian. New York: Academic Press.
Baker, C. L. and J. J. McCarthy, eds. (1981). The Logical Problem of Language Acquisi-
tion. Cambridge, Mass.: MIT Press.
Baker, M. (1985). Incorporation : A Theory of Grammatical Function Changing. Ph.D.
dissertation, Massachusetts Institute of Technology.
Bar-Adon, A. and W. Leopold, ed. (1971). Child Language: A Book of Readings. New
York: Prentice-Hall.
Barss, A. (1985). Chain-Binding. Presentation given at West Coast Conference on Formal
Linguistics.
260 REFERENCES
Barss, A. (1986). Chains and Anaphoric Dependencies. Ph.D. dissertation, Massachusetts
Institute of Technology.
Barss, A. and H. Lasnik (1986). A Note on Anaphora and Double Objects. Linguistic
Inquiry 17. 2, 347354.
Belletti, A. and L. Rizzi (1988). Psych-verbs and Th-theory. Natural Language and
Linguistic Theory 6. 3, 291352.
Berwick, R. (1985). The Acquisition of Syntactic Knowledge. Cambridge, Mass.: MIT
Press.
Berwick, R. and A. Weinberg (1984). The Grammatical Basis of Linguistic Performance.
Cambridge, Mass.: MIT Press.
Bever, T. (1970). The Cognitive Basis for Linguistic Structures. Cognition and the
Development of Language, edited by J. R. Hayes, 279352. New York: Wiley.
Bierwisch, M. (1963). Grammatik des Deutschen Verbs. Studia Grammatica. Berlin:
GDR.
Bloom, L. (1970). Language Development: Form and Function in Emerging Grammars.
Cambridge, Mass: MIT Press.
Bloom, L. , P. Lightbown, and L. Hood (1975). Structure and Variation in Child
Language. Monographs of the Society for Research in Child Development 40.
Borer, H. (1984). The Projection Principle and Rules of Morphology. Proceedings of
NELS 14.
Borer, H. (1985). The Lexical Learning Hypothesis and Universal Grammar. Boston
University Conference on Language Development, Boston, Mass.
Borer, H. and K. Wexler (1987). The Maturation of Syntax. Parameter-Setting, edited by
T. Roeper and E. Williams. Cambridge, Mass.: MIT Press.
Bouchard, D. (1984). On the Content of Empty Categories. Dordrecht: Foris.
Bowerman, M. (1973). Early Syntactic Development. Cambridge, England: Cambridge
University Press.
Bowerman, M. (1974). Learning the Structure of Causative Verbs. Papers and Reports on
Child Language Development 8, edited by E. Clark, 142178. Stanford, Calif:
Stanford University.
Bowerman, M. (1982). Reorganizational Processes in Lexical and Syntactic Development.
Language Acquisition: The State of the Art, edited by E. Wanner and L. Gleitman,
319- 347. Cambridge, England: Cambridge University Press.
Bradley, D. (1979). Computational Distinctions of Vocabulary Type. Ph.D. dissertation
Braine, M. D. S. (1963). The Ontogeny of the English Phrase Structure: the First Phase.
Reprinted in Child Language: A Book of Readings, edited by A. Bar-Adon. and W.
Leopold. New York: Prentice-Hall.
Braine, M. D. S. (1963). On Learning the Grammatical Order of Words. Reprinted in
Child Language: A Book of Readings, edited by A. Bar-Adon and W. Leopold. New
York: Prentice-Hall.
REFERENCES 261
Braine, M. D. S. (1965). Learning the Positions of Words Relative to a Marker Element.
Journal of Experimental Psychology 72. 4, 532540.
Braine, M. D. S. (1976). Childrens First Word Combinations. Monographs of the Society
for Research in Child Development 41, 196.
Bresnan, J. (1977). Variables in the Theory of Transformations. Formal Syntax, edited by
P. Culicover, T. Wasow, and A. Akmajian, New York: Academic Press.
Bresnan, J. (1978). A Realistic Transformational Grammar. Linguistic Theory and
Psychological Reality, edited by M. Halle, J. Bresnan, and G. Miller. Cambridge,
Mass.: MIT Press.
Bresnan, J. , ed. (1982). The Mental Representation of Grammatical Relations. Cambridge,
Mass.: MIT Press.
Brody, M. (1984). On Contextual Denitions and the Role of Chains. Linguistic Inquiry
15. 3, 355380.
Brown, R. (1973). A First Language: The Early Stages. Cambridge, Mass.: Harvard
University Press.
Brown, R. and U. Bellugi (1964). Three Processes in the Childs Acquisition of Syntax.
reprinted in Child Language: A Book of Readings, edited by A. Bar-Adon and W.
Leopold. New York: Prentice-Hall.
Browning, M. (1987). Null Operator Constructions. Ph.D. dissertation, Massachusetts
Burzio, Luigi (1986). Italian Syntax, Dordrecht: Reidel.
Carden, G. (1986a). Blocked Forwards Coreference and Unblocked Forwards Anaphora:
Evidence for an Abstract Model of Coreference. Papers from the Regional Meeting
of CLS 22, 262276.
Carden, G. (1986b). Blocked Forwards Coreference:Theoretical Implications of the
Acquisition Data. Studies in the Acquisition of Anaphora, Vol I, edited by B. Lust.
Dordrecht: Reidel.
Carden, G. and T. Dietrich (1981). Introspection, Observation, and Experiment. Proceed-
ings of the 1980 Biennial Meeting of the Philosophy of Science Association in East
Lansing, MI, edited by R. Giere, 583597.
Chierchia, G. (1984). Topics in the Syntax and Semantics of Innitives and Gerunds. Ph.D.
dissertation, University of Massachusetts
Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton.
Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press.
Chomsky, N. (1970). Remarks on Nominalization. Studies in Semantics in Generative
Grammar, Papers by N. Chomsky. The Hague: Mouton.
Chomsky, N. (1972). Studies in Semantics in Generative Grammar. The Hague: Mouton.
Chomsky, N. (1973). Conditions on Transformations. A Festschrift for Morris Halle,
edited by S. Anderson and P. Kiparsky, 223286. New York: Holt, Rinehardt, and
Winston.
Chomsky, N. (19751955). The Logical Structure of Linguistic Theory. New York: Plenum
Press.
262 REFERENCES
Chomsky, N. (1975). Reections on Language. Pantheon.
Chomsky, N. (1977a). On Wh-Movement. Formal Syntax, edited by P. Culicover, T.
Wasow, and A. Akmajian, 71132. New York: Academic Press.
Chomsky, N. (1977b). Essays on Form and Interpretation. New York: North-Holland.
Chomsky, N. (1980). On Binding. Linguistic Inquiry 11. 1, 146.
Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, N. (1982). Some Concepts and Consequences of the Theory of Government and
Binding, Cambridge, Mass.: MIT Press.
Chomsky, N. (1986a). Knowledge of Language: Its Nature, Origins, and Use, Praeger.
Chomsky, N. (1986b). Barriers. Cambridge, Mass.: MIT Press.
Chomsky, N. (1993). A Minimalist Program for Linguistic Theory. The View from
Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, edited by Ken
Hale and S. J. Keyser, 152. Cambridge, Mass.: MIT Press.
Chomsky, N. (1995). The Minimalist Program. Cambridge, Mass.: MIT Press.
Chomsky, N. and H. Lasnik (1977). Filters and Control. Linguistic Inquiry 8. 3, 425504.
Clark, E. (1973). Whats in a Word? On the Childs Acquisition of Semantics in his First
Language. Cognitive Development and the Acquisition of Language, edited by T. E.
Moore, 65110. New York: Academic Press.
Clark, H. and E. Clark (1977). Psychology and Language: An Introduction to Psycho-
linguistics. New York: Harcourt, Brace, Jovanich.
Clark, R. (1986). Boundaries and the Treatment of Control. Ph.D. dissertation, UCLA.
Cooper, R. (1978). Montagues Semantic Theory and Transformational Grammar. Ph.D.
dissertation, University of Massachusetts.
Crain, S. and C. McKee (1985). Acquisition of Structural Constraints on Anaphora.
Proceedings of NELS 16.
Culicover, P. (1967). The Treatment of Idioms within a Transformational Framework.
IBM Technical Report.
Culicover, P. , T. Wasow, and A. Akmajian, eds. (1977). Formal Syntax. New York:
Academic Press.
Culicover, P. and K. Wexler (1977). A Degree-2 Theory of Learnability. Formal Syntax,
edited by P. Culicover, T. Wasow, and A. Akmajian. New York: Academic Press.
Davis, H. (1985). Syntactic Undergeneration in the Acquisition of English: Wh-construc-
tions and the ECP. Proceedings of NELS 16.
Dowty, D. , R. Wall, and S. Peters (1979). Introduction to Montague Semantics. Dor-
drecht: Reidel.
Drozd, K. (1987). Minimal Syntactic Structures in Child Language. Manuscript, Tucson,
Ariz.: University of Arizona.
Drozd, K. (1994). A Unication Categorial Grammar of Child English Negation. Ph.D.
dissertation, University of Arizona.
Emonds, J. (1975). A Transformational Approach to English Syntax. New York: Academic
Press.
Emonds, J. (1985). A Unied Theory of Syntactic Categories. Dordrecht: Foris.
REFERENCES 263
Epstein, S. (1984). Quantier-pro and the LF Interpretation of PRO
arb
. Linguistic Inquiry
15. 3, 499505.
Farmer, A. (1984). Modularity in Syntax: A Study of Japanese and English. Cambridge,
Mass.: MIT Press.
Fiengo, R. (1980). Surface Structure: The Interface of Autonomous Components. Cam-
bridge, Mass.: Harvard University Press.
Fiengo, R. and J. Higginbotham (1981). Opacity in NP. Linguistic Analysis 7. 4, 395421.
Finer, D. and E. Broselow (1985). The Acquisition of Binding in Second Language
Learning. Proceedings of NELS 15.
Fodor, J. A. (1975). The Language of Thought. New York: Crowell.
Fodor, J. A. (1981). Methodological Solipsism as a Research Strategy in Psycholinguistics.
Representations, edited by J. Fodor. Cambridge, Mass.: MIT Press.
Fodor, J., T. G. Bever, and M. F. Garrett (1974). The Psychology of Language. New York:
McGraw-Hill.
Fodor, J. D. (1977). Semantics: Theories of Meaning in Generative Grammar. New York:
Crowell.
Frank, R. (1992). Syntactic Locality and Tree Adjoining Grammar. University of Pennsyl-
vania Ph.D. dissertation
Franks, S. and N. Hornstein (1990). Governed PRO. McGill Working Papers in Linguistics
6. 3, 167191.
Freidin, R. (1978). Cyclicity and the Theory of Grammar. Linguistic Inquiry 9. 4,
519549.
Freidin, R. (1986). Fundamental Issues in the Theory of Binding. Studies in the Acquisi-
tion of Anaphora, edited by B. Lust, 151191, Dordrecht: Kluwer.
Frazier, Lyn (1979). Some Notes on Parsing. University of Massachusetts Occasional
Papers in Linguistics. Amherst, Mass.: University of Massachusetts.
Fukui, N. and M. Speas (1986). Speciers and Projections. MIT Working Papers in
Linguistics 8, 128172.
Garrett, M. F. (1975). The Analysis of Sentence Production. The Psychology of Learning
and Motivation 9, edited by G. H. Bower, 133177. New York: Academic Press.
Garrett, M. F. (1980). Levels of Processing in Speech Processing. Language Production,
vol. 1, edited by B. Butterworth, 177220. New York: Academic Press.
Gleitman, L. and H. Gleitman (1990). Structural Sources of Verb Meaning. Language
Acquisition 1, 355.
Goodall, G. (1984). Parallel Structures in Syntax. Ph.D. dissertation, University of
California at San Diego.
Goodall, G. (19851986). Parallel Structures in Syntax. The Linguistic Review 5. 2,
173184.
Goodluck, H. (1978). Linguistic Principles in Childrens Grammar of Complement Subject
Interpretation. Ph.D. dissertation, University of Massachusetts.
Goodluck, H. and S. Tavakolian (1982). Competence and Processing in Childrens
Grammar of Relative Clauses. Cognition 16, 128.
264 REFERENCES
Grimshaw, J. (1981). Form, Function, and Language Acquisition Device. The Logical
Problem of Language Acquisition, edited by C. L. Baker and J. J. McCarthy.
Grimshaw, J. (1986). Nouns, Arguments, and Adjuncts. MIT Working Papers in Linguis-
tics. Cambridge, Mass.: Massachusetts Institute of Technology.
Gruber, J. (1967). Topicalization in Child Language. Child Language: A Book of Readings,
edited by A. Bar-Adon and W. Leopold, New York: Prentice-Hall.
Guilfoyle, E. (1985). The Acquisition of Tense and the Emergence of Lexical Subjects in
Child Grammars of English. McGill Working Papers in Linguistics.
Hale, K. (1979). On the Position of Walpiri in a Typology of the Base. Bloomington,
Ind.: Indiana University Linguistic Club.
Hale, K. (1983). Warlpiri and the Grammar of Noncongurational Languages. Natural
Language and Linguistic Theory 1, 548.
Hale, K. and J. Keyser (1986a). On the Syntax of Argument Structure. Lexicon Project
Working Papers 34. MIT Working Papers in Linguistics. Cambridge, Mass.: Massa-
chusetts Institute of Technology.
Hale, K. and J. Keyser (1986b). Some Transitivity Alternations in English. Lexicon
Project Working Papers 7. MIT Working Papers in Linguistics. Cambridge, Mass.:
Hamburger, H. and S. Crain (1982). Relative Acquisition. Language Development:
Volume 1, edited by S. Kuczaj II, 245274. Mahwah, NJ: Lawrence Erlbaum.
Hayes, B. (1980). A Metrical Theory of Stress Rules. Ph.D. dissertation, Massachusetts
Higginbotham, J. (1985). On Semantics. Linguistic Inquiry, 16, 547594.
Hoji, H. (1983). X
n
(YP) X
n-1
and the Bound Variable Zibun. MIT Workshop on
Japanese Linguistics. Cambridge, Mass.: Massachusetts Institute of Technology.
Hoji, H. (1985). Logical Form Constraints and Congurational Structures in Japanese.
Ph.D. dissertation, University of Washington.
Hoji, H. (1986). Empty Pronominals in Japanese and the Subject of NP. Proceedings of
NELS 17.
Hornstein, N. (19851986). Restructuring and Interpretation in a T-model. The Linguistic
Review 5. 4, 301334.
Hornstein, N. (1987). Levels of Meaning. Modularity in Representation and Natural
Language Understanding, edited by J. Gareld. Cambridge, Mass.: MIT Press.
Huang, J. (1982). Logical Relations in Chinese and the Theory of Grammar. Ph.D.
dissertation, Massachusetts Institute of Technology.
Huang, J. (1993). Reconstruction and the Structure of the VP: Some Theoretical Conse-
quences. Linguistic Inquiry 24.
Hyams, N. (1985). Language Acquisition and the Theory of Parameters. Ph.D. dissertation,
City University of New York.
Hyams, N. (1986). Language Acquisition and the Theory of Parameters. Dordrecht: Reidel.
REFERENCES 265
Hyams, N. (1987). Parameter-Setting. Parameter-Setting, edited by T. Roeper and E.
Williams, 122. Dordrecht: Reidel.
Jackendo, R. (1972). Semantic Interpretation in Generative Grammar. Cambridge,
Mass.: MIT Press.
Jackendo, R. (1977). X-Syntax. Cambridge, Mass.: MIT Press.
Jackendo, R. (1983). Semantics and Cognition. Cambridge, Mass.: MIT Press.
Jackendo, R. (1988). Consciousness and the Computational Mind. Bradford Books,
Jaeggli, O. (1986). Passive. Linguistic Inquiry 17. 4, 587622.
Jakubewicz, C. (1984). On Markedness and Binding Principles. Proceedings of NELS 14,
154182.
Jelinek, E. (1984). Empty Categories, Case, and Congurationality. Natural Language and
Linguistic Theory 2. 1, 3976.
Jelinek, E. (1985). The Projection Principle and the Argument Type Parameter. Paper
presented at Linguistic Society of America, winter meeting.
Johnson, K. (1986). Subjects and Theta Theory. Manuscript, Cambridge, Mass.: Massa-
chusetts Institute of Technology.
Joshi, A. (1985). Tree-Adjoining Grammars: How Much Context Sensitivity is Required
to Provide Reasonable Descriptions? Natural Language Parsing, edited by D. Dowty,
L. Kartunen, and A. Zwicky. Cambridge, Eng.: Cambridge University Press.
Joshi, A. , L. Levy, and M. Takahashi (1975). Tree Adjunct Grammars. Journal of
Computer and System Sciences 10.
Joshi, A. and A. Kroch (1985), The Linguistic Relevance of Tree-Adjoining Grammars,
MS-CS-8516, Department of Computer and Information Sciences, University of
Pennsylvania.
Kayne, R. (1981). ECP Extensions. Linguistic Inquiry 12, 93133.
Kayne, R. (1983). Connectedness. Linguistic Inquiry 14, 223249.
Kayne, R. (1984). Connectedness and Binary Branching. Dordrecht: Foris.
Kayne, R. (1985). Principles of Particle Constructions. Grammatical Representations,
edited by J. Guron, H. -G. Obenauer, and J. Pollock, 101140. Dordrecht: Foris.
Kegl, J. and J. Gee (undated). ASL Structure: Toward a Theory of Abstract Case.
Manuscript, Cambridge, Mass.: Massachusetts Institute of Technology.
Keyser, J. and T. Roeper (1984). On the Middle and Ergative Constructions in English.
Linguistic Inquiry 15, 381416.
Kiparsky, P. (1982a). From Cyclic Phonology to Lexical Phonology. The Structure of
Phonological Representations, edited by H. van der Hulst and N. Smith, 131177.
Dordrecht: Foris.
Kiparsky, P. (1982b). Lexical Morphology and Phonology. Linguistics in the Morning
Calm, edited by I. -S. Yang, 393. Seoul: Hansin.
Kitagawa, Y. (1986). Subjects in English and Japanese, Ph.D. dissertation, University of
Massachusetts.
266 REFERENCES
Klein, S. (1982). Syntactic Theory and the Developing Grammar: Reestablishing the
Relationship between Linguistic Theory and Data from Language Acquisition. Ph.D.
dissertation, University of California at Los Angeles.
Klima, E. and U. Bellugi (1966). Syntactic Regularities in the Speech of Children.
Psycholinguistic Papers, edited by J. Lyons and R. Wales, 183208. Edinburgh:
Edinburgh University Press.
Koopman, H. (1984). The Syntax of Verbs. Dordrecht: Foris.
Koster, J. (1975). Dutch as an SOV Language. Linguistic Analysis 1, 111136.
Koster, J. (1978). Locality Principles in Syntax. Dordrecht: Foris.
Koster, J. (1982). Class lectures. Salzburg Institute of Summer Linguistics, Salzburg,
Austria.
Koster, J. (1984). On Binding and Control. Linguistic Inquiry 15, 417459.
Koster, J. (1987). Domains and Dynasties. Dordrecht: Foris.
Kroch, A. and A. Joshi (1988). Analyzing Extraposition in a Tree Adjoining Grammar.
Syntax and Semantics 20, edited by G. Huck and A. Ojeda. New York: Academic
Press.
Labov, W. and T. Labov (1976). The Learning of Syntax from Questions. Zeitschrift fuer
Literatur Wissenschaft und Linguistik 6, 4782.
Ladusaw, W. (1985). A Proposed Distinction between Levels and Strata. Paper presented
at Linguistic Society of America, winter meeting.
Lapointe, S. (1978). A Theory of Grammatical Agreement. Ph.D. dissertation, University
of Massachusetts.
Lapointe, S. (1985a). A Model of Syntactic Phrase Combination in Speech Production.
Lapointe, S. (1985b). A Theory of Verb Form Use in the Speech of Agrammatic
Aphasics. Brain and Language 24. 1, 100155.
Laporte-Grimes, L. and D. Lebeaux (1993). Complexity Considerations in Early Speech.
Manuscript. University of Connecticut and University of Maryland.
Lasnik, H. (1986). Two Types of Condition C. Presentation at Princeton Conference on
Linguistic Theory, Princeton, NJ.
Lasnik, H. and S. Crain (1985). On the Acquisition of Pronominal Reference. Lingua 65,
135154.
Lasnik, H. and M. Saito (1984). On the Nature of Proper Government. Linguistic Inquiry
15, 235289.
Lebeaux, D. (1981). The Acquisition of the Passive. MA Thesis, Harvard University.
Lebeaux, D. (1982). Submaximal Projections. Manuscript. Amherst, Mass.: University of
Massachusetts.
Lebeaux, D. (1983). A Distributional Dierence Between Reciprocals and Reexives.
Linguistic Inquiry 14.
Lebeaux, D. (1984). Anaphoric Binding and the Denition of PRO. Proceedings of NELS
14.
REFERENCES 267
Lebeaux, D. (19841985). Locality and Anaphoric Binding. The Linguistic Review 4,
343363.
Lebeaux, D. (1986). The Interpretation of Derived Nominals. Papers from the Regional
Meeting of the Chicago Linguistic Society 22, 231247.
Lebeaux, D. (1987). Comments on Hyams. Parameter-Setting, edited by T. Roeper and
E. Williams, Dordrecht: Reidel.
Lebeaux, D. (1987). The Composition of Phrase Structure. Presentation, Tucson, Az.:
University of Arizona.
Lebeaux, D. (1988). The Feature +Aected and the Formation of the Passive. Thematic
Relations [Syntax and Semantic 21], edited by W. Wilkins. New York: Academic
Press.
Lebeaux, D. (1988). Language Acquisition and the Form of the Grammar. Ph.D. disserta-
tion University of Massachusetts, Amherst.
Lebeaux, D. (in preparation). Appeared as Lebeaux (1991). Relative Clauses, Licensing,
and the Nature of the Derivation. Perspectives on Phrase Structure: Heads and
Licensing [Syntax and Semantics 25], edited by S. Rothstein. New York: Academic
Press.
Lebeaux, D. (1997). Determining the Kernel II: Prosodic Form, Syntactic Form, and
Phonological Bootstrapping. NEC Technical Report 97094.
Lebeaux, D. (1998). Where does the Binding Theory Apply? NEC Technical Report
98015. Princeton, NJ: NEC Research Institute.
Lebeaux, D. (to appear). A Subgrammar Approach to Language Acquisition. NEC
Technical Report.
Levin, B. (1983). On the Nature of Ergativity, Ph.D. dissertation, Massachusetts Institute
of Technology.
Lust, B. and L. Mangione (1984). The Principal Branching Direction Constraint in First
Language Acquisition of Anaphora. Proceedings of NELS 14.
Lust, B. , ed. (1986). Studies in the Acquisition of Anaphora. Dordrecht: Reidel.
Manzini, R. (1983). On Control and Control Theory. Linguistic Inquiry 14, 421446.
Marantz, A. (1980). Whither Move NP. MIT Working Papers in Linguistics. Cambridge,
Mass.: Massachusetts Institute of Technology.
Marantz, A. (1982). On the Acquisition of Grammatical Relations. Linguistische Berichte:
Linguistik als Kognitive Wissenschaft 80/82, 3269.
Marantz, A. (1984). On the Nature of Grammatical Relations. Cambridge, Mass.: MIT
Press.
Maratsos, M. , S. Kuczaj II, D. Fox, and M. Chalkley (1979). Some Empirical Studies in
the Acquisition of Transformational Relations: Passives, Negatives, and the Past
Tense. Minnesota Symposium on Child Psychology 12, edited by W. Collins,
Mahwah, NJ: Lawrence Erlbaum.
Maratsos, M. , D. Fox, J. Becker, and M. Chalkley (1985). Semantic Restrictions in
Childrens Passives. Cognition 19, 167192.
268 REFERENCES
Marcus, M. , D. Hindle, and M. Fleck (1983). D-theory: Talking about Talking about
Trees. Association for Computational Linguistics 21, 129136.
May, R. (1985). Logical Form. Cambridge, Mass.: MIT Press.
McCawley, J. (1984). Anaphora and Notions of Command. Proceedings of the Tenth
Annual Meeting of the Berkeley Linguistic Society. Berkeley, Calif.: University of
California.
McNeill, D. (1970). The Acquisition of Language: The Study of Developmental Psycho-
linguistics. New York: Harper and Row.
Miller, G. and K. McKean (1964). A Chronometric Study of Some Relations between
Sentences. Quarterly Journal of Experimental Psychology 16, 297308.
Mohanon, K. P. (1982). Lexical Phonology. Ph.D. dissertation, Massachusetts Institute of
Technology.
Montague, R. (1974). Formal Philosophy, edited by R. Thomason. New Haven, Conn.:
Yale University Press.
Morgan, J. , R. Meier, and E. Newport (1987). Structural Packaging in the Input to
Language Learning: Contributions of Prosodic and Morphological Marking of
Phrases to the Acquisition of Language. Cognitive Psychology 19. 4, 498550.
Newport, E. , L. Gleitman, and H. Gleitman (1977). Mother, Id Rather Do It Myself:
Some Eects and Non-eects of Maternal Speech Style. Talking to Children:
Language Input and Acquisition, edited by C. E. Snow and C. Ferguson. Cambridge:
Cambridge University Press.
Nishigauchi, T. (1984). Control and Thematic Domain. Language 60, 215250.
Partee, B. H. (1979). Montague Grammar and The Well-Formedness Constraint. Selections
from the Third Groningen Round Table [Syntax and Semantics 10], edited by F. Heny
and B. Schnelle, 275313, New York: Academic Press.
Partee, B. H. (1984). Compositionality. Varities of Formal Semantics, Proceedings of the
4
th
Amsterdam Colloquium, edited by F. Landman and F. Veltman, 281311.
Dordrecht: Foris.
Pesetsky, D. (1982). Paths and Categories, Ph.D. dissertation, Massachusetts Institute of
Technology.
Pesetsky, D. (1985). Morphology and Logical Form. Linguistic Inquiry 16, 193246.
Pinker, S. (1979). A Theory of the Acquisition of Lexical Interpretive Grammars. MIT
Lexicon Project.
Pinker, S. (1984). Language Learnability and Language Development, Cambridge, Mass.:
Harvard University Press.
Pinker, S. and D. Lebeaux (1982). A Learnability-Theoretic Approach to Language
Acquisition. Manuscript, Cambridge, Mass.: Harvard University.
Postal, P. (1984). On Raising. Cambridge, Mass.: MIT Press.
Powers, S. and D. Lebeaux (1998). Data on DP Acquisition. Issues in the Theory of
Language Acquisition, edited by N. Dittmar and Zvi Penner, 3776. Bern: Peter
Lang.
REFERENCES 269
Pustejovsky, J. (1984). Studies in Generalized Binding. Ph.D. dissertation, University of
Massachusetts.
Radford, A. (1981). Transformational Syntax. Cambridge, Eng.: Cambridge University
Press.
Randall, J. (1985). Morphological Structure and Language Acquisition. New York:
Garland Press.
Reinhart, T. (1983). Anaphora and Semantic Interpretation. Chicago: University of
Chicago Press.
Riemsdijk, H. van and E. Williams (1981). NP-structure. The Linguistic Review 1.
Rizzi, L. (1982). Issues in Italian Syntax. Dordrecht: Foris.
Rizzi, L. (1986a). On Chain Formation. The Syntax of Pronominal Clitics [Syntax and
Semantics 19], edited by H. Borer, 6595. New York: Academic Press.
Rizzi, L. (1986b). Null Objects in Italian and the Theory of pro. Linguistic Inquiry 17, 501-
558.
Roberts, I. (1986). Implicit and Dethematized Subjects. Ph.D. dissertation, University of
Southern California.
Roeper, T. (1974). Ph.D. dissertation, Harvard University.
Roeper, T. and M. Siegel (1978a). A Lexical Transformation for Verbal Compounds.
Roeper, T. (1978b). Linguistic Universals and the Acquisition of Gerunds. University of
Massachusetts Occasional Papers 4, edited by H. Goodluck and L. Solan. Amherst,
Mass.: University of Massachusetts.
Roeper, T. (1982). The Role of Universals in the Acquisition of Gerunds. Language
Acquisition: The State of the Art, edited by E. Wanner and L. Gleitman, 267288.
Cambridge, Eng.: Cambridge University Press.
Roeper, T. (1983). Implicit Theta Roles in the Lexicon and Syntax. Manuscript, Amherst,
Roeper, T. (1987). Implicit Arguments and the Head Complement Relation. Linguistic
Inquiry 18. 2, 267310.
Roeper, T. , S. Akiyama, L. Mallis, and M. Rooth (1986), The Problem of Empty
Categories and Bound Variables in Language Acquisition. Manuscript, Amherst,
Roeper, T. and J. Keyser (1984). On the Middle and Ergative Constructions in English.
Roeper, T. and E. Williams (1986). Parameter Setting, Dordrecht: Reidel.
Rosenbaum, P. (1967). The Grammar of English Predication Complement Constructions.
Ross, J. R. (1968). Constraints on Variables in Syntax. Ph.D. dissertation, Massachusetts
Rothstein, S. (1983). The Syntactic Forms of Predication. Ph.D. dissertation, Massachu-
setts Institute of Technology.
270 REFERENCES
Rozwadowska, B. (1986). Thematic Relations in Derived Nominals. Thematic Relations
[Syntax and Semantics 21], edited by W. Wilkins. New York: Academic Press.
Sar, K. (1982). Inection-Government and Inversion. The Linguistic Review 1. 4,
417467.
Sar, K. (1987). The Syntactic Projection of Lexical Thematic Structure. Natural
Language and Linguistic Theory 5. 4, 561611.
Sar, K. (1987). Comments on Manzini and Wexler. Parameter-Setting, edited by T.
Roeper and E. Williams. Dordrecht: Reidel.
Saito, M. and H. Hoji (1983). Weak Crossover and Move- in Japanese. Natural
Language and Linguistic Theory 1, 245260.
Schlesinger, I. M. (1971). Production of Utterances and Language Acquisition. The
Ontogenesis of Grammar, edited by D. Slobin, 63101. New York: Academic Press.
Seeley, T. D. (1989). Anaphoric Relations, Chains, and Paths. Ph.D. dissertation, Universi-
ty of Massachusetts.
Selkirk, E. (1984). Phonology and Syntax: The Relation between Sound and Structure.
Shattuck, S. R. (1974). Speech Errors: An Analysis. Ph.D. dissertation, Massachusetts
Sheldon, A. (1974). The Role of Parallel Function in the Acquisition of Relative Clauses/
Journal of Verbal Learning and Verbal Behavior 13, 272281.
Solan, L. (1983). Pronominal Reference: Child Language and the Theory of Grammar.
Dordrecht: Reidel.
Solan, L. and T. Roeper (1978). Childrens Use of Syntactic Structure in Interpreting
Relative Clauses. Papers in the Structure and Development of Child Language,
University of Massachusetts Occasional Papers 4, edited by H. Goodluck and L.
Solan, 105126. Amherst, Mass.: University of Massachusetts.
Speas, M. (1990). Phrase Structure in Natural Language. Dordrecht: Reidel.
Sproat, R. (1985). On Deriving the Lexicon. Ph.D. dissertation, Massachusetts Institute of
Technology.
Sproat, R. (1985). The Projection Principle and the Syntax of Synthetic Compounds.
Sportiche, D. (1983). Structural Invariance and Symmetry in Syntax. Ph.D dissertation,
Sportiche, D. (1987). Unifying Movement Theory. Manuscript, Los Angeles: University
of Southern California.
Sportiche, D. (1988). A Theory of Floated Quantiers, and its Consequences for Constitu-
ent Structure. Linguistic Inquiry 19, 425450.
Steele, S. (in preparation). A Grammar of Luiseno.
Stowell, T. (1981). The Origins of Phrase Structure. Ph.D. dissertation, Massachusetts
Stowell, T. (1981/1982). A Formal Theory of Congurational Phenomena. Proceedings
of NELS 12.
REFERENCES 271
Stowell, T. (1983). Subjects across Categories. The Linguistic Review 2, 285312.
Stowell, T. (1988). Small Clause Restructuring. Manuscript, Los Angeles: University of
California at Los Angeles.
Tavakolian, S. (1978). Structural Principles in the Acquisition of Complex Sentences. Ph.D.
dissertation, University of Massachusetts.
Tavakolian, S. (1981). The Conjoined Clause Analysis of Relative Clauses. Language
Acquisition and Linguistic Theory, edited by S. Tavakolian, Cambridge, Mass.: MIT
Press.
Tavakolian, S., ed. (1981). Language Acquisition and Linguistic Theory. Cambridge,
Mass.: MIT Press.
Thiersch, C. (1978). Topics in German Syntax. Ph.D. dissertation, Massachusetts Institute
of Technology.
Travis, L. (1984). Parameters and Eects of Word Order Variation. Ph.D. dissertation,
Vainikka, A. (1985). The Acquisition of English Case. Presented at Boston University
Conference on Language Development 10, later appeared as Vainikka, A. (1993/
1994). Case in the Development of English Syntax. Language Acquisition 3,
257324.
Vainikka, A. (1986). Case in Finnish and Acquisition. Manuscript, Amherst, Mass.:
University of Massachusetts.
Vainikka, A. (1986). Nominative Signals Movement. Presentation at 3rd Workshop in
Comparative Germanic Syntax, Turku, Finland.
Vainikka, A. (1988), Manuscript, Amherst, Mass.: University of Massachusetts.
Vergnaud, J. -R. (1985). Dependances et niveaux de representation en syntaxe. Amster-
dam: John Benjamins.
Wanner, E. and L. Gleitman, ed. (1982). Language Acquisition: The State of the Art,
Cambridge, Eng.: Cambridge University Press.
Wasow, T. (1977). Adjectival and Verbal Passive. Formal Syntax, edited by P. Culicover,
T. Wasow, and A. Akmajian. New York: Academic Press.
Weinberg, A. (1988), Locality Principles in Syntax and in Parsing,. Ph.D. dissertation,
Wexler, K. and P. Culicover (1980). Formal Principles of Language Acquisition. Cam-
bridge, Mass.: MIT Press.
Wexler, K. (1982). A Principled Theory for Language Acquisition. Language Acquisition:
The State of the Art edited by E. Wanner and L. Gleitman, Cambridge, Eng.:
Cambridge University Press.
Wexler, K. and Y. Chien (1985). The Development of Lexical Anaphors and Pronouns.
Papers and Reports on Child Language Development 24, 138149. Stanford, Calif.:
Stanford University.
Wexler, K. and Y. Chien (1987a). Childrens Acquisition of Locality Conditions for
Reexives and Pronouns. Papers in Linguistics 26, 3039. Irvine, California:
University of California.
272 REFERENCES
Wexler, K. and R. Manzini (1987b). Parameters and Learnability in Binding Theory.
Parameter-Setting edited by T. Roeper and E. Williams, Dordrecht: Reidel.
Williams, E. (1978). Across-the-Board Rule Application. Linguistic Inquiry 9. 1, 3143.
Williams, E. (1980). Predication. Linguistic Inquiry 11. 1, 203238.
Williams, E. (1981). Argument Structure and Morphology. The Linguistic Review 1,
81114.
Williams, E. (1982). The NP-Cycle. Linguistic Inquiry 13. 2, 277296.
Williams, E. (1987). Reassignment of Functions at LF. Linguistic Inquiry 17. 2, 265299.
Zubizarreta, M. -L. (1987). Levels of Representation in the Lexicon and in the Syntax.
Dordrecht: Foris.
Index
A
Abney, 71, 72, 77, 84, 96
Abrogation of deep structure functions,
183258
Adjoin-, xv, xix, 91144
Agreement, 145153
Akiyama, 204, 239258
Akmajian, 88
Anchoring
and deepest computed level, 188194
derivation anchored at DS, 184194
derivation anchored at SS, 184194
Anti-Reconstruction Effects, 102112
Aoun, 21, 43, 44, 244
Argument/Adjunct distinction, 94136
Argument-linking, 3851
and learnability, 3841
and the Projection Principle, 104112
and the Structure of the Base,
104112
and ergative languages, 3841
and the derivation, 94136
B
Bach, 17, 100, 101, 220
Baker, 51
Barss, 224239
Base Order, 1729
Determining the base order 1729
Belletti, 219, 241
Bellugi, 26, 27
Bever, 9, 142, 185, 186
Bierwisch, 17
Bloom, 6970, 7980, 158
Borer, 15
Bowerman, 11, 28, 92
Bradley,12, 92
Braine, xxiii , 712, 19
Bresnan, 54, 118, 216
Brody, 258
Brown, 70, 92, 154
Browning, 214
C
Canonical structural realization, 36
Carden, xxviii, 204, 224239
Case representation, 178, 179
Chien, 245, 249
Chierchia, 23
Chomsky, xiii, xiv, xv, xxii, 2, 4, 10,
1315, 23, 25, 31, 41, 46, 47, 52,
70, 74, 80, 8486, 93, 96, 97, 100,
114, 116, 118, 128, 141, 145, 149,
150, 152154, 165, 183, 206, 208,
215, 220, 240242, 253
Clark, E., 92
Clark, H., 92
Clark, R., 206
Closed class elements, 15, 7,1116,
151153
and set of governors, 12
and niteness, 1314
and open class elements, 1115
link with grammatical operations,
151153
Cole, 96
274 INDEX
Composition of phrase structure, 91144
and saturation of closed class
elements, 114120
Condition C, 102112, 224239
and Dislocated Constituents, 224239
constraint of direct c-command,
224229
Conjoin-, 112115, 120136
Constansy principles, 9194
Control, 203224
and abrogation of DS functions,
188194, 220224
c-command constraint, 213220
double-binding constructions,
213220
early stages, 204213
Goodlucks result, 210211
representation in early grammar,
220224
Tavakolians result, 206207
Cooper, 113
Crain, 120, 207
Culicover, 7, 185
D
Deductive system, modelling, 195203
Deep structure, 104112
Derivational Endpoints, xiii, xxvii
Derivational model, 91136
Derivational Theory of Complexity,
184194
Detecting Movement, 1922
Dietrich, 224239
Dislocated constituents and indexing
functions, 183258
Dowty, 100
Drozd, 208213
E
Early phrase structure
building, 3136, 4753, 5684
from lexical to phrasal syntax, 6880
lexical representation, 67
pivot/open sequence, 5860, 6869,
7279
thematic representation, 67
Emonds, 17, 154
English nominals
as ergative, 4145
Epstein, 213
Equipollence, 194203
Ergative languages, 3845
F
Farmer, 68
Fiengo, 25, 88
Finiteness, 2, 1314
and closed class elements, 1314
Fixed Specier Constraint, 1923
Fleck, 140
Fodor, 9, 128, 185, 186
Frank, xiv
Franks, 217
Frazier, 137
Freiden, 101, 104
Fukui, 27, 28, 72, 84
Full vs. Reduced Paradigm, 211
Functor/Pivot, 6869, 7175
G
Garrett, xvi, xxvi, 9, 12, 92, 155157,
185, 186
General Congruence Principle, 47,
126136
and setting of parameters, 126136
Gleitman H., 67
Gleitman L., xxvi, 67
Goodluck, 120, 209211, 213
Governor, Canonical, 12
Grammatical operations, 145153
Grammatical Sequence, relative clauses,
142144
Grimshaw, 32, 35, 47, 51, 95
INDEX 275
H
Hale, 32, 49, 52, 54, 55
Hamburger, 120
Higginbotham, 88, 96, 224
Hindle, 140
Hoji, 52, 77, 118
Hornstein, 21, 43, 44, 217
Huang, 13, 42, 104
Hyams, 16, 82, 209, 215, 249
I
Idioms, 165182
and passive, 165182
Level I idioms, 178181
Level II idioms, 178181
J
Jackendoff, xxviii, 22, 33, 61, 85, 86,
95, 97, 102, 103, 172, 183
Jakubewicz, 245, 247, 249
Jelinek, 38, 40, 41, 46, 52
Johnson, 219
Joshi, xiv, xxii, 78
K
Keyser, 54, 64
Kitagawa, 27, 28
Klein, 26
Klima, 26, 27
Koopman, 12, 18, 55, 91, 166
Koster, 17, 93, 217
Kroch, xiv, xxii, 78
L
Labov,T., 198
Labov,W., 198
Lapointe, 156
Laporte-Grimes, xxvii
Lasnik, 21, 106, 207, 215, 230, 242
Lebeaux, xvi, xxi, xxii, xxv, xxvi, xxvii,
xxviii, 32, 34, 81, 82, 85, 86, 88,
92, 149, 174, 212, 213-220, 241
Levels of Representation
and parametric variation, 1415
Levin, 38
Levy , 78
Lexical entry/representation, 50, 5360,
63, 6678
structure of, 50, 5360, 63, 6678
with lexical insertion into open slots,
5758
Lightfoot, 21, 43, 44
Link of closed class item with
grammatical operations, 151153
Lust, 76
M
Mallis, 204, 239258
Mangione, 76
Manzini, 206
Marantz, 38, 92
Marcus, 140
Marker, 1922
McCawley, 224
McKean, 165, 185, 186
McKee, 207
McNeill, 185, 186
Meier, 19
Merge, xxii-xxiv
Merger, 154182
Metatheoretical Constraint on Indexing,
231
and chain-binding, 230234
Miller, 165, 185, 186
Minimalist Program, the, xiii-xxix
Montague, xxi
Morgan, 19
N
Newport, 19, 68
Nishigauchi, 205
O
Open class/closed class distinction, 716
276 INDEX
P
Parametric variation in phrase structure,
3, 1415, 1618, 3141
amount of, 3
and levels, 1415,
and triggers, 1618
Parametric variation in Relative Clauses
and saturation of closed class
elements, 114120
and structure of parameters, 112120,
126136
Peters, 100
Phrase Structural Case, 150
Phrase structure composition, xiii-xxix
Phrase Structure,
Building, 3136, 4753, 5684
Pinker, 7, 10, 23, 3136, 45, 48, 51, 63,
65, 70, 86, 164, 194197, 212
Pivot/Open Constructions, 5860, 6869,
7275
Pivot/Open distinction, 711
and government relation, 9,10
Postal, 243
Powers, xxvi
Predication, 146147
Pre-Project- representation, 5183
Principle of Representability, 141
Processing considerations, 136144
and ECP, 138139
and grammar, 136144
Project-, xvi-xxvii, 154182
Projection of Lexical Structure, 4753
Projection Principle, 104112
Property of Smooth Degradation, 141
Pustejovsky, 150
R
Radford, 88
Reconstruction, 234239
quasi-Reconstruction, 234239
vs. direct approach, 224229
Reduced structures
deletion account of, 159
null item account of, 160
subgrammar account of, 165182
Reinhart, 110, 224226
Relative clauses, 91144
Relative Clauses, Acquisition
conjoined clause analysis, 120126
default grammar, 123
Replacement Sequences, 69
Rizzi, 219, 241
Roeper, 16, 17, 19, 20, 24, 54, 56,
120126, 184, 204, 239258
Rooth, 204, 239258
Rosenbaum, 205
Ross, 88
Rothstein, 96
S
Sar, 17
Saito, 21, 72, 230
Schlesinger, 11
Seely, 231235
Selkirk, 88
Semantic bootstrapping, 3138, 4547
Sequence of Structures, 69
Shallow analysis/derivation, 184194
and Derivational Theory of
Complexity, 184194
Shattuck-Hufnagel, xxvi, 12, 92,
155157
Sheldon, 120, 121
Siegel, 54
Solan, 120, 125, 126, 184, 207, 227
Speas, xvi, 27, 28, 72, 84
Specied Determiner Generalization,
170175
Speech errors, xviii, 155156, 165169
Sportiche, 21, 27, 28, 213
Steele, 88
Stowell, 10, 12, 32, 37, 49, 52, 54, 55,
95, 99, 114, 138
INDEX 277
Strong Crossover, 239258
acquisition evidence, 245258
and van Riemsdijk and Williams
proposal, 243244
and wh-questions, 239258
derivational account, 247249,
251258
representational account, 247251
Structure of the Base, 104112
Subgrammar Approach, xiii-xxix, 53, 67,
7578, 165182
Submaximal Projections, 8687
T
Takahashi, 78
Tavakolian, 120126, 142, 184, 203224
Telegraphic speech, 154182
The placement of Neg
syntax, 2426
acquisition, 2629
Thematic Representation, 67
Theta representation, 178, 179
Thiersch, 18
Travis, 55, 91
Triggers, 1629
U
Universal Application of Principles, 245
V
Vainikka, 82
van Riemsdijk, xxviii, 4, 46, 102, 103,
116, 119, 183, 236, 243258
W
Wall, 100
Wasow, 88
Weinberg, 21, 43, 44, 140
Weisler, 231
Wexler, 7, 185, 245, 247, 249
Williams, xxviii, 5, 46, 56, 66, 96, 102,
103, 111, 113, 116, 119, 146, 147,
183, 205, 206, 213, 215, 224, 236,
243258
Z
Zubizarretta, 67

David Lebeaux Language Acquisition and The Form of The Grammar 2000

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

David Lebeaux Language Acquisition and The Form of The Grammar 2000

Hochgeladen von

Copyright:

Verfügbare Formate

LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR

This was called the hypothesis of submaximal projections.

110 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR

Adjoin- or Saturate linker

In the LFG account, it is possible to allow for a severely reduced c-structure,

Let us suppose that D-Structure is a pure representation of X-theory in

Das könnte Ihnen auch gefallen