Syntactic Structures and Morphological Information


Interface Explorations 7

Artemis Alexiadou
T. Alan Hall

Syntactic Structures
and Morphological Information

edited by
Uwe Junghanns
Luka Szucsich

Introduction vii
Luka Szucsich and Uwe Junghanns

Metagrammar of systematic relations: a study with special

reference to Slavic morphosyntax 1
Tania Avgustinova

On-line morphology: The morphosyntax of Hungarian

verbal inflection 25
Huba Bartos

Verbal morphology and agreement in Urdu 57

Miriam Butt and Louisa Sadler

Particles and sentence structure: a historical perspective 101

Gisella Ferraresi and Maria Goldbach

Subject Case in Turkish nominalized clauses 129

Jaklin Kornfilt

On the licensing of null subjects in Old French 217

Esther Rinke

Periphrastic paradigms in Bulgarian 249

Andrew Spencer

Transparent, restricted and opaque affix orders 283

Barbara Stiebels

Direction marking as agreement 317

Jochen Trommer
vi Contents

On the semantics of cases 341

Ilse Zimmermann

Index 381
Luka Szucsich and Uwe Junghanns

The present volume contains a selection of papers originating in the

workshop entitled "Clause Structure and Models of Grammar from
the Perspective of Languages with Rich Morphology" at the 23 rd
annual meeting of the German Linguistic Society / 23. Jahrestagung
der Deutschen Gesellschaft fur Sprachwissenschaft (DGfS-23) which
took place in February 2001 at the University of Leipzig.
The relation between morphology and syntax has been one of the
most debated topics in linguistics. In most cases "words" do not
appear in one single form (phonological matrix) irrespective of the
syntactic environment in which they are embedded, but rather exhibit
additional morphological markers depending on the "slot" they
occupy in a complex syntactic structure. It is commonly assumed that
these morphological markers, as a rule, do not belong to the
substantial lexical meaning of the respective word form. Apart from
that, in many languages there exists a set of morphological words
(some of them are not necessarily phonological words) which merely
represent grammatical/functional meanings of substantial lexical
entries (auxiliaries and other free functional/synsemantic words).
Morphological markers (bound or free) serve different purposes.
They "identify" grammatical categories like tense, aspect, mood,
sentence type, number, person, gender, case, definiteness, etc. which
are connected to certain interpretations of complex linguistic
expressions and they are correlated with the combinatorial potential
of the respective word form, i.e. they "embed" a lexical entry into its
syntactic environment. Here are just two examples: (i) Certain verbal
morphology (e.g., active/passive or tense/finiteness morphology)
influences the realization of the argument structure of the verbal
lexical entry; (ii) a case marker indicates the "place" of a nominal
expression within a more complex linguistic expression (in a
minimalist approach it indicates the target the nominal constituent
viii Luka Szucsich & Uwe Junghanns

merges with; in a functionalist approach it determines the syntactic

function of a nominal expression).
In recent theories, morphology has been treated rather diversely,
either directly determining syntactic derivations or having no
immediate relation to syntactic structures. In some models,
morphological information appears as a specification or sub-matrix
of nodes (e.g., Lexical-Functional Grammar, cf. Bresnan (ed.) 1982;
Head-Driven Phrase-Structure Grammar, cf. Pollard and Sag 1994;
Functional Generative Description, cf. Sgall, Hajicovä, and Panevovä
1986). In recent generative Principle and Parameter models,
alongside the morphological characterization of word forms and the
syntactic structures induced by them a varying number of special
syntactic categories—so-called functional categories—are assumed
that represent morphological and/or semantic information or that
provide potential syntactic positions for argument expressions and
adverbial phrases (cf. Pollock 1989; Chomsky 1995, 2000, 2001;
Rizzi 1997; Cinque 1999).
In view of these developments it is necessary to discuss the place
of morphology in models of grammar and determine in a principled
way which possible correlates morphological information has in the
syntactic representation of the clause. The papers in this volume do
not focus on languages with relatively impoverished morphology
like, e.g., modern English, but rather on languages with rich
morphology (inflecting, agglutinating, polysynthetic languages).
Many of the contributors consider in their papers universal as well as
language-specific aspects of the relation between morphology and
clausal syntax.
This volume was designed to assemble papers by researchers from
different linguistic backgrounds and, thus, to mirror the on-going
discussion concerning the relation between the morphological and
the syntactic component of the grammar, taking rather different
perspectives. The central issue of the workshop and the present
volume was/is to come closer to a principled linguistic treatment of
clause structure in morphologically rich languages, i.e. to discuss
whether there is a generalized relation between morphological and
Introduction ix

syntactic operations and, if this is the case, how this relation could be
modeled in the respective framework.
In the following we will give short summaries of the papers
contained in this volume.
In her paper "Metagrammar of systematic relations: a study with
special reference to Slavic morphosyntax" Tania Avgustinova
develops a standardized (universal) taxonomy for systematic
relations in grammar (the so-called metagrammar) with a hierarchy
of relational types and a system of admissible cross-classifications of
different relational types. Those cross-classifications amount to a
broad array of grammatical relations including marginal ones which
are instantiated in various ways in different natural languages. Thus,
one aim of the proposed metagrammar is to provide this inventory of
possible syntactic combinations and (morpho-)syntactic relations
which should be interchangeable between different syntactic and
morphological theories. The practical side of this work is to
formulate a system of grammatical relatedness which could be
implemented in language tools and which should determine the
design of shared grammatical resources for Slavic languages.
Although one of the tasks is to develop a metagrammar serving
different syntactic and morphological theories, the design of the type
hierarchy of grammatical relations is based on type hierarchies
known from the Head Driven Phrase Structure Grammar (HPSG)
with higher- and lower-level relational types and multi-dimensional
inheritance of relational properties from higher-level to lower-level
types. All constraints associated with a particular relational type are
consequently also inherited to lower-level types.
In the introductory section Avgustinova lays out the fundamental
assumption concerning the metagrammar of systematic relations. The
second and central section is devoted to a description of the
systematic relations. Avgustinova above all discusses a subpart of
systematic relations which she calls "observable syntagmatics".
These relations are connected to the overt linguistic form, in contrast
to covert linguistic function ("structural syntagmatics"). Moreover,
within the observable syntagmatics she mainly concentrates on
combinatorial relations which largely correspond to morphological or
χ Luka Szucsich & Uwe Junghanns

morphosyntactic relations. Avgustinova claims that the alignment

dimension of observable syntagmatics determining the linear
distribution of syntactic items is less relevant for Slavic languages
(which means that the ordering of constituents in Slavic languages is
not rigidly fixed but determined by information-structural
To exemplify the abovementioned relations, Avgustinova has
chosen Russian and Bulgarian examples which she presents in
section 3, employing so-called "relational charts". These charts (in
the shape of diagrams) consist of all lexical items of the actual
sentence and additional cells for labeling the systematic relations
between the individual lexical items or complexes of items (in
section 1 the author explains the format of the charts—a certain
relational chart comprises all metagrammatical relations of the
respective sentence). Languages with a considerably rich
morphology like Russian and Bulgarian call for a detailed
elaboration of combinatorial relations.
This type of metagrammar, of course, does not explain the gaps
within the inventory of syntactic combinations for a particular
language or from a cross-linguistic perspective. The author herself
states that constraints that block certain cross-classifications of
relational types are to be developed. These constraints, of course,
cannot be theory-independent. Therefore, the aim of the paper is
rather to provide tools for systematically classifying grammatical
relations not least against the background of an application to
automatic language processing.
In the context of a derivational theory of morphology Huba
Bartos investigates phenomena of Hungarian verbal morphology
which at first sight seem to violate the Mirror Principle developed by
Baker (1985). The aim of the paper is to save this principle without
resorting to dubious mechanisms like morphological reordering or
theoretically undesirable concepts like covert movement in
morphology. One of the main theoretical claims of the paper is that
morphology "shadows" syntax, but certain principles in morphology
may cause orderings of morphemes that look like deviations from
Introduction xi

strict mirroring in light of the scope properties of the respective

In the second section, following the introduction, Bartos presents
the relevant Hungarian data. In Hungarian, inflectional morphology
obeys a strict ordering V-Mod(ality)-T(ense)-M(ood). However,
certain affix orderings may be associated with different scope orders
(e.g. if only Mod- and T-morphemes are present, both readings, Mod
> Τ and Τ > Mod, are possible). Assuming that semantic scope is
determined syntactically via c-command, these data pose a problem
for a strict interpretation of the Mirror principle.
In the third section Bartos lays out the fundamentals of his
account. Following recent syntactic theories he assumes that
syntactic structures are built up derivationally and cyclically
(obeying strict locality conditions). The morphological component
has access to the syntactic structure at any point of the derivation, i.e.
morphology strictly parallels syntax. If a derivational step has a
consequence for morphology (word structure), the respective
morphological operations have to be carried out obligatorily, e.g.
what is known as head movement is always relevant for morphology.
The relevant binary operation is called Morphosyntactic Merger. A
central assumption is that as soon as a morphological operation is
carried out, the newly built word-domain is opaque for further
morphological operations, i.e. further operations always involve just
the edges of word-domains. There is no such thing as restructuring
within word-domains.
In the following section, Bartos presents his account of the
Hungarian data. He assumes three functional categories potentially
filled with "genuine" morphemes (viz. Mod, T, and M). These
categories, however, also provide templatic slots, if the respective
functional category is "contentless" (i.e., a merely categorical frame
which Bartos calls proxy). These proxies are available for features of
lower functional categories to get scope over intermediate
projections. The ordering of the morphemes, however, does not
change, because the "lower" morpheme has already been
morphosyntactically merged with the verbal root forming an opaque
word-domain. This phenomenon produces the alleged violation of
xii Luka Szucsich & Uwe Junghanns

the Mirror Principle. Bartos' account predicts that there should be

no scope inversion where all functional categories are "contentfull",
i.e. where Mod, T, and Μ contain interpretable formal features which
in most cases are morphologically spelled out. This prediction is
indeed borne out by the Hungarian data.
In the fifth section, as a "by-product", Bartos provides an account
for so-called verbal complexes in Hungarian by employing the
concept of morphosyntactic merger. He derives both the ordering
patterns within the verbal complexes (the so-called "roll-up" and the
non-"roll-up" order) and the fact that these complexes display certain
properties of words, e.g., they serve as inputs to derivational
processes like other word-level units.
The paper by Miriam Butt and Louisa Sadler is concerned with
the general issue of the status of morphology within the theory of
grammar, and in particular its relation to syntax. The points at issue
are the interaction between morphology and syntax, on the one hand,
and, on the other, the mechanisms and data structures that should be
assumed as appropriate for the description of morphological systems.
More specifically, the paper explores aspects of the morphology
of case and agreement in Urdu within the framework of Lexical-
Functional Grammar (LFG). The lexicalist hypothesis is taken to
hold. Accordingly, syntactic and morphological processes belong to
differing modules of grammar. However, quite a complex interaction
between syntax and morphology is permitted due to the specific form
in which the lexicalist hypothesis is embodied in LFG. The authors
assume a finite-state morphological analyzer as used in
computational work in LFG. While it preserves the separation of
(external) syntax and morphology, it does not exclude contact
between morphology and syntax entirely. Interaction comes about at
an interface allowing syntactic functional information to filter
through by means of tags or features which abstract away from the
surface realization of the morphemes.
Butt and Sadler claim that their model of the morphology-syntax
interface is superior to a morpheme-based word syntax approach that
suffers from a number of insufficiencies as they show.
Introduction xiii

The paper consists of seven sections. The introductory section is

followed by a section concerned with Lexical-Functional Grammar
in which Butt and Sadler provide a brief sketch of the basic design
principles of LFG. They introduce and exemplify an aspect of the
formalism (known as constructive morphology, Nordlinger 1998)
which permits a natural and straightforward approach to the ability of
morphological elements (such as case markers) to define and project
the relational structures which contain them. In section 3, Butt and
Sadler briefly introduce case in Urdu and its relation to verbal
agreement patterns. They sketch out a treatment of Urdu case
marking in LFG. Section 4 deals with agreement. Specifically, Butt
and Sadler present the facts of verbal agreement in Urdu. They
formulate a relatively simple generalization concerning verbal
agreement and show how constraints associated with verb forms will
capture this generalization. Sections 5 and 6 are devoted to the
details of the morphological analysis. In section 5, Butt and Sadler
explore a word-syntax, or morpheme-based, implementation
presenting several unwanted side-effects and drawbacks of this
approach. In section 6, they examine an encoding of the same set of
agreement data using a finite-state morphological analyzer interfaced
to the syntax, showing how the difficulties encountered in the word
syntax approach are resolved. Section 7 concludes the paper.
Gisella Ferraresi and Maria Goldbach examine the syntactic
and phonological status of the Old French sentence particle si 'thus'
and its loss in the 17th century. The basic hypothesis of the paper is
that a syntactic change emerges, if conditions of either the interface
to the C-I system or the interface to the A-P system undergo changes
which affect semantic and phonological properties of lexical items.
The authors take functional categories as combinations of subsets of
the set of formal features selected by a specific language. In different
languages, functional categories may differ with respect to which
subset of formal features is assembled in a particular functional
category. Syntactic phenomena in a given language correspond to
particular representations of functional categories at the semantic and
phonological interface. Ferraresi and Goldbach show how the
disappearance of the sentence particle si is related to its phonological
xiv Luka Szucsich & Uwe Junghanns

status and to changes concerning the system of the syllabic structure

of the phonological word and the clitic group in Late Latin and Old
In section 1 the authors discuss the morphophonological shape of
Old French si. By extensively examining iambic, decasyllabic Old
French metrical poetry which strictly obeys the rhythmical structure
of the involved lexemes, they reach the conclusion that si is a
phonologically weak element. It may be part of clitic clusters, but
unlike object and adverbial clitics it is not an obligatory clitic.
In the second section Ferraresi and Goldbach provide evidence
against a prevailing analysis according to which si occupies a Spec-
position within the C-domain like sentential adverbs. In contrast to
the latter, si can occupy a position between a subject DP and the
main verb. This fact cannot be explained for a V2-language like Old
French, if si were an XP located in a Spec-position higher than the
main verb (if preverbal sentential adverbs are present, subject DPs
obligatorily occur postverbally). In addition, the distribution of si
largely patterns with that of Welsh particles (e.g. only clitics may
intervene between a particle and the main verb). These facts lead the
authors, who adopt Rizzi's (1997) split CP analysis, to the
conclusion that si is spelled out in a functional category, viz. Fin0.
The prosodic facts and the fact that si may be split from the main
verb only by object clitics are captured by the claim that the verbal
root together with the clitics moves through the T°-node to the Fin°-
node where the complex merges with the particle si forming a word
level unit which also constitutes a prosodic unit, viz. a clitic group
(which in Old French is identical to a phonological word). Analyzing
si as a functional category within the C-domain allows for an
explanation of the fact that it does not occur in subordinate clauses
where the same position is occupied by a different element (que).
Section 3 is devoted to explaining the loss of the particle si in
Modern French. From Old French to Early Modern French the initial
syllable of a clitic group lost its secondary stress similar to the
process in phonological words from Late Latin to Old French (the
former, thus, represents a similar systematic change). This
development went hand in hand with the apocope of the word-final
Introduction xv

schwa making the right edge of phonological words and clitic groups
prosodically strong. Since si was the initial syllable in a clitic group
subject to the described prosodic changes, it became prosodically
weak. This phonological reduction process is also attested for other
initial clitics in Modern French (e.g., [il y a] > y'a 'there is'). This
fact eventually led to the total loss of si in the 17th century. It is
likely that the disappearance of the sentence particle si was
facilitated by the existence of five homonymous particles in Old
French, especially those who had a similar surface distribution in Old
French, viz. the adverb of manner si 'so' (Modern French ainsi) and
the subordinating conjunction si 'if which is also an element of the
C-domain, although the latter does not directly compete with the
sentence particle si.
In her paper "Subject Case in Turkish Nominalized Clauses",
Jaklin Kornfilt investigates the asymmetry between adjuncts and
arguments, claiming that the argument-adjunct distinction can also
play a role in determining the Case on the subject of a particular
syntactic domain. She assumes that it is a clause's status as an
adjunct versus as an argument which can determine the type of
subject Case.
The paper is also a case study in the interactions of morphology
and syntax, as it claims that overt ^gT-(eement) determines subject
Case (but only where Agr is licensed itself in this capacity). She
shows another aspect of the morphology-syntax interaction, viz. the
absence of a one-to-one relationship between syntactic and
morphological Case: while morphological Genitive indeed reflects
licensed nominal subject Case, morphological Nominative (possibly
by virtue of being phonologically null) reflects both licensed verbal
subject Case and default Case.
Kornfilt makes the following specific proposals:
(i) Turkish has three types of overt subjects: Those that bear
genuine subject Case, those that bear default Case, and those which
are Case-less.
"Genuine subject Case" is licensed by a designated Case licenser;
for Turkish, this is the overt ^gr(eement) marker. Such subject Case
xvi Luka Szucsich & Uwe Junghanns

can be Nominative or Genitive in Turkish, depending on the

categorial features of Agr.
Default Case is possible as a last resort strategy, when subject
Case is not licensed for an overt subject, and when no other licenser
can license another appropriate Case (e.g. an ECM verb licensing
Case-less subjects are non-specific, and they are less mobile than
the other two types of subjects.
(ii) The proposed interaction between the argument-adjunct
asymmetry and the designated subject Case licenser, i.e. overt Agr, is
implemented as follows:
Agr needs to be licensed itself in order to function as a subject
Case licenser. This can happen in three ways:
A. Categorially, i.e. via matching category features: A verbal Agr
is licensed in a fully verbal extended projection, and a nominal Agr is
licensed in a fully nominal extended projection.
B. However, where there is a categorial mismatch, Agr must be
licensed differently. This is when the argument-adjunct asymmetry
comes into play:
An argument domain bears a thematic index (cf. the proposal in
Rizzi 1994 that arguments bear a "referential" index, while adjuncts
don't); this index is inherited by the Agr (if there is one) that heads
the argument domain in question. Kornfilt assumes that it is such
indexation which licenses a categorially unlicensed Agr as a subject
Case licenser. Thus, if Agr does not match its clause categorially, it is
only where that clause is an argument that Agr will be able to license
subject Case; where the domain is an adjunct, a categorially
mismatched Agr cannot license subject Case.
Thus, Kornfilt correctly predicts the existence of argument-
adjunct asymmetries with respect to subject Case in categorially
hybrid clauses, as well as the absence of such asymmetries in
categorially homogeneous clauses.
C. There is another way for a categorially mismatched Agr to
receive an index and thus to get licensed as a subject Case licenser:
via predication with an external head, i.e. when the domain headed
by that Agr receives an index via predication (in headed operator-
Introduction xvii

variable constructions like relative clauses and comparatives), and

when, once again, the Agr head inherits the index of the clause in
(iii) In all other instances (i.e. where there is no Agr, or where an
existing, but categorially unlicensed Agr cannot receive an index by
either "referential" theta-marking or under predication), no genuine
subject Case is possible. The clause will have either a PRO subject
or, if it has an overt subject, that subject will be in a default Case
rather than in a genuine subject Case. Kornfilt discusses the issue of
default Case and proposes criteria determining when default Case is
possible and when it is not. She further proposes that the
morphological realization of default Case may differ across
languages; e.g. it is Accusative in English, while it is Nominative in
(iv) Subject Case is licensed locally within the extended
functional projection of the clause; no clause-external nominal
element is involved in this licensing—at least not directly, as the
licenser of subject Case.
(v) The account is compatible with approaches where AgrΡ is an
independent projection (Pollock 1989, Kornfilt 1984), but also with
approaches where Agr is positioned within the head of another
functional projection, e.g. of the head of a Fm(iteness)jP (cf. Rizzi
1997), as long as Agr is housed in a projection separate from ΤΑΜ
(i.e. Tense, Aspect, Mood).
(vi) Kornfilt's paper is, at the same time, a case study concerning
the two most widely used nominalization types in Turkish, with
respect to genuine subject Case. The argument-adjunct asymmetry
mentioned in (ii) is observed in one type of nominalization only (i.e.
the indicative type) and not the other (i.e. the subjunctive type). The
account proposed claims that, while both types of subordinate
domains are DPs, only indicatives are also CPs. This explains the
sensitivity of indicatives to "CP-level" phenomena and to theta-
marking, and the lack of such sensitivity in non-indicative
Kornfilt's paper consists of nine sections. The introductory section
is followed by section 2 in which Kornfilt presents the two main
xviii Luka Szucsich & Uwe Junghanns

asymmetries and establishes the relevance of Agr for subject Case.

Section 3 offers a basic account of subject Case. Section 4 extends
that account to predication. Section 5 draws preliminary conclusions.
Section 6 discusses the nature of default Case. Section 7 proposes an
explanation for when default Case may or may not be allowed.
Section 8 discusses two rival approaches to the first asymmetry (i.e.
the asymmetry between arguments and adjuncts in nominalized
factive clauses) and presents counterarguments. Section 9
summarizes the conclusions and mentions some speculations.
The paper is written in a general Principles and Parameters
In her paper, Esther Rinke discusses the licensing conditions for
the omission of referential subjects in Old French (13th c.). She
observes that the distribution of null subjects is constrained in the
following way: Null subjects occur predominantly in main clauses
with an initial non subject constituent and in conjunctional
subordinate clauses which contain a preverbal topic. This has to be
To account for the facts Rinke draws on work by Mary Kato and
Luigi Rizzi.
Kato (1999) assumes that a [+pronominal] agreement system is
the prerequisite for the licensing of null subjects. The agreement
morphemes have the same grammatical status as a pronominal
subject. As a result, they are able to check the EPP-feature of a given
sentence via verb movement to AGRS°. This account has two
important consequences: (i) pro can be eliminated, (ii) The
projection of a specifier of AGRSP for EPP checking is excluded in
null subject languages for reasons of economy.
Rinke adopts Rizzi's (1997) split-CP hyphothesis. Thus, she
assumes a number of functional layers in the C-domain, viz. ForceP,
TopP, FocP, and FinP. The functional node Fin has an impact on the
IP-system in that it selects the IP. Since the specification of the IP-
system in turn is crucial for the licensing of null subjects, Rinke
proposes that the licensing of null subjects is contingent on the
realization of the functional category Fin. This category is only
available in main clauses and in conjunctional subordinate clauses
Introduction xix

with a preverbal topic, but not in other types of subordinate clauses.

This explains the distribution of null subjects in Old French.
Rinke's crucial claim is that C is split into Force and Fin only in
some types of clauses, e.g., in main clauses and conjunctional
subordinate clauses with a preverbal adverbial topic, whereas in
other types of clauses, e.g., in other subordinate clauses, Force and
Fin collapse into one node. From this she goes further to claim that
the licensing of null subjects in Old French is contingent on the
realization of Fin. She finds additional evidence in the fact that null
subjects in main clauses tend to co-occur with the sentence particle
si. Following Ferraresi and Goldbach (2001) she assumes that si is
placed in Fin.
Rinke's account supports Rizzi's claim that Fin is the functional
layer within the CP-system which selects the IP system and which
may be endowed with agreement features. However, the nature of
Fin and its precise feature composition are left for future research.
In his paper, Andrew Spencer explores the notion of "paradigm"
against the background of periphrastic or analytic constructions, i.e.
multi-word combinations that express grammatical properties such as
tense or aspect. He suggests treating them as reflexes of paradigmatic
organization within a framework of paradigm-based morphosyntax
(following Ackerman and Webelhuth 1998, Sadler and Spencer
2001, Spencer 2001), claiming that it is only by adopting the
paradigm-based approach that the correct factoring of functions can
be achieved and only a paradigm-based approach will eventually lead
to an insightful account of these constructions.
According to the paradigmatic view on periphrastic constructions,
individual function words of multi-word combinations are not
considered as lexical entries projecting their own set of features.
Instead, they represent simply formatives which bear at most
syntactic category features.
Spencer presents two sorts of evidence for the paradigm-based
view of periphrases, based mainly on the unusually rich periphrastic
system of Bulgarian: (i) There are the kinds of gaps in these
constructions that often show up in inflectional paradigms but which
should not occur in genuinely compositional syntax, (ii) There are
xx Luka Szucsich & Uwe Junghanns

instances of superexhaustivity, in which the paradigm extends

beyond what would be expected from the normal combinatorial
The author argues that Bulgarian periphrastic constructions (e.g.
those containing the /-participle of 'be') would be extremely difficult
to describe if we insisted on listing featural properties in lexical
entries for function words.
Spencer points out as an important problem to be tackled in the
course of future investigation the question of how paradigm-driven
mapping rules relate to other aspects of morphosyntax, such as
linearization, clitic placement, agreement, ellipsis, etc.
The paper consists of eight sections. Section 1 is the introduction.
In section 2, Spencer introduces the idea that periphrastic or analytic
constructions are best regarded as a kind of idiom, in which neither
of the component parts has a meaning ("constructional idioms").
Section 3 discusses the fact that morphological paradigms are
sometimes incomplete ("underexhaustivity"). In section 4, Spencer
sketches the distinction between morphological and syntactic
features (m-features vs. s-features). Section 5 introduces Ackerman's
and Webelhuth's (1998) conception of "expanded predicate". In
section 6, Spencer applies some of the discussion of section three to
periphrastic paradigms in Bulgarian and develops a new concept of
"superexhaustivity", under which periphrastic paradigms produce
more forms than would be expected from the basic combinatorics of
the syntax. Section 7 deals with the grammaticalization of clause
structure in Bulgarian. Spencer demonstrates that future tense
constructions in Bulgarian consist of constructional idioms that form
paradigms. These paradigmatic periphrases contain subordinate
clauses. Section 8 presents the conclusions.
Barbara Stiebels defines as the goal of her paper to provide a
programmatic and semantically based overview of possible affix
orders within the domain of diathesis morphology. Two main
questions need to be answered: (i) Which diathesis markers may be
combined in principle? (ii) To what extent is the resulting
morphological structure compositional, i.e. reflects the semantic
composition and structural generation of forms?
Introduction xxi

Included in the discussion are examples from the following

languages: Chichewa, Quechua, Kinyarwanda (a Bantu language),
Classical Nahuatl (an Uto-Aztecan language), Yucatec Maya, West
Greenlandic, Chamorro, and Tukang Besi.
Stiebels shows that Baker (1988) makes false predictions
concerning possible diathesis combinations. According to her, the
Mirror Principle is a violable constraint.
She discusses the various versions of the Mirror Principle and its
correlation with syntactic configurations as well as scope (Baker
1985,1988, Muysken 1986).
Stiebels introduces the proposal made by Rice (2000) who puts
emphasis on the availability of affix combinations that may receive
different scope readings. Rice distinguishes three cases of affix
combination: (i) Two affixes A and Β do not exhibit a scope relation;
therefore, no affix order concerning A and Β is preferred. Both affix
orders may be possible, or a language may arbitrarily choose one
option, (ii) Each of the two affixes may take the other one into its
scope. Therefore, both affix orders are relevant because they differ in
their scopal interpretations, (iii) The scope relation is fixed such that
only affix A may take affix Β into its scope; thus, only the order with
A being the outer morpheme is possible. It is case (ii), the
availability of two affix orders, that Siebels is most interested in. She
assumes that differences in affix orders may result from semantic or
syntactic properties.
The paper consists of seven sections.
After the introductory section, Stiebels discusses in section 2 the
compositionality of affix orders and introduces the notion of
transparent, restricted and opaque affix orders. Given that a particular
combination of two morphemes A and Β has the universal potential
for free order of application, and, hence, for the two affix orders A-B
and B-Α, one must distinguish three subcases with respect to the
resulting structures:

(i) Transparent affix order

Both affix orders occur and transparently reflect the
underlying scope relations.
xxii Luka Szucsich & Uwe Junghanns

(ii) Restricted affix order

If, due to a language-specific constraint, only one affix order
occurs which receives a surface-true, i.e. compositional
interpretation, this affix combination is restricted.
Restricted affix orders show a complete gap for a certain
morpheme combination.
(iii) Opaque affix order
A given affix order has both the compositional and the non-
compositional interpretation. The latter violates the Mirror
Principle. These affix orders are opaque.
Opaque affix orders only lack a distinct PF for one of the
two readings.
An even stronger case of opacity occurs if only one of the
potential affix orders is allowed and if this has the
interpretation of the inverse affix order, hence violates the
Mirror Principle.

Concerning the question of why non-transparent affix orders arise,

Stiebels says: "One may speculate that different types of constraints
are responsible for non-transparent affix orders: restricted affix
orders presumably result from semantic and syntactic constraints,
whereas opaque affix orders result from phonological and
morphological surface constraints that dominate a constraint such as
the Mirror Principle, or have to be explained in terms of language-
specific conditions on grammaticalization."
Section 3 presents Baker's (1988) predictions for possible
diathesis combinations and Stiebels' own analysis of diathesis
operations. She develops a new version of the Mirror Principle·. "The
affix order must mirror semantic composition." According to this
principle, the semantics of a morphologically complex word is built
up step by step from the innermost component to the outermost
component. Unlike Baker, Stiebels assumes that the Mirror Principle
is a violable constraint: opaque affix orders violate it due to some
higher-ranked constraint. Following Wunderlich (1997b) and Dixon
and Aikhenvald (2000), Stiebels distinguishes three types of
Introduction xxiii

diathesis: (a) argument extension, (b) argument reduction, and

(c) diatheses that bring about alternative argument realizations.
Section 4 is concerned with diathesis combinations that yield an
identical semantic output. Section 5 deals with diathesis
combinations that differ in semantic terms. Section 6 treats diathesis
combinations in which one of the possible orders subsumes the
inverse one.
Section 7 concludes the paper.
Diathesis markers are treated as operations that change the
Semantic Form and/or theta-grid of the base verb. The analysis does
not invoke syntactic head-to-head movement. Stiebels argues in
favour of an autonomous morphology in its interplay with semantics.
She concludes that, in principle, diathesis markers can be
combined in any order. Restricted affix orders mostly result from
language-specific constraints on linking. The only invariant diathesis
combination (causative and assistive or iteration of causatives)
exhibits the predicted transparency. Only two cases of opaque affix
orders have been attested: the combination of causative and
applicative and multiple applicatives. In both cases, argument-
extending diatheses are combined, which poses a challenge for
structural linking in most languages.
Stiebels points out that opacity and subsumptive affix orders are
problematic. It needs to be shown whether opacity is strictly local,
involving only adjacent morphemes, in order to define the way
semantic processing works. A possible solution to the problem of
subsumptive affix orders might be provided within the framework of
Bidirectional Optimality Theory (Blutner 2000).
Her final conclusion is that a typology of possible affix orders is
not easily available within Optimality Theory because diathesis
combinations interface with different modules of the grammar
(syntax, semantics, morphology, discourse factors), which might not
be evaluated in parallel.
Jochen Trommer presents an analysis of cross-linguistic person
agreement marking with transitive verbs and of direction markers in
languages like Menominee and Turkana employing a constraint-
based morphological theory, the so-called Distributed Optimality
xxiv Luka Szucsich & Uwe Junghanns

(DO). Direction markers do not correspond directly to animacy

hierarchies but are determined by universal constraints which
correspond to prominence hierarchies. The constraints are defined in
an OT-style way (they are violable and ranked).
In the second and third section, Trommer develops the basic
assumptions concerning the DO framework where he formulates the
following principles: Morphology operating at the word-level spells
out the output of syntax (locality); morphological morphemes have to
be syntactically licensed (inclusiveness); all rankings of constraints
are possible (free ranking). In DO word forms are defined as bundles
of morphosyntactic features. The basic constraints which bring
morphology to life are the so-called PARSE constraints requiring
that certain feature combinations have to be realized by lexical items.
On the other hand, there are constraints competing with PARSE
constraints, first of all BLOCK constraints which suppress the spell
out of morphemes. Additionally, PARSE constraints are linked to
prominence scales determining which affix has to be realized with
respect to a hierarchy of person features (e.g. [-3]/[+3], requiring that
a 1st or 2nd person feature has to be spelled out by an affix if
adjacent to a 3rd person head) and with respect to a syntactic (case)
hierarchy ([+Nom]/[+Acc] where subject agreement is more
prominent than object agreement). Subsequently, the author shows
how particular rankings of universal constraints determine different
patterns of person agreement markings in different languages like
Menominee, Turkana, Dumi and Quechua.
In section 4, Trommer turns to direction markers with a direct
form where the subject of a transitive verb is higher on a prominence
scale and an inverse form where the object is more prominent on that
scale. He proposes that direction markers are in fact agreement
markers. Trommer assumes a PARSE [Case] constraint as the trigger
for the morphological spell out of direction marker, since these
morphemes occur in contexts where both [+Nom] and [+Acc]
features are present in a verb form. In Menominee the distribution of
the direction markers is determined by further features (e.g.
[ianimate], [±specified], [±obviative], i.e. these features are part of
the feature specification of direction markers. Similar to the situation
Introduction xxv

with person markers, the respective PARSE constraint is linked to a

prominence hierarchy of the abovementioned features.
There are, however, languages where particular transitive
predications are not marked by direction markers. This zero marking
is asymmetrical: If there are direction markers at all, there have to be
inverse markers, but not direct markers; there are no languages with
direct markers only. In section 5, the author captures this fact by
introducing an IMPOVERISH constraint which again corresponds to
a prominence scale. Depending on the ranking of this IMPOVERISH
constraint and the PARSE [Case] constraint with respect to each
other the desirable outcome is either a language with inverse markers
only or a language with inverse and direct markers.
In the 6th section Trommer discusses alternative analyses. In
contrast to functional approaches he does not have to assume
language-specific or even construction-specific feature hierarchies to
capture problematic data where different person hierarchies seem to
be involved (cf. the selection of prefixes and direction markers in
Blackfoot). In his approach, these data are accounted for by
assuming freely ranked constraints which are linked to a prominence
scale. The correspondence relation to a prominence scale also
accounts for the fact that certain constraints, at first glance, do not
seem to be freely rankable (as assumed in other OT-style
The paper by Ilse Zimmermann is concerned with the semantics
of cases in Modern Standard Russian. Zimmermann argues that
structural cases of complements can be characterized by abstract
semantico-syntactic features which correspond to the semantic
hierarchy of argument expressions and which are systematically
interrelated with their morpho-syntactic realizations. Out of the
various adjuncts to be found in Russian, she chooses the ones with
instrumental case to illustrate her ideas. In particular, Zimmermann
shows how morpho-syntactic case features of adjuncts are
semantically interpreted and how one can account for the polysemy
of the instrumental case by assuming context-dependent
specifications of semantic parameters.
xxvi Luka Szucsich & Uwe Junghanns

Zimmermann puts special emphasis on a strict differentiation

between universal semantico-syntactic and language-specific
morpho-syntactic case features. She assumes regular
correspondences between the two types of case features.
The paper consists of six sections.
In section 1, Zimmermann outlines her objectives. The main aim
of her paper is to bring together recent developments of the two-level
semantics that has been developed by Manfred Bierwisch, the
Linking Theory of Lexical Decomposition Grammar (Wunderlich,
Stiebeis), and Roman Jakobson's case characterizations. She
addresses the following questions:

• How are case forms of complements and adjuncts interrelated

with the semantics of these constituents?
• Which types of cases must be distinguished?
• Which types of configurations and of case features are involved?
• Which complements of lexical categories count as structural
• Which rules guarantee the correct case realizations of argument
• How do adjuncts get their case and how are they interpreted
• How can one cope with the polysemy of adjunct cases?

In section 2, Zimmermann characterizes the theoretical

framework, in particular the division of labour between morphology,
syntax, and semantics and the interface role played by the argument
structure of lexical entries of the functor expressions. The theoretical
background that Zimmermann adopts comprises a minimalist
framework of sound-meaning correlation (Chomsky 1995), a
lexicalist conception of morphology (Stiebels and Wunderlich 1994;
Wunderlich and Fabri 1995; Wunderlich 1997a), and the
differentiation between Semantic Form and Conceptual Structure
(two-level semantics; cf. Bierwisch 1983; 1987, 1997; Bierwisch and
Schreuder 1992; Lang 1987,1990,1994; Dölling 1997).
Introduction xxvii

In section 3, Zimmermann demonstrates the far-reaching

parallelism of verbal constructions and their nominalizations, with
systematic case variation of structural arguments.
Section 4 is concerned with case licensing. It presents a system of
rules correlating abstract semantico-syntactic case features in the
argument structure of lexical governors and the morpho-syntactic
case features of corresponding argument expressions.
Section 5 deals with the semantics of the instrumental as adjunct
case. In subsection 5.1., Zimmermann discusses instrumental phrases
as adverbial modifiers. Sub-section 5.2. is devoted to instrumental
phrases as secondary predicates.
Section 6 summarizes the paper.
Zimmermann claims that cases of noun phrases can have semantic
import. No case is a structural or semantic case per se. DPs with
structural case are systematically related to the semantics of their
governor. DPs as adnominal or adverbial modifiers and NPs as
secondary predicates are associated with very abstract case
In her analysis, structural cases are those predictable case
realizations of complements which are correlated with the abstract
case features +/-h(igher) r(ole), +/-l(ower) r(ole) in the argument
structure of lexical governors and mirror the semantic hierarchy of
DPs as argument expressions.
In order to demonstrate DPs as adverbial modifiers, Zimmermann
analyses Russian instrumental phrases. The semantic interpretation
of the instrumental expression draws from a variety of possible
meanings. Zimmermann invokes a semantic template that provides
the morpho-syntactic case features at the level of S(emantic) F(orm)
with a semantic parameter that is specified only at the level of
C(onceptual) Structure).
Semantic templates apply in the case of another type of
instrumental expressions too—instrumental NPs that function as
secondary predicates. In this case, the template brings in a semantic
parameter to be specified at CS, depending on the context.
Zimmermann concludes her paper by stating that the
differentiation between SF and CS as two levels of semantic
xxviii Luka Szucsich & Uwe Junghanns

interpretation turns out to be fruitful with respect to the semantics of



* All papers except of those by Tania Avgustinova and Esther Rinke derive from
presentations at the abovementioned workshop organized by the editors. We
take this opportunity to thank Tibor Kiss, Jürgen Pafel and Anita Steube for
encouraging us to organize the workshop as well as all participants of the
workshop for fruitful discussion of current issues in syntax and morphology
(the volume's contributers, as well as David Adger, Artemis Alexiadou, John
Bailyn, Ursula Bredel, Gisbert Fanselow, Daniel Harbour, Andrej Kibrik, Horst
Lohnstein, Rosemarie Lühr, Gereon Müller, Jamal Ouhalla, Jean-Yves Pollock,
Adam Przepiörkowski and Dieter Wunderlich). We are indebted to Gerhild
Zybatow for her support as the main organizer of the DGfS-23 meeting and to
Artemis Alexiadou and Tracy A. Hall for including this volume in the Interface
Explorations series. Special thanks to David Adger, Piotr Banski, Martin
Haspelmath, Christian Huber, Tracy Holloway King, Iliyana Krapova, Andrd
Meinunger, Eduardo Raposo, Marga Reis, Maaike Schoorlemmer, Michal
Starke, Anita Steube, Misha Yadroff and Ilse Zimmermann for reviewing the
papers submitted for this volume. We are grateful to Sigrid Lipka, Andrew
Mclntyre and Evan Mellander for helping us with the proofreading.


Metagrammar of systematic relations: a study
with special reference to Slavic morphosyntax

Tania Avgustinova

1. Introduction

A motivation for the present study is the assumption that shared

grammatical resources for Slavic languages should, by design, take
grammatical relatedness seriously.1 While the categorisation of primi-
tive linguistic entities tends to be language-specific or even construc-
tion-specific, the relationships observable in linguistic constructions
between these entities allow various degrees of abstraction correspon-
ding to important metagrammatical generalisations. The notion of
construction is commonly assumed to refer to the syntactic arrange-
ment of patterning within a grammatical unit. Cross-linguistically,
the key role in building syntactic constructions is played by what is
interpreted as structural syntactic dependencies. This essential gene-
ralisation tends to be implicit both in modern and in traditional lin-
guistic research, and certainly deserves attention as a feasible linguis-
tic universal. But structural syntactic dependencies are linguistic
abstractions. Being not directly observable, they are externalised by
various presumably general mechanisms which compete or comple-
ment each other.
Abstract relations are locally distributed as partial information
across syntactic items, i.e. words and larger fragments ("constituents")
- consider the following statement from (Comrie and Corbett 1993):
Syntactic relationships deal with the interdependence of words in sentences
or in segments of sentences (syntactic constructions). These relationships
may be purely semantic (for example, agent, patient, beneficiary) or they
may represent different levels of linguistic structure: syntagmatic (subject,
direct or indirect object, predicate, complement) or paradigmatic (case,
gender, person). The latter enter into larger classes of morphosyntactic
relationships, known as grammatical categories ... (p. 107)
2 Tania Avgustinova

While paradigmatic relations are functional contrasts and involve dif-

ferentiation, syntagmatic relations are possibilities of combination of
interacting items chosen within a framework of rules and conventions
(both explicit and implicit), e.g., on the basis of a grammar system.
They link together elementary constituent segments or minimal signifi-
cant units which themselves belong to various paradigms. Mature lin-
guistic theories develop nowadays powerful ontologies for linguistic
objects and categories (like words, phrases, sentences) as well as rich
inventories of relations among properties of these objects (e.g., agree-
ment, subcategorisation, long-distance dependencies, etc.). Yet, no
consistent account of grammatically relevant relations can be found.
The key hypothesis put forward here is that systematic relations mo-
tivate shared patterns of variation cross-linguistically as well as across
constructions. To adequately encode the metagrammar of systematic
relations, we need an ontological level of grammatical abstraction. This
ontological level can be encoded as compatible multidimensional hier-
archies of types of linguistic entities, with constraint inheritance from
more general to more specific types. In short, the hierarchy allows for
the expression of (possibly cross-cutting) generalisations and a specifi-
cation of (in)compatibility between types. The generalisations are
captured by factoring them out as constraints on super-types, as well
as allowing types to have more than one super-type. It is worth not-
ing, however, that it is rare in such multiple inheritance hierarchies to
find all possible combinations of the super-types instantiated via mu-
tual sub-types. And this is one sense in which the type hierarchy
represents linguistically relevant (sub)generalisations.
Attempts to derive the range of possible syntactic combinations of
any two constituents from a multidimensional inheritance network of
basic relational grammatical types open new ways of classifying rela-
tions that are non-standard or marginal in most theories. A number of
these relations fall out from the proposed cross-classification in
natural and theoretically rewarding ways. The classification presented
here is originally designed to systematise the inventory of syntactic
relationships found across Slavic languages. A far-reaching outcome
of this study is promoting the view that systematic relations holding
Metagrammar of systematic relations: Slavic morphosyntax 3

between the components of syntactic constructions must be treated as

research objects in their own right.
As soon as the relations holding in syntactic constructions are or-
ganised in an HPSG-style type hierarchy, the type subsumption can
be interpreted as modelling a continuum from general - and pre-
sumably universal - systematic relations to more and still more spe-
cific instances of these relations resulting from admissible cross-
classifications. Every type in such a multidimensional hierarchy is
associated with a set of constraints which is appropriate for this par-
ticular type and is inherited by all its immediate and non-immediate
sub-types. Thus, linguistic generalisations of different nature can be
captured in a theoretically elegant way, allowing immediate formali-
sation (e.g., in HPSG) and straightforward implementation.
For the actual linguistic material, we need a representation that
would be close to the overt perceptible expressions of language (syn-
tactic items) and to the abstract relational information that they di-
rectly express (arrays of systematic relations). Therefore, in visualis-
ing the systematic relations in syntactic constructions, relational
charts will be used, a representation of linguistic examples originally
employed by (Avgustinova and Uszkoreit 2000). A relational chart is
a diagram of the type illustrated in (Figure 1) for a sentence consist-
ing of the items α, β, γ, and δ (exactly in this order). The array of
systematic relations holding between any two selected items occu-
pies, thus, the respective "crossing" cell.

Figure 1. Relational chart diagram

| item a systematic relations (α β) systematic relations (α γ) systematic relations (α δ)

item β systematic relations (β γ) systematic relations (β δ)
item γ systematic relations (γ δ)
item δ

Information of various types of constituency with respect to non-

minimal syntactic items, i.e. multiword phrases, can be encoded in
the relational chart too. This requires merging the respective cells as
sketched in (Figure 2) for the systematic relations holding between
4 Tania Avgustinova

the item α and the complex item βγ, or between the complex item βγ
and the item δ.

Figure 2. Relational chart with merged cells

item a (a( 3 γ)) (α δ)
(α β) (α γ)
item β (ßr) (β δ) ((ß?) δ)
item γ (γδ)
item δ

Yet, merging cells with respect to particular relation types does not
block the encoding of other relation types that require the same cells
to be separated - i.e. relations holding between α and β, between
α and γ, between β and δ, and between γ and δ.
In order to better illustrate the approach developed here and for
convenient reference throughout the main discussion in Section 2, all
relational charts of actually considered linguistic examples will be
given in Section 3.

2. Systematic relations

The major dimensions of classification introduced for (the arrays of)

systematic relations discernible in syntactic constructions are
sketched in (Figure 3).2 The focus of our attention will be on segmen-
tal systematic relations in terms of syntagmatics, as they play a con-
stitutive role in syntax. In accord with the traditional "form-function"
perspective in theoretical linguistics, it is important to distinguish
dimensions of observable syntagmatics (which is concerned with the
overt linguistic form) and structural syntagmatics (which is con-
cerned with the covert linguistic function). Structural syntagmatics is
crucial in interpreting the observable syntagmatic relations which, in
turn, can be classified as combinatorial (i.e. morphosyntactic) and
alignment (i.e. configurational).
The intonational and prosodic aspect is certainly important too in
establishing instant connections between linguistic entities in various
Metagrammar of systematic relations: Slavic morphosyntax 5

constructions, but due to the morphosyntactic orientation of the pre-

sent work I will not consider it here. Nevertheless, the integration of
supra-segmental systematic relations into the ontology, as well as the
accommodation of any further relevant dimension, is immediately
possible and can be performed in the same principled way.

Figure 3. Systematic relations: dimensions of classification

systematic relation

'segmental' 'supra-segmentar ;

syntagmatics intonation / prosody

<f "function" "form"

structural observable

'morphosyntactic' 'configurational'

combinatorial alignment

2.1. Observable syntagmatics

Morphosyntactic relations concern the overt realisation of structural

dependencies in various syntactic constructions.
The combinatorial dimension in the taxonomy (Figure 4) largely
corresponds, in my understanding, to what (Schmidt and Lehfeldt
1995) regard as morphological signalling of direct syntactic rela-
tions.3 It encompasses observable relations of assembling (or
"valence" in a broader sense) and co-variation (or "agreement" in a
broader sense). Assembling includes what is traditionally considered
government and juxtaposition. The former is understood as the de-
termination by one element of the inflectional form of the other (i.e.
form government; a classical instance thereof is case government),
while the latter, in contrast, presupposes no overt morphological in-
dication (its classical instance is case adjunction). The systematic co-
variation of linguistic forms is typically realised as feature congruity,
i.e. compatibility of values of identical grammatical categories of
syntactically combined linguistic items.
6 Tania Avgustinova

The alignment dimension, in turn, is responsible for the actual lin-

ear distribution of syntactically relevant items, I assume that at least
the continuity of syntactic units, the directionality of the head (or, more
generally, of a certain distinguished, syntactically significant entity)
as well as the periphery of a syntactically determined domain are
relevant dimensions of classification in (Figure 4). The continuity of
syntactic units can be realised as immediate constituency (i.e. of type
continuous) or as long-distance constituency (i.e. of type discontinu-
ous). The directionality, for example, will account for situations
where the head either follows the dependent or precedes it. In turn,
the periphery of a syntactically determined domain can be left or right.

Figure 4. Observable syntagmatics


combinatorial alignment

government juxtaposition continuous discontinuous nonX-X X-nonX left right

The configurational alignment dimension of observable syntagmatics

is of primary relevance for languages with rigid word order and im-
poverished morphology (also known as "configurational languages").
In Slavic it tends to play a secondary role, as the word order is fairly
flexible.4 Nevertheless, in certain phenomena areas the alignment
factor is crucial for Slavic too, e.g., cliticisation (Avgustinova 1997,
2000, 2002; Avgustinova and Oliva 1997) or "partial agreement"
strategies discussed for Slavic in (Corbett 1998,2000b).

2.2. Structural syntagmatics

Providing an interpretation of observable syntagmatic relations, the

structural dimension of the ontology opens a novel typological per-
Metagrammar of systematic relations: Slavic morphosyntax 7

spective, ensuring that related classes of phenomena are generalised

over as well as kept apart in a principled way. Structural syntagmatics
is encoded with respect to centricity as centric or acentric, and taxis
as hypotaxis or parataxis (Figure 5) The centricity of a structural syn-
tagmatic relation (between, e.g., α and β) presupposes that one of the
syntactic items involved in this relation (e.g., a) plays a prominent
role. In contrast, the acentricity of a structural syntagmatic relation
presupposes no assumptions in this respect, hence, it can be viewed
as the unmarked member of the centricity opposition. The hypotaxis
means that there is a dependency of subordination between the in-
volved syntactic items, while the parataxis is neutral in this respect
and is regarded as the unmarked member of the taxis opposition.
The admissible cross-classifications in the structural syntagmatics
result in distinguishing four major types of relations in (Figure 5).

Figure 5. Structural syntagmatics: centricity and taxis



centricity taxis

centric acentric hypotaxis

(07) GTT) (7T)

centric hypotaxis acentric hypotaxis centric parataxis acentric parataxis

ο 6
α Ρ

The centric hypotaxis is an 'endocentric' relational type representing

the most structurally marked option because there is a designated
(central, or leading) element as well as a subordination relation be-
tween the items involved. The most structurally unmarked option, in
turn, is the acentric parataxis which can be interpreted as an 'exo-
8 Tania Avgustinova

centric' relational type. The other possibilities include the centric

parataxis which is an 'only-centric' relational type presupposing a
designated element but no subordination, and the acentric hypotaxis
which is an 'only-hypotactic' relational type involving subordination
although none of the items is unambiguously interpretable as central.5
Due to the fact that there always is a principal or leading element
in the centric relations, different linguistic theories typically agree on
how to interpret these relations structurally. But there is no consensus
- often even within the same linguistic theory - on the structural in-
terpretation of the acentric relations. Additional factors are usually
taken into consideration as supporting the introduction of particular
conventions. The latter, however, are not always linguistically moti-
vated. Sometimes the choice is arbitrary and is often due to theory-
specific technical considerations.
Now, let us look at the syntactic notions of "dependency" and
"constituency" from the outlined perspective. It is well-known that
the dependency grammar tradition highlights the formalisation of
hypotaxis (or syntactic subordination) by virtue of incorporating es-
sential endocentricity that is independent of word order, while the
phrase structure grammar is based on a structural formalisation of
alignment, prototypically conceived of as a part-whole relation. Be-
sides, it is commonly accepted that the fundamental notions of these
two major frameworks in modern linguistics - namely, dependency
and constituency - are only appropriately interpreted as complement-
ing each other, inasmuch as the former reflects the functional mean-
ing, while the latter being directly observable is related to the form.
In the relational taxonomy, this traditional assumption is explicitly
encoded. As sketched in (Figure 6), constitutive syntactic notions like
valence (encoding the local combinatorial potential of a category),
filler-gap dependency (encoding long-distance selection), free ad-
junction, or extraposition, actually involve all dimensions of syntag-
matics that have been distinguished up to now. The HPSG interpreta-
tion of valence and free modification, for instance, presupposes con-
tinuous constituency. Discontinuity or long-distance constituency, on
the other hand, is typically handled by special extraction mechanisms.
Metagrammar ofsystematic relations: Slavic morphosyntax 9

Figure 6. Some traditional syntactic notions

systematic relation

structural observable

combinatorial alignment

centricity taxis Γ I
assembling continuity directionality peripheiy

centric acentric hypo

centric hypotaxis acentric hypotaxis government nonHd-Hd Hd-nonHd

\ _ _ _ _ _ _ _ - — — c o n t i n u o u s discontinuous / left
subcategorisation modification

' - - * I

valence free adjunction 'filler-gap' extraposition

2.3. Combinatorial syntagmatics: assembling

Looking at the ways structural syntagmatics is externalised by com-

binatorial syntagmatics helps us reveal various classes of phenomena.
The admissible cross-classifications of the structural syntagmatic
types with the assembling types gives us the result in (Figure 7) and
(Figure 8).
The traditional notion of subcategorisation can thus be viewed as
a centric hypotaxis (endocentricity) that is realised via government.
Two general options are usually available across languages for exter-
nalising the governed centric hypotactic selection (i.e. subcategorisa-
tion) of nominal categories in actual syntactic constructions. A typi-
cal definition of the first one - relational case (Ex. 1, Ex. 2, Ex. 9,
Ex. 10, Ex. 11, Ex. 13) - can be found in (Blake 1994): "Case in its
most central manifestation is a system of marking dependent nouns
for the type of relationship they bear to their heads". The second op-
tion to externalise subcategorisation of nominal categories is to cross-
reference the syntactic function of the dependent at the head. It is
10 Tania Avgustinova

actually confined to certain core grammatical relations and typically

amounts to some kind of pronominal representation of these gram-
matical relations at the head. As (Blake 1994) observes, the cross-
referencing pronominal elements serve as an alternative to case in
signalling grammatical relations. He also makes an important meth-
odological remark:
... in a language with cross-referencing pronominal representation in the
verb it makes more sense to characterise the verb as requiring certain
categories of noun phrase to match the representation on the verb ... the so-
called agreement can be described in terms of the head (the verb) controlling
the dependent (the subject noun phrase), (p. 91)

In Slavic, there are two candidates for the second type of externalis-
ing a subcategorisation. On the one hand, the verb inflection can pos-
sibly be interpreted as cross-referencing the subject function, espe-
cially in Bulgarian6 where no relational case is realised on the depen-
dent. On the other hand, pronominal clitics can cross-reference the
direct and the indirect object in Bulgarian verb complex (Avgus-
tinova 1997) as well as the possessor relation in Bulgarian noun
phrases. Therefore, the systematic relation of object cliticisation (Ex.
3, Ex. 4, Ex. 5) can be viewed as a more specific instance of cross-
referencing. In general, a nominal category representing a grammati-
cal relation that is cross-referenced at the head selecting this nominal
category need not be overtly realised. The cross-referenced noun
phrase can typically be omitted. Particularly for Bulgarian, additional
means are needed to encode cross-referencing by pronominal clitics,
inasmuch as they can be either arguments or lexical formants - cf.
(Avgustinova 1997: 38-44). An instance of Bulgarian clitic replica-
tion (also known as "clitic doubling") is analysed in (Ex.5) as involv-
ing a cross-referencing relation.
As no "case agreement" can generally be assumed, the regular
compatibility of case specifications between two syntactic items is
due to acentric hypotaxis in a governed environment. Note that the
relational case explicating case government stands in clear opposition
to the so-called concordial case in (Figure 7) which is regarded as a
typical instance of a governed modification - for an illustration cf.
(Ex. 1, Ex. 9, Ex. 10, Ex. 11, Ex. 13); note that in Bulgarian the more
Metagrammar of systematic relations: Slavic morphosyntax 11

general relation type is appropriate (Ex. 6, Ex. 8). Another instance

of governed modification is the relation of ascriptive predication
(Ex. 10), with more specific instances attributive (Ex. 12) and classi-
ficational (Ex. 13).

Figure 7. A typology of government obtained by applying multiple inheritance

systematic relation

structural observable

centricity combinatorial
centric acentric hypotaxis parataxis assembling

centric government

subcategorisation governed modification governed centric parataxis governed acentric parataxis

relational case concordial case control co-predication co-dependence

cross-referencing ascriptive predication identificational predication coupling

object cliticisation attributive classificational

The systematic relation of syntactic control - (Ex. 2, Ex. 3, Ex. 4, Ex.

5, Ex. 11, Ex. 12) is an instance of governed centric parataxis. In Bul-
garian (Ex. 3), a personal ί/α-construction corresponds to the con-
trolled infinitive in Russian (Ex 2). Another instance of governed
centric parataxis is the relation of identificational predication (Ex. 9).
The relation of co-dependence (Ex. 1) plays a key role in a number
of constructions, relating syntactic items that have a common syntac-
tic head. In the taxonomy, this relation is distinguished as an instance
of governed acentric parataxis. For another instance of the latter, the
term co-predication used to refer to the assembling relation between
co-controlled predicates, i.e. between predicative categories that are
controlled by the same item. This can be observed exclusively in lan-
12 Tania Avgustinova

guages with infinitive like Russian - e.g., in (Ex. 2) it holds between

the subcategorised infinitive (prijti 'to-come') and the secondary predi-
cative (umytymi 'washed') - but not in Bulgarian. In (Ex. 3), the secon-
dary predicative (maskirani 'disguised') is actually controlled by the
finite verb (dojdat 'they-come') of the subcategorised da-construction.
Finally, the assembling relations in non-verbal predicative construc-
tions may involve what is distinguished here as coupling (Ex. 8).
The systematic relation of marking in (Figure 8) corresponds to
centric hypotaxis that is realised via juxtaposition. It involves various
functional categories like auxiliaries (Ex. 7), particles (Ex. 7), deter-
miners, prepositions (Ex. 8, Ex. 9), copular items (Ex. 8), and con-
junctions (Ex. 3, Ex. 4, Ex. 6). In (Avgustinova 1997) it is used for
modelling the syntagmatic relations between the main verb as a syn-
tactic (and semantic) head and the possibly multiple auxiliary verbs
as markers specifying it in Bulgarian periphrastic verb forms (Ex. 7).

Figure 8. A typology ofjuxtaposition obtained by applying multiple inheritance

systematic relation

structural observable
centricity taxis combinatorial

centric acentric hypotaxis parataxis assembling

centric acentric centric acentric juxtaposition

secondary predication (case) adjunction co-mariung relativising coordination

piedicative case adjunction

Instances of juxtaposed modification are the systematic relations of

(case) adjunction (Ex. 8, Ex. 9) and secondary predication (Ex. 3,
Metagrammar of systematic relations: Slavic morphosyntax 13

Ex. 4, Ex. 5), with the predicative case adjunction (Ex. 2, Ex. 11)
being a more specific instance of the latter.
What I propose to distinguish as co-marking (Ex. 9) is a subtype
ofjuxtaposed centric parataxis. It contrasts with the systematic rela-
tion of marking presented in (Figure 8) along the taxis dimension of
structural syntagmatics, as there is no subordination relation between
the involved syntactic items. Another distinguished subtype is the
relativising relation in relative-clause constructions (Ex. 8).
A prototypical instance ofjuxtaposed acentric parataxis is the co-
ordination relation (Ex. 6) To better understand this possible view of
coordination, however, let us consider the string "Χ, Y and Z", which
is an example of an awd-coordination involving three conjuncts. On
the assumption that the coordinating conjunction and the last (right-
most) conjunct are in a marking relation, the coordination actually
holds between the following syntactic items: X [MARKING none] - Y
[MARKING none] - Ζ [MARKING and]. All we need then is a constraint
ensuring that marked conjuncts (e.g., Z) always linearly follow the
unmarked ones (e.g., X or Y). This can be encoded trivially, e.g., as a
part of the specification of the type coordination.

2.4. Combinatorial syntagmatics: co-variation

Agreement phenomena are instances of co-variation of linguistic

forms which is typically realised as feature congruity, i.e. compatibil-
ity of values of identical grammatical categories of syntactically
combined linguistic items. Agreement is a relatively well-researched
topic, especially in Slavic linguistics, cf. (Corbett 2000a). However,
the investigations have mainly concentrated on the linguistic items
themselves (as agreement sources) and on the relevant properties of
these items (in terms of agreement features and conditions). The na-
ture of the relations holding between the "agreeing" items has not
received proper attention yet. As a result of admissible cross-
classifications in the proposed ontology, all known forms of agree-
ment are obtained automatically as well as novel concepts of co-
variation are predicted.
14 Tania Avgustinova

A directional (trigger-target) understanding of co-variation char-

acterises the approach of (Corbett 1998):
We shall call the element which determines the agreement (say the subject
noun phrase) the controller. The element whose form is determined by
agreement is the target. ... As these terms suggest, there is a clear intuition
that agreement is asymmetric ...
Indeed, the centricity dimension of co-variation appears to be essen-
tial in classifying observable agreement phenomena. Taking into con-
sideration how the sources of co-variation (i.e. the 'agreeing' items)
are related to each other, we can distinguish two major types of co-
variation: asymmetric and balanced (distributed).

Figure 9. Typology of co-variation obtained by applying multiple inheritance

systematic relation

structural observable

(concord) (accord)

The asymmetric co-variation is centric, which actually corresponds to

the traditional directional concept. One of the two co-variation sources
is unambiguously interpretable as the trigger and the other one as the
target of this relation. More specifically, the trigger-target configura-
tion can be unidirectional, if all co-varying grammatical categories
are triggered at the same item, or unstipulated, if the items involved
trigger different co-varying grammatical categories. The balanced
(distributed) co-variation, in contrast, is acentric and, therefore, not
Metagrammar of systematic relations: Slavic morphosyntax 15

interpretable in such directional terms. Intuitively, both co-variation

sources are often interpretable as co-targets of an external trigger.
As a result of admissible cross-classifications with the structural
taxis dimension of the hierarchy in (Figure 9), six classes of co-
variation phenomena are predicted.
Agreement 1: this is hypotactic unidirectional co-variation. It
holds, e.g., in number and gender between the verb (okazalas' 'turned
out') and its subject (ona 'she'), or just in number between the same
verb and its complement (rebenkom 'child') in (Ex. 1). Co-variation
in person, number and gender of the same type also holds between
the verbal clitic pronoun (ja 'her') cliticized on the verb (vidjaxa
'saw') and the object (Maria 'Mary') cross-referenced by this clitic in
(Ex. 5). The trigger of the discussed co-variation is the nominal ele-
ment, and the target is the verb or the clitic pronoun, respectively.
Agreement 2 (concord): this is a hypotactic unstipulated co-
variation. Its prototypical instance can be found within nominal
phrases, e.g., holding in number and gender between the adjective
(zdorovym 'healthy') and the noun (rebenkom 'child') in (Ex. 1). The
trigger is the noun and the target is the adjective.
Co-reference: this is a paratactic unidirectional co-variation. In
(Ex. 5) it holds in number and gender between the object (Maria
'Mary') and the predicative adjective controlled by it (maskirana
'disguised'), but also between the verbal clitic (ja 'her') cross-
referencing the object and the predicative adjective. The co-variation
trigger here is the object noun or the verbal object clitic, respectively,
while the target in both cases is the predicative adjective.
Agreement 3 (accord): this is a paratactic unstipulated co-
variation. It holds in number between the subject (ona 'she') and the
complement (rebenkom 'child') which are co-dependents of the same
verb (okazalas' 'turned out') in (Ex. 1). The trigger of co-variation is
the subject, while the complement presents the co-variation target.
Matching: this is a hypotactic balanced co-variation. Its prototypi-
cal instance is the compatibility between the auxiliaries and the main
verb in periphrastic forms). As discussed in (Avgustinova 1997), the
person-number-gender information in Bulgarian analytic (i.e. peri-
phrastic) verb forms can be distributed among several components,
16 Tania Avgustinova

namely, the main verb itself and a set of auxiliaries functioning as

markers to it. The analytic verb form in (Ex. 7) consists of two
auxiliaries, a particle and a main verb (si stjala da dojdes
'come.FUTURE.RENARRATiVE.2SG.F'). In fact, the balanced co-
variation relation of matching holds in all three grammatical catego-
ries between the 2nd person singular auxiliary (si) and the singular
feminine auxiliary participle (stjala), as well as between this combi-
nation of auxiliaries (si stjala) and the 2nd person singular main verb
(dojdes 'come').
Correlation: this is a paratactic balanced co-variation. It is typi-
cally observed in relative clause constructions. In (Ex. 8) it holds be-
tween the relative pronoun (kojato 'which') and the noun (knigata
'the book') modified by the relative clause. The observed compatibil-
ity encompasses all three grammatical categories, i.e. person, number
and gender.

2.5. Formal co-occurrence of combinatorial relations

In a given array of systematic relations relating the same two units,

there can be convergence of different structural relation types which
typically results in tied assembling.7
Two government relations cannot be combined in the same array
only if they are both acentric or they are both paratactic. The following
combinations can be extensively exemplified by in the Slavic data:
- subcategorisation and governed acentric parataxis, e.g., controlled
subcategorisation (Ex. 2, Ex. 3);
- subcategorisation and governed modification, e.g., relational case
in ascriptive predication (Ex. 10, Ex. 13);
- governed modification and governed centric parataxis, e.g., con-
cordial case in identificational predication (Ex. 9), controlled
concordial case (Ex. 11), or controlled attributive ascriptive
predication (Ex. 12).
On the other hand, when a government relation is tied with a juxtapo-
sition relation, the only requirement is that the former must be centric
and the latter hypotactic. This in particular means that subcategorisa-
Metagrammar of systematic relations: Slavic morphosyntax 17

tion can be tied with marking, e.g., preposition-noun combinations

(Ex. 9), while juxtaposed modification can co-occur with governed
centric parataxis, e.g., control secondary predication (Ex. 4) or con-
trolled predicative case adjunction (Ex. 2).
Cross-linguistically and across constructions, an affinity of assem-
bling and co-variation relations is attested. It appears to depend on
structural centricity, since either both relations involved are centric or
at least one of them is. The actual co-occurrence in a given array of
systematic relations, however, requires that both relations have the
same type of taxis. So, unidirectional asymmetric co-variation (cen-
tric relation) co-occurs with centric government as agreement 1 and
subcategorisation (hypotactic relations), or co-reference and gov-
erned centric parataxis (paratactic relations). On the other hand, un-
stipulated asymmetric co-variation (centric relation) co-occurs with
acentric government as agreement 2 (concord) and governed modifi-
cation (hypotactic relations), or agreement 3 (accord) and governed
acentric parataxis (paratactic relations). Finally, balanced co-
variation (acentric relation) co-occurs with centric juxtaposition as
matching and marking (hypotactic relations), or correlation and jux-
taposed acentric parataxis (paratactic relations).
Note that with tied assembling relations, such affinities are
resolved in favour of a more general common type, e.g., co-variation
in (Ex. 2, Ex. 3, Ex. 4), and asymmetric co-variation in (Ex. 9, Ex.
11, Ex. 12, Ex. 13).

3. Sample relational charts

For the sake of illustration, I will take two Slavic languages which, in
many respects, are traditionally regarded as incorporating existent
morphosyntactic extremes within the Slavic language family: Russian
and Bulgarian.
18 Tania Avgustinova

Ex. 1 'She turned out a healthy child.' (Russian)

Ona rel-case [NOM] co-dependence

she.NOM.3SG.F agrl [SG.F] agr3 (accord) [SG]
okazalas' rel-case [INST]
turned. SG.F agrl [SG]
zdorovym con-case [INST]
healthy.lNST.SG.M agr2 (concord) [SG.M]

Ex. 2 'They promised her to come washed.' (Russian)

Oni rel-case [NOM] control control

they.NOM.3PL agrl [PL] co-reference [PL]
obeScali rel-case [DAT] subcat [INF] pred-case-adiunction [INST]
promised.PL control control
co-variation [PL]
prijti co-predication

Ex. 3 'They promised me to come disguised.' (Bulgarian)

Obestaxa obj-cliticisation [ACC] subcat [da-construction]

promised.3PL control
co-reference [3PL]
da marking [da]
dojdat secondary predication
come.3PL control
co-variation [PL]
Metagrammar of systematic relations: Slavic morphosyntax 19

Ex. 4 'They told me to come disguised.' (Bulgarian)

Kazaxa obj-cliticisation [DAT] subcat [da-construction]

mi control
DAT.lSG co-reference [lSG]
da marking [da]
dojda secondary predication
come.lSG control
co-variation [SG.F]

Ex. 5 '(They) saw Mary disguised.' (Bulgarian)

Maria cross-referencing subcat control

Maria.3SG.F agrl [SG.F] co-reference [SG.F]
ja obj-cliticisation [ACC] control
ACC.3SG.F co-reference [SG.F]
vidjaxa secondary predication

Ex. 6 'The small andfluffy kitten sleeps.' (Bulgarian)

Malkoto coordination governed modification

Small-DEF.SG.N agr2 (concord) [SG.N]
i marking [i] subcat
and agrl [3SG]
puxkavo governed modification
flufly.SG.N agr2 (concord) [SG.N]
sleep. 3 SG
20 Tania Avgustinova

Ex. 7 'You would come, reportedly.' (Bulgarian)

Ti subcat
you.2sG agrl [2SG
si marking
AUX.2SG matching [2SG.F] marking
stjala matching [2SG.F]
da marking [da]

Ex. 8 'The book which we bought is on the upper bookshelf.' (Bulgarian)

Knigata, adjunction subcat coupling

book.3sG.F agrl f3SGl
correlation [3SG.F]
kojato subcat
bought. 1 PL
e marking
na marking [na]
gornija governed modification
upper, SG.M agr2 (concord) [SG.M]
Metagrammar of systematic relations: Slavic morphosyntax 21

Ex. 9 'We suffer for you as for a son.' (Russian)

Stradaem subcat [prep-ACC] adjunction

suffer. 1 PL
za marking [zal co-marking
for rel-case [ACC]
tebja identificational-prd
you.ACC.2sG con-case TACCI
asymmetric co-variation [SG.M]
kak marking
za marking fzal
for rel-case [ACC]

Ex. 10 'They brought a whole box of books.' (Russian)

Knig ascriptive-prd
book.GEN PL rel-case [GEN]
privezli rel-case [NOM] rel-case [ACC]
brought.PL agrl [PL]
celuju con-case [ACC]
whole.ACC.SG.F agr2 (concord) [SG.F]

Ex. 11 Ί found him alone.' (Russian)

Ja rel-case [NOM]
I.NOM. lSG agrl [PL]
zastal rel-case [ACC] pred-case-adjunction [ACC]
ego con-case fACCl
he.ACC.3sG.M control
asymmetric co-variation [SG.M]
22 Tania Avgustinova

Ex. 12 'Wepainted the wall white.' (Bulgarian)

Stenata subcat ascriptive-prd-attributive

Wall.SG.F.DEF control
asymmetric co-variation fSG.F]
bojadisaxme secondary predication
painted. 1 PL
white, SG.F

Ex. 13 7vanov is a proud father.' (Russian)

Ivanov ascriptive-prd-classificational
Ivanov.NOM.SG.M rel-case fNOM]
asymmetric co-variation [SG.M]
gordyj con-case [NOM]
proud.NOM.SG.M agr2 (concord) [SG.M]

4. Conclusions and outlook

The idea of a metagrammar of systematic relations promoted in this

contribution offers a novel perspective on grammar modelling in
general. Languages with rich morphology can be distinguished from
those with impoverished morphology in a principled way, and yet,
linguistic generalisations of various degrees of abstraction are directly
expressed. Languages differ typologically with respect to the prevai-
ling means of externalisation of structural syntagmatics. On the basis
of Slavic data, I have shown how a domain ontology can concep-
tualise morphosyntactic "building blocks" and thus shared patterns of
variation cross-linguistically as well as across constructions.
Admissible co-occurrences of relations holding between the same
two items have been considered in section 2.5. Still, a systematic inves-
tigation is called for to determine constraints that prevent certain cross-
classifications among the members of the proposed type hierarchy.
Traditional grammatical notions like subcategorisation, modifi-
cation, marking, agreement, control, coordination, etc. are consis-
tently interpretable on the basis of the proposed ontology. Therefore,
Metagrammar of systematic relations: Slavic morphosyntax 23

a challenging topic for further investigation is the potential of such a

metagrammar to serve as an interlingua between different theories of
morphology and syntax.


1. The research presented here is part of the author's Habilitationsprojekt funded

by the German sciences foundation (DFG). I am grateful to the anonymous
reviewer of this paper for helpful comments and suggestions.
2. The different shapes of edges connecting types in the graphical representation
of hierarchies are significant. The 'square' edges indicate conjunction of types
partitioning their super-type along various dimensions. The 'direct' edges
indicate disjunction of types within the respective dimension. Cross-
classifications encoding multiple inheritance are permitted with disjunctive but
not with conjunctive types.
3. The cited work offers a critical evaluation of existing approaches - both in
contemporary general linguistics and in Russian grammar tradition - to the
concepts of agreement, government and juxtaposition. The traditional Russian
terminology for these relation types is soglasovanie, upravlenie and primykanie.
4. Also for space reasons, I will concentrate on the combinatorial (rather than
alignment) relations when discussing the linguistic examples below.
5. Hypotaxis is a key notion in X-bar syntax. Note that from the outlined
perspective, the bar-level promoting relations are centric, while the bar-level
preserving relations are acentric. Parataxis, in turn, is crucial for what can be
called "mediation scheme". The X-bar mechanism is irrelevant for parataxis, as
the latter is generally not interpretable in terms of subordination.
6. All observations and generalisations about Bulgarian, on which our analysis is
based, hold in full extent for the closely related standard Macedonian language.
7. The co-occurring assembling relations are underlined in the relational charts of
the relevant examples in Section 3.


Avgustinova, Tania
1997 Word Order and Clitics in Bulgarian. (Saarbrücken Dissertations
in Computational Linguistics and Language Technology 5.)
Saarbrücken: Universität des Saarlandes / DFKI.
24 Tania Avgustinova

Avgustinova, Tania
2000 Gaining the perspective of language-family oriented grammar
design: predicative special clitics in Slavic. In: Piotr Banski and
Adam Przepiörkowski (eds.) First Conference on Generative
Linguistics in Poland GLiP-1, 5-14, Warsaw: Polish Academy of
Avgustinova, Tania
2002 Clustering clitics in Bulgarian nominal constituents. In: Peter
Kosta and Jens Frasek (eds.) Current Approaches to Formal
Slavic Linguistics, 63-72 (Linguistik international 9.) Frankfurt
am Main: Peter Lang.
Avgustinova, Tania and Karel Oliva
1997 On the nature of the Wackernagel position in Czech. In: Uwe
Junghanns and Gerhild Zybatow (eds.) Formale Slavistik, 25-47,
Frankfurt am Main: Vervuert.
Avgustinova, Tania and Hans Uszkoreit
2000 An ontology of systematic relations for a shared grammar of
Slavic. In: Proceedings of the 18th International Conference on
Computational Linguistics COLING'2000, Saarbrücken, 28-34.
Blake, Barry J.
1994 Case. (Cambridge Textbooks in Linguistics). Cambridge:
Cambridge University Press.
Comrie, Bernard and Greville G. Corbett (eds.)
1993 The Slavonic Languages. London/New York: Routledge.
Corbett, Greville G.
1998 Agreement in Slavic (position paper). Ms. (Workshop on
Comparative Slavic Morphosyntax, Indiana University) [website: linguistics/download.html]
Corbett, Greville G.
2000a Agreement in the Slavonic Languages: A Provisional Biblio-
graphy. [website:]
Corbett, Greville G.
2000b Number. (Cambridge Textbooks in Linguistics). Cambridge:
Cambridge University Press.
Schmidt, Peter and Werner Lehfeldt
1995 Kongruenz - Rektion - Adjunktion. [Agreement - Government -
Adjunction] (Specimina Philologiae Slavicae) München: Otto
On-line morphology: The morphosyntax of
Hungarian verbal inflection*
Huba Bartos

1. Introduction

1.1. Aims and claims

This paper investigates apparent violations of the generalization

widely known as the Mirror Principle (Baker 1985) in the domain of
Hungarian verbal inflection, and proposes a model of the syntax-
morphology interface to deal with the problems, in which the
morphological derivation runs on-line, parallel with the syntactic
one, building morphological structures in a strictly cyclic fashion,
following the sequence of syntactic operations step by step. In certain
key respects, it is similar to the model of Frampton and Gutmann
(1998, 1999), although the two models have been devised along
different lines, with crucially different considerations and purposes in
mind, treating entirely unrelated data.
The Mirror Principle is seemingly violated in cases where a single
affix order is paired with two different scope orders of the functional
categories represented by the relevant affixes, and on the usual
assumptions about the isomorphy of syntactic and semantic
composition and constituency this means that two different syntactic
structures correspond to one and the same morphological structure. It
will be shown, however, that a derivational conception of
morphology mirroring syntax, as defined in the model proposed, can
account for the problematic data, and predicts such apparent
mismatches precisely in those cases where they are attested in
Hungarian. Therefore one main claim of this paper is that the valid
effects of the Mirror Principle follow from the proposed architecture
of the grammar, with morphology "shadowing" syntax to the extent
that its internal properties (different from those of syntax) allow this.
26 Huba Bartos

So all deviations from strict mirroring result from the different

internal properties of the two modules of grammar: syntax and
Another claim of the paper is that the model extends to verbal
complex formation: the very same machinery used for inflectional
affixation underlies the creation of the so-called "roll-up" verb clusters
of Hungarian, analysed in detail by Koopman and Szabolcsi (2000).

1.2. Theoretical background assumptions

The major assumptions that will be at stake when considering the

problematic data, given in (la-c) below, are shared by most theories
falling within the Chomskyan tradition of generative syntax' - in fact,
the first two count as standard for the vast majority of these theories:


Scope is represented syntactically by c-command: Xis in
the scope of Y iff Y c-commands X. Asymmetric scope
relations are due to asymmetric c-command.
The ordering of functional projectional layers is
invariable within a language.
(Perhaps also invariable universally, cf. the Universal
Base hypothesis of Cinque 1999)
c. THE MIRROR PRINCIPLE (Baker 1 9 8 5 )
Morphological derivations directly reflect syntactic
derivations and vice versa.

(la, b, c) together predict that when we find invariant affix order,

there must be an invariant projectional hierarchy behind it, so there
should be no space for scopal ambiguities — modulo covert syntactic
rearrangement, which has a disputed, even dubious, status within
current minimalist theory, cf. the ever-changing view of the nature of
covert movement (full copy without phonological content vs. pure
feature movement vs. long-distance agreement) in Chomsky's most
recent works: Chomsky (1995,1998,1999,2001), as well as Kayne's
On-line Morphology: Hungarian Verbal Inflection 27

(1998) claim that (at least scope-oriented) covert movement does not
exist at all. We will therefore seek a solution that does not make use
of covert movement (as distinct from overt movement)2 of the
functional categories displaying scope variance, although
morphology creates the illusion of covert movement by not letting
certain rearrangement information pass through to the PF interface.
The next section will present data from Hungarian which do not
appear to conform to the above prediction.

2. Hungarian data

In Hungarian, there are five inflectional categories which can be

marked on a verb form by affixes:

- mood (M) - declarative (unmarked), conditional

(-nA), imperative/subjunctive (-_/)
- tense (T) - present (unmarked), past (-t(t))
- modality (Mod) - epistemic/deontic possibility (-hAt)
- subject agreement (Agrs) - α number β person (full paradigm of
suffixes; 3sg is usually unmarked)
- object agreement (Agr0) - indefinite (unmarked), definite (~(j)a/

Of these five, the first three are contentful, interpretable

categories, and they will be the subject matter of our investigation
henceforth. The two agreement categories do not participate in scope
relations at all, and they are morphologically peripheral, too,
occurring outside the other three affixes on a single verb form. They
will only be treated to the extent that it is necessary for the
presentation of the analyses. Moreover, since the unmarked items are
impossible to detect on the surface phonological forms, we will use
marked values, and since the imperative/subjunctive never cooccurs
with overt tense marking (i.e. it is confined to the unmarked present
tense), the only full set of items we can use is: possibility modality
(poss), past tense (past), conditional mood (cond). The invariable
affix order is V-poss-past-cond-Agr0-Agrs, but when past and cond
28 Huba Bartos

cooccur, the latter must be borne by a dummy V-stem (vol-), for

phonological reasons to be discussed below, so the verb form will be
split into two: V-poss-past-Agr0-Agrs Vdummy-cond.
Now we examine all the possible combinations of the three con-
tentful inflectional categories, with illustrative example sentences.
First, (2) shows poss and past cooccurring, while (3) and (4) give the
two scopally different readings associated with the verb form in (2):

(2) Vär-hat-t-ak.
'They could/were allowed to wait.' or 'They may (possibly)
have waited.'

(3) a. It WAS [POSSIBLE [for them to wait]. PAST [POSS [...]]

b. Az eliteltek csak az udvaron vdrhattak
the convicts only the yard-on wait-poss-past-3pl
α lätogatökra.
the visitors-for
'The convicts could / were allowed to wait for the visitors
only in the yard.'

(4) a. It is POSSIBLE [that they waitED]. POSS [PAST [...]]

Mod > Τ
b. k talän α mäsik kapunäl vdrhattak,
they perhaps the other gate-at wait-poss-past-3pl
ezert kerültük el egymäst.
so miss-past-1 pi asp each-other
'They may perhaps have waited at the other gate, that's
why we missed each other.'

Clearly, the form in (2) has two scope interpretation possibilities.

If, as generally assumed, the morpheme order is the mirror image of
the order of syntactic projections, then the Τ > Mod scope relation is
the straight order, while the Mod > Τ scope relation is the inverse, i.e.
unexpected one.
On-line Morphology: Hungarian Verbal Inflection 29

These data pose an immediate problem for the combined validity

of (la, b, c). A single morpheme order presupposes a single, invariant
projectional hierarchy by (lc), in accordance with (lb), but variable
scope order necessitates variable syntactic hierarchy by (la), not
reflected here in morpheme order, in defiance of (lc). In the
particular case of (2), for example, the V-poss-past morpheme order
suggests that the projectional hierarchy is "T [ Mod [ V ]]", but this
warrants only the Τ > Mod scope relation, while the existence of the
Mod>T scope order suggests some syntactic configuration where
Mod c-commands Τ at some level. Since (la) enjoys the widest
currency of the three, and it is often taken to be axiomatic (assumed
as a principle, cf. Aoun and Li's (1993) Scope Principle), and (lb) is
empirically well-motivated,3 we must loosen up (lc), or more
precisely, since (lc) is "merely" a generalization (which may or may
not be universally valid), we must derive (lc) in such a way as to
make room for the observed variability. The alternatives would be (i)
to resort to covert rearrangement, dispreferred on theoretical grounds
(see above), or (ii) to seek a semantic solution, relegating the issue of
scope inversion to semantics. But the inspection of further data
suggests that such an approach may be even more problematic.
Next, consider the case of cond and past cooccurring. (5) shows
the verb form, and (6, 7) the two possible scope orders:

(5) Vdr-t-ak vol-na.

wait-past-3pl Vdum-cond
'They would have waited.' or 'They wished to wait.'

(6) a. It WOULD be the case [that they waitED].

COND [PAST [...]] M>T
b. Az rök vdrtak volna, ha mondtad
the guards wait-past-3pl Vdum-cond if tell-past-2sg
volna nekik.
Vdum-cond to-them
'The guards would have waited if you had told them to.'

(7) a. It WAS the case [that they WOULD wait].

PAST [COND [...]] Τ> Μ
30 Huba Bartos

b. A vendegek igazän värtak volna meg,

the guests really wait-past-3pl Vdum-cond still
de tul kes volt.
but too late was-3sg
'The guests really would have waited / wished to wait
still, but it was too late.'

This, again, looks very much like the case of Τ and Mod: both the
"straight" and the "inverted" scope order is available, for the same
morphemic sequence. However, as it turns out, while this too is an
example of scope variance, it is still unlike the situation we have just
seen for Τ and Mod — this is not a simple case of scope inversion, in
the sense that while (6) presents a clear case of Μ scoping over T, (7)
is a rather more complicated matter, where what we perceive at first
sight as narrow-scope mood is in fact some sort of modality.
Nevertheless, I wish to extend my general analysis to this type, as
well. More details about the data will be presented, along with the
proposed treatment, in section 4.3.2.
Consider the third possible pair: conditional mood and possibility
combined, as in (8). (9) gives the straight scope reading, but (10)
shows that scope inversion is unavailable here, unlike in the former
two cases.4

(8) Vär-hat-nä-nak.
'They could wait.' / *'They possibly wish to wait.'

(9) a. It WOULD be [POSSIBLE[for them to wait]].

COND [POSS [...]] M>Mod
b. Ha sziikseges volna, värhatnänak idebent.
if necessary would-be-3sg wait-poss-cond-3pl in-here
'They could wait in here if it was necessary.'

(10) a. *It is POSSIBLE [that they WOULD wait].

*POSS [COND [...]] *Mod > Μ
On-line Morphology: Hungarian Verbal Inflection 31

b. #Nem ertem, miert nem mennek haza;

not understand-lsg why not go-3pl home
talän vdrhatndnak?
perhaps wait-poss-cond-3pl
Ί don't see why they don't go home; maybe they would

If Mod can invert with T, and Τ can invert with M, why can Mod
not invert with M? One might find some semantic factor, though, to
rule out the inversion on semantic grounds. However, what is the
simplest and most straightforward case for a syntactic solution (as
will be shown presently) is the worst case for the semanticist. When
all the three contentful inflectional categories are marked on a verb
form, only one scope order is attested, the full straight order
(Μ > Τ > Mod), even though in principle there could be as many as
six scopal variants, and even if we take into account the lack of scope
inversion between Μ and Mod, in the light of the attested inversions
in (4) and (7), we would still expect four different readings:

(11) Vdr-hat-t-ak vol-na.

wait-poss-past-3pl Vdum-cond
'They could have waited.'

(12) a. It WOULD be the case [that it WAS [POSSIBLE [for them to

wait]]]. Μ > Τ > Mod
b. A firik vdrhattak volna, ha valaki
the boys wait-poss-past-3pl Vdum-cond if someone
szolt volna nekik.
tell-past-3sg Vdum-cond to-them
'The boys could have waited if someone/anyone had told
them to.'

(13) a. Μ > Τ > Mod b. *M > Mod > Τ

c. * T > M > Mod d. *T > Mod > Μ
e. *Mod > Μ > Τ f. *Mod > Τ > Μ
32 Huba Bartos

It is obviously not an easy task to find a semantic explanation for

the loss of scope variance in this case, as compared to the cases in (2)
and (5). In the model presented and defended here, however, this case
will simply fall out as a natural outcome. For this to be seen, the next
sections will introduce the model of morphosyntax, and provide the
analyses for the above data.
As a final note to this section, let me dwell briefly on the nature of
modality represented by -hAt. Kiefer (1981) gave a detailed inventory
of the types of modality associated with this suffix. They can
nevertheless be classified under the two widely used terms: root vs.
epistemic modality (the former comprising deontic, circumstantial,
dispositional, etc. modalities). Now, as the few illustrative examples
above also show, generally speaking, the epistemic reading of -hAt
and wide scope Mod (i.e. Mod > T) go hand in hand, while the root
reading corresponds to the narrow scope case. This might lead us to
posit two distinct Mod domains, one above Τ (call it Modepist), and
another one below Τ ( M o d ^ . However, on the one hand this would
still beg the question of why/how the same affix order arises with
both types of Mod, and on the other hand, the generalization is not
fully true. While root modality never scopes over T, epistemic
possibility may have narrow scope with respect to T, as e.g. in (14):

(14) St. Germainben tegnap hovihar volt, tehdt Pdrizsban

St. Germain-in yesterday blizzard was so Paris-in
is havaz-hat-ott.
also snow-poss-past-3sg
'There was a blizzard yesterday in St. Germain, so it could be
snowing in Paris, too.'

Weather predicates cannot normally be interpreted with root

modality, since natural phenomena do not conform to obligations/
permissions, do not have wishes or dispositions, etc., so any sort of
root reading is out. Then this must be epistemic modality, but it is
embedded under past tense: the primary interpretation of the second
clause is not "It is POSSIBLE [that it snowED in Paris]" but rather:
"it WAS a case of [being POSSIBLE [for it to snow in Paris]]".
Although the distinction is subtle, the two cases can still be kept
On-line Morphology: Hungarian Verbal Inflection 33

apart, and havazhatott in (14) is interpreted with Τ > Mod.3 Therefore

I will not pursue here the line of splitting Mod apart in syntax.

3. On-line morphology

The model of syntax and morphology advocated here is essentially

the same as the one presented in Bartos (2000)6 — in essence, it is the
seemingly somewhat strange marriage of a derivational, minimalist
system of structure building (Chomsky 1995, 1999, 2001) and the
antilexicalist, late-insertion model of Distributed Morphology (Halle
and Marantz 1993, Marantz 1997). I assume, in line with mainstream
minimalism, that syntactic structure building proceeds bottom-up,
cyclically, by the operation Merge, and displacement of elements
(Move) is triggered by attraction, the aim of which is the satisfaction
(though not necessarily the checking) of some feature. But I follow
Distributed Morphology in assuming that words are assembled in the
course of derivation, rather than lexically, and that the actual sound-
shapes are inserted in/after morphology, so they are not present in
syntax. Moreover, I propose that the morphological derivation runs
on-line with syntax: each step of the syntactic derivation is imme-
diately scanned by morphology, and if that step has any consequence
for word-structure, the corresponding morphological operation is
carried out promptly, before the next step is taken in syntax. So for
instance if two heads are combined in syntax, e.g. by adjunction, a
word-domain comprising the morphological exponents of the two
heads is immediately established in morphology. Feedback from
morphology to syntax, and concomitant morphologically driven
syntactic repair is possible, too.
Beyond the general minimalist operations, I introduce an opera-
tion called morphosyntactic merger, which resembles (and replaces)
Halle and Marantz's (1993) morphological merger. The crucial
difference is that morphosyntactic merger includes chain formation
between the participant items, so it also has syntactic consequences,
beside the obvious word-formational ones, and therefore it takes
place in syntax, rather than at the level of morphology. The
introduction of this operation should not be seen as extra cost, further
34 Huba Bartos

complication of syntax, however, since it replaces head movement in

all those cases where it would be carried out to attach affixes to a
stem (or, equivalently, where the checking of an affixal feature would
be the only motivation of head movement). In fact, this should be
seen as an improvement on frameworks applying adjunctive head-
movement, because head-movement has certain problematic
properties, excluding this type of movement from the general family
of movement operations — for a discussion see Brody (1997).
The exact definition of morphosyntactic merger is as follows:


A [+affix] category X can morphosyntactically merge with a
potential stem Y under structural adjacency. This yields a
word domain {x, y} (where χ realizes X, and y realizes Y) at
the level of morphology, and a chain <X, Y> in syntax.

(16) STRUCTURAL ADJACENCY (cf. also Frampton and Gutmann

1998, 1999)
X and Y are structurally adjacent iff
(i) X c-commands Y, and
(ii) there is no Z, such that the projectional status of Ζ is
identical to that of X, and X c-commands Z, and Ζ c-
commands Y

As can be seen from these definitions, morphosyntactic merger is

strictly local, thereby incorporating the effects of the head movement
constraint even in the absence of movement as such. The reference to
"projectional status" in the definition of structural adjacency
concerns the contextually determined maximality/minimality of
projections, in the sense of Chomsky (1995): a projection is minimal,
if it contains only word-level terms, and it is maximal if it is
immediately dominated by a categorially distinct projection. The
status identity requirement in (16ii) ensures that items which are both
minimal and maximal projections at the same time (such as certain
adverbs), do not count as blockers of structural adjacency for purely
minimal or purely maximal projections.7
On-line Morphology: Hungarian Verbal Inflection 35

Once morphosyntactic merger (or in fact any other word-forming

operation) joins two elements, the order of their morphological
exponents will be determined by their lexical properties: if for
instance in a word-domain {x, y} Λ: is lexically specified as a suffix,
it will follow y (the stem) in the surface phonological form and if χ is
a prefix, it will precede y. Furthermore, since morphosyntactic
merger is a binary operation, items are joined pairwise, and each step
is a new cycle, so in the word structure every new item can only be
added to the edge" of the word-form created by the previous step. In
other words, bracket erasure obtains after every step in the
morphological derivation (a standard assumption about morphology
in general), so once a word-domain w, is embedded into a larger
word-domain w2, the inner structure of wl becomes unavailable for
any later morphological operation.'
Finally, the spell-out locus of the morphosyntactically merged
word-form is determined by the position of the root, i.e. the whole
inflected word will be pronounced in the position of the root
category. That is, the position of the morphophonologically indepen-
dent element is what counts, which is natural, since this element is
the one that "picks up" the dependent elements: the affixes.

4. The analyses

On the basis of the scope relations in (2-13), and the affix order facts
(assuming that the Mirror Principle is largely valid), we can establish
that the likely order of the functional projections under examination
is (Agrs > Agr0 >) Μ > Τ > Mod.10 Agr suffixes always appear after
other inflectional suffixes, and Agr-categories do not enter into scope
relations at all, so we have every reason to place them to the
periphery in syntax, too. Let us now treat the combinations of the
contentful inflectional categories one by one, starting with cases
where only two of them are marked.
36 Huba Bartos

4.1. Tense and Modality

The figure in (17) depicts the syntactic structure created up to the

insertion and projection of some Agr category; the relevant feature
content of the inflectional heads is given on the right-hand side:

(17) AgrP

Agr [+aff, +V,...]

Τ [+aff, +V, +past]

Mod VP Mod [+aff, +V, +poss]

Here the morphosyntactic derivation is obvious and straight-

forward: the word-domain {{{V, Mod}, T}, Agr}word and the <Agr,
T, Mod, V> chain are created by a sequence of applications of
morphosyntactic merger, and the order of the affixes follows from
the order of assembly plus their lexical property of being all suffixes.
As regards the LF interpretation, the Τ > Mod scope relation is self-
evident: Τ asymmetrically c-commands Mod.
The more interesting question concerns the derivation of inverse
scope between Τ and Mod. For this to obtain, the derivation must
reach a point where Mod gets to c-command T, but it must be
ensured that the affix order remains intact. Suppose that at the point
when the projection of TP is complete, we have an option of merging
in an entirely contentless Μ — the V-form has not been specified for
this category, so it is potentially available, but if inserted, it will be
inert (thus V will be interpreted with unmarked, indicative mood).
This Μ is a mere categorial frame, totally void of feature content,
including categorial underspecification — let us call it a proxy (in a
sense remotely reminiscent of, though in fact markedly different
from, the notion of proxy in Nash and Rouveret 1998). Alternatively,
we may look at this proxy "M" as a templatic slot, assuming a
On-line Morphology: Hungarian Verbal Inflection 37

(possibly universal, or universally grounded) template of functional

projections, much like that presented in Cinque (1999). It is a general
property of Hungarian clause structure (as a particular instantiation of
the universal template) that there are exactly three contentful inflec-
tional categories available: Μ, T, and Mod, thus there are exactly
three templatic slots. And whenever one of these is not specified by a
particular item drawn from the (narrow) lexicon, it can be made use
of as a proxy, if merged into the structure, which is an option in all
such cases.
As a matter of fact, though, in Hungarian only the templatic slot
for Μ is ever used as a proxy, since Τ is always specified (there are
no tenseless clauses — infinitivals being marked, too, for null tense),
and while it is true that Mod can be left unspecified, it is the lowest
slot of this domain, so there is no other inflectional category further
down in the clause structure that could access it.
So, coming back to the particular case under scrutiny, the proxy
"M" will act as a hosting site for some other verbal inflectional
category to fill it, i.e. to flesh it out featurally, and the movement due
to the "vacuum-effect" of the unspecified proxy will have to be
substitution, rather than adjunction, for it to gain proper specification.
Otherwise it will cause a crash at LF, being uninterpretable. In
principle, there are two categories that may fill it: Τ and Mod. But
there are at least two factors that single out Mod as the proper filler
of the proxy.
For one thing, mood and modality have generally been assumed to
be closely related notions. And while any detailed discussion of even
just the Hungarian facts would go far beyond the scope of this paper,
it is worth mentioning that there are obvious links of compatibility
between the two categories. For example, while deontic modality is
compatible with either realis or irrealis mood (cf. 0KJenny had to eat
the pasta, and she did / but she didn V), epistemic modality is not
compatible with either — it obviously only tolerates unspecified
mood (cf. "(For all we know,) Jenny may have eaten the pasta, and in
fact she did / but in fact she didn 7). And the modality of volition is
compatible only with irrealis mood; see 4.3.2 below. In view of this
close relation between Μ and Mod, it is likely that Mod is the
expected filler of a proxy M.
38 Huba Bartos

Moreover, although at first sight Τ seems to be closer to the proxy

Μ than Mod, recall that Τ and Mod are links of the same chain now,
so they plausibly count as equidistant for the attractor. This is
"backwards equidistance", the reversal of Chomsky's (1995, chapter
3) notion of equidistance, in order to fit the attraction-based
conception of movement, as in Chomsky (1995, chapter 4)." So we
have two options to satisfy the attracting needs of M: raising T, or
raising Mod. The former yields an output that is not distinct in any
significant way from the output of the alternative derivation in which
no proxy Μ is accessed: neither the phonological string, nor the
semantic interpretation is affected. Therefore inserting a proxy M,
then raising Τ to it constitute redundant steps, so this derivation is
barred on grounds of economy (cf. Fox 1995, Reinhart 1997,
Chomsky 1998). We are then left with the second option: raising
Mod to M. This is a case of substitution, so the category label of the
proxy will be provided by the raised element, Mod. This yields (18):

(18) ModP

Mod, TP Μ [] Mod [+V, +poss]

Τ [+aff, +V, +past]

Mod [+aff, +V, +poss]

Now Mod c-commands T, which has two welcome consequences

for us. First: this makes the inverse scope reading Mod > Τ available,
second: this leads to an output to semantics that is distinct from the
one arrived at by the simplest derivation, i.e., without inserting the
proxy M, so economy will not block this more complex derivation.
Let us turn our attention now to what happens in morphology. The
word-form up to Τ is built as in the derivation of the straight scope
order: {{V, Mod}, T}word. Next comes the insertion of the empty M,
which has no morphological reflex whatsoever. Then we raise Mod
to M. But now morphology will do nothing: specifically, it cannot lift
On-line Morphology: Hungarian Verbal Inflection 39

the Mod-affix (-hAt) out of its place between the V-stem and the T-
affix, because of strict cyclicity / bracket erasure. The internals of the
deeper domain, {V, Mod} have become inaccessible by now. In other
words: once Τ is added to the word-form, the Mod-affix is frozen in
its place, in the innermost domain. Morphology attempts to follow
syntax, but it has its own principles, hence its own limits. This is why
and how the two distinct syntactic structures, corresponding to the
two distinct scope readings, have the same morphological correlate:
[V-Mod-T-Agr ...]word. So the moral of this subsection is the
following: Phrase structure can be built in accordance with the
invariant functional projectional hierarchy hypothesis (lb), yet scope
relations can subsequently be changed, provided there is space (a
proxy) for such moves; and identical morphological sequences can be
arrived at via different derivations — thus the Mirror Principle is
maintained derivationally, up to the separate limitations of the
grammatical modules affected: morphology follows syntax to the
extent that it can. Also, the mirroring generalization is understood as
unidirectional, rather than bidirectional (as in Baker (1985)):
morphology strives to mirror (i.e. follow) syntax, but not the other
way round.12

4.2. Conditional Mood and Modality

Recall that we found no scope inversion effect for this pair: Mood
always scopes over Modality in the verb forms investigated, see (8-
10). On the assumptions laid out in the previous sections, this follows
naturally: the ordering of projections is as before (Μ > Τ > Mod), so
the emergence of the attested [word V-Mod-M-Agr] morphological
string is evident, but the only category potentially available as a
proxy is T, which however is not higher than M, so Mod cannot get
to c-command Μ in any way (remember that Agr is not available: it
is always filled with some content, it is never unspecified). Raising
Mod to Τ is interpretationally vacuous, thus blocked. This way the
only derivable scope order is the "default" Μ > Mod.
40 Huba Bartos

4.3. Conditional Mood and Tense

Apart from our prime concern, the emergence of inverse scope, there
is yet another issue here: Why and how is the dummy V-stem
inserted to carry the M-affix, as seen above in (6) and (11)? Let us
begin with this latter question.

4.3.1. The insertion of the dummy V

The dummy V-root vol- is clearly an extra item, with no semantic

contribution, so its occurrence must be constrained by economy, i.e.
it can only appear when it must, to ensure convergence. It is the
appropriate root form of the verb van 'be', this bound allomorph
appears immediately before past tense or conditional mood marking.
In the more optimal case, its insertion would be a purely
morphological matter: it has to be inserted to repair an otherwise ill-
formed string, at the point where (post-syntactic) vocabulary
insertion takes place. However, there are reasons to attribute
syntactic status to it, so its insertion must be seen as morphologically
induced syntactic repair." The particular motivation for its insertion
is the following. Recall that since Τ is lower in the projectional
hierarchy than M, it gets inserted sooner. Then morphosyntactic
merger associates Τ with V. In the next step, Μ is introduced into the
structure, so a second application of morphosyntactic merger puts
together Μ and [V+T], both syntactically, extending the chain, and
morphologically, placing Μ into the same word-domain: {{V, T},
M}word. But this leads to ill-formedness at vocabulary insertion: both
the past tense and the conditional mood affixes are analytic (in the
sense of Kaye (1995)), as regards their morphophonological proper-
ties. And there is a templatic filter in Hungarian morphophonology
prohibiting any sequence of adjacent analytic affixes (Rebrus 2000).
This means that the past and the conditional suffixes cannot be
adjacent. This is what necessitates repair, in the form of inserting the
dummy V-root, to host the mood affix.14
But notice now that [M vol-na] is now in an appropriate position to
host the next higher affix: Agr, via morphosyntactic merger, without
On-line Morphology: Hungarian Verbal Inflection 41

any further movement — contrary to fact, since the Agr-affixes

appear on the [V-T] bloc:

(19) a. [Agr [V+T]+Agr] [MP [M Vdum+M]]

var-t-unk vol-na
wait-past-1 pi Vdum-cond
'We would have waited.'
b- * [[Vdum+M]+Agr] [V+T]]
* vol-nä-nk vär-t
Vdum-cond-lpl wait-past

The derivation leading to (19b), by attaching Agr after M, though,

should block the one leading to (19a), which is longer since it has to
raise V over M, to pick up Agr. (Morphosyntactic merger will not
work: it must be strictly local.) But there is an additional factor here:
the Case assignment/checking of the subject DP, to be performed by
the contentful verb. The cheaper derivation, with [M Vdum+M] forming
a chain with Agr, inevitably leads to a crash: the dummy V lacks
both theta- and Case-features, so it is not a proper Case-assigner/
checker for the subject. Agr therefore must form a chain with the
bloc including T, which is usually assumed to carry the nominative
Case-feature in minimalist models (e.g. Chomsky 1995). This is
possible only by raising V to Agr, since Τ is in the morphosyntactic
bloc (i.e., chain and word-domain) built around V, so it is not
independent, not free to move on its own any more.15 The raising of V
all the way up would violate the Head Movement Constraint, were it
a valid restriction on movement. But given the general problems of
adjunctive head-movement, alluded to in 3 above, it is quite
conceivable that the usual effects of the HMC can be put down to the
strictly local nature of morphosyntactic merger, as long as it can be
taken to replace head-movement in a large number of cases, while
genuine head-movement may fall under somewhat looser locality
constraints (cf. also the cases of long head-movement, as e.g. in
Rivero 1991, 1994). More specifically, if we follow Chomsky (1995,
chapter 4) in building the notion of minimality into the definition of
Attraction" (itself a supplant for the definition of Move), and assume
that head-movement obeys the same kind of minimality as phrasal
42 Huba Bartos

movement, then we can say that Agr will attract the closest item that
can check some of its features, and this will presumably be T, which,
however, is part of the morphosyntactic V-bloc, and thus the raising
of V itself will be induced. This way the Agr-affix will also be
picked up by the [V+T] bloc. Observe that the raising of V will place
it to the left of volna in the linearized structure, and for spellout it
"carries with itself' its morphosyntactically merged affix (past), since
they are one unbreakable word-domain. This is why we get the string
in (19a), rather than *volna värtunk, or anything else.17 Choosing this
more costly derivation over the cheaper but crashing one with Agr
morphosyntactically merged to the [M Vdum+M] bloc is validated by a
basic tenet of derivational economy: crashing derivations, however
cheap they may be, may never block costlier but convergent
derivations (Chomsky 1995) — unless they compete in minimality,
which is not the case here (see fn. 16)."

4.3.2. Scope variance between Μ and T?

By now it must be obvious that the account of scope inversion

between Τ and Mod, and its absence between Μ and Mod will not
help explain scope inversion between Μ and T. There is no proxy for
Τ to raise to, so as to c-command M. And on closer scrutiny, as was
already hinted at in section 2, this scope inversion turns out to be
only apparent. Consider again the two readings of (4):

(20) Vdr-t-unk vol-na.

a. Μ > Τ — 'We would have waited / Had we waited
b. Τ > Μ — 'We wished / were willing to wait.'

Clearly, in (20b) Μ is not interpreted as conditional mood proper

— instead, it represents one kind of modality·, that of volition or
disposition. But if so, then (20b) is not a genuine case of scope
inversion: it is not the case that Τ and Μ invert relative scope. What
Τ has scope over is some modality, conceivably associated with the
syntactic category Mod. Nevertheless, both the fact that this modality
is encoded by a mood affix, and the ensuing mismatch between
syntactic and morphological structure need an explanation. It seems
On-line Morphology: Hungarian Verbal Inflection 43

that the simplest way of accounting for these facts is to assume, on

the one hand, that in (7) and (20b) there is a lexical item VOLITION,
instantiating the category Mod. (Note that no other Mod is, or even
could be, present in these structures.) However, this VOLITION exists
merely as a syntactic-semantic feature bundle (a morpheme in the
narrow lexicon, in the sense of Marantz (1997)), with no matching
sound-shape in the vocabulary of Hungarian, which could be inserted
in morphology. On the other hand, there seems to be a strong
coocurrence restriction on this modal item: it may only occur in the
scope of [+irrealis] mood (i.e. it is like an "irrealis polarity item"),
which probably has to do with its property of being incompatible
with realis mood: if someone has the volition/disposition of making
an event happen, that event must belong to the realm of not (yet)
realized desires, i.e. to a world not identical to the real one.
So once again, the category of mood is called for, and hence its pho-
nological exponent, the suffix -nA will become part of the structure.
An alternative, "brute force" solution might be to list -nA in the
vocabulary as an exponent for some inflectional element marked as
[+V, +affix, +VOLITION], apart from being the regular exponent of
conditional mood. But it would then be rather difficult to explain
why this does not surface between the V-stem and the tense suffix
(resulting in a complex form [V-M-Agr] + [Vdum-T], something like
*vär-nä-nk vol-t). Thus this kind of account should not be seriously

4.4. Mood, Tense, and Modality all together

As was shown in (11-13), when all these inflectional categories are

simultaneously marked on a verb form, there is absolutely no scope
variability: the affix order, as well as the only attested scope order,
correspond directly to the hierarchy of the functional projections
involved. And whereas in a semantic account something special
should be said about the unexpected loss of scope inversion vis-ä-vis
the cases in (2) and (5), on our morphosyntactic account the unique
scope order simply falls out from the analyses of the other cases.
There is just no room for scope variation. Since all three inflectional
44 Huba Bartos

categories are marked, no proxy is available. And the overtly visible

presence of a (root) POSSIBILITY Mod excludes the occurrence of the
VOLITION Mod, which might be linked to the conditional mood affix
in the way suggested in the previous subsection. So it comes as no
surprise that nothing is variable in this case.
This elegantly concludes the analyses of the "scope variance vs.
unique affix order" dichotomy: I have shown that given the proper
semantic classification, and the right model of the syntax-
morphology interface, everything falls into its place, and even the
Mirror Principle is maintained, as a unidirectional (syntax -»
morphology), derivational generalization.
The next section is devoted to showing how morphosyntactic
merger can be put to use in accounting for certain verbal complexes
in Hungarian.

5. Extended words by morphosyntactic merger

Hungarian displays two basic orders (and their combinations) of

verbs in stacked infinitival constructions, i.e. where several infinitival
clauses are serially embedded one under the other (Kenesei 1989,
Brody 1997, E. Kiss 1999, Koopman and Szabolcsi 2000). The
straight order corresponds to the order of embedding from left to
right: the main predicate is the leftmost item, and the most deeply
embedded clause is the rightmost one, as in (21a). The inverted, so-
called "roll-up" order is the exact opposite: the most deeply
embedded verb begins the string, and the last item to be rolled up
(sometimes the highest verb) is at the end: (21b). The name "roll up"
is suggestive of how the lowest V raises to the next higher one,
picking it up, then the complex of those two raises to the next higher
V again, and so each of the verbs is picked up in this ever-growing
"roll up".

(21) a. Most fogok akarni kezdeni enekelni.

now will-lsg want-inf begin-inf sing-inf
'Now will I want to begin to sing.'
b. Most fogok [enekelni kezdeni akarni].
On-line Morphology: Hungarian Verbal Inflection 45

There have been two major types of approaches to the analysis of

roll-up structures. The first one is in terms of XP-movement, as
proposed by Koopman and Szabolcsi (2000): remnant VPs are
moved successive cyclically to create the roll-up sequence." The
second one assumes that roll-up structures result from some head-
type relation: they are built by recursive V-incorporation, as in E.
Kiss (1999), or are parts of a single extended word, as in Brody
(1997). I will now propose that roll-ups can be created by
morphosyntactic merger - thus my analysis falls in the second type,
and is closest to Brody's model - , and that this is the cheapest
solution, since almost every important property of the roll-up
structure follows automatically from such an analysis, unlike in the
other models.
The key assumption underlying this analysis is that the rolled-up
infinitives behave like suffixes on the most deeply embedded verb.
As has been observed in the relevant literature, the lowest verb is the
only fully thematic verb, while the higher ones are thematically weak
or deficient to a certain extent. Although a clear thematic
characterization is still not available, most authors agree that the
rolled-up verbs are light in some sense. Let us take this to be
reflected in the assumption that they can optionally be assigned a
[+suffix] feature on entering the derivation. Morphosyntactic merger
will therefore apply to them, as if they were "real" affixes: they will
form a chain, and a single word-domain with the lowest verb, which
acts as their "hosting stem".
Morphosyntactic merger picks up these infinitives one by one, in a
strictly local fashion, as follows from the definition of this operation,
given in (15-16) above. This strict locality property immediately
explains why inekelni kezdeni akarni in (21b) only means 'to want to
begin to sing', and not 'to begin to want to sing', as observed by
Brody (1997): the roll-up cannot skip any V, as would be needed to
join 'sing' with 'want' across 'begin' in (21b). The process is
schematized in (22):
46 Huba Bartos

(22) a. ... akarni [ kezdeni [enekelni]]]

[+suff] [+suff]

morphosyntactic merger 1

morphosyntactic merger 2

b. [enekelni-kezdeni-akarni]w<ml
'root' 'suffix' 'suffix'

As the resulting construct, the roll-up cluster, is one

morphological word, it also immediately follows that no other words
may intervene between the rolled-up infinitives:

(23) a. Most fogok akarni el- kezdeni enekelni.

now will-lsg want-inf PV begin-inf sing-inf
'Now will I want to begin to sing.'
b. * Most fogok [enekelni el-kezdeni akarni].

(24) a. Mari fogja akarni holnap kezdeni

Mary will-3sg want-inf tomorrow begin-inf
szet- szedni a rddiot.
apart- take-inf the radio-acc
'Mary will want to begin tomorrow to take the radio apart.'
b. * Mari fogja [szetszedni {holnap} kezdeni {holnap}
akarni] a rddiot.

Roll-up can stop anywhere in the middle, i.e. it does not have to
reach up to the highest verb — the part above the roll-up retains the
straight order in such cases. This is captured by our analysis, as well:
if the optional [+suffix] feature is only assigned to some of the
infinitives, then if they form a continuous sequence reaching down to
the lowest verb, roll-up will proceed as far up as there is an item with
the [+suffix] feature, but if there is an intervening infinitive lacking
the optional feature, the derivation will crash, since the [+suffix]-
marked items cannot be picked up across a non-[+suffix] item. On
the other hand, the roll-up cannot start in the middle: the only verb
On-line Morphology: Hungarian Verbal Inflection 47

that may start rolling up the 'light' verbs is the lowest, fully thematic
one. This can be accounted for by the assumption that the 'light'
verbs are also light in the phonological sense, i.e., they cannot serve
as hosting roots for heavy affixes — they can bear inherently affixal
items, such as inflection, but not whole words with an occasional
affixal nature, like the other light verbs or auxiliaries.
The topmost, tensed V cannot be involved in the roll-up, however:

(25) * Man [ enekelni kezdeni akarni fog].

Mary sing-inf begin-inf want-inf will-3sg
'Mary will want to begin to sing.'

This is explainable in terms of a uniformity requirement on

chains: morphosyntactic merger creates a chain, and members of a V-
chain must be uniform with respect to finiteness (i.e. tense- and/or
agreement-marking). This requirement, apart from explaining the
restriction illustrated in (25), also underlies the contrast in the
following examples:20

(26) a. Nem kell akarnod kezdeni lebontani

not must want-inf-2sg begin-inf down-pull-inf
a sätrat.
the tent-acc
'You don't have to want to begin to pull down the tent.'
b. Nem kell akarnod [lebontani kezdeni] α sätrat.
c. * Nem kell [lebontani kezdeni akarnod] α sätrat.

Here each verb is untensed, but chain-formation up to the topmost

verb is blocked because it is an inflected infinitive, bearing
agreement, while the others are plain infinitives.
Finally, the extended words created by morphosyntactic merger
can serve as inputs to derivational processes, which is only expected
if they are indeed word-level units:

(27) a. Peter ers [ enekelni kezdeni akar] -äs -a

Peter strong sing-inf begin-inf want -ing(N) -poss
'Peter's strong wanting to begin to sing'
48 Huba Bartos

b. az [ enekelni kezdeni akar] -ό fiü

the sing-inf begin-inf want -ing(A) boy
'the boy wanting to begin to sing'

In sum: morphosyntactic merger is an appropriate device for

creating the Hungarian roll-up verb clusters, because it correctly
derives the major syntactic and morphological properties of these
clusters. This, in turn, provides further support to the
morphosyntactic model presented in the previous sections.

6. Conclusion

This paper has examined the morphosyntax of verbal inflectional

affixation in Hungarian, from the perspective of a cyclic derivational
model that combines syntactic and morphological derivations in such
a way that the two run parallel with each other. This model has been
shown to be adequate both for capturing the empirical facts and
generalizations emerging from the data, and for resolving the conflict
between the syntactic encoding of scope, and the generalization
known as the Mirror Principle in view of the data investigated.
Moreover, it has been done in such a way that we did not have to
resort to covert movement in the standard minimalist sense: syntax is
monocyclic, with homogenous ("overt") operations, including a
newly proposed chain- and word-forming operation, morphosyntactic
merger, which replaces head movement in a great number of cases.
Scope inversion among inflectional categories has been proposed to
involve the use of a proxy head in some cases, while it has turned out
to be only apparent in others.
The final section demonstrated that the model can be extended to
cover the so-called roll-up cluster formation process in Hungarian:
these clusters have been proposed to be regarded as words in an
extended sense, built by morphosyntactic merger.
On-line Morphology: Hungarian Verbal Inflection 49


This paper is a considerably reworked and extended version of Bartos (2000). I

am grateful to Michael Brody, Anna Szabolcsi, and Grete Dalmi for detailed
discussions, and to Katalin E. Kiss, Istvan Kenesei, Mark Newson, and David
Adger for various comments and criticism, as well as to an anonymous
reviewer for important objections and suggestions. I also wish to thank Grete
Dalmi for calling my attention to the parallel with Polish data. The research
underlying this paper was made possible by a Bolyai Fellowship of the
Hungarian Academy of Sciences.
1. For a model where the mirror principle is used as an axiom, with morpho-
logical structure determining syntactic structure, see Brody (1997).
2. The overt-covert distinction in the sense of movement preceding vs. following
the spellout point (i.e. belonging to the first vs. the second cycle), or in the
sense of category movement vs. feature movement, is absent from this model.
What is still found is the epiphenomenon: syntactic movement is visible at PF
is it is traceable through morphology, i.e. when morphology can perform the
related rearrangements, or does not interfere at all.
3. If the order of projections follows from selectional properties, as generally
assumed in minimalist theories, then it is obvious that given two functional
categories X and Y, if X selects Y, then Y does not select X, or else the
possibility of infinite regress would arise, by X and Y endlessly selecting each
other in turns, while even a single instance of recursion is hardly ever attested
in the inflectional domain (and it is entirely absent in Hungarian). Thus the
possibility of X and Y occurring in either order in a given language is ruled out.
4. This does not mean that (8) is not ambiguous — in fact, it is: it also has a
volitional/dispositional reading ("They would like to wait."), but this reading is
not scopally different from the one in (9). It is as if the modal and mood
content combined to form one modal meaning. Therefore this second reading is
irrelevant to our main discussion of scope variance.
5. The validity of the distinction (and thus the Τ > Mod scope interpretation of the
example) can be highlighted by adding a clause creating semantic anomaly in
an alleged Mod > Τ reading, while leaving the valid Τ > Mod reading intact.

(i) Tegnap havazhatott Parizsban, es tudjuk, hogy

yesterday snow-poss-past-3sg Paris-in and know-pres-lpl that
havazott is.
snow-past-3sg too Τ > Mod
'Yesterday it could be snowing in Paris, and we know that it did, in fact.'
Mod > Τ
#'It may have snowed in Paris yesterday, and we know that it did, in fact.'
50 Huba Bartos

(ii) St. Germainben tegnap hovihar volt, tehät Parizsban is

St. Germain-in yesterday blizzard was, so Paris-in also
havazhatott, es tudjuk, hogy havazott is.
snow-poss-past-3sg and know-pres-lpl that snow-past-3sg too
'There was a blizzard in St. Germain yesterday, so it could be snowing in
Paris, too, and we know that it did, in fact.'

(i) is anomalous on the Mod > Τ reading: we do not normally speak of mere
possibility when we have factual knowledge, in the present, of the past state of
affairs in question. On the other hand, (ii) is not anomalous on the scope
reading preferred for (14), i.e., Τ > Mod, exactly because we are not
considering a possible past (Mod > T) here, but a possibility that was available
at a past time (T > Mod).
6. Note that the model was devised to handle various aspects of Hungarian
morphosyntax, including nominal inflection (Bartos 1999), so it is not specific
to the particular analyses presented here.
7. Some requirement to this effect is necessary for all models capitalizing on ways
of morphologically associating items other than movement, cf. the notions of
structural adjacency in Distributed Morphology, and linear adjacency in
Frampton & Gutmann's (1998) model. Frampton & Gutmann (1998, in. 5)
allude to the possible application of some multi-tier structure, where e.g.
adverbs are represented on a separate tier, so that they will not interfere with
head-head relations on the main projection line of what are sometimes referred to
as extended projections. Another possibility, not explored here in detail, would
be the countercyclic insertion of adverbs into the clause structure, after the
morphological associations discussed in this paper have been established.
8. Unless it is an infix, but then again this property is lexically specified. There
are no infixes in Hungarian, though, so the exact details need not be our
concern here.
9. A reviewer raises the question of how this model would account for the well-
known bracketing paradoxes discussed in the literature, especially those treated
in Pesetsky (1985). The essential property of these cases is that the
interpretationally and selectionally relevant word-structure is different from
their appropriate morphophonological composition. E.g., unhappier should be
[[un + happy] + er] as far as its semantics ('more unhappy', rather than 'not
happier') is concerned, but should be [un + [happy + er]]
morphophonologically, since the comparative suffix -er does not attach to
words longer than two syllables. To overcome this problem, Pesetsky proposes
that the morphophonologically relevant structure can be altered by a covert
process he terms 'affix QR', which raises an affix to a higher position (in the
particular case of unhappier, it raises -er to a position c-commanding the whole
of [un + happy]), on condition that this transformation is string-vacuous. This
On-line Morphology: Hungarian Verbal Inflection 51

way, the semantically interpreted structure will diverge from the

phonologically interpreted one, exactly as needed. The model advocated here
achieves the same effects with ease, being rather similar in spirit to Pesetsky's
proposal. Although I would not call this transformation 'QR\ and my analyses
only concern the inflectional domain, my model could allow the raising of the
relevant affixal categories (NB unlike in Pesetsky's framework, what move
here are affixal feature-bundles, not actual affixes with sound-shapes) in
syntax, but these rearrangements would not be replicated in morphology for
exactly the same reason as in the Hungarian inflectional cases: morphology is
strictly stepwise cyclic (see also Frampton & Gutmann (1999: 3)), so once un-
has been added to happy+er, yielding {un, {happy, er}}word, the constituents of
the inner domain, happy and er, become frozen in their places, and cannot be
dislocated any more. I.e., the syntacic (LF-oriented) displacement of the lower
affixal category will not be reflected in the phonological output. And although
the string-vacuity constraint would preclude cases of linear affix reordering
anyway, rebracketing of the same linear string is not possible either.
10. A note is in order here about how this hierarchy relates to Cinque's (1999)
universal hierarchy of functional projections, a template which was assembled
relying heavily on the Mirror Principle. There the order of these three particular
categories is just the reverse: Mod^^, > Tpast > Mood^,^, while Mod„,ot is
below each of them. But Hungarian affix order provides prima facie evidence
that if the Mirror Principle is a valid generalization, then this partial hierarchy
is not accurate, especially as far as the placement of irrealis Mood below past
Tense is concerned. For details, see the analyses in section 4.
11. The head movement constraint is not an independent principle in Chomskyan
minimalism, its effects have been built into the definition of Move/Attract
instead (Chomsky 1995), so skipping Τ when raising Mod to Μ does not
violate locality.
12. There is another set of data in the domain of combining Τ and Mod, which
militates against a semantic account, but fits neatly into our analysis. Future
tense is often expressed in Hungarian by an auxiliary: fog, which, unlike
English will, is entirely void of any modal content. So e.g. a simple future V-
form would look like (i):

(i) fog olvas-ni

FUT read-infinitive
'will read'

For some reason, it is impossible to combine this future form with modality
represented by the modal affix -hAt, i.e. *fog-hat olvas-ni and *fog olvas-hat-ni
are both ill-formed. The purported meaning can only be expressed peri-
phrastically, by a biclausal structure: (iia), or by a simple present form plus a
future time adverbial: (iib).
52 Huba Bartos

(ii) a. Lehet, [ hogy fog olvasni].

be-poss that FUT read-inf
'It may be that he will read'
b. Majd olvas-hat
later read-poss-3sg
'Later he can/will be allowed to read.'

This is a fact, again, which is not likely to have a semantic explanation, but
follows from the analyses presented here. Since fog carries subject agreement
(e.g. fog-ok 'will-lsg'), it must originate below Agr. Its most likely status is
that of T, since it obviously expresses future tense. Thus fog may be seen as the
exponent of T, and what appears as the infinitival V-form after fog (as in fog
olvas-ni, see above) is in fact a bare V-form (monomorphemic, but bimorphic,
by fission occurring in morphology, splitting the single bare V into a V-root
and a -ni marker — which is presumably related to the fact that what appears
on the surface as a bare V-root, like olvas 'read' is in fact an inflected 3sg
form, the marker of 3sg agreement being null, and this is also the dictionary
form of Vs in Hungarian, so the infinitive is fissioned to be overtly marked as
distinct from the 3sg, non-bare, form).
Fog olvasni is clearly grammatical: the verb olvas 'read' picks up the infinitive
marker, and fog occupies T. *Fog olvas-hat-ni is totally out if olvas-ni spells
out a bare V, because this whole form is inserted under V° as a monolithic
form, so -hAt in Mod cannot get in between the V-root and -ni. Finally, "'fog-
hat olvas-ni is ruled out, too: if olvas-ni is a bare V spellout, then -hAt could
only get onto fog if it raised over Τ (cf. the definition of morphosyntactic
merger in (15)), to the proxy/empty Μ slot, but this time it cannot do so,
because the proxy "M" only attracts fog in Τ as a filler — Mod is not
equidistant with T, because now they are not in the same head-chain, since fog
in Τ is a separate word, blocking chain-formation between Mod and M. This
also yields the fact that Mod cannot scope over future T, either.
13. For a detailed discussion of the nature of vo/-insertion, see Bartos (2000). The
essential argument for regarding it as syntactic, rather than morphological, is
that it participates in / is visible for syntactic processes, such as right-node
raising, and that it can be separated (though not moved away) from the real V-
bloc by certain particles (e.g. the yes-no question particle -e, or is 'too').
14. This further indicates that irrealis (conditional) Mood is higher than (past) Τ in
Hungarian sentence structure (contra Cinque (1999)), or else the conditional
affix could attach first to the real V-stem, and the dummy V-root would host
the past affix.
15. The nominative assigning/checking Τ and the subject DP cannot meet in any
lower domain, e.g. TP, either, on the assumption that the specifiers of the
contentful inflections are retained exclusively for adverbials (Cinque (1999)).
On-line Morphology: Hungarian Verbal Inflection 53

16. Κ attracts F if F is the closest feature that can enter into a checking relation
with a sublabel of Κ (Chomsky 1995: 297). Note that Chomsky takes
minimality as inviolable this way (the very definition of movement makes
reference to minimality), but since the [Vduinmy+M] bloc has no checking to
offer for Agr, it is not a competitor for [V+T] in this respect, so these two
alternative derivations are legitimate competitors on economy, and the crashing
nature of one renders the other grammatical.
17. It is interesting to draw a parallel with Polish conditionals here: they are also
composite forms, consisting of the thematic V-stem carrying some sort of participial
marking plus another stem, the "conditional auxiliary", bearing subject agreement
(Borsley and Rivero 1994), as shown in (i). (Irrelevantly, the conditional auxiliary
can surface in any other position to the left of the main verb, too.)

(i) Ty jego widzia by.

you him see-prt cond-2sg
'You would see him.'

At first sight, it seems to be the equivalent of the Hungarian (19b), modulo the
order of the two verb forms: as if in Polish the conditional auxiliary could
assume subject agreement by virtue of being capable of assigning/checking
nominative to/with the subject, thereby making use of the shortest, most
economical derivation. This may well be true: proper auxiliaries often
assign/check nominative, so the difference between Polish and Hungarian boils
down to the fact that in Hungarian volna is not an auxiliary, just a dummy verb.
In fact, in earlier stages of Hungarian volna was probably a genuine auxiliary,
taking a participial complement, and the participial verb form coincided with
the past tense form, with the agreement marker analyzable as some sort of
"possessive" agreement (characteristic of present-day Hungarian infinitives,
too) on the participle: (iia). This is confirmed by the fact that in old Hungarian
the auxiliary could also bear its own past tense, forming a complex past form of
the scheme "V-participle-Agr + Aux-past": (iib).

(ii) a. % vdr-t-unk vol-na

wait-part-lpl beaux-cond
'We would have waited.' (lit.: '(There) would be our (having)
b. % var-t-unk vol-t
wait-part-lpl beaux-past
'We (had) waited.' (lit.: '(There) was our (having) waited.')

But in present-day Hungarian there is no good reason to attribute auxiliary

status to volna — it is a mere dummy, a minimal root inserted to carry the
mood affix.
54 Huba Bartos

18. Alternatively, the fronting of V might be analysed as a case of phrasal

movement, and then the issue of HMC and minimality would not even arise.
We do not pursue this line here, though.
19. Here I do not argue against the XP-movement approach: for counterarguments
with respect to the verbal complexes, see Brody (1997) and E. Kiss (1999),
while Bartos (forthcoming) shows that Koopman and Szabolcsi's XP-
movement-based model of affixation is unable to capture the scope phenomena
in sections 2-4.
20.1 am grateful to Ildiko Toth for pointing me to these data.

Aoun, Joseph and Yen-hui Audrey Li
1993 Syntax of Scope. Cambridge, MA: The MIT Press.
Baker, Mark
1985 The Mirror Principle and morphosyntactic explanation. Linguistic
Inquiry 16: 373-415.
Bartos, Huba
1999 Morfoszintaxis es interpretacio: A magyar inflexiös jelens6gek
szintaktikai härtere. [Morphosyntax and interpretation: The syntactic
aspects of Hungarian inflectional phenomena.] Doctoral dissertation,
ELTE, Budapest.
Bartos, Huba
2000 Affix order in Hungarian, and the Mirror Principle. In: Gabor Alberti
& Istvan Kenesei (eds.), Approaches to Hungarian vol.7.
(JATEPress, Szeged), 54-70.
Bartos, Huba
forthc. Verbal complexes and morphosyntactic merger. Ms., Research
Institute for Linguistics, Budapest.
Borsley, Robert D. and Maria Luisa Rivero
1994 Clitic auxiliaries and incorporation in Polish. Natural Language and
Linguistic Theory 12: 373-422.
Brody, Michael
1997 Mirror theory. Ms., University College, London & Research Institute
for Linguistics, Budapest.
Chomsky, Noam
1995 The Minimalist Program. Cambridge, MA: The MIT Press.
Chomsky, Noam
1998 Minimalist inquiries: The framework. In: Juan Uriagereka et al.
(eds.), Step by Step: Essays in Honor of Howard Lasnik, Cambridge,
MA: The MIT Press.
Chomsky, Noam
1999 Derivation by Phase. Ms., MIT.
On-line Morphology: Hungarian Verbal Inflection 55

Chomsky, Noam
2001 Beyond explanatory adequacy. Ms., MIT.
Cinque, Guglielmo
1999 Adverbs and Functional Heads: A Cross-Linguistic Perspective.
Oxford University Press, Oxford.
E. Kiss, Katalin
1999 Strategies of complex predicate formation and the Hungarian verbal
complex. In: Istvan Kenesei (ed.), Crossing Boundaries, 91-114
Amsterdam & Philadelphia: John Benjamins.
Fox, Danny
1995 Economy and scope. Natural Language Semantics 3: 283-341.
Frampton, John and Sam Gutmann
1998 Distributed morphological interface and V-to-I raising. Ms.,
Northeastern University.
Frampton, John and Sam Gutmann
1999 Cyclic computation, a computationally efficient minimalist syntax.
Syntax 2: 1-27.
Halle, Morris and Alec Marantz
1993 Distributed Morphology and the Pieces of Inflection. In: Samuel Jay
Keyser & Ken Hale (eds.), The View from Building 20, 111-176,
Cambridge, MA: The MIT Press.
Kaye, Jonathan D.
1995 Derivations and interfaces. In: Jacques Durand and Francis Katamba
(eds.), Frontiers of Phonology: Atoms, Structures, Derivations, 289-
332, London: Longman.
Kayne, Richard S.
1998 Overt vs. covert movement. Syntax 1: 128-191
Kenesei, Istvan
1989 Logikus-e a magyar szorend? [Is Hungarian word order logical?]
Altalänos Nyelveszeti Tanulmänyok 17: 105-152.
Kiefer, Ferenc
1981 What is possible in Hungarian? Acta Linguistica Academiae
Scientiarium Hungaricae, 31: 147-185.
Koopman, Hilda and Anna Szabolcsi
2000 Verbal Complexes. Cambridge, MA: The MIT Press.
Marantz, Alec
1997 No escape from syntax: Don't try morphological analysis in the
privacy of your own lexicon. UPenn Working Papers in Linguistics
4.2: 201-226.
Nash, Lea and Alain Rouveret
1998 Feature fission and the syntax of argument DPs and clitics. Paper
presented at GLOW '98, Tilburg.
Pesetsky, David
1985 Morphology and Logical Form. Linguistic Inquiry 16: 193-246.
56 Huba Bartos

Rebrus Pöter
2000 Morfofonolögiai jelensögek. [Morphophonological phenomena]. In:
Ferenc Kiefer (ed.), Strukturälis magyar nyelvtan 3. — Morfolögia
[A structural grammar of Hungarian vol.3. — Morphology], 763-
947, Budapest: Akadömiai Kiadö.
Reinhart, Tanya
1997 Wh-in-situ in the framework of the Minimalist Program. Natural
Language Semantics 6: 29—56.
Rivero, Maria Luisa
1991 Long Head Movement and Negation: Serbo-Croatian vs. Slovak and
Czech. The Linguistic Review 8: 319-351.
Rivero, Maria Luisa
1994 Clause structure and V-movement in the Languages of the Balkans.
Natural Language and Linguistic Theory 12: 63-120.
Verbal morphology and agreement in Urdu

Miriam Butt and Louisa Sadler

1. Introduction

The status of morphology within the theory of grammar, and in particu-

lar its relation to syntax, remains somewhat controversial.1 A number
of important issues can be distinguished, concerning the interaction
between morphology and syntax and the mechanisms and data struc-
tures appropriate for the description of morphological systems.
On the relation between syntax and morphology, one major line of
thinking, first proposed by Chomsky (1970), is often referred to as
'lexicalist' and holds that syntactic and morphological processes
belong to differing modules of grammar. In the tradition of derivational
syntax (e.g., DiSciullo and Williams 1987), these modules only interact
at the point at which lexical insertion of a morphologically fully
formed word into the syntax takes place. A radically different view
blurs the distinction between syntax and morphology in that in-
flectional morphology is taken to result essentially from head-to-head
movement, a syntactic operation. As such, inflectional morphology
forms a subtheory of syntax and morphological processes are predicted
to be subject to the same constraints as syntactic processes (e.g., Baker
1985, 1988). The most recent formulation of this position is Distribu-
tive Morphology (Halle and Marantz 1993).
The issue of mechanisms and data structures for morphological de-
scription is hardly less controversial. In broad terms, we may distin-
guish inferential-realizational approaches, including word-and-para-
digm models, in which forms are viewed as exponents of sets of fea-
tures or paradigmatic cells (e.g., Anderson 1992; Aronoff 1994; Stump
1991, 2001) from 'lexical' or morpheme-based models in which the
mapping between meaning and form is given in the lexical entry for a
morpheme. On this view, an inflectional affix is in many ways granted
a status similar to that of a lexeme, and it is natural to regard word
58 Miriam Butt and Louisa Sadler

structures as very similar to syntactic structures (as, for example, in

Selkirk's 1982 word syntax approach).
This paper explores aspects of the representation of morphology
in Lexical-Functional Grammar (LFG) by examining the treatment of
verbal morphology and agreement phenomena in the South Asian lan-
guage Urdu.2 The theory of Lexical-Functional Grammar (Bresnan
1982, 2001; Dalrymple 2001) is lexicalist in the former sense, that
of the lexicalist hypothesis. The specific form in which the lexicalist
hypothesis is embodied in LFG does, however, permit quite a complex
interaction between syntax and morphology, as we shall see.
The precise nature of the morphological component itself has re-
ceived relatively little attention in LFG. While most theoretical work
in LFG has assumed a morpheme-based approach, often for expository
convenience, there is no reason in principle why LFG should espouse a
lexical morphology of this sort, and indeed several papers propose the
integration of a realizational or constructional morphology with LFG.
Computational work in LFG has generally used finite-state morpholog-
ical analyzers (Karttunen, Kaplan and Zaenen 1992, Kaplan and Kay
1994), with an interface defined between the morphological and syn-
tactic analysis (e.g., Butt, King, Nino and Segond 1999). Finite state
technology is not committed to a morphemic or sign-based view of
morphological structure, and preserves the required separation of (ex-
ternal) syntax and morphology. From the point of view of the syntactic
module, the morphological analyzer is a 'black box'. The processes by
which words are formed are entirely opaque to the syntax and the only
point of contact is the syntactic functional information which filters
through the interface between morphology and syntax.
In this paper, we explore some complex morphosyntactic phenom-
ena in Urdu, showing how LFG permits a natural treatment of the in-
teraction between syntax and morphology. We then explore some defi-
ciencies of the morpheme-based view on the basis of a computational
implementation of the grammar fragment under discussion and go on
to examine an alternative model of the morphology-syntax interface.
The rest of this paper is structured as follows. In section 2 we first
provide a brief sketch of the basic design principles of LFG for those
Verbal Morphology and Agreement in Urdu 59

readers who are unfamiliar with the theory. Section 2.2 introduces and
exemplifies an aspect of the formalism (known as constructive mor-
phology, Nordlinger 1998) which permits a natural and straightforward
approach to the ability of morphological elements (such as case mark-
ers) to define and project the relational structures which contain them.
Section 3 briefly introduces case in Urdu and its relation to verbal
agreement patterns and sketches out a treatment of Urdu case marking
in LFG. In section 4 we move on to verbal agreement in Urdu. We for-
mulate a relatively simple generalization concerning verbal agreement
and show how constraints associated with verb forms will capture this
generalization. In the next two sections we turn to the details of the
morphological analysis, exploring first a word-syntax, or morpheme-
based, implementation in section 5 and presenting several unwanted
side-effects and drawbacks of this approach. The final section exam-
ines an encoding of the same set of agreement data using a finite-state
morphological analyzer interfaced to the syntax and shows how the
difficulties encountered in the word syntax approach are resolved.

2. Lexical Functional Grammar

LFG is a non-derivational syntactic theory which posits two levels of

syntactic representation, each with its own distinct vocabulary and data
structures. External syntax, or constituent structure (c-structure), is
modelled in terms of (phrase structure) trees of the familiar sort, based
on categories decomposable as simple feature bundles. These struc-
tures are directly motivated by considerations of surface variability:
that is, where the forms of exponence differ, so do the external (c-
structure) forms. Thus while a fully elaborated X' theory with fully
projective lexical and functional categories is appropriate for the de-
scription of highly configurational, endocentric constructions in lan-
guages such as English, alternative, non-configurational modes of or-
ganization for external syntactic structure are appropriate in other lan-
guages, and still others may exhibit a mix of configurational and non-
configurational external structure.
60 Miriam Butt and Louisa Sadler

LFG uses a second syntactic level of representation for internal, re-

lational structure. This level of representation distinguishes predicates
and their syntactic arguments, with relations such as SUBJECT, OB-
JECT, ADJUNCT and OBLIQUE being primitive notions. Internal syn-
tax is represented at the level of f(unctional)-structure. Functional
structures are represented as AVMS (attribute-value matrices) and are
finite functions from attributes to values, which may themselves be
functions. While languages differ quite radically in their external (sur-
face) structure, f-structures are intended to be 'largely invariant' across
languages (Butt, King, Nino and Segond 1999; Bresnan 2001).
C-structure and f-structure represent different dimensions of syn-
tactic structure and are related by a mapping function φ, from nodes
in the c-structure to f-structures. Thus for example, the annotations to
the nodes in (1) relate these nodes to their image in the domain of f-
structures: t abbreviates the expression φ(Μ(*)) and may be read as
"the f-structure of my mother node" and I abbreviates the expression
φ(*) and may be read as "the f-structure of the present node". Thus
the annotation on the DP node in (1), (fsUBJ) = I, states that the value
of the SUBJ attribute in the f-structure of the IP node is the f-structure
associated with the DP node itself, and the annotation t = I states an
identity between the f-structures of the mother and daughter nodes in

(1) IP

(tSUBJ) = 4, t=l

he t=;

is t=;

Verbal Morphology and Agreement in Urdu 61


" PRED 'pro' 1

PRED 'leave < SUBJ > '

Further equations specifying f-structure information are associated

with lexical items in the lexicon. Finally it should be noted that the set
of annotations appropriate for tree fragements are largely defined by
general principles, but discussion of these principles would take us too
far afield (see Bresnan 2001 for extensive discussion).

2.1. LFG morphosyntax

The strict separation of external syntactic structure and word-internal

structure is encapsulated in LFG in the principle of Lexical Integrity,
a recent statement of which is that: "morphologically complete words
are leaves of the c-structure tree and each leaf corresponds to one and
only on c-structure node" (Bresnan 2001). This ensures that there is no
syntacticization of morphological phenomena—in particular, affixes
do not enter into syntactic structural relations with elements of the
external syntax.
Note that the principle of lexical integrity is stated in such a way
as to preserve the invisibility of morphological structure to c-structure
while permitting what we might think of as the internal f-structure of
words to be visible to f-structure. In fact, as is clear from morpho-
logically rich languages, (external) syntax and morphology are equal,
interacting and competing contributors in the functional domain.
The direct contribution of morphology to the functional domain
can be easily seen in the analysis of phenomena such as head mark-
ing and pronominal incorporation in LFG. For example, Bresnan and
62 Miriam Butt and Louisa Sadler

Mchombo (1987) show that both subject and object markers in Chi-
chewa are word internal elements, using standard tests for lack of sep-
arability and the occurrence of allomorphic variation. They show that
the optional object marker always fulfills the argument function, op-
tionally doubled by a full noun phrase topic. On the other hand the
obligatory subject marker is sometimes an agreement marker (dou-
bled by an overtly expressed subject noun phrase) and sometimes ful-
fills the argument function. These properties are expressed by associ-
ating the appropriate functional information with the subject and ob-
ject inflections of the verb (denoted here by SM- and OM-). Adopting a
morpheme-based morphology, Bresnan and Mchombo provide the fol-
lowing entries for the affixes, as input to the morphological word build-
ing rules: notice that the equation which associates a pronominal PRED
value with the subject marker is optional, accounting for the observed
alternation between subject coding morphology as agreement and as
incorporated pronominal. The verbal affixes thus (partially) define the
f-structures of the subject and object—the contribution to f-structure
by the verb zi-nä-wä-lüm-a in (7) is shown in (9) below:

(3) OM-:V i n / / :(tOBJ)=|

(4-agr) = a
(4.PRED) = 'pro'

(4) SM-:V i M / z :(tsUBJ)=;

(4,agr) = a
((|pred) = 'pro')

(5) Njuchi zi-nd-lum-a a-lenje

lO.bee 10.S-PST-bite-FV 2-hunter
'The bees bit the hunters.' (Chichewa, Bresnan 2001: 150)

(6) zi-nä-lüm-a a-lenje

10-S-PST-bite-FV 2-hunter
'They bit the hunters.' (Chichewa, Bresnan 2001: 150)
Verbal Morphology and Agreement in Urdu 63

(7) Njüchi zi-nä-wä-lüm-a a-lenje

lO.bee 10.S-PST-2.O-bite-FV 2-hunter
'The bees bit them, the hunters.'
(Chichewa, Bresnan 2001: 150)

( 8 ) Njüchi zi-nä-wä-lu.m-a
lO.bee 10.s-PST-2.o-bite-FV
'The bees bit them.' (Chichewa, Bresnan 2001: 150)

" PRED 'PRO' 1
PRED 'bite < (SUBJ), (OBJ) > '

Although the formal framework of LFG naturally accommodates

the direct contribution of morphology to the definition of f-structres,
a number of open issues remain. F-structure is defined quite indepen-
dently of exponence, but many analyses involve the appearance of ex-
ponence related features at f-structure. This may be problematic, for
when f-structure is too closely dependent on details of exponence, the
invariant part may be 'submerged' in a wealth of language-dependent
morphosyntactic features. In some cases the presence of morphosyn-
tactic features (like VFORM in periphrastic constructions, e.g., Böqars,
Vincent and Chapman 1997) in f-structures will cause (otherwise cor-
rect) analyses to fail (Frank 1996), and there is evidence that both mor-
phological and syntactic features sets are required in some domains
(Sadler and Spencer 2001; Sells 2001). As more work is done on a
range of languages, and especially on those with richer morphology,
it is becoming clear that the interface between syntax and morphol-
ogy is often more complex than a simple form of the lexical integrity
64 Miriam Butt and Louisa Sadler

hypothesis might at first suggest. A number of architectural develop-

ments and/or extensions to the formalism have been suggested partly in
response to these issues, including the postulation of a separate projec-
tion for morphosyntactic features (Butt et al. 1996; Frank and Zaenen
2000) and the positive restriction architecture (Andrews and Manning
1999), but space precludes detailed discussion of these proposals.

2.2. Constructive morphology

As we have seen, LFG can easily capture the direct contribution of

head marking morphology in projecting the f-structure for a verbal
head and its syntactic arguments. In recent work, Nordlinger ( 1 9 9 8 )
has explored the contribution of dependent marking morphology to
the projection of f-structure. She shows how the formalism can di-
rectly capture the role of case markers in identifying the grammatical
functions borne by the noun phrases which they mark, in languages in
which such grammatical functions are not configurationally defined.
In the approach known as constructive case, she associates inside-out
constraints (Halvorsen and Kaplan 1988; Dalrymple 1993, see also
Andrews 1 9 9 6 : 4 1 - 4 3 ) with morphological elements, allowing case
morphology on the noun or noun phrase to define the larger syntactic
(f-structure) context in which they are embedded. Consider for exam-
ple, the sentence in (10) from the radically non-configurational lan-
guage Wambaya. Here the subject big dog is a discontinous constituent
in which both parts are marked with ergative case.

(10) galalarrinyi-ni gini-ng-a dawu bugayini-ni

dog.I-ERG 3 S G . M A S C . A - 1 . 0 - N F U T bite big.I-ERG
"The big dog bit me.' (Wambaya, Nordlinger 1998: 96)

Nordlinger assigns the following morphological structure to gala-

larrinyi-ni (the functional specifications are associated with the stem
and affix in (sub-)lexical entries):
Verbal Morphology and Agreement in Urdu 65

(11) Ν

t= I t= I
Ν Aff

galalarrinyi ni
(tPRED) = dog (tCASE) = ERG
((SUBJ t) OBJ)

As shown above, it is the ergative case which specifies that the nom-
inal element is a SUBJ in a larger f-structure. (12) specifies (a) that
the attribute:value pair CASE:ERG is defined for the f-structure of the
case marked nominal (that is, the f-structure denoted by and (b) that
this f-structure (that of the case-marked nominal) is the value of the
SUBJ attribute in a containing f-structure (i.e., the f-structure of the
sentence—denoted by (SUBJ t)X which also contains an OBJ:

(12) ni: ERG


Together the affix and the stem galalarrinyi, 'dog' then define the
following f-structure:


PRED 'dog'

The treatment of the (discontinuous) adjunct is more complicated.

The stem bugayini 'big', specifies that it is the value of an ADJUNCT
attribute in a containing f-structure (see annotations in (14)), while the
case affix specified that the f-structure containing the ADJ is an ergative
case SUBJ. The relevant morphological tree structure is as follows:
66 Miriam Butt and Louisa Sadler


(tPRED) = big ((ADJ t) CASE) = ERG
(ADJ t) ((SUBJ ADJ t) OBJ)

The f-description associated with the ergative case affix in (14) dif-
fers from that in (11) in that it specifies (adJ t) where the latter spec-
ified t · This local substitution of designators is due to the Principle of
Morphological Composition which (roughly) embeds the functional
designator of the stem under that of the affix. As we are not concerned
with this composition here, we have nothing more to say about it.3
The f-structure of bugayini-ni is as in (15). The f-structures (13)
and (15) combine gracefully as (16) to give the f-structure of the dis-
continuous subject (f 1) within the f-structure of the sentence (JO).

SUBJ r lι . « ι



/o SUBJ fx PRED 'dog'

ADJUNCT [ PRED 'big' ]

Nordlinger (1998) shows how inside out statements associated with

case morphology can successfully describe a range of phenomena in-
cluding case stacking, case marking on discontinuous nominal con-
stituents (as shown here) and so-called modal case, introducing sen-
tential tense and aspect features. Although Nordlinger (1998) uses a
Verbal Morphology and Agreement in Urdu 67

morpheme-based morphology in her formulation of constructive case,

she explicitly states that the use of inside-out statements is indepen-
dent of a morphological model: "while I use a morpheme-based mor-
phology here, the overall analysis does not depend on this view of
morphology and could equally be adapted into a rule-driven analysis"
(Nordlinger 1998: 63).

3. Exploring the syntax-morphology interface: Urdu case

In the following sections, we attempt to give the preceding material

a more concrete form by dicussing some aspects of two different but
interrelated phenomena in Urdu morphosyntax: case and agreement.

3.1. Case, agreement and pro-drop

South Asian languages have in common a number of areal character-

istics which are also found in Urdu. These include the widespread use
of non-nominative subjects (see section 3.3 for some Urdu examples)
and the existence of relatively free word order among nominal argu-
ments of a clause, together with the ability to pro-drop quite freely. On
the other hand, the agreement patterns of these languages differ sig-
nificantly, irrespective of these shared areal characteristics, something
which is rather unexpected on the view (which we do not espouse)
which posits a very direct structural relationship between patterns of
agreement and case marking. The work of Mahajan (1990, 1992) on
Hindi is fairly representative of approaches which see structural Case
assignment as involving functional agreement heads.
The pretheoretical generalization with respect to Urdu verbal agree-
ment is as follows (cf. Mohanan 1994). The verb agrees only with an
unmarked (i.e. nominative) direct argument (subject or object). Verbal
agreement in Urdu involves person, number and gender. If the subject
is unmarked, agreement is with the subject (see (17a)). If the subject
is marked, but the object is unmarked, the verb agrees with the object
(see (17b)). If both subject and object are unavailable for agreement
68 Miriam Butt and Louisa Sadler

(that is, are overtly case-marked), the verbal complex reverts to third
person, masculine singular morphology ((17c)). In summary, one can
say that overt case repels verbal agreement.
(17) a. nadya gari cala-t-i h-εϊ
Nadya.F.Nom car.F.Nom drive-Impf-F.Sg be-Pres.3.Sg
'Nadya drives a car.'

b. nadya=ne/adnan=ne gari cala-yi h-εί

N.F=Erg/A.M=Erg car.F.Nom drive-i'erf.F.Sg be-Pres.3.Sg
'Nadya/Adnan has driven a car.'

c. nadya=ne gari=ko cala-ya h-εί

Nadya.F=Erg car.F=Acc drive-Perf.M.Sg be-Pres.3.Sg
'Nadya has driven the car.'
Unlike Urdu, some other South Asian languages do exhibit (ver-
bal) agreement with overtly case marked nominale. Nepali and Gu-
jarati have free word order among nominale, the ability to pro-drop
and also allow ergative and dative (psych predicates) subjects, and are
thus similar to Urdu in these respects. However, Nepali differs from
Urdu in that the verb agrees with the ergative subject, shown in (18).
(18) mai-le mero lugä dho-en
I-Erg my clothes.Nomwash-Past.l.Sg
Ί washed my clothes.' (from Deo and Sharma 2002)
In Gujarati, on the other hand, as shown in (19), the overt presence
of an accusative has no effect on verb agreement with the object (Butt
and Deo 2001). Again, this is unlike Urdu/Hindi, where overt case
blocks agreement.
(19) a. mai-le mero lugga dho-en
I-Erg I.M.Sg.Gen clothes.M.Pl.Nom wash-Perf.3.Pl
Ί washed my clothes.'

b. ram-e gadi-ne jo-yi

Ram.M.Sg-Erg car.F.Sg-Acc see-Perf.F.Sg
'Ram has seen a/the car.'
Verbal Morphology and Agreement in Urdu 69

In a recent typological study of agreement in South Asian languages

Subbarao (1999) finds what amounts to a singular lack of correlation
between the distribution of case and agreement. That is, in some lan-
guages the presence of a lexical case marker blocks agreement (e.g.,
Urdu/Hindi), while in others the presence of a lexical case marker is
obligatory for agreement (e.g., Maithili), and in still others the pres-
ence or absence of a case marker is of no consequence for agreement
(e.g., Mizo, Hmar, Paite). Subbarao postulates null agreement check-
ing when there are no strong agreement features. In our view, this es-
sentially divests the postulated connection between case and agree-
ment of any empirical consequences, and therefore amounts to giving
up on the strong connection between case and agreement postulated
in much derivational generative syntax (cf. Bhatt 2002 who comes to
essentially the same conclusion).
Another issue that arises in connection to the relation between case
and agreement is pro-drop; Urdu, like all South Asian languages, al-
lows rampant pro-drop. An example is shown in (20).

(20) a. tum=ne nadya=ko Ifana di-yal

you=Erg Nadya.F=Dat food.M.Sg.Nom give-Perf.M.Sg
'Did you give Nadya (some) food?'

b. ji, di-ya
yes.Polite give-Perf.M.Sg
'Yes, gave.'

One standard view is that pro-drop is correlated with rich verb agree-
ment (e.g., Rizzi 1986). As we have seen, Urdu does have rich verb
agreement and also permits pro-drop. However, as the monologue in
(21) from the Hindi movie Dilwale Dulhania Le Jayenge shows, both
agreement and case are orthogonal to the possibility of pro-drop. Note
in particular that since Urdu does not have indirect object agreement,
the permitted absence of the indirect object in (20) cannot be explained
in terms of licensing by agreement.
The first sentence in (21a) begins the monologue by referring to
some pigeons who are seen pecking at seeds outside. There is no pro-
70 Miriam Butt and Louisa Sadler

drop here. The topic of the utterance is nominative 'they'(=some pi-

geons). In (21b) and (21c) the 'they'(=pigeons) has been pro-dropped.
In (21b) the overt realization of 'they'(=pigeons) would be ergative.
This means that the verb is not showing agreemeent with the pro-
dropped element. Rather, as indicated, it agrees with the nominative
(unmarked) object. In (21c), on the other hand, the overt realization of
'they'(=pigeons) would be nominative. Consequently there is agree-
ment with the pro-dropped element here.4

(21) a. \ye\T bbi mer-i=ki tdräh-εϊ

Pron.3.Sg also I.Gen-F.Sg=Gen.F.Sg like be-Pres.Pl
'They fopic are also like me.' (Dilwale Dulhania Le Jayenge)

b. jahä dana delf-a

where seed.M.Sg.Nom see-Perf.M.Sg
'where (theycon, .topic) s e e a s e e d '

c. udar ga-ye or pet bar kar

there go-Perf.M.Pl and stomach.M.Sg.Nom fill having
ur ga-ye
rise go-Perf.M.Pl
'there (they contJopic ) go and having filled (their) stomach
fl awa
cont.topic^ y y·'
We thus conclude that case and agreement appear to function rela-
tively independently in South Asian languages. In the following sub-
section we explore the nature of the syntax-morphology interface by
taking a closer look at case and agreement patterns.

3.2. Case in Urdu

Case is marked in Urdu by a set of case clitics which derive diachron-

ically from nouns or participles in an earlier stage of the language
(see Butt and King 2002a). There are two cases which are not marked
overtly by means of case clitics. One of these is the so-called bare
Verbal Morphology and Agreement in Urdu 71

locative, which is nevertheless often identifiable by the occurence of

the head noun in an oblique form. The other one is the 'direct' case
found on subjects and direct objects. In this paper, we consistently
gloss this case as the nominative. See Butt and King (2002a, 2002b)
for an extensive discussion.
The distribution of case markers in Urdu is relatively complex. Urdu
is generally described as a split ergative language, with clausal tense
and aspect features (partly) determining the case marking pattern. The
ergative must appear on the subjects of transitives when the verb is
marked with perfect morphology (-a/-i/-e) as in (22). Subjects of un-
accusative intransitives are always nominative, whereas subjects of
unergative intransitives are optionally ergative (Davison 1999, see sec-
tion 3.3). Furthermore, the ergative alternates with a dative subject in
a be+infinitive construction (see section 3.3).

(22) a. ram gari cala-t-a (h-εί)

Ram.M.Sg.Nom car.F.Sg.Nom drive-Impf-M.Sg be-Pres.3.Sg
'Ram drives a car.'

b. ram=ne gari cala-yi (h-εί)

Ram.M.Sg=Erg car.F.Sg.Nom drive-Perf.F.Sg be-Pres.3.Sg
'Ram has driven a/the car.'

Note that (22) also illustrates the relation between verbal agreement
and the lack of case marking discussed above.

3.3. Case Alternations

The patterns of case alternations in the language provide evidence that

Urdu cases are not mere morphological reflexes of structural position
or agreement relations, but are 'constructive' in the sense of Nordlinger
(1998), serving to define the f-structure grammatical functions born by
the nominals they mark, and directly contributing to, and interact with,
clause level functional and semantic information.
72 Miriam Butt and Louisa Sadler

Consider, for example, the clitic ko. This clitic expresses both dative
and accusative case in Urdu. As a dative it is associated with goals (in-
direct objects, subject of pysch-verbs, etc.). As an accusative it marks
specificity of objects, much as in Turkish (Εης 1991), and alternates
with a nominative or unmarked object, as shown in (23).

(23) a. ram=ne jiraf defd-i

Ram=Erg giraffe.F.Nom see-Perf.F.Sg
'Ram saw a/some giraffe.'

b. ram=ne jiraf=ko delf-a

Ram=Erg giraffe.F=Acc see-Perf.M.Sg
'Ram saw the (particular) giraffe.'
Because the specificity of an object has an effect on the clausal se-
mantic interpretation in terms of further properties such as telicity or
aspectual interpretation (e.g., Krifka 1992; Tenny 1994), the case clitic
must be seen as interacting with the clausal syntax and semantics.
Mahajan (1992) attempts to correlate specificity with object agree-
ment, but his analysis makes exactly the wrong prediction with regard
to the interpretation of specificity. Under his analysis, the unmarked
object moves to an agreement position to receive Case. In this posi-
tion, the unmarked (nominative in our terms) object is also associated
with specificity features. However, as seen above, it is precisely the
unmarked agreeing object which allows for non-specific readings. The
ko (accusative) marked non-agreeing object, on the other hand, must
be interpreted as specific (see Butt 1993).
Another example of a productive case alternation is found with in-
transitive unergative clauses. As shown in the minimal pair in (24), the
ergative alternates with the unmarked nominative to produce a mean-
ing difference in terms of intentionality/volitionality. Again, the clear
meaning difference is marked solely by the case alternation on the
(subject) with this class of verbs.

(24) a. ram l&äs-a

Ram.M.Nom cough-Perf.M.Sg
'Ram coughed.'
Verbal Morphology and Agreement in Urdu 73

b. ram=ne l&äs-a
Ram.M=Erg cough-Perf.M.Sg
'Ram coughed (purposefully).'

Similarly, with the be+infinitive in (25) the choice of ergative or da-

tive case determines sentential features. While the ergative introduces
a desiderative modality, the dative functions more as a default marker
which could be interpreted either in terms of obligational force or with
the desiderative modality, depending on the particular context of the
utterance. Note that the verbal complex itself is once again invariant
across this pair of sentences.

(25) a. nadya=ne zu ja-n-a h-εϊ

Nadya.F=Erg zoo.M.Loc go-Inf-M.Sg be-Pres.3.Sg
'Nadya wants to go to the zoo.'

b. nadya-ko zu ja-n-a h-ei

Nadya.F=Dat zoo.M.Loc go-Inf-M.Sg be-Pres.3.Sg
'Nadya has/wants to go to the zoo.'

3.4. Constructive case in Urdu

In the above sentences, the semantic contrasts are directly related to the
choice of nominal case. This is consonant with the constructive view
of case (and other 'functional' features) taken in Nordlinger (1998),
but is at odds with the standard view in derivational approaches, in
which case is seen as a mere spell-out of functional features.
Furthermore, as word order is largely free in Urdu (e.g., Maha-
jan 1990; Butt 1995; Kidwai 1997), and neither case nor grammatical
functions can be associated with any particular structural position (Butt
1995), it is evident that the case markers themselves play an important
role in determining the grammatical functions of the noun phrases they
are attached to. This entails an essentially constructive view of case cl-
itics and can be stated in a simple and intuitive fashion by means of
the lexical entry for the dative/accusative ko shown in (26).
74 Miriam Butt and Louisa Sadler

(26) ko:Κ
[ (tCASE) = ACC
(fSPECIFIC) = + ]

[ (tCASE) = DAT
( ° B W t ) V (SUBJt) ]
The entry for ko allows for 3 possibilities. As an accusative marker,
it 'constructs' an object: the statement (OBJ f ) has existential force
and specifies that the f-structure of the accusative case marked nomi-
nal (t) is the value of the OBJ attribute in an immediately containing
f-structure. The first disjunct also states that the object will have ac-
cusative case and that it is to be interpreted as specific. The second
disjunct covers the dative uses of ko. Datives can be either indirect
objects (OBJg 0 ) or subjects, as in (25b) above. Again, the inside-out
statements (OBJ go f ) and (SUBJ ΐ ) specify that the f-structure of the
case marked nominal is the OBJ go or the SUBJ in the f-structure corre-
sponding to the clause. This entry captures the distribution and seman-
tic effect of the accusative/dative ko efficiently and accurately without
further recourse to syntactic rules.
We stated above that Urdu case markers are syntactic clitics. As
such, the natural treatment of them in LFG is as co-heads of the NP
they mark, possibly as members of a (functional) category K, as shown
in (28) (see Butt and King 2002a for a discussion of the use of KP with
respect to Urdu).
The nominative is phonological null in Urdu and therefore cannot
receive an entry on a par with the other (phonologically substantial)
case markers.5 We assume that in the absence of overt case particles,
nominative case (i.e., (fCASE) = NOM) and the associated constructive
identification of the grammatical function is assigned via default rules.
For a detailed analysis of this and the case alternations presented above
see Butt and King (2002a, 2002b).
(27) larke=ko
Verbal Morphology and Agreement in Urdu 75

(28) KP

t=4 t=l

Ν ko


3.5. Case clitics vs. case inflection

A fundamental notion in LFG is that while morphology and syntax

may make the same type of functional contribution to the analysis
of a sentence (in different languages or, sometimes within the same
language, e.g., synthetic and analytic expression of temporal and as-
pectual clausal features in Latin, Sanskrit, Italian, French, Welsh and
many other languages), this does not imply that syntax and morphol-
ogy share an expression structure. As we have noted above, the prin-
ciple of strict lexicalism itself requires that morphological structures
are opaque to the syntax. If syntax and morphology are, in this sense,
equal partners, it is expected that the two means of expression may
co-occur and even compete within the same language.
A particular instance of syntax and morphology as alternative means
of expression occurs in Urdu where pronominals may bear inflectional
case marking, while lexical noun phrases, as we have seen, are gener-
ally marked by case clitics, which are syntactic elements. The inflec-
tional case marking on pronominals is a vestige of the earlier Sanskrit
case system. In some cases, pronouns permit either a case clitic or an
inflectional case marker, as shown in (29b).
(29) a. nadya=ko dakxane ja-n-a
Nadya.F=Dat/I.Obl-Dat post offic.M.Obl.Loc go-Inf-M.Sg
'Nadya has/wants to go to the zoo.'
76 Miriam Butt and Louisa Sadler

b. muf=ko/muf-e dakxane ja-n-a

I.Obl=Dat/I.Obl-Dat post offic.M.Obl.Loc go-Inf-M.Sg

Ί have/want to go to the zoo.'

While the dative noun in (29a) can only be marked with a case
clitic, the pronoun in (29b) is more permissive. The inflectional affix
-e contributes exactly the same information to the syntactic and se-
mantic analysis as the ko case clitic in (26). Again, see Butt and King
(2002a) for a more in-depth discussion of the pronominal case marking
paradigm in Urdu.6

3.6. Summary

The brief discussion of case marking in this section has illustrated how
the role of case can be captured simply and straightforwardly in LFG.
Case markers are treated as syntactic elements introducing a set of
constraints over f-structure (and contributing semantic information).
Case-inflected pronominals are associated with an identical set of (f-
structure) constraints. Urdu case is 'constructive' in the sense that the
cases themselves project the grammatical functions. In Urdu, the rela-
tion between case marking and grammatical function is complex and
there is no simple structural relation between grammatical relation and
position. Furthermore, case alternations are almost exclusively seman-
tically motivated. All these aspects of the role of case in Urdu can be
captured straightforwardly in LFG. In the following section, we turn
to verbal agreement. The verbal agreement pattern in Urdu is wholy
inflectional and involves person number and gender agreement dis-
tributed over parts of periphrastic expressions. We do not deal with the
details of the formation of periphrasis here. While both case clitics and
case inflections can be analysed straightforwardly in LFG, the analysis
of verbal agreement uncovers some challenges.
Verbal Morphology and Agreement in Urdu 77

4. Agreement

4.1. The verbal tense/aspect system in Urdu

The modern South Asian languages display various differing agree-

ment systems, most of which involve number, person and gender (but
not case) agreement in some distribution over auxiliaries and former
participles. These different systems are descended from the Sanskrit,
which had person and number agreement in the verbal domain and
number, gender and case agreement on adjectives and participles. In
this section, we confine our discussion to verbal agreement.
With respect to verbal agreement, South Asian languages differ
considerably. The modern Indo-Aryan language Bengali, for exam-
ple, has dispensed entirely with gender and number agreement in the
verb, while other languages like Marathi have re-introduced person
agreement. Despite these differences, the Urdu/Hindi verbal paradigm
is fairly representative of South Asian (Indo-Aryan) languages.
Urdu has a mixed inflectional and periphrastic tense/aspect system.
The indicative pattern summarized in (30) is illustrative for the verb
mar- 'hit' (the equivalent of the English simple present is rendered
periphrastically in Urdu as shown in several of the examples below).

(30) The Basic Modern Urdu Tense/Aspect System

Impf Perf Prog
Pres Past Fut Pres/Past Pres/Past Pres/Past
mara marega marta mara mar raha
+ Aux (be) + Aux (be) + Aux (be)
mar- 'hit' — 3.Sg.M

Amongst these forms, the future is the only form which inflects for
number and person.
78 Miriam Butt and Louisa Sadler

(31) Urdu Future Paradigm

Singular Plural Respect (ap) Familiar (tum)
1st mar-ü-g-a/i mar-e-g-e/i
2nd mar-e-g-a/i mar-e-g-e/i mar-o-g-e/i
3rd mar-e-g-a/i mar-e-g-e/i
mar-' hit'
The vowel encoding number and person agreement (immediately
following the stem) might be a vestige of the auxiliary h-εi 'be' (Mc-
Gregor 1968:161), or may derive historically from the original present
inflections (Ashwini Deo, p.c., August 2000). Compare the paradigm
for the present tense of the modern Urdu verb ho- 'be'.
(32) Present of Urdu be
Singular Plural Respect (ap) Familiar (tum)
1st h-ü het
2nd h-εί he! h-o
3rd h-ei he!
ho- 'be'
This verb is the only verb that has a present tense. When this same
morphology (u, e/ε, ο) appears on the other verbs in the language, it is
interpreted as subjunctive or imperative. These tenses are thus the only
other forms which show person and number agreement. They are also,
like the forms of the verb ho 'be', remnants of the original Sanskrit
system. Present tense readings for other verbs are expressed via the
progressive and imperfect periphrastic constructions, as illustrated in
(37) and (39).
The other parts of the verbal morphological paradigm involve only
number and gender agreement. Table (33) shows the 'imperfect' (ha-
bitual) forms for mar 'hit'.
(33) Urdu Imperfect
Singular Plural Respect (ap) Familiar (tum)
Μ mar-t-a mar-t-e mar-t-e mar-t-e
F mar-t-i mar-t-i mar-t-i mar-t-i
Verbal Morphology and Agreement in Urdu 79

All the forms which show number and gender agreement derive from
participles. Table (34) shows the "perfect" forms for mar 'hit', which
also inflects for number and gender.

(34) Urdu Perfect

Singular Plural Respect (ap) Familiar (tum)
Μ mar-a mar-e mar-e mar-e
F mar-i mar-i mar-i mar-i
mar- 'hit'

The folllowing table summarizes the distribution of number, person

and gender agreement across the verbal forms in Urdu.

Verb Form Number and Number and

Gender Person
Past, Perfect,
Imperfect, Progressive, —

Past 'be'
Subjunctive, —
Non-past 'be'
Future y/ y/

The core periphrastic tenses in Urdu are the present and past im-
perfects, present and past perfects and present and past progressives.
These analytic tenses take the form of a non-finite (aspectually marked)
form of the verb (inflected for gender and number) combined with a
form of the auxiliary be. Hence the present imperfect arises from a
combination of the forms in (32) and (33).
The verb ho 'be' lacks past morphology (though it can appear in
the present (32), future, imperfect and perfect). It therefore forms a
suppletive paradigm with another 'be' verb: f - (based on a participle
form of Sanskrit sthä 'stand'). This verb only has a past form, as shown
in (36). The past imperfect thus combines the forms in the table (33)
with those in (36), which are inflected for number and gender.
80 Miriam Butt and Louisa Sadler

Past of Urdu be
Singular Plural Respect (ap) Familiar (tum)
h h
1st t -a/i t -e/i
2nd h
t -a/i th-e/i th-e/r
3rd th-a/i th-e/i

An example of the periphrastic imperfect is given in (37).

(37) a. nadya gari cala-t-i h-ei

Nadya.F.Nom car.F.Nom drive-Impf-F.Sg be-Pres.3.Sg
'Nadya drives a car.'

b. nadya gari cala-t-i P-i

Nadya.F.Nom car.F.Nom drive-Impf-F.Sg be.Past.F.Sg
'Nadya used to drive a car.'

Similarly the present perfect combines the forms in (34) with (32) and
the past perfect the forms in (34) with (36).7

(38) a. nadya=ne gari cala-yi h-ei

Nadya.F=Erg car.F.Nom drive-Perf.F.Sg be-Pres.3.Sg
'Nadya has driven a car.'

b. nadya=ne gari cala-yi f-i

Nadya.F=Erg car.F.Nom drive-Perf.F.Sg be.Past.F.Sg
'Nadya had driven a car.'

The progressive combines forms of rah 'stay' with the above forms.
This verb is inflected just like mar 'hit' in tables (31), (33) and (34)
above. As the progressive does not concern us any further here, we
simply provide some examples in (39).

(39) a. nadya gari cala rah-i h-ei

Nadya.F.Nom car.F.Nom drive stay-Perf.F.Sg be-Pres.3.Sg
'Nadya is driving a car.'
Verbal Morphology and Agreement in Urdu 81

b. nadya gari cala rah-i

Nadya.F.Nom car.F.Nom drive-Perf.F.Sg stay-Perf.F.Sg
'Nadya was driving a car.'

c. nadya gari cala-ti rah-t-i

Nadya.F.Nom car.F.Nom drive-Impf.F.Sg stay-Impf-F.Sg
'Nadya keeps driving a car.'

d. nadya gari cala-ti rah-t-i

Nadya.F.Nom car.F.Nom drive-Impf.F.Sg stay-Impf-F.Sg
'Nadya used to keep driving a car.'

As is clear from the above, the tense/aspect system of Urdu is quite

complex and a full description would take us beyond the scope of this
paper. For example, in addition to the forms illustrated above, morpho-
logical reduplication can be used to expression repetitive action.

4.2. Formalizing the agreement generalizations

To illustrate how verbal agreement is analysed, we deal only with the

number and gender agreement exemplified above—it should be clear
that the person/number agreement (found with the present of 'be' and
in the (synthetic) future tense) can be expressed along the same lines. A
fully inflected word such as calayi 'drive-Perf.F.Sg' will be associated
with the following functional information in the syntax (the predicate-
argument structure has been left out for purposes of illustration):
82 Miriam Butt and Louisa Sadler

(40) cala-yi
( t A S P ) = PERF
[ (tSUBJ CASE) = c NOM

The agreement information in the entry in (40) is within the dis-

junction (notated by means of [ ] V [ ]). Recall that the choice of this
(non-default) verb form is only appropriate so long as either the sub-
ject or the object is a feminine singular nominative nominal. The first
disjunct covers the case where the subject is nominative: it may be
read as assigning the attribute value pairs GEN:FEM and NUM:SG to
a subject on condition that the subject is nominative. The constrain-
ing equation, notated by means of the constrained equality =c, does
not assign CASE:NOM to the subject but checks for the presence of
this attribute value pair (this is thus a filter rather than an informa-
tion defining statement). In similar fashion, the second disjunct assigns
the attribute value pairs GEN:FEM and NUM:SG to the object provided
that the subject is not nominative and on condition that the object is
nominative.8 As is evident, agreement (a) is seen essentially as a lexi-
cal constraint and (b) involves the verbal element directly constraining
values of the index features of the f-structure of the nominal, a view
which is quite standard in LFG.
By way of illustration, consider the syntactic analysis of the sen-
tence in (41), in which the verb agrees with the (nominative) object.

(41) gari adnan=ne cala-yi h-ei

car.F.Nom Adnan.F=Erg drive-Perf.F.Sg be-Pres.3.Sg
'Adnan has driven a car.'

As noted above, the order of nominal constituents within the clause

is syntactically rather free in Urdu (see Mahajan 1990; Butt 1995; Kid-
Verbal Morphology and Agreement in Urdu 83

wai 1997). This is modelled in LFG by means of a nonprojecting en-

docentric category S which directly dominates nominal constituents
mapped to clause internal grammatical functions (cf. King 1995; Bres-
nan 2001). The precise determination of the grammatical function (GF)
associated with each KP is rendered via an interaction of information
from the case clitics and the verbal predicate.9

(42) S —• KP*, V'

(tGF)=+ t=l

(43) S

(TGF) = ; (|GF) = ^



For the example at hand, these jointly determine that adnan ne is the
SUBJ and gari 'car' is the OBJ. Consider now the lexical information
associated with calayi (40). Since the SUBJ is not nominative and the
OBJ is nominative, the first disjunct does not apply, but the second
does, requiring that the agreement features born by the OBJ itself are
consistent with the values contributed by the verb, namely, FEM and
SG. The corresponding f-structure analysis is shown in (44).
As can be seen in (44), agreement is dealt with in LFG at the level of
f-structure by permitting the verbal head to directly constrain the agre-
meent features of the argument, constraints which are not mediated by
the level of constituent-structure representation.
84 Miriam Butt and Louisa Sadler


PRED d r i v e < (tSUBJ ) (tOBJ) > '
PRED 'Adrian
PRED 'car' "


5. Α simple morpheme-based morphology

As a syntactic theory which incorporates a strong lexicalist hypoth-

esis, LFG is in principle compatible with any theory of morphology
from lexical to inferential realizational (e.g., Stump 2001). However,
where the main focus has been on primarily syntactic phenomena,
work in LFG has generally assumed, largely for expository conve-
nience, a version of a word syntax approach, in which morphologi-
cal structures are phrase structure trees of the familiar sort, with stems
and affixes as leaves, associated with their own separate lexical en-
tries. For our examples, this involves defining sub-lexical entries for af-
PERS:3), and so on, and for the verb stems, and a set of sublexical
phrase structure rules to combine these elements. We explore a num-
ber of ways in which this is less that ideal.
Verbal Morphology and Agreement in Urdu 85

5.1. Default agreement

The disjunctions (number of possibilities) needed for the treatment of

masculine singular verb forms, such as calaya 'drive', are more com-
plex than the example shown with feminine singular agreement in the
previous section in (40). The reader will recall from earlier discussion
that verb agrees with the highest nominative argument in the clause
(where SUBJ is higher than OBJ) and in the absence of such an argu-
ment defaults to the masculine singular. Thus masculine singular forms
have both an agreement and a non-agreement (that is, default) usage.
A partial sample entry is shown in (45).
(45) - y a , V ^ C T A S P ) = PERF [ ( | S U B J CASE) = C NOM

This (very standard) approach certainly guarantees the correct syn-

tactic behavior—the constraints expressed in sub-lexical entries for the
verbal agreement suffixes (such as (45) and (40) ensure that feminine
singular verbs co-occur with feminine singular nominative arguments,
masculine plural verbs with masculine plural nominative arguments,
and so on, as well as ensuring that the (default) masculine singular
is the only form available in the absence of a nominative argument.
However what this fails to express is the fact that forms suffixed by
-ya such as calaya, diya, and so on, are morphologically masculine
singular forms. The obvious generalization, namely: if there is no ap-
propriate argument to agree with, use the masculine singular form, is
not directly expressed. This cannot be expressed in terms of f-structure
features, for adding agreement information to the third (default case)
disjunct as in (46) above would simply and inappropriately specify the
clause itself as masculine singular.
86 Miriam Butt and Louisa Sadler

(46) (fSUBJ CASE) φ NOM

(|NUM) = SG
What seems necessary is to recognise a distinction between mor-
phological and syntactic features, which is not captured in this most
simple approach in which affixes are directly associated with f-structure
features. Several proposals exist for an additional, morphological or
morphosyntactic projection (the μ projection) which permits a clearer
separation of purely morphological from syntactic features. The pos-
tulation of an additional projection is, of course, independent of the
choice between a morpheme-based or a realizational model of mor-
phology. The μ projection or level of representation was first pro-
posed by Butt, Nino and Segond (1996) in order to deal with mor-
phological wellformedness dependencies in German and English ver-
bal complexes. The subsequent proposal by Frank and Zaenen (2000)
assumes a morpheme-based morphology, while Sadler and Spencer
(2001) combine the use of an m-structure for morphological features
with a realizational model of morphology. The proposals by Frank and
Zaenen (2000) and Sadler and Spencer (2001) use an architecture in
which m-structure is projected from f-structure as shown in (47).
(47) c-str —> f-str m-str
This is in many ways counterintuitive, and in section 6 we discuss
another possible model of the morphology-syntax interface—one that
is instantiated and exemplified by the use of finite-state morphological
analyzers in conjunction with syntactic representations. But first, we
return to a further discussion of the word-syntax approach.

5.2. Controlling combinations—the role of paradigms

In a word syntax approach, there is no role for the notion of paradigm.

Affixes themselves are treated as meaning/form pairs. One of the prob-
lems that this entails is that of controlling adequately the combina-
tion of stems and affixes. This is often possible only at the cost of the
Verbal Morphology and Agreement in Urdu 87

introduction of additional, otherwise unmotivated, features (in the f-

structure) and/or the introduction of otherwise unnecessary categorial
distinctions. A case in point is provided by the inflectional possibilities
for the two verbs 'be': one of which occurs only in past tense forms
(inflecting for number and gender) and the other of which occurs in
the other forms (inflecting for person and number).

(48) a. tum yahäh-o

you.Nom here be-Imp.2.Sg
'You be here!'
a. tum yahä f-i
you.Nom here be.Past-F.Sg
'You were here.'

It is clear that these two 'be' verbs form part of the same paradigm:
both verbs function as auxiliaries in exactly the same way and both
verbs have the same copula uses. The f - is restricted to the past and
thus forms a suppletive paradigm with the ho form. As such, one would
like to be able to deal with both the 'be' verbs by means of a sin-
gle unifying sublexical rule such as in (49). This rule produces tensed
auxiliaries by taking a base form of an auxiliary and affixing a tense
morpheme such as -εί or -i above in (48a) and (48b), respectively (a
sample sub-lexical tree produced by this rule is shown in (50)).

(49) COPtns —» COPtns-BASE V-TNS-AFF

(50) COPtns

t=l t=4

However, if no further specifications are added, this rule will also

produce the incorrect sentence in (51), where the past 'be' auxiliary
is invested with person and number agreement morphology, rather than
number and gender morphology.
88 Miriam Butt and Louisa Sadler

(51) *tum yahäf-o

you.Nom here be.Past-2.Sg
'You were here.'
On the word syntax view, the person-number affix is a sub-lexical
leaf of the same kind as the number-gender affix, and this makes it
difficult to control the combinatorial possibilities. The possibility of
(51) must be ruled out via cumbersome constraints in the (sublexical)
lexion, whereby the person-number affix for the 'be' verb is prohib-
ited from appearing with the past tense and the number-gender affix is
restricted to the past tense.10
Explicit recognition of a suppletive 'be' paradigm in which the past
tense stem is and all other cells involve the stem ho- would permit us
to capture this restriction directly (rather than by doing what amounts
to engineering an 'accidental' distributional gap) for both copula and
auxiliary uses of 'be'.
Furthermore, in directly pairing content and form in lexical en-
tries for inflectional affixes, the word syntax approach can easily fail
to make the distinctions which we wish to make on morphological
grounds from those which have syntactico-semantic content. Thus on
main verbs the use of present tense inflections signals subjunctive mood
(recall that present tense does not occur otherwise on lexical verbs—
present time reference is signalled periphrastically).
(52) nadya a-yel
Nadya.F.Nom come-Pres.3.Sg
'Should Nadya come?
On the other hand, the present morphology on 'be' in both its cop-
ula and auxiliary use does indeed signal present tense. This would
seem to be a case where the verbs share a set of morphological forms,
but these map differently to f-structure information (or syntactic feat-
ural content). This could be stated simply if we separated morphology
(and morphological features) cleanly from the syntax (cf. Sadler and
Spencer 2001), but again, in the present approach, cumbersome stipu-
lation is required by means of an explicit rule which refers only to the
non-past copula.
Verbal Morphology and Agreement in Urdu 89

5.3. Allomorphy and homophony

Finally, the (word syntax) grammar is considerably complicated by

the existence of a high degree of allomorphy and homophony in Urdu.
Allomorphy causes a duplication of lexical entries—indicating clearly
that a more adequate approach would need at least to factor out pre-
dictable morphophonological alternations. The degree of homophony
in the Urdu inflectional system is striking, with the inflection -e (a
mid vowel which many originally differing morphemes have devel-
oped into in the course of language change), for example correspond-
ing to at least the following functions: (i) number and gender with
some verbal forms (perfect, imperfect, future); (ii) person and number
with other verbal forms (future); (iii) oblique inflection on infinitives
and nouns; (iv) adjective agreement; (v) nominal agreement.
Treating this by means of a single sub-lexical entry for the affix
requires a complicated disjunction of conditional statements with sev-
eral substatements (to deal with the suppletive 'be' forms). This is dif-
ficult to write, error prone, and more importantly, does not capture
any useful linguistic generalization. Note that this type of homophony
is crosslinguistically extremely common. It is therefore imperative to
adopt an approach to the morphology-syntax interface which can deal
with these kinds of phenomena elegantly and correctly.

6. Finite-state morphology

In this section we provide a very brief illustration of how some of these

issues can be addressed within a computational environment using fi-
nite state technology (Karttunen, Kaplan and Zaenen 1992; Kaplan and
Kay 1994) to encode morphological descriptions, interfaced appropri-
ately to the syntax (e.g., see Butt, King, Nino and Segond 1999). The
use of finite state technology does not, of course, constitute a theoret-
ical model of morphology, but it sheds an interesting (practical) light
on concerns which interest theoretical and descriptive morphologists.
The standard application of finite-state technology to morphologi-
cal analysis and generation involves transduction between two levels or
90 Miriam Butt and Louisa Sadler

strings—a surface form (that is, the actual word form itself) and what
is known as a lexical form, which is typically the canonical dictionary
form (lemma) and a string of features or morphological subcategories.
These levels are related by transducers which directly encode morpho-
logical alternations, and relate all inflected forms of the same word to
the same canonical dictionary form (lemma), accompanied by differ-
ent features. Clearly, moving from a lemma or root and set of features
to a surface form is very much like looking up a surface form in a
paradigm listing, a point of similarity between finite state morphology
and theoretical realizational models. The examples below show some
surface/lexical pairings for simple words, illustrating the sorts of mor-
phological features found in lexical forms.

(53) dogs
1. dog+Noun+Pl
2. dog+Verb+Pres+3sg

(54) kutte 'dogs'

1. kutt+Noun+Masc+Pl
2. kutt+Noun++Masc+Sg+Obl

As can be seen, an English form such as dogs is analyzed as be-

ing one of two possible forms: a plural noun or a present singular
verb. The base form (lemma) 'dog' is returned in the morphological
analysis, but the morpheme -s finds no representation beyond the list
of features. The Urdu example illustrates the same thing, except that
Urdu nouns are often marked for gender. The form kutte in Urdu can
either be analyzed as the simple unmarked plural form, or as a singular
form with oblique inflection. This oblique inflection is a prerequisite
for case marking, as in kutte=ko 'to the dog'.
The relation between a surface (realized) string and an abstract anal-
ysis into a lemma and one or more tags may be thought of as the mor-
phology proper—in analysis mode, the output is a lemma and string
of (morphological) features, in generation mode, the output is a sur-
face string. The finite state morphological component is entirely sep-
arate from the syntactic component. However, the abstract tags (string
Verbal Morphology and Agreement in Urdu 91

of morphological features) can be used to define an interface to the

syntax. This interface consists of two parts: 1) a sub-lexical grammar
which parses the tags; 2) an association of syntactically relevant infor-
mation with the tags.
The sub-lexical grammar which parses the tags takes the form of
sub-lexical rules as in (55), which takes care of tensed copulas.

(55) COPtns — • COPtns-S V-T V-F*

At first glance, this rule seems identical to the rule presented in the pre-
vious section in (49). However, whereas the rule in (49) was designed
to parse morphemes, this rule is designed to parse the abstract features
or tags provided by the morphological analyzer. Consider the analyses
of the 'be' verbs hei and te, respectively.

(56) a. hei
1. be+Verb+Pres+3P+Sg
2. be+Verb+Pres+2P+Sg

b. the
1. be+Verb+Past+Masc+Pl

The rule in (55) parses the tags in (56): it expects a base form (be)
followed by a verbal tag (V-T) in the rule, (+Verb is the tag), followed
by any or no number of verbal features. This allows for the tense, gen-
der, number and person features such as +Past, +P1, +Masc. Due to the
fact that the morphological analyzer itself only ever allows the +Past
tag in association with the t - forms, the problem of having to constrain
the grammar so that only the right kinds of morphemes appear with the
right kind of stem does not arise. The well-formedness of sequences of
tags is guaranteed by the definition of continuation classes in the finite
state morphology, which specify what tags can be consumed/output in
transition from one internal state to the next. In (57) the first state is
'be', the next state is '+Verb\ etc. These continuation classes can be
defined in simple and general terms.
So, for example, it should have been clear from section 4 on agree-
ment that there are a number of verb forms which take number and
92 Miriam Butt and Louisa Sadler

gender affixes of the same kind. The verb f - is one of these. In fact,
this verb only allows this morphology and is inherently past. This basic
property can be taken care of by defining the tag +Past as part of the
entry for f - , as shown in (57). The 'GendV' defines the continuation
class for this verb as containing form: content pairs signaling number
and gender, as shown in (58). These morphs have no further continua-
tion class (indicated by the '#'), and thus no other affixal processes.

(57) be+Verb+Past:th GendV;

(58) GendV
+Masc+Sg:a #;
+Fem+Sg:i #;
+Masc+Pl:e #;
+Fem+Pl:i #;

The 'be' verb ho, on the other hand, is defined with a different
continuation class because this verb, unlike P-, can occur in a vari-
ety of tenses and aspects. One of its possible continuation classes is to
take person and number morphology, as we have seen in the examples
above. This continuation class, called 'PN' is defined as in (59).

(59) PN
+Pres+lP+Sg:ü #
+Pres+2P+Sg:ei #
+Pres+3P+Sg:ei #
+Pres+lP+Pl:gT #
+Pres+2P+Resp: εϊ #
+Pres+3P+Pl:£i #
+Pres+2P+Fam:o #

Note that there is no one-to-one correspondence between tags such

as +Past or +Pres and forms. The stem f - is itself associated with the
Past feature, while in the case of ho, on the other hand, the tense may
differ. If the verb is followed by only 'person and number' morphol-
ogy, the interpretation has to be present.
Verbal Morphology and Agreement in Urdu 93

In effect them, whereas the word-syntax approach of the previous

section dealt with the combinatorial possibilities of stem and affixes
with sub-lexical rules which also provided the f-structure information,
the current approach separates out the morphology proper from the
morphology-syntax interface, and the combinatorial possibilities are
dealt with within the morphological component which relates surface
forms to an abstract system of tags/features.
The morphological features are integrated into the LFG analysis via
a mapping between the sets of morphological tags and f-structure at-
tribute value pairs. In the simplest case, the tag +Pres, for example,
gives rise to the f-structure information TENSE:PRES, as shown in (60).

(60) +Pres: V-F ( | T E N S E ) = PRES

It is this mapping which introduces the necessary separation be-

tween morphological and syntactic features to provide an intuitive treat-
ment of both default and agreeing uses of masculine singular verb
forms, as discussed in the previous section: morphologically, all uses
are masculine singular, but syntactically, f-structure constraints over
the SUBJ or OBJ are introduced only in the non-default usage.
The finite-state analysis we have briefly sketched is not a morpheme-
based analysis of the type explored in the first implementation. The
mapping between surface forms and functional (f-structure) or well-
formedness (m-structure) information is not modeled via a simple re-
lation between surface morphology and abstract information. Rather,
several levels of analysis are presupposed. The first is a mapping from
a surface string to a more abstract representation into a lemma and tags.
This level of analysis includes a treatment of morphophonologically
predictable allomorphy via the integration of phonological rules into
the morphological analysis. All this is invisible to the morphology-
syntax interface. The morphologically driven information becomes vis-
ible to the syntax only through the association of the abstract tags with
f-structure information, as shown in (60).
This more complex view of the morphology-syntax interface fur-
thermore has the advantage that paradigms can be identified within the
morphological component, rather than being the accidental by-product
94 Miriam Butt and Louisa Sadler

of sublexical rules within the syntactic component (cf. the two 'be'
verbs in (58) and (59), which can now be considered part of the same
paradigm, though allowing for differing continuation classes).
Within this architecture, the initial morphological analysis is arrived
at within an independent module whose syntax is of a very differ-
ent kind than that of a grammar (e.g., continuation classes and mor-
phophonological rules rather than phrase structure trees). The architec-
ture permits a separation of strictly morphological information from
syntactic information and preserves lexical integrity: at the point of
lexical insertion the abstract morphological information encoded in the
tags is mapped into information relevant for the f-structural analysis of
the sentence. Although this mapping between tags and LFG features
sets will often be quite trivial, the many-to-one nature does permit a
sophisticated and clearly defined morphology-syntax interface.

7. Conclusion

This paper has presented some aspects of LFG morphosyntax, dis-

cussing in particular case and agreement in Urdu, and showing how
both can be given a straightforward analysis in LFG. We then turned
to the nature of the morphological component itself, looking more
closely at issues of morphological representation. After sketching a
morpheme-based word syntax approach to the generalizations previ-
ously formulated, we discussed a number of insufficiencies in this ap-
proach and showed how they can be largely overcome under the as-
sumption of a different model of the morphology-syntax interface.
LFG'S stance on morphology can perhaps be compared to its stance
on semantics: in principle, several differing theories are compatible
with the fundamental architectural assumptions of LFG. For example,
Discourse Representation Theory (Kamp and Reyle 1993) is in princi-
ple just as compatible with LFG as a semantic analysis based on linear
logic (Dalrymple 2001). Similarly, morpheme-based analyses are in
principle just as compatible with the fundamental tenets of LFG as are
realizational models or finite state morphologies. The basic require-
ments that LFG does impose on the morphological component is that it
Verbal Morphology and Agreement in Urdu 95

respect lexical integrity (as such Distributive Morphology is not com-

patible with LFG) and that the functional information contributed by
morphological elements be able to play a role in the syntactic and se-
mantic analysis of a clause on a par with phrasal information.
After exploring two differing approaches to the morpho-syntax in-
terface, we concluded that the method traditionally used by theoretical
linguists (often more as a matter of convenience), namely, morpheme-
or word-based morphology, has several undesirable side-effects. The
architecture assumed forfinite-statemorphological analysis (employed
primarily within computational circles), on the other hand, would seem
to yield the desired results. For one, the morphological component op-
erates according to rules and principles which are distinct from that
of the syntax. For another, the interface to the syntactic and semantic
analysis abstracts away from the surface realization of the morphemes,
but still allow functional information contributed by the morphology
to enter the syntactic analysis in a well-defined and regular manner.
This is precisely the kind of syntax-morphology interface required for
an analysis of languages with rich morphology.


1. This paper was first presented as part of the Workshop on Clause Structure and
Models of Grammar from the Perspective of Languages with Rich Morphol-
ogy at the DGfS in Leipzig. We would like to thank Uwe Junghans and Luka
Szucsich for organizing the workshop. The issues discussed in this paper arose
partly out of a workshop organized by Louisa Sadler and Andrew Spencer on
Morphology in LFG during the LFG conference held at Berkeley in 2000. We
would like to thank an anonymous reviewer, Mary Dalrymple, Anette Frank,
Ron Kaplan, Lauri Karttunen, Rachel Nordlinger, Andrew Spencer and Annie
Zaenen for some very stimulating discussion. Miriam Butt's contribution to this
paper was made possible by financial support from the DFG (the German Sci-
ence Foundation) via the SFB 471 at the University of Konstanz.
2. The South Asian languages Urdu and Hindi are closely related. Both are among
the official languages of India and are spoken primarily in the north of India.
Urdu is the national language of Pakistan.
3. The Principle of Morphological Composition is given as (i), where χ is a string
of attributes:
96 Miriam Butt and Louisa Sadler

i. Stem Äff =ί> Stem Äff

( ( G F ^ t j ) ((GFm ( t ) ) x) (GF"t) ((GFm (GFn f))*)

This principle is defined in terms of annotated trees in the morphology. In a

realizational model, it would seem to require an operation over f-descriptors to
be associated with the application of certain morphological operations.
4. A reviewer notes that a similar combination of lack of verb agreement and ex-
tremely rampant pro-drop also characterises a number of other South Asian
languages, thus reinforcing our point.
5. Bresnan (2001) and Falk (2001) make limited use of empty categories in un-
bounded dependency constructions in some coniigurational languages. Some
arguments in favour of this analysis are adduced from weak crossover—however
Dalrymple, Kaplan and King (2001) argue that no empty categories are needed
in LFG even in view of the weak crossover data.
6. The form *muf-e=ko is blocked due to syntactic status of the pronoun muf-e
as a Κ heading a KP: the case clitics take NPs as sisters.
7. The auxiliary is required when past (that is, in the past imperfect, past perfect
and the past progressive), but is optional when present (in the present imperfect,
perfect and progressive).
8. Of course, there is no relevance to the order in which the disjuncts appear in the
lexical entry—they could as well be in the opposite order.
9. For more details see Butt (1995), Butt and King (2001,2002a, 2002b).
10. This is a simple illustration of a much larger problem of controlling combinato-
rial possibilities. The issues much more tricky in languages with more complex
morphology, for example where there are dependencies between affixes.


Anderson, Stephen R.
1992 A-morphous Morphology. Cambridge: Cambridge University Press.
Andrews, Avery
1996 Semantic case-stacking and inside-out unification. Australian Journal
of Linguistics 16(1): 1-55.
Andrews, Avery and Christoper Manning
1999 Complex Predicates and Information Spreading in LFG. Stanford, Cal-
ifornia: CSLI Publications.
Aronoff, Mark
1994 Morphology by itself: Stems and inflectional classes. Cambridge, Mas-
sachusetts: The ΜΓΓ Press.
Verbal Morphology and Agreement in Urdu 97

Börjars, Kersti, Nigel Vincent and Carol Chapman

1997 Paradigms, pronominal inflection and periphrasis: a feature-based ac-
count. In Geert Booij and Jaap van Marie (eds.) Yearbook of Morphol-
ogy, 155-180. Dordrecht: Kluwer Academic Publishers.
Baker, Mark
1985 The Mirror Principle and Morphosyntactic Explanation. Linguistic In-
quiry 16: 373-416.
1988 Incorporation: A Theory of Grammatical Function Changing. Chicago,
Illinois: University of Chicago Press.
Bresnan, Joan (ed)
1982 The Mental Representation of Grammatical Relations. Cambridge, Mas-
sachusetts: The ΜΓΓ Press.
Bresnan, Joan
2001 Lexical-Functional Syntax. Oxford: Blackwell.
Bresnan, Joan and Sam Mchombo
1987 Topics, Pronoun and Agreement in Chichewa. Language 63:741-782.
Bhatt, Rajesh
2002 Long Distance Agreement in Hindi-Urdu. Manuscript, The University
of Texas, Austin.
Butt, Miriam
1993 Object Specificity and Agreement in Hindi/Urdu. In Papers from the
29th Regional Meeting of the Chicgo Linguistic Society, 80-103.
1995 The Structure of Complex Predicates. Stanford, California: CSLI Pub-
Butt, Miriam and Ashwini Deo
2001 Ergativity in Indo-Aryan. In KURDICA Newsletter for Kurdish Lan-
guage and Studies.
Butt, Miriam, Maria-Eugenia Nino and Frederique Segond
1996 Multilingual Processing of Auxiliaries in LFG. In D. Gibbon (ed.),
Natural Language Processing and Speech Technology: Results of the
3rd KONVENS Conference, Bielefeld, October, 111-122. Berlin: Mou-
ton de Gruyter.
Butt, Miriam, Tracy Holloway King, Maria-Eugenia Nino and Frederique Segond
1999 A Grammar Writer's Cookbook. Stanford, California: CSLI Publica-
Butt, Miriam and Tracy Holloway King
2002a The Status of Case. In Veneeta Dayal and Anoop Mahajan (eds.),
Clause Structure in South Asian Languages. To Appear. Dordrecht:
Kluwer Academic Publishers.
2002b Case Systems: Beyond Structural Distinctions. In Ellen Brandner and
Heike Zinsmeister (eds.), New Perspectives on Case Theory. To Ap-
pear. Stanford, California: CSLI Publications.
98 Miriam Butt and Louisa Sadler

2001 Non-Nominative Subjects in Urdu: A Computational Analysis. In Pro-

ceedings of the International Symposium on Non-nominative Subjects,
525-548. Tokyo: ILCAA.
Chomsky, Noam
1970 Remarks on Nominalization. In R. A. Jacobs and RS. Rosenbaum (eds.),
Readings in English Transformational Grammar. Waltham, Massachuset
Ginn. Also in N. Chomsky. 1972. Studies on Semantics in Generative
Grammar. The Hague: Mouton.
Dalrymple, Mary
1993 The Syntax ofAnaphoric Binding. Stanford, California: CSLI Publica-
2001 Lexical Functional Grammar. Syntax And Semantics Volume 34. New
York: The Academic Press.
Dalrymple, Mary, Ronald M. Kaplan and Tracy Holloway King
2001 Weak Crossover and the Absence of Traces. In Miriam Butt and Tracy
Holloway King (eds.), The Proceedings of the LFG01 Conference.
Hong Kong.
Davison, Alice
1999 Ergativity: Functional and Formal Issues. In Michael Darnell, Edith
Moravcsik, Frederick Newmeyer, Michael Noonan and Kathleen Wheat-
ley (eds.), Functionalism and Formalism in Linguistics, Volume I: Gen-
eral Papers, 177-208. Amsterdam: John Benjamins.
Deo, Ashwini and Devyani Sharma
2002 Typological Variation in the Ergative Morphology of Indo-Aryan Lan-
guages. Manuscript, Stanford University.
DiSciullo, Anne-Marie and Edwin Williams
1987 On the Definition of Word. Cambridge, Massachusetts: The MIT Press.
Εης, Mürvet
1991 The Semantics of Specificity. Linguistic Inquiry 22(1): 1-25.
Falk, Yehuda
2001 Lexical-Functional Grammar: An Introduction to Parallel Constraint-
Based Syntax. Stanford, California: CSLI Publications.
Frank, Anette
1996 A Note on Complex Predicate Formation: Evidence from Auxiliary
Selection, Reflexivization, and Past Participle Agreement in French
and Italian. In Miriam Butt and Tracy Holloway King (eds.), The Pro-
ceedings of the LFG '96 Conference. Rank Xerox, Grenoble.
Frank, Anette and Annie Zaenen
2000 Tense in LFG: Syntax and Morphology. In Hans Kamp and Uwe Reyle
(eds.), How we say WHEN it happens: Contributions to the theory of
temporal reference in natural language.To Appear. Niemeyer, Tübingen.
Verbal Morphology and Agreement in Urdu 99

Halle, Morris and Alec Marantz

1993 Distributed Morphology and Pieces of Inflection. In K. Hale and S.J. Keyser
(eds.), The View from Building 20: Essays in Linguistics in Honor of
Sylvain Bromberger, 111-176. Cambridge, Massachusetts: The MIT
Halvorsen, Per-Kristian and Ronald M. Kaplan
1988 Projections and Semantic Description in Lexical-Functional Gram-
mar. Proceedings of the International Conference on Fifth Generation
Computer Systems, 1116-1122. Tokyo.
Kaplan, Ronald M. and Martin Kay
1994 Regular Models of Phonological Rule Systems. Computational Lin-
guistics 20(3): 331-378.
Kamp, Hans and Uwe Reyle
1993 From Discourse to Logic. Dordrecht: Kluwer Academic Publishers.
Karttunen, Lauri, Ronald M. Kaplan and Annie Zaenen
1992 Two-level Morphology with Composition. In Proceedings of the 14th
International Conference on Computational Linguistics (COLING-92),
Kidwai, Aeysha
1997 Scrambling and Binding in Hindi-Urdu PhD dissertation, Jawaharlal
Nehru University, New Delhi.
King, Tracy Holloway
1995 Configuring Topic and Focus in Russian. Stanford, California: CSLI
Krifka, Manfred
1992 Thematic relations as links between nominal reference and temporal
constitution. In Ivan Sag and Anna Szabolcsi (eds.), Lexical Matters,
29-53. Stanford, California: CSLI Publications.
Mahajan, Anoop
1990 The A/A-Bar Distinction and Movement Theory PhD dissertation, MIT.
1992 The Specificity Condition and the CED. Linguistic Inquiry 23(3): 510-
McGregor, R.S.
1968 The Language oflndrajit ofOrchä. Cambridge: Cambridge University
Mohanan, Tara
1994 Argument Structure in Hindi. Stanford, California: CSLI Publications.
Nordlinger, Rachel
1998 Constructive Case: Evidence from Australia. Stanford, California: CSLI
100 Miriam Butt and Louisa Sadler

Rizzi, Luigi
1986 Null Objects in Italian and Theory of pro. Linguistic Inquiry 17(3):
Sadler, Louisa and Andrew Spencer
2001 Syntax as an exponent of morphological features. In Geert Booij (ed.),
Yearbook of Morphology 1999.
Sells, Peter
2001 Syntactic Information and its Morphological Expression. In Louisa
Sadler and Andrew Spencer (eds.), Relating Morphological and Syn-
tactic Structure: Morphology in LFG. To Appear. Stanford, California:
CSLI Publications.
Selkirk, Elisabeth O.
1982 The Syntax of Words. Cambridge: Massachusetts: The MIT Press.
Subbarao, Κ. V.
1999 Agreement in South Asian Languages. In the Proceedings of the South
Asian Language Analysis Roundtable. To Appear. Chicago, Illinois:
University of Illinois at Urbana-Champaign.
Stump, Gregory, T.
1991 A Paradigm-theory of morphosemantic mismataches. Language 67(4):
Stump, Gregory, T.
2001 Inflectional Morphology: A Theory of Paradigm Structure. Cambridge:
Cambridge University Press.
Tenny, Carol
1994 Aspectual Role and the Syntax-Semantics Interface. Dordrect: Klu wer
Academic Publishers.
Particles and sentence structure: a historical
Gisella Ferraresi and Maria Goldbach

1. Preliminaries

We sketch here a small fragment of the syntax of Old French which

shows that syntactic change should be viewed as the interplay of dif-
ferent micro factors, phonology and lexical semantics.
As far as methodology is concerned, we follow Keenan (2001)
and Longobardi (2001), who base themselves on the hypothesis that
syntactic change occurs only as a consequence of changes of the
interface conditions, in Keenan's terminology as the consequence of
semantic and/or phonological 'decay' (where semantic erosion often
co-occurs with semantic decay). Our results support this hypothesis.
That means that certain syntactic phenomena depend on the semantic
and phonological representation of corresponding functional ele-
ments. We believe the analysis of syntactic parameters in generative
terms to be a new kind of approach. Our research is a first try at Old
French Verb-second (V2) phenomena, while, however, we do not
look any further into the nature of a V2 parameter.
In Minimalism it is assumed that a system comprising a set of
formal features is available and that a language makes a selection
from these features. Parameterisation is the way the different features
are realised in individual lexical items. For example, in one language
Τ and Agr may be realised in one and the same lexical item, whereas
in another language Τ and Agr may have distinct lexical realisations.
So, the task of the child is to distinguish what the subset of these fea-
tures is and how these features are combined. In other words, which
features are assembled into a given category. However, some strong
restrictions are imposed by UG. For example, Tense will never select
a C, only the contrary will be possible. Some restrictions however
could simply be a question of implementation, e.g. in the phonology
102 Gisella Ferraresi and Maria Goldbach

the sequence lateral-rhotic (*lr) is excluded, but this may be a

question of articulation rather than of Universal Grammar.
So we take parameters to be the way in which formal features are
realised in different categories. Drawing on proposals for the syntax
from Lightfoot (1999) and for the phonology from Dresher (1998),
we assume that a child uses 'grammar fragments' (in Lightfoot's
terminology) from the input for inferring the nature of the features in
her grammar. These fragments are the cues from which the child
identifies and assembles the syntactic/formal features of her particu-
lar grammar. Thus we consider that parameters are not located in the
grammar as individual properties of a language, rather they are a con-
sequence of the particular feature make-up in the language. Cluster-
ing effects, such as those properties associated with the so-called pro-
drop parameter, are merely an epiphenomenon.
The case study we present here is the loss of the sentence particle
si in French. We believe that macro changes which in the past have
been explained through reanalysis, are not falsifiable; the aim, as
Longobardi (2001:278) points out, is to find 'deep explanations for a
portion of even small fragments of grammatical history of a lan-
guage, rather than ... general principles immediately applicable to the
totality of superficially observable changes'.
In the Old French period the sentence particle si is in an interme-
diate stage: we will show that it is in the process of grammaticalisa-
tion, on its way to becoming a bound morpheme. But instead of
becoming morphologized it disappeared in the 17th century. We will
illustrate which factors contributed to the vanishing of si in Early
Modern French. In addition, we want to show how the presence of
particles in a language can throw light on sentence structure. In much
of the generative literature on Old French si has been neglected, and
in those works which do mention it, si has been treated as an adverb
occupying a specifier position (of CP or of AgrP). In the following
we will demonstrate that this view has to be abandoned and that si
instead is a functional head.
Particles and sentence structure: a historical perspective 103

1. The morphophonological shape of Old French si

Old French (OF) had several sentence particles (puis 'then', or

'now', ainz 'but') the most wide-spread of which is si 'thus'. It oc-
curs in direct speech, in literary texts (poetry), in charters and in
chronicles (narration). The other particles are less frequent and they
are nearly absent from certain text types (e.g. ainz occurs only very
rarely in charters, cf. Reenen and Schesler 2001). We focus therefore
in this article on si.
Si has different spellings: it appears as si, se, ci or ce in OF texts.
Concerning its lexical accent, we observe that si functions as the
(phonological) host of OF object clitics (me 'me', te 'you', le 'him',
les 'them', etc.) and adverbial clitics (en 'of it', i 'there') as in (la),
but there are examples where si is contextually reduced and cliticizes
onto another element as in (lb).

(1) a. Sin [< si en] vois vedeir alques de sun semblant (Roland
19, 270).
Si of it will (I) see something of his appearance
'(I) will see something of his appearance.'
b. Car lavez, s'alez asseoir (Charrete, 1028, quoted in
Buridant 2000:507).
Thus wash (you), Si come sit down
'Thus wash (your hands), come sit down.'

On the other hand, si can stand alone, i.e. cliticization is not

obligatory, and is even permitted in sentence initial position.

(2) Si prisent le palais ä force, si y menerent I'empereeur; ...

(Clari 33,40-41).
Si took (they) the palace by force, SI there led (they) the
'(They) took the palace by force, (they) led the emperor

In this respect si differs from true clitics (OF object and adverbial
clitics) which (in Old French) are excluded from sentence initial
104 Gisella Ferraresi and Maria Goldbach

position and which are always phonologically dependent on a host.

Hence, si takes part in clitic clusters where it functions as a host of
right adjoined clitics. But unlike true clitics it may occur in absolute
first position and it is not obligatorily subject to external sandhi rules
(e.g. by apocope of its final '/'). This means that unlike object and
adverbial clitics si has a lexical accent which may be reduced in
certain contexts. In this respect si resembles exponents of functional
categories such as auxiliaries, complementizers and determiners
which can appear stressless depending on their position in the
sentence (cf. Selkirk 1996 and references cited there).
In order to obtain more evidence about the prosodic character of si
we investigated its status in OF poetry, in particular in the Chanson
de Roland (11th century). OF poetry had a fixed rhythmical frame into
which the lexemes were inserted in accordance with their inherent
rhythmical structure. Along these lines the canonical syntax could be
altered in verse, i.e. a lexical item could surface in a syntactically
inappropriate position, whereas poetry avoided its insertion into a
rhythmically inappropriate position. Therefore we can deduce from
metrical poetry little (if nothing) about OF syntax, instead we can infer
the metrical character of OF lexical items (cf. Klausenburger 1970).

χ χ
χ χ χ χ χ
(3) a. Si vait fe- rir || Ge- rin par sa grant force
'(He) was about to attack Gerin with his enormous force.'
(Roland 122,1618).
χ χ
b. Ven- get Ii reis || si nus pur- rat ven- ger
'The king will come and (he) will be able to avenge us.'
(Roland 132,1744).

The Chanson de Roland is a decasyllable rhyme, that is each line

contains ten syllables, hence ten metrically relevant positions. By a
rhythmical caesura (marked by the double slash || in (3)) a line is
divided in two hemistichs. In early poetry the caesura in the
decasyllable appears after the fourth syllable. The fourth and the
Particles and sentence structure: a historical perspective 105

tenth syllables always bear major stress (they receive the primary
ictuses). The basic rhythmical pattern of the decasyllable is iambic,
that is, a weak syllable is followed by a strong one (in example (3)
above marked by superscribed x). Metrical rules - such as inversion
- may change the iambic rhythm, but these rules never operate across
the caesura nor do they involve the tenth syllable (see Nespor and
Vogel 1986 and references cited there). Thus the fourth and the tenth
syllables never lose their primary ictuses. As far as si is concerned,
we never found it to occur in the fourth or in the tenth position. This
cannot be due to a syntactic constraint since syntax in verse is much
less rigid than in prose. From this we conclude that si cannot bear
major stress. Moreover, in the great majority of verses we found si in
the first position of a hemistich, that is, line-initially or in postcaesura
position. The only other option where si can be hosted is the second
Apart from the fourth and the tenth positions, si never surfaces in
the third or in the sixth to ninth positions. Considering that the basic
rhythm is iambic, the first and the fifth position - the main locations
of si - are weak positions. We deduce from this that in principle si
occurs in a metrically unstressed position. Interestingly, this is not a
peculiarity of the Chanson de Roland, rather our results coincide with
the findings of Marchello-Nizia (1985) who investigated the St.
Alexis poem, also a decasyllabus rhyme from the 11th century.
Summing up so far, the phonologically and metrically weak
character of si suggests that it is a functional category and not a
lexical one. In the next section we shall look at the syntactic behavior
of OF si.

2. The syntax of si

Like the Germanic languages, Old French has been analysed as a V2-
language in the pre-generative as well as in the generative literature
(cf. Meyer-Lübke 1899, Adams 1987, Roberts 1993, Lemieux and
Dupuis 1995, Vance 1997). Clearly, in such an analysis, the element
si is viewed as an ordinary adverb, i.e. an XP in a specifier position;
Lemieux and Dupuis (1995), for example, arguing against V-to-C
106 Gisella Ferraresi and Maria Goldbach

movement for Middle French, propose that si is generated in

SpecAgrP. In Vance (1997) si occupies SpecCP. Thus, si is
considered to be a lexical category. These analyses, however, fail to
account for sentences like the following.

(4) [La damoisele a qui tu as parle] si est li anemis (Queste

113,1; from Vance 1997).
'The maiden to whom you have spoken si is the enemy.'

If si is in SpecCP, what is the position of the subject-DP? Further-

more, we do not find other (VP) adverbials in the position occupied
by si nor can si occur in the position that they occupy.

(5) a. * [La damoisele a qui tu as parle] vraiement est li anemis.

'The maiden to whom you have spoken surely is the
b. [La damoisele a qui tu as parle] est vraiement li anemis.
c. * [La damoisele a qui tu as parle] est si li anemis.

Sentential adverbs or adverbial expressions like apres gou 'after

that' or adonc 'then' do not obligatorily trigger the postposition of the
subject (5'a,b), although they appear adjacent to the verb, too (5'c,d).

(5') a. Adonc li messageprisent congie ... (Clari 19,42).

'Then the messengers said good-by ...'
b. Apres il sait que vos avez mis le vostre (Villeh. 113,6-7).
'After all he knows that you have given (all) yours (/all
your goods)'
c. Adonc atira li marquis son oirre ... (Clari 20,3)
'Then took the marquis his road
d. Apres cele quinzaine vint li marquis Bonifaces de
Monferrat (Villeh. 112,30-31).
'Behind these fifteen came the marquis Boniface de

These findings suggest that si occupies a position of its own not

shared by other adverbials. We propose that si does not occupy a
Particles and sentence structure: a historical perspective 107

specifier position but heads its own phrase. To ascertain what the
appropriate syntactical position of si is we compared the OF data
with Celtic particles.
Welsh has two different sets of particles, fe and mi which
introduce affirmative main clauses and a and y which belong to a
general focus strategy and are in complementary distribution with
felmi (cf. Roberts 2000: 39).

(6) a. Bore'ma, mi glywes i' rnewyddion ary radio.

Morning this PRT heard I the news on the radio
'This morning, I heard the news on the radio.'
b. Ydynion a werthodd yd.
the men PRT sold the dog.
'It's the men who have sold the dog.'

Celtic particles and the finite verb are always adjacent to each
other and adverbs can only precede both. The only elements
intervening between the particles and the finite verb are so called
"infixed pronouns" (Roberts 2000:38).

(7) Mi' ch gwelais i.

PRT you (pi.) saw I
Ί saw you' (Roberts 2000:38).

In sentence (6a) the adverbial phrase bore'ma precedes the

particle mi and the finite verb glywes, the subject i is postverbal and
is followed by the object-DP r newyddion. This pattern is parallel to
the one we found in OF particle sentences, such as (8).

(8) a. Adonc si manda li Dux tous les haus conseils de la

adv. PRT Vfm subj. obj.
'Thus the duke summoned all the municipal councils'
(Clari 21,17).
108 Gisella Ferraresi and Maria Goldbach

b. Et si vous metrons pro cinquante galies ä no coust.

obj.pron. Vfm obj.
'And we will assign you 50 galleys at our cost' (Clari 21,

In (8b) the clitic pronoun vous is inserted between the particle si

and the finite verb metrons. No other type of object DP nor a subject
pronoun can occur in this position. This mirrors the picture of the
Welsh structure in (7) where we observe that the phonological host of
the pronoun ch is the particle mi. In Old French it is generally the
finite verb that is the phonological and the syntactic partner of the
clitic, however, si may also fulfil this function.

(9) Sil [< si le] saluerentpar amur epar bien (Roland 121).
'(They) welcome him amicably and seemly.'

The structural similarities between Old French and Welsh

observed so far are represented in (10).

(10) XP - si/fe, mi - Clitics - Verbfin - Subj .-Pron. - Obj .-DP

We notice that the relation between the particle, the object clitic
and finite verb in Old French is close to such a degree that no non-
clitical element can intervene and that the whole complex is subject
to specific phonological rules (below we will show that these items
constitute one single prosodic unit). Turning now to the elements
preceding si, we see that the initial XP can be the subject as in (11) or
an adverb as in (12).'

(11) [La damoisele a qui tu as parle] si est li anemis (Queste

113,1; from Vance).
'The maiden to whom you have spoken si is the enemy.'

(12) Adonc si manda li Dux tous les haus conseils de la vile (Clari
'Thus the duke summoned all the municipal councils.'
Particles and sentence structure: a historical perspective 109

Firstly, subjects preceding si must be non-pronominal and must

have definite, specific reference, as in (13).

(13) a. [Cil de la ville de Jadres] ... si eurent molt grant peur

(Clari 26,2-4).
'Those from Jadres had great fear.'
b. [Cil qui en Blakie estoit fuis] si y fu si povres ... (Clari
'The one who took refuge in Blakie (he) was so poor
c. [Jqffrois Ii mareschaus de Champaigne et Alarz Make-
riaus] si s 'en alerent droit en France (Villeh. 102, 10-11)
'Joffrois li mareschaus de Champaigne and Alarz
Makeriaus went straight to France'

Though subjects of definite specific reference also occur

postverbally, indefinite subjects are excluded from the position
preceding si. Instead, they surface in postverbal position.

(14) Apres si avint un jour que ... (Clari 36, 36).

'Then there came one day that....'

Subject pronouns are excluded from occurring in front of si (15a).

Neither in Villehardouin nor in Clari we found them in this position.
This means that despite of their definite specific reference they
cannot be ahead of si.2 Nevertheless they may be topicalised as
evidenced by (15b).

(15) a. et si n'avons nous mie tous ceus nommes qui portoient

banieres (Clari 18, 31-32)
'And we did not mention all those who bore banners.'
b. Et il, par son sens et par son engin, ..., les mist en ce que
... (Villeh. 100, 36-37)
'And he because of his spirit and his ability manipulated
them that...'
110 Gisella Ferraresi and Maria Goldbach

Secondly, the XP preceding si cannot be the direct or the indirect

object. There seems to be a restriction against fronting of internal
arguments. Again, if si is missing object DPs may be topicalised as in

(16) a. [Des paroles que li dux dist bones et belles] ne vos puis
tout raconter (Villeh. 101,35-36)
'Of the good and nice words that the duke said I cannot
tell you all'
b. [Une autre partie] commanda Ii cuens de son avoir a
retenir... (Villeh. 102, 8-9)
'The lord ordered to hold back another part of his goods'

To summarize: if si is present only subject DPs of definite specific

reference and sentence adverbs may go before it. If si is not realised,
also subject pronouns and direct and indirect object DPs may be
topicalised. From this we conclude that there is a topic position in
front of si where adverbs such as apres and subjects are placed. This
position is not available for fronted internal arguments and for
subject pronouns.
Going back now to the comparison with Welsh, we see that here
also the position preceding the particle is subject to some restriction.
But in this case it is a restriction on focussed elements. Only one XP
may be fronted over the verb and the corresponding particles are a
(associated with fronted subjects, direct objects and VPs) and y
(associated with all other XPs) (see Roberts 2000:39).

(17) a. Y dynion a werthodd y ci.

'(It's) the men PRT (who) have sold the dog.'
b. Ym Mangory siaradais i llynedd.
'(It was) in Bangor PRT I spoke last year.'

These findings suggest that the quantity and quality of the position
preceding the particle is restricted in some way, presumably in a
language-specific way.3
In the following we will propose a descriptive model of OF
sentence structure. Parallel to Roberts' analysis for Celtic languages,
Particles and sentence structure: a historical perspective 111

we adopt for Old French the proposal from Rizzi 1997 where on the
basis of data from Italian and English he proposes to split the CP into
several distinct syntactic positions. Taking into consideration the
distributional properties of si so far discussed we take the structure in
(18) as a basis for future investigation. We hope eventually to
motivate our description by specifying which kind of features are
grouped into the respective categories and which kind of operations
they perform.

(18) The Split-CP analysis (Rizzi 1997)


Force TopP

Top FocP

Foe TopP

Top FinP

Fin TP

In Roberts' analysis, which follows Rizzi's 1997 system, Welsh

particles are merged in Fin, and verbs do not move into the C-system
but stay in AgrS. Now, let us see where OF subjects are. In contrast
to Roberts 2000 we suppose that the finite verb has moved to Fin
where si is merged. Si, object clitics, and the finite Verb form a
complex head as a consequence of head movement. This way we
model their intimate relationship since together these items constitute
- as we will illustrate in the next section - a single prosodic
constituent. The following structure summarises our proposal.4
112 Gisella Ferraresi and Maria Goldbach

(19) TopP

si ne obj-cl root tense

t ... V'

V° obj-DP

OF adverbs and subjects preceding si are in some kind of TopP,

seemingly in different ones as subjects are governed by a reference
constraint. What is it, then, that justifies si to be the head of Fin? We
have noted that si occurs in neither finite nor infinite subordination
(cf. ftn 3) If si really is the head of Fin - as we think it to be - how is
Fin then realised in Old French subordinates, where there is no si?
Let us have a look at an Old French subordinate construction. For its
descriptive analysis we adopt Rizzi's (1997) proposal, according to
which there are two positions for complementisers, Force and Fin.
The following Old French construction hints at Force and Fin in the
embedded clause not being realised syncretically but separately and
Particles and sentence structure: a historical perspective 113

at the same time. Examples like (20) can be found regularly in Old
French texts.

(20) a. ... et si leur disons [que] [s'il nous veulent rendre ces
trente six mile marcs] ... [que] nous les metrons outre
mer. (Clari 24,11-15)
'and si them tell we that if they us want to-give-back
these 36000 marks that we them take to overseas (to
North Africa).'
b. Si avoient pourchacie unes lettres de Rome, [que] [trestout
cil qui les guerroieroient ou qui leur feroient nul
domage] [qu] 'il fuessent escommunie. (Clari 26, 13-15)
'Si have (they) letters sent to Rome that all those who
would fight them or would cause them any damage that
they would be excommunicated.'

In Rizzi's sense we assume the second occurrence of que in the

sentence in (20) is in Fin, the first one, on the other hand, in Force.
This construction gives a hint at why si does not occur in an
embedded context: the embedded Fin head is reserved for que. As a
further argument for the assumption that si is in the head of Fin we
consider the fact that si regularly appears adjacent to other preceding
particles (like lors 'now', or 'now', ensi 'so') and to other preceding
(mostly temporal) scene-setting adverbial phrase.

(21) a. [Quant Theodore Lascaris οϊ la novele] [/orc] si manda

plus efforciement quanque il pot de gent, ... (Villeh. 194,
'When Theodore Lascaris heard the news then (she) send
by supreme effort as many people she could send'
b. [Au matin] si fu li parlemenz en un vergier ä I 'abate
madame Sainte Marie de Soissons. (Villeh. 103,36-38)
'In-the morning si there-was an assembly in a garden of
the abbey of Our Lady St. Mary of Soissons'

We find, among others, the combinations puis si 'then si' (Clari

30, 34), lors si 'then si' (cf. (21a)), apres si 'after that si' (Clari 30,
114 Gisella Ferraresi and Maria Goldbach

35; 25, 36), lors apres si 'now after that si' (Villeh. 161,25-27), or si
'now si' (Villeh. 100,29), adonc si 'now si' (Clari 31,28-29; 35, 22),
done si 'now si' (Clari 29, 39), puis apres si 'now after that si' (Clari
43, 38), apres quant-CP si 'after that when-CP Si' (Clari 48, 14-15)
etc. (cf. Reenen and Schesler 2001 for a comprehensive list). We now
assume that the scene-setting adverbial phrases and the particle-like
adverbials (i.e. those that always trigger V2) puis, or, lors, ensi are
distributed between two positions, Top and SpecFin. For a detailed
semantic reason for scene-setting adverbials to be in a TopP we refer
to Maienborn 2001. Thus we arrive at the following description.

(22) TopP

scene- FinP
setting ___
adverbials Spec Fin'
puis Fin0
lors si
ensi ainz

We can, at the present state of our survey, only notice desciptively

that SpecFin can also be occupied by quant-CPs, but that quant-CPs
can also be in TopP.
Now that we have briefly outlined the OF sentence structure we
will come to the question of V2-effects in Old French. Given that si
disappeared in the 17th century along with the loss of the V2-
phenomena we believe that these two incidents are underlyingly
connected. We assume that si can be superficially absent in two
different ways. In the first case si (respectively, any particle) is
missing because Fin is not projected, hence V3-structures arise. In
the second case si is realised in syntax, i.e. Fin is projected, but it
disappears by a postsyntactic reduction process which belongs to PF
(we deal with this PF-process in section 3). In the latter case the fea-
tural effects of Fin result - quasi epiphenomenal - in V2-structures.
So we see that the featural material in Fin, of which si and or are
Particles and sentence structure: a historical perspective 115

exponents, restricts the realisation of the topic position: if Fin is

realised by si a particular topic position (to be defined in future
research) is not available. On the other hand if Fin is not projected
this topic position is accessible for subject pronouns as well as for
object DPs. The benefit of our approach is that we need not invoke
CP-recursion in order to account for V3-structures in Old French.
Accordingly we ascribe the following structure to V3-sentences.

(23) a. [TopP A Vendemain, auques matin], [Spec Lanselos]

[T se lieve]] (...) (Tristan 63,17-18).
'The day after, in the morning, Lancelot gets up ...'
b. [TopP Apres] [χρ [Spec la gent de la vile] [x alerent] au palais
(...)] (Clari 33, 39-40).
'Then the people of the city went to the palace ...'

If Fin is not projected the head movement stops in Τ and V3-

configurations occur.


Top TP

quant-CP vP
se- CP
maintenant nominal or
apres gou pronominal

However, in our approach we cannot capture the fact that fronted

direct and indirect objects produce apparent V2-effects. Remember
the observation that topicalisation of internal arguments is bared if
Fin is realised (mainly by si). Our account predicts that topicalised
direct and indirect objects do only occur in V3-sentence since we
argue that V2-phenomena result from PF-reduction of si and that the
syntactic consequences of Fin-realisation take place prior to PF-
116 Gisella Ferraresi and Maria Goldbach

reduction. Clearly, this prediction is not empirically borne out in Old


(25) grant chose nos ont requise ... (Villeh. 100, 2).
'They asked us for big things ...'

For the time being we have no solution to this problem. But note
that our proposal reduces the explanatory task to explaining the
mechanisms of topicalisation. Our model does not need to reconcile
OF V3-sentences with a V2-syntax.
Beside V3 order also VI challenges the analysis of Old French as
a V2-language (although in most cases the verb is not in absolute
initial position but preceded by et 'and').

(26) a. Et sejournerent li pelerin en I'isle de Corfu ... (Clari

'And the pilgrims stayed on the island of Corfu ...'
b. Et chevauchierentpar lor jornees ... (Villeh. 18,4-6).
'And (they) rode in daylong marches ...'
c. Conseillierent soi et parlerent ensemble cele nuit
(Villeh. 100, 32-33)
'They discussed with each other and they talked together
in this night'
d. Querons lor qu'il le nos aident a conquerre (Villeh. 107,
'We ask them that they help us to conquer him'

Our account treats these sentences as analogous to the V2-

structures: the particle is removed by PF-reduction and no Top-node
is projected. Now that we have analysed the syntactic behaviour of
OF si we discuss the aforementioned PF-reduction which caused its
loss in early Modern French.
Particles and sentence structure: a historical perspective 117

3. The loss of the particle si in Modern French

3.1. Prosodic Factors

The particle si was lost during the 17th century and we can trace its
demise back to prosodic and semantic factors. First, we consider the
prosodic aspects. The theoretical basis of our proposal rests on
Nespor and Vogel's (1986) prosodic hierarchy illustrated in (27).

(27) Syllable > Foot > Phonological Word (ω) > Clitic Group (C)
> Phonological Phrase (Φ) > Intonational Phrase (I) >
Phonological Utterance

In Old French the main word accent falls on the final syllable
unless it contains a schwa. In this case the penultimate syllable gets
main stress.5 Home (1990) has shown that in this period initial
syllables of phonological words were weak. In open initial syllables ο
and ο raised to u and ε and e reduced to schwa (cf. Home 1990: 6):

(28) ο, ο > u; ε, e > a

Gallo-Roman Old French

nepotem > neveu 'nephew'
debere > devoir 'to have to'
dolorem > duleur 'sorrow'
moriri > murir 'to die'

Despite the fact that the orthographic shape of the initial vowels in
neveu and devoir is the letter e, their phonetic realisation is schwa.
This is confirmed by transliterations of OF words in Hebrew
characters, e.g. in a Vatican elegy from the end of the 13th century
(cf. Darmesteter 1874). In this text, the Latin letter e in the initial
unstressed syllable is transcribed as Hebrew 'sheva' [.], subscribed
under the preceding consonant, while e in open stressed syllable
appears in the Hebrew text as 'tsere' [ ] (Home 1990:6). The
phonetic value of sheva is schwa [a], that of tsere is closed [e].
118 Gisella Ferraresi and Maria Goldbach

(29) *sheva' {.} =: schwa [β], 'tsere' { } =: [e]

I'apelet 'called him' - , DN.1?
perir 'perish' - Τ . Ί . D

Thus, the Hebrew manuscript makes a systematic distinction

between the closed [e] and the initial syllable [a] which is not
possible in the Latin spelling system since it does not have distinct
characters to mark the different pronunciations. However, the
Hebrew text sufficiently identifies the OF pronunciation of the first
unstressed e.
In contrast to OF phonological words, the initial element of a clitic
group (into which phonological words are organized) had a
secondary stress, hence it was not weak. This is indicated by the fact
that clitics intervening between the initial member and the main
stress exponent optionally undergo syncope.

(30) de le cor del cor 'of the horn'

ne me vidrent nem vidrent 'they did not see me'
jo le pert -> jolpert Ί lose it'

If there are two pretonic clitics, only the first clitic is reduced.

(31) ne le te dit nel te dit Ί do not tell it to you'

se les te donet ses te donet 'if he gives them to you'

In this respect the OF clitic group mirrors the rule of syncope

operative in the Late Latin phonological word (4th - 7th century) (cf.
Home 1990:4).

(32) a. One pretonic syllable

Latin OF
bönitäte > honte 'kindness'
liberdre > livrer 'deliver'
mänducäre > mangier 'to eat'
lepordriu > levrier 'greyhound'
Particles and sentence structure: a historical perspective 119

b. Two pretonic syllables

Latin OF
subitamente > sotement 'suddenly'
äntecessore > >
ancessor 'ancestor'
auctoricäre > otreiier 'to concede'
ärcuballista arbaleste 'crossbow'

In (32a) the syllable between the first one, which bears secondary
stress, and that which bears main stress is reduced by syncope. If
there are two syllables between the initial and the accented syllable
as in (32b) only the first pretonic syllable is deleted. Thus by the OF
period, Late Latin syncope operating on phonological words applied
to the clitic group. Home draws attention to a further parallel
between the Late Latin phonological word and the OF clitic group: as
we have illustrated in (28) and (29) above, the secondary stress on
the initial syllable extant in Late Latin phonological words had
disappeared in OF phonological words. Similarly, the OF clitic group
lost its initial secondary stress in Early Modern French. The reason
for this development is the apocope of wordfinal schwa during the
16th century, which caused general oxytony not only at the level of
the phonological word but also at the level of the clitic group and the
phonological phrase (cf. Klausenburger 1970). That is, the right edge
of these prosodic categories became strong and they became strictly
right-headed.6 We think of this process in terms of scales which lose
their equilibrium when more weight is thrown onto one side. Turning
now to OF si, we conclude that it is organized into a clitic group with
adjacent clitics. Evidence for this can be gleaned from the following

(33) a. [Sil saluerent c] par amur epar bien (Roland 121).

'(They) welcome him amicably and seemly.'
b. [Si I 'en dunat c] s 'espee e s 'escarbuncle (Roland 1531).
'(He) gave him his sword and his carbuncle.'

Compare (32) with the clitic groups in (33). Clitics intervening

between si and the main stress exponent (i.e. the finite verb) are
reduced. If there are two clitics in between, the first one is reduced.
120 Gisella Ferraresi and Maria Goldbach

Since the initial element of the OF clitic group weakened by

apocope of word final schwa in the 16th century, it lost its secondary
accent and was subsequently subject to reduction processes. In this
way, we can explain the following reduction processes showing up in
Modern French.

(34) il [ne le veut c] pas > il le veutpas 'he doesn't want it'
[ilya c ] . . . >y'a ... 'there is ...'
[tu as vu c] ... > t'as vu ... 'you have seen ...'

(35) Reduction Rule

[[CL]w(eak) - [CL]W - ... - [<a]s(trong) J [ 0 - [CL]W - ... -
[ω], J
We assume that by rule (35) OF si became prosodically weak and
this gave rise to its eventual loss. In the next section, we will briefly
sketch some observations regarding the lexical semantics of the
sentence particle si compared with the lexical semantics of its OF

3.2. Distributional evidence

We want to argue that the existence of another five homonymous

particles in Old French (Fleischman 1992:262) contributed to the
semantic bleaching of this particular element:

- si (also se, s1) 'if as a subordinating conjunction;
- si 'so' as an adverbial intensifier (ModFr. tant, tellement);
- si 'thus, so1 as an adverb of manner (ModFr. ainsi);
- si as an adversative affirmation particle (ModFr. si);
- se (also s1) 3sg/pl reflexive pronoun (ModFr. se).

However, not all of the five homonyms are relevant to this

process. In fact, only those which have a similar surface distribution
can have had a role in the semantic bleaching of the sentence particle:
Particles and sentence structure: a historical perspective 121

that is, the adverb of manner and the subordinating conjunction

which are also elements competing for positions in the C-system, but
not the reflexive pronoun. The distinction is particularly difficult
when the position of si is immediately before the finite verb
(Marchello-Nizia 1985: 221):

(37) Je ne me puis mes sostenir, si sui atainz et sormenez (Yvain,

Ί can no longer hold myself up, so tired and exhausted I am.'

(38) Etpor ce, bele, siplorez? (Perceval, 794)

'And for that, my pretty one, are you crying so much?'

As Marchello-Nizia (1985:199) points out, between the 11th and

the 13th centuries si does not differ significantly in frequency across
different texts. However, an increased use of this particle is found in
the prose texts of the 13th century, such as romances and chronicles.
From the end of the 13 th century and the beginning of the 14th
century the frequency of si decreases both in prose and in poetry
texts. The end of the 15th century marks the end of the use of si in
most of the contexts where this particle can appear, and what
remaining usage there is, is marked as archaic or dialectal.
However, one can still find si in individual authors, although not
in all the structures it was previously used in, but by the end of 17th
century si has disappeared altogether. Foulet (1998:303) comments
thus on its loss: "De toute fa^on, grace ä la faculte qu'il a d'attenuer
son sens jusqu'ä le faire presque disparaitre, si offre une ressource
precieuse aux versificateurs mediocres.... II ne semble pas qu'il y ait
lieu de regretter la disparition de cette trop commode particule." [Be
that as it may, thanks to its propensity for undergoing semantic
attenuation to the point of near vacuity, si offers a precious resource
to mediocre poets ... It seems there is no point in regretting the
disappearance of this all too handy particle.]
We already mentioned that the OF V2-effects disappeared along
with the loss of sentence particles. We assume that these phenomena
are interrelated. Adopting the Inertial Theory from Longobardi
(2001) we have to pursue two questions: to what extent did the
122 Gisella Ferraresi and Maria Goldbach

extinction of OF sentence particles contribute to a syntactic

reorganisation in the earlier stages of Modern French, and in what
way did these elements serve as primary cues for feature composition
in individual syntactic categories, in particular in Fin? As regards the
first questions we offer the following preliminary answer: given that
the PF-reduction rule in (35) diachronically increased its application,
the main exponent of Fin - si - lessened. Consequently language
learners failed to detect evidence for the realisation of Fin-features as
an independent syntactic node. Therefore they opted for realising
Fin-features in one syncretic category with Tense. The answer to the
question why Fin is collapsed into one single node with Tense rather
than with some other node of the articulated C-system during the
diachronic development of French we leave for future research. We
believe that any solution of this issue is interconnected with the
answer to the second abovementioned question. At any rate a deeper
examination of the topic-area in Old French will further contribute to
the detection of the featural composition of Fin.


This study has been carried out as part of the research project "Multilingualism
as cause and effect of language change", directed by Jürgen Μ. Meisel (Jürgen
Μ. Meisel 1999). This project is one of currently 13 funded by the Deutsche
Forschungsgemeinschaft (German Science Foundation) within the
Collaborative Research Center on Multilingualism, established at the
University of Hamburg. For helpful comments we are indebted to our
colleagues Susanne J. Jekat, Jürgen Μ. Meisel and Tessa Say. We also want to
thank Dominique Billy and Jürgen Klausenburger as well as an anonymous
reviewer. Responsibility for errors remains with the authors.
1. Remember that approaches taking Old French to be a V2-language and
simultaneously assuming si to be a lexical adverb suffer from lacking an
adequate description for (11) and (12). See also Kaiser 1998.
An anonymous reviewer, referring to Zwart 1993, has proposed a comparison
with Dutch 'reduced adverbs', which have the property of appearing between
the first position and the finite verb. Zwart 1993:45 FN 18 mentions that V3
sentences with the adverbials nu (non-temporal) 'now', dan (non-temporal)
'then', echter 'however', daarentegen 'in contrast' and immers 'as is known'
appear in Dutch exceptionally. Zwart assumes these adverbials to be part of the
Particles and sentence structure: a historical perspective 123

first constituent. A comparison with the Dutch data, however, seems to be less
suited for the analysis of the syntax of Old French particles than the Welsh
examples, for Old French, in contrast to Dutch, shows V3 structure not only
with particles but also in other constructions (cf. the following examples):

(i) et [devant vostre conseil] [nos\ [vcw dirons] ce que nostre seignor vos
mandent, ... (Villeh. 99,25-26)
'and before your council we you will-tell that what our lords ask you ...'

(ii) [Maintenant] [// six message] [s'agenoillent] a lor piez mult plorant;
(Villeh. 101,22-23)
'Now the six messengers kneel down at their feet crying very much'

Thus Old French does not display the V2 properties typical for Dutch and
German in other structures either. We therefore hold the view that Old French
in its V2 structures is profoundly different from the German or Dutch V2
syntax. A sound explanation of the difference between the Dutch and German
V2 syntax on the one hand and the Old French V2 phenomena on the other
presupposes, however, a deeper understanding of the V2 parameter we do
without with, as already mentioned, within the scope of this paper.
2. Examples like the following:

(i) Et il sifirent mult volentiers. (Villeh. 101,4)

'And they did so very willingly'

are quasi formulaic. They do not contain the sentence particle si but the
homonym VP-adverb si - 'so'.
3. Our anonymous reviewer gave us the advice to bring into play Rizzi's
(1997:3lOff.) anti-adjacency effects as a further argument for sentence adverbs
in Topic position behaving differently from internal arguments. Consider the
following examples from Rizzi.
4. We do not have an explanation for the need to assume both right and left
adjunction with the head adjunction in (19). Apart from Tense everything is
left-adjoined. Morphologically, however, this assumption is not necessarily
problematical, since there is at least a guarantee that external material does not
influence internal material as to its morphological form (i.e., strict cyclicity is
5. According to Wartburg's (1962:184) calculation, Old French had one-third
paroxytones and two-thirds oxytones. Therefore it is argued that stress is
already predictable and has ceased to be distinctive in OF lexical items (cf.
Klausenburger 1970 and references cited there).
6. Klausenburger (1970:20) underlines the fact that in Old French liaison only
took place in the clitic group but did not affect the phonological phrase.
124 Gisella Ferraresi and Maria Goldbach

Evidence for non-liaison comes from lexeme-final devoicing of voiced

obstruents, e.g. Late Latin grande OF grant ('big'), a phenomenon to be
interpreted as word boundary marking process. By the 17th century, perhaps as
a result of the strict right-headedness of ω, C and Φ, the phonological phrase
became subject of liaison, levelling the difference between clitic group and
phonological phrase. In Modern French there are no phonological rules giving
evidence for the clitic group as an independent prosodic constituent. Hence
Home (1990:11) proposes that at this stage phonological phrase and clitic
group are co-extensive. According to Dell (1984) the Modern French
accentuation rule is operative at a phrasal level which could be identified with
Nespor and Vogel's phonological phrase.


Adams, Marianne
1987 Old French, null subjects, and verb second phenomena, Ph.D.
Dissertation. University of California, Los Angeles.
Buridant, Claude
2000 Grammaire nouvelle de l'ancien franqais [New grammar of Old
French], Paris: Sedes.
Darmesteter, Arsene
1874 Deux elegies du Vatican [Two Elegies of the Vatican], Romania
Dell, Francis
1984 L'accentuation dans les phrases en franfais [The accentuation in
the phrases of French], In: Francis Dell, Daniel Hirst and Jean-
Roger Vergnaud (eds.), Forme sonore du language. Structure
des representations en phonologie [Sound form of language:
Structure of representation in phonology], 65-122 Paris:
Dresher, Elan B.
1998 Charting the learning path: cues to parameter setting. Linguistic
Inquiry 30.1, 27-67.
Ferraresi, Gisella and Maria Goldbach
2001 Topicalisation and left dislocation in Old French, Ms. University
of Hamburg.
Fleischman, Suzan
1992 Discourse and diachrony: the rise and fall of Old French si, In:
Marinel Gerritsen and Dieter Stein (eds.), Internal and External
Factors in Syntactic Change, 432-473, Berlin: Mouton de
Particles and sentence structure: a historical perspective 125

Foulet, Lucien
1998 Petite syntaxe de l'ancien frangais [Short syntax of Old French],
Paris: Champion.
Home, Merle
1990 The clitic group as a prosodic category in Old French, Lingua 82,
Kaiser, Georg
1998 Verb-Zweit-Effekte in der Romania. Eine diachronische Studie
mit besonderer Berücksichtigung des Französischen [Verb-
second effects in Romance. A diachronic study with special
reference to French], Habilitation, University of Hamburg.
Keenan, Edward L.
2001 Explaining the creation of reflexive pronouns in English, Ms.
Klausenburger, Jürgen
1970 French Prosodies and Phonotactics. A Historical Typology.
Tübingen: Niemeyer.
Lemieux, Monique and Fernande Dupuis
1995 The locus of verb movement in non-asymmetric verb second
languages: the case of Middle French. In: Adrian Battye and Ian
Roberts (eds.), Clause Structure and Language Change, 80-109,
New York: Oxford University Press.
Lightfoot, David
1999 The Development of Language. Oxford: Blackwell.
Longobardi, Giuseppe
2001 Formal syntax, diachronic minimalism, and etymology: the
history of French chez. Linguistic Inquiry 32.2, 275-302.
Maienbom, Claudia
2001 On the position and interpretation of locative modifiers, Natural
Language Semantics 9.2,191-240.
Marchello-Nizia, Christiane
1985 Dire le vrai: L'adverbe "si" en frangais medieval [To tell the
truth: The adverb « si» in Medieval French]. Geneva: Droz.
Meisel, Jürgen Μ.
1999 Multilingualism as cause and effect of language change:
historical syntax of Romance languages. In: Finanzierungsantrag
zum SFB Mehrsprachigkeit, 455-477, University of Hamburg.
Meyer-Lübke, Wilhelm
1899 Romanische Syntax [Romance syntax]. Leipzig: Reisland.
Nespor, Marina and Irene Vogel
1986 Prosodic Phonology, Dordrecht: Foris.
126 Gisella Ferraresi and Maria Goldbach

Pollock, Jean Yves

1989 Verb movement, universal grammar, and the structure of IP,
Linguistic Inquiry 20.3, 365-424.
Reenen, Pieter van and Lene Schesler
1993 The thematic structure of the main clause in Old French: OR
versus SI, In: Henning Andersen (ed.), Historical Linguistics.
Selected Papers from the 11th International Conference on
Historical Linguistics, Los Angeles, 16-20 August 1993,401-419,
Amsterdam: Benjamins.
Reenen, Pieter van and Lene Schesler
2001 The pragmatic functions of the Old French particles AINZ, APRES,
DONC, LORS, OR, PUIS, and SI. In: Susan C. Herring, Pieter van
Reenen and Lene Schesler (eds.), Textual Parameters in Older
Languages, 59-105, Amsterdam: Benjamins.
Rizzi, Luigi
1997 The fine structure of the left periphery. In: Liliane Haegeman
(ed.), Elements of Grammar. Handbook in Generative Syntax,
281-337, Dordrecht: Kluwer.
Roberts, Ian
1993 Verbs and Diachronic Syntax. A Comparative History of English
and French. Dordrecht: Kluwer.
Roberts, Ian
2000 Principles and parameters in a VSO language. Ms. University of
Rouveret, Alain
1994 Syntaxe du gallois: principes generaux et typologie [The syntax
of Gaulish: General Principles and typology]. Paris: CNRS
Selkirk, Elisabeth
1996 The prosodic structure of function words. In: James L. Morgan
and Katherine Demuth (eds.), Signal to Syntax: Bootstrapping
from Speech to Grammar in Early Acquisition, 187-213,
Mahwah NJ: Lawrence Erlbaum Associates.
Skärup, Povl
1975 Les premieres zones de la proposition en ancien frangais [The
left periphery of the Old French proposition]. Kopenhagen:
Akademisk Forlag.
Vance, Barbara
1997 Syntactic Change in Medieval French. Verb-Second and Null
Subjects. Dordrecht: Kluwer.
Particles and sentence structure: a historical perspective 127

Wartburg, Walter von

1962 Einfiihrung in Problematik und Methodik der Sprachwissenschaft
[Introduction to the area of problems an methods of linguistics].
Tübingen: Niemeyer.
Zwart, Cornelius J.W.
1993 Dutch Syntax. A Minimalist Approach. Diss. Groningen.

Primary Sources

Chretien de Troyes, Lancelot ou le chevalier de la charrete (The knight of the

cart), (from Buridant 2000) vol. IV, (ed.) M. Roques, Paris:
Champion, 1967.
Robert de Clari, La Conquete de Constantinople (The conquest of
Constantinople). In: Albert Pauphilet (ed) Historiens et
chroniqueurs du Moyen Age (Historians and chroniclers of the
Middle Ages), nouvelle edition, Paris: Gallimard, 1986.
Li bestiaires d'amours di maistre Richart de Fornival e li response du bestiaire
(Master Richard's bestiary of love and response) (ed.) Cesare
Segre, Milan: Riccardo Ricciardi, 1957.
Chretien de Troyes, Perceval ou le conte du graal(Perceval or the story of the
Grail), (from Marchello-Nizia 1985) (ed.) F. Lecoy, Paris:
Champion, 1972-75.
La Queste del Saint Graal. Roman du XIIF siecle (The quest of the Holy Grail.
Romance of the 13th century), (from Vance 1997) (ed.) Albert
Pauphilet (ed), Paris: Champion, 1949.
La Chanson de Roland (The song of Roland), (ed.) Leon Gautier, Tours: Mame
etfils, 1881.
Le Roman de Tristan en Prose (The romance of Tristan), vol. 1, (ed.) Philippe
Menard, Geneva: Droz S.A., 1987.
Geoffroy de Villehardouin, La Conquete de Constantinople (The conquest of
Constantinople). 2 vol. (ed. and translated) Edmond Faral, Paris:
Les Belles-Lettres, 21961.
Chretien de Troyes, Yvain ou le Chevalier au lion (Yvain or the knight with the
lion), (from Marchello-Nizia 1985) vol. V-VI, (ed.) M. Roques,
Paris: Champion, 1967.
Subject Case in Turkish nominalized clauses*
Jaklin Kornfilt

1. Introduction and summary

It is well-known that the distinction between adjuncts and arguments

plays an important role in syntax. For example, arguments can be
extracted more easily than adjuncts out of syntactic islands. Further-
more, adjunct domains tend to be syntactic islands, while argument
clauses tend not to have island properties (abstracting away from
syntactic subjects). In this paper, I claim that the argument-adjunct
distinction can also play a role in determining the Case on the subject
of a particular syntactic domain. It is the status as an adjunct versus
as an argument of that domain which can determine, I claim, the type
of subject Case.
This paper is also a case study in the interactions of morphology
and syntax, as it claims that overt Agr(eement)1 determines subject
Case (but only where Agr is licensed itself in this capacity). Another
aspect of the morphology-syntax interaction shown here is absence
of a one-to-one relationship between syntactic and morphological
Case: while morphological Genitive indeed reflects licensed nominal
subject Case, morphological Nominative (possibly by virtue of being
phonologically null) reflects both licensed verbal subject Case and
default Case.
The specific proposals made in this paper are listed below:
1. I claim that Turkish has three types of overt subjects: Those
that bear genuine subject Case, those that bear default Case, and
those which are Case-less. "Genuine subject Case" is licensed by a
designated Case licenser; for Turkish, this is the overt Agr(eement)
marker. Such subject Case can be Nominative or Genitive in Turkish,
depending on the categorial features of Agr.
130 Jaklin Kornfilt

Default Case is possible as a last resort strategy, when subject

Case is not licensed for an overt subject, and when no other licenser
can license another appropriate Case (e.g. an ECM verb licensing
Case-less subjects will be discussed briefly, as well; these are
non-specific, and they are less mobile than the other two types of
2. The proposed interaction between the argument-adjunct
asymmetry and the designated subject Case licenser, i.e. overt Agr, is
implemented in the following way:
Agr needs to be licensed itself in order to function as a subject
Case licenser. This can happen in three ways:
A. Categorially, i.e. via matching category features: A verbal Agr
is licensed in a fully verbal extended projection, and a nominal Agr is
licensed in a fully nominal extended projection.2
B. However, where there is a categorial mismatch, Agr must be
licensed differently. This is when the argument-adjunct asymmetry
comes into play:
An argument domain bears a thematic index (cf. the proposal in
Rizzi 1994 that arguments bear a "referential" index, while adjuncts
don't); this index is inherited by the Agr (if there is one) that heads
the argument domain in question. 3 1 assume that it is such indexation
which licenses a categorially unlicensed Agr as a subject Case
licenser. Thus, if Agr does not match its clause categorially, it is only
where that clause is an argument that Agr will be able to license
subject Case; where the domain is an adjunct, a categorially mis-
matched Agr cannot license subject Case.
We thus correctly predict the existence of argument-adjunct
asymmetries with respect to subject Case in categorially hybrid
clauses, as well as the absence of such asymmetries in categorially
homogeneous clauses.
C. There is another way for a categorially mismatched Agr to
receive an index and thus to get licensed as a subject Case licenser:
via predication with an external head, i.e. when the domain headed
by that Agr receives an index via predication (in headed operator-
variable constructions like relative clauses and comparatives), and
Subject case in Turkish nominalized clauses 131

when, once again, the Agr head inherits the index of the clause in
3. In all other instances (i.e. where there is no Agr, or where an
existing, but categorially unlicensed Agr cannot receive an index by
either "referential" θ-marking or under predication), no genuine
subject Case is possible. The clause will have either a PRO subject
or, if it has an overt subject, that subject will be in a default Case
rather than in a genuine subject Case. The paper discusses the issue
of default Case and proposes criteria determining when default Case
is possible and when it is not. It further proposes that the
morphological realization of default Case may differ across
languages; e.g. it is Accusative in English, while it is Nominative in
4. Coming back to subject Case, it is licensed locally within the
extended functional projection of the clause; no clause-external
nominal element is involved in this licensing—at least not directly,
as the licenser of subject Case.
5. The account proposed is compatible with approaches where
AgrP is an independent projection (Pollock 1989, Kornfilt 1984), but
also with approaches where Agr is positioned within the head of
another functional projection, e.g. of the head of a Fm(iteness)P (cf.
Rizzi 1997), as long as Agr is housed in a projection separate from
ΤΑΜ (i.e. Tense, Aspect, Mood).
6. This paper is, at the same time, a case study concerning the
two most widely used nominalization types in Turkish, with respect
to genuine subject Case. The argument-adjunct asymmetry mention-
ed in 2. is observed in one type of nominalization only (i.e. the indic-
ative type) and not the other (i.e. the subjunctive type).
The account proposed claims that, while both types of subordinate
domains are DPs, only indicatives are also CPs. This explains the
sensitivity of indicatives to "CP-level" phenomena and to θ-marking,
and the lack of such sensitivity in non-indicative subordination.
The paper is organized as follows: Section 2 presents the two
main asymmetries and establishes the relevance of Agr for subject
Case. Section 3 offers a basic account of subject Case. Section 4
extends that account to predication. Section 5 draws preliminary
132 Jaklin Kornfilt

conclusions. Section 6 discusses the nature of default Case. Section 7

proposes an explanation for when default Case may or may not be
allowed. Section 8 discusses two rival approaches to the first
asymmetry (i.e. the asymmetry between arguments and adjuncts in
nominalized factive clauses) and presents counterarguments. Section
9 summarizes this study's conclusions and mentions some
The paper is written in a general Principles and Parameters
framework without focussing on formalistic issues, to enhance read-
ability by an audience of the kind that attended the workshop where
this work was presented. For the same reason, I have not formulated
my account in strictly Minimalistic terms.

2. Basic facts: Different types of clauses

I now turn to an exposition of the basic facts, starting with different

types of subordinate clauses.
Embedded clauses in Turkish typically are not tensed, and are
traditionally said to be nominalized to varying degrees. However,
some subordination, even of the head-final type, is fully verbal; I
start my discussion of subject Case with that type.
The main point of this section will be to show that genuine subject
Case is licensed by Agr, and that ΤΑΜ morphology does not play a
role in this regard.
More specifically, I shall claim that there is only one kind of
genuine subject Case, irrespective of its morphological realization as
Nominative or Genitive: subject Case licensed by Agr. Depending on
the categorial features of this Agr as [+N] or [+V], this subject Case
will be realized as Genitive (=nominal subject Case) or Nominative
(=verbal subject Case), respectively.
Subject case in Turkish nominalized clauses 133

2.1. "Verbal clauses"—"verbal" to the fullest

2.1.1. Indicatives

"Regular" indicative root clauses exhibit a rich array of ΤΑΜ

markers, as well as (predicate-subject) agreement markers. The latter
come from a particular agreement paradigm which is the most widely
used one for verbal predicates4 in the language.5 (1) exemplifies this
type, with the ΤΑΜ marker as the future tense:

(1) Sen yarin ak$am ev -de yemek

you(SG) (NOM) tomorrow evening home -LOC food
pi§ir-ecek -sin
cook-FUT -2.SG
'You will cook food at home tomorrow evening.'

Notice that the subject is in the Nominative. (In Turkish, there is

no phonologically realized Nominative morpheme; I assume here
that the syntactic Nominative corresponds to a null morpheme.)
Identical clauses can be found as subordinate clauses when used
as quotations, but also as "regular" subordinate clauses with a
number of matrix verbs; the following two examples illustrate these
two situations, in the order mentioned:

(2) a .[Sen yarin ak§am ev -de yemek

you(SG) (NOM) tomorrow evening home -LOC food
ρϊξίΓ -ecek -sin ] diye duy -du -m
cook-FUT-2.SG 'saying' hear-PAST-l.SG
Ί heard "you will cook food at home tomorrow evening".'
b.[Sen yarin akqam ev -de yemek
you(SG) (NOM) tomorrow evening home -LOC food
pi$ir -ecek -sin ] san -lyor -urn
cook-FUT -2.SG believe-PRSPROG -l.SG
Ί believe you will cook food at home tomorrow evening.'
134 Jaklin Kornfilt

Exceptional Case Marking [ECAi]-constructions provide evidence

that it is not ΤΑΜ—morphology which is responsible for licensing
of the subject and of its Case.6
This can be seen clearly in the contrast between fully verbal
subordinate clauses like (2)b., where we just saw a subordinate
clause exhibiting fully verbal ΤΑΜ as well as fully verbal Agr
morphology, and a corresponding subordinate clause also exhibiting
fully verbal ΤΑΜ morphology, but no Agr morphology. (3)a. is
similar to (2)b. in showing that a fully verbal subordinate clause with
verbal ΤΑΜ morphology and with verbal Agr morphology has a
Nominative subject, this time with a subordinate clause that exhibits
progressive aspect and simple past tense:

(3) a .[Sen diin sabah ev -de yemek

you(SG)(NOM) yesterday morning home -LOC food
pi§ir -iyor -du -η ] san -di -m
cook-PROG -PAST -2.SG believe-PAST -l.SG
Ί believed (that) you were cooking food at home yesterday

The next example exhibits an interesting pattern: the subordinate

clause has the identical (verbal) aspect and tense combination, but it
lacks Agr marking:

(3) b. [Sen -i diin sabah ev -de yemek

you(SG) -ACC yesterday morning home -LOC food
pi§ir -iyor -du ] san -di -m
cook-PROG -PAST (no Agr) believe-PAST -l.SG
Ί believed you to have been cooking food at home yesterday

This last example shows that when Agr is absent, the subject
cannot show up in the appropriate subject Case, which would be the
verbal subject Case in this instance, i.e. in the Nominative. Instead,
where the matrix verb is one of a small number of ECM verbs,
Accusative is licensed by that verb.
Subject case in Turkish nominalized clauses 135

It should be noted that many speakers accept ECM-like

constructions with overt Agr, as well:

(3) c .[Sen -i dün sabah ev -de yemek

you(SG) -ACC yesterday morning home -LOC food
pi§ir -iyor -du -η ] san -di -m
cook-PROG -PAST -2.SG believe-PAST -l.SG
Ί believed you to have been cooking food at home yesterday

I shall not, in the context of this paper, address the issue of the
nature of ECM in Turkish in detail, nor in the status of (3)c. It is
possible, for example, that while (3)b. is a genuine instance of ECM,
(3)c. exemplifies a phonologically empty subject (i.e. pro) copy in
the subordinate clause, with the Accusative DP actually raised into
the matrix (cf. Moore 1998).
For the purposes of this paper, the important point is the
following: all speakers accept (3)b., with an Accusative subject
under absence of Agr in the verbal subordinate clause, and no
speaker would accept (3)d., where the Agr element is missing, yet
where the embedded subject is in the Nominative:

(3) d.*[[Sen dün sabah ev -de yemek

you(SG) (NOM) yesterday morning home -LOC food
pi$ir -iyor -du ] san -di -m
cook-PROG -PAST (no Agr) believe-PAST -l.SG
Intended reading: Ί believed (that) you [Nom.] were
yesterday morning.'cooking food at home yesterday

Note that both in the fully grammatical (3)b. and in the

completely ungrammatical (3)d., the embedded predicate bears its
regular ΤΑΜ morphology, i.e. in this instance, markers for progres-
sive aspect and for past tense. Therefore, the licenser of Nominative
subjects, i.e. of the verbal subject Case, cannot be the verbal ΤΑΜ
136 Jaklitt Kornfilt

The second part of our conclusion must therefore be as follows:

the licenser of the Nominative in root as well as embedded verbal
clauses is the (verbal) Agr marker.
Our general observation so far, then, is as follows: 1. There is a
strict correlation between (verbal) Agr and (verbal) subject Case, i.e.
Nominative; 2. There is no correlation at all between (verbal) ΤΑΜ
morphology and the (verbal) subject Case.
From a cross-linguistic point of view, it is not a novel observation
that ECM is not limited to infinitival subordinate clauses (while it is
so limited in some languages, e.g. English). There are some
languages where ECM can apply to subjunctive clauses, and some
where ECM is possible even into clauses with tense and agreement;
the latter is the case, for example, in Modem Greek.
The Turkish facts are of special interest nevertheless. First of all,
the subordinate clauses into which ECM may apply are indicative,
not subjunctive. It is well-known that subjunctive clauses are more
"transparent" than indicative ones with respect to a number of
syntactic phenomena, ECM being only one of them. The same is true
with respect to anaphoric binding, for example.
Secondly, in Modern Greek, the morphological infinitival has
been lost. Tensed forms of verbs are therefore used instead of the
infinitival; in such instances, they can be said to be "fake" tenses. For
example, the citation form of verbs is tensed, with subject agreement.
Therefore, it is not too surprising that ECM should be able to apply
into a tensed subordinate clause that also exhibits predicate-subject
In contrast, the tenses in Turkish verbal subordinate clauses with
ECM are genuine. Turkish does have a morphological infinitive;
therefore, tensed forms of the verb are not used in place of the
infinitive, and are genuinely tensed.
Thus, the fact that Nominative subjects are only possible in fully
verbal subordinate clauses in the presence of verbal Agr, and that
verbal genuine Tense forms do not license overt Nominative subjects
when Agr is absent, is significant cross-linguistically, as well as for
Turkish individually.
Subject case in Turkish nominalized clauses 137

I shall make the further assumption that (a licensed) Agr must be

in C so as to act as a Case licenser. That the level of CP is involved
can be seen by the contrast between extractions out of ECM-clauses
versus fully finite (in the sense of George and Kornfilt 1981) verbal

(4) a. ??/*[Ali-nin [sen -i e{ yaz -di ]

Ali-GEN you -ACC write-PAST
san -dig-i ] mektupi
believe-FN-3.SG letter
Intended reading: **the letter which Ali believes you to have
b.[Ali-nin [sen ei yaz -di -n ]
AH-GEN you write -PAST -2.SG
san -dig-i ] mektupi
believe-FN-3.SG letter
'the letter which Ali believes you wrote'

I now turn to another type of fully verbal subordinate clause,

namely to subjunctives.

2.1.2. Subjunctives

There is a predicate form in root clauses which is called the Optative

or Subjunctive; I shall use the second form. This form takes different
predicate-subject Agr forms than Indicatives; however, these forms
are verbal, as well, and thus differ from nominal Agr forms. The
subject is in the Nominative. These clauses are illustrated by the next

(5) a .Ben bugiin yemek pi§ir -e -yim

I (NOM) today food cook -SUBJNCT -l.SG
Ί should/ought to cook food today;
Let me cook food today.'
138 Jaklin Kornfilt

Just as we saw for Indicatives, Subjunctive clauses can also be

embedded and show up in a form completely identical to a root

(5) b. [Ben bugiin yemek pi$ir -e -yim ]

I (NOM) today food cook-SUBJNCT-l.SG
isti -yor -urn
want-PRSPROG -l.SG
1 want to cook [that I should cook] food today;
I want for myself to cook food today.'

The facts here are just as expected; whether in root or embedded

clauses, the Agr form is verbal, and it licenses verbal subject Case,
i.e. the Nominative, on a subject.

In the next subsection, I turn to nominalized embedded clauses. In

that subsection, I shall aim at establishing the same correlation
between Agr and subject Case for such clauses that we saw in fully
verbal clauses. Another aim of the discussion will be to establish a
categorial difference between the two main types of embedded
nominal clauses—a difference which I claim plays a central role in
the licensing of nominal Agr as a subject Case licenser. More
specifically, I will claim that while nominal subjunctive clauses are
homogeneously nominal, nominal indicative clauses are categorially
hybrid, with a nominal Agr sandwiched between a verbal ΤΑΜ layer
and a verbal CP (or Force Phrase) layer. Therefore, while nominal
Agr is fully licensed within the fully nominal subjunctive clause, it is
not so licensed within the hybrid indicative clause and therefore
needs another licensing mechanism.
Subject case in Turkish nominalized clauses 139

2.2. Non-tensed argument clause types

2.2.1. Similarities and differences

Turkish has a few different "nominalization" types. For the purposes

of this paper, the term "nominalization" is being used to refer to an
extended clausal projection with some nominal functional layers that
represent "nominalization" (for which diagrams will be shown later
in the paper), and not to lexically derived deverbal nouns.9
I illustrate the two main types of syntactic nominalization here,
namely the "factive" (i.e. indicative) and "non-factive" (i.e. subjunc-
tive) types.
"Factive" (indicative) nominalized embedded clause:

(6) a. [Sen -in diin sabah ev -de yemek

you -GEN yesterday morning home -LOC food
piqir -dig -in ]-i duy -du -m
cook-FN-2.SG-ACChear -PAST-l.SG
(san -di -m
/believe -PAST-l.SG
Ί heard/believed that you had been/were cooking/cooked
/had cooked food at home yesterday morning.'

"Non-factive" (subjunctive) nominalized clause:

(6) b.[Sen -in yarin ev -de yemek

you -GEN tomorrow home -LOC food
pi$ir -me -n ]-i isti -yor -urn
cook-NFN -2.SG-ACC want-PRSPROG -l.SG
Ί want for you to cook food at home tomorrow;
I want that you should cook food at home tomorrow.'

These two types of nominalized embedded clauses exhibit some

similarities as well as differences.
140 Jaklin Kornfilt

I start with the similarities: the subject in both types is in the

Genitive, i.e. in what I have been calling the nominal subject Case.
This is just as expected under the correlation I have posited here,
because Agr is also nominal in both.
Both types are ultimately, i.e. in their highest functional layer(s),
DPs which need Case just like any DP. Such Case is licensed by a
structurally higher Case licenser and is realized, in both clausal
types, as the last morpheme in the morphological sequence of the
nominalized predicate.
A further point of similarity concerns the morphological sequence
within the predicate. In both types, the "factive" and the "non-
factive" nominal morphemes appear in the morphological slot in
which ΤΑΜ morphemes show up in fully verbal clauses; this can be
seen by comparing the examples presented so far for fully verbal
versus nominal clauses. These nominal morphemes share not just the
morphological slot, but also certain semantic properties with the
corresponding ΤΑΜ morphemes: mood properties like indicativity
versus subjunctivity are similar (hence my terms of nominal
indicative and subjunctive). Furthermore, as we shall see presently,
the indicative nominal marker can also express a vestige of tense.
Yet another property in which the two types of clauses are similar
is in exhibiting the full argument structure of their respective
predicates. From this point of view, these two types of nominalized
clauses are not different from fully verbal clauses; similar adjuncts
can show up, as well.
I now turn to the differences between these two nominal clauses.
The non-factive clauses are more nominal than the factive ones in a
number of ways. I shall therefore be claiming in this paper that non-
factive, i.e. subjunctive nominal clauses are homogeneously DPs;
factive, i.e. indicative nominal clauses, on the other hand, are, at the
same time, CPs (or, in the terminology of Rizzi 1997, Force
Phrases), i.e. they have at least one "high" functional layer with
verbal features within a nominal functional projection, i.e. within a
DP (in addition to a "low" verbal functional layer, i.e. the ΤΑΜ
Subject case in Turkish nominalized clauses 141

1. I take Tense to be part of a verbal property. As we shall see

presently, nominal indicatives can be overtly marked for future
versus non-future tense (-DIK: non-future, -(y)AcAK: future11). This
verbal property is congruent with positing a higher CP-layer, which
has verbal features, as well. In contrast, Subjunctive nominal clauses
have only one marker, -raA, and are thus neutral for tense.
The non-future nominal indicative was exemplified above; the
next example illustrates the future tense nominal indicative:

(6) c.[Sen -in ev -de yemek

you -GEN home -LOC food
pi$ir -eceg -in ]-i duy -du -m
cook-FUTN -2.SG-ACC hear -PAST-l.SG
Ί heard that you will cook food at home.'

As the translation makes clear, the embedded nominal indicative

clause is independent from the root clause with respect to tense. In
other words, the embedded clause has its own tense features.
This contrasts with embedded nominal subjunctive clauses; with
respect to tense, these depend on the clause they are embedded

(6) d. [Sen -in ev -de yemek

you -GEN home -LOC food
pi§ir -me -n ]-i isti -yor -urn
cook-NFN -2.SG-ACC want-PRSPROG -l.SG
liste -di -m /isti -yeceg -im
/want-PAST-l.SG /want-FUT -l.SG
Ί want/wanted/will want for you to cook food at home.'

These examples have shown us that nominalized indicatives have

tense (albeit by far not as richly so as in fully verbal clauses) and
have thus verbal properties, in contrast to nominal subjunctive
clauses that lack tense completely and thus lack corresponding verbal
142 Jaklin Kornfilt

2. In addition to lacking (functional) verbal properties, subjunc-

tive nominals have certain nominal properties which are absent in
nominal indicatives. Subjunctive nominalizations can, with varying
degrees of success, be pluralized and can also co-occur with certain
determiners, e.g. with demonstratives, while neither is possible with

(6) e. **[Hasan -in bu durmadan kumarhane -ye

Hasan -GEN this constantly casino -DAT
kag -tik -lar -in ] -i
escape -FN -PL-3.SG-ACC
duy -ma -mi§ -ti -m
Intended reading: Ί hadn't heard (about) these constant
runnings (away) of Hasan to the gambling casino.'

(6) f. ?(?)[Hasan-in bu durmadan kumarhane-ye

Hasan-GEN this constantly casino -DAT
kag -ma -lar -in ] -dan
escape-NFN-PL-3.SG -ABL
ho§lan -mi -yor -um
Ί don't like these constant runnings (away) of Hasan to the
gambling casino (i.e. that Hasan should run to the casino

3. There is very suggestive evidence showing that nominal

indicatives are CPs, while nominal subjunctives are not: non-factive
nominalized clauses cannot host WH-operators, i.e. they can neither
act as embedded WH- or Yes/No-questions, nor can they function as
modifying clauses in relative clause constructions. Factive
nominalized clauses can be used in all of those functions, arguing
that they are CPs (and thus have a Spec, CP position that can host an
operator), albeit dominated by DP, while non-factive nominalized
clauses are homogeneously DPs and consequently don't have a
qualifying Specifier position for the operators in question.12
Subject case in Turkish nominalized clauses 143

The following examples illustrate the contrasting properties of

nominal indicative versus subjunctive embedded clauses with respect
to embedded (i.e. narrow-scope) WH-questions and with respect to
relative clauses, in this order.

(7) [yemeg -i kirn -in pi§ir -dig-in ]-i

food -ACC who -GEN cook -FN-3.SG -ACC
sor -du -m /duy -du -m
ask -PAST-l.SG/hear-PAST-l.SG
Isöyle-di -m
/tell -PAST-l.SG
Ί asked/heard/told who had cooked the food.'

(8) *[yemeg -i kirn -in pi$ir -me -sin ]-i

food -ACC who-GEN cook-NFN-3.SG-ACC
söyle-di -m
tell -PAST-l.SG
Intended reading: Ί said who should cook the food.'

The indicative nominal clause in (7), by virtue of being a CP, has

a position in which a WH-operator is licensed: the Spec, CP position.
Please note that in recent approaches (such as the one proposed in
Rizzi 1997) in which the CP is layered further into a number of
distinct functional projections, this position could plausibly be the
Specifier position of a Force Phrase. What's important here is that
nominal indicative clauses would include such a projection, and thus
its Specifier position, as a position which is qualified to host a WH-
operator, while subjunctive nominal clauses, by virtue of not having
such a functional projection, cannot host a WH-operator. (Please
note that this statement will be generalized soon, so as to include any
Note also that there is nothing wrong with the nominal
subjunctive clause in combination with the matrix predicate in (8)
per se; this is illustrated in the next example, where the
corresponding declarative nominalized subjunctive is fine in the
same syntactic context:
144 Jaklin Kornfilt

(9) [yemeg -i Ali-nirt pi§ir -me -sin J-i

food -ACC Ali -GEN cook -NFN-3.SG-ACC
söyle -di -m
teU -PAST -l.SG
Ί said that Ali should cook the food.'

The ungrammatically of (8) is thus clearly due to the fact that the
subjunctive clause is an embedded interrogative and that there is a
question operator there for which the clause does not offer an appro-
priate position.
It is instructive to observe that the desired reading in (8) can be
expressed, but with some additional means—namely involving the
indicative: it is necessary to embed the subjunctive under an appro-
priate nominal indicative clause; e.g.:

(10) [[yemeg -i kim -in pi§ir -me -si ]

food -ACC who-GEN cook-NFN-3.SG
gerek -tig -in ]-i söyle-di -m
(be) necessary-FN-3.SG-ACC tell -PAST-l.SG
Ί said for whom it was necessary to cook the food.'

Now, the embedded interrogative has become larger, i.e. it is the

nominal indicative clause that dominates the subjunctive clause. As
we said earlier, the indicative clause does, by virtue of being a CP (or
a Force Phrase) have the necessary specifier position for the
interrogative operator, and the result is fine. Similar facts hold for
Y/N questions, as well; I shall not illustrate those, due to constraints
on space.
Finally, it is interesting to note that the issue is not just one of
licensing a [+WH] operator via [+WH] features of a qualifying
functional head. This is because a similar dichotomy between
nominal indicative versus subjunctive clauses can be observed with
respect to relative clauses, as well. The operator in relative clauses is
not [+WH]. Therefore, the issue is not merely one of [+WH] features
being licensed by a particular functional head (or not being so
licensed), but rather of having a functional projection whose
Subject case in Turkish nominalized clauses 145

Specifier position is able to host any operator that can enter into an
operator-variable relationship.
The following examples illustrate the contrast between the two
types of nominal embedded clauses with respect to relative clause
I start with indicative RCs:

( 1 1 ) [Ali-nin ei pi§ir-dig-i ] yemeki

AH-GEN cook-FN-3.SG food
'The food Ali cooked'

Subjunctive RCs don't exist (with the exception of irrealis RCs, to

be discussed later):

(12) *[Ali -nin ei pi§ir -me -si ] yemeki

Ali -GEN cook-NFN-3.SG food
Intended reading: 'The food Ali should cook'

Embedding the subjunctive RC under an appropriate indicative

"saves" the utterance:

( 1 3 ) [[Ali-nin ei pi^ir-me -sin]-i söyle-dig-im ]

Ali -GEN cook-NFN -3.SG -ACC tell -FN-l.SG
'The food which I said Ali should cook'

The explanation for the ungrammaticality of (12), as well as the

reason for why embedding the ungrammatical subjunctive modifier
clause under an indicative saves the utterance in (13) carry over from
the discussion of embedded interrogatives—under the proposed
extension from [+WH] operators to any operator, with the corre-
sponding extension from a functional head with [+WH] features to
one whose categorial features enable it to AGREE with any operator,
i.e. an extension to a C-head or a Force-head.
146 Jaklin Kornfilt

The following diagrams integrate the proposals made in the pre-

vious discussion, starting with a subjunctive nominalized clause.

(14) a.


AgrP Κ

DPi Agr

MP Agr

Ali nini ti ti kitab -i oku -ma -sin -i

Ali-GEN book -ACC read -NFN -3.SG -ACC
'for Ali to read the book' [='for Ali's reading the book']
(as a direct object)
(Adapted from Borsley and Kornfilt 2000:108)
Subject case in Turkish nominalized clauses 147

The following is a rough representation for an indicative nominal-

ized clause:

(14) b.


CP κ

AgrP C
D Agr

MP Agr




Ali nin[ ti tj kitab -i oku dug -un -u

Ali-GEN book -ACC read -FN -3.SG -ACC
'(that) Ali read the book' (as a direct object)

These rough representations are very similar, with the exception

of a CP- (or Force-Phrase)-layer between the AgrP (or Finiteness
Phrase) and the KP, the highest layer of the nominal clause (i.e. a
Case Phrase) in the representation for the nominal indicative clause
148 Jaklin Kornfilt

—a layer which is missing in the representation of the fully nominal

subjunctive clause, as discussed.
As a consequence, the nominal Agr finds itself between two
verbal phrasal layers in the nominal indicative clause, while it is
surrounded by fully nominal layers in the subjunctive clause. The
significance of this difference for the proposed account of subject
Case will be central for my account; a discussion of this significance
will be initiated in section 3.
Is it a coincidence that there is a CP (or a Force Phrase) in a
clause where there are also Tense features? The answer is no. As
mentioned in the earlier discussion, Tense features are verbal, and so
are the features of the CP (or of the Force Phrase); thus, there is a
categorial agreement between these layers. This analysis predicts that
we would not get lack of Tense if a CP-layer is present which
exhibits CP-related syntactic properties. This prediction is indeed
fulfilled in Turkish, as we saw. I expect it also to hold cross-linguis-
tically, and, as far as I know, it does.

2.2.2. The importance of nominal Agr for genuine subject Case

and infinitival clauses

I now turn to the importance of nominal Agr in licensing subject

A subset of matrix predicates that subcategorize for subjunctive
argument clauses also co-occur with infinitival argument clauses.
Such clauses share with the previously illustrated nominalized
clauses the property of being Case-marked:

(15) Beni[PROi karanlik -ta sokak-lar-da

I darkness -LOC street -PL-LOC
yürü-mek]-ten kork -ar -im
walk-INF -ABL fear -AOR-l.SG
Ί am afraid to walk in the streets in the dark.'
Subject case in Turkish nominalized clauses 149

Such infinitival clauses cannot bear overt Agr markers. Note that
there is no such marker between the infinitival marker and the Case
marker on the clause.
Overt subjects are not possible in infinitivals, no matter what their
Case is:

(16) *Ben [kiz -im Ikiz -im -in

I daughter -l.SG[NOM] /daughter-l.SG -GEN
karanlik-ta solcak -lar -da yiirii -mek] -ten
darkness-LOC street-PL-LOC walk-INF -ABL
kork -ar -im
fear -AOR-l.SG
Intended reading: Ί am afraid for my daughter to walk in the
streets in the dark.'

I claim here that these two observations are linked to each other;
in other words, infinitival clauses have no Agr, and it is therefore that
the only possible subject in such clauses is PRO. (This statement will
be refined later.)
For an utterance like (16) to be grammatical with an overt subject,
the embedded predicate must be marked with the non-factive
nominalization marker, instead of with the infinitive (thus preserving
subjunctive Mood), and it must also bear overt Agr morphology:

(17) Ben [kiz -im -in karanlik -ta

I daughter-l.SG-GEN darkness-LOC
sokak-lar -da yiirii -me -sin ]-den
street -PL -LOC walk -NFN -3.SG -ABL
kork -ar -im
fear -AOR-l.SG
Ί am afraid for my daughter to walk in the streets in the dark.'

Again, the observations are linked: Where there is Agr, there is

also an overt subject, and that subject appears in the appropriate
genuine subject Case. This is the Genitive, i.e. the nominal subject
Case, because the Agr itself has nominal features. PRO cannot show
150 Jaklin Kornfilt

up in the presence of Agr. For the purposes of this paper, I shall not
be concerned with the specifics of why this is not possible; the two
main types of answers would be either that this is due to some
implementation of the original PRO-Theorem (cf. Chomsky 1981),
or else to an inappropriate Case being licensed for the subject
position if that position is occupied by PRO (cf. Chomsky and
Lasnik 1991). Either way, the presence of Agr would preclude the
presence of a PRO-subject.
We thus explain the two correlations we have observed in this
discussion of nominal clauses:
Correlation A: when there is infinitival morphology, there is no
Agr, no overt subject possible (because no Case of any type—or a
Case of an inappropriate type—is licensed); the only possible subject
is PRO.
Correlation B: when there is instead nominal subjunctive
morphology (which has the same Mood as the infinitival), there also
is overt (nominal) Agr; now, an overt subject with nominal subject
Case (i.e. Genitive) shows up; no PRO subject is possible.
We are now able to collapse the correlations we have set up for
verbal and for nominal clauses into one overall correlation:
For both nominal and fully verbal clauses: where overt Agr shows
up, the overt subject is licensed via the corresponding (i.e. nominal
or verbal) subject Case (i.e. Genitive or Nominative, respectively),
depending on the nominal versus verbal features of the Agr. Without
Agr, no genuine subject Case of any sort is possible.
Having thus concluded a preliminary discussion of verbal as well
as nominal argument clauses, I turn to adjunct clauses.

2.3. Adjunct clauses

2.3.1. Indicative adjunct clauses with nominal Agr

Both the nominal indicative and the nominal subjunctive clause types
which we discussed as argument clauses can also appear as adjuncts.
Those are usually complements of postpositions, but they can also
Subject case in Turkish nominalized clauses 151

occur without a postposition. The two asymmetries mentioned in the

introduction surface when we compare these with the corresponding
argument clauses just discussed.
I start the discussion of adjunct clauses with examples of nominal
indicative clauses as objects of postpositions, followed by examples
of the same type of clause without any postposition, but still as
adjuncts. Comparison of these adjunct nominal indicatives (whether
with or without postpositions) with their argumental counterparts
will establish the first asymmetry.
The main property to notice about these adjunct nominal
indicatives is that their subjects are not in the Genitive, as expected,
but rather in the Nominative—or, if we don't want to prejudge the
issue at this point, we can say that the subjects are bare. The first
three examples illustrate this for nominal indicatives that are
postpositional objects, and the latter two exhibit the same fact for the
same clause type, used as adjuncts, but without a postposition:

(18) [[Sen yemek pi$ir -dig-in ] igin ] ben

you(SG) (NOM) food cook -FN-2.SG because I
konser -e gid -ebil -di -m
concert-DAT go-ABIL-PAST-l.SG
'Because you cooked, I was able to go to the concert.'

(19) [[Sen yemek pi§ir -dig-in ] -e

you(SG) (NOM) food cook -FN-2.SG -DAT
göre ] hepiniz ev -de kal -acak -siniz
according to all+you home -LOC stay-FUT -2.PL
'Given that you cooked, all of you will stay at home.'

(20) [[Ben yemek pi$ir -dig-im ] -den dolayi]

I (NOM) food cook -FN-l.SG -ABL because
konser -e gid -e -me -di -m
concert -DAT go -NegABIL-NEG-PAST-l.SG
'Because I cooked, I was unable to go to the concert.'
152 Jaklin Kornfilt

(21) [[Ben yemek pi§ir -dig-im ] -den konser -e

I(NOM) food cook -FN-l.SG -ABL concert-DAT
gid -e -me -di -m
'Because I cooked, I was unable to go to the concert.'

(22) [Sen konser-e git-tig-in ] -de ben

you(SG) (NOM) concert-DAT go -FN-2.SG -LOC I
ev -e dön -iiyor -du -m
home -DAT return -PROG -PAST - l.SG
'When you were going to the conceit (at your going to the
concert), I returned home.'

This appears to be a problem for the account I proposed so far.

Note that in all of these examples, the nominal indicative clause does
include a nominal Agr marker, and that all of these clauses do carry
some kind of θ-role, albeit an adjunct θ-role. Thus, one would
assume that some sort of thematic index would be assigned to the
clause and be inherited by the nominal Agr, thus enabling it to
license subject Case. However, this would incorrectly predict
Genitive subjects here, as this would be the licensed subject Case.
I would like to claim that this problem is only apparent. I shall
propose in this paper that the subjects of adjunct indicative clauses
are in a default Case (and not in a genuine, licensed subject Case),
because the Agr element is not licensed by a primary θ-index. I
follow Grimshaw (1990) in distinguishing primary from secondary
θ-roles—a distinction which directly corresponds to the one in Rizzi
(1994) between referential and non-referential indexation, which is a
distinction drawn between arguments and adjuncts. In line with this
distinction, I refine the proposal I made earlier about licensing a
nominal Agr which is not categorially licensed. I had proposed that
such an Agr needs to carry a θ-index to be licensed as a subject Case
licenser. I now constrain that proposal: this θ-index must be that of a
primary θ-role in Grimshaw's sense, i.e. a "referential" index in
Rizzi's sense. While any θ-index might be sufficient to license Agr
as a subject Case marker in some other languages, it is clear that for
Subject case in Turkish nominalized clauses 153

Turkish, there must be such a constraint imposed on the type of

I shall return to the issue of default Case in subsection 2.3.3., by
showing that it applies to adjunct clauses without any Agr element,
as well, thus providing independent motivation for this mechanism in
the grammar. An account of my overall approach to subject Case will
be offered in section 3.
The present subsection has illustrated the first asymmetry
mentioned in the introduction; the next subsection illustrates the
second asymmetry, i.e. the fact that the first asymmetry is found only
in the categorially hybrid nominal indicatives but not in the fully
nominal subjunctives.

2.3.2. Subjunctive adjunct clauses with nominal Agr

Subjunctive nominal adjunct clauses contrast with indicative

nominalized adjunct clauses with respect to their subjects: those
subjects are in the Genitive, just as they are in corresponding
argument clauses:

(23) [[Sen -in yemek pi§ir -me -n ] igin ] ben

you(SG)-GEN food cook-NFN-2.SG for I
ev -de kal-di -m
house -LOC go -PAST -1 .SG
Ί stayed at home so that you should cook (for you to cook).'

The Genitive subject here contrasts with the corresponding

Nominative subjects in the examples (18) through (22), where
nominal indicatives were exemplified as adjuncts. The contrast
between (23) and (18) is particularly instructive in this regard, as the
same postposition, i.e. igin, shows up in both (albeit with different
semantics, due to the different facti vi ty and Mood differences
between the embedded clauses); the nominal Agr morphology is the
same in both, as well. Yet, the subject of the embedded indicative
clause in (18) is in the Nominative (i.e. default) case, while the
154 Jakliti Kornfilt

subject of the embedded subjunctive clause is in the Genitive (i.e.

genuine nominal subject) case in (23).
I propose that the reason for this contrast is the one mentioned
earlier, e.g. in the introduction: the nominal Agr element is licensed
via the categorially matching nominal features within its own clause
in nominal subjunctive clauses such as in (23). I have argued in
subsection 2.1.2. that these clauses are indeed homogeneously nomi-
nal, and that especially their ΤΑΜ morphology is [+N, -V], thus
"harmonizing" with the corresponding feature values of the nominal
Agr morphology. As a consequence, the nominal Agr in nominal sub-
junctives always licenses genuine (nominal) subject Case, i.e. Gen-
itive, irrespective of the argument or adjunct status of the clause. In
other words, an Agr licensed categorially within its clause does not
need further licensing via any sort of indexation, thematic or other-
wise; this is why nominal subjunctive clauses don't exhibit sensitivi-
ty to the adjunct versus argument distinction with respect to subject
In contrast, Agr is not categorially licensed clause-internally in
nominalized indicative clauses. This is why it needs licensing via
indexation, as discussed in the previous subsection, and why nominal
indicative clauses show sensitivity to their adjunct versus argument
There is some independent evidence for my proposal that the lack
of sensitivity observed for nominal subjunctive clauses with respect
to the argument/adjunct asymmetry (due to the ability of its Agr to
license genuine nominal subject case irrespective to that asymmetry)
is made possible by the homogeneously nominal categorial features
of the domain headed by such nominal Agr. This evidence comes
from regular possessive phrases.
Possessive phrases are similar to nominalized clauses with
respect to the nominal Agr morphology on the head, i.e. on the
nominal with the semantics of "possessed". Interestingly, the
specifier of such phrases (i.e. the nominal with the semantics of
"possessor"), which is the nominal that corresponds to the subject of
nominal phrases, bears Genitive marking, irrespective of the
argument or adjunct status of the entire possessive phrase. In the
Subject case in Turkish nominalized clauses 155

following pair of examples, the possessive phrase is an argument in

(24), and an adjunct in (25). Please note that the possessor, Ali, is
marked with the Genitive in both instances:

(24) Hasan [Ali -ttin kitab-ιη ] -i oku -du

Hasan Ali -GEN book-3.SG -ACC read -PAST
'Hasan read Ali's book.'

(25) Hasan kitab-ι [[Ali -nin kiz -i ] igin]

Hasan book-ACC Ali -GEN daughter-3.SG for
al -di
'Hasan bought the book for Ali's daughter.'

This fact is exactly as predicted by the approach outlined above.

The nominal Agr morphology is licensed by the nominal features
within its own phrase; obviously, the nominal head of a possessive
phrase is unambiguously and fully nominal. Therefore, such Agr
morphology does not need any other licensing and is thus insensitive
to any indexing that would originate from a thematic or predicational
In the following section, I discuss adjunct clauses without an overt
Agr element and show that such clauses independently motivate the
assumption of a default Case mechanism for subjects.

2.3.3. Adjunct clauses without any Agr

Typically, adjunct clauses that lack overt Agr morphology on their

predicates also don't have an overt subject; these embedded subjects
have the properties of PRO. 14 Note that in the following two
examples with PRO-subjects, the subject of the adjunct clause takes
on the reference of the overt subject in the main clause obligatorily:
156 Jaklin Kornfilt

(26) a.Oyai dün bütiin gün ςαΙι§ -ti.

Oya yesterday all day work-PAST
[PROj/*i makale-yi yaz -ar -ken ]
article -ACC write-PRES.PART -'while'
Ahmetj islik ςαΐ -iyor -du
Ahmet whistle play -PROG -PAST
'While writing the article, Ahmet was whistling.'
(The only person writing the article can be Ahmet, even
though Oya was mentioned in the discourse, and would be
pragmatically the likelier antecedent for PRO in this
b. [PROi makale-yi yaz -ar -ken ]
article -ACC write-PRES.PART -'while'
beni islik ςαΐ -acag -im
I whistle play -FUT -l.SG
'While (I will be) writing the article, I will be whistling.'

In contrast, the pro-subject in corresponding clauses with overt

Agr morphology may also take on the reference of other antecedents:

(27) Oyai gok özverili bir insan.[[proi/j dün yemek

Oya very selfless a person yesterday food
piqir -dig-i ] igin ] Ahmetj konser -e
cook-FN-3.SG because Ahmet concert-DAT
gid -ebil -di
'Oya is a very selfless person. Because she/he cooked
yesterday, Ahmet was able to go to the concert.'

Although syntactically, Ahmet is a closer antecedent to the pro-

subject, the discourse-antecedent Oya is pragmatically the likelier
antecedent and is thus the preferred indexer. (Without the first
sentence about Oya, the indexer of pro is, of course, Ahmet.)
For our purposes, these sentences establish the difference between
PRO- and pro-subjects. Note, at the same time, that PRO-subjects
co-occur with predicates that lack overt Agr, while pro-subjects need
Subject case in Turkish nominalized clauses 157

overt Agr to be licensed and identified. (Kornfilt 1996b discusses

additional criteria to distinguish these two empty categories in Turk-
An interesting observation about Agr-less adjunct clauses is that
an overt subject can show up in the position of PRO in such exam-

(28) [Meral makale-yi yaz -ar -ken ] ben islik

Meral article -ACC write-AOR-'while' I whistle
ςαΐ -iyor -du -m
play -PROG-PAST-l.SG
'While Meral (was) writing the article, I was whistling.'

Similar facts are found with other morphologies found in such

tenseless adjunct clauses, as well:

(29) [sen konser-e gid -ince ] ben ev -e

you(SG) concert-DAT go -'when' I home-DAT
dön -dii -m
return -PAST-l.SG
'When you went to the concert, I returned home.'

(30) [sen konser -e gid -eli ] be$ saat ol -du

you(SG) concert-DAT go -'since' five hour be-PAST
'It's been five hours since you went to the concert.'

The factive nominalization morpheme -DIK can, albeit in-

frequently, also serve as part of the predicate of such an adjunct
clause with either a PRO-subject or an overt subject, when followed
(in its Agr-less form) by the locative morpheme -DA:

(31) a.[PRO her gel -dik-te ] Ali ben-im -le

each come -FN-LOC Ali I -GEN -with
kavga ed -er
quarrel do -AOR
'Ali quarrels with me every time [he] come[s].'
158 Jaklin Kornfilt

b. [sen her gel -dik-te ] Ali ben -im -le

you(SG) each come -FN-LOC Ali I -GEN-with
kavga ed -er
quarrel do-AOR
'Ali quarrels with me every time you come.'
(Adapted from Lewis 1967: 183)

The interesting question that arises here is: how can it be that both
PRO-subjects and overt subjects are possible here? We saw earlier
that Agr-less infinitival clauses that are arguments (rather than
adjuncts) allow only PRO-subjects, while nominalized clauses with
overt Agr allow only overt (or pro-) subjects. In other words, for
argument clauses, PRO- and overt subjects are in complementary
distribution. However, this complementary distribution obviously
breaks down for adjunct clauses. In other words, PRO- and overt
subjects are in free variation for Agr-less adjunct clauses.
In order to explain this observation, we have to address the basic
issue of how overt subjects receive Case in these adjunct clauses that
lack overt Agr.
My proposal is that such Case is due to a mechanism of default
Case that applies as a last resort. In other words, when no Case
licenser is available for an overt subject, this last resort mechanism
applies. Note that in these Agr-less adjunct clauses, there indeed is
no Case licenser for an overt subject: there is neither overt Agr as
such a licenser, nor Tense (even if we had not ruled out Tense
previously in such capacity), as these clauses have no independent
tense and take on the tense interpretation of the root clause.
I shall come back to the issue of default Case; at present, it is
sufficient to say that such Agr-less clauses with overt subjects that do
not bear genuine, licensed subject Case establish the necessity of
default Case in Turkish. Given that the grammar of the language
needs this mechanism, default Case can also be appealed to when
accounting for the overt subjects in categorially hybrid clauses that
are adjuncts and which do have overt Agr. The common denominator
of both types of clauses, i.e. hybrid clauses with and without overt
Agr, is that they are adjuncts, i.e. that they lack primary θ-roles.
Subject case in Turkish nominalized clauses 159

Thus, I propose that there is a correlation between that lack and the
necessity of default subject Case in both types of clauses.
I now turn to an overall account of licensed subject Case in

3. An account of subject Case

3.1. A sketch of a proposal, and one previous proposal

In the previous discussion, I have proposed that: 1. there is one single

licensed subject Case (with the possibility of default Case when there
is no licenser available; necessary constraints on default Case shall
be discussed later); I further claimed that this single subject Case
may have different morphological realizations (in Turkish, those
would be Nominative and Genitive); 2. that the licenser of the
genuine, licensed subject Case is the overt Agr morphology, and that
3. the Agr morphology has to be licensed itself in order to function as
a licit subject Case licenser.
The correlations between overt Agr of both categorial types and
the corresponding subject Case pointed out so far show us that the
first two claims above are convincing. What about the third claim?
In the Minimalist Program, elements that don't have semantic
features ("interpretable features"), or else which have them
redundantly, are imperfections (cf. Chomsky 2002). Agr is one such
element, since whatever semantic features it has are already
contained in the co-indexed subject DP. The importance of such an
element—and of the syntactic position AGR in which the Agr
morpheme is situated—has therefore to be motivated. This is what I
have tried to establish in this paper so far— i.e. I have ruled out
alternative sources for licensed subject Case, thus motivating the
existence of Agr as a subject Case licenser.
In the same spirit as in the Minimalist Program, I would argue that
because an entity like AGR (and its corresponding morphology, i.e.
Agr) with its uninterpretable features is undesirable in general, its
existence must be licensed all the more, when its categorial features
160 Jaklin Kornfilt

are in conflict with those of its syntactic environment. Thus, I have

proposed a primary source of licensing via matching categorial
features; where there is categorial mismatch, I have proposed
licensing via referential indexing (in Rizzi's sense). This, in turn, is
made possible either via thematic indexing, as we have seen so far,
or via predicational indexing, as we shall see later in the paper.
The idea that an Agr element, categorially mismatched locally
within its own clause, needs to be licensed from the outside, has a
predecessor, albeit not an identical one. Raposo (1987) proposes for
inflected infinitives in European Portuguese (EP) the following
generalization: nominal Agr needs Case itself. I offer here one
citation to this effect:
"...Agreement (Agr) in [the inflected infinitive's] Infi node must
be Case-marked, if it is to assign nominative Case to the subject of
its clause." (Raposo 1987: 85.)
Raposo mentions the nominal nature of Agr in these instances as a
motivating factor for his claim, just presented, that such Agr has to
be "Case-marked" from outside in order to have its own "Case-
assignment" potential be activated. But why should a nominal Agr be
any less able to license subject Case than a verbal Agrl Raposo
analyzes inflected infinitives in EP as CPs; in such a syntactic
domain, a nominal Agr would indeed cause conflict of categorial
features and thus would need licensing itself, if we look at the EP
facts from the perspective of the approach I have suggested.
The licensing of the nominal Agr, Raposo proposes, takes place
through the Case on the whole infinitival clause, and via subsequent
percolation of that Case down to the nominal Agr that heads the
inflected infinitival, and which, in Raposo's account, has risen to the
It is important to note that the Case "assigned" to the overt subject
in these EP inflected infinitives is not the same as the Case on the
"switched on" Agr in most instances: while the overt subject bears
subject Case (in EP, this is the Nominative, despite the syntactically
nominal nature of Agr15), the external Case on the CP (and thus on its
Agr-head) is, in most instances, the Accusative (abstracting away
Subject case in Turkish nominalized clauses 161

from inflected infinitives as sentential subjects); this is illustrated by

the following example:

(32) Eu lamento[os deputados ter -em trabalhado pouco]

Ί regret the deputies to-have-Agr worked little.'
(Raposo 1987: 87, his [7]a.)

In other words, the subject Case licensing mechanism proposed

for EP by Raposo is not Case transmission, as suggested for some
English gerunds in Reuland (1983). Instead, we have here an Agr,
licensed by any Case, in turn licensing subject Case.
Given that no Case transmission takes place here, the question
arises as to why the licensing of Agr as a Case licenser should be due
to an external Case, and whether other ways of such external
licensing of Agr might be conceivable.
In the following sections, I shall show that at least for Turkish,
external Case as the factor that activates an Agr element (i.e. an Agr
not activated internally within a clause or a phrase) does not work.
Therefore, a different external factor is needed, and I have proposed
indexation by a primary θ-role (and shall add indexation by
predication) as such a factor.
Note, however, that Raposo's and my proposals are motivated by
a similar way of thinking: where Agr is not "legitimate" within its
own local domain, it must get legitimized by virtue of heading a
syntactic domain which, in turn, is a necessary, even obligatory,
constituent in its own domain. In Turkish, such a constituent would
be an argument of a verb or of a noun, as opposed to an adjunct of a
We shall see that adjunct clauses of nouns, i.e. modifier clauses in
relative clause (RC) constructions (as opposed to adjuncts of verbs)
are also treated as "necessary" constituents in Turkish. This might be
a parametric dimension along which languages might differ and
whose investigation I leave for future research.
Returning to a formalization of the general ideas just discussed,
the "necessary" constituents are, I propose, indexed: either by a
(primary) θ-role for argument clauses, or by a predicational index,
162 Jaklin Kornfilt

for the modifying clauses in RCs. This index percolates down to the
Agr-head, thus activating it as a subject Case licenser.16
I now turn to showing that in Turkish, it isn't the Case on the Agr
that activates it as a subject Case licenser.

3.2 Problems with licensing of Turkish Agr via Case

Three types of problems have to be acknowledged that a Case-based

account of the kind proposed by Raposo for EP would have to face
with respect to subject Case in Turkish:

3.2.1. Instances of licensed nominal Agr without structural Case

Turkish categorially hybrid indicative clauses appear in noun-

complement constructions:

(33) [Ali-nini [proi aile -sin ] -i

Ali-GEN family -3.SG -ACC
terket -tig -i ] söylenti -si
abandon-FN-3. SG rumor -CMPM
'the rumor that Ali abandoned his family'

A noun does not check the Case of its complement (or at least not
structural Case); in this respect, it is different from either verbs or
adpositions. This is also shown here by the fact that there is no overt
Case on the nominalized complement clause of the noun. But, just
like a verb, a noun assigns primary θ-roles. This explains the
Genitive on the subject of the categorially hybrid clausal N-
complements. At the same time, such examples motivate the
approach to licensed subject Case proposed here, i.e. one as based on
indexation via primary θ-roles, and against an approach based on
licensing of Agr via Case on that element.
Subject case in Turkish nominalized clauses 163

3.2.2. Instances of licensed verbal Agr without structural Case

Further motivation for Agr as the subject Case licenser, as well as for
its own licensing via θ-role-based indexation, is offered by the
existence of tensed complement clauses of nouns:

(34) [[beni [proi aile -m ] -i

I [NOM] family -l.SG -ACC
terket -ti -m ] söylenti -si ]
abandon -PAST-l.SG rumor -CMPM
'the rumor that I abandoned my family'

Given the fully verbal, tensed nature of the noun-complement

clause, it is clear that neither the clause nor its (verbal) Agr-head
receive any Case from the external noun. Nevertheless, the overt
subject of that clause has subject Case, namely Nominative. This
shows that the local, verbal Agr element is licensed as a subject Case
licenser; given the verbal nature of that local Agr, the appropriate
subject Case is the Nominative, and this is what we find here.
These facts about noun-complement clauses are problematic for
any account like Raposo's, where Agr as a licenser is activated by
external Case on that Agr.

3.2.3. Instances of unlicensed nominal Agr with adpositional Case

Another type of problem is posed by the existence of categorially

hybrid clauses as complements of postpositions. Such examples were
discussed previously, in section 2.3.1., where we saw that in such
constructions, the subject does not receive the expected subject Case,
despite the existence of an Agr element. (See examples [18] through
[22] in that section.) Note that in such examples, postpositions do
assign Case to their complement clauses, and thus to the Agr element
heading such clauses. This is especially obvious in examples (19)
through (22), where the clause bears overt Case, irrespective of the
presence or absence of an overt postposition. Thus, (clause-)external
164 Jaklin Kornfilt

Case on Agr and the clause that it heads does not license that Agr as
a subject Case marker, making default Case necessary.
On the other hand, the common denominator between all the
examples in 2.3.1. (i.e. [18] through [22]) is the fact that the
categorially hybrid clauses do not bear any primary θ-role. I
therefore submit that the account I have proposed here is
corroborated by these examples, while a Case-based account is
refuted by them.

4. An additional subject Case licensing mechanism: Indexation

by predication in headed operator-variable constructions

4.1. Overtly headed relative clauses

Agr has an additional option of receiving a "referential" index (cf.

Rizzi 1994), if θ-role assignment is not an option. This is through
predication. Relevant examples are relative clauses (overtly headed
as well as Free Relatives) and comparatives (which, formally, are
similar to Free Relatives in Turkish).
In order to show that a (somewhat) separate treatment of RCs is
necessary, I would first like to demonstrate that the structure of RCs
is different from that of noun-complement constructions just
First, I would like to show that the modifying clause in RCs is an
adjunct of the head, not a complement of the head:

(35) [Ali-nin gegen gün dükkän-dan al -dig-ι ] bu

Ali -GEN past day shop -ABL buy -FN-3.SG this
$ahane vazo
magnificent vase
'this magnificent vase which Ali bought at the store the other
Subject case in Turkish nominalized clauses 165

Note that the modifying clause precedes the demonstrative, and

compare this property to a corresponding clause as a complement of
a head noun:

(36) §u [[Ali -nini [proi aile -sin ] -i

that Ali-GEN family-3.SG-ACC
terket -tig -i ] söylenti -si ]
abandon-FN-3.SG rumor -CMPM
'that rumor that Ali abandoned his family'

Here, the clause follows the demonstrative. Note that the order
found in RCs is not possible in noun-complement constructions:

(37) *[Ali -nini [proi aile -sin ] -i

Ali -GEN family -3.SG -ACC
terket -tig -i ] §u söylenti -si
abandon -FN-3.SG that rumor -CMPM
Intended reading: 'that rumor that Ali abandoned his family'

This shows that the modifying clause in RCs is merged higher

than the corresponding clause in noun-complement constructions.
Thus, these examples motivate the analysis of the noun-complement
clauses in a way appropriate to their label, i.e. the clause is the
complement of the noun and is therefore merged closer to that noun
as compared to the modifying clause in RCs which is an adjunct
rather than a complement of the head noun and is therefore merged
higher in the structure.
But if the modifying clause in RCs is not a complement of the
head noun, then it also does not receive a primary θ-role from that
noun, and thus the Agr heading the clause does not receive an
appropriate index. Yet, the subject of the clause does show up in the
appropriate subject Case, i.e. it is in the Genitive. This means that the
Agr is indexed, after all, but not via a marking based on a primary
θ-role. Instead, I suggest that the indexation needed is achieved via a
predication relation between the modifying clause and the head noun.
Early mention of such a predication rule in RCs and in Left
166 Jaklin Kornfilt

Dislocation constructions can be found in Chomsky (1977), where

such a rule is taken to express a general notion of "aboutness" (cf.
Chomsky 1977: 81).
The same predication relation, I suggest, holds in the following
two examples:

(38) a. the [ sad ] man

b.the man [who[ is sad]]

In turn, this predication relation is, in a sense, similar to that we

find within a clause between a predicate, i.e. a VP (or an I') and the

(39) a. The man [coughed]

b. The man [is sad ]

There is some recent work in which at least some instantiations of

this kind of predication is related to θ-role assignment, as in
Williams (1994). There, predication between a predicate and a
subject as well as predication between a nominal head and its
adjectival modifier is taken to involve θ-role assignment.
It is interesting to see an approach where the relationship between
a modifier (i.e. an adjunct) of a nominal head and that head (as in
[38]a.) is viewed as one involving θ-role assignment. If this view is
on the right track, then we have exactly the type of natural class of
constructions that we have been aiming for in this paper:
complements of verbs and of nouns, along with (for Williams, only
adjectival) adjuncts of nouns, to the exclusion of adjuncts of verbs.
The task that remains is to also introduce relative clause modifiers
into that natural class. Williams (1994) excludes them, for reasons
that it would take us too far afield to discuss here. I would like to
suggest that modifier clauses of nominal heads should have the same
relationship to those heads as modifier adjectives', in other words, if
there is a predication relationship between the modifier and the head
in (38)a. i.e. a relationship based on θ-role assignment, in
accordance with the suggestions in Williams 1994, then the same
Subject case in Turkish nominalized clauses 167

relationship should also hold between the modifying clause and the
head in (38)b. As a matter of fact, the traditional labelling of such
modifier clauses as adjective clauses goes along with this idea.
The similarity between these two kinds of noun modification is
even more obvious in other languages, Turkish being one of them:

( 4 0 ) a . [iizgiin] adam
sad man
'the sad man'
b . [[ei iizgiin ol -an ] Op(] adami
sad be -REL.PART man
'the man who is sad'

If my suggestion is correct, there is predication between modifier

and nominal head in both of (40) a. and b., and in both of them
(following Williams 1994 at least in spirit), this predication is based
on θ-role assignment. I further suggest that this sort of θ-role
assignment is "primary" in a sense similar to Rizzi's primary
θ-roles, because it restricts the reference of the DP-head; therefore,
the indexation that encodes the predication relationship is a
"referential" one. This is where we find the ultimate similarity
between those instances where an argument clause receives a
"referential" index, namely via a primary θ-role, and those instances
where an adjunct clause likewise receives a referential index via
predication, the latter also based on (primary) θ-role assignment
between head and modifier.
The examples of RCs just discussed have subject targets and thus
don't possess subjects in need of Case. But note that once we insure
such indexation on the modifier clause in an RC, the account
developed here for subject Case applies to those RCs that do have a
subject, i.e. in RCs whose target is a non-subject, as the one in (35),
repeated here for convenience as (41):
168 Jaklin Kornfilt

(41) [Ali-nin gegen gün dükkän-dan al -dig -i ]j

Ali-GEN past day shop -ABL buy-FN 17 -3.SG
[bu §ahane vazojj
this magnificent vase
'this magnificent vase which Ali bought at the store the other

I have encoded the predication relation at issue via indexation.

The index on the modifying clause would, as outlined earlier for
argument clauses, percolate down to the (nominal) Agr element
which is not licensed internal to the clause, as it conflicts with the
"verbal" features of the predicate in this categorially hybrid clause.
Agr, now licensed via its predication index, licenses the appropriate
subject Case on the subject; due to the nominal nature of this Agr,
subject Case is realized as the Genitive.
A similar account would hold for comparative constructions
whose head would be an overt or covert quantificational phrase.
Before turning to those and to Free Relatives, I would like to point
out that indexation between modifying clause and nominal head is a
necessary but not sufficient condition for genuine, licensed subject
Case to be realized; the presence of overt Agr is crucial. To see this, I
show examples of nominalized irrealis relative clauses without an
Agr marker; relative clauses with nominal future tense morphology
are such an example.
It is important to note that in such instances, no overt subject is
possible any longer; the only possible subject is PRO:

(42) a .[[[PRO san -a e{ ver -ecek] Opi] bir

you(sg.)-DAT give -FUTN a
vazo] bul -a -mi -yor -urn
vase find -NegABIL -NEG -PROG -1 .SG
Ί am unable to find a vase to give you.'
Subject case in Turkish nominalized clauses 169

b. *[[[ben Iben -im san -a e(

I [NOM] / 1 -GEN you(sg.)-DAT
ver -ecek] Opi] bir vazo]
give -FUTN a vase
bul -a -mi -yor -urn
find-NegABIL -NEG-PROG -l.SG
Intended reading: Ί am unable to find a vase for me to give

Whether the overt subject is in the Nominative or in the Genitive,

the last example remains ungrammatical in the absence of Agr.
The ungrammatical example can be rescued by having the future
tense in the categorially hybrid modifying clause be followed by
nominal Agr:

(43) [[[Ben-im san -a ei ver -eceg -im ] Opi]

I -GEN you(sg.)-DAT give -FUTN-1.SG
bir vazo] -yu dim bul -du -m
a vase -ACC yesterday find-PAST-l.SG
Ί found yesterday a vase I'll give you.'

These sets of examples offer a clear illustration of the significant

role of Agr in the licensing of overt subjects, and we find the same
correlation in relative clauses between Agr and overt subjects with li-
censed subject Case as we found in argument hybrid clauses; when
Agr is present, we find overt subjects with licensed subject Case;
when Agr is absent, only PRO is possible as a subject.

4.2. Indicative adjunct clauses with Genitive subjects: Free relatives

and comparatives

Note that nominalized indicative clausal postpositional complements

do exist whose subjects are carriers of Genitive, i.e. of the genuine
nominal subject Case:
170 Jaklin Kornfilt

(44) [[Ay§e -nin duy -dug -un ] -a göre ] Sare

Ay§e-GEN hear -FN -3.SG -DAT according to Sare
deprem -de vefat et -mi$
earthquake-LOC death do-REP.PAST
'According to what Ay§e heard, Sare died in the earthquake.'

(45) Piyanist bu parga -yi [[Pollini -nin

pianist this piece -ACC Pollini -GEN
göster -dig-i ] gibi] ςαΐ -di
show -FN -3.SG like play -PAST
'The pianist played this piece like Pollini showed
(i.e. in the way in which P. showed it to be played).'

(46) Ali [[baba-sin -in iste -dig-i ] kadar ]

AH father-3.SG-GEN want-FN-3.SG as-much-as
baqari -li ol -a -ma -mi$
success -with become -NegABIL -NEG-REP.PAST
'(It is said that) Ali wasn't able to become as successful as his
father wanted.'

Note that the subjects of the nominalized factive clausal comple-

ments of the postpositions is in the Genitive rather than in the default
Nominative Case, in contrast with what we saw in (apparently)
similar adjunct clauses in section 2.3.1. What is the difference?
All of the postpositions in the last three examples have either
comparative semantics, or else the construction can be interpreted as
a (free) RC. More specifically, I suggest that (44) and (45) are Free
Relatives (FRs), while (46) is a comparative construction.
Among a number of competing analyses for comparatives, one
widely accepted analysis has been to view comparative constructions
as involving an operator, in a sense similar to relative clauses (cf.
Bresnan 1973 and 1975; for an account of Turkish comparatives
along these lines, cf. Knecht 1976). The translations of these last
three examples are suggestive: (44), a Free Relative: 'According to
what (i.e. on the basis of the things that) Ay§e heard, ...'; (45),
another Free Relative: 'The pianist played this piece like the way
Subject case in Turkish nominalized clauses 171

which Pollini showed'; (46), a comparative construction: 'Ali wasn't

successful as much as, i.e. to the extent that his father wished'. Note
also that similar facts hold for other comparatives: -DAnfazla 'more
than', -DAn az 'less than' etc.; due to space limitations I shall not
illustrate those.
It is particularly interesting to compare (44) with (19), since the
same postposition (göre) is used in both, yet the subject of the
postposition's clausal object is bare in (19), but has Genitive Case in
(44). The reason is, I claim, that we have a Free Relative in
(44)—and therefore crucially—an operator and a (phonologically
empty) head18, leading to the presence of the Genitive. In (19), there
is no reason to assume the presence of an operator, nor that of a
nominal head. The most appropriate translation of the postposition
göre in (19) is 'given that', rather than 'according to X', as in (44).
All we have in (19) is the clausal complement within a Postpositional
Phrase, with the whole PP being an adjunct of the matrix
verb—hence the lack of Genitive, despite nominal, rich Agr; instead,
the default, bare Case is found on the subject.
Two common denominators of relative clauses and comparatives
are the presence of an operator and the presence of a head of the
whole construction. I suggest that it is a predication relationship
between the head and the clause which activates the Agr heading the
clause (by co-indexing the external nominal or quantifier phrase
head of the construction with the clause, and with inheritance of the
index on the clause by the nominal Agr- head of the clause).
In other words, what we see here is exactly the same kind of
"activation" of nominal Agr in categorially hybrid indicative clauses
which we observed in overtly headed relative clauses and which we
accounted for by proposing referential indexation on the nominal
Agr, based on predication between the head of the RC and the
modifying clause. This predication is made possible by the moved
operator (relative or comparative), which turns the clause into an
"open" clause. The only difference is that here, the respective heads
of the constructions are phonologically empty. The Genitive marking
on the subjects is therefore just as expected.19
172 Jaklin Kornfilt

From a typological perspective, it is interesting to note that a

somewhat similar proposal has been made in the literature for
Japanese. Watanabe (1996) proposes to analyze some of the
Japanese "Ga-No conversion" contexts as relative clauses and
comparative constructions, and he claims that the operator movement
in such constructions makes Genitive marking on the subject
While this is a different account from mine, as it does not appeal
to predication and to referential indexing, it is nonetheless very
suggestive that, when observed from a particular perspective,
Japanese and Turkish should have rather similar phenomena. It is
possible that indexation via predication, with concomitant subject
Case licensing, is a particular parameter, while indexation via Θ-
marking is another one. Japanese might have a positive value for the
first, European Portuguese for the second, and Turkish for both.
Clearly, this is a fascinating area for further research.

5. Preliminary conclusions

My account has been based on the following proposals: 1. Genuine

subject Case is licensed by Agr which is itself licensed. 2. The
primary type of Agr-licensing is via matching categorial feature in
the local domain. In absence of this licensing, we may have: 3.
Licensing by (referential) indexation, instantiated by either primary
θ-role marking or by predication (the latter also based on θ-role
assignment, if Wilhams 1994 is correct). 4. The type of subject Case
is determined by the category of the licensing Agr (Nominative for
verbal Agr and Genitive for nominal Agr). 5. Where Agr is absent, or
where it is not licensed as a subject Case licenser, no subject Case is
licensed, even where conditions for indexation are met otherwise.
Overt subjects receive default Case instead.
Subject case in Turkish nominalized clauses 173

6. What is the Case of the non-Genitive subjects in adjunct


The question now arises about the nature of the default Case I have
proposed. Is this simply a morpho-phonologically unrealized general
Case, or is it the Nominative? Given that the Nominative in Turkish
has no overt realization, this is a legitimate query. I shall conclude
that the default Case is indeed the Nominative.
There is some independent evidence for my conclusion. One type
of such evidence is provided by Left-Dislocation constructions, and
especially in non-Case matched contexts.
In Left-Dislocation constructions, the dislocated element can
either exhibit the same Case as the corresponding constituent in the
clause, or the default Case, i.e. it can be bare (in other words, I am
claiming that the "bare" dislocated constituent is in the Nominative);
but it cannot be in the Accusative, if the corresponding constituent in
the clause is not Accusative:

(47) Ali (-yi) mi? Ben kendisin-i

Ali(-ACC) Y/N I himself-ACC
iig ay -dir gör -me -di -m
three month-since see -NEG-PAST-l.SG
'(About) Ali, I haven't seem him for (the last) three months.'

(48) Ali(*-yi) mi? Ben kendisin-den gok

Ali (-ACC) Y/N I himself -ABL very
kork -ar -im
fear -AOR-l.SG
'(About) Ali, I am very much afraid of him.'

This is in contrast to English, where the default Case appears to

be Accusative:

(49) a. Who's there?—It's me.

b. Who's there?—*It's /.20
174 Jaklin Kornfilt

Chomsky (2001) offers a typology of Case which is, in part, simi-

lar to his older proposals (cf. Chomsky 1981) in including structural
and inherent Case. But an important addition is the notion of default
Case, i.e. Case licensed not by any particular licenser, but rather
assigned independently of such licensing relationships. Examples
like those in (49) are offered as illustrations of this notion.21 My
proposal to analyze the dislocated subjects in (47) and (48) (where
they are phonologically "bare") as well as the "bare" subjects of
adjuncts without operators and nominal heads (i.e. overt subjects
which I have claimed bear default Case) as being in the Nominative
Case accords well with this recent approach. The basic default Case
assignment/checking mechanism would be the same in English and
Turkish; the only difference would be in the actual morphological
realization of the default Case: Accusative in English, Nominative in
Turkish. The fact that Nominative is morphologically realized as a
zero morpheme makes it, I suggest, even more plausible as a default
Another source of missing overt Case marking on nominals is
lack of specificity: a non-specific nominal does not bear the expected
structural Case morpheme (i.e. Accusative or Genitive). This phe-
nomenon has been discussed in the literature; discussion and further
sources can be found in Εης (1991), Dede (1986), Tura (1986), Ergu-
vanli-Taylan (1984), as well as Kornfilt (1984) and (1995a). Such
non-specific, morphologically "bare" nominals must usually be im-
mediately pre-verbal; they cannot scramble away from that position,
although Turkish is otherwise rather word-order free. I will illustrate
with Genitive subjects and their "bare", non-specific counterparts:

(50) [Araba -nin yol -dan geg -tig -in J -i

car -GEN road -ABL pass -FN-3.SG -ACC
gör-dü -m.
see -PAST -l.SG
Ί saw that the car went by on the road.'

In this example, we find the Genitive subject in its canonical, sen-

tence-initial position. A corresponding non-specific subject, "bare"
Subject case in Turkish nominalized clauses 175

morphologically, cannot show up in this canonical subject position;

instead, it must be in immediate pre-verbal position:

(51) a.[yol -dan bir araba geg -tig-in ] -i

road -ABL a car pass -FN-3.SG -ACC
gör -dii -m.
see -PAST -l.SG
Ί saw that a car (non-specific, non-referential) went by on
the road.' (The subject may be focussed, but it does not have
to be.)
b. *[bir araba yol -dan geg -tig -in ] -i
a car road -ABL pass -FN-3.SG -ACC
gör -dii -m.
see -PAST-l.SG
Intended reading: Ί saw that a car (non-specific, non-
referential) went by on the road.'

Similar facts hold in existentials—this is expected, as the

"semantic" subjects of existentials are obviously non-specific:

(51) c .[Garaj-da be§ araba ol -dug-un ] -u

garage-LOC five car be -FN -3.SG -ACC
bil -iyor -um
know -PRSPROG -l.SG
Ί know that there are five cars in the garden.'
d. *[Be§ araba garaj -da ol -dug-un J -u
five car garage -LOC be -FN -3.SG - ACC
bil -iyor -urn
know -PRSPROG -l.SG
Intended reading: Ί know that there are five cars in the

In all of these examples, we would expect the subjects to show up

in the Genitive, but they are "bare", i.e. Case-less, instead.
176 Jaklin Kornfilt

Specific subjects which have, in my account, undergone default

Case marking (and are, morpho-phonologically speaking, "bare" as
well), behave differently with respect to word order, i.e. they can
show up in canonical subject position:

(52) [bu gocuk ev -de kal -dig-ι ] ίςΐη Ali

this child house-LOC stay-FN-3.SG because Ali
ϊξ -e gid -ebil -di.
work-DAT go -ABEL -PAST
'Ah could go to work because this child stayed at home.'

In this respect, they pattern with Genitive subjects (cf. example

[50]) as well as with Nominative subjects:

(53) bu gocuk ev -de kal -di

this child house-LOC stay-PAST
This child stayed at home.'

Conclusion: there are different types of morphologically bare

subjects. While bare non-specific subjects lack (structural) Case (in
all of the instances just observed, this would be the Genitive) and are,
probably due to that reason, fixed in their pre-verbal position,
subjects that are bare but carry default Nominative Case behave like
regular, genuinely Nominative subjects in verbal clauses as well as
their Genitive counterparts in nominal clauses. The issue of non-
specific, bare DPs is orthogonal to the issue of licensed versus
default Case. Any nominal phrase with non-lexical Case, irrespective
of the licensed or default nature of such Case, is (usually) realized in
a Case-less, bare form, as in (51) a. or c.

7. When default Case may or may not be licensed

It is important to set up constraints on the application of default

Case; otherwise, default Case would apply in all instances when
licensed Case does not, and we would lose our explanations of, for
Subject case in Turkish nominalized clauses 177

example, the complementary distribution between overt subjects and

PRO-subjects in infinitival clauses. I therefore start with those.

7.1. Infinitivais

I propose that the infinitival marker -mAK is actually -mA + K, with

-mA in M(ood) position, and -K in AGR (=Fin) position. This
proposal captures the morpho-phonological similarity between the
inflected nominal subjunctives (i.e. -mA + Agr) and the non-inflected
infinitivals (-mA+K). It further captures the fact that, semantically,
the Mood of infinitives and that of the inflected non-factives is
similar, i.e. they are both subjunctive. The analysis further accounts
for the fact that the verbs selecting for infinitival clauses are similar
to those that select subjunctive clauses.
Note that -K never expresses φ-features; I analyze it,
consequently, as [-Agr]. This licenses "null Case" (cf. Lasnik and
Chomsky 1991), i.e. the Case for licensing PRO, but not for
licensing overt DPs. Now, default Case is blocked: default Case can't
apply when any Case (of the type that needs to be licensed by a
designated licenser) is actually licensed; this holds for "genuine
subject Case" (e.g. for the Genitive in nominal clauses) as well as for
the "null Case" of Lasnik and Chomsky, i.e. the Case special for
PRO-subjects. In this way, the complementary distribution between
overt subjects and PRO in infinitivals and the correlation of this
distribution with the presence versus absence of an appropriate Agr
element can be maintained.22'23
I now turn to other syntactic domains that lack Agr—domains
where it is desirable to block application of default Case.

7.2. Other Age-less domains

7.2.1. In ECM-contexts

When discussing ECM-constructions, I suggested that there is no "C-

system" dominating the embedded clause. Hence there also is no
178 Jaklin Kornfilt

"Fin"-head (corresponding to AGR), not even negatively specified in

the way I just proposed for infinitives (cf. George and Kornfilt 1981,
where, for Turkish, Agr determines Finiteness). Consequently,
Accusative Case is licensed by the higher (ECM-) verb. We now
have a licensed Case; therefore, default Case is not needed and hence
not allowed, given that it is a strictly last resort mechanism.

7.2.2. In adjunct domains headed by other forms than the

infinitive, but no Agr

We discussed adjunct clauses that lack Agr. The CP-status of such

clauses is unclear: extraction judgements are murky. I would like to
make the following assumption about such clauses: "Fin" exists in
their clausal architecture, but it is underspecified, as it lacks Agr
features. No other Case licensing is possible (the way it is in
infinitivals as well as in ECM-constructions). Consequently, default
Case applies, as a last resort.

7.3. Adjunct domains with Agr (and no predicational indexation)

No primary θ-role is assigned to this syntactic domain. If such a

domain is not involved in predication (i.e. if such an adjunct clause is
not a relative clause or if it is not a comparative), and if there also is
no clause-internal categorial congruence between Agr and the rest of
the predicate, Agr will not be licensed as a subject Case licenser. No
other licensing is possible. As a consequence, default Case applies as
a last resort.
Subject case in Turkish nominalized clauses 179

8. Other treatments of the argument/adjunct asymmetry based

on subject Case

There have been very few discussions of the phenomena presented in

this paper. I am aware of two: one earlier and one later than my
work. I start with the earlier one.

8.1. Kennelly (1990)

She proposes that it is Tense that assigns Nominative to the subject

of a clause, rather than Agr. For her, indicative nominal markers (i.e.
the markers which I have glossed as F[active] N[ominal] and which I
analyzed as giving rise to categorially hybrid predicates) equal
Tense. She claims that external θ-role assignment blocks
Nominative assignment; motivated by the need to receive Case, the
subject raises to Spec, DP (i.e. the DP which Kennelly assumes is the
proper category of nominal clauses); in that higher position, the
subject receives Genitive, simply as a consequence of being the
specifier of a DP.
My main objection to this account is conceptual: why should the-
matic marking block regular subject Case? To my knowledge, this
phenomenon, i.e. θ-marking as blocking Case checking, has never
been observed or referred to for other languages and/or for other
phenomena. If anything, θ-marking either presupposes or makes
possible Case assignment/checking; this has been usually assumed
for inherent Case, for example.24 As for structural Case, I am not
aware of an instance where thematic marking blocks such Case. We
saw earlier that for Raposo, it is in "Case-marked", i.e. largely Θ-
marked, domains that a nominal clausal head is able to assign a
subject Case to its subject. Thus, the main proposal is not motivated
and goes against general assumptions and cross-linguistic facts.
In addition, the data Kennelly uses to bolster her analysis are
problematic and, to my knowledge, are not shared by many native
speakers. Due to space considerations, I shall not discuss her
proposals further, given the gravity of the conceptual problems with
180 Jaklin Kornfilt

it. It is probably due to these reasons, and I suspect especially due to

the problems with the data, that this proposal has seldom, if ever,
been referred to or used as the basis of later work. In any case, given
the problems with it that I just sketched, I think we can safely reject
this approach.

8.2. Aygen (2002)

This work postdates the presentation of my paper at the Leipzig

workshop related to this volume, as well as Kornfilt (2001) and
(2002). Given certain similarities between it and my work, as well as
between it and Lees (1965), I shall devote some space to a discussion
of its main aspects.
Aygen (2002) devotes attention to factive "nominalizations" only,
i.e. the type which I have claimed to be categorially hybrid.
She notes the main contrast between argument and adjunct factive
clauses in Turkish—the contrast that has been the focus of this paper,
too: the subjects of the former are Genitive, but those of the latter are
Nominative. Note that in terms of adjunct clauses, Aygen considers
only those that are marked with the factive marker, and which do
have Agr. (In other words, she does not discuss the Agr-less predi-
cates in adjunct clauses; in terms of argument clauses as well as ad-
junct clauses, she considers only the factive nominal clauses, i.e.
there is no account of the non-factive, fully nominal clauses.) She
reasons that, given that Agr is present in both argument and adjunct
clauses, yet the Case on the subjects is different, Agr couldn't possi-
bly be involved in Case assignment to subjects.
This first premise is faulty; the conclusion does not necessarily
follow. As we have seen in the present paper, an approach is possible
where Agr (or any representation of φ-features) checks for subject
Case only in particular configurations. This is the reasoning I have
followed here. Raposo's approach to European Portuguese takes this
same direction of reasoning, as well, as did Reuland's (1983) earlier
one to English.
Subject case in Turkish nominalized clauses 181

However, although Ay gen's premise is faulty, it is nevertheless

true that the initial observation of the asymmetry just mentioned
might indeed be open to an approach that does not involve Agr at all
in subject Case assignment—especially if all the arguments in favor
of Agr in this respect (and which were discussed earlier in the present
paper) are not considered, due to a very narrow focus on just the
asymmetry at hand. I shall therefore briefly sketch Ay gen's account.
Aygen (2002) proposes that in Turkish, Case on the subject is
licensed by neither the [+Tense] features of Τ nor the φ-features of
AGR, but by a Case feature on C. This feature is responsible for
Genitive subjects: in relative clauses and in noun-complement
clauses, some agreement relationship between either the external
nominal head of such DPs or perhaps the D-head of the (high) DP
and the CP licenses a Genitive feature on the CP, and, indirectly, on
the subject of the CP.
Indicative argument clauses are claimed by Aygen to be noun-
complement clauses with an abstract nominal external head; adjunct
indicative clauses are claimed not to be externally headed.
To illustrate the latter claim, I use the following two examples;
(54) would be claimed to have a structure similar to (55):

(54) Ben[Hasan -in gel -dig-in ] -i

I Hasan -GEN come -FN-3.SG -ACC
bil -iyor -urn
know -PRSPROG -l.SG
Ί know that Hasan came.'

(55) Ben [[Hasan-in gel -dig-i ]

I Hasan-GEN come -FN-3.SG
gergeg-in ] -i bil -iyor -urn
fact -CMPM-ACC know -PRSPROG -l.SG
Ί know the fact that Hasan came.'

The internal argument of the verb in (55) is a complex DP, i.e. a

noun-complement clause with its head noun. According to Aygen,
the Genitive is licensed by indirect agreement (C-N agreement) with
182 Jaklin Kornfilt

a nominal head. This indirect agreement is mediated via the CP, i.e.
the complement clause, which "agrees" with the head. She claims
that in (54), the object clause is actually an instance of a complex
DP, as well, with a phonologically unrealized nominal head. Hence,
the Genitive subject of that clause is similarly accounted for.
The idea that nominalized argument clauses are actually
complements of phonologically unrealized nominal heads was, to my
knowledge, first proposed in Lees (1965) for Turkish. There, both
factive and non-factive nominalizations were analyzed in this way,
although slightly different phrase structures were attributed to each
construction. This interesting proposal has some drawbacks,
however, and some of the criticism I shall raise against Aygen's
approach to Genitive subjects in factives will concern Lees's original
proposal, as well.
First, concerning Aygen's proposals, i.e. the licensing of a
Genitive subject via an "agreement" relation between the C-head of a
complement (or, more generally, argument) CP and the nominal head
(or perhaps even the D-head) of a dominating DP: this proposal
would make sense for languages where "concord", i.e. agreement
between the head noun (or D) and the complement in terms of certain
features (e.g. φ-features, or Case) obtains. Turkish, however, has no
concord. The features of a nominal head do not spread within the
DP—neither to complements, nor to modifiers.
The latter point is relevant with respect to relative clauses, where
the modifier clause is not a complement, but an adjunct of the nomi-
nal head. For those, too, it is not plausible to assume an "agreement"
relationship between the nominal head and the modifier clause, given
that no agreement between modifier and nominal head is ever found
elsewhere, either. Also, as the examples in the current study show,
there is no overt agreement between the nominal head and either a
complement clause (as in noun-complement constructions) or
between a nominal head and a modifier clause in a relative clause
Furthermore, while the proposal appears to unify relative clauses
and noun-complement clauses by positing this "agreement" relation
between a nominal head (overt or covert) and a (complement or
Subject case in Turkish nominalized clauses 183

adjunct) clause, this unification is only apparent. Complements of

nouns are structurally in a different position from adjuncts of nouns,
as we saw in some detail earlier in the present paper; we saw that the
modifier clause in a relative clause is attached higher with respect to
the head than is a complement clause in a noun-complement con-
struction. Thus, even if some sort of (abstract) "agreement" did
obtain, we would be looking at different "agreement" relations for
these two constructions.
One way out of this would be to posit raising of the subject to
Spec, DP of the higher DP, i.e. of the DP associated with the external
head N. While there is no evidence in Turkish for such raising in the
syntax, LF-raising is a possibility. For Japanese, this has been
proposed by Miyagawa (1993), with considerable explanatory
success. I shall offer some evidence to counter any analysis imputing
Genitive subject Case licensing directly to an external nominal for
Turkish, thus suggesting a property of grammar which is open to
parametric variation.
In addition to these conceptual and empirical problems, there are
problems concerning the evidence offered for parts of the analysis,
namely that nominalized complement clauses are externally headed
DPs. I shall consider only a few of those that are particularly clear-
cut and shall start by considering one type of such evidence, namely
scrambling to post-verbal positions.

8.2.1. Problems for post-verbal scrambling

It is well-known that Turkish allows backgrounded constituents to

scramble to post-verbal positions. Such scrambling from out of an
embedded clause to the very end of the root clause is not too bad for
most speakers:
184 Jaklin Kornfilt

(56) ?[Hasan -in ti nihayet kag -tig -in ] -i

Hasan -GEN finally escape -FN-3.SG -ACC
duy -du -m kari -sin -dani
hear -PAST -l.SG wife -3.SG -ABL
Ί heard that Hasan finally ran away from his wife.'

Given that the scrambled constituent has not attached to the

argument clause, but rather elsewhere (i.e. to the root clause), there is
no reason (other than subjacency, with mild effects) to predict
ungrammatically, especially given that I am not assuming that the
argument clause has a nominal head.
This contrasts with overtly headed factive clauses. In such con-
structions, scrambling to root-final position deteriorates:
This contrasts with overtly headed factive clauses:

(57) ??/*[[Hasan -in ti nihayet kag -tig -i ]

Hasan -GEN finally escape -FN -3.SG
söylenti -sin ] -i duy -du -m
rumor -CMPM-ACC hear -PAST-l.SG
kari -sin -dani
wife -3.SG -ABL
Ί heard the rumor that Hasan finally ran away from his wife.'

Aygen would wrongly predict the same status of acceptability for

both, given that the host of scrambling in (56) is externally headed in
her approach, just as its counterpart is headed in (57). The clear
contrast between these two examples sheds further doubt on her
approach. (Note also that this last contrast is problematic for Lees
1965, as well.)
Similarly revealing are examples where the whole argument
clause has been scrambled to verb-final position in the root clause:
Subject case in Turkish nominalized clauses 185

(58) tj Duy -du -m [[Hasan-in nihayet

hear -PAST -1 .SG Hasan -GEN finally
kari -sin -dan kag -tig -in] -i ]j
wife -3.SG-ABL escape -FN-3.SG-ACC
Ί heard that Hasan finally ran away from his wife.'

In such examples, post-verbal scrambling of a constituent of the

subordinate clause is fine:

(59) tj Duy -du -m [[Hasan-in ti nihayet

hear-PAST-l.SG Hasan-GEN finally
kag -tig -in ] -i ]j kari -sin -dani
escape-FN-3.SG-ACC wife-3.SG-ABL
Ί heard that Hasan finally ran away from his wife.'

This is just as expected in any approach in which this type of sub-

ordinate clause is not headed. The fact that it is an argument clause is
not a problem, either, given that the clause is not in argument posi-
tion, similarly to extraction facts in English, where a syntactic island
like a sentential subject does not exhibit island effects when it is ex-
traposed, as shown in Ross (1967). (For discussion of Turkish facts
of postverbal scrambling out of subordinate clauses in different posi-
tions, cf. Kornfilt 1998.)
However, the full grammaticality of (59) for many speakers is a
serious problem for Aygen's approach, which would predict it to be
ungrammatical, for the reasons already discussed: the scrambled
subordinate clause is, in that approach, headed. This problem is
compounded by the ill-formedness of corresponding examples where
there is an overt head:
186 Jaklin Kornfilt

(60) ??/*tj Duy -du -m [[Hasan-in t{ nihayet

hear -PAST-l.SG Hasan-GEN finally
kag -tig -i ] söylenti -sin -i ]j
escape-FN-3.SG rumor -CMPM-ACC
kar -sin -dani
wife -3.SG -ABL
Intended reading: Ί heard the rumor that Hasan finally ran
away from his wife.'

For Aygen, there should be no difference between the perfectly

fine (59) and the ill-formed (60); again, this is a problem for Lees
(1965), as well.

8.2.2. Problems for distribution

Nominalized clauses can differ in their distribution according to

whether they have an external nominal head or not. Only two syste-
matic differences (among a number of similar subcategorizational
differences) are considered here: factive versus non-factive nomi-
nalized clauses as objects versus subjects of psychological predi-

1. Psychological predicates allow both the factive and the non-

factive nominalization types as complements, without any difference
in semantics.

(61) a.[Ali-nin ev -den kag -ma -sin ] -a

Ali-GEN home -ABL flee -NFN -3.SG -DAT
iiziil -dii -m
Ί was saddened at Ali's running away from home.'
Subject case in Turkish nominalized clauses 187

b.[Ali-nin ev -den kag -tig -in ] -a

Ali-GEN home -ABL flee -FN-3.SG -DAT
iiziil -dü -m
Same translation as in the previous example.

However, when an external noun shows up, only the factive

gerund is well-formed for factive semantics:

( 6 2 ) &.??/*[[Ali-nin ev -den kag -ma (-si)]

Ali-GEN home-ABL flee-NFN -3.SG
söylenti -sin ] -e üzül -dü -m
rumor -CMPM -DAT sadden-PAST-l.SG
Intended reading: Ί was saddened at the rumor of Ali's
running away from home.'
b.[Ali-nin ev -den kag-tig-i ]
Ali-GEN home -ABL flee -FN-3.SG
söylenti -sin -e üzül -dü -m
rumor -CMPM-DAT sadden-PAST-l.SG
Ί was saddened at the rumor of Ali's running away from

2. With the same type of predicates, only the non-factive

gerundive is well-formed as subject, despite indicative semantics;
however, when such a sentential subject is externally headed, only
the factive gerund is well-formed for indicative semantics:

( 6 3 ) a..[Ali-nin ev -den kag-ma -si ] ben-i

Ali-GEN home-ABL flee-NFN-3.SG I -ACC
üz -dü
sadden -PAST
'Ali's running away from home saddened me.'
188 Jaklin Kornfilt

b. *[Ali-nin ev -den kag -tig -i ] ben -i

Ali-GEN home -ABL flee -FN-3.SG I -ACC
iiz -dii
sadden -PAST
Intended reading: 'Ali's running away from home saddened

(64) a.??/*[[Ali-nin ev -den kag-ma (-si)]

Ali-GEN home-ABL flee-NFN -3.SG
söylenti -si ] ben -i iiz -dii
rumor -CMPM I -ACC sadden-PAST
Intended reading: 'The rumor of Ali's running away from
home saddened me.'
b. [[Ali -nin ev -den kag -tig -i ]
Ali -GEN home -ABL flee -FN-3.SG
söylenti -si ] ben -i iiz -dii
rumor -CMPM I -ACC sadden-PAST
'The rumor of Ali's running away from home saddened me.'

Once again, these examples are problematic for both Aygen

(2002) and Lees (1965), as they clearly show that the distribution of
nominalized clauses with external nominal heads is different from
the distribution of their counterparts without external nominal heads.

8.2.3. Existing correlations not captured

There are clear correlations that hold between subject Case types and
local Agr types. These hold at an observational level and are inde-
pendent from any analytical bias: 1. Nominative subjects in argument
clauses (as well as root clauses) are possible only when verbal Agr is
present locally; 2. Genitive subjects in both argument and adjunct
clauses are possible only when nominal Agr is present locally. The
present study has offered illustrations of both generalizations. In both
instances, the presence or absence of external nouns is completely ir-
relevant. Therefore, any approach to the first asymmetry (i.e. the
Subject case in Turkish nominalized clauses 189

argument-adjunct asymmetry in categorially hybrid nominalized

factive clauses) that rejects Agr as an important factor in determining
subject Case in Turkish is problematic, as is any approach that
attributes primary importance to an external nominal head as deter-
mining Genitive Case.

8.2.4. Correlations posited that do not exist

Aygen (2002) claims that Genitive subjects are possible only when
the clause has an external nominal head, or else, where there is no
such overt head, where a nominal head is potentially possible,
because the position is there structurally.
In consequence, she claims that when a Genitive subject is not
possible, an external nominal head is not possible, either. Likewise,
when there is an external nominal head, the subject of the clause
should always be Genitive.
Both correlations are counterexemplified by a variety of construc-
tions: Indicative nominalized existentials with non-Genitive


I shall start with one type of construction which is discussed in

Aygen (2002) as furnishing support for her analysis (and as
supposedly being problematic for one aspect of my approach—an
aspect of the analysis which I had presented in Leipzig as well as in
Kornfilt 2001). This is the existential construction in nominaliza-
In Turkish existentials, the subject is to the immediate left of the
verb; in this respect, it is similar to other non-specific, non-referential
subjects which we saw earlier in this paper. Another similarity is that
such a subject cannot be marked with the Genitive; instead, it has to
be morphologically bare with respect to Case. While there is a
special existential verb in fully verbal clauses, the predicate in
190 Jaklin Kornfilt

nominalized existentials is the "light verb" ol 'be', which takes the

regular nominalization inflections we have discussed in this study.
An indicative nominalized existential follows as an illustration:

(65) Ali [bahge -de bir ejderha ol -dug -un ] -u

Ali garden-LOC a dragon be-FN -3.SG-ACC
duy -du
hear -PAST
'Ali heard that there is a dragon in the garden.'

Aygen (2002) claims that, because the subject of existential

subjects in indicative nominalizations cannot be in the Genitive, such
clauses cannot show up with an external nominal head. She claims
that examples of the following sort are ungrammatical:

(66) Ali [bahge -de bir ejderha ol -dug -u ]

Ali garden-LOC a dragon be-FN -3.SG
söylenti-sin -i duy -du
rumor -CMPM-ACC hear -PAST
'Ali heard the rumor that there is a dragon in the garden.'

Aygen (2002) further claims that the supposed ungrammaticality

of (66) is a problem for my approach. This is because she imputes to
that approach the prediction that the subject in (66) should be
Genitive: due to the θ-marking which the gerund clause would
receive from the external noun, the agreement would receive a
referential index and thus, so she claims, would license Genitive
subject Case, contrary to fact.
First of all, my approach does not make this prediction. The
nominal Agr in (66) does not AGREE with the subject, but rather
with an expletive pro. In other words, this is a "fake", default Agr.
Secondly, both in the current study and in its precursor presentations,
I made clear that non-specific subjects (which include existential
subjects) cannot be morphologically marked for structural Case (and
thus for Genitive), even if such a structural Case should be licensed
Subject case in Turkish nominalized clauses 191

A third, and even more important, point is that Ay gen's idiolect

appears to be exceptional in rejecting examples like (66). Such
examples are perfect for me, and they are perfect for all the native
speakers whom I have consulted. (All the individuals in the list of
speakers in the footnote of acknowledgements were consulted about
these examples and similar ones, and they all found them to be
flawless.) In the absence of Agr, only PRO is licensed as a subject; an

overt DP is not, irrespective of its Case

Turning to problematic examples not mentioned in Aygen (2002), we

saw earlier that irrealis relative clauses cannot have any overt
subject, Genitive or otherwise; the only possible subject is PRO. The
next two examples illustrate this point once again, for the readers'

(67) a .[[PRO san-a ver -ecek] bir vazo]

you-DAT give -FUTN a vase
bul -du -m
Ί found a vase to give you.'
b.*[[Ben /Ben -im san-a ver -ecek]
I (NOM) / 1 -GEN you -DAT give -FUTN
bir vazo] bul -du -m
a vase find-PAST-l.SG
Intended reading: Ί found a vase for me to give you.'

In the presence of Agr, the same FUTN marker expresses Future/

Indicative; a Genitive subject is licensed—due to the presence of the
Agr element:
192 Jaklin Kornfilt

(68) [[Ben-im san -a ver -eceg -im] vazo] -yu

I -GEN you-DAT give -FUTN-l.SG vase -ACC
bul -du -m
find-PAST -l.SG
Ί found the vase I am going to give you.'

Remember that Ay gen claims that the nominal head of the relative
clause licenses Genitive subjects, and that the Agr element of a
clause is irrelevant for the subject being licensed via Case. She
would therefore predict that the version of (67)b. with a Genitive
subject should be grammatical.
Thus, both the fact that irrealis relative clauses cannot have overt
subjects (as shown by the examples in [67]) and the contrast with
future tense relative clauses that do have Genitive subjects (as shown
by [68]) are problematic for Aygen (2002) but just as expected under
the approach developed in the present study, as comparison of these
constructions shows the importance of Agr for subject Case licensing
as well as the irrelevance of an external nominal head. Determining factor not Mood by itself (in general); e.g.

noun-complement clauses with the same Mood, but
different categorial features

Aygen (2002) mentions in passing that Mood plays a role in

determining the subject Case, too, but does not make explicit in what
way this would interact with the subject Case determination via the
external nominal head. But suppose that we do pursue this idea. We
would have to say that irrealis mood somehow blocks the Genitive
Case licensed by the external nominal head, while indicative mood
does not do so.
Vague and unlikely as this proposal is, it would draw the correct
distinction between indicative/future and irrealis relative clauses.
However, it is clear that Mood does not determine subject Case
marking in Turkish in general. This can be seen clearly by
contrasting (69) with (70):
Subject case in Turkish nominalized clauses 193

(69) [Ben-irn aile -m -i terket -tig -im ]

I -GEN family-l.SG-ACC abandon-FN-l.SG
söylenti -si
rumor -CMPM
'the rumor that I abandoned my family'

(70) [Ben aile -m -i terket -ti -m ]

I(NOM) family-l.SG-ACC abandon-PAST-l.SG
söylenti -si
rumor -CMPM
'the rumor that I abandoned my family' (i.e. same as in the
previous example)

(69) illustrates a nominalized indicative noun-complement clause,

while (70) exemplifies a fully verbal, but also indicative noun-com-
plement clause. Both have obviously an external nominal head. But
only (69) has a Genitive subject, while (70) has a Nominative sub-
ject. There is no difference in Mood. However, there is a difference
in the local marker for Agr. it is nominal in (69), thus licensing Geni-
tive subject Case, and verbal in (70), thus licensing Nominative sub-
ject Case. Thus, these facts and their contrast are just what my ap-
proach predicts.
However, this contrast, and especially the Nominative (rather than
Genitive) subject in (70) are problematic for Aygen (2002). Her
analysis predicts Genitive subjects for both constructions, due to the
external nominal head in both. Since there is no Mood difference
between the two, her analysis cannot take recourse to a Mood-based
determination of subject Case, either.
We thus see that what matters for subject Case and its overt reali-
zation are the categorial features of Agr.25 Another relevant construction: The nominalized indicative

clause as a postpositional complement

Yet another type of construction that is discussed in Aygen (2002) in

the context of the supposed correlation Genitive subject—external
194 Jaklin Kornfilt

nominal head is the nominalized indicative clause as a postpositional

complement. We saw that such clauses have Nominative rather than
Genitive subjects (unless they are in a predicational relationship).
Aygen gives examples showing that no nominal head is possible
when the subject of such a clause is in the Nominative; to be well-
formed, the subject of such a clause must be in the Genitive:

(71) *[[Hasan anla -dig-ι ] §ey -e göre ]

Hasan understand -FN-3.SG thing-DAT according to
herkes anla -yacak
everybody understand -FUT
Intended reading: 'According to the thing that Hasan
understood, everybody will understand.'
(Aygen 2002: example [15 a.]; glosses and translation slightly

(72) [[Hasan-in anla -dig-ι ] §ey -e

Hasan-GEN understand-FN-3.SG thing-DAT
göre ] herkes anla -yacak
according to everybody understand -FUT
'According to the thing that Hasan understood, everybody will
(Aygen 2002: example [16]; glosses and translation slightly

(73) [[Hasan haber-i anla -dig-in ] -a

Hasan news -ACC understand -FN-3.SG -DAT
göre ] herkes anla -yacak
according to everybody understand -FUT
'Given that Hasan understood the news, everybody will.'
(Aygen 2002: example [17]; glosses and translation slightly

This triplet does not establish that the Genitive subject is due to
the external noun in (72). The account I have proposed in this study
explains these facts too, and without all the problems that go along
Subject case in Turkish nominalized clauses 195

with Aygen's analysis. In (71) and (72), we have a relative clause.

Predication between §ey 'thing' and the modifier clause would
referentially index the clause and thus turn the nominal Agr into a
licenser of Genitive. This is why (72) is well-formed, but (71),
without the Genitive, is ill-formed. In (73), the clause does not
receive a referential index from anywhere. Since this is an indicative,
and thus categorially hybrid, clause, its nominal Agr is not licensed
to be a Genitive licenser clause-internally, either. Therefore, no
genuine subject Case is licensed, and default Nominative applies
A further problem with Aygen's claim that her analysis explains
the existence of such triplets is the following: why should verbs
subcategorize for noun-complement constructions (whether overtly
or covertly headed by an external noun), while postpositions
subcategorize for nominalized clauses without external nominal
heads? Why should any kind of (factive) nominalized clause reject
an external nominal head when the clause is an adjunct—whether as
an adjunct itself, or as part of an adjunct, when it is subcategorized
by a postposition? Aygen (2002) does not attempt to explain or
motivate this difference, while basing her (as shown here,
problematic) analysis on the assumption of this difference.

8.2.5. Problems with scope facts

If the Genitive subject in indicative nominalized clauses somehow

AGREEs with an external head noun, one would expect for suitable
subjects to be able to take scope over that head noun, at least option-
Miyagawa (1993) shows that in Japanese, this is indeed an option
for Genitive subjects, but not for Nominative subjects of nominal
complement clauses.
Turkish examples differ in this respect:
196 Jaklin Kornfilt

(74) Ali veya Veli -nin parti-ye gel -eceg -i

Ali or Veli -GEN party-DAT come -FUTFN -3.SG
ihtimal -i yüz -de elli -den yüksek.
probability-CMPM hundred-LOC fifty -ABL high
'The probability that Ali or Veli will come to the party is
greater than fifty per cent.'

(75) Ali veya Veli -nin parti-ye gel -dig-i

AH or Veli -GEN party-DAT come -FN-3.SG
ihtimal -i yüz -de elli -den yüksek.
probability-CMPM hundred-LOC fifty -ABL high
'The probability that Ali or Veli came to the party is greater
than fifty per cent.'

For those speakers who accept these constructions as well-formed,

the head noun has scope over the Genitive subject; the reverse scope
is not possible. In other words, for those speakers who accept them,
these examples can mean: 'The probability that either Ali or Veli
came/will come to the party is greater than fifty per cent.' However,
they cannot mean: 'Either the probability that Ali came to the party is
greater than fifty per cent, or the probability that Veli came to the
party is greater than fifty per cent.'
This strongly suggests that the Genitive subject is not in a non-
local (or indirect) AGREE relation with the external head ihtimal
'probability', nor has it risen to the specifier position of that external
head or of its associated D. In Japanese, such raising might be possi-
ble or necessary, as the complement clauses of nouns (or the modi-
fier clauses of relative clauses) are not nominalized themselves. They
neither have a nominal Agr, nor do they have ΤΑΜ morphology with
nominal features. Therefore, an analysis imputing Genitive licensing
capabilities to the external head is plausible for Japanese, with the
concomitant scope effects.
In Turkish, however, the clause has local morphology with nomi-
nal categorial features, and thus Genitive subject Case can be licens-
ed by closer nominal elements than an external noun. As a conse-
quence, attempts to impute Genitive licensing to such an external
Subject case in Turkish nominalized clauses 197

nominal head fail when confronted with syntactic challenges, as we

have seen.

9. Conclusions and some speculations

I have claimed in this paper that the subject-predicate agreement

morphology of subordinate clauses in Turkish may license overt
subjects in the following ways: 1. If the Agr element is itself licensed
categorially within its clause, it also licenses genuine subject Case
(and thus an overt subject). An Agr element is licensed if it occurs in
a morphological sequence with categorially fully specified ΤΑΜ
markers and is of the same categorial type as those and as the higher
functional projections. 2. If the Agr is not licensed in this way (which
typically is the case when a nominal Agr shows up after a categor-
ially underspecified ΤΑΜ marker, i.e. the factive future or non-future
markers and under a CP=ForceP, i.e. a verbal functional projection),
an overt subject can be licensed in one of two ways: A. If the Agr
bears a referential index (in the sense of Rizzi 1994), it is licensed
itself and can license genuine subject Case. Such referential index
can be inherited by Agr in one of two ways: 1. Via a primary θ-role
that the clause, headed by Agr, receives; 2. Via predication between
such a clause and a co-indexed head. B. Where neither categorial
homogeneity nor indexation enables an Agr to assign subject Case
(and where there also is no negative Agr element that licenses null
Case), default Case is assigned instead. In Turkish, default Case is
In contrast to two studies on the same issue (but limited to factive
subordination only), one preceding the current study temporally and
the other following its earlier incarnations, this larger perspective on
subordination and subjects in subordinate domains shows that it is
unnecessary to make otherwise unmotivated assumptions, e.g. the
assumption that subject Case is blocked under θ-marking, or that
nominalized argument clauses have all an abstract nominal head and
are noun-complement clauses. Furthermore, subject Case is not
198 Jaklin Kornfilt

licensed by Tense/Aspect or Mood per se—the latter only together

with Agr, which itself determines, in Turkish, finiteness.
This study, then, shows us that in a morphologically rich language
like Turkish, not only does Agr express φ-features, but also catego-
rial distinctions which are reflected in the subject Case it licenses.
Another result is that θ-marking, whose importance for extraction
has long been established theoretically and cross-linguistically, has
been claimed here to also play a central role in determining subject
Case. It would be interesting to find additional cross-linguistic
evidence; the facts of European Portuguese are very suggestive in
this regard.
It was important, when referring to the Case of subjects, to distin-
guish between genuine and default subject Case. Genuine subject
Case (whether Nominative or Genitive) depends on the category of
the Agr that licenses it. Subjects with genuine subject Case are in
complementary distribution with PRO or Accusative subjects (de-
pending on the presence versus absence of Agr), while subjects with
default Case are in free variation with PRO in the absence of Agr
(other than in infinitives and irrealis relative clauses, where there is
negative Agr), and such subjects are independent from the presence
of Agr with respect to Case. It is important, when categorizing Case,
not to draw the lines simply according to the overt appearance of
Case (i.e. Nominative versus Genitive), but according to the way in
which Case is licensed—something which may, but not always does,
coincide with the overt realization of Case.
There is one potential problem with the account I proposed:
For the factive nominalized, categorially hybrid clauses, I
proposed an analysis based on the claim that they are CPs—in
contrast with the fully nominal, non-hybrid, non-factive clauses
which I claimed lack a CP-level.
I further claimed that in factive nominalized clauses, the nominal
Agr raises to the C-head of the clause and thus inherits the index that
the entire CP receives; it is this index that enables Agr to license
genuine subject Case.
The potential problem mentioned is typological: Sabel (1996)
points out that in languages with raising of a verb or of a verbal
Subject case in Turkish nominalized clauses 199

element to C in non-finite clauses, no WH-movement is possible, as

the raised element obliterates the [+WH] features in C, making it
impossible for the Spec, CP to host WH-elements (cf. Sabel 1996:
297). Yet, I showed that in Turkish, it is the factive nominalized
clause where raising Agr takes place and where RCs and embedded
WH-questions are possible, i.e. in whose Spec,CP the WH-operator
would be hosted.
I suggest that the problem is only apparent. In all of the languages
listed by Sabel in this context—all of them Indo-European
languages—there is obviously complementary distribution between a
C-position occupied by a complementizer and one into which a verb
or other predicational morphology has been raised; thus, it is
plausible to suggest, as Sabel does, that raising erases [+WH] fea-
tures otherwise expressed by a complementizer.
However, in a language like Turkish, there is never an overt
complementizer introducing nominalized clauses. The nominalized
predicate, raised into C with the Agr, includes the Mood marker—i.e.
the factive marker, and it is this marker that encodes the hybrid
category and the associated CP-status of the clause, thus acting as a
clause-typing marker of the sort exhibited by Indo-European
languages in the shape of a complementizer. Therefore, Agr (+Mood)
raising to C in Turkish does not obliterate [+WH] features in C; such
raising might in fact be motivated by the need to actually activate
those features.26
In the body of the paper, I have suggested that the approach to
licensed subject Case, based on referential (in the sense of Rizzi
1994) indexing (or lack thereof) on a categorially hybrid clause may
be extended to other languages, under appropriate parametrization.
For example, while in Turkish, the indexing must indeed be
referential in the appropriate sense, i.e. must express a primary θ-role
or predication (following Williams 1994, for whom predication is
linked to θ-marking), in European Portuguese, the indexing can
express a secondary θ-role, as well, but (probably) cannot express
predication. In both languages, the subject Case licenser is Agr,
appropriately indexed, and raised to C. In other languages, too, it has
been proposed that an inflectional element like Agr (or, depending on
200 Jaklin Kornfilt

the language, Tense) may license subject Case if it is raised to C.

This was mentioned, in passing, for Modern Greek. Bayer (1983-84)
has proposed such an account for Bavarian; similar proposals exist
for other languages, too.
This means that the raised Agr can reach into the lower functional
projection, i.e. into the AgrP (or TenseP, depending on the
language). Thus, we have a configuration and mechanism somewhat
similar to ECM-constructions, where it is a designated verb that
reaches into a "deficient" clause—in Turkish, the deficiency being
expressed by the lacking Agr. Here, in the indexed clauses with their
Agr (or Tense) raised to C, the clause is similarly deficient, due to
the raising of its head to C.27
In Japanese, there is no inflectional element such as Agr or Tense.
I won't speculate here on the nature of the regular subject Case, i.e.
the Nominative, in Japanese. However, perhaps due to the lack of an
appropriate inflectional subject Case licenser, Japanese has the
possibility of an external nominal head of clauses licensing a nomi-
nal subject Case, i.e. the Genitive.
Turkish, on the other hand, by virtue of having nominal as well as
verbal inflectional heads, can have subject Case licensed locally, i.e.
within the clause. This is particularly instructive when the Agr is
nominal, as (under appropriate indexation) a nominal subject Case
can be licensed locally, without needing recourse to an external
nominal head, in contrast with Japanese, where such an external head
is needed for Genitive licensing.
We may say, then, that the nominal Agr in Turkish acts, in a
sense, as the Japanese external nominal head. Thus, in turn, we may
reach a new understanding of why Agr must, or even is able to, bear
referential indexing in Turkish: it is a (small) noun.28
Subject case in Turkish nominalized clauses 201


* This paper corresponds to the presentation at the Workshop on

morphologically rich languages, held within the DGfS conference that took
place in Leipzig, in February/March 2001. It is also related to Kornfilt
(2002), but is very different from that paper: the coverage of the present
paper is larger, as it looks at non-factive as well as factive nominalizations.
Also, the approach taken here is different: while here, indexation of Agr is
limited to referential indexation, the just mentioned work does not do so
and thus runs into problems. Furthermore, (referential) indexation of Agr in
operator-variable constructions is performed here via predication between
the clause and a nominal (phrasal) head, while indexation of Agr was done
via Spec-Head agreement within CP in the aforementioned work. The
approach based on categorial features is new here, as is discussion and
criticism of some other work that addresses one of the asymmetries studied
here. I would like to thank the workshop organizers, Uwe Junghanns and
Luka Szucsich, for inviting me to the workshop and for their patience with
the drawn-out progress of this paper. I owe a particular debt of gratitude to
both organizers (and editors of this volume) for their close reading of a
preliminary draft of this paper, and for their comments; two anonymous
referees also provided insightful and useful comments, and I thank them. I
also wish to thank Noam Chomsky for discussion of this material, and
especially of the issue of default Case in adjunct domains. I am grateful to
the DGfS for providing travel funds that made my participation at the
workshop possible. I owe a special debt of gratitude to a number of Turkish
native speakers for adding to the pool of native speakers' judgements:
£igdem Balim, Akgiil Baylav, Cemal Be§karde§, Demir Dinf, Cem
Mansur, Alp Otman, Bengisu Rona, Mehmet Yanilmaz, Ay§e Yazgan. I
further thank Mark Brown for his help with the formatting of this paper. In
addition, I would like to thank the audiences of a number of related
presentations: in 1999 at the MPI EVA, at the University of Jena, the
University of Venice, the University of Paris at Jussieu, and at Bogazi^i
University; in 2001, in addition to the DGfS workshop in Leipzig, at the
Altaic workshop at ΜΓΓ, and at the CUNY Graduate Center; in 2002, at the
In the Mood conference at Frankfurt University, at ZAS Berlin, at Cornell
University, and at the MPI EVA in Leipzig. Among those audiences, I
would like to thank, in particular, the following individuals: Artemis
Alexiadou, Josef Bayer, John Bowers, Guglielmo Cinque, Peter Cole, Chris
Collins, Marcel den Dikken, Günther Grewendorf, Jacqueline Gueron,
Gabriella Hermon, Sumru Özsoy, Shigeru Miyagawa, Jean-Yves Pollock,
Luigi Rizzi, Joachim Sabel, Eser Erguvanli Taylan, and John Whitman. All
shortcomings of the resulting study are to be blamed on the author.
202 Jaklin Kornfilt

1. In this paper, I shall use Agr for the overt morphological agreement marker,
and AGR for the related syntactic position.
2. I am using the term "extended projection" in the sense of Grimshaw (1991).
However, contrary to that work, I do allow categorially mixed extended
projections, especially for nominal predicates; for a discussion, see Borsley
and Kornfilt (2000).
3. See also Stowell (1981), Sabel (2002) for indexation of arguments via Θ-
marking. In the concluding section, I speculate that in Turkish, the Agr ele-
ment is a true nominal, similar to an external noun, and as such it is ex-
pected that it can and will inherit the θ-index of the domain that it heads.
4. "Verbal predicate" refers to predicates whose functional projections are
fully verbal (rather than mixed, i.e. including nominal layers, as is the case
in the nominalized subordinate clauses which we will be discussing
shortly). Thus, predicate adjectives and predicate nouns fall under the term
"verbal predicate", as they include either a copula, or else some sort of
auxiliary, e.g. ol 'be, become', et 'make", etc., whereby these "light" verbs
have their own verbal functional projections—unless they are nominalized,
in which case such adjectival or nominal predicates would, of course, fall
under the term "nominal predicate".
5. For agreement paradigms in Turkish, the reader is referred to reference
grammars of Turkish, e.g. Lewis (1967), Kornfilt (1997).
6. For discussion, see Kornfilt (1977) and Kornfilt (1996a). For an account of
Turkish ECM, proposing distinct derivations for clauses with versus
without overt Agr, see Moore (1998).
7. "Genuine" tense does seem to be the Nominative Case licenser in Modern
Greek. In ECM-constructions, only the present tense, i.e. the citation form,
can show up; alternation with other tenses is not possible, and neither is a
Nominative subject. Thus, non-alternating present tense is "fake"—and so
is Agr in these forms that mimic the infinitive. Iatridou (1993) suggests that
in Classical Greek, which did have an infinitive form, Agr was the subject
Case assigner, while Tense is the subject Case assigner in Modern Greek.
The correlation is suggestive for a possible parametrization, as Greek seems
to have undergone a change from a Turkish-type language (i.e. with an Agr-
less, special infinitive form, and with Agr as the subject Case licenser) to a
language without a dedicated infinitive form, where Agr, in those
constructions where it does show up, is "fake" for purposes of subject Case
8. One of the anonymous reviewers raises the objection that if this analysis of
ECM constructions is correct for Turkish, it would incorrectly predict
corresponding sentences to be ungrammatical in English. But, as a matter of
fact, most native speakers I have consulted indeed judged examples like the
translation of (4)a. to be ill-formed. Also, crucially, those speakers, while
Subject case in Turkish nominalized clauses 203

not allowing for WH-movement out of ECM-infinitivals, do allow for WH-

extraction out of Control infinitivals, i.e. from infinitivals that do have CP-
status. For such speakers, then, the analysis carries over to English straight-
forwardly. There were, to be sure, also some speakers who accepted such
examples in English. I shall not attempt offering an account of the idiolects
of those latter speakers. Note also that Iatridou (1993) suggests that Tense,
the subject Case licenser in Modern Greek, must be in C so as to fulfill its
Case-licensing function.
9. Kornfilt and Greenberg (2000) give examples for both lexically and
syntactically derived nominalizations as well as criteria to distinguish these
two types. One criterion is argument structure. I assume that lexical
nominalization can change the argument structure of a predicate, while
syntactic nominalization cannot. Consequently, externalizing an internal
argument can happen only via passive morphology in syntactic
nominalizations, while lexical nominalization can effect such
externalization directly, i.e. without passive morphology. Furthermore,
sentence-level adverbs are possible with such predicates, in contrast with
lexical nominalizations. Further discussion and examples can be found in
Kornfilt (2000a) and in Borsley and Kornfilt (2000). An early work where
criteria distinguishing lexical and syntactic nominalization for English is
Chomsky (1970). For a concise, but very insightful and influential early
generative treatment of nominalizations in Turkish, see Appendix C of Lees
10. The gloss NFN (nominal non-factive) here refers not to the semantics, but to
the morphology of this marker. For example, as will be shown later (in
section 8), psychological predicates can take both "factive" and "non-
factive" nominalized forms with factive/indicative semantics.
11. Following general Turkological practice, I use capital letters to represent
phonologically determined alternations. Vowel alternations are determined
by Vowel Harmony, and consonant alternations by a number of phenomena,
e.g. devoicing of obstruents, and conversion of /A7 into /g/.
12. There is another way to accommodate the extraction-based contrasts
sketched here: both types of nominal embedded clauses would be analyzed
as CPs (embedded within DPs), but only the indicatives would have a
[+WH] or, more generally, [+Operator] feature, while the nominal subjunc-
tives would be [-Operator]. I have proposed such analyses in the past; see,
e.g., Kornfilt (1993) and (1995b). Sabel (1996) makes a similar proposal for
a number of other languages. While this analysis covers the data, I think
that the proposal made in the present paper is less ad-hoc and more con-
vincing, as it is in line with a number of other facts, i.e. those presented ear-
lier in this section, showing categorial differences between the two nominal
clause types. However, the question arises as to how an operator can be
204 Jaklin Kornfilt

extracted out of a non-factive nominal clause, given lack of a CP-layer. Ex-

amples like (10) and (13) illustrate the possibility of such extraction out of
NFN-clauses when embedded under FN-clauses (which do have a CP-
layer). I suggest that the AgrP-layer (i.e. the Finiteness layer) provides an
escape hatch for the extracted operator; however, it is only the CP-layer
(i.e. the Force layer) that provides the target position, i.e. the "resting
place", for such an operator (as opposed to a mere escape hatch position).
13. Another apparent problem concerning adjunct clauses are posed by
examples like the two last ones, where the adjunct clause is marked with a
Case whose provenance is not immediately clear: if the superordinate
predicate does not assign a θ-role to the clause, it also does not check for its
Case. I propose that the Locative and Ablative here are licensed not via the
regular mechanisms, but semantically. Following Larson (1985), I have
claimed in Kornfilt (2000b) that not all Case is assigned by a θ-role assigner
or via specifier-head agreement, but rather that some configurations re-
quire, for semantic reasons, certain Cases. The Locative and Ablative are
required here to convey the semantics of something like "at a specific point
in time", "for a particular reason". These Cases have to be overt, because
otherwise the appropriate semantic interpretation could not be assigned to
the respective clauses.
14. For detailed discussion of the properties of PRO as contrasted with other
empty categories in Turkish, see Kornfilt (1996b). For present purposes, the
most important property of PRO is that of Control; pro, in contrast, is not
15. European Portuguese evidently doesn't make an overt, morphological
distinction between nominal versus verbal Agr. Consequently, the subject
Case licensed by Agr is of only one single shape, i.e. that of Nominative.
16. The question arises here whether European Portuguese might also benefit
from the index-based approach proposed here, thus enabling us to abandon
the Case-based approach proposed by Raposo, and bringing it closer to the
approach being adopted here for Turkish.
One anonymous reviewer discusses a paradigm in European Portuguese
which suggests that for this language, at least, it is Case rather than
indexing due to θ-marking which licenses nominal Agr as a subject Case
licenser. S/he shows that the preposition antes 'before' cannot assign Case
and therefore needs the presence of de 'of', as a Case assigner for nominal
prepositional complements. S/he further shows that antes may take fully
tensed clausal complements, introduced by the complementizer que,
without de. In contrast, when antes takes inflected infinitival clauses as its
complement, then de becomes obligatory. The reviewer interprets these
facts as meaning that in the latter instance, de is needed to license the
Subject case in Turkish nominalized clauses 205

nominal Agr of the inflected infinitive as a subject Case licenser via Case,
because indexation of Agr via θ-marking is ensured even without de.
Without knowing how uninflected infinitival clauses without overt subjects
behave in such contexts, I am not sure how to evaluate such facts. It is clear
that EP differs from Turkish in—among other properties—allowing non-
primary θ-roles (like those assigned by prepositions) to activate Agr in
general, in contrast to Turkish, where such activation by non-primary
θ-roles is not possible. Such a non-primary θ-role would be assigned to the
inflected infinitive that is the complement of antes. It is possible that de is
needed here not for the activation of Agr, but as a Case licenser for the
entire infinitival clause, made nominal by its nominal Agr-head, and that
activation of Agr is achieved by indexation as I have proposed—with the
parametric difference to Turkish that a non-primary θ-role can achieve this
indexation in EP. Further investigations along this line of inquiry must be
left to future research.
17. Relative clauses with non-subject targets bear the regular factive nominal-
ization morphology, as expected (cf. Kornfilt 1984, among others, versus
the established usage of terming this morphology object relativization). RCs
whose targets are subjects have a special nominalization marker, as have
RCs whose targets are contained within larger subjects, and RCs with
targets in impersonal constructions. These facts have been discussed, with
different proposals, by Underhill (1972), Hankamer and Knecht (1976), and
Kornfilt (2000a), among others. Coindexation between the moved operator
and the associated C is exploited in Kornfilt (2000a) to explain the
occurrence of this special nominalization marker in these particular
18. Kornfilt (1995b) proposes an analysis of Free Relative Clauses in Turkish
such that there is a nominal head position in these constructions to which
the relativization operator moves and to which, if the construction does
have an Agr element, this element adjoins. Thus, Free RCs in Turkish are
actually not headless. To be more exact than the formulation in the text, the
head isn't phonologically empty, either, when there is overt Agr, as the head
is occupied with some phonological material (i.e. that of the Agr element),
if the analysis in this older work is on the right track.
19. Aygen (2002) analyzes examples like (44) and (45) as Free Relatives, as
well. (She does not discuss comparatives.) She refers to Kornfilt (2001) as
having made such a proposal, but also refers to additional sources as having
offered the same analysis. Those references among her list which I could
locate (leaving out an MA-thesis by B. Öztürk, which I have been unable to
locate so far) do not offer such an analysis. The items in question are:
Hankamer (1972), Sezer (1991) and an earlier version of Sezer (2002), and
Kennelly (1996). Öztürk (2002), which appears to include relevant parts of
206 Jaklin Kornfilt

her thesis, does mention the relevant Genitive/Nominative contrasts in

nominalized indicative clauses as postpositional complements, but does not
offer any account of that contrast, other than claiming that it shows the
irrelevance of nominal Agr in determining subject Case—a claim which my
current study argues against.
It should be mentioned that Kornfilt (2001) was a presentation similar to the
one at the earlier Leipzig workshop which forms the basis of the current
20. This example is ill-formed in most stylistic registers but very formal ones.
21. For a detailed study of default Case (under a similar, but not identical, view
of this notion), see Schütze (2001). He offers a characterization which is
similar to the one I have proposed in the text: "The default case forms of a
language are those that are used to spell out nominal expressions (e.g., DP)
that are not associated with any case feature assigned or otherwise
determined by syntactic mechanisms." (Schütze 2001: 206.) Another recent
application of this notion is to be found in Szucsich (2002), who proposes
the Instrumental as a default Case for adjuncts in certain constructions in
Russian and other Slavic languages.
22. This analysis of the infinitival morphology predicts that not only argument
clauses, but also adjunct clauses that are headed by infinitival morphology
and lack relevant indexation should display only PRO subjects and never
overt subjects. This prediction is correct.
23. This analysis also extends to irrealis relative clauses, illustrated in (42). The
marker -(y)AcAK, glossed as FUTN, would now be analyzed as two mor-
phemes: -(y)AcA under Mood, and -Sunder Agr, with the latter bearing
negative value for agreement, just as with infinitives. This analysis would
also draw a distinction between -(y)AcAK as the genuine Future Tense
Nominal with indicative mood, and -(y)AcA(-K) as a (nominal) irrealis
24. In addition to more recent sources like Chomsky (2001), early sources of
this assumption are Chomsky (1986) and Pesetsky (1982).
25. The fact that nominal possessive phrases (which clearly lack any kind of
Mood) are not sensitive to the argument-adjunct distinction and always
exhibit Genitive specifiers, as we saw earlier, further shows that it is
categorial features and not Mood features that determine the choice of
licensed subject Case.
26. Hiraiwa (2001) assumes raising of a verbal complex to C, as well. He does
so for Japanese, in so called 'Ga-No conversion' instances, i.e. for instances
where a Nominative subject can optionally show up in the Genitive. He also
briefly discusses Genitive subjects in Turkish, for which he proposes
similar raising. Our proposals were obviously made independently of each
Subject case in Turkish nominalized clauses 207

other, as I have been made aware of that study only recently. I would like to
thank John Whitman for having drawn my attention to it.
Note, however, that our proposals, similar as they are, do differ from each
other. I am assuming the raising of a verbal complex to C/D, but not only in
those instances where Genitive is licensed (as Hiraiwa does), but
everywhere. Thus, Nominative as a genuine subject Case is licensed by a
raised verb, too. It is referential indexation on the raised verbal complex
that includes an overt Agr element which licenses the appropriate subject
Case, not raising per se.
Furthermore, Hiraiwa is wrong in claiming that in Turkish, an overt comple-
mentizer blocks verb raising and thus Genitive subject Case. While Turkish
does have (right-branching) subordination introduced by complementizers
and with Nominative subjects, we have seen in this paper that left-
branching subordination without complementizers and with fully verbal
predicates is possible, too. Lack of complementizer should make raising
possible, according to Hiraiwa, and thus license Genitive subjects. Instead,
the subject is Nominative. This shows that it is not raising of the verb to C
per se that licenses a particular subject Case, but rather the category of the
inflected verb complex, and, in particular, the category of the Agr.
I would also like to point out that raising of a predicate to C in Turkish was
proposed , as far as I know, for the first time in Kural (1993), with different
motivation than my proposal in this paper.
27. I am grateful to Marcel den Dikken for pointing out the similarity between
the subject Case licensing mechanism proposed here and that found in ECM
constructions, den Dikken raised this similarity as a problem. However,
given the widely assumed nature of a Case-licensing predicate as having
raised to C (and thus licensing subject Case in ECM-like configurations) in
widely differing languages such as in Bavarian, European Portuguese, and
Modern Greek, I view this aspect of my approach as unproblematic.
28. I am grateful to Chris Collins for a suggestion along similar lines, after a
presentation of this material at Cornell University. If Agr is a nominal head,
what does it mean to say that Agr can be verbal, in those instances where I
posited a verbal Agrl "Verbal" Agr would then simply mean a nominal
agreement element which AGREEs with the verbal predicate in category
features. Likewise, what I have called a "nominal" Agr is a nominal head
which AGREEs with the categorial features of its phonological host, i.e. a
nominal predicate or a nominal head of a domain. In categorially hybrid
clauses, the Agr bears [+N] categorial agreement features which are in con-
gruence with the higher K, but these features conflict with the verbal
features of the Tense and C-layers of the clausal architecture.
208 Jaklin Kornfilt


1. First person
2. Second person
3. Third person
ABIL Abilitative
ABL Ablative
ACC Accusative
ADV Adverbial
AGR Agreement as a syntactic node
Agr Agreement (as a morpheme);
agreement in general
AgrΡ Agreement Phrase
AOR Aorist
CAUS Causative
CMPM Compound marker
DAT Dative
DP Determiner phrase
DVN Deverbal noun
FN Factive nominalization
FUT Future
FUTN Future nominalization
GEN Genitive
Κ Case as a syntactic node
KP Case Phrase
(as a functional syntactic projection)
LOC Locative
Μ Mood
MP Mood Phrase
MST Modern Standard Turkish
Ν Noun; nominal as a distinctive feature
NP Noun Phrase
NegABIL Negative abilitative
NEGN Negative nominalizer
NF Nominal functional category
NFN Non-factive nominalization
NP Noun phrase
Op Operator
PASS Passive
PL Plural
PRES .PART Present participle
PROF Professional suffix
Subject case in Turkish nominalized clauses

PROG Progressive
PRSPROG Present progressive
REL.PART Relativization participle
REP.PAST Reported past
RES Resultative
SUBJNCT Subjunctive
SG Singular
ΤΑΜ Tense/Aspect/Mood
V Verb; verbal as a distinctive feature
VBL.CONJ Verbal conjunction
VF Verbal functional category
VP Verb phrase
210 Jaklin Kornfilt

Aygen, Giil§at
2002 Subject case in Turkic subordinate clauses: Kazakh, Turkish and
Tuvan. In: Masako Hirotani (ed.), Proceedings of NELS 32,
563-580. Amherst: University of Massachusetts, GLSA.
Bayer, Josef
1983-84 COMP in Bavarian syntaxTAe Linguistic Review 3: 209-274.
Borsley, Robert D. and Jaklin Kornfilt
2000 Mixed extended projections. In: Robert D. Borsley (ed.), The Na-
ture and Function of Syntactic Categories, 101-131. New York/
San Diego: Academic Press.
Bresnan, Joan
1973 The syntax of comparative clause constructions in English. Lin-
guistic Inquiry 4: 275-343.
Bresnan, Joan
1975 Comparative deletion and constraints on transformations. Lin-
guistic Analysis 1:25-74.
Chomsky, Noam
1970 Remarks on nominalization. In: Roderick A. Jacobs and Peter S.
Rosenbaum (eds.), Readings in English Transformational Gram-
mar, 184-221. Boston: Ginn.
Chomsky, Noam
1977 On Wh-Movement. In: Peter W. Culicover, Thomas Wasow and
Adrian Akmajian (eds.), Formal Syntax, 71-132. New York: Aca-
demic Press.
Chomsky, Noam
1981 Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam
1986 Knowledge of Language: Its Nature, Origin, and Use. New York:
Chomsky, Noam
2001 Derivation by phase. In: Michael Kenstowicz (ed.), Ken Hale: A
Life in Language, 1-52. Cambridge, Mass.: ΜΓΓ Press.
Chomsky, Noam
2002 On Nature and Language. (Adriana Belletti and Luigi Rizzi, eds.)
Cambridge: Cambridge University Press.
Chomsky, Noam and Howard Lasnik
1991 Principles and parameters theory. In: Joachim Jacobs, Armin von
Stechow, Wolfgang Sternefeld and Theo Vennemann (eds.), Syn-
tax: an International Hand-book of Contemporary Research. Ber-
lin: de Gruyter.
Subject case in Turkish nominalized clauses 211

Dede, Miijerref
1986 Definiteness and referentiality in Turkish verbal sentences. In: Dan
I. Slobin and Karl Zimmer (eds.), Studies in Turkish Linguistics,
147-164. Amsterdam /Philadelphia: John Benjamins.
Εης, Mürvet
1991 The semantics of specificity. Linguistic Inquiry 22: 1-25.
Erguvanli-Taylan, Eser
1984 The Function of Word Order in Turkish grammar. [University of
California publications in Linguistics, vol. 106.] Berkeley and Los
Angeles: University of California Press.
George, Leland and Jaklin Kornfilt
1981 Finiteness and boundedness in Turkish. In: Frank Heny (ed.), Bind-
ing and Filtering, 104-127. London: Croom Helm, and Cam-
bridge, Mass.: ΜΓΓ Press.
Grimshaw, Jane
1990 Argument Structure. Cambridge, Mass.: ΜΓΤ Press.
Grimshaw, Jane
1991 Extended Projections. Ms., Brandeis University.
Hankamer, Jorge
1972 Analogical Rules in Syntax. Proceedings of CIS 8: 111-123.
Hankamer, Jorge and Laura Knecht
1976 The role of the subject/non-subject distinction in determining the
choice of relative clause participle in Turkish. Proceedings of
NELS VI, Montreal WPL 6: 123-135.
Hiraiwa, Ken
2002 On Nominative-Genitive conversion. In: A Few from Building
E39: ΜΓΓ Working Papers in Linguistics 39: 67-126.
Iatridou, Sabine
1993 On Nominative Case assignment and a few related things. In: Pa-
pers on Case & Agreement II: ΜΓΓ Working Papers in Linguistics
19: 175-196.
Kennelly, Sarah
1990 Theta control in Turkish. In: GLOW Newsletter 24: 72-73; ab-
Kennelly, Sarah
1996 Turkish subordination: [-Tense, -CP, +Case], In: Modern Studies
in Turkish: Proceedings of the 6th International Conference on
Turkish Linguistics. Eskigehir University, Turkey; 55-75. (Copy-
right 1996; conference held in 1992.)
Knecht, Laura
1976 Turkish comparatives. In: Judith Aissen and Jorge Hankamer
(eds.), Harvard Studies in Syntax and Semantics 2, 279-358.
212 Jaklin Kornfilt

Kornfilt, Jaklin
1977 A note on subject raising in Turkish. Linguistic Inquiry 8:4.
Kornfilt, Jaklin
1984 Case marking, agreement, and empty categories in Turkish. Ph.D.
dissertation, Harvard University.
Kornfilt, Jaklin
1993 Infinitival WH-constructions and complementation in Turkish. In:
Eurotyp Working Papers, Group 3: Subordination and Comple-
mentation, vol. 4, 66-83. Manchester: University of Manchester, as
a European Science Foundation Working Papers volume.
Kornfilt, Jaklin
1995 a Scrambling and incorporation in Turkish. In: Artemis Alexiadou,
Nanna Fuhrhop, Paul Law and Sylvia Löhken (eds.), FAS Papers
in Linguistics 1: 56-65. Berlin: FAS.
Kornfilt, Jaklin
1995 b Constraints on free relative clauses in Turkish. In: Artemis Alexia-
dou, Nana Fuhrhop, Paul Law and Sylvia Löhken (eds.), FAS Pa-
pers in Linguistics 4, 36-57. Berlin: FAS.
Kornfilt, Jaklin
1996 a NP-Movement and 'Restructuring'. In: Robert Freidin (ed.), Cur-
rent Issues in Comparative Grammar, 121-147. Dordrecht: Klu-
Kornfilt, Jaklin
1996 b Turkish and configurationality. In: Bengisu Rona (ed.), Current is-
sues in Turkish linguistics 1: 111-125. Ankara: Hitit. (Copyright:
London: SOAS, 1996; related conference held in 1990.)
Kornfilt, Jaklin
1997 Turkish. London: Routledge.
Kornfilt, Jaklin
1998 On rightward movement in Turkish. In: Lars Johanson (ed.), The
Mainz Meeting: Proceedings of the Seventh International Con-
ference on Turkish Linguistics, 107-123.Wiesbaden: Harrassowitz
Kornfilt, Jaklin
2000 a Some syntactic and morphological properties of relative clauses in
Turkish. In: Andre Alexiadou, Paul Law, Andre Meinunger and
Chris Wilder (eds.), The Syntax of Relative Clauses, 121-159.
Amsterdam/Philadelphia: John Benjamins.
Subject case in Turkish nominalized clauses 213

Kornfilt, Jaklin
2000 b Postpositions and adverbs: a case study in syntactic categories. In:
Cigdem Balim and Colin Imber (eds.), The Balance of Truth:
Essays in Honour of Professor Geoffrey Lewis, 217-237. Istanbul:
The Isis Press.
Kornfilt, Jaklin
2001 Subjects and their Case in Turkish/Turkic embeddings. Paper pre-
sented at the First Altaic Workshop, ΜΓΓ.
Kornfilt, Jaklin
2002 Functional projections and their subjects in Turkish clauses. In:
Eser Erguvanli-Taylan (ed.), The Verb in Turkish, 183-212.
Amsterdam: Benjamins.
Kornfilt, Jaklin and Gerald Greenberg
2000 Changing argument structure without voice morphology: A con-
crete view. In: Asli Göksel and Celia Kerslake (eds.), Studies on
Turkish and Turkic Languages, 51-56. Wiesbaden: Harrassowitz
Kural, Murat
1993 V-to(-I-to)-C in Turkish. In: Filippo Beghelli and Murat Kural
(eds.), UCLA Occasional Papers in Linguistics 11: 1-37.
Larson, Richard
1985 Bare-NP adverbs. Linguistic Inquiry 16: 595-621.
Lees, Robert B.
1965 Turkish nominalizations and a problem of ellipsis. Foundations of
Language 1-2;112-121.
Lees, Robert B.
1968 The Grammar of English Nominalizations. Bloomington/The
Hague: Indiana University/Mouton. [Fifth printing; original publi-
cation in 1963.]
Lewis, Geoffrey L.
1967 Turkish Grammar. Oxford: Oxford University Press.
Miyagawa, Shigeru
1993 Case-checking and minimal link condition. In: Case and Agree-
ment II, ΜΓΓ Working Papers in Linguistics 19: 213-254.
Moore, John
1998 Turkish copy-raising and Α-chain locality. Natural Language and
Linguistic Theory 16: 149-189.
Öztürk, Balkiz
2002 Turkish as a non-pro-drop language. In: Eser Erguvanli-Taylan
(ed.), The Verb in Turkish, 239-259. Amsterdam: Benjamins.
Pesetsky, David
1982 Paths and Categories. Doctoral dissertation, ΜΓΓ.
214 Jaklin Kornfilt

Pollock, Jean-Yves
1989 Verb movement, universal grammar, and the structure of IP. Lin-
guistic Inquiry 20: 365-424.
Raposo, Eduardo
1987 Case theory and Infl-to-Comp: The inflected infinitive in European
Portuguese. Linguistic Inquiry 18: 85-109.
Reuland, Eric
1983 Governing -ing. Linguistic Inquiry 14: 101-136.
Rizzi, Luigi
1994 Argument/Adjunct (a)symmetries. In: Guglielmo Cinque, Jan Kö-
ster, Jean-Yves Pollock, Luigi Rizzi, and Raffaella Zanuttini (eds.),
Paths Towards Universal Grammar: Studies in Honor of Richard
S. Kayne, 361-376. Washington, D.C.: Georgetown University
Rizzi, Luigi
1997 The fine structure of the left periphery. In: Liliane Haegeman (ed.),
Elements of Grammar, 281-337. Dordrecht: Kluwer.
Sabel, Joachim
1996 Restrukturierung und Lokalität. Universelle Beschränkungen für
Wortstellungsvarianten. Berlin: Akademie Verlag, Studia Gram-
matica 42.
Sabel, Joachim
2002 A minimalist analysis of syntactic islands. The Linguistic Review
Schütze, Carson T.
2001 On the nature of default case. Syntax 4: 205-238.
Sezer, Engin
1991 Issues in Turkish Syntax. Ph.D. dissertation, Harvard University.
Sezer, Engin
2002 Yagayan Türi^e [Living Turkish], Unpublished ms., Harvard Uni-
versity; version 4.2.
S to well, Tim
1981 Origins of phrase structure. Ph.D. dissertation, ΜΓΓ.
Szucsich, Luka
2002 Nominale Adverbiale im Russischen: Syntax, Semantik und Infor-
mationsstruktur. Munich: Otto Sagner.
Tura, Sabahat Sansa
1986 Definiteness and referentiality in Turkish nonverbal sentences. In:
Dan I. Slobin and Karl Zimmer (eds.), Studies in Turkish Linguis-
tics, 165-194. Amsterdam /Philadelphia: John Benjamins.
Underhill, Robert
1972 Turkish participles. Linguistic Inquiry 3: 87-99.
Subject case in Turkish nominalized clauses 215

Watanabe, Akira
1996 Nominative-Genitive conversion and Agreement in Japanese: A
Cross-linguistic perspective. Journal of East Asian Linguistics
Williams, Edwin
1994 Thematic Structure in Syntax. Cambridge, Mass.: The ΜΓΓ Press.
On the licensing of null subjects in Old French1
Esther Rinke

0. Introduction

It is a well known fact that the omission of referential subjects was

licensed in Old French. Therefore, the grammar of Old French had to
meet the relevant licensing conditions for null subjects. This licens-
ing has been described for null subject languages such as Spanish
and Italian in terms of an internal characterisation of the Agreement
system which has to be either rich or strong depending on the defini-
tion of its content.
I will base my argumentation on Kato's (1999) proposal that a
[+pronominal] agreement system is the prerequisite for the licensing
of null subjects. I follow her assumption that in null subject lan-
guages the agreement morphemes have the same grammatical status
as a pronominal subject. As a result, they are able to check the EPP-
feature of a given sentence via verb movement to AGRS°. This
account has two interesting consequences: First, it leads to an elimi-
nation of pro, since it is assumed that the agreement morpheme itself
takes over the functions which were attributed to the empty pronomi-
nal element; second, it makes the prediction that, for economy
reasons, the projection of a specifier of AGRSP for EPP checking is
excluded in null subject languages.
However, any approach which is based on morphophonological
criteria only fails to account for the licensing of null subjects in Old
French in the 13th century, since their distribution is restricted to
certain structural environments. Null subjects occur predominantly in
main clauses with an initial non subject constituent and in conjunc-
tional subordinate clauses which contain a preverbal topic. In order
to capture this generalisation, one has to find out which additional
218 Esther Rinke

structural condition beyond the pure morphophonological licensing

of null subjects is operating.
I base my analysis on the assumption of a split CP system with
different functional layers, namely ForceP, TopP, FocP, and FinP as
proposed by Rizzi (1997). I will propose that the licensing of null
subjects is contingent on the realisation of the functional category
Fin. This category is only available in main clauses and in conjunc-
tional subordinate clauses with a preverbal topic, but not in other
types of subordinate clauses.

1. Null subjects in main clauses2

1.1. The distribution of null subjects in main clauses

It is commonly agreed among researchers that the most likely

environment for null subjects in Old French are main clauses in
which a non subject constituent occupies the initial position and a
finite verb the second position. This finding is supported by our data
base. The initial position is occupied by temporal adverbials such as
puis (si) 'thereafter', apres (si) 'thereafter' and I'andemain 'the day
after' (examples (l)a-c). It is also occupied by a topicalized object or
adjunct phrase as shown in the examples (l)d and e.

(1) a. ...puis si se department, si s'en ala chascuns en son pays.

... thereafter (they) left,...
'Thereafter they left and everyone went to his land.'
(Clari 19,15-16)
b. Apres si nommerons les evesques qui y furent.
thereafter (we) name the bishops that there were
'Thereafter we name the bishops that were there.'
(Clari 17,19-20)
On the licensing of null subjects in Old French 219

c. Et Vandemain vindrent devant le due et dirent...

and the day after (they) came in front of the army leader
and said ...
'And the day after they came to the army-leader and said

(Vileharduyn, 024/04)
d. Les paroles ... ne vouspuis toutes raconter.
the words ... not you can (I) all tell
Ί cannot tell you all the words.'
(Vileharduyn, chap. 30/01)
e. ... et en audience fu devise que en iroit outremer.
and in public (it) was decided that one leaves to oversea
'And it was decided in public to leave to oversea'
(Vileharduyn, chap. 30/03)

The most frequent element introducing null subject clauses is si.

Si has been analysed as an adverbial in the literature, since it may
trigger inversion (Vance 1997). However, I will adopt Ferraresi and
Goldbach's (2001) analysis of si as a sentence particle. Their main
arguments are given in section 2.3. In null subject main clauses, si
appears in initial position or following a temporal adjunct clause
introduced by quant 'when'.

(2) a. Si commencierent la plus riche navie que onques fist

si (they) started the most considerable navy that ever was
'So they started (to build up) the most considerable navy
(Clari 22, 1-2)
220 Esther Rinke

b. Quant Ii marquis fu venus ä Soissons. si demanda aus

barons pour quoi il l'avoient mande.
when the marquis had come to Soissons, si (he) asked the
barons why they him have sent
'When the marquis arrived at Soissons, he asked the
barons why they had sent for him.'
(Clari 20, 8-9)

1.2. Where is the empty subject?

Since the distribution of null subjects in Old French is primarily

restricted to the aforementioned types of main clauses in which a
realised subject pronoun would occur postverbally, Foulet (1928)
claims that subject omission correlates with subject inversion in Old
French: "C'est la un point fondamental de la syntaxe du vieux
fran9ais: 1'inversion du sujet entraine facilement dans le cas du
pronom personnel l'omission du sujet." [Here is a fundamental
aspect of Old French syntax: the inversion of the subject easily
causes subject omission in the case of a personal pronoun.]
This view, albeit in a different shape, has essentially been adopted
in generative analyses (see Vanelli, Renzi and Beninca 1985, Adams
1987, 1988, Roberts 1993, Vance 1997). These authors agree that the
Old French morphology was able to identify the content of an empty
pronominal category. With regard to the licensing conditions, they
establish a correlation between the restricted distribution of null
subjects in main clauses and the verb second property, which is
assumed to be a property of Old French. Leaving aside subtle
differences in the analysis, the assumption is that the verb moves
regularly to the complementizer position in clauses where a non
subject constituent occupies the initial position. Since an overt
pronominal subject would be placed in postverbal position, it is
concluded that an empty pronominal subject pro is licensed in this
position. The licensing condition for pro is related to government by
the finite verb with its respective agreement features. This licensing
condition has been formulated in different ways. Adams (1987,
On the licensing of null subjects in Old French 221

1988) proposes a directional head government parameter. This

parameter would fix the direction in which an empty element may be
governed (and licensed) by a possible governor. Roberts (1993)
explains the distributional restriction of null subjects in Old French
basing his approach on Koopman and Sportiche's (1991) proposal of
a parametrization of nominative case assignment. Roberts assumes
that the possibility of nominative case assignment in a government
relation in Old French was the relevant condition for the licensing of
pro in postverbal position in main clauses. Vance (1997) assumes a
"Condition on nominative pro" entailing government of pro by a
nominative case licensing head hosting AGR° as the relevant require-
ment for its licensing.
To sum up so far: Null subjects are primarily restricted to main
clauses in which a non subject constituent occupies the initial posi-
tion. Generative analyses correlate the licensing of null subjects in
this configuration to government of the empty pronominal element
pro in postverbal position. This government relation is related to the
verb-second property of Old French, where movement of the finite
verb to the complementizer position takes place.
The analyses reported so far are based on the assumption that an
empty pronominal subject has to be placed in postverbal position in
so-called verb-second main clauses. I will argue against this view.
The first, empirical, argument is that subject inversion is not
obligatory and that pronominal subjects do occur pre- as well as
postverbally. The second, theoretical, argument is that from a mini-
malist point of view, pro may be eliminated altogether.
Not all main clauses with an initial non subject constituent show
the subject in inverted position. As Kaiser (2002)3 notes, there exist
verb-third structures with a subject constituent in preverbal position,
as shown in (3). These verb-third sentences are not only incompatible
with a verb-second grammar, they are also never found in the cor-
responding German and Icelandic texts which Kaiser examines in
222 Esther Rinke

(3) Le matin Ureisfist faire un brief

in the morning the king let make a letter
'In the morning let the king write a letter.'
(Li quatre livre des Reis 78: 2 Sam 11,14; Kaiser 2002: 176)

In our database, we also find verb-third structures, in which pro-

nominal or non-pronominal subjects occupy the second structural po-
sition after an adverbial expression or an adjunct. Some examples are
given in (4).

(4) a. et de ce iour que nous nommerons en un an, nous vous

metrons en quel terre ...
and this day that we tell in one year, we you take to that
'And on this date that we will announce in one year's
time, we will take you to this country.'
(Clari 21,25-29)
b. Et des barons, qui ν furent, nous ne les savons mie
and of the lords that were there, we not them know not to
'And we are not able to name all the lords that were
(Clari 17,29-31)
c. Adonc limessage prisent congie, si s 'en revinrent;...
now the messengers took advise, si they returned ...
'As soon as the messengers were informed, they returned

(Clari 19,42-43)
d. Apres U message prisent congii, si s 'en revinrent;...
after the messengers took advise, si they returned ...
'After the messengers were informed, they returned ...'
(Clari 21, 38-39)
On the licensing of null subjects in Old French 223

In the constructions mentioned so far, inversion is optional. But

when a subordinated clause occupies the sentence-initial position, in-
version is never found.4

(5) a. ... et, se li Latin vousissent. il m'eussent tot decop0 en

and, if Latins wanted, they me would have all cut in
'And if the Latins wanted, they would have cut me in
(Clari 29,17-18)
b. Et se vous me voulez croire. nous, irons cest yver se-
journer j usque vers la Pas que;...
and if you me want believe, we will this winter sojourn
until Eastern
'And if you want to believe me, we will sojourn this
winter until Easter.'
(Clari 24, 39-40)
c. Et quant il vint pres d'eus. il leur manda qu' il s 'en
alaissent esraument et qu' il vidaissent sa tere;...
and when he came near to them, he them ordered that
they would go out of the kingdom and that they would
leave his land
'And when he came to them, he ordered them that they
would go out of the kingdom and that they would leave
his land.'
(Clari 28, 30-31)
d. quant il s'en furent αΐέ, li empereres manda sa gent
toute, si les suivi.
... and when they had left, the emperor ordered his people
all, si him followed
'And when they had left, the emperor ordered all his peo-
ple, so they followed him.'
(Clari 28,28-30)
224 Esther Rinke

So far we have observed that overt pronominal and non-pronomi-

nal subjects may appear pre- and postverbally. It can therefore not be
concluded without further evidence that an empty subject would have
to be placed in postverbal position. This not only casts doubt on the
conclusion that an empty pronominal element would have to appear
obligatorily in postverbal position, it may also be evidence against
the assumption of a verb second property of Old French. I will not go
into further detail here. For an extensive discussion see Kaiser
The question where the empty subject is located remains. How-
ever, from a minimalist perspective, this question is not relevant. I
follow Kato (1999), who claims that the empty pronominal element
pro may be eliminated from the syntax altogether. In my argumenta-
tion, I follow primarily Kato (1999), though similar, but not identical
proposals have been made by Speas (1994), Barbosa (1995), Roberts
and Roussou (2002), Manzini and Savoia (1997), Roberts (1998),
Alexiadou and Anagnostopoulou (1998).
Kato assumes that only a language which possesses ^pronomi-
nal] agreement allows for empty referential subjects. Pronominal
agreement is characterised by the fact that the agreement morphology
has the same grammatical status as a pronominal subject. Kato
argues on the basis of Everett (1996), that the agreement morphology
in null subject languages has to be integrated into the system of weak
pronominale (in the sense of Cardinaletti and Starke 1998). As such,
agreement morphemes appear as independent items in the numera-
tion in a null subject language. Having the same status as pronominal
subjects, they are able to check the EPP-feature of a given sentence
via movement of the finite verb [+agreement morphology] to
AGRS°. Since the EPP feature has to be checked only once, no
further checking takes place. For economy reasons, the projection of
a specifier of AGRSP for EPP checking should be excluded.
This approach has various consequences. First, no specifier
position of AGRS is available in null subject languages. Overt
pronominal and non-pronominal preverbal subjects have to be placed
in a Σ-Phrase above AGRSP, which functions as a topic position.6 In
a non null subject language, however, the projection of a specifier of
On the licensing of null subjects in Old French 225

AGRS is obligatory for EPP checking and preverbal subjects should

be placed there. Second, Kato (1999) argues that pro has to be elimi-
nated for economy reasons, since its existence is not relevant for the
syntactic processes in question. The difference between null subject
languages and non null subject languages does not consist in the
availability or non-availability ofpro, but follows from the character-
isation of the agreement complex itself and from economy considera-
This theory implies that the licensing of null subjects is an
inherent property of the agreement system itself and does not follow
from structural requirements on an empty pronominal category, like
e.g. government by the finite verb or by AGR. Since null subjects are
restricted to certain types of clauses, the question how the structural
restriction on null subjects in Old French may be captured needs still
to be answered. I will come to this issue now by taking into
consideration the distribution of null subjects in subordinate clauses.

2. Null subjects in subordinate clauses

2.1. The distribution of null subjects in subordinate clauses

In comparison to their distribution in main clauses, null subjects are

rare in subordinate clauses.7 The only type of subordinate clauses
with null subjects which we found in our data base shows the word
order conjunction - topicalised constituent — finite verb. It is also
regularly reported in the literature that null subjects occur mainly in
this type of subordinate clauses (e.g. Vance 1997: 203f.). In (6), I
give some examples from the Clari text and from the Vileharduyn
226 Esther Rinke

(6) a. Cil est vaillans et hardis, quant si grant hardement

entreprist ä faire.
this one is powerful and courageous, when so great
audacy undertakes to do
'This one must be powerful and courageous if he under-
takes an audacy like this.'
(Clari 32, 35-40)
b. Seigneur, or vees la grant mervelle de la grant honneur
que Dieus m'a donnee, que en eel iour mesme, que on me
devoit prendre et essillier, en eel ior mesme sui corones a
Sir, so see the great miracle of the great honour that God
me has given, that on this day same, that one me has to
arrest and expel, on this day same am crowned to emper-
'Sir, see the great miracle of the great honour that God
has given to me; that on the same day when one has to
arrest and expel me, I will be crowned to emperor.'
(Clari 34, 2-6)
c. ... et si li dist qu' il n'auroit jamais point de son hiretage,
separ l'aide de Dieu et des croisies ne I'avoit.
... and so him told that he not had never something of his
heritage, if with the help of God and the crusaders (he)
not it had.
'...and so he told him that he would never have anything
of his heritage if he will not have it by the help of God
and the crusaders.'
(Clari 37, 31-35)
On the licensing of null subjects in Old French 227

d. Et jut tel le conseil acorde entr'euls que a Venice

cuidoient trover plus grant joison de vessiax que a nul
autre port.
and was this the council agreed upon them that in Venice
think find more great number of vessels than in no other
'And they agreed that in Venice they would find a greater
number of vessels than in any other port.'
(Vileharduyn 014, 02)

According to the literature and to our empirical findings, this is

the only type of subordinate clauses with null subjects which is
regularly found in Old French prose texts in the 13th century (cf.
Vance 1997 and Roberts 1993). Other types of subordinate clauses
(e.g. conjunctional verb-first clauses) did not show up in our data
base until this point and are reported to be very rare (see Hirschbüh-
ler 1990, Roberts 1993, Vance 1997 for a discussion). I will come
back to these cases in section 3.2.
Both Roberts (1993) and Vance (1997) analyse subordinate
clauses of the type conjunction - topic - finite verb as double CP-
structures. This means that these structures "... consist of a root CP
embedded under a higher C° containing a complementizer ..."
(Vance 1997: 206). This account is based on a verb-second analysis
of Old French. Verb movement to C° in this type of subordinate
clauses is assumed in a parallel fashion to root clauses. However, as
discussed in section 1.2., the Old French grammar seems to provide
at least two structural positions preceding the finite verb. If the evi-
dence that the verb moves to the complementizer position in main
clauses is not unambiguous, why then, should this assumption be
made for subordinate clauses? Moreover, in a parallel fashion to verb
third main clauses, we find instances of conjunctional subordinate
clauses which show the subject and an adverbial phrase in preverbal
position. This construction occurs with preverbal pronominal sub-
jects as in (7)a. as well as with non-pronominal subjects as in (7)b.8
228 Esther Rinke

(7) a. Si vous prions pour Dieu que vous soiez nostre sire, et
que vous pour I'amour Damedieu preniez la croix.
so you (we) ask by God that you would be our master,
and that you for the love of God take the cross
'So we ask you in God's name that you will be our mas-
ter and that, for the love of God, you take the cross.'
(Clari 20,15-17)
b. Et tout Ii haut homme, et clerc et lai, et petit et grant,
demenerent si grant joie ä l'esmouvoir, que onques
encore sifaitejoie nesifaite estoire ne fu veue ne oie.
and all the noble men, both clerical and laymen, both
short and tall, showed such a great joy by departing, that
never again such a joy or such a story was seen or heard
'And all the noble men, both clerical and laymen, both
short and tall, showed such a great joy when departing
that never again such a joy or such a story was seen or
(Clari 25,12-15)

In the following section, I will come back to the question which

additional structural condition beyond the internal characterisation of
the agreement morphology has to be met for the licensing of null
subjects in Old French. I will argue that the realisation of an empty
complementizer which is endowed with Agr features is the pre-
requisite for null subject licensing in the 13th century. Whereas it is
available in subordinate clauses which contain a preverbal topic, it is
not available in other types of subordinate clauses. The reason is that
the realisation of a preverbal topic in this context may lead to a split
in the CP system, whereas otherwise the different CP-layers collapse
into one node in conjunctional subordinate clauses. Since this pro-
posal is based on a split CP system, as it has been suggested by Rizzi
(1997), I will first summarise the main assumptions of this model.
On the licensing of null subjects in Old French 229

2.2. The split CP system

Rizzi (1997) assumes that the CP-layer consists of a Force-Finite-

ness and a Topic-Focus system. The relevant structure is given in (8).

(8) ForceP

Force0 TOPP*

Top° FocP

Foc° TOPP*

Top° FinP


The Force-Finiteness system expresses selectional relations be-

tween the C system and the immediately higher and lower structural
systems. The functional head Force is responsible for the specifica-
tion of the clause type and relates to the higher clause, whereas the
functional head Fin determines the selection of the IP system. Rizzi's
definition of the nature of Fin is given in (9).
230 Esther Rinke

(9) "... the C system expresses a specification of finiteness, which

in turn selects an IP system with the familiar characteristics of
finiteness: mood distinctions, overt tense distinctions, subject
agreement licensing nominative case." (Rizzi 1997: 284)

Whereas the Force-Finiteness system is an essential part of the

C system, the Topic-Focus system is present when needed to ensure
that so-called P-features (Topic or Focus features on constituents) are
checked and deleted. Rizzi (1997) assumes that the realisation of the
functional nodes is language-specific and structure-dependent. He
proposes that an economy principle restricts the structure-building
process so that structure is avoided whenever it may be omitted.
As it becomes clear from the structure in (8) and the definition of
Fin in (9), it is the functional node Fin within this system which has
an impact on the IP system since it is, following Rizzi, responsible
for its selection. Since the specification of the IP system in turn is
crucial for the licensing of null subjects, I want to suggest that the
possibility of subject omission in Old French correlates with the rea-
lisation of Fin and the resulting formation of a Fin/Agreement com-
plex. The first argument for this assumption derives from the oc-
currence of null subjects in subordinate clauses of the type conjunc-
tion - topic - finite verb as discussed in section 2.1.

2.3. The realisation of Fin in subordinate clauses

Rizzi (1997) suggests that Force and Fin collapse into one node in
conjunctional subordinate clauses. The realisation of a preverbal
topic may nevertheless activate the Topic-Focus field which in turn
triggers the split of the Force-Finiteness system. He shows this with
regard to subject extraction structures in English.
More specifically, Rizzi shows that the realisation of a preposed
adverbial may allow an agreeing Fw-node to co-occur with the com-
plementizer that. In (10)a, subject extraction is not possible, while in
(10)b, which contains a topicalised adverbial, subject extraction leads
to a grammatical construction.
On the licensing of null subjects in Old French 231

(10) a. * An amendment which they say that t will be law next

b. An amendment which they say that, next year, t will be

In subordinate clauses in English, Rizzi (1997: 312) observes an

alternation of an overt complementizer that and a phonetically not
realised O-complementizer (example (1 l)a), which he assumes to be
consistent with AGR (i.e., it may be endowed with AGR). If the that
form is selected, like in (ll)b, subject extraction is not possible,
since the trace in subject position is not properly governed (ECP vio-
lation). Only when the O-complementizer is selected, subject extrac-
tion is possible, since 0 is turned into a governor of the subject trace /
by the Agr specification, (cf. (1 l)c). So, the availability of a null
complementizer in subordinate clauses is identified as a prerequisite
for subject extraction.

(11) a. I think that10 John will win the prize.

b. * Who do you think [t' that [ t will win the prize]]?
c. Who do you think [t' 0 [ t will win the prize]]

Whereas Force and Fin collapse into one node in finite comple-
ment clauses without an activated Topic-Focus field, they are sepa-
rated into two heads in those cases when the Topic-Focus field is
activated. Within this system, Force is realised by that and Fin is
realised by 0.

(12) ... [that [next year Top0 [0 [John will win the prize]]

Leaving aside the technical details about subject extraction in

English for the moment, I would like to extend this analysis to the
cases of null subject licensing in Old French. First, it must be shown
that Old French makes available a phonologically empty comple-
mentizer like English. Evidence for the existence of a null comple-
mentizer in Old French derives from Foulet's (1928: 333) ob-
servation that the conjunction que is frequently omitted in comple-
232 Esther Rinke

ment clauses of epistemic and declarative verbs (cf. also Roberts

1993: 133).

Le role de que pour relier deux ou plusieurs phrases est aussi itendu dans la
vieille langue qu'il Test de nos jours. Mais alors que nous sommes toujours
tenus de l'exprimer, le vieux fran9ais le sous-entend assez fröquemment:
1° Apr£s les verbes signifiant «promettre», «jurer» et surtout «savoir»,
«penser», «vouloir», on trouve souvent comme compliment une phrase que
ne prec£de aucun que... (Foulet 1928: 333)
[The role of que in connecting two or more phrases is present in the old lan-
guage as well as nowadays. But while we must always express it, the old
French quite frequently implies it: 1. After the verbs meaning 'to promise',
'to swear', and especially 'to know', 'to think', 'to want', one often finds a
phrase which is not preceded by que in complement position...]

I therefore conclude that Old French makes available a

phonologically empty complementizer 0 like English. In the next
step, I want to extend the structural description given in (12) to the
cases of null subject subordinate clauses with a preverbal adverbial
phrase (13)a. Analogous to the English counterpart in example (12), I
assume that Fin is realised by an empty complementizer in these con-
texts. The structure I adopt for the Old French subordinate clause is
given in (13)b.

(13) a. Et jut tel le conseil acorde entr'euls que a Venice

cuidoient trover plus grant joison de vessiax ...
'And they agreed upon the council that in Venice they
think to find a greater number of vessels ...'
(= (6)d; Vileharduyn 014/02)
On the licensing of null subjects in Old French 233


Force0 TOPP


Top0 FocP

Foe0 FinP


AGR0 ... VP

hor<x° que] \a Venice1 [Fin+AgrO] [cuidoiettt] [vp trover

plus grant
foison de

As for subordinate clauses, the contrast between subordinate

clauses with a topicalised element and subordinate clauses which do
not exhibit a preverbal topic can now be accounted for: As argued by
Rizzi (1997), the Force-Finiteness system is collapsed into a single
head in subordinate clauses with an overt complementizer that do not
contain a topic. Fin+Agr is not available and empty referential sub-
jects are not licensed.
234 Esther Rinke

3. Extending the analysis

3.1. Fin in main clauses

In the following section, I argue that the above explanation for the
structural restriction of null subjects in subordinate clauses extends to
null subject main clauses as discussed in section 1. Additional evi-
dence for the analysis is derived from the distribution of sentence
particles in Old French.
Based on Rizzi (1997), Ferraresi and Goldbach (2001) show that
Old French possesses a declarative/assertive particle si, which they
compare to a set of Welsh particles analysed by Roberts (2000).
They demonstrate that Old French si displays crucial properties
similar to its Welsh counterpart: it is always adjacent to the finite
verb (see example (14)a) and it may operate as a phonological and
syntactic partner of a clitic in some cases (see example (14)b). They
also show that si immediately follows the complementizer que (see
example (14)c) (cf. Ferraresi and Goldbach 2001: 4ff.).

(14) a. Adonc si manda li Dux tous les haus conseils de la vile.

'Thus the duke summoned all the municipal councils.'
(Clari from Ferraresi and Goldbach 2001)
b. Sil [< si le] saluerentpar amur e par bien.
'(They) welcome him amicably and seemly.'
(Roland 121 from Ferraresi and Goldbach 2001)
c. (...) et li rois dist [que siferoit il volontiers]
'(...) and the king said that he would do so willingly.'
(Queste 34,10)

Ferraresi and Goldbach (2001) adopt Roberts' (2000) analysis for

the Celtic particles. They assume that si is merged in Fin like its
Welsh counterparts, which Roberts analyses as PF realisations of
finite [Fm], If we adopt this analysis for si, it becomes clear that null
subjects occur predominantly in contexts where Fin is merged with
this sentence particle. Statistical data of the first 300 clauses in the
On the licensing of null subjects in Old French 235

Clari data base show that 26 out of 28 null subject clauses are of this
type. This is shown in table 1:

Table 1. Distribution of null subjects in the Clari data base

Nullsubjects subordinate (adverb+) temporal

total clause - si - si - finite adverbs -
finite verb verb finite verb
28 15 11 2
% 53.6 39.3 7.1

3.2. Problematic cases

There remain some problematic cases for this analysis. Vance (1997)
cites embedded clauses like (15) where the complementizer is imme-
diately followed by the finite verb.

(15) Et quant vit la tombe qui ardoit si merveilleusement,...

and when saw-3sg the tomb which burnt so marvellously
'And when he saw the tomb burning so marvellously...'
(Queste 264,7; Vance 1997: 227)

It may be observed that all examples cited by Vance (1997: 227f.)

consistently involve a temporal interpretation (quant 'when', devant
que 'before', tant que 'so long that').

(16) a. ... devant que I 'aie trouve ...

... before that him-may-find ...
'... until I have found him ...'
(Queste 72,28; Vance 1997: 227f.)
236 Esther Rinke

b. ... et erra tant que encontra par aventure Agloval et

... and travelled so much that met by chance Agloval and
'... and travelled so far that he met by chance Agloval
and Girflet'
(Queste 55,27; Vance 1997: 228)

This observation leads to the speculation that these elements are

somehow related to Fin, because Fin is the functional layer within
the C system which is related to the tense and finiteness specifica-
tions of the subordinate clause. However, the exact nature of this re-
lation needs to be worked out in more detail. Roberts (1993: 137) for
example, on the basis of Hirschbühler (1990), analyses these sen-
tences as belonging to a "conservative system", in which a "double
Agr" system is at work. Within this system, a null subject in Agr2 is
licensed under government by the finite verb in Agrl. Vance (1997:
229) regards the examples in (16)a. and b. as "truly exceptional
clauses, possibly archaic in nature, in which the productive 13th
century licensing conditions for pro are not met". Be that as it may,
null subject conjunctional verb first clauses may not be considered
regular cases of subject omission in Old French.
A second case which is problematic for my analysis concerns
main clauses which are introduced by the coordinating conjunction et
with the finite verb immediately following it like in example (17).
On the licensing of null subjects in Old French 237

(17) [Et Ii dus leur respondi que il queroit respit au quart jour.]
Et adont auroit son conseil assamble. Et porront dire ce que il
[And the chief answered that he wanted time of rest on the
fourth day.]
And therefore (he) had his council assembled. And (they) could
say what they wanted.
'And the chief answered that he wanted to rest on the fourth
day. Therefore he assembled his council, and they could say
what they wanted.'
(Vileharduyn 017/05)

These sentences, which are distinct from conjoined clauses with

identical subjects because they do not share the subject with the pre-
vious phrase, represent a regular context for the distribution of null
subjects. Given that Force and Fin only collapse into one node in
embedded conjunctional clauses, it may be speculated that Fin is uni-
versally realised in main clauses.

4. Consequences of the analysis

In the previous sections, I have argued that the licensing of null

subjects in Old French is due to the interplay of two factors: the
existence of a pronominal agreement system and the additional
condition that Fin bearing agreement features must be present. The
assumption of a pronominal agreement system in Old French is
supported by the finding of Ferraresi and Goldbach (2001) that the
grammar of Old French does not provide a preverbal subject position
(SpecAGRSP). This has been identified as one of the consequences
of a pronominal agreement system suggested by Kato (1999) and di-
scussed in section 1.2.: Since the EPP feature has been checked by
the agreement morpheme, no further checking takes place. For econ-
omy reasons, the projection of a specifier of AGRSP must be exclud-
ed. Overt pronominal and non-pronominal preverbal subjects have to
be placed in a topic phrase above AGRSP.
238 Esther Rinke

From a diachronic perspective, the loss of the sentence particle si

indicates that the realisation of Fin as an additional licensing con-
dition for null subjects was no longer active within the Middle
French period. Nevertheless, null subjects were licensed in Middle
French and they occurred in a greater variety of structural contexts
than in the late Old French period (13th century).

4.1. No SpecAGRSP is available

The non availability of a preverbal subject position in Old French

may be illustrated by Ferraresi and Goldbach's (2001) observation
that only object clitics, which form a complex head with the finite
verb, intervene between the sentence particle si in Fin and the finite
verb in AGRS0. They draw the following conclusion: "This close
vicinity of Fin and AgrS models the intimate relationship between si,
object clitics and finite verb that together constitute ... a single
prosodic constituent." For realised subjects, Ferraresi and Goldbach
(2001) identify three possible positions: firstly, a topic position
preceding si that is available for definite specific subjects, secondly,
the postverbal position SpecVP for non-pronominal subjects, and
thirdly, SpecTP for pronominal subjects. The last two generalisations
are derived from the asymmetric behaviour of full subject DPs and
pronominal subjects with regard to negative specifiers as mie and pas
(cf. Vance 1997). Whereas full subject DPs follow mie and pas,
which are assumed to mark the left edge of the VP, pronouns precede
it. Example (18)a is taken from Ferraresi and Goldbach (2001), (18)b
from Vance (1997: 68).

(18) a. (...) ce ne feraije mie.

this I will not do
(Tristan en prose, nach Ferraresi and Goldbach 2001: 5)
b. einsi ne le comande pas nos^e ordre
our order does not command it in this way
(Queste 120,6; Vance 1997)
On the licensing of null subjects in Old French 239

Ferraresi and Goldbach (2001: 7) interpret their findings as evi-

dence that "there is no SpecAgrP projected ..." - as it is predicted for
a null subject language.

4.2. The diachronic development

Ferraresi and Goldbach (2001) also show that the frequently oc-
curring Old French sentence particles are prosodically weakened in
the Middle French period. As a result of this weakening, sentence
particles like si disappear from the language altogether. Ferraresi and
Goldbach argue that this loss goes along with the loss of the struc-
tural position Fin. They date the completion of this development to
the 17th century. Interestingly, the weakening of Fin seems to have
led to a freer distribution of null subjects in the Middle French period
(cf. Vance 1997). One possible interpretation of this fact is that the
realisation of Fin is not only a necessary condition for the licensing
of null subjects, but also a structural restriction for their licensing.
The question how the null subject property got lost in the history
of French remains. Within the theory of pronominal agreement,
however, it appears that agreement did not cease to be pronominal.
What actually changed is the morphophonological material which
fulfills the EPP in French: namely the paradigm of clitic subject pro-
nouns which developed in the history of French. This diachronic de-
velopment, the emergence of a paradigm of clitic pronouns, may be
illustrated by comparing the behaviour of Old French preverbal sub-
ject pronouns with their Modern French counterparts. There is gene-
ral agreement that Old French exhibits a paradigm of strong nomina-
tive pronouns. In contrast to their clitic Modern French counterparts,
they may be contrastively stressed ((19)a and (20)a), coordinated
((19)b versus (20)b) and modified ((19)c versus (20)c); they may be
separated from the verb by non-clitic elements ((19)d versus (20)d)
and appear in an isolated position ((19)e versus (20)e), cf. also
Adams (1987), Roberts (1993: 112ff.), Skärup (1975: 430ff.).
240 Esther Rinke

(19) a. Etjequesai?
and I what know
'And what know I?'
(Tristan, 1.4302; nach Roberts 1993: 112)
b. Cil de la ville nous ont molt mefait, etje.etmeshommes,
nous voulons vengier d'eus se nous povons.
that of the town us have much harmed and I and my men
we want revenge on them if we can
'That ones of the town have harmed us much and me and
my men, we want to take revenge on them if we can.'
(Clari 24, 37-39)
c. Seje meismes ne li di...
If I self not him say
'If I don't tell him myself...'
(Franzen 1939: 20; in Roberts 1993: 114)
d. Si vous prions pour Dieu que vous soiez nostre sire, et
que vous pour Vamour Damedieu preniez la croix.
... and that you for the love of god take the cross
'... and that, for the love of god, you take the cross.'
(Clari 20,15-17)
e. et qui i sera? jou et tu
and who there will-be? I and you
'And who will be there? I and you.'
(Price 1971: 145; in Roberts 1993: 113)

Modern French (cf. Kaiser 1992):

(20) a. *IL partira le premier.

HE will leave first
b. *Jean et il partiront.
Jean and he will leave
c. *Elles deux / toutes partiront bientot.
they (fem.) two / all will leave soon
On the licensing of null subjects in Old French 241

d. Je ne le lui ai pas encore dit. /

*Je souvent vais au cinema.
I not it him have yet said /1 often go to the cinema.
Ί haven't told him about it yet.'
e. Qui est venu? *//.
who has come? he
'Who came?'

This contrast shows that Old French exhibits a system of strong

nominative pronouns in preverbal position. In Modern French,
however, clitic pronouns appear in subject position. These
observations may be interpreted in different ways. In the light of the
theory of null subject licensing that was presented here, which is
essentially based on parametrization of EPP-checking, the argumen-
tation depends essentially on the analysis of subject pronouns in
Modern French. One possible analysis is that the development of a
paradigm of clitic pronouns and the possibility of subject doubling
provides evidence for the hypothesis that French changed from a null
subject language to a non null subject language (cf. Roberts 1993,
Vance 1997, Kato 1999, among others). On the other hand, Kaiser
(1992) provides evidence in favour of the view that weak subject
pronouns in Old French share crucial properties with affixes. One of
these properties is their high degree of selection, because they are
always adjacent to the finite verb and phonologically and syntactical-
ly bound to it. Kaiser argues that the clitic pronouns are positioned in
1°. Based on the proposals by Roberge (1986), Saltarelli (1989) and
Kaiser and Meisel (1991), he argues that Modern French has to be
regarded a null subject language, where the clitic pronouns took over
the role of the agreement morphology. Yet, this analysis depends on
the exact status of subject clitics in French, which is the issue of an
ongoing debate.
242 Esther Rinke

5. Summary and conclusion

In this paper, I have argued that the distributional constraints on null

subjects in the Old French period, namely in the 13th century, may
be accounted for in terms of a realisation of the functional category
Fin. This analysis is based on a split CP system as proposed by Rizzi
(1997). The main argument for the assumption that the licensing of
null subjects in Old French is contingent on the realisation of Fin
derives from the fact that they may occur in conjunctional
subordinate clauses with a preverbal adverbial topic, since only in
this type of embedded clauses the CP system is split into the two
functional layers Force and Fin, whereas it collapses into one node
in other subordinate clauses. This analysis is also supported by the
observation that null subjects in main clauses tend to co-occur with
the sentence particle si, for which Ferraresi and Goldbach (2001)
assume that it is placed in Fin. From a theoretical point of view, this
account supports Rizzi's claim that Fin is the functional layer within
the CP system which selects the IP system and which may be en-
dowed with agreement features. However, the nature of Fin and its
precise feature composition still need to be studied in more detail.
From a diachronic perspective, the sentence particle si was first
weakened and finally disappeared from the language in the 17th cen-
tury. This development indicates that Fin was probably no longer
present as a governor of Agr and as an additional licensing condition
for null subjects. Nevertheless, null subjects occurred more freely in
the Middle French period. Simultaneously, the subject pronouns be-
came clitic elements and took over the function of the agreement


1. This study has been carried out as part of the research project "Multilingualism
as cause and effect of language change", directed by Jürgen Μ. Meisel. This
project is one of currently thirteen funded by the Deutsche Forschungsgemein-
schaft (German Science Foundation) within the Collaborative Research Center
on Multilingualism, established at the University of Hamburg.
On the licensing of null subjects in Old French 243

For comments and help I wish to thank an anonymous reviewer and the editors
as well as my colleagues at the SFB 'Mehrsprachigkeit': Matthias Bonnesen,
Gisella Ferraresi, Maria L. Goldbach, Marc Hinzelin, Imme Kuchenbrandt,
Pilar Larranaga, Jürgen Μ. Meisel, Anja Mehring, and Kathrin Schmitz.
Thanks for support and fruitful exchange of ideas to Georg A. Kaiser and Ana
Maria Martins. Thanks to Nicole Gozdek and Tobias Schepelmann for their
help in setting up the data base. Thanks to Tanja Kupisch and Sophia Voulgari
for correcting my English. Needless to say that I alone am responsible for any
remaining errors.
2. The empirical investigation is primarily based on two Old French prose texts.
Both are chronicles which report on the Conquest of Constantinople and both
are dated to the beginning of the 13th century. The first one is written by
Robert de Clari, the second one by Josfroi de Vileharduyn. The respective ref-
erences are given in the bibliography. For the examples from the Clari text, I
will give the number of the page and the line in the edition as a reference, for
the examples from the Villeharduyn text, I take the chapter and the line as an
indication, since they are marked in the edition.
3. Kaiser investigates the Old French text Li quatre livres des Reis from the 14th
4. This observation has already been made by Diez (1882). Note however, when
an additional element like a resumptive adverbial appears, inversion takes
5. In addition to the structure-related arguments against a verb-second analysis of
Old French, Kaiser (2002) puts forward quantitative evidence. He shows that
the Old French text he investigates provides clear evidence in favour of the
verb second property nearly to the same extent as against a V2 analysis, namely
around 12% clear V2 main clauses and around 11 % sentences with more than
one preverbal constituent.
6. This position corresponds to SpecTOPP in Rizzi's (1997) structure. I do not
exclude the possibility that some null subject languages show subjects in
SpecTP. However, movement of the subject to this position should not be
driven by EPP checking.
7. In the Clari text, we found only three clear instances of null subjects in
subordinate clauses within a data base containing 1439 instances of a finite
8. Vance (1997) integrates examples like (7)a into her analysis by assuming that
the preverbal subject pronoun is a clitic on the complementizer que in C°.
9. Vance translates tant que as "so much" respectively "so far". However, I do not
adopt this translation. Rather, I assume that tant que has a temporal interpreta-
tion in this context (for the compatibility of tant que with a temporal interpreta-
tion cf. also Greimas 1997, s.v.).
244 Esther Rinke


On the licensing of null subjects in Old French

Periphrastic paradigms in Bulgarian
Andrew Spencer

1. Introduction

This paper explores the notion of 'paradigm' against the background

of periphrastic or analytic constructions.1 These are constructions
which express grammatical properties such as tense or aspect, but
which consist of several word forms in the syntax (multi-word com-
binations). For instance, English verbs have morphological para-
digms which distinguish, amongst other properties, Past and Non-
Past forms (wrote, writes)? However, in addition, English has
grammaticalized Progressive/Perfect aspect (is writing, has written)
and Passive voice (was written). I shall argue that it makes sense to
follow the practice of many traditional descriptive accounts of such
systems in treating them as reflexes of paradigmatic organization
within a framework of paradigm-based morphosyntax (following
Ackerman and Webelhuth 1998, Sadler and Spencer 2001, Spencer
The next section introduces the idea that constructions such as has
written are best regarded as a kind of idiom, in which neither the has
nor the -en has a meaning. Section three briefly discusses the im-
portant fact that morphological paradigms are sometimes incomplete.
Section four sketches the important distinction between morphologi-
cal and syntactic features. Section five sketches Ackerman and We-
belhuth's (1998) conception of 'expanded predicate'. Section six ap-
plies some of the discussion of section three to periphrastic para-
digms in Bulgarian and develops a new concept of 'superexhaustiv-
ity', under which periphrastic paradigms produce more forms than
would be expected from the basic combinatorics of the syntax. Sec-
tion seven is a fairly detailed demonstration that future tense con-
structions in Bulgarian consist of constructional idioms that form
paradigms. The surprising thing about these paradigmatic periphrases
250 Andrew Spencer

is that they contain subordinate clauses. The final section presents

summary conclusions.

2. Periphrases as constructional idioms

The term 'paradigm' in morphology tends to be restricted to sets of

word forms which realize morphosyntactic properties or features.
However, it is relatively uncommon for the set of morphosyntactic
features of a language to be realized solely by single word forms.
Even in morphologically complex languages we often find that some
features are expressed by multi-word combinations. It is a rather ob-
vious descriptive observation that the features expressed periphrasti-
cally are frequently in paradigmatic opposition to those expressed
morphologically. However, it is rather rare for this fact to be re-
flected in theoretical treatments and indeed, it can be surprisingly
difficult to capture the fact straightforwardly in some contemporary
frameworks. Thus, in grammatical frameworks based on the struc-
turalist morpheme concept, periphrastic expressions pose an inter-
esting analytical problem. Consider the Perfect aspect forms of Eng-
lish, as exemplified in (1):

(1) Harriet has left

There seems to be no agreement on how best to analyse sentences

such as (1). Let's assume that the sentence realizes the morphosyn-
tactic properties of TENSE PRESENT and ASPECT PERFECT, and that the
tense marking is found on the auxiliary. Which component conveys
the property ASPECT PERFECT, the auxiliary (2) or the participle (3)?

(2) 'perfect auxiliary' analysis:

Auxperfect V-en

Harriet has left

Periphrastic paradigms in Bulgarian 251

(3) 'perfect participle' analysis:

AuXhave Vperfect

Harriet has left

The question is, of course, wrongly put. Neither the auxiliary nor
the participle on its own serve as 'the' realization of aspect. Rather,
the specification aspect perfect is signalled by the entire construction.
Compound tenses of this sort are constructional idioms, in which
neither component has a 'meaning', in just the same way that neither
turn nor down have a meaning in a phrasal verb such as turn down
(the offer). This analysis is sketched in (4):

(4) Constructional idiom analysis of Perfect:

ASPECT PERFECT is expressed by combining
the appropriate form of auxiliary have with
the -en participle ('past participle').

This construction has to obey various syntactic constraints, of

course, some of which are peculiar to English auxiliary construc-
tions.3 The crucial point, however, is that the grammar of English
should contain some kind of statement of the form (4) as part of the
definition of morphosyntactic properties such as aspect.
In the light of this we can interpret (4) in the following way. The
grammar makes available structures in which have (as an auxiliary
verb with special properties) collocates with the -en participle form
of a (lexical or auxiliary) verb. At the same time the grammar in-
cludes a declaration of morphosyntactic features including ASPECT
grammar includes a set of mapping principles which effectively state
that it is the have Verb-en construction which realizes the ASPECT
PERFECT features. This leaves open the possibility that have and
Verb-en can participate in totally different constructions and realize
entirely different morphosyntactic properties. For instance, have can
figure as a modal semi-auxiliary and Verb-en can figure in the pas-
252 Andrew Spencer

sive. In principle it would even be possible for have + Verb-en to

realize entirely different features from ASPECT PERFECT (we will see
examples of this in the Bulgarian system).
Appeal to feature ensembles including ASPECT SIMPLE, ASPECT
PERFECT implies that constructions such as the Perfect can stand in a
paradigmatic opposition to Simple aspect forms. This is tantamount
to saying that the system of auxiliary + lexical verb structures real-
izes cells in a paradigm. That is the crucial result I wish to establish
and explore in this paper. Before I can do this, however, it will be
necessary to review some of the salient properties of the more famil-
iar type of morphological (inflectional) paradigm.

3. Properties of paradigms - 'underexhaustivity'

A paradigm is nothing more than a set of forms defined by a set of

oppositions. We can speak of a paradigm space generated by a set of
features. To consider a simple example, suppose that the nouns in a
language inflect for four Cases (Nominative, Accusative, Genitive,
Dative) and two Numbers (Singular, Plural). Then we expect all
nouns to have 4 x 2 forms. I shall call a paradigm which conforms in
this manner an 'exhaustive paradigm'.
The importance of the notion of 'exhaustive paradigm' lies in the
fact that morphological paradigms are frequently less than exhaus-
tive. Morphological paradigms sometimes suffer inexplicable gaps.
For instance, the Russian lexeme MECTA 'dream' lacks a Genitive
Plural form. All speakers of the language know that the form 'ought'
to be meet but this word form doesn't exist, and speakers are obliged
to use the Genitive Plural of a different lexeme, MECTANLE. On the
other hand, the lexeme MAÖTA 'mast' does have a regularly formed
Genitive Plural, mact. More importantly, paradigms often exhibit
systematic gaps for all lexemes. The Chukotko-Kamchatkan lan-
guages provide an interesting illustration. In the Chawchuwen dialect
of Koryak (2ukova 1972: 233, 307—8, Spencer 2000: 205) verbs
agree with subjects and objects and distinguish Singular, Dual and
Plural number for intransitive subjects and objects, but only distin-
Periphrastic paradigms in Bulgarian 253

guish Singular and Plural number for transitive subjects. In the Palan
dialect intransitive verbs in the Indicative agree with the subject in
Singular, Dual and Plural (Zukova 1980: 87—96). In the Imperative,
however, the use of Dual number forms is 'irregular' and often re-
placed by the Plural agreement forms (though the Dual forms of the
personal pronouns are used, (2ukova 1980: 98, 99). Only Singular
and Plural agreements are found in the Conditional mood. There is
no particular morphological reason for this difference between the
two dialects, since each uses essentially the same array of affixes.
Moreover, the very closely related language Aljutor manages to dis-
tinguish all three numbers for both subject and object in 1st and 2nd
persons using exactly the same affixes as Koryak but distributed in a
slightly different way. However, in transitive verbs there appear to be
no special Dual 3rd person subject forms (see Kibrik, Kodzasov and
Muraveva 2000: 210, Mal'ceva 1998: 61—63,206).
I shall call systematically incomplete paradigms such as this 'un-
derexhaustive'. Their existence is important because they demand a
set of feature cooccurrence restrictions to be defined over the po-
tential space of forms implied by the basic feature set and its com-
binatorics. The complete set of paradigms defined over the features
SubjAgr, ObjAgr, Person{l, 2, 3} and Number{Sg, Du, PI} properly
includes all of the actual paradigm sets found in the various dialects
of Koryak and Aljutor so that specific and essentially arbitrary rules
are required along the lines "SubjAgr [Number{Sg, PI}] if the verb is
transitive" and so on. In effect such cooccurrence restrictions provide
justification for appealing to the notion of 'paradigm'.
Paradigms often show other important properties, including that of
syncretism or homophony between the forms occupying distinct
cells: certain parts of the paradigm are expressed by one and the
same word form. There is a considerable literature on this (see Stump
1993 for one theoretical proprosal), but the essential point is that
such systematic homophonies in paradigms are ubiquitous.
254 Andrew Spencer

4. Two types of feature: m-features and s-features

All models of morphology have to appeal to morphology-specific

features which govern the construction of words. Examples are in-
flectional class features such as '3rd conjugation' or 'Strong verb'.
These features never have any reflex in the syntax (see Aronoff 1994
for extensive justification). I shall refer to them as 'morphological
features' or 'm-features'.
At the same time many languages express grammatical features by
means of function words, word order and other non-morphological
devices. The grammar of English has to include a property or feature
specification [DEFINITE +] to account for DP structure and the distri-
bution of the definite article. I shall call such a property a 'syntactic
feature' or 's-feature' (even if in some formal models it isn't actually
represented as a feature structure).
Now, some feature names actually conflate these two notions.
Since English nouns distinguish Singular/Plural number, and since
this is represented in the syntax (by agreement, for instance) the m-
feature [Number: {Singular, Plural}] serves as the realization of the s-
feature NUMBER {SINGULAR, PLURAL}. This is the usual function of
morphology, but it's important to recognise the distinction between
the two feature types. One illustration of this comes from mis-
matches. In spoken French (and other languages) the s-feature TENSE
PAST is expressed by a periphrasis combining the Present tense form
of the auxiliary with a tenseless (nonfinite) participle, e.g. elle a ecrit
'she has written'. The problem here is that elsewhere the
[Tense:Present] form of the auxiliary realizes TENSE PRESENT, as one
would expect: elle a une letter 'she has a letter'. More extended justi-
fication of the distinction is given in Sadler and Spencer (2001).
Once we recognise the m-/s-feature distinction we can simplify
the way we view morphology. For instance, we may wonder exactly
what the feature content is of the two participles as traditionally de-
scribed for English: writing and written. For instance, is written pri-
marily the perfect participle or the passive participle? Or should we
say that there are two homophonous participles? Given the m-/s-
feature distinction we can see that this is a spurious notational prob-
Periphrastic paradigms in Bulgarian 255

lem. The participle is simply a (meaningless) form of the verb lex-

eme, a 'morphome' in Aronoff s (1994) terminology, bearing the m-
feature [Vformren], and serving as a partial exponent of s-features
Given the m-/s-feature distinction we can return to the construc-
tional idiom analysis of the periphrastic perfect. We can now re-write
(4) as (5a) and provide (5b) as a partial definition of the passive

(5) a. Constructional idiom analysis of Perfect:

ASPECT PERFECT is expressed by combin-
ing the appropriate form of auxiliary HAVE
with [Vform:en] of the lexical verb.
b. Constructional idiom analysis of Passive:
VOICE PASSIVE is expressed by combining
the appropriate form of auxiliary 'be' with
[Vform:en] of the lexical verb.

In other words, we can characterise the perfect construction as in

(6) (taking the verb leave for concreteness):



[Cat:Aux] [Vform:en]

Specific examples are given in (7):

256 Andrew Spencer

(7) a. ASP PERF HAVE + V

TENSE PRES [Cat:aux] [Vform:en]

has left


TENSE PAST [Cat:aux] [Vformren]

had left

will have left

Given this analysis, neither have nor the ending -t of left 'means'
ASPECT PERFECT in has left. Notice in particular that there is no m-
feature [Aspect:Perfect] in the first place. A further point to note is
that the perfect is expressed by a construction which has its own
morphosyntactic properties. In particular, the auxiliary verb can ex-
press its own tense forms, and in the case of the future tense this is
itself done periphrastically. For this reason, when we realize a given
tense form of the perfect or progressive aspect or the passive voice
we have to treat the auxiliary like any other finite verb and realize the
appropriate TENSE s-feature on that auxiliary. This is represented di-
rectly in (6).

5. Periphrastic paradigms

Implicit in what we've said so far is the claim that s-features such as
ASPECT PERFECT as well as TENSE PAST are properties of individual
(verb) lexemes. In other words, has left is the perfect form of the
verb 'leave', in much the same way that left is the past tense form
Periphrastic paradigms in Bulgarian 257

and leaves is the 3sg form. Nonetheless, expressions such as has left
have their own syntactic properties, too, which they share with other
auxiliary constructions. Ackerman and Webelhuth (1998: 143) de-
scribe expressions such as has left as 'expanded predicates'. Their
characterization of this notion is given in (8):

(8) Form and function of (expanded) predicate:

Function Form
the contentive aspect of the its categorial core, the auxiliaries
predicate, i.e., its meaning and particles needed to express
and its function inventory the predicate in the syntax

Thus, a phrase such as will have been making up (a story) can be

thought of as the Future Perfect Progressive form of a lexeme 'in-
vent, concoct' whose categorial core is the verb form make, whose
full lexical form is a multi-word combination (lexical root + particle,
make up) and whose contentive aspect is additionally expressed by
the three auxiliaries.
The state of affairs shown in (8) is represented directly in peda-
gogic and traditional grammars. However, the status of periphrastic
constructions is less clear in many recent models of generative
grammar, because there is generally no level at which to state the fact
that the expression types in (8) form a paradigm. The paradigmatic
organization of English auxiliaries is rather complex so to illustrate
the problem I'll consider the slightly less controversial case of Rus-
sian. In the Russian verb system it is common to distinguish a three
tense system (Past, Present, Future) with two aspects (Perfective,
Imperfective). These are expressed as in (9):

(9) Basic paradigm system for RASPISAT' 'to write out'

Perfective Imperfective
Past raspisal raspisyval
Present — raspisyvaet
Future raspiset budet raspisyvat'
258 Andrew Spencer

In the Future Tense forms the single word form in Perfective as-
pect, raspiset, is in opposition to the Imperfective Future, a peri-
phrastic construction formed from the Future Tense form of the verb
'to be' and the (Imperfective) infinitive. The Imperfective Future
likewise stands in paradigmatic opposition to the single word form
Present and Past tense forms of the Imperfective. Notice that there is
no Present tense form of the Perfective aspect in Russian. Given this,
a definition of the set of s-features for Russian will therefore contain
the declarations given in (10):

(10) Partial statement of s-feature paradigm for Russian



The feature cooccurrence restriction given in (10b) ensures that

the paradigm is not complete, in that the 'virtual' cell corresponding
In addition to definining the s-feature paradigm space, a grammar
must specify a mapping which defines how each cell is filled. Where
the paradigms are realized by words formed by an agglutinating
morphology or by compositional syntactic constructions with func-
tion words the mapping will be fairly trivial. However, in the general
case we will find deviations from a one:one mapping (Ackerman and
Webelhuth 1998, Sells 2000, Sadler and Spencer 2001). The point of
paradigm-based models of morphosyntax is to develop formal ma-
chinery for expressing such mappings and integrating them into the
rest of the morphology and syntax.
The paradigm-based perspective on periphrastic constructions
leaves open an important question:

(11) How do we distinguish paradigmatic periphrasis (expanded

predicates) from 'ordinary' syntactic constructions involving
two separate lexemes?
Periphrastic paradigms in Bulgarian 259

In other words, how do we know that the English expression

wants to leave is an infinitival complement of LEAVE subordinated to
the main verb WANT and isn't the desiderative mood form of the lex-
eme LEAVE? This amounts to the problem of grammaticalization:
how do we know when grammaticalization has taken place (and been
Equivalently, from the synchronic point of view, (11) is equiva-
lent to (12):

(12) How do we individuate grammatical features (as opposed to

lexical meanings)?

In other words how do we know that we are dealing with, say, a

feature such as TENSE FUTURE or MOOD INTENTIONAL rather than a
verb lexeme DESIRE(x, y)? Now, (12) is a question which any the-
ory of grammar, in any theoretical framework, needs to provide an
explicit answer to. However, as far as I know this is a question which
is almost uniformly ignored. Clearly, however, this question is cru-
cial to all current accounts of morphosyntax, and certainly isn't pe-
culiar to paradigm-driven models such as that which I am proposing.
However, the paradigm-based approach does have the virtue of
throwing the problem into relief and making it less easy to sweep it
under the carpet.
In the rest of this paper I address the questions in (11, 12) by pro-
posing a set of diagnostics for determining when we have a paradig-
matic system. The thrust of the discussion will be the observation
that periphrastic paradigms tend to have certain properties which
make them look more like morphological paradigms than syntactic
260 Andrew Spencer

6. Properties of periphrastic paradigms

6.1. Quasi-morphological behaviour

In Spencer (2001) I outline a variety of ways in which periphrastic

constructions in Slavic languages resemble morphological systems,
in that they exhibit cumulation, zero exponence, meaningless morphs
and so on. I argue there that we can capture the regularities ade-
quately only if we make use of sets of rules or constraints mapping s-
feature complexes to syntactic constructions organized in the manner
of realization rules advocated by morphologists such as Stump
(2001). In this section I extend those observations by highlighting
two important properties of periphrastic paradigms which distinguish
them from the 'syntactic ideal' of common-or-garden compositional
and regular constructions.

6.2. Incomplete paradigms

An underexhaustive periphrastic paradigm is the most obvious de-

viation from the 'syntactic ideal', as illustrated by the gap in the Rus-
sian verb paradigm shown in (9). There is no Present Perfective form
in Russian, yet there is no particular reason, morphological, syntactic
or semantic why Russian couldn't have developed the pseudo-
paradigm in (13):

(13) Exhaustive pseudo-paradigm system for RASPISAT' 'to write

Perfective Imperfective
Past raspisal raspisyval
Present raspiset raspisyvaet
Future budet raspisat' budet raspisyvat'

It might be thought that there are good semantic reasons for dis-
allowing the exhaustive paradigm, in that the semantics of perfectiv-
ity seem to be incompatible with Present tense meaning. However,
Periphrastic paradigms in Bulgarian 261

this would be a mistake (borne in part of taking traditional semanti-

cally based morphological labels too seriously). Bulgarian, a Slavic
language with a similar system of verbal aspect, is quite happy to al-
low Present tense and Perfective aspect to cooccur (and even permits
an Imperfect tense of the Perfective aspect and an Aorist tense of the
Imperfective aspect). Thus, the reason why Russian has the underex-
haustive system in (9) and not the exhaustive system in (13) is effec-
tively an accident of linguistic history.

6.3. Superexhaustive paradigms

The second type of deviation is one which seems to have escaped

theoretical notice hitherto, but which provides strong motivation for
distinguishing between compositional, productive syntax and gram-
maticalized paradigmatic periphrasis. Just as there are 'underexhaus-
tive' paradigms, so, I shall argue, there are cases in which a paradigm
has more forms than would be expected from the normal principles
of compositional syntax. I shall call such paradigms 'superexhaus-
tive'. They are systems in which the paradigm develops forms which
are mandated neither by existing paradigmatic oppositions nor by the
'inherent' syntax of the periphrasis. Such phenomena pose an inter-
esting puzzle for grammaticalization theory, but they clearly pose
serious problems for a syntactic, non-paradigmatic account of periph-
Superexhaustive paradigms are not particularly common, but Bul-
garian provides a particularly interesting example of the phenome-
non, the Emphatic Renarrated mood. To understand how this works
we need to look at Bulgarian morphosyntax in more detail.
Bulgarian distinguishes three tenses morphologically, the Present,
Imperfect and Aorist. In addition, it has a set of compound Perfect
tenses, formed by combining tensed forms of the verb 'be' with the /-
participle. The /-participle is formed by adding - I to the verb stem,
thus from the (Aorist) stem pisa 'write' we obtain pisal. This partici-
ple agrees with its subject in number and gender but not in person.
The Perfect can appear in various tense forms, including the Present
262 Andrew Spencer

Perfect and Past Perfect or Pluperfect: säm pisal Ί have written',

bjax pisal Ί had written'. The latter is formed by putting the auxil-
iary 'be' into the Imperfect or the Aorist tense (these tenses of 'be'
only differ in the 2/3 sg forms).
Bulgarian has developed a special Renarrated mood by reinter-
preting the Perfect tense series. This is a type of evidential mood in-
dicating that the speaker didn't witness the event described but learnt
of it by hearsay ('They say that', Ί gather that', 'Apparently'). In
(14) I give the basic forms for the Renarrated mood in 3sg forms (ig-
noring Future tense based forms, which we will return to).

(14) Bulgarian Renarrated mood (preizkazno naklonenie), non-

future forms (Scatton 1984: 331), Pi§E 'write'
Indicative Renarrated

Present pise 'writes'

Imperfect pisese 'was writing'

Aorist pisa 'wrote' pisal

Present Perfect pisal e 'has written'

bil pisal
Past Perfect be(se) pisal 'had written'

First notice that the Present/Past distinction is neutralized in the

Renarrated mood, except that the Aorist has its own unique Renar-
rated form. This form is identical to the Present Perfect Indicative in
lst/2nd person forms (an instance of the common paradigm property
of syncretism) and consists of the Present tense of the auxiliary 'be'
and the /-participle form of the verb, pisal. However, the Perfect In-
dicative retains the auxiliary in 3rd person forms, while in the 3rd
person Renarrated forms the auxiliary is dropped (an intriguing ex-
ample of zero exponence discussed in more detail in Spencer 2001:
294-95). The same /-participle form, pisal, is the basis of the Pres-
Periphrastic paradigms in Bulgarian 263

ent/Past Perfect Renarrated. However, the /-participle for the Pres-

ent/Imperfect Renarrated is slightly different, pis el. This is the Im-
perfect /-participle (minalo nesvarSeno dejatelno pricastie), based on
the Present/Imperfect stem and is found only in the Renarrated
forms, not in Perfect Indicative forms. This morphological innova-
tion is necessary to ensure that a complete paradigm of Renarrated
forms is possible, otherwise there would only be a single form for the
Present, Imperfect and Aorist tenses. The innovation represents an
extension to the normal morphological paradigm of verb forms, and
is an instance of what I shall call 'morphological superexhaustivity'.
As a result the verb paradigm is now skewed, because it includes a
form which can't be used for any Indicative tense/aspect construc-
Of greater interest is an extension of the Renarrated mood system.
Bulgarian has a paradigm of Emphatic Renarrated forms. The basic
scheme is shown in (15):

(15) Bulgarian Emphatic Renarrated forms, lsg (Scatton 1984:

33If), PISE 'write'
Renarrated Emphatic Renarrated

Present/Imperfect säm pisel bil säm pisel

Aorist säm pisal bil säm pisal

Present Perfect bil pisal —

The Emphatic Renarrated forms are derived systematically from

the Renarrated forms by creating an additional Renarrated form for
the Present tense auxiliary forms (shown here in the lsg form). This
creates a construction which is otherwise unlicensed. In some Slavic
languages this would be the way to create a Pluperfect from a Present
Perfect (because the expression bil säm would be the Past tense of
the auxiliary). However, the Pluperfect in Bulgarian is formed by
taking a synthetic past tense of the auxiliary (Imperfect or Aorist).
264 Andrew Spencer

Outside of the system of Renarration, a form such as bil säm pisal

could only be the Perfect of a Perfect, and this is no more motivated
for Bulgarian than the corresponding English expression would be:
*has had written (the letter). As it is we seem to have a Renarrated
form of a Renarrated form, but this too is semantic nonsense. In ef-
fect, we have here a non-compositional extension of a construction
which is already pretty non-compositional.
The pressure to express further grammatical meanings has led to
an extension of the periphrastic paradigm in a way that is very diffi-
cult to reconcile with a picture on which the auxiliary verb has its
own lexically defined set of grammatical features. There seems to be
no alternative but to treat this as a constructional idiom which has
been extended in the way that synthetic morphological paradigms are
occasionally extended. I shall call this type of phenomenon 'peri-
phrastic superexhaustivity'. Some examples are given below (where
'L' in morphemic glosses indicates 7-participle'):

(16) a. Aorist Renarrated:

Ti si napisala pismoto
you are write.L the.letter
'You wrote the letter (reportedly)'
b. Emphatic Aorist Renarrated:
Ti si bila napisala pismoto
you are be.L write.L the.letter
'You wrote the letter (reportedly, emphatic)'

(17) a. Present/Imperfect Renarrated:

Ti si pisela pismoto
you are write.iMPF.L the.letter
'You are/were writing the letter (reportedly)'
b. Present/Imperfect Emphatic Renarrated:
Ti si bila pisela pismoto
you are be.L write.iMPF.L the.letter
'You are/were writing the letter (reportedly,
Periphrastic paradigms in Bulgarian 265

Bulgarian linguists often distinguish a further modal category

called variously the 'Presumptive' (predpolozitelna forma), 'Inferen-
tial' (umozakljucitelna forma), 'Conclusive' (konkluziv) and other
terms (Bojadziev, Kucarov and Pencev 1998: 401, 410—413). This
is identical to the Renarrated mood, except that the 3rd person auxil-
iary is always present. This means that the forms of the Aorist Con-
clusive coincide with those of the Present Perfect. The one form
which is therefore unique to this mood is the 3rd person forms with
the Imperfect /-participle, such as e pis el 'was writing (it seems)'
(see Kucarov 1994 for a detailed discussion of this mood).
According to Bojadziev et al. (1998: 423f) the Emphatic or Dou-
bled Renarrated form is in fact the Conclusive of the Renarrated form
(konkluzivna preizkazna forma). Thus we can set up the following
correspondences, in which the Emphatic Renarrated is derived by
'renarrating' the Conclusive forms, that is, by replacing the 'indica-
tive form' auxiliary of the Conclusive with a 'renarrated' form.
Compare their table shown in (18) with the table in (15) above:

(18) Conclusive and Emphatic Renarrated forms

Conclusive Emphatic Renarrated

pisel e bil pisel

bil e pisal no renarrated form
stjal e da pise stjal bil da pise
stjal e da e pisal stjal bil da e pisal
pisal e bil pisal

If this is true then we may ask (with an anonymous referee)

whether the Emphatic Renarrated paradigm really does constitute an
instance of superexhaustivity, as opposed to just the operation of
regular formation rules. The answer remains 'yes'. Now, the mean-
ing of the Emphatic Renarrated is itself a matter of controversy, but
what it cannot possibly mean is the Renarration of the Conclusive or
the Conclusive of the Renarrated. In other words, bil pisel can't mean
'it is said that the speaker infers that he wrote' or 'the speaker infers
266 Andrew Spencer

that it is said that he wrote'. But the fact that it's possible to relate the
Emphatic Renarrated to two existing morphosyntactic processes at
the purely formal level without preservation of the conventional se-
mantics for those two processes is precisely what gives rise to super-
exhaustivity. The Emphatic Renarrated form, whatever its meaning
and however it is formed, is non-compositional. The discussion in
Bojadziev et al. (1998) highlights this aspect of the construction very
clearly. The Emphatic Renarrated paradigm, therefore, is mandated
purely by morphological form, not by the meaning normally associ-
ated with its components.
Superexhaustive paradigms provide no less evidence for gram-
maticalization and paradigmatic organization than underexhaustive
paradigms. In each case we are dealing with a deviation from syn-
tactic compositionality which is entirely unmotivated from the point
of view of syntactic representations. In the next section we look at
Future tense forms in Bulgarian, to examine the full extent to which
syntax can be commandeered in the service of functional categories.

7. Grammaticalization of clause structure in Bulgarian

7.1. Bulgarian subordinate clauses

Before we can consider Bulgarian Future forms we need to know

something about Bulgarian complement clauses. Bulgarian lacks in-
finitive forms. Where many other Slavic languages have infinitival
complements, Bulgarian has a variety of modal and other verbs
which take finite subordinate clause complements introduced by DA.
An interesting peculiarity of these clauses is that they permit fronting
of arguments and adjuncts before the subordinator.

(19) a. Ivan iska pisma da pise

Ivan wants letters DA writes
'Ivan wants to write letters'
Periphrastic paradigms in Bulgarian 267

Ivan iska Marija da pise pisma

Ivan wants Marija DA writes letters
'Ivan wants Marija to write letters'

(20) a. Ivan iska da

Ivan wants DA
otgovarja pravilno na vaprosite
answers correctly to the.questions

Ivan iska pravilno da

Ivan wants correctly DA
otgovarja na vaprosite
answers to the.questions
'Ivan wants to answer the questions correctly'

The DA-clause starts its own clitic domain (clitics are shown in

(21) Ne sie Ii vie iskali da mu gi pokaza?

NEG AUX Q you want DA to.him them
'Don't you (reportedly) want me to show them to him?'
'Haven't you been wanting me to show them to him?'

Example (21) also illustrates the basic way in which negation is

effected in Bulgarian, by means of a negative particle, NE, which
then heads the clitic cluster. If there are no clitics, NE attaches to the
first full verb form (the lexical verb or non-clitic forms of the 'be'
There is controversy in the literature as to how best to describe the
DA-element. In particular, it isn't clear whether we should regard DA
as a complementizer in all its uses or whether it is really a modal
particle of some kind in some of its uses (see Rudin 1986: 54f for a
survey of views). This question is irrelevant to our present concerns.
What is crucial is that the DA-element has a wide variety of morpho-
syntactic uses and some, at least, of its morphosyntactic properties
are preserved across those uses.
268 Andrew Spencer

7.2. Grammaticalization of OA-clauses - Imperatives and Futures

The DA-clause is used to realize certain types of imperative (Scatton

1984: 339). In examples (22b) and (23) we see that the subject can
come before the DA-element, as we saw in (19b):

(22) a. Da dojde Ivan

DA come Ivan
b. Ivan da dojde
Ivan DA come
'Ivan should come!'

(23) a. Neka Ivan da dojde

let Ivan DA come
'Ivan should come'
b. Neka Ivan da ne idva
let Ivan DA NEG come.iMPFV
'Ivan shouldn't come'

These examples show that, in this type of imperative at least, the

DA-clause construction (viewed as a piece of 'pure' syntax) can
sometimes serve as an exponent (or even the principal exponent) of a
The Imperative is not the only category to use the DA-clause. This
is also the structure of two of the four compound tenses based on the
Future (see Scatton 1984: 320 and Bojadziev et al. 1998: 382 for a
basic description of the tense/aspect system). The basic Future tense
construction (bädeäte) is relatively simple. It is formed by taking an
uninflecting particle ste and combining this (as a clitic) with the Pre-
sent tense form of the verb, either in the simple aspect or in the Per-
fect (bädeste predvaritelno). Thus, we have (24,25):

(24) a. Az ste pisa pismoto

I FUT write. lSG the.letter
Ί will write the letter'
Periphrastic paradigms in Bulgarian 269

Te ste pisat pismoto

they FUT write.3PL the.letter
'They will write the letter'

(25) a. Az ste säm napisal pismoto

I FUT be.LSG write.L.SG the.letter
Ί will have written the letter'
b. Te ste sa napisali pismoto
they FUT be.3PL write.L.PL the.letter
'They will have written the letter'

The prefixed forms napisal/napisali represent the Perfective as-

pect, not to be confused with the Perfect tense/aspect. Perfective as-
pect is more natural than Imperfective aspect in Perfect tense forms
such as these. The verb 'be' has two equivalent Future tense forms,
one using the ste + the Present tense of 'be' and the other with a spe-
cial form of 'be' based on the root bäd-, which was historically a
Future tense form. The Perfect aspect forms in (25) can be based on
either type of Future, e.g. Az ste bäda napisal pismoto, Te ste bädat
napisali pismoto.
A number of other tense/mood/aspect expressions are possible
based on this design, but all of them involve a DA-clause. Tradition-
ally, these are all taken as part of the verbal paradigm. First, in the
Indicative mood there are the Past Future (bädeste ν minaloto) and
the Past Future Perfect (bädeste predvaritelno ν minaloto). In the Past
Future the future auxiliary takes the Person/Number inflections of the
Imperfect (past) tense, and which I shall gloss as §TA (the traditional
lsg citation form). This then combines with a DA-clause whose lexi-
cal verb is either in the Present tense (Past Future, (26)) or Present
Perfect (Past Future Perfect, (27)):

(26) a. Az stjax da pisa pismoto

I STA.ISG DA write. ISG the.letter
Ί would write the letter'
270 Andrew Spencer

b. Te stjaxa da pisat pismoto

they STA.3PL DA write.3PL the.letter
'They would write the letter'

(27) a. Az stjax da säm

napisal pismoto
write.L.SG the.letter
Ί would have written the letter'
b. Te stjaxa da sa
they §TA.3PL DA be.3PL
napisali pismoto
write.L.PL the.letter
'They would have written the letter'

The commonest interpretation of the Past Future is, perhaps, that

of a Conditional (as seen in the glosses for (26,27)).
Next I consider the Negative Future forms. These are constructed
with the verb IMA 'have'. Now, as a lexical verb IMA has a special
fused negated form as seen in examples (28,29):

(28) a. Toj ima pari

he have.3SG.PRES money
'He has money'
b. Toj njama pari
he NEG.have.3sG.PRES money
' H e doesn't have any money'

(29) a. Te imaxa pari

they have.3PL.IMPF money
'They had/used to have money'
b. Te njamaxa pari
they NEG.have.3PL.LMPF money
'They didn't (used to) have money'
Periphrastic paradigms in Bulgarian 271

The verb IMA is also used impersonally (in 3sg forms) in existen-
tial sentences of the kind 'there is/are X':

(30) a. Tuk ima pari

here have.3sG money.PL
'There's money here'
b. Tuk njama pari
here NEG.have.3SG money.PL
'There's no money here'

Notice that PARI 'money' is a plurale tantum noun, but it doesn't

trigger agreement on the verb. The examples in (28-30) show that
the fused root njam- is used uniformly instead of the expected ne +
ima (which is ungrammatical).
Exactly the same idiosyncratic allomorphy is found when IMA is
used to form the negated future tenses. In the Present tense, the im-
personal 3sg form, njama (glossed as NJAMA), is used for all persons
followed by a DA-clause in the Present or Present Perfect:

(31) a. Az njama da pisa pismoto

I NJAMA DA write. ISG the.letter
Ί will not write the letter'
Te njama da pisat pismoto
they NJAMA DA write.3PL the.letter
'They will not write the letter'

(32) a. Az njama da säm

napisal pismoto
write.L.SG the.letter
Ί will not have written the letter'
272 Andrew Spencer

b. Te njama da sa
they NJAMA DA be.3PL
napisali pismoto
write.L.PL the.letter
'They will not have written the letter'

For the Past Future we use the 3sg Negative Imperfect form of
IMA, njamase (glossed NJAMASE), with the DA-clause:

(33) a. Az njamase da pisa pismoto

I NJAMASE DA write. ISG the.letter
Ί would not write the letter'
Te njamase da pisat pismoto
they NJAMASE DA write.3PL the.letter
'They would not write the letter'

(34) a. Az njamase da säm

napisal pismoto
write.L.SG the.letter
Ί would not have written the letter'
Te njamase da sa
they NJAMASE DA be.3PL
napisali pismoto
write.L.PL the.letter
'They would not have written the letter'

Finally, we can form Renarrated and Emphatic Renarrated Fu-

tures, based on the /-participles of §TA 'Future', NJAMA 'Negative
Future' and SÄM 'be', stjal, njamal, and bil:

(35) a. Tja stjala

da e napisala pismoto
DA be.3sG write the.letter
'She will/would have written the letter (reportedly)'
Periphrastic paradigms in Bulgarian 273

Tja bila stjala

da e napisala pismoto
DA be.3sG write the.letter
'She will/would have written the letter
(reportedly, emphatic)'

(36) a. Tja njamalo

da e napisala pismoto
DA be.3SG write the.letter
'She will/would not have written the letter (reportedly)'
Tja njamalo bilo
da e napisala pismoto
DA be.3sG write the.letter
'She will/would not have written the letter
(reportedly, emphatic)'

According to Scatton (1984: 332) and Bojadziev et al. (1998:

460—462) there are also forms with ne instead of njamalo, which
would give Tja ne stjala da e napisala pismoto for (36a) and Tja ne
bila stjala da e napisala pismoto for (36b). In those alternative forms
it appears that we are negating the 'be' auxiliary and not the ste form,
though this is only visible in non-3rd person forms: Az ne säm stjal
da säm napisal pismoto Ί would not have written the letter (report-
The DA-clause in these compound Future tenses is syntactically
still a DA-clause. Thus, it creates its own clitic domain, so that those
clitics which relate to the lexical verb (pronominal object clitics and
the perfect auxiliary) are found in that clause, while 'main clause'
clitics, that is the interrogative LI and perfect auxiliaries relating to
Renarrated forms are found outside the DA-clause, just as though the
Future auxiliary were a matrix verb selecting a subordinate clause
complement. This is seen in examples such as the following
(Avgustinova 1997: 74):
274 Andrew Spencer

(37) Stjaxte miIi da gi dadete?

PAST.FUT.2PL Q DA them you. give
'Would you give them to me?'

(38) Njamase li da ste

mi gi dali? them give.L
'Would you not have given them to me?'

(39) Steli li ste (bili)

da mi gi badete prestavili?
DA them FUT.2PL introduced
'Were you (reportedly) going to introduce them to me?'

(40) Njamalo li (bilo)

da ste mi bili predstaveni?
DA AUX.2PL AUX.L introduced
'Weren't you (reportedly) going to be introduced to me?'

Similarly, elements such as objects and adverbials can climb out

of the DA-clause and appear in a preposed position between the aux-
iliary complex and the DA complementizer (Avgustinova 1997: 51):

(41) Njamase li pravilno da e

NJAMASE Q correctly DA be.3SG
otgovoril na väprosite?
reply to the.questions
'Would he not have answered the questions correctly?'
Periphrastic paradigms in Bulgarian 275

(42) Ti stjal li si predvaritelno

you STJAL Q AUX.2SG in.advance
da si ni ja pokazal?
DA AUX.2SG her show
'Will you (reportedly) have shown her to us in advance?'

7.3. Theoretical implications

Clearly, the data in the previous section illustrate periphrastic para-

digms in the sense that I have been using this term. A reasonably full
range of constructions is found (even if some of the more baroque
ones are very seldom used) and the forms of the more complex
structures are largely predictable from those of the simpler structures
from which they are built up. Thus, we have a situation in which a
DA-clause construction is used as part of a verbal paradigm, while
retaining most of the syntax of the original syntactic construction.
This is theoretically significant. The Future constructions with DA
illustrate in a particularly graphic fashion the way in which syntax
can lose its compositionality and can serve as the (partial) exponent
of morphosyntactic features organized paradigmatically.
I propose to account for these paradigms by setting up a family of
s-feature mappings similar to those proposed in (6) for the English
Perfect. These are shown in (43) (with obvious abbreviations):

(43) Where ' V' stands for any lexical verb, and where SÄM stands
for the auxiliary verb ('be'):




TENSE IMPF [Vform:l]
276 Andrew Spencer


Ste + V


ste + V


sta + da + V

sta + da + V

(44) Where Ύ ' stands for any lexical verb:


njama + da + V


njama + da + V


njamase + da + V
Periphrastic paradigms in Bulgarian 277


njamase + da + V

Obviously, these rules would have to be supplemented by further

principles regulating agreement and so on. This is an interesting ex-
ercise in itself, since agreement for person, number, gender can be
spread across the entire construction. The four rules for negated
forms will override the default negation rule which just adds NE to
the left edge of the verb or clitic cluster. There are various notation
conventions one could imagine to collapse these two sets of rules
into a single rule schema. For example, we may wish to write (44) as

(45) V

njama + da + V

Similarly, one can expand these rules in obvious ways to accom-

modate the Renarrated and Emphatic Renarrated forms.
Crucial to this way of looking at things is the assumption that
formatives such as sta and njama are grammatical elements serving
as exponents of mapping rules, and not fully fledged lexical entries.
The syntactic idiom chunks illustrated in (43, 44) are self-contained
morphosyntactic units which serve to realize morphosyntactic prop-
erties, they are not inflected forms of lexical entries. Thus, none of
the forms of §TE or STA is treated as a lexical entry bearing the fea-
ture TENSE FUTURE (much less [Tense:Future]!) Even njama isn't
treated as a listed negation element. Rather, we have a variety of
grammatical formatives realizing morphosyntactic properties. It's
278 Andrew Spencer

just that some of those formatives happen to be inflecting words in

their own right.
To be sure, one could imagine an analysis in which sta, njama and
the like were treated as verbs in the lexicon bearing a set of features
(e.g. [Tense:Future], [Polarity:Negation]) and having selectional
features (e.g. \+da\). The syntax of the compound Futures would
then be essentially that of any lexical verb which selects a DA-clause.
Indeed, Krapova (1999) has recently argued for precisely this posi-
tion. She argues that STA, when conjugated, is a lexical modal verb
like ISKAM 'want'. However, her argument rests on a faulty charac-
terization of the semantics of STA. She claims that STA always assigns
its subject a volitional/intentional semantic role, much like ISKAM
(Krapova 1999: 82). This is simply false. Casual inspection of these
constructions shows that the traditional characterization as Future-in-
the-Past (shifting to Conditional) is correct.
Krapova explicitly states that it is more 'principled' to relate all
tokens of a grammatical formative such as English HAVE or Bulgar-
ian IMAM 'have' to a single underlying meaning. In other words, she
denies the possibility of homonymy or polysemy. Given Krapova's
assumptions we would have to concede that the lexemes EAR (for
hearing) and EAR (of corn) were one and the same (underspecified)
lexeme. A particularly telling observation is the fact that the negative
form of the conjugated STA constructions is formed with NJAMA, just
like the straightforward Future Indicative with the particle STE
(which Krapova agrees is simply a functional element and not a lexi-
cal verb). But by Krapova's 'No homonymy' principle njama must
be a form of IMA and not of STA (or STE). Thus, she would be forced
to argue that there simply is no Negative Future form in Bulgarian.
Another important and related point which is overlooked by
Krapova is the fact that the grammatical formatives do not have full
paradigms of their own. Since STA even when conjugated is itself the
exponent of TENSE PAST FUTURE it cannot have a Future or a Past
Future form of its own. If sta, njama and so on really did represent
forms of autonomous lexical entries it would be a complete mystery
why they lack precisely the inflections which they themselves are
exponents of. Thus, why is there no Future form of either sta or
Periphrastic paradigms in Bulgarian 279

njama, along the lines of *ste sta or *ste njamal It is no more possi-
ble to construct such strings than it is to have a Perfect aspect form of
the Perfect auxiliary HAVE: *I have had left early. Those that believe
that auxiliaries such as English HAVE or Bulgarian §TA are fully
fledged lexical entries have to explain this otherwise mysterious
complementary distribution. Notice that homophonous verbs do have
Perfect forms, e.g. I have had an idea or I have had to leave early. It
is difficult to see how such contrasts can be explained without void-
ing the 'No homophony' principle of any content.
Clearly we must reject Krapova's bizarre 'No homonymy' as-
sumption out of hand, but her discussion is useful as a reductio ad
absurdum of the strongest version of the view that functional words
have their own lexical entries. Is there a satisfactory weakening of
the lexical entry thesis? No doubt it is possible to set up special lexi-
cal entries, with a plethora of selectional features guaranteeing that
just the right collocations are generated. It is easy to see that such
entries would still retain the mysterious property of complementary
distribution pointed out above: these would be lexical entries which
themselves lacked those forms which correspond to the features of
which they are exponents.

8. Conclusions

I have argued for a paradigmatic perspective on periphrastic con-

structions. On this approach when we consider multi-word combina-
tions such as English has been writing or Bulgarian njamase da e
pisal '(he) would not have written' we do not regard individual func-
tion words such as has been or njamase da e as lexical entries pro-
jecting their own set of features. Instead, we regard them as simply
formatives which bear at most syntactic category features. I pre-
sented two sorts of evidence for this paradigm-based view of periph-
rases, based mainly on the unusually rich periphrastic system of Bul-
garian: first, we find the kinds of gaps in these constructions that we
often see in inflectional paradigms but which should not occur in
genuinely compositional syntax; second, we find instances of super-
280 Andrew Spencer

exhaustivity, in which the paradigm takes on a life of its own, so to

speak, and extends beyond what would be expected from the normal
combinatoric syntax. Constructions such as the Bulgarian Emphatic
Renarrated are extremely difficult to describe if we insist on listing
featural properties in lexical entries for function words. What con-
ceivable lexical entry could we posit for bil, the /-participle of 'be',
that would account for its use as an exponent of the Present/Aorist
Emphatic Renarrated form bil säm cetjal Ί am reading/read (report-
edly, emphatic)', while simultaneously accounting for its appearance
as the marker of Perfect Renarrated in bil säm napisal Ί have written
(reportedly)' or Present Perfect Indicative and Present/Aorist Renar-
rated of bil säm Ί have been, I am (reportedly), I was (reportedly)'?
There remain many interesting questions. In particular, it is neces-
sary to explain exactly how paradigm-driven mapping rules such as
those of (43, 44) relate to other aspects of morphosyntax, such as
linearization, clitic placement, agreement, ellipsis and so on. These,
of course, are problems for all current theories of morphosyntax. I
claim that it is only by adopting the paradigm-based approach that
the correct factoring of functions can be achieved and only a para-
digm-based approach will eventually lead to an insightful account of
these constructions.


1. Parts of this paper have been presented to audiences at the Workshop on Con-
structions, Linguistics Association of Great Britain, 5 April 2001, University of
Leeds, and the Workshop on Historical Morphosyntax, 6 June 2001, Universität
Konstanz, as well as to the Arbeitsgruppe 12 of the DGfS23 meeting, Univer-
sität Leipzig, 1 March 2001.1 am grateful to Guergana Popova, the editors, and
an anonymous reviewer for helpful comments.
2. Except where it would be fussy to do so, I write forms of lexemes in italics and
the name of the lexeme itself in SMALL CAPITALS.
3. Naturally, this account presupposes a theory of constructions within the gram-
matical architecture. For preliminary discussion of this see Ackerman and We-
belhuth 1998, Sells 2000.
Periphrastic paradigms in Bulgarian 281


Transparent, restricted and opaque affix orders

Barbara Stiebels

1. Introduction*

Cross-linguistic research on the morphological structure of words has

revealed two tendencies for possible affix orders: whereas functional
categories (e.g. tense-aspect-mood systems) show a strong tendency
for fixed affix orders (see Bybee 1985 and Wunderlich 1993), which
only exhibit a small range of cross-linguistic variation, adverbial af-
fixes and diathesis markers surface in variable orders that correlate
with systematic differences in meaning. The behavior of both classes
of morphemes can be motivated semantically; the current literature
on affix order, though, is mainly dominated by syntactic approaches
(e.g. Baker 1985, Pesetsky 1985, Muysken 1986, Speas 1991, Alsina
The research on affix order has been stimulated by Baker's (1985)
Mirror Principle, which states that affix orders should mirror syn-
tactic derivations:

(1) Mirror Principle (Baker 1985: 375)

Morphological derivations must directly reflect syntactic
derivations (and vice versa).

Whereas in the original paper, Baker's proposals concerning the

nature of the relevant operations are quite vague, Baker (1988) pro-
poses a system where the affix order results from underlying syntac-
tic configurations by head movement. In most cases, a given affix
order can only receive a unique interpretation. Gaps in potential affix
orders result from violations of syntactic principles (e.g. Case Filter,
Empty Category Principle).
284 Barbara Stiebeis

Muysken (1986) interprets the Mirror Principle in terms of scope:

if an affix A has scope over affix B, it must be external with respect
to B, which may be illustrated as follows:

(2) a. Affix order: V-AFFI-AFF 2 -... VS. V-AFF2-AFF1-...

b. Semantic scope: AFF2(AFFI(V)) VS. AFFI(AFF2(V))

The representations in (2) are meant to also include the mirror im-
age, where all affixes are realized as prefixes.
In case where the relevant affixes do not attach at the same side of
the verbal stem, affix orders by themselves normally do not indicate
their order of application. Therefore, the following structures are

(3) [AFF Ι - [Verb-AFF 2 ] ] vs. [ [ AFF Ι - Verb] - AFF 2 ]

However, in some languages, such affix orders can be distin-

guished due to structural properties (e.g. linking patterns such as case
distributions) or due to certain allomorphies. I will provide evidence
for this in the following sections.
A recent proposal by Rice (2000) puts emphasis on the availabil-
ity of affix combinations that may receive different scope readings.
According to Rice, three cases of affix combination have to be dis-
tinguished: first, two affixes A and Β do not exhibit a scope relation;
therefore, no affix order concerning A and Β is preferred. Both affix
orders may be possible, or a language may arbitrarily choose one
option. The combination of the Chichewa (Bantu) intensifier (INT)
-its 'do V well, intensively' with various diathesis markers is a case
in question:

(4) Position of the intensifier morpheme in Chichewa

(Hyman and Mchombo 1992)
Transparent, restricted and opaque affix orders 285

With the applicative (APPL) -Ir and the passive (PASS) -Idw, the
intensifier may only occur as inner morpheme; thus, the affix order is
arbitrarily fixed. However, with the reciprocal (REC) -an, it may
show up in both orders, yielding no interpretational difference.
Secondly, each of the two affixes may take the other one into its
scope. Therefore, both affix orders are relevant because they differ in
their scopal interpretations. Thirdly, the scope relation is fixed such
that only affix A may take affix Β into its scope; thus, only the order
with A being the outer morpheme is possible. The first two cases are
instances of local variability, i.e., there may be language-internal or
cross-linguistic variation regarding the actual affix orders, whereas
the third case is predicted to show global uniformity, i.e., all lan-
guages should display the relevant affix order. The second case is the
one I am most interested in: the availability of two affix orders. The
notion of scope, proposed by Muysken and Rice, will be clarified by
considering explicit semantic representations.
Differences in affix orders may result from semantic or syntactic
properties. If, for instance, a causative affix (CAUSE) is combined
with an adverbial affix (MOD), the readings in (5a/b) obtain: In (5a)
the (outer) adverbial affix modifies the complex situation of causa-
tion, whereas in (5b), it only modifies the subevent expressed by the
base verb.1 (5c-e) show the simplified representations for a transitive
base verb, the verb extended by an adverbial affix and the causativ-
ized variant of the verb. Following the tradition of Lexical Decompo-
sition Grammar (Joppen and Wunderlich 1995, Wunderlich 1997b,
Stiebeis 1999), I represent the argument structure of a lexical item as
a sequence of λ-abstractors (abstracting over the argument variables
in Semantic Form [SF]): the referential argument of the verb, i.e. the
situational variable s, is considered to be the highest argument and
written as right-most argument on the theta-grid. The other argu-
ments are written to its left according to their depth of embedding in
SF and, thus, to their rank on the argument hierarchy.
286 Barbara Stiebeis

(5) Combination of causative and adverbial morpheme

λy λχ λιι Xs' 3s [[ACT(u) & V(x,y)(s)](s') & MOD(s')]
λy λχ Xu λδ" 3s [ACT(U) & [V(x,y)(s) & MOD(S)]](S')
c. V Xy λχ λβ V(x,y)(s)
d. V-MOD Xy λχ Xs [V(x,y)(s) & MOD(S)]
e. V-CAUSE λy λχ Xu Xs' 3s [ACT(U) & V(x,y)(s)](s')

Structural differences of affix orders often depend on the accessi-

bility of arguments. Certain adverbial affixes, for instance, if com-
bined with an applicative, may access the applied argument only as
the outer morpheme. Wechsler (1989) has shown that adverbial af-
fixes such as 'again' can only take direct arguments into their scope,
which requires the applicative to apply before the affixation of the
adverbial morpheme. In the following examples from Chichewa, the
clitic nso 'again' can take the instrumental phrase into its scope only
if the latter has been integrated as structural argument via applica-
tivization as in (6b); in (6a), the instrumental phrase is realized as
oblique adjunct.

(6) Repetitive in Chichewa (Wechsler 1989: 429)

a. mu-lembe=nso chimangirizo ndi nthenga
2SG-write=again essay with feather
'you write the essay again, with a quill (this time)'
b. mu-lembe-re-nso nthenga chimangirizo
2SG-write-APPL=again feather essay
'you write the essay with a quill again'

It is the goal of this paper to provide a programmatic and semanti-

cally based overview of possible affix orders within the domain of
diathesis morphology: which diathesis markers may be combined in
principle and to which extent is the resulting morphological structure
compositional, i.e. reflects the semantic composition and structural
generation of forms? I will show that Baker (1988) makes wrong
Transparent, restricted and opaque affix orders 287

claims concerning possible diathesis combinations and that the Mir-

ror Principle is a violable constraint.
In the following section, I will discuss the compositionality of
affix orders and introduce the notion of transparent, restricted and
opaque affix orders. Section 3 briefly presents Baker's (1988) pre-
dictions for possible diathesis combinations and my analysis of dia-
thesis operations. Section 4 is concerned with diathesis combinations
that yield an identical semantic output, whereas section 5 is con-
cerned with those that differ in semantic terms. Section 6 finally
treats diathesis combinations in which one of the possible orders sub-
sumes the inverse one.

2. Compositionality of affix orders

Given that a particular combination of two morphemes A and Β has

the universal potential for free order of application, and, hence, for
the two affix orders Α-B and B-Α, one must distinguish three sub-
cases with respect to the resulting structures: The most unproblem-
atic case is the one in which both affix orders occur and transparently
reflect the underlying scope relations. I will call these cases trans-
parent affix orders. The following example from Bolivian Quechua
shows the transparency of the combination of hortative and assistive.
The assistive adds an assister argument to the base verb, which is
realized as subject. The hortative, some kind of intensifier, expresses
that the action denoted by the verb is executed with a certain amount
of energy.

(7) Assistive/hortative in Quechua (van de Kerke 1996: 198)

a. p'acha-ta t'aqsa-ysi-rqu-wa-rqa
cloth-ACC wash-ASS-HORT-1.A-3 SG.PAST
'she helped me wash the clothes energetically'
b. p'acha-ta t'aqsa-rqu-ysi-wa-rqa
cloth-ACC wash-HORT-ASS-1. A-3 SG.PAST
'she helped me energetically wash the clothes'
288 Barbara Stiebeis

(7a) has the expected interpretation that the assisting action is

executed energetically, whereas (7b) denotes the situation of ener-
getic washing.
If due to a language-specific constraint, only one affix order oc-
curs, which receives a surface-true, i.e. compositional interpretation,
this affix combination is restricted. Quechua, for instance, allows the
repetitive affix -kipa 'again' only to be internal to the causative af-
fix. The inverse order is not possible. The interpretation is composi-
tionally fixed to the repetition of the situation expressed by the base

(8) Causative/repetitive in Quechua (van de Kerke 1996: 176)

mama-y p'acha-ta t'aqsa-kipa-chi-wa-rqa
mother-ISG.P cloth-ACC wash-REP-CAUSE-l.A-3SG.PAST
'my mother made me rewash the clothes'
#'again my mother made me wash the clothes'

The most problematic case regarding the realization of a particular

morpheme combination is found in languages in which a given affix
order has both the compositional and the non-compositional inter-
pretation. The latter violates the Mirror Principle. These affix orders
are opaque. Whereas restricted affix orders show a complete gap for
a certain morpheme combination, opaque affix orders only lack a
distinct PF for one of the two readings. The combination of hortative
and causative in Quechua is an example for an opaque affix order:
the surface order HORT-CAUSE has the additional non-compositional
interpretation that the causing event is executed energetically.

(9) Hortative/causative in Quechua (van de Kerke 1996: 177)

Maria-wan p'acha-ta t'aqsa-rqu-chi-na-yki tiya-n
Maria-COM cloth-ACC wash-HORT-CAUSE-NOML-2sG be-3SG
a. 'you should make Maria wash the clothes with energy'
b. 'you must energetically make Maria wash the clothes'

An even stronger case of opacity occurs if only one of the poten-

tial affix orders is allowed and if this has the interpretation of the
Transparent, restricted and opaque affix orders 289

inverse affix order, hence violates the Mirror Principle. This case is
illustrated in (lOd): the first line shows the morphological orders
(with V being the verbal stem), the second line the underlying scopal

(10) Schema of attested affix orders in multiscopal contexts

a. transparent b. restricted
order V-A-B V-B-A V-A-B *V-B-A

scope B(A(V)) A(B(V)) B(A(V)) *A(B(V))

c. opaque\ d. opaquej
V-A-B *V-B-A V-A-B *V-B-A

B(A(V)) A(B(V)) *B(A(V)) A(B(V))

These few examples from Quechua have already illustrated that a

language may display transparent, restricted and opaque affix orders
within the same domain of morphology, and that some affixes may
even surface in both transparent and opaque affix orders (e.g. the
One may speculate that different types of constraints are responsi-
ble for non-transparent affix orders: restricted affix orders presuma-
bly result from semantic and syntactic constraints, whereas opaque
affix orders result from phonological and morphological surface con-
straints that dominate a constraint such as the Mirror Principle, or
have to be explained in terms of language-specific conditions on

3. Order of diathesis markers

The most elaborate proposal concerning possible diathesis combina-

tions has been made by Baker (1988). He analyzes diathesis markers
as affixal heads that need to be incorporated into a governing head.
Baker distinguishes three types of complex incorporation: whereas
290 Barbara Stiebeis

cyclic incorporation involves consistent movement of affixal heads

into governing heads, acyclic incorporation means that an intermedi-
ate head is skipped and incorporated separately. Separate incorpora-
tion consists of parallel head movement of the heads of sister catego-
ries into the governing head. Acyclic incorporation is excluded in
principle by the Empty Category Principle. Among the diathesis
combinations that are based on cyclic or separate incorporation,
some are excluded by the Stray Affix Filter (affixes should be at-
tached to a stem) and the Case Filter. According to this analysis, the
possible diathesis combinations should pattern as follows:

(11) Possible diathesis combinations according to Baker (1988)

Diathesis markers derivation affix order
separate CAUSE-APPL
separate * APPL-CAUSE
cyclic CAUSE-PASS (type 1/2)
cyclic PASS-CAUSE (type 2)
cyclic APPL-PASS
acyclic *PASS-APPL

According to Baker, only causative and passive may be combined

in both orders - at least in type 2 languages, whereas in type 1 lan-
guages, PASS-CAUSE violates the Stray Affix Filter. ANTIPASS-APPL
and APPL-CAUSE violate the Case Filter under Baker's assumptions.
Moreover, antipassive and applicative should not combine in any
case. However, as cross-linguistic studies reveal, Baker's approach is
far too restrictive. I will provide the relevant counter-evidence in the
following sections.
Within the framework I would like to propose, all diathesis mark-
ers can be combined in principle in both orders but may be restricted
due to language-specific constraints on linking, i.e. the morphosyn-
Transparent, restricted and opaque affix orders 291

tactic realization of arguments. I assume that the Mirror Principle

should be formulated in semantic terms (see also Muysken 1986):

(12) Mirror Principle (own version)

'The affix order must mirror semantic composition.'

This version of the Mirror Principle requires that the order of se-
mantic integration of morphemes corresponds to their position in
morphological structure, i.e. their relative distance to the stem. Un-
like Baker, I assume that the Mirror Principle is a violable con-
straint: opaque affix orders violate it due to some higher-ranked con-
In the following I will discuss to what extent the various combi-
nations of diathesis markers yield affix orders that need to be distin-
guished in syntactic or semantic terms and to what extent transparent,
restricted, and opaque affix orders occur.
Following Wunderlich (1997b) and Dixon and Aikhenvald (2000)
I distinguish three types of diathesis: (a) argument extension such as
causative, assistive or applicative, (b) argument reduction as found
with agentless passive, 'patientless' antipassive and reflexivization,
and (c) diatheses that bring about alternative argument realizations
such as agentive passive, antipassive with oblique realization of the
internal argument, dative shift and locative alternation. I will not
consider dative shift and locative alternation in the following.
The representation of the causative has already been given in
footnote 1. The assistive also introduces a highest argument but must
be represented as an object control verb: it takes a verbal predicate,
adds an assister argument and identifies the 'assisted' with the high-
est argument of the base verb; since there is no evidence in van de
Kerke's data that these verbs may express indirect assistence, I do
not assume that a new situational variable is introduced:

(13) Representation of assistive (Quechua)

ASS λ Ρ λ χ λη Xs ASSIST(U,X,P(X))(S)
292 Barbara Stiebeis

In contrast to causative and assistive, the applicative introduces a

lowest (or second-to-lowest) argument, namely the applied argument,
which is realized as direct object. The following example from the
Bantu language Kinyarwanda shows a benefactive applicative, in
which a beneficiary ('boy') is added.

(14) Applicative in Kinyarwanda (Kimenyi 1980: 32)

umukoöbwa a-ra-som-er-a umuhuungu igitabo
girl 3SG.N-PRES-read-APPL-ASP boy book
'the girl is reading a book for the boy'

The argument extension found in the applicative is triggered by

the integration of a semantic predicate, which I will simplify as
APP(s,u), a place-holder for more specific predicates that integrate a
beneficiary, instrument and so on (see (15a)). The applicative cannot
be represented as a functor on verbs because this would yield incon-
sistencies between the argument hierarchy predicted from the process
of semantic composition via Functional Composition and the argu-
ments' depth of embedding in SF (Stiebels 1996, Wunderlich 1997a).
I assume that the base verb undergoes argument extension as in
(15b), i.e. it is extended by a predicative argument, and that the ap-
plicative is integrated via Functional Composition as shown in (15c)
so that the arguments of the applicative are inherited to the base verb.

(15) Representation and derivation of applicative

a. APPL Xu Xs APP(S,U)
with APP G {INSTR(S,Z), LOC(S,Z), POSS(U,V), ...}
b. V λy λχ hi V(x,y)(s)
λΡ Xy λχ Xs [V(x,y)(s) & P(s)]
c. V-APPL Xz Xy λ χ λβ [V(x,y)(s) & APP(S,Z)]

I assume that the agentless passive is represented as a functor that

existentially binds the highest argument of the base verb (see (16a)).
With agentive passive, the highest argument is marked as oblique
(see (16b)).
Transparent, restricted and opaque affix orders 293

(16) Representation of passive

a. λΡ λβ 3x P(x)(s) [agentless passive]
b. λΡ λχ XsP(x)(s) [agentive passive]

Antipassive functions as the mirror image of passive. It either ex-

istentially binds the lowest argument of the base verb as shown in
(17a) for a transitive verb, or marks this argument as oblique as in

(17) Representation of antipassive

a. λΡ λχ Xs 3y P(x,y)(s) [patientless antipassive]
b. λΡ λy λχ Xs P(x,y)(s) [oblique antipassive]

Finally, reflexivization involves either co-indexation of θ-roles if

it takes place in syntax (see (18a)), or multiple λ-abstraction if it is
encoded morphologically (see (18b)).2

(18) Representation of reflexivization (transitive base verb)

a. λ ^ λχϊ λβ V(x,y)(s) [syntax]
b. λχ V(x,x)(s) [morphology]

In this paper, I am concerned with morphological reflexives/recip-

The various combinations of diathesis markers show a varying
tendency toward transparent, restricted and opaque affix orders, as I
will show in the following. In principle, combinations of diathesis
markers may be restricted due to semantic/conceptual factors (e.g.
the role of specified agent arguments, the potential ambiguity of
forms) and structural factors such as the maximal number of struc-
tural linkers and structural arguments in the particular language, the
linker inventory, the symmetry or asymmetry of objects (Bresnan
and Moshi 1993) and the obligatoriness of morphological marking of
argument saturation (e.g. by means of pronoun or noun incorpora-
tion); these parameters constitute the linking profile of the language.
294 Barbara Stiebeis

Recall that languages with symmetric objects allow both internal

arguments to be alternatively realized as the subject of a passive verb
- besides other symmetries. Further restrictions are attested: in many
languages, diathesis operations that follow argument extensions must
not affect the structural realization of arguments that have been in-
troduced into the base verb, whereas diathesis operations that follow
argument reductions may be affected by the lack of structurally ac-
cessible arguments.
In the following, I will first discuss diathesis combinations that
yield an identical semantic output; then I will discuss those combi-
nations that differ in their semantic output. Finally I will show to
what extent affix orders may be in a subsumption relation. Apart
from one exception (see section 5.1.), all diathesis combinations are
affected by the language-specific linking profile and thus expected to
show cross-linguistic variation (see also Alsina 1999).

4. Diathesis combinations with identical semantic output

Diathesis combinations that have an identical semantic output, i.e.

have an identical SF, may still differ in their θ-grid. Therefore I will
distinguish two cases: diathesis combinations with identical SF and
identical θ-grid and diathesis combinations with identical SF but dis-
tinct θ-grid. Only the first type is predicted to be either realized by a
single affix order or to show free variation.

4.1. Diathesis combinations with identical θ-grid

A diathesis combination that yields an identical output both for SF

and θ-grid in any order is the combination of passive and reflexive,
as shown for a transitive base verb:

(19) Combination of passive and reflexive (transitive base verb)

a. V-PASS-REFL Xy Xs Ξχ V(x,y)(s) ->• Xs Ξχ V(x,x)(s)
b. V-REFL-PASS λχ Xs V(x,x)(s) -» Xs Ξχ V(x,x)(s)
Transparent, restricted and opaque affix orders 295

As with all similar cases, the two affix orders differ, however, in
their intermediate step. In (19a) the possible antecedent of the re-
flexive is bound prior to reflexivization, whereas in (19b) it is bound
after reflexivization, which might lead to a slight preference for
(19b). Alsina (1999) claims that (19a) is universally excluded. The
order V-PASS-REFL could be impossible in languages that require
antecedents to be structurally realized. In principle, both combina-
tions of passive and reflexive are ungrammatical with 2-place verbs
in languages that do not allow impersonal passives. Moreover, with
3- and 4-place verbs, V-REFL-PASS is only possible in languages with
symmetric objects (Alsina 1999) because only then can one of the
remaining internal arguments be promoted to subject position.
In Classical Nahuatl, an Uto-Aztecan language, the order of pas-
sivization and reflexivization can be determined on the basis of the
actual reflexive allomorphs. In general, a 'specific reflexive' (with
person and number agreement) is used if the argument in question is
bound by the highest argument as in (20a). If the antecedent is not
realized structurally as highest argument, the 'unspecific' reflexive
ne- is used as in (20b): here, the highest argument is existentially
bound and thus not accessible.

(20) Passive/reflexive in Classical Nahuatl (Launey 1979: 61)

a. ni-no-tlätia
1 SG.N-1 SG.REFL-hide Ί hide myself
b. ne-tläti-lo
usp.REFL-hide-PASS 'People hide'

In order to account for the reflexive allomorphy, one must assume

that reflexivization applies after passivization, which contradicts Al-
sina's (1999) claim. The order V-REFL-PASS is not attested in Na-
Identical SFs are also generated with the combination of passive
and antipassive:

(21) Combination of passive and antipassive

296 Barbara Stiebeis

It is, however, dubious whether languages should make use of

both argument reductions; this only seems plausible if multiple ar-
gument extensions apply.

4.2. Diathesis combinations with different θ-grids

There are two cases in which diathesis combinations result in the

same SF, but differ in their θ-grid: the combination of causative and
passive on the one hand and the combination of antipassive and ap-
plicative on the other hand. Concerning the combination of causative
and passive, the causer is existentially bound in the order V-CAUSE-
PASS, whereas the causee is bound in the inverse order:

(22) Combination of causative and passive

λy λ χ Xs' 3u 3s [ACT(u) & V(x,y)(s)](s')
Xy λιι Xs' 3x 3s [ACT(U) & V(x,y)(s)](s')

The combination of causative and passive depends on constraints

on structural linking and the requirement for morphologically en-
coded binding of arguments. V-PASS-CAUSE is superfluous in lan-
guages with optional (oblique) causees because there is no need to
bind the causee. This affix order, however, is highly relevant in lan-
guages with obligatory (morphological) argument saturation or in
languages with restrictions on structural linking: with the latter, cau-
sativization may be restricted to intransitive or transitive verbs. In
Yucatec Maya, only two structural arguments are allowed (Krämer
and Wunderlich 1999); therefore, causativization is restricted to in-
transitive verbs. In order to causativize an underlyingly transitive
verb, argument reduction must take place.
Transparent, restricted and opaque affix orders 297

(23) Causative/passive in Yucatec Maya (Bricker 1978: 22)

a. k=u kd?an-s-ik
'he is teaching him'
b. k=u kä?an-s-ä?al
'he is being taught'

In (23a), the verb 'learn' is passivized before its argument struc-

ture is extended by a causer argument. (23b) shows that a causativ-
ized verb may undergo passivization. Therefore, both orders are at-
tested in Yucatec Maya. The order V-CAUSE-PASS may be ungram-
matical in languages that do not allow new arguments to be existen-
tially bound or realized obliquely.
Depending on the order of application of antipassive and applica-
tive, different arguments are existentially bound or realized
obliquely. In this respect, the combination of antipassive and appli-
cative is a mirror image of the combination of causative and passive.
If antipassive precedes the applicative, the base object is existentially
bound or realized obliquely as in (24a). Such an order of application
is often used if the language exhibits restrictions on structural link-
ing: the antipassive reduces the number of structural arguments thus
allowing subsequent argument extension. If the antipassive follows
the applicative, the applied argument is existentially bound or real-
ized obliquely as in (24b).

(24) Combination of antipassive and applicative

a. V-ANTIPASS-APPL λ χ Xs 3y V(x,y)(s)
λζ λχ Xs By [V(x,y)(s) & APP(s,z)]
b. V-APPL-ANTIPASS λ ζ Xy λχ Xs [V(x,y)(s) & APP(S,Z)]
- » λy λχ Xs Bz [V(x,y)(s) & APP(S,Z)]

Languages that do not allow the existential binding of new argu-

ments, should not display affix orders such as (24b); therefore, the
combination of antipassive and applicative may be restricted. In
298 Barbara Stiebeis

West Greenlandic, applicative and antipassive may be iterated, thus

transparently showing both affix orders:

(25) Antipassive/applicative in West Greenlandic

(Fortescue 1984: 270)
a. am-vuq 'he went out' (V-3SG)
b. anni-p-paa 'he went out with it' (V-APPL-3SG/3SG)
c. anni-s-si-vuq 'he went out with something'
d. anni-s-si-vig-aa 'he went out with something to him'

Note that West Greenlandic also exhibits several applicative vari-

ants and that the surface form is subject to many morphophonologi-
cal processes.
The resulting verb forms of the combination of causative and an-
tipassive differ in their linking patterns - at least in languages with
asymmetric objects. Depending on the linking conditions in causa-
tivized transitive verbs (oblique causee vs. oblique base object), the
antipassive existentially binds the structural internal argument of the
causativized verb (compare (26a/b)); therefore, only the causer ar-
gument remains structural (str).

(26) Combination of causative and antipassive

λχ Χα Xs' 3y 3s [ACT(U) & V(x,y)(s)](s') [obi. causee]
obi str
b. Xy Xu Xs' 3x 3s [ACT(U) & V(x,y)(s)](s') [obi. base obj.]
obi str
λχ Χα Xs' 3y 3s [ACT(u) & V(x,y)(s)](s')
str str

If, however, the antipassive applies first, the base object must be
existentially bound; therefore, the causee argument can be realized
structurally. Note that Baker (1988) predicts both orders to be un-
Transparent, restricted and opaque affix orders 299

grammatical. The following examples from Chamorro provide

counter-evidence to his claim (West Greenlandic would also be a
case in question). In (27a) the antipassive applies prior to causativi-
zation; as expected, the causee häm 'us' is realized structurally (as
NOM-marked pronoun), whereas the base object is oblique. In (27b)
the antipassive follows causativization (umlauting the causative mor-
pheme); here, both causee and base object are oblique.3

(27) Causative/antipassive in Chamorro (Gibson 1992: 175/150)

a. ha=na '-fan-aitai häm / ma'estrak-ku
3SG.E=CAUS-ANTIPASS-read lPL.EX.N the teacher-ISG.P
ni esti na lebblu
OBL this LINK book
'my teacher made us read the book'
b. man-nä'-eksamina häm i doktu
PL-ANTIPASS.CAUS-examine IPL.EX.N the doctor
as nana-n-mami
OBL mother-n-LPL.EX.P
'we had the doctor examine our mother'

5. Diathesis combinations that differ semantically

Since argument extensions are triggered by the integration of further

predicates into the SF of the base verb, combinations of argument
extensions yield outputs that differ according to their order of appli-
cation. In addition, the combination of diathesis markers with re-
flexives or reciprocals may yield outputs that differ in their binding
relations. I will begin with the discussion of the diathesis combina-
tions that yield different SFs and different θ-grids.

5.1. Diathesis combinations with different θ-grids

Since some structures and processes universally single out the high-
est argument of verbs ('logical subject'), the order of application of
300 Barbara Stiebeis

diathesis markers that introduce a highest argument is highly rele-

vant. It is the combination of such diathesis markers that exhibits the
strongest requirement and tendency for transparent affix orders. (28)
represents the differences between the orders of application of assis-
tive and causative. If the assistive applies first as in (28a), the causer
u is the highest argument and, hence, realized as subject. If the
causative applies first, the assister ν is the highest argument, and the
causer is identified with the assisted.

(28) Combination of assistive and causative

λ y λ χ λ ν λιι Is' 3s [ACT(u) & ASSlST(v,x,V(x,y))(s)](s')
λy λχ Xu λν λβ1 3s [ASSIST(V,U,[ACT(U) & V(x,y)(s)]](s')

The following example from Quechua shows the predicted trans-

parency. (29a) represents the order V-ASS-CAUSE, (29b) the order V -

(29) Causative/assistive in Quechua (van de Kerke 1996: 179)

a. Maria-wan wawa-s-ta maylla-ysi-chi-wa-n
Maria-COM child-PL-ACC wash-ASS-CAUSE-l.A-3SG
'she makes Maria help me wash the children'
b. Maria-wan wawa-s-ta maylla-chi-ysi-wa-n
Maria-COM child-PL-ACC wash-CAUSE-ASS-l.A-3SG
'she helps me to make Maria wash the children'

Independent of the order of application, the causee is realized by

the comitative and the assisted by object agreement, which indicates
a certain asymmetry between the causative and the assistive, requir-
ing further elaboration.
Additional examples from Quechua and other languages (e.g. the
iteration of causatives) confirm the prediction that the combination of
diathesis markers introducing a highest argument should always be
Transparent, restricted and opaque affix orders 301

The combination of causative and assistive is partly mirrored by

the combination of applicatives. Depending on the order of applica-
tion, the resulting verbs differ in their SF and their θ-grid.

(30) Combination of Applicatives

a. V-APP1-APP2
λν λιι λy λχ Xs [V(x,y)(s) & APPI(s,u) & APP2(S,V)]
b. V-APP2-APP1
λιι λν λy λχ Xs [V(x,y)(s) & APP2(S,V) & APPI(s,u)]

In contrast to the combination of diathesis markers that introduce

a highest argument, the combination of applicatives underlies lan-
guage-specific linking constraints and, thus, does not exhibit global
uniformity. In languages with asymmetric objects, the applied argu-
ment introduced last is predicted to be realized as structural object,
whereas the internal argument introduced by the first applicative is
oblique. With these languages, affix orders should be clearly distin-
guished due to their structural effects. In languages with symmetric
objects, the order of application does not play a role: both internal
arguments are structural and may be accessed likewise. Despite its
structural relevance one hardly finds examples for multiple applica-
tives in which both orders are attested. Many languages - even those
with asymmetric objects - have a strong preference for one of the
possible orders of applicatives. In Tukang Besi, for instance, the
combination of locative and comitative applicative is restricted: only
the order LOC-COM is possible and the comitative argument 'with my
younger sister' is realized structurally (i.e. NOM), as expected.4

(31) Locative/Comitative applicative in Tukang Besi

(Donohue 1999: 249)
ku-wil(a)-isi-ngkene-'e na iai-su
1 SG-g0-L0C-C0M-3 .A NOM younger.sister-lSG.P
(di ompu-su)
OBL grandparent-lSG.P
Ί visited my grandmother with my younger sister'
302 Barbara Stiebeis

In contrast, the combination of comitative and benefactive appli-

cative is opaque: the morphological order is restricted to COM-BEN
(compare (32a/b)); however, only the comitative argument 'with her
friend' can be realized structurally (compare (32a/c)), suggesting a
scope relation COM over BEN.

(32) Benefactive/comitative applicative in Tukang Besi

(Donohue 1999: 248/252)
a. no-homoru-ngkene-ako-'e te iaku na kene-no
3.REAL-weave-COM-BEN-3.A CORE LSG NOM friend-3.P
te wurai te ompu-su
CORE sarong CORE grandparent-lSG.P
'my grandmother w o v e a sarong for me with her friend'
b. * no-homoru-ako-ngkene
c. * no-homoru-ngkene-ako-aku te kene-no
3.REAL-weave-C0M-BEN-LSG.A CORE friend-3.p
te wurai na ompu-su
CORE sarong NOM grandparent-1 SG.P
'my grandmother wove a sarong for me with her friend'

5.2. Diathesis combinations with identical θ-grids

Among the combinations of diathesis markers that differ in their se-

mantic output are some that still have an identical θ-grid. This is due
to the fact that the two orders only differ in the predicate's argument
variables. A clear case is given by the combination of causative and
reflexive/reciprocal. The order V-CAUSE-REFL is predicted to allow
two readings in principle: one in which the causer binds the causee as
in (33a.i) and one in which the causer binds the base object as in
(33a.ii). Note, however, that this binding may violate locality con-
straints of particular languages because the causee is a potential in-
terfering binder. The affix order V-REFL-CAUSE only allows the
reading in which the causee binds the base object as in (33b).
Transparent, restricted and opaque affix orders 303

(33) Combination of causative and reflexive/reciprocal5

a. V-CAUSE-REFL (i) Xy Xu Xs' Bs [ACT(U) & V(u,y)(s)](s')
(ii) λχ Xu Xs' 3s [ACT(U) & V(x,u)(s)](s')
b. V-REFL-CAUSE λχ Χα Xs' Bs [ACT(u) & V(x,x)(s)](s')

Some dialects of Quechua show the predicted affix orders and

their corresponding interpretations:

(34) Combination of causative and reflexive in Quechua

(van de Kerke 1996: 180)
a. maylla-chi-ku-n
wash-c AUSE-REFL-3 SG
(i) 'he lets himself be washed'
(ii) 'he causes himself to wash someone'
b. maylla-ku-chi-n
'he causes someone to wash himself

Other Quechuan dialects (van de Kerke 1996) display a restriction

disallowing the affix order V-REFL-CAUSE, which might be explained
by the fact that these dialects require the antecedent to be the highest
argument of the verb.
Again, Classical Nahuatl exhibits the expected reflexive allomor-
phy. In (35b), the reflexive verb form is causativized so that the
causee binds the internal argument, whereas in (35c) reflexivization
operates on the causative verb form so that the causer binds the re-
flexive. In the first case, the unspecific reflexive is used, in the sec-
ond case the specific reflexive.

(35) Reflexive/causative in Classical Nahuatl (Launey 1979: 186)

a. mo-tlaso'tla-'
'they love one another'
b. ni-kin-ne-tlaso'tlal-tia [REFL-CAUSE]
Ί cause them to love one another'
304 Barbara Stiebeis

c. ni-k-no-tti-tia [CAUSE-REFL]
(i) Ί show myself to him'
(ii) Ί make him see me'

There are also languages that show restricted affix orders for
CAUSE/REFL: Kinyarwanda (Kimenyi 1980) exhibits only the mor-
phological structure [REFL-V-CAUSE], which is structurally ambigu-
ous. However, the interpretation based on the order V-REFL-CAUSE is
blocked; the reflexive must be bound by the highest argument. In
contrast, Tukang Besi (Donohue 1999) does not allow the order V-
CAUSE-REC in the combination of causative and reciprocal.
The combination of causative and applicative also yields two affix
orders that differ in semantic terms. With the order V-CAUSE-APPL,
the applied argument is expected to be related to the complex situa-
tion of causation as in (36a), whereas with the order V-APPL-CAUSE,
the applied argument should be related to the subevent denoted by
the base verb, as shown in (36b).

(36) Combination of causative and applicative

λζ λy λχ λ\ι Xs* 3s [[ACT(u) & V(x,y)(s)](s') & APP(s\z)]
λζ λy λχ Xu Is' 3s [ACT(u) & [V(x,y)(s) & APP(s,z)]](s')

The interpretational differences become evident with instrumental

and locative phrases: is the instrument part of the causing event or
part of the subevent denoted by the base verb? Likewise, does the
locative refer to the place of the causing event or to the place where
the action denoted by the base verb is situated?
However, the order V-APPL-CAUSE is rarely attested, which led
Baker to conclude that this order is ungrammatical in any case. Evi-
dence for such an order is found, for instance, in Chamorro (and with
certain verbs in Tukang Besi, see Donohue 1999). Overtly, the two
affix orders cannot be distinguished because the causative is realized
as prefix and the applicative as suffix. The linking patterns indicate
Transparent, restricted and opaque affix orders 305

the underlying derivation: (37a) corresponds to the order V-CAUSE-

APPL because the applied argument is realized by the nominative
(NOM). Every derivation subsequent to the applicative would render
the applied argument oblique, which is not the case in (37a). In
(37b), the causee is realized by the nominative, which can only be
explained with respect to the order V-APPL-CAUSE. Note that the ap-
plied argument 'Joaquin' is related to the subevent of telling a story.

(37) Causative/applicative in Chamorro (Gibson 1992: 110/122)

a. hu=na'-puni'-i yu' nu i bäbui as Juan
1 SG.E=CAUS-kill-APPL ISG.N OBL the p i g OBL Juan
Ί made Juan kill me the pig' [CAUS-APPL]
b. si tata-hu ha=na'-sasngan-i yu'
NOM father-ISG.P 3sG.E=CAUs-tell-APPL ISG.N
as Joaquin nu i estoria-mu
OBL J OBL the story-2sG.P
'my father made me tell Joaquin your story' [APPL-CAUS]

As overwhelming tendency, the combination of causative and ap-

plicative is realized by means of the opaque affix order V-CAUSE-
APPL. This is true, for instance, for Quechua, as the following exam-
ple illustrates:

(38) Causative/applicative in Quechua (van de Kerke 1996: 192)

mama-y Ana-wan chompa-ta ruwa-chi-pu-wa-n
mother-1SG.P Ana-COM sweater-ACC make-CAUSE-APPL-l-3SG
a. 'in my place my mother made Ana make a sweater'
b. 'my mother made Ana make a sweater in my place'
c. 'my mother made Ana make me a sweater'

Reading (38a) is compositional, whereas the other two readings in

which the beneficiary is related to the subevent denoted by the base
verb are not. Similarly, Chichewa and Nahuatl only exhibit opaque
affix orders with CAUS/APPL. The Chichewa sequence lir-its-ir (cry-
CAUSE-APPL) occurs for an instrumental applicative in which the in-
strument is used in the causing event as well as for a benefactive ap-
306 Barbara Stiebeis

plicative in which the beneficiary is related to the crying event (Hy-

man and Mchombo 1992).
The fact that the combination of causative and applicative shows
the strongest tendency for opaque affix orders among all diathesis
combinations suggests that the difference in meaning is not very cru-
cial and, hence, is not reflected in morphology. The sortal properties
of the applied argument determine to which situation argument it is

6. Diathesis combinations with a potential subsumption


There are also cases in which one of the two affix orders may be se-
mantically or structurally ambiguous such that it subsumes the inter-
pretation or the linking pattern of the inverse order.

6.1. Potential semantic subsumption

Potential semantic subsumption is found with the combination of

applicative and reflexive/reciprocal. The order V-APPL-REFL has two
possible interpretations, namely those indicated in (39a): the highest
argument binds the applied argument as in (39a.i) or the base object
as in (39a.ii).6 The inverse order can only have an interpretation that
is identical to (39a.ii).

(39) Combination of applicative and reflexive/reciprocal

a. V-APPL-REFL (i) Xy λ χ Xs [V(x,y)(s) & APP(s,x)]
(ii) λ ζ λ χ Xs [V(x,x)(s) & APP(S,Z)]
b . V-REFL-APPL λ ζ λ χ Xs [V(x,x)(s) & APP(s,z)]

Therefore, V-APPL-REFL subsumes V-REFL-APPL in principle. Such

a subsumption may result in two compensation strategies: V-REFL-
APPL may either block interpretation (ii), perhaps due to a high-
ranked ambiguity constraint, or it may be blocked morphologically
Transparent, restricted and opaque affix orders 307

by the order V-APPL-REFL because some kind of economy constraint

rules out superfluous morphological structure: the affix order with
the wider extension is preferred.
In Chichewa both affix orders occur and show the full range of
possible interpretations: the order V-APPL-REC is ambiguous, as
(40a/c) show, and does not blockη the inverse order (see (40b)), a fact
that needs further investigation. If the reciprocal is the inner mor-
pheme, a wellformedness constraint requires that a copy of the recip-
rocal is added to the following diathesis marker. 8

(40) Applicative/reciprocal in Chichewa

(Hyman and Mchombo 1992, Alsina 1999: 12)
a. mang-ir-an- 'tie for each other'
b. mang-an-ir-an- 'tie each other for/with/at'
c. alenje a-na-meny-er-an-ά mikondo
hunters.2 CL.2-PAST-hit-APPL-REC-FV spears.4
'the hunters hit each other with spears'

Classical Nahuatl distinguishes the two readings due to the selec-

tion of the reflexive allomorph: if the applied argument is bound, the
specific reflexive is used as in (41a), if, however, the base object is
bound, the unspecific reflexive is used as in (41b).

(41) Applicative/reflexive in Classical Nahuatl (Launey 1979: 196)

a. ni-k-no-kwi-tl-s
1 SG.N-3 SG. A-1 SG.REFL-take-APPL-FUT
Ί will take it for m y s e l f
b. ni-k-ne-tläfi-lia
Ί hide myself from him'

Morphological structures such as those of Nahuatl do not allow

any conclusion with respect to the underlying affix order in case of
subsumptive relations.
308 Barbara Stiebeis

6.2. Potential structural subsumption

Depending on the symmetry of objects, the orders of combining ap-

plicative and passive must be distinguished. Generally, the order V-
PASS-APPL only allows the internal argument of the base verb to be
realized as the subject of a passive verb because the applied argu-
ment is not locally accessible (see (42a)). In languages with asym-
metrical objects, V-APPL-PASS only displays a structure with the ap-
plied argument being the subject of the passive verb (see (42b)). In
languages with symmetrical objects both internal arguments may be
alternatively realized as subject. In this case, the order V-APPL-PASS
subsumes the inverse order with respect to its linking potential.

(42) Combination of applicative and passive

λζ Xy Xs 3x [V(x,y)(s) & APP(s,z)]
λζ λy Xs 3 x [V(x,y)(s) & APP(s,z)]

Note, however, that V-PASS-APPL is excluded in languages in which

applicatives require the presence of a specified structural highest
argument; this is, for instance, relevant in Tukang Besi (Donohue
1999: 297).
In Chichewa, instrumental and benefactive applicatives may only
occur in the order V-APPL-PASS. AS in Tukang Besi, these applicatives
require a specified structural agent argument. With the locative appli-
cative, both orders occur:
Transparent, restricted and opaque affix orders 309

(43) Passive/locative applicative in Chichewa

(Alsina 1999: 10/11, Alsina and Mchombo 1993: 42)
a. ukönde u-ku-lük-ir-idw-ά ρά-mchenga
net. 14 CL.14-PRES-weave-APPL-PASS-FV CL.16-sand.3
(ndi äsödzi)
by fishermen.2
'the net is being woven on the sand (by fishermen)'
b. ukönde u-ku-luk-idw-ir-ά ρά-mchenga
net. 14 CL.14-PRES-Weave-PASS-APPL-Fv CL.16-sand.3
(ndi äsödzi)
by fishermen.2
'the net is being woven on the sand (by fishermen)'
c. pa-mchenga pa-ku-lvk-ir-idw-ά mikeka
CL.16-sand.3 CL.16-PRES-weave-APPL-PASS-FV mats.4
'the beach is being woven mats on'

The order V-APPL-PASS is structurally ambiguous (see (43a/c)) be-

cause this applicative licences object symmetry; hence, both the ap-
plied argument and the base object may be promoted to subject posi-
tion. The fact that both (43 a) and (43b) are acceptable although a
blocking effect is expected needs to be clarified.

7. Conclusions

The preceding discussion has shown that, in principle, diathesis

markers can be combined in any order. In the case of passive and
reflexive and passive and antipassive, an identical output is gener-
ated. It is the intermediate step that might favor one affix order over
the other, depending on the linking constraints of the relevant lan-
guage. The following table summarizes the findings along the fol-
lowing dimensions: (a) whether an identical SF is generated, (b)
whether the two orders yield an identical θ-grid, (c) whether there
may still be differences in the resulting linking patterns despite an
identical θ-grid and (d) whether the diathesis combination is influ-
enced by the specific linking conditions of the language in question.
310 Barbara Stiebeis

The last column indicates the tendency with respect to the actual af-
fix orders: t - transparent, r - restricted, ο - opaque. However, fur-
ther typological studies are necessary to validate the observed pat-

(44) Properties of diathesis combinations

same same same parameterized affix
SF Θ- linking according to order
grid pattern linking profile
PASS/REFL + + + + r
CAUSE/ANTIPASS + -/+ + + t/r
APPL/REFL +/- + + + t/r
PASS/APPL + + +/- + r
CAUSE/PASS + - - + t/r
ANTIPASS/APPL + - - + t/r
CAUSE/REFL - + + + t/r
CAUSE/APPL - + + + o/t
CAUSE/ASS - — - — t
APPL/APPL - - -/+ + r/o

Restricted affix orders mostly result from language-specific con-

straints on linking. Since almost all diathesis combinations interact
with the linking profile of the language, restrictions are expected.
The only invariant diathesis combination (causative and assistive or
iteration of causatives) exhibits the predicted transparency.
Up to now, only two cases of opaque affix orders have been at-
tested: the combination of causative and applicative and multiple
applicatives. In both cases, argument-extending diatheses are com-
bined, which pose a challenge for structural linking in most lan-
guages. Apart from the fact that the factors that trigger opacity need
to be determined, opacity in itself is a serious problem for mor-
pheme-based approaches. If one does not want to make use of late-
insertion models (Distributed Morphology, Halle and Marantz 1993),
post-syntactic filters, morphological circumscriptions (Hyman and
Transparent, restricted and opaque affix orders 311

Mchombo 1992) or covert LF-movements, which are all very power-

ful mechanisms, the question arises as to which alternatives are
available. Moreover, one must ask whether the semantics is proc-
essed at each step of morphological concatenation, which is desirable
from isomorphism, or whether semantic processing may be post-
poned. The latter alternative may be plausible if it can be con-
strained. Therefore, further studies must show whether opacity is
strictly local, involving only adjacent morphemes.
Another challenge is given by subsumptive affix orders. To what
extent do the predicted blocking effects occur? How may they be
modelled? A possible solution might be provided within the frame-
work of Bidirectional Optimality Theory (Blutner 2000).
Finally, a typology of possible affix orders is not easily available
within Optimality Theory because diathesis combinations interface
with different modules of the grammar (syntax, semantics, morphol-
ogy, discourse factors), which might not be evaluated parallel in one


* This paper is based on research that has been conducted within the Sonderfor-
schungsbereich 'Theory of the Lexicon', funded by the German Science Foun-
dation (DFG). I would like to thank Dieter Wunderlich, the audience in Leipzig
and the anonymous reviewer for helpful comments. Throughout the paper, I
will make use of the following abbreviations: '=': clitic boundaiy, '#': deviant
semantic interpretation; A: object agreement, ACC: accusative, ANTIPASS: anti-
passive, APPL: applicative, ASP: aspect, ASS: assistive, BEN: benefactive appli-
cative, CAUSE: causative, CL: class marker, COM: comitative (case/ applicative),
CORE: core case, E: ergative agreement, EX: exclusive, FUT: future tense, FV: fi-
nal vowel, HORT: hortative, IMPF: imperfective, INCOMP: incompletive aspect,
INT: intensifier, LINK: linker, LOC: locative applicative, MOD: modifier, N: sub-
ject agreement, ΝΟΜ: nominative, NOML: nominalization, OBL: oblique, P: pos-
sessor agreement, PASS: passive, PAST: past tense, PL: plural, PRES: present
tense, REC: reciprocal, REFL: reflexive, REP: repetitive, SG: singular, USP: un-
1. I assume that the causative morpheme is a functor on the verb with the follow-
ing Semantic Form: λΡ Xu Xs' 3s [ACT(u) & P(s)](s')
312 Barbara Stiebeis

The causative integrates a verbal predicate Ρ via functional composition, binds

its situational variable and adds the causer argument u and the complex situ-
ational variable s' (Wunderlich 1997b). ACT denotes an unspecified activity.
The causal relation is inferred from conceptual coherence constraints (Kauf-
mann 1995). Modifiers that do not add arguments can also be represented as
functors on verbs.
2. Given that ditransitives should be included in the discussion of reflexive bind-
ing, I do not see a possibility to represent the reflexive morpheme as a functor
on the verb. Therefore, I assume that it might be represented as a template that
operates on the base verb's SF (with consequences for λ-abstraction).
3. Although the causee 'the doctor' does not receive an oblique marker, its posi-
tion (following the subject) renders it oblique in (27b). In (27a) the causee pre-
cedes the subject.
4. Tukang Besi has an unusual linking system. The subject of intransitive verbs is
marked by ΝΟΜ. With transitive verbs, the object is marked by NOM if the verb
exhibits object agreement; otherwise the subject is marked by NOM. Structural
arguments that are not realized by NOM are marked by the 'core marker' te.
5. Note that (33a.i) is preferred with reciprocals and (33a.ii) with reflexives.
6. Interpretation (39a.ii) may be blocked if a language requires the reflexive to
correspond to a structural argument; in languages with asymmetric objects, the
base object is often not structural and, hence, possibly not accessible to reflex-
7. One might speculate that the order V-REC-APPL is chosen in cases in which
base object and applied object both qualify as target for anaphoric binding, i.e.
with the benefactive applicative, and in which the binding of the base object is
to be ensured. V-APPL-REC would then be used for the binding of the applied
argument. With the other applicatives, such an ambiguity is less likely and V-
REC-APPL is avoided as a superfluous form.
8. The numbers in (40c) indicate noun class.


Alsina, Alex
1999 Where's the mirror principle. The Linguistic Review 16: 1-42.
Alsina, Alex and Sam Mchombo
1993 Object asymmetries and the Chichewa applicative construction. In:
Sam Mchombo (ed.), 17-45.
Baker, Mark
1985 The Mirror Principle and morphosyntactic explanation. Linguistic
Inquiry 16: 373-415.
Transparent, restricted and opaque affix orders 313

Baker, Mark
1988 Incorporation: a theory of grammatical function changing. Chi-
cago: The University of Chicago Press.
Blutner, Reinhard
2000 Some aspects of optimality in natural language interpretation.
Journal of Semantics 17: 189-216. [Special volume edited by Petra
Hendriks, Helen de Hoop and Henriette de Swart]
Bresnan, Joan and Lioba Moshi
1993 Object asymmetries in comparative Bantu syntax. In: Sam
Mchombo (ed.), 47-91.
Bricker, Victoria R.
1978 Antipassive constructions in Yucatec Maya. In: Nora C. England,
Collette C. Craig and Louanna Furbee-Losee (eds.), Papers in Ma-
yan linguistics, 3-23. Columbia: Museum of Anthropology, Uni-
versity of Missouri.
Bybee, Joan
1985 Morphology: A study of the relation between meaning and form.
Amsterdam: Benjamins.
Dixon, R. M. W. and Alexandra Y. Aikhenvald
2000 Introduction. In: R. M. W. Dixon and Alexandra Y. Aikhenvald
(eds.), Changing valency: case studies in transitivity, 1-29. Cam-
bridge: Cambridge University Press.
Donohue, Mark
1999 A grammar ofTukang Besi. Berlin: Mouton de Gruyter.
Fortescue, Michael
1984 West Greenlandic. London: Croom Helm.
Gibson, Jeanne D.
1992 Clause union in Chamorro and in Universal Grammar. New York:
Halle, Morris and Alec Marantz
1993 Distributed Morphology and the pieces of inflection. In: Kenneth
Hale and Samuel J. Keyser (eds.), The view from building 20: Es-
says in linguistics in honor of Sylvain Bromberger, 111-176. Cam-
bridge, Mass.: MIT Press.
Hyman, Larry M. and Sam Mchombo
1992 Morphotactic constraints in the Chichewa verb stem. Berkeley Lin-
guistic Society 18, 350-364.
Joppen, Sandra and Dieter Wunderlich
1995 Argument linking in Basque. Lingua 97: 123-169.
314 Barbara Stiebeis

Kaufmann, Ingrid
1995 Konzeptuelle Grundlagen semantischer Dekompositionsstrukturen:
Die Kombinatorik lokaler Verben und prädikativer Komplemente.
Tübingen: Niemeyer.
Kerke, Simon van de
1996 Affix order and interpretation in Bolivian Quechua. Ph.D. Disser-
tation, University of Amsterdam.
Kimenyi, Alexandre
1980 Α relational grammar of Kinyarwanda. Berkeley: University of
California Press.
Krämer, Martin and Dieter Wunderlich
1999 Transitivity alternations in Yucatec, and the correlation between
aspect and argument roles. Linguistics 37: 431-479.
Launey, Michel
1979 Introduction a la langue et a la litterature Azteques. Vol. 1, Gram-
maire. Paris: L'Harmattan.
Mchombo, Sam (ed.)
1993 Theoretical aspects of Bantu grammar. Stanford: CSLI publica-
Muysken, Pieter
1986 Approaches to affix order. Linguistics 24: 629-643.
Pesetsky, David
1985 Morphology and logical form. Linguistic Inquiry 16: 193-246.
Rice, Keren
2000 Morpheme order and semantic scope. Cambridge: Cambridge
University Press.
Speas, Margaret.
1991 Functional Heads and Inflectional Morphemes. Linguistic Review
8: 389-417.
Stiebels, Barbara
1996 Lexikalische Argumente und Adjunkte: Zum semantischen Beitrag
von verbalen Präfixen und Partikeln. (Studia grammatica 39.) Ber-
lin: Akademie Verlag.
Stiebels, Barbara
1999 Noun-verb symmetries in Nahuatl nominalizations. Natural Lan-
guage and Linguistic Theory 17: 783-836.
Wechsler, Stephen
1989 Accomplishments and the prefix re-. Proceedings of the North-
Eastern Linguistic Society 19,419-438.
Transparent, restricted and opaque affix orders 315

Wunderlich, Dieter
1993 Funktionale Kategorien im Lexikon. In: Frank Beckmann and
Gerhard Heyer (eds.), Theorie und Praxis des Lexikons, 54-73.
Berlin: Walter de Gruyter.
Wunderlich, Dieter
1997a Argument extension by lexical adjunction. Journal of Semantics
14: 95-142.
Wunderlich, Dieter
1997b Cause and the structure of verbs. Linguistic Inquiry 28: 27-68.
Direction marking as agreement
Jochen Trommer

1. Introduction

The typological literature (e.g. Croft 1990) assumes that certain

languages mark the (un-)naturalness of predication types with respect
to animacy hierarchies by special affixes. For example, in the
Algonquian language Menominee (Bloomfield 1962), verbs mark
predications which involve 1st or 2nd person subjects and third
person objects by the affix -a- and predications with 3rd person
subjects and lst/2nd person objects by -eko!

(1) a. ke-nan-a-w-aw (kenanawaw)

2-fetch-D- [+3 ]- [-1 +pl]
'you (pi.) fetch him' (Bloomfield 1962: 153)
b. ke-nan-eko-w-aw (kenanekowaw)
'he fetch you (pi.)' (Bloomfield 1962: 154)

In terms of the Algonquianist literature (Hockett 1966), -a marks

a "direct" situation since a speech act participant is supposed to be a
more "natural" subject than a 3rd person argument while -eko, which
appears in the inverse case, marks an "inverse" constellation. Both
types of marking are referred to by the term "direction marking". A
statement from Comrie (1980: 62) describes how direction marking
is related to animacy hierarchies: "Languages which have an
opposition between direct and inverse verb forms build directly upon
the animacy hierarchy: the direct forms are used when the subject of
the transitive verb is higher on the scale of animacy than the direct
object ... The inverse form is used when the subject is lower in
animacy than the object..." The animacy hierarchy typically has the
form in (2a) or (2b):
318 Jochen Trommer

(2) The Animacy Hierarchy:

a. 2 > 1 > 3 > inanimate, or
b. 1 > 2 > 3 > inanimate

In this article, I propose to analyze direction marking in a much

simpler way, namely as transitive agreement. Under this account,
direction markers do not refer to feature hierarchies as in (2), but are
governed by universal markedness constraints which correspond to
prominence hierarchies in a well-defined way. Constraints are taken
to be violable and ranked in the sense of Optimality Theory (OT,
Prince and Smolensky 1993). Crosslinguistic variation in direction
marking, which has been interpreted as evidence for language-
specific prominence hierarchies (e.g. Croft 1990), follows from
different ranking of these constraints.
The claim that direction marking is agreement receives additional
support by the fact that another aspect of agreement in the discussed
languages is also regulated by constraints referring to feature
hierarchies: the competition for feature realization by case-less
agreement affixes. This phenomenon will be analyzed in section 3
using the version of OT-morphology introduced in section 2. The
account is then extended to direction markers themselves (section 4).
Section 5 deals with the distribution of zero direction marking. The
article concludes with a comparison of the analysis with alternative
analyses of direction marking (section 6) and a short summary
(section 7).

2. The Framework

The formal framework I will adopt in this paper is Distributed

Optimality (DO, Trommer 2001), a constraint-based morphological
framework based on Optimality Theory (OT, Prince and Smolensky
1993). DO assumes a "minimalist" conception of OT-morphology
embracing the restrictive assumptions in (3):
Direction Marking as Agreement 319

(3) a. Locality: Morphology interprets the output of syntax.

Morphological constraints can only refer to small word-
like units, not to syntactic phrases.
b. Inclusiven ess: All morphemes in the morphological
output have to be licensed by a syntactic item whose
features they subsume. There is no insertion of features.
c. Free Ranking: All rankings of the assumed constraints
yield a possible grammar.

All these assumptions are potentially problematic for an adequate

formalization of direction marking. If direction markers express the
relation of subject and object, they must have access to at least the
clause level. If direction markers evaluate the relation of subject and
object with respect to a feature hierarchy, the morphosyntactic featu-
res of these markers cannot simply subsume the features of syntactic
structures which do not contain feature hierarchies in any sense.
Free Ranking is a standard claim in OT, but it is systematically
violated by Harmonic Alignment (see section 6.) which plays a
prominent role in OT-formalizations of markedness hierarchies.
Since feature hierarchies seem crucial to direction marking, it is a
special challenge to test whether fixed constraints hierarchies can be
dispensed with and Free Ranking can be maintained.
In DO, word forms are characterized by bundles of morphosyn-
tactic features which derive from syntactic derivations. For example,
for (la) we can assume the following representation2:

"+NOM" "+ACC "

(4) [+V] +2-1 +3
+pl +sg

Vocabulary items, such as na n:[+Y], or Ae:[+2] associate

morphosyntactic features with pieces of sound and are used to spell
out these feature bundles. PARSE constraints require that certain
feature combinations are realized by vocabulary items of the output
form. If a feature is not realized in the output, the corresponding
constraint is violated.
320 Jochen Trommer

(5) OT-tableau for the input in (4)


1 [CAT] [P] [NUM]
a. «y ke- na n -a· -w -a w *
[+2] r+Vl D [+31 [-1+pl]
b. na n -a- -w -a w *
r+vi d [+3] r-i+pii
c. ke- na n -a· -w **
[+2] [+V] D [+3]

Thus, PARSE [NUM] in (5) is violated by each number feature

from the input (+sg, +pl) that is not realized in the output, and
PARSE [P] correspondingly for person features (+2,+3, -1). Each
violation is depicted by a star in the tableau.
Following the principles of OT, that candidate is optimal (indica-
ted by ts·) which induces the least serious constraint violations. In
(5), this is kena na wa w because the only constraint it violates once
(PARSE [NUM] for +sg) is also violated once or twice by the other
candidates which violate additional constraints. Crucially, even the
optimal candidate is not perfect: it violates PARSE [NUM]. The rea-
son is that there is no vocabulary item in Menominee that expresses
[+sg]. Feature realization can also be prevented by other constraints.
Such constraints will be introduced in the following sections.

3. Hierarchy-Based Competition

The term "hierarchy-based competition" is intended to cover cases

where person features compete for realization and asymmetries
between the features determine which one surfaces. In Turkana
(Dimmendaal 1983), finite verbs agree with subjects and objects, and
the same person markers are used for subject and object agreement.
However, object agreement is suppressed if both arguments are
participants (6a), or both are non-participants, and subject agreement
is suppressed if the subject is 3rd person and the object [-3] (6b):
Direction Marking as Agreement 321

(6) a. k-a-ram-i
Ί will beat you' (Dimmendaal 1983: 122)
b. k-ä-mn-ä
'he loves me' (Dimmendaal 1983: 123)

In effect, person agreement is always marked by one single affix.

Thus Turkana person agreement exhibits a phenomenon that has been
one of the major motivations for OT: a rule conspiracy. Different
processes (suppression of subject and object agreement) "conspire"
to reach the same goal: the restriction of agreement to a single person
affix. I propose to capture this fact by the constraint in (7), which is
violated by any word form that contains more than one non-
portmanteau affix marked for person.


As we will see in the following, many languages exhibit the same

or related constraints. Thus, BLOCK [PERSON] is not a language-
particular stipulation, but seems to be part of the universal constraint
set of Universal Grammar. Note that the restriction of (7) to person
affixes is necessary, since other inflectional affixes like aspect,
direction and number marking can coocur with the person affixes3.
What remains to be done is to explicate the choice for which affix is
actually suppressed to satisfy (7). Given the prominence scales in
(8a,b), there is a simple principle behind the affix choice in these
cases (8c):

(8) a. Subject > Object

b. 1/2 > 3
c. Choose the affix that corresponds to the higher scale

However, (8c) cannot be maintained in its most general form since

it leads to a contradiction for 3 1 predications, where (8b) seems to
322 Jochen Trommer

outrank (8a). Thus, I propose to replace (8c) by (9a), which gives us

the constraints in (9b,c).

(9) a. If there is a prominence scale A > Β

there is a PARSE constraint PARSE [ P ] ^
b. PARSE [P]["3W+31
c. PARSE [p][+nom]/[+acc]

PARSE [P]*® is to be read as follows: Realize the person features

of a syntactic head containing A if this is adjacent to a head
containing B. Thus, PARSE [P][_3J/[+3] requires that the person features
of a 1st or 2nd person head are spelled out by an affix, if it is
neighboured by a 3rd person head. Now it is crucial how these
constraints are ranked, since for the evaluation procedure of OT
optimization for higher constraints is always more important than
optimization for lower constraints. For Turkana, I will assume that
the ranking is BLOCK [P] » PARSE [Ρ]ι"3]/[+3] » PARSE
P>][+nom]/[+acc] I f o n e a r g u m e n t i s [ + 3 ] and the other [-3], we get the
following tableau:

(10) Mixed:[+Nom +3], [+Acc +1]2


[pjHM+3] |pj[+Nom]/[+Acc]
[PI [P]
a. ts- r+1], * *

b. Γ+31, *!
c. Γ+1Ί, Γ+31, *!

Spell-out of both heads would violate BLOCKING, therefore

(10c) is discarded (depicted by "!" after the relevant violation mark).
Suppression of the [-3] head (10b) would violate PARSE [P]["3J/[+3],
which is also discarded. The only remaining and hence optimal
candidate is (10a).
If both arguments are [-3], PARSE [Ρ][_3Μ+3] becomes irrelevant,
and PARSE [P][+N0MW+ACC] favours the appearance of the nominative
Direction Marking as Agreement 323

(11) Only SAP Arguments: [+Nom +2], [+Acc +1]2


[PI rp][-3]/[+3] j-pj[+Nom]/[+Acc]

.. „

a. is- [+21,
b. r+n, *!
c. [+21, [+11, *l
The same is true if both agreement heads are [+3]. While subject
and object agreement do not differ in morphological expression, the
account predicts that the surfacing marker is coindexed with the
According to the principles of OT, all possible rankings of
constraints should yield an attested or at least plausible language
type. In the following, I will show that this indeed holds for the
proposed constraints. If PARSE [P] is ranked above BLOCK [P],
both Agrs + Agr 0 are realized (12):

(12) PARSE [P] » BLOCK [P]

Otherwise, there are three possibilities: If PARSE [P][+NOMJ/[+ACC]

and BLOCK [P] are ranked above PARSE [Ρ]Η]/[+3] (13a), only
subject agreement is realized. This can be observed in a standard
Indo-European language such as English. If PARSE [Ρ][-3Μ+3] and
BLOCK [P] are above PARSE [P][+N0MW+ACC], we get the distribution
of Turkana (13b). The third possibility is that PARSE [P]h3]/[+3] and
PARSE [pfNOMM+Acc] b o t h dominate BLOCK [P] (13c) (PARSE
constraints are abbreviated by the respective superscripts, the {}
brackets enclose constraints whose ranking with respect to each other
is irrelevant, "&" is used to combine different subrankings, i.e., each
of the rankings in a., b. and c. must be combined separately with
324 Jochen Trommer

(13) [BLOCK [P] » PARSE [Ρ]]

[ n c 1
a· - [ { ; l ä }

b &
· [{ BLOCK m } " [^«""I'I+H

- [ { Ä L ] } »"««in]
In languages of the type of (13c), subject agreement should always
be realized, but object agreement should be suppressed, unless the
object is higher on the person hierarchy than the subject. This corre-
sponds closely to the analysis of Quechua proposed in Lakämper and
Wunderlich (1989: 127):

(14) a. Object-Subject Constraint (OSC): The object may be

marked separately from the subject only if it refers to
person that is higher on the hierarchy of person than the
person to which the subject refers
b. Hierarchy of person: 1 > 2 > 3

What distinguishes Quechua from (14c) is then only the relevant

hierarchy (1 > 2 > 3 instead of 1/2 > 3). Since the latter type of
hierarchy is also well-documented, (14c) is also a plausible language.

3.1. Capturing different Hierarchies

Indeed, there are languages which exhibit similar blocking pheno-

mena as Turkana, but according to slightly different hierarchies:

(15) a. Turkana: 1/2 > 3 | Nom > Acc

b. Nocte: 1 > 2 > 3
c. Menominee: 2 > 1 > 3
Direction Marking as Agreement 325

In contrast to Turkana, in these languages, competition under

blocking is resolved exclusively with reference to person features.
However, in some cases 2nd person agreement wins over 1st person,
and in others it is the other way around. This can be integrated in the
proposed account by assuming the more elaborated hierarchy in (16a)
and replacing (9a) by (16b):

(16) r Μ ι
l [+2] J > [+3]
b. If A is distinct from B, and A Β on a prominence scale
S then there is a PARSE constraint PARSE [P]*®

This licenses the PARSE constraints in (17):

(17) a. PARSE PER[+l]/[+3]

b. PARSE PER[+2W+3]
c. PARSE PER[+1]/t+2]
d. PARSE PER[+2]/[+1]

Assuming that PARSE PER[+Noml/[+Acc] is dominated by BLOCK

[P], we can now account for all the patterns in (15). Turkana (18a),
Dumi (18b), Menominee (18c) and Quechua (18d):

(18) [BLOCK [P] » PARSE [P]] &

Γ [+1M+3] 1
a. [+2]/[+3] » [+Nom]/[+Acc] » { [ ™ }
r [+i]/[+3] ι
b. [+2]/[+3] » i\ [+Nom]/[+Acc] J1
f [+1M+3] 1
c. < [+2]/[+3] »J 1
[+2]/[+l] \ [+Nom]/[+Acc] j
326 Jochen Trommer

[+2]/[+3] » BLOCK [P] » [+2]/[+l]

3.2. Further parameters of competition

For implementing the effect of the person hierarchy in Turkana, I

have assumed constraints such as (19a), but equally well we could
take the slightly different (19b):

(19) a. PARSE [P]™+2>

Realize agreement of a [+1] head in the context of a
[+3] head
b. PARSE [+l]/[+2]
Realize [+1] agreement of a head in the context of a
[+3] head

The outcome is identical in both cases, but it is not in others.

Thus, to account for the fact that person agreement is with the
nominative argument, we have to choose (20a) not (20b) since the
agreement markers itself can realize person features of subjects and
objects, and hence cannot be marked for a case feature:

(20) a. PARSE [P]t+Nom]/[+Accl

Realize person agreement of a [+Nom] head in the
context of a [+Acc] head
b. PARSE [+Nom]/[+Acc]
Realize [+Nom] agreement of a head in the context of a
[+Acc] head

The null assumption is now that constraints for person as in (19)

work in the same way, i.e., the correct formulation is (19a) not (19b).
However, there is strong evidence against this assumption, as can be
seen if we look at the Menominee person prefixes ne-:[+1], ke-\[+2]
and o-:[+3]. If there is a [+2] argument (and no [+1] argument), ke-
Direction Marking as Agreement 327

appears: In a parallel fashion, ne- appears if one of the arguments is

[+1] (and none [+2]).
Now, there are two situations where both items would be licensed.
In transitive forms where one argument is 2nd and the other 1st
person and in forms with an inclusive ([+1 +2]) plural. In both cases,
ke- appears:

(21) ke-pose-q
'we (inc.) embark' (Bloomfield 1962: 150)

However, (22b) in this case will not lead to the correct results
since for [+1 +2] ke- as well as ne- realize agreement with a [+1]
head. The requirement that [+1] (and hence ne-) appears is only
captured by (22a):

(22) a. PARSE [+2]/[+l]


Thus I conclude that different features of prominence hierarchies

involve slightly different types of PARSE constraints and (16b) has
to be formulated more liberally:

(23) If A is distinct from B, and A Β on a prominence scale S then

there is a PARSE constraint PARSE [ P ] ^ or PARSE A/B

Also BLOCK constraints seem not to be uniform. Thus, in

Turkana, only person affixes seem to be subject to blocking while in
Warlpiri only number affixes are involved. In Menominee, apart
from the person prefixes, there are other affix types (person suffixes
and number suffixes), which also involve blocking, but crucially
there is no blocking between different affix types. Thus, there seem
to be different BLOCK constraints referring to different affix classes.
Ideally, all types of blocking should be reduced to a single constraint,
but I leave this open for future research.4
328 Jochen Trommer

3.3. Advantages of the account

Hierarchy-driven competition has only scarcely been treated in the

literature. Other formal accounts such as Wunderlich (1996) for the
Menominee prefixes, rely heavily on lexical stipulation. The
typological literature only notes the phenomenon in passing. For
example Croft (1990: 113) writes: "In a number of languages found
scattered around the world, the transitive verb agrees not with the
subject (A), or the absolutive (P), but whichever of A and Ρ is higher
on the person hierarchy."5 The account proposed in this section
improves in several respects over this rough characterization: First, it
seems not to be true that blocking in general involves agreement as a
whole.6 As has been shown for Turkana, it often targets only specific
types of agreement. Second, the constraint-based account relates
hierarchy-based competition to more common systems. Both
agreement types simply emerge from different constraint rankings.
Finally, systems that show partial violation of BLOCKING, such as
Quechua can also be accounted for by constraint ranking.

4. Competition among Direction Markers

In this section, I take Menominee to show how the distribution of

direct and inverse markers in a single language also follows from
hierarchy-based competition. For reasons of space, I will only treat
two of the five direction markers in Menominee, but the analysis
extends straightforwardly to the missing markers (see Trommer 2001
for details).
Recall from (1) in section 1. that -a- is used if the subject is 1st or
2nd person, and the object is third person, while -eko is used in the
converse constellations.
The first question is now how to represent direction markers.
Recall the assumption from section 2. that affixes encode a subset of
the syntactic features they interpret. Since the typical distribution of
direction markers is in transitive verb forms with person/number
affixes that are not specified for case, it is natural to assume that
Direction Marking as Agreement 329

direction markers express just the case features left unexpressed by

other affixes and have roughly the form in (24):

(24) [+Nom ...][+Acc ...]

Assuming the constraint PARSE [Case], this explains why

direction affixes must appear. The presence of [+Acc] also ensures
that they only appear in transitive contexts. But additional feature
specifications are needed to characterize different direction markers
in a given language such as -a- and -eko in Menominee.
Note first that the distribution of these markers is actually much
more complex than stated above, as shown in (25). Both markers
appear in combinations with an "unspecified actor'" ([3 -spec +an]),
and in combinations of inanimates ([3 -an]) with other 3rd person
arguments. Further, if both arguments of the verb are 3rd person
animate, direction marking is sensitive to the contrast between
proximate ([3 -obv +an]) and obviative ([3 +obv +an]) NPs, where
"proximate" corresponds roughly to NPs referring to topic
information and "obviative" to NPs introducing new discourse
referents. In transitive predications, either the subject or the object
(but not both) are obviative. Apart from the unspecified actor case,
which has no corresponding patient category, -a- represents the
mirror image of -eko.

(25) Distribution of -a- and -eko.

-a- -eko
[1/2 +an] -> [3] [3] - > [1/2 +an]
[3 -spec +an] -> [3 +spec] [3 -spec +an] - > [1/+2 +an]
[3 -obv +an] -> [3 +obv +an] [3 +obv +an] — > [3 +obv +an]
[3 -obv +an] -> [3 -an] [3-an] — > [3 -obv +an]
[3 +obv +an] ->· [3 -an] [3 -an] [3 +obv +an]

A crucial generalization emerges from (25): Whenever -a- is used,

the subject is [+an]; if -eko appears, the object is [+an]. Since this
feature is not realized by any other agreement affix in Menominee, it
330 Jochen Trommer

is plausible that it is also part of the specification of the direction

markers as in (26):

(26) -a- <-> [+Nom +an] [+Acc]

-eko <-» [+Nom] [+Acc +an]

This still does not account completely for the distribution of -a-
and -eko since for many cases both markers would be licensed. For
example, if one argument is 1st person and the other proximate/ani-
mate, both arguments are animate; hence, both markers should be

(27) a. [+Nom +1 +an] [+Acc +3 -obv +an]

b. [+Nom +3 -obv +an] [+Acc +1 +an]

But recall that the feature [+an] is only realized by the direction
markers. Hence, PARSE constraints referring to this feature will have
an immediate effect on the distribution of these markers. The basic
idea is now that for certain categories the feature [+an] is more
typical than for others. For example, non-third person arguments are
typically animate, while this is only true to a much more restricted
degree for 3rd person arguments. To translate this observation in
terms of constraints, we can assume the following PARSE constraint:

(28) PARSE [+an][+1M+3]

This ranking has the effect to favour -a for (27a), and -eko for
(27b). Note that the case features of the feature structures in the
direction markers do not allow for any other coindexing than the ones
in the depicted candidates:

(29) Input: [+Nom +1 H-an^ [+Acc + 3 -obv +an]2

PARSE [+anf1J/[+3]
a. w -a- [+Nom+an], [+Acc]2
ύ. -eko [+Nom], [+Acc +an]2 *!
Direction Marking as Agreement 331

(30) Input: [+Nom +3 -obv +an]! [+Acc +1 +an]2

PARSE f+an][+1]/t+31
a. -a- [+Nom+an], [+ACC] 2 *!
b. «τ -eko [+Nom], [+Acc +an]2

Now, (28) reflects a prominence hierarchy for [+an] in the same

way as the constraints in (16) do for person. Thus we can account for
the other cases where both markers are licensed in a completely
parallel fashion:

ο» - {i®>[y
b. If A is distinct from B, and A Β on a prominence scale
S then there is a PARSE constraint PARSE f+an]^

This leads among others to the PARSE constraints PARSE [+an][+3

-specM+obv] a n d Ρ A R S E [+an][+1M+3 "specl, which ensure that -a- is chosen
in unspecified actor constructions with another 3rd person argument,
but -eko, if the other argument is 1st person. In a similar way (31a)
ensures the correct distribution of both markers in all relevant cases."

5. Zero Direction Marking

Even direction-marking languages do not necessarily mark all

transitive predications by direction markers, i.e. for many
constellations the direction markers are zero. Interestingly, zero
marking shows an asymmetry between direct and inverse markers:
There are languages with inverse markers and without direct markers
but no languages with direct markers only (cf. Croft 1990: 137).
Thus, in Turkana no direction marking obtains for 1/2 3
332 Jochen Trommer

(32) a. ä-mm-ä
Ί love her' (Dimmendaal 1983: 69)
b. k-ä-mn-ä
'she loves me' (Dimmendaal 1983: 123)

Now, assuming that k- is specified maximally simply, as [+Nom]

[+Acc], and that its appearance is favoured in all transitive forms by
PARSE constraints, it is natural to assume a counter-constraint that
prevents its appearance in direct contexts. This cannot be a
BLOCKING constraint, since all affixes that could block k- in (31a)
should also block it in (31b). Hence I propose to use a different
constraint type, which I call IMPOVERISH according to the
impoverishment rules from Halle and Marantz (1993):

(33) IMPOVERISH [+Nom][+Acc] / [+Nom -3][+Acc +3]

This means that, no [+Nom][+Acc] affix should occur in a form

which corresponds to the underlying feature structures [+Nom -3]
and [+Acc +3].
If all relevant IMPOVERISHMENT constraints for [+Nom]
[+Acc] do as (33) and refer to direct constellations, this account
predicts the observation that there are direction-marking languages
with only inverse markers but none without. To see this, assume for
the moment that (33) is the only relevant IMPOVERISHMENT
constraint, abbreviated in the following as "IMPOVERISH CASE",
and look at the two possible rankings for this and PARSE CASE. If
IMPOVERISH CASE is ranked higher than PARSE CASE, we get
an inverse language (only inverse markers):

(34) Inverse language/inverse: [+Nom +3][+Acc -3]


a. «y I-V
b. V *!
Direction Marking as Agreement 333

(35) Inverse language/direct: [+Nom -3][+Acc +3]


a. D-V *!
b. V Ä *

If the ranking is reversed, a full direction marking language

emerges (inverse and direct markers):

(36) Direction language/inverse: [+Nom +3][+Acc -3]


a. <r I-V
b. V *!

(37) Direction language/direct: [+Nom -3][+Acc +3]


a. er D-V ®-S ι β ·: 11
ψΙ ' &M mi* -' •
b. V *!

Since direction marking again refers to different parts of feature

prominence scales, I propose the following correspondence relation
between IMPOVERISH constraints and scales.
(38) a
· {[«]}>t+3i

b. If A is distinct from B, and A Β on a prominence scale

S then there is a IMPOVERISH constraint
IMPOVERISH [+Nom][+Acc] / [+Nom A][+Acc B]

If all resulting constraints are ranked below PARSE [CASE] we

get a language where all transitive predicates have a direction marker
(such as Menominee, see section 4). If single IMPOVERISHMENT
constraints are ranked higher, only the corresponding inverse
configurations will be marked. This predicts i.a. languages where
334 Jochen Trommer

direction is marked in 3 -> 1/2 and 2 -» 1 predications, but not in any

other. This seems to be the case in Nocte (Gupta 1971). Other
languages make reference to different prominence scales. Thus, in
Kutenai (Dryer 1994), direction marking appears only for different 3rd
person arguments (obviative and proximative, see section 4. for
discussion of these features). Note that also markers not traditionally
thought of as direction markers fall under this characterization. Thus
in Ancash Quechua (Lakämper and Wunderlich 1989) there is one
portmanteau marker for 2 -> 1. In my account there is no formal
difference between portmanteau affixes for subject and object
agreement and direction markers. Thus, this instantiates the case
CASE/[+3][+l/+2] are ranked above while IMPOVERISH
CASE/[+2][+l] is ranked below PARSE [CASE],
There is ample evidence that direction markers mark agreement.
In the most extreme form this leads to large paradigms of direction
markers as in Arizona Tiwa (Klaiman 1993), with different markers
for specific number and person features. In other languages (e.g.
many Algonquian languages) there are special direction markers for 2
-> 1 and 1 -> 2 predications bleeding the direction markers used
elsewhere in these languages.
Thus it is obvious that direction markers encode agreement
features. The question is, whether they additionally refer to feature
hierarchies. I suggest that they do not and that the hierarchy effects
are captured by IMPOVERISHMENT constraints, which are
independently necessary to account for other aspects of agreement
morphology Thus in Menominee, the [-3] marker -m which marks
the presence of 1st or 2nd person arguments is suppressed in lsg ->
2sg forms:

(39) a. ke-natom-enene-m-enaw
'we call you (sg./pl.)' (Bloomfield 1962: 157)
b. ke-na tom-enene (kena tomen)
Ί call you (sg.)' (Bloomfield 1962: 157)
Direction Marking as Agreement 335

Again, this cannot be the effect of a surface filter since the context
of (39b) is a subset of the contexts of (39a); thus, everything that is
blocked in a. should also be blocked in b. Assuming IMPOVER-
ISHMENT, the data can be accounted for by a high-ranked IMPO-
VERISH [-3]/[+l +sg][+2+sg].

6. Alternative Analyses

Functional approaches to direction marking assume that direction

markers in some way encode the "well-behavedness" of the
alignment between Subject/Object and animacy features. As Aissen
(1999) notes, this implies - given the cross-linguistic diversity of
direction markers - the problematic assumption of language-specific
hierarchies. What is worse, hierarchy-based competition is often
driven by slightly different hierarchies than direction marking. Thus
in Blackfoot (DeLancey 1985: 643), 2nd person is ranked higher than
1st person for prefix selection, while 2 1 verbs carry the inverse
marker. This suggests the hierarchy ... 2 > 1 ... for prefixes and ... 1
> 2 ... for direction markers. Hence, in an account that relies on
direct reference to feature hierarchies in the grammar of single
languages, the hierarchies must in principle be not only language-
specific, but also construction-specific. A constraint-based account
avoids this un-desirable consequence, while incorporating the insight
that prominence hierarchies govern the distribution of direction
Aissen's OT-analysis of direction marking (Aissen 1999) is much
more limited in empirical domain than the one proposed here and
problematic in several ways. To see this, I briefly discuss, how she
would treat a language similar to Turkana which marks 3 -> 1,2
predicates through an inverse marker, and leaves all other
predications unmarked. At the heart of an Aissen-style analysis of
such a language is the fixed constraint ranking in (40):

(40) *0D & *Subj/3 & *Obj/l,2 » *0 D & *Subj/l,2 & *Obj/3
336 Jochen Trommer

*0 D marks non-realization of the direction category, *Subj/3 a 3rd

person subject and *Obj/l,2 a 1st or 2nd person object. Thus the first
constraint in (40) (*0 D & *Subj/3 & *Obj/l,2) which is formed from
these three constraints by local conjunction (indicated by "&") marks
the situation, where the subject is 3rd person, the object lst/2nd and
there is no direction marking. (40) can hence be paraphrased as in

(41) Mark Direction for 3 1,2 » Mark Direction for 1,2 —> 3

These two constraints interact with the economy constraint

*StructD which marks direction marking in general. There are three
possible situations:

(42) a. *StructD » 3:1,2 » 1,2:3 => no direction marking

b. 3:1,2 » *StructD » 1,2:3 => Inverse, but no direct marking
c. 3:1,2 » 1,2:3 » *StructD => Inverse and direct marking

As long as 3 -> 1,2 and 1,2 -> 3 are not reranked, this derives the
same typology as the one proposed in section 5. The ranking of the
constraints in (41) is systematically related to hierarchies by a
technique called harmonic alignment, which derives fixed constraint
rankings from prominence hierarchies. Thus the role of harmonic
alignment is roughly the same as for the statements in (16) and (38)
in my approach.
While this account partially derives the same results as my ana-
lysis it comprises several limitations and problems. First, as Aissen
notes herself (Aissen 1999: fri. 21), her account does not extend to
languages with inverse and direct marking, such as Menominee.
Second, in her account, it is completely unclear what a direction
marker formally is. In the functionalist literature a direction marker is
supposed to be a marker that encodes inverse or direct configurations
with respect to language specific feature hierarchy, but if this would
be correct the constraints on its distribution would be unnecessary.
Third, Aissen does not capture the systematic relation between direc-
tion markers and the caseless nature of other agreement morphology
Direction Marking as Agreement 337

in direction marking languages. This of course follows from the

uncertain formal nature of direction markers in her analysis. Finally,
Aissen has to stipulate a fixed order of constraints. While this is
common practice in the OT literature, it goes against the spirit of OT,
where constraints are supposed to be freely rankable. Of course, it
remains to be seen, if other analyses stated in terms of harmonic
alignment can be rephrased without fixed constraint orders.

7. Summary

In this article, I have analyzed direction markers as agreement

affixes. The distribution of direction markers crosslinguistically was
accounted for by different rankings of universal but violable
constraints, which were linked to feature hierarchies. All assumed
constraint types were independently motivated by their usefulness to
account for the behaviour of simple agreement markers in direction
marking languages. The proposed approach has been shown to be
superior to other accounts of direction marking. Note that the
proposed account obeys all the restrictive assumptions outlined in
section 2: Thus, direction marking is formalized locally referring
only to adjacent agreement heads, (locality), all assumed affixes en-
code subsets of the relevant syntactic features (inclusiveness), and all
assumed constraints can be reranked leading to other possible
grammars. The analysis hence supports a restricted version of OT-


1. The following abbreviations are used: Agr = agreement, Acc = accusative case,
an = animate, D = direction marker, I = inverse marker, inc. = inclusive
(plural), Nom = nominative case, Num = number, obv = obviative, Ρ = person,
pi = plural, sg = singular, spec = specified (actor).
2. Agreement with subjects of transitive clauses will be represented in the
following by [+Nom] and object agreement by [+Acc]. Obviously, this notation
has to be refined to extend to ergative languages.
338 Jochen Trommer

3. A similar constraint type ("the monosuffix constraint") is assumed by Aronoff

and Fuhrhop (2002) to account for coocurrence restrictions in English
derivation and inflection.
4. Note that in standard Indoeuropean languages, where person and number
agreement is fused, and there is no direction markings, different blocking
constraint will conspire to reduce agreement to one single affix.
5. With reference to Tangut, which functions roughly along the lines of Turkana.
6. But see footnote 4.
7. In "unspecified actor forms", the subject is unspecified in a passive like-
manner. Bloomfield indeed calls these forms passives, and I will follow him
here in the translations.
8. Note that 1 —> 2 and 2 -> 1 predicates have different direction markers (cf.
Trommer 2001 for discussion).
9. An anonymous reviewer notes that the functional approach predicts correctly
that inverse marker are phonologically longer than direct markers, while the
constraint-based account does not. While this observation seems to be correct
in many cases, there are some problems. Thus the shortest direction marker in
Menominee is -e which is used in forms with 1st person objects. Since at least
part of these forms (3 1) are clearly inverse, and also 2 -> 1 forms are at
least ambiguous between direct and inverse (DeLancey 1985: 644), -e should
be at least as long or longer as the direct marker -a . But obviously it is the
other way around.


Aissen, Judith
1999 Markedness and subject choice in optimality theory. Natural
Language and Linguistic Theory 17: 673-711.
Aronoff, Marc and Nanna Fuhrhop
2002 Restricting suffix combinations in German and English. To
appear in: Natural Language and Linguistic Theory.
Bloomfield, Leonard
1962 The Menomini Language. New Haven: Yale University Press.
Comrie, Bernard
1980 Inverse verb forms in Siberia. Folia Linguistica Historica 1: 61-
Croft, William
1990 Typology and Universals. Cambridge: Cambridge University
Direction Marking as Agreement 339

DeLancey, Scott
1985 An interpretation of split ergativity and related patterns.
Language 51: 626-657.
Dimmendaal, Gerrit Jan
1983 The Turkana Language. Dordrecht: Foris Publications.
Dryer, Matthew S.
1994 The discourse function of the Kutenai inverse. In: Givon, Talmy
(ed.), Voice and Inversion, 65-100. Amsterdam: John Benjamins.
Gupta, Das
1971 An Introduction to the Node Language. Shillong: North-East
Frontier Agency.
Halle, Morris and Alec Marantz
1993 Distributed Morphology and the pieces of inflection. In: Hale,
Kenneth and Keyser, Samuel Jay (eds.), The View from Building
20,111-176. Cambridge, M.A.: MIT Press.
Hockett, Charles F.
1966 What Algonquian is really like. International Journal of
American Linguistics 321: 59-73.
Klaiman, Miriam H.
1993 The relationship of inverse voice and head-marking in Arizona
Tewa and other Tanoan languages. Studies in Language 17: 343-
Lakämper, Renate and Dieter Wunderlich
1989 Person marking in Quechua - a constraint-based minimalist
analysis. Lingua 105: 113-148.
Prince, Alan and Paul Smolensky
1993 Optimality theory: Constraint interaction in generative grammar.
Ms. Technical reports of the Rutgers University Center of
Cognitive Science, New Brunswick.
Trommer, Jochen
2001 Distributed Optimality. Ph.D. dissertation, University of
Wunderlich, Dieter
1996 A minimalist model of inflectional morphology. In: Chris
Wilder, Manfred Bierwisch and Hans-Martin Gärtner (eds.), The
Role of Economy Principles in Linguistic Theory, 267-298.
Berlin: Akademie-Verlag.
On the semantics of cases*
Ilse Zimmermann

The present study is concerned with the semantics of cases in Rus-

Within a minimalist framework of sound-meaning correlation, it
is argued that structural cases of complements can be characterized
by abstract semantico-syntactic features which correspond to the se-
mantic hierarchy of argument expressions and which are systemat-
ically interrelated with their morpho-syntactic case realizations.
As to cases of adjuncts, the analysis focuses on the semantics of
the instrumental. It is shown how morpho-syntactic case features of
adjuncts get a semantic interpretation and how one can account for
the polysemy of the instrumental by assuming context-dependent
specifications of semantic parameters.

1. Objectives

On the basis of data from contemporary noncolloquial Russian, this

paper is concerned with the semantics of cases. It tries to bring to-
gether recent developments of the two-level semantics, the linking
theory of Lexical Decomposition Grammar (LDG), as elaborated by
Dieter Wunderlich and his colleagues (Wunderlich 1997a, Stiebels
1996, 2000a) and Roman Jakobson's case characterizations (Jakob-
son 1936, 1958). The main concern will be complements of verbs
and of event nominalizations with structural cases and adjuncts in the
instrumental. The particular questions to be raised are the following:
342 Ilse Zimmermann

- How are case forms of complements and adjuncts interrelated

with the semantics of these constituents?
- Which types of cases are to be discriminated?
- Which types of configurations and of case features are in-
- Which complements of lexical categories count as structural
- Which rules guarantee the correct case realizations of argument
- How do adjuncts get their cases and how are they interpreted
- How can one cope with the polysemy of adjunct cases?

An argument will be made for a strict differentiation of universal

semantico-syntactic and language-specific morpho-syntactic case
features and for the existence of regular correspondences between
them. As for the semantics of cases, it will be shown that structural
cases of complements have a systematic semantic background and
that for the morpho-syntactic cases of adjuncts we have to assume
semantic parameters which are specified at the level of Conceptual
The paper is organized as follows: Section 2 characterizes the
theoretical framework, first of all the division of labour between
morphology, syntax and semantics and the interface role played by
the argument structure of lexical entries of functor expressions.
Section 3 demonstrates the far-reaching parallelism of verbal
constructions and their nominalizations, with systematic case
variation of structural arguments. Section 4 is concerned with case
licensing. It presents a system of rules correlating abstract semantico-
syntactic case features in the argument structure of lexical governors
and the morpho-syntactic case features of the corresponding argu-
ment expressions. Section 5 concentrates on the semantics of the
instrumental as adjunct case. Section 6 is a summary.
Sections 2-4 are a shortened version of Zimmermann (2002).
On the semantics of cases 343

2. The framework

Within a minimalist framework of sound-meaning correlation

(Chomsky 1995), the analysis follows a lexicalist conception of mor-
phology (Stiebels and Wunderlich 1994, Wunderlich and Fabri 1995,
Wunderlich 1997b) and the differentiation of Semantic Form and
Conceptual Structure (Bierwisch 1983, 1987, 1997, Bierwisch and
Schreuder 1992, Lang 1987, 1990, 1994, Dölling 1997). I assume
Phonetic Form (PF), Logical Form (LF) and Semantic Form (SF) as
relevant grammatically determined levels of representation.
My conception of syntax is very restrictive (cf. Jacobs 1995). For
sentences and DPs, I presuppose the structural layers in (1) and (2),

(1) CP MoodP TP NegP vP* VP

(2) DP FP nP* NP

In the base structure, argument expressions with structural cases

of verbs and of the corresponding deverbal nouns are placed in
Spec VP, SpecvP or in SpecNP, SpecnP, respectively. DPs with non-
structural cases and PPs appear in the complement position. The verb
raises to Mood or to C (Zimmermann 1999a) and—in parallel to
sentence structures—the deverbal noun overtly moves to a high
functional projection F (Alexiadou 1999), so that all argument ex-
pressions of Ν will be to its right (Haider 1992). I will not discuss the
nature of the category F. Possibly, it is a further n. Adjuncts can be
integrated at all projections (Maienborn 1996, 1997, 2000).
The syntactic configurations at the level of LF are the input for se-
mantic interpretation. This implies that syntactic movements of con-
stituents can have an effect in SF (Zimmermann 1999a). For functor
expressions like verbs and their nominalizations this means that they
are combined with their arguments semantically on the basis of LF
configurations where chains with traces of moved argument expres-
sions must be taken into consideration (see (3)). In such derived
structures, the head of the chain, the case bearing argument
344 Ilse Zimmermann

expression DPj occupies some derived position whereas the tail of

the chain tj is in the complement or specifier position of V, ν, Ν or n.

(3) (DPi, ..., tO

The lexical entries for functor expressions like verbs and their
nominalizations include in their argument structure grammatical re-
quirements which must be fulfilled by the respective argument ex-
pressions. I call these requirements grammatical argument addresses
Gj. They are associated with lambda operators λχί which represent
the argument positions of the respective functor expression.

(4) λχ η ... λχί [ . . . χι ... x n ... ]

Gn Gi
argument structure predicate-argument structure

The argument positions λχί are ordered from right to left

according to the relative depth of embeddedness of the arguments Xj
in the predicate-argument structure. The highest argument xi of
verbs and event nominalizations constitutes the referential argument
(Williams 1981, Bierwisch 1989, Bischof 1991). For mnemonic
reasons, I will represent it for verbs and event nominalizations as s
(referring to situations).1 The other arguments are participant, propo-
sitional or predicate arguments.

(5) λχ η ... λχχ Xs [ . . . s ... χι ... x n ... ]

Gn Gi
with s e e, x, e {e, t <e,t>}

λχί in (5) represents the argument position of the external argu-

ment, λχ η is the argument position of the lowest internal argument.
For DP arguments, the grammatical features Gi are case requirements
(Zimmermann 1967) which must be fulfilled by the corresponding
DPs as heads in L F chains.
So we have the following hierarchy of argument positions:
On the semantics of cases 345

(6) internal < external < referential

This hierarchy corresponds to the relative adjacency of the argu-

ment expressions to the verb or its nominalization, from left to right.
The lowest internal argument expression or its trace is the immediate
neighbour of the functor expression in its base position. Higher argu-
ment expressions are not adjacent to the governor in the underived
The hierarchy of arguments is also valid for the classification of
argument expressions with respect to their predictable or nonpredic-
table case forms. Typically, the lowest internal argument expression
can have idiosyncratic (lexically determined) case marking.2
As regards the nature of case requirements of argument expres-
sions, I assume the following: Cases are understood as analysable
morpho-syntactic characteristics of inflected nouns, pronouns, adjec-
tives, determiners and quantifiers. These characteristics are repre-
sented by a small number of morpho-syntactic case features, which
allow morphological cross-classification and underspecification (see
section 4). I assume that idiosyncratic case requirements of argument
expressions are directly referred to by these case features as instan-
tiations of Gj in the argument structure of lexical governors (see the
examples in (7)-(ll)). In contrast, predictable case forms of
argument expressions are understood as structural cases. They are
conditioned by the semantic argument hierarchy. Wunderlich (1997a,
1999), Stiebels (1996, 1997, 2000a, 2000b, 2001) and Wunderlich
and Lakämper (2000) posit configurational case features for comple-
ments of verbs and nouns in general, without discriminating between
semantico-syntactic and morpho-syntactic feature systems. Their fea-
tures are +/-hr (there is a/no higher role), +/-lr (there is a/no lower
role), as proposed similarly by Kiparsky (1992). I will restrict the use
of these features to the characterization of argument positions for
structural arguments in the argument structure of lexical governors
and correlate them by correspondence rules to the morpho-syntactic
case features of the pertinent argument expressions. Also with re-
spect to German, I regard morphological case classifications in the
same way as, for instance, Bierwisch (1967) or Gallmann (1998),
346 Ilse Zimmermann

namely as autonomous morpho-syntactic qualifications which are not

reducible to a system of semantico-syntactic case features.
It seems necessary to distinguish the following case types:

• Structural cases are predictable case forms of DPs as argument

expressions. They correspond to semantically conditioned ab-
stract case features +/-hr, +/-lr as argument addresses in the ar-
gument structure of functor expressions.
• Lexical cases are unpredictable case forms of DPs idiosyncrat-
ically required for an argument expression by the respective
governor. They are represented by language-specific morphol-
ogically conditioned features.
• Semantic cases are morpho-syntactic case feature bundles of
DPs which receive a semantic interpretation.

3. Verbs and event nominalizations as governing heads and

their structural arguments

In the following, verbs and event nominalizations are analysed with

respect to the case form of their DP arguments. Of main concern are
DPs with structural (i.e., predictable) cases, in short structural argu-
ments. It will be shown which case requirements of argument expres-
sions remain unchanged in nominalizations and which arguments re-
ceive alternative case realizations.

3.1. The argument structure of verbs and their nominalizations

Words as syntactic atoms are fully inflected items. They enter

syntactic representations with all affixes of word formation and
inflection. With Bierwisch (1989) and Bischof (1991), I assume that
nominalizations of verbs—at least in German and in Russian—are
derived morphologically and do not constitute products of syntactic
Verbs and event nominalizations have the same semantic charac-
On the semantics of cases 347

terization. The respective nominalizing suffix simply converts the

verb into a noun without changing the SF.4
In (7)-(ll), I give some lexical representations of verbs and event
nominalizations with semantic and morpho-syntactic information
which is relevant for the case realization of argument expressions.
The predicate-argument structure is unanalysed. Positions for
structural arguments are associated with abstract case features +/-hr,
+/-lr, which predict the admissible systematic case forms of argu-
ment expressions, depending on the syntactic category of the respec-
tive governor. Idiosyncratic case requirements are represented by
morpho-syntactic features, which will be analysed in section 4. The
case forms of the corresponding argument expressions are indi-
cated—for convenience—by traditional case names.
In (7)-(9), case information is systematic, redundant and therefore
omissible. In contrast, the internal argument of the lexical entries in
(10)-(11) idiosyncratically shows up in the dative and in the instru-
mental, respectively. Here one has to do with unsystematic lexical
case which must be learnt.

(7) vyzdorovet' 'recover' / vyzdorovlenie 'recovery'

vozniknut' 'emerge' / vozniknovenie 'emergence'
λχ λβ [ . . . s ... χ ... ]
V: nom
N: gen

(8) usvoit' 'acquire' / usvoenie 'acquisition'

znat 'know' / znanie 'knowledge'
λy λχ λβ [ . . . s ... χ ... y ... ]
+hr -hr
-lr +lr
V: acc nom
N: gen instr
348 Ilse Zimmermann

(9) 'inform' / soobscertie 'information'

'hand in' / vrucenie 'handing in'
λζ λy λχ Xs [ . . . s ... χ ... y ... ζ ... ]
+hr +hr -hr
-lr +lr +lr
V: acc dat nom
N: gen dat instr

(10) izmenit' 'betray' / izmena 'betrayal'

pomoc' 'help' / pomosc 'help'
Xy λχ λβ [ . . . s ... χ ... y ... ]
+R -hr
+P -lr
V: dat nom
N: dat gen

(11) obmenjat'sja 'exchange' / obmen 'exchange'

zanimat'sja 'be engaged' / zanjatie 'engagement'
λy λχ λβ [ . . . s ... χ ... y ... ]
+Ρ -hr
V: instr nom
N: instr gen

In the last two examples, the external argument position λχ is

associated with the abstract case features -hr, -lr although there is a
lower argument. Due to its idiosyncratic case requirements, the argu-
ment position λy is invisible for the feature specifications +/-hr and
+/-lr. So λχ is characterized as the lowest structural argument posi-
tion. Furthermore, the two examples illustrate that idiosyncratic case
requirements of argument expressions of verbs are inherited by event
nominalizations. The same is true for the structural dative, as illus-
trated by (9).
Bayer, Bader and Meng (2001) observe that dative complements
behave like complements with idiosyncratic case marking insofar as
On the semantics of cases 349

they do not alternate with nominative or genitive phrases in passive,

middle and nominalised constructions, in contrast to argument ex-
pressions in the accusative. This case resistance of dative comple-
ments shows up in Russian nominalizations of ditransitive verbs
even more consistently than in German where the dative phrases
alternate with PPs. In Russian, the dative is preserved in event nomi-
nalizations (see (9) and (20)). Bayer, Bader and Meng (2001) classify
only the nominative and the accusative as true structural cases, which
characteristically undergo alternations in function changing opera-
tions. The corresponding phrases are categorized as DPs (or as QPs)
whereas dative and genitive phrases are analysed as KPs (case
phrases) constituting barriers for various syntactic operations. I
cannot see whether this categorial differentiation could serve as a
possible explanation for the different case forms of argument expres-
sions in function changing operations. I regard the dative of comple-
ments of ditransitive verbs and of their nominalization and also the
genitivus subjectivus and the genitivus objectivus in nominalizations
as structural cases and represent the respective phrases as DPs.
Alternating case realizations of structural arguments are connect-
ed with the external argument position λχ in (7)-(ll) and with the
lowest position for a structural argument, λy in (8) and λζ in (9). The
accusative complement of transitive and ditransitive verbs corre-
sponds to the genitive complement of the deverbal noun whereas the
nominative argument of these verbs shows up in the instrumental of
event nominalizations. The nominative argument of intransitive
verbs as in (7), (10) and (11) appears in the genitive, in nominaliza-
tions. These regularities concerning alternating case forms of struc-
tural arguments must be captured by the rules which interrelate the
abstract semantico-syntactic case features +/-hr, +/-lr of structural ar-
gument positions with the morpho-syntactic case features of the re-
spective DPs as argument expressions (see section 4).
350 Ilse Zimmermann

3.2. Reflexive verbs and their nominalizations

As can be seen from (11) and (12), Russian nominalizations do not

allow the combination with the reflexive morpheme -sja, in contrast
to Polish (cf.formowac (sie) / formowanie (sie) 'form / formation')
and to Nahuatl (Stiebeis 1997).5

(12) a. Vertolet prizemlilsja.

helicopter-NOM landed
'The helicopter landed.'

b. Vertolet prizemljalsja / byl prizemlen

helicopter-NOM was landed
(odnim passazirom).
one passenger-instr
'The helicopter was landed (by one of the passengers).'

c. prizemlenie vertoleta (odnim passazirom)

landing helicopter-GEN one passenger-INSTR
'the landing of the helicopter (by one of the passengers)'

The deverbal noun prizemlenie 'landing' in (12c) is ambiguous. It

corresponds to the reflexive verb prizemljat'sja / prizemlit'sja 'land'
in (12a) and to the transitive verb prizemljat' / prizemlit\ which in
(12b) is passivised, without any marking of reflexivity. I assume that
the reflexive formative -sja is added at the right end of verb forms on
the basis of a morpho-syntactic feature +refl. This feature corre-
sponds to the reflexive pseudoargument in German which, too, is
restricted to verbal constructions. What is crucial for the present in-
vestigation of structural arguments is the observation that reflexive
elements are always correlated with the lowest structural argument
position, i.e., with an absent true argument in the structural accu-
sative, and that in passive, middle and anticausative constructions
they are accompanied by the absence of the external argument.6
In addition to (12a) with an anticausative reflexive verb, (13) is a
case of reflexivum tantum, which characteristically does not combine
On the semantics of cases 351

with an accusative argument. (14) illustrates the reflexive passive of

an imperfective verb, and (15) the middle.

(13) Mal'cik smeetsja.

boy-NOM laughs
'The boy is laughing.'

(14) Plan razrabatyvaetsja.

plan-NOM is worked out
'The plan is being worked out.'

(15) Kniga xoroso prodaetsja.

book-NOM well sells
'The book sells well.'

In all these constructions, the reflexive exponent is added on the

basis of the morpho-syntactic feature +refl of the respective verb. In
nominalizations, this marking does not apply.
In (16)-(17), I give complex lexical representations for two causa-
tive/anticausative verb pairs and their event nominalizations.7 These
lexical entries respect the far-reaching correspondence of reflexivity
and transitivity of verbs. The presence of the reflexive formative -sja
on the verb corresponds to the absence of a structural accusative
argument. As in German, the deverbal noun of such pairs is syste-
matically ambiguous.

(16) prizemlit'(sja)-a 'land' /prizemlenie 'landing'

Xy (λχ) α λβ [ . . . s ...(... χ ...) α ... y ... ]
ahr -hr
-lr +lr
V: acc nom
N: gen instr
V: nom
N: gen
352 Ilse Zimmermann

(17) obucit'(sja).a 'teach, learn' / obucenie 'teaching, learning'

λζ λγ (λχ) α λβ [ ... s ... (... χ ...)α ··· Υ ··· ζ ··· ]
+R ahr -hr
+Ρ -lr +lr
V: dat acc nom
N: dat gen instr
V: dat nom
N: dat gen

3.3. Examples

The following noun phrases with deverbal heads illustrate the case
realizations of the pertinent argument expressions, in contrast to in-
finitival phrases. The examples are given with normal word order.
Derived word order variations, I do not consider here. It is important
to notice that Russian nominalizations preserve the order of the argu-
ment expressions relative to the lexical governor in its base position.
In contrast to German, the genitival complement need not be adja-
cent to the noun. In the nominalizations, the de verbal noun precedes
the highest structural argument expression. This results from head
movement of Ν to F (see section 2, (2)).

(18) a. vyzdorovlenie pacienta

recovery patient-GEN
'the recovery of the patient'

b. vyzdorovet'
On the semantics of cases 353

(19) a. znanie rebenkom jazyka

knowledge child-INSTR language-GEN
'the knowledge of the language by the child'

b. znat' jazyk
know language-ACC
'know the language'

(20) a. nemedlennoe soobscenie institutami firme

immediate informing institutes-INSTR firm-DAT
svoix zakazov
their orders-GEN
'the institutes' immediate informing the firm of their

b. nemedlenno soobscit' firme svoi zakazy

immediately inform firm-DAT their orders-ACC
'inform the firm immediately about their orders'

(21) a. obmen tovariscej opytom

exchange comrades-GEN experience-INSTR
'the exchange of experience by the comrades'

b. obmenjat'sja opytom
exchange experience-INSTR
'exchange experience'
354 Ilse Zimmermann

(22) a. obucenie (mater'ju) rebenka

teaching / learning mother-INSTR child-GEN
'the teaching of reading to the child by the mother' /
'the learning of reading by the child'

b. obucit' rebenka cteniju

teach child-ACC reading-DAT
'teach the child reading'

c. obucit'sja cteniju
learn reading-DAT
'learn reading'

3.4. The structural instrumental

As is apparent from the examples, the instrumental of the external

argument in nominalizations is accompanied by the argument in the
genitivus objectivus. With Bischof (1991), I regard this instrumental
as structural case. It should not be confused with the agentive instru-
mental phrase in passive constructions. (19a) illustrates a nominal-
ization of a Stative verb which does not have any passive. Neverthe-
less, the external argument appears in the instrumental. Moreover,
(20a) demonstrates that the argument in the instrumental binds the
reflexive pronoun svoj-, which is not possible for instrumental
phrases in passive constructions. Cf.:

(23) Nemedlenno soobscalis' firme institutami

immediately were informed firm-DAT institutes-INSTR
* svoi zakazy.
their orders-NOM
'The firm was informed immediately by the institutes about
their orders.'
On the semantics of cases 355

Thus, I differentiate between the structural instrumental of the ex-

ternal argument in event nominalizations as in (19a), (20a) and (22a),
the lexical instrumental of the internal argument as in (21a, b) and
the semantic instrumental of modifiers including the so-called
argument adjunct of passive constructions.8

3.5. The structural genitive

There are various kinds of genitives. Argument expressions and

adjuncts can be marked by the genitive. Here, I will concentrate on
the structured adnominal genitive of event nominalizations.9
As is evident from the lexical entries in 3.1. and 3.2. and from the
examples in 3.2. and 3.3., the structural accusative of transitive and
ditransitive verbs and the structural nominative of intransitive verbs
correspond to the genitive of the pertinent argument expressions of
the event nominalizations. This systematic correspondence of struc-
tural cases shows up in many languages. I assume that the adnominal
structural genitive is associated with the following types of argument
structures of nouns as functor expressions:

(24) a. (...) λχ Xs [ . . . s ... χ ... ]


b. (...) ky λχ Xs [ . . . s ... χ ... y ... ]

+hr -hr
-lr +lr
356 Ilse Zimmermann

c. λζ λy λχ λβ [ . . . s ... x ... y ... z ... ]

+hr +hr -hr
-lr +lr +lr

d. λy λχ [ . . . χ ... y ... ]
+hr -hr

(24a) represents deverbal nouns of intransitive verbs like vyzdoro-

vlenie 'recovery', pomosc' 'help' and deadjectival nouns like zavi-
simost' 'dependence'. (24b) and (24c) characterize deverbal nouns of
transitive resp. ditransitive verbs like usvoenie 'acquisition', soob-
scenie 'information', and (24d) represents relational nouns like mat'
'mother' and semantically enriched sortal nouns with a 'possessor'
argument as in dom vraca 'house of the doctor'.
I depart from Stiebels' assumptions on the argument structure of
nouns (Stiebels 1997, 2000a, 2001) in following Bierwisch (1989)
and Bischof (1991) with regard to the treatment of event nominaliza-
tions. I do not accept Stiebels' proposal to treat relational, seman-
tically enriched sortal nouns and event nominalizations alike. In
Stiebels' system, all non-highest argument positions of nouns are
characterized as +hr, without any further differentiation. Thus, event
nominalizations of ditransitive verbs get the following representa-

(25) λζ λy λχ λβ [ . . . s ... χ ... y ... ζ ... ]

+hr +hr +hr -hr

Since the adnominal structural genitive is classified by Stiebels as

+hr linker, all non-highest argument expressions of nouns could be
genitive phrases. Several additional principles of case realization
must help to avoid this. I do not need such additions. I assume that
the argument structure of verbs is preserved in nominalizations,
including complete information on structural cases (see (24a-c)).
On the semantics of cases 357

4. Case realizations and case licensing

DPs can emerge with morpho-syntactic case information of the lan-

guage-specific case system.10 It must be guaranteed that the case
forms of DP-argument expressions fulfil the case requirements of
their respective governor. Some licensing must take place. I will
adopt the devices of Optimality and Correspondence Theory to
account for this (Stiebeis 2000a, Wunderlich and Lakämper 2000).
It must be emphasized that my conception of syntax does not
assume any movements of a DP in order to get its case licensed.
Movements of DPs obey requirements of scope and of information
structure. Nevertheless, some principles must be at work to check the
admissibility and co-occurrence of case forms of DPs as they are pre-
scribed in the argument structure of the governing lexical head of the
construction. I assume that correspondence rules and very general
principles of argument realization relate the language-specific case
characterizations of DPs to the pertinent requirements in the argu-
ment structure of the governor.

4.1. Case systems

Without going into the details of Russian nominal inflection, I pre-

suppose the case system of Jakobson (1936) by giving his case quali-
fications the status of +/-valued features and by adding the feature
obl(ique), whose specifications I borrow from Franks (1995):11
358 Ilse Zimmermann

(26) Jakobson's (1936) enriched case system

R Ρ u G obi
acc +
dat + + +
instr + +
genl + +
loci + + +
gen2 + + +
loc2 + + + +

with the correlations12

R= Bezug (directional: "signalizing the goal of the


Ρ= Rand (marginal: "assigning the entity an accessory

place in the message")

U = Umfang (quantified: "focusing upon the extent to

which the entity takes part in the message")

G = Gestaltung (cases of shaping)13

As the features of Jakobson's system serve morpho-syntactic and

semantic generalizations, it is no coincidence that LDG's features
+/-hr, +/-lr roughly coincide with Jakobson's features +/-R, +/-P,
respectively. Cf.:
On the semantics of cases 359

(27) The case system of LDG

hr lr GEN dir instr

acc +
dat + +
gen + +

It is essential to keep in mind that Wunderlich and his colleagues

do not assume a special system of features for the morpho-syntactic
case realizations of DPs. In their system of case linkers, the
semantico-syntactic features +/-hr, +/-lr figure together with further
differentiations of diverse origin. Characteristically, the additional
features do not serve cross-classification. This is in sharp contrast to
the system proposed in (26).
Whereas the features +/-hr, +/-lr correspond to the semantic hier-
archy of complements and allow us to characterize the nominative,
accusative, dative and the ergative, the genitive needs further
qualifications in order to be differentiated from the accusative. I put
GEN in (27) (cf. Wunderlich 2000) provisionally as an additional
characterization of the genitive. In Stiebels (2000a, 2001), the con-
textual categorial features -articulated, -dependent (i.e. N) proposed
by Wunderlich (1996) make the feature specification +hr of the
structural genitive dependent on a nominal governor. But the genitive
as lexical case cannot be characterized in this way. It occurs with V,
A and Ρ as governor in German, Russian and in other languages.
Furthermore, Russian genitive phrases systematically alternate with
the accusative of the direct object and the nominative of the subject
in negated sentences (Jakobson 1936) and should be regarded as
structural arguments. In view of this and of the correspondence of the
structural genitive in event nominalizations with the nominative and
360 Ilse Zimmermann

accusative of verb complements, it seems questionable to have the

qualification +hr (or +R of the system in (26)) for the genitive.
The additional features dir, instr and others in (27) characterize
so-called semantic cases in richer case systems, as, for instance, Hun-
In sum, I cannot see how such an enlarged system of case features
could deliver homogeneous criteria for cross-classification and gen-
eralizations. It seems more enlightening to keep morpho-syntax and
semantics apart and to try to capture the existing correlations be-
tween the various subdomains of grammar by special rules and prin-
My high estimation of Jakobson's (1936, 1958) case features first
of all concerns the general approach and less the details. Further-
more, his semantic generalizations nowadays can be captured more
adequately by assumptions about the semantic decomposition of
functor expressions and about the semantic hierarchy of comple-
ments. Nevertheless, there are very fundamental and subtle observa-
tions in Jakobson's semantic characterizations of differences
expressed by cases which deserve attention and recognition (see
Demjjanow and Strigin 2000a, 2000b). The same is true with respect
to the morphological generalizations offered by Jakobson's system of
case features. It takes into account case inflection of nouns,
adjectives, pronouns and numerals of Russian.
According to the case system (26), all cases—except for the nomi-
native as the most unmarked case—are identifiable by positively
specified features. Thus, the markedness constraint (28) for lexical
representations of morphemes, which has been proposed by Stiebels
(2000a, 2000b), can be observed.

(28) *-: Avoid the negative specification of a morpho-syntactic

feature in the lexical representation of morphemes.

Furthermore, at least some syncr