Beruflich Dokumente
Kultur Dokumente
CROSS-CULTURAL PERSPECTIVE
STUDIES IN BILINGUALISM (SiBil)
EDITORS
EDITORIAL BOARD
Volume 2
edited by
KEES DE BOT
University of Nijmegen
RALPH B. GINSBERG
University of Pennsylvania
&
National Foreign Language Center
CLAIRE KRAMSCH
University of California, Berkeley
1991
The publication of this volume has been supported by a subsidy from
the European Cultural Foundation.
Foreword ix
Richard D. Lambert
Preface xi
Kees de Bot, Claire Kramsch & Ralph B. Ginsberg
Richard D. Lambert
ferences in orientation which did seem to distinguish the American from the Eu
ropean attendees. For instance, the Americans were more inclined to focus on
individual language learners, non-classroom learning environments, the role of
learner variation, and the importance of research on policy. The Europeans
were more inclined to take the classroom as given, treat teaching and learning
together, and give preference to research that would help the teacher perform
better. A number of the American scholars were particularly interested in the
languages of Asia and Africa while the Europeans were primarily concerned
with the teaching of English or the languages of Western Europe.
By and large, however, the differences of opinion that emerged in the dis
cussion did not follow nationality lines. Sharp disagreements there were, but it
was difficult to predict on which side of the Atlantic the contending parties or in
dividuals would fall. For instance, there was disagreement as to whether theore
tical significance was a necessary definer of research priorities or whether
evaluation of existing programs and the solution of concrete problems should be
paramount; whether the need for narrowly-defined, rigorously controlled ex
periments should take precedence over semi-ethnographic observation of natu
ral phenomenon; what the appropriate scale of studies is; whether the validity of
the measures of learning outcomes is still indeterminate or appropriate
measures are available once the goals of research are set; what the contribution
of theoretical linguistics is to research on second language learning is; how
should advanced technologies best be used; whether culture should be taught di
rectly or left to emerge as a by-product of language learning. These and many
other issues are represented in the papers that follow.
We wish to thank the European Cultural Foundation for sharing in the sup
port for this conference. The Rockefeller Foundation's Conference Center at
Bellagio, Italy, on Lake Como, is an ideal place for just such transnational dia
logues. The informal discussions that eddied around the edges of the formal
presentations were especially helpful in facilitating discussions across national
boundaries. We are especially thankful to the Rockefeller Foundation for its
hospitality and support.
5 January 1989
National Foreign Language Center
Washington, D.C.
Preface
In recent years research on foreign language teaching and learning has in
creased substantially on both sides of the Atlantic. At the same time differences
in perspectives on what should be investigated and what paradigm should pre
vail have also grown, to the point that there is serious concern, both in Europe
and in the United States, that a split in the field is not inconceivable. In an at
tempt to narrow this gap, Richard D. Lambert of the National Foreign Language
Center and Theo van Els of the Department of Applied Linguistics of the
University of Nijmegen, the Netherlands, took the initiative to organize a small
scale conference on empirical research in foreign language pedagogy, bringing
together scholars from Europe and the US. The aim of the conference was to
unearth both commonalities and differences in viewpoints and paradigms. The
conference was sponsored by the NFLC, the Rockefeller Foundation, and the
European Cultural Foundation, and was held at the Rockefeller Foundation's
Bellagio Study and Conference Center in June 1988. The present volume is the
outcome of this conference.
The editors are indebted to the authors for their cooperation through sev
eral revisions, to Albert Cox for technical support in producing the manuscript,
to the National Foreign Language Center for support in getting this volume to
press, and to Yola de Lusenet of Benjamins Publishers for her help and pa
tience.
Section I—Priorities in the US and in Europe
Foreign Language Instruction and Second Language
Acquisition Research in the United States
Foreign language (FL) instruction and the related research on second lan
guage acquisition (SLA) in the United States can be understood only in the con
text of the role of English, of American education, and of speech and language
research and educational research in the United States. Any part of an educa
tional system is, after all, both a result of historical processes and a response to
current needs and values.
The most salient part of the language situation in the United States is surely
the overall dominance of English. Not only is English by far the most common
mother tongue, it is also by far the language most often learned as a second lan
guage and is overwhelmingly the language of participation in U.S. economic,
4 CHARLES A. FERGUSON & THOM HUEBNER
political, and social life. Moreover, Americans perceive their nation as even
more monolingual than it is. In 1975, for example, when the U.S. Bureau of the
Census conducted a special sample survey of non-English languages, almost 18
percent of the population aged 14 years or older claimed a mother tongue other
than English (seven out of ten of them native-born Americans), and one person
out of eight aged four or older lived in a household in which a language other
than English was spoken (Waggoner 1981). Although not the national or official
language of the United States by constitution, statute, or regulation, English is
the de facto national language, its status maintained by powerful social press
ures, and non-English-speaking immigrant groups have generally experienced
relatively rapid attrition of mother tongue competence and corresponding shift
to English (Fishman et al. 1966; Veltman 1983). In spite of this pattern of lin
guistic assimilation, the visibility of large numbers of Hispanics and the relative
ly recent influx of Asians have resulted in movements advocating some kind of
legal status for English, both at state and national levels. The outcome of such
movements is unclear, but the dominance of English is likely to persist no mat
ter what the outcome.
Another feature of the language situation in the United States that is rele
vant to our understanding of the learning and teaching of FLs is the existence of
four different language professions, each with its own occupational goals, educa
tion or special training, and attitudes on language education issues: FL teachers,
bilingual education specialists, teachers of English as a second language, and
teachers of English as a native language. These groups, who could be strong al
lies if they shared important aspects of their educational perspectives and saw
complementary roles for themselves in the American educational system, gener
ally see one another as adversaries or, at best, as professionally unrelated. We
will not attempt here to address the relation between the study of literature and
FL instruction as such —a problematic issue in most European and American
educational systems.
Finally, let us emphasize an aspect of the language situation that is not often
treated explicitly: attitudes and beliefs about language widely held by Ameri
cans. We assume that the members of any speech community, even such a large
and complex one as the United States, share to a considerable degree a set of
such attitudes and beliefs, so-called myths about language (Ferguson and Heath
1981: xxvii-xxx). We assume further that these myths may sometimes be of criti
cal importance for understanding the activities of FL learning and teaching as
well as the SLA research efforts of the community. These myths vary consider
ably by region, social class, and other categories, and they have not been investi
gated as much as the evaluative attitudes toward languages and their speakers
(cf. Ryan and Giles 1982). Some of them, however, merit notice.
First, Americans tend to regard competence in an FL as a kind of all-or-
none personal attribute not particularly related to the process of acquisition or
the nature and level of proficiency. People have the competence or they don't:
"Does so-and-so speak Chinese?" "I don't know Spanish". Americans generally
assume (with some justification, of course) that there is little connection be
tween having studied a language and "knowing" it or being able to use it. The re
search corrective to this myth is the current concern with proficiency testing and
other forms of measurement of language competence. Richard Lambert has
called for a "common metric measuring in an objective, consistent fashion the
degree of proficiency a person... has in foreign language". (Lambert 1987:13)
Related to this failure to connect the processes of acquisition to the level of
competence is the notion that there are only a few "real" — one might almost say
"magical" —ways to learn a language. Many people have assured us at one time
or another that the only way to learn a foreign language is to be exposed to it in
childhood, or to live in a country where it is spoken, or (usually said with a
smile) to have a mate or lover who speaks the language. The widespread belief
that living in the appropriate country will produce fluency in a language is evi
denced, for example, in the disappointment that many Stanford undergraduate
students feel after one or two quarters at a Stanford overseas campus, when they
find that they have not automatically reached full fluency. American students
typically do not expect to learn to use a language by studying it in school (and
neither do their teachers or the surrounding community), but they do expect to
learn it by being in the country, having no inkling of the time, effort, and com
municative strategies required. When Americans are faced with a need to ac
quire some FL competence and the options just discussed are not available, they
want the fastest, most efficient, most painless method, preferably one that fea
tures some new technology. The research counterpart to this view is the peren-
8 CHARLES A. FERGUSON & THOM HUEBNER
niai concern to test different "methods" to see which one is best, that is, most ef
ficient.
A third myth concerns the way people differ in their ability to learn lan
guages. Americans believe that aptitude is very important. Although many as
sume that their compatriots in general have low language aptitude, they assume
just as strongly or more so that individuals differ greatly in language aptitude.
Many individual Americans claim that they themselves have no aptitude for lan
guages and could never learn one, whereas some people they know are, as they
say, "good at languages". Several first-rate American universities make provi
sion to waive their language requirement if a test shows that a particular student
has poor language aptitude.
In this connection, it is interesting to compare attitudes toward foreign com
petence in English with those toward American competence in FLs. An Ameri
can's lack of competence in an FL is often attributed to low aptitude. In
contrast, a foreigner's lack of competence in English may be attributed to lack of
opportunity, clannishness, laziness, or other explanatory factors, but rarely to
lack of aptitude. Incidentally, an attitude not often verbalized but apparent from
incidental comments and behavior is that a foreigner with an excellent command
of English is somehow more intelligent and more competent in other ways than
one whose command of English is less good.
In addition to the emphasis on aptitude, Americans hold conventionalized
notions, almost stereotypes, about the relative difficulty of languages. They as
sume that there is some kind of absolute scale of difficulty such that Spanish is
easier to study or to learn than French, or a more nuanced scale such that Span
ish is easier in the first year but harder in the second year. This view contrasts
with the implicit assumption of most American linguists that all languages are
roughly equal in difficulty for the newborn and differences in difficulty in SLA if
they exist, are due to the nature of the structural differences between L1 and L2
(shades of contrastive analysis!). Linguistic theories that make allowance for
measurement along these lines, such as those involving markedness or par
ameter-setting, could contribute to the understanding of these questions.
On the theory side, SLA research in the United States has tended to be tied
either to linguistics or to psychology, and the tendency has often been to "apply"
a theoretical model derived from quite different contexts of language use rather
than to deal with SLA phenomena as the source for theory construction. Interes
tingly, the USSR (and prerevolutionary Russia) has had the same pattern of the-
FL INSTRUCTION AND SLA RESEARCH IN THE US 9
ory application from linguistics and psychology (Pitthan 1988) and has experi
enced the same failure to construct theories that start from SLA, although the
patterns of teaching and learning FLs in the Soviet Union are dramatically dif
ferent from those in the United States.
We do not mean to say that research on SLA should not be theory driven.
But Shulman raises an important caveat against the potential trivialization of the
field by a single paradigmatic view. While theory drives much of research (some
would say it should drive all research), there are many kinds of theory that need
to be taken into account in SLA.
The name of the field of inquiry itself suggests need for both a theory of
language and a theory of learning. Given the current state of linguistic theory in
the United States, one can find any number of competence and performance
models. The same could be said of learning theory, although any theory of lear
ning would necessarily include some specification of an initial state, a motivation
to learn, a specification of input, an acquisition procedure, and a description of a
desired state. In addition, researchers who deal in tutored contexts need a model
of teaching. Closely related to all of these areas is a theory of research design. In
the following sections, we review some research on learning contexts, on the na
ture of language, on the acquisition process, and on teaching behaviors believed
to facilitate learning.
Several taxonomies for the contexts of teaching and learning second langu
ages are common in the literature. One involves the labels assigned to teaching
methodology. Some years ago, researchers hoped that a comparison of "me
thods" would lead to an optimal one for language learning. That kind of re
search, which takes method as the unit of analysis, has proven not very fruitful.
FL INSTRUCTION AND SLA RESEARCH IN THE US 11
Several authors (Brumfit this volume; Larsen-Freeman this volume; Long this
volume) critique this line of research; we will not review their arguments here.
Other taxonomic distinctions, however, persist in contemporary research.
One is that between tutored and untutored language learning. Another divides
the second language learning field into second language, foreign language, and
bilingual education. Both distinctions implicitly reflect differences in degree, if
not in kind, of the processes and products under investigation. While not dispa
raging the practical worth of these taxonomies, they are useful only so long as
the contextual features used to form the bases of the taxonomies differ signifi
cantly across categories and are sufficiently uniform within them.
One danger is that these taxonomic distinctions may obfuscate both cultural
and individual differences. For example, DeKeyser's (1986) description of the
learning strategies of a group of American students in a one-semester study ab
road course in Spain will ring familiar to anyone who has had experience with
American students in similar programs, regardless of the host country. At the
same time, individual differences within the group were striking, even though
they were in the same FL program.
Within the North American context, research on these issues has tended to
concentrate north of the U.S.-Canada border. In his review of social psychology
and SLA, Gardner argues that, among the various individual differences exami
ned in the SLA literature, an integrative motive (broadly defined) and "language
aptitude are the only two individual differences which have been well documen
ted to date as being implicated in the language learning process" (1985: 83). He
argues further that changes in social attitudes may be affected by second langu
age learning experiences and that these changes are perhaps greatest when pro
grams involve novel experiences of rather short duration, such as intensive
bicultural experiences among students who maximize contacts with native spea
kers or in short intensive programs.
From this perspective, if parents and community play a role in socialization
and the formation of attitudes, they also influence the SLA process. Gardner
(1985:146) states:
"Second language acquisition takes place in a particular cultural context...
[T]he beliefs in the community concerning the importance and meaningful-
ness of learning the language, the nature of the skill development expected,
and the particular role of various individual differences in the language lear
ning process will affect second language acquisition".
To the extent that Americans hold various "myths about language", re
searchers would want to know what communities expect of foreign language
12 CHARLES A. FERGUSON & THOM HUEBNER
Other theories, such as Lexical Functional Grammar (LFG), have not yet
been applied to SLA, although Pinker's work (1984) within an LFG framework
on first language acquisition portends that it will. Rosen (1987) explores the im
plications between Relational Grammar and SLA. While Newmeyer (1987)
points out that many of the assumptions of these frameworks are converging, the
bulk of the work on SLA within formal theories of grammar reflects a stron
commitment to government-binding, and has focused solely on linguistic aspects
of initial and final state. A clear articulation of this position is found in Gregg
(1989).
The argument about SLA theory seems to be as follows. Since they don't
have a complete theory of language, researchers can't look at language acquisi
tion. Instead they should look at the acquisition of linguistic or grammatical
competence (the terms are used interchangeably throughout our paper). Gram
matical competence is defined as our intuitive knowledge of the syntax, phono-
FL INSTRUCTION AND SLA RESEARCH IN THE US 13
logy, and to some extent semantics of the language in question. One assumption
within this framework is that grammatical competence is independent of lan
guage use and involves a mental system that is quite separate from pragmatic
knowledge, conceptual knowledge, perception, and other human faculties. This
has been called the autonomous nature of grammar. At the same time, one
sense in which language is perceived to be modular is that its use results from
the interaction of linguistic competence with other mental faculties or modules,
involving, for example, pragmatic knowledge, conceptual knowledge, and per
ception.
Gregg's rather strong position is that SLA should be centrally concerned
with the acquisition of linguistic competence. In addition to providing a sense of
direction to the field of SLA, such an orientation would bring other advantages
to the field, he maintains: a "rigor" inherent in formal approaches and a knowl
edge of what is innate in language and what is acquired.
These apparent advantages can also be seen as problematic areas for formal
approaches as well. To date agreement on the relevant parameters and their le
vels of expansion is far from universal. For example, working within a GB
framework, Huang (1982) and Koopman (1984) offer differing explanations for
head direction in Chinese, which, as has been pointed out in the literature (Eu
bank 1988; Bley-Vroman and Chaudron 1987; Klein 1987), have different ef
fects on the interpretation of SLA data.
A second problem involves the tapping of a learner's intuitions about a sec
ond language. Coppieters (1987) argues that the linguistic competence of even
very fluent second language speakers differs in unexpected ways from that of na
tive speakers. Furthermore, Birdsong (1988) points out that, while such research
intends to describe the learner's grammatical competence at any given point in
time as evidenced by intuitions about the second language, the interaction of
multiple cognitive mechanisms (modularity) makes it difficult to base judgments
about underlying linguistic competence on performance data such as imitation
tasks.
A final problem to which formalist theories have given little attention is the
process of acquisition, either in the sense of accounting for how a learner is
"driven" from one stage of knowledge to another, or in the sense of providing a
theory of the actual time course of acquisition. As Marshall (1979) points out
and Berwick and Weinberg (1986) reiterate, "No one has seriously attempted to
specify a mechanism that "drives" language acquisition through its "stages" or
along its continuous function" (Marshall 1979: 443). That is, it is not always clear
what the learning process includes, how learners' linguistic competence changes
from time 1 to time 2. For example, in distinguishing between the acquisition of
14 CHARLES A. FERGUSON & THOM HUEBNER
"Japanese is a pro-drop language, and knowing that, I drop pronouns left and
right—including at times when a native speaker would not. That is to say that I
don't yet know the discourse restraints (at least) on pronoun-dropping in
Japanese, and thus my 'communicative competence' is not up to native stand
ards".
write: "The forms of natural languages are created, governed, constrained, ac
quired, and used in the service of communicative functions".
From this perspective, any explanation of linguistic phenomena cannot ex
clude semantic and pragmatic considerations. Silva-Corvalán makes this claim
most explicit in her discussion of Muysken's (1981) hierarchy of markedness for
tense as applied to data on language attrition: "In my view of language as a sys
tem of human communication, to be explanatory, a markedness hierarchy needs
to be justified with reference to factors which lie outside the linguistic system,
namely cognitive and interactional factors" (1987:14).
These assumptions have implications for what is deemed legitimate terrain
for second language acquisition research. Rather than an overriding concern
with abstract formulations of linguistic competence, SLA researchers working,
either explicitly or implicitly, within this framework have been concerned with
the production of discourse rather than clause length phenomena (e.g. Hatch
1978; Tomlin 1984), with intra-speaker variation (e.g. Tarone 1984; Ellis 1985b),
with changes over time as exemplified by learner production of naturally occur
ring speech (e.g. Huebner 1983; Sato 1985), with the nature of linguistic input
(e.g. Chaudron 1985), and with strategies employed for comprehension and pro
duction (Faerch and Kasper 1987; Chamot et al. 1988).
This more general approach also has its problems. Its emphasis on language
in use has often resulted in a failure to tap the full range of what a learner
"knows" about the language being acquired. In addition, often research of this
type has not clearly articulated the relationship between aspects of language use
and acquisition of specific features of a given linguistic system. Finally, as Gregg
(1989) justifiably points out, it has often failed to distinguish between what lear
ners do because they are not fully proficient in the target language and what they
do by virtue of being human.
Given the current state of affairs of all linguistic theories, the prospects are
as promising for SLA to contribute to them as vice versa. While one finds
numerous claims that SLA is in fact doing so, to date the research in this field
has been more of a confirmatory nature (cf. Huebner 1987).
3 Models of Learning
Another large body of SLA research on the American scene has focused on
the learning and teaching of second languages. Work in social psychology, such
as Gardner's (1985) and Giles and Byrne's (1982), looks at motivation and
larger social variables in second language learning; other research has drawn
heavily on interactional models of discourse to isolate those features of interac-
16 CHARLES A. FERGUSON & THOM HUEBNER
tion that presumably facilitate learning. The most comprehensive published re
view is Chaudron's Second Language Classrooms; Research on Teaching and
Learning (1988). Here we highlight some conclusions that can be drawn from it.
First, while correlations can be found — between for example: (1) modifications
in teacher talk and in-class versus out-of-class interaction; (2) input generation
and proficiency; (3) task type and type or amount of interaction; (4) amount of
teacher talk and language proficiency of learners; (5) learner production and
achievement test scores; (6) learners' negotiation behaviors and proficiency—
there is little study of the causal relationship between the members of these
pairs. Second, the vast majority of the studies cited in Chaudron, and presum
ably the bulk of the research in this area, look at English as a second language
classrooms. Few studies focus on the range of teacher and student behaviors and
interaction patterns in FL classes in the United States. Third, the bulk of the
studies cited in Chaudron are of the process-product, or more accurately the
pseudo-process-product, variety. Very few classroom-centered qualitative
studies of SLA, and virtually none of FL acquisition, exist.
Finally, there are few studies that take a programmatic look at instructional
programs, especially with respect to FL teaching and learning in the United
States. For example, most university-level FL programs offer courses such as
"Advanced Conversation" and "Grammar Review", which are usually offered to
students at specific junctures in their language learning careers. Yet little re
search of which we are aware carefully examines either instructional goals and
outcomes in these "specialized" language courses or the assumptions about FL
learning that motivate their inclusion at those junctures.
4 Conclusions
References
Berwick, R.C. and A.S. Weinberg. 1986. The grammatical basis of linguistic performance: Language
use and acquisition. Cambridge: MIT Press.
Birdsong, D. 1988. Second-language acquisition theory and the logical problem of the data. Paper
presented at the eighth Second Language Research Forum, University of Hawaii, Manoa,
March.
Bley-Vroman, R. and C. Chaudron. 1987. A critique of Flynn's parameter setting model of second
language acquisition. Unpublished manuscript, University of Hawaii, Manoa.
Bresnan, J. 1982. The mental representation of grammatical relations. Cambridge: MIT Press.
Brumfit, C. This volume. "Problems in defining instructional methodologies.
Campbell, R.N. and S. Schnell. 1987. "Language conservation." Annals of the American Academy
of Political Social Sciences 490.177-185.
Chamot, A.U., J.M. O'Malley and L. Kupper. 1988. Learner strategies for listening comprehen
sion in English as a second language. Paper presented at the American Educational Re
search Association Annual Meetings, New Orleans, April.
Chaudron, C. 1985. "Intake: on models and methods for discovering learners' processing of
input." Studies in Second Language Acquisition 7/1.1-14.
Chaudron, C. 1988. Second language classrooms: Research on teaching and learning. Cambridge:
Cambridge University Press.
Chomsky, N. 1981. Lectures on government and binding. Dordrecht: Foris Publications.
Coppieters, R. 1987. "Competence differences between native and near-native speakers." Lan-
guage 63/3.544-573.
DeKeyser, R.M. 1986. From learning to acquisition? Foreign language development in a U.S. class-
room and during a semester abroad. Ph.D. thesis. Stanford University.
Ellis, R. 1985a. Understanding second language acquisition. Oxford: Oxford University Press.
Ellis, R. 1985b. "Sources of variability in interlanguage.Applied Linguistics 6/2.118-131.
Eubank, L. 1988. Parameters in L2 learning: Flynn revisited. Paper presented at the eighth Sec
ond Language Research Forum, University of Hawaii at Manoa, March.
Faerch, C, and G. Kasper. 1987. "The role of comprehension in second-language learning." Ap-
plied Linguistics 7/3.251-274.
Ferguson, CA., and S.B. Heath. 1981. "Introduction." Language in the USA ed. by CA. Ferguson
and S.B. Heath. Cambridge: Cambridge University Press.
Feyerabend, P. 1974. "How to be a good empiricist—a plea for tolerance in matters epistemologi-
cal." The philosophy of science ed. by P.H. Hidditch, 12-39. Oxford: Oxford University Press.
Fishman, JA. 1980. "Ethnic community mother tongue schools in the USA: Dynamics and dis
tributions." International Migration Review 14.235-247.
Fishman, J.A., V. Nihirny, J. Hoffman and R. Hayden. 1966. Language loyalty in the United States.
The Hague: Mouton.
Gardner, R.C 1985. Social psychology and second language learning: The role of attitudes and moti-
vation. Baltimore: Edward Arnold.
Gazdar, G. et al. 1985. Generalized phrase structure grammar. Oxford: Basil Blackwell.
Giles, H. and J.L. Byrne. 1982. "An intergroup approach to second language acquisition." Journal
of Multicultural and Multilingual Development 3/1.17-40.
Gregg, K.R. 1989. "Linguistic perspectives on second language acquisition: What could they be,
and where can we get some?" Linguistic Perspectives on Second Language Acquisition ed. by
S.M. Gass and J. Schachter, 15-40. Cambridge: Cambridge University Press.
18 CHARLES A. FERGUSON & THOM HUEBNER
Hatch, E.M. 1978. "Discourse analysis and second language acquisition." Second language ac-
quisition: A book of readings. ed. by E.M. Hatch. Rowley, MA: Newbury House.
Huang, C.J. 1982. Logical relations in Chinese and the theory of grammar. Ph.D. thesis. Massachu-
setts Institute of Technology.
Huebner, T. 1983.A longitudinal analysis of the acquisition of English. Ann Arbor: Karoma.
Huebner, T. 1987. SLA: a litmus test for linguistic theory? Paper presented at the conference on
Second Language Acquisition: Contributions and Challenges to Linguistic Theory, Stanford
University, July.
Klein, W. 1986. Second language acquisition. Cambridge: Cambridge University Press.
Klein, W. 1987. SLA theory: prolegomena to a theory of language acquisition and implications for
theoretical linguistics. Paper presented at the conference on Second Language Acquisition:
Contributions and Challenges to Linguistic Theory, Stanford University, July.
Koopman, H. 1984. The syntax of verbs. Dordrecht: Foris Publications.
Kuhn, T.S. 1962. The structure of scientific revolutions. Chicago: University of Chicago Press.
Kuno, S. 1987. Functional syntax: Anaphora, discourse and empathy. Chicago: University of Chica-
go Press.
Lambert, R.D. 1987. "The improvement of foreign language competence in the United States."
Annals of the American Academy of Political and Social Science 490.9-19.
Larsen-Freeman, D. This volume. "Research on language teaching methodologies: a review of the
past and an agenda for the future."
Long, M. 1985. Theory construction in second language acquisition. Paper presented at the sixth
Second Language Research Forum, University of California at Los Angeles, February.
Long, M. This volume. "Focus on form: a design feature in language teaching methodology."
Long, M. and C. Sato. 1984. "Methodological issues in interlanguage studies: an interactionist
perspective." Interlanguage ed. by A. Davies, C. Criper and A.P.R. Howatt, 253-279. Edin
burgh: Edinburgh University Press.
MacWhinney, B., E. Bates and R. Kliegl. 1984. "Cue validity and sentence interpretation in Eng
lish, German, and Italian." Journal of Verbal Learning and Verbal Bahavior 23/1.127-150.
Marshall, J.C. 1979. "Language acquisition in a biological frame of reference." Language Acquisi-
tion ed. by P. Fletcher and M. Garman, 437-453. New York: Cambridge University Press.
McLaughlin, B. 1987. Theories of second language learning. London: Edward Arnold.
Muysken, P. 1981. "Creole tense/mood/aspect systems: the unmarked case?" Generative studies on
Creole languages ed. by P. Muysken, 181-199. Dordrecht: Foris Publications.
Newmeyer, FJ. 1987. "The current convergence in linguistic theory: Some implications for second
language acquisition research." Second Language Research 3/1.1-19.
Oxford, R.L. and N.C. Rhodes. 1988. "U.S. foreign language instruction: Assessing needs and cre
ating an action plan." ERIC/CLL News Bulletin 11/2.1 + 6-7.
Perlmutter, D. 1983. Studies in relational grammar 1. Chicago: University of Chicago Press.
Pinker, S. 1984. Language leamability and language development. Cambridge: Harvard University
Press.
Pitthan, I.M. 1988. A history of Russian/Soviet ideas about language: Background to Soviet
foreign language pedagogy. Unpublished Ph.D. thesis. Stanford University.
Rosen, C. 1987. Relational grammar and SLA. Paper presented at the conference on Second lan
guage Acquisition: Contributions and Challenges to Linguistic Theory, Stanford University,
July.
Rutherford, W.E. 1984. "Description and explanation in interlanguage syntax: state of the art."
Language Learning 34/3.12-55.
FL INSTRUCTION AND SLA RESEARCH IN THE US 19
Ryan, E.B. and H. Giles, eds. 1982. Attitudes toward language variation: Social and applied con-
texts. London: Edward Arnold.
Sato, C.J. 1985. The syntax of conversation in interlanguage development. Unpublished Ph.D.
thesis. University of California at Los Angeles.
Shulman, L. 1986. "Paradigms and research programs in the study of teaching: A contemporary
perspective." Handbook of research on teaching (3rd ed.) ed. by M.C. Wittrock, 3-?. New
York: MacMillan Publishing Company.
Silva-Corvalán, C. 1987. Cross-generational bilingualism: theoretical implications of language at
trition. Paper presented at the conference on Second Language Acquisition: Contributions
and Challenges to Linguistic Theory, Stanford University, July.
Tarone, E. 1984. "On the variability of interlanguage systems." Universals of second language ac-
quisition ed. by F.R. Eckman, L.H. Bell and D. Nelson, 3-23. Rowley, MA: Newbury House.
Tomlin, R.S. 1984. "The treatment of foreground-background information in the on-line descrip
tive discourse of second language learners." Studies in Second Language Acquisition 6/2.115-
142.
Veltman, C. 1983. Language shift in America. Berlin etc.: Mouton.
Waggoner, D. 1981. "Statistics on language use." Language in the USA ed. by C.A. Ferguson and
S.B. Heath, 486-515. Cambridge: Cambridge University Press.
Empirical Foreign Language Research in Europe
The purpose of this paper is not to present a full survey of past and ongoing
empirical research in Europe on foreign language teaching (FLT), even if there
may well be a great need for such a survey. An authoritative source of informa
tion on educational research like the Handbook of Research on Teaching (Merlin
C. Wittrock, ed., 1986,3rd ed.), a project of the American Educational Research
Association, which in the European context would certainly have an article on
foreign language teaching—besides, or instead of, one on 'teaching bilingual
learners' (Wong Fillmore and Valadez 1986) — , does not exist in Europe. Nor
are there many good and systematic incidental treatments of empirical research
in any number of the relevant sub-fields of our field of action. The scope of this
overview is a much more limited one; the main questions will concern the fol
lowing aspects of FLT research in Europe:
In this way we hope to provide some insight into past and ongoing develop
ments in the European scene of FLT research, and to suggest some directions
that future research might take.
22 THEO VAN ELS, KEES DE BOT & BERT WELTENS
Belgium - 4 4
FRG 4 9 19 47 79
France 1 - 2 6 9
Great-Britain 1 4 3 5 13
Netherlands - 3 7 15 25
Scandinavia 3 8 2 6 19
Other W&E. Eur. countries - 2 3 4 9
USA/Canada 6 11 7 31 55
Other countries 2 3 5
For the first two analyses, which were adapted from Van Els (1988), we
used the fairly representative collection of books and journals in the field of ap
plied linguistics at our department. All the important international journals are
represented and there are about 5000 volumes: handbooks, monographs,
proceedings and readers, not including foreign language teaching materials, of
course. All the journals, from their first issues, all the books acquired since
about 1976 and some of the books from before 1976, have been systematically
catalogued in a fully computerized bibliographical system. For analysis I separ
ate lists were printed, for four consecutive periods of 5 or 6 years, of all books
and articles to which the key-word 'foreign language teaching', and also either
the key-word 'empirical research' or the key-word 'research report' had been at
tributed. The total number of items found was 218.
In table 1 these publications have been categorized according to the country
where the research was carried out. What one sees is first of all a steady increase
in the number of publications dealing with FLT research over the past twenty
years; the increase is particularly striking for the fourth period. Secondly, Eu
rope appears to have shown a steadier increase than North America.
In this count, the share that individual countries take in the total output,
varies a great deal. Particularly low is the share of both France and Great-Bri
tain. Where there is an overall increase of the output for all countries over the
period, Scandinavia is an exception to the rule: the number for 1972/76 reflects
the special activities in connection with the well-known GUME-project (see
Von Elek and Oskarsson 1975). Another striking point is the fact that the FRG
has produced a great number of more 'general' works, i.e. works discussing re
search planning, design, or policy, most of them in the last few years.
Table 2. Analysis II: Number of publications dealing with research on FL/L2 learning
and teaching.
Belgium - 2 3 11 16
FRG 13 44 85 94 236
France 3 4 3 14 24
Great-Britain 11 8 11 19 49
Netherlands 2 17 48 78 145
Scandinavia 4 8 10 15 37
Other W&E. Eur. countries 6 10 11 19 46
In order to validate the figures in table 1, a second analysis was carried out.
In this second analysis all those documents that had been assigned either 'em
pirical research' or 'research report' were again selected, as had been done in
the first, but instead of just adding 'foreign language teaching' as a selection
term, 'foreign language teaching or foreign language learning or second lan
guage teaching or second language learning' was added. This led to a total of 892
publications being selected. They were categorized according to country and
period in the same way, with the results shown in table 2.
As can be seen in table 2, the overall tendencies are comparable to those in
table 1: a general increase over the years, and a relatively minor contribution
from France and Great-Britain. Note, also, that the share from the USA and Ca
nada has risen remarkably (from 25% in table 1 to 35% in table 2), mainly as a
result of the wealth of Canadian publications on L2 learning and teaching.
Nevertheless, this increase does not bridge the gap between Europe and North
America.
In the third analysis, quite a different perspective was taken. We opted for a
count of European publications in a limited set of journals figuring in the Arts &
Humanities Citation Index, which we take as an indication of their scientific im
pact. Nine journals were selected from this corpus on the basis of our estimation
of their relevance for the field. They were the following:
1. Applied Linguistics;
2. Canadian Modern Language Review;
3. Foreign Language Annals;
4. International Review ofApplied Linguistics;
5. Journal of Multilingual and Multicultural Development;
6. Language Learning;
7. Modern Language Journal;
8. System;
9. TESOL Quarterly.
Table 3. Analysis III: Number of publications per country.
FRG 13 15 3 2 3 36
Great-Britain 7 46 12 15 6 86
Netherl./Belgium 6 14 10 0 1 31
Scandinavia 5 10 8 0 1 24
E. Europe 0 5 0 1 0 6
S. Europe 1 19 3 2 1 26
FRG 54 40 37 24
Great-Britain 6 22 8 31
Netherlands/Belgium 22 19 36 18
Scandinavia 7 16 6 18
E.Europe 5 0 8 0
S. Europe 7 3 6 6
analysing the first four volumes of Language Testing (1984-1987). Our results
were confirmed: apart from a remarkably strong Israeli contribution, the FRG
and Great-Britain appeared to be the strongest European contributors to the
EMPIRICAL FOREIGN LANGUAGE RESEARCH IN EUROPE 27
language testing literature (6 and 11 articles resp. out of a total of 19; the re
maining two came from the Netherlands).
When we want to compare the data from the three analyses, the best com
parison is the number of empirical studies dealing with FLT (Analysis I), and
FLT or SLT (Analysis II) from the period 1981-1987 on the one hand, and the
same categories of studies from the period 1982-1987 (Analysis III) on the other
hand. This comparison is represented in table 4 in terms of percentages per
country. (For the sake of simplicity, we have left out the North-American ar
ticles from analyses I and II, and computed the percentages on the basis of the
European sub-total.)
Table 4 shows that all three analyses yield highly comparable results in
many respects, but there are also a few remarkable differences. On the one
hand, analyses I and II overestimate the German and Dutch/Belgian contribu
tions; this may be attributed to the nature of the database used, which contains a
relatively high proportion of documents written in German and Dutch. On the
other hand, analysis III yields a (much) larger contribution from Great-Britain
and the Scandinavian countries; this may be due to the fact that the preponde
rance of the journals selected publish in English, and the fact that one of the
journals {System) is based in Sweden, respectively. In fact, analyses I and II rep
resent the total research effort within each country, whereas analysis III is
limited to that part of the effort that is likely to have an international impact.
A clear and important finding in analyses I and II was the steady rise in the
number of empirical publications over time. In analysis III, which dealt with a
relatively short period of time, we also looked at this development, with the re
sult presented in table 5.
81 82 83 84 85 86 87 Total
FLT +Emp 6 3 6 4 5 4 4 32
SLT +Emp 5 3 3 10 4 5 6 36
Total 11 6 9 14 9 9 10 68
The tendency noted in analyses I and II across the years 1966-1987 appears
not to continue within the 1980s: the number of empirical articles on FLT/SLT
fluctuates around 10 per year, and there is no sign of an increase over the years.
28 THEO VAN ELS, KEES DE BOT & BERT WELTENS
One final comment concerning the figures should be added here. What they
irrefutably show is that there has been an increase of empirical research over the
past two decades. That in itself is very gratifying. What the figures do not show,
however, is how our field compares in this respect to other fields of research.
Whether, therefore, the rate of growth of applied linguistic research is satisfac
tory in comparison with that of other fields, or — for that matter — in proportion
to the need for research in the field of foreign language teaching, we do not
know at all. In addition, a recent survey by the Association of Dutch Universities
(VSNU) showed that the scientific output has increased considerably in all fields
over the last decade.
It is surprising to see how in very recent times people in the context of the
European Common Market have been growing acutely aware of the fact that on
1 January 1993 the unification process of the European countries concerned will
take a major step towards doing away with barriers of all kinds between the
countries. The number of those seems to be growing too who realize that lan
guage communication, i.e. the efficient use of the languages of Europe within
the Community itself and also of a number of 'outside' languages, will play an
important and critical part in bringing the process of unification to a successful
end.
A major recent development is the establishment of a vast joint programme
for promoting the teaching and learning of foreign languages in the European
Community, called LINGUA. In the pre-amble of the programme proposal, lack
of foreign language skills is called "the Achilles Heel in the Community-wide ef
fort to make the free movement of persons and ideas a practical reality" (Docu
ment no. 6614-89 of the 1321st session of the Council of Europe and the
Ministers of Education, May 22, 1989). The central aim of the programme-
which is to start in 1990, and for which a fairly large budget has been set aside —
is "to increase the capacity of the citizens of the Community to communicate
with each other by a quantitative and qualitative improvement in the teaching
and learning of foreign languages within the European Community" (o.c.). Simi
lar considerations have — also very recently and at very short notice — incited the
Dutch Ministry of Education to commission the writing of a National Action
Programme for foreign language use and teaching in the Netherlands. The Eu
ropean perspective is to be one of the main issues.
One of the leading principles agreed upon by the European nations is that
the rich and diverse heritage of European cultures and languages should not be
EMPIRICAL FOREIGN LANGUAGE RESEARCH IN EUROPE 29
jeopardized in this process. This cannot but mean that the attention for FLT,
which in some of the countries has never been overwhelming so far, will have to
be increased in all the countries. A number of private foundations have been
working towards this end by organizing a series of conferences in the last few
years, bringing together experts in the field of FLT and international communi
cation, politicians and representatives from the world of business and com
merce, to discuss the immense problems and possible solutions. A major role in
this enterprise has been played by the European Cultural Foundation, which
also co-funded the conference from which the present volume arose. It is inter
esting to see that the 'manifesto' drawn up at a previous conference, held in Ma
drid, June 1987, not only stresses the great importance of utilizing Europe's
diversity of cultures and languages and of overcoming the difficulties caused by
that very diversity by a major effort to improve FLT — quantitatively and qualita
tively—, but also stresses the importance of the promotion of empirical research
into all aspects of the problem area, inclusive of the teaching of foreign lan
guages.
In the European context, therefore, it is very gratifying to see that the de
mand from applied linguists for more research into FLT, is backed by a growing
awareness in other, also political, circles that a great deal of work desperately
needs doing. If we can come up with the right ideas for empirical research, it is
our conviction that the opportunities for carrying it out will be made available.
What, now, are the right ideas for empirical research? Everybody will agree
that a first requirement is for our research to be more truly empirical. There is
no need to stress here that repeated statements of just opinions and hunches on
what should be taught and especially on how foreign languages should be taught,
will not further the cause of FLT any more than they have done so far. How the
research effort — both as to what and how should be taught — can be made more
truly empirical, is one of the main themes in the other contributions to this vol
ume.
Our next point is one on which we may not all agree as wholeheartedly. In
his 1988 paper Van Els stressed — as he had done before — that the main source
of inspiration for applied linguistic research should be sought primarily in FLT
itself. As a 'problem-oriented' discipline applied linguistics should be concerned
with questions originating from the actual teaching of foreign languages, and not
from one of the related source disciplines. In the paper in question he argued
that the sometimes vehement academic dispute in the FRG between Sprach-
lehrforscher ('language-teaching researchers') like Karl-Richard Bausch on the
one hand, and Zweitspracherwerbsforscher ('SLA researchers') like Henning
Wode on the other, may well find its main explanation in the fear felt by the for
mer that there is a great danger in the weight given to second language learning
30 THEO VAN ELS, KEES DE BOT & BERT WELTENS
research by Wode that FLT will again be turned into 'the child of fashion' of any
new development in any of the source disciplines, most prominently—of
course—linguistics, but also developmental psychology.
Two further points that we think are of general interest relate to more spe
cific aspects of research into FLT. First of all, it is not uncommon to set the
goals of FLT — i.e. of any programme serving whatever target group — at the hig
hest level imaginable, i.e. at native or at least native-like level. Usually this is
done by people who have not given the matter any serious consideration, but it
also happens that it is a point of view taken deliberately and stated in the most
explicit terms. Our point of view is that not only is it, in by far most instances,
fully unrealistic to set one's goals so high — as everyone would agree—, but that
also different aims set for teaching programmes may fundamentally affect the
teaching and learning that should lead up to those aims. What may be valuable
procedures and practices in one programme, may lose their strengths in pro
grammes in which one attempts to achieve different sets of goals. What point
would there be in stressing absolute correctness of spoken competence in a pro
gramme that sets out to achieve a high level of reading ability? In this kind of
programme there will be very little need for listening comprehension exercises,
let alone for pronunciation drills. The more we aim at explicitly defining particu
lar sets of learning goals aimed at satisfying particular learning needs — 'learning
units', 'modules' — , the more we will have to adopt teaching methodologies tai
lored to achieving those goals with a maximum of effect, with the highest
possible level of efficiency. So far, there is very little empirical evidence as to
which teaching methodologies to choose under those different circumstances.
The second point in this connection — also related to goals of teaching — was
elaborated in a paper by Van Els and Weltens (1989: 23). For brevity's sake, we
will simply quote the relevant passage from the paper:
"The (second) point is that FL loss caused by non-use results in less—and,
possibly, also different—language competence from the competence achieved
right at the end of the FL course. It is often the case in our educational sys
tems that language courses are followed after their completion by a number of
years of non-use, before pupils are expected to apply the language com
petence acquired in real-life communicative situations. In such a case the final
objective of a language course cannot be exactly the same as the competence
required later for actual communicative usage, but—in order to make up for
the loss sustained in the meantime—may well have to be higher and, possibly
even, different."
Now that our project into the loss of school-French in the Netherlands has
been completed, we may have to adapt the previous statement somewhat. For,
EMPIRICAL FOREIGN LANGUAGE RESEARCH IN EUROPE 31
surprisingly enough, what we found for the written receptive skills — i.e. reading
comprehension — after two years of non-use was an increase rather than a de
crease (cf. Weltens 1989). But, whichever way, our conclusion cannot but be that
effects of FLT programmes should not be measured merely as the direct out
come of the particular programmes in question, i.e. measured immediately after
the completion of the teaching process.
4 Concluding remarks
References
Brophy, J., and T.L. Good. 1986. "Teacher behavior and student achievement." Wittrock
1986.328-375.
DES (Department of Education and Science). 1983. Foreign languages in the school curriculum.
London: Welsh Office.
32 THEO VAN ELS, KEES DE BOT & BERT WELTENS
Fraser, B.J., HJ. Walberg, W.W. Welch, and J.A. Hattie. 1987. "Syntheses of educational produc-
tivity research." International Journal of Educational Research 11/2.145-252.
Henningsson, B. 1986. "Foreign language teaching in Swedish schools." FIPLV World News 6.3-4.
Van Els, T. 1988. "European developments in applied linguistics." Applied linguistics in society ( =
British Studies in Applied Linguistics, 3) ed. by P. Grunwell, 16-29. London: CILT.
Van Els, T., and B. Weltens. 1989. "Foreign language loss research from a European point of
view." ITL Review ofApplied Linguistics 83/84.19-35.
Von Elek, T., and M. Oskarsson. 1975. Comparative method experiments in foreign language teach-
ing: The final report of the GUME/Adults project. Gothenburg: School of Education.
Weltens, B. 1989. The attrition of French as a foreign language ( = Studies on Language Acquisition,
6.) Dordrecht/Providence, RI: Foris.
Wittrock, M.C., ed. 1986. Handbook of Research on Teaching. 3rd Edition. New York: Macmillan.
Wong Fillmore, L., and C. Valadez. 1986. "Teaching bilingual learners." Wittrock 1986.648-685.
Zapp, F.J. 1979. Foreign language policy in Europe. An outline of the problem. Brussels: European
Cooperation Fund.
Section II—Measurement and Research Design
Introduction to the Section Measurement and
Research Design
Ralph B. Ginsberg
Michael H. Long
1 Against methods
At stages 1 and 2, not just Spanish speakers, whose L1 has pre-verbal nega
tion, but also Japanese learners, whose native system is post-verbal, initially pro
duce pre-verbally negated utterances in ESL (Gillis and Weber 1976; Stauble
1981), although the Japanese abandon the strategy sooner (Zobl 1982). Pre-ver
bal negator placement appears to reflect strong internal pressures, for it is wide
ly observed in studies of both naturalistic and instructed SLA. Turkish speakers
receiving formal instruction, for example, start with pre-verbal negation in
Swedish, even though both L1 and L2 have post-verbal systems (Hyltenstam
1977).
With minor variations, the evidence to date suggests that the same develop
mental sequences are observed in the ILs of children and adults, of naturalistic,
instructed and mixed learners, of learners from different L1 backgrounds, and of
learners performing on different tasks. L1 differences occasionally result in ad
ditional sub-stages and swifter or slower passage through stages, but not in dis
ruption of the basic sequence by skipping stages (for review, see Ellis 1985;
Larsen-Freeman and Long, in press; Zobl 1982).
Passage through each stage, in order, appears to be unavoidable, and obli
gatoriness has been incorporated into the definition of "stage" in SLA (Meisel,
Clahsen and Pienemann 1981; Johnston 1985). As would be predicted if this de
finition is accurate, it also seems that developmental sequences are impervious
to instruction. It has repeatedly been demonstrated that morpheme accuracy or
ders and developmental sequences do not reflect instructional sequences (Light-
bown 1983; Ellis 1989), and tuition in a German SL word order structure beyond
A DESIGN FEATURE IN LANGUAGE TEACHING METHODOLOGY 43
students' current processing abilities has been shown not to result in learning
(Pienemann 1984).
The results for developmental sequences, together with related findings of
common (although not invariant) naturalistic and instructed morpheme accu
racy orders, show that language learning is obviously at least partly governed by
forces beyond a teacher's or textbook writer's control. This realization has in
turn led some theorists to conclude that classrooms are useful to the extent that
they provide sheltered linguistic environments for beginners, but that it does not
help for teachers to focus on linguistic form. An inference that could easily be
drawn from such interpretations is that there are only two options in this area of
course design: either (1) a linear, additive syllabus and methodology whose con
tent and focus is a series of isolated linguistic forms (sound contrasts, lexical
items, structures, speech acts, notions, etc.), or (2) a program with no overt focus
on linguistic forms at all. While this turns out to be a false dichotomy, focus on
form is a potentially important design feature for distinguishing instructional
methodologies and settings.
Focus on form is a feature which reveals an underlying similarity among a
variety of (a) teaching "methods", e.g. ALM, TPR, Grammar Translation and Si
lent Way, (b) syllabus types, e.g. structural, notional-functional, lexical, and (c)
program types, e.g. submersion, immersion, sheltered subject-matter, which on
the surface appear to differ greatly. Groups (a) and (b) all utilize an overt focus
on form; Group (c) does not. It also allows generalizations across traditional
boundaries, identifying a link between the program types in group (c) and in the
ory, at least, a linguistically non-isolating teaching "method", such as the Natural
Approach (Krashen and Terrell 1983). At the classroom process level, tech
niques, procedures, exercises and pedagogic tasks can also be categorized as to
whether or not they either permit or require a focus on form. Display questions,
repetition drills and error correction, for example, all overtly focus students on
form; referential questions, true/false exercises and two-way tasks do not. Fi
nally, while many potentially relevant design features will distinguish some
methods, syllabi, tasks and tests from others, few have the valency of focus on
form. It appears to be a parameter one value or another of which characterizes
almost all language teaching options.
Five caveats are in order. First, it is not being suggested that whether or not
a program type, syllabus, method, task or test focuses on form is the only rele
vant design characteristic or that important differences will not exist among
members of groups which share the feature, and vice versa. Second, while most
programs, syllabi, methods, tasks and tests either do or do not overtly focus on
form, some within the former group differ in the degree to which they isolate lin
guistic structures, not to mention as to how they do so; there are, in other words,
44 MICHAEL H. LONG
The practice of isolating linguistic items, teaching and testing them one at a
time, was originally motivated by advances in behaviorist psychology and struc
turalist linguistics. Combined with the advent of a world war and a sudden need
for fluent foreign language speakers, these events led to the growth of ALM and
its many progeny. As distinct from a focus on form, to which we return below,
structural syllabi, ALM, and variants thereof involve a focus on forms. That is to
say, the content of the syllabus and of lessons based on it is the linguistic items
themselves (structures, notions, lexical items, etc.); a lesson is designed to teach
"the past continuous", "requesting" and so on, nothing else.
Arguments abound against making isolated linguistic structures the content
of a FL course, that is, against a focus on forms. Of the hundreds of studies of in-
terlanguage (IL) development now completed, not one shows either tutored or
naturalistic learners developing proficiency one linguistic item at a time. On the
contrary, all reveal complex, gradual and inter-related developmental paths for
grammatical subsystems, such as auxiliary and negation in ESL (Stauble 1981;
Kelley 1983), and copula and word order in GSL (Meisel, Clahsen and Piene-
mann 1981). Moreover, development is not unidirectional; omission/suppliance
of forms fluctuates, as does accuracy of suppliance.
Although most syllabi and methods assume the opposite, learners do not
move from ignorance of a form to mastery of it in one step, as is attested by the
very existence of developmental sequences like that for ESL negation. Typically,
when a form first appears in a learner's IL, it is used in a non-target-like manner,
and only gradually improves in accuracy of use. It sometimes shifts in function
over time as other new (target-like and non-target-like) forms enter (Huebner
1983). It quite often declines in accuracy or even temporarily disappears al
together due to a change elsewhere in the IL (see, e.g. Meisel, Clahsen and
Pienemann 1981; Huebner 1983; Lightbown 1983; Neumann 1977), a phenome
non sometimes describable as U-shaped behavior (Kellerman 1985). Further,
attempts to teach isolated items one at a time fail unless the structure happens
A DESIGN FEATURE IN LANGUAGE TEACHING METHODOLOGY 45
the foreign language is spoken, the cultures of its speakers, and so on — and
overtly draw students' attention to linguistic elements as they arise incidentally
in lessons whose overriding focus is on meaning, or communication. Views
about how to achieve this vary. One proposal is for lessons to be briefly "inter
rupted" by teachers when they notice students making errors which are (1) syste
matic, (2) pervasive and (3) remediable. The linguistic feature is brought to
learners' attention in any way appropriate to the students' age, proficiency level,
etc. before the class returns to whatever pedagogic task they were working on
when the interruption occurred. (For details and a rationale, see Crookes and
Long 1987; Long, in press).
An example of the probable effect of instruction on ultimate attainment
comes from work on the acquisition of relative clauses in a SL. Several studies
(e.g., for English: Gass 1982; Gass and Ard 1980; Pavesi 1986; Eckman, Bell and
Nelson 1988; for Swedish: Hyltenstam 1984) have shown that both naturalistic
and instructed acquirers develop relative clauses in the order predictable from
the noun phrase accessibility hierarchy (Keenan and Comrie 1977; Comrie and
Keenan 1979; see Figure 1), although with occasional reversals of levels 5 and 6.
Of particular interest in the present context, Pavesi (1986) compared
relative clause formation by instructed and naturalistic acquirers. The former
were 48 Italian high school students, ages 14-18, who had received from 2 to 7
years (an average of 4 years) of grammar-based EFL instruction and who had
had minimal or (in 45 of 48 cases) no informal exposure to English. The untu
tored learners were 38 Italian workers (mostly restaurant waiters), ages 19-50,
who had lived in Scotland anywhere from 3 months to 25 years (an average of 6
years), with considerable exposure to English at home and at work, but who had
received minimal (usually no) formal English instruction.
Relative clause constructions were elicited using a set of numbered pictures
and question prompts: ("Number 7 is the girl who is running", and so on). Impli-
cational scaling showed that both groups' developmental sequences correlated
significantly with the noun phrase accessibility hierarchy. There were two other
kinds of differences, however. First, naturalistic learners produced statistically
significantly more full nominal copies than the instructed learners (e.g. "Num
ber 4 is the woman who the cat is looking at the woman"), whereas instructed
learners produced more pronominal copies ("Number 4 is the woman who the
cat is looking at her"). Given that neither English nor Italian allow copies of
either kind, this is further evidence of the at least partial autonomy of IL syntax,
a claim also supported by the developmental sequence itself, of course. Interes
tingly, the relative frequencies of the different kinds of copies suggest that the
instructed learners had "grammaticized" more, even in the errors they made, a
result consistent with findings by Pica (1983) and Lightbown (1983). Second,
A DESIGN FEATURE IN LANGUAGE TEACHING METHODOLOGY 47
more instructed learners reached 80 percent criterion on all of the five lowest
NP categories in the hierarchy, with differences attaining statistical significance
at the second lowest (genitive) level and falling just short (p < .06) at the lowest
(object of a comparative) level. More instructed learners (and very few natural
istic acquirers) were able to relativize out of the more marked NPs in the hier
archy. In considerably less average time, that is, instructed learners had reached
higher levels of attainment.
Pavesi's study is a non-equivalent control groups design, so causal claims
are precluded. There are also no data on whether or not the high school students
were ever actually taught relative clauses, or if so, which ones. We know simply
that they received something like a grammar-translation course. The findings
are nonetheless suggestive of the kind of effects a focus on form may have on ul
timate SL attainment. Two other studies, furthermore, have shown that structu
rally focused teaching of relative clause formation can accelerate learning, and
also that, at least as far down as level 4 (object of a preposition) in the hierarchy,
instruction in a more marked structure will generalize back up the implicational
scale to less marked structures (Gass 1982; Eckman et al 1988; and see also
Zobl 1985).
SLA research findings like those briefly described here would seem to sup
port two conclusions. (1) Instruction built around a focus on forms is counter
productive. (2) Instruction which encourages a systematic, non-interfering focus
on form produces a faster rate of learning and (probably) higher levels of ulti
mate SL attainment than instruction with no focus on form. If correct, this would
make [ + focus on form] a desirable design feature of FL instruction. Programs
exist which have this feature, alternating in some principled way between a focus
on meaning and a focus on form. (One example is task-based language teaching.
See Long 1985; Crookes and Long 1987; Long and Crookes 1989; Long, in
press). Programs with a focus on form need to be compared in carefully control
led studies with programs with a focus on forms and with (e.g. Natural Ap
proach) programs with no overt focus on form.
4 Further research
True experiments are needed which compare rate of learning and ultimate
level of attainment after one of three programs: focus on forms, focus on form,
and focus on communication. Preliminary research in this area has produced
mixed results, two studies finding positive relationships between the amount of
class time given to a focus on forms and various proficiency measures (McDo
nald, Stone and Yates 1977, for ESL; Mitchell, Parkinson and Johnstone 1981,
48 MICHAEL H. LONG
for French FL), and a third study of ESL (Spada 1986, 1987) finding no such ef
fects. (For detailed review, see Chaudron 1988.) All three studies were compari
sons of intact groups which differed in degree of focus on forms, it should be
noted. Research has yet to be conducted comparing the unique program types.
Studies of this kind should be true experiments, employing a pretest/post-
test control group design, and should also include a process component to moni
tor implementation of the three distinct treatments. They should utilize multiple
outcome measures, some focusing on accuracy, some on communicative ability
or fluency, thereby avoiding (supposed) bias in favour of one program of an
other. The post-tests should include immediate and delayed measures, since at
least one study (Harley 1989) has found a short-term advantage for students re
ceiving form-focused instruction disappeared (three months) later. Some of the
measures should further reflect known developmental sequences and patterns of
variation in ILs, appropriate for the developmental stages of the subjects as re
vealed on the pretests. A distinction should be maintained between construc
tions which are in principle learnable from positive instantiation in the input and
constructions which in principle require negative evidence. (For further details
and desirable characteristics of such studies, see Long 1984, forthcoming; Lar-
sen-Freeman and Long 1989.)
Several additional issues need to be addressed, either as separate studies of
the focus on form design feature or as sub-parts of the basic study outlined
above. Many interesting questions remain unanswered, after all. It will be useful
to ascertain which structures require focus and/or negative evidence, and which
can be left to the care of "natural processes" (White 1987). Other possibilities
include studies motivated by implicational markedness relationships designed to
determine the principles governing maximal generalizability of instruction (see,
e.g. Eckman et al 1988). Similarly, one can envisage studies inspired by current
models of UG designed to test the claimed potential of certain structures to trig
ger instantaneous (re-)setting of a parameter. An example would be Chomsky's
(1981) work on the pro-drop parameter, and the claimed triggering effects of ex
pletives with it and there as dummy subjects (Hyams 1983; Hilles 1986). Finally,
further theoretically motivated work, like that of Pienemann (1984) and Piene-
mann and Johnston (1987), is clearly needed on the timing of instruction. Re
search of these and other kinds will establish the validity and scope of focus on
form as a design feature in language teaching methodology.
References
Allwright, R.L. 1977. "Language learning through communication practice." ELT Docs 76/3.2-14.
A DESIGN FEATURE IN LANGUAGE TEACHING METHODOLOGY 49
Breen, M.P. 1987. "Contemporary paradigms in syllabus design." Language Teaching 20/2.81-92,
and 20/3.157-174.
Chaudron, C. 1988. Second Language Classrooms. Research on Teaching and Learning. Cam-
bridge: Cambridge University Press.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Comrie, B. and EX. Keenan. 1979. "Noun phrase accessibility revisited". Language 55.649-664.
Corder, S.P. 1967. "The significance of learners' errors." International Review of Applied Linguis-
tics 5.161-170.
Crookes, G. 1986. Task classification: a cross-disciplinary review ( = Technical Report, 4.) Honolu-
lu: Center for Second Language Classroom Research, Social Science Research Institute,
University of Hawaii at Manoa.
Crookes, G. and M.H. Long. 1987. "Task-based language teaching. A brief report. Modern Eng-
lish Teaching (Part 1) 8.26-28 + 61, and (Part 2) 9.20-23.
Dinsmore, D. 1985. "Waiting for Godot in the EFL classroom." ELT Journal 39.225-234.
Dulay, M. and H. Dulay. 1973. "Should we teach children syntax? Language Learning 24/2.245-
258.
Eckman, F.R., L. Bell and D. Nelson. 1988. "On the generalization of relative clause instruction in
the acquisition of English as a second language." Applied Linguistics 9/1.1-20.
Ellis, R. 1984. "The role of instruction in second language acquisition." Language Learning in For-
mal and Informal Contexts ed. by D.M. Singleton and D.G. Little, 19-37. Dublin: IRAAL.
Ellis, R. 1985. Understanding Second Language Acquisition. Oxford: Oxford University Press.
Ellis, R. 1989. "Are classroom and naturalistic acquisition the same? A study of the classroom ac-
quisition of German word order rules." Studies in Second language Acquisition 11/3.305-328.
Felix, S.W. 1981. "The effect of formal instruction on second language acquisition." Language
Learning 31/1.87-112.
Gass, S.M. 1982. "From theory to practice." On TESOL '81 ed. by M. Hines and W. Rutherford,
129-139. Washington, DC: TESOL.
Gass, S.M. and J. Ard. 1980. "L2 data: their relevance for language universals." TESOL Quarterly
14/4.443-452.
Gillis, M. and R. Weber. 1976. "The emergence of sentence modalities in the English of japanese-
speaking children." Language Learning 26/1.77-94.
Harley, B. 1988. "Effects of instruction on SLA: issues and evidence." Annual Review of Applied
Linguistics 9.165-178.
Harley, B. 1989. "Functional grammar in French immersion: a classroom experiment." Applied
Linguistics 10/3.331-359.
Hilles, S. 1986. "Interlanguage and the pro-drop parameter." Second Language Research 2/1.33-
52.
Hoetker, J. and W.P. Ahlbrand. 1969. "The persistence of the recitation." American Educational
Research Journal 6/1.145-167.
Hyams, N. 1983. "The pro-drop parameter in child grammars." Proceedings of the West Coast
Conference on Formal Linguistics ed. by M. Barlow, D. Flickinger and M. Westcoat. Stan-
ford, CA: Stanford University, Department of Linguistics.
Hyltenstam, K. 1977. "Implicational patterns in interlanguage syntax variation." Language Learn-
ing 27/2.383-411.
Hyltenstam, K. 1984. "The use of typological markedness conditions as predictors in second lan-
guage acquisition: the case of pronominal copies in relative clauses." Second Languages. A
Cross-Linguistic Perspective ed. by R.W. Andersen, 39-58. Rowley, MA: Newbury House.
50 MICHAEL H. LONG
Johnston, M. 1985. Syntactic and morphological progressions in learner English. Canberra, Austra-
lia: Commonwealth Department of Immigration and Ethnic Affairs.
Keenan, E. and Comrie, B. 1977. "Noun phrase accessibility and universal grammar." Linguistic
Inquiry 8.63-99.
Kellerman, E. 1985. "If at first you do succeed..." Input in Second Language Acquisition ed. by S.
Gass and C. Madden, 345-353. Rowley, MA: Newbury House.
Krashen, S.D. and H.W. Seliger. 1975. "The essential contributions of formal instruction in adult
second language learning." TESOL Quarterly 9/2.173-183.
Krashen, S.D. and T. Terrell. 1983. The Natural Approach. New York: Pergamon Press.
Larsen-Freeman, D. and M.H. Long. 1989. Research Priorities in Foreign Language Learning and
Teaching. Washington, DC: Johns Hopkins University, National Foreign Language Center.
Larsen-Freeman, D. and M.H. Long. In press. An Introduction to Second Language Acquisition
Research. London: Longman.
Lightbown, P.M. 1983. "Exploring relationships between developmental and instructional sequen
ces." Classroom-Oriented Research on Second Language Acquisition ed. by H.W. Seliger and
M.H. Long, 217-243. Rowley, MA: Newbury House.
Long, M.H. 1983. "Does instruction make a difference? A review of research." TESOL Quarterly
17/3.359-382.
Long, M.H. 1984. "Process and product in ESL program evaluation." TESOL Quarterly 18/3.409-
425.
Long, M.H. 1985. "A role for instruction in second language acquisition: task-based language
teaching." Modelling and Assessing Second Language Acquisition ed. by K. Hyltenstam and
M. Pienemann, 77-99. Clevedon, Avon: Multilingual Matters.
Long, M.H. 1988. "Instructed interlanguage development." Issues in Second Language Acquisi-
tion. Multiple Perspectives ed. by L.M. Beebe, 115-141. New York: Newbury House.
Long, M.H. Forthcoming. "The design and psycholinguistic motivation of research on foreign lan
guage learning." To appear in Foreign Language Acquisition Research and the Classroom ed.
by B. Freed. Boston: D.C. Heath.
Long, M.H. In press. Task-Based Language Teaching. Oxford: Basil Blackwell.
Long, M.H., L. Adams, M. McLean and F. Castanos. 1976. "Doing things with words: verbal in
teraction in lockstep and small group classroom situations." On TESOL '76 ed. by J.F. Fan-
selow and R. Crymes, 137-153. Washington, DC: TESOL.
Long, M.H. and G. Crookes. 1989. Units of analysis in syllabus design. Ms. Department of ESL,
University of Hawaii at Manoa.
Long, M.H. and C.J. Sato. 1983. "Classroom foreigner talk discourse: forms and functions of tea
chers' questions." Classroom-Oriented Research in Second Language Acquisition ed. by H.W.
Seliger and M.H. Long, 268-285. Rowley, MA: Newbury House.
McDonald, F.J., M.K. Stone and A. Yates. 1977. The effects of classroom interaction patterns and
student characteristics on the acquisition of proficiency in English as a second language.
Princeton, NJ: Educational Testing Service.
Meisel, J.M., H. Clahsen and M. Pienemann. 1981. "On determining developmental stages in
natural second language acquisition." Studies in Second Language Acquisition 3/2.109-135.
Mitchell, R., B. Parkinson and R. Johnstone. 1981. The foreign language classroom: an observa-
tional study. ( = Stirling Educational Monographs 9.) Stirling: Department of Education,
University of Stirling.
Neumann, R. 1977. An attempt to define through error analysis an intermediate ESL level at UCLA.
M.A. in TESL thesis. Los Angeles, CA: UCLA.
A DESIGN FEATURE IN LANGUAGE TEACHING METHODOLOGY 51
Newmark, L. 1966. "How not to interfere with language learning." International Journal of Ameri-
can Linguistics 32/1.77-83.
Newmark, L. and D.A. Reibel. 1968. "Necessity and sufficiency in language learning." Interna-
tional Review ofApplied Linguistics 6.145-164.
Nunan, D. 1987. "Communicative language teaching: making it work." ELT Journal 41/2.136-145.
Pavesi, M. 1986. "Markedness, discoursal modes, and relative clause formation in a formal and an
informal context." Studies in Second Language Acquisition 8/138-55.
Phillips, D. and C. Shettlesworth. 1975. "Questions in the design and implementation of courses in
English for specialized purposes." Proceedings of the 4th International Congress of Applied
Linguistics (Volume 1) ed. by G. Nickel, 249-264. Stuttgart: Hochschule Verlag.
Pica, T. 1983. "Adult acquisition of English as a second language under different conditions of ex-
posure." Language Learning 33/4.465-497.
Pica, T. and M.H. Long. 1986. "The linguistic and conversational performance of experienced and
inexperienced teachers." "Talking to learn ": Conversation in Second Language Acquisition
ed. by R.R. Day, 85-98. Rowley, MA: Newbury House.
Pienemann, M. 1984. "Psychological constraints on the teachability of languages." Studies in Sec-
ond Language Acquisition 6/2.186-214.
Pienemann, M. and M. Johnston. 1987. "Factors influencing the development of language profi-
ciency." Applying Second Language Acquisition Research ed. by D. Nunan, 45-141. Adelaide,
SA: National Curriculum Resource Centre.
Prabhu, N.S. 1987. Second Language Pedagogy. Oxford: Oxford University Press.
Ross, S. Forthcoming. Praxis and product in the EFL classroom. To appear in Evaluating Second
Language Education Programs ed. by C. Alderson and A. Beretta. Cambridge: Cambridge
University Press.
Scherer, G. and M. Wertheimer. 1964. A Psycholinguistic Experiment in Foreign Language Teach-
ing. New York: McGraw-Hill.
Schmidt, R.W. 1990. "The role of consciousness in second language learning." Applied Linguistics
11/2.17-45.
Schumann, J.H. 1979. "The acquisition of English negation by speakers of Spanish: a review of the
literature." The Acquisition and Use of Spanish and English as First and Second Languages
ed. by R.W. Andersen, 3-32. Washington, DC: TESOL.
Smith, P. 1970.A Comparison of the Cognitive and Audiolingual Approaches to Foreign Language
Instruction: The Pennsylvania Foreign Language Project. Philadelphia: Center for Curriculum
Development.
Spada, N. 1986. "The interaction between types of content and types of instruction: some effects
on the L2 proficiency of adult learners." Studies in Second Language Acquisition 8/2.181-199.
Spada, N. 1987. "Relationships between instructional differences and learning outcomes: a pro-
cess-product study of communicative language teaching." Applied Linguistics 8.137-161.
Stauble, A.-M. 1981. A comparative study of a Spanish-English and Japanese-English second lan-
guage continuum: verb phrase morphology. Unpublished Ph.D. dissertation, UCLA.
Swaffer, J.K., K. Arens and M. Morgan. 1982. "Teacher classroom practices: redefining method
as task hierarchy." Modem Language Journal 66.24-33.
Von Elek, T. and M. Oskarsson. 1975. Comparative Methods Experiments in Foreign Language
Teaching. Department of Educational Research. Gothenburg, Sweden: Molndal School of
Education.
White, L. 1987. "Against comprehensible input: the Input Hypothesis and the development of sec-
ond-language competence." Applied Linguistics 8/2.95-110.
52 MICHAEL H. LONG
White, L. 1989. "The principle of adjacency in second language acquisition: do learners observe
the subset principle?" Paper presented at the Child Language Conference, Boston, MA.
March.
Wode, H. 1981. "Language-acquisitional universals: a unified view of language acquisition." Na-
tive Language and Foreign Language Acquisition. ( = Annals of the New York Academy of
Sciences 379) ed. by H. Winitz, 218-234. New York: New York Academy of Sciences.
Zobl, H. 1982. "A direction for contrastive analysis: the comparative study of developmental se-
quences." TESOL Quarterly 16.169-183.
Zobl, H. 1985. "Grammars in search of input and intake." Input in Second Language Acquisition
ed. by S.M. Gass and C. Madden, 329-344. Rowley, MA: Newbury House.
Pros, Cons, and Limits to Quantitative Approaches
in Foreign Language Acquisition Research
W.E. Lambert
are inattentive. But I should be qualified to talk on this topic because, in years
past, I was a statistical assistant for Leon Thurstone and have been a colleague
for years of John Carroll, George Ferguson, and Lee Cronbach, who have tried
to keep me honest, statistically speaking. Furthermore, I have been knee-deep
in quantitative research on language related issues for a long run of years. That
experience has made me a proponent of tight designs and quantitative checkouts
because all other alternatives in language research turn out to be too subjective
and personally biased. The only way I can see to be tough or rigorous on our
selves and our ideas in this field is to put those ideas to a serious quantitative,
experimental test. This bias of mine, however, has clear limits and what I want to
do here is present what I see as the pros to quantitative approaches as well as
the cons and the limitations.
Complying with this topic assignment meant reading recent reports on stat
istical procedures for dealing with performance changes over time when large-
scale evaluation studies are conducted (e.g. the papers of Willett 1988; Bryk and
Raudenbush 1987; and Rogosa, Brandt, and Zimowski 1982); reading through
several large scale, ongoing empirical studies on foreign language pedagogy in
order to get some idea of what is going on in the North American scene; and
then, thinking back on my own involvement in studies of language pedagogy and
attempting to explain what has been going on in these cases, too.
The upshot of all this is that I have three or four macro concerns that will be
the schema for organizing the comments to follow. The concerns are: 1) bigness
versus manageableness in the breadth of empirical studies; 2) the nature of
"process" in the product-versus-process debate in empirical research; and 3) the
tailoring of design and statistics to accommodate more moderately sized investi
gations that will be able to explore "deep" processes or underlying mechanisms.
The United States is big and research directors as well as fund suppliers
seem to want to make their studies big, as if one can only keep up the feeling of
national unity if one brings the whole nation or some large region of it into each
empirical test of an educational or social innovation. The common argument is
that if a researcher has a really strong new pedagogical treatment in hand, or a
really important teacher or learner characteristic to examine, its effects should
be robust enough to emerge even when tested across the nation. Consequently,
it is common to hear an administrator in a federal post (e.g. in the United States
Department of Education) say that he/she has two or three somewhat related
empirical studies underway that are national in scope, each at a cost of some five
PROS, CONS, AND LIMITS TO QUANTITATIVE APPROACHES 55
million dollars. The problem I have with this is that one can convincingly argue
that there are many "nations" all within the U.S.A. For example, one recent esti
mate is that there are only 14% of the American population that have Anglo-
Saxon roots, which makes them not much more important than the 13% who
have Germanic origins, the 11% who have African roots, the 11% who have His
panic roots, and etc. (see Sowell 1983). In fact, I believe that if one were to scale
down distances in North America to a European size, it might well be that there
are equivalent culture differences in a San Diego-Albequerque-Chicago-Boston
network (to take a random example of sites), as there are in a Zurich-Milano-
Paris-Amsterdam-London network.
The point is that when research projects become too large they are forced
to overlook the socially distinctive characteristics of regional sites, school dis
tricts, schools, and particularly classrooms. To attempt to attend to these poten
tially distinctive features usually overtaxes the capacities of the research team,
and in most cases such issues are bypassed in the search for across-site trends.
Researchers usually realize that there are regional, district, and school variations
in their data that are clear and possibly significant, but they normally can't deal
with them; and this usually means that they are "averaged out". For example,
samples of pupils from various schools are amalgamated in a treatment-com
parison investigation, even though obvious school-to-school differences in "aca
demic atmosphere" exist, i.e. differences in the attention or priority given to
certain subject matters or to learning in general. My argument here is that lan
guage related research should be kept as small as possible so that regional, dis
trict, school, principal, teacher, and student variations can be dealth with
adequately. If one were to combine data collected in London and Amsterdam to
test out some particular pedagogical approach, one would likely have to over
look enormously different views about language learning in the two sites. But no
more so, I would argue, than would transpire in a Boston-Chicago amalgama
tion.
Here are two examples of bigness troubles that I have in mind. The first is
the Baker and De Kanter (1981) review of all methodologically adequate studies
of bilingual education up to the 1980's that were developed for language mi
nority children and conducted across the United States. The aim of the Baker and
De Kanter report (1981) was to assess the impact of bilingual educational offer
ings on math and English achievement scores. The basic criterion for a success
ful program was that it showed more learning than would have been the case
without the program.
Setting aside the clear need for either random-assignment controls for
those in or outside a bilingual program, or some quasi-experimental approxima
tion (e.g. Campbell and Boruch 1975), Baker and De Kanter concluded on the
56 W.E. LAMBERT
basis of the studies available that bilingual education didn't have much if any
positive effect. Overall, perhaps that is a relatively true evaluation, but "overall"
in this case covered a multitude of sins. When someone with the patience and
insight of Ann Willig (1985) conducted a meta-analysis of the same studies re
viewed by Baker and De Kanter, she was able to uncover enough of the sins to
come to a much more convincing conclusion and one that was very favorable to
ward bilingual education. For instance, Willig found: "In every instance where
there did not appear to be crucial inequalities between experimental and com
parison groups, children in the bilingual programs averaged higher than the
comparison children on criterion instruments" (1985: 312). My point here is that
much of the confusion in the overview of Baker and De Kanter was due to the
fact that they had to look beyond specific cases that were regional, district or
school specific, and it took a Willig to not only considered them but show their
importance. In her conclusions, she makes direct reference to the bigness factor:
"The cost of the national Title VII evaluation could havefinancedseveral pro
grams that included sound, integrated research in the design. Not only would
such an endeavour have produced additional programs for a number of stu
dents, it would also have produced information useful for both evaluation and
program planning. In discussing the necessity for smaller scale, randomized ex
periments of educational programs, Campbell and Erlebacher (1970: 207)
write, "We are sure that data from 400 children in such an experiment would
be far more informative than 4,000 tested by the best of quasi-experiments, to
say nothing of an ex post facto study". The results of this synthesis have con
firmed that observation" (Willig 1985:313).
helping the children juggle two languages and cultures. (Of course, we Cana
dians dislike the misuse of the term "immersion" in the first case because ac
tually it is a reversal of the intent of immersion education, as I will explain later.)
The implementation of this project has been instructive in several respects.
It was a study requested by the government, through the Department of Educa
tion (D.O.E.), and was motivated by a keen interest in the potential of the Im
mersion Strategy option. My guess is that the immersion-in-English option was
congruent with the Reagan administration's views about language minorities, i.e.
that it is basically unAmerican to have American citizens or citizens-to-be jibber
jabbering in home languages in American public schools. Things like that, the
argument goes, stigmatize minorities and slows their progress towards Ameri
canization. Better to dive right into English and stay away from "maintenance"
bilingual programs or even "traditional" ones if possible, because both alterna
tives only stretch out the assimilation period. Since a few districts around the
country were trying out an immersion-in-English option, the D.O.E. indicated in
the original contract that they wanted a large scale investigation of the Immer
sion Strategy classes then starting, and to have them compared with "com
parable" Early Exit programs — the most common form of education available
for minorities. The study would be restricted to hispanic children only. Thus, the
study was to be big and also expensive because it would follow children for a
four year period. A small group of experienced research consultants would meet
twice a year first to help design the study and then to monitor it.
At the first design-planning meeting, the consultants argued fast and furi
ously to change the immersion name to "sink or swim", "submersion", "drown
ing", "brain washing" or some such alternative. But the D.O.E. kept it as
"immersion strategy". Then we argued for the inclusion of one alternative — the
Late Exit option — to add a bit of sunshine to the project and on this point, the
D.O.E. was persuaded our way.
Then much time was spent on another really exciting approach to the basic
question, and actually this alternative almost worked out. The idea was to run a
real experiment comparing the three alternatives. We realized that few district
supervisors in the United States could differentiate one of these alternatives
from another. In fact, few school principals or teachers in bilingual/bicultural
programs in the United State know about alternative approaches to teaching mi
nority children, other than the alternative they have been asked to comply with.
Thus, we had the opportunity to work with one or two districts and to set up,
through random placement of pupils, the three alternatives and test their
relative effectiveness. Parents certainly were no better informed about the alter
natives and they would likely have been willing to participate, since, as is, they
take whatever program the district has decided to offer. This possibility got us
58 W.E. LAMBERT
researchers excited because it would have satisfied the basic demands of "good"
experiments according to Donald Campbell's and Ann Willig's specifications.
Note that it would have been relatively small in scale, providing control over dis
trict and region effects, the things that a bigger study can't handle properly. As
well, treatment specifications and teacher selection and training could have
been easily undertaken and monitored, and most important of all, pupils from a
common district or community could have been placed at random in one treat
ment or another. Later, contrasting communities could be included. But rather
than crying over a missed opportunity, we as consultants could adjust to the
generosity of D.O.E. to permit the Late Exit alternative to be added to the con-
tract.
The main point here, however, is that bigness stalks this project, as it does
so many American educational evaluations or surveys. Let me illustrate. (1) In
order to get a nation-wide view of the relative strengths and weaknesses of the 3
options, 5 states are included: California, Texas, Florida, New York, and New
Jersey (likely equivalents to Moscow, Athens, Bucarest, Amsterdam, and Lon
don, in my mind). This spread, it was argued, would give representation to the
major Spanish-speaking groups in the United States. But this approach means
that little or no attention can be given to the differences in program effects for
the various cultural-historical subgroups classified together here as Hispanic, i.e.
Mexican and Chicano in particular regions and Cuban and Puerto Rican in other
regions. (2) Some states have only one or two of the alternatives in operation in
their schools and few districts or school systems available across the country
have all three alternatives in place. This means that the researcher can not
determine why a particular state, region or district has "inherited" one alterna
tive or another, and what effect the values and attitudes underlying the choice of
alternative in vogue might have on that program's relative success or failure. (3)
States, regions, districts, and schools within a district vary also with respect to
the socio-economic and educational background of Hispanic families and these
factors affect, salary and social class backgrounds of public school teachers, and
ultimately the achievement scores of pupils. In sum, then, this large scale re
search project, exemplary in many respects and so designed as to circumvent as
many of these potentially confounding variables as possible, has been from its
start too big for its britches. It has had to overlook or work around variables that
are clearly socially significant, i.e. ethnic differences within American's Spanish-
speaking population; state and regional variations in socio-economic status of
families, in demographic clusters of language minorities, and in educational pro
grams; and school-specific variations in climates or atmospheres that encourage
or discourage learning and teaching. My argument is that less money would be
spent to conduct a coordinated set of real or quasi experiments in different re-
PROS, CONS, AND LIMITS TO QUANTITATIVE APPROACHES 59
Michael Long (1984) has recently described some important differences be
tween process-oriented research and product-oriented research, a differentia
tion similar in many respects to "formative" versus "summative" evaluations in
research on second language learning (see Scriven 1967). Product research sets
out to answer questions about the effectiveness, reflected in achievement scores,
of one program (approach or "treatment") compared to another program.
Generally each new educational innovation is ultimately tested for its presumed
merits by means of product evaluations, and, theoretically, well conducted
evaluations could test out and, thereby, inform policy makers on the best course
of education possible. One need never know why a particular approach is the
best alternative if one could be confident that the evaluation had been carefully
conducted: the product outcomes would simply determine which alternative is
the most effective. But it is difficult for human researchers to be careful enough
to satisfy all possible critics, and, especially in big studies, unbeknownst to the
evaluator, happenings intervene while the product is being tested. For example,
pupils following an Early Exit option in the example above might perform
poorest at the time of post-testing because all the good ones in that program had
been exited out (the issue of subject "mortality"). Researchers might have big
enough samples to still deal with the "slow" early exiters, but one would begin to
see weaknesses in the evaluation. Or it could be that the supposedly "bilingual
instruction" given to one treatment group was actually reduced to having a bi-
cultural teacher instruct through English, possibly with non-native command of
the English language. Consequently, researchers are required to be as "process
oriented" in their evaluations as possible, that is, to find out what actually tran
spires in each classroom under each treatment. This includes the details of
teacher-pupil interactions, analyses of the content of instruction and its form, as
well as the pupil variability in receptivity to the instruction. Clearly, there is a
need for some balance here, as Long recognizes:
60 W.E. LAMBERT
"Process evaluations offer many benefits for teachers and administrators alike.
Of these, the most important is that they can document what is going on in
classrooms, as opposed to what is thought to be going on. Using process and
product evaluations in combination, one can then determine not only whether
a program really works, or works better, but if so, why, and if not, why not"
(Long 1984: 422).
The examples Long gives of what can be done in process research are in
structive and interesting. Consider the issue of teacher-pupil interactions. Sup
pose it is agreed to video-tape or audio-tape a sample of classes in an
educational experiment on language pedagogy; some samples would be taken
from an innovative, new approach in one case and from a standard old approach
in the other. The tapes are transcribed, and transcriptions usually take five
minutes per minute of tape. To check on transcription accuracy one might want
two transciptors to work independently and' calculate their agreement; but note
how costs can accumulate here. Nonetheless, information about the fine-tex
tured differences between programs can be made apparent in this fashion, e.g.
one set of classes might be found to stress:
There is no question that researchers would be delighted with such data, be
cause then they could pinpoint factors that have an effect on product-oriented
achievement measures. There is good common sense here that researchers ap
preciate. For instance, Merrill Swain (1987) got such transcripts from classes in
French immersion programs in Canada and found that teachers hardly ever used
the past tense when teaching a history course in French to anglophone students.
And product assessments had noted that these students were not too swift in the
use of the French past tenses! The major point is that researchers like Swain in
this example might miss entirely what was going on in the classroom if they ne
glect process concerns in their research.
Process, however, can be overstressed, and clearly a reasonable balance has
to be struck. Here's an example of too much process at the expense of product,
an example that bothers me. The researchers were searching for "significant bil
ingual instructional features", i.e. they attempted to "identify, describe, and ana
lyze significant instructional features in successful bilingual instructional
settings" and to explore the consequences of these features on the progress of
PROS, CONS, AND LIMITS TO QUANTITATIVE APPROACHES 61
language minority pupils (Tikunoff 1980: 1, 1981; Fisher and Guthrie 1983). For
this purpose they collected detailed information on what went on in the class-
rooms, including teaching styles, whether active or not, among many other fea
tures, assuring the reader of their report that process aspects of research were
admirably covered. The trouble is that, for determining which programs were
"successful", they relied on opinions of local people — administrators, teachers,
parents, and former students. No other independent check on "success" is men
tioned and no attention is given to a contrast or comparison group that did not
receive "significant bilingual instruction". In fact, "significant" is presumed to be
the instruction that transpires in "successful" classrooms. Again, there is no in
troduction of comparison groups who were not successful. There is much valu
able information in this work. But it was expensive and spread through three
years, and there is no way to determine and no evidence given to convince me
that these instructional features were either significant or successful. The ne
glect of product information in this case means that the researchers did not go
after data from matched groups of Limited English Proficient students who re
ceived either one set of instructional features or a comparison set, and who then
were found to be either successful or not in terms of achievement growth or im
provement. To me, it is a shame to have missed this opportunity to, as Long sug
gests, combine product and process concerns in the research. A valuable
suggestion for those wanting to explore the process-product issue in more detail
is the recent work by Craig Chaudron (1988) that demonstrates very nicely the
need for researchers to give ample attention to both process and product.
There is, however, another way to consider process and product research, a
way that I think goes deeper and captures the interests of another type of re
searcher. The particular other way I have in mind was introduced by Lev Vygot-
sky back in 1934, although his book appeared in English only in 1962 (Vygotsky
1962). Here's an example:
"It seems to us that [this] phenomenon has not received a sufficiently convinc
ing psychological explanation, and this for two reasons: First, investigations
have tended to focus on the contents of the phenomenon and to ignore the
mental operations involved, i.e. to study the product rather than the process;
second, no adequate attempts have been made to view the phenomenon in the
context of other bonds and relationships..." (Vygotsky 1962: 71; emphasis
added by W.L.).
62 W.E. LAMBERT
The phenomenon Vygotsky was referring to was the changes that transpire
in the normal development of thought from infancy to young adulthood, a pro
gression from thinking in "complexes" to "pseudo-concepts" or "potential" con
cepts to "genuine concepts".
"The processes leading to concept formation develop along two main lines.
The first is 'complex' formation: The child unites diverse objects in groups
under a common 'family name'; this process passes through various stages.
The second line of development is the formation of 'potential concepts', based
on singling out certain common attributes. In both, the use of the word is an
integral part of the developing processes, and the word maintains its guiding
function in the formation of genuine concepts, to which these processes lead"
(Vygotsky 1962: 81).
cess orientation on the part of the researcher. But to get at the deeper levels, the
researcher has to have some relevant theoretical ideas, even if only common-
sense hunches, to orient the long-ranged plan of the research. Let me illustrate
what I mean through three or four examples from the Montreal setting.
In Montreal, French and English school systems are and have been separ
ate; the administration is separate, the schools are in different sites and, conse
quently, students and staff are kept exclusively in their own linguistic worlds.
This separateness is nicely represented in an important Canadian novel on the
two major etholinguistic communities in Quebec, entitled Two Solitudes
(McClennan 1945). Ailie Cleghorn and Fred Genesee (1984) were interested in
what happens when French and English speaking teachers become members of
a common teaching staff in English language schools that have French "immer
sion programs" underway. Their hunch was that the social interactions of the
two groups of teachers would likely reflect the social realities of distant, separate
existences. Data were collected, using observational procedures, over a one-year
period. Thus, an observer recorded relevant events in the schools, in classrooms,
in principals' offices, and in teachers' rooms, especially at break times and lunch
times. It was an unusual event, for both the French teachers and the English
schools involved, to have a sizable subgroup of French teachers working in
otherwise all-English schools. At first, the English speaking teachers showed
normal amounts of politeness and welcome. In the common teachers' room at
lunch period, for example, small tables were arranged so as to accommodate all
staff, and suggestions were made that French might be the language of com
munication (a type of "French table") from time to time, so that the English tea
chers could get some experience using French, at the same time, they reasoned,
as the French teachers were made to feel at home. The Cleghorn and Genesee
study is noteworthy because it chronicles in the teacher-to-teacher contacts the
slow but sure emergence of the deep, long-standing conflictual nature of Eng
lish-French relations in the general society. For instance, there was a gradual
separation and segregation of social contacts, including the use of separate ta
bles, separate burners on the common stove, schedules for French and English
usage of the stove. French teachers slowly switched to English (no matter how
poorly they commanded it) for intergroup contacts which, for generations, had
been the expected thing for French-Canadians to do in the presence of anglo
phones. To me, this informative study is a good example of a carefully do
cumented, standard process-oriented study that was designed to go far beyond
64 W.E. LAMBERT
the structure and content of the interaction between teachers. Instead, the basic
process data were used to explore a fundamental social-context process invol
ving society's impact on the school and on cross-group contacts that take place
in this novel form of mixed-group setting. The impact of this deeper societal
process on anglophone children's progress in French, their reluctance to initiate
French conversations outside school, and their expectations that French people
speak English with anglophones were all evident in the product results of the
immersion classes.
This outcome not only says something important about the causal link be
tween becoming bilingual and cognitiveflexibility,but it also casts light on a very
important underlying mental process that permits one to infer what likely goes
on in the immersion experience, something far below the surface events of
teacher-student interaction patterns. Thus, this study is a clear illustration of a
Vygotsky-type process that was studied through an apparently product type (pre-
post) testing of the performance of the children on a standard, psychometrically
sound measure of cognitive activity.
Robert Gardner and I have had a long-standing interest in the role played
by students' attitudes towards the foreign group whose language they are stu
dying, whether they are motivated by "instrumental" reasons (those with a prac
tical pay-off) or "integrative" reasons (e.g. interest in or inquisitiveness about
the foreign people and their culture) (see Gardner and Lambert 1972). Since
our early work, Gardner (1981) has accumulated an impressive array of empiri
cal studies that explore the ways in which attitudes and motivations affect lan
guage acquisition proficiency, performance in the classroom, and willingness to
take advanced courses in the language. The basic research design is to measure,
as of the start of FL training, the foreign language aptitude, the verbal IQ, the
socioeconomic background, and the attitude-motivational profile of large num
bers of primary and secondary school students and to follow them through one
or more years of FL training, with repeated tests of FL achievement. Thus, a ba
sically product oriented approach is followed. Numerous, small-scale replica
tions reveal that measures of attitudes toward the other cultural group and
motivational interest in mastering the FL are correlated, forming a cluster that
stands apart from a second cluster made up of tests of aptitude for learning a FL
and verbal intelligence. Furthermore, each cluster is as closely correlated to FL
achievement as the other. The fact that the attitude-motivation cluster is as good
a predictor of FL achievement as verbal intelligence or language aptitude and
that it is statistically independent from the aptitude-intelligence cluster has great
social significance because it indicates that anyone, even the intellectually and
linguistically non-gifted, can be successful in FL study if they want to and espe
cially if they want to for the "right" attitudinal reasons.
The more recent research of Gardner and his students shows that the atti
tude-motivation index is also strongly associated with perseverance in the FL
study (Gardner and Smythe 1975; Gardner 1981), that is, the more integratively
66 W.E. LAMBERT
oriented the attitudes and motivation of student are, the more they avail them
selves of opportunities to practice the second language, and the more often they
decide to take advanced level courses at the college level. It is also clear that at
titudes and motivation affect classroom interactions (Glicksman 1981; Gardner
1981). Trained observers of FL classrooms found that the more "integratively"
oriented students (those with favorable attitudes and non-practical motivations)
volunteered more frequently, gave more correct answers publicly, and received
more positive feedback from teachers than did those less integratively oriented.
There were no subgroup differences, however, in asking the teacher questions,
in demonstrating knowledge beyond that solicited, nor in indications of class-
room anxiety.
For me, these results indicate that a deeper process, reflected in an atti
tude-motivation complex, is at work in FL learning. Furthermore, this deeper
process seems to have an effect on the content and structure of the teacher-stu
dent interaction — the more standard form of classroom process, the type more
commonly dealt with by FL researchers.
My final example is both societal and personal in nature. It deals with small
communities in northern New England in the United States whose residents
have French as a heritage language, being third or fourth generation immigrants
from French Canada, but who function otherwise in an all-English American so
ciety. These "Franco-Americans" have kept French up mainly as an informal so
cial language, especially with family members, and mainly for oral
communication; there is very little reading or writing in French. As these
families function more and more in English, they gradually lose French. Their
stage of bilinguality reflects the gradual substitution of English for French, what
we refer to as "subtractive" bilingualism, meaning that even though at a certain
time in their lives they are functionally bilingual, French is being eliminated
from their lives and replaced by English (Dube and Herbert 1975; Lambert,
Giles, and Picard 1975; Lambert, Giles, and Albert 1976). The implied contrast
is with an "additive" form of bilingualism where speakers of a dominant, prestig
ious and communicationally useful language (like English in the United States
or French in France) can add a second or foreign language to their linguistic
repertoires with no fear that the first language and its cultural supports will be
upset in any sense. Rather, they experience numerous cognitive, intellectual and
social advantages as they become bilingual. The question that prompted us was:
PROS, CONS, AND LIMITS TO QUANTITATIVE APPROACHES 61
This example, I suggest, is both small and community based, and it is by de
sign as carefully control-group, product-oriented as we could make it, and yet it
was much more. It provided us with an opportunity to test out potentially im
portant underlying processes that help us understand the different meanings that
being bilingual/bicultural can have on both language minority and language ma
jority families in an American setting.
Considering all four illustrations, what are the essential features of this
"deeper type" of process research or this "Vygotsky style" search for underlying
processes? I see two important features: (1) all such examples are applications
of a hypothetical-deductive research model (cf. Underwood 1957; or Hull 1952)
that makes active use of "hypothetical constructs" or "intervening variables"
(see MacCorquodale and Meehl 1948). These hypothetical constructs are often
simply sophisticated guesses on the part of the researcher. Their importance lies
in the fact that they can be linked, through experiments, with particular input
variables (also known as "independent variables") that are systematically related
to one or several output variables ("dependent variables"). (2) The basic model
also implies multiple hypothetical deductions and testings of the central con
struct, and thus there is an implied requirement that the researcher-theoretician
strive for "construct validity" so as to enhance the believability of the basic con
struct (see Cronbach and Meehl 1955; Underwood 1957: 117ff). This old, de
pendable model gets new names and new twists from time to time, but never any
substantive changes. And as is apparent in the examples, the constructs or basic
processes can be psychological in nature, group or community oriented, and
even culture oriented. This suggested model does imply, however, that valuable
research on foreign or second language learning requires much more than lin
guistic or pedagogical training and interest; it requires as well some extensive
experience and interest in one or more of the behavioral sciences, either on the
part of the researcher or on the part of research collaborators. The important
message, however, is that progress in FL or SL research calls for prime attention
on underlying hypothetical constructs or, more simply, on educated guesses that
experienced teachers are so competent at generating. Progress also calls for
careful and systematic testing using product-type, quantitative research ap
proaches which incorporate as much process data as is economically feasible.
The smaller the scale of the design and the more local its scope, the greater the
progress is likely to be.
PROS, CONS, AND LIMITS TO QUANTITATIVE APPROACHES 69
Acknowledgement
This paper was also presented at the Conference on Foreign Language Ac
quisition Research and the Classroom, University of Pennsylvania, October 12-
15,1989.
Notes
1. This point is important and worth documenting. Recently, Don Taylor and I conducted a
community study in Detroit, Michigan (see Lambert and Taylor 1988) wherein we worked
with two large school districts over a 3 year period. The superintendent of one district be-
came our good friend. He was a Polish-American and gave our project his personal backing.
He was happy that he had some 15 teachers hired to teach "bilingually" in such languages as
Arabic, Polish, Albania, Greek, and Vietnamese. Watching these teachers in actions, Taylor
and I noted that none of them used any other language than English except for rare special
moments when the other was used with a particular child, and in a soft voice. On hearing
about this the superintendent called all the teachers together to confirm the fact and to hear
the reasons why: e.g. some directive from the Office of Education for the State of Michigan
had sent a directive that this was the way to do bilingual teaching. The directive clearly bo-
thered the Arab and Greek teachers but seemed normal and sensible to most of the Polish
teachers. Another more recent example: Taylor and I, continuing the same project in the
Dade County (Florida) Public School System, visited a 1 hour "bilingual" class in science for
high school pupils. All 35 students were Spanish speaking with varying degrees of skill in
English. The point is that not one word of Spanish was used by the teacher! Someone above
had told her that was what she was to do and she was comfortable with the schema, arguing
that "Since I'm obviously Hispanic myself I know how to get these Hispanic youngsters inter-
ested". She was an excellent teacher, but in no way was she teaching bilingually. God only
knows what the limited-English minority child was getting from that class, and valuable op-
portunities were lost for the fully bilingual children in that class to realize that the same
teacher could have made science both exciting and Spanish at the same time.
2. Incidentally, Swain's finding makes one wonder about "sheltering" the language of instruc-
tion for minority language students, i.e. being too concerned that the inputs are simple and
"comprehensible". Although presently in vogue, I'm more inclined to the pedagogical views
of Sir Walter Scott who, in his 1831 book dedicated to his 5 year old grandson, wrote in the
preface:
"These tales were written... for the use of the young relative to whom they are in-
scribed... The compiler... after commencing his task in a manner obvious to the
most limited capacity... was led to take a different view of the subject, by finding
that a style considerably more elevated was more interesting to his juvenile
reader. There is no harm, but on the contrary there is benefit, in presenting a
child with ideas somewhat beyond his easy and immediate comprehension. The
difficulties thus offered, if not too great or too frequent, stimulate curiosity, and
encourage exertion" (Scott 1831: iii-iv).
70 W.E. LAMBERT
Placing input clearly within versus somewhat beyond the realm of comprehensibility is a
minor point, but one that deserves a series of careful experiments. And we had better not
think too much about the idea that these wonderful stories were enthusiastically read by five
year old's in 1831!
References
Baker, K. and A.A. De Kanter. 1981. Effectiveness of bilingual education: A review of the literature.
Washington, DC: Office of Planning, Budget and Evaluation, U.S Department of Education.
Bryk, A.S. and S.W. Raudenbush. 1987. "Application of hierarchical linear models to assessing
change." Psychological Bulletin 101.147-158.
Campbell, D.T. and R.F. Boruch. 1975. "Making the case for randomized assignment to treat
ments by considering alternatives." Evaluation and experiment ed. by C.A. Bennett and A.A.
Lumsdaine, 195-296. New York: Academic Press.
Chaudron, C. 1988. Second language classrooms: Research on teaching and learning. New York:
Cambridge University Press.
Cleghorn, A. and F. Genesee. 1984. "Languages in contact: An ethnographic study of interaction
in an immersion school." TESOL Quarterly 18.595-625.
Cole, M. 1987. Quoted in L.S. Hearnshaw, The shaping of modern psychology, 177. London: Rout-
ledge and Kegan Paul.
Cronbach, L. and P.E. Meehl. 1955. "Construct validity in psychological tests." Psychological Bul-
letin 52.281-302.
Dubé, N.C. and G. Herbert. 1975. The St. John Valley bilingual education project. Washington,
DC: U.S. Department of Health, Education and Welfare.
Fisher, C.W. and L.F. Guthrie. 1981. Executive summary:Thesignificant bilingual instructional fea-
tures study. Document SBIF-83-R.14.
Gardner, R.C. 1981. "Second language learning." A Canadian social psychology of ethnic relations.
ed. by R.C. Gardner and R. Kalin. Toronto: Methuen.
Gardner R.C. and W.E. Lambert. 1972. Attitudes and motivation in second language learning.
Rowley, MA: Newbury House.
Gardner R.C. and P.C. Smythe. 1975. Second language acquisition: A social psychological ap-
proach ( = Research Bulletin, 332.) London/Ontario: University of Western Ontario, Depart
ment of Psychology.
Getzels, J.W. and P.W. Jackson. 1962. Creativity and intelligence. New York: Wiley and Sons.
Glicksman, L. 1981. Improving the prediction of behaviors associated with second language acquisi-
tion. Unpublished doctoral dissertation. London/Ontario: University of Western Ontario.
Hull, C.L. 1952.A behavior system. New Haven: Yale University Press.
Lambert, W.E. 1984. "An overview of issues in immersion education." Studies on immersion edu-
cation ed. by Office of Bilingual Bicultural Education. Sacramento: California State Depart
ment of Education.
Lambert, W.E., H. Giles, and A. Albert. 1976. Language attitudes in a rural community in northern
Maine. Unpublished manuscript. Montreal: Psychology Department, McGill University.
Lambert, W.E., H. Giles, and O. Picard. 1975. "Language attitudes in a French-American com
munity." International Journal of the Sociology of Language 4.127-152.
Long, M. 1984. "Process and product in ESL program evaluation." TESOL Quarterly 18/3.409-
425.
PROS, CONS, AND LIMITS TO QUANTITATIVE APPROACHES 71
McClennan, H. 1945. Two solitudes, New York: Duell, Sloan and Pearce.
MacCorquodale, K. and P.E. Meehl. 1948. "On a distinction between hypothetical constructs and
intervening variables." Psychological Review 55.95-107.
Peal, E. and W.E. Lambert. 1962. "The relation of bilingualism to intelligence." Psychological
Monographs 76.1-23.
Ramirez, D., S.D. Yuen and D.S. Ramey. 1988. Longitudinal study of immersion, early-exit and
late-exit transitional bilingual education programs for language minority children: Study design
overview. San Mateo, CA: Aguirre International.
Rogosa, D., D. Brandt and M. Zimowski. 1982. "A growth curve approach to the measurement of
change." Psychological Bulletin 92.726-748.
Scott, S. 1973. The relation of divergent thinking to bilingualism: Cause or effect? Unpublished
manuscript. McGill University, Psychology Department.
Scriven, M. 1967. "The methodology of evaluation." Perspectives on curriculum evaluation ( =
American Educational Research Association: Monograph Series on Curriculum Evaluation, 1)
ed. by R.W. Tyler, R.M. Gagné and M. Scriven, 39-83. Chicago: Rand McNally.
Sowell, T. 1983. The economics and politics of race. New York: Morrow and Co.
Swain, M. 1987. Personal communication. See also Harley, B., et al. 1987. The development of bil-
ingual proficiency: Final report. Toronto: Modern Language Center, OISE.
Tikunoff, W.J. 1980. Overview of the significant bilingual instructional features study. San Francisco:
Far West Laboratory, Document SBIF-80-D.1.1.
Tikunoff, WJ. 1981. Significant bilingual instructional features study: A report of the state-of-the-
study. San Francisco: Far West Laboratory, Document SBIF-81-R.8.
Underwood, B.J. 1951. Psychological research. New York: Appleton-Century-Crofts, Inc.
Vygotsky, L. 1962. Thought and language. Cambridge, MA: MIT Press.
Willett, J.B. 1988. Questions and answers in the measurement of change (= Review of Research in
Education, 15.) In press.
Willig, A.C. 1985. "A meta-analysis of selected studies on the effectiveness of bilingual educa
tion." Review of Educational Research 55.269-317.
Ask a Stupid Question...:
Testing Language Proficiency in the Context of
Research Studies
Christine Klein-Braley
formance or that of the learners. Outside the classroom and divorced from any
course of study, tests can be used as qualifying procedures, for instance by
universities to ensure that foreign students know enough of the language to en
able them to follow lectures in their chosen area, or by aviation authorities to
check whether pilots and air traffic controllers can use English to communicate
with each other successfully.
Ideally it would be possible to construct a test which would — simultaneous
ly—answer all these questions, and any others we might happen to have. Unfor
tunately there is no such animal as the diagnostic achievement aptitude test of
general language proficiency. Tests must be carefully designed to answer specific
questions, and often the more suitable a test is for one purpose the less suitable
it is for another. For instance, a test of achievement (i.e. a test designed to pro
vide feedback on learning progress for teacher and pupil) would begin by sur
veying the units to be taught in a given period of time and the test would consist
of a representative sample of these units weighted according to their relative im
portance. If both teaching and learning had been maximally effective all pupils
would score 100%. It is obvious that a test of this kind is not suitable as a qualifi
cation procedure since its aim is extremely narrow, and there is no way in which
the test scores can be related to a concept of general language proficiency.
Tests for use in research studies are no exception to the general rule that
tests can only provide answers to the questions they are constructed to investi
gate. They need to have special qualities, and this paper will be devoted to sum
marizing their most important and desirable characteristics.
1 Preliminary considerations
There are many experimental designs which can be used in research studies,
and naturally one will select a design appropriate for the given question. Often
such designs involve repeated testing of the same individuals to determine
whether changes ( = learning!) have occurred in the interval. Sometimes it is
possible to use the same test twice, but it would be preferable to have alternative
versions of the test available for use as required. Statistically, parallel tests are
defined as having the same mean score and the same standard deviation. The
reason for this is obvious: only if the tests are equal in difficulty is it possible to
determine whether a change has taken place between the two test sessions. Con-
ceptually, it seems desirable that the tests should have the same types of items
and numbers of tasks to make them equivalent in language processing tech
niques. Any test format to be used in a research study ought, therefore, to be
one which allows for the relatively effortless production of parallel versions,
which must then be equated by empirical testing.
Sometimes it is desirable to investigate the achievement levels of large
groups and in such cases there is a special technique, matrix sampling, (cf. e.g.
Cronbach 1984: 527) which enables a large pool of items to be split down into a
number of shorter tests so that no one individual needs to respond to all items.
This saves time without serious loss of information.
formation before the tests are administered; for instance if the tests are known
in advance the teachers may well teach for the test, since it is entirely reasonable
for teachers to wish their pupils (and thus indirectly they themselves) to do well.
However all those involved in the study should be offered and provided with full
feedback after the relevant parts of the study have been completed.
Test development should also take into consideration practical constraints
of the school or institutional setting. For instance all tests should be planned to
fit into convenient time slots (e.g. lessons). This might mean using two short
tests rather than one long one. The decision to use media, even such modest
media as cassette recorders or overhead projectors, is an open invitation to
Murphy's Law in rich countries, and may be impossible to implement in poor
countries. The facilities available for any measure which goes beyond traditional
paper and pencil tests (with the researchers providing paper, pencils and pencil
sharpeners!) need to be carefully investigated well in advance before test for
mats are developed which then have to be discarded as too ambitious after the
study has already begun.
For the pupils the testing should be as painless as is reconcilable with the
necessity of collecting the data required. If alternative procedures are possible,
then the one which is shortest, least complicated, most interesting and least like
ly to provoke test anxiety should be selected. Sensitive personal data should only
be collected if this is really essential. So far as possible pupils too should be of
fered feedback about the study in general and their own performance in particu
lar.
The amount of time teachers or educational supervisors are prepared to
allot to the research testing may be much more restricted than the researchers
originally hoped. Other constraints may appear unexpectedly. It therefore makes
sense to consult those responsible inside the school(s) or institutional system(s)
involved at a very early stage of planning.
One reason why some researchers are reluctant to do all this is that the pro
cess of explaining and defending both the study and the planned tests to tea
chers, administrators and outside experts can be both lengthy and painful. Yet
this is an important stage in an investigation. On the one hand the researchers
are forced to formulate their own objectives and methods very clearly in order
to get them across. If they are successful, they will often find that they have not
only gained willing support, but also that the practitioners suggest aspects for in
clusion which might not otherwise have been taken into consideration.
TESTING LANGUAGE PROFICIENCY 77
Like all tests, tests to be used for research purposes need to conform to
basic test standards, that is they should be objective, reliable and valid. Despite
more than 25 years of professional language testing (cf. Carroll 1961; Lado
1961) there still seems to be quite considerable ignorance as to what this means,
even among alleged experts in the field of language testing. One is surprised, for
instance, to find Morrow (1979: 51) claiming "Reliability [in communicative
tests]... will be subordinate to face validity" or Underhill (1987: 105) writing
"Both reliability and validity are rather vague concepts which suffer from a lack
of clear definition about exactly what they are, let alone how they should be as
sessed or calculated". In fact these concepts are quite clearly defined and are
used in exactly the same way as they are used in psychological testing (cf. e.g.
APA 1974), and the qualities to which they refer are intuitively and obviously
desirable since they contribute towards making the tests equitable and fair for
all examinees. Furthermore there is general consensus about assessing how far
tests meet the criteria.
2.1.1 Objectivity
A test is objective if the test score obtained by the examinee is not affected
in any way by the experimenter, proctor or scorer. Objectivity can be affected,
for instance, by some test forms being easier to read than others, by inadequate
or inaudible instructions from the test proctor, but perhaps most obviously and
importantly by scorer bias or subjectivity. A multiple-choice test is machine-
scorable and thus, in this sense, entirely objective; an essay or translation
marked by the pupil's own teacher is very likely to be contaminated by scorer
bias.
It is important to realise that this is not a criticism of the inability of some
(possibly incompetent?) language teachers to disregard personal prejudices in
order to assess their pupils objectively. In medical research into the effective
ness of new drugs it has been found necessary to introduce the double or even
triple blind experiment in order to eliminate experimenter bias. Only if neither
the patient who takes the drug, nor the physician who administers it, nor the la
boratory staff who perform the blood counts (or whatever) are kept ignorant of
whether the patient is receiving the new drug or the placebo is it possible to
evaluate how effective the new medication is. Subjectivity is not a matter of will-
power. In terms of language testing this means that if we decide to use item
78 CHRISTINE KLEIN-BRALEY
2.1.2 Reliability
A test is reliable if it measures accurately and exactly. This is investigated
heuristically by calculating whether conditions which should be met if the tests
are measuring properly are, in fact, met. Reliability is thus determined by find
ing out (a) whether all the test items correlate with each other (internal consist-
ency); (b) whether the two halves of the test correlate with each other (split-half
reliability); (c) whether the same test administered twice produces scores which
correlate highly with each other (test-retest reliability) and (d) whether two tests
designed to be parallel to each other have high intercorrelations (parallel relia
bility). In fact it is rare for all four types of reliability to be calculated for any one
test although each of them involves a slightly different concept of measurement
stability.
The result of the investigation is a figure between 0 and 1, known as a relia
bility coefficient and technically representing the correlation between the postu
lated "true score" and the observed test score. If the true score and the test
score were identical the correlation between them would be 1.00, but since all
tests are affected by measurement error the difference between 1 and the actual
reliability coefficient is an indicator of the overall accuracy of the test. Obviously
the nearer the reliability coefficient approaches 1, the better the test. If judg
ments of or decisions about individuals are to be made, then testers usually de
mand a reliability coefficient of at least .9; where groups rather than individuals
are being tested the required level is .7.
Tests intended for research purposes are normally tests which aim at
gathering information about groups: the sort of effects we are likely to be look
ing for will probably only reveal themselves in group terms. This means that the
lower reliability level of .7 will usually be acceptable in a research context. It is,
however, important for test reliability to be determined in the preliminary stage
because it is a tragic waste of time, money and resources to embark on large-
scale testing using unreliable tests.
2.1.3 Validity
A test is valid if it measures the thing it is intended to measure. There is no
such thing as an inherently valid test since if a direct indicator of the trait or
quality we wish to measure is available we would use this rather than the test.
TESTING LANGUAGE PROFICIENCY 79
Technically it would be possible, for instance, to develop a test which would help
us to decide whether someone is a man or a woman, but nobody has so far taken
the trouble since there are easier ways available of doing this. All tests, there
fore, are indirect ways of trying to get at information which is unavailable direct
ly or immediately — but then, so is a thermometer or a measuring tape. No test is
obviously and of itself valid; validity must always be demonstrated by the test
constructor.
Psychometrics usually defines three types of validity: criterion-referenced,
construct, and content validity.
The easiest type of validity to determine is criterion-referenced, correlative
or empirical validity. The test is valid if the test scores correlate with some other
available measure of the attribute or trait. Often a newer test is developed as a
short cut to acquiring the same information as another, often more complicated,
sometimes merely older, measure. Tests can also have the purpose of predicting
information which will only become available at some time in the future, and
they do this by assuming that a relationship which has been shown to hold be
tween test and criterion for one set of data will continue to operate for further
data sets. Validity in this approach is therefore demonstrated by agreement
(correlation) between criterion and test, and very often the criterion is itself a
test. Agreement between the old, tried and tested method and the new measure
is viewed as confirmation that both test the same thing, i.e. both are valid. This
approach is obviously problematical because it stands and falls with the valida
tion of the criterion or the previous measure. Nevertheless language testers have
made frequent use of this type of validity.
A more complex approach to validation is involved in the concept of con-
struct validation. Here the relationships between a test and the underlying the
ories are examined. The theory predicts lawful relationships and if the test can
show that these relationships do in fact exist, then the test is viewed as valid in
terms of the theory. Sometimes test construction proceeds from the theoretical
assumptions (e.g. the Noise Test or the C-Test) and is tested against them (cf.
e.g. Gaies 1988; Klein-Braley 1985). More often in language testing, construct
validation has begun ex post facto when a new type of test — the most obvious
example is cloze tests — turns out to have interesting and unexpected properties
(cf. Oller 1973 but also Alderson 1979 and Klein-Braley 1981), or because a spe
cific test procedure has become very popular as a criterion measure (cf. Bach-
man and Palmer 1981 on the FSI Interview).
The third type of validity is content validity. Here the universe of interest is
defined and then sampled. This sample is converted in some way into test items
and administered to the examinees. From the performance on the sample con
clusions are drawn about examinee performance in the whole area. For instance
80 CHRISTINE KLEIN-BRALEY
Engels (1982), wishing to sample student knowledge of the 2,000 most frequent
words in English, took random samples of successive 500 word groups (50 items
in each). This test is content valid. Similarly, tests which sample from a defined
curriculum to decide whether students have learned what is being taught use the
concept of content validation. Expert judges are brought in to determine how
well the proposed items represent the universe concerned.
Often, however, the universe is difficult to define, or is infinite, or both.
Language is actually a good example of this. What is language, and what — as
Spolsky (1973) asked — does it mean to know a language? At this point the con
ceptual difference between content and construct validation seems to vanish be
cause the production of any content-valid language test must begin by
constructing a theory of language which can then be used as a basis for sampling.
I shall come back to this point later.
In addition to genuine validity, which must be investigated empirically,
there also exists the concept of face validity. This term is testers' shorthand for
the way a test looks to the naive (i.e. non-expert) user, to the examinee, to the
examinee's friends, parents, relations, even to the teacher who uses tests without
investigation of the assumptions they are based on. Face validity is desirable in
the sense that examinees (and other users) should feel that a test is relevant, ap
propriate and fair. But face validity is in no way sufficient unless the test has
been shown to be valid according to one of the psychometric approaches. And if
test validity has been demonstrated psychometrically then face validity can to a
large extent be ignored. If a lawful and regular statistical relationship could be
shown to hold between students' abilities to lob stones and their subsequent per
formance as simultaneous interpreters, it would be entirely legitimate to take
cohorts of applicants for the United Nations Translation Training Department
to the nearest sports field for a stone-throwing contest, even though the face va
lidity of the selection procedure would be zero.
3 Tests
Spolsky (1985) uses the same basic classification, but with a different
nomenclature: structural tests; general language proficiency tests; functional
tests.
rather than that one?"). In large-scale testing, on measures such as the TOEFL,
compromises have to be made between the desirable and the feasible. Tests of
this kind must be machine-scorable if they are to be administered to hundreds of
thousands of students every year. But a research project would presumably not
be confronted with enormous numbers of students to be tested. At any rate, be
cause of the problems involved in defining the areas to be sampled, selecting
and writing items, weighting subtests, etc., the theoretical problems in construct
ing a discrete-point proficiency test are immense, and since this task is presum
ably not the aim of the study, this type of test is probably not suitable in this
context.
(I have criticised cloze tests and translation "tests", for instance, because the
question of intercorrelations between two tests from the same "family" adminis
tered to the same subjects has been virtually ignored: Klein-Braley 1983, 1987.)
Relationships can be specified which ought to hold between these tests and
other tests, both of language and other traits, and this can be empirically investi
gated (cf. e.g. Raatz 1985).
What do tests of linguistic performance look like? There are, in fact, two
different groups of tests, subjective and objective.
of possible item bias — qualities in the item which favour some examinees but
disadvantage others - more than one text should be used. This also has advant
ages for subsequent test analysis since most statistical procedures available can
only legitimately be used if items are independent of each other. Thus the tradi
tional statistical analysis of cloze tests, for instance, on the basis of individual
blanks in the text, is not legitimate since the items are embedded in the same
text and are thus dependent on each other. What is possible is analysis on the
super-item level using each text or task as an item. This is the approach adopted
with the C-Test (cf. Klein-Braley and Raatz 1985).
The objective procedures have the advantage that it is reasonably easy to
produce highly reliable tests. The test scores are numerical with a fairly wide
range on an interval scale, whereas the subjective procedures are generally
scored on (ordinal level) rating scales, and rarely use more than 5, or at the most
7, categories.
Like all tests, these tests need to be put through test development proce
dures, but in most cases it is not difficult to develop acceptable tests. The main
exception seems to be the classical nth word deletion cloze test, which has been
shown to be highly erratic in its performance (cf. Alderson 1979; Klein-Braley
1981) and which is difficult to score: if exact scoring is used (= only replace
ment of the original word is counted as correct) then it is often too difficult for
learners of foreign languages (it is often too difficult for mother tongue learners
too! — cf. Klein-Braley 1982), and if acceptable scoring is used then a great deal
of time can be spent in agreeing on what is acceptable, which casts away all the
advantages of an objective procedure.
In the context of a research study the objective tests of linguistic perfor
mance would be the ones to look at first. They are relatively easy to produce,
fairly easy to explain to the test takers, and the scoring is objective, though up to
now it cannot be performed by machine — with the exception of the cloze elide
test where ETS holds a patent for machine-scorable forms.
At the same time it should be pointed out that these are proficiency, not
achievement tests. They are not curriculum-oriented. Their purpose is to place
learners on a continuum from zero to 100% linguistic performance. They are
not designed to reveal small increments of linguistic knowledge, the control of
individual units, the ability to manipulate specific structural rules. Nor are they
diagnostic tests. This means that normally they have no justification as classroom
tests — since in my opinion learners have a right to expect that tests administered
as part of the learning process should in some way be related to what has been
taught. In the context of a research study, on the other hand, their absence from
the normal classroom can probably be viewed as a benefit, since they will be un-
TESTING LANGUAGE PROFICIENCY 89
channels for communication? I can only come up with the telephone, the radio
and the station/airport loud speakers. Fair enough. But then we realize that we
also have to throw out all multiple-choice measures — and this leaves us in rather
a quandary so far as testing reading and listening is concerned. What is the aver
age everyday response to reading a book, or to listening to a radio programme?
Normally there is no visible response at all! Admittedly we could get round this
problem by using specialised materials: a comedy programme, perhaps, and
counting the laughs. But this is (a) unsatisfactory sampling and (b) may be af
fected not by the examinee's level of language proficiency but by his or her sense
of humour. Similarly essays go overboard. The only people who regularly write
essays as part of everyday life are schoolchildren and language students. Their
mother and fathers don't. They write letters, shopping lists, notes for the clean
ing lady — and possibly a variety of texts in their professional capacities. But they
don't write essays.
A second problem with communicative tests is that of judging the outcome
of the test procedures. What is to be judged? The adequacy with which the stu
dent performs the given task? But the task itself, as we have seen, is only quasi-
authentic. And just how is the language used in performing the task to be
assessed? Adequacy? Fluency? Correctness? Amount of foreign accent? Olsh-
tain and Blum-Kulka (1985: 28) make the following suggestion:
"Since one of the outstanding features of speech act behavior is variability, ex
pected outcomes on the test given to learners of the language will need to
allow for such variability. The tester needs to relate, therefore, to a range of
acceptable answers. How can this range be established? One possible remedy
might be to follow the principle of administering any functional test to native
speakers of the target language first, in order to establish the acceptable vari
ation of answers. Accordingly, the tester will be able to evaluate the learner's
answers by comparing them to the native norms of variability on the same test
and within the very same testing item".
4 Conclusions
hand the discrete-point-item test can need several cycles of painstaking test de
velopment, but it can subsequently be used with very large groups and adminis
tered and scored by ancillary personnel because the test development
procedures have made it relatively foolproof. My own preference — speaking
now as a language teacher - would be to invest the effort in test development.
But then I hate marking student papers!
It may seem that in focussing so much attention on the tests I am implying
that much of the effort — and funding — going into the research project needs to
be invested in the tests. This is, in fact, exactly what I am suggesting since in any
piece of research satisfactory, i.e. interpretable and reliable, results can only be
obtained if the measurement procedures are functioning properly. No amount
of effort or statistical manipulation can rescue a research study if the tests have
been designed to answer the wrong questions or if they are not sufficiently sensi
tive to detect possible effects.
Acknowledgement
Thanks, as always, are due to my research partner, colleague and friend Ulrich Raatz, Professor
of Clinical Psychology at the University of Duisburg, for his helpful comments and criticism. All
remaining errors are entirely my own work
References
Alderson, J. Charles. 1979. "The cloze procedure and proficiency in English as a foreign lan-
guage." TESOL Quarterly 11.59-67.
APA: American Psychological Association. 1974. Standards for educational and psychological
tests. Washington: APA.
Bachman, Lyle F. and Adrian S. Palmer. 1981. "The construct validation of the FSI Oral Inter-
view." Language Learning 31/1.67-86.
Bachman, Lyle F. 1981. The trait structure of cloze test scores. Paper presented at the 1981 TESOL
Midwest Regional Conference, Champaign/Urbana.
Campbell, Donald T. and Julian C. Stanley. 1963. "Experimental and quasi-experimental designs
for research on teaching." Handbook of research in teaching ed. by N.L. Gage, 171-246. Chi-
cago: Rand McNally and Co.
Carroll, John B. 1961. "Fundamental considerations in testing for English language proficiency of
foreign students." Testing the English proficiency of foreign students ed. by Center for Applied
Linguistics, 30-40. Washington, DC: Center for Applied Linguistics.
Cronbach, Lee J. 1984. Essentials ofpsychological testing. New York: Harper and Row.
Engels, Leopold K. 1982. "Testing and mastery learning of English vocabulary at university level."
Practice and problems in language testing III. Studiereeks van het tijdschrift van de Vrije
Universiteit Brussel, 10 ed. by Madeline Lutjeharms and Terry Culhane, 144-157. Brussel:
VUB.
TESTING LANGUAGE PROFICIENCY 93
French, John W. 1961. "Schools of thought in judging excellence of English themes." Testing prob-
lems in perspective ed. by Anne Anastasi, 587-596. Washington, DC: American Council on
Education.
Gaies, Stephen J. 1988. "Validation of the Noise Test." In Grotjahn, Klein-Braley and Stevenson
1988.41-74.
Grotjahn, Rüdiger, Christine Klein-Braley and Douglas K. Stevenson, eds. 1988. Taking their
measure: the validity and validation of language tests ( = Quantitative Linguistics, 34.) Bo-
chum: Studienverlag Dr. N. Brockmeyer.
Harris, David P. 1969. Testing English as a second language. New York: McGraw-Hill.
Hauptman, Philip C , R. LeBlanc and M. Bingham Wesche, eds. 1985. Second language perfor-
mance testing. Ottawa: University of Ottawa Press.
Henning, Grant, Hudson, G. and Turner, J. 1985. "Item response theory and the assumption of
unidimensionality for language tests." Language Testing 2.141-154.
Klein-Braley, Christine and Ulrich Raatz, eds. 1985. C-Tests in der Praxis. Bochum: Fremdsprache
und Hochschule: AKS Rundbrief 13/14.
Klein-Braley, Christine. 1981. Empirical investigations of cloze tests. Ph. D. Dissertation, University
of Duisburg.
Klein-Braley, Christine. 1982. "On the suitability of cloze tests as measures of reading comprehen-
sion." Lezen in Onderwijs en Onderzoek ( = Toegepaste taalwetenschap in artikelen, 13) ed. by
A.J.M. van der Geest, C.J.Koster and J.F. Matter, 49-61. Amsterdam: VU Boekhandels.
Klein-Braley, Christine. 1983. "A cloze is a cloze is a question." Issues in language testing research
ed. by John W. Oiler Jr., 218-228. Rowley, MA: Newbury House.
Klein-Braley, Christine. 1985. "A cloze-up on the C-Test." Language Testing 2.76-104.
Klein-Braley, Christine. 1987. "Fossil at large: translation as a language testing procedure." Grot-
jahn, Klein-Braley and Stevenson 1988.111-132.
Lado, Robert. 1961. Language Testing. London: Longman.
Madsen, Harold S. and Randall L. Jones. 1981. "Classification of oral proficiency tests." The con-
struct validation of tests of communicative competence ed. by Adrian S. Palmer, Peter J.M.
Groot and George A. Trosper, 15-30. Washington, DC: TESOL.
Manning, Winton H. 1986a. "Using technology to assess second language proficiency through
Cloze-Elide tests." Technology and language testing ed. by Charles W. Stansfield, 147-166.
Washington, DC: TESOL.
Manning, Winton H. 1986b. Development of Cloze-Elide tests of English as a second language.
Draft final report submitted to the TOEFL Research Committee. Princeton, NJ: Educa-
tional Testing Service.
Morrow, Keith. 1979. "Communicative language testing: revolution or evolution?" The communi-
cative approach to language teaching ed. by Christopher J. Brumfit and Keith Johnson, 143-
157. Oxford: Oxford University Press.
Oiler, John W. Jr. 1973. "Cloze tests of second language proficiency and what they measure."
Language Learning 23.105-118.
Oiler, John W. Jr. 1979. Language tests at school. London: Longman.
Olshtain, Elite and Shoshana Blum-Kulka. 1985. "Crosscultural pragmatics and the testing of
communicative competence." Language Testing 2/1.16-30.
Olshtain, Elite and Tamar Feuerstein. 1988. "Computer assisted global textual analysis". Paper
presented at the 13th International LAUD Symposium on Linguistic Approaches to Artifi-
cial Intelligence, Duisburg.
Raatz, Ulrich. 1985. "The factorial validity of C-Tests." Klein-Braley and Raatz 1985.42-54.
94 CHRISTINE KLEIN-BRALEY
1 Theories of Testing
for such items typically include pre-testing of a pilot version of a set of items (a
test), checking of the psychometric properties of the test as a whole and of the
individual items, and selection of satisfactory items for inclusion in the final pro
duct. In classical test theory the focus of empirical checks is on aspects such as
ascertaining suitable difficulty levels and appropriate item discriminative power,
and establishing test reliability and validity properties, usually by means of
correlational methods (resulting in internal consistency measures and inter-
correlations). Not infrequently, statistics calculated according to classical theory
carry meaning only in relation to a given sample of persons who have taken the
test in question and in relation to the particular set of items included in the test.
In other words, the estimates are relative to sampling characteristics. They are
sample-dependent both in respect of the sample of persons involved and in re
spect of the sample of items used. There is no way of quantifying the test statis
tics in objective or absolute terms, and this must be considered a weakness in
classical theory.
The one-parameter Rasch model is the simplest of the latent trait models
and it is the one that has been most commonly used in language testing as well
as in many other disciplines. It uses only one parameter to describe each item
("difficulty") and only one parameter to describe each person ("ability"), hence
the designation. Further, the model states that the probability of a correct re
sponse to an item is a simple logistic function of these two parameters. By use of
the model it is possible (under certain assumptions, see below) to predict the li
kelihood of a correct answer to a given test item on the basis of knowledge of
98 MATS OSCARSON
only two variables, item difficulty and person ability. The mathematical function
that relates the probability of a correct item response to the ability variable (the
latent trait) is described graphically by a so-called item characteristic curve
(ICC), which typically takes the form of an elongated S (see Figure). On the
basis of the item characteristic curve it is possible to make an estimate of the
probability (p) of a correct response to the item at any given student ability
level. For example, for a person at ability level -2 (representing an independent
assessment on a transformed z-scale) the probability of responding correctly to
item j is .4 (i.e. there is a 40% chance that the person will obtain a correct score).
For a person at ability level 1 the probability of success on item k is .7. The value
of p depends of course, as always, on person ability and item difficulty.
The response pattern observed for a given item can be tested statistically for
goodness-of-fit. For an item that does not fit the model, the item characteristic
curve will deviate more or less markedly from the pattern portrayed in the fig
ure.
extension of the original theory and is usually referred to as the Partial Credit
method (described in detail in Wright and Masters 1982). A very clear and in
structive demonstration of the usefulness of this form of the Rasch model is
given by Pollitt and Hutchinson (1987), who employed it in the analysis and cali
bration of a number of free writing tasks.
The Rasch model (like any other latent trait model) is based on a number
of assumptions concerning the nature of the data under analysis. The most im
portant of these are (1) the assumption of unidimensionality, which means that
the test must be homogeneous, i.e. each item in the test must measure the same
characteristic and (2) the assumption of local stochastic independence, which
means that performance on one item must not be affected by performance on
other items in the test. (There are, in addition, certain other requirements asso
ciated with the use of the model, but I will not go into them in this context.)
From the assumption of unidimensionality it follows that poor fit to the
model will be obtained if the test is heterogeneous, for instance if some of the
items measure the ability aimed at and little or nothing else, whereas the others
measure a slightly different ability, or the intended ability plus something else.
This is not uncommon in foreign language listening comprehension testing, for
example, where a less relevant variable such as the ability to remember detailed
information may easily become over-represented among the more important
components of the skill one wants to assess.
It is usually important to be able to ascertain unidimensionality of test
measures in the context of measurement of the relative effects of different in
structional treatments, for instance when the same post-test is used to gauge in
structional effects. The reason for this is that there is always a risk that there may
be interactions between treatments and test scores resulting from covariation of
item difficulties and one (but not the other) treatment. What is needed in such
cases is a homogeneous test which measures the same thing in all treatments.
There is probably a good chance of avoiding this potential source of error if
Rasch analysis of test item data is undertaken.
Generally speaking, violation of underlying assumptions will of course en
danger the validity of results obtained by use of the model. However, the degree
to which one may accept departures from the assumptions is sometimes a matter
of practical judgement. Gustafsson (1980: 226) states that "... the fit of the data
to the model is important but the question of fit is nevertheless subordinate to
the solution of concrete measurement problems. This implies that lower stand
ards of fit can sometimes be set, that all possible deviations from the model as
sumptions need not necessarily be considered and that in fact large deviations in
the data from the model assumptions can sometimes be tolerated". Further
more, the assumptions under which one may apply the Rasch model are largely
100 MATS OSCARSON
the same as those which make application of classical test theory permissible, as
when a pointbiserial correlation is calculated in order to ascertain item discrimi
nation power or when a reliability index is calculated in order to estimate the in
ternal consistency of a set of test items.
As heterogeneity of items violates the assumption of unidimensionality, a
test should be checked for this before the Rasch model is employed, for instance
by use of factor analysis. However, application of factor analysis to dichotomous
data is considered to be problematic (Hambleton and Swaminathan 1985: 156).
In a collection of papers edited by Hughes and Porter (1983) it is pointed out
that the exploratory use of factor analysis tends to result in over-estimation of
the magnitude of the first factor, i.e. of the common variance. Instead, increasing
attention is now being paid to confirmatory approaches, i.e. to methods which
allow the researcher to make a statistical comparison between the predictions of
a model and the results obtained empirically (Palmer and Bachman 1981;
Adams, Griffin and Martin 1987).
Finally a note on the interpretation of the notion of unidimensionality. By
means of the latent trait approach to item analysis we reject items which do not
fit the model (which, by the way, only means that those items do not function
well in combination with the other items in the particular test in hand, not that
they are necessarily poor items in a different context). This operation would
seem to constitute a potential risk with respect to test validity. After we have dis
carded misfitting items, it would appear that we will be left with a test instru
ment which meaures a very narrow range of abilities, or indeed just one single
refined ability in accordance with the fundamental requirement of the model,
i.e. that of unidimensionality of scores. The following question might then be
raised, at least in the area of language testing: Doesn't this result in a loss of va
lidity? After all, linguistic competence is a highly complex attribute, even when
we restrict our attention to sub-skills such as comprehension of spoken lan
guage, command of grammar, lexical control etc. Gustafsson (1977: 88) offers a
solution to this perplexing issue by stating that even if only one single variable is
measured with the same test "it does not mean that the latent trait in itself is
unidimensional; it may well be functionally (and factorially) complex and we can
certainly not claim that there is one unitary process underlying test perfor
mance". The view is supported empirically by Henning, Hudson and Turner
(1985), who studied the problem using a 150-item multi-skill language profi
ciency test. Examination of test data (item fit statistics etc.) indicated no viol
ations of the assumption of unidimensionality even though the test consisted of
subtests measuring such diverse skills as listening and reading comprehension,
grammar accuracy, vocabulary recognition, and writing error detection. (Re
search in this area, involving foreign language reading comprehension, has also
ITEM RESPONSE THEORY AND REDUCED REDUNDANCY TECHNIQUES 101
been reported by Willmott and Fowles 1974.) Thus it is probably safe to say that
the apparent threat to test validity posed by the assumption of unidimensionality
is not a real one and that there is no conflict between the requirements of
"coverage of domains" and measurement in one dimension.
By means of test linking or test equation using IRT techniques one may
compare scores obtained on different tests (including scores obtained on tests at
different levels of difficulty, so-called vertical equation). Basically, the compari
son is made possible by use of a set of link items common to both tests, or any
pair of tests if more than two are being calibrated (see Wright and Stone 1979
for a full description). This type of application of latent trait procedures is highly
relevant in research into the effects of instructional treatments, for instance in
long-term longitudinal studies which involve measurement at distant time inter
vals and which may therefore necessarily involve updating of parts of the tests
being used. Another obvious area of application is in classical pre-test/post-test
research designs.
lated to those of any other set of items drawn from the same pool. The relevance
of this facility is particularly obvious in situations which require precise assess
ments in relation to some absolute standard of performance.
It should be added that the feasibility and usefulness of item banking on the
basis of latent trait measurement principles has been a matter of some dispute
among psychometricians (see for instance Woods and Baker 1985). The con
troversy relates primarily to the question of whether test characteristics estab
lished by means of the Rasch model can in fact be assumed to remain constant
over time.
By virtue of the fact that the Rasch model makes provision for "person-in
dependent" item calibration, it is possible to minimize the errors of measure
ment inherent in any set of test scores. Theoretically, the condition of minimal
measurement error obtains when all subjects only take items on which the prob
ability of responding correctly is equal to the probability of responding incor
rectly (i.e. when p = .50). Therefore it is always an advantage if one can
administer different sets of items, each at a suitable level of difficulty in terms of
probability of a correct response, to different groups of examinees, rather than
administering the same set of items to all subjects. As already indicated, the
Rasch one-parameter model provides one way of doing just that, i.e. of tailoring
the test to suit the particular target group in hand. The resulting gain in meas
urement precision and cost-effectiveness is of great interest in many educational
research contexts as well as in institutional test-administration programmes.
As we have seen, the Rasch model can be used for test dimensionality
"check-up". The model rejects items which measure something other than the
majority of items in the test. The practical implication of such a function is of ut
most interest in the type of research we are considering here (and indeed in any
type of research which aims at measuring a specific ability variable by means of a
test); it is imperative that we know that each measurement instrument quantifies
a single defined ability at a time and not a conglomerate of abilities, the exist
ence of each of which we may not even be aware of in each case. The reason for
this is of course that an intruding or ill-defined variable (in a post-test) may eas
ily co-vary with the effects of one (but not some other) treatment under com-
ITEM RESPONSE THEORY AND REDUCED REDUNDANCY TECHNIQUES 103
parison, thus seriously undermining the validity the post-test scores. What we
want in experimental educational research are well-defined treatments and
equally well-defined test functions, strictly attuned to the specification of objec
tives common to the experimental treatments. Only then will we be able to draw
the right conclusions about the effects of the treatments under investigation.
Seen in this perspective, the issue of test dimensionality becomes one of pro
found importance. Latent trait theory is one (but not the only) tool that can be
used in order to establish the characteristics of a language test in this respect.
1.9 Summing up
In sum, then, I would like to argue in favour of exploiting the insights that
have been gained in recent decades in the area of statistical item analysis. For
such purposes as we are discussing here, i.e. the scientific evaluation of language
learning and teaching in a broad perspective, item response theory can no doubt
contribute substantially to the validity of the conclusions we are able to draw on
the basis of our research efforts. There is good reason to believe that the largely
inconclusive methodological experiments of the 1960's and 1970's would have
produced less equivocal results had the measurements of treatment effects been
performed with the rigorous control which we are now, some 25 years later, in a
position to exercise.
Further remarks on the significance of item response theory will be given in
the conclusion of the paper.
The use of the cloze procedure in language testing has grown at a phe
nomenal rate in the last 10 to 15 years. Many major proficiency test batteries in
use today include a cloze part of some sort or other and the technique is quite
commonly used in the ordinary foreign language classroom. Its widespread
popularity derives from the fact that it is relatively easy, even for a layman, to
convert a piece of text into a passable cloze test, and also from the fact that the
scoring is usually fairly simple and straightforward (most notably if one uses the
exact-word principle). The technique is, furthermore, extremely well researched,
and studies testifying to its usefulness abound (for surveys on the cloze and its
various modified forms, see for instance Oiler 1979; Cohen 1980: 89-110).
It is not surprising, therefore, that strong claims have been made for the
value of the cloze procedure. It is sometimes contended, for instance, that a
well-designed cloze measures not only language skills at a relatively low level
(e.g. command of vocabulary, grammar, idioms), but also higher-order skills
such as awareness of "intersentential relationships", global reading comprehen
sion etc. (see for instance Chihara et al. 1977; Bachman 1982; Bensoussan and
Ramraz 1984). Briere and Hinofotis (1979: 12) state that "Regardless of scoring
method, frequency of items deleted, or length of passage, results on a cloze test
correlate highly (usually .70 or better) with overall placement batteries in ESL".
Oiler (1979: 357) tells us that "Ever since Taylor's first studies in 1953, it has
been known that cloze scores were good indices of reading comprehension". It
may be added that the cloze was originally devised as a method for assessing the
readability of texts (Taylor 1953).
However, data that may cast doubt on the cloze as a valid assessment instru
ment are not lacking. Some researchers (e.g. Carroll 1972; Lado 1986) have
questioned the notion that successful performance on cloze tests requires ability
to interpret global text meanings, the implication being that cloze items are es
sentially sentence-bound. Other researchers have tried to define the possible
limit of the range of a cloze task to 5-10 words on either side of the blank. If
such an estimate were to be found valid, it would mean, in effect, that cloze tasks
are often insensitive to discourse constraints across sentence boundaries. Mark-
ham (1987: 309), investigating cloze sensitivity to global comprehension, con
cludes that the cloze procedure does not really assess comprehension at the
macro level: "It does not appear necessary to pay attention to the global cues in
order to complete the deletions". Other studies (for instance Hanzeli 1979) have
pointed to a special problem affecting the cloze, i.e. the difficulty of measuring
control of content words. Certain word classes, notably adjectives and certain
adverbs, are very hard to elicit by means of the deletion technique. Function
ITEM RESPONSE THEORY AND REDUCED REDUNDANCY TECHNIQUES 105
words are easier, because they are, as the jargon goes, "subject to local determi-
nacy", i.e. their immediate environment provides the necessary clues for their
substitution.
lating to vocabulary mastery and syntactical awareness, and the weight of evi
dence is that it also measures the test-taker's global proficiency in the language
quite well. Therefore, it is applicable in a wide variety of contexts, including lan
guage learning research.
23 The C-Test
The classical cloze of the fixed-ratio deletion type has spawned the develop
ment of a large number of cloze-like variants, e.g. the rational deletion cloze
(see above), the partial dictation test, which involves deletion of portions of re
corded speech, and the cloze-elide test, which involves identification of irrele
vant words inserted in a text (Manning 1987). Probably the most intriguing and
innovative of recent additions to the family of cloze techniques is the so-called
C-test, which was introduced in 1982 (for a comprehensive presentation and sur
vey of research, see Klein-Braley and Raatz 1984; for a review of the technique,
see Carroll 1987).
A C-test is constructed by deleting the second half of every second word in
a number of short texts (usually five or six). Each text is regarded as a "super-
item" and item statistics are not calculated on the basis of performance on indi
vidual tasks (blanks) but on the "super-item" level.
The development of the C-test arose out of the above authors' critical ana
lysis of the assumptions underlying the Cloze, e.g. as regards the extent to which
a set of cloze tasks may be viewed as representing a random sample of the ele
ments of the language and also as regards the general validity of the procedure
(for a comprehensive account of the theoretical justification for the C-test, see
Klein-Braley 1985). Both the classical Cloze and the C-test may be described as
pragmatic and authentic tests in the sense that they use authentic materials as
the basis for item construction, but the originators of the latter test claim that a
better operationalization of the principle of random selection of language ele
ments is achieved with the C-test model. They hold that it tends to sample from
the various language elements more evenly than the Cloze procedure does, and
also that a better representation of "the real language" is achieved owing to the
fact that the test format involves the use of a variety of different text types.
Impressive empirical test data have been reported (see particularly Klein-
Braley and Raatz 1984). Cohen et al. (1984), investigating the possibilities of
adapting the C-Test technique to testing in Hebrew, found that the technique
"appears to be both a reliable and valid measure of general language profi
ciency" (p. 225). The evidence is, therefore, that this relatively new, and as yet
ITEM RESPONSE THEORY AND REDUCED REDUNDANCY TECHNIQUES 107
not widely employed, variant of deletion test has a great deal to offer in the way
of effective and reliable scoring.
However, the C-Test seems to suffer at least one disadvantage, namely that
of questionable face validity (a problem which also affects the Cloze, although to
a minor degree). Mutilation of every second word in a text, albeit undertaken
for the sake of as wide coverage of linguistic elements as possible, no doubt re
sults in a product which does not really convey an impression of authentic lan
guage and consequently the researchers' ambition to secure representativity may
in fact prove to be a somewhat self-defeating measure. The question of whether
the C-Test format will be well received in the field, i.e. among teachers and lear
ners, seems to be crucial. In-depth studies of attitudes as well as of test-taking
strategies would seem to be called for (cf. work undertaken by Grotjahn 1986).
Finally, it might be added that further examination of test dimensionality,
for instance by means of latent trait methods (discussed in Section I), will be of
vital importance (some work has already been done, cf. Raatz 1985) in order to
ascertain whether variables other than linguistic ones are at play in C-Test per
formance. Is it possibly the case that some extralinguistic ability (or some very
particular linguistic ability) is helpful in restoring words cut into half, or are the
demands of the task of such a nature that all-round linguistic competence is a
necessary and sufficient prerequisite for successful performance? The test is of
an integrative type and is designed to measure general language proficiency, as
will have become clear from the above account.
3 Concluding Remarks
therefore be appropriate to end this paper by trying to relate the described de
velopments in statistical item analysis and testing to the challenge of communi
cative language testing.
The issue is: Can competencies postulated by existing models of communi
cative ability (e.g. that of Canale and Swain 1980) be appropriately dealt with
within the framework of current language testing theory and practice? Thus
stated, the question seems simple enough. However, the real complexity of the
problem becomes apparent if we remind ourselves of what kinds of components
modern models of language competence usually employ in their descriptions.
Typically they include variables such as grammatical competence (including
not just control of structures and rules, but also control of the phonetic system,
of semantics, of lexicon etc.), sociolinguistic competence (including choice of
register, style, conventions etc.), strategic competence (including verbal as well
as non-verbal communication strategies), etc. Assessing the full range of a per
son's abilities in all such domains is of course very difficult, if not impossible, in
an ordinary testing situation and there still remains a great deal of uncertainty as
to what and how to test. Nevertheless, one would probably be justified in saying
that language testers are actually beginning to come to grips with many of the
problems the task involves. The language tester's repertoire is in fact quite im
pressive and here one might point to such developments as are illustrated above,
in spite of the fact that reduced redundancy testing techniques do not possess
real face validity from a communicative point of view. They do contribute, how
ever, to providing a fuller picture of an individual's ability.
With regard to the question of what precisely one may assess by means of
any given test, for instance whether a single dimension is involved in the meas
urements, we can place considerable trust in item response theory as explained
earlier. IRT provides a promising basis for making inroads into better under
standing of what language tests measure. Whether IRT and the notion of latent
traits can be firmly established in a wider theory of language testing is a question
which may yet have to await its final answer. Communicative ability is an elusive
concept which does not easily lend itself to penetrating inquiry and detailed
quantification, even by sophisticated methods, and the work now being under
taken is far from its completion. Having said that, I would still like to reiterate
the main argument of my paper, namely that we have made headway in a great
many areas of foreign language testing and that we should now be in a position
to attack the perennial question of what effective language teaching looks like
with renewed confidence.
ITEM RESPONSE THEORY AND REDUCED REDUNDANCY TECHNIQUES 109
Acknowledgement
References
Adams, R J., P.E. Griffin and L. Martin. 1987. "A latent trait method for measuring a dimension
in second language proficiency." Language Testing 4/1.9-27.
Alderson, J. Charles. 1979. "The cloze procedure and proficiency in English as a foreign lan-
guage." Tesol Quarterly 13/2.219-227.
Bachman, Lyle 1982. "The trait structure of cloze test scores." TESOL Quarterly 16.612-670.
Bensoussan, M. and R. Ramraz. 1984. "Testing EFL reading comprehension using a multiple-
choice rational deletion cloze." Modern Language Journal 68/3.230-239.
Briere E.J. and F.B. Hinofotis. 1979. Concepts in Language Testing: Some Recent Studies. TESOL,
Georgetown University, Washington DC. 20057.
Canale, M. and M. Swain. 1980. "Theoretical bases of communicative approaches to second lan-
guage teaching and testing." Applied Linguistics 1/1.1-47.
Carroll, John B. 1972. "Defining Language Comprehension: Some Speculations." Language Com-
prehension and the Acquisition of Knowledge ed. by R.B. Freedle and J.B. Carroll, 1-29.
Washington DC: Winston.
Carroll, John B. 1987. "Review of Klein-Braley, C. and Raatz, E. 1985. 'C-Tests in der Praxis.' in
Fremdsprachen und Hochschule, AKS-Rundbrief 13/14, Bochum: Arbeitskreis Sprachen-
zentrum /AKS/" Language Testing 4/1.99-106.
Chihara, T., J. Oller, K. Weaver and M.A. Chavez-Oiler. 1977. "Are cloze items sensitive to con-
straints across sentences?" Language Learning 27/1.63-69.
Cohen, Andrew D. 1980. Testing Language Ability in the Classroom. Rowley, MA: Newbury
House.
Cohen, A.D., M. Segal and R. Bar-Siman-Tov. 1984. "The C-Test in Hebrew." Language Testing
1/2.221-225.
De Jong, H.A.L. and C.A.W. Glas. 1987. "Validation of listening comprehension tests using item
response theory." Language Testing 4/2.170-194.
Grotjahn, R. 1986. "Test validation and cognitive psychology: Some methodological consider-
ations." Language Testing 3/2.159-185.
Gustafsson, J.E. 1977. The Rasch model for dichotomous items: Theoryt applications and a com-
puter program. ( = Department of Education and Educational Research, University of Göte-
borg, Sweden, Report No. 63.)
Gustafsson, J.E. 1980. "Testing and obtaining fit of data to the Rasch model." British Journal of
Mathematical and Statistical Psychology 33.205-233.
Gustafsson, J.E. 1981. An introduction to Rasch's measurement model. Göteborg, Sweden: Depart-
ment of Education and Educational Research, University of Göteborg.
Hambleton, R.K. and H. Swaminatham. 1985. Item Response Theory: Principles and Applications.
Boston: Kluwer-Nijhoff Publishing.
110 MATS OSCARSON
Hanzeli, Victor E. 1979. "Cloze Tests in French as a Foreign Language: Error analysis." Concepts
in Language Testing: Some Recent Studies ed. by EJ. Briere and F.B. Hinofotis, 3-11. Wash-
ington DC: Teachers of English to Speakers of Other Languages.
Henning, G. 1984. "Advantages of latent trait measurement in language testing." Language Testing
1/2.123-133.
Henning, G. 1987. A Guide to Language Testing: Development, Evaluation, Research. New York:
Newbury House.
Henning, G., T. Hudson and J. Turner. 1985. "Item response theory and the assumption of uni-
dimensionality for language tests." Language Testing 2/2.141-154.
Hughes, A. and D. Porter, eds. 1983. Current Developments in Language Testing. London: Aca-
demic Press.
Klein-Braley, Christine. 1985. "A cloze-up on the C-Test: A study in the construct validation of
authentic tests." Language Testing 2/1.76-104.
Lado, Robert. 1986. "Analysis of native speaker performance on a cloze test." Language Testing
3/2.130-146.
Manning, W.H. 1987. Development of cloze-elide tests of English as a second language ( = TESOL
Research Report, 23.) Princeton, NJ: Educational Testing Service.
Markham, Paul L. 1987. "Rational deletion Cloze processing strategies: ESL and native English."
System 15/3.303-311.
Munby, John. 1979. Communicative Syllabus Design. Cambridge: Cambridge University Press.
Oiler, John W. Jr., ed. 1983. Issues in Language Testing Research. Rowley, MA: Newbury House.
Oiler, John W. Jr. 1979. Language Tests at School: A pragamatic approach. London: Longman.
Oscarson, Mats. 1986. Native and Non-Native Performance on a National Test in English for Swed-
ish Students: A Validation Study ( = Report No. 1986:03, Department of Education and Edu-
cational Research.), Göteborg, Sweden: University of Göteborg.
Palmer, A.S. and L.F. Bachman. 1981. "Basic concerns in test validation." Issues in Language Test-
ing ed. by J.C. Alderson and A. Hughes, 135-151. London: The British Council.
Perkins, K. and L.D. Miller. 1984. "Comparative analysis of English as a second language reading
comprehension data: Classical theory and latent trait measurement." Language Testing
1/1.21-32.
Pollitt, A. and C. Hutchinson. 1987. "Calibrating graded assessments: Rasch partial credit analysis
of performance in writing." Language Testing 4/1.72-92.
Porter, Don. 1983. "The effect of quantity of context on the ability to make linguistic predictions."
Current Developments in Language Testing ed. by A. Hughes and D. Porter, 63-74. London:
Academic Press.
Raatz, U. 1985. "Better theory for better tests?" Language Testing 2/1.60-75.
Rasch, G. 1960. Probabilistic models for some intelligence and attainment tests. Chicago: The
University of Chicago Press.
Smith, P.D. 1970. A Comparison of the Cognitive and Audiolingual Approaches to Foreign Lan-
guage Instruction: The Pennsylvania Foreign Language Project. Philadelphia: The Center for
Curriculum Development.
Taylor, W.L. 1953. "Cloze procedure: A new tool for measuring readability." Journalism Quarterly
30.415-433.
Willmott, A.S. and D.E. Fowles. 1974. The Objective Interpretation of Test Performance: The Rasch
Model Applied. Slough, Bucks.: NFER Publishing Company.
Woods, A. and R. Baker. 1985. "Item response theory." Language Testing 2/2.118-140.
Wright, B.D. 1977. "Solving measurement problems with the Rasch model." Journal of Educa-
tional Measurement 14/2.97-116.
ITEM RESPONSE THEORY AND REDUCED REDUNDANCY TECHNIQUES 111
Wright, B.D. and M.H. Stone. 1979. Best Test Design: RaschMeasurement.Chicago: MESA Press.
Section III—Teaching Environments
Introduction to the Section on Teaching
Environments
Kees de Bot
be given of the learners' activities, makes it clear that the amount of time effec
tively used for learning is quite small. At the same time, we do not know
whether it is at all possible or even effective to have learners focussed on their
learning activity all the time. Maybe gazing out of the window is a perfect way to
digest new information.
In his contribution, Allwright proposes to focus research on this particular
part of the process: the way in which learners define their own, probably idiosyn
cratic learning environment.
The relation between "methods" and "learning" is further weakened by the
link between "classroom activities" and "learning." There is, as yet, simply no
way to get to know to what extent certain activities lead to the changes in cere
bral activity we tend to call learning. Recent work on ERP's (Event related
potentials, low-voltage, but detectable cerebral activity that appears to be re
lated to certain types of activities and stimuli) definitely has some potential, but
this type of research is still in its infancy, and the way in which data are gathered
at present does not lend itself particularly to classroom research: a class full of
adolescents sitting motionlessly with their heads covered with electrodes could
be the ultimate dream of a tired teacher, but it is certainly not a ecologically
valid research environment.
One of the aspects that have not been explored in any detail is what the long
term effects of different methods are. Longitudinal research is still to be done. It
is conceivable that certain methods are relatively successful in having immediate
effects, while other methods may be less successful in the short run, while lead
ing to better retention over the years. As pointed out by Van Els et al. (this vol
ume) the effectiveness of a method has to be related to the goals set for foreign
language teaching. In some cases short term success is sufficient, but in general
the aims are more far reaching in time.
In her paper Mitchell stresses the importance of both the goals of the teach
ing method and the goals of the evaluation of that method. Evaluation does not
take place in a political vacuum, but rather, the research will be interpreted by
those involved: the politicians, the teachers, and sometimes the learners or their
parents. This implies that a certain transparency and face-validity of the metho
dology is called for. Outcomes that are "mere numerals" are unlikely to change
politicians' decision making or parents' attitudes.
In research on foreign language teaching, the same type of methodological
discussions takes place as in other educational research. One of the bones of
contention is the validity of (quasi-)experimental designs. From the authors in
this subsection Larsen-Freeman is clearly more in favour of this type of research
than the other authors. She stresses the importance of process-oriented data-
gathering, but does not reject the application of experimental designs as such.
118 KEES DE BOT
Language Focus
In the Natural Approach, what is important is that the language the teacher
uses is comprehensible. In other words, structural diversity is permitted as long
as what is being transmitted by the teacher is understood. This is not the case
with the Silent Way. With the Silent Way, there is both structural grading3 and a
restricted functional vocabulary, at least at the beginning levels. No such linguis
tic constraints are placed on what is taught the students in a Community Lan
guage Learning class. In fact, it is the students who determine the syllabus
indicating what it is they wish to learn of the target language (TL) by having con
versations in their native tongue which are subsequently translated into the TL.
Linguistic structures receive little attention from students in a course where the
Communicative Approach is being practiced. Instead, students are engaged in
using the language, and thus practicing the functions to which the language is
put.
tations and applications by practitioners (cf. 1.1 below). It is also doubtful that
empirical research would yield unequivocal results indicating the superiority of
one methodology over another. Certainly this has not been the case to date (cf.
1.1 below). I do believe, however, that there are some instructional practices
which are superior for certain purposes and for certain teachers and students
and where there is divergence from these in practice, we should be able to ex
plain the differences in terms of learning outcomes. I will elaborate on this point
in 2.1 below. First, though, I will summarize what research has been conducted
on teaching methodologies and related matters. Following that review, I will
propose two categories of investigation which I believe should be included on a
future research agenda.
Given the difference between Type A and Type E classrooms, the re
searchers hypothesized about the extent to which the differences would contrib
ute to differences in student knowledge and performance in the eight classrooms
under investigation. These hypotheses were then tested by analyzing student
performance on measures of their grammar, discourse, sociolinguistic com
petence and listening skills in French.
In actual fact, only two of the eight classrooms were determined to have ex
periential orientations according to their overall COLT score and even these
were termed "relatively" experiential as opposed to relatively analytical. None
theless, the researchers report that their most striking finding was the extent to
which the two different types of instruction were indistinguishable. None of the
differences between groups on adjusted post-test scores was significant, al
though the difference between the analytical and experiential groups in favor of
the former nearly reached significance on the grammar multiple-choice written
test. When the two most analytical classrooms were compared with the two ex
periential classrooms, more significant differences emerged; however, on most
of the sub-tests, the two groups performed similarly. Moreover, when the total
gain in proficiency was calculated for each class over the year, the one experien
tial class made the highest gain in overall proficiency and the other experiential
class made the lowest gain of the eight classes. Although these results seem
counter-intuitive, they may be explicable either by pointing to research metho
dological problems, or by considering the fact that there is more to language
learning success than the actual practices which are implemented. More will be
said about this below.
126 DIANE LARSEN-FREEMAN
From the preceding review of the empirical research in the area of teaching
methodologies, it seems clear that in order to promote our understanding of the
teaching/learning process, future research should not attempt to compare meth
odologies on a global level, but rather should focus on more local practices. An
other requirement should be that research designs include both process (what is
actually happening in the classroom) and product (what the learning outcomes
are) with an observational component built in to verify that the former is pro
ceeding as planned. Furthermore, the research should be theoretically moti
vated in order to contribute to a coherent, rather than fragmented, view of the
teaching/learning process.
2 1 Process/Product Studies
cial formal features (morphemes, function words, subclause word order). The
subjects were divided into four groups depending on the orientation of their in
struction: form only, meaning only, both form and meaning and a control group
which was given the pre and post tests, but worked on an unrelated task during
the learning time allotted the other groups. The other three groups each were
given a different task depending upon its focus. For example, the form-focused
group worked on an anagram task, while the meaning-focused group registered
their opinion about the issues raised in the sentences. The subjects were given
cued recall tests and a sentence copying test which was administered both before
and after the experimental treatment.
From the results, Hulstijn was able to determine that attention to form was
sufficient for implicit learning of the structural features to take place. However,
he only obtained modest evidence to support the claim that focus on meaning
inhibits the acquisition of the formal features.
Although this study may not be unique in meeting the characteristics it is
desirable for process/product studies to have, it does address all three. It is
targeted at a sub-global level, it considers both process and product and it is the
oretically motivated. Moreover, it deals with clear intervention points (e.g. at
tention to form) which can make an instructional difference.
One would not expect to find from such studies that certain teaching prac
tices are intrinsically "good" or "bad" for all learners. Depending upon the
learning outcomes intended, different practices may be exploited. Moreover, for
a particular developmental point, certain practices may be more efficient than
others. As Politzer,s (1970) study indicates, there is likely to be a curvilinear, not
linear, correlation between student achievement and teaching practices. Certain
practices may be positively correlated with student achievement sometimes,
neutral, or even negatively correlated with student achievement at others.
If optimally-timed and optimally-focused instructional practices do make a
difference, as seems so intuitively obvious, then the type of process-product
study called for here should be illuminative. However, despite their obvious
merit, process-product studies should not be the only nominee to a research
agenda. As was alluded to several times already, there is more influencing suc
cess in language learning than the actual practices which are employed. Indeed,
as we have seen with some of the studies mentioned earlier, no matter how
worthy the practices are which process-product research supports, teachers do
not always put them into practice in the manner prescribed. Rather than des
pairing at such behavior, it would be worth our while to encourage research in
itiatives which examine how the agent in the instructional methodology, the
teacher, influences the teaching/learning process.
128 DIANE LARSEN-FREEMAN
In addition to the competing demands with which the teacher must cope, a
responsible teacher will alter methodological practices simply to meet the lear
ners' needs at the time. As frustrated as we might be when the teachers deviate
from what they are supposed to do during our experiments, we would experi
ence even more frustration if we were students in a class where a teacher ad
hered rigidly to a specific methodological practice when we students were
unresponsive, bored or hopelessly lost. Thus, it is fallacious to view teachers as
mere "conveyor belts" (Lim 1988), delivering language through inflexible prac
tice.
What empirical research has been conducted on the role of the teacher has
been limited almost primarily to describing the speech teachers use in address
ing learners, questioning them and giving feedback. What this brief list dra
matizes as Woods (1988) acknowledges, is that we know very little about what
teachers actually do, although there is no dearth of materials telling teachers
what they should do. Another explanation for the teacher's failure to heed the
"shoulds" or to consistently apply methodological principles is that methodolog-
ists (and one could easily include researchers and even language teacher educa
tors in this group) do not necessarily conceptualize teaching practice in the same
way as teachers do. If we are to generate knowledge that is to have a positive im
pact on pedagogical practice, then we must formulate our inquiries in ways that
are more compatible with teachers' perspectives (Bolster 1983).
RESEARCH ON LANGUAGE TEACHING METHODOLOGIES 129
3 Conclusion
tion of the incongruities has thus far taken place. A research agenda should,
therefore, include process-product studies which attempt to resolve the contra
dictions, not through homogenization of practice, but rather in linking a specific
practice with particular learning outcomes, depending upon the audience.
There should also, however, be room on the agenda for investigating the
role of the agents in the teaching/learning process. We cannot assume that tea
chers are mere conduits from methodologists to students. We not only need to
know what teachers do, but also why they do it.
Ultimately, of course, we must be able to weave all the strands together:
teaching, learning, teacher, learner, materials, context. Until that time, however,
there is much groundwork to be laid.
Notes
1. The six include The Silent Way, Suggestopedia, Community Language Learning, Total
Physical Response, The Communicative Approach and the Natural Approach.
2. Here the term theoretical is used in a broad and generic sense following Stern (1983: 26)
who views each language teaching methodology as a different theory of teaching.
3. Structural grading does not mean a foreordained sequence. The Silent Way teacher assumes
the responsibility for moving from one structure to the logical next, depending upon the
needs of a particular group of students with whom the teacher is working.
4. It is worth noting, however, that in a replication study in Sweden, significant differences in
favor of the Explicit Method (Cognitive-code approach) over the Implicit Method (ALM)
were found when only adults were the subjects (Oskarsson 1973).
5. Indeed the last thing we would want to do is to chastise the teacher who was not being meth-
odologically chaste. See 3.2 for further discussion.
6. Of course, I could make (and have made, Larsen-Freeman 1983) the same case for studying
the other agent in the process, the learner, who some might argue has an equal or even more
important role to play in the process than the teacher (see, for example, Breen and Candlin
1980; Allwright 1981). As this paper is supposed to deal with teaching methodologies, how-
ever, I will leave it to others to make that case.
7. It is interesting to note that Chaudron (1988) devotes two chapters to the agents in the
teaching/learning process. One chapter is entitled "Learner Behavior", the other simply
"Teacher Talk". Chaudron, himself, explains that "In general in L2 research, learners have
been conceived of as much more 'whole' persons than teachers..."
References
Agard, F. and Dunkel, H. 1948.An Investigation of Second-Language Teaching. Boston: Ginn.
Allen, J., M. Frohlich and N. Spada. 1984. "The communicative orientation of language teaching:
an observation scheme." On TESOL '83 ed. by J. Handscombe, R. Orem and B. Taylor, 231-
252. Washington, DC: TESOL.
RESEARCH ON LANGUAGE TEACHING METHODOLOGIES 133.
Aliwright, R. 1981. "What do we want teaching materials for?" ELT Journal 36/1.5-18.
Asher, J. 1969. "The total response approach to second language learning." The Modern Language
Journal 53/1.3-7.
Asher, J. 1972. "Children's first language as a model for second language learning." The Modern
Language Journal 56/3.133-139.
Asher, J. 1984. "The total physical response: some guidelines for evaluation." Paper presented at
the 1984 Milwaukee Symposium on Current Approaches to Second Language Acquisition.
Bolster, A. 1983. "Toward a more effective model of research on teaching." Harvard Educational
Review 53/3.294-308.
Breen, M. and C. Candlin. 1980. "The essentials of a communicative curriculum in language
teaching" Applied Linguistics 1/2.89-112.
Chaudron, 1988. Second Language Classrooms: Research on Teaching and Learning. Cambridge:
Cambridge University Press.
Fanselow, J. 1977. "Beyond Rashomon — conceptualizing and describing the teaching act."
TESOL Quarterly 11/1.17-39.
Gary, J.0.1975. "Delayed oral practice in initial stages of second language learning." New Direc-
tions in Second Language Learning Teaching and Bilingual Education ed. by Burt, M. and H.
Dulay, 89-95. Washington, DC: TESOL.
Harley, B., Allen, P., Cummins, J. and M. Swain. 1987. Tlie Development of Bilingual Proficiency,
Final Report, Volume II: Classroom Treatment. Toronto: The Ontario Institute for Studies in
Education.
Hatch, E. and M. Long. 1980. "Discourse analysis, what's that?" Discourse Analysis in Second
Language Research ed. by D. Larsen-Freeman, 1-40. Rowley, MA: Newbury House Publish-
ers.
Hulstijn, J. 1989. "Implicit and incidental second language learning: Experiments in the process-
ing of natural and partly artificial input." To appear in Interlingual Processes ( = Language in
Performance, 1) ed. by H. Dechert and M. Raupach. Tübingen: Gunter Narr Verlag.
Krashen, S. 1981. Second Language Acquisition and Second Language Learning. Oxford: Perga-
mon Press.
Krashen, S. and T. Terrell. 1983. The Natural Approach. Oxford: Pergamon Press.
Larsen-Freeman, D., ed. 1980. Discourse Analysis in Second Language Research. Rowley, MA:
Newbury House Publishers.
Larsen-Freeman, D. 1983. "Second language acquisition: getting the whole picture." Second Lan-
guage Acquisition Studies ed. by K. Bailey, M. Long and S. Peck, 3-22. Rowley, MA: New-
bury House Publishers.
Larsen-Freeman, D. 1986. Techniques and Principles in Language Teaching. New York: Oxford
University Press.
Larsen-Freeman, D. 1987. "Recent innovations in language teaching methodology." The Annals of
the American Academy of Political and Social Science 490.51-69.
Larsen-Freeman, D. and M. Celce-Murcia. 1985. "Defining the challenge: an additional choice in
language teaching." A paper presented at the 1985 TESOL Convention, New York City.
Larsen-Freeman, D. and M. Long. 1988. "Research priorities in foreign language learning and
teaching." A paper prepared for the National Foreign Language Center, The Johns Hopkins
School for Advanced International Studies, Washington, DC.
Levin, L. 1972. Comparative Studies in Foreign Language Teaching: The GUME Project. Stock-
holm: Almquist and Wiksell.
Lim, C. 1988. "Producing instructional materials in the Singapore setting." A paper presented at
the 1988 RELC Seminar, 11-15 April 1988, Singapore.
132 DIANE LARSEN-FREEMAN
Long, M. 1980. "Inside the 'black box': methodological issues in classroom research on language
learning." Language Learning 30/1.1-42.
Long, M. 1987. "The experimental classroom." The Annals of the American Academy of Political
and Social Science 490.97-109.
Long, M. and G. Crookes. 1986. "Intervention points in second language classroom processes." A
paper presented at the 1986 RELC Seminar, 21-25 April 1986, Singapore.
Moskowitz, G. 1971. "Interactional analysis — a new modern language for supervisors." Foreign
Language Annals 5/2.211-221.
Oskarsson, M. 1973. "Assessing the relative effectiveness of two methods of teaching English to
adults." IRAL 11/3.251-262.
Politzer, R. 1970. "Some reflections on 'good' and 'bad' language teaching behaviors." Language
Learning 20/1.31-43.
Postovsky, V. 1970. "The effects of delay at the beginning of second language teaching." Unpub-
lished doctoral dissertation. Berkely, CA: University of California.
Prahbu, N. 1987. Second Language Pedagogy. Oxford: Oxford University Press.
Prahbu, N. 1988. "Materials as support: materials as constraint." A paper presented at the 1988
RELC Seminar, 11-15 April 1988, Singapore.
Scherer, G. and M. Wertheimer. 1964. A Psycholinguistic Experiment in Foreign-Language Teach-
ing. New York: McGraw-Hill.
Smith, P. 1970. A Comparison of the Cognitive and Audiolingual Approaches to Foreign Language
Instruction: The Pennsylvania Foreign Language Project. Philadelphia: Center for Curriculum
Development.
Spada, N. 1986. "The interaction between type of contact and type of instruction: some effects on
the L2 proficiency of adult learners." Studies in Second Language Acquisition 8/2.181-199.
Stern, H. 1983. Fundamental Concepts of Language teaching. Oxford: Oxford University Press.
Swaffar, L., Arens, K. and M. Morgan. 1982. "Teacher classroom practice: redefining method as
task hierarchy." Modem Language Journal 66/1.24-33.
Ullman, R. and E. Geva. 1982. Classroom observation in the L2 setting: a dimension of program
evaluation. Modern Language Centre, Ontario Institute for Studies in Education (Mimeo).
Wagner, M. and G. Tilney. 1983. "The effect of 'superlearning techniques' on the vocabulary ac-
quisition and alpha brainwave production of language learners." TESOL Quarterly 17/1.5-19.
Woods, D. 1988. "Teachers' interpretations of language teaching materials." A paper presented at
the 1988 RELC Seminar, 11-15 April 1988, Singapore.
Problems in Defining Instructional Methodologies
Christopher Brumfit
2 Methods as packages
Education has never been free from polemic, and polemic leads to oversim
plification. Nonetheless, it does seem worthwhile to try to disentangle the basic
principles that are at stake from the fights for franchise or ownership of "new"
procedures and their associated polemic, especially as (as Howatt 1984 shows
clearly) there are only a limited number of basic themes to be drawn upon by
teaching methodologists. Viewing the current scene from a British perspective, I
PROBLEMS IN DEFINING INSTRUCTIONAL METHODOLOGIES 135
find it curious that (for example) it was possible for Krashen and Terrell to mar
ket "The Natural Method" (1983) as some kind of coherent package without
constantly examining the extent to which it overlapped with other traditions in
its recommendations. There are, indeed, serious academic problems about the
notion of "packages", but I have addressed this issue elsewhere, in Brumfit
(1985: 86-93), and shall not repeat those arguments here.
Underlying these different views of "methods" there seem to be two separ
ate traditions, and it is worth disentangling them.
1. Language teaching has a long history central to the institutionalised educa
tional process. (Indeed, a recent lengthly encyclopedia article on the history of
teaching methods (Connell 1987) devotes most of its space to the place of lan
guage work in the curriculum). This tradition was diverted, but did not die, with
the decline of the classics, and is to be found in the general curricular discus
sions, and the rationales provided in teacher education, throughout Europe.
2. At the same time, there is a stronger tradition of alternative pedagogies in
language teaching than in other major curricular areas. This is partly because
there is a substantially greater amateur demand for language teaching than there
is for mathematics or other areas of the curriculum: all sorts of people, for prac
tical rather than academic reasons, need languages at some stage in their care
ers, and have done throughout history. It is partly also because alternative
pedagogies in other subject areas were more likely to be repressed — theology
and medicine, to name but two areas, have often been intolerant of alternative
approaches to their fields. Consequently, language teaching has been particulary
open to the claims of inspired outsiders which may or may not have generalis-
able value. Furthermore, many non-academic experts learned languages with
great flair, so that bizarre methods that "worked" for individuals could always be
supported by individual testimonies (e.g. Rambert 1972: 45, working with a text
translated into French words with English word order, "It was a simple, but bril
liant system. I was interested, and learnt it all by heart in no time").
Such concerns are clearly relevant to our interests. Let us simply note, for
the moment, that many discussions of "methods" (e.g. Richards 1984) treat them
as entities in their own right, almost as experimental models, rather than as cul
tures socially emerging from human practices. To what extent is such a formal
view justified?
number of possible methods, and it is also likely that techniques may be in prin
ciple separable from any method: particular methods are more likely to be
identified with constellations of techniques rather than with particular ones ex
clusively.
As an illustration of this, consider the characteristics of Communicative
Language Teaching summarised from a range of sources in a contemporary sur
vey:
It is at least arguable that the use of the term "method" obscures as much as
it reveals. It is difficult to see that the requirements of Anthony's sketch, or of
Richards and Rodgers' more developed outline, are actually addressing ques
tions significantly different from those with which a commentator like Clark is
concerned, although he claims only to deal with educational value-systems in
curriculum renewal (Clark 1987: 3). Citing Skilbeck (1982), Clark identifies
three value systems — classical humanism, reconstructionism, and progressiv-
ism — and relates them to foreign language teaching. The first is realised through
Grammar-Translation, the second through a variety of procedures including
audiolingualism, functional-notional syllabuses and graded objectives, and the
third through a number of process-oriented approaches.
This discussion does not use the term "method", but may be seen as dealing
with very similar issues to those addressed by Richards and Rodgers, without
being committed to the notion that methods come in discrete packages that are
readily identifiable, and that can be chosen from a set of conveniently available
options.
At a less abstract level, I have attempted to define the major features of
classroom planning and organisation (summarised in Brumfit 1984: 95-96). This
concentrates on three types of analysis of product (linguistic, interactional and
content or topic analyses). Any piece of genuine linguistic data will be capable of
being analysed in terms of all three, of course, but particular teaching pro
grammess will tend to concentrate on some dimensions rather than others. That
is to say that teachers will see some of these features as crucial for learning, even
though the goal of learning will inevitably include all of them. However, the
major criterial elements in the classroom process are more important. These are
identified in terms of (i) Communicative Abilities ("conversation/discussion
comprehension, extended writing, and (possibly) extended speaking" being
preferred as categories to the traditional "four skills" model), (ii) orientation to
wards Accuracy or Fluency, and (iii) pedagogical mode ("Individual, Private In
teractional — i.e. pairs or small groups — and Public Interactional — i.e. whole
class or large groups").
Similarly, the kinds of category systems devised for specific research pur
poses, such as that produced at the University of Stirling to characterise modern
language teaching in Scottish schools (Mitchell, Parkinson and Johnstone 1981),
140 CHRISTOPHER BRUMFIT
Without these three, language learning cannot take place, so teachers una
voidably have to take a position on all three. But everything else is a matter of
convention, and conventions are negotiated by all those with interests in the in
stitutions of education: teachers, learners, parents, government, administrators
and others. Further, it is at least arguable that none of these conventions can be
seen to be static, not only because needs of learners vary and the views we have
of language and the world vary as our knowledge improves, but even more be
cause the institution of schooling, and the history of langauge teaching, have
their own dynamic. What is motivating now may not be motivating next year,
simply because next year it is a year older. Language teaching is part of a much
larger system, and the characteristic of a system is that if one element changes,
all the others subtly adjust to accommodate the change. Insofar as language
teaching is part of education, its elements will be subject to change caused by
factors that are totally outside the control of language and language-learning
theory.
PROBLEMS IN DEFINING INSTRUCTIONAL METHODOLOGIES 141
Furthermore, since the social network provides its own dynamic, sensitive
teachers may well, in the course of their many centuries experience of teaching,
have explored all possible permutations of language learning behaviour. What
changes, as research into classroom behaviour continues, is not necessarily the
essential structure of teaching method, but our ability to describe and explain
that structure more sensitively. There is no logical or necessary relationship be
tween the findings of research and the behaviour of learners or teachers, any
more than our ability to explain evolution more successfully entails changes in
the behaviour of the animals that are evolving.
Now of course to say that no logical relationship is entailed does not mean
that no relationship is possible. It is not my contention that teachers should take
no notice of research, nor that teaching cannot improve as a result of research.
But the argument does suggest that the concerns of researchers must be to try to
understand something that is a given, not to get mixed up with the claims of
142 CHRISTOPHER BRUMFIT
those who wish to make money or name or who simply want to improve the
existing system. The dissatisfactions with present practice of present teachers are
data for the reseacher, but enormous care must be taken to avoid seeing changes
of convention as somehow to be interpreted as changes of principle.
6 Conclusion
of the topics decided on by the learners. The term "method" as currently used
incorportates a large number of conflicting and ill-defined features.
But the elements within classrooms can clearly be discussed in terms of a
number of key features. It would be perfectly possible to specify the charac
teristics of classroom behaviours in terms of the structure of language presented
to a class, the structure of practice opportunities, and the devices for motivation
of students, for example, to use the three key criteria referred to above. Similar
ly, the features of any of the categories isolated for mention by others could like
wise be listed and quantified, and the Mitchell, Parkinson and Johnstone (1981)
list does this for some major features of language classrooms.
What seems much harder to sort out is whether there would be value in de
manding an advance specification of "method" as such. Probably "method" is
better seen as a retrospectively-perceived constellation of common features
rather than as something that can be identified and predicted in advance. To
predict it in advance would be to reduce the teacher's and pupils' roles as deter
minants of classroom procedures to such an extent that crucial elements of
teaching and learning would almost certainly escape observation.
References
Allen, Patrick and Merrill Swain, eds. 1984. Language Issues and Educational Policies. Exploring
Canada's Multilingual Resources ( = ELT Documents, 119). Oxford: Pergamon.
Annual Review of Applied Linguistics 1987. "Communicative Language Teaching." New York:
Cambridge University Press.
Anthony, Edward M. 1963. "Approach, Method, and Technique." ELT 17.63-67.
Berlitz, M.D. 1907. Berlitz Method for Teaching Modern Languages. New York: M.D. Berlitz.
Brumfit, Christopher. 1984. Communicative Methodology in Language Teaching. Cambridge:
Cambridge University Press.
Brumfit, Christopher. 1985. Language and Literature Teaching From Practice to Principle. Oxford:
Pergamon.
Clark, John L. 1987. Curriculum Renewal in School Foreign Language Learning. Oxford: Oxford
University Press.
Connell, W.F. 1987. "History of Teaching Methods." Dunkin 1987.201-214.
Dunkin, Michael J., ed. 1987. International Encyclopedia of Teaching and Teacher Education.
Oxford: Pergamon.
Dunkin, Michael J. and Bruce J. Biddle. 1974. The Study of Teaching. New York: Holt, Rinehart
and Winston.
Feiman-Nemser, Sharon and Robert E. Floden. 1986. "The Cultures of Teaching." Wittrock
1986.505-526.
Gordon, Peter and Denis Lawton. 1984. A Guide to English Educational Terms. London:
Batsford.
Hesse, M.G., ed. 1915. Approaches to Teaching Foreign Languages. Amsterdam: North-Holland.
144 CHRISTOPHER BRUMFIT
Horton, T. and P. Raggatt, eds. 1982. Challenge and Change in the Curriculum. London: Hodder
and Stoughton and Open University.
Howatt, A.P.R. 1984.A History of English Language Teaching. Oxford: Oxford University Press.
Krashen, S. and T. Terrell. 1983. The Natural Approach. Oxford: Pergamon.
Larsen-Freeman, Diane. 1986. Techniques and Principles in Language Teaching. Oxford: Oxford
University Press.
Mitchell, Rosamond, Brain Parkinson and Richard Johnstone. 1981. The Foreign Language Class-
room: an Observational Study. ( = Stirling Educational Monographs, 9). University of Stirling.
Rambert, Marie. 1972. Quicksilver. Basingstoke: Macmillan.
Richards, Jack C. 1984. "The Secret Life of Methods." TESOL Quarterly 18/1.7-23.
Richards, Jack C , John Platt and Heidi Weber. 1985. Longman Dictionary of Applied Linguistics.
Harlow: Longman.
Richards, Jack C. and Theodore S. Rodgers 1986. Approaches and Methods in Language Teaching.
Cambridge: Cambridge University Press.
Rivers, Wilga M. 1968. Teaching Foreign-Language Skills. Chicago: University of Chicago Press.
Skilbeck, M. 1982. "Three Educational Ideologies." Horton and Raggart 1982.
Suppes, Patrick, ed. 1978. Impact of Research on Education. Washington, DC: National Academy
of Education.
Van Lier, Leo. 1988. The Classroom and the Language Learner. Harlow: Longman.
Ullman, Rebecca and Esther Geva. 1984. "Approaches to Observation in Second Language
Classes." Allen and Swain 1984.113-128.
Wittrock, Merlin C , ed. 1986. Handbook of Research on Teaching Third Edition. New York:
Macmillan.
Evaluation of Foreign Language Teaching Projects
and Programmes
Rosamond Mitchell
Subsequently, the case for broadening the scope of evaluation inquiry be
yond the experimental paradigm has been argued by numerous theorists in the
Anglo-American research community, among whom the most notable are per-
146 ROSAMOND MITCHELL
haps House and Cronbach (US) and MacDonald (UK): see for example, House
(1980), Cronbach et al. (1980), and MacDonald and Walker (1974). (Cronbach
(1982: 324) in particular presents an extended critique of what he calls the "out
moded recommendation that the program evaluator prefer true experiments".)
The grounds for this shift lie essentially in the recognition that evaluation is an
applied, policy-related activity, with a short-term, "improvement" orientation
rather than fundamental research; as Cronbach (1982:2) remarks:
"Many kinds of inquiry and pseudoinquiry are called evaluations. I restrict at
tention to inquiries that represent serious attempts to improve a program or a
kind of service by developing a clear picture of its operations and the fate of
its clients".
and believe. Payoff comes from the insight that the evaluators work generates
in others.
A study that is technically admirable falls short if what the evaluator learns
does not enter the thinking of the relevant political community".
The Colorado Project (Scherer and Wertheimer 1964) and the Pennsylvania
Project (Smith 1970) are well-known starting points for discussions of FL pro
gramme evaluation (Long 1980, 1984; Beretta 1986a). These studies each at
tempted to compare two FL teaching "methods" (audiolingualism plus a more
"traditional" mode of instruction) using large-scale, field experimental designs
in which classes and their teachers were randomly assigned to either instruc
tional method, with learners' FL achievement as the dependent variable. Their
findings were inconclusive; this politically inconvenient outcome immediately
provoked extensive critiques of the methodology employed by other members of
the FLT research community (see e.g. the October 1969 issue of the Modern
Language Journal, entirely devoted to a series of critiques of the Pennsylvania
Project). However, as Beretta points out, the thrust of these criticisms was "not
for failing to produce an evaluation that was capable of influencing policy, but
148 ROSAMOND MITCHELL
for failing to arrange for the tight controls that would have promoted internal
validity and contributed to a theory of language learning" (1988: 4). Thus for
example the Pennsylvania Project was criticised for failing to ensure the two in
structional "treatments" remained distinct (Otto 1969), and for bias in the tests
used (Valette 1969). Some researchers involved in these critiques went on to de
velop models for experimental "comparative methods" research with stronger
internal validity (e.g. Freedman 1976, who substituted pre-recorded instruc
tional sequences for the undependable live teacher variable.) Whatever the me
rits of such designs for fundamental research, as Beretta remarks, they "can have
only extremely remote implications for practice", (1986b: 146) and consequently
can have little role to play in user-oriented programme evaluation.
The literature of the late 1970s and 1980s on the evaluation of second and
foreign language programmes shows strikingly uneven levels of awareness of the
debates within mainstream educational research sketched in the introductory
section of this paper. Reports of substantive evaluations of FL/L2 programmes
are more commonly found in the literature than are discussions of evaluation
methodology. The former may include some explicit rationale for the choice of
evaluation procedures, but for most the rationale remains largely implicit and
must be deduced from the account provided of the evaluator's practice.
Among those who do contribute to the substantive discussion of evaluation
methodology, Richards maintains a strikingly strong commitment to "true ex
perimental design" as the only worthwhile form of programme evaluation
(1984). He commends the small, well-controlled experimental study of Wagner
and Tilney (1983) as an "excellent example" of method evaluation (Richards op.
cit.: 18), and argues that similar principles should be followed in the evaluation
of large scale, long term projects such as the Bangalore "procedural syllabus"
project (Prabhu 1987), which is singled out for special criticism.
Others may feel, however, that the Wagner and Tilney study illustrates well
the problems associated with experimental models in evaluation contexts. The
study concentrated on decontextualised vocabulary acquisition — i.e. one aspect
only of the methodological "package" under investigation. The experimental
"treatment"—vocabulary recited to the sounds of Baroque music — was de
livered via an audiotape, while the control equivalent was delivered by a real
live, "traditional", teacher. (While the behaviour of the latter was well control
led, the subjective attitudes of the teacher towards the experiment, and of
his/her students towards him/her as a person, could not be controlled away.
Neither of course could the individual learning strategies of the students, in and
out of class, randomly assigned though they were). The number of subjects with
in each condition was small, and the population from which they came an un
usual one (why music students?). As it happened, the experiment produced no
EVALUATION OF FLT PROJECTS AND PROGRAMMES 149
Throughout, a few voices have been raised with reference to the Canadian
studies, to argue for the broadening of the quasi-experimental, "product" evalu
ation model to encompass process questions of the kind outlined above. Thus,
Hornby (1980) makes similar suggestions in the context of immersion pro
grammes in the United States. Ullmann and Geva (1985) argue the case in Ca
nada, drawing on their own use of systematic classroom observation in the
evaluation of a "core French" programme to exemplify the approach (Ullmann
et al. 1983). However, it would appear that to date, the impact of these argu
ments has been slight.
In discussing this "gap" in the Canadian immersion evaluation procedures,
Beretta suggests that the researchers have themselves been aware of the "value
of documenting implementation" (1988: 9), but have failed to argue the case
with sponsoring bodies primarily concerned with public reassurance. This is per
haps to underestimate the difficulties of collecting process data, in politically
sensitive contexts (some of which are narrated in Mitchell forthcoming); it may
be that immersion programme developers and/or teachers resisted judgmental
scrutiny of the classroom "black box". This interpretation may be lent credence
by the existence of studies such as that of Canale et al. (1987), who are able to
report sensitive classroom case study material (including, for example, an ac
count of children being held up to public ridicule for mother tongue use) in the
context of an advisory rather than an evaluative document. But the international
evaluation community will benefit if the Canadian researchers themselves can
ultimately produce a full account of the rationales and constraints, academic and
political, which have formed their evaluation agenda.
Because of its scope and international influence, the Canadian French-re
lated evaluation experience has been discussed at some length. The other major
North American L2 evaluation tradition, which has received rather less interna
tional attention, is that associated with federally-funded bilingual education pro
grammes in the United States. Here too the general emphasis has been on
product evaluation inspired by (though not necessarily rigorously implementing)
experimental and quasi-experimental designs (Baker 1981). There has been
however somewhat greater variety in evaluation strategies adopted, with some
critical commentaries on the "product" model and attempts to explore life in
side bilingual programmes using ethnographically-inspired observational
strategies (see for example the introductions to, and empirical studies reported
in Cohen et al. 1979).
Apart from these two substantial, government-funded empirical evaluation
traditions, the universe of FL/L2 programme evaluation is relatively fragmented.
152 ROSAMOND MITCHELL
The recent methodological contributions of Beretta (1986a, 1986b and 1988) are
exceptional in their depth of acquaintance with the general evaluation literature;
a few other EFL specialists deal more summarily with the area (e.g. White
1988). The likelihood that language educators will interest themselves in evalu
ation issues has however some relationship with the degree of accountability ex
pected in particular professional contexts. Thus for example, teachers of English
for Special Purposes have a clear concern with evaluation, including an aware
ness of the general evaluation literature unusual among L2/FL specialists, which
they themselves attribute to a strong sense of accountability to their sponsors,
whether governments or commercial organisations (Mackay 1981; McGinley
1986), and/or to their students (Waters 1987). So, Mackay argues for a model of
evaluation which "has as its purpose the provision of those in authority with in
formation which can be used in making decisions about improving or modifying
the program" (1981: 107). For this purpose he stresses the need to complement
student achievement data not only with a range of process information, but also
with a theoretical critique of the programme rationale, and provides a case study
of an actual evaluation which exemplifies these principles:
"The explicit purpose of the appraisal was not so much to assert merely that
the project was satisfactory or unsatisfactory, but to provide those responsible
for its future as detailed an account as possible of every factor which might
contribute to the project's success or lack of success. They would then be in a
position to make decisions, which might affect any aspect of the program, on
the basis of comprehensive and objectively gathered information" (op. cit.:
114).
The concluding section of this paper will illustrate some of these points,
with reference to an evaluation study in which the author was recently involved
(Mitchell et al. 1987). The programme to be evaluated was a bilingual (Gaelic-
English) primary school programme in the Western Isles of Scotland. The pro
gramme had already been running for eight years at the time the evaluation was
commissioned. The commission arose out of a conflict between the Western
Isles local education authority and the central Scottish Education Department
regarding the worth of the programme. (The Western Isles had requested fund
ing from the SED for an extension of the bilingual programme into secondary
education, on the basis of the claimed success of the primary programme; the
SED said it was not satisfied as to the latter, and offered to fund the evaluation
study instead. After considerable internal conflict, leading to the resignation of
key programme developers, the local authority accepted the offer and re
searchers from Stirling University, including the author, were invited by the
SED to undertake the study.)
By the time the evaluation study was planned, the bilingual programme had
been extended to all primary schools in the Western Isles. A formal control
group design was thus out of the question, as no "uncontaminated" control
schools were available. However, on the basis of preliminary interviews with
head teacher, it was possible to make a preliminary categorisation of the schools
as having different levels of expressed commitment to the programme. The
evaluation study concentrated mainly on a sample of "high uptake" schools, but
a small number of "low uptake" schools were also included for purposes of less
formal comparison on process and product measures.
The original bilingual education project (BEP) concentrated its efforts
largely on altering the pattern of classroom experience of bilingual children, fa
vouring use of both languages as media of instruction, the integration of lan
guage arts work with other curriculum areas (notably Environmental Studies),
and the adoption of child-centred methods. While the project aimed to develop
children's bilingual competence to a high level, precise language objectives had
not been formulated.
Following the commitments of the original project, the evaluation study
committed the main part of its resources to a study of classroom processes, em
ploying two different systematic observation instruments, unstructured observa
tion, and teacher interviews for the purpose. A second major element in the
evaluation was the assessment of children's writing and speaking skills in both
languages, for selected year groups (Primary 4 and Primary 7). A third, minor
element involved parent interviews, to discover their attitudes towards the pro-
EVALUATION OF FLT PROJECTS AND PROGRAMMES 155
An evaluation study of the kind outlined above could provide its clearest
and most confident answers to the first of these questions. Primarily through the
different structured observational procedures, it proved possible to produce a
rich picture of current classroom activity, indicating a considerable degree of
curriculum integration, and use of both languages as media of instruction across
the curriculum and for a full range of instructional purposes. (Observed dif
ferences between "high" and "low uptake" schools consisted mainly in the much
greater willingness of teachers in the former category to codeswitch, within in
structional episodes. The main overall constraint identified on the use of Gaelic
as a medium of instruction was the teachers' perceptions of differential Gaelic
fluency levels among their pupils. Where individuals were perceived as non-
fluent, teachers addressed them very infrequently in Gaelic; where such individ
uals formed a significant proportion of a class, English predominated in
whole-class instruction also, thus also affecting the overall language experience
of even fully-fluent Gaelic speakers in such classes. As indicated later, these
findings strongly influenced the policy recommendations of the evaluators.)
Child-centring and experiential learning, however, were being implemented to a
much more limited degree, comparable with that found in other studies of con
temporary British primary schooling (an explicit point of comparison was Galton
et al. 1980).
The second question was much less capable of being answered satisfactorily,
given the constraints under which the evaluation study operated (most notably,
the absence of data regarding the state of affairs existing in the schools when
156 ROSAMOND MITCHELL
BEP was established, and on relations between the BEP development team and
the schools, during the years prior to the evaluation). However, the question
could be tackled indirectly, partly through indirect teacher reports in interview,
and partly through examination of the classroom process data. The BEP had
concentrated its attention on two particular areas of the curriculum: Gaelic Lan
guage Arts, and Environmental Studies. Comparison of Gaelic Language Arts
work with English Language Arts, never a particular focus of BEP attention,
showed striking differences in teaching methodology. The distinctive aspects of
GLA work as compared with ELA were consistently in line with BEP recom
mendations; thus for example, oral work and discussion, much favoured by BEP,
were common for GLA but very unusual for ELA. Such evidence suggested that
BEP had indeed been the decisive "delivery mechanism" for a range of "pro
gressive" methodological ideas, even though the latter may have been being pro
moted nationally through other mechanisms also.
The third question was the hardest to respond to in any meaningful way,
given the non-experimental design of the evaluation, and our conclusions were
perforce very tentative. The language assessment procedures adopted involved
eliciting extended samples of speech and writing of different types, in both lan
guages, from all P4 and P7 pupils in the 10-school sample. Thus the evaluation
contributed to the domain of public discussion a substantial description of child
ren's bilingual proficiency of a kind not previously available, which allowed for
direct comparisons between levels of achievement in English and Gaelic, at two
different age levels, and also allowed for at least informal comparisons with na
tional levels of achievement in English (as some assessment tasks were derived
directly from those used in a large scale English study: Gorman et al. 1982).
This aspect of the study provided reassurance that levels of achievement in
English were generally satisfactory, and that a majority of children were also
able to communicate effectively in Gaelic, though few pupils were attaining to
equal levels in both languages. But the thorny question remained, regarding the
extent to which the language skills documented could be attributed to the BEP
itself.
Even partially to answer this question, it was judged necessary to treat oral
and literacy skills separately. Children's performance on the Gaelic oral assess
ment tasks correlated very strongly with teachers' independent global ratings of
their Gaelic fluency. All children performed effectively in spoken English, in
cluding those from classrooms where Gaelic was the dominant language of in
struction; however, the older children's English performance surpassed that of
the younger on measures to do with control of longer discourse and sensitivity to
the listener. The older children also outperformed the younger much more sub
stantially in Gaelic, on all measures. While (being older) they had benefited
EVALUATION OF FLT PROJECTS AND PROGRAMMES 157
from longer experience of bilingual schooling, and while school influences were
detectable in some respects (e.g. growth in technical vocabulary), the evaluators
were not convinced that this was the prime reason for their greater oral Gaelic
ability, given the teachers' consistent reports of community language shift and of
decline in Gaelic proficiency among children entering school. It was concluded
that Gaelic use in school was compensating in part only for community decline
in use of the spoken language.
For literacy skills however, correlations between Gaelic test performance
and teacher fluency ratings were much weaker. That is, many children judged by
their teachers to be non- or partially-fluent were able to produce extended if in-
accurate Gaelic writing; this was true at both age levels, though again the older
pupils generally outperformed the younger. Here, the evaluators could attribute
Gaelic achievement to school experience with much greater confidence.
The absence of attitude measures curtailed the answers which could be pro
vided to the fourth question somewhat, but it was nonetheless possible to draw
conclusions from the classroom process data of significance not only for the
monoglot English-speaking children but for the future of the bilingual pro
gramme as a whole. So far from being neglected or alienated from classroom
life, these pupils seemed to be extremely influential, in terms of the language ex
perience available for all pupils. Teachers consistently addressed them individ
ually in English (thus in effect confirming rather than destabilising their
"monoglot" status), and where they were at all numerous, the use of Gaelic in
whole class instruction was significantly restricted. The evaluators viewed this
pattern as ultimately threatening to the viability of the overall programme, and
recommended that a clear policy decision be taken regarding Gaelic L2 instruc
tion for this group.
The above discussion illustrates the kinds of answers, more or less defini
tive, which can be provided for key questions motivating evaluation studies
through a non-experimental but many-faceted design. Lest evaluators develop
too inflated a view of their likely influence, however, this evaluation study also
illustrates the problematic relationship of research-based evaluation short-term
decision-making. Cynics committed to bilingual education viewed the com
missioning of the evaluation as a "stall" on the part of the Scottish political
centre, to avoid fostering the cultural distinctiveness of its highland periphery.
Yet before the evaluation project had reported, the SED purse-strings were
loosed, and substantial sums of earmarked money made available for the pro
motion of Gaelic educational programmes; again, cynics might attribute this to
the general wish of a Tory government to earn itself goodwill in Labour Scot
land, rather than as a considered educational decision. The job satisfaction
evaluators will depend then, not so much on seeing the "right", rational deci-
158 ROSAMOND MITCHELL
sions taken in the specific context they have studied, but on feeling they have
operated with sufficient rigour and theoretical grounding to advance general un
derstanding of the overall workings of L2/FL programmes. That is, there must
be a sense in which broad programme evaluations too can claim the status of
"basic" research on the context and dynamics of L2/FL teaching and learning.
References
Atkin, J.M. and E.R. House. 1981. "The federal role in curriculum development, 1950-80." Educa-
tional Evaluation and Policy Analysis 3/5.5-36.
Atkinson, P. and S. Delamont. 1985. "Bread and dreams or bread and circuses? A critique of
'case study' research in education." Controversies in Classroom Research ed. by M. Hammer-
sley, 238-255. Milton Keynes: Open University Press.
Baker, K. A. 1981. Effectiveness of Bilingual Education: A Review of the Literature. Washington,
DC: Department of Education. ED 215 010.
Barrington, G.V. 1982. English as a Second Language. An Evaluation of Calgary Board of Educa-
tion ESL Services Grades 1-12. Summary Report. Calgary, Alberta: Calgary Board of Educa-
tion.
Barrington, G.V. 1986 "Evaluating English as a second language: a naturalistic model" TESL Ca-
nada Journal 3/2.41-51.
Barrow, G.R. 1986. Foreign Language Proficiency in Action. Calumet, IN: Department of Foreign
Languages and Literatures, Purdue University. ED 283 361.
Beretta, A. 1986a. "A case for field-experimentation in program evaluation." Language Learning
36/3.295-309.
Beretta, A. 1986b. "Toward a methodology of ESL program evaluation." TESOL Quarterly
20/1.144-55.
Beretta, A. 1988» "The program evaluator: the ESL researcher without portfolio." Sultan Quaboos
University, Oman. Mimeo.
California State Department of Education. 1985. Handbook for Planning an Effective Foreign Lan-
guage Program. ED 269 993.
Canale, M. et al. 1987. Programme dans les Ecoles Elémentaires de Langue Française pour les
Elèves de Compétence Inégale en Français. Toronto: Ontario Department of Education. ED
281 377.
Cohen, A.D. et al. 1979. Evaluating Evaluation ( = Bilingual Education Series, 6.) Arlington, VA:
Center for Applied Linguistics.
Cronbach, LJ. 1982. Designing Evaluations of Educational and Social Programs. San Francisco:
Jossey-Bass.
Cronbach, LJ. et al. 1980. Toward Reform of Program Evaluation: Aims, Methods and Institutional
Arrangements. San Francisco: Jossey-Bass.
Freed, B.F. 1987. "Preliminary impressions of the effects of a proficiency-based language require-
ment." Foreign Language Annals 20/2.139-46.
Freedman, E.S. 1976. "Experimentation into foreign language teaching methodology." System
4.12-28.
Galton, M. B. Simon, P. Croll, A. Jasmen and J. Willcocks. 1980. Inside the Primary Classroom.
London: Routledge and Kegan Paul.
EVALUATION OF FLT PROJECTS AND PROGRAMMES 159
Gorman, T.P. et al. 1984. Language Performance in Schools: 1982 Primary Survey Report. London:
Department of Education and Science.
Guba, E.G. and Y.S. Lincoln. 1985. Naturalistic Inquiry. Beverly Hills, CA: Sage Publications.
Hagel Jacobson, P.L. 1982. "Using evaluation to improve foreign language education." Modern
Language Journal 66.284-91.
Hebert, Y. et al. 1984. Native Indian Language Education in the Victoria-Saanich Region: An
Evaluation Report. Mimeo. ED 250 341.
Holland, V.M. et al. 1984. English-as-a-Second-Language Programs in Basic Skills Eduction Pro-
gram I. Washington, DC: American Institutes for Research. ED 254 097.
Hornby, P.A. 1980. "Achieving second language fluency through immersion education." Foreign
Language Annals 13/2.107-13.
House, E.R. 1980. Evaluating with Validity. Beverly Hills, CA: Sage Publications.
Indiana State Department of Public Instruction. 1981. Designing Strengthening and Assessing
School FL Programs. ED 222 040.
Lapkin, S. M. Swain, J. Kamin and G. Hanna. 1983. "Late immersion in perspective: the Peel
study." Canadian Modern Language Review 39/2.182-206.
Lee, K.B. 1982. Evaluation of Foreign Language Program in Urban Community. Mimeo. ED 226
588.
Long, M.H. 1980. "Inside the 'black box': methodological issues in classroom research on lan-
guage learning." Language Learning 30/1.1-42.
Long, M.H. 1984. "Process and product in ESL program evaluation." TESOL Quarterly 18/3.409-
25.
MacDonald, B. and R. Walker, eds. 1974. SAFARI I: Innovation, Evaluation, Research and the
Problem of Control. CARE, University of East Anglia.
MacDonald, B. et al. 1982. Bread and Dreams. ( = CARE Occasional Publications, 12.) CARE,
University of East Anglia.
McGinley, K. 1986. "Coming to terms with evaluation." System 14/3.335-41.
Mackay, R. 1981. "Accountability in ESP Programs." ESP Journal 1/2.107-22.
Mitchell, R. et al. 1987. Report of an Independent Evaluation of the Western Isles' Bilingual Educa-
tion Project. Department of Education, University of Stirling.
Mitchell, R. forthcoming. "Evaluating bilingual primary education." Evaluating Language Educa-
tion Programs ed. by A. Beretta and J.C. Alderson. Cambridge: Cambridge University Press.
Morrison, F. 1984. Speaking French in Five-year-old Kindergarten. Ottawa: Ottawa Board of Edu-
cation. ED 259 591.
Ohio State Department of Eduction. 1981. A Self-Appraisal Checklist for Fis in Ohio's Secondary
Schools. ED 206 180.
Oklahoma State Department of Education. 1981. Curriculum Review Handbook: Foreign Lan-
guage. ED 205 051.
Otto, F. 1969. "The teacher in the Pennsylvania Project." Modern Language Journal 53/6.411-420.
Parkinson, B. et al. 1982. An Independent Evaluation of 'Tour de France' ( = Stirling Educational
Monographs, 11.) Department of Education, University of Stirling.
Prabhu, N.S. 1987. Second Language Pedagogy. Oxford: Oxford University Press.
Richards, J.C. 1984. "The secret life of methods." TESOL Quarterly 18/1.7-23.
Scherer, A.C. and M. Wertheimer. 1964. A Psycholinguistic Experiment in Foreign Language
Teaching. New York: McGraw-Hill.
Simons, H. 1987. Getting to Know Schools in a Democracy. Lewes, E. Sussex 7 Philadelphia: Fal-
mer.
160 ROSAMOND MITCHELL
Smith, P.D. Jr. 1970. A Comparison of the Cognitive and Audiolingual Approaches to Foreign Lan-
guage Instruction: the Pennsylvania Foreign Language Project. Philadelphia: Center for Cur-
riculum Development.
Stake, R.E. 1967a. "Toward a technology for the evaluation of educational programs." Perspectives
on Curriculum Evaluation ( = AERA Monograph Series on Curriculum Evaluation, 1) ed. by
R.W. Tyler, R.M. Gagne and M. Scriven, 1-12. Chicago: Rand McNally.
Stake, R.E. 1967b. "The countenance of educational evaluation." Teachers College Record
68/7.523-40.
Stufflebeam, D.L., R.L. Hammond, H.O. Merriman, M.M. Provus, W.J. Foley, WJ. Gephart and
G.G. Gupa. 1971. Educational Evaluation and Decision Making. Itasca, IL: Peacock.
Swain, M. 1984. "A review of immersion education in Canada: research and evaluation studies."
Language Issues and Educational Policies ( = ELT Documents, 119) ed. by P. Allen and M.
Swain, 35-51. Oxford: Pergamon/British Council.
Swain, M. and S. Lapkin. 1982. Evaluating Bilingual Education. Clevedon, Avon: Multilingual
Matters.
Ullmann, R. and E. Geva. 1985. "Expanding our evaluation perspective: what can classroom ob-
servation tell us about core French programs?" Canadian Modern Language Review
42/2.307-23.
Ullmann, R. et al. 1983. The York Region Core French Evaluation Project. Toronto: Ontario In-
stitute for Studies in Education.
Valette, R.M. 1969. "The Pennsylvania Project, its conclusions and its implications." Modern Lan-
guage Journal 53/6.396-404.
Wagner, J.J. and G. Tilney. 1983. "The effect of 'superlearning techniques' on the vocabulary ac-
quisition and alpha brainwave production of language learners." TESOL Quarterly 17/1.5-19.
Waters, A. 1987. "Participatory course evaluation in ESP." English for Specific Purposes 6/1.3-12.
White, R.V. 1988. The ELT Curriculum. Oxford: Basil Blackwell.
The Characterization of Teaching and Learning
Environments:
Problems and Perspectives
Dick Allwright
two most common "traditional" distinctions: firstly that between "informal" and
"formal" contexts, and then that between "second language" and "foreign lan
guage" contexts.
When we have, to take up the subtitle of this paper, reviewed the "prob
lems" inherent in basing research on these gross but familiar and pervasive dis
tinctions, we will then move on to consider the "perspectives" offered by an
alternative view of learning environments: one that focusses on the nature of the
"learning opportunities" that arise in different contexts. This view will be illus
trated from recent doctoral research at Lancaster which itself reinforces the sug
gestion made above that our understanding will be very limited if we do not find
ways of investigating learners' own, probably highly idiosyncratic, processes of
characterizing the learning environments they find themselves in.
First, however, we need to present very briefly the "traditional" distinctions
in our field, consider the purposes that characterizations might be intended to
serve, and, in the light of the obvious interest in using characterizations to inves
tigate the possible "causes" of learning outcomes, review the different types of
learning outcome we need to bear in mind.
ond language learning, but that in itself does not delimit the field adequately,
given the great variety of possible research interests in this area.
Three broad research areas can be discerned. Firstly there is the interest in
theory-building. Secondly there is the interest in developing what might be
called (perhaps unkindly) "piecemeal understanding" or (more positively) "in
sight", rather than any formal theory. And thirdly there is the quite different in
terest involved in providing decision-makers with the descriptive information
they may need in a given pedagogic situation. These three are conceptually dis
tinct, but may well come together in practice.
For present purposes I will assume agreement that theory construction
(whatever we take the word "theory" to mean) is at the heart of the research en
terprise, and that research is essentially about developing our understanding of
whatever phenomena interest us. I also assume agreement that "understanding",
for our purposes, is a matter of becoming less uncertain of the factors that we
can reasonably hold to determine outcomes. (This is clearly a viewpoint that
perpetuates the concern of western science for causes, and as such it is certainly
challengeable, but it probably represents the view of the majority of researchers
in our area.) The phenomena that interest us, presumably, are the processes
whereby a speaker of at least one language becomes a speaker of another lan
guage, or at least moves towards that state (we might also be interested in the
processes whereby a learner attempts, albeit unsuccessfully, to move in that di
rection, and in the processes whereby a teacher might try to motivate reluctant
learners, but these are more likely to be considered peripheral concerns). More
particularly, if we are also educators, we will want to know the extent to which
the result of those processes depends on contextual (and therefore potentially
manipulable) factors. And of course this entails being able to discriminate be
tween those contextual factors (the characteristics of environments) that make a
difference and those that are purely incidental. This can bring us neatly back to
the two commonsense dichotomies we started with. Do they capture factors that
make a difference?
their 1977 study of Mexican workers in the USA, found them becoming less
rather than more positive about the new country, as their linguistic proficiency
developed.
Having reviewed five types of possible difference in learning outcomes we
can now, finally, return to the two dichotomies outlined in the introduction to
this paper and begin to discuss their problematic aspects.
If, on the other hand, we define "teaching" for research purposes at least, as
a matter of "providing learning opportunities" (see Allwright 1986), then we can
immediately see, I suggest, that people paid professionally to be "teachers" are
not the only possible people to do "teaching". Many people, regardless of their
professional designation, may be in a position to provide learning opportunities,
whether deliberately or purely incidentally as a by-product of some other activ
ity. Such people would of course include other learners in a classroom situation.
This possibility would make it important that any research aimed at measuring
the impact of "teaching" should take into account the extent to which the
"teaching", as now defined, is not purely and simply in the hands of the osten
sible teacher.
With only a small extension of the above thinking we can also see that lear
ners may also be teachers for themselves, individually, in the sense and to the
extent that they create learning opportunities for themselves (and thereby for
each other, of course, in a class situation). This too would need to be taken into
account in any research on classroom language learning.
166 DICK ALLWRIGHT
something that must emerge from research, rather than something that can be
imposed upon research as a framework of independent value. We need the re
search precisely for the purpose of telling us how usefully to characterize teach
ing and learning environments.
With the foregoing admonition in mind we should perhaps turn now to our
second commonsense dichotomy — that between "second language" and "foreign
language" contexts.
2 Learning opportunities
ence to language in the above characterizations is deliberate, given that the ana
lysis is intended to apply regardless of subject matter. Secondly, and more im
portantly for our purposes here, it is probably not helpful to think in terms of
different types of opportunity. It may be more helpful to think of "encounter"
and "practice" as two ways of looking at any one opportunity. It may well be the
case that they more often occur in combination with each other rather than iso
lated from one another. This perspective also allows us to include affect as a fur
ther aspect of learning opportunities — to comment on the way in which
opportunities might be conducive either to enhanced receptiveness or to en
hanced defensiveness, for example.
2.1 "Encounter"
2.2 "Practice"
"Practice" is now a difficult term to use, because of its associations with be
haviourist approaches to language instruction, but we need some such term to
refer to the mental operations a learner may perform on encountering target
material, and in doing whatever it takes to learn it. Hearing a teacher explain, in
the target language, a particular linguistic concept, offers opportunities to en
counter the explanation and also to encounter the language in which it is ex-
CHARACTERIZATION OF TEACHING AND LEARNING ENVIRONMENTS 169
pressed. Beyond that, of course, it also offers opportunities to practice the men
tal operations involved in listening comprehension, whatever we take them to
be, and that in turn may itself constitute an act of learning, if we can accept the
view that comprehending is virtually synonymous with acquiring (see Krashen
1985: 4).
All these suggestive findings (and they are no more at present) lead me to
propose that research should pay attention to both the "encounter" and the
"practice" aspects of learning opportunities. Beyond that, they also suggest that
research should pay particular attention to learners' proficiency levels relative to
those of others in the same learning group. Teaching and learning environments,
we might say at this point, differ interestingly in terms of the characteristics of
the learning opportunities they provide, and in terms of the proficiency relation
ships they offer learners. We know very little indeed about the impact of profi
ciency relationships on any of our five learning outcomes (although Safya
Cherchalli's 1988 Lancaster -doctoral thesis throws new light on the issue by
means of a diary and interview study as follow-up to a questionnaire survey). We
know more about what are probably the relevant characteristics of learning op
portunities, even though we have generally discussed the issues in different
terms. For example, we can be reasonably sure that research will need to pay at
tention to the source of learning opportunities. The notion of source, however, is
itself a complex one in relation to learning opportunities, because wherever
these are social, interactive events (as they typically are in classroom language
lessons), we are likely to find it very difficult to talk about a single source. For
example, a learner may initiate an enquiry which the teacher responds to. From
one point of view the learner is the source (the originator), but from another
point of view the teacher is clearly the source of the relevant learning material,
no matter who or what prompted its inclusion in the discourse. To complicate
matters even further, we have some reason to believe that the question of who is
the addressee of learning opportunities may also be a relevant factor. For
example, as we saw indicated in Slimani's work (op.cit.), learners seem to find it
less difficult to take something from learning opportunities which are not ad
dressed to themselves in particular but are addressed either to someone else in
the class, or to the whole class. They are nevertheless likely to say, in a question
naire response, that they actually prefer to be the direct addressee (see Lahcen,
in progress). It seems most likely that this phenomenon is closely related to the
issue of relative proficiency, to which we will return below.
CHARACTERIZATION OF TEACHING AND LEARNING ENVIRONMENTS 171
5 Conclusions
This paper has been an attempt to throw light on some of the issues in
volved in characterising learning and teaching environments. The major points
to have emerged, I believe, are the following.
Firstly, that useful characterizations are necessarily to be seen primarily as
the product of research, rather than as a priori inputs to it.
Secondly, that we cannot yet say with any confidence what the criterial at
tributes of learning and teaching environments are, and therefore cannot yet
characterize them in a way known to be systematically related to learning out
comes, since our research has not advanced that far. In this connection I have il
lustrated the point by developing an alternative definition of "teaching", for
research purposes, and explored some of its potential implications via a "learn
ing opportunities" approach to the analysis of classroom language learning. In
this way I may perhaps appear to be trying to provide the world with yet another
characterization scheme, perhaps eventually to be seen as a rival to Fanselow's
FOCUS (1977), to Ullmann and Geva's TALOS (1983), or to Allen et al.'s
COLT (1984). In self-defence I can only argue that I am well aware of the dan
gers of such an enterprise (these are well set out in Chaudron 1988: 21-22), and
am offering my own approach rather as a complement to those current SLA
studies, in the hope that such a multiplicity of viewpoints will serve not as a set
of increasingly constricting straightjackets but as an encouragement to the
broadening of research attempts to develop our understanding of the complex
ities of classroom language learning.
One further point remains to be made. The foregoing analysis in terms of
learning opportunities has included the introduction of the potentially highly
productive observation that the characterization of learning environments is not
something done only by researchers. This characterization process is rather part
of the normal business of being a learner. As such it may also be crucial to the
process whereby learners get whatever the do get from being in language les-
CHARACTERIZATION OF TEACHING AND LEARNING ENVIRONMENTS 173
sons. The natural corollary is that what we need to study, in our research, is the
characterization process itself, among our learners.
We conclude, then, with the proposition that the characterization of learn
ing and teaching environments is far from being merely a preliminary to re
search. It is both an important outcome of research and an important object of
research in its own right, as a process vital to our learners' classroom lives.
References
Allen, J.P.B., M. Frölich and N. Spada. 1984. "The Communicative Orientation of Language
Teaching: An Observation Scheme." Handscombe, Orem and Taylor 1984.231-252.
Allwright, D. 1987. "Classroom Observation: Problems and Possibilities." Das 1987.88-102
Allwright, D. 1988. Observation in the Language Classroom. London: Longman.
Allwright, R.L. 1984a. "The Importance of Interaction in Classroom Language Learning." Applied
Linguistics 5/2.156-171.
Allwright, R.L. 1984b. "Why Don't Learners Learn what Teachers Teach?-The Interaction Hy-
pothesis." Singleton and Little 1984.3-18.
Allwright, R.L. 1986. "Making Sense of Instruction: What's the Problem?" Papers in Applied Lin-
guistics — Michigan 1/2.1-11.
Aston, G. 1986. "Trouble-Shooting in Interaction with Learners: The More the Merrier?" Applied
Linguistics 7/2.128-143.
Breen, M.P. 1985. " The Social Context for Language Learning-A Neglected Situation?" Studies
in Second Language Acquisition 7.135-158.
Breen, M.P. (forthcoming). Understanding the Language Teacher.
Canale, M. and M. Swain. 1980. "Theoretical Bases of Communicative Approaches to Language
Teaching and Testing." Applied Linguistics 1.1-47.
Chaudron, C. 1988. Second Language Classrooms: Research on Teaching and Learning. Cam-
bridge, Cambridge University Press.
Cherchalli, S. 1988. Learners' Reactions to their Textbook (with special Reference to the Relation be-
tween Differential Perceptions and Differential Achievement): A Case Study of Algerian Sec-
ondary School Learners. Lancaster: Doctoral Thesis.
Das, B.K., ed. 1987. Patterns of Classroom Interaction in Southeast Asia ( = Anthology Series, 17.)
Singapore, SEAMEO Regional Language Centre.
Doughty, C. and T. Pica. 1986. "Information Gap Tasks: Do they Facilitate Second Language Ac-
quisition?" TESOL Quarterly 20/2.305-325.
Fanselow, J.F. 1977. "Beyond Rashomon: Conceptualizing and Describing the Teaching Act."
TESOL Quarterly 11/1.17-40.
Handscombe, J., R. Orem and B. Taylor, eds. 1984. ON TESOL '83: The Question of Control.
Washington, DC: TESOL.
Higgs, T.V., ed. 1982. Curriculum, Competence, and the Foreign Language Teacher. Skokie, Illinois:
National Textbook Association.
Higgs, T.V. and R. Clifford. 1982. "The Push Toward Communication." Higgs 1982.57-79.
Krashen, S.D. 1985. The Input Hypothesis: Issues and Implications. London/New York: Longman.
Lahcen, D.B. (In progress.) Attention in Classroom Language Learning. Doctoral Research at the
University of Lancaster.
174 DICK ALLWRIGHT
Long, M.H. 1983. "Does Second Language Instruction Make a Difference? A Review of Re
search." TESOL Quarterly 17/3.359-382.
Meara, P., ed. 1989. Beyond Words ( = British Studies in Applied Linguistics, 4.) London: British
Association for Applied Linguistics.
Oiler, J.W. Jr., L.L. Baca and A. Vigil. 1977. "Attitudes and Attained Proficiency in ESL: A So-
ciolinguistic Study of Mexican Americans in the Southwest." TESOL Quarterly 11/2.173-183.
Selinker, L. 1972. "Interlanguage." International Review ofApplied Linguistics in Language Teach-
ing 10/3.209-231.
Singleton, D.M. and D.G. Little, eds. 1984. Language Learning in Formal and Informal Contexts.
Dublin: Irish Association for Applied Linguistics (IRAAL).
Slimani, A. 1987. The Teaching/Learning Relationship: Learning Opportunities and Learning Out-
comes. An Algerian Case Study. Lancaster: Doctoral Thesis.
Slimani, A. 1989a. "Learning Words from Classroom Discourse." Meara 1989.79-87.
Slimani, A. 1989b. "The Role of Topicalization in Classroom Language Learning." System
17/2.223-234.
Ullmann, R. and E. Geva. 1983. Classroom Observation in the L2 Setting: A Dimension of Program
Evaluation. Ontario: Modern Language Centre, Ontario Institute for Studies in Education.
Wenden, A. and J. Rubin. 1987. Learner Strategies in Language Learning. Englewood Cliffs, NJ:
Prentice/Hall International.
Section IV — Learning Environments
Introduction to the Section Learning Environments
Claire Kramsch
The post structuralist revolution in the language sciences has given ever
more importance to the notion of context and variability in language acquisition
and use. Foreign language research echoes in this respect the general trend in
language pedagogy both in Europe and in the United States. By shifting its at
tention from the structures of language to language learning processes and,
hence, to the person of the learner, research follows the same trend as language
pedagogy, broadening its base from language forms to language use, form the in
dividual learner to his/her interaction with the environment. The notion of "en
vironment," a term that originated in ecology and has now returned to education
after a loop via the computer sciences, is broader than that of context or situ
ation. It evokes global worlds of interconnected networks, "coral gardens" with
their delicate balance of cultures.
Learning environments are defined as either topographically different set
tings (e.g. instructional or natural environments, computer microworlds) or dif
ferent discourse genres in each of these settings (e.g. dialogue, monologue, oral
or written narrative), or different discourse forms within each genre (e.g. in
structional, communicative, procedural, phatic) or different linguistic contexts of
occurrence. They can refer to persons (teachers, peer tutors), materials (tex
tbooks, speech, knowledge in various forms) or circumstances (fortuitous or de
liberate). They are always interactional, in that they elicit or facilitate learning
through interaction with the learner.
The question asked by researchers in this last section is: What kind of learn
ing environments facilitate the acquisition of foreign languages? The first two
178 CLAIRE KRAMSCH
chapters give a response to this question from the two opposite ends of the spec
trum: the (natural) mind of the individual learner and the (electronic) mind of
the computer. Recent advances in linguistic theory allow us to speculate about
adult learners' developmental stages in the acquisition of syntactic structures.
Suzanne Flynn (chapter 12) gives a summary of recent thought in Universal
Grammar theory that accounts for the ability of adults to learn a second lan
guage. If the principles of Universal Grammar are still available to them, all they
have to do is reset the parametric switches of UG principles to fit the L2 par
ameters of the new linguistic environment. At the other end of the cognitive
spectrum, we have the availability of this super-learner/teacher: the computer.
General theories of learning have shed some light on the psychological as
pects of the acquisition of language: cognitive processes, interactional events, re
lationship of language to thought. Thus some progress has been made in our
understanding of learners' cognitive interaction with their environment. The
paradigm mentioned earlier from linear, product-oriented, structural forms of
knowledge, to relational, process-centered, procedural ones, raises some inter
esting epistemological issues, as to how knowledge is represented, transmitted
and internalized. Ralph Ginsberg describes in chapter 13 some of the advances
made in the design of intelligent tutoring systems. These electronic learning en
vironments challenge the basic tenets of traditional language pedagogy.
Between these two extremes, we have nurseries, streets and classrooms. In
chapter 14, Edmondson explores the cognitive and social dimensions of a lear
ner's interaction with teacher and peers in classroom settings and he cautions
against a simplistic, reductionist view of classroom interaction as a mere se
quence of turns-at-talk, even if these surface phenomena are the easiest to re
search.
In the same sense that the term "teaching environment" included, but was
not limited to the person of a teacher, the notion of learning environment im
plies that, although learning cannot take place without mediation, this mediation
can be either direct or indirect. Moreover, since the learner is part of the envi
ronment, we have to think of the relationship between the two as a process of
mutual creation: a learning environment is by definition a context that is not
only conducive to change, but suscpetible to change as well through its interac
tion with the learner. Thus the notion of environment has to be seen as a flex
ible, variable concept, in which learner and the conditions of his/her learning
define each other mutually in a cybernetic sense. This is most true of the learn
ing of culture through language, as Kramsch shows in chapter 15. The develop
ment of cross-cultural competence requires an ecological understanding of
social and cultural environments than can only emerge from the contrastive per
spectives of both the source and the target cultures. The best learning environ-
INTRODUCTION TO THE SECTION LEARNING ENVIRONMENTS 179
ment seems to be the one which allows itself to be deconstructed for precisely
what it is: an environment that allows the learner to eventually dispense of it and
become more than the sum of its parts.
Some Ins and Outs of Foreign Language Classroom
Research
Willis J. Edmondson
place in classrooms essentially determines what is learnt there (e.g. Hatch 1978:
403; Allwright 1984). The argument in favour of focussing on the classroom as
interaction is in fact surely self-evident: as Allwright remarks "Interaction is the
process whereby lessons are 'accomplished'..." (Allwright 1984: 159). In other
words, teaching is interaction, and classroom learning occurs in and through in
teraction. A focus on interaction in the classroom has developed logically in
Second Language Acquisition (SLA) studies focussing on the validity of
Krashen's Input Hypothesis (Krashen 1982). Three reasons justify this develop
ment of focus. Firstly, learner outputs also act as input to that learner's own pro
cessing mechanisms (a point made by Sharwood-Smith 1981). Secondly, learner
outputs also act as input for other learners in the same environment, and thirdly
what learners say may clearly determine what will happen next — in the simplest
case, what form further input from a teaching source will take. So, following
such arguments, one can only adequately analyse "input" as a product of class-
room interaction.
The aim of the paper is to expand on this claim, and consider selected the
oretical concepts and research procedures that may be useful in moving closer to
this goal. While I shall be centrally concerned with foreign as opposed to second
language learning, (a blurred distinction, but still a useful one), I shall also be
concerned with some central features of SLA research, assuming that results es
tablished under this rubric should be relevant to foreign language teaching and
learning. Even if this assumption turns out to be optimistic, we may still assume
that an ultimate goal must be a theory of classroom learning that applies to both
types of learning context. The "ins" and "outs" of my title concern then both
learner inputs and learner outputs in classroom interaction, and the problem of
relating learner-internal variables to learner-external factors in seeking to un
derstand classroom learning. The paper is in four sections. Part 1 briefly con
trasts a concern with internal factors. The second part of the paper raises the
vexed "nature versus nurture" issue with regard to internal learner charac
teristics. The third and major part of the paper focusses on classroom interac
tion. A brief summary is offered in conclusion.
models, include amongst the former at the very least the following: socio-cultu-
ral setting, educational system, and what observable events actually occur in the
classroom ("input"). One way of characterising different learner-internal factors
is to distinguish between cognitive factors, and affective/personality variables.
The study of cognitive variables has in the past focussed for example on the con
struct "aptitude", while the terms "attitude" and "motivation" reflect a concern
with affective factors deemed to be personality-based.
Further, in this brief attempt to offer a terminological framework for what
follows, I shall use the term "cognitive style" to suggest a specific constellation
of intellectual/cognitive factors, while "learning style" will be used more broad
ly, on the assumption that both cognitive and context-based or affectively-based
internal variables contribute to different learning styles. This terminological
convention may be justified by some research findings. Thus Naiman et al.
(1978) found "field" independence" (a construct concerning cognitive preferen
ces/skills) to correlate highly with their proficiency measures for learning
French, while Hansen and Stansfield (1981) did not find this construct to corre
late with their measures for communicative competence, suggesting in fact that
"a strong interest in other people and attentiveness to social cues in the com
munication task (which are associated with field dependence) perhaps leads to
effective communicative skill" (Hansen and Stansfield 1981: 363). As learning in
classrooms is, centrally, a social activity, and as "interest" and "attentiveness"
are in the classrooms undoubtedly affected by attitudes and motivation, it seems
reasonable to assume that learning style is not exclusively determined by cogni
tive skills.
A largely exclusive concern with either external or internal learning vari
ables is doubtless a function of one's own research interests or background. Fur
ther, either focus can be justified on commonsense grounds. For while it seems
obvious that what observably happens in classrooms determines what is learnt
there, it is equally obvious that ultimately learning takes place, if at all, inside
the heads of those who learn. Again, either focus may well also be strategic: on
the one hand, it is, other things being equal, less difficult to work with observ
ables than with non-observables; on the other hand, it might well be argued that
it is advisable to diagnose differences before recommending treatments.
If we for example go back to roughly the sixties, we discover two research
paradigms reflecting an overriding concern with either external or internal
learning influences, with each largely ignoring the other. On the one hand, much
research around this period consisted of large and small-scale undertakings in
side the method-comparison paradigm, where external factors were controlled
and manipulated: on the other hand, this was also the period when important
work concerning psychological concepts such as a aptitude, attitude and motiva-
184 WILLIS J. EDMONDSON
tion was carried out. The main thrust of this latter research was aimed at
measuring such internal factors and their relevance to foreign language learning
success.
The former research paradigm was/is implicitly behaviouristic in its psycho
logical presuppositions, and did not strive to differentiate between learners as
individuals, while the latter is clearly cognitive in psychological set, and sought
(amongst other things) precisely such differentiations. The former was class-
room-based and of immediate potential didactic relevance (when research re
vealed which method was superior, it was clearly to be recommended, and
followed), while the latter was to all intents and purposes neither — the practical
relevance was more in terms of screening applicants for special language train
ing options in institutions where such training is based on selection, and in terms
of general educational sensitivity in for example teacher-training.
Research inside the methods comparison paradigm failed to establish that
teaching method was the sole or indeed major determinant of learning success.
Apart from problems of research design, particularly concerning the operation-
alisation of methodological labels such as "audiolingual", the central reason for
this lack of clear results must be the SINGLE FACTOR MYTH. In other words,
the undertaking was simplistic in seeking to isolate one part of the classroom
complex (labelling it "teaching method", and relate learning outcomes to this
single factor (cf. e.g. Stern 1983, chapter 21; Edmondson 1984).
The latter paradigm again focusses on only part of the complex we are con
cerned with, and above all, fails to offer an answer to the practical questions of
language instruction. There are of course continuing doubts as to what, if any
thing, the construct "language aptitude" might be, whether it is distinguishable
from general intelligence, and whether it measures competence, as opposed to
grammatical knowledge (e.g. Oiler 1981; Krashen 1981).
We have of course moved on since this period. Of major importance have
been the impact of sociopragmatic/communicative perspectives, deriving from
sociolinguistics and the philosophy of language, the general "focus on the lear
ner" (which was in part a reaction against the methods comparison tradition),
and the cognitive information-processing framework for understanding learning,
deriving from cognitive science and artificial intelligence. We are however far
from having anything approaching a generally accepted research paradigm, des
pite the establishment of "second language acquisition" as a term and as a focus
of intensive classroom-based research. One reflection of this as I understand its
healthy state of diversity is some uncertainty as to how internal and external fac
tors can be meshed together in an theory, and indeed how we can engage in em
pirical research which does not focus on the one at the expense of the other.
SOME INS AND OUTS OF FOREIGN LANGUAGE CLASSROOM RESEARCH 185
There is still the danger that we repeat the mistakes of the past, using however
"richer" concepts (or simply different ones) developed since then.
of four previously unknown languages for one year in small classes. The bases
for these groupings were two: high versus low scorers on the test battery, and
"logicians" versus "verbalists". The latter opposition may be related to -/+ risk-
taking, +/-structure-dependence, serialists vs. globalists, or simply, in Krashen's
sense, natural "learners" versus natural "acquirers". On the argument that rela
tively marked cognitive preferences on this latter dimension would tend to lead
to accuracy without fluency (to use Brumfifs distinction — e.g. Brumfit 1984), or
the opposite, the didactic treatment given in the courses was compensatory. Sim-
plistically, the "learners" were given no rules until they were confident in speak
ing, while the "acquirers" were not allowed to speak until they had learnt the
rules. Terminal testing showed no significant differences between the achieve
ments of the "learners" as opposed to the "acquirers", but a remarkably signifi
cant difference between the highly scoring groups, and the groups with much
lower total scores on the original intelligence structure tests.
The compensatory didactic treatment is premised on the assumption that
learning styles can be changed, that a distinction is possible between the inter-
subject strength of relevant cognitive abilities (high-scorers versus low scorers),
and the relative intra-subject strength of different cognitive skills (the "seria
lists" versus the "globalists"). The former learner characteristics are, on the evi
dence of this research, not affected by didactic treatment, while the latter
apparently are.
The didactic consequence drawn from pre-testing is then stimulating: it sug
gests, in fact, that we should maybe providing learners with what they don't want.
I think this research raises as many issues as it clarifies, possibly because the ac
counts cited are meagre, and a fuller documentation is unfortunately not known
to me. Instead of concluding, as Stasiak does, that this research shows that taking
account of learners' individual cognitive profiles leads to more effective learn
ing, one might for example suggest that highly intelligent persons learn more
successfully under frustrating teaching conditions than do less highly intellec
tually gifted persons. Even so, the didactic consequences drawn from detected
differences between cognitive styles reinforce the simple point I am seeking to
make, namely that we need to clarify our views on posited learner-internal dif
ferences, regarding their universality and the degree to which they are subject to
change via classroom learning experience.
4 Classroom Interaction
The complex interaction inside the learner between internal and external
factors may be linked to classroom interaction by the observation that "The in-
188 WILLIS J. EDMONDSON
teraction between external and internal factors is manifest in the actual verbal
interactions in which the learner and his interlocutor participate" (Ellis 1985:
129).
If this link is to be fruitfully exploited for research purposes, however, we
need, I want to suggest, a more process-oriented theory of discourse interaction,
and may fruitfully supplement discourse studies of classroom interaction by
studies which attempt to investigate more directly the cognitive mechanisms
underpinning discourse behaviours.
It seems to me that there are two interpretations of the notion of "discourse
interaction" which one can distinguish. One we might call a "weak" interpreta
tion of interaction, the other the "full" interpretation. I posit the distinction in
the belief that often in current classroom-based research only the "weak" inter
pretation is taken into account. We may assume that interaction occurs between
two subcomponents of some complex when A affects or determines B, and B af
fects or determines A (cf. Ellis's formulation of the "interactionist" position on
internal and external learning factors — Ellis 1985:129).
A first, weak, interpretation of the notion of interaction is essentially linear,
whereby interaction occurs over a sequence of time intervals, such that for
example that which affects (interactant A) is active at time 1, the effects occur at
time 2, while the reciprocal relationship may require two further time units (in
discourse, times 2 and 3 are maybe collapsed, as turn-taking occurs). A second,
stronger, interpretation is bilateral, whereby interaction occurs inside one time
unit, inside which A and B are both determining and being determined.
Let me attempt to illustrate the distinction I am trying to make with refer
ence to classroom discourse. The "weak" notion of interaction is simply a reflec
tion of the conventions of turn-taking and sequential relevance in spoken
discourse. The stronger argument is based on the nature of discourse meaning,
and claims that the "meaning" of a discourse contribution produced at time 1
may be subsequently developed or established in the ensuing discourse: what
follows it may have determining retrospective force (Downes 1977; Leech 1980:
79-117; Edmondson 1982). In the context of the classroom, this means, I suggest,
that the very notion of an "input", which is distinct from its discourse conse
quences is open to question: learner responses may determine or affect "input"
not only prospectively but retrospectively. I want to develop this argument a
little, and relate it to the notion of classroom "negotiation".
Without needing to develop a specific theory, we can accept that a discourse
unit has a specifically discoursal "meaning" which is more than its semantic
meaning, more than its sentential meaning. Such meanings do not exist inde
pendently of human processing agents. So when X produces a unit of dis
course—an utterance, let us say — one obvious notion would be that the relevant
SOME INS AND OUTS OF FOREIGN LANGUAGE CLASSROOM RESEARCH 189
The suggestion is then that discourse processing and discourse analysis are
necessarily related issues, and that therefore a richer classroom interactional
analysis system will produce analyses which can be related to questions of lear
ner processing. What would result will be a mode of analysis which could not
easily be reliably applied on a grand scale, generating data for productive quan
titative analysis: indeed, initially case studies might be more useful. Further, the
question of subjectivity in analysis requires attention. The data obtained for de
tailed interactional analysis will be usefully supplemented by research strategies
which attempt to tap learner procedures and perceptions more directly. Verbal
reports, i.e. introspective data of various kinds, collected during or consequent
to, classroom activity, seem to be the most promising means of achieving this (cf.
e.g. Allwright 1984; Hawkins 1985). House (1986) suggests the interesting possi
bility of incorporating retrospective interpretations of previous learning acti
vities into teaching programmes.
5 Summary
I have tried to make the following points regarding the "ins" and "outs" of
foreign language teaching/learning research:
Both internal and external factors co-determine learning success. Under the
former are to be included both universal cognitive abilities whereby humans are
uniquely equipped to acquire a language or languages, and cognitive and affec
tive/emotional traits which distinguish learners in terms of their cognitive and/or
learning "styles". Assuming that foreign language learning is part of education, it
seems sensible to assume that both cognitive and learning styles are subject to
influence, i.e. learnable, until such time as the opposite is firmly proven. In other
words, it is necessary both for learning theory and teaching practice to decide
which individual internal factors are subject to change, and which, if any, are not.
"Single cause" hypotheses, whereby the nature of classroom learning is at
tributed to one (external or internal) factor are priori unlikely to be insightful.
Such hypotheses may, however, be useful, in stimulating research that estab
lishes their inadequacy and thereby contributes towards a more adequate theory.
As I understand it, something like this is happening to the Input hypothesis.
Studies of for example intake, comprehension and output reflect this develop
ment (e.g. Swain 1985; White 1987; Brown 1986).
It is still relevant to take account of native-speaker behaviours, in attempt
ing to understand classroom procedures and learner processing. The point is
made above with regard to discourse analysis and the concept of 'negotiation',
but may hold for example for communicative and/or learning strategies, or the
SOME INS AND OUTS OF FOREIGN LANGUAGE CLASSROOM RESEARCH 193
Notes
1. I assume here that on any non-trivial interpretation of the term "interaction", reading and
writing activities are also to be viewed as interactive—cf. e.g. Widdowson (1979, chapter 13),
Edmondson (1981, chapter 7). The point is important in a foreign language teaching/learn-
ing context, given the common view that foreign language classrooms have not only a train-
ing function, in terms of inculcating specific language skills, but also an educational function
in the broadest sense (cf. Widdowson P. and U.). Hence, while oral skills are commonly a
major focus of foreign language teaching courses, work within texts, especially in non-begin-
ners' language courses, is of no less importance.
2. Cf. the notion of "world-switching" in classroom discourse (Edmondson 1981b, 1985) or the
teacher encouragement of learner error (Edmondson 1986).
References
Allwright, R.L. 1980. "Turns, Topics and Tasks: Patterns of Participation in Language Learning
and Teaching." Discourse Analysis in Second Language Research ed. by D. Larsen-Freeman,
165-187. Rowley, MA: Newbury House.
Allwright, R.L. 1984. "The Importance of Interaction in Classroom Language Learning." Applied
Linguistics 5.156-169.
Aston, G. 1986. "Trouble-shooting in Interaction with Learners: the More the Merrier?" Applied
Linguistics 7.128-143.
194 WILLIS J. EDMONDSON
Bausch, K.R. and F.G. Königs, eds. 1986. Sprachlehrforschung in der Diskussion. Tübingen:
Gunter Narr Verlag.
Brown, G., ed. 1986. Comprehension ( = Applied Linguistics, 7/3.) Oxford: Oxford University
Press.
Brumfit, C.J. 1984. Communicative Methodology in Language Teaching. Cambridge: Cambridge
University Press.
Chaudron, C. and J. Richards. 1986. "The Effect of Discourse Markers on the Comprehension of
Lectures." Applied Linguistics 7.113-127.
Downes, W. 1977. "The Imperative and Pragmatics." Journal of Linguistics 13.77-97.
Dunkin, MJ. and B.J. Biddle. 1974. The Study of Teaching. New York: Holt, Rinehart and Win-
ston.
Edmondson, W.J. 1981a. Spoken Discourse. London: Longman.
Edmondson, W.J. 1981b. "Worlds within Worlds-Problems in the Description of Teacher-Learner
Interaction in the Foreign Language Classroom." Proceedings of the 5th AILA Congress ed.
by J.G. Savard and L. Laforge, 127-140. Quebec: Laval University Press.
Edmondson, WJ. 1982. "On the Determination of Meaning in Discourse." Linguistische Berichte
78.33-42.
Edmondson, WJ. 1983. "Diskurs im Fremdsprachenunterricht als Handlungsgeschehen." Hand-
lungsorientierte Fremdsprachenunterricht ed. by A. Raasch, 39-42. Tübingen: Gunter Narr
Verlag.
Edmondson, W.J. 1984. "Methods, Approaches, Principles and Practices." New Approaches in
Foreign Language Methodology ed. by W. Knibbeler and M. Bernards, 53-62. Brussels:
AIMAV.
Edmondson, W J. 1985. "Discourse Worlds in the Classroom and in Foreign Language Learning."
Studies in Second Language Acquisition 7.159-168.
Edmondson, WJ. 1987. "'Acquisition' and 'Learning': the Discourse System Integration Hypo-
thesis." Perspectives on Language in Performance ( = Festschrift Werner Hüllen) ed. by W.
Lörscher and R. Schulze, 1070-1089. Tübingen: Gunter Narr Verlag.
Ellis, R. 1985. Understanding Second Language Acquisition. Oxford: Oxford University Press.
Flanders, N. 1970. Analysing Teaching Behavior. Reading, MA: Addison-Wesley.
Gass, S.M. and C.G. Madden, eds. 1985. Input in Second Language Acquisition. Rowley, MA:
Newbury House.
Hansen, J. and C. Stansfield. 1981. "The Relationship of Fielddependent-independent Cognitive
Styles to Foreign Language Achievement." Language Learning 31.349-367.
Hatch, E. 1978. "Discourse Analysis and Second Language Acquisition." Second Language Ac-
quisition ed. by E. Hatch, 401-435. Rowley, MA: Newbury House.
Hawkins, B. 1985. "Is an 'Appropriate Response' always so Appropriate?" Gass and Madden
1985.162-178.
House, J. 1986. "Learning to Talk: Talking to Learn. An Investigation of Learner Performance in
Two Types of Discourse." Kasper 1986.43-57.
Kasper, G., ed. 1986. Learning Teaching and Communication in the Foreign Language Classroom.
Aarhus: University Press.
Krashen, S. 1981. Second Language Acquisition and Second Language Learning. Oxford: Perga-
mon.
Krashen, S. 1982. Principles and Practice in Second Language Acquisition. Oxford: Pergamon.
Lafayette, R.C. and M. Buscaglia. 1985. "Students Learn Language via a Civilization Course-A
Comparison of Second Language Classroom Environments." Studies in Second Language
Acquisition 7.323-342.
SOME INS AND OUTS OF FOREIGN LANGUAGE CLASSROOM RESEARCH 195
Suzanne Flynn
English word and that 'sblap' could not be an English word given the rules of the
language. At the morphological level, they know that 'Mapped' and 'blapping'
are acceptable transformations of the potential verb to 'blap' without having to
know the meaning of the verb. At the syntactic level, English speakers can rec
ognize the difference between the grammatical sentence {John is a teacher) and
the ungrammatical (*John a teacher is). At the semantic level, they recognize
anomalous sentences (!The chair thinks a hole in one) and can identify para
phrases as expressing the same meaning (John wrote the angry rebuttal) and (The
rebuttal was written by John). At the pragmatic level, they can distinguish be
tween polite questions ("Would you please close the window?") and rude ques
tions ("Close the window, huh?") in particular contexts (e.g. requesting the
window closed from one's future employer). A language learner's competence
as a speaker or listener reveals an even more profound knowledge of the
properties of her language — a knowledge that is both complex and abstract.
Consider for example, the complexity of the knowledge that must be repre-
sented in the competence of an English speaker to account for the normal per-
formance in assigning coreference between a reflexive pronoun and a noun. The
indices indicate coreference assignments. An asterisk means that the coref-
erence assignment is not possible.
COREFERENCE
(la) Maryi saw herselfi.
(lb) Maryi saw her*i.
WANNA CONTRACTION
HEAD-DIRECTION PARAMETER
(3a) Head-Initial
English
[The child [who is eating rice]] is crying.
Spanish
[El niño [que come arroz]] llora.
(3b) Head-Final
Japanese
[[Gohan-o tabete-iru] ko-ga] naite-imasu.
'Rice-obj. eating is child-subj. crying is.'
1975: 12). Within this context, "knowledge of grammar, hence of language, de
velops in the child through the interplay of genetically determined principles
and a course of experience" (Chomsky 1980:134).
Informally, we speak of this process as language learning. The mediation of
UG in language learning restricts the infinite number of false leads that could be
provided by random induction from unguided experience of surface structure
data alone (Lust 1986). As a theory of acquisition, UG makes several predic
tions.
For example, one prediction is that learners' hypotheses about language are
structure dependent; that is, "early hypotheses about possible grammatical com
ponents are defined on sentences of words analyzed into abstract phrases"
(Chomsky 1975: 32). This means that learners naturally abstract out from what
they hear and organize the language, for example, a sentence, into hierarchies of
phrasal units. In this sense, UG restricts the nature of the hypotheses learners
will consider about the target language they are learning.
More specifically, UG predicts that the relevant properties learners attend
to in acquisition are those isolated by the principles and parameters approach of
UG. For example, if some version of the UG formulation is correct, we should
find evidence that learners will know that languages will instantiate some type of
a head-complement ordering. At the same time, we should find evidence that
they are attempting to establish the correct head-direction for the language they
are learning in acquisition. The theory also predicts that within this context, the
speech environment to which the learner is exposed plays an important but
limited role in acquisition. Its principal function is to specify those ways in which
the open parameters of UG are instantiated in a particular language. That is, the
environment provides the data base necessary for the learner to establish the
values of the parameters associated with UG in order to construct the grammar
of a particular language. The role of the environment in this framework repre
sents a major departure from traditional behaviorist models in which the envi
ronment provides everything that is needed for language learning.
Current research in theoretical first language acquisition seeks to document
the role of UG in the language learning process (see work represented in for
example Lust 1986; Roeper and Williams 1987).
In summary, as a theory of language, UG provides a system of principles
and parameters which of necessity constitute the properties of all languages. As
a theory of biological endowment for language, UG provides an early schemat
ism that learners apply to languages. This schematism in turn significantly con
strains the nature and range of hypotheses learners will entertain when
acquiring a new language.
LINGUISTIC THEORY AND FL LEARNING ENVIRONMENTS 203
While there are no definitive answers yet available with respect to either
first or adult second language learning there are a number of issues that the lan
guage acquisition research as well as the theory from which it emerges raise for
the design of effective instructional settings. In particular, both types of research
could improve our understanding about what knowledge is available to the lear
ner, how this knowledge is used and how learning takes place. In turn, these in
sights have consequences for teacher training, classroom composition, as well as
for the development of effective groupings and sequencing of curricular materi
als. We will consider these and others in more detail below.
To begin, we know that adult second language learners do not start with
"clean slates". That is, they bring to the language learning context knowledge
not available to the child first language learner. At the same time, we know that
204 SUZANNE FLYNN
adult second language learners also share with children a certain body of com
mon linguistic knowledge.
More specifically, we know that adults have at least three distinct bodies of
knowledge available to them:
While the existence of either a knowledge base derived from the first lan
guage or one derived from general cognition may not be surprising, the role of
general properties of UG in the adult language learning process may be. The ex
istence of this body of knowledge means that the adult, in contrast to many
traditional approaches, namely CA, and also in contrast to several more recent
approaches e.g. the Fundamental Difference Hypothesis (Bley-Vroman 1989),
second language acquisition is not restricted by the learner's first language alone
or by unconstrained problem solving strategies.
Through their knowledge of UG, adult second language learners are not re
stricted to surface structure facts of a language alone. Their knowledge of UG
involves a capacity that is both complex and abstract. This was briefly illustrated
above in the coreference and wanna examples in 1 and 2. More specifically, lear
ners bring to the language learning context a set of structural sensitivities com
parable to those that they bring to the first language learning situation. That is,
there is evidence that suggests that learners are prepared to pick up the same
abstract structural properties of the second language grammar that they did for
the first language grammar, for example the head-direction of a language (see
related discussion in Martohardjono and Gair 1989).
Knowledge of the first language means that learners have a fully developed
competence for at least one other language. This means that they have con
structed a specific grammar from the principles and parameters provided by a
theory of UG. More specifically, open parameters have been specified for par
ticular values. Some of the values of these parameters will match those of the
target second language and some will not. In addition, through their knowledge
of their first language, adult learners know all sorts of idiosyncratic and non-
paradigmatic properties for at least one language. Very few, if any, of these
properties will match those for the target second language.
LINGUISTIC THEORY AND FL LEARNING ENVIRONMENTS 205
Knowing that these three bodies of knowledge are available to the adult
learner has several possible consequences for language teaching. Most generally,
it means that we can make certain assumptions about the adult learner's knowl
edge. We know that all learners will share knowledge of a certain common lin
guistic base, namely UG. We also know that divergences that exist among
learners will principally derive from differences that exist between the first and
second language of the learner, for example where parametric settings between
the first and second language differ. Knowing both of these facts allows us in
turn to establish more precisely what has to be learned: differences in par
ameter-settings. At the same time, we know that all or most learners will need to
learn the idiosyncratic properties of a language e.g. idioms, irregularities intro
duced by historical borrowings, individual lexical items (although not general
properties of the lexicon), among others. No theory of UG or any other knowl
edge base will give us these facts.
At another level, one consequence of knowing what is available to the lear
ner is that language instructors need to be linguistically sophisticated; they need
to understand the specifics of each of these knowledge bases. At one level they
need to be familiar with the basic principles and parameters of a theory of UG
in order to understand what general linguistic knowledge all learners share and
what specific linguistic knowledge learners have of their first languages. This
suggests that instructors need to be familiar with the linguistic properties of the
specific first languages represented by the learners in their classes in order to
understand where differences will emerge.
In addition, instructors need to be generally acquainted with the results of
current psycholinguistic research specifically that relate to language acquisition
and use. At the same time, they need to be familiar with theories of second lan-
206 SUZANNE FLYNN
guage acquisition that attempt to integrate all of these domains into coherent
meaningful explanations of the second language acquisition process.
With respect to the learners themselves, the availability of these three
bodies of knowledge for all adult learners means that in principle, all adults are
capable of learning new languages. Explanations about why some adults do not
learn second languages will have to appeal to factors not related to the basic bi
ological capacity for language, e.g. inadequate exposure to the target language
or other complex factors related to issues of motivation.
In terms of classroom composition, these results suggest that a mixed model
consisting of both heterogeneous and homogeneous groupings based on dif
ferences and similarities of parameter-settings of the first language would be
beneficial. We know that there are certain aspects of a new language that all
learners, regardless of their first languages, will have to learn, e.g. the idiosyn
cratic, and irregular properties, and those which only some learners will have to
learn, e.g. when parametric values differ between the first and the second lan
guage. Dividing the classes up in this way means that in the case of a match in
parameter settings between the first and second language students do not have
to be redundantly taught something they already know. In the case of the mis
match, it means that students can receive the additional input necessary for
them to assign new values to parameters.
The reason for this difference has to do with differences in the dominance
type of relationships that exist between the pronoun and the noun. In sentence
4, her does not dominate Mary; that is, it is not higher in position than Mary in a
hierarchical tree structure of this sentence. In sentence 5, however, she domi
nates Mary; it is higher in the tree. A general rule of language, roughly para
phrased, states that pronouns cannot dominate their antecedents.
In addition, we know that learners will attempt to apply structure depend
ent hypotheses to the new target language. We know that learners will not com
mit certain kinds of errors that violate boundaries of abstract phrasal units, for
example formulate structure independent hypotheses. To illustrate, we do not
find sentences like that in 6 in the speech of adult second language learners (nor
in the speech of child first language learners).
not as yet fully specified, it appears that a lack of a match in properties may
cause problems in learning. For example, Oiler and Redding (1971) found evi
dence to suggest that the learning of articles (a, an, the) was disrupted for spea
kers acquiring English as a second language when the first languages of the
learners did not have article-like categories.
Somewhat paradoxically, we also know that the existence of certain com
parable properties in both the first and second language does not always facili
tate learning. For example, Clahsen (1988) reports that Turkish speakers
learning German as a second language will use a SVO (subject-verb-object) pat
tern in spite of the fact that both German and Turkish require clause-final verb
placement in embedded clauses.... "the generalization... holds regardless of the
learner's L1" (op. cit.: 61).
Phonologically, interference from the first language is commonplace. For
example, the observed inability of Japanese speakers in English to perceive or
produce the /r/ and HI distinction in English is argued to result from the fact that
/r/and/1/are not phonemically distinguished in Japanese and they are in English.
The lack of this distinction in Japanese is believed to interfere with the sub
sequent learning of this distinction in English. It is important to keep in mind,
however, that the interference function of the first language is not necessarily its
dominant role in the second language learning process.
Given the nature of the knowledge available to the adult learner, we know
that there is a strong deductive component involved in language learning. This
means that language learners do not learn the new language by translating word
for word from the first language to the second language. They are capable of
looking for higher order conceptual units and will do so quite naturally when
given the opportunities by abstracting out from what they hear. Essentially, the
construction of the target second language is a grammar-driven process rather
than a data-driven one.
These results also suggest that learners will proceed through a natural se
quence of development guided by innate principles. While developing, these
learners will extrapolate from the language environment what they need when
they need it.How much of this is open to actual learning is still an important em
pirical question.
We also know that knowledge of their first languages can serve to facilitate
learning. Where there is a match between the first and second language, lear
ners will rely upon what is already available to them from their first languages.
For example, a Spanish speaker learning English has more available to her that
can be used when learning English than a Japanese speaker does. One way that
Spanish matches English is in its being a head-initial language. Because both
languages share this property, adult Spanish speakers do not have to re-learn
this fact about English. They can draw upon what they already know from Span
ish when learning English. Japanese, on the other hand as illustrated above in
sentence 3B is a head-final language;this means that Japanese speakers need to
assign a new value to the head-direction parameter in order to acquire English.
They also do so in a manner that corresponds to what children do when learning
English as a first language (for extended discussion see Flynn 1987; Flynn and
Lust in press; Flynn forthcoming).
We also know that in contrast to many theories about language learning,
adult second language acquisition does not proceed by random induction from
surface language facts alone. While some inductive learning is involved and re
search needs to isolate more precisely where, this learning is also highly con
strained. Of all the possible hypotheses and strategies an adult could use and
formulate when learning a second language given all the knowledge available to
an adult, adults simply do not apply non-linguistic hypotheses to the learning of
a second language. In fact, what is impressive about the adult second language
acquisition process is not the manner in which first and second language acquisi
tion seem to trivially differ but in the significant manner in which the two pro
cesses converge.
LINGUISTIC THEORY AND FL LEARNING ENVIRONMENTS 211
In terms of instructional settings, knowing how learning takes place has sev
eral consequences.
For example, as in first language acquisition, the learning environment must
be rich enough to provide the input necessary for the learner to deduce the right
properties of the target language. This suggests, as already documented for first
language acquisition, that the learner needs as much exposure as possible to
natural language. In addition, the language learning environment must be inter
active and directed to individual learners. While it is not always possible in a lan
guage classroom, the goal for language learning contexts should be to simulate
such an environment. Ideally, this interaction should be between two interlocu
tors; however, it is also conceivable that other forms of language exchanges can
provide some of this interaction in new and creative ways. For example, one can
imagine developing computer programs that respond immediately and appropri
ately to the learner such that they simulate but not necessarily substitute for the
needed one-to-one language "instruction" provided by caretakers with their
young children. Such work is the focus of the Foreign Languages and Literatures
computer projects being developed within the context of Project Athena at MIT,
for example.
The existence of a strong deductive component to second language learning
also strongly suggests that not all corrections are meaningful or useful. We know
from first language acquisition that one can with great effort get a child to cor
rect a previously ungrammatical utterance only to have the child resort to using
the ungrammatical utterance until she is really ready to change naturally. A simi
lar phenomenon is also often observed with adult learners. Part of the reason
why these corrections appear useless is that the type of input given to the adult
and perhaps the time at which it was given in development were simply
meaningless to the learner. It seems that the right kind of input is needed and it
must be given at the right time in order for such intervention to have any lasting
effect. The form of this input will also not always be in the form of an explana
tion as suggested above. It will more often than not involve more linguistic input
of a particular kind, for example expansions and paraphrases of key utterances
in as many varied syntactic structures as possible. Determining exactly what the
key utterances are is dependent upon the instructor's understanding of the na
ture of the error made. Determining when such input is useful is also dependent
upon one's knowledge of what developmental stage the student has attained.
Institution of such a program to do exactly this could easily be developed
with current technology in computer-aided instruction.
212 SUZANNE FLYNN
4 Conclusions
In summary, the purpose of this paper was to explore the possible implica
tions of linguistic theory for language pedagogy. What has been discussed in this
paper is only a fragment of what can ultimately be achieved and tested. Conti
nued study and dialogue between the two domains explored in this paper can
yield the insights necessary for the continued development of both principled
language learning environments and ultimately principled theories of language.
214 SUZANNE FLYNN
Acknowledgements
@@@The author wishes to thank Ralph Ginsberg, Claire Kramsch, Charles Ferguson and Jack Carroll
for discussions concerning various aspects of the issues addressed in this paper as well as for the
many suggestions made for revision with respect to an earlier version of this paper. The author
would also like to acknowledge the participants at the Bellagio conference for their helpful com
ments and questions.
Notes
1. Important to note is that the discussion in this paper will center principally on the acquisi
tion of linguistic knowledge. This is not to say that this is all that one needs to learn in order
to become a native or near-native speaker of a language. Discussion of the acquisition of
other necessary properties and components, for example the target culture, can be found in
several other papers in this volume. Discussion of such issues is, however, beyond the scope
of this paper.
2. Technically, we can account for these facts in terms of principles of Binding Theory pro
posed within a theory of Universal Grammar. For a detailed discussion of the specific de
tails, see Chomsky 1981 as well as Lasnik and Uriagereka 1988; Radford 1988.
3. The trace (t) in both sentences 2c and 2e indicates the position from the wh-word (who in
this case) has been moved in order to form a question. The t is a type of place-holder. In sen
tence 2c, the t indicates that the wh-word was the object of the verb,... visit who. In sentence
2e, the t indicates that the wh-word was the subject of the infinitive clause, who to visit Bill.
The asterisk in sentence 2f indicates that this sentence is ungrammatical.
References
Bley-Vroman, R. 1989. "What is the Logical Problem of Foreign Language Learning?" Linguistic
Perspectives on Second Language Acquisition ed. by S. Gass and J. Schachter, 41-68. Cam
bridge: Cambridge University Press.
Chomsky, N. 1975. Reflections on Language. New York: Pantheon Press.
Chomsky, N. 1980. RulesandRepresentations. New York: Columbia University Press.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, N. 1986. Knowledge of Language. New York: Praeger Press.
Clahsen, H. 1988. "Parameterized Grammatical Theory and Language Acquisition: A Study of the
Acquisition of Verb Placement and Inflection by Children and Adults." Flynn and O'Neil
1988.47-75.
Cook, V. 1988a. The Relevance of Grammar in the Applied Linguistics of Language Teaching. Ms.
University of Essex, England.
Cook, V. 1988b. Chomsky's Universal Grammar: An Introduction. Oxford, England: Basil Black-
well.
Dulay, H. and M. Burt. 1974. "Natural Sequences in Child Second Language Acquisition." Lan-
guage Learning 24.37-53.
Dulay, H., M. Burt and S. Krashen. 1982. Language Two. Oxford: Oxford University Press.
LINGUISTIC THEORY AND FL LEARNING ENVIRONMENTS 215
Felix, S. 1985. "More Evidence on Competing Cognitive Systems." Second Language Research
1.47-72.
Flynn, S. 1987. A Parameter-Setting Model of L2 Acquisition: Experimental Studies in Anaphora.
Dordrecht: Reidel Press.
Flynn, S. Forthcoming. "Eubanks Revisited: Response to Flynn Revisited." To appear in Second
Language Research.
Flynn, S. and W. O'Neil eds. 1988. Linguistic Theory in Second Language Acquisition. Dordrecht:
Kluwer Academic.
Flynn, S. and B. Lust. In press. "A Response to Bley-Vroman and Chaudron." Language Learning
June.
Flynn, S. and J. Carroll. In press. Second Language Acquisition. England: Longman Press.
Fries, C. 1945. Teaching and Learning English as a Foreign Language. Ann Arbor, MI.: University
of Michigan Press.
Jenkins, L. 1988. "Second Language Acquisition: A Biolinguistic Perspective." Flynn and O'Neil
1988.109-116.
Klein, W. 1986. Second Language Acquisition. Cambridge: Cambridge University Press.
Lado, R. 1957. Linguistics Across Cultures. Ann Arbor, ML: Univeristy of Michigan Press.
Lasnik, H. and J. Uriagereka. 1988.A Course in GB Syntax. Cambridge, Ma.: MIT Press.
Lust, B. 1986. "Introduction." B. Lust 1986.
Lust, B. ed. 1986. Studies in the Acquisition of Anaphora, Vol. 1: Defining the Constraints. Dor-
drecht: Reidel Press.
Martohardjono, G. and J. Gair. 1989. "Apparent Inaccessibility in SLA: Misapplied Principles or
Principled Misapplications?" Paper presented at the 18th Annual Linguistics Symposium.
University of Wisconsin-Milwaukee.
Newmeyer, F. 1983. Grammatical Theory: Its Limits and Its Possibilities. Chicago: University of
Chicago Press.
Newmeyer, F. and S. Weinberger. 1988. "The Ontogenesis of the Field of Second Language
Learning." Flynn and O'Neil 1988.27-34.
Oiler, J. and J. Redding. 1971. "Article Usage and Other Language Skills." Language Learning
1.85-95.
Radford, A. 1988. Transformational Grammar: A First Course. Cambridge: Cambridge University
Press.
Roeper, T. and E. Williams, eds. 1987. Parameter-Setting. Dordrecht: Reidel Press.
Rutherford, W. 1987. Second Language Grammar: Language and Teaching. London: Longman
Press.
Sciarone, A. 1970. "Contrastive Analysis: Possibilities and Limitations." International Review of
Applied Linguistics 8/2.115-131.
Sharwood-Smith, M. 1981. "Consciousness Raising and the Second Language Learner." Applied
Linguistics 2.159-68.
Stowell, T. 1981. Origins of Phrase Structure. Ph.D thesis. MIT.
Culture in Language Learning:
A View From the United States
Claire Kramsch
Council for the Teaching of Foreign Languages (ACTFL 1986), although some
suggestions have been made for French.
In the debate surrounding the link between language and culture, I believe,
with Bourdieu (1967), that systems of education breed systems of thought and
that those systems of thought constitute a great deal of what we call the "cul
ture" of a given society. At the very least, it represents the value attached by a
given society to the phenomenon of language and culture itself.
Since foreign language education is governed in the US by some 1600
school boards across the country, it could be instructive to look at the Gui
delines currently issued by the Boards of Education of the respective states to
see how they suggest integrating the teaching of language and culture.
and cultures, bring about world peace (Hawaii); provide all students with cultu
ral and linguistic sensitivities necessary for world citizenship (Connecticut).
The political reasons for studying foreign languages, for example English or
German, in public schools in France are stated in sober and realistic terms.
"English is spoken as a first or second language in many countries in the world".
"We cannot overestimate the particular importance of our relations with those
countries where German is the state language, and which constitute a linguistic
and cultural community of major demographic, economic and political import
ance. The only one in Europe that serves as a bridge between western and east
ern bloc countries" (Instructions ministérielles 1986).
The goals of foreign language education are threefold:
These goals reflect the traditional French belief that the acquisition of lan
guage is the formation of mental structures, that learning to talk is learning to
think, and that social acceptability in French society and abroad is not only a
question of using grammatically correct sentences, but employing the patterns of
thought of the dominant, i.e., educated, discourse of that society (Bourdieu
1982). Of course, this view not only acknowledges social differences in the native
language, but replicates them in the acquisition of a second. By teaching foreign
languages, the French educational system fulfills its mission of furthering up
ward social mobility through an awareness of the intellectual value of language
and culture per se.
CULTURE IN LANGUAGE LEARNING: A VIEW FROM THE U.S. 225
The question is: Do these conceptual outcomes further any more the cause
of intercultural understanding than the American? Does every one in the world
share the French view of the importance of the intellect? To what extent is up
ward mobility always linked to the ability to speak and write well in different
languages and to know other cultures? A more serious question raised by the
French model is the following: Is intercultural understanding linked to a specific
kind of social literacy and thus inevitably class-specific?
the content of information at the expense of its form and structure not present
an incomplete picture of intercultural communication? Intellectual styles (Gal
tung 1981) or patterns of thought (Bourdieu 1967) are socially and culturally
determined and are so inseparable from the informational content transmitted,
that communication breakdowns occur more often than not at the level of dis
course, not at the level of the facts presented. Some of these breakdowns are ap
parent, for example, when American students of German are given textbooks
written by Germans to teach German as a foreign language and are asked to
adopt a culturally different learning style.
The second question concerns the emphasis put in the Hessian Guidelines
on "demystifying" current ideologies. Since there can be no non-ideological ab
solute standpoint, it might be more useful to think in terms of the negotiation
and joint construction of a reality that is agreed upon as a safeguard against
communicative intolerance.
Finally, there is no easy passage from reflection and enlightenment to ac
tion. If the American view might be seen as too much focused on language as a
tool for action, the European view might be considered to be too concerned
with language as an object of linguistic or social reflection. Both views illustrate
two complementary aspects of culture: culture as performance, culture as com
petence, to which I will now turn.
The need to account for the cultural dimensions of language forces us to re
view the traditional, positivistic conceptions of quantitative, normative, linear
language learning.
There is a noticeable gap, for example, in most of the US boards of educa
tion guidelines, between the intercultural goals of foreign language education
and their behavioristic theory of language. In Virginia, language is seen only as a
set of symbols or tools; "Language is a set of symbols used by people to convey
meaning. ... tools for transmitting thoughts and ideas" (VA vol. 1, p. 5) "Lan
guage learning includes the acquisition of many skills". (PA p. 5) It is not clear
how purposeful, intercultural communication emerges from the acquisition of
isolated skills. In New York, we note the same discrepancy between the goals
and the means. The communicative goals are no less than "the ability to under
stand, respect and accept people of a different race, sex, ability, cultural herit
age, national origin, religion and political, economic and social background as
well as their values, beliefs and attitudes" (p. 4). But to reach these goals, tea
chers are encouraged to "use principles of mastery learning, and to use informal
CULTURE IN LANGUAGE LEARNING: A VIEW FROM THE U.S. 227
In the wake of the President's Commission Report following the 1975 Hel
sinki agreement, several developments on the national level have served to
make foreign languages part of a general push for the internationalization of
American education.
Realizing the need to meet the demands of a "global society" in which the
U.S has to deal with trade deficits, competitiveness and disarmament issues, and
228 CLAIRE KRAMSCH
realizing in addition that the U.S is less and less a melting pot, but more and
more a permanently multicultural salad bowl, efforts are being made at the
federal level to internationalize American education. The American Council on
Education surveyed the following aspects of international studies at the under
graduate level in colleges and universities in the United States: internationally-
oriented majors, minors, certificate programs; foreign language instruction;
study abroad; faculty development, including support for travel abroad and cur
riculum development; visiting foreign faculty and lecturers, foreign students; in
ternationally-oriented library resources; institution-to-institution linkages
overseas.
Its findings draw a sobering picture of the pervasive anglocentric orienta
tion of American higher education (Lambert 1989).
Under the direction of Richard Lambert, the NFLC convened a special task
force on the teaching of language and culture in September 1987 to explore pe
dagogical needs and existing materials. Participants included scholars in English
and comparative literature, anthropology, foreign language acquisition, linguis
tics and representatives of governmental agencies and business corporations.
Participants agreed that culture is both something you perform and something
you learn about. The discussion addressed questions related both to cultural
performance and cultural competence. Related to cultural performance were
such issues as: What can best be learned by living in the country, what is best
learned in a domestic instructional setting? at what stage should the learners
learn which features? How can teaching culture be adapted to the purposes of
the learner in a foreign country? What is an appropriate unit of teaching: the
communicative situation? behavioral segments? speech acts? How should we se
lect which cultural features to teach: their generalizability across situations?
their capacity to be matched with linguistic features? How can we measure cul
tural competence: directly, indirectly? How can recent technological advances
help in the teaching of culture? Eleanor Jorden's interactive video material for
teaching Japanese cultural performance represents a major step in the right di
rection (Jorden 1989) and so do other efforts, particularly for the less commonly
taught languages, such as Hindi (Gambhir 1987, forthcoming).
With regard to cultural competence, the planning group discussed the
general education goals of the teaching of culture: cultural aspects of discourse
and conversational style, critical reading skills transferred from mother tongue
reading classes, multiple perspectives on C2 read in the learners' native lan-
CULTURE IN LANGUAGE LEARNING: A VIEW FROM THE U.S. 229
- sociolinguistic ability: can meet all the demands for survival as a traveller
...; can handle any common social situation with an interlocutor accusto
med to foreigners: make requests politely, offer and receive gifts and invita
tions, apologize, make introductions, and discuss some current events or
policies, a field of personal interest, a leisure-time activity of one French-
speaking country; can participate in a conversation if conducted in "français
soigné", perhaps asking to have some expressions repeated or paraphrased;
manage to convey an attitude of good will via tone of voice and nonverbal
means.
- knowledge: can interpret simple menus, timetables ...; beyond the survival
level, knows about the phases of "culture shock" and how they may affect
perception; can identify the truth or untruth implied in the stereotypes of
his or her home culture and of French culture;... can name at least two pre-
230 CLAIRE KRAMSCH
sent political parties in France, and two or three major contemporary issues;
can describe or give examples of qualities prominently sought in French
education, such as clear expression and organization of ideas, knowledge of
French history and geography, and literature; ... can describe in broad out
line the main geographical regions, the political institutions, the public-edu
cation system, and the mass media of France or another French-speaking
country; can produce a few proverbs or stock phrases which reflect a world
view often encountered there; can say how that country's institutions, regu
lations, and customs such as attitudes toward behavior and appearance in
public, may affect him or her as a foreign traveller (or student, trainee, busi
ness person); ... can identify, in a literary or a journalistic text, examples of
elevated style and of familiar and popular expressions, and in reading, can
point out some of the verbal indications of attitudes, hidden quotations or
allusions. (op. cit.:15-16).
- informed attitudes (desirable at the basic level, indispensable at the supe
rior level): curiosity about discovering similarities and differences between
one's home culture and French culture; ... without losing one's own identi
ty, a basic desire to accommodate to the norms of the foreign society; the
determination to avoid over-generalization and stereotyping; awareness of
the fact that one's perceptions and judgments are patterned by one's home
culture, and are subject to temporary influences such as the phases of cultu
re shock; a critical approach to statistics and opinion polls: a concern to
know the date and scope of the evidence, even if one is not able to judge the
credibility of the agency; a fair-minded, relativistic appreciation of cultural
differences to the point of being able to present objectively some judgments
that foreigners make concerning one's home country. (op. cit.: 14)
3 Current obstacles
moted. But through training she acquired responsive modes and solved her
"problem". The article ends with her comment, which "sums up the experience
of many immigrant professionals with pragmatics: "Before I didn't realize what
people were expecting from me. Now I feel free to speak, free to contribute".
The slightly uncomfortable feeling we have in reading these examples of
successful acquisition of cultural performance is that it doesn't seem to be ac
companied by any gain in cultural understanding neither on the part of the
clients nor on the part of the American journalist who tells the story. On the
contrary, the closing lines of the piece seem to reinforce the ideological stereo
type of America as the land of freedom symbolized by American "free" conver
sational style. A similar ethnocentric result would be achieved if ESL teachers
used the film Crosstalk (Gumperz, Jupp and Roberts 1979) as behavioristic
training for successful job interviews without at the same time increasing the
cross-cultural sensitivity of both parties to the cultural dimensions of discourse.
Television's ability to bring whole new worlds into one's living room and its
total claim on domestic and foreign reality hide the fundamental socio-centrality
of the medium as a model of meaning production (Ong 1977; Fiske and Hartley
1978; Gumpert and Cathcart 1982; Geisler 1985; Kozloff 1987). Television vie
wers have been socialized into seeing foreign countries and events through the
cultural discourse of their own television or, if satellite and other reception per
mit, through the cultural codes of a foreign television discourse. The fact that a
society's television programs reflect a certain cultural consensus on the way so
cial reality is viewed, makes the medium into a unique tool for teaching foreign
cultures as they are presented through foreign television; but at the same time,
because of its appearance of universality, television can be the greatest obstacle
to appreciate and understand cultural differences, if it is not critically "decon
structed" and placed in its own cultural discourse framework. For example,
there is a long German tradition of foregrounding the process of narration in
television films: the story may start with the end and retrace the events that led
to it, the filmmaker/narrator may appear in person to give metacomments on the
narrative, thus destroying the filmic illusion. In addition, because of the tradi
tional lack of interruptions for commercial purposes, German television viewers
have long attention spans and enjoy slow reflective narrative styles. By contrast,
American cinematic style prefers to obliterate all traces of enunciation: narra
tion is chronological, events unfold linearly as in "real time", viewers are able to
identify with the characters. Uninformed American viewers of German televi
sion films tend to find the lack of American-type suspense and action discon
certing, and the pace too slow; they confuse the lack of identification
possibilities with "intellectualism".
234 CLAIRE KRAMSCH
3.5 Lack of a theoretical framework for the discussion of culture and for
contrastive cultural analyses
The current efforts and obstacles outlined above prompt us to reassess the
traditional questions asked of foreign language education in instructional set
tings.
A question that is often asked by policy makers and administrators in the
United States is: Is a foreign culture learned best in a domestic instructional set
ting or by living and studying abroad? Although the question is, educationally
and financially, a valid one, there is to date no conclusive evidence to show that
study abroad per se leads to cross-cultural understanding, or to the development
of the cross-cultural personality. More needs to be known about what it is exact
ly students learn when they go abroad (for a discussion of this topic, see Lambert
1989).
Rather than reduce the issue to either/or dimensions, we should be con
cerned about the appropriate balance of cognitive and experiential learning that
are both equally essential to the acquisition of cultural competence and perfor
mance. One of the advantages of studying abroad is not only in the practice of
the forms of the language, but in the exposure to other intellectual styles, other
ways of framing questions, other ways of interpreting social, historical, political
facts. But even there, experiencing these different discourse forms does not
make them meaningful without conscious cross-cultural reflection. This takes
place at best in instructional settings.
A related question is the following: Is a foreign language course the best
place to teach culture, or is it best taught in a separate course taught in the stu
dents' native language?
CULTURE IN LANGUAGE LEARNING: A VIEW FROM THE U.S. 235
This question assumes that a language can be taught without teaching the
way in which that language expresses the world view of the social group or so
ciety that speaks it. It assumes that one first learns skills then content. We must
reframe the question as follows: What is the appropriate balance of the develop
ment of socialization and literacy in the foreign language? The question is not:
Socialization or literacy? but: When and how much should we teach how to per
form social acts in the language, when and how much should we teach how to in
terpret oral and written texts? (Kramsch 1987b).
In recent years, much has been made of "content-based" instruction for the
teaching of foreign languages. The most notable experiment has been at the
Lauder Institute School of Management, where advanced students attend lec
tures in their fields conducted in the foreign language. These and other immer
sion experiences have raised the question: What is the best way to teach the
advanced levels: language courses or content courses taught in the language?
Content courses are obviously an excellent way of using the language for various
academic and professional purposes. The evidence is not yet in concerning the
effect of these courses on the linguistic proficiency of advanced learners. How
ever, from a cultural point of view, the question is wrongly posed.
If these courses are to impart not only knowledge of foreign events, but also
a foreign discourse style, one has to ask: From what cultural perspective are
these courses taught? Which point of view is represented in the transmission of
cultural knowledge: that of the base or that of the target culture? the "busi
ness/management" point of view or other intellectual points of view also? Cul
ture can only be really understood within relational systems of thought, indeed
within "ecological" forms of pedagogy, in the Batesonian sense (Bateson 1982).
If the perspective of the lecturer cannot and indeed should not be avoided, a lec
turer, say, in the field of political science, should be able to convey to students
the cultural slant of the discourse of his/her discipline.
As renewed efforts have been made to link the teaching of foreign lan
guages to practical usages outside the classroom, there has been much concern
about which aspects of the language should be taught. Hence the traditional
question: If only short times of exposure are available, should schools teach
basic language skills or general education competencies? This question again as-
sumes that one can separate skill from content in the development of communi
cative competence. It is a fallacy to believe, for example, that an uneducated
language learner will be successful in achieving the lofty goals of the US Presi
dent's Commission and the cultural goals of the US states' guidelines without
additional education. Rather, a cross-cultural approach should abandon the na
tive speaker as ideal or norm and focus instead on developing the learner's bi-
culturalism, as it does its bilingualism. It should, therefore, instill basic language
236 CLAIRE KRAMSCH
courses with the intellectual excitement generated by forms of learning that are
typically attributed to general education and academic achievement: relational
and critical thinking, observation of and reflection on interactional processes, in
terpretive ability (Kramsch 1987a, 1987b; Swaffar 1990).
Finally, given the administrative structure of higher education in the United
States and the traditional differential of prestige between teachers of literature
and teachers of language, one question has gained in importance in the last few
years: How can we break the hammerlock of humanists on the teaching of
foreign languages? or: Should the teaching of language be coupled to that of lit
erature or not?
Although this question is justified in view of certain academic excesses, it is
nevertheless of too limited a scope to offer a useful response. The issue is not:
By whom or in which domain of knowledge should languages be taught, but
what kind of discourse worlds should be activated and how? Literature is but
one of the many cultural discourses to which foreign language learners should
be exposed to; others include everyday conversational discourse, scientific, tech
nical and political discourse, and the specific discourse of individual disciplines.
To the extent that they are the products of a given culture, works of foreign lit
erature read by foreign non-intended readers present a special challenge in
cross-cultural communication, that eminently serves to further cross-cultural
education (Wierlacher 1985; Bredella and Haack 1988; Kramsch 1988a). Lan
guage study should expose learners to a variety of discourse forms that coexist in
a given culture (Modern Language Association 1989).
5 Conclusion
Notes
1. The different states' guidelines consulted here are as follows: Foreign Languages Arkansas
Public School Course Content Guide. Little Rock, Arkansas: State Board of Education,
1984; Handbook for Planning an Effective Foreign Language Program. Sacramento, Califor-
nia: California State Department of Education 1985; A Guide to Curriculum Development
in Foreign Languages. Hartford, CT: Connecticut State Board of Education 1981; French
Language Program Guide. Honolulu, Hawaii: Department of Education, Office of Instruc-
tional Services/General Education Branch, Feb. 1979; Designing, Strengthening and Assess-
ing School Foreign Language Programs. A Guideline for Administrators and Teachers.
Bloomington, IN: Indiana Dept of Public Instruction, Division of Curriculum 1981; Ken-
tucky FL/ESL Skills Continuum Frankfort, KY: Foreign Language Education, Kentucky
Dept of Education 1980; Position Paper on Foreign Language Education in Michigan
Schools. Detroit: Michigan State Board of Education 1983; Modern Languages for Com-
munication. New York State Syllabus. Albany, NY: New York State Education Dept 1985;
Suggested Learner Outcomes. French. Oklahoma City: Oklahoma State Dept of Education,
August 1985; Handbook for Foreign Language Educators, Harrisburg, PA: Pennsylvania
Dept of Education 1983; Foreign Language Dept Goal Statement. Materials Guide. Level I
materials. Springfield, MA: Springfield Public Schools 1986; Secondary French Guidelines
for Levels I, II, III. Austin, TX: Foreign Language Section, Texas Educational Agency, Divi-
sion of Curriculum Development 1978; A Course of Study for Foreign Languages in Utah.
Salt Lake City, Utah: Utah State Office of Education, Division of Curriculum and Instruc-
tion 1980; Foreign Languages in Virginia Schools. Richmond, VA: Foreign Language Ser-
vice, Dept of Education. Sept 1977, vol. 1-7; A Guide to Curriculum Planning in Foreign
Language. Madison, WI: Wisconsin Dept of Public Instruction 1985; Instructions ministé-
rielles pour l'enseignement des langues vivantes. Journal officiel 524-6,6, 1986; Rahmenrich-
tlinien, Sekundarstufe 1, Neue Sprachen. Der Hessische Kultusminister (Ed.) Frankfurt
a.M.: Diesterweg 1980.
I am grateful to Laure Borgomano for making available to me her collection of States Gui-
delines.
2. Other U.S. initiatives worth mentioning are the Lauder Institute of Management and Inter-
national Studies, especially their Language and Cultural Perspectives Program at the
University of Pennsylvania. Besides professional proficiency in a foreign language, this pro-
gram provides substantial knowledge of contemporary and traditional culture of educated
native speakers of that language; history, economics, geography, literature, political science
238 CLAIRE KRAMSCH
and philosophy, and religion, as well as the arts, media, and sports are taught in the foreign
language; students are given also an understanding of management communication style, be
havior, and cultural protocol in a range of professional and social settings. Another trend-
setting initiative is the Integrative German Studies program at the University of Tübingen
sponsored by the Bosch foundation for graduate students and young scholars from the
United States. This is an interdisciplinary research project with the intention of developing
comparative and interdisciplinary German Studies for Americans. The first seminar run for
students from UCLA took place in summer 1988.
3. At the same time, the Los Angeles Times published an article entitled "Security Threat
Cited in Foreigner Jobs", where the "security threat" posed by the influx in foreign engineers
referred to the dangers inherent in foreign intellectual styles. "Some experts worry that the
traditional American emphasis on practical engineering problems may be eroding in favor of
theoretical engineering sciences, a more prestigious pursuit but one less likely to contribute
to American competitiveness in world markets" (Gillette 1988).
References
Abelson, H. and G.J. Sussman. 1985. Structure and Interpretation of Computer Programs. Cam-
bridge: MIT Press.
ACTFL Proficiency Guidelines. 1986. Hastings-on-Hudson, NY: ACTFL Materials Center.
AATF National Bulletin. 1989. "The Teaching of French. A Syllabus of Competence." AATF Na-
tional Bulletin 15. Special issue.
Attinasi, J. and P. Friedrich. 1988. "Dialogic Breakthrough: Catalysis and Synthesis in Life-chang
ing Dialogue." Unpublished manuscript.
Bateson, G. 1982. Steps to an ecology of mind. New York: Bantam.
Bourdieu, P. 1967. "Systems of education and systems of thought." International Social Science
Journal (Unesco) 19/3.338-58.
Bourdieu, P. 1982. Ce que parler veut dire. L'économie des échanges linguistiques. Paris: Fayard.
Bredella, L. and D. Haack, eds. 1988. Perceptions and Misperceptions: The United States and Ger-
many. Studies in Intercultural Understanding. Tübingen: Gunter Narr Verlag.
Coste, D. 1980. "Analyse de discourse et pragmatique de la parole dans quelques usages d'une di
dactique des langues." Applied Linguistics 1/3.244-252.
Fishman, J. 1982. "The Need for Language Planning in the United States." PEALS 20/4.5-6. (Pub
lished by the Colorado Congress of Foreign Language Teachers, a constituent of ACTFL.)
Fiske J. and J. Hartley. 1978. Reading Television. London: Methuen.
Fliegel, D. 1987. "Immigrant Professionals must speak American." The Boston Globe June 16.
Galisson, R. 1987. "Accéder à la culture partagée par l'entremise des mots a C.C.P." Etudes de
Linguistique Appliquée 67.119-40.
Galtung, J. 1985. "Struktur, Kultur und intellektueller Stil." Wierlacher 1985.151-193.
Gambhir, S. et al. 1987. New Directions New People A Video Series for Teaching Hindi as a
Foreign Language. Available from South Asia Regional Studies, U. of Pennsylvania, Philad
elphia, PA 19104-6305.
Gamhbir, V. Forthcoming. "A set of culturally sensitive situations for South Asian languages."
Available from Dept of Modern Languages and Linguistics, Cornell University, Ithaca, NY
14853.
Garfinkel, H. 1967. Studies in Ethnomethodology. London: Basil Blackwell, 301-323.
CULTURE IN LANGUAGE LEARNING: A VIEW FROM THE U.S. 239
Geisler, M. 1985. "'Heimat' and the German Left: The Anamnesis of a Trauma" New German
Critique 36.25-66.
Gillette, R. 1988. "Threat to Security Cited in Rise of Foreign Engineers." Los Angeles Times
January 20.
Gumpert G. and R. Cathcart, eds. 1982. Inter/Media. Interpersonal Communication in a Media
World. New York: Oxford University Press.
Gumperz, J J, T.C. Jupp and C. Roberts. 1979. Cross-talk: A Study of Cross-Cultural Communica-
tion. London: National Centre for Industrial Language Training in association with the BBC.
Habermas, J. 1970. Theorie des kommunikativen Handelns. Frankfurt/Main: Suhrkamp.
Hart D., S. Lapkin and M. Swain. 1987. "Communicative Language Tests: Perks and Perils."
Evaluation and Research in Education 1/2.83-94.
Hofstadter, R. 1963. Anti-intellectualism in American Life. New York: Vintage Books.
Jorden, Eleanor H. and Mari Noda. 1989. Japanese: The Spoken Language. Videotapes of core
conversations. Available from Sales Dept, Sony Video Software, 1700 Broadway, New York,
NY 10019.
Kozloff, S.R. 1987. "Narrative Theory and Television." Channels of Discourse. Television and Con-
temporary Criticism ed. by R.C. Allen. Chapel Hill: University of North Carolina Press.
Kramsch, C.J. 1983. "Culture and Constructs: Communicating Attitudes and Values in the
Foreign Language Classroom." Foreign Language Annals 16.437-448.
Kramsch, C.J. 1987a. "New Directions in the Teaching of Foreign Languages." The Governance of
Foreign Language Teaching and Learning. Proceeding of a Symposium Princeton, New Jersey,
October 1987. ed. by P. Patrikis. New Haven, CT: The Consortium for Language Teaching
and Learning.
Kramsch, C.J. 1987b. "Socialization and Literacy in a Foreign Language: Learning Through Inter
action." Theory into Practice (Ohio State University) 26/4.243-50.
Kramsch, C.J. 1987c. The Missing Link in Vision and Governance: Foreign Language Acquisition
Research. ( = Profession 87.) New York: The Modern Language Association of America.
Kramsch, C.J. 1988. "The Cultural Discourse of Foreign Language Textbooks." Towards an New
Integration of Language and Culture ed. by Alan Singerman Middlebury, VT: Northeast Con
ference.
Lambert, Richard D. 1989. International Studies and the Undergraduate. Washington, DC: Ameri
can Council on Education.
McLeod, B. 1976. "The relevance of anthropology to language teaching." TESOL Quarterly
10/2.211-220.
Modern Language Association. 1989. "Language Study in the United States. A Draft Statement."
MLA Newsletter Fall 1989.16
Müller, B.D. 1980. "Zur Logik interkultureller Verstehensprobleme." Jahrbuch Deutsch als
Fremdsprache 6.102-119.
Müller, B.D. 1981. "Bedeutungserwerb. Ein Lernprozeß in Etappen." Konfrontative Semantik ed.
by B.D. Müller, 113-154. Tübingen: Gunter Narr Verlag.
Murray, J.H., G. Furstenberg and D. Morgenstern. 1989. "The Athena Language Learning Pro
ject: Design Issues for the Next Generation of Computer-Based Language Learning" Mod-
em Technology in Foreign Language Education: Application and Projects ed. by W. Flint
Smith, 97-118. Lincolnwood, IL: National Textbook Co.
Ong, W. 1977. Interfaces of the Word: Studies in the Evolution of Consciousness and Culture. Itha-
ca, NY: Cornell University Press.
240 CLAIRE KRAMSCH
Perkins, J. 1980. "Strength through Wisdom: A Critique of U.S. Capability. A Report to the Presi-
dent from the President's Commission on Foreign Languages and International Studies, No-
vember 1979." Modern Language Journal 64.9-57.
Pfeiffer, K.L. 1988. "Implications of the Intellectual Migration: Two Cultures Once Again?" Bre-
della and Haack 1988.37-59.
Porcher, L. 1983. "L'école dans tous ses états I. A la recherche de 'modèles' pédagogiques." Le
Français dans le Monde 179.25-29.
Reddy, M. 1979. "The Conduit Metaphor." Metaphor and Thought ed. by A. Ortony. Cambridge:
Cambridge University Press.
Smith, F. 1985. "A Metaphor for Literacy: Creating Worlds or Shunting Information?" Literacy,
Language and Learning. The Nature and Consequences of Reading and Writing ed. by D.
Olson, N. Torrance and A. Hildyard. Cambridge: Cambridge University Press.
Snow, C. 1987. "Beyond Conversation: Second Language Learners' Acquisition of Description
and Explanation." Research in Second Language Learning: Focus on the Classroom ed. by
J.P. Lantolf and A. Labarca, 3-16. Norwood, NJ: Ablex.
Swaffar, Janet. 1990. "Language learning is more than learning language: Rethinking reading and
writing tasks in textbooks for beginning language study." Foreign Language Research and the
Classroom ed. by B.Freed. Lexington: D.C. Heath.
Wierlacher, A., ed. 1985. Das Fremde und das Eigene. Prolegomena zu einer interkulturellen Ger-
manistik. Munich: Judicium Verlag.
Implications of Intelligent Tutoring Systems for
Research and Practice in Foreign Language
Learning
Ralph B. Ginsberg
In this and two subsequent papers I shall explore some implications of arti
ficial intelligence for the design and empirical analysis of learning environments
and teaching systems for foreign languages. At the same time, since artificial in
telligence is implemented on computers or in settings in which computers play a
fundamental role (such as hypermedia with interactive video and voice syn
thesis), I shall be discussing foreign language learning that takes place in de
signed environments that are for the most part neither classroom-based nor
classroom-managed. The issues with which I shall be concerned here, however,
do not depend in any important way on the technology of computers or class
rooms, and accordingly the research I shall review is pertinent to traditional
teaching methods as well as computer-based learning.
Computers are, in one respect, powerful tools which vastly increase our ca
pacity to perform logical, numerical, and symbolic computations. Most of the
computer aided instruction (CAI) that is now commonplace in virtually every
area of education uses them in this way. But in another respect computers are an
interactive and potentially intelligent medium within which we can carry out our
most important social and cognitive activities. Artificial intelligence (AI) is the
branch of computer science which tries to exploit this intelligence and interac
tiveness. With regard to the transmission and acquisition of knowledge, this en
tails addressing two basic questions:
242 RALPH B.GINSBERG
- Learning environments: what are the characteristics of the physical and so
cial settings in which people learn efficiently and effectively, and how can
such settings be constructed or simulated?
- Knowledge communication: how is knowledge successfully communicated
and skill successfully imparted, and how can those communication proces
ses be emulated and enhanced?
- STEAMER (Hollan et al. 1984), which makes use of computer graphics and
simulation to help students learn the operation of shipboard steam propul
sion systems;
- SOPHIE (Burton, Brown and De Kleer 1982), designed to teach various
aspects of electronic troubleshooting;
- WEST (Burton and Brown 1982) and WUSOR (Goldstein 1982), coaches
for the computer-based games WEST ("How the West Was Won") and
WUMPUS;
- GUIDON (Clancey 1987), designed to teach medical diagnosis through
case method dialogues; and
- LISP and Geometry tutors (Anderson and Reiser 1985; Anderson, Boyle
and Yost 1985), for the AI computer language LISP and high school geome
try.
- Several other programs are currently in production.
The goals and aspirations of ITS can, perhaps, best be grasped by contrast
ing it with the familiar CAI programs it is meant to improve. ITS's improve
ments move it closer to successful human teachers and supportive learning
environments, in and out of the classroom, and in this respect they bear on
general issues of research in foreign language learning.
One potentially important drawback of CAI - one should not exaggerate
how important this or the other drawbacks noted below really are for any par
ticular pedagogical goal or subject matter - is its rigidity. "Traditional" CAI (the
quotation marks signal caricature of both CAI and ITS) can be thought of as a
directed graph or flowchart, consisting of nodes representing textual, graphical
and audiovisual presentations of material, menus, questions with their answer
categories, error messages and tutorial explanations etc.; and a set of links con
necting each node to the program's next actions (presentations). The links em
body a detailed specification of the flow of control, as determined by student
responses at the originating node or by some other, prespecified branching
mechanism. The course author, perhaps assisted by an authoring program, must
explicitly enter all of the nodes and specify all of the links (student responses
and tutorial reactions). A session then consists of one of the possible paths
through the graph. Such a scheme is satisfactory if the student only needs drill
and practice, exercises, and an occasional tutorial. But it puts a great burden on
the designer to anticipate all contingencies, a burden that is difficult to bear for
rich, nonmechanical activities, like using a foreign language. By contrast, like
human teachers, ITS tries to plan sessions with the student on the fly (Peachey
and MaCalla 1986):
ITS, then, tries to respond more flexibly than CAI. But ITS also tries to
teach more complex skills, and this has important consequences for how it must
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 247
be designed. For simple skills and highly structured, low level tasks, responses
can be predicted and paths constrained. But for more complex skills and more
integrative tasks (e.g. mathematics problem solving, writing, or carrying on a
conversation in a foreign language) students must play a more active role in the
process, experimenting with various aspects of the domain and determining their
own courses of action. Such student behavior may well be impossible to antici
pate in detail. Thus, if the tutor is to be helpful, it must, like human tutors, be
able to solve problems as they arise, i.e. it must have its own knowledge of the
subject matter and be able to put it to use. Moreover, with complex skills, in
ferences from what the learner does to what he knows (can do) and why he does
it are not at all straightforward. Expert knowledge of the subject matter and
more complex, sensitive representations of the student are a second major dif
ference between ITS and CAI.
A third, more subtle difference has to do with the way knowledge of the
subject matter is represented in ITS and CAI. In CAI the knowledge is con
tained in the procedures, i.e. the branching rules, which contain the possible
answers to exercises and drills (including the correct one) as conditions and the
CAI responses as actions. Moreover in the branching rules teaching knowledge
and subject matter knowledge (condition and action) are tied strongly together.
ITS programming structures follow a different strategy which has proved suc
cessful in other AI applications (e.g. expert systems, computer vision, and natu
ral language processing). First, the teaching and domain knowledge are
separated into different modules — so that each can be independently modified
and so that they can be more flexibly combined as the planning mechanisms re
quire. Second, within the teaching and domain modules, expert knowledge is
often represented "declaratively", as a set of facts and rules in a knowledge base,
which can be modified independently of the procedures that use them; inference
rules and other computational mechanisms are then provided to access the
knowledge base and put it to use. It is from the representation of expert knowl
edge and the separation of teaching and subject matter knowledge that ITS
derives its power, and indeed these feature are what makes ITS possible.1
A final difference between ITS and CAI has to do with the kinds of things
they try to teach and the instructional principles they use to do so. Largely
through the influence of such ITS researchers as John Seely Brown, Richard
Burton and Allan Collins, and the seminal, closely related work of Seymour
Papert, our views of what does and should go on in classrooms or on computers,
and how the learning process should be organized, are being transformed. In
particular (see Pea and Soloway 1987) a fact-oriented, classroom-based and
classroom-managed "transmission view of knowledge", in which the major peda
gogical activity is the presentation of well-structured material to be learned
248 RALPH B. GINSBERG
based ITS STEAMER where learning sessions are largely directed by the lear
ner and the computer itself does not seem to do anything that could be called in
telligent. But as Papert (1980, Ch.7) and Hollan, Hutchins and Weitzman (1984)
stress, AI is in fact fundamental to the design and construction of both systems.
First of all the LOGO language itself is a variant of the AI programming lan
guage LISP; STEAMER'S graphic editors and inspectable simulations are based
on LISP and AI object-oriented programming techniques. Neither would be
possible without the power of these languages. Secondly in both STEAMER and
LOGO the interface through which the learner interacts with the computer
draws heavily on the interactive programming environments developed for AI,
and again neither would be possible without it. But most importantly the micro-
worlds themselves derive from an analysis of how people reason about geometry
or dynamic systems, an analysis of the knowledge that is to be learned, and an
analysis of the way that knowledge has to be represented if learning is to be ef
fective. Such essentially cognitve analyses, and the computational models that
implement them, are AI's greatest achievements and promise. The resulting
knowledge representations are built into the fabric of the microworlds — into the
objects the learner can manipulate, the physical set up, her possibilities for ac
tion, the tasks she can perform, the tools available to do them. Microworlds are
often engaging because they are "realistic" and apparently relevant, as are simu
lations and "authentic" materials. They are also, like games and other informal
learning environments, fun and self-motivating. But it is the cognitive analysis
leading to an implicit presentation of knowledge by structuring the possibilities
to be explored by the learner (as contrasted with knowledge presented through
explanations or expository texts), not the motivational effects, that distinguishes
the AI based applications from "traditional" approaches to education.
The educational philosophy lying behind the construction of Papert's math
ematics and physics microworlds is one of "discovery" learning, of learning-by-
doing, of giving the learner the opportunity and tools to learn by himself, rather
than trying to teach him. While Papert's arguments are, in my view, compelling,
simply letting a learner explore a microworld — or a complex simulation, or a
"natural" environment in which learning might take place, for that matter —
without any "guidance" whatever, has several limitations which should be noted:
- learners may not get into fine structure, even if they have mastered the
grosser features;
- learners may not explore the microworld effectively, e.g. cycling in an incor
rect procedure or fixating on complex problems before their simpler com
ponents have been encountered;
- learners may learn slowly, spending a lot of time on irrelevant activities;
- learners may not see the context of the simplifications in the microworld or
the limitations of the specific tasks or setting.
Considerable progress has been made for several domains, notably mathe
matics, physics and engineering, where microworlds have been designed. I shall
return to these questions when "increasingly complex microworlds" (Burton,
Brown and Fischer 1984) and the Collins-Brown-Newman (1987) framework are
discussed in section 3.2.
Intelligent tutors as described in the section 1.1 and 1.2 are comprised of
seven interdependent "architectural" components which perform the various
functions necessary for teaching. The architecture can also serve as a convenient
way of organizing discussions of the human tutors, coaches, and even classroom
instructors that the ITSs are meant to emulate, since they, of course, must meet
these functions as well. The components are:
- planning and control mechanisms which, on the basis of the current state of
the student model and the domain and teaching knowledge, determine what
to do;
- an environment, i.e. a set of tasks (activities) that the learner is to perform,
and the tools given him to do so; and
- an interface by which the tutor and the student communicate.
ITSs differ in how these components are actually constructed, how they are
interrelated, and what emphasis is given to each in the research and develop
ment process. Still, to build functioning prototypes on a computer, ITS designers
and researchers have had to be very specific about how each is to be im
plemented. Although for expository purposes the components are treated as if
they were separate modules, they must to some extent be designed and evalu
ated concurrently, since after all they must work together to produce effective
learning. Whether or not it would be possible, or even desirable, to replicate the
work I shall review by building an ITS for foreign language instruction, it is my
contention that considering how the issues have been addressed by ITS re
searchers helps define the interesting research and instructional design issues
for the field. In this section I discuss the components of ITS architecture and re
view some of the approaches ITS researchers have taken in each.
algorithms to compute the optimal move for any given position; and SOPHIE-I
(Burton, Brown and De Kleer 1982), which uses a general purpose circuit simu
lator, are examples. In language instruction most current parsers would fall into
this class. At the other extreme are "glass box" experts (Goldstein and Papert
1977), elaborate cognitive models and qualitative process models which repre
sent the domain knowledge and reason about it in the same way that human
beings do, so that describing and observing the expert's problem solving process
would be a useful component of the instruction. Anderson's LISP and Geometry
tutors (Anderson and Reiser 1985; Anderson, Boyle and Yost 1985) and SO-
PHIE-III (Brown and Burton 1987) are examples. It is hard to imagine the
analog of these models for foreign language instruction, since a detailed, "psy
chologically real" theory of language production and understanding would be re
quired, but for some narrowly circumscribed tasks computational models might
be possible. In between are expert systems, developed using knowledge engin
eering techiques, which represent knowledge in an explicit way but may not rea
son about it the way human beings do. (While they may have the same
knowledge base as cognitive models, they manipulate it differently.) A classic
example is GUIDON (Clancey 1986b, 1987), whose expert system consists of a
knowledge base containing facts and relations in its domain (bacterial infec
tions), knowledge of diagnostic techniques, and a set of procedures (inference
rules and interpretations) which use the knowledge base and techniques to diag
nose diseases.
While it is not crucial how domain problems are solved, there are, neverthe
less, problems that have to be addressed in one way or another if the domain
knowledge is to be useful for tutoring. In order to help students, tutors have to
be able to explain how the correct answer is derived and guide the students
along a path to it. Glass box experts already do this. The evolution of the GUI
DON case method tutors from classical expert system to a more human-like
model (Clancey 1986b) was motivated by just such considerations. Black box
and classical expert systems in ITS have been made more "articulate" by
augmenting them with devices like Burton and Brown's (1982) "issues recogni
zers", or Clancey's (1987) "t-rules", which compare student and expert perfor
mance (however generated) and base tutorial actions on differences ("issues")
they are designed to detect. In this regard issue recognizers automate some as
pects of CAI branching, although in order to do so effectively they may need
more information than the match between correct and incorrect answers.
Besides articulateness, tutoring imposes other demands on the domain
knowledge component. A good tutor needs some knowledge of how incorrect or
partial answers are generated if it is to help students get past these blocks. Many
tutors represent incorrect knowledge in the form of "false facts", "mal-rules",
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 253
and "buggy" procedures, to deal with this problem. In foreign language the
CALLE tutor (Feuerman, Marshall, Newman and Rypa 1985; Xerox 1985),
which uses the LFG processor developed by Kaplan (Kaplan and Bresnan 1982),
has this happy property of being able to account for ungrammatical as well as
grammatical responses. The problem of knowledge representation is further
compounded by the fact that in many domains there are many correct responses
and many ways to reach a given correct response, each containing useful peda
gogical information. A single algorithm or expert system may not be sufficient to
capture this variation; multiple representations and views of the expert knowl
edge may be required (Olsson 1986). Thus, although to a large extent domain
knowledge and teaching knowledge are separated in ITS, there are practical re
strictions on the modularity of the system.
2.3 Diagnostics
The diagnostic component of ITS updates the student model, inferring his
knowledge from his responses; or to put it another way, "the student model is a
data structure, and diagnosis is a process that manipulates it" (VanLehn 1988).
Diagnosis is clearly essential if instruction is to be adapted to the needs and
problems presented by particular students as instruction progresses. Moreover,
if teaching strategies are to be adaptive, their assumptions about a student's
knowledge and behavior have to be tested. Thus diagnosis is intimately linked
not only to the student model but to the planner as well. As Olsson (1986)
stresses, the question is not so much "what is in the student's head", as "what do
we need to know in order to teach?" For complex cognitive activities, detailed
student models, and a rich array of teaching tactics, it would be necessary to
know a good deal.
256 RALPH B. GINSBERG
The diagnostic methods being developed in ITS and cognitive science more
generally have great potential payoff for the foreign language field, where work
on assessment and testing has been dominated by considerations of how much a
student knows, and where test results have been used for purposes of statistical
comparison and certification, rather than for the important didactic goals of
determining what a particular student knows at a particular point in time. Conti
nuing the contrast between diagnosis in ITS and ability testing, for teaching it is
generally not sufficient to know whether or not a student has mastered a particu
lar skill or gets a question right or wrong: it is equally important to know exactly
what errors he makes and why he might have made them. A further, closely re
lated difference in the treatment of errors between diagnosis and ability testing
is that in diagnosis most errors are treated as systematic, not random, and thus to
be accounted for by student models, although allowance is made for perfor
mance lapses due to fatigue, boredom, memory failures, distractions, and the
like. Diagnosis, then, differs from testing in method as well as intent.
VanLehn (1988) distinguishes nine types of diagnostic techniques in ITS
based on his typology of student models, but here it will be sufficient to discuss
the three broad classes suggested by Olsson (1986). All support student models
which at least contain a detailed specification of the skills and subskills being
taught. The simplest diagnostic methods relate to overlay models, where the stu
dent's knowledge is described as a subset of expert knowledge, without any par
ticular attention paid to misinformation or distortions (i.e. "errors"). Overlay
methods are quite similar to those of CAI and ability testing, with the stipula
tions that the skills to be "tested" are highly disaggregated and mastery of each
subskill evaluated. As a consequence, overlay methods are practical for use out
side the context of ITS. For example Marshall (1980, 1981), not herself an ITS
researcher but drawing on its methods, has developed several algorithms for
choosing problems for presentation in an adaptive testing framework which en
able overlay models to be fittted efficiently.
A second class of models focus on error descriptions, attempting to account
for specific errors at the behavioral level (e.g. specific incorrect answers to a
problem or exercise) by postulating computational mechanisms that could pro
duce them. Burton's analysis of "bugs" in multicolumn subtraction is an example
(Burton 1982; Brown and Burton 1978). Bugs, i.e. incorrect procedures which
may or may not produce errors depending on the problem, are represented in
the procedural network (described in the previous subsection) along with cor
rect procedures. Predictions of responses to problems presented are calculated
for all possible correct and buggy procedures and the best fitting model selected.
For reasons discussed in Burton (1982) this is a very difficult task and the algo
rithm that implements it is computationally very intensive. Nevertheless, with
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 257
enough computing power, Burton's methods are possible, and interesting find
ings have been obtained.
The third type of diagnostic method in ITS is simulation, of which Ander
son's "model tracing" procedure is perhaps the best example. Because they are
so detailed and require such specific data, Anderson's simulations are very
closely tied to the specific subject matter the tutor is teaching, and his methods
are difficult to extrapolate to other contexts. Olsson and Langley (1988) have,
however, suggested a simulation method, implemented in the computer pro
gram DPF (Diagnostic Path Finder), which can be used in many ITS domains,
like language learning, where the extensive data required by model tracing is un
available, but where a cognitive model is entertained. Using task analysis (e.g. a
procedural network), selective search, and machine learning techniques, DPF
predicts both the specific behavioral path taken by the student and the strategy
(rules) used to get from each point on the path to the next. Although it is promi
sing because of its psychological underpinnings, DPF has yet to be used in an
ITS.
2A Teaching Knowledge
tinction, however, because in the ITS literature planning points to different, and
equally interesting, directions for empirical research than simply establishing
what works.
The development of planning mechanisms for tutoring is a relatively ne
glected area in ITS and tutors which tackle the issues head on are only now
reaching the prototype stage. As Peachey and MaCalla (1986) point out, how
ever, planning has a long history in AI in connection with determining the physi
cal actions of real or simulated robots. They also note that planning is more
difficult in ITS than in robotics, because in the latter the plan is designed to
change the state of the physical world, which can be observed; while in ITS the
state that needs to be changed is the student's knowledge, which is generally not
fully observable. Furthermore, students are intelligent, independent actors and
in this regard, unlike the worlds of many robots, not entirely predictable. Recent
work in ITS by Peachey and MaCalla (1986), Macmillan and Sleeman (1987),
Russell (1987), and Murray (1988) promises to bring general planning methods
to a practical stage.
Attention to results on planning in AI and ITS are important for research in
foreign language learning because, as Macmillan and Sleeman have stressed,
planning is fundamental to the way human tutors look at their work. Recent em
pirical work by Leinhardt and Greeno (1986) and by Leinhardt and her col
leagues (Leinhardt, forthcoming; Leinhardt and Smith 1985; and Leinhardt,
Weidman and Hammond 1987) has applied AI planning ideas to an analysis of
the difference between novice and expert teachers. The increased richness of
the characterization of teaching behavior which these concepts allow opens up
many new avenues of research. Methods of inferring plans from behavior — the
obverse of planning, called "plan recognition" in AI — like cognitive modelling
and diagnosis, are difficult, time consuming and largely qualitative. Students too
have intentions and plans which govern their interactions with teachers and
computers; these also must be represented and established if student behavior is
to be understood and, from a pedagogical point of view, if they are to be helped.
The success of Johnson and Soloway's PROUST tutor (1987) in inferring the
plans and intentions of beginning programmers in PASCAL from errors in their
code, and the ability of Wilensky's and his students' Unix Consultant (Wilensky,
Arens and Chin 1984; Wilensky et al. 1986) to infer what the user really wants to
know and do from often vague and ambiguous requests, similarly indicate inter
esting empirical research directions.
Another relatively neglected area of concern in ITS is the whole question of
curriculum, i.e. the selection and sequencing of topics for instruction. Because of
its research and development orientation and the difficulty of encoding all of the
knowledge required, ITS designers have been concerned for the most part with
260 RALPH B. GINSBERG
The last two components of ITS architecture, the environment and the in
terface, comprise the computer as experienced by the learner. The environment
consists of the tasks the student is given and the tools he is given to perform
them. It shades off into the interface, which determines how the student inter
acts with the tutor and with the domain. Many of the issues and alternatives in
the design of ITS environments have already been discussed in the section on
computer-based microworlds. Factors listed by Burton (1988), in an insightful
review, cover many of the issues in ITS in general: the knowledge to be learned
(communicated); the appropriate level of abstraction; the fidelity (verisimili
tude), in various respects, with which the knowledge needs to be represented;
sequencing of tasks, and adaptation of tools and props to the stage of learning;
the amount of structure imposed on exploration of the environment by task de-
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 261
finition; and the help provided (assistance in doing parts of the problem, aiding
the learner to reflect on his own performance and skills, coaching etc.).
Interface issues are closely related to teaching tactics and to those aspects of
the implementation of the teaching plan having to do with dialogue structure,
but different psychological and sociological considerations are involved. Al
though the interface has received relatively little systematic attention in the ITS
literature (see, however, Hollan, Hutchins and Weitzman 1984; Frye and Solo-
way 1986; and Wenger 1987), this is beginning to change as a result of increased
interest in the general issues of interface design in computer science and AI.
(See Miller 1988, for a thorough review of the state of the art and its implica
tions for ITS.) One such issue, for example, is effective online help, a common
problem for most computer systems and applications. Only a fine line separates
help from coaching and tutoring, since to achieve their immediate goals users
often need to acquire some understanding of what they really want to do, how
the system works, and what their options are. Some very interesting research,
closely allied to ITS, is being carried out in the design of intelligent help systems
(e.g. Wilensky, Arens and Chin 1984; Wilensky et al. 1986; Fischer 1988). Of
particular note is the empirical and theoretical work of Breuker and his col
leagues (Breuker, forthcoming; Winkels and Sandberg 1987; Winkels, Sandberg
and Breuker, forthcoming) focusing on coaching and teaching strategies. Intelli
gent help may be feasible where fullblown intelligent tutoring is not because the
limited nature of the domain makes knowledge representation possible, and be
cause help requires less by way of explanation and actual problem solving than
tutoring. User modelling, planning, and effective explanatory tactics are still
necessary, however. With the advent of CD-ROM and other multimedia learn
ing environments, interface issues will become very much more severe. Careful
design will be required if these new technologies are to be effective.
There are important considerations having to do with learning which bear
on the design of the interface. In Anderson's tutors the interface is carefully de
signed to hold some of the information about the problems being solved on
screen in order to minimize burdens on short term memory and allow students
to concentrate on acquiring effective procedures (Lewis, Milson and Anderson
1987; Anderson, Boyle and Reiser 1985; Anderson, Boyle, Farrell and Reiser
1987). The interface also provides Anderson's model tracing procedures with
the information needed to infer the learner's state of knowledge and problem-
solving strategies. (In this respect it is the functional equivalent of "think aloud"
and other verbal protocols used in cognitive science; see Ericsson and Simon
1984).
Since the interface in ITS (and CAI) determines the final form of the com
munication between the program and the user, the dialogue between the two
262 RALPH B. GINSBERG
could be managed at this level. Whether and to what extent the student or the
ITS communicates in natural language then becomes an important issue, as yet
to be resolved (see Burton, Brown and De Kleer 1982, for discussion and experi
ence with SOPHIE I, II and III). Other interface issues which should be syste
matically explored - for CAI as well as ITS, and not only for computer-based
instruction — include: the required speed of response of the tutor for various ac
tions; screen management and the amount and type of information with which
the user must deal; helping the student keep track of where he has been and
where he is going during a session; and the pedagogical use of computer
graphics and other visual aids.
From a research point of view the fundamental questions about foreign lan
guage learning are "what do people know?" and "how do they learn it?" Rela
tively concrete answers to these questions are also essential to good teaching
and to the design of tools to support it. As pointed out above AI researchers
have had to think carefully about learning and effective teaching because they
have had to build explicit rationales and prescriptions into their didactic knowl
edge bases, teaching strategies, and planning mechanisms, on the one hand, and
into the design of computer-based microworlds and the environments and inter
faces of ITS, on the other. Much of the empirical analysis of ITS and micro-
worlds has been directed toward testing these assumptions. The problems of
applying learning theories developed in the substantive domains studied by ITS
(e.g. mathematics, the maintenance and repair of complex equipment, and pro
gramming) to foreign language learning should certainly not be minimized. One
of the hardest learned lessons of ITS, and AI in general, is how domain-specific
knowledge and successful procedures really are. Nevertheless it is encouraging
that for some complex, cognitive skills, theoretical models can be formulated
which make it possible to study learning in some depth and to design successful
learning environments in a principled way.
In this section I review two approaches to the conceptualization of learning
which have guided the development of ITS and which derive from experiences
with it. As cognitive theories, both stand in stark contrast to the behaviorism that
underlies most CAI. The first, exemplified by Anderson's ACT* theory of skill
acquisition and the tutoring principles derived from it, brings a general theory of
learning and cognition to bear on the design of tutors for very specific skills. The
second, exemplified by the cognitive apprenticeship framework of Collins,
Brown and Newman, synthesizes a wide range of experiences with apprentice-
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 263
ship, microworlds, coaching, and ITS, in the form of a heuristic guide to the de
sign of learning environments. While these theories are very different in form
and motivation, each in its own way suggests interesting new avenues of research
and development in foreign language learning.
guide to research, it enumerates the main factors that that must be taken into ac
count in empirical studies of what actually works.
The framework is based on a wide ranging and insightful analysis of the
common conditions associated with successful learning in a number of diverse
settings:
tice by making explicit what is largely implicit; coaching, offering help to bring
the student closer to expert practice, and coaching aimed at executable advice
(advice that can be followed); "scaffolding", providing supports for carrying out
simplified but real tasks, and "fading", removing supports as skill develops; re
flective comparison with experts, other students, and ultimately with the lear
ner's inner cognitive model; and exploration of interesting subtasks and methods
subsumed under a general goal.
3. Sequencing addresses changing learning needs in different phases of ac
quisition process. The main considerations here are the management of increas
ing complexity and increasing diversity in the tasks the student is given; and
coordinating instruction of local vs. global skills. The elements of sequencing are
addressed in detail in the earlier ICM framework. They include: maintaining
motivation through success, avoiding the dangers of oversimplification (unjusti
fied extrapolation and an unwillingness to try new things), structuring the envi
ronment so that progressively less simple but still realistic versions of the target
expert skill are learned, and using task specification to focus attention on im
portant factors in the microworld.
4. Sociology refers to the social context of learning. Besides the methods of
situating knowledge so that students come to understand its purposes and condi
tions of use, the category directs attention to exposure to examples of expert
practice and active communication about expertise; intrinsic motivation in the
tasks; and exploitation of cooperation and competition in the social situation of
learning. It is here that the framework has the greatest bearing on classroom-
based learning.
While the framework directs us to look in particular directions to charac
terize good and bad instruction — and effective and ineffective learning — much
work needs to be done in variable specification and measurement in the foreign
language field before the framework can be used in rigorous research. As with
ACT*, experience with learning environments designed specifically according to
these principles will clearly play an important role. It would be an interesting
exercise for language pedagogues to reformulate the principles of successful lan
guage learning in these terms, if for no other reason than to clarify the simi
larities and differences between language learning and the learning of other
cognitive and procedural skills.
4 Conclusion
gard to foreign language learning in the sequel. First and foremost the focus of
all of the work that has been reviewed here has been on learning. Designers of
ITSs and microworlds have looked very concretely at what is to be learned, how
it is to be learned, and how learning can be effectively supported. The artifacts
that they have designed have been explicitly motivated by this analysis. By con
trast many applications of advanced technologies to foreign languages have
taken learning issues for granted, implicitly relying on accepted educational
methodologies and other components of the larger system within which the new
technologies are to be embedded to achieve their instructional goals. It is in the
analysis of learning that the main interest of ITS for the foreign language field
lies.
Second, the orientation of AI research in ITS and microworlds is essentially
cognitive, although motivation is not ignored. The primary concern is with the
acquisition of knowledge, and with the cognitive and metacognitive processes,
and complex procedural skills, that put knowledge to use. Models of expert and
student knowledge are at the heart of ITS. Further, as Papert has so cogently ar
gued, the design of tasks and microworlds, which contain no specific tutorial in
tervention but which allow the student to "discover" the relevant knowledge by
his own natural learning devices, is basically an epistemological enterprise.
Work in foreign language learning has put relatively little emphasis on the pre
cise specification of cognitive mechanisms involved in learning as compared
with other factors (e.g. presumed individual differences in motivation, learning
and cognitive styles). Redressing this imbalance, I shall argue, is essential for the
design of effective foreign language learning.
Third, as in many other problems in AI, the representation of knowledge
turns out to be the key research and design problem. Many would claim that it is
in the area of knowledge representation that AI has had its greatest successes.
Knowledge is represented in ITS not only in the data structures and manipula
tions of the expert model, but in the physical structures of its tasks, interfaces,
and microworlds. Both domain knowledge and goal structure need to be repre
sented. If ITSs are to be built for foreign language learning it is the problems of
knowledge representation that must first be tackled. The knowledge of language
that can be usefully represented will determine the possibilities for ITS in
foreign language learning, and this must be carefully assessed.
Fourth, while one can usefully talk about the elements of ITS architecture
separately, it is striking how interdependent they are and, accordingly, how im
portant it is to design them concurrently. Teaching must be adapted to the na
ture of the knowledge to be learned, and correlatively knowledge representation
must take into account the demands of teaching. Diagnosis updates student
models and at the same time guides the instructional planner: it is concerned
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 269
with what we need to know in order to teach. The interface and the environ
ment—the medium in which communication, interaction, and learning takes
place — not only contain information about the domain, they also shape the way
the learner reflects on her knowledge and on her own learning processes. They
must accordingly be designed with learning considerations in mind.
Fifth, although I do not discuss this at all in the paper, it would be an inter
esting exercise to stand many of the criticisms of CAI and ITS on their heads and
look at learning in the classroom in the light of the kinds of considerations that
the builders of ITSs have faced. The first question would be the obvious one:
given the knowledge to be communicated, and given what is known about the
way people learn, would classrooms (a teacher, several students etc.) be in
vented in the first place, and if so how would they be structured? Or to put it an
other way, for communicating knowledge, what is the comparative advantage of
the educational technology presently in use, and for what kinds of knowledge is
it particularly effective? With regard to what I have been calling learning envi
ronments and diagnosis, one could ask similar questions about materials and
testing in traditional educational settings. (Of course there might be other rea
sons to invent classrooms, textbooks, and testing besides their efficacy in com
municating knowledge.) I realize it is hard to treat such questions as anything
but rhetorical or polemical, but they do have a core of scientific content which
debates on education in our rapidly changing society cannot ignore.
Sixth, with regard to empirical research strategies, the sections on student
modelling and diagnosis describe rigorous computational models and methods
of analyzing data that point in very different directions from the statistical mod
els and methods which have dominated the educational and evaluational lit
erature on learning in instructional settings. The difference lies not so much in
the formalisms employed, although these are very different indeed, but in the
kinds of questions that are addressed and the goals that are served. Pace: of
course empirical work on foreign language learning needs both.
Finally, I would not want to exaggerate the extent to which any existing ITSs
or microworlds have reached the educational goals that they have set. That is a
difficult empirical question on which there has been lamentably little systematic
research. As I indicated at the outset, however, most ITSs are prototypes whose
primary purpose is research, and it is the results of this research that I have
stressed. Here, the ideas and methods developed have many fruitful and practi
cal applications to foreign language learning, in ways that will be the subject of
another subsequent paper.
270 RALPH B. GINSBERG
Notes
1. One might with some justification say that the major achievements of AI as a discipline have
been primarily in the area of knowledge representation. On the declarative vs. procedural
representations see Winograd 1975. With reference to ITS, see Clancey's (1986b, 1987) dis
cussion of the motivation for NEOMYCIN as a basis for GUIDON; and Anderson (1988).
Although a useful one, the declarative/procedural distinction cannot be pushed too far, as
VanLehn (1988b) has cogently argued.
2. To anticipate a little, using insights generated by ITS research, misleading dichotomies, like
learning vs. acquisition, can be reformulated in a more general theory of learning pertinent
to instructional settings.
3. The word is used in many different senses, even by the same author (see Lawler 1987); some
key phrases, overlapping Pea's, are "limited, simplified slices of reality", "worlds with limited
possibilities", "fixed and limited objects, properties, and relations", "problem spaces", and
"task domains along with the tools to operate in them".
4. How systematic errors really are is, of course, an empirical question. Whether errors are
treated as random depends on how important and how hard it is to characterize them, as
well as the social and psychological aspects of assessment method in the domain.
5. Of course, conditioning, reinforcement, choice probabilites, and the rest of the conceptual
repertoire of behavioristic theories of learning are irrelevant altogether.
References
Abelson, H., and A.A. DiSessa. 1980. Turtle Geometry: The Computer as a Medium for Exploring
Mathematics. Cambridge, MA: MIT Press.
Anderson, J.R. 1983. The Architecture of Cognition. Cambridge, MA: Harvard.
Anderson, J.R. 1986. "Knowledge compilation: the general learning mechanism." Machine Learn-
ing (volume 2) ed. by R. Michalski, J. Carbonnell and T. Mitchell, 202-217. Palo Alto: Tioga.
Anderson, J.R. 1987. "Skill acquisition: compilation of weak-method problem solutions." Psycho-
logical Review 94.192-210.
Anderson, J.R. 1988. "The expert module." Poison and Richardson 1988.
Anderson, J.R. B.J. and Reiser. 1985. "The LISP tutor." Byte 10.159-175.
Anderson, J.R., C F . Boyle and B.J. Reiser. 1985. "Intelligent tutoring systems." Science 228.456-
458.
Anderson, J.R., C.F. Boyle and G. Yost. 1985. "The GEOMETRY tutor." Proceedings of the
Ninth International Joint Conference on Artificial Intelligence ed. by A. Joshi. Los Altos:
Kaufmann.
Anderson, J.R., C.F. Boyle, A. Corbett and M. Lewis. 1988. Cognitive modelling and intelligent tu-
toring. Draft. Pittsburgh: Carnegie Mellon.
Anderson, J.R., C.F. Boyle, R. Farrell and B.J. Reiser. 1987. "Cognitive principles in the design of
computer tutors." Modelling Cognition ed. by P. Morris. New York: Wiley.
Barchan, J. 1987. Language Independent Grammatical Error Reporter.
Barchan, J., B. Woodmansee and M. Yazdani. 1986. "A prolog-based tool for French grammatical
analysis." Instructional Science 15.21-48.
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 271
Barchan, J. and J. Wusterman. 1988.A Prolog-base d tool for grammatical analysis of Western Euro-
pean Languages. Research Report. Exeter, UK: Computer Science Department, University
of Exeter.
Berwick, R. 1985. The Acquisition of Syntactic Knowledge. Cambridge, MA: MIT.
Breuker, J. Forthcoming. "Coaching in help systems." To appear in Intelligent Computer-Aided In-
struction. ed. by J.Self. In press. London: Chapman and Hall.
Brown, J.S., and J. Greeno, chairmen. 1984. "Report of the research briefing panel on information
technology in precollege education." Research Briefings 1984. Washington, DC: National
Academy Press.
Brown, J.S. and R.R. Burton. 1978. "Diagnostic models for procedural bugs in basic mathematical
skills." Cognitive Science 2.155-192.
Brown, J.S. and R.R. Burton. 1987. "Reactive learning environments for teaching electronic
troubleshooting." Advances in Man-Machine Systems 3.65-98.
Brown, J.S., A. Collins and P. Duguid. 1988. Cognitive apprenticeship, situated cognition and social
interaction. ( = Institute for Research on Learning Report, 8.) Palo Alto: Tioga.
Brown, J.S., T.P. Moran and M.D. Williams. 1982. The semantics of procedures: a cognitive basis
for maintenance training competency. Palo Alto: Xerox Corporation CIS Working Paper.
Brown, J.S. and K. VanLehn. 1982. "Repair theory: a generative theory of bugs in procedural
skills." Cognitive Science 4.379-426.
Burton, R.R. 1982. "Diagnosing bugs in a simple procedural skill. Sleeman and Brown 1982.157-
183.
Burton, R.R. 1988. "The environment module of intelligent tutoring systems." Poison and Ri-
chardson 1988.
Burton, R.R. and J.S. Brown. 1982. "An investigation of computer coaching for informal learning
activities." Sleeman and Brown 1982.79-98.
Burton, R.R., J.S. Brown and G. Fischer. 1984. "Skiing as a model of instruction." Everyday Cogni-
tion ed. by B. Rogoff and J. Lave, 139-150. Cambridge: Harvard.
Burton, R.R., J.S. Brown and J. De Kleer. 1982. "Pedagogical; natural language and knowledge
engineering techniques in SOPHIE I, II, and III." Sleeman and Brown 1982.227-282.
Carbonell, J.R. 1970. "AI in CAI: an artificial intelligence approach to computer-assisted instruc-
tion." IEEE Transactions in Man-Machine Systems 11.19-202.
Carr, B. and LP Goldstein. 1977. Overlays: a theory of modeling for computer-aided instruction. ( =
Artificial Intelligence Memo, 406.) Cambridge, MA: MIT Press.
Cerri, S. and J. Breuker. 1981. "A rather intelligent language teacher." Studies in Language Learn-
ing 3.182-192.
Clancey, WJ. 1982. "Tutoring rules for generating case method dialog." Sleeman and Brown
1982.201-225.
Clancey, WJ. 1984. "Methodology for building an intelligent tutoring system." Models and Tactics
in Cognitive Science, ed. by W. Kintsch, J. Miller and P. Poison. Hillsdale, NJ: Lawrence Erl-
baum.
Clancey, WJ. 1986a. "Qualitative student models." Annual Review of Computer Science. Palo
Alto: Annual Reviews.
Clancey, W.J. 1986b. "From GUIDON to NEOMYCIN and HERACLES in twenty short lessons
(ONR Final Report 1979-1985)." AI Magazine 7.40-60.
Clancey, W.J. 1987. Knowledge-based Tutoring: the GUIDON Program. Cambridge, MA: MIT.
Clancey, WJ. 1988. "The knowledge engineer as student: metacognitive bases for asking good
questions." Mandl and Lesgold 1988.
272 RALPH B. GINSBERG
Collins, A. and J.S. Brown. 1988. "The computer as a tool for learning through reflection." Mandl
and Lesgold 1988.
Collins, A. and M. Grignetti. 1975. Intelligent CAI ( = BBN Report, 3181.) Cambridge: Bolt Be-
ranek and Newman.
Collins, A. and A.L. Stevens. 1978. "Goals and strategies of inquiry teachers." Advances in Instruc-
tional Psychology ed. by R. Glaser, 65-119. Hillsdale, NJ: Lawrence Erlbaum.
Collins, A. and A.L. Stevens. 1983. "Cognitive theory of interactive teaching." Instructional Design
Theories and Models: An Overview of their Current Status ed. by CM. Reigeluth. Hillsdale,
NJ: Lawrence Erlbaum.
Collins, A., J.S. Brown and S.E. Newman. 1987. "Cognitive apprenticeship: teaching the craft of
reading, writing, and mathematics." Cognition and Instruction: Issues and Agendas ed. by
L.B. Resnick. Hillsdale, NJ: Lawrence Erlbaum.
Dede, CJ., P.P. Zodhiates and C.L. Thompson. 1985. Intelligent computer-assisted instruction: a
review and assessment of ICAI research and its potential f or education. Cambridge, MA: Edu-
cational Technology Center, Harvard University.
Feuerman, K., C. Marshall, D. Newman and M. Rypa. 1987. The CALLE Project. Technical Re-
port. Pasadena: Xerox Corporation.
Fischer, G. 1988. "Enhancing incremental learning processes with knowledge-based systems."
Mandl and Lesgold 1988.
Frye, D. and E. Soloway. 1986. Interface design: a neglected issue in educational software. New-
Haven: Department of computer Science, Yale University.
Geesey, R., R. Ginsberg, J. Lancaster, E. Manukian and L. Reeker. 1989. Learning environments
for scientific and technical competency. Unpublished manuscript.
Goldstein, I.P. 1982. "The genetic graph: a representation for the evolution of procedural knowl-
edge." Sleeman and Brown 1982.51-77.
Goldstein, I.P. and S. Papert. 1977. "Artificial intelligence, language, and the study of knowledge."
Cognitive Science 1.1-21.
Halff, H.M. 1988. "Curriculum and instruction in automated tutors." Poison and Richardson 1988.
Haugeland, . 1984. "First among equals." Models and Tactics in Cognitive Science ed. by W.
Kintsch, J. Miller and P. Poison. Hillsdale, NJ: Lawrence Erlbaum.
Hollan, J.D., E.L. Hutchins and L.M. Weitzman. 1984. "STEAMER: an interactive inspectable
simulation-based training system." AI Magazine 5/2.15-28.
IRL. 1988. The Advancement of Learning. Palo Alto: Institute for Research on Learning.
Johnson, W.B. 1988. "Pragmatic considerations in research, development, and implementation of
intelligent tutoring systems." Poison and Richardson 1988.
Johnson, W.L. and E. Soloway. 1987. "PROUST: an automatic debugger for Pascal programs."
Kearsley 1987.
Kaplan, R.M. and J. Bresnan. 1982. "Lexical-functional grammar: a formal system for grammati-
cal representation." The Mental Representation of Grammatical Relations ed. by J. Bresnan.
Cambridge, MA: MIT Press.
Kearsley, G.P., ed. 1987. Artificial Intelligence and Instruction: Applications and Methods. Reading:
Addison-Wesley.
Klahr, D., P. Langley and R. Neches, eds. 1987. Production System Models of Learning and Devel-
opment. Cambridge, MA: MIT Press.
Lave, J. In preparation. Tailored learning: apprenticeship and everyday practice among craftsmen in
West Africa. Stanford: IRL.
Lawler, R.W. 1987. "Learning environments: now, then, and someday. Lawler and Yazdani
1987.1-25.
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 273
Lawler, R. W. and M. Yazdani, eds. 1987. Learning Environments and Tutoring Systems ( = Artifi-
cial Intelligence in Education, 1.) Norwood, NJ: Ablex.
Leinhardt, G. Forthcoming. "Math lessons: a contrast of novice and expert competence." To ap-
pear in Journal of Research in Mathematics Education.
Leinhardt, G. and J.G. Greeno. 1986. "The cognitive skill of teaching." Journal of Educational
Psychology 78.75-95.
Leinhardt, G. and D.A. Smith. 1985. "Expertise in mathematics instruction: subject matter knowl-
edge." Journal of Educational Psychology 77.247-271.
Leinhardt, G., C. Weidman and K.M. Hammond. 1987. "Introduction and integration of class-
room routines by expert teachers." Curriculum Inquiry 17.135-176.
Lesgold, A. 1988. "Toward a theory of curriculum for use in designing intelligent instructional sys-
tems." Mandl and Lesgold 1988.
Lewis, M.W., R. Milson and J.R. Anderson. 1987. "The TEACHER'S APPRENTICE: designing
and intelligent authoring system for high school mathematics." Kearsley 1987.
Macmillan, S.A. and D.H. Sleeman. 1987. "An architecture for a self-improving instructional
planner for intelligent tutoring systems." Computational Intelligence 3.17-27.
Mandl, H. and A. Lesgold, eds. 1988. Learning Issues for Intelligent Tutoring Systems. New York:
Springer Verlag.
Marshall, S.P. 1980. "Procedural networks and production systems in adaptive diagnosis." Instruc-
tional Science 9.129-143.
Marshall, S.P. 1981. "Sequential item selection: optimal and heuristic policies." Journal of Mathe-
matical Psychology 23.134-152.
McArthur, D. 1986. "Developing computer tools to support performing and learning complex
cognitive skills." Applications of Cognitive Psychology: Problem Solving, Education and Com-
puting ed. by K. Pezdek, D. Berger and B. Banks, 183-200. Hillsdale, NJ: Lawrence Erlbaum.
McArthur, D., C. Stasz and J.Y. Hotta. 1987. Learning problem-solving skills in algebra. Santa
Monica: Rand Corporation Note.
Miller, J.R. 1988. "The role of human-computer interaction in intelligent tutoring systems. Poison
and Richardson 1988.
Morgenstern, D. 1986. "The Athena language project." Hispania 69.740-745.
Murray, J.H., D. Morgenstern, and G. Furstenberg. 1987. The Athena language learning project:
design issues for the next generation of language learning tools. Draft. MIT Press.
Murray, W.R. 1988. Personal communication.
Olsson, S. 1986. "Some principles of intelligent tutoring." Instructional Science 14.293-326.
Olsson, S. and P. Langley. 1988. "Psychological evaluation of path hypotheses in cognitive diag-
nosis." Mandl and Lesgold 1988.
O'Shea, T. and R. Bornat. 1987. A five component model for computer-based training. Unpublished
manuscript.
O'Shea, T., R. Bornat, B. du Boulay, M. Eisenstadt and I. Page. 1984. "Tools for creating intelli-
gent computer tutors." Artificial and Human Intelligence ed. by A. Elithorn and R. Banerji,
181-199. Amsterdam: North Holland.
O'Shea, T. and J.A. Self. 1983. Learning and Teaching with Computers. Englewood Cliffs, NJ:
Prentice-Hall.
Palinscar, A.S. and A.L. Brown. 1984. "Reciprocal teaching of comprehension-fostering and
monitoring activities." Cognition and Instruction 1.117-175.
Papert, S. 1980. Mindstorms: Children, Computers, and Powerful Ideas. New York: Basic Books.
Papert, S. 1986. Rethinking mathematics learnability in a computer culture. Unpublished lecture,
Stanford University.
274 RALPH B. GINSBERG
Park, O-C, R.S. Perez, and R.J. Seidel. 1987. "Intelligent CAI: old wine in new bottles, or a new
vintage?" Kearsley 1987.
Pea, R.D. 1987. "Integrating human and computer intelligence." Pea and Sheingold 1987.128-146.
Pea, R.D. and K. Sheingold, eds. 1987. Mirrors of Minds: Patterns of Experience in Educational
Computing. Norwood, NJ: Ablex.
Pea, R.D. and E. Soloway. 1987. Mechanisms for facilitating a vital and dynamic education system:
fundamental roles for education science and technology. Final Report for OTA, US Congress.
Peachey, D.R. and G.I. MaCalla. 1986. "Using planning techniques in intelligent tutoring sys-
tems." International Journal of Man-Machine Studies 24.77-98.
Poison, M.C. and J.J. Richardson, eds. 1988. Foundations of Intelligent Tutoring Systems. Hillsdale,
NJ: Lawrence Erlbaum.
Psotka, J., L.D. Massey and S.A. Mutter, eds. 1988. Intelligent Tutoring Systems: Lessons Learned.
Hillsdale, NJ: Lawrence Erlbaum.
Russell, D.M. 1987. "The instructional design environment: Interpreter." Psotka, Massey and
Mutter 1988.
Scardamalia, M. and C. Bereiter. 1983. "Child as co-investigator: helping children gain insight into
their own mental processes." Learning and Motivation in the Classroom ed. by S.G. Paris, G.
Olson and H. Stevenson, 61-82. Hillsdale, NJ: Lawrence Erlbaum.
Scardamalia, M. and C. Bereiter. 1985. "Fostering the development of self-regulation in children's
knowledge processing." Teaching and Learning Skills: Research and Open Questions ed. by S.
Chipman, J.W. Segal and R. Glaser. Hillsdale, NJ: Lawrence Erlbaum.
Schank, R.C. 1984. The Cognitive Computer. Reading, MA: Addison-Wesley.
Schoenfeld, A.H. 1983. Problem solving in the mathematics curriculum: a report, recommendations
and an annotated bibliography. ( = MAA Notes, 1.) The Mathematical Association of Ameri-
ca.
Schoenfeld, A.L. 1985. Mathematical Problem Solving. New York: Academic Press.
Sleeman, D. and J.S. Brown, eds. 1982. Intelligent Tutoring Systems. London: Academic Press.
Suppes, P., ed. 1981. University-Level Computer-Assisted Instruction at Stanford: 1968-1980. Palo
Alto: Stanford University, Institute for Mathematical Studies in the Social Sciences.
Uren, J. and M. Yazdani. 1988. Spanish LINGER. Research Report. Exeter, UK. University of
Exeter, Computer Science Department.
VanLehn, K. 1985a. Acquiring procedural skills from lesson sequences. ( = Technical Report ISL,
9.) Palo Alto: Xerox Corporation.
VanLehn, K. 1985b. Learning one subprocedure per lesson. ( = Technical Report ISL, 10.) Palo
Alto: Xerox Corporation.
VanLehn, K. 1988a. "Toward a theory of impass-driven learning." Mandl and Lesgold 1988.
VanLehn, K. 1988b. "Student modeling." Poison and Richardson 1988.
VanLehn, K. and J.S. Brown. 1980. "Planning nets: a representation for formalizing analogies and
semantic models for procedural skills." Apptitude Learning and Instruction, vol.2: Cognitive
Process Analysis of Learning and Problem Solving ed. by R.E. Snow, P.A. Federico and W.E.
Montague. Hillsdale, NJ: Lawrence Erlbaum.
VanLehn, K., J.S. Brown and J.G. Greeno. 1984. "Competitive argumentation in computational
theories of cognition." Models and Tactics in Cognitive Science ed. by W. Kintsch, J. Miller,
and P. Poison. Hillsdale, NJ: Lawrence Erlbaum.
Wenger, E. 1987. Artificial Intelligence and Tutoring Systems: Computational and Cognitive Ap-
proaches to the Communication of Knowledge. Los Altos: Kaufmann
IMPLICATIONS OF INTELLIGENT TUTORING SYSTEMS 275
Westcourt, K., M. Beard and L. Gould. 1977. "Knowledge-based adaptive curriculum sequencing
for CAI: application of a network representation." Proceedings of the National ACM Con-
ference, Seattle Washington.234-240. New York: ACM.
Westcourt, K., M. Beard and A. Barr. 1981. "Curriculum information networks for CAI: research
on testing and evaluation by simulation." Suppes 1981.817-839.
Wexler, K. and Culicover, P. 1980. Formal Principles of Language Acquisition. Cambridge, MA:
MIT Press.
Wilensky, R. 1986. Common LISPcraft. New York: Norton.
Wilensky, R., Y. Arens and D. Chin. 1984. "Talking to UNIX in English: and overview of UC."
Communications of the ACM 27.574-593.
Wilensky, R., et al. 1986. "UC — a progress report." Berkeley: Computer Science Division, Univer-
sity of California, Report no. UCB/CSD 87/303.
Winkels, R. and J. Sandberg. 1987. The EUROHELP coach ( = Memo 94 of the VF Project.) Am
sterdam: Department of Social Science Informatics, University of Amsterdam.
Winkels, R., J. Sandberg and J. Breuker. 1986. Coaching Strategies and tactics for IHSs ( = Memo
78 of the VF Project.) Amsterdam: Department of.Social Science Informatics, University of
Amsterdam.
Winograd, T. 1975. "Frame representations and the declarative/procedural controversy." Repre-
sentation and Understanding: Studies in Cognitive Science ed. by D.G. Bobrow and A.M. Col-
lins, 185-210. New York: Academic Press.
Winograd, T. and F. Flores. 1985. Understanding Computers and Cognition: A New Foundation for
Design. Norwood: Ablex.
Xerox. 1985. CALLE Project Final Report. Pasadena: Xerox Special Information Systems, Vista
Laboratory.
Yazdani, M., ed. 1984. New Horizons in Educational Computing. New York: Wiley.
Yazdani, M. 1986. "Intelligent tutoring systems: an overview." Expert Systems 3.154-162.
Yazdani, M. 1988. Language tutoring with Prolog. Draft. Exeter, UK: University of Exeter, Depart
ment of Computer Science.
In the series Studies in Bilingualism (SiBil) the following titles have been published thus far or
are scheduled for publication:
35 Rocca, Sonia: Child Second Language Acquisition. A bi-directional study of English and Italian tense-
aspect morphology. 2007. xvi, 240 pp.
34 Koven, Michèle: Selves in Two Languages. Bilinguals' verbal enactments of identity in French and
Portuguese. 2007. xi, 327 pp.
33 Köpke, Barbara, Monika S. Schmid, Merel Keijzer and Susan Dostert (eds.): Language Attrition.
Theoretical perspectives. 2007. viii, 258 pp.
32 Kondo-Brown, Kimi (ed.): Heritage Language Development. Focus on East Asian Immigrants. 2006.
x, 282 pp.
31 Baptista, Barbara O. and Michael Alan Watkins (eds.): English with a Latin Beat. Studies in
Portuguese/Spanish – English Interphonology. 2006. vi, 214 pp.
30 Pienemann, Manfred (ed.): Cross-Linguistic Aspects of Processability Theory. 2005. xiv, 303 pp.
29 Ayoun, Dalila and M. Rafael Salaberry (eds.): Tense and Aspect in Romance Languages. Theoretical
and applied perspectives. 2005. x, 318 pp.
28 Schmid, Monika S., Barbara Köpke, Merel Keijzer and Lina Weilemar (eds.): First Language
Attrition. Interdisciplinary perspectives on methodological issues. 2004. x, 378 pp.
27 Callahan, Laura: Spanish/English Codeswitching in a Written Corpus. 2004. viii, 183 pp.
26 Dimroth, Christine and Marianne Starren (eds.): Information Structure and the Dynamics of
Language Acquisition. 2003. vi, 361 pp.
25 Piller, Ingrid: Bilingual Couples Talk. The discursive construction of hybridity. 2002. xii, 315 pp.
24 Schmid, Monika S.: First Language Attrition, Use and Maintenance. The case of German Jews in
anglophone countries. 2002. xiv, 259 pp. (incl. CD-rom).
23 Verhoeven, Ludo and Sven Strömqvist (eds.): Narrative Development in a Multilingual Context.
2001. viii, 431 pp.
22 Salaberry, M. Rafael: The Development of Past Tense Morphology in L2 Spanish. 2001. xii, 211 pp.
21 Döpke, Susanne (ed.): Cross-Linguistic Structures in Simultaneous Bilingualism. 2001. x, 258 pp.
20 Poulisse, Nanda: Slips of the Tongue. Speech errors in first and second language production. 1999.
xvi, 257 pp.
19 Amara, Muhammad Hasan: Politics and Sociolinguistic Reflexes. Palestinian border villages. 1999.
xx, 261 pp.
18 Paradis, Michel: A Neurolinguistic Theory of Bilingualism. 2004. viii, 299 pp.
17 Ellis, Rod: Learning a Second Language through Interaction. 1999. x, 285 pp.
16 Huebner, Thom and Kathryn A. Davis (eds.): Sociopolitical Perspectives on Language Policy and
Planning in the USA. With the assistance of Joseph Lo Bianco. 1999. xvi, 365 pp.
15 Pienemann, Manfred: Language Processing and Second Language Development. Processability theory.
1998. xviii, 367 pp.
14 Young, Richard and Agnes Weiyun He (eds.): Talking and Testing. Discourse approaches to the
assessment of oral proficiency. 1998. x, 395 pp.
13 Holloway, Charles E.: Dialect Death. The case of Brule Spanish. 1997. x, 220 pp.
12 Halmari, Helena: Government and Codeswitching. Explaining American Finnish. 1997. xvi, 276 pp.
11 Becker, Angelika and Mary Carroll: The Acquisition of Spatial Relations in a Second Language. In
cooperation with Jorge Giacobbe, Clive Perdue and Rémi Porquiez. 1997. xii, 212 pp.
10 Bayley, Robert and Dennis R. Preston (eds.): Second Language Acquisition and Linguistic Variation.
1996. xix, 317 pp.
9 Freed, Barbara F. (ed.): Second Language Acquisition in a Study Abroad Context. 1995. xiv, 345 pp.
8 Davis, Kathryn A.: Language Planning in Multilingual Contexts. Policies, communities, and schools in
Luxembourg. 1994. xix, 220 pp.
7 Dietrich, Rainer, Wolfgang Klein and Colette Noyau: The Acquisition of Temporality in a Second
Language. In cooperation with Josée Coenen, Beatriz Dorriots, Korrie van Helvert, Henriette Hendriks,
Et-Tayeb Houdaïfa, Clive Perdue, Sören Sjöström, Marie-Thérèse Vasseur and Kaarlo Voionmaa. 1995.
xii, 288 pp.
6 Schreuder, Robert and Bert Weltens (eds.): The Bilingual Lexicon. 1993. viii, 307 pp.
5 Klein, Wolfgang and Clive Perdue: Utterance Structure. Developing grammars again. In cooperation
with Mary Carroll, Josée Coenen, José Deulofeu, Thom Huebner and Anne Trévise. 1992. xvi, 354 pp.
4 Paulston, Christina Bratt: Linguistic Minorities in Multilingual Settings. Implications for language
policies. 1994. xi, 136 pp.
3 Döpke, Susanne: One Parent – One Language. An interactional approach. 1992. xviii, 213 pp.
2 Bot, Kees de, Ralph B. Ginsberg and Claire Kramsch (eds.): Foreign Language Research in Cross-
Cultural Perspective. 1991. xii, 275 pp.
1 Fase, Willem, Koen Jaspaert and Sjaak Kroon (eds.): Maintenance and Loss of Minority Languages.
1992. xii, 403 pp.