Sie sind auf Seite 1von 4

key concepts in e l t

Corpus-aided language learning


Li-Shih Huang

A corpus is a large collection or database of machine-readable texts


involving natural discourse in diverse contexts (Bernardini 2000). Such
discourses can be spoken, written, computer-mediated, spontaneous, or
scripted and may represent a variety of genres (for example everyday
conversations, lectures, seminars, meetings, radio and television

Downloaded from http://eltj.oxfordjournals.org/ at :: on December 3, 2013


programmes, and essays). Some readily available corpora include the
British National Corpus (BNC, http://www.natcorp.ox.ac.uk), which
contains 100 million words from written and spoken language in a variety of
contexts, the Michigan Corpus of Academic Spoken English (MICAS E,
http://micase.elicorpora.info), which features 1.8 million words of
speech in various academic contexts, and the Corpus of Contemporary
American English (COCA), with 410 million words
(http://www.americancorpus.org).1
Although corpus linguistics (i.e. computer-assisted analysis
techniques for studying texts) is a young specialization, its usefulness
in teaching and learning has received growing attention and
recognition (for example Hunston 2002; Sinclair 2004; Conrad 2005;
OKeeffe, McCarthy, and Carter 2007; Bennett 2010; Reppen 2010). In
particular, researchers have identified corpus data as resources that provide
descriptive insights relevant to how people use language and as tools that
enable students and instructors to analyse both how people use different
language forms at various levels of formality and how language fulfills
multiple speech functions across contexts. Corpus data suggest that
individuals often do not use language as specified in grammar books and
that word meanings vary across contexts and users (Biber and Reppen
2002).
Over the past ten years, a growing number of studies have shown how
learners can use corpus data to further their language learning (see
Hunston op.cit.; Boulton 2010). Numerous corpus linguists (for example
Gavioli and Aston 2001) have pointed out that learning activities centred on
analysing corpus data are consistent with current principles of language-
learning theory, that is students develop more autonomy when they receive
guidance about how to observe language and make generalizations. Such
activities promote noticing and grammatical consciousness raising
(Schmidt 1990), which can enhance second language learning and
development. Despite the growing interest in corpora and corpus-aided
learning, however, many teachers believe that incorporating corpora into
their teaching would be too technically challenging or time consuming

E LT Journal Volume 65/4 October 2011; doi:10.1093/elt/ccr031 481


The Author 2011. Published by Oxford University Press; all rights reserved.
Advance Access publication May 5, 2011
(Boulton 2010). Yet, while some researchers have suggested substantial
training is necessary (for example Estling Vannestal and Lindquist 2007),
others have provided evidence that only a minimal amount of training is
needed (for example Boulton 2008). Some have also recommended using
paper-based materials generated from corpora as a viable alternative to
accessing corpora via computers (Boulton 2010).
A key pedagogical approach for using corpora in language teaching and
learning is data-driven learning (DDL), which emerged in the mid-1980s.
DDL was defined as the use in the classroom of computer generated
concordances to get students to explore the regularities of patterning in the
target language, and the development of activities and exercises based on
concordance output (Johns and King 1991: iii). As Johns (1994: 297) stated,
what distinguishes the DDL approach is the attempt to cut out the
middleman as far as possible and to give direct access to the data so that the
learner can take part in building up his or her own profiles of meaning and
uses. Furthermore, corpus data [offer] a unique resource for the

Downloaded from http://eltj.oxfordjournals.org/ at :: on December 3, 2013


stimulation of inductive learning strategiesin particular the strategies of
perceiving similarities and differences and of hypothesis formation and
testing (ibid.). By extension, the corpus-aided discovery learning (CADL)
approach entails encouraging learners to take the role of language
researchers by systematically engaging in discovery learning (Gavioli 2000)
and in learning how to learn through observations, analyses, interpretations,
and presentations of language-use patterns in corpus data. In the C A DL
approach, learning about language use is driven by a process of enquiry that
works toward understanding or problem solving, and corpora are used as
mediational tools (Vygotsky 1978) rather than as the basis for language
teaching and learning. Furthermore, instructors adhering to the CADL
approach play a critical role in facilitating or guiding the process of
discovery, which depends on the learners needs, stages of learning, and
levels of proficiency.
Researchers have generally agreed that corpus data enrich our
understanding of language use and are an important resource for language
teaching and learning. The use of corpora in language teaching is not
without controversies, however. Among the debates featured in Seidlhofer
(2003), for example, some scholars have advocated using real examples
only in the classroom (for example Sinclair 1997), while others, in contrast,
wonder whether the discourse in corpora, taken out of its original context,
can still be considered authentic, real, or natural, thereby questioning the
efficacy of analysing displaced language that may not be relevant to learners
linguistic and sociocultural contexts. In response to Widdowsons (1998)
remark that corpora may provide samples of genuine language produced by
language users with real communication goals but do not necessarily
guarantee that learners can participate in discourse in ways that lead to
learning, researchers such as Gavioli and Aston (op.cit.) note that learners
can still authenticate language samples by adopting an observers role to
critically analyse the data, which will raise their awareness of lexical,
grammatical, and textual issues as they restructure their views about
language use in real situations. Similarly, Carter (1998: 501) argues that
while real English from corpora can be unrealistic for classroom
instruction and thus modified language used in the classroom that is based

482 Li-Shih Huang


on learners needs and levels might be more pedagogically viable and
realistic, learners should be provided with opportunities to develop a feel
for the language through corpus data. The validity of analysing corpora to
capture language use across seemingly limitless contexts or to describe the
workings of real English around the world has also been questioned. Some
scholars point out that communicative contexts are not restricted to native
speaker discourse, and, as such, language teaching should not be based
simply on descriptive facts generated from largely native speaker-oriented
corpora (Prodromou 1996).2
Despite these debates, technological advancements have undoubtedly
enhanced language learners and instructors access to corpora, and the
plethora of articles and books written for language-teaching researchers and
practitioners published during the past five years suggest that attention to
and interest in using corpora for teaching and learning purposes will
continue for the foreseeable future.

Downloaded from http://eltj.oxfordjournals.org/ at :: on December 3, 2013


Notes Carter, R. 1998. Orders of reality: C A N C O DE,
1 For more examples, visit http://corpus.byu.edu communications, and culture. E LT Journal 52/1:
and International Corpus of English: http://ice- 4356.
corpora.net/ice. Conrad, S. 2005. Corpus linguistics and L2 teaching
2 The Vienna-Oxford International Corpus of in E. Hinkel (ed.). Handbook of Research in Second
English (V O I C E) (http://www.univie.ac.at/voice) Language Teaching and Learning. Mahwah, NJ:
is one such corpora that collects English spoken by Lawrence Erlbaum Associates.
non-native language users in various contexts. Estling Vannestal, M. and H. Lindquist. 2007.
V O ICE comprises one million words of naturally Learning English grammar with a corpus:
occurring, non-scripted, face-to-face interactions experimenting with concordancing in a university
by over 1,200 speakers with 50 different first grammar course. ReC AL L 19/3: 32950.
languages. Gavioli, L. 2000. The learner as researcher:
introducing corpus concordancing in the classroom
in G. Aston (ed.). Learning with Corpora. Houston,
References TX: Athelstan/Bologna: C L U E B.
Bennett, G. 2010. Using Corpora in the Language Gavioli, L. and G. Aston. 2001. Enriching reality:
Learning Classroom. Ann Arbor, MI: Michigan language corpora in language pedagogy. E LT
University Press. Journal 55/3: 23846.
Bernardini, S. 2000. Systematising serendipity: Hunston, S. 2002. Corpora in Applied Linguistics.
proposals for concordancing large corpora with Cambridge: Cambridge University Press.
language learners in L. Burnard and T. McEnery Johns, T. 1994. From printout to handout: grammar
(eds.). Rethinking Language Pedagogy from a Corpus and vocabulary teaching in the context of data-driven
Perspective. Frankfurt am Main: Peter Lang. learning in T. Odlin (ed.). Perspectives on Pedagogical
Biber, D. and R. Reppen. 2002. What does frequency Grammar. Cambridge: Cambridge University Press.
have to do with grammar teaching? Studies in Second Johns, T. and P. King. (eds.). 1991. Classroom
Language Acquisition 24: 199208. concordancing. English Language Research Journal
Boulton, A. 2008. Looking for empirical evidence 4: 2745.
for DD L at lower levels in B. Lewandowska- OKeeffe, A., M. McCarthy, and R. Carter. 2007. From
Tomaszczyk (ed.). Corpus Linguistics, Computer Tools, Corpus to Classroom: Language Use and Language
and Applications: State of the Art. Frankfurt am Main: Teaching. Cambridge: Cambridge University Press.
Peter Lang. Prodromou, L. 1996. Correspondence. E LT Journal
Boulton, A. 2010. Data-driven learning: taking the 50/1: 889.
computer out of the Equation. Language Learning Reppen, R. 2010. Using Corpora in the Language
60/3: 534572. Classroom. Cambridge: Cambridge University Press.

Corpus-aided language learning 483


Schmidt, R. 1990. The role of consciousness in Widdowson, H. G. 1998. Context, community
second language learning. Applied Linguistics 11/2: and authentic language. T E S O L Quarterly 32/4:
12958. 70516.
Seidlhofer, B. 2003. Controversies in Applied
Linguistics. Oxford: Oxford University Press. The author
Sinclair, J. 1997. Corpus evidence in language Li-Shih Huang is an Associate Professor of Applied
description in A. Wichmann, S. Fligelstone, Linguistics and Learning and Teaching Centre
T. McEnery, and G. Knowles (eds.). Teaching and Scholar-in-Residence at the University of Victoria,
Language Corpora. New York, NY: Longman. Canada. Her current research examines academic
Sinclair, J. 2004. How to Use Corpora in Language language learning needs and outcomes assessment,
Teaching. Amsterdam: John Benjamins Publishing corpus-aided discovery learning, and learner
Company. strategies in language learning and language testing
Vygotsky, L. S. 1978. Mind in Society: The Development contexts.
of Higher Psychological Processes. Cambridge, MA: Email: lshuang@uvic.ca
Harvard University Press.

Downloaded from http://eltj.oxfordjournals.org/ at :: on December 3, 2013

484 Li-Shih Huang

Das könnte Ihnen auch gefallen