Beruflich Dokumente
Kultur Dokumente
doi:10.1017/S0272263115000212
Yucel Yilmaz
Indiana University
65
66
Yucel Yilmaz
decade (Goo & Mackey, 2013; Li, 2010; Long, 2007; Lyster & Ranta, 2013;
Lyster & Saito, 2010; Mackey & Goo, 2007; Russell & Spada, 2006). Previous research on corrective feedback has concentrated on the relative
effects of corrective feedback on noticing and/or second language (L2)
development. Recently, researchers have started to investigate the role
of a wide array of factors in moderating the effectiveness of corrective
feedback. Studies contributing to this line of research have focused on
cognitive (e.g., working memory), affective (e.g., anxiety), and task-related
(e.g., contextual support) factors. Another factor that can be expected
to moderate feedback effectiveness is exposure condition. Previous
research has investigated the effectiveness of corrective feedback in
contexts in which learners experience corrective feedback by directly
receiving the feedback on producing inaccurate utterances. However,
this is not the only condition under which learners experience feedback. In classroom settings, learners also experience feedback indirectly by hearing the feedback that is provided to other learners. It
could even be the case that, in a classroom setting, learners hear the
feedback provided to other learners more often than they receive feedback addressed directly to them. Despite this variability in exposure
condition, no studies so far have investigated whether learners who
directly receive feedback on producing an inaccurate utterance benet
more than learners who are allowed to hear the feedback provided to
other learners.
Exploring this issue has important implications for corrective feedback research and practice. Research on this topic could help maximize
the benets of feedback in pedagogical practice by providing information about which exposure condition (direct, indirect, or both) should
be promoted for designing various classroom activities. In addition, this
research could contribute to the theory of corrective feedback by shedding light on the cognitive processes learners go through when decoding corrective feedback. Finally, the results of this research could inform
the corrective feedback research methodology as to whether feedback
should be categorized and reported separately depending on exposure
condition.
BACKGROUND LITERATURE
Corrective Feedback in L2 Acquisition
Although there is no dispute over the contribution of positive evidence
(i.e., targetlike exemplars of language) to L2 acquisition, there is disagreement over the role of negative evidence (i.e., information that
indicates what is not targetlike in the L2). A group of SLA researchers
(e.g., Krashen, 1981; Schwartz, 1993) has argued (with some variation
67
68
Yucel Yilmaz
69
70
Yucel Yilmaz
71
72
Yucel Yilmaz
Method
The present study followed a randomized experimental design with an
immediate and delayed posttest (see Figure 1). A pretest was not administered because it was possible to assume that learners had zero knowledge of the target forms, as they were absolute beginners with no previous
exposure to or knowledge of the target language (i.e., Turkish). Prior to
the beginning of the experiment, all learners were asked to study key
vocabulary using an instructional Web site to ensure that they had
enough vocabulary to begin the experiment. Learners were randomly
assigned to one of three groups: receivers, nonreceivers, and control.3
Participants
Participants were recruited at a large university in the midwestern
United States. The study was advertised in two ways: (a) via an electronic advertisement using the online classied ads service of the university and various university-related listservs and (b) via yers posted
on bulletin boards around campus. Two hundred and twenty participants volunteered for the study on the basis of the following criteria:
(a) being a native speaker of Mandarin Chinese, (b) not having been
73
Figure 1. Study design; The shaded parts indicate that paired participants from these groups were present during each others task
performance.
exposed to Turkish previously, and (c) not having taken any linguistics
courses. The researcher explained that they had to learn 39 Turkish
words to qualify for the study and provided them with a link to the instructional module for the vocabulary learning activities (see the Tasks and
Materials section for a description of the preexperimental stage materials). He asked them to contact him again when they had studied the
words and had passed an online vocabulary test by scoring 95%. Fortytwo participants (38 females and six males) contacted the researcher
after passing the test and constituted the nal pool of participants. The
reason for targeting a group of participants speaking the same rst
language (L1; i.e., Mandarin) was to prevent L1 differences from confounding the results. The target population was required to not have any
previous exposure to Turkish for the following reasons: (a) to represent
the learning processes of learners who are exposed to a L2 for the rst
time, (b) to better detect differences among the groups (i.e., learners
74
Yucel Yilmaz
Target Structures
Two Turkish structures were selected for the study: the plural morpheme
/-lAr/, and the locative case morpheme /-DA/. These morphemes were
selected because of their predicted low form-meaning salience due to
allomorphy. Structures that have allomorphic variation can be a challenge for learners because the learners cannot rely on their tendency to
look for one-to-one form-meaning relationships to learn the structure
(Andersen, 1984). Turkish, being an agglutinating language, is rich in
inectional morphology, with most sufxes having phonologically conditioned allomorphs. Vowel harmony and devoicing determine the
allomorphs of the morphemes under study. Vowel harmony species
that a native Turkish word should include either exclusively nonfront
(i.e., central and back) vowels /a, , o, u/ or exclusively front vowels
/e, i, , y/. As shown in examples (1) and (2), the plural morpheme
/-lAr/ becomes [-ler] or [-lar] depending on vowel harmony; the choice
between /e/ and /a/ is determined by the preceding stem vowel. It is /e/
after front vowels and /a/ after nonfront vowels.
(1) kemer-ler
belt-PL
belts
(2) tabak-lar
plate-PL
plates
75
The sufx /-DA/ expresses the locative case in Turkish, and it has four
allomorphs, as shown in examples (3)(6). It becomes [-de], [-da], [-te],
or [-ta] depending on vowel harmony and devoicing. The locative
/-DA/ becomes [-te] or [-ta] after voiceless consonants and [-de] or
[-da] after vowels or voiced consonants. The preceding stem vowel
determines the vowel in the sufx, as in the plural morpheme. The
meaning of the locative case corresponds to the English prepositions
in, on, at, and by.
(3) ev-de
house-LOC
in the house
(4) masa-da
table-LOC
on the table
(5) sepet-te
basket-LOC
in the basket
(6) raf-ta
shelf-LOC
on the shelf
76
Yucel Yilmaz
77
recognition test and an oral production test. The recognition test was
designed to measure learners knowledge about the pattern behind the
allomorphic variation for each morpheme under conditions favorable
for the use of explicit knowledge; these conditions were as follows:
(a) learners had enough time to plan their responses and (b) learners
could give a correct response by attending to the form (Ellis, 2005). The
test included 48 items: 32 distractors and eight critical items for each
target structure. The items were presented randomly through a Webbased testing tool. Each item included a picture and an incomplete twoword sentence (e.g., adam _____), in which one of the words was always
provided. The learners task was to ll in the gap by selecting from ve
options (e.g., a. masada, b. masata, c. masate, d. none, e. masade). The
options presented the orthographic and phonological forms (through
clickable links) of the noun-allomorph combinations. The correct option
included the correct noun-allomorph combination (e.g., masada), whereas
the incorrect options included incorrect noun-allomorph combinations
(e.g. *masata). The items were balanced with respect to which allomorph
is considered correct.
The oral production test was designed to measure learners ability to
mark nouns for plurality or location when necessary and to use the
correct allomorphs of the morphemes. The task involved a time limit,
and its primary focus was on message creation. In this task, learners
were asked to describe the location of the object(s) they saw in the
picture. In each version, there were 40 items: 16 critical items, creating
contexts for the production of each target form, and eight distractor items.
The test was administered using a stimulus presentation computer program. The presentation of the items was random and automatic with an
8-s delay between the items. The items were balanced with respect to
which allomorph was considered correct. There were two versions for
each of the tests. One version was used on the immediate posttest and
the other was used in the delayed posttest. The versions of the tasks
were counterbalanced across experimental units.
Feedback Treatment
A native Turkish-speaking research assistant (hereafter referred to as
the experimenter) provided explicit correction to the receivers whenever they omitted marking on nouns for plurality and/or location when
it was necessary to do so, marked the incorrect noun, or used incorrect
allomorphs during the treatment. At no point during the treatment
were learners told that they would receive feedback. The feedback
type choice was motivated by the results of previous studies showing
that explicit correction is an effective form of feedback (Li, 2010).4
78
Yucel Yilmaz
The receivers were provided feedback on their own errors during the
tasks. The nonreceivers were allowed to hear the feedback provided to
the receivers. The control group learners neither received feedback on
their own errors nor heard the feedback provided to other learners. To
expose the nonreceivers to feedback, each nonreceiver was matched
with a receiver and was allowed to be present in the same room with the
receiver during the receivers task performance. All groups were matched
on output opportunities: Each learner interacted with the experimenter
by performing three tasks in which he or she described pictures (see
the Procedures section for more information). This ensured that all
79
Procedures
The procedures of the study were as follows. The researcher set up
three meetings with the learners who had studied the vocabulary items
and passed the online test. All meetings took place in a research lab.
The learners in the control group met with the experimenter individually, whereas the learners in the receiver and nonreceiver groups were
paired and met with the experimenter together. These pairs were kept
80
Yucel Yilmaz
intact until the end of the study. During the receiver-nonreceiver treatment sessions, the receiver and the nonreceiver sat side by side, facing
the experimenter. All sessions started with the administration of the
picture-naming task. Learners had to score 100% on the test to proceed
to the treatment tasks. After passing the test, learners carried out the
treatment tasks with the experimenter. In the rst session, learners in
all groups carried out one treatment task with the experimenter. The
study was designed such that the receivers carried out the task before
the nonreceivers to match the experimental groups on opportunities to
apply the knowledge gained from feedback. One week after the rst
meeting, the groups met with the experimenter again to carry out two
more treatment tasks.7 In the rst of these tasks, the nonreceivers
were the rst to carry out the task. In the second task, the learners
who performed the task rst were counterbalanced across all receivernonreceiver pairs. At the end of the session, the learners took the immediate posttest individually in the presence of the experimenter. Two
weeks after the second session, the learners met again with the experimenter alone and took the delayed posttest. The meetings of the learners
in the receiver-nonreceiver pairs with the experimenter were on the
same day and immediately followed each other. In each posttest, the
oral production test was administered before the recognition test. Next, all
participants responded to a background questionnaire, and the receivers
and nonreceivers responded to three manipulation check statements
by rating them on a scale from 1 to 9. The manipulation check statements
were (1) When my partner was doing the activity with the researcher, I was
paying attention to their conversation; (2) I noticed that the experimenter
corrected my errors; and (3) I noticed that the experimenter corrected my
partners errors. Learners performance in all treatment tasks and oral
production tests were audio recorded.
81
and 95.1% for the plural. Disagreements in scoring were then discussed
and resolved. Next, for each learner an adjusted target language use
(ATLU) score was calculated per target structure using the formula in
Ono and Witzel (2002):
ATLU =
Plural
Locative
Task 1
Task 2
Task 3
Total
SD
SD
SD
SD
7.43
7.64
1.09
.63
6.00
7.36
2.48
.93
5.71
6.57
2.13
1.55
19.14
21.57
5.25
2.59
82
Yucel Yilmaz
locative = .76) and for the recognition test (immediate plural = .46,
immediate locative = .63; delayed plural = .12, delayed locative = .06)
revealed that the distribution of the scores, especially in the oral production tests, tended to be positively skewed. This means that most of the
scores were at the lower end of the scale. The low scores of the control
group on both tests might have contributed to these skewness values.
The kurtosis values for the oral production test (immediate plural = .53,
immediate locative = 1.84; delayed plural = .74, delayed locative =
1.01) and for the recognition test (immediate plural = .49, immediate
locative = .11; delayed plural = 1.48, delayed locative = .10) revealed
that the distribution was slightly platykurtic in many cases, indicating large variation within scores. Cognitive or affective individual
differences among learners, the difculty level of the tests, or the
interaction between these two factors may have given rise to these
large variations.
Next, given the nonnormal distribution of the data, various nonparametric tests were carried out to test the hypothesis. First, the performance of each experimental group (i.e., receivers and nonreceivers)
was compared to the performance of the control group. A Kruskal-Wallis
test, the nonparametric equivalent of a one-way ANOVA, was conducted
for each time and outcome measure. Post hoc Mann-Whitney tests were
conducted to nd out if each of the groups was different from the control group. Second, the performances of the two experimental groups
were compared against each other. In this analysis, the observations
that came from the receiver-nonreceiver pairs that participated in
the same sessions were treated as related because they were matched
on many different variables (e.g., quantity and type of feedback; see
the Method section). Given that treating the groups as independent
when they are actually related may lead to Type II errors (Field,
2009), a Wilcoxon signed-ranks test, the nonparametric analogue of
the paired-samples t test, was carried out for each time and outcome
measure. The results of the nonparametric tests are reported for
each structure separately.
83
Table 2.
Structure
Time
Groups
Receivers
Nonreceivers
Control
Recognition
Immediate
Delayed
Oral production Immediate
Delayed
.49
.46
.43
.25
.50
.50
.42
.03
.25
.22
.35
.34
.28
.22
.32
.31
.25
.13
.30
.16
.23
.20
.33
.33
.15
.22
.00
.00
.06
.13
.00
.00
.20
.26
.00
.00
Locative Recognition
.27
.31
.47
.33
.25
.25
.59
.31
.17
.14
.32
.29
.27
.30
.44
.35
.25
.25
.60
.42
.13
.14
.30
.29
.23
.21
.00
.00
.25
.25
.00
.00
.11
.15
.00
.00
Immediate
Delayed
Oral production Immediate
Delayed
84
Table 3.
Structure
Plural
Recognition
Oral production
Locative
Recognition
Oral production
Receivers vs.
control
Time
Nonreceivers vs.
control
Receivers vs.
nonreceivers
Immediate
Delayed
Immediate
Delayed
29.00
47.00
21.00
49.00
.001*
.016*
< .001*
.003*
.61
.46
.76
.56
67.00
94.00
28.00
28.00
.139
.848
< .001*
< .001*
.28
.04
.71
.71
2.45
2.85
2.67
.13
.014*
.004*
.008*
.894
.46
.54
.50
.02
Immediate
Delayed
Immediate
Delayed
N/A
N/A
21.00
28.00
N/A
N/A
< .001*
< .001*
N/A
N/A
.76
.71
N/A
N/A
28.00
28.00
N/A
N/A
< .001*
< .001*
N/A
N/A
.71
.71
.32
.21
.97
.27
.975
.832
.331
.789
.06
.04
.18
.05
Yucel Yilmaz
85
tests showed that neither in the recognition test nor in the oral production test did the receivers and nonreceivers differ from each other (see
Table 3).
As explained in the Method section, the receivers and nonreceivers
were allowed to hear each others attempts to produce the target morphemes in their subsequent turns. One consideration with this design
decision is that an imbalance in the number of attempts (i.e., extra input
for the other learner) between the groups could make it difcult to attribute
any differences between the groups to feedback exposure. To account
for this, the type and amount of input learners could hear from each
other were analyzed. Table 4 shows the descriptive statistics for the
type and amount of input to which each experimental group was exposed.
As can be seen from Table 4, nontargetlike productions (considering
suppliance in nonobligatory contexts and misformations together)
were more frequent than targetlike productions. Paired-samples t tests
conducted for each morpheme in each category revealed no signicant
differences between the groupsplural: misformations, t(13) = .29,
p = .77; correct suppliance, t(13) = .78, p = .45; oversuppliance, t(13) = .45,
p = .66; locative: misformations, t(13) = .27, p = .79; correct suppliance,
t(13) = .58, p = .57; oversuppliance, t(13) = 1.10, p = .29. Therefore, it is
possible to assume that the receivers and nonreceivers heard a comparable amount of input from each other.
Next, various t tests were conducted on the learners self-ratings of
the extent to which they paid attention to the interaction between their
partner and the experimenter. To determine whether learners self-ratings
were signicantly different from chance, each groups mean score was
compared to the median value of the rating scale (i.e., ve) using onesample t tests. The tests revealed signicant differencesnonreceivers:
t(13) = 14.92, p < .001; receivers: t(13) = 6.41, p < .001. Paired-samples
t tests conducted to determine whether the self-ratings of the two
experimental groups differed from each other revealed no differences
between the groupsnonreceivers: M = 8.35, SD = .84; receivers: M = 7.78,
Table 4.
Structure
Receivers
Nonreceivers
SD
SD
Plural
Misformations
Correct suppliance
Oversuppliance
1.29
1.55
.86
1.45
1.65
1.15
1.17
1.31
.74
1.39
1.49
1.11
Locative
Misformations
Correct suppliance
Oversuppliance
1.45
.93
.14
1.44
1.20
.39
1.57
.83
.02
2.04
.98
.09
86
Yucel Yilmaz
SD = 1.61; t(13) = 1.29, p = .165. These results show that both groups paid
attention to their partners interaction with the experimenter, and that
the degree to which they paid attention to it did not differ between the
groups. Finally, two paired-samples t tests were conducted on learners
self-ratings of the two additional manipulation-check statements. For
the statement I noticed that the experimenter corrected my errors,
the receivers ratings were signicantly higher than the nonreceivers
ratingsnonreceivers: M = 5.29, SD = 2.92; receivers: M = 8.14, SD = 1.51;
t(13) = 3.68, p = .003. For the statement I noticed that the experimenter
corrected my partners errors, there was a signicant difference between
the ratings in favor of the nonreceiversnonreceivers: M = 6.43,
SD = 2.98; receivers: M = 3.43, SD = 2.62; t(13) = 2.88, p = .013. These
results show that the receivers and nonreceivers not only paid attention to the interaction between their partners and the experimenter but
also noticed that the receivers were the ones that were corrected. Overall, the analyses of the manipulation check statements revealed that it
was unlikely that the difference between the nonreceivers and receivers
in performance is attributable to the nonreceivers failure to pay attention to the interaction and feedback between the receivers and the
experimenter.
DISCUSSION
The hypothesis of the study predicted that the receivers would outperform the nonreceivers as measured by their recognition and oral production test scores. The hypothesis was conrmed for the plural
morpheme because the receivers outperformed the nonreceivers on all
the tests (immediate oral production and immediate and delayed recognition) except for the delayed oral production test. Additional indirect
evidence conrming the hypothesis came from the comparisons of each
group with the control group. The receivers outperformed the control
on all the tests, whereas the nonreceivers outperformed the control
group only on the oral production test, not on the recognition tests.
The hypothesis, however, was not supported for the locative morpheme because on neither of the tests were there differences between
the receivers and nonreceivers. The comparisons between each of the
groups and the control did not provide any support for the hypothesis
either. Contrary to expectations, the nding that neither experimental
group outperformed the control group on the recognition test raises
doubts as to whether any substantial learning took place for the locative morpheme.
At least two factors may have contributed to the nding that the
receivers outperformed the nonreceivers in the plural morpheme.
The rst is the communicative pressure the receivers might have
87
88
Yucel Yilmaz
89
feedback, and the fact that learners learned the morpheme only through
corrective feedback may be some of the factors that contributed to the
ineffectiveness of the treatment. It could be that absolute beginners
need to be exposed to the locative for a longer period of time to distinguish themselves from a control group. It is equally possible that
learners need to be exposed to feedback types that not only provide the
correct form but also explain the rule behind the structure. In addition,
the learners in this study were not provided any additional positive
evidence on the target forms (other than the positive evidence provided
through corrective feedback). Although this is an essential experimental
control feature for attributing the results to exposure condition alone, it
may have increased the difculty of mapping form and meaning. Similarly, these factors may have contributed to the fact that the effect of
exposure condition in the plural morpheme was less durable. It could
be that these factors decreased the effectiveness of the treatment and
prevented the receivers from keeping their advantages on the delayed
posttest. Future studies could consider increasing the length and number
of treatment sessions as well as the amount of feedback to increase the
effectiveness of feedback in the acquisition of the locative. In addition,
in future rst-exposure studies, a controlled amount of positive evidence could be presented prior to the administration of feedback tasks
to facilitate form-meaning mapping. It would be necessary to test
learners knowledge after this exposure stage to make sure that the
amount of learning across conditions is comparable.
In this study, exposure condition was not isolated from the learners
attempts to produce the target forms. The learners in each of the feedback groups heard their partners correct and incorrect attempts to
incorporate the knowledge he or she gained from the corrective feedback. Therefore, the feedback exposure variable included the ways that
learners attempted to use the correct form in their subsequent turns.
Nonequivalent numbers of attempts would potentially confound the
results. However, this was not the case, as documented by the follow-up
analysis showing that learners were exposed to comparable amounts of
targetlike and nontargetlike input. Although this allows one to claim
that the effect of exposure condition for the plural morpheme and the
lack of it for the locative morpheme cannot be attributed to differing
amounts of extra input learners heard from each other, it does not allow
one to make any claims about the effect of feedback in the absence of
this extra input. In other words, the results should be considered as
shedding light on receiver-nonreceiver differences in contexts in which
learners are allowed to hear each others targetlike and nontargetlike
productions.
When interpreting the results, one needs to pay attention to the
specic features of the feedback used in this study. It is generally
accepted that feedback types differ along an implicit-explicit continuum.
90
Yucel Yilmaz
91
92
Yucel Yilmaz
In addition, this line of research can help identify the cognitive processes that are involved in L2 learning through feedback. In this study,
the receivers higher performance in the plural morpheme was attributed to more effective hypothesis testing and increased alertness. If this
interpretation is conrmed by future research, the proposed cognitive
processes could be used to explain how learners take advantage of
feedback when they receive feedback on their own errors. Finally, the
ndings of this line of research have important implications for corrective feedback research methodology. A recurrent nding showing an
advantage for the receiver role would suggest that studies in which
both receiver and nonreceiver roles are present should control for the
number of feedback instances learners receive in each role to minimize
the potential confounding effects of feedback exposure condition on the
variables under investigation.
Received 6 February 2014
Accepted 26 September 2014
Final Version Received 17 October 2014
NOTES
1. These studies also had premodied input conditions in which directions were read
to learners from a text containing repetitions and paraphrases of the task items.
2. Unlike in the previous studies, the groups were not called interactors/participators/performers and observers because the groups were not different with respect to
whether they participated in interaction (both groups produced output and interacted
with the researcher). Rather, the groups were called receivers and nonreceivers because
what made these groups different was receiving or not receiving feedback on their own
errors.
3. Following Norris and Ortegas (2000) research design recommendations, a true
control group was used. Norris and Ortega (2000) dened a true control group as a group
receiving neither instruction nor exposure related to the target structure except in pretests and posttest (p. 446). They also argued that a true control group is a more powerful
research tool in identifying effects attributable to instruction than a comparison group in
which learners receive alternative treatments. The members of the control group in this
study were not exposed to the target structure, but they were allowed to carry out the
tasks the receivers and nonreceivers carried out. Therefore, the difference between the
control group and the feedback groups can be attributed to feedback because output
opportunities were controlled for.
4. In addition, it was assumed that a feedback type such as explicit correction would
be relatively less likely to favor one of the conditions because it is unambiguous in its
corrective intent and focus.
5. Note that Turkish does not require a copula in nominal sentences.
6. The experimenter held the oor after the feedback by asking a task-relevant question as quickly as possible.
7. A 1-week interval was given between the rst and second sessions for the following reasons: (a) It was assumed that giving an interval between the sessions would
better represent classroom contexts because learners usually receive feedback over
several different lessons rather than in a single lesson, and (b) given that learners were
absolute beginners, it was hypothesized that this interval length would maximize
learning by giving them enough time to ne-tune their working hypotheses about the
target structures.
93
94
Yucel Yilmaz
Gass, S. M., & Mackey, A. (2006). Input, interaction and output: An overview. AILA review,
19, 317.
Gass, S., Mackey, A., & Ross-Feldman, L. (2011). Task-based interactions in classroom and
laboratory settings. Language Learning, 61, 189220.
Goldschneider, J. M., & DeKeyser, R. M. (2001). Explaining the natural order of L2 morpheme acquisition in English: A meta-analysis of multiple determinants. Language
Learning, 51, 150.
Goo, J. (2012). Corrective feedback and working memory capacity in interaction-driven
L2 learning. Studies in Second Language Acquisition, 34, 445474.
Goo, J., & Mackey, A. (2013). The case against the case against recasts. Studies in Second
Language Acquisition, 35, 127165.
Krashen, S. (1981). Second language acquisition and second language learning. Oxford, UK:
Oxford University Press.
Leeman, J. (2003). Recasts and second language development. Studies in Second Language
Acquisition, 25, 3763.
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language
Learning, 60, 309365.
Loewen, S., & Nabei, T. (2007). Measuring the effects of oral corrective feedback on L2
knowledge. In A. Mackey (Ed.), Conversational interaction in second language acquisition (pp. 361377). New York, NY: Oxford University Press.
Loewen, S., & Philp, J. (2006). Recasts in the adult L2 classroom: Characteristics, explicitness and effectiveness. Modern Language Journal, 90, 536556.
Long, M. H. (1981). Input, interaction, second-language acquisition. In H. Winitz (Ed.),
Native language and foreign language acquisition (pp. 259278). New York, NY:
New York Academy of Sciences.
Long, M. (1991). Focus on form: A design feature in language teaching methodology. In
K. de Bot, R. Ginsberg, & C. Kramsch (Eds.), Foreign language research in cross-cultural
perspective (pp. 3952). Amsterdam, the Netherlands: Benjamins.
Long, M. H. (1996). The role of the linguistic environment in second language acquisition. In W. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition
(pp. 413468). New York, NY: Academic Press.
Long, M. H. (2007). Problems in SLA. Mahwah, NJ: Erlbaum.
Long, M. H., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In
C. Doughty & J. Williams (Eds.), Focus on form in classroom SLA (pp. 1541). New York,
NY: Cambridge University Press.
Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruction.
Studies in Second Language Acquisition, 26, 399432.
Lyster, R., & Izquierdo, J. (2009). Prompts versus recasts in dyadic interaction. Language
Learning, 59, 453498.
Lyster, R., & Ranta, L. (1997). Corrective feedback and learner uptake. Studies in Second
Language Acquisition, 19, 3766.
Lyster, R., & Ranta, L. (2013). Counterpoint piece: The case for variety in corrective feedback research. Studies in Second Language Acquisition, 35, 167184.
Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA: A meta-analysis. Studies in
Second Language Acquisition, 32, 265302.
Mackey, A. (1999). Input, interaction, and second language development: An empirical
study of question formation in ESL. Studies in Second Language Acquisition, 21,
557587.
Mackey, A., Gass, S., & McDonough, K. (2000). How do learners perceive interactional
feedback? Studies in Second Language Acquisition, 22, 471497.
Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and research
synthesis. In A. Mackey (Ed.), Conversational interaction in SLA: A collection of empirical studies (pp. 408452). New York, NY: Oxford University Press.
Mackey, A., & Philp, J. (1998). Conversational interaction and second language development: Recasts, responses, and red herrings? Modern Language Journal, 82,
338356.
Mackey, A., Philp, J., Egi, T., Fujii, A., & Tatsumi, T. (2002). Individual differences in working
memory, noticing of interactional feedback and L2 development. In P. Robinson (Ed.),
Individual differences and instructed language learning (pp. 181209). Philadelphia, PA:
Benjamins.
95
96
Yucel Yilmaz
APPENDIX A
LIST OF WORDS
Balon balloon, bavul suitcase, bere hat, beyaz white, byk big,
defter notebook, ekmek bread, elma apple, etek skirt, gemi
boat, gri grey, inek cow, kafes cage, kahve brown, kama knife,
kamyon truck, kavun melon, kazak sweater, kedi cat, kemer
belt, kemik bone, kirmizi red, kitap book, kck small, masa
table, mavi blue, miki mickey, motor motorbike, sapan slingshot, sedir armchair, sepet basket, sinek y, siyah black, sopa
stick, tabak plate, tepsi tray, torba trash bag, yatak bed, yesil
green.
APPENDIX B
NORMALITY TEST RESULTS
Structure
Plural
Test
Recognition
Oral production
Locative
Recognition
Oral production
Time
df
P value
Immediate
Delayed
Immediate
Delayed
.91
.87
.76
.66
42
42
42
42
.002
.000
.000
.000
Immediate
Delayed
Immediate
Delayed
.90
.93
.74
.76
42
42
42
42
.001
.013
.000
.000