Teaching Writing Teachers About Assessment

Available online at www.sciencedirect.
com
Journal of Second Language Writing 16 (2007) 194–209
Teaching writing teachers about assessment

Sara Cushing Weigle *
Department of Applied Linguistics & ESL, Georgia State University,
P.O. Box 4099, Atlanta, GA 30302-4099, USA
Abstract
The assessment of student writing is an essential task for writing teachers, and yet many graduate
programs do not require students to take a course in assessment or evaluation, and courses on teaching
writing often devote only a limited amount of time to the discussion of assessment. Furthermore, teachers
frequently need to prepare their students for externally mandated large-scale writing assessments, and thus
they need to have an understanding of the uses and misuses of such tests. This article outlines some of the
essential considerations in classroom and large-scale assessments and provides suggestions for how to
incorporate considerations about assessment into a course on teaching writing or as a stand-alone course.
# 2007 Elsevier Inc. All rights reserved.
Keywords: Second language writing; Writing assessment; Teacher education
Assessment of student writing is an essential task for writing teachers. Unfortunately,

however, many graduate programs in TESOL and rhetoric/composition do not require students to
take a course in assessment or evaluation, and courses on teaching writing often devote only a
limited amount of time to the discussion of assessment. Moreover, teachers often feel that
assessment is a necessary evil rather than a central aspect of teaching that has the potential to be
beneficial to both teacher and students. They may believe, rightly or wrongly, that assessment
courses focus too much on statistics and large-scale assessment and have little to offer classroom
teachers. As a result, teachers sometimes avoid learning about assessment or, worse, delay
thinking about how they will assess their students until they are forced to do so, a situation which
unfortunately decreases the chances that assessments will be fair and valid.
At the same time, writing teachers often find themselves in a position of having to prepare
their students for externally imposed assessments such as departmental or university-wide exit
examinations or large-scale high stakes tests such as the test of English as a foreign language
* Tel.: +1 404 413 5192; fax: +1 404 413 5201.

E-mail address: sweigle@gsu.edu.
1060-3743/$ – see front matter # 2007 Elsevier Inc. All rights reserved.
doi:10.1016/j.jslw.2007.07.004
S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209 195
(TOEFL). Teachers sometimes feel that such assessments have little to do with the skills they are
trying to teach their students; consequently, they may approach these tests with some resistance
and, unfortunately, little understanding of how such tests are constructed or scored and whether
or not they have been validated for the purpose for which they are being used.
It is my belief that writing teachers must be adequately prepared to construct, administer, score,
and communicate the results of valid and reliable classroom tests, and that, similarly, they should
have an understanding of the uses and misuses of large-scale assessments so that they can be critical
users of such tests and effective advocates of their students in the face of mandatory assessments not
of their own making. In this paper, I start by outlining some of the fundamental principles of
assessment in general, and then discuss the process of test development, some of the considerations
that teachers must think about in designing classroom writing assessments, and some suggestions
for how teacher trainers might approach these issues in a course on second language writing issues
or on assessment. Finally, I discuss large-scale assessment and some of the ways in which teachers
can be empowered by a deeper understanding of these assessments that affect their students.
Classroom assessment
For any teacher, the ability to design fair and valid ways of assessing their own students’
progress and achievement is an essential skill. In order to do so, teachers need to understand the
range of possibilities for assessing students, what the essential qualities of a good assessment
instrument are, and how to develop assessments that maximize these essential qualities within the
constraints of time and resources that teachers face.
It may be useful at first to clarify some terminology and to outline various types of
assessments. Assessment is a broad term that encompasses all sorts of activities that teachers
engage in to evaluate their students’ progress, learning needs, and achievements. As Brown
(2004) notes, teachers are constantly evaluating their students in informal ways, and these
informal evaluations are an important part of assessment, just as more formal tests are. Informal
assessments include such things as clarification checks to make sure students understand
particular teaching points, eliciting responses to questions on style and usage from students, or
circulating among students doing peer response work to ensure that they are on task. Formal
assessments can be defined as ‘‘exercises or procedures specifically designed to tap into a
storehouse of skills and knowledge’’ (Brown, 2004, p. 6). For a writing class, formal assessments
may include traditional writing tests, for example, an exercise in which students are required to
generate one or more pieces of connected discourse in a limited time period, which are then
scored on some sort of numerical scale (Hamp-Lyons, 1991a,b), and other activities, in particular,
response to and evaluation of artifacts such as portfolios, homework assignments, or out – class
writing assignments. It is important for teachers to recognize that all of these activities – informal
assessments, and various types of formal assessments, including tests – have a place in a teacher’s
assessment toolbox, all are appropriate under certain circumstances, and all need to be evaluated
according to the most important qualities of effective assessments: in particular, reliability,
validity, and practicality. Thorough treatments of these qualities can be found in a variety of
sources, including those listed in the Appendix. I include only a brief discussion of them here.
A good test is reliable; that is, it is consistent. A student should get the same score on a test one
day as on the next day (assuming, of course, that no additional learning has taken place in the
interim) or from one grader/rater as from another. If there is a choice of topics or tasks, they
should be equivalent in difficulty so that a student’s chances of performing optimally do not
depend on which topic they choose. Finally, conditions of administration should be as similar as
196 S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209
possible so that factors not related to the skills being assessed do not affect student performance.
For example, students should all be given the same amount of time to complete the assessment.
A good test is valid for the purposes for which it is being used. Validity is a complex issue that
is discussed at length in many references on assessment (e.g., Bachman, 1990; Bachman &
Palmer, 1996; Hamp-Lyons, 1991a,b; Hudson & Brown, 2002; McNamara, 1996). In essence,
validity has to do with the appropriateness of decisions that will be made on the basis of the test so
that, for example, students who are capable of demonstrating excellent work in class are able to
do so on the test, and those who are not as capable are not able to pass the test by other means (for
instance, by lucky guessing or by memorizing a response). For most classroom purposes, the
most important validity consideration is that the content of the test is representative of the skill(s)
and knowledge that are being taught in the course, both in terms of covering the range of skills
adequately, and also in terms of not assessing skills that are not being taught in the course.
A good test is practical; that is, it can be developed, administered, and scored within the
constraints of available resources, particularly time. For teachers, practicality is an overriding
concern; writing teachers in particular know how time-consuming it is to grade papers. Teachers
need to have realistic expectations about how much time they can devote to developing
assessments, as well as how long it will take to administer and score any assessment of writing.
Reliability, validity, and practicality are not the only considerations for assessment. For
example, Bachman and Palmer (1996) include interactiveness, authenticity, and impact, or the
effect of an assessment on learners, teachers, and other stakeholders, in their model of test
usefulness, but for classroom teachers, these three are perhaps the most critical to be familiar with.
The test development process
Whether one is writing a test for an individual classroom or for large-scale administration, the
essential steps are the same. Many books on language testing provide guidance for test
development at the classroom level and for large-scale tests which go into greater detail than is
possible here (see, for example, Alderson, Clapham, & Wall, 1995; Bachman & Palmer, 1996;
Weigle, 2002). For any classroom test, there are four major considerations that go into an
assessment procedure. These are:
setting measurable objectives,

deciding on how to assess objectives (formally and informally),
setting tasks,
scoring.
Specifying measurable objectives
One of the most fundamental lessons about assessment is that decisions about assessment
should not be left until the end of instruction, but rather should be taken into account from the
very beginning, preferably in the earliest planning stages for a course. Teachers need to learn how
to articulate precisely what it is they hope students will learn in their courses so that they can
develop ways of assessing whether their students have, in fact, mastered the course objectives.
For this reason, it is helpful to state course objectives in terms of observable behaviors or products
so that they can be evaluated appropriately.
Many writing course syllabi contain general objective statements such as ‘‘students will learn
the basics of academic writing’’ or ‘‘after completing this class, you will know how to revise and
edit your writing.’’ The problem with statements that are framed in this way is that they do not
provide any guidance for developing assessments that will help teachers judge whether students
have met these objectives. As a teacher, how does one know when a student has ‘‘learned the
basics of academic writing’’? Does the writing of a student who has accomplished this objective
differ from that of a student who has not? Without a clearer statement of measurable outcomes, it
will be impossible for teachers to know whether they have been successful. This problem of
vaguely worded objectives is compounded in a multi-level program where students need to
progress through two or more levels, and different sections of the same course are taught by
different teachers. Statements such as those above do not provide useful ways of articulating
between levels.
It is, therefore, much more helpful to start out by stating objectives in such a way that it is clear
when the objectives have been met. There are many sources in the educational literature for
writing clear objectives (see, for example, Gronlund, 2004), but my own inspiration in this area
comes from business rather than education. David Allen, in his excellent book Getting Things
Done, provides a three-step model for stating outcomes, which can be applied just as easily to
teaching as to the business world.
The steps are:
1. View the project from beyond the completion date.

2. Envision ‘‘WILD SUCCESS.’’
3. Capture features, aspects, qualities you imagine in place (p. 69).
In terms of teaching writing, these steps can be conceptualized as follows:
1. Imagine the class and the students at the end of the term.
2. Think about the very best piece of writing that could come from this class.
3. Describe its attributes. What does it look like? What makes it stand out? Is it the correct use of
verb tenses? Is it the vivid details or the insightful thinking that went into the writing? Is it the
use of transitions and other cohesive devices? Has the student revised appropriately to
instructor and/or peer feedback? Teachers who can articulate what they imagine their best
writers can accomplish at the end of a term are in a good position to begin developing
assessments. Furthermore, by defining one’s objectives in this way at the beginning of
instruction, teachers can begin to plan out how they will assist students in reaching these goals,
thus allowing concerns about assessment to inform instruction from the very beginning.
One activity that can help teachers with articulating their objectives is to have them write an
imaginary endnote to a final draft of a writing assignment from their course, as in Figure 1. Note
that the questions cover three main areas: what the student has done well (i.e., the student’s
Fig. 1. Imaginary endnote to a final draft of a writing assignment.

strengths, whether specifically learned in the course or not), ways in which the student has
improved (i.e., what the student has learned from the course), and what the student could focus on
for the future (i.e., what the student may not yet have mastered but is ready to learn). The
questions are flexible enough to cover linguistic, content, rhetorical, or process dimensions of
writing.
Once teachers have determined what a successful paper would look like, they are ready to
write outcome statements that contain measurable objectives. One rule of thumb for specifying
objectives is to include three characteristics: a description of the performance itself, or what the
student is expected to do, the conditions under which the performance will be elicited, and the
level of performance that will be deemed acceptable (Mager, 1975, cited in Ferris & Hedgcock,
2005).
Some teachers may object that setting goals in this way is inappropriate for teaching writing,
especially those who view personal expression as the main goal of writing instruction (see
Raimes, 1991, for an overview of different perspectives on the goals of writing courses). Indeed,
one of the dangers of writing objectives in this way is that what is measurable is not always what
is essential, so that the focus often turns to easily quantifiable traits of essays such as error counts.
Teachers need to find a compromise that they can live with between too much specificity, which
can lead to an unhealthy focus on lower level skills to the detriment of the big picture, and too
much generality, which can make it nearly impossible to ascertain how successfully the course
objectives have been met. One example of such a compromise can be found in Figure 2. Note that
the outcome statements cover objectives related to the range of written products, the use of
language, and the writing process, and are written using verbs that describe observable behavior
(uses, writes, etc.).
The benefits of specifying outcomes in this way are numerous. Teachers can use these
outcome statements to make teaching decisions and to design rubrics for evaluating writing, and
students benefit because what is expected of them becomes much more clear.
One assignment that that can help students learn how to practice writing clear course
objectives is to provide a sample syllabus (see, for example, Ferris & Hedgcock, 2005, pp. 110–
118). The students’ task is to evaluate the course objectives in terms of whether they are specific
and measurable. For ones that are not, students need to rewrite the objectives so that they are
specific and measurable. For those objectives that are already specific, students can discuss how
they would write assignments that measure those objectives.
Deciding on how to assess objectives
Once teachers have a list of objectives, the next step in the process is to decide which
objectives will be assessed informally, which will be assessed formally through tests, and which
will be assessed formally through means other than tests. For example, objectives related to
critical thinking skills might best be assessed through informal means such as observing
participation in class discussions or responding to reading journals, while more specific
language-related objectives such as the correct use of verb tenses might be assessed as part of a
test, either as a controlled exercise or as part of the evaluation of a timed writing assignment.
Teachers need to be aware of the multiplicity of ways in which various objectives might be
tested. The bibliography at the end of this article contains numerous resources for designing
assessment tasks, testing books in particular. Cohen (1994) and Hughes (2002) contain chapters
devoted to various ways of assessing writing skills either holistically or as discrete subskills. In
this next section, I will focus on setting tasks for independent writing (either as timed single-draft
essays or as untimed multiple-draft essays) rather than for testing subskills such as grammatical
knowledge or the ability to paraphrase. Following this, I describe portfolio assessment as a
potentially more valid way of assessing many aspects of writing than can be assessed in a single
test. First, however, I will explore the issue of whether one should test writing at all in a writing
course; that is, under what circumstances is a writing test appropriate?
In-class versus out-of-class writing
Writing teachers frequently face the dilemma of whether to assess in-class as well as out-of-
class writing. Particularly in classes where the writing process is emphasized, many teachers feel
that it is counterproductive to assess students on a single draft of a paper, especially on an
impromptu topic that students may not have had time to think about before the day of the
Fig. 2. Examples of outcome statements.

Fig. 2. (Continued ).
assessment. If a final examination for a writing course consists of impromptu writing only,
students are given a mixed message about what kind of writing is actually important for them to
be able to master.
Furthermore, most writing outside of testing situations in the real world is not completed
under time pressure. This is particularly true for academic writing. The process of writing
involves reflection, discussion, reading, feedback, and revision, and one’s best work is usually not
produced in a single draft within 30 or 60 minutes.
A final reason for emphasizing out-of-class writing is that some L2 students may have
difficulties on timed writing tests even if they are successful in other academic writing tasks
(Byrd & Nelson, 1995; Johns, 1991). Furthermore, English teachers without ESL training may be
susceptible to basing their evaluations of NNS writing more on sentence-level concerns than on
content or rhetorical concerns (Sweedler-Brown, 1993). NNS may not be able to perform as well
under time pressure as their native-speaking peers, and this may be especially noticeable in timed
writing.
However, there are at least three important reasons why teachers would want to include some
sort of in-class writing assessment as part of their assessment of students’ abilities. The first
reason is simply a pragmatic one: timed writing tests are a fact of life for many students. Writing
tests have become standard on large-scale high-stakes tests such as the TOEFL and the GRE, and
such tests can have a profound effect on students’ futures. Furthermore, in content courses such
as history or psychology – at the undergraduate level, at least – students are frequently expected
to write short essays on their examinations (Carson, Chase, Gibson, & Hargrove, 1992). The
ability to compose under time pressure is thus critical for many students, and the writing class can
be a valuable place to learn strategies for timed writing and to practice this skill.
In addition, while collaboration in writing is often seen as an important component of the
writing process, there are times when teachers want to know what students can do on their own
without assistance. In out-of-class writing assignments there is always the danger that students
have received inappropriate amounts and kinds of help from tutors, friends, or roommates. In
particular, second language writers may ask their native speaker friends to proofread their papers
and fix sentence-level errors. Teachers are certainly justified in asking students to produce at least
some writing in class where they are unable to rely on such outside support.
A third reason for testing writing in class under timed condition comes from second language
acquisition theory. From a psycholinguistic viewpoint, in-class writing can serve as a test of
automatized knowledge of English. In general, adults writing in their first language have
automatic access to lexical and syntactic resources, while for many second language writers,
particularly at lower levels of proficiency, these processes are not yet automatic, so writers need
to focus conscious attention on retrieving words and explicit grammar rules from long-term
memory. This need to pay attention to word and sentence level concerns makes it difficult to
focus on macro-level issues such as overall structure and organization and writing strategies that
they may use in their first language (see Weigle, 2005, for a summary of research in this area).
Furthermore, as Ellis (2005) demonstrates, different tasks evoke implicit and explicit
knowledge. Ellis found that an untimed grammaticality judgment test evoked explicit or rule-
governed knowledge, particularly for those sentences that were ungrammatical, while a timed
test evoked implicit knowledge. One might hypothesize on the basis of these results that timed
and untimed writing assignments would evoke different knowledge types, and, therefore, if one is
interested in knowing how much linguistic knowledge is implicit and automatized, a timed
writing assessment may be an appropriate vehicle for this purpose.
For these reasons, although many writing teachers feel that in-class writing does not allow
students to demonstrate their best ability, one can justify assessing both in-class and out-of-class
writing as complementary sources of information about student abilities, particularly when it
comes to making high-stakes decisions such as passing or not passing a course. In such cases, as
assessment specialists (e.g., Brown & Hudson, 1998) frequently point out, it is particularly
critical to use multiple sources of information, as no single test of an ability is without error.
In assessing in-class or timed writing, however, classroom teachers have advantages that
developers of large-scale tests do not, in that they can modify the timed impromptu essay to take
advantage of the extended time they spend with students before a test. Elsewhere (Weigle, 2002),
I have presented ways of modifying the timed impromptu essay to fit the classroom environment.
Possibilities include strategies such as discussing a topic in class and doing preliminary
brainstorming, allowing students to write an essay outline before writing their drafts in class, and/
or writing an in-class draft for a grade, followed by revising it out of class based on teacher or
peer feedback for a separate grade. Because of the difficulties that second language writers often
have managing both the content and linguistic demands of a writing assignment, giving students
the opportunity to prepare the content in advance of the writing may allow them to demonstrate
their best writing.
Setting tasks
Whether or not one is evaluating in-class or out-of-class writing, a useful approach to task
development is to begin by drafting test specifications as a way of articulating clearly what one is
attempting to assess. Specifications are particularly important in developing large-scale tests, but
even for an individual teacher, specifications can be helpful as a tool for planning out an
assessment. Specifications can benefit teachers in at least three ways: (1) the process of
developing specifications helps to ensure that teachers have considered the specific aspects of
writing that they are attempting to assess and how those aspects are operationalized in tasks and
scoring procedures; (2) within a given program, teachers can share specifications so that courses
at the same level can maintain the same evaluation standards and procedures; and (3) sharing
specifications with students allows them to know exactly how they will be assessed, in terms of
what sorts of tasks they can be expected to perform and how they will be evaluated (Weigle,
2002).
Specifications can take many forms, but one useful format is that described in detail in
Davidson and Lynch (2002). The main parts of the specification are (a) a general description of
the skill(s) or ability(ies) being tested, including a rationale for why these particular skills are
important for the given testing context; (b) a description of the prompt, or the instructions to
the student about what to write, including a description of any additional stimulus material
such as reading passages, pictures, or graphs; and (c) a description of the scoring guide, or
rating scale.
In my experience, teachers in training are often skeptical of the value of specifications and
sometimes resistant to the notion of spending time on specifications until they actually go
through the process of developing a specification and a test. However, they usually find that
writing a specification is helpful in clarifying their thinking and anticipating potential difficulties,
and that, in the long run, writing a specification saves time. As one student wrote in an online
posting for an assessment course:
When we began discussing test specifications I felt sooo lost and had no clue where to
begin. After reading and discussing in class, I thought that writing the specs would be
difficult but not impossible. Now, I can just look at my specs and create test items with
much more understanding of how the process works. I would just like to advertise for spec
writing and say that they really are the blueprints and make things so much clearer when it
comes to creating a test that is relevant. I finally see the light even though I still have much
to learn and perfect when it comes to writing tests.
As noted above, specifications should include a description of the prompt (instructions to the
student) and of the expected response. Useful guidelines for designing prompts can be found in
Kroll and Reid (1994). Depending on the goals of the assessment, specifications can include any
of the dimensions for writing tasks (from Weigle, 2002) outlined in Table 1. For example, one
Table 1
Dimensions of tasks for writing assessment
Dimension Examples
Subject matter Self, family, school, technology, etc.
Stimulus Text, multiple texts, graph, table
Genre Essay, letter, informal note, advertisement
Rhetorical task Narration, description, exposition, argument
Pattern of exposition Process, comparison/contrast, cause/effect, classification, definition
Cognitive demands Reproduce facts/ideas, organize/reorganize information,
apply/analyze/synthesize/evaluate
Specification of
Audience Self, teacher, classmates, general public
Role Self/detached observer, other/assumed persona
Tone, style Formal, informal
Length Less than 1/2 page, 1/2 to 1 page, 2–5 pages
Time allowed Less than 30 min, 30–59 min, 1–2 h
Prompt wording Question vs. statement, implicit vs. explicit, amount of context provided
Choice of prompts Choice vs. no choice
Transcription mode Hand-written vs. word-processed
Scoring criteria Primarily content and organization, primarily linguistic accuracy, unspecified
Weigle (2002). Adapted from Purves, Soter, Takala, and Vähäpässi (1984, pp. 397–398) and Hale et al. (1996).
might specify that students will write a one page (length) narrative letter (rhetorical task/genre) to
a close friend (audience) using a series of picture prompts (stimulus) as input, and so on.
Scoring
One of the most troublesome aspects of assessing writing for many teachers is assigning letter
grades or numerical scores to their students’ work. One reason for this difficulty is that many
teachers feel much more comfortable in the role of supportive coach than of evaluator. Another
reason is that teachers sometimes begin their assessment with some idea of how many points a
particular assignment is worth, but without a clear notion of how those points should be awarded
or the criteria they should use to grade their student work.
For these reasons, among others, teachers need to have a systematic process for assigning
scores to essays or other written work and some sort of written rubric that outlines the criteria for
grading. Sources for writing rubrics abound in print and online, so there is little need for a teacher
to start from scratch in developing a rubric for grading.
In creating a rubric, teachers need to be familiar with the main types of rubrics. Rubrics vary
along two dimensions: whether they are general (to be used across a variety of assignments/
writing tasks) or specific to an assignment, and whether a single score is given (usually referred to
as a holistic scale) or analytic (i.e., points are given for different aspects of writing, such as
content, organization, and use of language). Much has been written about the advantages and
disadvantages of different types of scoring rubrics; see in particular Hamp-Lyons (1991a,b) and
Weigle (2002, chap. 6). While arguments can be made for either type of scoring rubric, research
suggests that, while holistic scales are faster and more efficient, analytic scales tend to be
somewhat more reliable than holistic scales, and certainly provide more useful feedback to
students, as scores on different aspects of writing can tell students where their respective
strengths and weaknesses are.
In training teachers, it is useful to have them try out existing scoring rubrics on a set of essays
on a given topic—perhaps one holistic rubric such as the TOEFL writing rubric and one analytic
rubric such as that proposed by Jacobs, Zinkgraf, Wormuth, Hartfiel, and Hughey (1981) – and
compare their answers in small groups. Teachers in training usually learn from this experience
that (a) without exemplars at different levels the various descriptors are difficult to interpret
consistently; (b) they can usually agree on the best and the worst essays, but the ones in the
middle are more difficult to agree on; and (c) different raters read different things into papers and
bring their own values and experiences into the rating process, which highlights the importance
of rater training to clarify how the scale should be used in a given context so that raters can learn
to apply similar standards.
To summarize, there are many things that novice teachers need to learn about developing their
own classroom writing assessments—in particular, how to articulate their course objectives
clearly so that their assessments match their instruction as closely as possible, how to construct
prompts that can elicit reliable samples of writing that are valid indicators of their students’
ability, and how to score writing reliably and efficiently. This article has only scratched the
surface of these issues; the interested reader is referred to the sources listed in the Appendix for
additional information in these areas.
Portfolio assessment
Experienced writing teachers and scholars agree that writing tests such as those described in
the previous section is quite limited in terms of its usefulness in assessing the complete range of a
student’s writing ability. Writing ability is perhaps best conceptualized as the ability to compose
texts in a variety of genres that are appropriate for their audience and purpose, and it is difficult, if
not impossible, to generalize from a single text on a single topic composed under time constraints
to this broader universe of writing. For this reason, many individual teachers and writing
programs have adopted portfolio assessment as a (potentially) more valid approach writing
assessment. A complete discussion of portfolio assessment is beyond the scope of this paper;
interested readers are referred to Hamp-Lyons and Condon (2000), Mabry (1999), Weigle (2002,
chap. 9), and Wolcott and Leggett (1998) for more thorough treatments of portfolio assessment.
Here, I will briefly define portfolio assessment and provide a brief overview of some of the
advantages and constraints of portfolio assessment.
A portfolio is ‘‘a purposeful collection of student works that exhibits to the student (and/or
others) the student’s efforts, progress, or achievement in a given area’’ (Northwest Evaluation
Association, 1991, p. 4, cited in Wolcott and Leggett, 1998). Portfolios vary greatly depending on
the age of students, the purpose of the course, and the learning context, among other variables, but
three essential components of a portfolio are collection, reflection, and selection (Hamp-Lyons
and Condon, 2000). A portfolio is a collection of written products rather than a single writing
sample, but it is the process of selecting and arranging the specific contents through deliberate
reflection that distinguishes a portfolio from a pile of papers or a large folder (p. 119). Another
important component of most portfolio assessment programs is delayed evaluation, which gives
students both motivation and time to revise their papers based on feedback and self-reflection
before turning them in for a final grade.
Portfolio assessment has several advantages over traditional writing tests as a means for
evaluating student growth and achievement in writing. First and foremost, portfolios allow
assessment and instruction to be integrated seamlessly, as everything that happens in the writing
class contributes directly to the process of assembling the portfolio. Furthermore, portfolio
assessment allows students to demonstrate their mastery of different genres and registers, as well
as their mastery of different aspects of the writing process such as the ability to revise one’s
writing based on feedback and to edit one’s writing for sentence-level errors. For second
language writers, in particular, portfolio assessment has the advantage of affording extra time for
revision and editing to students who may not perform as well under timed conditions.
Despite these advantages, however, implementing a portfolio assessment program is not without
its difficulties. One potentially problematic aspect of portfolio assessment is reliability of scoring:
individual portfolios may contain writing samples that vary greatly in quality, which makes it
difficult to assign a single score or grade to the portfolio, and the content of portfolios assembled by
different students may vary considerably, making it difficult to score consistently across portfolios.
Another area of potential difficulty has to do with practicality: setting up and maintaining a
successful portfolio program requires a great deal of advanced planning and investment of
time and effort on the part of teachers, administrators, and students. These difficulties are not
insurmountable, however, and many teachers who have successfully implemented portfolio
assessment will state unequivocally that the benefits of portfolios far outweigh the difficulties.
What teachers should know about externally mandated assessments
In addition to knowing about classroom assessments, writing teachers need to be aware of

many issues related to large-scale assessment. In many programs and institutions, teachers are
obligated to prepare their students for large-scale examinations, ranging from locally produced
exit examinations to professionally written tests such as the TOEFL. Teachers frequently have
one of two attitudes towards these tests. Some teachers feel mistrustful of standardized tests and
the companies that make and administer them. They believe – not completely without
justification – that many externally imposed tests are thrust upon them and their students for
political reasons, and that such tests are created by people who are out of touch with the world of
education. Others, on the other hand, are all too willing to trust the judgments of the ‘‘experts’’
rather than their own expertise. They tend to assume that a test is valid simply because it was
written by a professional test writer and do not take the time to examine the test closely or look at
the match between the test and their own goals and objectives in teaching.
As a teacher trainer, I find it important to explore both of these points of view and point out
some of the dangers and misconceptions involved in each. Several scholars have pointed out the
inherently political nature of large-scale assessment; as tests can be used as gatekeeping
mechanisms that allow or restrict access to educational resources and opportunities (see for
example, Shohamy, 1998; White, Lutz, & Kamusikiri, 1996). Questions about whose agenda is
being served by large-scale tests, who has the right to determine what ‘‘good writing’’ means, and
what the intended and unintended effects of policy decisions about testing are on students,
teachers, and programs, need to be continually asked, particularly by teachers, who are among
those most affected by large-scale tests and most immediately aware of how these tests affect
their students. On the other hand, teachers may too easily adopt the view described by Scharton
(1996) of ‘‘right-minded teachers struggl[ing] against ruthless big-company test designers who
merely want to sell a test score to administrators interested in a quick fix’’ (p. 56) and may not
appreciate the professionalism behind the development of large-scale tests. My own experience
as a member of the TOEFL Committee of Examiners for three years helped disabuse me of that
particular perspective; I found that the people involved in developing the TOEFL were deeply
committed to creating high-quality, fair, and valid assessments and were just as concerned with
mitigating negative consequences to students as any teacher.
Of course, most teachers will not have the opportunity to get an up-close look at the inner
workings of testing companies, and teachers’ busy schedules make it difficult to devote time to
advocacy issues. However, one valuable assignment that I have used with teachers in training to
raise their awareness of some of these issues is to critique an existing test from the point of view
of reliability, validity, authenticity, practicality, and washback. Students have critiqued large-
scale tests and tests that are used in their institutions and are often surprised at what they find out.
For example, students have discovered that placement and exit tests used in local language
schools and community colleges frequently have no handbook or technical manual, no record of
how they were developed or validated, and often use writing prompts that are not pretested or
equated, so that there is no way to determine the effect of particular prompts or prompt types on
the scores given. At the same time, students come to appreciate that large testing organizations
such as Educational Testing Service, whom many are used to thinking of in negative terms
because of their dominance in high-stakes tests, in fact take tremendous care in defining
constructs, designing valid and reliable assessments, and maintaining a program of research to
ensure that their tests are of high quality.
As test users and as advocates for their students, teachers have a responsibility to understand
the powerful role that tests play in their students’ lives, and where relevant, to challenge misuses
of tests, for example, the use of a single essay test to make high – stakes decisions such as exit or
admissions, or the practice of administering writing prompts that have not been validated or even
pre-tested. Huot (1996) proposes a set of principles for assessing writing that take into account
the needs and concerns of all stakeholders in assessment; these principles can be a useful starting
point in evaluating any mandated assessments that are in place at an institution. Huot believes
that writing assessment should be site-based. That is, it should be developed in response to a need
at a specific site; locally controlled by the institution involved; context-sensitive to take into
account the instructional goals as well as the cultural and social environment of the institution;
rhetorically-based, adhering to ‘‘recognizable principles integral to the thoughtful expression
and reflective interpretation of text’’; and accessible, so that procedures for creating and scoring
writing assessments are available to all stakeholders, including the test takers themselves.
Teachers can also be advocates for fair testing by insisting that those who are administering
and scoring tests adhere to a code of practice and ethics, such as that promulgated by the
International Language Testing Association (http://iltaonline.com/). The ILTA code of ethics
consists of nine principles that should guide the professional behavior of language testers, each
elaborated upon with a set of annotations. For example, Principle 1 states: ‘‘Language testers
shall have respect for the humanity and dignity of each of their test takers. They shall provide
them with the best possible professional consideration and shall respect all persons’ needs, values
and cultures in the provision of their language testing service.’’ The draft code of practice outlines
responsibilities and obligations for test writers, institutions, and users of test results with regard to
good testing practices. Teachers should not hesitate to ask questions about the reliability and
validity of the tests that their students are required to take and how test results will be used, and
teachers should be proactive in bringing issues of questionable testing practices to the attention of
administrators.
Conclusion
In this paper, I have briefly touched upon several issues related to assessment that writing
teachers should be aware of, both in terms of tests that teachers develop for their own courses and
in terms of large-scale tests. Because assessment is such an integral component of teaching, it is
regrettable that many graduate programs in composition and TESOL do not require an
assessment course, and thus many teachers enter the classroom without a thorough grounding in
assessment issues. Fortunately, teachers are not without resources; there are regional associations
of language testing specialists that hold annual conferences, and, increasingly, there are
assessment-related sessions at major international conferences such as TESOL and many
excellent volumes that discuss assessment issues in clear, understandable terms. A solid
understanding of assessment issues should be part of every teacher’s knowledge base, and
teachers should be encouraged to equip themselves with this knowledge as part of their ongoing
professional development.
Acknowledgments
I am indebted to Diane Belcher, Alan Hirvela, and two anonymous reviewers for their helpful
suggestions on earlier drafts of this manuscript.
References
Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge
University Press.
Allen, D. (2002). Getting things done: The art of stress-free productivity. London: Piatkus Books.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
Bachman, L., & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University Press.
Brown, H. (2004). Language assessment: Principles and classroom practices. White Plains, NJ: Pearson Education Inc.
Brown, J. D., & Hudson, T. (1998). The alternatives in language assessment. TESOL Quarterly, 32, 653–675.
Byrd, P., & Nelson, G. (1995). NNS performance on writing proficiency exams: Focus on students who failed. Journal of
Second Language Writing, 4, 273–285.
Carson, J. G., Chase, N. D., Gibson, S. U., & Hargrove, M. (1992). Literacy demands of the undergraduate curriculum.
Reading Research and Instruction, 31(4), 25–50.
Cohen, A. D. (1994). Assessing language ability in the classroom. Boston, MA: Heinle and Heinle.
Davidson, F., & Lynch, B. K. (2002). Testcraft: A teacher’s guide to writing and using language test specifications. New
Haven, CT: Yale University Press.
Ellis, R. (2005). Measuring implicit and explicit knowledge of a second language: A psychometric study. Studies in
Second Language Acquisition, 27, 141–172.
Ferris, D. R., & Hedgcock, J. R. (2005). Teaching ESL composition: Purpose, process, and practice. Mahwah, NJ:
Lawrence Erlbaum Associates.
Gronlund, N. (2004). Writing instructional objectives for teaching and assessment (7th ed.). Upper Saddle River, NJ:
Pearson Education.
Hale, G., Taylor, C., Bridgeman, B., Carson, J., Kroll, B., and Kantor, R. (1996). A study of writing tasks assigned in
academic degree programs. (TOEFL Research Report No. 54). Princeton, NJ: Educational Testing Service.
Hamp-Lyons, L. (1991a). Assessing second language writing in academic contexts. Norwood, NJ: Ablex.
Hamp-Lyons, L. (1991b). Scoring procedures for ESL contexts. In L. Hamp-Lyons (Ed.), Assessing second language
writing in academic contexts (pp. 241–276). Norwood, NJ: Ablex.
Hamp-Lyons, L., & Condon, W. (2000). Assessing the portfolio: Principles for practice, theory, and research. Cresskill,
NJ: Hampton Press.
Hudson, T., & Brown, J. D. (2002). Criterion-referenced language testing. Cambridge: Cambridge University Press.
Hughes, A. (2002). Testing for language teachers (2nd ed.). Cambridge: Cambridge University Press.
Huot, B. (1996). Toward a new theory of writing assessment. College Composition and Communication, 47, 549–566.
International Language Testing Association. (2000). Code of ethics for ILTA. Retrieved October 3, 2007 from http://
www.iltaonline.com/code.pdf.
Jacobs, H. L., Zinkgraf, S. A., Wormuth, D. R., Hartfiel, V. F., & Hughey, J. B. (1981). Testing ESL composition: A
practical approach. Rowley, MA: Newbury House.
Johns, A. M. (1991). Interpreting an English competency examination: The frustrations of an ESL science student. Written
Communication, 8, 379–401.
Kroll, B., & Reid, J. (1994). Guidelines for designing writing prompts: Clarifications, caveats, and cautions. Journal of
Mabry, L. (1999). Portfolios plus: A critical guide to alternative assessment. Thousand Oaks, CA: Corwin.
McNamara, T. F. (1996). Measuring second language performance. London: Longman.
Purves, A. C., Soter, A., Takala, S., & Vähäpässi, A. (1984). Towards a domain-referenced system for classifying
assignments. Research in the Teaching of English, 18(4), 385–416.
Raimes, A. (1991). Out of the woods: Emerging traditions in the teaching of writing. TESOL Quarterly, 25, 407–
430.
Scharton, M. (1996). The politics of validity. In E. M. White, W. D. Lutz, & S. Kamusikiri (Eds.), Assessment of writing:
Politics, policies, practices. New York: The Modern Language Association of America.
Shohamy, E. (1998). Critical language testing and beyond. Studies In Educational Evaluation, 24, 331–345.
Sweedler-Brown, C. O. (1993). ESL essay evaluation: The influence of sentence-level and rhetorical features. Journal of
Weigle, S. (2002). Assessing writing. Cambridge: Cambridge University Press.
Weigle, S. (2005). Second language writing expertise. In K. Johnson (Ed.), Expertise in language learning and teaching
(pp. 128–149). Hampshire, England: Palgrave Macmillan.
White, E., Lutz, D., & Kamusikiri, S. (Eds.). (1996). Assessment of writing: Politics, policies, practices. New York: The
Modern Language Association.
Wolcott, W., & Leggett, S. M. (1998). An overview of writing assessment: Theory, research and practice. Urbana, IL:
National Council of Teachers of English.
Appendix. Selected references on assessment
Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation.
Cambridge: Cambridge University Press.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford
University Press.
Bachman, L., & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University
Press.
Brown, H. (2004). Language assessment: Principles and classroom practices. White Plains,
NJ: Pearson Education.
Brown, J. D., & Hudson, T. (1998). The alternatives in language assessment. TESOL
Quarterly, 32, 653–675.
Cohen, A. D. (1994). Assessing language ability in the classroom. Boston: Heinle and Heinle.
Davidson, F., Lynch, B. K. (2002). Testcraft: A teacher’s guide to writing and using language
test specifications. New Haven, CT: Yale University Press.
Hamp-Lyons, L. (1990). Second language writing: Assessment issues. In B. Kroll (Ed.),
Second language writing: Research insights for the classroom (pp. 69–87). Cambridge:
Cambridge University Press.
Hamp-Lyons, L. (1991). Assessing second language writing in academic contexts. Norwood,
NJ: Ablex.
Hamp-Lyons, L., & Kroll, B. (1997). TOEFL 2000-writing; composition, community, and
assessment (TOEFL Monograph Series Report No. 5). Princeton, NJ: Educational Testing
Service.
Hudson, T., & Brown, J. D. (2002). Criterion-referenced language testing. Cambridge:
Cambridge University Press.
Huot, B. (1990). The literature of direct writing assessment: major concerns and prevailing
trends. Review of Educational Research, 60, 237–263.
Shohamy, E. (2001). The power of tests: A critical perspective on the uses of language tests.
London: Longman/Pearson Education.
Weigle, S. (2002). Assessing writing. Cambridge: Cambridge University Press.
White, E. (1994). Teaching and assessing writing: Recent advances in understanding,
evaluating and improving student performance, 2nd ed. San Francisco: Jossey-Bass.
Wolcott, W., & Leggett, S. M. (1998). An overview of writing assessment: Theory, research
and practice. Urbana, IL: National Council of Teachers of English.

Teaching Writing Teachers About Assessment

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Teaching Writing Teachers About Assessment

Hochgeladen von

Copyright:

Verfügbare Formate

Available online at www.sciencedirect.

Journal of Second Language Writing 16 (2007) 194–209