Sie sind auf Seite 1von 18

843818ANN The Annals of the American AcademyClassroom Assessment to Support Teaching and Learning

research-article2019

Classroom assessment includes both formative assess-


ment, used to adapt instruction and help students to
improve, and summative assessment, used to assign
grades. These two forms of assessment must be coher-
ently linked through a well-articulated model of learn-
ing. Sociocultural theory is an encompassing grand
theory that integrates motivation and cognitive devel-
opment, and it enables the design of equitable learning
environments. Learning progressions are examples of
fine-grained models of learning, representing goals,
Classroom intermediate stages, and instructional means for reach-
ing those goals. A model for creating a productive
Assessment to classroom learning culture is proposed. Rather than
seeking coherence with standardized tests, which

Support undermines the learning orientation of formative


assessment, I propose seeking coherence with ambi-
tious teaching practices. The proposed model also
Teaching and offers ways to minimize the negative effects of grading
on learning. Support for teachers to learn these new
Learning assessment practices is most likely to be successful in
the context of professional development for new cur-
riculum and standards.

Keywords: formative assessment; summative assess-


ment; feedback; grading; learning theory;
equity
By
Lorrie A. Shepard

T he use of assessment in classrooms to sup-


port teaching and learning is a fundamen-
tally different undertaking from large-scale
testing designed to monitor trends, hold schools
accountable, evaluate teachers and programs,
or inform selection and placement decisions.

Lorrie A. Shepard is a university distinguished profes-


sor in the School of Education at the University of
Colorado Boulder. Her research focuses on psychomet-
rics and the use and misuse of tests in education settings.
Most cited are her contributions to validity theory,
standard setting, bias detection, the effects of high-
stakes accountability testing, and the integration of
learning theory with classroom formative assessment.

NOTE: Special thanks to Michael Feuer and Jim


Pellegrino for their thoughtful comments on an earlier
draft of this article.
Correspondence: Lorrie.Shepard@Colorado.edu
DOI: 10.1177/0002716219843818

ANNALS, AAPSS, 683, May 2019 183


184 THE ANNALS OF THE AMERICAN ACADEMY

(These other purposes of testing motivate the organization of this volume.)


Classroom assessment includes both formative assessment used to adapt instruc-
tion and help students improve and summative assessment used to assign grades
or otherwise certify student achievement. Often formative strategies for eliciting
and responding to student thinking may be informal and can be engaged in con-
texts of real-world problem solving rather than requiring test-like formats.
Because classroom assessment is intended to aid directly in the learning process
(not merely to measure learning outcomes), it necessarily must be closely tied to
instructional practices and to relevant research: on learning (in subjects such as
mathematics, science, and literacy); on motivation, feedback, and self-regulation;
on cognitive and sociocultural aspects of learning and identity; on curriculum; on
adaptive teaching and teacher-learning; and on theories of formative assessment
and grading.
Historically, the measurement research literature did not distinguish between
the kinds of test formats needed for standardized tests and those that teachers
would need for classroom quizzes and tests (Shepard 2006). Because of beliefs
about learning a century ago, equating learning goals with test formats did not
seem unreasonable (Shepard 2000). (For historical context see, e.g., Vinovskis,
this volume; Reese 1999; for an earlier review, U.S. Congress, Office of Technology
Assessment 1992). This context for classrooms began to change in the 1990s, how-
ever, due to two major trends: (1) increased accountability pressure attached to
high-stakes tests leading to outcries about associated distortions of curriculum and
instruction, and (2) changing conceptions of subject-matter expertise and learning
processes following from the rise of cognitive psychology and constructivist
approaches to teaching and learning. Although beyond the scope of this article, it
is important to note that assessment reformers in the United States responded to
these challenges by developing new subject matter standards and new perfor-
mance assessments to represent and enact those standards. In the United Kingdom
and in other Organisation for Economic Co-operation and Development (OECD)
countries, assessment reforms focused on formative assessment as a remedy more
likely to raise student achievement (Black and Wiliam 1998a, 1998b; OECD
2005). The research literatures resulting from both of these strands of work make
important contributions to the work that I report here.
In this article, I consider how research on learning and motivation can be
brought together with research on grading so that they will not so often be at
cross-purposes. I propose a model for classroom assessment based on s­ ociocultural
theory. Rather than seeking alignment with standardized tests, which u ­ ndermines
classroom cultural practices that support deep learning, I draw the connections
between productive formative assessment practices and a­ mbitious, high-leverage
teaching practices. My main concern is to help policy-makers avoid the mistake
of thinking that tests designed primarily to serve purposes of monitoring,
accountability, or selection can also be assumed to be useful to teachers making
decisions about their students.
Classroom Assessment to Support Teaching and Learning 185

Integrating Learning and Teaching Research to Develop


a Coherent Model for Classroom Assessment
In this section, I summarize relevant research and offer an argument for how these
findings and principles can be fit together into a coherent, research-informed model
for classroom assessment. Coherence is a crucial enabling feature of any model
because it supports the sense-making of individual participants and also makes it
more likely that separate reform efforts—those based on “culturally relevant peda-
gogy,” standards-based instruction, technology, and assessment—can be made
mutually supportive. The National Research Council (2001) report, Knowing What
Students Know (KWSK), introduced the idea of horizontal coherence at the level of
classrooms when curriculum, instruction, and assessment are built from a common
model of learning. This idea of building coherence from a shared model of learning
is fundamental. KWSK emphasized further that ideally, assessment would not just
be “aligned” with instruction but would be “integrated seamlessly into instruction so
that teachers and students are receiving frequent but unobtrusive feedback about
their progress” (p. 256). KWSK also argued for vertical coherence between class-
room level and large-scale assessments, recognizing that representation of learning
goals would have to be at very different grain sizes. I do not elaborate on vertical
coherence here, but in the final section of the article, I argue against vertical coher-
ence in those cases where impoverished representations of intended learning goals
are likely to undermine deep learning and equity goals.
To support learning, formative assessment must be grounded in a well-­
articulated model of learning (National Research Council 2001; Penuel and
Shepard 2016), based on empirical studies as well as curricular conceptualiza-
tions, which specify both desired outcomes as well as pathways and critical junc-
tures on the way to competence. Shepard, Penuel, and Pellegrino (2018) have
further distinguished between “little” and “big” learning theories: where the
former are the discipline-specific, cognitive models featured in KWSK and the
latter refer to grander schools of thought about how knowing develops in the
human mind and in communities. In comparison to differential, behaviorist, and
cognitive conceptions of learning, Shepard, Penuel, and Pellegrino argue that
sociocultural theory (discussed in the next section) is the encompassing theory
because it integrates motivation and cognitive development. This integration is
essential in a model for classroom assessment where formative assessment prac-
tices lead to and enable graded demonstrations of proficiency.

Contemporary research on learning goals


The first requirement for a model of classroom assessment is that it do a good
job of enacting and representing valued learning goals. Present-day research
findings—helping us to define what those goals should be—are consistent with
but also a considerable extension of the twentieth-century cognitive science
research summarized in How People Learn (National Research Council 2000).
Whereas with past reforms real-world applications supplied the contexts for
meaningful learning, more recent research has focused on the nature of
186 THE ANNALS OF THE AMERICAN ACADEMY

participation in “authentic practice” (Sawyer 2006, 5), itself a significant goal for
learning. Disciplinary reforms of the 1990s already recognized that learning with
understanding is principled, entails mental organizational structures, and enables
transfer across contexts. In recent decades, researchers in the learning sciences
have found that deep learning occurs when students “engage in activities that are
similar to the everyday activities of professionals who work in a discipline”
(Sawyer 2006, 4); and it follows necessarily that this type of learning is specific to
each discipline and community of expertise, rather than involving generic reason-
ing abilities (National Research Council 2012).
For example, the National Research Council (2007) Committee on Science
Learning, Kindergarten through Eighth Grade identified four interconnected
strands of learning needed to achieve scientific proficiency:

1. know, use, and interpret scientific explanations of the natural world (i.e.,
phenomena);
2. generate and evaluate scientific evidence and explanations;
3. understand the nature and development of scientific knowledge; and
4. participate productively in scientific practices and discourse (p. 334).

Traditional learning goals for K–8 science education account for only part of
strands 1 and 3; but even the idea of hypothesis testing in strand 3 now means
being able to conjecture and generate research questions, not just performing
cookbook versions of the “scientific method.”
At the turn of the twenty-first century, policy-makers, politicians, and business
leaders became keen on expanding the definition of learning goals, for different
reasons; but in general, they were concerned about international competitiveness
(as they had been in the 1990s) and the need for a workforce with technological
and analytical thinking skills (National Alliance of Business 2002). In 2009, when
the Common Core State Standards Initiative was launched by the National
Governors Association and the Council of Chief State School Officers (2010a,
2010b), “twenty-first-century skills” became the new buzz phrase (Mathews
2009). As disciplinary researchers became involved in the development of new
standards, a key feature in both English language arts and mathematics, as well
as the Next Generation Science Standards (National Research Council 2013),
was the integration of content strands with disciplinary practices. Also in 2009,
the National Research Council Committee on Defining Deeper Learning and
21st Century Skills was convened to examine research evidence for how such
skills are developed. The committee acknowledged that these types of goals for
learning are not new; skills and abilities such as critical thinking, reasoning and
argumentation, innovation, flexibility, initiative, self-reflection, collaboration, and
communication have always been valued in society, but what may be new is the
expectation that all students develop these abilities (National Research Council
2012). The committee identified three main categories of skills—cognitive,
intrapersonal, and interpersonal—that are inherently intertwined and cannot be
taken apart for separate didactic treatment. In this context, formative assessment
became increasingly valued as a tool to guide student progress.
Classroom Assessment to Support Teaching and Learning 187

Learning progressions are the most prominent example of fine-grained mod-


els of learning that represent goals, intermediate stages, and instructional means
for reaching those goals. Learning progressions are different from traditional
curricular scope and sequence charts because hypothetical trajectories are built
from existing research evidence plus expert judgments and then are refined
through field testing. Developing learning progressions is essentially a special
case of curriculum development where instructional activities and embedded
assessments plus teacher learning supports are jointly designed (Shepard,
Penuel, and Pellegrino 2018). The promise of learning progressions comes from
the substantive insights they provide to help in interpreting student thinking and
responding to well-known learning challenges (Shepard 2018). Substantively
developed continua—where distinct ways of thinking are identified for each
stage of development—are different from quantitative score scales anchored by
sample test items. Individual test items are difficult for many reasons and typi-
cally do not tell the story of the mental conceptions a student is holding or how
a student’s ideas might be engaged toward further learning.
A major limitation of learning progressions follows from one of their strengths.
Simply put, they are difficult to develop. As a result, there are few existence
proofs of learning progressions, which therefore means they could not plausibly
be developed for the entirety of K–12 curricula (Brookhart 2018). Instead, a few
well-designed learning progressions can help teachers attempting to take on
ambitious teaching practices (Shepard, Penuel, and Pellegrino 2018). For
broader-scale implementation, teachers can adopt a “learning progression
approach” and begin to think more explicitly about ways to recognize intermedi-
ate and partially formed student ideas and also to keep track of and share with
colleagues successful instructional interactions (Alonzo 2017; Shepard 2018).

Integration of knowing and “becoming”


Sociocultural and “situative” theories of learning have emerged as the more
powerful or complete perspective on the social nature of learning as compared to
cognitive theories (see also Mislevy, this volume). Situated learning refers to
more than just the location where learning occurs but considers additionally the
processes by which the individual and community are changing or evolving
together through participation in communities of practice (Lave and Wenger
1991; Rogoff 1997). Learning then is defined in part as the transformation of an
individual’s ability to participate in valued social and cultural activities. The pro-
cesses by which an individual engages in practice involve emotional, motivational,
and relational aspects of self, not just knowing (Holland and Lave 2009).
Personal identity is, therefore, critically important because individuals are lit-
erally “authoring” themselves as they contribute to a community that is in turn
shaping them. Sociocultural theorists have grounded their work in ethnographic
field studies, discourse analysis, and other qualitative methodologies to provide
integrative accounts of participation and development. Their findings are consist-
ent with experimental research literatures in social psychology on self-efficacy,
self-determination, and self-regulation (Pintrich 2000; Yeager and Walton 2011).
188 THE ANNALS OF THE AMERICAN ACADEMY

Sociocultural theory enables an expanded view of goals for learning. And it


reframes the ways that classroom activities might be organized to provide
extended “practice” with the ways of thinking and reasoning that characterize
mature participation in disciplines such as history, science, mathematics, or lit-
erature. Practice here does not mean repetitive drill; it means extended opportu-
nities for meaningful participation—in conversations about mathematical
problem solving, for example.
Sociocultural theory also has profound implications for the design of equitable
learning environments, because it emphasizes ambitious learning targets and the
role of identity. Acknowledging who a person is and envisioning who that person
might become are essential to meaningful participation. Cognitive research
taught us decades ago about the importance of prior knowledge for new learning,
but this was often taken to mean the prior knowledge taught in school. In con-
trast, sociocultural theory attends to all life experiences as relevant to learning,
including cultural practices in homes, neighborhoods, and places of worship and
intuitive understandings of the natural world. And because emotion and cogni-
tion are now understood to be inextricably linked, a student’s sense of safety and
respect are also essential to meaningful learning opportunities. Sociocultural
perspectives help to acknowledge who a person is as she joins a classroom com-
munity (e.g., Moll et al. 1992), but these theories also call for ongoing mediation
of participation across communities of practice, as teachers and students work
out what things are the same and what are differences between home and school
(Calabrese Barton, Tan, and Rivet 2008).
Among the myriad social interactions that contribute to learning and develop-
ment, feedback is a construct fundamental to formative assessment. Practice
without feedback does not further learning, and thus feedback, from self-reflec-
tion or from others, is essential for learning (National Research Council 2012).
Yet a striking finding from the literature is that in one-third of studies, the effect
of feedback is negative compared to controls with no feedback (Kluger and
DeNisi 1996; Hattie and Timperley 2007; Shute 2008). The explanation for this
result comes from the same or similar studies of motivation, considered next,
showing that feedback can be harmful when it focuses on the learner in compari-
son to others rather than on task performance.
Feedback has not typically been theorized from a sociocultural perspective
because the idea derives from a mechanical process, and historically most studies
were framed from a behaviorist or cognitive perspective. Shute (2008) acknowl-
edged a few studies that use Vygotskian notions of scaffolding to conceptualize
formative feedback and attend to the development of self-regulation capabilities
along with the furthering of cognitive goals (Graesser, McNamara, and Van Lehn
2005). To better understand the interplay of cognitive and intrapersonal or emo-
tional dimensions of learning, research on effective tutoring is relevant. Lepper
and Woolverton (2002), for example, found that expert tutors carefully tuned the
problems they selected to students’ interests and current state of knowledge,
actively considered students’ emotional states by giving them explicit choices
about tackling more difficult problems, rarely provided didactic lessons on how
to do problems, but did provide more metacognitive talk about learning
Classroom Assessment to Support Teaching and Learning 189

strategies and making connections between new problems and work already
mastered.
The research literature on motivation to learn is vast and difficult to distill
because there are numerous theories that overlap with similar (but differently
named) constructs. For example, according to self-determination theory (Deci
and Ryan 1985, 2008) individuals have basic needs for competence, autonomy,
and what they call relatedness. Autonomous or intrinsic motivation involves voli-
tion and choice and leads to productive and successful outcomes. By contrast,
controlled motivation comes from pressure and sources of control outside the
self. These ideas are quite similar to research on goal orientation whereby stu-
dents are said to adopt either mastery or performance goals (Dweck 1986).
Students working toward mastery or task-involved goals are willing to invest
effort in becoming more competent, to tackle more difficult problems, and to
willingly seek new strategies. By contrast, students with performance or an ego-
involved orientation many engage in learning behaviors but mostly to get good
grades or to avoid being judged incompetent.
Closely related to students’ sense of autonomy versus control are beliefs about
the nature of intelligence. As outlined by Dweck (2002), children who believe
that intelligence can be developed are more willing to invest effort in learning,
are more self-confident, and exhibit “hardiness” in the face of setbacks. Currently
popular growth mindset (Dweck 2006) interventions are one example of short-
duration, psychological interventions shown to improve student achievement by
altering beliefs about intelligence, stereotype threat, or other self-conceptions
(Yeager and Walton 2011). These interventions are difficult to scale up, however,
because they can easily devolve to superficial imitations. A better approach is not
to conceive of separate motivational therapies but rather to develop a classroom
culture focused on learning (Shepard 2000), where trajectories for development
of academic, intrapersonal, and interpersonal capabilities are integrated and
mutually supportive (National Research Council 2012). This means, for example,
resisting grading practices and normative comparisons that by definition make
only some children winners.

Classroom culture: Formative assessment, grading, and equitable and ambi-


tious teaching
Here I bring together the research summarized above—on learning goals and
learning processes—to envision a model for productive classroom assessment. I
borrow from models developed by Joan Herman (2010, 2013) and from a version
of Herman’s 2010 model adapted by the National Research Council’s (2012)
Committee on Defining Deeper Learning and 21st Century Skills. Although draw-
ing a picture to represent complex ideas is necessarily fraught with its own potential
for error and miscommunication, talking about what belongs in the picture and
why is nonetheless a useful device for summarizing and clarifying core ideas.
As shown by the central arrow in Figure 1, formative assessment practices
must be coherently linked to learning goals by means of a research-based or
190 THE ANNALS OF THE AMERICAN ACADEMY

Figure 1
A Progression-Based Model of a Classroom Learning Culture Connecting Formative
Assessment Practices with Ambitious Teaching

curriculum-supported learning progression and should thereby enable develop-


ment toward culminating or summative assessments. In Figure 1, however, I
have purposely omitted reference to either benchmark or accountability assess-
ments, which occur in other models (e.g., Herman 2010). Most present-day
examples of such assessments are not coherently aligned with ambitious content
goals and disciplinary practices. As a result, thoughtful classroom practices most
often need to manage the dissonance rather than presume coherence (see, e.g.,
Braaten et al. 2017).
Figure 1 is drawn to convey the idea that deep learning entails the concomi-
tant development of cognitive, intrapersonal, and intrapersonal competencies
(National Research Council 2012), represented by the three circles on the left
that are then entwined in the learning progression. Many of the specific compe-
tencies identified as contributing to these processes of knowing and becoming
are also salient in the formative assessment literature—these include critical
thinking, problem-solving, self-regulation, metacognitive skills, communication,
and collaboration. On the right side of the picture, I have made the model more
complex than a single summative assessment. Herman (2013) uses a student
photograph at the point of end-of-year assessments, which I have adopted to
Classroom Assessment to Support Teaching and Learning 191

symbolize the importance of identity as an integral contributor to and outcome


for learning. I have separated cognitive summative assessments used for grading
from other learning outcomes, however, because showing everything entwined
here could lead to serious misunderstandings.
As is evident from the motivation literature, grading is a problem for learning.
Because grading practices elicit comparisons to classmates and imply a perma-
nent lack of ability when learning targets seem out of reach, grading require-
ments are an obstacle for every teacher hoping to develop a learning-focused
classroom culture (Shepard 2000). Point systems are especially problematic if
they are used as external rewards to “motivate” and control students. There is
ample experimental evidence in the formative assessment literature that students
learn more from written comments alone than from comments plus grades
(Black et al. 2003; Butler 1988).
The research on grading (e.g., Shepard, Penuel, and Pellegrino 2018) yields
recommendations about how best to lessen the distorting effects of grading: by
ensuring that grades are based on projects and rich representations of intended
learning goals, by establishing routines whereby students can invest effort and
improve based on feedback, and by allowing for later evidence of mastery to
replace early attempts that fell short. The requirement to keep parents informed
can better be addressed with substantive examples of progress instead of letter
grades or percentages that say nothing about actual learning. Although it might
be argued that high school students must be assigned grades as evidence for col-
lege admissions, the same cannot be said for young children, for whom parents
need evidence about developmental milestones and advice about what they can
do to support their children’s growth.
As much as grading may be unavoidable for academic achievement, it is nei-
ther necessary nor defensible to grade students on personality or social-emotional
dimensions. Affective measures may have sufficient reliability and validity for
aggregate school climate studies, but they are not sufficiently accurate at the
individual student level and are vulnerable to teacher bias. Measurement
research on grading practices consistently recommends against using “enabling
factors” (McMillan 2001) such as effort, work habits, and participation as ingre-
dients when determining grades. The primary technical reason for this argument
is that these other factors distort the meaning of grades as an indicator of achieve-
ment; this is the same reasoning behind arguments for standards-based grading
systems (Guskey 2009). A more compelling reason, however, when considering
classroom culture, is that giving points for effort and collaboration leads to the
commodification of these endeavors and invites a performance orientation, for
example, working to please the teacher, rather than supporting students to
develop a learning or mastery orientation. Factors that enable learning, such as
attention, organizational skills, and collaboration, are more appropriate as targets
for formative feedback than for grading.
The picture I present in Figure 1 differs from the models provided by Herman
(2010, 2013) and the National Research Council (2012) because I have located
the learning progression arrow and entwining of multiple competencies within
the context of classroom culture—and beyond that, within the context
192 THE ANNALS OF THE AMERICAN ACADEMY

of children’s families and communities. But this pictorial difference in no way


represents fundamental differences in our understandings of sociocultural theory
or research on learning and motivation. For example, in their list of research-
based teaching methods, the deeper learning committee included
priming student motivation by connecting topics to students’ personal lives and inter-
ests, engaging students in collaborative problem solving, and drawing attention to the
knowledge and skills students are developing, rather than grades or scores. (National
Research Council 2012, 9)

As stated previously, I disagree about whether it is useful for teachers, parents,


or policy-makers to continue to locate formative assessment on a continuum, as
part of an assessment “system,” shared with formal tests and accountability-
focused instruments. Elsewhere, my colleagues and I have supported and agreed
with the KWSK conception of vertical coherence. But here I want to argue that
such vertical coherence assigns the meanings and images of standardized tests to
formative assessment and thus undermines development of classroom cultural
practices that support deep learning. A better way to support student and teacher
learning is to align formative assessment practices with ambitious teaching,
depicted by two sets of arrows acting on learning in the middle of the figure.
The vocabulary and research literature on ambitious teaching, high-leverage,
and core teaching practices arose more than a decade ago primarily in the area of
teacher preparation (Ball and Forzani 2009; Lampert and Graziani 2009).
Teaching for deep understanding and flexible knowledge use requires teaching
practices that are interactive and discourse-based, allowing students extensive
opportunities to explain their reasoning and gain experience with specific aspects
of disciplinary practice.
Formative assessment is itself a high-leverage practice closely connected to
other core practices. For example,

to facilitate learning, teachers must know their students well—not only their personali-
ties and preferences, but also their ideas about subjects and their ways of thinking about
them, including their intellectual habits, misconceptions, and interests. They must
understand the ways in which students’ personal and cultural backgrounds bear on their
work in school and be able to respond with appropriate instructional activities. This
means skillfully eliciting, probing, and analyzing students’ thinking through verbal
interactions and written work. (Ball and Forzani 2011, 20, emphasis added)

Although being able to lead a discussion may seem like a generic skill, for the
most part developing a repertoire of specific teaching strategies implies that most
high-leverage practices are subject-matter specific and, when possible, are grade-
level and community specific. Ball and Forzani (2011), for example, argue that
second grade teachers should be able to probe for understanding of complex
subtraction problems, and middle school English teachers should have strategies
not just for teaching academic English but for helping students to understand
“how and when to use academic English” (p. 21). An extensive team of research-
ers has documented the distinct reading, reasoning, and argumentation practices
that lead to discipline-specific learning goals in literature, science, and history
Classroom Assessment to Support Teaching and Learning 193

(Goldman et  al. 2016). In science, Windschitl et  al. (2012) created discourse-
based patterns of instruction that included (1) eliciting students’ ideas, which are
then analyzed and used to adapt instruction; (2) guiding student sense-making by
asking them to draw models, share interpretations with classmates, and make
revisions; and (3) pressing students for evidence-based explanations. These ideas
in the core practices literature, especially about eliciting and responding to stu-
dent thinking (McDonald, Kazemi, and Kavanagh 2013), are completely consist-
ent with the formative assessment and learning progression literature, especially
Alonzo’s (2017) ideas about informal learning progressions when formal progres-
sions have yet to be developed.
To reiterate, formative assessment is carried out during the instructional pro-
cess for the purpose of adapting instruction to improve learning (Penuel and
Shepard 2016, 788). It can be conceived of as a set of core practices that intersect
with those intended to support deep learning and participation in disciplinary
discourse practices. More importantly, however, like ambitious teaching, forma-
tive assessment implies an ethos or classroom culture focused on equitable and
collaborative learning. Explicitly focusing classroom activity on what is being
learned was one of the most significant contributions of the Assessment Reform
Group efforts in the United Kingdom in the 1990s. Formative assessment strate-
gies include explicit sharing of learning goals and criteria for judging quality work,
questioning and other classroom routines that make thinking visible, explicit feed-
back plus informal feedback through hearing other students’ ideas, and peer- and
self-assessment. These techniques are important for providing information and for
shifting the nature of classroom interactions, but they are insufficient by them-
selves if there is not a commensurate change in the social meaning of evaluation
(Shepard 2000). Deep learning can only be supported in a cultural context of trust
and respect where students are willing to reveal what they currently understand
with full confidence that talking about ideas will surely lead to new learning.
As elaborated elsewhere, a sociocultural approach to formative assessment
involves more attention being paid to codefining emergent learning goals with
students—goals that explicitly attend to identity and that support students in
navigating between “everyday and disciplinary forms of thinking, being, and
doing” (Penuel and Shepard 2016, 821). Curricular activities are designed to
build from students’ interests, experiences, and funds of knowledge and to focus
on endpoints that are meaningful to students. Assessment processes are embed-
ded in social practices so that they remain authentic to disciplinary expertise. For
example, Windschitl et  al. (2012) build students’ content knowledge through
classroom routines whereby students explain their ideas to others, construct and
compare theories, and justify claims, all of which are occasions for peer and self
assessment. Ambitious teaching practices have been shown to be possible and,
when enacted, produce rewarding results; but such practices are neither ordinary
nor routine. I turn now to research evidence on teacher learning to identify the
supports needed to make these kinds of transformations possible.
194 THE ANNALS OF THE AMERICAN ACADEMY

Research on Teacher Learning


Over the past three decades, research on teacher learning has shifted in parallel
to changes in our understandings of student learning derived from cognitive and
sociocultural findings. In contrast to traditional “staff development” efforts
focused on what new things teachers needed to know, professional learning today
focuses on knowledge use and the kinds of experiences grounded in practice that
teachers need to have to be able to teach in profoundly different ways. Teaching
that supports development of students’ deep understanding, twenty-first-century
skills, and proficiencies called for by contemporary standards is complex and not
easily mastered. It requires conceptual framing plus ongoing implementation,
reflection, and further adaptation to learn how to be this kind of teacher.
Recognizing the social and contextualized nature of learning, for example, helps
to explain why shared teacher discourse communities (now called professional
learning communities) are an essential resource to help teachers work out new
roles and try out and adapt new instructional routines (e.g., Putnam and Borko
2000).
Three persistent themes emerge from a thorough review of research on
teacher learning and explain why professional learning communities are now the
“new paradigm” to support teacher learning and in turn student learning
(Darling-Hammond and Richardson 2009). The review argued that effective
professional development

•• deepens teachers’ knowledge of content and how to teach it to stu-


dents;
•• helps teachers to understand how students learn specific content;
•• provides opportunities for active, hands-on learning;
•• enables teachers to acquire new knowledge, apply it to practice, and reflect
on the results with colleagues;
•• is part of school reform efforts that link curriculum, assessment, and
standards to professional learning;
•• is collaborative and collegial; and
•• is intensive and sustained over time.

Four of these findings refer to the general character of professional development


opportunities and how they support teacher learning. For our purposes here, the
other three bullets that pertain to the substance of what teachers engage with in
professional learning communities are boldface. Earlier in this article, I made an
argument for coherence to support teacher sense-making and for discipline-
specific models of student learning, which are at the heart of what teachers are
learning about. Specifically, as noted, teachers are deepening their own knowl-
edge of content and how it is that students come to understand specific
content.
If teaching in fundamentally different ways is a complex and daunting task,
then teachers should not be expected to do it alone or without resources and,
Classroom Assessment to Support Teaching and Learning 195

importantly, they should not have to make sense of multiple reforms separately.
Research on formative assessment demonstrates its extraordinary potential to
improve student learning, but not if it is implemented in a way that is at cross-
purposes with the underlying theory of learning. Nor does it make sense to
implement formative assessment as its own intervention. Greater coherence and
greater effectiveness are much more likely if formative assessment professional
development is integrated with ambitious teaching practices and with curricular
reforms in literacy, mathematics, social studies, and so forth; and if professional
learning communities work out explicitly how they will protect the intentions of
formative assessment practices from grading requirements.
In our analysis of formative assessment interventions (Penuel and Shepard
2016), we outlined theories of action as well as theories of learning for each
formative assessment approach. Our theories of action specifically considered
what material resources and what professional opportunities would be needed to
support teachers in taking on new roles. Nearly all sociocognitive interventions
are embedded in curriculum or learning trajectories, which require extensive
development and field testing. These resources are not scripts and require fur-
ther adaptation in local contexts; but because teachers do not have to start from
scratch to devise activities, they can focus more intently on trying out new
instructional strategies, analyzing students’ thinking, and sharing “breakthroughs”
with colleagues, that is, the critical questions, modeling, or peer interactions that
helped students to get past familiar stumbling blocks. For sociocultural interven-
tions, material resources can include effective strategies for eliciting students’
interests, experiences, and funds of knowledge.

Summary and Policy Implications


Classroom assessment is fundamentally different from large-scale assessment
(addressed in other articles of this volume) and is even different from individual
student testing used to make selection and placement decisions. Learning-
focused assessment does not need to look like a test and does not require stand-
ardization. Classroom assessment includes both formative assessment—used to
help students revise their thinking or for teachers to adapt instruction—and
summative assessment used primarily to assign grades. To support teaching and
learning, formative assessment practices need to be fully integrated with high-
leverage, ambitious teaching practices shown to support deep learning. Although
each of these constructs is supported by extensive, separate research literatures,
they are closely connected because they are each grounded in the same theories
and empirical findings from cognitive and sociocultural learning research. These
research literatures agree that deep learning for teachers as well as students
requires engagement with discipline-specific ways of knowing and doing. For
example, even very young children can learn scientific practices such as observ-
ing, measuring, and recording phenomena, while older children learn how to
model scientific phenomena and interpret representations. A different set of
196 THE ANNALS OF THE AMERICAN ACADEMY

thinking and reasoning skills is needed for historical inquiry where students must
be able to contextualize and interpret primary documents and then reconcile
accounts from multiple historical documents (Reisman 2012).
The research-informed model for classroom assessment proposed in this arti-
cle shows how discipline-specific learning progressions or local curricula can be
used to organize both instructional activities and formative assessment processes
so that they build toward intended learning goals represented in summative
assessments. Contemporary research on learning also requires that attention be
paid to the motivational consequences of student evaluation. This means that
educators and policy-makers must be better informed about the kinds of assess-
ment practices that attend to students’ assets from home and community and to
ways of enabling self-regulation versus practices that rely on extrinsic motivation.
It makes little sense for school districts to invest in popular social psychology
interventions, such as growth mindset (Dweck 2002), without recognizing that
district testing, grading, and score-posting requirements have far more pervasive
and long-term effects on students if they are made to see themselves as less than
capable learners.
A central argument made in this article and in prior work (Penuel and Shepard
2016) is that, to be effective, assessment interventions must be designed accord-
ing to a research-based theory of learning. For classroom purposes, both “big”
and “little” theories are required (Shepard, Penuel, and Pellegrino 2018).
Teachers need big-picture understandings of how cognitive competencies such as
critical thinking, reasoning, and problem solving are jointly developed along with
personal competencies such as self-regulation, collaboration, and communication
(National Research Council 2012). “Little” theories refer to fine-grained
­models—on the scale of classroom instructional units and lessons—that describe
how knowledge and skills are developed in a particular domain. Enactment of
these theories requires the support of empirically tested learning progressions or
specific curricula designed to reach the ambitious learning goals set by Common
Core State Standards (National Governors Association Center for Best Practices
2010a, 2010b) and Next Generation Science Standards (National Research
Council 2013). Research on teacher learning is clear: the kinds of changes in
teaching practice being called for by new standards are enormous, not just tiny
adjustments. Teachers need material resources and the support of professional
communities to make these changes.
An important point for policy-makers to understand, then, is that local control
of curriculum in the United States makes it unlikely that most states can design
the kinds of coherent, curricular activity systems (Roschelle, Knudsen, and
Hegedus 2010) needed to integrate curriculum, instruction, assessment, and
teacher learning at the classroom level. That’s why districts (or consortia of
smaller districts) would be the appropriate level of authority for the development
and implementation of such systems (Shepard, Penuel, and Pellegrino 2018).
Local control of curriculum also creates a particular problem for conceptualizing
state assessments because accountability tests at the state level must necessarily
be curriculum-neutral; that is, they cannot favor one district’s curriculum over
another. Vertical coherence between classroom and state-level assessments is an
Classroom Assessment to Support Teaching and Learning 197

ideal unlikely to be realized in practice (Shepard, Penuel, and Pellegrino 2018).


In this article, I have taken a more radical position than in my previous work,
arguing that talking about coherence with state-level tests brings the wrong
meanings about standardized tests into classroom practices. I argue instead that
district policies and teacher professional learning opportunities would be more
productive if conceptual coherence could be sought between formative assess-
ment practices, standards implementation, and high-leverage teaching practices
and if the social meaning of learning-focused assessment and grading could be
better reconciled.
The best way for policy-makers to avoid the mistakes of the past is to recognize
that the same “test” cannot serve both accountability and instructional purposes.
Public outcry against state tests and demands that they be made more “instruc-
tionally relevant” actually make the problem worse (often resulting in longer tests
to make individual subtest scores more reliable). These public debates ignore the
fact that state tests must be curriculum-neutral and because of cost and time
constraints cannot provide deep insights into student thinking. (Many policy-
makers and members of the public incorrectly believe that state content stand-
ards are the same thing as curriculum. In fact, standards are much more general
and vague compared to curricula and are better thought of only as curriculum
frameworks.) Complaints about needing to get scores back sooner do not address
this problem. It would be better to recognize that once-per-year tests provide
teachers with a way to evaluate the adequacy of their curriculum rather than
seeking formative help with individual students. The type of information pro-
vided by state tests would actually be more useful if reviewed every summer to
identify programmatic strengths and weakness as part of planning for the next
school year (Shepard 2008). For example, a grade level team of teachers could
take action if their classroom scores on mathematical modeling were significantly
behind where their students had performed on other dimensions of the state
mathematics test. What state tests can do to make the most difference is to
ensure better representation of challenging learning goals, especially by includ-
ing open-ended assessment tasks that call for disciplinary practices such as argu-
mentation and modeling as well as content knowledge.
A key lesson for policy-makers interested in supporting more meaningful
classroom assessment practices is to attend to the important differences between
classroom and large-scale assessment. The research literature is replete with
findings documenting how deep learning improvement efforts in classrooms and
schools are sabotaged by accountability tests and associated rewards and punish-
ments. Especially when teachers are not yet familiar with curricula and instruc-
tional strategies to enact ambitious disciplinary standards, it is easy to lapse into
narrow teaching-the-test instructional routines and to see student “deficits”
rather than learning opportunities as the cause of poor performance (Bertrand
and Marsh 2015).
Helping teachers help students requires a vision for equitable and ambitious
teaching practices and a conceptual understanding of how formative assessment
and grading practices could be consistent with such a vision. In this vein, because
of the magnitude of the changes needed across multiple aspects of
198 THE ANNALS OF THE AMERICAN ACADEMY

science education, the National Research Council (2014) report on developing


assessments for the Next Generation Science Standards (NGSS) advised policy-
makers to focus on developing new assessment systems from the “bottom up,”
focusing first on valid enactments as close as possible to the point of instruction.
The same literature that shows many failed attempts at challenging standards-
based reforms also provides glimmers of real change when teachers have ade-
quate curricular and collegial support to learn about disciplinary content with an
emphasis on how student mastery develops. It also matters that teachers have
agency in pursuing disciplinary learning goals and that teachers as well as stu-
dents are able to learn in a climate of trust, mutual respect, and caring.

References
Alonzo, Alicia C. 2017. Models of student cognition as guides for formative assessment, Michigan
Formative Assessment Academy Webinars, Michigan Mathematics and Science Centers Network,
October 16, Ann Arbor, MI.
Ball, Deborah L., and Francesca M. Forzani. 2009. The work of teaching and the challenge of teacher
education. Journal of Teacher Education 60 (5): 497–511.
Ball, Deborah L., and Francesca M. Forzani. 2011. Building a common core for learning to teach and
connecting professional learning to practice. American Educator 35 (2): 17–21, 38–39.
Bertrand, Melanie, and Julie A. Marsh. 2015. Teachers’ sensemaking of data and implications for equity.
American Educational Research Journal 52 (5): 861–93.
Black, Paul, Christine Harrison, Clare Lee, Bethan Marshall, and Dylan Wiliam. 2003. Assessment for
learning: Putting it into practice. Maidenhead, UK: Open University Press.
Black, Paul, and Dylan Wiliam. 1998a. Assessment and classroom learning. Assessment in Education:
Principles, Policy, and Practice 5 (1): 7–74.
Black, Paul, and Dylan Wiliam. 1998b. Inside the black box: Raising standards through classroom assess-
ment. London: Department of Education & Professional Studies, King’s College London.
Braaten, Melissa, Chris Bradford, Kathryn Kirchgasler, and Sadie Fox Barocas. 2017. How data use for
accountability undermines equitable science education. Journal of Educational Administration 55 (4):
427–46.
Brookhart, Susan M. 2018. Learning is the primary source of coherence in assessment. Educational
Measurement: Issues and Practice 37 (1): 35–38.
Butler, Ruth. 1988. Enhancing and undermining intrinsic motivation: The effects of task-involving and
ego-involving evaluation on interest and performance. British Journal of Educational Psychology 58
(1): 1–14.
Calabrese Barton, Angela, Edna Tan, and Ann Rivet. 2008. Creating hybrid spaces for engaging school
science among urban middle school girls. American Educational Research Journal 45 (1): 68–103.
Darling-Hammond, Linda, and Nikole Richardson. 2009. Teacher learning: What matters? Educational
Leadership 66 (5): 46–53.
Deci, Edward L., and Richard M. Ryan. 1985. Intrinsic motivation and self-determination in human
behavior. New York, NY: Plenum Press.
Deci, Edward L., and Richard M. Ryan. 2008. Facilitating optimal motivation and psychological well-
being across life’s domains. Canadian Psychology 49 (1): 14–23.
Dweck, Carol S. 1986. Motivational processes affecting learning. American Psychologist 41 (10): 1040–48.
Dweck, Carol S. 2002. The development of ability concepts. In Development of achievement motivation,
eds. Allan Wigfield and Jacquelynne Eccles, 57–88. San Diego, CA: Academic Press.
Dweck, Carol S. 2006. Mindset: The new psychology of success. New York, NY: Random House.
Goldman, Susan R., M. Anne Britt, Willard Brown, Gayle Cribb, MariAnne George, Cynthia Greenleaf,
Carol D. Lee, Cynthia Shanahan, and Project READI. 2016. Disciplinary literacies and learning to
Classroom Assessment to Support Teaching and Learning 199

read for understanding: A conceptual framework for disciplinary literacy. Educational Psychologist 51
(2): 219–46.
Graesser, Arthur C., Danielle S. McNamara, and Kurt Van Lehn. 2005. Scaffolding deep comprehension
strategies through Point & Query, AutoTutor, and iSTART. Educational Psychologist 40 (4): 225–34.
Guskey, Thomas R. 2009. Practical solutions for serious problems in standards-based grading. Thousand
Oaks, CA: Corwin Press.
Hattie, John, and Helen Timperley. 2007. The power of feedback. Review of Educational Research 77 (1):
81–112.
Herman, Joan L. 2010. Coherence: Key to next generation assessment success. Los Angeles, CA: CRESST.
Herman, Joan. L. 2013. Formative assessment for next generation science standards: A proposed model.
Princeton, NJ: Educational Testing Service.
Holland, Dorothy, and Jean Lave. 2009. Social practice theory and the historical production of persons.
Action: An International Journal of Human Activity Theory 2:1–15.
Kluger, Avraham N., and Angelo DeNisi. 1996. The effects of feedback interventions on performance: A
historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological
Bulletin 119 (2): 254–84.
Lampert, Magdalene, and Filippo Graziani. 2009. Instructional activities as a tool for teachers’ and teacher
educators’ learning in and for practice. Elementary School Journal 109 (5): 491–509.
Lave, Jean, and Etienne Wenger. 1991. Situated learning: Legitimate peripheral participation. Cambridge:
Cambridge University Press.
Lepper, Mark R., and Maria Woolverton. 2002. The wisdom of practice: Lessons learned from the study
of highly effective tutors. In Improving academic achievement: Impact of psychological factors on
education, edited by Josua Aronson, 135–58. San Diego, CA: Academic.
Mathews, Jay. 5 January 2009. The rush for “21st-century skills.” The Washington Post.
McDonald, Morva, Elham Kazemi, and Sarah S. Kavanagh. 2013. Core practices and pedagogies of
teacher education: A call for a common language and collective activity. Journal of Teacher Education
64 (5): 378–86.
McMillan, James H. 2001. Secondary teachers’ classroom assessment and grading practices. Educational
Measurement: Issues and Practice 20 (1): 20–32.
Mislevy, Robert J. 2019. Advances in measurement and cognition. The ANNALS of the American Academy
of Political and Social Science (this volume).
Moll, Luis C., Cathy Amanti, Deborah Neff, and Norma Gonzalez. 1992. Funds of knowledge for teach-
ing: Using a qualitative approach to connect homes and classrooms. Theory into Practice 31 (2): 132–
41.
National Alliance of Business. 2002. A nation of opportunity: Building America’s 21st century workforce.
Washington, DC: Department of Labor.
National Governors Association Center for Best Practices, and Council of Chief State School Officers.
2010a. Common Core State Standards for English Language Arts. Washington, DC: National
Governors Association Center for Best Practices, and Council of Chief State School Officers.
National Governors Association Center for Best Practices, and Council of Chief State School Officers.
2010b. Common Core State Standards for Mathematics. Washington, DC: National Governors
Association Center for Best Practices, and Council of Chief State School Officers.
National Research Council. 2000. How people learn: Brain, mind, experience, and school, eds. John D.
Bransford, Ann L. Brown, and Rodney R. Cocking. Committee on Developments in the Science of
Learning, Commission on Behavioral and Social Sciences and Education, National Research Council.
Washington, DC: National Academies Press.
National Research Council. 2001. Knowing what students know: The science and design of educational
assessment, eds. James W. Pellegrino, Naomi Chudowsky, and Robert Glaser. Committee on the
Foundations of Assessment. Board on Testing and Assessment, Center for Education. Division of
Behavioral and Social Sciences and Education. Washington, DC: National Academies Press.
National Research Council. 2007. Taking science to school: Learning and teaching science in grades K-8,
eds. Richard A. Duschl, Heidi A. Schweingruber, and Andrew W. Shouse. Committee on Science
Learning, Kindergarten through Eighth Grade. Board on Science Education, Center for Education.
Division of Behavioral and Social Sciences and Education. Washington, DC: National Academies
Press.
200 THE ANNALS OF THE AMERICAN ACADEMY

National Research Council. 2012. Education for life and work: Developing transferable knowledge and
skills in the 21st century, eds. James W. Pellegrino and Maragaret L. Hilton. Board on Testing and
Assessment and Board on Science Education, Division of Behavioral and Social Sciences in Education,
edited by Committee on Defining Deeper Learning and 21st Cenury Skills. Washington, DC: National
Academies Press.
National Research Council. 2013. Next Generation Science Standards: For states, by states. Washington,
DC: National Academies Press.
National Research Council. 2014. Developing assessments for the Next Generation Science Standards.
Washington DC: National Academies Press.
Organisation for Economic Co-operation and Development (OECD). 2005. Formative assessment:
Improving learning in secondary classrooms. Paris: OECD.
Penuel, William R., and Lorrie A. Shepard. 2016. Assessment and teaching. In Handbook of research on
teaching, 5th ed., eds. Drew H. Gitomer and Courtney A. Bell, 787–850. Washington, DC: American
Educational Research Association.
Pintrich, Paul R. 2000. Multiple goals, multiple pathways: The role of goal orientation in learning and
achievement. Journal of Educational Psychology 92 (3): 544–55.
Putnam, Ralph T., and Hilda Borko. 2000. What do new views of knowledge and thinking have to say about
research on teacher learning? Educational Researcher 29 (1): 4–15.
Reese, William J. 1999. Testing wars in the public schools: A forgotten history. Cambridge, MA: Harvard
University Press.
Reisman, Avishag. 2012. Reading like a historian: A document-based history curriculum intervention in
urban high schools. Cognition and Instruction 30 (1): 86–112.
Rogoff, Barbara. 1997. Evaluating development in the process of participation: Theory, methods, and
practice building on each other. In Change and development: Issues of theory, application, and method,
eds. Erie Amsel, and Ann Renninger, 265–85. Hillsdale, NJ: Erlbaum.
Roschelle, Jeremy, Jennifer Knudsen, and Stephen Hegedus. 2010. From new technological infrastruc-
tures to curricular activity systems: Advanced designs for teaching and learning. In Designs for learning
environments of the future: International perspectives from the learning sciences, eds. Michael
Jacobson and Peter Reimann, 233–62. New York, NY: Springer.
Sawyer, R. Keith, ed. 2006. The Cambridge handbook of the learning sciences. New York, NY: Cambridge
University Press.
Shepard, Lorrie A. 2000. The role of assessment in a learning culture. Educational Researcher 29 (4): 4–14.
Shepard, Lorrie A. 2006. Classroom assessment. In Educational measurement, 4th ed., ed. Robert L.
Brennan, 623–46. Westport, CT: Praeger.
Shepard, Lorrie A. 2008. Formative assessment: Caveat emptor. In The future of assessment: Shaping
teaching and learning, ed. Carol Anne Dwyer, 279–303. New York, NY: Lawrence Earlbaum.
Shepard, Lorrie A. 2018. Learning progressions as tools for assessment and learning. Applied Measurement
in Education 31 (2): 165–74.
Shepard, Lorrie A., William R. Penuel, and James W. Pellegrino. 2018. Using learning and motivation
theories to coherently link formative assessment, grading practices, and large-scale assessment.
Educational Measurement: Issues and Practice 37 (1): 21–34.
Shute, Valerie J. 2008. Focus on formative feedback. Review of Educational Research 78 (1): 153–89.
U.S. Congress, Office of Technology Assessment. 1992. Testing in American schools: Asking the right ques-
tions. OTA-SET-519. Washington, DC: U.S. Government Printing Office.
Vinovskis, Maris A. 2019. History of testing in the United States: PK–12 education. The ANNALS of the
American Academy of Political and Social Science (this volume).
Windschitl, Mark, Jessica Thompson, Melissa Braaten, and David Stroupe. 2012. Proposing a core set of
instructional practices and tools for teachers of science. Science Education 96 (5): 878–903.
Yeager, David S., and Gregory M. Walton. 2011. Social-psychological interventions in education: They’re
not magic. Review of Educational Research 81 (2): 267–301.

Das könnte Ihnen auch gefallen