Sie sind auf Seite 1von 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/328129332

The concept of an agile curriculum as applied to a middle school mathematics


digital learning system (DLS)

Article · October 2018


DOI: 10.1016/j.ijer.2018.09.017

CITATION READS

1 33

6 authors, including:

Jere Confrey Meetal Shah


North Carolina State University North Carolina State University
113 PUBLICATIONS   4,538 CITATIONS    3 PUBLICATIONS   11 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Equipartitioning View project

All content following this page was uploaded by Meetal Shah on 07 March 2019.

The user has requested enhancement of the downloaded file.


International Journal of Educational Research 92 (2018) 158–172

Contents lists available at ScienceDirect

International Journal of Educational Research


journal homepage: www.elsevier.com/locate/ijedures

The concept of an agile curriculum as applied to a middle school


T
mathematics digital learning system (DLS)
Jere Confreya, , Alan P. Maloneyb, Michael Belchera, William McGowana,

Margaret Hennesseya, Meetal Shaha


a
SUDDS (Scaling Up Digital Design Studies) Group, STEM Education Department, College of Education, North Carolina State University, 407 Gorman
Street, Raleigh, NC 27607, USA
b
The Math Door, 104 Duryer Ct., Cary, NC 27511, USA

ARTICLE INFO ABSTRACT

Keywords: Curricular theory must evolve to keep pace with the implications of the design, use, and effects of
Learning Trajectories deploying and adapting digital curricular resources, especially when placed within digital
Classroom Assessments learning systems (DLS) with rapid feedback and analytic capacity. We introduce an “agile cur-
Curricular Frameworks riculum” framework describing how to use classroom assessment data to regulate teachers’
Learning Maps
practices of iteratively adapting curricula. Our DLS, called Math-Mapper 6–8, is introduced as an
Middle Grades Mathematics
example with its diagnostic assessments of students’ progress along learning trajectories. An
exploratory video study of middle school teachers reviewing, interpreting, and acting on its data,
both during instruction (short cycle feedback) and within professional learning communities
(long cycle feedback) illustrates how an agile curriculum framework supports data-driven ad-
justments based on student learning.

1. Introduction

Curricular theory must keep pace with the implications of easy access and use of ubiquitous digital curricular resources, especially
when they are incorporated into digital learning systems (DLS) with rapid feedback and analytic capacity. Concerns have been raised
about how to maintain curricular coherence, when and if teachers supplement their instruction haphazardly by adding in a range of
materials of varied quality from the web (Confrey, Gianopulos, McGowan, Shah, & Belcher, 2017; Larson, 2016). At the same time,
more and more people are advocating for the customization of curricular offerings instead of insisting on strict fidelity in im-
plementation (Pane, Steiner, Baird, & Hamilton, 2015). What is needed is a means to determine when adaptations to a curriculum are
achieving their intended purpose by providing relevant, valid and timely data to teachers. In this paper, we describe one approach to
providing such data by using diagnostic assessments built around an explicit theory, that of learning trajectories.
To ground our approach in current curricular theory, in Section 2, we trace the evolution of curricular frameworks spanning design,
implementation, and outcomes of curriculum and propose and describe a new framework named “the agile curriculum”. We define an
agile curriculum as a means to support the ongoing revision and adaptation of teachers’ curricular practices based on providing im-
mediate data about what one’s students are learning. In Section 3, after describing the theoretical components of an agile curriculum
and its enacting framework, we introduce and describe briefly the features and affordances of a software application built by the team,
designed to support the use of an agile curriculum. It includes a learning map with a foundation in learning trajectories and a set of


Corresponding author.
E-mail address: jere_confrey@ncsu.edu (J. Confrey).

https://doi.org/10.1016/j.ijer.2018.09.017
Received 8 April 2018; Received in revised form 20 September 2018; Accepted 24 September 2018
Available online 06 October 2018
0883-0355/ © 2018 Elsevier Ltd. All rights reserved.
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Fig. 1. The three phases of curricula.

related digitally-scored diagnostic assessments. In Section 4, in middle schools that use the software in 1-1 enactment, we report on a
study of the teachers’ practices reviewing data from the diagnostic assessments on students’ progress on learning trajectories both with
their classes and other teachers. Finally, we reflect on the conditions of support needed to implement and refine an agile curriculum.
We view the agile curriculum as situated in the broader field of curriculum ergonomics but in a more restricted way. The field as a
whole examines the full range of design, implementation, and revision of curricular materials to consider how to take into account the
users’ needs. In contrast, the agile curriculum is limited to studying how teachers use and revise existing curriculum materials, where
curricular revisions are driven by data from learning-trajectory based classroom assessments.

2. Theoretical views of curricular frameworks

2.1. Curricular frameworks

Over the last two decades, frameworks for describing curriculum1 have changed from being oriented to production (input-output)
to ones that model use as evolving, iterative, and dynamic. A longstanding framework (Fig. 1) specified that the intended curriculum,
reflecting what authors and designers had in mind when the curriculum was originally built, would become the enacted (or im-
plemented) curriculum, encompassing the transformation that occurs as practitioners teach, which would subsequently produce the
achieved (or attained) curriculum, describing what students learned (Cuban, 1992; McKnight et al., 1987; Schmidt et al., 1996; Stein,
Remillard, & Smith, 2007). Researchers documented how transitions from one component to the next were influenced by teaching
conditions, students’ prior knowledge, teacher variables, and varied measures and the student outcomes (Ball & Cohen, 1996; Gehrke,
Knapp, & Sirotnik, 1992; Remillard, 1999; Stein, Grover, & Henningsen, 1996; Tarr, Chávez, Reys, & Reys, 2006).
Remillard and Heck (2014) offered a different conceptualization for curriculum, policy and enactment, distinguishing “official”
from “operational” curriculum mediated by “instructional materials.” Within the operational curriculum, they identified teacher-
intended curriculum (planning), enacted curriculum, and student outcomes. By placing these three components within an instruc-
tional cycle, they conceptualized curriculum “enactment” as interactional and evolving based on a teacher’s interpretative activities
and reactions (Remillard & Heck, 2014, p. 711- 716).
The role and meaning of implementation fidelity is being revised in light of approaches that legitimize curricular adaptations.
Traditionally the degree of adherence to the authors’ intention (implementation fidelity) has been a critical element in evaluating the
effectiveness of a curriculum (Huntley, 2009). However, in the last five years, curriculum scholars have reconceptualized the use of
materials and human resources to include forms of “re-sourcing” (Pepin, Gueudet, & Trouche, 2013). Remillard (2005) recognized
“teachers as ‘active’ designers and users of the curriculum materials (and not as simple transmitters) and analyzed teachers’ usages of
resources and interpretation of and participation with the resources.” (Pepin et al., 2013 p. 931). This led them to argue therefore that
evaluation of a resource’s quality becomes “collective and dynamic,” in that the question is whether a resource is good “for a given
context, for a given community, at a given stage of its development,” rather than “of good quality per se” (p. 936).
Gueudet and Trouche (2009) use “document” and “documentational genesis” to describe teachers’ ways of adapting and modifying
resources. Documentational genesis contains the twin processes of instrumentation (how resources influence teaching) and in-
strumentalization (how teachers appropriate and reshape resources). These view curriculum as a tool to accomplish instruction and
focus on curricular affordances and the process of appropriation in socio-cultural theory. Trouche (2004) introduced instrumental
orchestration to analyze how teachers guide students’ instrumental genesis via interactions with a given software.
Early experimentation with digital curricula has paved the way for the instrumental orchestrations. E-textbooks began as digital
replicas of printed textbooks, but evolved quickly (Chazan & Yerushalmy, 2014; Gueudet, Pepin, & Trouche, 2013). Pepin, Gueudet,
Yerushalmy, Trouche, and Chazan (2015) define an e-textbook as “an evolving structured set of digital resources, dedicated to teaching,
initially designed by different types of authors, but open for redesign by teachers, both individually and collectively,” (p. 644). Features of
these various tools have included: 1) easy revisions and additions, 2) use of a variety of media, and 3) interactivity. Digital affor-
dances can also support jointly-authored digital curricula, and repeated, asynchronous, and distributed revisions (Barquero,
Papadopoulos, Barajas, & Kynigos, 2016; Gueudet et al., 2013). These studies demonstrate both the potential and the challenges of
distributed joint authorship relative to quality and coherence.
Research on the use of open educational resources (OER) reinforces the need to create new frameworks to handle supplementation
and modification of curriculum. Since initiation of the Common Core State Standards in Mathematics (CCSS-M) and the severe economic
recession (2008–2010) in the U.S., districts and teachers, strapped for funding, turned to the internet for its plethora of (free) resources.

1
We define curriculum as “a plan for the experiences that learners will encounter, as well as the actual experiences they do encounter, that are
designed to help them reach specified mathematics objectives” (Remillard & Heck, 2014, p. 707). This definition is broad enough to encompass the
underlying learning theory, the curricular materials and classroom assessments themselves, and the modifications that teachers make (with or
without intention) during instruction. Later in the paper we propose a framework for the agile curriculum that situates these components into an
iterative cycle of improvement.

159
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Research found that 60% of teachers report using the web to supplement instruction (Davis, Choppin, McDuffie, & Drake, 2013). But
teachers’ use of the web-based content to assemble curriculum is often distressingly incoherent. Webel, Krupa, and McManus (2015)
examined how a group of 5th and 6th grade teachers combined internet materials with other curricular approaches. Asked to evaluate
examples of open educational resources (OERs), “the teachers tended to value activities they perceived students would enjoy (e.g.,
games, online activities, videos), resources with worked examples and opportunities for practicing procedures, activities that they
believed students could complete successfully, and, to some extent, resources with multiple representations” (p. 59).
Selecting materials for reasons unrelated to their cognitive intentions can undercut curricular coherence: “a collection of educational
resources is no more a curriculum than a pile of bricks is a home" (Wiliam, 2018, p. 42). Using standards to locate resources and
organize instruction fails on several counts: standards’ highly variable grain-sizes, short ramps to competency, and lack of attention to
learning (Confrey et al., 2017). They provide little or no insight into where students are in relation to instructional goals. Clearly, there
is a need for a means of monitoring and regulating curricular assembly and modification. In the next section, we suggest an appropriate
application of classroom assessment provides a missing piece to how to establish, maintain, and improve curricular quality.

2.2. Classroom assessment

Remillard and Heck (2014) situated high-stakes testing as consequential outcomes in the official curriculum, likely acknowledging
its direct effects on the designated curriculum, and indirect and limited effects on the enacted curriculum (National Research Council
(NRC, 2003). They acknowledge a bidirectional relationship between student outcomes and enacted curriculum, but attribute the
bidirectionality to students learning from solving the outcome measures’ tasks. What is as yet underrepresented in the enacted curri-
culum literature is sufficient attention to “classroom assessments” and features related to them (National Research Council (NRC, 2003;
Pellegrino, Chudowsky, Glaser, 2001; Pellegrino, DiBello, & Goldman, 2016) as a means to drive instructional decision-making.
Classroom assessments2 are assessments for supporting students while learning by providing relevant, timely, detailed, and ac-
tionable feedback on their current progress; it can guide instructional decision-making. This meaning of classroom assessment builds
on the foundation of formative assessment (sometimes called “assessment for learning;” Black, Harrison, Lee, Marshall, & Wiliam,
2003). In contrast to assessments that are used to rank, evaluate, or certify some aspect of performance (summative assessments),
formative assessment is a systematic process to gather evidence about learning as it is occurring, so as to adapt lessons to increase
learners’ likely success in achieving the goals of the lesson (Heritage, 2007). Digital tools are adding to the power and accessibility of
formative assessments; for example when student work can be shared in real time through applications such as the STEP formative
assessment platform (Olsher, Yerushalmy, & Chazan, 2016).
A critical element of formative assessment is to strengthen the role of the students as partners in assessment—to collaborate with
teachers to assess their own current levels of understanding, and to recognize what they know and what they need to know to
succeed, to reduce the “discrepancy between current and desired understanding” (Hattie & Timperley, 2007). Students learn to take
an active rather than passive role towards their own learning, strengthening self-regulation strategies, and adapting their learning
tactics to meet their own learning needs (Brookhart, 2018; Heritage, 2007)
Classroom assessment encompasses formative assessment practices and often involves the application of more formal measure-
ment properties (Confrey, Toutkoushian, & Shah, in press; Confrey & Toutkoushian, in press; Wilson, 2018). It requires an explicit
research-based theory of student learning (Pellegrino et al., 2001). Shepard, Penuel, and Pellegrino (2018) argue that classroom
assessment should be based on discipline-specific, detailed models of learning, such as developmental models, learning progressions,
facets of understanding (Minstrell, 2001), or local instructional theories (Gravemeijer, 1994). The National Research Council (2003)
argued that classroom assessments should be dynamic, administered and scored in a timely and immediate way, criterion-referenced,
and directly relevant to a student’s current state of learning. Under traditional models, an end-of-unit assessment is used to evaluate
students’ levels of achievement before moving to a new topic, whereas classroom assessment drives instructional changes within the
unit dynamically and iteratively. Repeatedly leveraging such change necessitates a new curricular framework. In the next section we
describe such a framework, the agile curriculum, that uses LT-aligned classroom assessment.

2.3. A framework for an agile curriculum

To synthesize the elements of the enacted curriculum with those of classroom assessment, we offer a revised framework (Fig. 2) of
curriculum use together with a set of four enactment principles (described below), and call the overall approach “the agile curri-
culum." The agile curriculum framework is proposed in order to describe a process of continuous revision and improvement based on
data gathered during curricular enactment. The term “agile” derives from the agile methods used in software engineering, which
focus on setting clear targets for design, creating rapid prototypes for building an application, sharing responsibility among teams for
identifying and achieving subgoals, and creating iterative enhancements based on opportunities and weaknesses identified through
gathering continuous feedback (Cohen, Lindvall, & Costa, 2003). Analogously, in curricular enactment the focus needs to be on
rapidly and flexibly meeting the needs of the students, responding to challenges to and opportunities for learning that arise during the
course of instruction. Proposed adaptations of materials and pedagogy need to be evaluated on the basis of feedback data in both a
prospective sense and in retrospective analysis.

2
This use of “classroom assessments” contrasts to when used informally by practitioners for any quiz, graded homework, or unit test for grades as
a summary of students’ achievement.

160
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Fig. 2. The agile curriculum framework, leveraging two-cycle feedback.

The framework positions the instructional core between the bookends of standards and high stakes testing (as in Confrey and
Maloney (2012) and positions curricular materials as the mediators of curricular activity (as in Remillard and Heck (2014). It
envisions the instructional core as a cycle and recognizes the role of classroom-based diagnostic assessment data as feedback, first
within the classroom modifying instruction and then outside the classroom, among a collective group of practitioners (professional
learning communities (PLCs)), modifying subsequent curriculum enactment. Thus, the former requirement for implementation fi-
delity is exchanged for one in which change is driven by high-quality evidence from classroom assessments. We emphasize that this
does not mean we endorse ad hoc curricular assembly—an agile curriculum assumes an initial adoption of carefully-designed ma-
terials as its basis but anticipates and welcomes rapid refinements based on evidence.
We identify two primary types of feedback cycles. “Short-cycle” feedback operates during instruction episodes such as a curricular
unit, using assessment diagnostically to affect learning and instruction during active instruction or prospectively. “Long-cycle”
feedback involves retrospective evidence-based deliberations toward revisions of materials and/or sequencing and operates across
months and years. Both cycles can involve collective actions among teachers.
We further propose four principles of the agile curriculum that meld key features of curricular enactment and classroom assessment.

1.) Explicit, transparent learning theory guides the interpretation of data and enactment. Focusing on student learning is an essential
foundation of an agile approach to curriculum. This requires specification of developmentally-appropriate, fine-grained learning
theory. We use learning trajectories.
2.) Instructional adjustments and supplementation occur in response to short-cycle feedback during enactment. Curricular revisions occur in
response to long-cycle feedback. Both are based on interpretation of multiple sources of data relevant to the curricular aims. Agility
implies continuous interactions between curriculum enactment and data. Short term adjustments can be based on student
questions, review of student methods or ideas, ways to connect to prior learning, and a need for differentiation among student
groups (Pellegrino et al., 2016). Long-cycle adjustments include adoption of an approach or a change in sequencing of topics,
based on evidence connected to specific examples across teachers working in professional learning communities (PLCs).
3.) Students are recruited as partners in interpreting and acting on assessment data. Growing evidence acknowledges the importance of
strengthening students’ perception of efficacy with regard to their own learning (Heritage, 2010; Heritage, 2007). Therefore, an
agile curriculum should provide compelling, immediate, systematic, and actionable learning data, tied to specific curricular goals,
to help students identify gaps in their learning and ways forward, and should assist them in developing a “growth mindset”
(Dweck, 2006) with regard to their role in the learning process.
4.) Teachers’ roles in instrumental orchestration (Trouche, 2004) are strengthened: they become increasingly skilled in conducting student-
centered instruction while leveraging learning trajectory-based evidence to meet individual and group needs. We chose the term “con-
ducting” to emphasize that we are not requiring teachers to be the initial composers of curriculum, but rather recognizing their
critical role in refining the compositions, improvising, and adding supplemental elements based on evidence. The conductor is
often considered a solitary actor, but we conceptualize teacher orchestration as encompassing both individual and collective
action (Pepin et al., 2013).

3. Description of Math-Mapper 6–8

Math-Mapper 6–8 (M-M) was designed and used in this study to support enactment and investigation of the proposed agile
curriculum, and its four principles. M-M consists of a learning map, a linked diagnostic assessment system, and various tools for
scheduling and organizing instruction, assessments, materials and assignments3, all with learning trajectories as the underlying
learning theory (Confrey, 2015).

3
M-M contains a “sequencer” for scheduling the use of materials and assessments, and a “resourcer” that links to curated curricular resources.
These were not relevant to the present study.

161
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Fig. 3. “Finding Key Percent Relationships” relational learning cluster from the Math-Mapper 6–8 learning map. The cluster’s three constructs are
“Percent as Amount per 100," “Benchmark Percents,” and “Percents as Combination of Other Percents." The LT progress levels (L1-L5) are displayed
for the top construct, “Percents as Combination of Other Percents." The progress levels’ grade mapping, based on CCSS-M topic designations, is
shown at right.

3.1. Learning map

The learning map conveys all the typical middle school math content as the organized territory of what is to be learned, through a
hierarchical arrangement of nine big ideas, populated by 24 “relational learning clusters” (RLCs) that contain a total of 62 constructs,
each with a related learning trajectory consisting of at least 5 “progress levels.” It is designed intentionally to redirect teachers’
reliance on individual standards toward a focus on big ideas, concepts’ interrelationships, and the underlying learning trajectories4.
Learning trajectories (LTs), which form the learning theory underlying M-M, are defined as:
…researcher-conjectured, empirically supported description of the ordered network of constructs a student encounters through
instruction (i.e., activities, tasks, tools, forms of interaction, and methods of evaluation), in order to move from informal ideas,
through successive refinements of representation, articulation, and reflection, toward increasingly complex concepts over time
(Confrey, Maloney, Nguyen, Mojica, & Myers, 2009, p. 3).
Math-Mapper’s LTs are all based on new syntheses of empirical research on student learning. Expressed as sequences of progress
levels, they are, in essence, probabilistic conjectures that identify the landmarks and obstacles students are likely to encounter
through instruction, as their thinking develops over time from less sophisticated to more sophisticated (Clements & Sarama, 2004;
Confrey, Maloney, Nguyen, & Rupp, 2014; Nguyen & Confrey, 2014; Simon & Tzur, 2004). Understanding and leveraging the learning
trajectories towards big ideas is what helps to keep teachers focused on teaching for understanding, drawing on concepts, strategies,
procedures. and generalizations.
Each RLC contains closely-related constructs, arrayed to suggest an instructional ordering of content: constructs are arrayed
vertically to represent more basic (lower) to more sophisticated (higher) constructs, student learning of which is supported by
understanding the lower cluster(s). A single RLC, “Finding Key Percent Relationships” comprising three constructs/LTs, is shown in
Fig. 3.

3.2. Diagnostic assessment and reporting system

Assessments (9–11 items) are administered at the RLC level and require 20–30 min. Each item targets a single LT progress level, is
written conceptually to address the level, and is designed to promote classroom discussion. The assessments and their resulting
reports are based directly on the LTs. Also available are construct-level practice tests which allow students to select a desired level,
work items, and receive immediate feedback.

4
M-M bidirectionally displays relationships between each learning trajectory and the CCSS-M.

162
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Fig. 4. A. Top half of a student report for diagnostic assessment for cluster “Finding Key Percent Relationships.” Color-coded dials (left side)
summarize overall percent correct responses for each construct: percent correct for assessment (black), additional percent correct due to revision of
incorrect items (turquoise). LT-specific color-coded learning ladder (right side) indicates relative levels of correctness for items at each level tested:
incorrect (orange); varying percentages correct (shades of blue); untested levels (white). B. Bottom half of student report for a diagnostic assessment
for cluster “Finding Key Percent Relationships” (“Item matrix”). Item responses are color coding as in Fig. 4A, with additional information shown in
figure legends at top and bottom of figure. Clicking a construct displays the LT; clicking an item-response cell launches an item review panel, which
displays the item itself, and permits student to revise response(s) or reveal correct response(s). (For interpretation of the references to color in this
figure legend, the reader is referred to the web version of this article).

Upon completion of an assessment, students and teachers receive immediate feedback. Student reports simultaneously provide
feedback at several different levels of detail (Fig. 4A, B). Autonomous access to their own assessment data from tests, practice tests
and retests, is designed to facilitate students to reflect on their current understanding, and their learning needs, in relation to the
learning targets of constructs and clusters, and to recruit them as partners in acting on their data.
Class reports provide teachers with “heatmap” displays that detail the entire class’s assessment responses—by student, item, LT
level, and construct (Fig. 5). The reports were designed to support teachers in making instructional adjustments based on data, and
support teachers in conducting student-centered instruction.
M-M was designed to support an agile curriculum, with explicit features developed to align with the four principles of agile
curriculum. In the section that follows, we report on a study about the degree to which teachers and students engage agilely in
instructional adaptations based on the reports they receive on students’ progress along learning trajectories.

163
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Fig. 5. Heatmap for the cluster “Finding Key Percent Relationships” (T2′s class). Vertical axis/rows: Progress levels for corresponding construct LT
(low to high). Horizontal axis/columns: students, ordered from weakest (left) to strongest (right) by overall cluster performance. Cells: student’s
performance on individual tested item, color-coded as in student report (Fig. 4A, right, and Fig. 4B). Clicking the left margin of any heatmap reveals
the specific item, along with an item analysis tool that (anonymously) specifies all student responses for that item.

4. An exploratory study using Math-Mapper 6–8

A central component of the agile curriculum is the use of data by teachers and students to influence and support learning.
Therefore, we chose to examine in detail a sample of classes to explore how teachers reviewed data with their students and used the
information from the diagnostic assessments. It requires considerable time for students and teachers to learn to use M-M and invent
relevant practices to integrate its use into instruction. Their practices continue to evolve and change. For this reason, we collected
data multiple times from participating teachers over time.
We chose to conduct an exploratory study to identify ways to measure salient instructional factors related to data review (and
subsequently link them to student outcomes). The study’s data collection examined:

1) Teachers’ proficiency in accessing and using the assessment data,


2) Teachers’ application of classroom assessment data to improve learning, and
3) Instructional episodes connected with data reviews for evidence of students’ active participation in the learning process and of
teachers’ facilitation of such.

To include both short- and long-term cycles of feedback in the framework, we investigated both classroom data review sessions
and teachers’ collective (PLC) discussions. The research questions explore how the four principles of an agile curriculum are exercised
within the framework:

1 How do teachers review the data from the heatmaps and use student reports to make adjustments in their instruction?
2 What role, if any, do the learning trajectories appear to play in those processes?
3 How do students participate in the review process? Is there evidence that they become more active partners in the assessment
process?
4 How do the teachers collectively discuss, interpret, and use their data to adjust, or plan to adjust, instruction?

4.1. Context

Field-testing M-M has been conducted at three partner schools in two districts. One district, listed as low-performing at the state
level, has transitioned to digital resources more recently. The second district is recognized as high-performing and has used digital
resources extensively. (Table 1).
The data for this study consisted of videos of teachers reviewing results with their students in class, and monthly grade-level
meetings of District 1 professional learning communities (Table 2). Seven class sessions were selected to represent contrasting cases,
representing different grade levels, topics, districts, and styles of teaching. One teacher (T2) was selected for study across multiple
classes over time to examine the variation in her evolving practice.

164
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Table 1
Research site student and teacher demographics.
District 1 District 2

Population served 977 1163


African-American (%) 27 4
Asian (%) 1 9
Hispanic and Mixed (%) 10 8
White (%) 53 79
Percent Free and Reduced Lunch 56.9 9.9
Number of years implementing 1-1 computing 3 5
Number of teachers participating 19 33
Number of tests taken 9,197 21,696

Table 2
Classroom data-review sessions and PLC sessions analyzed.
Grade District Teacher/PLC Cluster No. obs’v’ns Year

6 1 T1 Ratio 1 1
6 1 T2 Ratio, Percents 3 1-2
7 1 T3 Ratio, Percents 1 2
7 2 T4 1 2
7 2 T5 Area of circles 1 2
6 1 Gr. 6 PLC Percents 1 2
8 1 Gr. 8 PLC Bivariate Data, Functions 1 2

Table 3
Categories and codes for video analysis.
Review Data Instructional Actions (decisions)

Data Sources Norms of Data Interpretation Whole-Class Peer-to-Peer

Learning Map Dichotomous evaluative feedback (good/bad) Teacher meta-talk Form groups
Student reports Growth Mindset Test strategies of exploring the options Assign to Practice
Heatmaps Low Expectations of Students Leveraging connected knowledge Provide assistance
Practice Connection of feedback to LT Establishing classroom norms around data use Monitor Progress
Student Self Reflection Showing evidence of instructional insecurity
Student Self-correction Characterizing interaction patterns by task

4.2. Methods and analyses

Several class sessions and PLC meetings were video recorded. The recordings were transcribed and reviewed by research team
members. Inductive-contrastive analysis was undertaken (Derry et al., 2010, p. 10). Because this was an exploratory study, a single
team member watched all the videos and categorized episodes as: 1) representing actions involving a review of the data or 2)
examples of instructional actions. Within the category of reviewing data, the episodes were further labeled as either discussing the
meaning and features of specific data sources or as helping the class form norms of data use. The instructional actions fell into two
broad categories according to the organization of the class: whole-class or peer-to-peer. Across these four categories, 51 codes were
proposed and presented to the team. The team viewed a subset of videos and selected 20 codes as most representative of the frequent
and consequential examples of data use (See Table 3).
To prepare for coding by the research team, an initial standard was set by applying these codes to a set of 39 episodes. Episodes
could be coded with multiple codes. To measure inter-rater reliability, all researchers coded the identified episodes, and a pooled
kappa (De Vries, Elliott, Kanouse, & Teleki, 2008) was calculated to measure reviewers’ agreement with the standard. A pooled
kappa greater than 0.75 (criterion for excellent clinical significance per Cicchetti (1994)) was chosen as a minimum requirement
for agreement. After a training period, reviewers’ pooled kappa exceeded 0.75 (minimum of 0.79). With inter-rater reliability
established, transcripts were distributed, and two researchers independently coded each transcript, using the ’’Dedoose’’ (2018)
application. Across all transcripts, the two reviewers agreed on all codes applied to 87.8% of the episodes (567 of 646 episodes
coded). For the remaining episodes, agreement was established through team discussion. Illustrative examples were selected for
use in this paper.

165
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

4.3. Results

We present results relating to each research question, describing the relevant components of the theory of action for each
question. Specific examples are categorized according to the components of the theory of action (Table 3).

4.3.1. Research Question 1: how do teachers review the data from the heatmaps and use student reports to make adjustments in their
instruction?
4.3.1.1. Data Sources. Teachers differed in their facility with and proclivity towards use of M-M. Two teachers did not share the data
representations of student reports or a heatmap, and only reviewed the assessment items one after the other, mirroring the conduct of
a traditional test review.
This contrasted with other teachers who carefully demonstrated the features of the tool and communicated their purpose to
students before interpreting the data. For example, during year 1, when T2 introduced an anonymized version of her class’s heatmap
for the first time, she told the class how to read the data display, and discussed how to interpret the map’s color-coding:
T2: Now, the white spaces on my report mean you didn’t have that question, because we didn’t all have the same question. So, if
there’s a white space, it means you didn’t have a question there. And… each one of these columns is an individual student.
She asked which color they thought stood out the most. The students replied in unison, “Yellow.”5She asked them if the “yellow
popping out” was a “good thing,” and the students replied in the negative. This orientation towards data would be expected in a
“performance setting” where a teacher reports on how well the material was learned. However, she moved from evaluating to
interpreting the data, using “we” repeatedly to identify the whole class with the heatmap results.
In year 2, after the M-M Practice feature became available, she and her class discussed the heatmap data from their recent test on
“Key Percent Relationships” and asked: “Which construct would you say would be the first one we need to practice?” They replied,
“Construct C,” which had more orange (Fig. 5). The teacher used the features of the heatmaps to guide her instruction.

4.3.1.2. Instructional actions (whole class). The majority of the teachers chose to review the data with the class as a whole,
occasionally working individually or in pairs. In the first year of M-M6 use, all teachers reviewed assessment items with the class using
a preview of the test, with little reference to the LTs. Most of the review was teacher-directed, repeating past methods rather than
reteaching the topic. Students’ participation consisted of short responses to teacher questions, without requests for explanations. The
excerpt below is typical of the interactions in T2′s first data review.
T2: Yes. Ok, that makes sense. Ok so now I’m using less because 5 is less than 35. 5 divided by 5?
S (students): 1
T2: One cup of mango. Is 1 less than 6.
S: Yes
T2: Ok, that makes sense 15 divided by 5.
S: 3
Some teacher explanations were clear, while others revealed evidence of instructional insecurity. Weaker teachers revealed a
tendency to work only part of the problem, failing to reach closure. T3 posted the table of values from the problem and then copied
two rows of the table to the right to make it look like a ratio box, one row with two values and the other with a missing value. She
solved it cursorily, then said:
T3: “So, somebody, be honest and tell me. Was it the table or was there something else that made you not understand this?”
S1: Mine was the wording of the questions
T3: the wording of the questions? What, can you be more specific?
S1: Like, on some of them, it didn’t make sense that, like, when I asked you what it was asking on the first one, I didn’t get it, like
what it was asking, but I knew how to do it.
Despite having solicited student opinion, the teacher’s lack of follow-up left the student with little benefit from the exchange.
In year 1, while reviewing M-M items, teachers’ references to similar problems not on the assessment were rare and yielded
unpredictable outcomes. T1 once abruptly asked“…if 5 golden unicorns cost $40, how does it compare to buying one unicorn for
$10?”, leaving the students confused. In contrast, T2 smoothly connected her discussion to a problem on ratio equivalence from the
original context:
T2: All right if Mary has only 6 cups of watermelon how many of each fruit does she need in order to make less tropical punch that
still taste the same. So we're back to Arturo and Monique. Remember, Monique wanted to make the same lemonade but she
wanted to make less than Arturo.
It appeared to help establish a connection between regular instruction and the assessment review process.

5
Mostly orange; hues can differ based on the individual display monitor setting
6
T1 and the first class of T2 occurred in year 1. T3 joined the program this year so she is also viewed as a part of the year 1 group.

166
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

4.3.1.3. Instructional actions (peer-to-peer). T5 created small groups and facilitated student-directed discussion of data, based on their
performance on a test on the cluster “Measuring Characteristics of Circles.” T5 allowed groups to collaboratively decide where to
focus their efforts using the Practice feature to work items aligned to a selected LT level. He encouraged his students to act on their
data to gain competency:
T5: If you look at your stats, what I want you guys to do now is kind of discuss with one another because I’ve paired you guys
together because you guys have very similar questions right and wrong. Look at the stacks, see what level and what construct you
guys didn’t get right—you maybe misunderstood—and then you can do some practice within that construct. So why don’t you
guys take a look at there and see what level you want to use, talk to each other and then kind of work through the practice
problems together. And then you can revise and revisit your questions.
One group of students, recognizing they needed to work on levels 2 and 3 of “Area of Circles” (construct B), opted to begin their
practice with level 3.
S2: So basically when…I see construct A, I see pi and circumference it shows that L2, or meaning Level 2 with estimating pi, by
comparing the lengths of circumference to diameter as ratios. It looked like I got that wrong which means that I would need, like, I
think we would need more practice in that one.
S3: And then Level 1, level 6, level 5, level 4, Level 3 are all blue…
S1: Yeah, so we got those right, and then for the top one, construct B.
S3: Yeah, we need Level 2 and level 3.
By allowing students to direct their review process and work collaboratively to revisit and revise items, T5 empowered students as
partners in the assessment process—but not without accountability. Throughout the peer-to-peer session, T5 monitored students’
progress, probed for understanding, and intervened effectively and efficiently when necessary.

4.3.2. Research Question 2: what role, if any, do the learning trajectories appear to play in those processes?
4.3.2.1. Data sources. During the first year, we seldom observed teachers refer to LT levels; it took time for the teachers to recognize
their utility, let alone begin to share them with students. In year 2, however, more teachers were observed leveraging the LTs. For
example, T4 encouraged her students to view their own reports on their devices and drew their attention to the LTs. Student
responses indicated that they comprehended the hierarchy in the levels:
T4: Can someone tell me what you think they mean? You’re thinking of ladder. What do you think, [Student 1]?
S1: …L1 is the first level.
T4: I like that. So, if you think math levels what do you think that should be? What do you think, [S2]?
S2: Um, like how, advanced or how like…
T4: Ooh, I love that. Very nice. So how advanced, okay, so L1 compared to L5. In this, which one do you think is the easiest level,
which one do you think is the most challenging? [S3].
S3: L1 is the easiest and L5 is the hardest.
She deftly linked the student reports to practice activity:
T4: Orange, ok, so I would start focusing on whichever areas that you have on your reports that are orange, because that’s where
you struggled with, and I would do the practice problems in that area. Feeling good?
S1: Good, because then if they are going to do like a test, after this, then they know how to do it.
S2: So, like study more on that, to like…
T4: To study more. It identifies which ones you might still need a little bit of help on.
By restating Student 2′s response, she emphasized how to focus on levels with which they needed help and strengthened student
agency. Teachers also referred to the “revise or reveal” feature of student reports to increase student participation in assessment
routines.

4.3.2.2. Instructional actions (whole class). Teachers changed in the way they referred to the LTs, the constructs, and the levels. In
year one, T2 related each of the problems to the particular construct in the map by name but she did not refer to the items’ LT levels.
T2: “Identifying Ratio Equivalence” was measured by question 1, 2, and 3. So we want to go back to questions 1, 2, and 3 and see
what we did wrong. “Finding Base Ratios” was questions 4 and 5. We can go back and figure out what happened.
A compelling example of the evolution of a teacher’s ability to make instructional adjustments based on data involved T2 dis-
cussing the construct “Combinations of Percents.” During an examination of a heatmap, she asked her class7 to identify where they
should focus their attention:
T2: Okay, so < student > said Construct C is our problem area and I agree with him…I blame myself for this, because we

7
It is worth noting that, within the CCSS-M, it is easy to overlook percents greater than 100 as the standard does not specify different cases of
percentages: 6.RP.A.3.c: Find a percent of a quantity as a rate per 100 (e.g., 30% of a quantity means 30/100 times the quantity); solve problems
involving finding the whole, given a part and the percent.(CCSS-I, 2011)

167
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Fig. 6. A base ratio assessment item with multiple solution methods.

haven’t done a lot with construct C. But this helps me because we need to go back and do a little bit more, do some practice. So,
percents as combinations of other percents. So, knowing that we can take 20% and 1% and put them together to make 21%. And
knowing that we can take 100% of something and doubling it. We can have more than 100%. If you have more than what you
planned on having, you have more than 100%, ‘cause you got extra. We didn’t do a lot of that, so I blame myself for that… Which
one [level] do you think is the most problematic?
Class: Level 3
T2 realized that she neglected to teach a Level 3 topic (“Extends benchmark of 100% to scale to, for example, 200% or 300% as 2
times or 3 times as large”), and admits that to her class. Then to address that oversight, she pulled up an item from level 3 about a
puppy that originally weighed 25 pounds and had grown to 50 lbs., and uses it to teach the topic. The item asked what percent of the
puppy’s original weight is his current weight. Below is the exchange with students:
T: Now in reading the question, well not question, statements. Every statement says “of its original weight.” What is the original
weight of the puppy?
S: 25
T: 25 percent, so 25 percent, er, 25 pounds is our 100 percent. So, the puppy is now more than 25 pounds, so the puppy is more
than 100 percent. You’ve doubled it, we’ve doubled the weight from 25 pounds to 50 pounds, so if I double 100%, I get 200%. So
the puppy’s current weight is 200% of its original weight. And again, it goes back to that piece where it says “of its original
weight.” We can be more than 100%. Yes?
In the class video, after teaching 200% using the puppy problem from level 3, T2 pulled up an problem from level 5 (“Combines
known percents to multiply, divide, add, or subtract to find any percent”). It required the students to find 245% of a $130 water bill
using combinations of percents. Throughout this exchange, the teacher leveraged the structure of the map to maintain coherence of
ideas, and deliver student-centered instruction. T2 trusted the students to take the lead on the problem. She acted as a note taker,
recorded the percentages students calculated. She tracked the conversation, without suggesting the students solve the problem a
particular way. The students confidently proposed four distinct methods for solving the problem, delighting themselves and the
teacher.

4.3.3. Research Question 3: how do students participate in the review process; is there evidence that they become more active partners in the
assessment process?
4.3.3.1. Norms for data interpretation. The ways teachers talked to students about interpreting the assessment data fell along a
spectrum. On one end was an evaluative, performance-based interpretation, with data not used effectively to inform instruction. At
the other was a formative, growth-based orientation that focused on promoting learning (Heritage, 2007), the data being central to
identifying areas for growth and instructional decisions. We illustrate this with examples of use of M-M data in the classrooms of two
teachers.
T1 (6th grade) demonstrated a rigid approach to learning, frequently describing questions as “easy” or “hard,” and not as op-
portunities to learn. She did not use heatmap data. Several students attempted to describe methods they had used to solve a problem
(Fig. 6). Throughout the exchange, T1 did not follow up on productive comments that would have engaged students as partners in
assessment; she instead sought out students who were, like her, “confused.”
T2 (6th grade) demonstrated many instances of a growth-based orientation. She had created a Google form as well as a paper
reflection sheet for her students to use when reviewing their own reports:
We did the Google form before and then we’ve done this reflection sheet, and we’ve done without the reflection sheet. I’m just
trying to find…the perfect fit of what works best for us, as far as reflecting on the information that we gathered.

168
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

She instructed her students to iteratively cycle through observing data (viewing test results in reports), interpreting data
(identifying areas of weak understanding as displayed in the reports), and acting on data (revising missed questions, practicing weak
constructs, and eventually taking a retest).
On a separate occasion, she offered the following to assist her students in viewing their percent correct scores from a non-
evaluative perspective:
…I don’t want you to focus so much on the percentage you got on the test, but where you fall in the [proficiency] chart. Ok, so yes,
a 63 may not be what you typically think of as a good score. However, if a 63 is showing proficiency, in our chart, then we are
doing pretty well. Ok, are you at the top end, at 100? No. That’s ok. I don’t think I ever score 100 on anything. But that just means
we can learn from it. [italics added]
Later, referring to (anonymized) heatmap data, she said,
The great thing about this, it shows us what we need to learn. So we have some work to do. As a whole, as a group.
T2 repeatedly exhorted her class that finding out what you don’t know is a good thing, that learning gains result from hard work,
and that working together will yield the best outcomes for all.

4.3.4. Research Question 4: how do the teachers collectively discuss, interpret, and use their data to adjust, or plan to adjust, instruction?
4.3.4.1. Norms for data interpretation. Teachers were observed collectively reflecting on their instruction through the lens of their
diagnostic assessment data. Over time, we observed teachers adopting more of a growth orientation: toward their use of the tool, their
own teaching, and their students’ mathematical aptitude. “Last year, it felt like they were going to fail, and it was going to be terrible.
And this year I’ve approached it more as a learning tool and less as a be-all end-all test kind of thing,” said a sixth-grade teacher in a
PLC. T2 described above is part of the same PLC, and reflected on how her teaching had changed as a result of M-M:
R: What do you think makes the difference between last year and this year, in terms of the results from the students?
T2: As far as mine goes, my kids have a stronger math comprehension this year. Coming from elementary school. Last year, I had a
lot of 1 s, 2 s and 3 s [on the end-of-grade test]. This year I have mostly 3 s and 4 s. So my kids came in stronger, which I think leads
back to this. But I hope that I’m a little bit stronger, you know, using this [M-M] to guide my instruction and decide, ‘What was I
missing?’ Like what steps, what stones in the sidewalk did I just forget to put down for the kids? And I mean still, even still, like I
said construct C was our lowest, and we talked about that, and the fact that I still, this year missed that step of going past 100%.

4.3.4.2. Instructional actions (whole class). After her in-class discovery that she had overlooked teaching percents greater than 100%,
T2 discussed with the 6th grade PLC her oversight and recovery based on the analysis of the data in the heatmap:
T2: Yeah, I found with mine, and I discussed it with them today in the video, that was a “me” problem. Like, “I did not discuss with
you percentages over 100.″
T6: Yeah, I hadn’t done it. I realized that too with mine. That I hadn’t taught it.
T2: We hadn’t taught it, and it was kind of a confusing situation, because they felt like “Well you said it was always 100 is like the
total, so why are we going over 100 if you said 100 is the total.” So I have to find a way to reorganize that, like we talked about
“ok, if your total is this, but you get more than that, you’re going over that total. You’re going over 100.″
She related to her colleagues the multiple ways the students solved the level 5 problem, and then compared it with a problem from
level 2 (“Builds up to other percents including using addition or multiplication of benchmark percent of 1%”):
T2: And then we looked at the questions for [levels] 3 and 5. We actually did question [level] 5 like four times, in four different
ways. That’s where my low kid had his “aha” moment….it’s definitely combining, not so much combining, but going above 100.
So they were OK talking about the one where you have to find 23% of 80 [level 2], and two students do it two different ways. One
finds 20%, finds 1% and adds. The other finds 25% and 1% and subtracts. And they understood that, they could conceptualize
that, for the most part, but going above 100% was just for the most part like, mind boggling to them. So we did spend a lot of time
on this one.
The episode demonstrates the extent to which T2 learns to respond more agilely to student learning needs, and supports student-
directed learning, all facilitated by her use of M-M affordances. T2: 1) recognized her own mistake in having neglected to teach the
topic (percents greater than 100%), and was forthright enough to take responsibility with the students, 2) worked an item from the
neglected level, teaching the topic in real time, 3) allowed the students to direct their own approaches to a substantially harder
problem from level 5, while taking the role of orchestrator, and 4) reported back on her mistake and adaptation to other teachers,
learned they also had missed it, and explained to them how she leveraged the LT to address the issue in real time. The other 6th grade
teachers also recognized that they had neglected the topic of percents greater than 100 and made plans to address the topic prior to
the end of the unit.

4.4. Discussion

We developed a theory of action (Argyris & Schon, 1978; Elmore, 2006) by studying the positive and negative examples of how
teachers used the data to review instruction and refine their curricular use. By using the two observed major categories of “data

169
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

Fig. 7. Initial theory of action for class reviews of classroom M-M assessment data, constituting the short-cycle feedback loop.

review” and “instructional actions” from our coding organization, we were able to elaborate on the short cycle of the overall “agile
curriculum” model (Fig. 7).
It revealed variations in how teachers used the tool, how they supported students in focusing on their learning and next steps, and
how they adjusted their instructional practices to respond to and stimulate the changes. From these analyses, we obtained pre-
liminary answers to our research questions.
The first question asked how teachers review the data from the heatmaps and student reports and use it to make adjustments in
instruction. Variations immediately were apparent in the extent to which teachers leveraged the tool’s affordances for displaying
data. Most of the teachers did share the heatmaps with students, connecting item review with a systematic approach to the data. Once
the teachers used the data displays, most used them to decide where to focus attention and how to sequence the items, making data
reviews more efficient and coherent. Most teachers also made significant use of the student reports, particularly by encouraging
students to revise their responses either on their own, with peers, or after engaging in practice.
Question 2 concerned the role and influence of the LTs on classroom practice. All teachers leveraged the LTs to some degree by
reviewing the items in order of the levels. At first teachers hesitated to discuss an LT itself, fearing it was too difficult or obtuse to
students—although they would use the labels. Over time, teachers and students were both observed to use the LTs to characterize
student knowledge and to describe what remained to be learned. In one case, weak performance on items from one level of an LT led
to teachers’ awareness that they had neglected an important topic—one missing explicitly from the Standards.
The 6th grade PLC discussion of percents greater than 100 therefore provides a concrete example of how the LT structure and
corresponding diagnostic assessment data can be used to identify critical learning topics that may be overlooked in a standards-based
approach, and support teachers in developing curriculum-context knowledge (CCK) (Choppin, 2009). In this case, CCK is developed
not through successive enactments of a lesson, but rather through a professional conversation about student progress on LTs based on
an instructional sequence.
Recognizing how the LTs can guide instruction is a gradual process, and progress does not come “free.” Active instruction,
whether whole-class or through careful monitoring of group work, was a necessary element in achieving such progress. The items
provided students opportunities to learn unmastered content, and the LTs provided guidance and direction. But also critical was the
teacher’s ability to facilitate that learning using a variety of strategies, and developing those into classroom norms that would
establish an agile approach to curricular enactment. Teachers who leveraged the structure of the map were able to maintain co-
herence in assessment reviews, rather than treating each item as an independent skill.
Students were involved in the assessment process in various ways. At first, the teachers recognized a need to help students
interpret the lower scores, and worried about students becoming discouraged. Over time, as the engagement with more conceptual
problems generated interesting classroom exchanges, and teachers and students both began to focus more on learning instead of
evaluation. Student involvement also increased as teachers gradually relied less on teacher-directed explanations and allowed more
space for students’ choices of problem focus, and their proposals, ideas, and explanations. There seemed to be evidence of subtle shifts
in the relative emphasis on teacher’s insistence on choices of representations, the involvement of students in rephrasing the problems,
and solicitation of multiple methods. These shifts seemed to indicate that many teachers began to trust their students more, value
their contributions and insights more, and give them a more active role in the assessment process.

5. Conclusions

The context of curriculum ergonomics invites one to propose methods of curricular revisioning based on data. Specifically, it
provides the opportunities to integrate disparate fields of scholarship in order to strengthen and guide the revisioning process based
on data. To this end, we integrated two fields of scholarship: curricular theory and classroom assessment. Curricular theory has
progressed towards recognition of continuous change by communities of practice, and classroom assessment relies on the systematic

170
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

application to learning as it occurs within instructional settings. Both thrive within digitally-supported environments that can support
revisioning and related documentation as well as on-line assessment with immediate scoring and reporting to diverse users.
In this paper, specifically, we demonstrate a means to guide curricular revisions and adaptations based on data on students’
progress on learning trajectories. As articulated in our agile curricular framework, we envision at least two distinct cycles for applying
the feedback from diagnostic assessments—changes during instruction (short-term feedback) and changes in further curricular en-
actments (long-term feedback) based on collective data discussions and related proposals. To help the reader understand the concept
of an agile curriculum, we provided data on multiple teachers’ review practices and discussed how they related to our theory of action
within the classroom. We further reported on how those data were also used to guide discussions with professional learning com-
munities to decide to make long term curricular adjustments.
The examples from the study only begin to illustrate the evolution of curriculum envisioned in the agile concept, and help to
explain why the concept of agility is accompanied by four principles. An explicit theory of learning (learning trajectories in our
example) was necessary to guide the development and analysis of the assessments. For our study, the examples illustrate how
teachers leverage a fine-grain delineation as well as the sequential feature of the data from the learning trajectory levels. The study
also reported on the ways teachers use relevant and timely feedback to guide immediate discussions and influence next steps. And
with the example of reasoning about percents greater than 100%, one sees how the data affect both short term instructional decision-
making and then collectively affect long-term curricular planning. The example illustrates how the digital resource relies on a critical
role for teachers in terms of how facilely and effectively they use the tool, how they envision the role of classroom assessment in
targeting learning, and the degree to which teachers use the data to promote learner-centered instruction. Limitations of the study
include its inclusion of a relatively small number of schools and teachers and its focus only on the data review aspects of instruction.
Further work is needed to see how various review practices are linked to changes in student outcomes and to understand how the
changes in assessment approaches are viewed by the students. Finally, we close in recognition that establishing secure research
results in these complex digital learning systems will accrue slowly and only when researchers, district personnel, and practitioners
engage with DLSs that use varied instructional organizations.

Acknowledgement

This work was supported by the National Science Foundation (NSF) under Grant 1621254.

References

Argyris, C., & Schon, D. (1978). Organizational learning: A theory of action perspective. Reading, MA: Addison-Wesley Publishing Company.
Ball, D. L., & Cohen, D. K. (1996). Reform by the book: What is—or might be—the role of curriculum materials in teacher learning and instructional reform?
Educational Researcher, 25(9), 6–14.
Barquero, B., Papadopoulos, I., Barajas, M., & Kynigos, C. (2016). Cross-case design in using digital technologies: Two communities of interest designing a c-book unit.
Extended Paper Presented in TSG 36 Task Design, ICME 13.
Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003). Assessment for learning: Putting it into practice. Buckingham, UK: Open University Press.
Brookhart, S. M. (2018). Learning is the primary source of coherence in assessment. Educational Measurement Issues and Practice, 37(1), 35–38.
CCSS-I (2011). Mathematics standards. Accessed 17 July 2016 www.corestandards.org/Math.
Chazan, D., & Yerushalmy, M. (2014). The future of mathematics textbooks: Ramifications of technological change. In M. Stochetti (Ed.). Media and education in the
digital age: Concepts, assessment and subversions (pp. 63–76). New York: Peter Lang.
Choppin, J. (2009). Curriculum‐context knowledge: Teacher learning from successive enactments of a standards‐based mathematics curriculum. Curriculum Inquiry,
39(2), 287–320.
Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological
Assessment, 6(4), 284–290.
Clements, D. H., & Sarama, J. (2004). Learning trajectories in mathematics education. Mathematical Thinking and Learning, 6(2), 81–89.
Cohen, D., Lindvall, M., & Costa, P. (2003). Agile software development. Rome, NY: Data and Analysis Center for Software.
Confrey, J. (2015). Some possible implications of data-intensive research in education—The value of learning maps and evidence-centered design of assessment to
educational data mining. In C. Dede (Ed.). Data-intensive research in education: Current work and next steps. (pp. 79–87). Washington, DC: Computing Research
Association.
Confrey, J., & Maloney, A. (2012). A next generation digital classroom assessment based on learning trajectories. In C. Dede, & J. Richards (Eds.). Steps toward a digital
teaching platform (pp. 134–152). New York: Teachers College Press.
Confrey, J., Gianopulos, G., McGowan, W., Shah, M., & Belcher, M. (2017). Scaffolding learner-centered curricular coherence using learning maps and diagnostic
assessments designed around mathematics learning trajectories. ZDM, 49(5), 717–734.
Confrey, J., Maloney, A., Nguyen, K. H., Mojica, G., & Myers, M. (2009). Equipartitioning/splitting as a foundation of rational number reasoning using learning
trajectories. Paper Presented at the Proceedings of the 33rd Conference of the International Group for the Psychology of Mathematics Education.
Confrey, J., Maloney, A. P., Nguyen, K. H., & Rupp, A. A. (2014). Equipartitioning, a foundation for rational number reasoning: Elucidation of a learning trajectory. In
A. P. Maloney, J. Confrey, & K. H. Nguyen (Eds.). Learning Over time: Learning trajectories in mathematics education (pp. 61–96). Charlotte, NC: Information Age
Publishing.
Confrey, J., Toutkoushian, E. & Shah, M. (in press). A validation argument from soup to nuts: Assessing progress on learning trajectories for middle school mathe-
matics. Applied Measurement in Education.
Confrey, J., & Toutkoushian, E. (in press). A validation approach to middle-grades learning trajectories within a digital learning system applied to the “measurement of
characteristics of circles.” In J. Bostic, E. E. Krupa, & J.Shih (Eds.) Quantitative Measures of Mathematical Knowledge: Researching Instruments and Perspectives.
New York, NY: Routledge.
Cuban, L. (1992). Curriculum stability and change. In P. W. Jackson (Ed.). Handbook of research on curriculum (pp. 216–247). New York: Simon & Schuster Macmillan.
Davis, J., Choppin, J., McDuffie, A. R., & Drake, C. (2013). Common core state standards for mathematics: Middle school mathematics teachers’ perceptions. Rochester, NY:
The Warner Center for Professional Development and Education Reform.
De Vries, H., Elliott, M. N., Kanouse, D. E., & Teleki, S. S. (2008). Using pooled kappa to summarize interrater agreement across many items. Field Methods, 20(3),
272–282.
Dedoose(Version 8.0.35). Los Angeles, CA: SocioCultural Research Consultants, LLC. Retrieved from www.dedoose.com.
Derry, S. J., Pea, R. D., Barron, B., Engle, R. A., Erickson, F., Goldman, R., ... Sherin, B. L. (2010). Conducting video research in the learning sciences: Guidance on

171
J. Confrey et al. International Journal of Educational Research 92 (2018) 158–172

selection, analysis, technology, and ethics. Journal of the Learning Sciences, 19(1), 3–53.
Dweck, C. S. (2006). Mindset: The new psychology of success. Random House Incorporated..
Elmore, R. F. (2006). International perspectives on school leadership for systemic improvement. Politics, (July), 1–28.
Gehrke, N. J., Knapp, M. S., & Sirotnik, K. A. (1992). In search of the school curriculum. Chapter 2 Review of Research in Education, 18(1), 51–110.
Gravemeijer, K. (1994). Educational development and developmental research in mathematics education. Journal for Research in Mathematics Education, 25(5),
443–471.
Gueudet, G., & Trouche, L. (2009). Towards new documentation systems for mathematics teachers? Educational Studies in Mathematics, 71(3), 199–218.
Gueudet, G., Pepin, B., & Trouche, L. (2013). Textbooks design and digital resources. In C. Margolinas (Ed.). Task Design in Mathematics Education. Proceedings of ICMI
Study 22 (pp. 327–337). . Accessed 21 July 2016 https://hal.archives-ouvertes.fr/hal-00834054v2.
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112.
Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan, 89(2), 140–145.
Heritage, M. (2010). Formative assessment and next-generation assessment systems: Are we losing an opportunity? Council of Chief State School Officers.
Huntley, M. (2009). Measuring curriculum implementation. Journal for Research in Mathematics Education, 40(4), 355–362.
Larson, M. (2016). Curricular coherence in the age of open educational-resources. Accessed 13 August 2016 https://www.nctm.org/News-and-Calendar/Messages-from-
the-President/Archive/Matt-Larson/Curricular-Coherence-in-the-Age-of-Open-Educational-Resources/.
McKnight, C., Crosswhite, J., Dossey, J., Kifer, L., Swafford, J., Travers, K., ... Cooney, T. (1987). The underachieving curriculum: Assessing U.S. School mathematics from
an international perspective. A national report on the second international mathematics study. Champaign, IL: Stipes Publishing Co.
Minstrell, J. (2001). Facets of students’ thinking: Designing to cross the gap from research to standards-based practice. In K. Crowley, C. D. Schunn, & T. Okada (Eds.).
Designing for science: Implications from everyday, classroom, and professional settings. Mahwah, NJ: Lawrence Erlbaum Associates.
National Research Council (NRC) (2003). Assessment in support of instruction and learning: Bridging the gap between large-scale and classroom assessment. Workshop report.
Committee on assessment in support of instruction and learning. Board on testing and assessment, committee on science education K-12, mathematical sciences education
board. Center for education. Division of behavioral and social sciences and education. Washington, DC: The National Academies Press.
Nguyen, K. H., & Confrey, J. (2014). Exploring the relationship between learning trajectories and curriculum. In A. P. Maloney, J. Confrey, & K. H. Nguyen (Eds.).
Learning Over time: Learning trajectories in mathematics education (pp. 161–186). Charlotte, NC: Information Age Publishing, INC.
Olsher, S., Yerushalmy, M., & Chazan, D. (2016). How Might the Use of Technology in Formative Assessment Support Changes in Mathematics Teaching? For the
Learning of Mathematics, 36(3), 11–18.
Pane, J. F., Steiner, E. D., Baird, M. D., & Hamilton, L. S. (2015). Continued progress: Promises evidence on personalized learning. RAND Corporation.
Pellegrino, J. W., Chudowsky, N., & Glaser, R. (Eds.). (2001). Knowing what students know: The science and design of educational assessmentWashington, DC: The National
Academies Press. https://doi.org/10.17226/10019. Retrieved from https://www.nap.edu/catalog/10019/knowing-what-students-know-the-science-and-design-
of-educational.
Pellegrino, J. W., DiBello, L. V., & Goldman, S. R. (2016). A framework for conceptualizing and evaluating the validity of instructionally relevant assessments.
Educational Psychologist, 51(1), 59–81.
Pepin, B., Gueudet, G., & Trouche, L. (2013). Re-sourcing teachers’ work and interactions: A collective perspective on resources, their use and transformation. ZDM,
45(7), 929–943.
Pepin, B., Gueudet, G., Yerushalmy, M., Trouche, L., & Chazan, D. (2015). E-textbooks in/for teaching and learning mathematics: A disruptive and potentially
transformative educational technology. In L. D. English, & D. Kirshner (Eds.). Handbook of international research in mathematics education (pp. 636–661). (3rd ed).
New York, NY: Routledge.
Remillard, J. T. (1999). Curriculum materials in mathematics education reform: A framework for examining teachers’ curriculum development. Curriculum Inquiry,
29(3), 315–342.
Remillard, J. T. (2005). Examining key concepts in research on teachers’ use of mathematics curricula. Review of Educational Research, 75(2), 211–246.
Remillard, J. T., & Heck, D. J. (2014). Conceptualizing the curriculum enactment process in mathematics education. ZDM, 45(5), 705–718.
Schmidt, W. H., Jorde, D., Cogan, L., Barrier, E., Ganzalo, I., Moser, U., ... Wolfe, R. G. (1996). Characterizing pedagogical flow: An investigation of mathematics and science
teaching in six countries. Dordrecht, The Netherlands: Kluwer.
Shepard, L. A., Penuel, W. R., & Pellegrino, J. W. (2018). Using learning and motivation theories to coherently link formative assessment, grading practices, and large-
scale assessment. Educational Measurement Issues and Practice, 37(1), 21–34.
Simon, M. A., & Tzur, R. (2004). Explicating the role of mathematical tasks in conceptual learning: An elaboration of the hypothetical learning trajectory. Mathematical
Thinking and Learning, 6(2), 91–104.
Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in
reform classrooms. American Educational Research Journal, 33(2), 455–488.
Stein, M. K., Remillard, J., & Smith, M. S. (2007). How curriculum influences student learning. Second handbook of research on mathematics teaching and learning, 1(1),
319–370.
Tarr, J. E., Chávez, Ó., Reys, R. E., & Reys, B. J. (2006). From the written to the enacted curricula: The intermediary role of middle school mathematics teachers in
shaping students’ opportunity to learn. School Science and Mathematics, 106(4), 191–201.
Trouche, L. (2004). Managing the complexity of human/machine interactions in computerized learning environments: Guiding students’ command process through
instrumental orchestrations. International Journal of Computers for Mathematical Learning, 9(3), 281–307.
Webel, C., Krupa, E. E., & McManus, J. (2015). Teachers evaluations and use of web-based curriculum resources to support their teaching of the Common Core State
Standards for Mathematics. Middle Grades Research Journal, 10(2), 49–64.
Wiliam, D. (2018). How can assessment support learning? A response to Wilson and Shepard, Penuel, and Pellegrino. Educational Measurement Issues and Practice,
37(1), 42–44.
Wilson, M. (2018). Making measurement important for education: The crucial role of classroom assessment. Educational Measurement Issues and Practice, 37(1), 5–20.

172

View publication stats

Das könnte Ihnen auch gefallen