Sie sind auf Seite 1von 39

Mathematics Performance Assessment in the Classroom: Effects on Teacher Planning and Student Problem Solving Author(s): Lynn S.

Fuchs, Douglas Fuchs, Kathy Karns, Carol L. Hamlett and Michelle Katzaroff Reviewed work(s): Source: American Educational Research Journal, Vol. 36, No. 3 (Autumn, 1999), pp. 609-646 Published by: American Educational Research Association Stable URL: http://www.jstor.org/stable/1163552 . Accessed: 30/11/2012 21:47
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Educational Research Association is collaborating with JSTOR to digitize, preserve and extend access to American Educational Research Journal.

http://www.jstor.org

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

American Educational ResearchJournal Fall 1999, Vol. 36, No. 3, pp. 609-646

MathematicsPerformanceAssessment in the Classroom:Effects on Teacher Planning and Student Problem Solving


Lynn S. Fuchs, Douglas Fuchs, Kathy Karns, Carol L. Hamlett, and Michelle Katzaroff
Peabody College of Vanderbilt University

Thepurpose of this study was to examine effects of classroom-based performance-assessment (PA)-driven instruction. Sixteen teachers were randomly assigned to PA and no-PA conditions. PA teachers attended a workshop, administered 3 PAs over several months, and met with colleagues to score PAs and share ideas forproviding studentfeedback and instruction. PA teachers' knowledge about PA increased; their curriculum shifted toward problem solving; and they reported relying on varied strategies to promote problem solving. Compared to no-PA students, above-grade PA students showed stronger problem solving on all measures, at-grade PA students, on 2 of 3 measures, below-grade students, on only 1 dimension of 1 measure. Professional development needs to promote mathematical problem solving among all students are discussed.

LYNN FUCHS a Professor in the Department of Special Education, Peabody is S. College of Vanderbilt University, Box 328 Peabody, Nashville, TN 37203. Her specializations are linking assessment to instruction and learning disabilities. DOUGLAS is FUCHS a Professor in the Department of Special Education, Peabody College of Vanderbilt University. His specializations are reading disabilities and inclusion. KATHY KARNS a Doctoral Candidate in the Department of Special Education, is Peabody College of Vanderbilt University. Her specialization is curriculum-based measurement.
CAROLL. HAMLETT a Research is Associate in the Department

Education, Peabody College of Vanderbilt University. Her specialization is computer applications to classroom-based assessment systems. MICHELLEKATZAROFF was a Project Coordinator in the Department of Special Education, Peabody College of Vanderbilt University. Her specialization is peerassisted learning strategies.

of Special

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff

n this country,mathematics educationis typifiedby shallow coverageof a studentsto remember largenumberof topics.Thisapproach requires many isolatedfactsand to spend most of theirinstructional practicing time routine & mathematics educaproblemsolutions(Stigler Hiebert,1997).Increasingly, and researchers voicing concernaboutthis "shallow" are tors,policymakers, curriculum. Dissatisfaction stemsfromeconomicconsiderations empirical and aboutstudentlearning. also is theoretically It findings grounded. in Increasing complexity alongwithmounting technological sophistication the workplace havecreated demandforhighlyskilledworkers who can greater applyknowledgein flexiblewaysto solve novelproblems (Darling-Hammond, 1990;Goodman,1995;Guthrie,1991;Mory& Salisbury, 1992).At the same evidenceaboutstudentlearning revealsthata shallowcurricutime,empirical lum fails to meet this demand. For example, Larkin(1989) showed that children experience difficultyapplying simple computationalskills when problemschangein minorways;thisrevealsa lackof conceptualunderstanding necessaryfor knowledge application.Otherwork (Foxman,Ruddock, & how McCallum, Schagen,1991,citedin Boaler,1993)illustrates studentsfail to realizeconnectionsbetween mathematical problemspresentedin and out of context;in fact, studentstypicallysolve decontextualized problemsmore studentsdo not necessarily know how to use the successfully.Consequently, isolatedskillsthey acquire,and manychildrenmasterthe curriculum without achievingthe capacityto apply knowledge flexiblyto novel situations(cf. Brown,Campione,Webber,& McGilly, 1992). Suchfindingsaboutstudentlearningaresupportedby theoretical frameworksforunderstanding development transfer. the of Mostcognitiveresearchers have challengedthe assumption which a shallowmathematics on curriculum is based: that of verticaltransfer,whereby mastery of simple skills facilitates of & acquisition morecomplexskills(Gagne,1968;Resnick Resnick, 1992). withinsome theoretical the transfer Instead, frameworks, notionof vertical has been replaced with the concept of lateraltransfer,by which children acrossnumerousexperiencesin orderto abstract recognizepatterns generalized problem-solvingprinciplesor schemas (Brown et al., 1992; Gick & withawareness analogous of relations Holyoak,1983).Whencombined among problems, schemas can influence behaviorin a broad set of situationsof affectbreadth learning of and, (Brown comparable roughly complexity thereby, et al., 1992;Cooper& Sweller,1987).These organizational schemes, which childrenformatively and modifyover time and contexts,offera basis for test the understanding developmentof problem-solving capacity(Glaser,1984). To facilitate development,teachersmustprovidestudentswith a wealth this of authenticclassroomactivitiesso thatknowledgeapplicationis addressed along with knowledge acquisition(Brown,Collins,& Duguid, 1989;Prawat, of 1992).Thistype of classroom to activity, course,standsin starkcontrast the inherentin a shallowcurriculum. approach

610

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

Education ReformingMathematics
The mismatch between a shallow curriculumand what seems desirable based on economic considerations, empirical findings about student learning, and current theoretical frameworks has prompted a movement to reform mathematics education in this country. Documents such as the National Council of Teachers of Mathematics' (NCTM) Curriculum and Evaluation Standards (NCTM, 1989) and the Mathematical Sciences Education Board's Everybody Counts (National Research Council, 1989) describe a new vision of mathematics education. This vision moves the field beyond factual and isolated skill competencies to emphasize development of conceptual understanding and problem-solving capacity. The challenge, of course, is realizing this vision in classrooms across the country. To date, most reform efforts represent one of two strategies (cf. Firestone, Mayrowetz, & Fairman, 1998). The first strategy for reshaping learning environments is to work closely with teachers to modify their beliefs about mathematics education (e.g., Prawat, 1989, 1992) and to extend their knowledge about pedagogy and content (e.g., McLaughlin,1987; Schulman, 1987). Because this intensive approach is costly in personnel and time, however, a second strategy, sometimes referred to as "measurement-driven instruction"(Popham, 1987), has dominated reform efforts. Tests DrivingInstructionWith Traditional To prompt curricularreform, many education leaders and policymakers have fixed their attention on assessment (e.g., Archbald & Newman, 1988; Linn, 1991, 1993; Shepard, 1991; Wiggins, 1989). High-stakes accountability testing programs influence what teachers teach and what students learn (DarlingHammond, 1990; Linn, 1993; Torrance, 1993); the demonstrated power of tests "todirect educators' behavior... makes them potent tools for educational reform" (Resnick & Resnick, 1992, p. 56). Unfortunately, in contrast to the reform movement's focus on authentic, integrated knowledge application, traditional achievement tests sample isolated items of factual and basic information with multiple-choice response formats (Linn, 1993; Smith, 1991a, 1991b; Wilson, 1992). The result is that traditionaltests prompt teachers to emphasize basic, factualinformationand to

provide few opportunitiesfor students to learn how to apply knowledge 1990;Wilson,1992). (Darling-Hammond,
An AlternativeBrandof Measurement-Driven Instruction In response to these problems associated with traditionaltests, leaders of the reform movement (e.g., Resnick & Resnick, 1992; Rothman, 1995) have called for the development of an alternative brand of assessment, known as performance assessment (PA). PAs pose authentic problem-solving dilemmas and require students to develop solutions involving the application of multiple skills and strategies. One goal of this new form of assessment is to redirect teachers' instructional efforts to incorporate better integrated, more complex 611

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff

to learning activities,with greatergeneralizability real-lifedilemmas (see & Archbald Newmann, Cohen&Spillane, 1988; 1992; Nickerson, 1989; Wiggins, and instruction 1989).As describedby Resnick Resnick (1992),PA-driven may remove pressuresto teach isolated facts and skills while offeringteachers incentiveto provideextended thinkingactivities: "Performance assessments can become essentialtools in educationalreform" 72). (p. as frameworks havebeen issued, Accordingly, callsforrevisedassessment stateshave designed and incorporated use of PA:As of 1994,more than the 40 states employed some form of PA in their annual testing programs relianceon PA,research beginning is (Thurlow,1994).Withstates'increasing to emerge about how PA-driven within externalaccountability instruction, influencesteachingand learning. programs, In one early study, Torrance(1993) examined England'sand Wales' NationalAssessment,as it was pilotedwith first-grade teachersand children. in Findingsrevealedthatteachersfailedto plan theircurriculum responseto the PAinformation derived.Rather, teacherstreatedthe assessmentsas they a special activity,set apartfromteaching.Torrance concludedthatauthentic to assessmentdo not automatically have a positive impacton approaches teaching and learning.He speculated that carefullydesigned professional used in conjunction withmeasurement-driven instrucdevelopmentactivities, reform. tion, mightbe necessaryto accomplishthe intendedcurricular In fact, some states have begun to offer professional development to and opportunities help teachersunderstand be responsiveto PA information. Koretzandcolleagues(Koretz, & Barron, Mitchell, Stecher,1996;Koretz, & and Mitchell, Barron, Keith,1996)surveyedteachersin Maryland Kentucky, two statesthatincorporated intotheirannual PA and testingprograms provided teacherswith in-servicetraining and involvementin scoringPAs.In both on states, elementaryand middle school teachersreportedthat they modified instruction primarily providingpracticeon tests thatresembledstatewide by and students' PAs,by usingtest-preparation materials, by increasing familiarity with statePAs,rather thanby revising instructional in activities moremeaningfulways.Although theseresults raisequestions aboutthe potential instructional impact of PA, findings were not uniformlynegative. With respect to the for curricular teachingof mathematics, example,teachersreported decreasing while increasing focus on mathemphasison numberfactsand computation ematicalcommunication, problemsolvingusingmeaningful tasks,and application.
More recently, based on interviews as well as classroom observations, Firestone et al. (1998) provided similarevidence of mixed results as a function of PA use within testing programs in Maine and Maryland.On the one hand, classrooms incorporatedmore mathematicsproblems requiringseveral steps or activities organized around a common theme, problem situation, or concept.

On the otherhand,teachers activities such as test emphasized test-preparation simulationsand directcoachingabout acceptableresponse styles, provided limitedopportunitiesfor mathematical reasoning,and failed to incorporate fromtypicalmathematics educateachingmethodsthatdifferedsubstantially
612

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

tion in this country.Firestone al. concludedthatestimatesof the effects of et at PA-driven instruction, leastwithinstatetestingprograms, maybe unrealisticallyhigh. In lightof qualifiedfindingsaboutthe use of PAwithinstatewideassessment programs, Firestoneet al. (1998) raisedseveralinteresting hypotheses of instruction. suggestion One abouthow to enhancethe impact PA-driven was to provide teachersongoing opportunitiesto work closely with their own students'PAs.The premiseis thatas teachersinspecttheirstudents' work PA samples, formulatejudgmentsabout the qualityof that work, and provide studentswith feedback aligned to well-establishedperformance standards, teachers'knowledge about the reformcurriculum should increaseand their instructional approachshould change accordingly(e.g., Murnane& Levy, and 1996). On a related note, Darling-Hammond Falk (1997) raised the that PA-driven instruction be more successfulin advancing possibility may importantstudent outcomes if teachers evaluate student performanceon reformstandards with PAin theirclassrooms. frequently Purpose of This Study The hope inherentin these propositions that,with routineuse of classroom is assessmentsthat reflect challenging,authentictasks, teachers'knowledge aboutand understanding the reform of curriculum increase,theirinstrucwill tionalplanswill begin to reflecta new vision of mathematics education,and student capacityto engage in mathematical problemsolving will improve. at the presenttime,research judgethe viability this hope to of Unfortunately, is not available. The purposeof thisstudy,therefore, to answerthe followingresearch was PA-driven in instruction mathematics: (a) questions about classroom-based How does classroom-based PA-driven instruction affectteachers' knowledge aboutwhat PAis, teachers' knowledgeabouthow PAmightenhanceinstructionaldecisions,and teachers' focus?(b) reportsof changesin theircurricular Whatis the natureof teachers' mathematics instructional planswhen they use classroom PA?and (c) Does teachers'use of classroom-basedPA-driven instruction enhancestudents' mathematical problemsolving? To explorethe thirdquestion,we examinedstudentperformance a set on of relatedproblem-solving measures learners for who hadbeen designated by theirteachersas performing above, at, andbelow gradelevel in mathematics. We incorporated set of relatedmeasuresto examine problemsolving;we a separated outcomes by students' grade-level mathematicsfunctioning to addressthe reformmovement'sfocus on equity.
Problem Solving By definition, mathematicalproblem solving requires students to apply knowledge, skills, and strategies within novel contexts. Relative to the contexts students already have experienced, some problems are more novel--or less similar-than others. As students solve problems with increasing novelty, they demonstrate greater problem-solving capacity because greater novelty re613

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff

quires recognitionof patternswith fewer elements of connectedness (cf. Prawat,1989). Novelty thereforemay increasemetacognitivedemands for awareness relations Brownet al., 1992; of (cf. Prawat, 1989),require generating of activation different schemas(cf. Brownet al., 1992;Cooper problem-solving & Sweller,1987), or involve differentproblem-solving operators(Cooper& Sweller,1987). as frameworks understanding for Unfortunately, revealedin theoretical standards judgingsimilarity inherently for are no transfer, personal; taxonomy forindexingsimilarity exists(Cooper&Sweller,1987; Goodman,1978; Tomic, 1995).Therefore,the questionof how to measureproblemsolving remains (Salomon&Perkins,1987). essentiallyarbitrary In lightof the importance problemsolving,but given the arbitrariness of in inherent formulating we literature similarity judgments, reliedon the transfer to identifyproblemfeaturesthatreflectdecreasingsimilarity, connectedor ness, with the PAs students had experienced in their classrooms.These elementswere (a) similarity format scoring, in and whichmayserveto prompt awarenessof relations acrossproblemsituations (Brownet al., metacognitive of which maydetermine 1992;Prawat, 1989);(b) similarity problemstructure, the similarity the problem-solving of schema to be activated(Brown et al., of 1992;Cooper& Sweller,1987);and (c) similarity computationand applicationskills,whichmaydetermine similarity necessary the of problem-solving operators(Cooper& Sweller,1987). Using these problem featuresto structureour similarityanalyses, we identifiedthreeproblem-solving measureswith decreasingelementsof simiwith the classroomPAs.We referto the three measuresas analogous, larity the the related,and novel.Although selectionof these termsis arbitrary, labels reflectsimilarity a relativesense for the followingreasons. in The analogousproblem-solving measuresharedthe greatestnumberof elementsof similarity the classroom with PAs: Theformat scoringwere and (a) identical to the classroom PAs, perhaps serving to prompt metacognitive awarenessof relations; the problemstructure was identical,potentially (b) activation the same problem-solving of permitting schema;and (c) the problem involvedanalogous and skills,allowinguse of the computation application same problem-solving operators.The analogousmeasuredifferedfromthe classroom becauseit featured PAs different quantities, objects,andcontexts(as did every classroomPA). The relatedproblem-solving measurehad fewer elements of similarity. the format scoringwere identical the classroom and to Although PAs,perhaps awareness,the problemstructure servingto promptmetacognitive differed, of demandingactivation different schemas,and the problem problem-solving involveddifferent and different skills,requiring computation application problem-solving operators. The novel problem-solving measurehadthe fewestelementsof similarity: and fromthe classroom perhapsdecreasing (a) Theformat scoringdiffered PA, awarenessof potentialrelationsto the classroomPAs;(b) the metacognitive activation different of varied,demanding problemstructures schemas;and (c)
614

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Assessments Mathematics Performance and the problemsinvolveddifferent skills,requiring computation application use of differentoperators.

Equity
of Regardless how one chooses to measureproblemsolving, it is clear that to & 1999; problemsolvingcan be difficult promote(cf. Bransford Schwartz, & Detterman Sternberg, 1993;Resnick& Resnick,1992).In fact,Cooperand the Sweller(1987) characterized developmentof "problem-solving skill...[as] an enigma"(p. 347). Moreover, researchsuggeststhatbenefitsfrominstructional environmentsdesigned to promote problem solving may vary as a functionof students' priorlearning accomplishments. Forexample,Cooperand Sweller(1987)found thatstudentswith previously low achievementlevels requiredlonger periods and more worked examples to acquiremathematical problemsolving;Mayer(1998) demonstrated that when teachers used an approach consistentwith the NCTM studentswith higherincomingachievement levels benefitedmore; standards, Woodwardand Baxter(1997)showed thatstudentswith learningdisabilities or who scored below the 34thpercentileon a standardized achievementtest less from a reformedmathematicscurriculumthan did average profited achievers. These findingsraiseconcernabout the potentialefficacyof a reformed mathematics curriculum studentsalreadyperforming for below grade level. This concern is especiallyproblematic because the educationreformmovement calls explicitlyfor a dual focus on excellence and equity.The goal is to enrichschool curricula with authentic activities achieve and problem-solving uniformlyhigh performancestandardsfor all students,regardlessof their histories & (McDonnell, 1997; learning McLaughlin,Morison, McLaughlin, Shepard, & O'Day, 1995). Because priorresearchhas suggested thatuniformlyhigh standards problem-solving for outcomes may be difficultto achieve, it was in this study to examine outcomes separatelyfor studentsdesigimportant nated by theirteachersas above, at, and below gradelevel. Study Limitations Beforeoutliningthe study,we cautionreadersaboutsome limitations associated with this research.First,we purposefullydid not contributeto the PA teachers'discussionsabouthow to strengthen theirinstructional in programs to student performanceson the classroomPAs. We limited our response in about participation thisway becausewe didnotwantto confuseourfindings PA-driven instruction with the effects of "mathematics reformclassrooms," where teachersupportfor modifyingbeliefs and for enhancingpedagogical rich.Givenourchoice, the current knowledgeis substantively studydoes not a basis for determining potentialof reformclassrooms. the provide simiSecond,as alreadystated,our scheme forjudgingproblem-solving termsfor our problem-solving larityand for establishing measures,although defensiblein a relative in sense, is necessarily Therefore, interpreting arbitrary.
615

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff

measures,readersshould take care to findingsacross our problem-solving standards. applytheirown similarity


Third, and most important, when describing how teachers plan with classroom-based PA-driven instruction, we collected no data on our contrast group teachers, and we relied entirely on self-reports.Since we do not provide comparison data using the contrastgroup, we cannot attributeour instructional planning findings specifically to our treatment. Since we did not collect classroom observational data, our conclusions about what PA teachers did in their classrooms are only tentative. We remind readers of these two related limitations as we discuss findings.

Method
Overview We randomly assigned 16 teachers to classroom-based PA-driven instruction (PA) and no classroom-based PA-driven instruction (no-PA) conditions. PA teachers attended an initial workshop, administered three PAs over several months, and, after each PA, met with colleagues to score PAs and share ideas for providing student feedback and instruction.We provided technical support to PA teachers and explicitly encouraged them to adapt their instruction to enhance the problem-solving performance of all students in their classrooms. We examined teachers' knowledge about PA and described PA teachers' instructionalplanning in response to theiruse of PA.We also assessed students' problem-solving performance on three measures with decreasing elements of similarity in regard to the classroom PAs. We separated these learning outcomes for children whom teachers had designated them as above, at, and below grade level in mathematics (see Table 1 for a schedule of study activities). In this section, we describe study participants, the classroom PAs, the PA teachers' activities, the teacher and student outcome measures, data collection, and data analysis. Participants Teachers. From four schools in a southeastern urban school district, 16 general educators (all female; 4 at Grade 2, 6 at Grade 3, and 6 at Grade 4) volunteered to participate. Stratifyingby grade level, we randomly assigned half of the teachers to a classroom-based PA-driveninstruction condition. For PA teachers, mean years of teaching and class size, respectively, were 16.75 (SD = 6.59) and 22.50 (SD = 1.77); the corresponding means for no-PA teachers were 19.88 (SD = 8.58) and 24.88 (SD = 1.18). Two PA teachers were African American, 5 were European American, and 1 was Asian American; 4 had bachelor's degrees, and 4 had master's degrees. All no-PA teachers were European American; 3 had bachelor's degrees, and 5 had master's degrees. Inferentialstatisticsindicated that PA and no-PA teachers were comparable on these variables. Students. From these classrooms, student participantswere the children for whom we had complete information (i.e., who were present on every day

616

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

Table1 Study Activities by Week and Treatment


Treatment Activity Week PA No PA

Pretestson analogous problem-solvingmeasure administered Initialteacher workshop occurred ClassroomPAadministered Teacher release day occurred PAfeedback session provided to students Teacherscompleted instructional plan sheets Teachersimplementedinstructional plans Teacherscompleted questionnaires Postteston analogous problem-solvingmeasure administered Postteston novel problem-solvingmeasure administered Postteston relatedproblem-solvingmeasure administered

1-2 1 10, 13, 23 10, 13, 23 10, 13, 23 10, 13, 23 10-30 19-21 24 25 30

X X X X X X X X X X X

X X X X

of pretesting and posttesting). Each teacher completed a demographic information form on which she reported the following informationfor each student in her class: her judgment of the student's mathematics grade-level status and reading grade-level status (above, at, or below grade), age, gender, race, reduced/free lunch status, special education status (yes/no), and classroom behavior status (acceptable, occasional problem, or frequent problem). To designate above-, at-, and below-grade status, we asked teachers to review standardized achievement tests and to consider their students' current classroom performance. At the beginning of the study, we administered the Fuchs et al., 1997), Computation and Applications MathematicsTest (MCAT; and we obtained, from the previous spring, results of the school district's administrationof the mathematics portion of the Comprehensive Test of Basic Skills. We used mathematics grade-level status, as reported by the teachers,' as a study factor. We used the other data to examine group comparability. Informationby treatment(PA vs. no-PA) and by mathematicsgrade-level status is shown in Tables 2 and 3 for continuous and categorical data, respectively. Two-way (mathematics grade-level status and treatment) analyses of variance (ANOVAs)on the continuous variablesand chi-squareanalyses on the categorical variables indicated comparabilitywith the following exceptions. As would be expected, mathematics test scores and reading grade-level status were associated with students' mathematics grade-level status. To determine whether the 272 children in the complete data set were comparable to the remaining 90 pupils (who were absent on 1 or more days of pretesting or posttesting), we conducted additional inferential analyses on

617

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

N (-4 C0 v,

~o
=t
co

~
\0 ON0
-

00
P(
N

Cc)

alp

mL

o~~J

0 0

CNI

c*'

0
-@

r\

~u-\

CUo

z9)

Ul

o .0 P, rQ)

c 00 o "

e~$: 9,

.0

?C1 EU
(11 ()-

V 0

-~

00

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

Table3 Categorical Student Demographic Data by Treatmentand MathematicsGrade-LevelStatus


Mathematics grade-level status Above PA Variable n % No PA n % n PA % At No PA n % n PA % Below No PA n %

Gender:male Race American African EuropeanAmerican AsianAmerican Reduced/freelunch Sped (yes) Behavior Acceptance Occasional Frequent Reading Above At Below

12 41 3 23 3 4 0 10 79 10 14 0

50

38 44 32 37 46 53 8 9 27 31 2 2 61 71 19 22 6 7 13 15 63 73 10 12

53

54

45

10 48 8 38 13 62 0 0 12 57 1 5 12 57 7 33 2 9 0 0 6 29 16 71

2 11 14 77 2 11 3 17 0 0 14 78 2 11 2 11 14 77 3 17 1 6

38 38 53 54 7 7 28 29 0 0 75 77 17 17 6 6 10 10 79 81 9 9

7 35 12 60 1 5 14 70 2 10 12 60 6 30 2 10 0 0 4 20 15 80

26 90 0 0 3 10 26 90 3 10 0 0

age and on the variables shown in Tables 2 and 3. We found no significant differences. ClassroomPAs Development. At each grade level, we had developed six parallel forms of a PA using the following procedure. First,we selected a "massed concepts" PA framework (Sammons, Kobett, Heiss, & Fennell, 1992) designed to assess students' application of a "core"set of skills considered essential for successful entry to the next grade. Having identified a basic conceptualization for our PAs, we developed one preliminary fourth-grade PA that we used for illustrative purposes in the development process. Next, we held a focus group meeting at which teachers individually completed the illustrativePA, learned (in the large group) to use a PA scoring rubric, and divided into grade-level teams to score five sample PAs and to make suggestions for modifying the scoring rubric and the basic structure of the illustrativePA. Then, in grade-level teams, teachers identified 10 core skills to incorporate into PAs at their grade level. They began by reviewing the statewide mathematics curriculum to select the 10 skills most essential for successful entry into the next grade and the 10 skills most essential for successful entry into the grade they teach. Then, in one large group, the teachers compared the skills they specified as important for entering the next

619

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff higher grade with the skills identified by teachers in that next higher grade as critical for successful entry. With this input, teachers returned to grade-level teams to finalize lists of core grade-level skills. Finally, in grade-level teams, teachers identified and rank ordered 20 themes that represented real-life situations students might face now or in the next few years, were interesting, could incorporate the 10 core skills, and were age appropriate. Working with this input, we collectively developed three PAs, one at each grade level. We piloted each PA with three students who were entering and three who were exiting the target grade. On the basis of the range of performances and the students' input about what they liked, disliked, and found confusing, we modified these initial three PAs. We next constructed a framework for the remaining five parallel PA forms at each grade level and then developed those remaining PAs. The following sequence then recurred three times. At each grade level, five experienced teachers completed each of the six PAs. These adults, who were unfamiliar with the development process, identified every skill they applied in solving the PA, described inconsistencies in difficulty level and the required skill applications across parallelforms, and noted potential sources of confusion within the narrativeand questions. Based on this input, we revised the PAs. Finally,four students (two exiting and two entering the targetgrade level) completed all six PAs at each grade. Based on their responses and input, we made a final set of revisions. For the classroom PAs in this study, we used three alternateforms at each grade level. See the Appendix for one fourth-gradePA. Format. Each 2- to 3-page PA began with a multiparagraph narrative describing the problem situation. Each dilemma also presented students with tabular and graphic information for potential application in the PA. The problem included questions that provided students with opportunities to (a) apply the core set of skills, (b) discriminaterelevantfrom irrelevantinformation in the narrative, (c) generate information not contained in the narrative, (d) explain their mathematical work, and (e) generate written communication related to the mathematics. At Grade 4, because one question required use of graph paper, every page of the PAwas photocopied (i.e., superimposed) onto graph paper. Between each question, we left adequate space for a response. Scoring. These PAs are scored according to a rubric adapted from the Kansas Quality Performance Assessment (Kansas State Board of Education, 1991). This rubricstructuredscoring along four dimensions (conceptual underpinnings, computational applications, problem-solving strategies,and communicative value). Each dimension was scored on a 6-point Likert-type scale (where 0 = no relevant response and where an anchor was provided for every odd number on the scale). Reliability and validity. In accord with Lane, Liu,Ankenmann, and Stone (1996), we had investigated the technical features of these PAs using three criteria: intertask consistency, error due to raters, and relation with other measures (see Fuchs et al., 2000). On each criterion, the PAs were shown to be adequate (see Measures section for coefficients).

620

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments PATeacherActivities Initial workshop.During Week 1 of the study, teachers participatedin one full-day workshop in which they (a) learned about the reform emphasis on excellence and equity and about the purpose of PA, (b) completed one PA, (c) reviewed scoring criteria,(d) achieved scoring reliabilityof 80%on five PAs, and (e) discussed instructionalmethods that might help their students develop problem-solving capacity. Administration of classroom PAs. Teachers administered the three classroom PAs on the Monday of Study Weeks 10, 13, and 23. Teachers read classroom PAs aloud to students and reread text whenever students requested. Teachers had no access to these PAs prior to the administration. Release daysfollowing classroom PAs. The Tuesday after each PA administration,teachers came together for a release day. Each day, they participated in the same set of activities. First,they reviewed procedures for scoring the PAs according to the rubric adapted from the Kansas QualityPerformanceAssessment (Kansas StateBoard of Education, 1991). Second, teachers reviewed methods for providing written feedback to students. Third, teachers worked with fellow teachers at their grade to achieve at least 80% agreement on five protocols (preselected by research assistants to represent a range of responses) and to discuss their written comments to students. Fourth, teachers scored and wrote comments on their own students' PAs. Fifth,teachers reviewed and discussed a lesson we had prepared for teaching students how to interpret PA feedback. Sixth, teachers brainstormedabout and discussed, in the large group, how to modify instructionto enhance the performanceof all students in their classes, including moderated but refrained from interjecting our own ideas. Therefore, the discussions were collegial, without external influence. Finally, teachers completed instructional plan sheets on which they described plans for modifying their instructional programs before the next PA administration and, for the second and thirdsessions, modified previous plans to reflect the activities they had actually implemented. Providing studentfeedback. On the Wednesday or Thursday of each PA week, teachers distributed the scored PAs and provided feedback to students using a lesson we had structured as follows. First,students reviewed a blank copy of the PA they had taken on the Monday of that week, as the teacher reread the PA aloud. Second, the teacher worked the PA on an overhead, showing all work, labeling to show what she was doing and why, providing written explanations and communication about the work, and using harder skills whenever possible (e.g., multiplication ratherthan addition). Third, the teacher reviewed the four areas in which the PAs were scored and principles

in below-grade-levelpupils. As teachersparticipated these discussions,we

for scoringhigh in each area. Forconceptualunderpinnings, termed"making sense"for the children,


the lesson instructed students to show that they understood what the problem was all about and to do work that made sense. For computational applications, 621

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff termed "computation,"the lesson instructed students to show all work, even easy skills that they could do in their head, and to add, subtract,multiply, and so on without errors.For problem-solving strategies,termed "problemsolving," the lesson instructed students to use only importantpieces of information and to show their plan for solving the problem. For communicative value, termed "communication,"the lesson instructed students to show all of their work and to tell what they did and why. Then the teacher reviewed the numbering system by which each dimension was scored and explained the nature of the comments she wrote while providing examples of positive and negative comments related to each scoring dimension. Finally,the teacher passed out the scored test, encouraged student discussion, and checked student understandingof the comments and scores by asking questions of individual students. Measures Teacherquestionnaires. PA and no-PA teachers completed questionnaires that comprised two parts.On one part,teachers responded to two open-ended items designed to assess their knowledge about what PA is and how PA might enhance their instructionaldecisions: "Write mathematicsproblem that might a be categorized as an example of performance assessment" and "Explainthe ways in which performance assessment might be helpful in making educational decisions in your school and classroom in the area of mathematics."For each question, separately, teacher responses were ordered randomly (so that PA and no-PA teachers were in no particularorder) and typed without teacher names. Then two research assistants read each response and developed a coding scheme to map responses comprehensively. These coding values appear in Table 4. Finally, the same two research assistants coded each teacher's answer in terms of whether it represented each possible response code. Agreement between the coders on 33% of the responses, equally distributedbetween treatments,was 95.4%.For each question, we entered into analysis the total number of codes each teacher's response represented (except that, on the second open-ended question, we did not include wrong/ irrelevant responses in the total). On another part of the questionnaire, as a means of assessing changes in teachers' curricular focus, teachers distributed 100 points to indicate "how much instructionaltime they allocated to: basic math facts, computation, word problems, problem-solving activities,and other."They completed this distribution for the previous year, the current year, and the year to follow. Instructional plan sheets. On each release day, every PA teacher completed an instructional plan sheet on which she described the activities she planned to incorporate into her instructional program before the next PA to enhance student performance on the types of problem-solving activities incorporated in the PA. Two research assistants studied these plans (from which dates and names had been removed) and developed a coding scheme to capture comprehensively the foci of the teachers' instructional plans (see Table 5 for a list of codes). Then two additional research assistants used this 622

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

Table 4 Teacher Questionnaire Data by Treatment Open-Ended


Treatment Question/response PA No PA

Writea math problem that might be categorizedas an example of PA Contains2 or more paragraphs Containstables or graphs Has 2 or more questions Provides opportunitiesto apply 3 or more skills information relevant/irrelevant Requiresstudentsto discriminate Requiresstudentsto generate information Requiresstudents to explain work Requiresstudentsto generatewrittencommunication WaysPAhelpful in makingeducationaldecisions in mathematics Trackstudentprogress Develop higher order thinking Measurehigher order thinking Develop applicationof skills Measureapplicationof skills Help teachers plan better instruction Identifystudents'strategiesfor problemsolving Develop students'strategiesfor problem solving Wrong/irrelevant response Note. Values representnumberof teachersciting each response.

4 6 7 8 3 6 3 4 4 4 2 3 3 6 2 1 0

2 2 7 5 0 1 0 2 1 0 3 3 1 4 1 0 2

scheme to code plans (from which dates and teacher names had been removed). Interscorer agreement on this coding, calculated on 25% of the plans equally distributed across the release days, was 92%.We treated these data descriptively. Student problem solving. We measured student problem solving with three types of measures: analogous, related, and novel with respect to the classroom PAs. The analogous problem-solving measure involved the three unused and unseen parallel PA forms we had developed at the students' assigned grade. Consequently, the analogous problem-solving measure incorporated the same problem structureand required the same computation and application skills as did the classroom PA. It also was several pages in length and began with a multiparagraphdescription of the problem situation, which included tabular and graphic informationfor potential application in the PA.The problem posed four questions that provided students with opportunities to apply the same core set of skills prompted by the three classroom PAs, to discriminaterelevant from irrelevant information in the narrative, to generate information not contained in the narrative, to explain mathematical work, and to produce written communication related to the mathematics. The analogous problem623

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

00 ""r-Noa
-e

en CU CU

r--en CV

00

U)

Na14)

~r )

0\C

a)
0I

0C C14 ' 00 0

SN

a)

00
00J000

0 )N

-Lf

ccJ

d
cu

arf

000-4
coo 000 o uo

e\0

cc

S13

0c

ef N

od

dU
000 000

C rq

'J~ 0

cu

a)

0 09~. a~
FBI?
.8
0-0

en

P
Xa
000n-r o i o 0e(
c u

0, Q)

~ 04
-C

Q)
u
u

E
CU

Poin

o'

~N

"4S 00 0
-

a-~ E

0 Q) -m Q)~
m 2

(O

go

QQ)
0 U) LL
N 0 0 N 0 \

vJ2

-~

CU d

04.'

00

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Assessments Performance solving measuredifferedfromthe classroomPAsin thatit involveddifferent quantities,objects,and contexts.Atthe classroomlevel, we counterbalanced measureacrosstrialsand the three formsof the analogousproblem-solving treatments. Research for the assistants used standard procedures administering analomeasures(see Appendixfor directions). with classAs gous problem-solving roomPAs,these PAswere readaloud,withpartsreread studentsrequested. as At pretesting, researchassistants when two thirds stopped the administration of the classhadfinishedorwhen 20 minuteshadelapsed,whicheveroccurred when two thirdsof the sooner;at posttesting, theystoppedthe administration class had finished or when 40 minutes had elapsed, whichever occurred sooner.Theanalogous werescoredaccording the measures to problem-solving same rubric teachersused for classroomPAs. Two scorers of unaware the studycondition whichPAswere completed in scored the rubricfor 20%of the PAs.Foreach protocol,we independently averagedagreementacrossthe fourdimensions.Acrossthe 20%of protocols on which agreement was calculated, ratesof agreement were 97.6% pretest at and 95.7%at posttest.As demonstrated previouswork (see Fuchset al., in 2000), alternateform/test-retest reliabilitycoefficients for the conceptual underpinnings, computational applications, problem-solving strategies,and communication were .56, .66, .59, and .46. scoringdimensions,respectively, As demonstrated 331 students, for correlations the totalmathscoreon the with Testof BasicSkillsrangedfrom.62(forcommunication) .68 to Comprehensive (forproblem-solving strategies). The relatedproblem-solving measureinvolvedone unseen PAfromone level below the students'assigned grade (i.e., for thirdgraders,the grade relatedproblem-solving measurewas one second-grade classroomPA).This relatedproblem-solving measure fromthe analogousproblem-solving differed measure in that it incorporated differentproblemstructure a and required applicationof a different,easier set of computationand applicationskills. measuresshared Nevertheless,the analogous and relatedproblem-solving threefeatures: Theirappearance format and were similar (a) (i.e., multipage, of information inclusionof and multiparagraph presentation tabular/graphic fourquestions); they providedopportunities discriminate to relevant from (b) irrelevant to generateinformation containedin the narrative, not information, to explainmathematical relatedto the work, and to producecommunication and Administration and mathematics; (c) theyreliedon the samescoringrubric. scoring methods were identical to the methods used for the analogous problem-solvingmeasure. Interscorer agreementassessed on 20%of the protocolswas 96.1%. The novel problem-solving measurewas the Iowa Test of Basic Skills and Boats"at Grade2 and "Dollars and Sense"at (Riverside,1995):"Bears Grades3 and4. According the testmanual, to the NCTM recommenreflecting dations:
The focus of the [mathematics] assessmentsis on students'mathematicalreasoningcapabilities, well as on theirabilityto commuas

625

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff nicatetheirunderstanding symbolsand words throughmathematics of theirown choosing.Assessmentof computational skills is limited to the contextof problemsolving.In the primary-grade assessments, theirawareness the use of manipulatives helps studentsdemonstrate of mathematicsal concepts. (pp. 4-5) We used normal curve equivalents based on a national sample. As reported in the manual for "Bears and Boats" and "Dollars and Sense," respectively, alpha coefficients were .59 and .84; correlations with the Iowa Test of Basic Skills were .69 and .78. Two research assistantsunaware of study conditions scored the protocols; based on 20%of the protocols, agreement was 98.6%. We considered the Iowa PA the novel problem-solving measure because its appearance, its format, and the nature of the items and problem structures differed greatly from those of the classroom PAs. Teacher DataCollection Research assistants delivered questionnaires to PA and no-PA teachers on Monday of Week 19 and collected them on Friday of Week 21. On each of the 3 release days, PA teachers completed instructionalplan sheets, which we photocopied before teachers left for the day. StudentDataCollection Analogous problem-solving measure. During Weeks 1 and 2, research assistants trained in standard procedures administered two alternate forms of the analogous problem-solving measure (i.e., two unused, unseen alternate forms of the grade-level classroom PAs) during the same session in a wholeclass format(excluding students who were absent). Researchassistantsread all text aloud and reread text whenever requested. The average score (on each scoring dimension) across the two administrationswas entered into analyses. Between the administrations, research assistants delivered a 45-minute lesson on how PAs are structured,strategies for approaching PAs, and scoring procedures. This lesson, which was designed to increase student familiarity with PAs, incorporated examples of student work to illustrate the topics and required students to respond frequently to questions (see Fuchs et al., 2000, for additional information on effects of this familiaritytraining). During Week 24, the same research assistants administered the final form of the analogous problem-solving measure (i.e., a third,unused, unseen gradelevel classroom PA) in a whole-class format (excluding absent students). Research assistants read the text aloud and reread as requested. In every class, prior to the posttest administration,research assistantsreviewed the PA scoring procedures and tips for scoring well on PAs. Related problem-solving measure. During Week 30, research assistants trained in standardprocedures administeredthe related problem-solving measure (i.e., one unseen below-grade classroom PA). This was administered in a whole-class format to every PA and no-PA student (who was not absent) at Grades 3 and 4 (i.e., fourthgraderstook one third-gradeclassroom PA and third graderstook one second-grade classroom PA;we had no availablebelow-grade 626

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

PA for second graders,so they did not take the related problem-solving measure).At each grade,all studentscompletedthe same below-grade-level PA form. As with classroom PAs, all text was read aloud and reread as requested. Novelproblem-solvingmeasure. During Week 25, research assistants the trained standard in administration administered novel problemprocedures in solving measure(i.e., the Iowa PA).Thiswas administered a whole-class format all PAand no-PAstudents(who were not absent).Aswithclassroom to PAs,the Iowa PAwas read aloud and rereadas requested. DataAnalysis data. Forthe itemson the questionnaire involvedteachers' that Teacher aboutwhat PAis and how PAmightenhance instructional deciknowledge in had sions,we summedallof the codes each teacher mentioned herresponse and conducted a between-subjects ANOVA2 these sums. To (treatment) on calculate effectsizes forthesequestions(whichinvolveddatacollectionat one the point in time), we subtracted differencebetween the means, which we then dividedby the pooled standard deviation(Hedges & Olkin, 1985). Forthe questionaboutchangesin teachers' curricular focus,we conducted one between-subjects(treatment: vs. no PA) ANOVAand one withinPA (see Table 6). To subject(year:previousvs. currentvs. following) ANOVA calculateeffect sizes for this questionaboutcurricular focus (which involved Table6 Teachers' CurricularFocus by Treatment
F PA Focus Math facts Year Last This Next Last This Next Last This Next Last This Next M 15.71 13.75 12.50 25.00 18.75 18.75 12.86 16.88 15.00 6.43 15.00 17.50 SD 5.35 7.44 3.78 9.13 4.43 5.82 6.36 3.72 5.98 4.76 5.98 7.07 No PA M 17.50 20.00 19.38 20.63 21.88 23.13 13.13 11.88 11.25 8.13 6.88 8.12 SD PA Year PA x Year ESa L-T 0.62 T-N 0.09

Computation

Word problems

Problem solving

5.76* 4.63 10.27** 0.64 5.35 4.17 5.63 0.10 2.15 7.98* 6.51 4.58 1.22 2.64 5.30 1.73 3.72 2.31 6.51 4.21 19.08** 21.76** 4.58 6.51

0.88

0.21

0.73 -0.19

1.51

0.16

is "aES effect size for PA vs. no PA; L-T is last to this year, and T-N is this to next year. We assigned positive values to ESs indicating change in the direction of the reform movement. Therefore, for math facts and computation, positive ESs indicate greater decreases for PA than for no-PA teachers; for word problems and problem solving, positive ESs indicate greater increases for PA than for no-PA teachers. *p < .01. **p < .001.

627

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff

differencebetween changes between years),we use the followingformula: deviationof the increase/ changescoresdividedby quantity (pooled standard & squarerootof 2[1- r ,) (Glass,McGaw, Smith,1981).We assignedpositive values to effect sizes that indicateda change in the directionof the reform movement.Therefore,for mathfacts and computation,positive effect sizes indicategreaterdecreasesfor PAthanfor no-PAteachers; word problems for andproblemsolving,positiveeffectsizes indicate for increases PAthan greater for no-PAteachers.We treatedteachers'instructional datadescripplanning tively (see Table 5). Studentdata. On each measure,we conductedtwo-waybetween-subPA status: above vs. atvs. below) jects(treatment: vs. no PA;mathgrade-level ANOVAs Tables7 and8). On the analogous we (see measure, problem-solving examinedpretreatment differences using pretestdata;we examinedlearning to The using growthscores frompretreatment posttreatment. posttreatment score was used for the relatedand the novel problem-solving measures. Because interactions supersede main effects, we do not discuss main effectswhen significant interactions identified. Fisherleastsignificant are The difference hoc procedure & (LSD) (Seaman, Levin, Serlin,1991)was used post in conductingfollow-up tests involvingmore than two groups to evaluate pairwisecomparisons.To calculateeffect sizes for the analogousproblemsolving measure(involvinggrowth scores), we used the formuladescribed earlier(Glasset al., 1981).To calculateeffect sizes for the relatedand novel measures(involvingone point in time), we subtracted the problem-solving difference betweenthe means,whichwe thendividedby the pooled standard deviation(Hedges & Olkin, 1985). (See Table9 for effect sizes.) Results Teachers' AboutPAandCurricular Focus Knowledge PA-driven instrucQuestion1 of thisstudyasked,How does classroom-based tion affectteachers' aboutwhat PAis, teachers' knowledge knowledgeabout how PA mightenhance instructional decisions, and teachers'reportsabout focus? addressteachers' To changesin theircurricular knowledgeaboutwhat PAis, the questionnaire asked teachersto writea mathematics problemthat as mightbe categorized an exampleof PA.PAteachers' problemsrepresented an averageof 5.13 (SD = 2.17) of the dimensionsshown in Table 4; no-PA teachers'problemsreflectedan averageof 2.25 dimensions(SD = 1.75),F(1, 14) = 8.51,p < .05, effect size = 1.47.Thus,resultsindicatedthat classroombased PA-driven teacherinstruction increaseteachers'knowledge about did what PAis. To examineteachers'knowledge abouthow PAmightimproveinstructionaldecisions,the questionnaire askedteachers explainthe ways in which to PAmightbe helpfulin makingmathematics instructional decisions.On average, PAteachers'responsescaptureda totalof 3.25 (SD = 0.71) of the codes shown in Table4 (excluding incorrect/irrelevant fromthe totals); noresponses PAteachers'responsesreflectedan averageof 1.75(SD = 1.04) of the codes,
628

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

0)

ftai

tn 00 ONN ON

E
ON

00

0000r14000

00 ML

cc.

~n
0M

Nerh

"rs

r-

00

Flo

?g'5

L_

u O

er

c; c; o dd~~
S000
Ir,

o
000 00500
Flo

ooo oei Q
r
C5~

o~

? 01

'CV

6'
0

0)
'cc V)~
>Qc~

k
r

~
9~?~~~n

~ ~ ~
o000 - 1 O 000 c

0-0;

=ONO

ci)

0t

rcri
S4 r

r0

c~~
0 0 0 0

4 ~F
P 0)01
0..-.

,Q

oI
_:0

..

.0-

00

000

00

'

.j

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L44

Ria'H

ccn
C)
C00

?000

ccd

00

C,
C

?01

E6

u,
4.0

c
?U C5

F 80 B01

ro--f
U u O
-

r
U tr ;r er

OU LC

W 0oe a)
C)
og
.0

0e

Pe

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Assessments Mathematics Performance Table9 Effect Sizes for Student Problem-Solving Data
Contrast PA vs. No PA Measure Above At Below PAvs. No PA Above vs. at At vs. below Above vs. below

Analogous problemsolving CUchange CAchange PSchange COMchange Relatedproblem solving CU CA PS COM Novel problem solving NCE

1.34 1.20 1.16 1.35 1.07 1.47 1.18 1.18 0.93

1.15 0.76 1.08 1.09 1.00 0.91 1.05 0.95

0.14 0.26 0.55 0.60 0.14 0.09 0.21 0.50

1.13 1.07 1.23 1.26 0.88 0.89 0.94 0.89 0.36

0.70 0.79 0.83 0.67 0.56 0.46 0.49 0.82 0.95

0.45 0.26 0.45 0.34 0.75 0.79 0.81 0.62 0.83

1.08 1.01 0.99 1.04 1.43 1.32 1.41 1.41 1.73

0.30 -0.28

Note. See Table 7 for acronymdescriptions.

F(1, 14) = 11.45, p < .01, effect size = 1.70. Therefore, classroom-based PAdriven instruction did improve teachers' understanding of how PA might enhance their mathematics instructionaldecisions. As a means of exploring changes in teachers' curricularfocus, teachers distributed 100 points on the questionnaire to reflect the amount of instructional time they allocated to math facts, computation, word problems, and problem-solving activities (see Table 6) for the previous year, the currentyear, and the year to follow. For all but word problems, we found a significant interaction between treatment and year. Follow-up tests revealed the following. On math facts and computation, from the previous year to the current year, PA teachers significantly decreased their emphasis, whereas no-PA teachers allocated comparable emphasis; for both groups, emphasis was comparable from the currentyear to the following year. On problem-solving activities, from the previous year to the current year, PA teachers increased emphasis, whereas no-PA teachers allocated comparable emphasis; for both groups, emphasis was comparable from the currentyear to the following year (see Table 6 for effect sizes). Thus, as reflected in comparisons between PA and no-PA teachers' curricularfoci from the previous year to the currentyear, classroom-based PAdriven instructiondid prompt teachers to shift their emphasis away from basic, isolated, routine content toward problem solving. And, as revealed by the effect sizes, curricularchange in the desired direction was substantially larger than would otherwise be expected. Teachers' InstructionalPlans Question 2 of this study asked, What is the nature of teachers' mathematics

631

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff instructional plans when they use classroom-based PA-driven instruction?To answer this question, teachers completed instructionalplan sheets, indicating how they planned to address the curriculumreflected in the classroom PAs, at each of 3 professional release days. In Table 5, we display this informationby grade level, teacher, and release day. Teachers committed themselves to between 9 and 17 methods for enhancing students' performance on the classroom PAs. Across teachers, 16% of methods were devoted to computation or application skills, 9% were devoted to word problem practice (one second-grade teacher was responsible for most [7 of 91 word problem practice), 27% were devoted to problemsolving activities, 22%were devoted to helping students show their work, and 26%were devoted to providing students with similar test practice or directing test-preparation activities. The problem-solving activities that teachers identified more than once were as follows: cooperative learning or peer tutoring to develop "higher level thinking" and problem-solving skills (n = 3), "problem of the day"with (n = 7) or without (n = 9) journalsin which students explained their problem-solving methods, and problem demonstrations (n = 6). Consequently, approximately one of every four activities focused on developing student problem solving; about one of every two activities was devoted to helping students demonstrate their problem-solving capacity; and the remaining activities concerned the development of routine skills that might be necessary for application during problem solving. StudentProblem Solving Question 3 of this study asked, Does teachers' use of classroom-based PAdriven instructionenhance students' mathematicalproblem solving?To answer this question, we examined student growth from pretreatment to posttreatment on the analogous problem-solving measure and explored student performance at the end of the treatment on a related novel and a novel problemsolving measure. Analogousproblem-solving measure. For the analogous problem-solving measure pretest scores (see Table 7), we conducted ANOVAsto compare the pretreatment performances of PA and no-PA students. On each score, we found significant effects for mathematics grade-level status: Across PA conditions, scores of above-grade students were higher than those of at-grade students, whose scores, in turn, were higher than those of below-grade students. However, there were no significant initialeffects for treatment or for the Treatment x Mathematics Grade-Level Status interaction. By contrast, ANOVAsconducted on the analogous problem-solving measure pretest to posttest growth scores (see Table 7) did reveal significant interactions between treatmentand mathematicsgrade-level status. Follow-up tests indicated that, on each score, growth for above- and at-gradestudents was greaterin the PA than in the no-PA condition, whereas growth for below-grade students was comparable across PA and no-PA conditions. Consequently, with their teachers' use of classroom-based PA-driven instruction, above- and at-grade students increased their performance on a 632

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Assessments Performance


problem-solving measure analogous to the classroom PAs more than would be expected as a matter of maturity.This was not true, however, for the belowgrade students. Related problem-solving measure. On the related problem-solving measure (posttest only at Grades 3 and 4; see Table 8), we also found significant interactions between treatment and mathematics grade-level status for three of four scores: conceptual underpinnings, computational applications, and problem-solving strategies. For each score, follow-up tests indicated that above- and at-grade student performance was stronger in the PA than in the no-PA condition; however, below-grade student performancewas comparable across PA and no-PA conditions. Therefore, as with the analogous problemsolving measure, above- and at-grade-level students benefited differentially from their participation in PA-driveninstruction classrooms, as reflected on a related problem-solving measure; by contrast,below-grade-level students did not. On the remaining score, communicative value, we found a main effect favoring PA over no-PA students (and a main effect favoring above-grade over at- and below-grade students and favoring at-grade over below-grade students). So, as reflected on the communication score on a related problemsolving measure, PA-driven instruction benefited students across all three grade-level designations. Novelproblem-solving measure. We found a significant interaction on the novel problem-solving measure (posttest only) (see Table 7). Follow-up tests revealed that, among above-grade students, scores were higher for PA than for no-PA students; by contrast, among at- and below-grade students, scores were comparable across PA and no-PA conditions. Consequently, as reflected on the novel problem-solving measure, only above-grade students benefited differentially from their participationin PA-driveninstruction classrooms. Discussion Effects of Classroom-Based PA-Driven Instructionon Teachers Results indicated that classroom-based PA-driven instruction can serve to increase teachers' understanding of what PA is and how PA might be used to improve mathematics instructional decisions. When asked to develop a PA, teachers in the classroom-based PA-drivencondition constructedproblems that incorporated more features associated with PA. When asked how PA might be used to formulate instructional decisions, PA teachers cited more strategies. Results were not only statistically significant but also large: Effect sizes comparing PA and no-PA teachers' knowledge exceeded 1 standarddeviation unit. These findings are important because they reveal the potential for classroombased PA-driven instruction to extend teachers' knowledge about assessment practices. These results are not, however, surprising. After all, one might expect PA knowledge to increase as teachers, over the course of a school year, repeatedly score students' PAs, provide students with feedback on their performances, and use the assessment information to develop classroom activities. 633

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff Perhaps more noteworthy is the finding that, as a function of participating in classroom-based PA-driveninstruction,teachers reported curricularfoci that were becoming better aligned with the mathematics reform movement. Relative to the contrast teacher group, PA teachers reported a differentially decreasing emphasis on math facts and computation along with a differentially increasing focus on problem solving. Effect sizes reflecting the magnitude of the difference between PA and no-PA teachers' changing curricularemphasis were substantial, ranging from 0.62 for math facts to 1.51 for problem solving. Clearly, classroom-based PA-driven instruction enhanced teachers' thinking about their mathematicscurriculum.This finding echoes previous work on the effects of statewide accountability programs incorporating PA. For example, Koretz and colleagues (Koretz, Barron, et al., 1996; Koretz, Mitchell, et al., 1996) showed that as states rely on PAs and provide educators with information about and experience in scoring PAs, teachers increase curricular focus on mathematical communication, problem solving, and application. Importantly, however, the current study extends Koretz et al.'s findings by showing how PA teachers' instructional plans may reflect reports of changing curricularemphasis. As the PA teachers, over three planning cycles, brainstormedcollaboratively with colleagues to address the challenges associated with increasing student performance, they committed themselves to many (i.e., between 9 and 17) instructional ideas. These ideas represented at least three important instructional strategies. First,approximately one of every four ideas the teachers incorporated into their plans was designed to expand the mathematicalproblem-solving performance of their students. And some of these activities, such as peer-mediated learning (e.g., King, 1991) and problem demonstrations (e.g., Cooper & Sweller, 19871),may resemble research-based practices with demonstrated efficacy for promoting mathematical problem solving. Second, as represented by their use of testlike practice, teachers incorporated extended mathematical problem-solving activities. This is noteworthy because extended problem-solving activities offer students the opportunity to discover the relations among knowledge elements and problem features, which are important to the development of problem solving (Prawat, 1992). Teachers' use of extended problem-solving activities also is noteworthy because prior work (Stigler & Hiebert, 1997) suggests that teachers typically avoid extended mathematics activities, allocating as much as 96%of students' time to practice on routine problems. The third strategy represented in the teachers' instructional plans was helping students demonstrate and communicate about the mathematical competence they already possessed: Approximatelyone of every four instructional ideas was designed to improve students'labels and explanations for theirwork. This instructional focus seems appropriate because, as reflected in the NCTM standards (1989), competence in communicating mathematical information is a valued goal within the mathematics reform curriculum. Of course, some may construe two of the teachers' strategies-clarifying methods for showing and explaining work and providing testlike practice-as

634

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

to "teaching the test"and thereforeview the teachers'activitiesas cause for concern.Afterall, these two foci collectivelyrepresented half approximately of the teachers' instructional use plans--ordoublethe teachers' of moreguided methodsfor extendingstudents'mathematical problem-solving competence or as demonstrations). Moreover, shown (e.g., peer-mediated learning problem in previouswork examiningthe effectsof statewideaccountability programs that incorporatePA, teachers can structurethese activities in ways that constitute"coaching" et (Firestone al., 1998).And coachingcan lead to rapid initial on PAsthatdo not represent learning true et (Hambleton al., 1995). gains In this regard, however, it is important to consider convincing Some have proposed that,when PA mirrors reform the counterarguments. to the test representsnot only a legitimatebut also an curriculum, teaching effectiveapproach aligningcurriculum desireddirections for in (e.g., Archbald & Newmann, 1988; Resnick & Resnick, 1992; Wiggins, 1989). Providing studentswith guidanceon how to label or explaintheirwork and permitting studentsopportunities engage in extended problemsolving with testlike to activitiesare two viable strategies achievingalignmentby teachingto the for test. in PA-driven instruction did Consequently, participation classroom-based to shape teachers'understanding assessment,curriculum, inof and appear structionin substantialand desirableways. Of course, within this context, becauseof two studylimitations. we findingsmustbe qualified First, although describedPA teachers'instructional we have no similardata for our plans, we the comparisongroup of no-PAteachers.Therefore, cannotattribute PA teachers'plansdirectlyto classroom-based PA-driven instruction. imporThe tance of this problemis reducedsomewhatbecause we have corroborating evidence of changingcurricular emphasesthatcan be linkeddirectlyto the treatmentvia contrastgroup data.The second problemin interpreting the instructional planningdatais thatwe reliedentirelyon teacherreports.This second problemmay be mitigatedin partby recentevidence on the effects of statewidePAs;Firestone al. (1998) showed thatclassroomobservations et of mathematics corroborated teacherreports. programs the suffersin important Nevertheless, current study ways fromits failure to collectinstructional dataforthe contrast planning groupandto complement teacher reportswith observational data. Additionalresearchincorporating contrast teachersandrelyingon classroom observations required provide is to which can be more rich, trustworthy descriptionsof teachers'instruction, to PA-driven instruction. clearlyattributed the use of classroom-based
Performance

Effects Classroom-Based of PA-Driven Instruction Students' on Problem-Solving

Despite important questionsabout our methodsfor describingteachers'instructionalplans, our results do provide a solid basis for concludingthat teachers' of classroom-based use PA-driven instruction to enhancedprobled lem solving among at least some of theirstudents. in Clearly, superiorproblemsolving,as a functionof participation class635

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff

was room-basedPA-driven instruction, demonstrated PAstudentswhom by teachers had judged to be performingabove-gradelevel in mathematics. students and Theseabove-grade-level morethan grewstatistically dramatically no-PAstudentson all three comparableabove-grade-level demographically the measures: measurethatwas analogousto the classroom problem-solving PAs (effect sizes: 1.16 to 1.35), the measurethat was less similarbut was relatedto the classroomPAs(effect sizes: 1.07to 1.47),and the measurethat was more novel with respect to the classroomPAs(effect size: 0.93). on Moreover, two of the threemeasures,studentswho had been designatedas performing gradelevel also demonstrated at impressivelearningas a functionof theirteachers' of classroom-based use PA-driven instruction. On the analogous and related problem-solving measures,effects favoringPA students over no-PA students were statistically significantand impressive (effect sizes were 0.76 to 1.15for the analogousmeasureand 0.91 to 1.05 for the relatedmeasure). however,effectson the novel measurewere not statistically By contrast, and the effectsize was a correspondingly modest0.30.Thisfailure significant, to produce superiorlearningon the novel measurefor the at-grade-level studentsis especiallynoteworthy becauseourresearch staffjudgedthe novel students' measure,which had many more questionsto guide and structure responses,to be easierthanthis study'sanalogousor relatedmeasures. the datado not providethe basisto determine Although available why atstudentsfailedto demonstrate effectsacrossthe set of grade-level comparable to measures,it seems instructive speculate about possible problem-solving To students' explanations. generatesome hypothesesaboutthe at-grade-level on to disappointing performance the novel measure,we return an analysisof the key similarities distinctions and measures. amongthe problem-solving We had relied on the transfer literature specify problemfeaturesthat to reflecteddecreasingsimilarity connectednesswith the classroomPAsand or the problem-solvingmeasures.One featurewas computationand among the of students skills,whichmaydetermine similarity the operators application use to solve problems (Cooper&Sweller,1987).Skillsandoperators, although somewhatdifferentacrossthe threeproblem-solving measures,were highly similarand of comparabledifficulty the analogousand novel measures. for Consequently,the need for varyingskills and operatorsis not a persuasive students'failureto demonstratecomparable explanationfor at-grade-level effects on the analogousand novel problem-solving measures. In a relatedway, problemstructure, which may determinethe similarity of the schema to be activatedduringproblemsolving (Brown et al., 1992; for Cooper& Sweller,1987), does not offera convincingframework understudents' failure. the of standingthe at-grade-level Although problemstructure the analogousmeasurewas identical those containedin the classroom to PAs, it differedfrom the problemstructure both the relatedmeasureand the of novel measure.If a differing accountedforstudents' failure problemstructure to solve problems,then we would expect at-grade-level studentsto experience similar difficulties withthe relatedand novel problem-solving measures.
636

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

studentsperformed on the relatedmeasurebutnot well Instead, at-grade-level on the novel measure. The best candidate explainingthe pattern findingsmaybe the third for of elementby whichwe formulated formatand scoring, similarity comparisons: which may serve to prompt metacognitiveawareness of relations across (Brownet al., 1992;Prawat, and 1989).Format scoringwas problemsituations identicalfor the analogousand relatedmeasures(as was the at-grade-level students' but for performance) differed dramatically the novel measure(as did the at-grade-level students'performance). Thisprovidesthe basis for speculatingthat these studentsmay have sufferedfrominadequateawarenessof how the knowledge and strategiesthey had in fact garnered from the classroom activities(as evidenced by their superior performanceon the measures) analogousandthe related problem-solving mightapplyto the novel measure. Thisargument consistent is withtheoretical workemphasizing importhe tance of metacognitiveawarenessof problemrelations(Brownet al., 1992; Prawat,1989). It finds empiricalsupportin researchshowing that a lack of awarenessof problemrelationsmay inhibittransfer (Gick& Holyoak,1983; & Reed,Ernst, Banerji, 1974),and it seems consistentwithwork showingthat students)identifydeep, relativelyinvisexperts(e.g., our above-grade-level iblestructural similarities whereasnovices(e.g., our amongproblemsituations, students)tend to relyon surfacesimilarities at-grade-level (Chi,Feltovich,& Glaser, 1981). This argumentalso seems plausiblein light of PA teachers' failureto incorporate activities instructional designedto help studentsanalyze and articulatesimilaritiesand differencesamong problem contexts. Such activities,in which studentsanalyzecontrasting pattern-finding cases, have been shown to enhance problem solving (Schwartz& Bransford,1998). our awarenessof the Consequently, hypothesissuggeststhatmetacognitive relationsacrossproblem-solving situations mayhave mediatedat-grade-level students' Thishypothesis, problem-solving capacity. althoughclearlyspeculative, providesthe basis for futureresearch. results the below-grade-level for students were disappointing. Meanwhile, We found one statistically effect on the relatedproblem-solving significant measure.This effect was manifestedon the communicative value scoring dimension,the dimensionprobablymost sensitiveto PAteachers'guidance abouthow to label andexplainwork.Moreover, even thoughrelatively small to modesteffectsizes favoredPAstudentson the analogous(0.14 to 0.60) and related(0.09 to 0.50) problem-solving PA measures,below-grade-level studentsactually somewhat lowerthandidtheirno-PAcounterparts on performed the novel PA (effect size = -0.28).1Clearly,for below-grade-level students, resultsarediscouraging. Of course,these findingsareconsistentwith previousworkshowingthat students withinitially mathematics low achievement with experiencedifficulty mathematical problemsolving.Forexample,CooperandSweller(1987)found thatstudentswith previouslylow achievement levels required longerperiods and moreworkedexamplesto acquiremathematical problemsolving.Mayer
637

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff

that consistent used an approach withthe (1998)demonstrated when teachers NCTM studentswith higherincomingachievement levels benefited standards, more. And, in a similarway, Woodwardand Baxter(1997) showed that students who hadlearning disabilities who scoredbelow the 34thpercentile or on a standardized achievement profited froma reformed test less mathematics curriculum thandid averageachievers. Consequently,our resultsraise questions about the education reform rhetoric that can et (McDonnell al., asserting allchildren learnto highstandards to the offered 1997).Relative moststateinitiatives, professional development teachersin thisstudywas ambitious. Teachersmet fora daylongintroduction to and discussionaboutPA.Then,followingeach of threePAadministrations, teachersspent a fullday studyingtheirstudents' performance, thinkingabout how to maximizefeedbacksessions, and workingcollaboratively with colideas for promotingmathematical leagues to develop instructional problem solving. In response to this collegial professionaldevelopmentexperience, teachers' instructional plansincorporated manysoundideas thatreflectedthe mathematics reformcurriculum important in ways. Moreover,the methods reflectedin the teachers' withstudents' actualparticipation the in plans,along authentic assessments, appeared to enhance the problem solving of the studentsand,to a greatextent,the at-grade-level students. above-grade-level these activities were insufficient promotelearning the to for Nevertheless, Our remaining pupilsin these classrooms. findings,alongwithpreviouswork involvingmore simple formsof assessment(e.g., Fuchs,Fuchs,Hamlett,& Stecker, 1991), suggest that teachersmay requiremore intensive forms of to support to help them use assessmentinformation develop instructional environments promotelearning low-achieving that for In students. fact,teachers may requirethe kinds of extensive and long-termprofessionaldevelopmentactivities withinmathematics reform classrooms Prawat, (cf. incorporated in which teachershave the 1992).Moreextendedprofessional development, not but with opportunity only for collegialcollaboration also for interaction individualspossessing substantial aboutinnovativeinstructional knowledge the methods,mayprovideteacherswiththe enhancedcapacity, time,and the materialresourcesto addressa relativelyunfamiliar in curriculum effective 1987). ways (McLaughlin, Evidencesupporting need for additional the teachersupportis found in the PAteachers' instructional these plansreflectedimpressive plans.Although to innovativeactivities, some promising attempts incorporate approachesfor studentsin the development problem-solving of were noticeguiding capacity ablyabsent.Forexample,Brownandcolleagues(1992)developeda reciprocal teaching method by which teachersuse a reflectionboard to scaffold students'capacityto extractrelevantinformation fromstories,keep trackof drawvisualrepresentations problems, of checkarithmetic important quantities, answersand engage in sense making.The facts, and estimateapproximate Cognitionand TechnologyGroupat Vanderbilt (1997, in press) has demonstratedthe capacityto promoteproblemsolving by havingstudentsanalyze the effects of systematicvariationsin problem parametersand by asking
638

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments

thanto solve studentsto inventsolutionsto a broadclass of problemsrather single problems. and As shown in thisstudy,forlow-achieving forsome average-achieving formulated to these andotherinnovative students, activities, carefully provide studentswith explicitguidancein developingproblem-solving schemasand awareness,maybe necessary.Teachersshouldhave access to metacognitive such innovationsas they attemptto plan assessmentsthatreflecta reformed curriculum. classroom-based PA-driven instruction may Consequently, although means for drivingeducationalreform,schools may representone important such instruction moreextensiveprofessional need to supplement with support systems.

APPENDIX Directionsfor Administration: Analogousand RelatedProblemSolvingMeasures


"I'm [research assistant name] from [project name]. We're interested in how students work on different types of math problems. Today, your class will take a math test for us. You won't get a grade for this test in your math class or on your report card. And, on this test, there may be things you haven't learned yet. So, we don't want you to worry if the test or part of the test seems hard. We don't expect you to know how to do everything on this test. But, your teacher and I do want you to do the very best work you can do. This will help us better understand what you know about math." [Research assistant distributes tests.] "Please put your first and last name, your teacher's name, and the date on the lines at the top of page 1. I'll write your teacher's name and the date on the board." [Research assistant points at appropriate lines.] "Now, put your first and last name on the other page(s). Before you begin, I'll read the story and the questions to you. Read along as I read. Please do not begin to work until I finish reading and tell you to begin." [Research assistant reads the entire test, including graphs or charts.] "Some parts of this test may seem hard to you. You may know the answer to some, but not all, parts of the questions. That's OK. Find the parts you know how to do or think you might know how to do, and try your best. If you need help rereading the story or questions as you work, raise your hand and your teacher or I will help you. You can do the questions in any order you like. There are [number ofl pages. Pull the first page off so you can read the story easily as you answer the questions." [Demonstrates.] "Are there any questions? Begin." [Research assistant and teacher monitor students as they work. Research assistant or teacher rereads as requested by students but does not provide any other help. At the end of the administration, the research assistant says "Stop"and collects tests, being sure to get all pages and that students' names appear on every page.]

639

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

APPENDIX A Fourth-Grade PerformanceAssessment

*-Name Teacher
f

Date Grade
i

: PA#

Field Tri i!Class

Mrs. Smith is going with her class on a field trip.There are 29 people going, including: 'Mrs.Smith and parentdrivers.Up to 5 people, including the driver,can fit into each car. iMrs.Smith called aheadto reserveparkingspace. The size of a reservedparking i lot is 32 feet long and 30 feet wide. Each car is 10 feet long and 6 feet wide. The ticket loffice is 50 yards from the parkinglot. 7Teachers (but not otheradults) get in at any of the field tripsfor 1/2 price if they bring ,a group of 10 studentsor more. The class earned$550.00 from a book sale to pay for all field trip expenses. Mrs. Smithremindedstudentsthat they would need to buy 'sandwicheson the field tripbecause bag lunches might spoil on a hot day.

iI

___.

Field TripTicket Price for 1 Person(See key)i ,

Zoo
SAquarium
Science

FieldTripExpenses Other
MParking

I,

forEachCar
1 Sandwich

Permit

89
$4.00

Museum

Ke:Eac meansOnly get $2. teachers h ini for 1/2 the pricelisted on the chart.

,.!

i1!ii

J __i i i

I" ii

__

i i, I" i

i J

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Name

Date

(1) Where does the class decide to go? How much will the tickets cost for the_ whole group?

.I

II

I i i.

work. class to Show How cars I(2) many wilthe need take? al your

1I I II
(3)

11 i ,T
to
cover

Ii I
yes, of what these other

i
thi:gs
could cost? If
no,

le how more do ney much money

everyone

I Is

there

enourh

buy

I I moaeyfield i the
on

trip?

How

i al iIexpenses? might much

If each

things

,,

Write a notice. surmoneythe do Bh more

everyting parentsneed t know.

I I I

student your work. Show a in Mrs. Smith'scl caust will the (2) Howesmany be sent to l parents of every

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Name

Date

S(5) Draw a picture showing how the cars can be arrangedin the reservedparking -l-ot. Leave at least 4 feet between each car so thatthe passengerscan get in andou-. "of cars. the I
1i
I 11 [Ii iiiI I, 1I1 1[I

tL ltl1 I , I

F 1711

ii i

tii1_'
_

iI

!1

II I
I ,

I il
I i

!1!J !I

II I I

-71

I Iii I
I

RI'FF!TtlVl 1i~I I I S Itll


IIi
, i I IiI
Ii

t I I

I1

I I

[I]

I I

I Ii I I

III

1t 1t
I I
I

I I

I
I ),,,______....

I i

I
'I i i
I

I [I I
I [ 1 1i

I
1

I i

I i

[.ii1ii
I

IiI
I

i Ii i i

I1

I II I I

11111111 1I...

i IIi
iI i

[i

F I

Ii

l 1[] I !i

___ __i

I i i i I t i I ! I ! i ! ! ! ! i i !!!....,

I !,

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments Notes


This researchwas supported in partby GrantH324V980001 fromthe U.S. Department of Education, Office of Special Education,and Core GrantHD15052 from the National Instituteof ChildHealthand HumanDevelopment to Vanderbilt University.Statementsdo not reflect the position or policy of these agencies, and no official endorsement by them should be inferred. 'We opted to rely on teacherjudgmentsfor determiningabove-, at-, and below-grade students because, as summarizedby Hoge and Coladarci (1989), teacherjudgmentsabout student achievement are sound. Moreover,teacherjudgmentseemed a betterchoice than MCAT Tennessee ComprehensiveAssessment Program(TCAP)scores because (a) the or MCATis not a norm-referencedtest and, therefore,cannot be used to sort students into above, at, and below grade level, and (b) TCAPscores reflectedachievement priorto the summer break during which regression, especially in mathematics,can occur (Allinder, Fuchs, Fuchs, & Hamlett, 1992). We did, however, rerun analyses using the previous spring's TCAPscores to sort students (of course, we had to make judgments about the appropriate cut points for above-, at-, and below-grade categories), and results were analogous to those obtained when teacherjudgmentswere used to sort the students. We neverthelesscaution readersthatour use of the termmathematicsgrade-levelstatus should be qualified with the phrase "as reported by teachers." 2Samplesizes are small for applying parametricstatisticsto the teacher data (n = 8 per condition). Therefore, readers may wish to attend to effect sizes. 3To address the possibility that time limit (or stopping rule) may have differentially penalized below-grade students, we conducted post hoc analyses on another database (Fuchs, Fuchs, Eaton, Hamlett,& Karns,2000). In that study of test accommodations, 181 high-, medium-, and low-achieving fourth-gradestudents completed two PAs: one with a 20-minute time limit and one with a 40-minute time limit. Analyses indicated that the increases students experienced with extended time did not differ as a function of achievement category, with effect sizes between achievement categories ranging from 0.01 to 0.07.

References
Allinder, R. M., Fuchs, L. S., Fuchs, D., & Hamlett, C. L. (1992). Differential effect of summer break on student performance in math and spelling as a function of grade level. Elementary School Journal, 92, 451-460. Archbald, D. A., & Newman, F. M. (1988). Beyond standardized testing: Assessing academic achievement in the secondary school. Reston, VA: National Association of Secondary School Principals. Boaler, J. (1993). Encouraging the transfer of "school" mathematics to the "real world" through the integration of process and content, context, and culture. Educational Studies in Mathematics, 25, 341-373. Bransford, J. D., & Schwartz, D. L. (1999). Rethinking transfer: A simple proposal with multiple implications. In A. Iran-Nejad & P. D. Pearson (Eds.), Review of research in education (pp. 61-100). Washington, DC: AERA. Brown, A. L., Campione, J. C., Webber, L. S., & McGilly, K. (1992). Interactive learning environments: A new look at assessment and instruction. In B. R. Gifford & M. C. O'Connor (Eds.), Changing assessments: Alternative view of aptitude, achievement, and instruction (pp. 37-75). Boston: Kluwer Academic. Brown, J. S., Collins, A., & Duguid, P. (1989). Situated cognition and the culture of learning. Educational Researcher, 18(1), 32-42. Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5, 121-152. Cognition and Technology Group at Vanderbilt. (1997). TheJasper projects: Lessons NJ: Erlbaum.

in curriculum,instruction, and assessment, professional Mahwah, development.

643

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff


Cognition and Technology Group at Vanderbilt. (in press). The Jasper series: A design experiment in complex problem solving. In J. Hawkins & A. Collins (Eds.), Design experiments: Integrating technologies into schools. New York: Cambridge University Press. Cohen, D. K., & Spillane, J. P. (1992). Policy and practice: The relations between governance and instruction. In G. Grant (Ed.), Review of research in education (Vol. 18, pp. 3-49). Washington, DC: American Educational Research Association. Cooper, G., & Sweller, J. (1987). Effects of schema acquisition and rule automation on mathematical problem-solving transfer.Journal of Educational Psychology, 79, 347-362. Darling-Hammond, L. (1990). Achieving our goals: Superficial or structural reforms? Phi Delta Kappan, 72, 286-295. Darling-Hammond, L., & Falk, B. (1997). Using standards and assessments to support student learning. Phi Delta Kappan, 79, 190-199. Detterman, D. K., & Sternberg, R. J. (Eds.). (1993). Transfer on trial. Intelligence, cognition, and instruction. Norwood, NJ: Ablex. Firestone, W. A., Mayrowetz, D., & Fairman, J. (1998). Performance-based assessment and instructional change: The effects of testing in Maine and Maryland. Educational Evaluation and Policy Analysis, 20, 95-113. Fuchs, L. S., Fuchs, D., Eaton, S., Hamlett, C. L., & Karns, K. (2000). Supplementing teacherjudgment about testaccommodations with objective data sources. School Psychology Review, 29, 65-85. Fuchs, L. S., Fuchs, D., Hamlett, C. L., Phillips, N. B., Karns, K., & Dutka, S. (1997). Enhancing students' helping behavior during peer-mediated instruction with conceptual mathematical explanations. Elementary SchoolJournal, 97, 223-250. Fuchs, L. S., Fuchs, D., Hamlett, C. L., & Stecker, P. M. (1991). Effects of curriculumbased measurement and consultation on teacher planning and student achievement in mathematics operations. American Educational ResearchJournal, 28, 617-641. Fuchs, L. S., Fuchs, D., Karns, K., Hamlett, C. L., Dutka, S., & Katzaroff, M. (2000). The importance of providing background information on the structure and scoring of performance assessments. Applied Measurement in Education, 13, 1-34. Gagne, R. M. (1968). Contributions of learning to human development. Psychological Review, 75, 177-191. Gick, M. L., & Holyoak, K. J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15, 1-38. Glaser, R. (1984). Education and thinking: The role of knowledge. American Psychologist, 39, 93-104. Glass, G. V., McGaw, B., & Smith, M. L. (1981). Meta-analysis in social research. Beverly Hills, CA: Sage. Goodman, J. (1995). Change without difference: School restructuring in historical perspective. Harvard Educational Review, 65, 1-28. Goodman, N. (1978). Ways of worldmaking. Indianapolis, IN: Hackett. Guthrie, J. W. (1991). The world's new political economy is politicizing educational evaluation. Educational Evaluation and Policy Analysis, 13, 309-321. Hambleton, R. K., Jaeger, R. M., Koretz, D., Linn, R. L., Millman, J., & Phillips, S. (1995). Review of the measurement quality of the Kentucky Instructional Results Information System, 1991-1994. Frankfort:Office of Education Accountability, Kentucky General Assembly. Hedges, L. V., & Olkin, I. (1985). Statistical methodsfor meta-analysis. Orlando, FL: Academic Press.

644

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Mathematics Performance Assessments


Hoge, R. D., & Coladarci, T. (1989). Teacher-based judgments of academic achievement: A review of the literature. Review of Educational Research, 59, 297-314. Kansas State Board of Education. (1991). Kansas Quality Performance Accreditation. Topeka, KS: Author. King, A. (1991). Effects of training in strategic questioning on children's problemsolving performance. Journal of Educational Psychology, 83, 307-317. Koretz, D., Barron, S., Mitchell, K., & Stecher, B. (1996). Perceived effects of the Kentucky Instructional Results Information System (KIRIS).Santa Monica, CA: RAND. Koretz, D., Mitchell, K., Barron, S., & Keith, S. (1996). Final report.Perceived effects of the Maryland School Performance Assessment Program. Los Angeles: University of California. Lane, S., Liu, M., Ankenmann, R. D., & Stone, C. A. (1996). Generalizability and validity of a mathematics performance assessment. Journal of Educational Measurement, 33, 71-92. Larkin, J. H. (1989). What kind of knowledge transfers? In L. B. Resnick (Ed.), Knowing, learning, and instruction (pp. 283-305). Hillsdale, NJ: Erlbaum. Linn, R. L. (1991). Dimensions in thinking: Implications for testing. In B. F. Jones & L. Idol (Eds.), Educational and cognitive instruction: Implications for reform (pp. 197-208). Hillsdale, NJ: Erlbaum. Linn, R. L. (1993). Educational assessment: Expanded expectations and challenges. Educational Evaluation and Policy Analysis, 15, 1-16. Mayer, D. P. (1998). Do new teaching standards undermine performance on old tests? Educational Evaluation and Policy Analysis, 20, 53-73. McDonnell, L. M., McLaughlin, M. J., & Morison, P. (Eds.). (1997). Educating one and all. Students with disabilities and standards-based reform. Washington, DC: National Academy Press. McLaughlin, M. W. (1987). Learning from experience: Lessons from policy implementation. Educational Evaluation and Policy Analysis, 9, 171-178. McLaughlin, M. W., Shepard, L. A., & O'Day, J. A. (1995). Improving education through standards-based reform:A reportby the National Academy of Education Panel on Standards-Based Reform. Stanford, CA: National Academy of Education. Mory, E., & Salisbury, D. (1992, April). School restructuring: The critical element of total system design. Paper presented at the annual meeting of the American Educational Research Association, San Francisco. Murnane, R. J., & Levy, F. (1996). Teaching to new standards. In S. H. Fuhrman & J. A. O'Day (Eds.), Rewards and reform: Creating educational incentives that work (pp. 257?292). San Francisco: Jossey-Bass. National Council of Teachers of Mathematics. (1989). Curriculum and evaluation standards for school mathematics. Reston, VA: Author. National Research Council. (1989). Everybodycounts-A report to the nation on the future of mathematics education. Washington, DC: National Academy Press. Nickerson, R. S. (1989). New directions in educational assessment. Educational Researcher, 18(9), 3-7. Popham, W. J. (1987). The merits of measurement-driven instruction. Phi Delta Kappan, 68, 679-682. Prawat, R. S. (1989). Promoting access to knowledge, strategy, and disposition in students: A research synthesis. Review of Educational Research, 59, 1-41. Prawat, R. S. (1992). Teachers' beliefs about teaching and learning: A constructivist perspective. American Journal of Education, 100, 354-395. Reed, S. K., Ernst, A., & Banerji, R. (1974). The role of analogous solutions for

645

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

L. Fuchs, Fuchs, Karns, Hamlett, and Katzaroff


solving algebra word problems. Journal of Experimental Psychology. Learning, Memory, and Cognition, 11, 106-125. Resnick, L. B., & Resnick, D. P. (1992). Assessing the thinking curriculum: New tools for educational reform. In B. R. Gifford & M. C. O'Connor (Eds.), Changing assessments:Alternativeviews ofaptitude, achievement, and instruction(pp. 37-75). Boston: Kluwer Academic. Riverside. (1995). Performance assessmentsfor the Iowa Testof Basic Skills. Chicago: Author. Rothman, R. (1995). Measuring up: Standards, assessments, and school reform. San Francisco: Jossey-Bass. Salomon, G., & Perkins, D. N. (1987). Rocky roads to transfer: Rethinking mechanisms of a neglected phenomenon. Educational Psychologist, 24, 113-142. Sammons, K. B., Kobett, B., Heiss, J., & Fennell, F. S. (1992). Linking instruction and assessment in the mathematics classroom. Arithmetic Teacher, pp. 11-15. Schulman, L. (1987). Knowledge and teaching: Foundations of the new reform. Harvard Educational Review, 57, 1-22. Schwartz, D. L., & Bransford, J. D. (1998). A time for telling. Cognition and Instruction, 16, 475-522. Seaman, M. A., Levin, J. R., & Serlin, R. C. (1991). New developments in pairwise multiple comparisons: Some powerful and practical problems. Psychological Bulletin, 110, 577-586. Shepard, L. (1991). Will national tests improve student learning? Phi Delta Kappan, 71, 232-238. Smith, M. L. (1991a). Put to the test: The effects of external testing on teachers. Educational Researcher, 20(5), 8-11. Smith, M. L. (1991b). Unintended consequences of external testing in elementary schools. Educational Measurement. Issues and Practice, 10(4), 7-11. Stigler, J. W., & Hiebert, J. (1997). Understanding and improving classroom mathematics instruction. Phi Delta Kappan, 79, 14-21. Thurlow, M. L.(1994). National and stateperspectiveson performance assessment and students with disabilities. Reston, VA: Council for Exceptional Children. Tomic, W. (1995). Training in inductive reasoning and problem solving. Contemporary Educational Psychology, 20, 483-490. Torrance, H. (1993). Combining measurement-driven instruction with authentic assessment: Some initial observations of national assessment in England and Wales. Educational Evaluation and Policy Analysis, 15, 81-90. Wiggins, G. (1989). A true test: Toward more authentic and equitable assessment. Phi Delta Kappan, 70, 703-713. Wilson, M. (1992). Educational leverage from a political necessity: Implications of new perspectives on student assessment for Chapter 1 evaluation. Educational Evaluation and Policy Analysis, 14, 123-144. Woodward, J., & Baxter, J. (1997). The effects of an innovative approach to mathematics on academically low-achieving students in inclusive settings. Exceptional Children, 63, 373-388.

received December 1997 Manuscript Revisionreceived May 1998 Accepted December 1998

646

This content downloaded by the authorized user from 192.168.52.72 on Fri, 30 Nov 2012 21:47:06 PM All use subject to JSTOR Terms and Conditions

Das könnte Ihnen auch gefallen