The Long-Run Effects of Teacher Cheating On Student Outcomes

TheLongRunEffectsof
TeacherCheatingonStudentOutcomes
AReportfortheAtlantaPublicSchools
TimR.Sass,DistinguishedUniversityProfessor,GeorgiaStateUniversity
JarodApperson,Ph.D.student,GeorgiaStateUniversity
CarycruzBueno,Ph.D.student,GeorgiaStateUniversity
May5,2015
ExecutiveSummary
The manipulation of student test scores by teachers and administrators in the Atlanta
PublicSchools(APS)hasbeencarefullydocumentedanddozensofteachersandleadershave
either accepted administrative sanctions, plead guilty to criminal charges or have been
convicted of crimes. Little is known, however, about how the falsification of test scores by
teachers and administrators (teacher cheating) has impacted students. Using a panel of
individualleveldataonstudentsandteachersfromAPS,weinvestigatedtheeffectsofteacher
cheatingonsubsequentstudentachievement,attendanceandstudentbehavior.Keyfindings
include:
Ofthe11,553studentswhowereinclassroomsflaggedbytheGovernorsOfficeofStudent
Achievement (GOSA) for high levels of wrongtoright erasures on the 2009 CRCT exam,
5,888(51percent)werestillenrolledinAPSinfall2014.
Notallstudentsinclassroomswherecheatingoccurredhadtheirtestscoresmanipulated
equally. For both reading and English Language Arts (ELA), over onethird of students in
flaggedclassroomshadtwoorfewerwrongtoright(WTR)erasuresandnearlyonefourth
ofstudentsinflaggedclassroomshadtwoorfewerWTRerasuresontheirmathCRCTexam.
Thereisstrongevidencethatteacherswereselectiveintheirmanipulationoftestscores.In
particular, teachers were more likely to change answers for students of lower apparent
ability.
Usingthefrequencyoferasuresin2013asabenchmark,ofthe11,553studentsinflagged
classrooms in 2008/09 we estimate that 7,064 (61 percent) likely had their test answers
manipulated in one or more subjects on the 2009 CRCT exam. Of these 7,064 students,
3,728werestillenrolledinAPSinfall2014.
Thereisrelativelyrobustevidencethatmanipulationofstudentstestanswershadnegative
consequences for later student performance in reading and English Language Arts (ELA).
The estimated impacts are equivalent to a onetime reduction in achievement of roughly
onefourthtoonehalfoftheaverageannualachievementgainformiddleschoolstudents.
Put differently, the estimated loss in achievement is one to two times the difference in
achievement between having a rookie teacher rather than a teacher with five years of
experience for a single year. In contrast, the effects of teacher cheating for subsequent
achievementinmatharemixed.
Thereislittleornoevidencethatteachercheatinghaddeleteriouseffectsonsubsequent
studentattendanceorstudentbehavior.
1. Introduction
Much attention has been paid to the teachers and administrators caught up in the
Atlanta Public Schools (APS) cheating scandal. However, little is known about how the
falsificationofstudenttestscoreshasimpactedstudents.Whileillicitbehaviorbyteachersand
administrators (henceforth teacher cheating) obviously boosted student test scores in the
short term, it is unknown what effect this cheating had on subsequent student achievement
andotheroutcomessuchasattendance,disciplineandhighschoolcompletion.Inthisreport
weinvestigatetheeffectsoftestscoremanipulationonstudentoutcomestolearntheextent
to which students may have been harmed in the longrun. We focus on the relationship
betweenteachercheatingandpostcheatingstudenttestscores,attendanceandbehavior.In
additiontomeasuringtheimpactofteachercheatingonstudentoutcomes,weseektoquantify
thenumberofstudentspotentiallyaffectedbyteachercheatinganddetermineiftheyarestill
enrolledwithinAPS.
Specifically,thisreportaddressesthefollowingquestions:
1. HowmanystudentswerepotentiallyimpactedbyteachercheatinginAPS?
2. Amongpotentiallyimpactedstudents,whatistheeffectofteachercheatingontestscore
outcomesinthelongrunanddoesitvaryacrossstudentsandacrosssubjects?
3. Werestudentswhoreceivedinflatedtestscoreslesslikelytoattendschoolafterthey
learnedthattheirpriorscoreswereincorrect?
4. Howhasthedisciplinarybehaviorofpotentiallyimpactedstudentschangedsinceteacher
cheatingwasuncovered?
5. HowmanystudentswhowerepotentiallyimpactedarestillenrolledinAPS?
6. Whatproportionofstudentswhowerepotentiallyimpactedtransferredfromtraditionalto
charterschoolsorlefttheAtlantaPublicSchoolsystem?
2. PossibleMechanismsandRelatedResearch
The existing research related to teacher cheating has focused on the identification of
cheating(JacobandLevitt,2003a,2003b;vanderLindenandJeon,2012;KingstonandClark,
2014).Thereisnopriorresearchthatspecificallyconsidersthepossibleimpactsofalteredtest
scores on subsequent student performance. However, there are four related strands of
literature, student selfesteem, grade inflation, teacher incentives and achievementbased
interventions, each of which provides some possible mechanisms by which teacher cheating
couldimpactstudentsinthelongrun.
A.Selfesteem
If a student receives an unexpectedly high test score it is conceivable that this could
increase their own perception of their academic ability and consequently boost their self
esteem.Ofcourse,ifitislaterlearnedthatthepriortestscorewasfalse,anyincreaseinself
esteem could be eliminated or even reversed. A number of studies have shown that self
esteemmayhavepositiveeffectsonbothacademicandemploymentoutcomes.Forexample,
Waddell (2006) finds that high school graduates with low selfesteem attain fewer years of
postsecondary education, are more likely to be unemployed and have lower earnings than
others in their high school graduation cohort. de Araujo and Lagos (2013) consider the
simultaneousdeterminationofselfesteem,educationalattainmentandearningsandfindthat
positive selfesteem boosts wages by increasing educational attainment, but has no direct
effectonwages.
Inthecaseofmanipulatedtestscoresanyimpactsofinflatedstandardizedtestscores
onselfesteemwouldlikelybemutedbysignalsoflowerperformancefromcoursegrades.If
teachercheatingdidinfactaffectstudentselfesteem,wewouldexpectthatstudentoutcomes
would improve (beyond the direct effect of falsification on scores) until falsification became
knownandthentheywoulddecline,perhapstolevelsbelowwhatwouldbeexpectedpriorto
theinitiationofcheating.
B.GradeInflation
Falsetestscoresthatresultfrommanipulationbyteachersandadministratorsareakin
tosocalledgradeinflation,wherestudentsreceivegradesthataregreaterthatwhatmight
otherwisebejustifiedbytheiracademicperformance.Babcock(2010)findsthatincreasesin
expectedgradesleadstudentstostudyless.Usingsurveydatafromcollegecourseevaluations,
hefindsaveragestudytimeisreducedbyhalfinaclasswheretheaverageexpectedgradeisan
A,relativetoaclassinwhichtheaverageexpectedgradeisaC.Thusitispossiblethat,at
least in the short term, students could react to inflated test scores by devoting less effort in
school, believing that they can perform well with little effort. If this is the case, one would
expectevenlowerscoresoncetestanswerswerenolongermanipulated.Anysucheffectson
studentperformancemaybetempered,however,iftheachievementexamswereconsidered
tobelessconsequentialthancoursegradesbystudents.1
C.Rewards,MotivationandTeacherEffort
It is alleged that a primary cause of teacher cheating in APS was external pressure to
demonstrate positive school performance. Two ways to increase test scores are to improve
instructionortocheatbyprovidinginappropriateassistancetostudentsbeforeorduringthe
exam and/or correct wrong answers expost. The opportunity to boost scores via cheating
shouldreducethepayofffromimprovedteachereffectiveness(sinceotherwiselowtestscores
Theimpactmayalsobedampenedifachievementtestresultsarenotknownuntilaftertheendoftheacademic
year.
wouldbemanipulated).Thisinturncouldresultinadecreaseinteachereffortandareduction
intruestudentlearning.Akeyassumptionforthistooccuristhatteachersadjusttheireffort
tochangesinincentivesrelatedtotestscores.Indirectevidenceontherelationshipbetween
teachereffortandincentivescanbefoundintheliteratureontheeffectsofperformancepay.
Whileearlyexperimentalstudiesfoundlittleevidencethatperformancepayboostedteacher
productivity(e.g.Springer,etal.2010;Springer,etal.2012;Fryer,2013),morerecentanalysis
by Dee and Wyckoff (2013) of a districtwide scheme operated at scale (Washington DCs
IMPACTteacheraccountabilitysystem)providesstrongevidencethatexistingteacherswillin
fact adjust their teaching performance in response to significant incentives. Thus another
mechanismbywhichtestscoremanipulationcouldhaveaffectedstudentoutcomeswouldbe
through a reduction in teacher effectiveness. Once teacher cheating stopped, instructional
quality should have returned to premanipulation norms. However, if teaching quality has
persistenteffectstherecouldbenegativeconsequencesforlaterstudentoutcomes.
D.InterventionsBasedonStudentAchievement
One of the main concerns with teacher cheating is that because of artificially inflated
test scores, students are not identified to receive remedial services, such as intervention
programs, summer school or retention. If remedial programs increase student achievement,
denial of these services could potentially have lasting effects on the student. However, the
evidenceontheefficacyofretentionandsummerschoolplacementforlowachievingstudents
issomewhatmixed,withpositiveeffectsoccurringprimarilyinelementaryschool.Mostofthe
rigorous research studies conducted thus far exploit achievement based rules for mandating
retention and/or summer school and compare outcomes for students just below the
intervention threshold (who receive services) with those whose scores are just above the
threshold (and thus do not receive remediation). Jacob and Lefgren (2004) find that both
summer school and retention have a positive (but small) impact on student achievement for
thirdgradersandnoeffectforsixthgradersinChicago.MarianoandMartorell(2012)analyzea
two tiercutoff system in New York City where students who score below the cutoff on their
spring assessment are assigned to summer school; if these students fall below the summer
evaluationcutofftheyareretained.Theauthorsfindthatsummerschoolhadasmallpositive
impactonstudents6thgradeELAachievementiftheyattendedformissingtheELAcutoff,but
notforstudentswhoscoredlowonthemathtest.Usingasameyearcomparison,theyfind
thatretentioninfifthgradehasalarge,significantandpositiveeffectonstudentperformance
forthefollowingtwoyears.Matsudaira(2008)studiesamandatorysummerschoolpolicyand
finds that positive and quantitatively substantial average effects of summer school on
subsequent achievement in both math and reading achievement. However, the estimated
impacts vary substantially across gradelevel and subject combinations. Winters and Greene
(2012)evaluatetheimpactsofFloridasthirdgraderetentionpolicy.Notonlywerestudents
whofellbelowthethresholdretained,theyalsowereassignedahighqualityteacherinthe
retentionyearandwererequiredtoattendsummerschool.Thiscombinationofremediation
effortshadsubstantialpositiveeffectsonachievementinmultiplesubjects,thoughtheeffects
diminishovertime.
3
3. DataandBackground
Toanswertheresearchquestionsoutlinedabove,weconstructedalongitudinaldataset
from APS administrative data. With the assistance of APS staff we were able to collect
anonymous individuallevel data on enrollment, attendance, discipline, program participation
(e.g.specialeducation,subsidizedlunch,Englishasasecondlanguage),schooltypes,student
demographics and diploma receipt. APS also provided state test results from the Criterion
Referenced Competency Test (CRCT) in grades 18 and high school EndofCourse Tests
(EOCTs).2 For both the CRCT and EOCTs we computed normalized scores for each grade (or
coursefortheEOCTs)andyearbasedonstatewidemeansandstandarddeviations.Allofthe
datawereassembledintoapanelcoveringallstudentsenrolledinAPSfromthe2004/05school
yearthroughfallof2014.
AnotherkeydatasourceistheGeorgiaBureauofInvestigation(GBI)reportonteacher
cheating in APS (Office of the Governor, 2011). The GBI investigation included information
from the Governors Office of Student Achievements (GOSAs) analysis of erasures on the
spring2009CRCTexam.ResultsoftheerasureanalysiswereusedbytheGBItoselectschools
for detailed investigation, which included interviews with school personnel. Just over 60
percent of the districts elementary and middle schools received a detailed investigation. In
over half of these schools educators confessed to cheating and investigators concluded that
systemic misconduct occurred in over threefourths of the schools that were investigated in
detail.Theinvestigationalsorevealedthatcheatinghadbeengoingonforsometime,perhaps
as far back as 2001 in some schools. Interview data from the report provided valuable
contextualinformationabouthowteachercheatingwascarriedout.
The data provided by APS include individuallevel erasure data for the CRCT in school
years 2008/09 through 2012/13. The erasure data from 2008/09 and 2009/10 only cover
schools that were investigated by the GBI, based on high levels of wrongtoright (WTR)
erasures in one or more classrooms and other evidence of testing improprieties (henceforth
investigated schools). Thus we have erasure data for nearly twothirds of the districts
elementaryandmiddleschoolsin2008/09,aboutonefifthofelementaryandmiddleschoolsin
2009/10andthencompletedistrictwideerasuredatain2010/112012/13.3Theerasuredata
providedtouscontaintherawnumberofcorrectanswersandthenumberofWTRerasures.
Theindividuallevelerasuredatawerecombinedwithinformationonthecriteriausedbythe
GBI to flag classrooms suspected of cheating.4 Using the erasure data and the GOSA
Thecriterionreferencedexamwasadministeredingrades18throughthe2009/10schoolyear.Inlateryears,
thetestwasadministeredonlyingrades38.
3
Our2008/2009erasuredatacoverallschoolsinitiallytargetedforinvestigation,includingtwoschoolsthatdidnot
receiveadetailedinvestigationbecauseinitialinquiriesuncoverednoevidenceofimproprieties.
4
ClassroomswereflaggedwhenthenumberofWTRerasureswasgreaterthanthreestandarddeviationsabove
thestatemean.Anadjustmentwasmadeforclasssizebydividingthestandarddeviationbythesquarerootof
theclasssize.TheGBIreports(OfficeoftheGovernor,2011)refertoflaggedclassrooms,thoughtheywerein
methodology, we identified both schools which were investigated by the GBI and flagged
classroomswithintheinvestigatedschoolsforschoolyears2008/09and2009/10.
4. ResearchDesignandMethods
A.IdentifyingCheating
A key element in the analysis of the effects of teacher cheating is identifying which
studentshadtheirtestscoresmanipulated.Therearethreetypesofteachercheatingpossible.
First,teacherswithadvancedknowledgeoftheexamquestionsandanswerscouldhaveused
actual test questions in their lessons and communicated the answers prior to the exam. We
refer to this type of score manipulation as exante cheating. Second, teachers could have
guidedstudentstothecorrectanswerduringtheexamorgivenstudentsthecorrectanswers
outright during the exam. We call this contemporaneous cheating. Third, teachers could
have corrected wrong answers after students turned in their exams. We dub this expost
cheating.
Interviews conducted during the GBI investigation uncovered evidence that all three
types of cheating occurred in APS. Teachers and other school personnel admitted they had
employeda variety ofmethods to manipulate test scores. These included reviewing the test
questions prior to test administration and prepping student responses (exante cheating);
positioning low and highability students next to each other and allowing students to copy
answers from one another during the exam plus signaling the correct answers to students
during the test (contemporaneous cheating); and filling in empty answers with correct
responsesorchangingstudentsanswersfromwrongtorightaftertheexam(expostcheating).
There are three methods that have been used in prior research to identify teacher
cheating. One method, developed by Jacob and Levitt (2003a, 2003b), is to look for unusual
patternsofresponses,eitherwithinasinglestudentsanswersheetoracrossstudentanswer
sheets.Wedonothaveaccesstotheanswersheetsofindividualstudents,sothisapproachis
notanoptioninthecurrentanalysis.
A second approach employed by Jacob and Levitt (2003b) is based on unusual inter
temporalchangesintestscores.Teachercheatingofanysortshouldleadtoincreasesintest
scores.Correspondingly,oncecheatingstops,testscoresshoulddroptoreflectstudentstrue
achievementlevels.JacobandLevittidentifiedstudentsasbeingcheatediftheyexperienced
largeincreasesintestscoresinoneyearfollowedbymodestincreasesorevendeclinesinthe
followingyear.GiventhatcheatingallegedlyoccurredoverseveralyearsinAPS,therunupin
testscoresassociatedwithcheatingwillnotbeobservedforstudentswhosetestscoreswere
factgroupsofstudentswhowereadministeredagiventestbyasingleproctor.Thetestscoreadministratorwas
notnecessarilytheclassroomteacherforthetestedsubject.
alwaysmanipulatedpriortotheeliminationofcheating.Forexample,ifcheatingwaspervasive
in a school and it began before a student entered first grade, then only manipulated scores
would be observed prior to the end of cheating. Consequently, identifying teacher cheating
basedonunusualincreasesintestscoresisproblematicinourcontext.
AsillustratedinFigure1,wedoobserveadropintestscoresafterteachercheatingwas
uncovered in 2009. Students in flagged classrooms within investigated schools in 2009 had
substantially higher normalized scores (on the order of 0.5 standard deviations) than did
students in nonflagged classes in 2009 or students in formerly investigated schools in 2010.
Usingtestscoresdropstoidentifycheatingcreatesproblemsforouranalysisoftheimpactsof
cheating, however. As explained below, we rely on the first score a student receives after
cheating ends to control for their true ability. Since test score drops are the difference
betweenthelastmanipulatedscoreandthefirsttruescore,includingbothameasureoftest
scoredrops(toidentifyhavingbeencheated)andthefirstpostcheatingtestscore(tomeasure
studentability)inananalysisoffutureachievementwouldbeakintousingmanipulatedscores
topredictfuture(true)scores.Absentcompletecheating,manipulatedscoreswillbepositively
related to ability. Thus the relationship between being cheated (large test score drops) and
futureachievementwouldbebiasedupward.Weconfirmedthistobethecaseinpreliminary
analyseswherewefoundaquantitativelylargeandstatisticallysignificantpositiverelationship
between large test score drops from spring 2009 to spring 2010 and post2010 test scores
(holding constant spring 2010 test scores). We therefore do not employ test score drops to
identifyteachercheating.
Rather than unusual answer patterns or large yeartoyear changes in test scores, we
relyonthethirdmethodofidentifyingcheatingbyteachers,countsofwrongtorighterasures,
to determine which students had their scores manipulated. In the absence of cheating,
erasuresofanykindshouldberelativelyinfrequent.Also,iferasuresaretheresultofstudent
uncertaintybetweentwopossibleanswerswewouldexpectwrongtorightandrighttowrong
erasures to be about equally likely. One advantage of erasure analysis is that, in contrast to
intertemporal changes in test scores, high levels of wrongtoright erasures would not result
from students who are becoming sick, having a bad day or other random events that are
unrelatedtocheating.5Themajordisadvantageoferasureanalysis,however,isthatitcanonly
identify expost cheating. To the extent that exante cheating or contemporaneous cheating
occurreditwouldtendtoreducethenumberofWTRerasuresandleadtounderidentification
ofcheatingbasedonerasurecounts.
One way to gauge the extent of exante and contemporaneous cheating (and hence
thepotentialunderidentificationcheatingwhenWTRerasuresareusedtoidentifycheating)is
toobservechangesovertimeininitialtestanswers(i.e.answersgivenpriortoerasures).Ifex
anteandcontemporaneouscheatingoccurredduringthe2009exam,thenwhenstatemonitors
Onenotableexceptioniscaseswhereastudentinitiallymarkstheiranswersinwrongcolumn(e.g.bubbles
answerBratherA,CratherthanB,etc.),realizestheirmistake,anderasestheiranswerstocorrectthemistake.
werepresentinAPSschoolsduringthe2010CRCTadministrationwewouldexpectareduction
in the number of initially correct answers (prior to any erasures), relative to 2009. We
approximatetheinitialnumberofcorrectanswersbysubtractingthenumberofWTRerasures
fromthetotalnumberofcorrectanswers.Werefertothismeasureastheinitialright.6The
initialright scores could also have risen in 2010 in the absence of exante and
contemporaneous cheating if the exam simply became easier. We can distinguish between
thesehypothesesbytakingintoaccounthowtheCRCTswereadministered.Ingrades1and2
the exam questions and possible answers were read to students while in grades 38 the
studentsreadthequestionsandpossibleanswersindependently.Havingteachersreadingthe
questionsandanswersinthelowergradeswouldmakeiteasierforteacherstoengageinex
ante cheating in a number of ways. One method noted by the GBI is simply changing the
inflexionoftheirvoicewhenreadingthecorrectanswer.
Figure 2 provides evidence that exante cheating did occur and was concentrated in
grades1and2.Thegraphsshowthedistributionofinitialrightscoresbygradeforboth2009
and2010.Thesampleislimitedtoschoolsthathadasignificantproportionoftheirclassrooms
flagged for high WTR erasure counts in both 2009 and 2010 (since we have erasure data for
only such schools in those years). Thus the sample includes slightly less than 20 percent of
districtschools.Consistentwithexantecheatingbeingeasiertoimplementingrades1and2,
we see that the number of initially correct answers fell sharply in 2010 for grades 1 and 2
whereas the test score distributions in 2009 and 2010 are roughly similar for grades 3 and
above.7Thus,althoughwecannotrejectthepossibilitythatsomeexanteorcontemporaneous
cheatingoccurredinhighergrades,wecanbemoreconfidentthatWTRerasurecountsarea
goodmeasureofallformsofteachercheatingingrades3andabove.Giventhelikelihoodof
moresubstantialexantecheatingingrades1and2,whenemployingWTRerasurecountsto
identify cheating we conduct separate estimates of the impact of cheating on student
outcomesforgrades38andforgrades12.
WebaseourerasurecountmeasureofcheatingonthedistributionofWTRerasureson
the spring 2013 exam (when all evidence suggests that cheating no longer existed). Table 1
showsthe90th,95thand99thpercentilesofthedistributionofWTRerasurecountsinboththe
last year of pervasive cheating, 2009, and the last year of available erasure data, 2013. A
student was designated as having beencheated in 2009 if the number of WTR erasures on a
given exam exceeded the number of WTR erasures corresponding to the 95th percentile in
2013.Thus,forexample,ifastudentsCRCTreadingexamin2009hadmorethanfourWTR
erasures,theywereclassifiedashavingbeencheated.
Theactualnumberofinitialcorrectanswerswillequalthetotalrightaftererasures,minusWTRerasures,plus
righttowrongerasures.Unfortunately,wedonotpossessinformationonthenumberofrighttowrongerasures
in2009.Weassumethatthenumberofrighttowrongerasures,whilenotmeanzero,israndomlydistributed
acrossstudents.ThususingthenumbercorrectlessthenumberofWTRerasureswillserveasareasonableproxy
forthenumberofinitiallycorrectanswersforourpurposes.
7
Themiddleschooldistributionsarebasedononlyafewschoolsandthusarelessprecise.
B.DifferentialCheatingandControllingforStudentAbility
Priorresearchonteachercheating(e.g.JacobandLevitt,2003a,2003b),aswellasthe
GBI investigation, focused on identifying classrooms where cheating occurred in order to
identifywhichteachersengagedinillicitbehavior.Incontrast,thefocusofouranalysisison
students rather than on teachers. The extent of teacher cheating can vary across students
withinclassrooms,i.e.theimpactofacheatingteacherwillnotbethesameforallstudentsin
theclass.Forexample,giventhetimecostsinvolved,cheatingteacherscouldchoosetoonly
correctanswersfortheirweakerstudents,whowouldlikelyhavemoreinitialwronganswers
and would thus produce the greatest gains in test scores when their answers are corrected.
Similarly, even if a teacher reviewed the answers of all students, more able students would
havefewerwronganswerstobeginwithandthusanymanipulationofanswersexpostwould
have a smaller impact on the students score. Evidence of such withinclassroom variation is
providedinTable2,whichshowsthedistributionofWTRerasuresbysubjectwithinGBIflagged
classrooms in 2008/09. For both reading and ELA, over onethird of students in flagged
classroomshadtwoorfewerWTRerasuresandoveronefourthofstudentshadonlyoneorno
WTR erasures on their reading exam. In math, the proportion of students with few WTR
erasureswassmaller,butstillsubstantial.Nearlyonefourthofstudentsinflaggedclassrooms
hadtwoorfewerWTRerasuresontheirmathCRCTexam.
Becauseourinterestisinevaluatinghowcheatingimpactedlongrunstudentoutcomes,
wemustevaluatetheextenttowhichstudentabilityplayedaroleinselectivecheating.Ifthe
students selected by a teacher for cheating were weaker students to begin with, a longrun
achievementanalysiswhichfailedtoaccountforsuchdifferenceswouldbebiaseddownward.
Indeed,wefindstrongevidencethatteachersselectedwhichstudentstocheatbasedontheir
ability. Figure 3A shows the distribution of WTR erasure patterns in a cheating environment
(2009 flagged classrooms) by quintile of student ability in math, where student ability is
measuredby2010CRCTscores.Figure3BshowsthedistributionofWTRerasureinanormal
environment(2013),brokendownbymeasuredstudentabilityinmath,wherestudentabilityis
characterized by performance on the 2013 CRCT exam. In the cheating environment, the
fewesterasuresareforthehighestabilitystudents(Q1)withthedistributionsofWTRerasures
progressivelyskewedtotherightforstudentsoflowerapparentability.Incontrast,inthenon
cheatingenvironment,thequintilegraphsarenearlycoincident,saveforslightlyfewererasures
amongstthemostablegroup(Q1).Thisistobeexpectedsincehigherabilitystudentsarelikely
tobemoreconfidentintheirinitialanswers.Takentogether,theseresultsclearlysuggestthat
teachersweremorelikelytochangestudentanswersforstudentsoflowerability.
The challenge imposed by selective cheating can be addressed by appropriately
controllingforstudentability.Ifcheatingonthe2009examwereanisolatedincident,wecould
relyontestscoresinprioryearsasameasureofstudentability;however,asdiscussedabove,
APS experienced widespread, longterm cheating which renders pre2009 results unreliable.
Twootheralternativesexist.Oneistorelyonour2009estimateofinitialrightanswers.Inthe
caseofexpostcheating,thismethodhastheappealofmeasuringabilityimmediatelypriorto
the treatment. However, because we are aware that some exante and contemporaneous
cheating existed, reliance on the initial right scores could lead to bias in our estimates.8 A
secondoptionistorelyon2010testscoresasameasureofability.Theappealofthisoptionis
thatwehavegreaterconfidenceinitsaccuracyasthetestobserversledtoasignificantlylower
incidence of cheating in 2010. However, the downside to this measure is that it limits the
mechanisms we are able to evaluate. Our analysis can still measure the effects of students
realizingtheywerecheated.However,wecannolongerevaluatetheeffectofpotentialself
esteemboostsoccurringinthe2009/10schoolyearasaresultof2008/09cheating.
C.ChoosingaComparisonGroup
We are interested in the causal impact of teacher cheating on subsequent student
outcomes. Put differently, how would a student have fared if their scores had not been
manipulated?Onepossibilityistocompareoutcomesforastudentbeforetheirscoreswere
manipulatedtotheirpostcheatingoutcomes.Suchawithinstudentanalysiswouldholdany
timeinvariantstudentcharacteristicsconstant.Foroutcomesthatareobservedineachperiod,
suchasattendanceordisciplinaryincidents,thiscouldbeaccomplishedbyestimatingmodels
of student outcomes that include indicators for the cheated and postcheated periods along
withstudentfixedeffectsthatcreateaseparatebaselineforeachstudent.Unfortunately,as
discussed above, in many instances teacher cheating occurred across multiple classrooms
within a school over several years, meaning that for many students we never observe pre
cheating outcomes. Further, for onetime outcomes such as high school graduation, it is not
possibletoestablishawithinstudentprecheatingbaseline,makingthisapproachinfeasible.
The other alternative is to compare outcomes for students who were cheated with
those who were not cheated. This is potentially problematic, however, since noncheated
students may be different from cheated students in ways that are correlated with student
outcomesandthuswecouldfalselyattributedifferencesinoutcomestocheatingwhichmight
becausedbyotherfactors.Forexample,manyschoolsinAPSwerenotinvestigatedbythe
GBI because there were no classrooms flagged for having high levels of WTR erasures within
theschoolortherewasonlyanisolatedflaggedclassroomwithnocorroboratingevidenceof
teachercheating.Thesenoninvestigatedschoolstendedtoservemuchlowerproportionsof
studentsfromlowincomefamiliesthandidinvestigatedschools.Whiletherewereanumber
ofnoninvestigatedschoolsservingstudentpopulationswithsimilardemographicstothoseof
investigated schools, the fact that no cheating was uncovered in these schools may be
indicative of differences in school leadership, school culture, or average teacher quality.
Withinschool comparisons can be made by estimating models of student outcomes that
include school fixed effects. However, even within investigated schools, sorting of students
across teachers (either due to parental preferences for teachers, teacher preferences for
Anotherproblemwithusingtheinitialrightisthatquestionsvaryindifficultyandthustherawnumberofinitial
rightanswersdoesnotdirectlymapintothetruescalescoreforastudent.
studentsorabilitytrackingofstudents)couldresultinsignificantdifferencesintheunderlying
characteristicsofstudentswhoattendedclassroomswhereteachercheatingoccurredvisvis
studentsattendingnoncheatingclassroomsinthesameschool.Itisalsopossibletocompare
cheatedandnoncheatedstudentsinthesameclassroombyincludingclassroomfixedeffects,
thoughnoncheatedstudentsarelikelytobeofhigherabilitythancheatedstudentssincethere
is less for a teacher to gain from manipulating alreadyhigh scores. Also, withinclassroom
comparisons would not measure any cheating behavior that affects all students within a
classroom equally, so as a reduction in teacher effort. Further, making withinclassroom
comparisons reduces the number of students being directly compared and will therefore
diminishtheprecisionofanyestimatedeffects.
For the crossstudent comparisons we take a number of steps to minimize any bias
associatedwiththenonrandomexposureofstudentstoteachercheating.First,wecontrolfor
avarietyofobservablestudentcharacteristics,includinggender,race/ethnicity,free/reduced
pricelunchstatus,giftedstatus,limitedEnglishproficiencyanddisabilitystatus.Inaddition,we
controlforunmeasuredabilitybyincludingthefirstnotcheatedtestscore.Forstudentswho
had been continuously cheated, the first nonmanipulated score would be the revelation of
trueachievementforastudent.Assuch,thescorewouldnotbeinfluencedbypotentialloss
inselfesteem.However,thefirstnoncheatingscorecouldbeadownwardlybiasedmeasure
of true ability if past teacher cheating led to either reduced teacher or student effort and
achievementhaspersistenteffectsovertime.
D.EconometricModel
Thedecisionsdescribedaboveregardingtheissuesofidentifyingcheating,differential
cheatingbyteachers,controllingforstudentabilityandconstructingavalidcomparisongroup
leadustothefollowingempiricalmodelofstudentachievement:
,
where ,
,isachievementlevelforstudenti,tyearsafter2009,Cheatequalsoneifthe
individual was cheated in 2009 and zero otherwise, Xi,t+1 is a vector of student and family
specific characteristics such as race, gender, lunch status, gifted status, Limited English
Proficiency, Special Education status and i,2009+t is the idiosyncratic error term. The main
parameterofinterestis.Weestimatethreevariantsofthemodel,onewithnofixedeffects
(asspecifiedabove),asecondwithschoolfixedeffectsthatproducewithinschoolcomparisons
(i.e.withanadditionaltermkwherekindexesschools)andathirdwithclassroomlevelfixed
effects that produce withinclassroom comparisons (ie. with an additional term j, where j
indexesclassrooms).
10
5. Results
A.DescriptiveStatistics
InTable3weprovidedescriptivestatisticsforstudents,brokendownbyschooltypeof
enrollment in 2008/09. Investigated schools enrolled over onefourth of the student
population. Relative to noninvestigated schools, investigated schools serve a higher
proportion of minority students, fewer gifted students and a larger fraction of students from
lowincomefamilies(asmeasuredbyfreeandreducedpriceeligibility).
In Table 4 we present a tabulation of the number of students who were enrolled in
flagged classrooms in 2008/09 by their continued enrollment in APS in subsequent years,
broken down by their grade level in 2008/09. It is clear that the vast majority of flagged
classrooms werein elementary schools (grades 15). Further, while there was some attrition
afterthetruescoreswererevealedin2010,therewasnotalargeexodusofstudentsfrom
thedistrict.Fewofthestudentswhowereingrades7and8in2008/09remaininAPSinfall
2014 since their cohorts would have graduated (assuming normal progress and ontime
graduationattheendof12thgrade).Forstudentswhowereingrades18in2008/09about
5060percentofthemremaininAPSinfall2014.Theproportionremainingtendstobehigher
for younger cohorts (around 60 percent for the 2008/09 1stgrade cohort) than for older
cohorts (slightly less than 50 percent for the 2008/09 cohort), which is likely due to higher
attrition as a result of dropouts in high school. A total of 5,888 students who were in
classroomsflaggedforextraordinarilyhighWTRerasuresbasedon2009testresultsremainin
APSinfall2014.
Table 5 presents a breakdown of enrollment in APS traditional and charter schools by
yearforthosestudentswhowereinaflaggedclassroomin2008/09.Thereissomeevidence
thatthecheatingscandalleadtoamovementawayfromtraditionalpublicschoolsandintothe
chartersector.Therearefairlylargeincreasesinenrollmentincharterschoolsinthetwoyears
aftertherevelationofteachercheating,2009/10and2010/11.Afterthe2010/11schoolyear
the absolute enrollment in charters stabilizes, though the relative proportion continues to
increase.
Table 6 provides a tabulation of the fall 2014 grade level for students who were in a
flagged classroom in 2008/09 by their 2008/09 grade level. For each 2008/09 cohort, the
majorityofstudentsareontrack,enrolledinagradethatissixlevelsabovetheirenrollment
gradein2008/09.Ofthe5,888whowereinclassroomsflaggedbytheGBIin2008/09andare
enrolledinanAPSschoolinfall2014,abitlessthanhalf(2,477of5,888or42percent)havenot
yetreachedhighschool.
Table7presentsatabulationofenrollmentinAPSbyyearbythelevelofWTRerasures
onthe2009CRCT.Thefirstrowindicatesenrollmentbystudentsdefinedascheatedbased
onhavingmoreWRTerasuresonthe2009CRCTinanyofthreesubjects(reading,mathorELA)
11
than the 95th percentile of WRT erasures in a noncheating environment (the 2013 CRCT).
Basedonthisdefinition,7,064studentshadtheirexamanswersmanipulated;3,728ofwhich
were still attending APS schools in fall 2014. The second and third rows student enrollment
countsprovidealternativebenchmarks,5ormoreWRTerasuresonlyanyofthethreesubject
examsor10ormoreerasuresonanyofthethreesubjectexams.
B.EstimatedEffectsofTeacherCheatingonStudentOutcomes
OurprimaryresultsarepresentedinTables8A8C.Thetablesreportestimatesofin
equation(1),whichrepresentstherelationshipbetweenbeingacheatedstudent(i.e.having
anumberofWTRerasuresin2009thatexceedsthe95thpercentileforWTRerasuresin2013)
and subsequent normed test scores, holding constant observable student characteristics and
thestudents2010normalizedtestscore.Eachrowrepresentsestimatesfromaspecification
withvaryinglevelsoffixedeffects.Eachcolumnrepresentstheestimateofonachievement
fromadifferentyear.Resultsarepresentedforstudentsingrades1and2in2008/09(where
contemporaneous teacher cheating was likelymore prevalent), grades 36 in 2008/09 and all
grades16together.
InTable8A,whichreportsresultsformath,wefindconsistentnegativeeffectsofbeing
cheatedonfuturetestscoresforstudentswhowereingrades1and2in2008/09.Theeffects
rangefrom0.06to0.10standarddeviationsandarefairlyconstantovertime.Thisisonpar
with the effect of having a rookie teacher rather than one with five years of experience
(Clotfelter, et al., 2006). Alternatively, the effect is equivalent to 18 to 29 percent of the
averageannuallearninggaininmathformiddleschoolstudentsof0.34(Lipsey,etal.,2012).
Themodelthatincludesclassroomfixedeffects,thatistheonethatcomparesstudentswithin
a classroom, produces the largest effects. However, the results do not change dramatically
withschoolfixedeffectsorevenwithnofixedeffectsatall.Resultsforstudentswhowerein
grades36in2008/09arequitedifferent.Saveforonepositiveandsignificantcoefficient,the
remaining 11 estimates are all insignificantly different from zero. When all students are
grouped together, the point estimates are all negative, but only estimates for 2013/2014 are
significantlydifferentfromzeroacrossallthreespecifications.For2013/14thenegativeeffect
ofhavingbeencheatedin2008/09isestimatedtobeintherangeof0.04to0.07standard
deviationsorabout12to21percentofannualachievementgainsformiddleschoolstudentsin
math.
Incontrast tomath,resultsforreadingachievement,whicharedisplayedinTable8B,
are almost uniformly negative and statistically significant. The estimated magnitudes of the
effectofbeingcheatedforstudentswhowereingrades1and2aresubstantiallylargerthan
the size of those for students who were in grades 36 in 2008/09. When all grades are
combined,theestimatedeffectofbeingcheatedonreadingtestscoresinfutureyearsfallsin
therangeof0.06to0.14,whichisequivalentto22to52percentofannuallearninggainsin
readingformiddleschoolstudents(whichequal0.27).Thelargestestimatesonceagaincome
12
from the models which make withinclassroom comparisons, i.e. the models with classroom
fixedeffects.
Estimates for ELA, presented in Table 8C, are similar to those for reading. Estimated
impacts of being cheated are almost always negative and statistically significant. Results for
grades1and2arelargerinmagnitudethanforstudentswhowereingrades36duringthelast
yearofwidespreadteachercheating,2008/09.Combiningallgrades,theestimatedimpactof
beingcheatedonstudentachievementisintherangeof0.06to0.12standarddeviations.
In addition to estimating effects on later student achievement, we also estimated
versions of equation (1) that replaced future test scores with measures of attendance and
behavior. For these alternative outcomes we define being cheated as having unusually high
WTRerasures(greaterthanthe95thpercentileofthe2013distribution)ineithermath,reading
or ELA. Similarly, rather than include only a singlesubject 2010 test score as a control for
studentability,weinclude2010scoresinmath,readingandELA.Wealsoinclude2009values
of the outcome variable (attendance or behavior) as a control. Finally, in the equation
estimating the number of disciplinary incidents we include grade level indicators as control
since disciplinary incidents to be much higher in middle and high school than in elementary
school.Otherwise,thespecificationsareidenticaltothosefortheanalysisoftestscores.
Table 9 presents estimates of the impact of being cheated on the percent of days in
attendanceineachoftheyears2010/11through2013/14.In2010/11(thefirstyearfollowing
therevelationoftruescores)thereappearstobenoeffect;alloftheestimatedcoefficientsare
positive and statistically insignificant. However, in later years the point estimates are almost
always negative and often are significantly different from zero. It is important to remember
thatinthelatteryearsstudentswhowereintestedgrades(18)in2008/09aremostlyallin
middleandhighschool,whereabsenteeismmaybemoredirectlyamatterofstudentchoice.
However,evenifwefocusontheresultsthatarestatisticallysignificantlydifferentfromzero,
the estimates are quite small in magnitude, on the order of 0.3 to 0.8 percentage points or
about1schooldayoutofthetypical180dayschoolyear.
Finally,inTable10,wepresentestimatesoftheimpactofbeingcheatedin2008/09on
thenumberofdisciplinaryincidentsinyears2010/11through2013/14.Thereappearstobe
little in the way of effects of teacher cheating on subsequent student behavior. While point
estimates are mostly positive, in only one case are they significantly different from zero at
standardconfidencelevels.
6. SummaryandConclusions
Muchefforthasbeendevotedtoidentifyingtheteachersandadministrators
responsibleformanipulatingtestscoresinAPSandbringingthoseresponsibletojustice.Atthe
sametimelittleisknownaboutthevictimsofthecheatingscandal.Thisreportrepresentsthe
13
firstattempttorigorouslyanalyzetheimpactofteachercheatingonthelongrunoutcomesof
students.
Inconductingtheanalysiswefacedseveralchallenges,includingidentifyingwhich
studentswerecheated,derivingameasureofthestudentstrueabilityandcomingupwitha
reasonablecounterfactualgroupforcomparison.Giventheavailabledataandthehistoryof
testmanipulation,wechosetoidentifycheatedstudentsasthosewhohadhighnumbersof
wrongtorighterasuresonthe2009CRCTexamandusedtheir2010CRCTscoresasameasure
oftheirtrueability.Basedonthesechoices,weestimatedtheeffectofbeingcheatedonlater
testscores,attendanceandbehaviorofstudents.Comparisonsweremadebetweencheated
studentsandnoncheatedstudentsgenerally,betweencheatedandnoncheatedstudentswho
attendedthesameschoolin2008/09andbetweencheatedandnoncheatedstudentswithin
thesameclassroomin2008/09.
Ourresultsindicatethatbeingcheatedhadnegativeconsequencesforlaterstudent
performanceinbothreadingandELA.Theestimatedimpactsareintherangeofonefourthto
onehalfoftheaverageannualachievementgainforamiddleschoolstudent.Thisisequivalent
toonetotwotimesthedifferencebetweenhavingarookieteacherandonewith5ormore
yearsofexperienceinasingleyear.Incontrast,theeffectsofteachercheatingforsubsequent
achievementinmatharemixed.Thereislittleevidencethatteachercheatinghadany
deleteriouseffectsonsubsequentstudentattendanceorstudentbehavior.Anyimpactsthat
mayhaveoccurredwereverysmall.
14
References
Babcock,Philip(2010).RealCostsofNominalGradeInflation?NewEvidencefromStudent
CourseEvaluations,EconomicInquiry,48(4):983996.
deAraujo,PedroandStephenLagos(2013).SelfEsteem,EducationandWagesRevisited,
JournalofEconomicPsychology,34:120132.
Clotfelter,CharlesT.,HelenF.Ladd,andJacobL.Vigdor(2006).TeacherStudentMatching
andtheAssessmentofTeacherEffectiveness,"TheJournalofHumanResources,
41(4):778820.
Dee,ThomasandJamesWyckoff(2013).Incentives,SelectionandTeacherPerformance
EvidencefromIMPACT.WashingtonDC:NationalBureauofEconomicResearch,NBER
WorkingPaperNo.19529.
Fryer,RolandG.(2013).TeacherIncentivesandStudentAchievement:EvidencefromNew
YorkCityPublicSchools,JournalofLaborEconomics,31(2):373407.
Jacob,BrianA.andStevenD.Levitt(2003a).CatchingCheatingTeachers:TheResultsofan
UnusualExperimentinImplementingTheory,BrookingsWhartonPapersonUrban
Affairs,2003:185220.
Jacob,BrianA.andStevenD.Levitt(2003b).RottenApples:AnInvestigationofthe
PrevalenceandPredictorsofTeacherCheating,QuarterlyJournalofEconomics,
118(3):843877.
Jacob,BrianA.andLarsLefgren(2004).RemedialEducationandStudentAchievement:A
RegressionDiscontinuityAnalysis.ReviewofEconomicsandStatistics,86(1):226244.
Kingston,NealM.andAmyKClark(eds.).TestFraud:StatisticalDetectionandMethodology.
NewYork:Routledge,2014.
Lipsey,MarkW.,KellyPuzio,CathyYun,MichaelA.Hebert,KasiaSteinkaFry,MikelW.Cole,
MeganRoberts,KarenS.Anthony,andMatthewD.Busick(2012).Translatingthe
StatisticalRepresentationoftheEffectsofEducationInterventionsintoMoreReadily
InterpretableForms.Washington,DC:NationalCenterforSpecialEducationResearch,
InstituteofEducationSciences,U.S.DepartmentofEducation.
Mariano,LouisT.andPacoMartorell(2012).TheAcademicEffectsofSummerInstructionand
RetentioninNewYorkCityEducationalEvaluationandPolicyAnalysis,35(1):96117.
15
Matsudaira,JordanD.(2008).MandatorySummerSchoolandStudentAchievement,Journal
ofEconometrics,142(2):829850.
Springer,Matthew,J.R.Lockwood,DaleBallou,DanielF.McCaffrey,LauraHamilton,Matthew
Pepper,ViNhuanLeandBrianStecher(2010).TeacherPayforPerformance:
ExperimentalEvidencefromtheProjectonIncentivesinTeaching.Nashville,TN:
NationalCenteronPerformanceIncentives.
Springer,MatthewG.,JohnF.Pane,ViNhuanLe,DanielF.McCaffrey,SusanFreemanBurns,
LauraS.HamiltonandBrianStecher(2012).TeamPayforPerformance:Experimental
EvidenceFromtheRoundRockPilotProjectonTeamIncentives,Educational
EvaluationandPolicyAnalysis,34(4):367390.
vanderLinden,WimJ.andMinjeongJeon(2012).ModelingAnswerChangesonTestItems,
JournalofEducationalandBehavioralStatistics,37(1):180199.
Waddell,GlenR.(2006).LaborMarketConsequencesofPoorAttitudeandLowSelfEsteemin
Youth,EconomicInquiry,44(1):6997.
Winters,MarcusandJayP.Greene(2012).TheMediumRunEffectsofFloridasTestBased
PromotionPolicy,EducationFinanceandPolicy,7(3):305330.
16
Figure1:DistributionofNormalizedScoresbySubjectfor2009InvestigatedSchools
.2
.1
0
.1
.2
.3
Reading Normed Scores
.3
Math Normed Score
-4
-2
0
2
Standard Deviation
-6
.1
.2
.3
ELA Normed Score
-4
-2
0
2
Standard Deviation
2009 Flagged Class

2009 Non-Flagged Class
2010 Class
17
-5
-4
-3
-2
-1
0
1
Standard Deviation
Figure2:DistributionsoftheNumberofInitialRightAnswersbyGrade,2009and2010
Grade 2 Initial Right, 2009 and 2010
Schools Investigated in both 2009 and 2010
.01
.01
Density
.02
Density
.02
.03
.03
.04
.04
20
40
No. of Questions Initially Correct
2009
60
10
20
2010
2009
50
60
2010
.01
.01
Density
.02
Density
.02
.03
.03
.04
.04
0
0
20
40
2009
60
20
40
2010
2009
18
30
40
2010
60
.01
.01
Density
.02
Density
.02
.03
.04
.03
20
40
2009
60
10
20
2010
2009
50
.01
.01
Density
.02
Density
.015
.02
.03
.025
.04

.03
60
2010
.005
10
20
30
40
2009
50
60
10
20
2010
30
40
2009
19
30
40
2010
50
60
.0 5
.1
.1 5
.2
.2 5
.3
Figure3A:Distributionof2009WrongtoRightErasuresby2010AchievementQuintilein
Math(Studentsinaflaggedclassroomin2008/09andnotinaflaggedclassroom
in2009/10)
10
Q1
15
20
25
30
Wrong To Right Erasures
Q2
Q3
35
40
Q4
Q5
.0 5
.1
.1 5
.2
.2 5
.3
Figure3B: Distributionof2013WrongtoRightErasuresby2013AchievementQuintilein
Math(Studentswhoattendedaschoolthatwasflaggedin2008/09)
5
Q1
10
15
20
25
Wrong To Right Erasures
Q2
Q3
Q4
20
30
35
40
Q5
Table1.90th,95thand99thPercentilesoftheWrongtoRightErasureCounton2009and2013
CRCTExamsinReading,ELAandMath
Reading
th
90 Percentile
95thPercentile
99thPercentile
2009CRCT
ELA
8
12
19
2013CRCT
Math
8
12
19
Reading
12
16
26
3
4
6
ELA
Math
3
4
8
4
5
9
Table2.WrongtoRightErasureCountFrequenciesforCRCTReading,ELAandMathin
FlaggedClassroomsinInvestigatedSchoolsin2008/09
No.ofWTR
Erasures
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Morethan15
Total
Reading
Number
1,038
1,113
1,061
917
741
621
523
412
349
281
243
214
188
150
118
99
419
8,487
Percent
12.23
13.11
12.50
10.80
8.73
7.32
6.16
4.85
4.11
3.31
2.86
2.52
2.22
1.77
1.39
1.17
4.94
100.00
ELA
Number
865
945
978
876
796
663
583
449
374
313
259
192
165
146
127
99
410
8,240
21
Math
Percent
10.50
11.47
11.87
10.63
9.66
8.05
7.08
5.45
4.54
3.80
3.14
2.33
2.00
1.77
1.54
1.20
4.98
100.00
Number
659
791
800
857
757
725
577
523
492
419
332
286
252
208
190
167
1021
9,056
Percent
7.28
8.73
8.83
9.46
8.36
8.01
6.37
5.78
5.43
4.63
3.67
3.16
2.78
2.30
2.10
1.84
11.27
100.00
Table3.SummaryStatisticsforAPSSchoolsbyInvestigatedSchoolStatusin2008/09
Black
White
Hispanic
Female
SpecialEducation
Gifted
GiftedServed
EarlyInterventionProgram
LimitedEnglishProficiency
FreeandReducedLunch
Observations
AllSchools
S.D.
Investigated
Schools
Mean
S.D.
NonInvestigated
Schools
Mean
S.D.
0.83
0.37
0.10
0.29
0.05
0.21
0.50
0.50
0.10
0.30
0.08
0.28
0.08
0.27
0.15
0.35
0.00
0.07
0.58
0.49
54,356
0.95
0.22
0.01
0.07
0.03
0.18
0.50
0.50
0.09
0.29
0.06
0.23
0.06
0.23
0.25
0.43
0.00
0.07
0.77
0.42
18,542
0.77
0.42
0.14
0.35
0.05
0.23
0.50
0.50
0.10
0.30
0.10
0.29
0.09
0.28
0.10
0.30
0.00
0.07
0.49
0.50
35,814
Mean
22
Table4.EnrollmentinAPSbyYearby2008/09GradeLevelforStudentsinOneorMore
FlaggedClassroomsin2008/09
Grade
Levelin
2008/09 2008/09
1
2
3
4
5
6
7
8
Total
2068
2018
2001
1822
1611
706
588
739
11553
EnrollmentinAPSbyYear
2009/10
2010/11
2011/12
2012/13
2013/14
Fall2014
1901
1837
1829
1685
1383
642
538
664
10479
1709
1647
1642
1476
1278
582
465
603
9402
1581
1516
1401
1375
1197
535
423
525
8553
1465
1332
1358
1307
1106
486
370
454
7878
1310
1238
1303
1232
1046
412
297
46
6884
1210
1154
1167
1084
895
338
29
11
5888
Table5.EnrollmentinAPSbyYearandSchoolTypeforStudentsinOneorMoreFlagged
Classroomsin2008/09
School
Type
Traditional
Charter
Total
2008/09
11285
268
11553
2009/10
10038
441
10479
2010/11
2011/12
8796
606
9402
23
7898
655
8553
2012/13
7187
691
7878
2013/14
6203
681
6884
Fall2014
5274
614
5888
Table6.EnrollmentinAPSbyGradeLevelinFall2014by2008/09GradeLevelforStudentsin
OneorMoreFlaggedClassroomsin2008/09
Grade
Levelin
2008/09
EnrollmentinAPSinFall2014byGradeLevel
5
1
2
3
4
5
6
7
8
Total
11
0
0
0
0
0
0
0
11
197
8
0
0
0
0
0
0
205
1000
153
5
1
0
0
0
0
1159
1
992
101
7
1
0
0
0
1102
1
1
1058
491
120
13
1
0
1685
10
11
0
0
3
580
264
30
3
0
880
0
0
0
5
508
69
9
1
592
12
0
0
0
0
2
226
16
10
254
Total
1210
1154
1167
1084
895
338
29
11
5888
Table7.EnrollmentinAPSbyYearbyErasureCountsin2009forStudentsinFlagged
Classroomsin2008/09
2009WRT
Erasure
Measure
2008/09 2009/10 2010/11 2011/12 2012/13 2013/14
>95thPercentile
ofWTRErasures
in2013inany
Subject
7064
6437
5810
5268
4872
4340
3728
5orMoreWTR
ErasuresinAny
Subject
7125
6485
5852
5313
4901
4348
3728
10orMoreWTR
ErasuresinAny
Subject
3473
3150
2877
2611
2421
2156
1825
AllStudentsin
Flagged
Classrooms
11553
10479
9402
8553
7878
6884
5888
24
Fall
2014
Table8A:EstimatedEffectofBeingCheatedin2008/09onNormedMathTestScoresin2010/112013/2014by2008/09Grade
Level
Grades1&2
Grades36
Grades16
Model
2010/11 2011/12 2012/13
2013/14
2010/11
2011/12
2012/13 2013/14
2010/11
2011/12
2012/13
2013/14
NoFixed
Effects
0.0686**
(0.0206)
0.0680**
(0.0218)
0.0657**
(0.0241)
0.0830**
(0.0233)
0.0053
(0.0179)
0.0160
(0.0168)
0.0290
(0.0229)
0.0111
(0.0341)
0.0192
(0.0137)
0.0209
(0.0134)
0.0184
(0.0167)
0.0555**
(0.0195)
SchoolFixed
Effects
0.0610** 0.0648**
(0.0195)
(0.0212)
0.0612*
(0.0243)
0.0688**
(0.0224)
0.0002
(0.0150)
0.0204
(0.0153)
0.0433*
(0.0219)
0.0061
(0.0328)
0.0154
(0.0123)
0.0146
(0.0125)
0.0110
(0.0161)
0.0405*
(0.0184)
Classroom
FixedEffects
0.0882**
(0.0234)
0.0622*
(0.0307)
0.0997**
(0.0296)
0.0088
(0.0211)
0.0096
(0.0209)
0.0056
(0.0297)
0.0188
(0.0446)
0.0405*
(0.0159)
0.0364*
(0.0163)
0.0351
(0.0215)
0.0717**
(0.0247)
0.0923**
(0.0256)
Standarderrorsclusteredattheclassroomlevelinparentheses.*Significantatthe5%level,**significantatthe1%level.Note:2012,2013,and2014resultsarerestrictedto
grades15,14,and13,respectively
25
Table8B:EstimatedEffectofBeingCheatedin2008/09onNormedReadingTestScoresin2010/112013/2014by2008/09
GradeLevel
Grades1&2
Grades36
Grades16
Model
2010/11 2011/12 2012/13 2013/14 2010/11 2011/12 2012/13 2013/14 2010/11 2011/12 2012/13 2013/14
NoFixed
Effects
0.1419**
(0.0269)
0.1955**
(0.0323)
0.1858**
(0.0371)
0.1863**
(0.0401)
0.0701**
(0.0169)
0.0762**
(0.0196)
0.0491*
(0.0239)
0.0374
(0.0373)
0.0798**
(0.0142)
0.0936**
(0.0165)
0.0612**
(0.0205)
0.1076**
(0.0268)
School
Fixed
Effects
0.1449**
(0.0273)
0.1919**
(0.0315)
0.1775**
(0.0370)
0.1705**
(0.0399)
0.0681**
(0.0166)
0.0685**
(0.0193)
0.0440
(0.0235)
0.0355
(0.0365)
0.0782**
(0.0142)
0.0978**
(0.0164)
0.0633**
(0.0204)
0.0932**
(0.0268)
Classroom
Fixed
Effects
0.1888**
(0.0337)
0.2195**
(0.0377)
0.2536**
(0.0462)
0.2164**
(0.0515)
0.0977**
(0.0229)
0.1045**
(0.0266)
0.0664*
(0.0315)
0.0326
(0.0527)
0.1263**
(0.0190)
0.1409**
(0.0217)
0.1386**
(0.0271)
0.1334**
(0.0370)
26
Table8C:EstimatedEffectofBeingCheatedin2008/09onNormedELATestScoresin2010/112013/2014by2008/09Grade
Level
Grades1&2
Grades36
Grades16
Model
2010/11
2011/12
2012/13
2013/14
2010/11
2011/12
2012/13
2013/14
2010/11
2011/12
2012/13
2013/14
NoFixed
Effects
0.1119**
(0.0210)
0.1053**
(0.0252)
0.1078**
(0.0260)
0.1326**
(0.0258)
0.0380*
(0.0167)
0.0707**
(0.0186)
0.0365
(0.0236)
0.0578
(0.0338)
0.0668**
(0.0131)
0.0887**
(0.0150)
0.0664**
(0.0176)
0.1145**
(0.0204)
0.1051**
(0.0205)
0.1013**
(0.0241)
0.0984**
(0.0250)
0.1224**
(0.0252)
0.0406**
(0.0159)
0.0669**
(0.0184)
0.0260
(0.0233)
0.0644
(0.0335)
0.0645**
(0.0127)
0.0828**
(0.0145)
0.0603**
(0.0173)
0.1090**
(0.0201)
0.1421**
(0.0268)
0.1191**
(0.0300)
0.1407**
(0.0351)
0.1503**
(0.0316)
0.0426*
(0.0210)
0.0862**
(0.0236)
0.0319
(0.0309)
0.0567
(0.0425)
0.0828**
(0.0165)
0.1020**
(0.0185)
0.0824**
(0.0237)
0.1190**
(0.0253)
SchoolFixed
Effects
Classroom
FixedEffects
27
Table9.EstimatedEffectofBeingCheatedin2008/09onPercentAttendance2010/11
2013/2014
Model
2010/11
2011/12
2012/13
2013/14
NoFixedEffects
0.0781
(0.1124)
0.2632
(0.1835)
0.3145
(0.2299)
0.7764**
(0.2627)
SchoolFixedEffects
0.0021
(0.0916)
0.4200**
(0.1482)
0.3878*
(0.1959)
0.6880**
(0.2468)
ClassroomFixedEffects
0.0200
(0.1045)
0.0781
(0.1816)
0.0603
(0.2166)
0.2992
(0.2850)
Standarderrorsclusteredattheclassroomlevelinparentheses.*Significantatthe5%level,**significantatthe1%level.
Table10.EstimatedEffectofBeingCheatedin2008/09onDisciplineIncidents2010/11
2013/2014
Model
2010/11
2011/12
2012/13
2013/14
NoFixedEffects
0.0070
(0.0408)
0.0098
(0.0634)
0.1149*
(0.0549)
0.0677
(0.0539)
SchoolFixedEffects
0.0138
(0.0382)
0.0847
(0.0597)
0.0038
(0.0561)
0.0441
(0.0583)
ClassroomFixedEffects
0.0054
(0.0463)
0.0629
(0.0770)
0.0505
(0.0733)
0.0232
(0.0749)
Standarderrorsclusteredattheclassroomlevelinparentheses.*Significantatthe5%level,**significantatthe1%level.
28

The Long-Run Effects of Teacher Cheating On Student Outcomes

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

The Long-Run Effects of Teacher Cheating On Student Outcomes

Hochgeladen von

Copyright:

Verfügbare Formate

TheLongRunEffectsof

Reading Normed Scores

Math Normed Score

ELA Normed Score

2009 Flagged Class

Schools Investigated in both 2009 and 2010

Schools Investigated in both 2009 and 2010

Grade 1 Initial Right, 2009 and 2010

Grade 4 Initial Right, 2009 and 2010

Schools Investigated in both 2009 and 2010

Schools Investigated in both 2009 and 2010

Grade 3 Initial Right, 2009 and 2010

Grade 6 Initial Right, 2009 and 2010

Schools Investigated in both 2009 and 2010

Schools Investigated in both 2009 and 2010

Grade 5 Initial Right, 2009 and 2010

Schools Investigated in both 2009 and 2010

Grade 8 Initial Right, 2009 and 2010

Schools Investigated in both 2009 and 2010

Grade 7 Initial Right, 2009 and 2010

2010/11 2011/12 2012/13

Das könnte Ihnen auch gefallen