You are on page 1of 12



for Better
Nine Principles for Using Measures
of Effective Teaching

the Dallas Independent Schools. New Teacher Center. University of Virginia. The approximately 3. University of Chicago. ABOUT THE MET PROJECT: The MET project is a research partnership of academics. its partners. the Hillsborough County Public Schools. and other leading school systems and organizations. National Math and Science Initiative. Stanford University. Partners include representatives of the following institutions and organizations: American Institutes for Research. Dartmouth College.metproject. Teachscape. the Denver Public Schools. More in-depth discussion of the MET project’s analyses to date may be found in the project’s research reports and non-technical briefs at www. The Danielson University of Southern California. Educational Testing Service. January 2013 . Funding is provided by the Bill & Melinda Gates Foundation. the Memphis Public Schools. and education organizations committed to investigating better ways to identify and develop effective teaching.000 MET project teachers who volunteered to open up their classrooms for this work are from the following districts: The Charlotte- Mecklenburg Schools. teachers. University of Washington. and Westat. Rutgers University. National Board for Professional Teaching Standards. University of Michigan. and the Pittsburgh Public Schools. the New York City Schools. Cambridge Education. Harvard University. University of Texas. Empirical Education.ABOUT THIS document: This brief highlights a set of guiding principles from the Bill & Melinda Gates Foundation to inform the design and implementation of high-quality teacher support and evaluation systems based on three years’ of work by the Measures of Effective Teaching (MET) project. RAND.

but they typically lack the conditions for success. unions. more effective teaching. They’re on their own to adjust practice to better serve students. and testing systems that measure only some of the outcomes that educators value for students. Teachers generally work in isolation. Success itself remains ill defined. schools.Creating the Conditions for Success Teachers want to succeed. overly numerous and often trivial learning objectives. research institu. as disconnected from what they valued Nine Principles for Using Measures of Effective Teaching 1 . expressed little faith that traditional tricts. The partners in the Measures of Nearly all of the teachers on the MET Effective Teaching (MET) project—a project’s advisory panel similarly group of thoughtful individuals in dis. evaluation measures and practices could tions. and technical organizations—told provide usable information to guide us from the outset that current evalu. its measures improve teacher support. They decried ation systems were not being used to evaluation as perfunctory. What guidance they get often is plagued by vague teaching standards.

stand and then close the gap between menting feedback and evaluation systems tive. The both professional development and MET project has sought to build and accountability purposes. about teaching and learning. we offer nine that trustworthy measures could inform and the actual teaching occurring in guiding principles based on three years’ improvements in teaching practice in classrooms. But good information is hard to produce. Note the cyclical presenta- It is very hard to support effective tion. Identifying and validating better and ultimately supported. three overarching imperatives. and collaboration ways that traditional evaluation systems with districts. explained on the following pages. Measuring for measures can help set expectations and in Figure 1: Measure Effective Teaching. strong the MET project and a core concern of be valid and reliable. and districts navigate the work of imple- observation practices as highly subjec. observation. To help states test measures of effective teaching so Figure 1 A Framework for Improvement-Focused Teacher Evaluation Systems MEASURE EFFECTIVE TEACHING  Set expectations  Use multiple measures  Balance weights INVEST IN IMPROVEMENT ENSURE HIGH-QUALITY DATA  Make meaningful distinctions  Monitor validity  Prioritize support and feedback  Ensure reliability  Use data for decisions at all levels  Assure accuracy 2 Feedback for Better Teaching . These advisors nevertheless agreed their expectations for effective teaching that support teachers. the measures has been the primary goal of measures of teaching effectiveness could right measurement processes. communications. Our prior reports tested. Well-designed evaluation systems It will require care and attention for teaching without good information will continually improve over time. and an awareness the districts with which we work. These principles. and its that school systems can clearly under. fall into of how information can be distorted. have not. Ensure High-Quality Data. teacher evaluation measures to serve about actual teaching practice. as shown When given the right type of attention. Improvement align effort. of study. the claim that It requires the right measures. and Invest in Improvement.

and they have included multiple measures. edge of subject-matter content and pedagogy. 50 percent of the weight to student tifaceted nature of effective teaching. and even central office administration—to align efforts in support of more effective teaching and learning. we have found that approaches that ■■ Use Multiple Measures. Guiding Principles for Improvement-Focused Teacher Evaluation Systems Our district partners are beginning to build and implement systems for teacher feedback and evaluation. are able to provide feedback at all levels of the system—school leadership. In the students. the MET project sought to indicate meaningful differences continues on page 6 Nine Principles for Using Measures of Effective Teaching 3 . and behaviors that enable better observation instruments to assess student learning. They see feedback as the path to better teaching. The first step to assess the supportiveness of the in designing teacher evaluation instructional environment. while focused on teaching. professional development. skills. In each case. What counts most knowledge into practice. of their subject and how to teach it. bining measures into a single index. The choice allocate between 33 percent and of measures should reflect the mul. content systems is for stakeholders to agree tests to assess teachers’ knowledge on the teacher knowledge. They understand that the measures. and target support. Measure Effective or developed measures to reflect Teaching all key aspects of its definition of effective teaching: student surveys ■■ Set Expectations. When com- service of student success. and the ability to put that ■■ Balance Weights. set the learning gains of a teacher’s priorities. they have upheld high standards for data quality. achievement measures are sufficient For instance. This benefits the teachers’ classroom practice. they have emphasized the importance of investing in improvement. and entire system by providing a shared student assessments to measure language to talk about teaching. all in the gets the most attention. An unmeasured facet is academic and social needs. coaching support. likely to be neglected. we defined effective measured each facet of effective teaching as sensitivity to students’ teaching. knowl. It was important that we MET project.

0 Using assessment in -4. student perception surveys.25 0. Diagnosing Practice with Multiple Measures These pages use MET project data to illustrate how multiple measures can provide teachers with rich.0 3. The teacher can see her overall results and where her results sit within the systemwide distribution for each measure and individual teaching competency.0 4. and student achievement gains.0 procedures Establishing a culture for learning -4..0 Score on FFT Scale 4 Feedback for Better Teaching .25 0 0.0 0 2.0 -2. student survey) Equally Weighted Composite State Math Test — Achievement gains Classroom Observation — FFT Student Survey — Tripod Achievement Gains Classroom Observations Middle School Math Scores Score on Danielson Framework for Teaching (FFT) ➋ -0. contextualized information on their practice for use in professional development. 40th observation..0 instruction Student achievement on 2009 state math test Using questioning & discussion techniques Students in Ms.0 2. Displayed are results for a MET project teacher (the name is fictional). and district on classroom observations.0 2.0 3.5 1.0 achievement on 2010 state math test Creating an environment of respect & rapport 0 Engaging students in learning Managing classroom -2.0 4. her school. Communicating with Classroom School District Actual = Predicted students Achievement 1.0 ➍ Managing student ➌ behavior Difference between actual and predicted 2. Ms.0 4. A 6th grade | Valley View Middle School | XYZ School District ➊ Multiple Measures Bar (achievement gains.5 -0. A’s .

Care ➍ Achievement Gains Scatterplot Challenge The scatterplot shows the gap between actual and predicted performance for all district 6th grade students on last year’s state math assessment. Each column represents a single teacher. Predicted performance is the average performance for students with Confer similar prior scores. Distance from the line represents the gap between predicted and actual performance. 1. Student Surveys Legend Lines extend from each side of the box.0 3. ➋ & ➌ Box Plots The box plots at level ➋ depict scores for each measure. A teacher’s value-added score is calculated by averaging each of his or her 1. ➊ Multiple Measures Bar This bar contains a score for every teacher on each measure within the Multiple Measures Composite (MMC).0 4. and green representing high performance.0 2. Scores beyond these lines are considered outliers. after adjusting for English language learner and free and reduced-price lunch status. indicating a high level of agreement among them. Above predicted Score on Tripod Scale performance is credited as positive and below predicted performance is debited as negative.0 The dark blue dot represents The line within the box is the the teacher. Points above the line Consolidate represent higher-than-predicted performance for students with similar characteristics.0 3.0 2. The center (dashed) line represents Clarify actual performance equal to predicted performance. Scores Composite score: 228 of 500 for the MMC and its individual measures are color-coded to performance standards for each measure. Note that the colors Tier: Satisfactory generally match across the four measures near each end of the bar. The light blue dot median (middle) teacher. In other words. teachers at the very high end tend to do well on all of the measures.0 5. The top row is the MMC and the rows below represent the achievement gains for the state math assessment and the average scores for the Framework for Teaching classroom observations and the Tripod student survey. yellow representing average performance. The orange box represents the middle 50 percent of all teachers. on the left to the Score on Tripod Survey 5th percentile and on the right to the 95th percentile.0 5. MMC scores determine placement on the bar from the lowest MMC score on the left to the highest MMC score on the right.0 student’s performance against predictions. with red representing Composite percentile: 40th low performance. and Captivate represents the school average. Nine Principles for Using Measures of Effective Teaching 5 .0 4. represents the district average. The box plots at level ➌ depict scores for each component within the student survey and the teacher observation measures. Points below the line represent lower- Control than-predicted performance. and the opposite is true for those at the very low end.

and told very few. it does not mean they are on the other measures. and narrow a focus on one measure. more than one lesson and include Whenever the MET project collected tion. better observation scores. observer. or the instrument as intended before they than teachers with lower scores. same reliability as when a principal suggest that teachers’ effectiveness ers to determine if measures could observes two full lessons and a peer is unlikely to be distributed equally identify effective teachers regardless or another administrator observes among several performance catego- of student assignment—and they did. School systems the classrooms of teachers with full-lesson observations can increase should do the same. the consistency weights avoid the risks posed by too sure with their student achievement of the data collection process. Many traditional evaluation sys- starting points). and prior School systems can use a variety of Invest in Improvement success raising student test scores combinations of observers and les. as measured. If teachers begin to score confidentiality. ■■ Make Meaningful Distinctions. two additional full lessons. but they should compare reliability is a function of the content among teachers. lengths. One measure of For example. does not lead has learned much about how to reli. administrators observe three partial reality. and requires rigorous training on how to Ensure High-Quality Data this undermines trust in the system. Teachers who Measurement of teaching should competencies within an observation demonstrate skills and score high on reflect the quality of teachers’ instrument. the skill. better the reliability of observation ratings. For ries. we found this to be far 6 Feedback for Better Teaching . Validation is not a one-time for survey questions. ■■ Ensure Reliability. Because two observ- and effort away from improvement sures are needed. not pass our validity test and was above 0. Accuracy of observations indicates measurement error. Low reliability correct. (whether student achievement. We have tested the validity of more than one observer for each student data from a classroom it veri- all measures in the MET project and teacher. Indeed. of the questions. we found that school tems told almost all teachers they content knowledge for teaching did systems could achieve reliability were satisfactory. then it ably measure teacher practice. then new mea- tently wrong. Chief scores and survey responses means is not the best place for teachers to among these is the need to observe crediting them to the right teacher. balanced teachers’ performance on each mea. The above scenario is that separates teachers into four erable effort to randomly assign more efficient. (adjusted for students’ different sons observed to improve reliability. This does not reflect ite measure. focus their limited time and atten. but higher ■■ Assure Accuracy. or surveys) invites scores are no longer associated with accuracy amounts to being consis- manipulation and detracts attention desired outcomes.65 when a principal observes sometimes less than 1 percent. assurance of Overweighting any single measure exercise. ers agree. Assuring accuracy of student test to better student outcomes. therefore omitted from our compos- one full lesson and peers or other they were not. The MET project are allowed to rate teachers’ practice. It also requires assess- a measure should experience more practice and not the idiosyncrasies ment of observers’ abilities to apply success in helping students learn of a particular lesson. higher on a measure. lessons. but neither does a system The MET project invested consid. Reliability without observations. If group of students. continued from page 3 School systems needn’t go to such student survey measures and tests. student survey results. Moreover. differentiate performance across all ■■ Monitor Validity. we learned that fied with the teacher the names of the found that students learn better in short observations to supplement students in the class. gains. MET project data classrooms of students to teach. yet it achieves the equal-sized groups. In addition.

need improvement most. A number of Charlotte Danielson’s Framework for sions. Only 7. rigorous instructional techniques— than trying to make fine distinctions for which teachers showed the most ■■ Use Data for Decisions at All Levels. room for improvement—rather than The responsibility for improving efforts would be better spent work. alone. and only 4. Although (Fla. Multiple measures provide our partner districts—including the Teaching. Rather experiences.2 participated in the MET project video observation measures indicate percent of teachers scored above a study told us that seeing them.from the case. mance will require administrative needs. This led three. where 50 percent of the teach. many of the teachers who to areas of teaching that classroom ers scored below a two. action on behalf of students. enable school systems to better classroom observation scores were While some teachers’ low perfor. Teachers at the 25th and rich information to help teachers Denver and Hillsborough County 75th percentiles scored less than improve their practice.5 percent of teach. Sound measures help school tion. feedback. This would suggest a large selves teach was one of their most Hillsborough County to focus its middle category of effectiveness with valuable professional development professional development support on two smaller ones at each end.) Public Schools—have shifted one-quarter point different from the we didn’t study the effectiveness of professional development resources average. classroom management skills that teaching shouldn’t rest with teachers ing to improve their practice. support teachers’ improvement bunched at the center of the distribu.4 points of each waste of effort to use measures of fessional development and whether other (on a four-point scale) using teaching only for high-stakes deci. it’s a systems know where to target pro- ers scored within 0. the supports work. among teachers in this vast middle. MET project teachers’ ■■ Prioritize Support and Feedback. Measures of effective teaching Nine Principles for Using Measures of Effective Teaching 7 . most teachers had clearly mastered.

But there’s still much to learn as these systems are implemented and improved over time and aligned to new expectations for students. But the real work lies ahead: understanding how to use that data to help all teachers improve their practice and the outcomes for America’s young people. The Next Phase of Work States and districts have learned a great deal in the last few years about how to create better teacher development and evaluation systems. Understanding how teachers are performing is an important first step. As they move forward. states and districts should commit to measurement but hold lightly to the specific measures as the field continues to gain new knowledge. One of the most exciting prospects is aligning teacher development and evaluation systems to the Common Core State Standards. 8 Feedback for Better Teaching .

the Bill & Melinda Gates Foundation works to help all people lead healthy..S.Bill & Melinda Gates Foundation Guided by the belief that every life has equal value. which works primarily to improve high school and postsecondary education. Program. please visit the foundation is led by CEO Jeff Raikes and Co-chair William H.gatesfoundation. Gates Sr. . it focuses on improving people’s health and giving them the chance to lift themselves out of hunger and extreme poverty. productive lives. In developing countries. Washington. Based in Seattle. For more information on the U. it seeks to ensure that all people— especially those with the fewest resources— have access to the opportunities they need to succeed in school and life. under the direction of Bill and Melinda Gates and Warren Buffett. ©2013 Bill & Melinda Gates Foundation. All Rights Reserved. In the United States. Bill & Melinda Gates Foundation is a registered trademark in the United States and other countries. .gatesfoundation.