Sie sind auf Seite 1von 37

Whos at the Top? It Depends . . .

Identification of Highest Performing Schools for


Newsweeks 2014 High School Rankings

Authors



Matthew Finster
Jackson Miller





2014





2




Whos at the Top? It Depends . . .

Identification of Highest Performing Schools for
Newsweeks 2014 High School Rankings





2014

Prepared for:
Newsweek LLC
7 Hanover Square, 5
th
Floor
New York, New York 10004




Prepared by:
Westat
An Employee-Owned Research Corporation


1600 Research Boulevard
Rockville, Maryland 20850-3129
(301) 251-1500


3

Executive Summary

This report describes the 2014 methodology and analysis for Newsweeks Top Public High School
rankings. The 2014 methodology is different from traditional high school ranking methods in that it
produced two sets of rankings. We created two lists, an absolute list and a relative list, to
demonstrate the consequence of accounting for non-school factors on school rankings. The
absolute list ranks schools based solely on the achievement and college readiness indicators. The
relative list ranks the highest performing schools after accounting for student poverty. These two
lists reveal how the rankings vary when non-school factors (poverty, in this case) are considered.

For both the absolute and relative rankings, we conducted multiple analyses to identify the top
schools. Using data from National Center for Education Statistics (NCES, specifically data available
through EDFacts and Common Core Data (CCD), we conducted a threshold analysis to identify
schools with the highest levels of academic achievement based on average student proficiency as
measured by scores on state standardized assessments. The absolute list identified the top 20 percent
of schools in each state, that is, schools with performance above the 80
th
percentile. The relative list
identified schools that were .5 standard deviation (SD) or higher than the line of best fit after
controlling for the percentage of students eligible for free or reduced-price lunch. We surveyed
schools identified in the threshold analysis to collect data related to college readiness. The web-based
survey asked about basic demographic information, graduation rates, college enrollment rates, the
number of full-time equivalent (FTE) counselors, the number of students who take the SAT and/or
ACT, the average SAT and/or ACT score, the percentage of students taking at least one advanced
placement (AP)/international baccalaureate (IB)/advanced international certificate of education/
AICE course, and the schools students average AP/IB/AICE score. Using these data, we created a
college readiness index score based on six indicators, namely, counselor FTE (weighted at 10
percent), changes in 9
th
-grade to 12
th
-grade enrollment rates (10 percent), a composite SAT/ACT
score (17.5 percent), a composite AP/IB score (17.5 percent), high school graduation rates (20
percent), and college enrollment rates (25 percent). We conducted a second analysis to construct the
absolute ranking, which ranked schools based on their college readiness index score, and the relative
list, which ranked schools based on their distance from the line of best fit when controlling for
student poverty.

To highlight schools with similar achievement levels for all students, we noted schools on both lists
in which economically disadvantaged students scored at or above the state average in reading and
mathematics. This analysis also used NCES data obtained from EDFacts and the CCD.

We conducted several sensitivity tests to assess the extent to which the rankings are dependent on
the judgmental weights and to examine the degree to which interrelationships among the variables
may have influenced the final rankings. For this analysis, we produced rankings using weighting
scheme to assign equal importance to each of the six college readiness indictors. Next, we conducted
a Principal Component Analysis (PCA) to generate weights and control for intercorrelations among
the variables. Based on the results of our sensitivity tests, we concluded that school rankings were
less sensitive to changes in the weighting scheme than they were to controlling for student poverty
levels. However, these results are unique to these data. We encourage others to assess the sensitivity
of their rankings to their weighting schemes and to address issues stemming from multicollinearity
when ranking high schools.

4

Acknowledgments
This report was written by Westat for Newsweeks Top Public High School ranking list. The
authors, Matthew Finster and Jackson Miller, would like to acknowledge the following people who
contributed to this work.
First, to the Newsweek staff for their patience and understanding throughout the process, and
especially Maria Bromberg, for her dedication to the project.
We also thank our project director, Allison Henderson, and our Westat colleagues, Jill Feldman, and
Shen Lee for providing feedback and support. In particular, we would like to thank Anthony
Milanowski for his counsel throughout the process, especially regarding the sensitivity tests, and for
reviewing the report. Thanks also to thank Edward Mann and his team for conducting the web-
based survey.
We would like to acknowledge the contributions made by the advisory panel, Russell Rumberger at
University of California, Santa Barbara; David Stern at University of California, Berkeley; Brad Carl
at The Value-Add Research Center at the Wisconsin Center for Education Research; Nicholas
Morgan at The Strategic Data Project at the Harvard Center for Education Policy Research; and
Elisa Villanueva Beard at Teach for America, for their input.
Last, we want to thank the education staff that took the time to provide us with survey data.
Matthew Finster
Research Associate
Westat
1600 Research Blvd.
Rockville, Maryland 20850

Jackson Miller
Research Associate
Westat
1600 Research Blvd.
Rockville, Maryland 20850



5

Contents

Introduction ............................................................................................................................ 7
Review of Relevant Literature and Previous Methods...................................................... 9
Methodology for Identification of Top Schools for Newsweeks Top Public
High Schools Rankings .................................................................................. 12
Data Sources .................................................................................................... 15
Analysis Details ............................................................................................... 16
Threshold Analysis: Based on Academic Achievement
Indicators ........................................................................... 16
Absolute Ranking: Identification of high schools that performed the
highest on accountability assessments within state .................... 16
Relative Ranking: Identification of high schools that exceeded
expectations on accountability assessments based on levels
of student socioeconomic status, within state ............................ 16
Ranking Analysis: Based on College Readiness Indicators ......... 18
Absolute Ranking: Identification of high schools that had the
highest college readiness index scores ....................................... 18
Relative Ranking: Identification of high schools that performed
higher than the state average after controlling for students
socioeconomic status ............................................................... 18
Equity Analysis: Identification of high schools that have
economically disadvantaged students that are performing
higher than the state average ........................................................... 19
Limitations ....................................................................................................... 20
Conclusion and Discussion ................................................................................................. 22
Appendix A: Sensitivity Tests ............................................................................................. 23
References ......................................................................................................................... 36


6

Figure 1: Scatterplot of Schools Achievement Index Scores by Percentage of Economically
Disadvantaged Students in Connecticut ....................................................................................................... 17



7

Introduction
The purpose of this brief is to describe the methodology (hereafter termed 2014 methodology) used
to develop Newsweeks 2014 High School Rankings. The methodology produced rankings based on
two aspects of school performance: the academic achievement of all students on state assessments
in reading and mathematics and the extent to which high schools prepare their students for college.
We also incorporated an equity measure to highlight schools that have high levels of achievement
for all students based on economically disadvantaged students having levels of academic
achievement at or above the statewide average level of achievement of all students.
One of the biggest challenges in ranking school performance is addressing whether, and if so how,
to incorporate the substantial influence of family background on students success in school. From
the time when the Coleman Report (Coleman et al., 1966) found that family background has a
considerable influence on student achievement, numerous studies have confirmed this finding and
concluded that students from more affluent families, whether measured by household income or
parents occupation or level of education, experience substantial advantages over students from less
affluent families (Sirin, 2005; Stinebrickner & Stinebrickner, 2003; Sutton & Soderstrom, 1999).
Thus, it is generally agreed on that schooling outcomes are influenced by a combination of factors
inside and outside of school. However, most high school lists ignore the influence of family
characteristics when ranking schools.
1
When family background characteristics are not included, all
student performance is erroneously attributed to the school. Removing the influence of students
socioeconomic status on school-level outcomes can drastically change the high school rankings (e.g.,
Toutkoushian & Curtis, 2005). To address the common criticism that school rankings are driven by
student background characteristics, we developed two lists: one which is referred to as the
absolute list, and another which is referred to as the relative list. The absolute ranking is based
on the highest scores on the college readiness index and ranks schools without accounting for
student background characteristics. The relative ranking, on the other hand, accounts for students
socioeconomic status. The relative list controls for student socioeconomic status to clarify the
association between influences occurring outside of school and student performance. These two lists
are different and answer different types of questions. If someone is interested in knowing which
schools perform the best in readying students for college, regardless of whether the performance is
attributable to the school or to family background, then one should refer to the absolute list. If, on
the other hand, an individual is interested in knowing which schools perform the best on the college
readiness index relative to their levels of student socioeconomic status.
Furthermore, we developed an equity measure for the schools on both lists to assess economically
disadvantaged students performance in schools. There are long-standing conceptions of equity and
adequacy in education finance that have evolved over time (e.g., Baker & Green, 2009; Berne &
Stiefel, 1984). Broadly, the focus of these debates has shifted from the equity of input measures,
such as school resources, to adequacy of educational outcomesfor all students. Concurrently,
national education policy has been squarely focused on the goal of educating all students to high
standards (e.g., No Child Left Behind Act of 2001, Race to the Top Fund). Since high achievement
and college readiness for all students is a critical component of a successful high school, for both the
absolute and relative rankings, we note schools that have economically disadvantaged students that
are achieving at or above the state average in both reading and mathematics. This approach provides

1
An exception to this is U.S. News High School Ranking produced by the American Institutes for Researchs (AIR), for details please, see Duhon et. al.
(2013).
8

a rough estimate of equity and indicates whether economically disadvantaged students in a school
have average performance levels that are at least as high as state averages on standardized reading
and mathematics assessments. This measure provides another way to distinguish between schools,
based on the size of performance gaps between economically disadvantaged students in a school and
all students in the state.
The 2014 methodology extends on Newsweeks 2013 ranking methodology in several ways. First, the
2013 methodology relied solely upon self-reported data provided by schools to create a college
readiness index. For 2014, we used secondary data collected by the National Center for Education
Statistics (NCES) to supplement self-reported information about college readiness provided by
schools. Student achievement data obtained from NCES were used to select a sample of schools
(for the absolute list, n=4,400 and for the relative list n=4,100) from which to collect college
readiness data. Additionally, using data from the CCD, the 2014 methodology includes a variable to
control for average student poverty levels, which are known to be associated with student
achievement and college readiness. We controlled for student poverty levels to address critiques of
past rankings that claim that school performance as measured by average student test scores is as
much or more a function of student background characteristics than of factors within a schools
control, and thus most rankings erroneously attribute variation in student performance to schools
alone. To illustrate the difference between a list that controls for student poverty and one that does
not, we constructed two different rankings, an absolute ranking and a relative ranking. To assess
how sensitive rankings were to the judgmental weights we applied, we conducted several
sensitivity tests.
The 2014 methodology consists of a similar multi-step analysis for both the absolute and relative
rankings. However, the relative ranking accounts for student socioeconomic status, and the absolute
list does not. The first step was to assess schools performance within their respective state. For this
analysis, we constructed an academic achievement index (AI) based on school average proficiency
rates on state standardized tests. We conducted this threshold analysis to identify schools that
performed above a defined cut point. For the absolute list, we selected schools that were in the 80
th

percentile or higher in each states performance distribution (i.e., top 20 percent). For the relative
list, we selected schools that performed .5 SD or higher than the average school in the state with
similar proportions of students eligible for free or reduced-price lunch.
Schools that performed above the defined threshold proceeded to the next stepthe ranking
analysis based on the college readiness data. For this step, Westat surveyed both schools on both
lists to collect data about college readiness indicators (e.g., ratio of students to counselor full-time
equivalent (FTE), percentage of students who took advanced placement (AP) tests, and average SAT
scores). Using these indicators, we created a college readiness index to rank the schools. For the
absolute ranking, we ranked schools by their college readiness index score. For the relative ranking,
we ranked schools by their college readiness index score, controlling for socioeconomic status. We
conducted a separate analysis to assess the sensitivity of the rankings to the judgmental weighting
scheme used for the college readiness index. This analysis is mentioned below and described in detail
in appendix B).
The final step was to analyze whether economically disadvantaged students in each school
performed better than the state average in reading and mathematics. Schools in which economically
disadvantaged students performed at least as high or higher than the state average in both reading
and mathematics were denoted as equitable schools.
9

The next section discusses relevant research, including common critiques, pertaining to school
ranking methodologies.
Review of Relevant Literature and Previous Methods
The purpose of the Newsweek high school rankings is to identify the top 500 public high schools in
the country. Given the diversity among public high schools across the nation, developing one system
to rank schools presents a challenge. The problem is developing one comprehensive ranking system
that adequately accounts for heterogeneity among schools. Gladwell (2011) describes this dilemma
in his article in The New Yorker in which he discusses the opposing forces of comprehensiveness vs.
heterogeneity. According to Gladwell, a ranking system can be comprehensive so long as it is not
trying to rank units (schools, in this case) that are heterogeneous. Similarly, a ranking system can
account for heterogeneity so long as it does not also strive for comprehensiveness. Ranking the
nations top public high schools is, by nature, comprehensive. Yet, there is great diversity among
public high schools, much of which cannot be accounted for. Schools vary with regard to their
settingsurban, suburban, and ruraland the variety of curriculum and programs they offer. Even
with a narrow focus on college preparation, the extent to which schools focus on AP, international
baccalaureate (IB) or duel-enrollment varies considerably, and the extent to which high school
students take the SAT or ACT varies considerably by state. Schools also enroll students with
significantly different background and demographic characteristics, and studies have repeatedly
demonstrated that these differences influence achievement in a variety of ways (e.g., Sirin, 2005;
Stinebrickner & Stinebrickner, 2003; Sutton & Soderstrom, 1999). To account for at least some of
this heterogeneity, we developed two ranking systems.

The first ranking scheme identified and ranked schools by their performance on the academic
achievement and college readiness measures. In essence, this ranking scheme identifies the schools
with the highest absolute performance on these measures, regardless of influencing factors. A
problem with this approach is that it ignores the influence on achievement attributable to student
background characteristics. So while this ranking scheme identifies schools with the highest scores
using these metrics, estimates of the association between school factors and performance are
confounded with contributions made by students background and family characteristics. That is, it
is not clear to what extent student performance is associated with school versus non-school factors.
In an effort to address this confound, we devised the relative ranking method to account for student
poverty levels.
2
Using the relative method, rankings are based on school performance levels relative
to the socioeconomic status of its student body. Since the relationship between poverty levels and
student performance is partially accounted for, schools that are ranked highly using this method are
not necessarily the same schools that have highest absolute levels of performance.

Another important decision when constructing ranking schemes is which variables to include and
how much each variable will contribute to the final rankings (Bobko, Roth, & Buster, 2007;
Gladwell, 2011; Webster, 1999). To determine which variables to include, we referred to previously
used methodologies (e.g., Duhon, Kuriki, Chen, & Noel, 2013; Streib, 2013) and related research
(e.g., Grubb, 2009). There is a vast amount of research about the economics of education and
education finance that examines the production function of schools (e.g., Grubb, 2009; Harris, 2010)

2
This study is not causal and does not make any causal claims about the influence of schools on students. To really separate the influence of the
school vs. student background characteristics, a causal study would be required. In this case, we are parsing out the association of the school
performance with student backgrounds.
10

that provides defensible rationales for factors that should be included or accounted for in a ranking
scheme (e.g., financial resources, teacher experience, staff development, college pressure, school
problems, family background and student connectedness). Unfortunately, limited data are available
for all schools, and available data exclude many of the recommended factors (This issue is addressed
again in the limitations section.) Hence, we relied on common measures that have been used in
previous rankings (e.g., Duhon et al., 2013; Streib, 2013). These variables include student
performance on standardized assessments, high school graduation rates, SAT and ACT scores,
performance metrics on AP/IB exams, and college acceptance and enrollment rates. (Specific items
are discussed in greater detail below.)

In addition to commonly used items, we included two variables that we derived from research about
school performance metrics. The importance of students engagement with counselors has been
shown to positively influence college attendance rates (e.g., Grubb, 2009) and, thus, could be
considered one suitable indicator of college preparation. However, due to data limitations, we used a
simple school resource indicated by pupil-counselor full time equivalent (FTE) ratio to reflect this
construct. Additionally, we added another variable, referred to as holding power (e.g., Balfanz &
Legters, 2004; Rumberger & Palardy, 2005), that indicates the dropout and transfer rate of 9
th
-grade
students and can be considered another appropriate indicator of school quality (e.g., Rumberger &
Palardy, 2005).

Typically, data for high school rankings are obtained from national or state-level data sets or from
self-reported survey data provided by the school or school district personnel. Both sources of
information have strengths and limitations. Data from national and state sources typically lag by two
or three school years. For this reason, rankings that rely on information from these sources use data
that are two years old (e.g., Duhon et al., 2013). For example, 2013 rankings would rely on data
collected for SY 2011-12. A primary benefit of using these types of datasets is that there are uniform
reporting procedures and data checks in place, resulting in data that are more reliable with regard to
consistency and accuracy; although these data are still self-reported by educational personnel. An
alternative is to survey schools directly and ask them to self-report the requested data. While this
option allows for the more current data to be collected than can be obtained through state or
national data files, this method is more susceptible to participation bias and lacks standardized
reporting procedures. To address issues stemming from self-reported data, we established
procedures to assess the credibility and feasibility of some data. (These procedures are discussed in
the data section.)

An important note is that for the rankings, the unit of analysis is the school. We used means of
student performance indices to indicate school performance. Using average student performance as
an indicator of school performance is a common practice, and in fact, most states tie school funding
to these metrics (Cobb, 2002; Hall, 2001). This practice of using average student performance as a
measure of school performance in rankings like these has been criticized by some for not
distinguishing between student vs. school level performance (e.g., Di Carlo, 2013). However, student
means are commonly used as an indicator of school performance, for example, in state funding and
accountability systems, and the only way to properly address this issue is to use student-level data
and conduct a multilevel (e.g., hierarchical) study, however, this kind of student-level data are not
currently available for all public high schools.

In addition to determining which variables to include, we also had to determine how to combine
them. This process entailed determining how items could be used for comparative purposes and
11

how much to weight these items to generate a composite score. Methods can involve a single-stage
(e.g., Streib, 2013) or multi-stage process (e.g., Duhon et al., 2013). A benefit of using a multi-stage is
that it establishes a minimum performance standard and conducts additional analysis on just the
subsample of schools that meet the initial criteria. Additionally, it also allows for the inclusion of
student achievement data that need to be assessed within state. Due to the lack of comparability
across state assessments with regard to their difficulty and content focus, we used relative
performance within states to determine the minimum threshold, as opposed to including this as part
of the ranking criteria. This approach accounts for the nested structure of standardized assessments
within states and identifies schools that met the baseline criteria. (See the methods section for
additional details.)

Determining an appropriate weighting scheme is a thorny issue. Ways of combining data elements
into composite scores has been debated by statisticians and researchers for decades (Bobko et al.,
2007). Three common procedures for combining items into a composite score are using expert
judgmental weights, equal (or unit) weighting, and regression weights. Judgmental weighting relies
on using theory and professional judgment to establish a weighting scheme. Equal weighting applies
an equal amount of importance to each item in the composite. Regression weighting uses a formula
based on the relationship between the measures and a criterion to establish weights, although it is
only possible to use this approach when there is a criterion measure, limiting its viability in this case.
For public high school rankings, previous methods have relied on judgmental weights (e.g., Duhon
et al., 2013; Streib, 2013); however, equal weighting is a viable option. Research that examines
judgmental versus equal weighting (e.g., Bobko et al., 2007) points out that in many cases, each
approach produces similar results. Yet, in some instances, it makes more sense to weight items based
on theorized substantive (or empirical) importance of the variables. Based on this information, we
used a judgmental weighting scheme to generate the composite scores and, subsequently, rank
schools. We also assessed the sensitivity of the rankings to the weighting scheme by generating
rankings using equal weights and comparing those results to the rankings produced by the
judgmental weighting schema. Furthermore, we conducted a principal component analysis to
generate rankings using weights derived from the empirical relationships between the variables. (For
a discussion of the equal and PCA weighting analysis, please refer to appendix A.)

One problem to acknowledge and potentially control for when using judgmental weights is the inter-
correlations between the items. In some cases, items can be so similar that they are measuring the
same thing. Previous ranking studies (e.g., Duhon et al., 2013; Streib, 2013) do not indicate whether
or how multicollinearity among the variables was accounted for. This is problematic because if the
items are moderately or highly correlated, the explicit weights may vary substantially from the actual
contributions of the items to the rankings (e.g., Weber, 1999). Furthermore, items with very high
correlations (e.g., r=+.9) should not all be included in a model. If two items are correlated at r=.9,
they are essentially measures of the same underlying construct. Given this, it is important to examine
the inter-correlations of indicators used in the ranking formula. If items are highly correlated, the
same results could be produced more efficiently by a simpler model that includes fewer indicators.

Some methods explicitly control for multicollinearity. Principal component analysis is a technique
that generates components and weights based on empirical data. PCA is one way to assess whether
the explicit weights and the actual contributions of the data to the ranking vary substantially. The
results of the PCA we conducted are discussed in appendix A.

The following section explains and elaborates on the 2014 methodology.
12


Methodology for Identification of Top Schools for Newsweeks Top
Public High Schools Rankings
For the 2014 methodology, we conducted a multi-step process consisting of a threshold, a ranking,
and an equity analysis. The threshold analysis assessed schools performance as measured by
students achievement levels on standardized assessments and identified (1) the schools that have the
highest levels of academic achievement (for the absolute ranking) and (2) the schools that have the
highest levels of academic achievement given the socioeconomic status of their students (for the
relative ranking). Schools that were identified in the threshold analysis were surveyed to obtain data
about college readiness indicators, and, pending completion of the survey, schools proceeded to the
second analysis. The second analysis ranked schools by their responses to several college readiness
indicators that identified (1) the schools with the highest levels of college readiness and (2) the
schools that have the highest levels of college readiness after accounting for the socioeconomic
status of their students. (Additional analyses were conducted to assess the sensitivity of the rankings
to the judgmental weights and the intercorrelations among the variables). (See appendix A for
details.) For the last analysis, we compared the performance of each schools economically
disadvantaged student population to the average performance in reading and mathematics for all
students within each state.
The methods used to develop the rankings were designed to:
Identify high schools within each state that have the highest performance as
measured by academic achievement on state assessments in reading and mathematics
Assess the extent to which the top-performing schools have prepared their students
for college and to rank them accordingly
Recognize schools that have high levels of achievement among economically
disadvantaged students.
The procedures are similar for both the relative and absolute rankings; however, for the relative
ranking, we accounted for the socioeconomic status of the schools student body. Details related to
each step in the process follow:
Threshold Analysis: Create a high school achievement index based on performance
indicators (i.e., proficiency rates on state standardized assessments). For the absolute list, the index
was used to identify high schools that perform at or above the 80
th
percentile within each state. For
the relative list, the index was used to identify high schools that perform .5 SD or more than their
states average when accounting for students socioeconomic status.
Ranking Analysis: For the high schools on both lists identified in the threshold analysis, we
created a college readiness index based on the following six indicators: the ratio of students to
counselor FTE, changes in 9
th
- and 12
th
-grade student enrollment rates (referred to as holding
power), high school graduation rates, a weighted SAT/ACT composite score, a weighted AP/IB
composite score, and the percentage of students enrolling in college. The weighting scheme for the
index is below:
o Holding Power10 percent
13

o Ratio of Counselor FTE to student enrollment10 percent
o Weighted SAT/ACT17.5 percent
o Weighted AP/IB composite17.5 percent
o Graduation Rate20 percent
o Enrollment Rate25 percent

For the absolute rankings, we rank ordered the schools by their college readiness index scores. For
the relative list, we ranked the schools based on how well the schools performed relative to schools
with similar proportions of students eligible for free or reduced-price lunch.
Equity Analysis: Of the top high schools identified in the ranking analysis, we then
identified schools in which economically disadvantaged students performed better than the state
average for all students in reading and mathematics. This part of the analysis did not affect the
rankings. Instead, we incorporated this step to recognize schools that have equitable academic
performance for economically disadvantaged students as indicated by their performance levels
relative to the state average for all students on both the reading and mathematics assessment. Table
1 provides an overview of the steps, analysis, outcomes, and data used for the 2014 methodology.
Table 1: Summary of Proposed Methodology for 2014
Threshold Analysis: Student
Achievement
Identification of high-
achieving schools
For the relative list:
Accounting for socioeconomic
status of students

Analysis
Creation of academic
achievement index (AI) scores
for each school by state*
For relative list:
Within each state, scatterplot
of schools AI scores by
socioeconomic status indicator
( percent free or reduced-price
lunch)

Ranking Analysis: College
Readiness
Identification of schools
achieving high marks on
college readiness
For the relative list:
Accounting for socioeconomic
status of students

Analysis
Of the schools that proceed from the
threshold analysis:
Creation of college readiness
index scores for each school*
For relative list:
Scatterplot of schools college
readiness scores by
socioeconomic status indicator
(percent free or reduced-price
Equity Distinction:
Identification of schools with
economically disadvantaged
students performing better
than the state average on both
the math and reading
assessments


Analysis
Of the top schools in tier 1:
Comparison of schools
economically disadvantaged
students with state averages




14

Outcome
For absolute list:
Schools within each state were
rank ordered by their AI score
Schools in the top 20 percent
within their state proceed to
step 2
For relative list:
Schools with an AI score of
0.5 SD above the line of best
fit selected to proceed to step
2

lunch)
Outcome
For absolute list:
Schools were rank ordered by
their scores on the college
readiness index.
For relative list:
Schools ranked based on their
(standardized) distance from
the line of best fit

Outcome
Schools with higher
achievement than the state
average on both reading and
math among economically
disadvantaged students are
recognized as equitable
Achievement Data
Source:
NCES EDFacts Data
Indicators:
Academic achievement in
mathematics, reading/
language arts, science

College Readiness Data
Source: Email/web- based
school surveys, NCES data
Indicators:
Ratio of student enrollment to
counselor FTE, change in
student cohort enrollment
rates from 9
th
to 12
th
grade
(i.e., holding power),
percentage students taking
SAT/ACT, average
SAT/ACT score, graduation
rates, college enrollment rates,
percentage of students taking
at least one AP/international
baccalaureate (IB)/Advanced
international certificate of
education (AICE) course, ratio
of AP/IB/AICE test per
student, average
AP/IB/AICE score
Subgroup Achievement
Data
Same data as threshold
analysis by student subgroups



15

Data Sources
The threshold and equity analyses are based on state standardized assessment data obtained from
NCES. Assessment data available on states websites vary with regard to their completeness, limiting
comparability and, in some cases, excluding states from the analysis. Given this, we used publicly
available data from NCES (found at www.data.gov) about performance for all schools during the
2011-12 school year to determine which schools met the initial threshold criteria using systematically
collected data for a more complete set of schools. NCES suppresses publically available data to
protect the identity of students in schools with small student sub-group populations. The suppressed
data placed schools into student proficiency ranges that varied from 3 to 50 percentage points. Of
the schools contained in the database, we selected regular public high schools, excluding vocational,
special education, and alternative schools, as defined by the NCES data manual. For the threshold
analysis, we have records and data for 14,454 public high schools. We used this sample in the
threshold analysis to construct both the absolute and relative lists. Of the 14,454 schools, the
threshold analysis identified approximately 4,400 schools for the absolute list and 4,100 for the
relative list. A total of 2,533 schools were on both lists.
We collected survey data from schools that appeared on either list to conduct the ranking analysis
for schools that met or exceeded the threshold criteria. A link to the web-based survey was sent to
the schools via regular mail and by email to collect basic demographic information, graduation rates,
college enrollment rates, the number of counselor FTE, the number of students taking the
SAT/ACT, average SAT/ACT scores, the percentage of students taking at least one AP/IB/AICE
course, and average AP/IB/AICE scores. We collected data from the 2011-12 school year to align
with the most recent year of publically available achievement data. For the schools that received a
survey, the response rate was 38 percent (n=1551) for schools on the relative list and 35 percent (n=
1529) for the schools on the absolute list.
We used the survey data to create the college readiness index score for each of the schools in both
of our samples. We also added the holding power variable to the index, which was derived from
NCES data to determine the ratio of 12
th
-grade students in the 2011-12 school year to 9
th
-grade
students in the 2008-09 school year. However, given the limitations of the data in the CCD, we were
not able to account for students who transferred to a different school during this period.
After collecting the data, we ran data checks to identify schools with data that were likely incorrect.
Schools that reported having more students taking the SAT or ACTor more students taking AP
or IB classesthan the total number of students enrolled were removed from the data. In addition,
we removed schools that had a counselor-to-enrollment ratio that was excessively large or small. We
defined excessive as any school with a standardized ratio (the z-score of the ratio) above 3 or below
-3. Finally, we capped the graduation rate at 100 percent for schools that reported having more high
school graduates than total 12
th
-grade students and capped the college enrollment rate at 100 percent
for schools that reported having more college enrollees than high school graduates. In some cases, it
is possible for schools to graduate more students, or have more students enroll in college, than the
number of 12
th
-grade students. However, capping these two variables at 100 percent eliminated any
advantage a school could gain by graduating students outside of the typical four-year window.

(For the sensitivity tests that we conducted, we further screened and cleaned the data to prepare
them for multivariate analysis; these procedures are discussed in appendix A.)
16

The next section provides additional analytic details.
Analysis Details
This section provides more detailed information about the procedures we used to conduct each
analysis and to construct both the absolute and relative rankings.
Threshold Analysis: Based on Academic Achievement Indicators
Absolute Ranking: Identification of high schools that performed the highest on
accountability assessments within state
Relative Ranking: Identification of high schools that exceeded expectations on
accountability assessments based on levels of student socioeconomic status,
within state
Substep 1: Calculate the academic achievement index
As part of the threshold analysis, we created an AI based on proficiency rates on states reading
language arts and mathematics assessments. We calculated the index scores by taking the weighted
average of the proficiency rates on the two assessments. The equation is:
Weighted avg.=((# taking reading/language assessment*pct. proficient)+(# taking math*pct.
proficient))/(# taking reading/language assessment+ # taking math)
For the relative ranking, we also calculated the percentage of economically disadvantaged students
for each high school:
The percentage of students living in poverty for each school was calculated using data from the
CCD, which includes the number of students eligible for free or reduced-price lunch and the total
number of students in each school. We calculated the percentage eligible for free or reduced-price
lunch by dividing the number eligible by the total number of students in the school.
Substep 2: Identify top schools
Absolute Ranking: Identify schools that achieve within the top 20 percent on state English
language arts and mathematics assessments within their respective state
For this step, we ranked the schools by their AI composite score within state and identified the top
20 percent of schools. Schools in the top 20 percent proceeded to the ranking analysis.

17

Relative Ranking: Identify schools that have higher than expected achievement based on
the state-specific relationship between student achievement and percent
eligible for free or reduced- price lunch
We regressed the schools AI scores on their percentage of economically disadvantage students to
generate a line of best fit that was used to determine the state-specific relationship between the AI
and the percentage of economically disadvantaged students. For example, Figure 1 below shows the
relationship between the index and average student poverty for Connecticut. Figure 1 depicts the
type of analysis we performed for each state
Figure 1: Scatterplot of Schools Achievement Index Scores by Percentage of Economically
Disadvantaged Students in Connecticut

The extent of the difference between a high schools expected and observed achievement is captured
by the residual from the line of best fit. The residual is the vertical difference between the observed
value and the expected value indicated by the line of best fit. To identify schools that perform above
the average, based on the line of best fit generated by the AI scores and percent of economically
disadvantaged students, we established the cut-off point at +0.5 SD. That is, schools with residuals
that met or exceeded +0.5 SD proceeded to ranking analysis.
0
2
0
4
0
6
0
8
0
1
0
0
0 20 40 60 80 100
frlpct
wt_avg Fitted values
18

Ranking Analysis: Based on College Readiness Indicators
Absolute Ranking: Identification of high schools that had the highest college readiness
index scores
Relative Ranking: Identification of high schools that performed higher than the state
average after controlling for students socioeconomic status
For the ranking analysis, we repeated the procedures used in the threshold analysis, but we used the
college readiness indicators to produce a college readiness index score for each school. As in the
threshold analysis for the absolute rankings, we ranked schools on their college readiness index score
and, for the relative rankings, we ranked schools based on their residuals from the line of best fit
using the scatterplots of college readiness index scores by student socioeconomic levels.
Substep 1: Calculation of college readiness index scores for each high school that
progressed past the threshold analysis
For the high schools that proceeded beyond the threshold analysis, we created a college readiness
index score based on the indicators in the survey. Due to patterns of missing data and high
intercorrelations among some indicators, the college readiness index score is based on six indicators:
holding power, ratio of counselor FTE to student enrollment, graduation rates, a weighted
composite SAT/ACT score, a weighted composite AP/IB score and college enrollment rates.
Regarding high intercorrelations, college acceptance rates and enrollment rates were highly
correlated at r=.94. Due to the high level of this correlation, only one indicator, college enrollment,
was selected for inclusion in the index.
To account for instances in which students take both the SAT and ACT assessments, and to account
for instances where one or the other test is not typically taken, we created an average weighted
SAT/ACT composite. The formula for the weighted SAT/ACT is:
Weighted SAT/ACT=((zSAT*# of students)+(zACT * # of students))/(# of students taking the
SAT+ # of students taking the ACT)
Similarly, to account for schools that offer both AP and IB programs, and to adjust accordingly
when one program or the other is offered at a school, we created an average weighted AP/IB
composite. The formula for the weighted AP/IB composite is:
Weighted AP/IB =((zAP*# of students)+(zIB * # of students))/(# of students taking the AP+ # of
students taking the IB
Substep 2: Determination of weighting scheme
To determine how to weight the six indicators in the college readiness index, we reviewed the
literature for previous weighting schemes and the rationales that were provided for using them (e.g.,
19

Duhon et al., 2013; Streib, 2013).We also assessed the empirical relationships among the variables
using a principal component analysis. Based on these theoretical and empirical sources, we created a
judgmental weighting scheme that we used to rank the schools. As previously discussed, there is a
lot of debate regarding how to combine measures into composite scores, and this conundrum has
confronted statisticians and researchers for decades (Bobko et al., 2007). A common criticism is that
weighting schemes are somewhat arbitrarily chosen or that the weights are designed to reward
certain values (e.g., Gladwell, 2011). A more nuanced criticism is that, because of the
multicollinearity among variables, the explicit weights that are used can vary substantially from the
actual contributions of each variable when constructing the rankings (Webster, 1999). In other
words, due to the fact that the items are interrelated often weights do not indicate how much an
item actually contributes to the ranking. To partially address these criticisms, we tested several
different weighting schemes to assess the extent to which the rankings changed based on which
weighting option we used. As previously mentioned, viable options for combining multiple pieces of
information into a composite score include judgmental weighting, equal weighting, and use of
regression weights (Bobko et al., 2007). To compare the effects of using different weighting
schemes, we used equal weighting, which is a viable and useful option for forming composite
scores (Bobko et al., 2007, p.691), and we derived empirical components and respective weights
using a principal component analysis (PCA) to control for multicollinearity among the indicators.
The rankings produced by these alternative weighting schemes were highly correlated (r=>.90), with
the rankings generated by the judgmental weighting scheme. In this case, the three ranking schemes
produced similar results; thus, while we cannot escape the fact that our weights are judgmental and
reward certain items more highly than others, our results indicate the rankings produced using
judgmental weights are similar to those produced by equal weighting and a weighting scheme based
on the empirical relationships in the data. These findings suggest our rankings are not particularly
sensitive to these various weighting schemes. (For more details about sensitivity tests, see appendix
A).

Substep 3: Ranking of schools by their college readiness index scores
Similar to the process used in the threshold analysis, we used the college readiness scores to rank the
schools. For the absolute ranking, schools were rank ordered based on their college readiness score.
For the relative ranking, we ranked schools by their residuals expressed as the distance from the line
of best fit. To do this, we created a scatterplot graph based on each schools college readiness index
score and the percentage of economically disadvantaged students. We used these data to generate a
line of best fit and to determine a schools college readiness score distance from line of best fit (i.e.,
the residuals), which we then standardized and used to rank the schools.
Equity Analysis: Identification of high schools that have economically disadvantaged
students that are performing higher than the state average
For the equity analysis, we assessed the performance of each schools economically disadvantaged
students and compared that with the states average performance. In addition to the fact that each
state uses its own standardized assessments, there is evidence that suggests the student achievement
gaps are nested by state (Reardon, 2014). As a result, the analysis of variation in student achievement
disaggregated by student subgroups was compared to the respective state average for each school.
Then, we noted any school in which the economically disadvantaged student population performed
better than the state average on both the reading and mathematics assessments. However, due to
20

data suppression issues, many schools did not have data that were needed to conduct this analysis.
Of the top 500 schools on the absolute list, 125 schools were not included. Of schools on the
relative list, 93 schools were not included in this analysis. To avoid confusion interpreting these
results, we indicated whether a school met our standard, did not meet our standard, or whether
these data were not available.

Limitations
There are several limitations associated with conducting this analysis, many of which stem from the
availability of data and their suitability for comparing schools in different states. For example, a wide
variety of family and school factors are associated with student academic achievement and college
readiness (Sirin, 2005; Stinebrickner & Stinebrickner, 2003; Sutton & Soderstrom, 1999). However,
other than student poverty, we have access to data for only a few variables and constructs that have
been shown to influence student performance. Additionally, we have no information about a range
of school factors that may influence school performance, such as fiscal resources, teacher quality
and effectiveness, school leadership, and school climate. These school factors could potentially
contribute to student achievement and college readiness, but data are unavailable for a variety of
reasons. For example, fiscal data are not collected at the school level (i.e., school fiscal resources) in
federal datasets. In addition, indicators of teacher quality, such as teaching experience and/or
credentials, are not reported in a comprehensive manner for all schools, and, even if they were, they
are only very loose proxies for teaching quality (Goldhaber, 2002). States, districts, and schools often
collect data about school leadership and climate, but variability in how these data are collected and
differences in the psychometric properties of surveys make it difficult to compare results across
states. The lack of quality data about all public high schools in the United States makes it difficult for
ranking methodologies to include the full variety of factors, both within and outside of school, that
are known to influence student achievement and college readiness.
Although we were limited by the availability of national level data to account for school and non-
school influences on student achievement and college readiness, our survey is also limited regarding
the extent to which it asks about all types of college preparation programs used across the country.
Our survey collected data about six factors that research indicates contribute to students college
readiness. However, there are other types of college preparation programs that are not accounted for
in this study. For example, in lieu of AP or IB courses, some schools offer duel enrollment
programs that allow students to earn college credit while still in high school. Similarly, there are
many early college high school programs that partner with local colleges and universities to provide
advanced coursework. Students in these schools can elect to take college courses rather than AP or
IB courses. Including some of these factors in future analyses may enhance the methodology for
identifying and ranking top performing schools, although some may argue that duel enrollment
programs are dependent on college resources and not necessary a high school factor.
Another limitation of the study is the reliance on self-reported data to create the college readiness
index. There are a few potential issues with using self-reported data that may have introduced bias
into our analysis. First, the final rankings reflect only those schools that participated in the survey,
not the entire list of schools identified in the threshold analysis. The rankings are dependent on
schools that responded to the survey and indicate schools standings within the sample of
21

respondents. In this sense, both the absolute and relative lists represent the relative ranking of the
schools that responded to the survey, not of the entire sample identified in the threshold analysis.
3

Second, using self-reported data also introduces the potential for schools to game the rankings, or
simply to inadvertently submit incorrect data and improve their ranking. If a school submitted
erroneous data (intentionally or unintentionally) that appeared plausible, our data checks would not
have identified that school, and we would have under- or overestimated its standing relative to the
other schools on the list.
We did not control for any other factor besides student socioeconomic status. Some schools have
selection processes that allow them to select high achieving students. Many of the top schools on
the relative list are magnet schools and may have an application process that allows them to select
high achieving students. These types of schools would have an advantage over schools that do not
have an application process in our ranking methodology.
In the analysis for the relative list, we used school-level student proficiency rates to select schools
that performed better than a pre-determined threshold. For example, the relative high school
rankings, which controlled for student poverty levels, could be biased against schools with high
levels of achievement and college readiness but low levels of student poverty. This bias is, at least
partially, the result of ceiling effects on state assessments across schools within a state. In cases
where average scores are near the top of the assessment scale, it is difficult for schools with low
levels of poverty to exceed the state average (i.e., line of best fit) by a relatively large distance (e.g., .5
SD), making it difficult for these schools to meet the threshold criteria.
Another limitation is that the school-level proficiency rates in EDFacts are a cross-sectional measure
and only represent a schools performance at a single point in time. As a result, these data cannot be
used to infer causality (i.e., we cannot say for certain that some action or actions taken by a school
resulted in a higher ranking than a school that did not take that same action or actions) or make
comparisons across states. Instead, we can simply compare school performance in the 2011-12
school year with other schools within the state (i.e., we could say that school A in state X had more
students performing proficiently in math than school B in state X). Our methodology would be
improved by using panel or longitudinal data to track student performance over time. Another
option for assessing school performance is the use of statistical approaches like value-added models
(VAMs) or student percentile growth models, which use longitudinal achievement data as well as
other variables (both school and non-school variables) to assess the schools effect on changes in

3
If the schools responding to the survey are systematically different than those schools that did not respond, this exposes the rankings to a potential
source of bias. We assessed differences between respondents and non-respondents on several observable characteristics, such as, student achievement
and student poverty levels and found slight, though significant, differences in means between the two groups for both the absolute and relative groups.
On average, schools that responded to the survey had higher student proficiency rates and lower percentages of student eligible for free and reduced
priced lunch rates than non-respondents. For schools on the absolute list, there is a slight, significant different in means between respondents (m =
85.56, SD = 12.73) vs. non-respondents (m = 84.06, SD= 13.48) in reading language arts assessments (t (4305) = 3.62, p<.001). Also, non-respondents
had a lower mean than respondents of student proficiency in mathematics assessments (m = 77.83, SD = 16.77 and m = 80.10, SD = 15.80,
respectively) at a statistically significant level (t (4305) = 4.38, p<.001). Non-respondents also had a higher average percentage of students eligible for
free or reduced-price lunch than respondents (m = 29.95, SD = 19.46, m = 27.86, SD = 20.66, respectively) at a statistically significant level (t (4305) =
-3.34, p<.001). For schools on the relative list, we found a similar pattern, that is, non-respondents, on average, had lower student performance in
reading language arts and mathematics assessments and had a higher percentage of students eligible for free and reduced priced lunch rates than
respondents. Non-respondents had a lower mean of student proficiency in reading language arts than respondents (m = 79.62, SD = 15.97 and m =
82.39, SD = 14.58, respectively) at a significant level (t (4098) = 5.58, p<.001). Non-respondents had a lower mean of student proficiency in
mathematics assessments than respondents (m = 73.10, SD = 18.51 and m = 76.49, SD = 17.27, respectively) at a statistically significant level (t (4098)
= 5.82, p<.001. And, non-respondents had a higher percentage of students eligible for free or reduced-price lunch than respondents (m = 51.33, SD =
23.73 and m = 40.72, SD = 24.72, respectively) at a statistically significant level (t (4098) = -13.66, p<.001.)

22

student performance. However, using VAMs or growth models to generate a school ranking would
produce a list of top schools in which students demonstrate the most growth on achievement
measures, not the schools with the highest absolute performance.

As we have noted, there are many limitations associated with this analysis. Future school rankings
would be improved by including more school and non-school factors that have been shown to be
associated with student achievement and college readiness to assess school performance and rank
schools accordingly.

Conclusion and Discussion
We used two approaches to rank high schools. The absolute list provides a ranking of schools with
the highest average performance on indicators for which we had data, using a judgmental weighting
scheme, but without regard for other factors. The relative list provides a ranking of schools that
perform the highest, using the same indicators and weights but after controlling for average student
poverty levels. We did this in an effort to identify the student performance attributable to the
school. While there are many ways to account for student background characteristics, accounting for
student poverty in this manner significantly influences which schools are identified as top schools.
Each method provides a different perspective regarding which high schools are achieving the
highest levels of performance based on indicators of academic achievement and college readiness.

The weighting scheme we used is less influential in ranking schools than controlling for the level of
student poverty. That is, adjusting the weights makes relatively minor changes to the rankings. Based
on the results of the rankings and the sensitivity analysis, we found that including student
background characteristics has a larger influence than using various weighting schemes to determine
a schools rank. However, these findings are limited to these data and analyses. Researchers
developing future school ranking methodologies are encouraged to examine indicators used for
multicollinearity, assess the sensitivity of their rankings to the respective weights assigned to
variables specified in their analytic model, and make an intentional decision regarding whether to
exclude or include confounding factors.

23

Appendix A: Sensitivity Tests
Introduction
The purpose of the sensitivity tests was to gauge the extent to which different item weights
influenced the final high school rankings. To bolster the analysis, we also assessed the variation in
the rankings produced using equal (unit) weighting and weights derived from a purer
measurement model, namely, a principal component analysis. A common critique of ranking systems
is that they are sensitive to the item weights, which often are (or at least appear to be) somewhat
arbitrarily chosen (Gladwell, 2011; Webster, 1999). Gladwell (2011) highlights this problem in his
article in The New Yorker on college rankings by describing in detail how the item weights in the U.S
News and World Report methodology influence the final rankings. So, in some cases, while ranking
schemes may identify clear winners, the ranks may be highly dependent on the weights chosen for
the variables.
The conundrum of determining the appropriate weights for items to form composite scores has
been debated by statisticians and researchers for decades (Bobko et al., 2007). Options to determine
the weights for composite scores include using experts to determine judgmental weights, using equal
(unit) weights, and/or using empirical information to generate differential weights (Bobko et al.,
2007). As previously discussed, high school rankings typically rely on judgmental weights. In their
review of the literature on the utility of equal (unit) weighting in creating composite scores, Bobko
and colleagues (2007) affirmed the usefulness of equal weighting and recommended equal (unit)
weighting as a useful and viable option when forming composite scores (p.691). Hence, for our
first sensitivity test, we assessed the extent to which our public high school rankings changed based
on the equal weighting vs. our judgmental weights.
Another more complicated option to generate composite scores is to use the empirical data to
generate differential weights using regression methods. A benefit of regression methods and, more
specifically, a principal component regression analysis, is that it can control for inter-correlations
between the indicators and identify the most significant components (i.e., the most significant
ranking criteria). Using the ranking criteria from the U.S News & World Report tier rankings of
national universities, Webster (1999) conducted a PCA and found that due to extreme
multicollinearity among the indicators, the actual contributions of the ranking criteria were
substantially different from the explicit weighting schemes.
Due to the fact that multicollinearity among the indicators may mask the actual contributions of the
items when using the explicit judgmental and equal weighting schemes (e.g., Webster, 1999), we
conducted a PCA, among other reasons, to control for multicollinearity and assess the most
significant ranking criterion. We used a principal component analysis to (1) reduce the items down
to components which explain a majority of the variance in the data, (2) use these components to
produce component scores (using regression method) for each respective school, and (3) to weight
the components by their substantive importance as determined by their respective eigenvalues. In a
24

sense, a PCA is a purer measurement model due to the fact that it accounts for the inter-
correlations among the indicators and determines the substantive importance of the respective
components based on the variance in the empirical data. To test the sensitivity of the rankings to
multicollinearity between the items and the ranking criterion (i.e., weights across the items), we
created rankings based on the results of the PCA and compared the rankings to those produced by
the judgmental and equal weights.
Specifically, our questions were:
1) To what extent will the high school rankings produced from the judgmental and equal
weighting schemes correlate?
2) To what extent will the high school rankings that are produced using weights developed by
principal component analysis correlate with the rankings produced from the judgmental and
equal weighting schemes?

Methods
In this section, we discuss the methods we used for the equal weighting and, primarily, the PCA. For
the equal weighting analysis and PCA, we used the data obtained from the school survey discussed
previously. For the PCA, the components scores could only be produced from complete cases, this
reduced the n to 1,080 for the absolute list and 974 for the relative list. Nonetheless, according to
Comrey and Lee (1992), a sample size of 1,000 is excellent for factor analysis.
We used the six variables previously describedcollege enrollment rates, holding power, ratio of
counselor FTE to student enrollment, the weighted SAT and ACT composite, the weighted AP and
IB composite, and high school graduation ratesto create the rankings. We examined the data for
normality, linearity, and outliers; however, since the principal component analysis is being used for
descriptive purposes, assumptions regarding the distribution of variables in the sample are not in
force (Tabachnick & Fidell, 2007). To address outliers in the data, we capped the graduation and
enrollment rates at 100 percent and deleted records that exceeded the standardized z-score value of
3 for all of the items.
Based on Bobko and colleagues (2007) recommendations, we used equal (unit) weighting across the
items to generate the school rankings. For the equal weighting analysis, we assigned an equal weight
(i.e., 16.6666) to all six items in the formula and ranked the schools accordingly. To produce the
rankings for the absolute list, the schools final scores were rank ordered. To produce the rankings
for the relative list, the schools scores were put onto a scatterplot by student poverty levels, and the
schools were subsequently ranked by their residuals from the line of best fit.
For the second part of our analysis, we conducted a PCA. PCA is a statistical technique applied to a
set of variables to reduce the variables into subsets that are totally independent (i.e., orthogonal) of
one another. We used PCA to examine whether the six variables could be reduced to a smaller
number and to provide weights that are based on components that are not correlated with each
other, so that any overweighting of the variables due to their intercorrelations is eliminated.
25

In PCA linear combinations (the components) of original variables are produced, and a small
number of these combinations typically accounts for the majority of variability within the set of
intercorrelations among the original variables. Having once discovered which components exist, it is
possible to estimate an individuals (school in this case) score on a component based on his/her
scores for the constituent variables. There are several sophisticated techniques for calculating
component scores that use component score coefficients as weights in an equation. We used a
technique referred to as regression method in which the component loadings are adjusted to take
account of the initial correlations between variables, resulting in a purer measure of the unique
relationship between variables and components. The coefficients were used to produce component
(also referred to as factor) scores that were then weighed by their respective eigenvalues, which
indicate the substantive importance of the associated component. This process produced an overall
score by which the schools were subsequently ranked. For the absolute list, schools were rank
ordered by their scores, and for the relative list, schools were ranked by their residuals from the line
of best fit in a scatterplot of the school score by student poverty.
There are several important characteristics of PCA. In PCA all variance is analyzed, and components
are simply aggregates of correlated variables. In this sense, the variables cause or produce the
component. There is no underlying theory about which variables should be associated with which
factors; they are simply empirically associated. Thus, any labels applied to derived components are
merely convenient descriptions of the combination of variables associated with them and do not
necessarily reflect some underlying process.
Results
Equal weighting vs. judgmental weights
For our first sensitivity test, we determined the correlation between the rankings produced by the
equal weighting scheme with the rankings produced by the judgmental weighting scheme for both
the absolute and relative list. Under the equal weighting scheme, we weighted all six variables at
.16666. For the judgmental weighting scheme, we weighted each variable based on our research into
the factors that affect students college readiness. Our judgmental weights were .25 for college
enrollment rate, .20 for graduation rate, .175 for the ACT/SAT composite, .175 for the AP/IB
composite, .10 for holding power, and .10 for the ratio of counselor FTE to 12
th
-grade students. For
the absolute list, the correlation between the lists derived from the judgmental weights and the equal
weights is r=0.993. This indicates that there is high correlation between the two methods and that
the two methods result in a very similar ranking of schools. For the relative list, the correlation
between the rankings derived from the judgmental weighting and equal weighting is r=0.990. Again,
this indicates that there is a high level of correlation between the rankings produced by the
judgmental and equal weights.
In this case, the high correlations are not surprising because sensitivity to the weighting scheme is
influenced by the size of the ratios between largest and smallest weights applied, and in the case of
26

the judgmental weighting scheme, the ratio is relatively small, with the largest indicator accounting
for only 25 percent and the smallest accounting for 10 percent.
Principal Component Analysis
Principal component analysis was performed on the relative and absolute school survey data using
the same six indicators. The results from the PCAs, in which we extracted four components using
both the absolute and relative school data, are discussed below. In addition to these results, we also
ran several variations of the PCAs using different extraction rules and rotation options, each of
which produced similar results. In several instances throughout the discussion we refer to those
variations.
Tables 1 and 2 show the univariate descriptive statistics for the absolute and relative school data
indicators:
Table 1: Descriptive Statistics of Absolute School Indicators
Indicators
Mean Std. Deviation Analysis N
z_abs_pctenroll
.114777 .8591348 1080
z_abs_fteenroll
-.084126 .2311849 1080
wt_abs_SATACT
-.031373 .8124109 1080
wt_abs_apib
.020204 .9932075 1080
z_abs_gradrate
.043956 .6338433 1080
z_abs_holdingpower
-.046034 .0474650 1080

Table 2: Descriptive Statistics of Relative School Indicators

The correlation matrix for the absolute and relative school indicators are presented below in Tables
3 and 4. Notably, most of the variables are slightly to moderately correlated. In the absolute school
data, 10 of the 25 correlations are significant. The percentage enrollment in college items is slightly
correlated with the ratio of counselor FTE to student enrollment, the weighted SAT and ACT
composite score, the weighted AP and IB composite score, and high school graduation rates. The
ratio of counselor FTE to student enrollment is slightly correlated with the percent enrollment in
college and the high school graduation indicators. The weighted composite score for SAT and ACT
is slightly correlated with percent enrollment in college, high school graduation rates, and holding
Indicators Mean Std. Deviation Analysis N
z_rel_pctenroll .082 .919 974
z_rel_fteenroll -.044 .168 974
wt_rel_SATACT -.009 .820 974
wt_rel_apib .019 .975 974
z_rel_gradrate .061 .623 974
z_rel_holdingpower -.049 .042 974
27

power. The SAT/ACT composite score is also moderately correlated with the weighted AP/IB
composite score. The weighted AP/IB composite is slightly correlated with the percent enrollment
in college and very weakly (though significantly) with high school graduation rates and holding
power.
Table 3: Correlation Matrix of Absolute School Indicators
Indicators
z_abs_pcten
roll
z_abs_fteen
roll
wt_abs_SATA
CT
wt_abs_a
pib
z_abs_gradr
ate
z_abs_holdingpo
wer
z_abs_pctenroll 1 - - - - -
z_abs_fteenroll .141*** 1 - - - -
wt_abs_SATACT .368*** .039 1 - - -
wt_abs_apib .247*** -.034 .502*** 1 - -
z_abs_gradrate .356*** .127*** .139*** .091*** 1 -
z_abs_holdingpo
wer
.039 .015 .096*** .056* .007 1
Note. * p<.05, ** p<.01, *** p<.001.
In the relative school data, 11 out of 15 correlations are significant. And, again, most of the variables
are slightly correlated. However, there are some different patterns among the correlations.
Table 4: Correlation Matrix of Indicators Using Relative School Data
Indicators
z_rel_pctenr
oll
z_rel_fteenr
oll
wt_rel_SATA
CT
wt_rel_api
b
z_rel_gradra
te
z_rel_holdingpow
er
z_rel_pctenroll 1 - - - - -
z_rel_fteenroll .022 1 -' - - -
wt_rel_SATACT .381*** -.054*' 1 - - -
wt_rel_apib .255*** -.039 .488*** 1 - -
z_rel_gradrate .347*** .053* .220*** .107*** 1 -
z_rel_holdingpow
er
.055* -.008 .179*** .080** .024 1

In the PCA with four components extracted using the absolute school data, the KaiserMeyerOlkin
(KMO) Measure of Sampling Adequacy is .612. This KMO statistic meets Kaisers (1974)
recommendation of .5 as a bare minimum and is in the mediocre range based on Hucheson &
Sonfronious (1999) guidelines. For the relative school data, the KMO measure of sampling
adequacy is .635. This statistic also meets Kaisers (1974) bare minimum criterion and is adequate
based on Hucheson & Sonfroniou (1999) guidelines.
In the PCA using the absolute school data, the diagonal elements of the anti-image correlation
matrix are all above .5. Similarly, in the PCA using the relative school data, the diagonal elements of
the anti-image correlation matrix are also all above .5. For these values, Field (2005) recommends a
bare minimum of .5 for all variables, and he recommends variables with lower values should be
28

considered for exclusion. However, in this case, no values on the diagonal of the anti-image
correlation are below .5, so none were considered for exclusion based on Fields (2005) criteria.
For both the absolute and relative school data, the Bartletts Test of Sphericity is significant
(p<.001), indicating factor analysis is appropriate. Bartletts measure tests the null hypothesis that the
original correlation is an identity matrix. A significant value indicates that the Rmatrix is not an
identify matrix and that there is some relationship between the variables that we capture in our
analysis (Field, 2005).
Another indication of the appropriateness of the model fit is the differences in the correlations in
the reproduced matrix and in the Rmatrix. For a good model, the differences (referred to as
residuals) will be less than .05, and if 50 percent or more are greater, then it is grounds for concern
(Field, 2005). For the PCA using the absolute school data, there are six (40.0 percent) nonredundant
residuals with absolute values greater than 0.05. And, for the PCA using the relative school data,
there are also six (40.0 percent) nonredundant residuals with absolute values greater than 0.05. Based
on Fields (2005) guidelines, these values do not signal a concern.
It is important to note that since we are using PCA for the sole purpose of data reduction, there is
no assumption that there are a certain number of underlying factors that influence the six college
readiness measures. These tests are primarily related to whether the sample size and data are
sufficient to find such factors. Though that is not our purpose, we included these indices as
references to the adequacy of the PCA model.
The first part of the analysis is the extraction process, which determines the linear components
within the data set (referred to as eigenvectors). Technically, there can be as many eigenvectors as
indicators, yet most will be unimportant. To determine the importance of a component, the
eigenvalues can be assessed. The eigenvalues represent the variance explained by a particular linear
component, and this value can be presented as percentage of variance explained. For both the
absolute and relative school data, four components were extracted based on the Jolliffes (1972,
1986) recommendation of .7. The scree plots indicate that either two or four components could be
justified. Two components would be extracted based on Kaisers rule; however, Kaisers rule can be
overly strict (Jolliffe, 1972, 1986) and, in these cases, only retains roughly half of the total variance.
In accordance with Jolliffes (1972, 1986) criteria and the scree plot criterion, and to retain more of
the total variance, we chose .7 as the extraction cut-off point. (We also ran the model using Kaisers
criterion and found similar results in the output of the PCA.)

29

Figure 1: Scree Plot of PCA Using the Absolute Data


Figure 2: Scree Plot of PCA Using the Relative Data


The extracted four components explain 82.14 and 81.85 of the cumulative percent of the variance in
the absolute and relative data, respectively. The components, eigenvalues, and percentage of variance
explained for the absolute and relative school data are presented below:

30

Table 5: Total Variance Explained by the Components Using the Absolute School Data


Table 6: Total Variance Explained by the Six Indicators in the Relative School Data

The next several tables depict the communalities after extraction. Communality for a variable is the
variance accounted for by the component (Tabachnick & Fidell, 2007), which indicates the
proportion of common variance present in the variable. Once the factors are extracted, the
communalities indicate how much variance is, in fact, common. The communalities after extraction
indicate the amount of variance in each variable that can be explained by the retained components
(Field, 2005). As such, a community value of 1 would indicate the variable has no specific variance
and a communality of 0 would indicate that the variable shares none of its variance with any of the
other variables (Field, 2005). For the PCA using the absolute and relative school data, the values are
all above .6, and the average communality after extraction is .821 and .818 respectively. Of note, the
communalities for the variables counselor FTE to student enrollment and holding power are very
close to 1 (>.98), which indicates those variables have little specific variance. The communalities are
listed in Tables 7 and 8. The values listed in the column labelled initial are the communalities prior
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1
1.909 31.81 31.81 1.91 31.81 31.81 1.59 26.57 26.57
2
1.166 19.43 51.23 1.17 19.43 51.23 1.32 22.06 48.63
3
.990 16.50 67.73 0.99 16.50 67.73 1.01 16.82 65.45
4
.864 14.41 82.14 0.86 14.41 82.14 1.00 16.69 82.14
5
.602 10.03 92.17
6
.470 7.83 100.00
Extraction Method: Principal Component Analysis.
Component
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1
1.954 32.571 32.571 1.954 32.571 32.571 1.544 25.735 25.735
2
1.093 18.210 50.782 1.093 18.210 50.782 1.352 22.536 48.271
3
.977 16.291 67.072 .977 16.291 67.072 1.014 16.895 65.166
4
.886 14.774 81.846 .886 14.774 81.846 1.001 16.680 81.846
5
.615 10.257 92.103
6
.474 7.897 100.000
Component
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Extraction Method: Principal Component Analysis.
31

to extraction and, hence, are all 1 since PCA operates on the assumption that all variance is
common. The values in the extracted column are the communalities after extraction and depict the
amount of variance in each variable that is explained by the retained components.
Table 7: Communalities of PCA Using the Absolute School Data

Table 8: Communalities of PCA Using Relative School Data

Tables 9 and 10 are the component matrixes. The component matrix shows the loadings of each
indicator onto each component. Since we used factor rotation to improve the interpretation, we
discuss the results from the rotated component matrix further below.
Table 9: Component Matrix Using Absolute School Data
a


Indicators Initial Extraction
z_abs_pctenroll
1.000 .640
z_abs_fteenroll
1.000 .990
wt_abs_SATAC
T
1.000 .739
wt_abs_apib
1.000 .739
z_abs_gradrate
1.000 .823
z_abs_holdingp
ower
1.000 .999
Extraction Method: Principal Component Analysis.
Indicators
Initial Extraction
z_rel_pctenroll
1.000 .623
z_rel_fteenroll
1.000 .999
wt_rel_SATACT
1.000 .716
wt_rel_apib
1.000 .791
z_rel_gradrate
1.000 .791
z_rel_holdingpo
wer
1.000 .989
Extraction Method: Principal Component Analysis.
Indicators
1 2 3 4
z_abs_pctenroll
.731 .272 -.163
z_abs_fteenroll
.194 .654 .273 .671
wt_abs_SATACT
.773 -.322 .188
wt_abs_apib
.679 -.470 -.128 .201
z_abs_gradrate
.501 .548 -.514
z_abs_holdingpower
.162 -.197 .941 -.218
Extraction Method: Principal Component Analysis.
a. 4 components extracted.
32

Table 10: Component Matrix Using the Relative School Data
a


As demonstrated in the component matrixes, once factors are extracted it is possible to determine
which items load heavily onto which components. However, it is common for most variables to load
heavily onto the most important component and to load slightly on less important components,
which makes interpretation difficult. A technique used to address this characteristic is referred to as
factor rotation; this technique is used to discriminate between the components. Factor rotation
effectively makes adjustments to the calculations so that variables load maximally to only one factor.
To aid in interpretation of the loadings and components, we orthogonally rotated the factors using
the varimax method. Orthogonal rotation rotates the factors while keeping them independent.
(Before items are rotated, they are also independent; this option merely ensures that factors remain
uncorrelated.) The varimax method attempts to maximize the diffusion of loadings within factors,
effectively loading a smaller number of variables onto each factor (Field, 2005). Varimax is typically
recommended in analysis when the underlying factors are not theoretically related, that is, that the
factors are independent of each other (Field, 2005). In this case, we selected varimax to generate
components that are independent of each other, so that the subsequent scores and rankings
eliminate correlations between the components, so that each component makes a unique
contribution to the final score, and inter-correlations between variables do not influence the weight
of the variable in determining the final score. We compared the results from the PCA analysis with
the judgmental and equal rankings, both of which do not account for the relationships between the
items, to assess the differences in the ranking.
The component loadings are a gauge of the substantive importance of a given variable to a
component and therefore can be used to assign variables to components.
4
Stevens (1992)
recommends interpreting factor loadings with an absolute value greater than .4. In effect, values of
.4 explain roughly 16 percent of the variance in the variable. Based on these recommendations, for

4
It is possible to assess the statistical significance of the factor loadings; however, we did not directly test for significance for two reasons (1) for our
intention of examining the variation in computing weights, statistical significance is irrelevant since we have a fixed sub-population and are not
making inferences about what factors exist in the population, and (2) SPSS does not produce significant tests of factor loadings. Furthermore, the
statistical significance gives little indication of the substantive importance of a variable to a component. Nonetheless, for those interested, Stevens
(1992) produced a table of critical values that can be used to assess the significance of factor loadings. Based on a sample size of 600, the values
should be greater than .21 and for 1000 it should be greater than .162. Based on this criteria, almost all of the loadings are likely statistically
significant.

Indicators
1 2 3 4
z_rel_pctenroll
.708 .291 -.123 -.149
z_rel_fteenroll
.657 .593 .464
wt_rel_SATACT
.803 -.206 .170
wt_rel_apib
.682 -.290 .488
z_rel_gradrate
.524 .538 -.116 -.462
z_rel_holdingpower
.262 -.400 .770 -.411
Extraction Method: Principal Component Analysis.
a. 4 components extracted.
33

the absolute school data, the weighted SAT/ACT and AP/IB items substantively load onto
component one; the college enrollment and graduation rates substantively load onto component
two; the counselor FTE ratio substantively loads onto component three; and holding power
substantively loads onto component four. For the relative school data, there is a similar pattern;
however, the items for components three and four are switched (i.e., holding power substantively
loads onto component three, and counselor FTE ratio substantively loads onto component four).
Tables 11 and 12 below are the rotated component matrixes.
Table 11: Rotated Component Matrix Using Absolute Data
a

Table 12: Rotated Component Matrix Using Relative Data
a

In effect, these loadings represent the extent that the items contribute to the components. The
component scores are derived from the factor loadings. The component scores were then multiplied
by their respective substantive importance, determined by the rotated eigenvalues. To calculate the
component scores for schools, we used the regression method option in SPSS 20. We then
multiplied the component scores by the respective rotated eigenvalues to rank order the schools in
both the relative and absolute list.
For both the absolute and relative list rankings, the PCA based on four components extracted is
correlated with the judgmental weighing scheme at r=0.93 and with the equal weighting scheme at
1 2 3 4
z_abs_pctenroll
.391 .684 .139
z_abs_fteenroll
.991
wt_abs_SATACT
.839 .165
wt_abs_apib
.857
z_abs_gradrate
.906
z_abs_holdingpower
.998
Indicators
Component
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 5 iterations.
Indicators
1 2 3 4
z_rel_pctenroll
.372 .696
z_rel_fteenroll
.998
wt_rel_SATACT
.780 .276 .172
wt_rel_apib
.889
z_rel_gradrate
.889
z_rel_holdingpower
.992
a. Rotation converged in 5 iterations.
Component
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
34

r=.94. We also ran a PCA model in which we extracted six components to retain all possible
variance in the components. The correlations from these models with the judgmental and equal
weighting schemes are very similar, resulting in no change or only a one-hundredth change in the
correlation values.
Discussion and implications
These sensitivity tests address two major critiques of rank ordered lists such as these. First, the
rankings can be sensitive to their respective weights (e.g., Bobko et al., 2007; Gladwell, 2011).
Second, without accounting for the multicollinearity among the variables, the explicit weighting
scheme may vary vastly from the actual contribution of the variables (e.g., Webster, 1999).
The ranking results produced from the equal weighting schemes for both the absolute and relative
lists demonstrates that the judgmental and equal weighting schemes result in very similar rank order
of schools (r=.99 ). In this case, using the judgmental weights did not result in a substantially
different ranking vs. the equal weighting scheme. However, the sensitivity of the judgmental
weighting is dependent on the relative size of the weights (regarding proportionality). In other
words, substantially changing the weights on one or more items would likely result in fewer
similarities between the rankings produced by the judgmental and equal weighting schemes.
However, again, under the conditions that we compared these two composite weighting schemes,
the subsequent high school rankings are very similar. These results are unique to our data, including
the items and weights that we used for the analysis. But, these results are consistent with other
research demonstrating that in many cases there are high correlations between judgmental and equal
weighting schemes (Webster, 1999).
We now turn to the second critique regarding multicollinearity. In essence, this critique claims that if
variables are highly similar (determined by their variance explained and their correlations), then the
stated weights do not in actuality represent how the variables contribute to the composite scores.
Critics claim that misrepresentation of the weighting schemes can have serious implications for
educational policies (e.g., Webster, 1999). We conducted a PCA to (1) to assess the empirical
loadings of the items on components and (2) produced a ranking of schools based on the
component scores and the respective substantive importance of the components. We compared the
ranking list derived from the PCA to the judgmental and equal weighting schemes to assess the
consistency across the three different weighting options. (As an aside, we also used the findings to
inform our judgmental weights.)
The results indicate that the three different weighting options resulted in a very similar ranking of
public high schools for both the absolute and relative schools. There are high levels of correlation
across all three weighting schemes for both the absolute and relative schools, especially between the
judgmental and equal weighting schemes. Using a PCA to account for the multicollinearity between
the items and empirically derived item loadings, components and component weights also resulted
in a similar public high school ranking list as the judgmental and equal weight schemes (r=.93 and
r=.94). Yet, despite these high levels of correlations, there is still some variation among the rankings
35

of schools based on these different methods. For example, there are some changes in the top 10
schools using the different methods, and, in fact, the top two schools change. However, these
analyses do demonstrate that the public high school rankings, in this case, are not particularly
sensitive to the differences between our judgmental, equal and PCA weighting schemes.
It should be noted that these results are empirically derived from our data and unique to it. Future
researchers who construct composite scores to produce high school (or for that matter, college)
ranking lists should examine the sensitivity of their lists to their weighting schemes and should
examine issues arising from multicollinearity.
36

References
Baker, B.D., & Green, P.C. (2009) Conceptions, measurement, and application of educational
adequacy and equal educational opportunity. In G. Skyes, B. Schneider, & D. Plank (Eds.),
Handbook of education policy research. New York, NY: Routledge.
Balfanz, R., & Legters, N. (2004). How many central city high schools have a severe dropout
problem, where are they located and who attends them? Initial estimates using the Common
Core of Data. In G. Orfield (Ed.), Dropouts in America: Confronting the crisis. Cambridge, MA:
Harvard Educational Publishing.
Berne, R., & Stiefel, L. (1984). The measurement of equity in school finance. Baltimore, MD: John Hopkins
University Press.
Bobko, P., Roth, P., & Buster, M. (2007). The usefulness of unit weights in creating composite
scores: A literature review, application to content validity, and meta-analysis. Organizational
Research Methods, 10, 689-709.
Cobb, C. (2002). Performance-based accountability systems for public education. Unpublished manuscript, The
New Hampshire Center for Public Policy Studies.

Coleman, J.S., Campbell, E.Q., Hobson, C.J., McPartland, J.M., Mood, A.M., Weinfield, F.D., et al.
(1966). Equality of educational opportunity. Washington, DC: U.S. Government Printing Office.

Comrey, A.L., & Lee, H.B. (1992). A first course in factor analysis (2
nd
edition). Hillsdale, NJ: Erlbaum.

Di Carlo, M. (May, 2013). A quick look at best high school rankings. Shanker Blog.
http://shankerblog.org/?p=8328

Duhon, C., Kuriki, A., Chen, J., & Noel, A. (2013). Identifying top-performing high schools for the best high
school rankings. Washington, DC: American Institutes for Research

Field, A. (2005). Discovering statistics using SPSS (2
nd
edition). Thousand Oaks, CA: Sage Publications.

Gladwell, M. (February, 2011). The order of things: What college rankings really tell us. The New
Yorker. http://www.newyorker.com/magazine/2011/02/14/the-order-of-things

Goldhaber, D. (2002). The mystery of good teaching. Education Next, (Spring, 2002), 1-7.

Grubb, W.N. (2009). The money myth: School resources, outcomes and equity. Russell Sage Foundation: New
York.

Hall, D. (2001). How states pursue public school accountability and measure performance of public schools, school
year 1998-1999. Unpublished manuscript, The New Hampshire Center for Public Policy
Studies.
37

Harris, D.N. (2010). Education production functions: Concepts. In D. Brewer & P. McEwan (Eds.),
Economics of education. San Diego, CA: Elsevier Ltd.
Hucheson, G., & Sonfroniou, N. (1999). The multivariate social scientist. London: Sage.
Jolliffe, I.T. (1972). Discarding variables in a principal component analysis, I: artificial data. Applied
Statistics, 21, 160-173.
Jolliffe, I.T. (1986). Principal component analysis. New York: Springer.
Kaiser, H. F. (1974). An index of factorial simplicity. Psychometrika, 39, 31-36.
Reardon, S. (2014). Education. In State of the Union: The poverty and inequality report. Pathways: a
magazine on poverty, inequality, and social policy. Stanford. Stanford Center on Poverty and
Inequality. Special issue 2014.
Rumberger, R., & Palardy, G. (2005). Test scores, dropout rates, and transfer rates as alternative
indicators of high school performance. American Education Research Journal, 42(1), 3-42.
Sirin, S.R. (2005). Socioeconomic status and academic achievement: A meta-analytic review of
research. Review of Educational Research, 75(3), 417-453. doi:10.3102/00346543075003417.
Stinebrickner, R., & Stinebrickner, T. (2003). Understanding educational outcomes of students from
low-income families. Journal of Human Resources, 38(3), 591-617.

Streib, L. (2013, May 6). Americas Best High Schools 2013: Behind the Rankings. Retrieved from
http://www.thedailybeast.com/articles/2013/05/06/america-s-best-high-schools-2013-
behind-the-rankings.html

Sutton, A., & Soderstrom, I. (1999). Predicting elementary and secondary school achievement with
school-related and demographic factors. The Journal of Educational Research, 92, 330-338.

Tabachnick, B., & Fidell, L. (2007). Using multivariate statistics (5
th
edition). Boston: Allyn & Bacon.
Toutkoushian, R.K., & Curtis, T.C. (2005). Effects of socioeconomic factors on public high school
outcomes and rankings. The Journal of Educational Research, 98(5), 259-271.
Webster, T.J. (1999). A principal component analysis of the U.S. News & World Report tier rankings of
colleges and universities. Economics of Education Review, 20, 235244.

Das könnte Ihnen auch gefallen