Beruflich Dokumente
Kultur Dokumente
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=apsa.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Political Science Association is collaborating with JSTOR to digitize, preserve and extend access to
The American Political Science Review.
http://www.jstor.org
Aeica Political
American Science Review
PoitclSineRveVo.9,N.3Spmbr20 Vol. 95, No. 3 September 2001
Measurement AShared
Validity: Standard
forQualitative
andQuantitative
Research
ROBERT ADCOCK and DAVID COLLIER University of California,Berkeley
Scholarsroutinelymake claims thatpresuppose the validityof the observationsand measurementsthat
operationalize their concepts. Yet, despite recent advances in political science methods, surprisingly
little attention has been devoted to measurementvalidity.Weaddressthis gap by exploringfour themes.
First, we seek to establish a shared framework that allows quantitative and qualitative scholars to assess
more effectively,and communicate about, issues of valid measurement.Second, we underscorethe need to
draw a clear distinction between measurement issues and disputes about concepts. Third, we discuss the
contextual specificity of measurement claims, exploring a variety of measurement strategies that seek to
combine generalityand validityby devotinggreaterattention to context.Fourth, we address the proliferation
of termsfor alternativemeasurementvalidationprocedures and offeran account of the three main types of
validation most relevant to political scientists.
R esearchers routinely make complex choices 1994), no major statement on this topic has appeared
about linking concepts to observations, that is, since Zeller and Carmines (1980) and Bollen (1989).
about connecting ideas with facts. These choices Although King, Keohane, and Verba (1994, 25, 152-5)
raise the basic question of measurement validity: Do cover many topics with remarkable thoroughness, they
the observations meaningfully capture the ideas con- devote only brief attention to measurement validity.
tained in the concepts? We will explore the meaning of New thinking about measurement, such as the idea of
this question as well as procedures for answering it. In measurement as theory testing (Jacoby 1991, 1999), has
the process we seek to formulate a methodological not been framed in terms of validity.
standard that can be applied in both qualitative and Four important problems in political science re-
quantitative research. search can be addressed through renewed attention to
Measurement validity is specifically concerned with measurement validity. The first is the challenge of
whether operationalization and the scoring of cases establishing shared standards for quantitative and qual-
adequately reflect the concept the researcher seeks to itative scholars, a topic that has been widely discussed
measure. This is one aspect of the broader set of (King, Keohane, and Verba 1994; see also Brady and
analytic tasks that King, Keohane, and Verba (1994, Collier 2001; George and Bennett n.d.). We believe the
chap. 2) call "descriptive inference," which also encom- skepticism with which qualitative and quantitative re-
passes, for example, inferences from samples to popu- searchers sometimes view each other's measurement
lations. Measurement validity is distinct from the va- tools does not arise from irreconcilable methodological
lidity of "causal inference" (chap. 3), which Cook and differences. Indeed, substantial progress can be made
Campbell (1979) further differentiate into internal and in formulating shared standards for assessing measure-
external validity.1 Although measurement validity is ment validity. The literature on this topic has focused
interconnected with causal inference, it stands as an almost entirely on quantitative research, however,
important methodological topic in its own right. rather than on integrating the two traditions. We
New attention to measurement validity is overdue in propose a framework that yields standards for mea-
political science. While there has been an ongoing surement validation and we illustrate how these apply
concern with applying various tools of measurement to both approaches. Many of our quantitative and
validation (Berry et al. 1998; Bollen 1993; Elkins 2000; qualitative examples are drawn from recent compara-
Hill, Hanna, and Shafqat 1997; Schrodt and Gerner tive work on democracy, a literature in which both
groups of researchers have addressed similar issues.
Robert Adcock (adcockr@uclink4.berkeley.edu) is a Ph.D candi- This literature provides an opportunity to identify
date, Department of Political Science, and David Collier parallel concerns about validity as well as differences in
(dcollier@socrates.berkeley.edu) is Professorof Political Science,
specific practices.
Universityof California,Berkeley,CA 94720-1950. A second problem concerns the relation between
Amongthe manycolleagueswho haveprovidedhelpfulcomments
on this article,we especiallythank ChristopherAchen, Kenneth measurement validity and disputes about the meaning
Bollen,HenryBrady,EdwardCarmines,RubetteCowan,PaulDosh, of concepts. The clarification and refinement of con-
ZacharyElkins,JohnGerring,KennethGreene,ErnstHaas,Edward cepts is a fundamental task in political science, and
Haertel, Peter Houtzager,Diana Kapiszewski,Gary King, Marcus
Kurtz,James Mahoney,SebastianMazzuca,Doug McAdam,Ger- carefully developed concepts are, in turn, a major
ardo Munck, Charles Ragin, Sally Roever, Eric Schickler,Jason prerequisite for meaningful discussions of measure-
Seawright,JeffSluyter,RichardSnyder,RuthStanley,LauraStoker, ment validity. Yet, we argue that disputes about con-
and three anonymousreviewers.The usual caveats apply.Robert cepts involve different issues from disputes about mea-
Adcock'sworkon this projectwas supportedby a NationalScience surement validity. Our framework seeks to make this
FoundationGraduateFellowship.
1 These involve,respectively,the validityof causalinferencesabout distinction clear, and we illustrate both types of dis-
the casesbeingstudied,and the generalizability of causalinferences putes.
to a broaderset of cases (Cook and Campbell1979,50-9, 70-80). A third problem concerns the contextual specificity
529
Measurement Validity: A Shared Standard for Qualitative and Quantitative Research September 2001
of measurement validity-an issue that arises when a attention in political science: content, convergent/dis-
measure that is valid in one context is invalid in criminant, and nomological/construct validation.
another. We explore several responses to this problem
that seek a middle ground between a universalizing
OVERVIEWOF MEASUREMENT VALIDITY
tendency, which is inattentive to contextual differences,
and a particularizingapproach, which is skeptical about Measurement validity should be understood in relation
the feasibility of constructing measures that transcend to issues that arise in moving between concepts and
specific contexts. The responses we explore seek to observations.
incorporate sensitivity to context as a strategy for
establishing equivalence across diverse settings. Levels and Tasks
A fourth problem concerns the frequently confusing
language used to discuss alternative procedures for We depict the relationship between concepts and ob-
measurement validation. These procedures have often servations in terms of four levels, as shown in Figure 1.
been framed in terms of different "types of validity," At the broadest level is the background concept, which
among which content, criterion, convergent, and con- encompasses the constellation of potentially diverse
struct validity are the best known. Numerous other meanings associated with a given concept. Next is the
labels for alternative types have also been coined, and systematized concept, the specific formulation of a
we have found 37 different adjectives that have been concept adopted by a particular researcher or group of
attached to the noun "validity" by scholars wrestling researchers. It is usually formulated in terms of an
with issues of conceptualization and measurement.2 explicit definition. At the third level are indicators,
The situation sometimes becomes further confused, which are also routinely called measures. This level
given contrasting views on the interrelations among includes any systematic scoring procedure, ranging
different types of validation. For example, in recent from simple measures to complex aggregated indexes.
validation studies in political science, one valuable It encompasses not only quantitative indicators but
analysis (Hill, Hanna, and Shafqat 1997) treats "con- also the classification procedures employed in qualita-
vergent" validation as providing evidence for "con- tive research. At the fourth level are scores for cases,
struct" validation, whereas another (Berry et al. 1998) which include both numerical scores and the results of
treats these as distinct types. In the psychometrics qualitative classification.
tradition (i.e., in the literature on psychological and Downward and upward movement in Figure 1 can be
educational testing) such problems have spurred a understood as a series of research tasks. On the
theoretically productive reconceptualization. This liter- left-hand side, conceptualization is the movement from
ature has emphasized that the various procedures for the background concept to the systematized concept.
assessing measurement validity must be seen, not as Operationalization moves from the systematized con-
establishing multiple independent types of validity, but cept to indicators, and the scoring of cases applies
rather as providing different types of evidencefor valid- indicators to produce scores. Moving up on the right-
ity. In light of this reconceptualization, we differentiate hand side, indicators may be refined in light of scores,
between "validity"and "validation."We use validity to and systematized concepts may be fine-tuned in light of
refer only to the overall idea of measurement validity, knowledge about scores and indicators. Insights de-
and we discuss alternative procedures for assessing rived from these levels may lead to revisiting the
validity as different "types of validation." In the final background concept, which may include assessing al-
part of this article we offer an overview of three main ternative formulations of the theory in which a partic-
types of validation, seeking to emphasize how proce- ular systematized concept is embedded. Finally, to
dures associated with each can be applied by both define a key overarching term, "measurement" involves
quantitative and qualitative researchers. the interaction among levels 2 to 4.
In the first section of this article we introduce a
framework for discussing conceptualization, measure-
ment, and validity. We then situate questions of validity
Defining Measurement Validity
in relation to broader concerns about the meaning of Valid measurement is achieved when scores (including
concepts. Next, we address contextual specificity and the results of qualitative classification) meaningfully
equivalence, followed by a review of the evolving capture the ideas contained in the corresponding con-
discussion of types of validation. Finally, we focus on cept. This definition parallels that of Bollen (1989,
three specific types of validation that merit central 184), who treats validity as "concerned with whether a
variable measures what it is supposed to measure."
King, Keohane, and Verba (1994, 25) give essentially
2 We have found the following adjectives attached to validity in the same definition.
discussions of conceptualization and measurement: a priori, appar-
ent, assumption, common-sense, conceptual, concurrent, congruent,
If the idea of measurement validity is to do serious
consensual, consequential, construct, content, convergent, criterion- methodological work, however, its focus must be fur-
related, curricular, definitional, differential, discriminant, empirical, ther specified, as emphasized by Bollen (1989, 197).
face, factorial, incremental, instrumental, intrinsic, linguistic, logical, Our specification involves both ends of the connection
nomological, postdictive, practical, pragmatic, predictive, rational, between concepts and scores shown in Figure 1. At the
response, sampling, status, substantive, theoretical, and trait. A
parallel proliferation of adjectives, in relation to the concept of concept end, our basic point (explored in detail below)
democracy, is discussed in Collier and Levitsky 1997. is that measurement validation should focus on the
530
Ameica
AmericanPoiia Science
Political cec Review
eiwVl 5 No.
Vol. 95, o 3
J
Task: Operationalization
commonlyinvolves an explicitdefinition.
2.
Task: Modifying Systematized
Developing,on the basis of a systema- Concept. Fine-tuningthe systematized
tized concept, one or more indicators or
concept, possibly extensively revisingit, in
for scoring/classifyingcases. lightof insightsabout scores and indicators.
C
*0 Level 3. Indiicators
E
//
I Also referredto as "measures"and "opera-
tinnalinztinns " In nlalitativ ,e research,these
II,I %,.11..1i,.A,?I
U) /I - cllassifyingcaases.
relation between observationsand the systematized Measurement Error, Reliability, and Validity
concept;any potential disputesabout the background
concept should be set aside as an important but Validityis often discussedin connectionwith measure-
separate issue. With regardto scores, an obvious but ment errorand reliability.Measurementerrormay be
crucialpoint must be stressed:Scores are never exam- systematic-in whichcase it is calledbias-or random.
ined in isolation;rather,they are interpretedand given Random error,which occurs when repeated applica-
meaningin relationto the systematizedconcept. tions of a given measurementprocedureyield incon-
In sum, measurementis validwhen the scores (level sistent results, is conventionallylabeled a problemof
4 in Figure 1), derivedfrom a given indicator(level 3), reliability.Methodologistsoffer two accounts of the
can meaningfullybe interpretedin termsof the system- relationbetween reliabilityand validity.(1) Validityis
atized concept (level 2) that the indicator seeks to sometimes understood as exclusivelyinvolving bias,
operationalize.It would be cumbersometo refer re- that is errorthat takes a consistentdirectionor form.
peatedly to all these elements, but the appropriate From this perspective,validityinvolvessystematicer-
focus of measurementvalidationis on the conjunction ror, whereas reliabilityinvolves random error (Car-
of these components. mines and Zeller 1979, 14-5; see also Babbie 2001,
531
MeasurementValidity:A SharedStandardfor Qualitativeand QuantitativeResearch
II II
September 2001
I
144-5). Therefore,unreliablescores may still be cor- concepts,the better the theorywe can formulatewith
rect "on average"and in this sense valid. (2) Alterna- them, and in turn,the betterthe conceptsavailablefor
tively,some scholarshesitate to view scores as valid if the next, improvedtheory."Various examplesof this
they contain large amounts of random error. They intertwiningare exploredin recent analysesof impor-
believe validityrequiresthe absence of both types of tant concepts, such as Laitin's (2000) treatment of
error.Therefore,theyviewreliabilityas a necessarybut languagecommunityand Kurtz's(2000) discussionof
not sufficientconditionof measurementvalidity(Kirk peasant.Fearon and Laitin's(2000) analysisof ethnic
and Miller 1986, 20; Shively1998, 45). conflict,in whichthey begin with their hypothesisand
Ourgoal is not to adjudicatebetweenthese accounts ask what operationalizationis needed to capturethe
but to state them clearlyand to specifyour own focus, conceptions of ethnic group and ethnic conflict en-
namely,the systematicerrorthat ariseswhen the links tailed in this hypothesis,furtherillustratesthe interac-
among systematizedconcepts, indicators,and scores tion of theoryand concepts.
are poorlydeveloped.This involvesvalidityin the first In dealingwith the choices that arise in establishing
sense stated above. Of course, the randomerror that the systematizedconcept,researchersmustavoidthree
routinelyarisesin scoringcases is also important,but it commontraps.First,they should not misconstruethe
is not our primaryconcern. flexibilityinherentin these choices as suggestingthat
A finalpoint shouldbe emphasized.Becauseerroris everythingis up for grabs. This is rarely,if ever, the
a pervasivethreat to measurement,it is essential to case. In any field of inquiry,scholarscommonlyasso-
view the interpretationsof scoresin relationto system- ciate a matrix of potential meaningswith the back-
atized concepts as falsifiable claims (Messick 1989, groundconcept. This matrixlimits the range of plau-
13-4). Scholarsshould treat these claimsjust as they sible options, and the researcherwho straysoutside it
would any casualhypothesis,that is, as tentativestate- runsthe riskof being dismissedor misunderstood.We
ments that require supportingevidence. Validity as- do not mean to implythat the backgroundconcept is
sessmentis the searchfor this evidence. entirelyfixed.It evolvesover time, as new understand-
ingsare developedandold ones are revisedor fall from
AND CHOICES use. At a giventime, however,the backgroundconcept
MEASUREMENTVALIDITY
ABOUTCONCEPTS usuallyprovidesa relativelystablematrix.It is essential
to recognizethat a real choice is being made, but it is
A growing body of work considers the systematic no less essential to recognize that this is a limited
analysisof conceptsan importantcomponentof polit- choice.
ical science methodology.3How shouldwe understand Second, scholarsshould avoidclaimingtoo much in
the relation between issues of measurementvalidity defendingtheirchoice of a givensystematizedconcept.
and broader choices about concepts, which are a It is not productive to treat other options as self-
centralfocus of this literature? evidentlyruled out by the backgroundconcept. For
example,in the controversyover whether democracy
versusnondemocracyshouldbe treatedas a dichotomy
Conceptual Choices: Forming the or in termsof gradations,there is too muchrelianceon
Systematized Concept claims that the backgroundconcept of democracy
We view systematizedconcepts as the point of depar- inherentlyrulesout one approachor the other (Collier
ture for assessingmeasurementvalidity.How do schol- and Adcock 1999, 546-50). It is more productiveto
ars formsuchconcepts?Becausebackgroundconcepts recognize that scholarsroutinelyemphasizedifferent
routinelyincludea varietyof meanings,the formation aspectsof a backgroundconceptin developingsystem-
of systematized concepts often involves choosing atized concepts,each of which is potentiallyplausible.
among them. The number of feasible options varies Rather than make sweeping claims about what the
greatly.At one extremeare conceptssuch as triangle, backgroundconcept "really"means, scholars should
which are routinelyunderstoodin terms of a single present specific arguments,linked to the goals and
conceptualsystematization;at the other extreme are context of their research,that justify their particular
"contestedconcepts"(Gallie 1956),suchas democracy. choices.
A carefulexaminationof diversemeaningshelpsclarify A thirdproblemoccurswhen scholarsstop short of
the options,but ultimatelychoices must be made. providinga fleshed-outaccount of their systematized
These choices are deeply interwinedwith issues of concepts.This requiresnot just a one-sentencedefini-
theory, as emphasizedin Kaplan's(1964, 53) paradox tion, but a broaderspecificationof the meaning and
of conceptualization:"Properconcepts are needed to entailmentsof the systematizedconcept. Within the
formulatea good theory,but we need a good theoryto psychometricsliterature,Shepard(1993, 417) summa-
arrive at the proper concepts.... The paradox is rizes what is required: "both an internal model of
resolvedby a processof approximation: the better our interrelateddimensionsor subdomains"of the system-
atized concept, and "an externalmodel depictingits
3
Examples of earlier work in this tradition are Sartori 1970, 1984 relationshipto other [concepts]."An exampleis Bol-
and Sartori, Riggs, and Teune 1975. More recent studies include len's (1990, 9-12; see also Bollen 1980) treatmentof
Collier and Levitsky 1997; Collier and Mahon 1993; Gerring 1997,
1999, 2001; Gould 1999; Kurtz 2000; Levitsky 1998; Schaffer 1998. political democracy,which distinguishesthe two di-
Important work in political theory includes Bevir 1999; Freeden mensionsof "politicalrights"and "politicalliberties,"
1996; Gallie 1956; Pitkin 1967, 1987. clarifiesthese by contrastingthem with the dimensions
532
Amerian
American Political Poitica
Sciece
Science Reiew
Review Vl.
Vol. 95
95, No.
No. 3
533
MeasurementValidity:A SharedStandardfor Qualitativeand QuantitativeResearch September
I 2001
elections, some civilian governmentsin Central and Contextual Specificity in Political Research
South America to varying degrees lacked effective
Contextualspecificityaffects many areas of political
power to govern.A basic concernwas the persistence science. It has long been a problemin cross-national
of "reserveddomains"of militarypower over which
elected governmentshad little authority(Valenzuela survey research (Sicinski 1970; Verba 1971; Verba,
1992,70). Becauseproceduraldefinitionsof democracy Nie, and Kim 1978, 32-40; Verba et al. 1987,Appen-
did not explicitlyaddress this issue, measures based dix). An exampleconcerningfeaturesof nationalcon-
text is Cain and Ferejohn's(1981) discussionof how
upon them could result in a high democracyscore for the differingstructureof partysystemsin the United
these countries,but it appearedinvalidto view them as States and Great Britainshouldbe taken into account
democratic.Some scholars therefore amended their when comparingpartyidentification.Contextis also a
systematizedconcept of democracyto add the differ- concernfor surveyresearchersworkingwithina single
entiatingattributethat the elected governmentmustto nation,who wrestlewith the dilemmaof "inter-person-
a reasonabledegreehave the powerto rule (Karl1990,
ally incomparableresponses"(Brady1985).For exam-
2; Loveman1994, 108-13; Valenzuela 1992, 70). De- ple, scholarsdebate whether a given surveyitem has
bate persistsover the scoringof specificcases (Rabkin the same meaning for different population sub-
1992, 165), but this innovation is widely accepted
groups-which could be defined, for example,by re-
among scholarsin the proceduraltradition(Hunting- gion, gender, class, or race. One specific concern is
ton 1991, 10; Mainwaring,Brinks, and Perez-Linan whether populationsubgroupsdiffersystematicallyin
2001;Markoff1996, 102-4). As a resultof this friendly their "response style" (also called "response sets").
amendment,analystsdid a betterjob of capturing,for Some groups may be more disposed to give extreme
these new cases, the underlyingidea of procedural answers,and others may tend toward moderate an-
minimumdemocracy. swers(Greenleaf1992).Bachmanand O'Malley(1984)
show that responsestyle varies consistentlywith race.
They argue that apparently important differences
SPECIFICITY,AND
CONTEXTUAL
VALIDITY, acrossracialgroupsmayin partreflectonly a different
EQUIVALENCE mannerof answeringquestions.Contextualspecificity
also can be a problemin surveycomparisonsover time,
Contextualspecificityis a fundamentalconcern that as Baumgartnerand Walker (1990) point out in dis-
ariseswhen differencesin contextpotentiallythreaten cussinggroupmembershipin the United States.
the validityof measurement.This is a centraltopic in The issue of contextual specificityof course also
psychometrics,the field that has produced the most arises in macro-level research in internationaland
innovative work on validity theory. This literature comparativestudies (Bollen, Entwisle,and Anderson
emphasizesthat the same score on an indicatormay 1993, 345). Examples from the field of comparative
have differentmeanings in different contexts (Moss politicsare discussedbelow. In internationalrelations,
1992, 236-8; see also Messick 1989, 15). Hence, the attention to context, and particularlya concern with
validationof an interpretationof scores generatedin "historicizingthe concept of structure,"is central to
one context does not imply that the same interpreta- "constructivism"(Ruggie 1998, 875). Constructivists
tion is validfor scoresgeneratedin anothercontext.In argue that modern internationalrelations rest upon
political science, this concern with context can arise "constitutiverules" that differ fundamentallyfrom
when scholarsare makingcomparisonsacrossdifferent those of both medievalChristendomand the classical
world regionsor distincthistoricalperiods.It can also Greek world (p. 873). Although they recognize that
arise in comparisonswithina national(or other) unit, sovereigntyis an organizingprincipleapplicableacross
giventhat differentsubunits,regions,or subgroupsmay diversesettings,the constructivistsemphasizethat the
constitute very different political, social, or cultural "meaningand behavioralimplicationsof this principle
contexts. vary from one historicalcontext to another"(Reus-
The potential difficultythat context poses for valid Smit 1997, 567). On the other side of this debate,
measurement, and the related task of establishing neorealistssuch as Fischer(1993, 493) offer a general
measurementequivalenceacrossdiverseunits,deserve warning:If pushedto an extreme,the "claimto context
more attention in political science. In a period when dependency"threatensto "makeimpossiblethe collec-
the quest for generalityis a powerfulimpulse in the tive pursuit of empiricalknowledge."He also offers
social sciences,scholarssuch as Elster (1999, chap. 1) specifichistoricalsupportfor the basic neorealistposi-
have strongly challenged the plausibilityof seeking tion that the behaviorof actorsin internationalpolitics
general,law-likeexplanationsof politicalphenomena. follows consistent patterns. Fischer (1992, 463, 465)
A parallelconstrainton the generalityof findingsmay concludes that "the structurallogic of action under
be imposed by the contextualspecificityof measure- anarchyhas the characterof an objectivelaw,"whichis
ment validity.We are not arguingthat the quest for grounded in "an unchangingessence of human na-
generalitybe abandoned.Rather, we believe greater ture."
sensitivityto context may help scholarsdevelop mea- The recurringtension in social research between
sures that can be validly applied across diverse con- particularizing and universalizingtendenciesreflectsin
texts. This goal requires concerted attention to the part contrastingdegrees of concern with contextual
issue of equivalence. specificity.The approachesto establishingequivalence
534
AmericanPolitical
American PoliticalScience
Science Review
Review Vol. 95,
Vol. No. 3
95, No.
535
MeasurementValidity:A SharedStandardfor Qualitativeand QuantitativeResearch September 2001
536
AmericanPolitical
American PoliticalScience
Science Review
Review Vol.
Vol. 95, No. 3
psychology.A recurringmetaphorin that field charac- the general idea of valid measurement.These specific
terized the three types as "somethingof a holy trinity proceduresgenerallydo not encompasscontentvalida-
representingthree differentroads to psychometricsal- tion and have in common the practice of assessing
vation"(Guion 1980, 386). These types may be briefly measurementvalidityby takingas a point of reference
defined as follows. establishedconceptualand/ortheoreticalrelationships.
* Content validity assesses the degree to which an We findit helpfulto groupthese proceduresinto two
indicatorrepresentsthe universeof content entailed typesaccordingto the kindof theoreticalor conceptual
in the systematizedconcept being measured. relationship that serves as the point of reference.
* Criterionvalidityassesses whether the scores pro- Specifically,these types are based on the heuristic
ducedby an indicatorare empiricallyassociatedwith distinctionbetweendescriptionand explanation.7First,
scores for other variables,called criterionvariables, some procedures rely on "descriptive"expectations
which are considered direct measures of the phe- concerningwhethergiven attributesare understoodas
nomenon of concern. facets of the same phenomenon.This is the focus of
* Constructvalidityhas had a range of meanings.One what we label "convergent/discriminant validation."
centralfocus has been on assessingwhethera given Second,other proceduresrely on relativelywell-estab-
indicatoris empiricallyassociatedwith other indica- lished "explanatory"causal relations as a baseline
tors in a way that conformsto theoreticalexpecta- against which measurementvalidity is assessed. In
tions about their interrelationship. labelingthis second group of procedureswe drawon
Campbell's(1960, 547) helpful term, "nomological"
These labels remain very influentialand are still the validation, which evokes the idea of assessment in
centerpiecein some discussionsof measurementvalid- relation to well-establishedcausal hypothesesor law-
ity, as in the latest edition of Babbie's (2001, 143-4) like relationships.This second type is often called
widely used methodstextbookfor undergraduates. construct validity in political research (Berry et al.
The second phase grewout of increasingdissatisfac- 1998;Elkins2000).8Out of deferenceto this usage, in
tion with the "trinity"and led to a "unitarian"ap- the headings and summarystatementsbelow we will
proach (Shultz, Riggs, and Kottke 1998, 269-71). A refer to nomological/construct validation.
basic problem identified by Guion (1980, 386) and
others was that the threefold typologywas too often
taken to mean that any one type was sufficientto Types of Validation in Political Analysis
establishvalidity(Angoff 1988, 25). Scholarsincreas- A baseline for the revised discussion of validation
ingly argued that the differenttypes should be sub- presentedbelow is providedin workby Carminesand
sumedundera singleconcept.Hence, to continuewith Zeller, and by Bollen. Carminesand Zeller (1979, 26;
the priormetaphor,the earliertrinitycame to be seen Zeller and Carmines1980, 78-80) argue that content
"in a monotheistic mode as the three aspects of a validationand criterionvalidationare of limitedutility
unitarypsychometricdivinity"(p. 25). in fields such as political science. While recognizing
Much of the second phase involveda reconceptual- that contentvalidationis importantin psychologyand
ization of constructvalidityand its relationto content education,they arguethat evaluatingit "hasprovedto
and criterionvalidity.A centralargumentwas that the be exceedinglydifficultwith respectto measuresof the
latter two may each be necessaryto establishvalidity, more abstractphenomenathat tend to characterizethe
but neitheris sufficient.They shouldbe understoodas social sciences"(Carminesand Zeller 1979, 22). For
part of a larger process of validationthat integrates criterionvalidation,these authors emphasize that in
"multiplesources of evidence"and requiresthe com- many social sciences, few "criterion"variables are
binationof "logicalargumentand empiricalevidence" available that can serve as "real" measures of the
(Shepard 1993, 406). Alongside this development,a phenomenaunderinvestigation,againstwhichscholars
reconceptualization of constructvalidityled to "amore can evaluatealternativemeasures(pp. 19-20). Hence,
comprehensive and theory-basedview that subsumed for manypurposesit is simplynot a relevantprocedure.
other more limited perspectives"(Shultz, Riggs, and
Kottke 1998, 270). This broader understandingof Although Carmines and Zeller call for the use of
constructvalidityas the overarchinggoal of a single, multiple sources of evidence, their emphasis on the
limitationsof the first two types of validation leads
integratedprocessof measurementvalidationis widely them to give a predominant role to nomological/
endorsed by psychometricians.Moss (1995, 6) states constructvalidation.
"thereis a close to universalconsensusamongvalidity In relation to Carminesand Zeller, Bollen (1989,
theorists"that "content-and criterion-relatedevidence
185-6, 190-4) addsconvergent/discriminant validation
of validityare simply two of many types of evidence
that supportconstructvalidity." 7 Descriptionand explanationare of courseintertwined,but we find
Thus, in the psychometricliterature(e.g., Messick this distinctioninvaluablefor exploringcontrastsamongvalidation
1980, 1015), the term "constructvalidity"has become procedures.While these proceduresdo not alwaysfit in sharply
essentiallya synonymfor what we call measurement boundedcategories,manydo indeed focus on either descriptiveor
validity.We have adoptedmeasurementvalidityas the explanatoryrelationsand hence are productivelydifferentiatedby
our typology.
name for the overall topic of this article, in part 8 See also the mainexamplesof constructvalidationpresentedin the
because in politicalscience the label constructvalidity majorstatementsby CarminesandZeller 1979,23, andBollen 1989,
commonlyrefers to specificproceduresratherthan to 189-90.
537
MeasurementValidity:A SharedStandardfor Qualitativeand QuantitativeResearch September 2001
538
American Political Science Review
Scec Rve Vol. 95,
Vol. 95, No. 3
IPoiia
AeiaI
1955, 282). Withoutit, a well-focusedvalidationques- concept.In the frameworkof Figure 1, this procedure
tion may rapidlybecome entangled in a broaderdis- involvesrevisingthe indicator(i.e., the scoringproce-
pute over the concept. Such agreementcan be pro- dure) in order to sort cases in a way that better fits
vided if the systematizedconcept is taken as given, so conceptual expectations, and potentially fine-tuning
attention can be focused on whether a particular the systematizedconcept to better fit the cases. Ragin
indicatoradequatelycapturesits content. (1994, 98) terms this process of mutual adjustment
"double fitting." This procedure avoids conceptual
Examples of Content Validation. Within the psycho-
metrictradition(Angoff1988,27-8; Shultz,Riggs,and stretching(Collierand Mahon1993;Sartori1970),that
Kottke 1998, 267-8), content validationis understood is, a mismatchbetweena systematizedconceptand the
as focusing on the relationshipbetween the indicator scoringof cases, which is clearlyan issue of validity.
An example of case-orientedcontent validation is
(level 3) and the systematizedconcept (level 2), with- found in O'Donnell's(1996) discussionof democratic
out reference to the scores of specificcases (level 4).
We will first present examplesfrom political science consolidation.Some scholarssuggestthatone indicator
that adopt this focus. We will then turnto a somewhat of consolidationis the capacityof a democraticregime
to withstandsevere crises. O'Donnell argues that by
different, "case-oriented"procedure (Ragin 1987,
this standard, some Latin American democracies
chap. 3), identifiedwith qualitativeresearch,in which would be consideredmore consolidatedthan those in
the examinationof scores for specific cases plays a
centralrole in contentvalidation. southernEurope.He findsthis an implausibleclassifi-
Two examplesfrom political researchillustrate,re- cation because the standardleads to a "reductioad
absurdum"(p. 43). This exampleshows how attention
spectively,the problemsof omission of key elements to specific cases can spur recognitionof dilemmasin
from the indicatorand inclusionof inappropriateele-
ments.Paxton's(2000) articleon democracyfocuseson the adequacyof contentandcanbe a productivetool in
the firstproblem.Her analysisis particularlysalientfor content validation.
scholarsin the qualitativetradition,given its focus on In sum,for case-orientedcontentvalidation,upward
choices about the dichotomousclassificationof cases. movementin Figure 1 is especiallyimportant.It can
Paxtoncontraststhe systematizedconcepts of democ- lead to both refiningthe indicatorin lightof scoresand
racy offered by several prominent scholars-Bollen, fine-tuningthe systematizedconcept. In addition,al-
Gurr,Huntington,Lipset,Muller,and Rueschemeyer, though the systematizedconcept being measured is
Stephens,and Stephens-with the actualcontentof the usually relativelystable, this form of validationmay
indicatorsthey propose. She takes their systematized lead to friendlyamendmentsthat modify the system-
concepts as given, which establishescommonconcep- atized concept by drawingideas from the background
tual ground. She observesthat these scholarsinclude concept. To put this another way, in this form of
universalsuffragein whatis in effecttheirsystematized validationboth an "inductive"componentand concep-
concept of democracy,but the indicatorsthey employ tual innovationare especiallyimportant.
in operationalizingthe concept consider only male Limitations of Content Validation. Content validation
suffrage.Paxton thus focuses on the problemthat an makes an importantcontributionto the assessmentof
importantcomponent of the systematizedconcept is measurementvalidity,but alone it is incomplete,for
omitted from the indicator. two reasons.First,althougha necessarycondition,the
The debateon Vanhanen's(1979, 1990)quantitative
indicatorof democracyillustratesthe alternativeprob- findingsof content validationare not a sufficientcon-
dition for establishingvalidity(Shepard 1993, 414-5;
lem that the indicatorincorporateselements that cor- Sireci 1998, 112). The key point is that an indicator
respond to a concept other than the systematized with valid content may still produce scores with low
concept of concern. Vanhanen seeks to capture the overall measurementvalidity,because furtherthreats
idea of politicalcompetitionthat is part of his system- to validitycan be introducedin the coding of cases. A
atizedconceptof democracyby including,as a compo- second reason concerns the trade-offbetween parsi-
nent of his scale,the percentageof votes won by parties
other than the largestparty.Bollen (1990, 13, 15) and mony and completenessthat arisesbecause indicators
routinelyfail to capturethe full content of a system-
Coppedge (1997, 6) both question this measure of atized concept. Capturingthis content may require a
democracy, arguing that it incorporates elements complexindicatorthat is hardto use and adds greatly
drawn from a distinct concept, the structureof the to the time and cost of completingthe research.It is a
partysystem. matterof judgmentfor scholarsto decide when efforts
Case-Oriented Content Validation. Researchers en- to further improvethe adequacyof content may be-
gaged in the qualitativeclassificationof cases routinely come counterproductive.
carryout a somewhatdifferentprocedurefor content It is usefulto complementthe conceptualcriticismof
validation,based on the relation between conceptual indicatorsby examiningwhether particularmodifica-
meaningand choices aboutscoringparticularcases. In tions in an indicatormakea differencein the scoringof
the vocabularyof Sartori(1970, 1040-6), this concerns cases. To the extent that such modificationshave little
the relation between the "intension"(meaning) and influence on scores, their contributionto improving
"extension"(set of positivecases) of the concept. For validity is more modest. An example in which their
Sartori,an essentialaspectof conceptformationis the contributionis shown to be substantialis providedby
procedureof adjustingthis relationbetween cases and Paxton(2000). She developsan alternativeindicatorof
539
MeasurementValidity:A SharedStandardfor Qualitativeand QuantitativeResearch September 2001
540
American Political Science Review Vol. 95, No. 3
I Ie IR I
tors and the concepts of political liberties and demo- present article, a central focus in his major method-
craticrule, the standardof contentvalidationis clearly ological statementon this approach(1989, 190-206).
met, and Bollen continues to use these overarching He demonstrates,for example,its distinctivecontribu-
labels. tion for a scholarconcernedwith convergent/discrimi-
Another concern arises over the interpretationof nantvalidationwho is dealingwith a data set with high
low correlationsamongindicators.Analystswho lack a correlationsamongalternativeindicators.In this case,
"true"measure againstwhich to assess validitymust structuralequationswith latent variablescan be used
base convergentvalidationon a set of indicators,none to estimatethe degreeto whichthese high correlations
of which may be a very good measureof the system- derivefrom sharedsystematicbias, ratherthan reflect
atized concept. The result may be low correlations the valid measurementof an underlyingconcept.12
among indicators,even though they have sharedvari- This approachis illustratedby Bollen (1993) and
ance that measuresthe concept.One possiblesolution Bollen and Paxton's(1998, 2000) evaluationof eight
is to focus on this shared variance,even though the indicatorsof democracytaken from data sets devel-
overall correlationsare low. Standardstatisticaltech- oped by Banks, Gastil, and Sussman.13For each indi-
niques may be used to tap this sharedvariance. cator,Bollen and Paxtonestimatethe percentof total
The opposite problemalso is a concern:the limita- variancethat validlymeasuresdemocracy,as opposed
tions of inferring validity from a high correlation to reflectingsystematicand randomerror.The sources
among indicators.Such a correlationmay reflect fac- of systematic error are then explored. Bollen and
tors other than valid measurement.For example,two Paxtonconclude,for example,that Gastil'sindicators
indicators may be strongly correlated because they have "conservative" bias, givinghigherscores to coun-
both measure some other concept;or they may mea- tries that are Catholic, that have traditionalmonar-
sure differentconcepts,one of whichcauses the other. chies, and that are not Marxist-Leninist(Bollen 1993,
A plausibleresponse is to think through,and seek to 1221; Bollen and Paxton 2000, 73). This line of re-
rule out, alternativereasonsfor the high correlation.10 search is an outstandingexampleof the sophisticated
Althoughframingthese concernsin the languageof use of convergent/discriminant validation to identify
high and low correlationsappearsto orient the discus- potentialproblemsof politicalbias.
sion toward quantitativeresearchers,qualitativere- In discussingBollen's treatmentand applicationof
searchersface parallelissues. Specifically,these issues structuralequationmodelswe would like to note both
arise when qualitativeresearchersanalyzethe sorting similarities, and a key contrast, in relation to the
of cases producedby alternativeclassificationproce- practice of qualitative researchers.Bollen certainly
dures that representdifferentways of operationalizing sharesthe concernwith carefulattentionto concepts,
either a given concept (i.e., convergentvalidation)or and with knowledgeof cases, that we have emphasized
two or more conceptsthat are presumedto be distinct above, and that is characteristicof case-orientedcon-
(i.e., discriminantvalidation).Giventhatthese scholars tent validationas practicedby qualitativeresearchers.
are probablyworkingwith a smallN, they maybe able He insiststhat complexquantitativetechniquescannot
to drawon their knowledgeof cases to assess alterna- replace careful conceptualand theoreticalreasoning;
tive explanationsfor convergences and divergences rather they presuppose it. Furthermore,"structural
among the sortingof cases yielded by differentclassi- equationmodels are not very helpful if you have little
fication procedures.In this way, they can make valu- idea aboutthe subjectmatter"(Bollen 1989,vi; see also
able inferences about validity.Quantitativeresearch- 194). Qualitativeresearchers,carryingout a case-by-
ers, by contrast, have other tools for making these case assessmentof the scores on differentindicators,
inferences,to whichwe now turn. could of course reach some of the same conclusions
about validityand politicalbias reachedby Bollen. A
ConvergentValidation and Structural Equation Models structuralequation approach,however, does offer a
with Latent Variables. In quantitative research, an
importantmeans of respondingto the limitationsof
simple correlational procedures for convergent/dis- reevaluationof substantivefindings-in this case concerningparty
criminantvalidationis offered by structuralequation identification(Greene 1991,67-71).
12Two pointsaboutstructuralequationmodelswith latentvariables
models with latent variables(also called LISREL-type shouldbe underscored.First,as noted below,these modelscan also
models). Some treatments of such models, to the be usedin nomological/construct validation,andhenceshouldnot be
extentthat they discussmeasurementerror,focus their associatedexclusivelywithconvergent/discriminant validation,which
attentionon randomerror,that is, on reliability(Hay- is the applicationdiscussedhere. Second,we have emphasizedthat
duk 1987, e.g., 118-24; 1996).11However,Bollen has convergent/discriminant validationfocuseson "descriptive"
relations
made systematicerror, which is the concern of the among concepts and their components.Withinthis framework,it
merits emphasisthat the indicatorsthat measure a given latent
variable (i.e., concept) in these models are conventionallyinter-
10On the appropriatesize of the correlation,see Bollenand Lennox
pretedas "effects"of thislatentvariable(Bollen1989,65;Bollenand
1991,305-7. Lennox1991,305-6). These effects,however,do not involvecausal
1 To take a political science application,Green and Palmquist's interactionsamongdistinctphenomena.Suchinteractions,whichin
(1990) studyalso reflectsthis focus on randomerror.By contrast, structuralequationmodels involvecausalrelationsamongdifferent
Green (1991) goes fartherby consideringboth randomand system- latentvariables,are the centerpieceof the conventionalunderstand-
aticerror.Likethe workby Bollendiscussedbelow,these articlesare ing of "explanation."By contrast, the links between one latent
an impressive demonstrationof how LISREL-typemodels can variableandits indicatorsare productivelyunderstoodas involvinga
incorporatea concernwith measurementerror into conventional "descriptive" relationship.
statistical analysis, and how this can in turn lead to a major 13See, for example,Banks1979;Gastil 1988;Sussman1982.
541
MeasurementValidity:IA Shared
I Standardfor Qualitativeand QuantitativeResearch
I I September 2001
542
Ameicn
Pliica
American Political Siece evew
Science Review ol 95
Vol. 95, N.
No. 3
543
MeasurementValidity:A SharedStandardfor Qualitativeand QuantitativeResearch September 2001
I
models, this processmust also incorporatethe careful Anderson, Perry. 1974. Lineages of the Absolutist State. London:
use of contentvalidation,as Bollen emphasizes. Verso.
Angoff,WilliamH. 1988."Validity:An EvolvingConcept."In Test
Second,we have encouragedscholarsto distinguish Validity,ed. HowardWainerand HenryI. Braun.Hillsdale,NJ:
between issues of measurementvalidityand broader LawrenceErlbaum.Pp. 19-32.
conceptualdisputes.Buildingon the contrastbetween Babbie, Earl R. 2001. The Practice of Social Research, 9th ed.
the backgroundconcept and the systematizedconcept Belmont,CA: Wadsworth.
Bachman,JeraldG., and PatrickM. O'Malley.1984. "Yea-Saying,
(Figure 1), we have exploredhow validityissues and Nay-Saying,and Going to Extremes:Black-WhiteDifferencesin
conceptualissues can be separated.We believe that ResponseStyles."PublicOpinionQuarterly
48 (Summer):491-509.
this separationis essential if scholars are to give a Banks, Arthur S. 1979. Cross-National Time-Series Data Archive
consistentfocus to the idea of measurementvalidity, User'sManual. Binghamton:Center for Social Analysis, State
and particularlyto the practiceof content validation. Universityof New York at Binghamton.
Baumgartner,FrankR., and Jack L. Walker.1990. "Responseto
Third, we examined alternative procedures for Smith's'Trendsin VoluntaryGroupMembership:Commentson
adaptingoperationalizationto specific contexts:con- Baumgartnerand Walker':MeasurementValidityand the Conti-
text-specificdomains of observation,context-specific nuity of Research in Survey Research."AmericanJournalof
indicators, and adjusted common indicators. These Political Science 34 (August): 662-70.
Berry, William D., Evan J. Ringquist,Richard C. Footing, and
proceduresmake it easier to take a middle position Russell L. Hanson. 1998. "MeasuringCitizen and Government
between universalizingand particularizingtendencies. Ideologyin the AmericanStates, 1960-93."AmericanJournalof
Yet, we also emphasize that the decision to pursue Political Science 42 (January): 327-48.
context-specificapproachesshouldbe carefullyconsid- Bevir, Mark. 1999. The Logic of the History of Ideas. Cambridge:
ered and justified. CambridgeUniversityPress.
Bollen,KennethA. 1980."Issuesin the ComparativeMeasurement
Fourth, we have presented an understandingof of Political Democracy." American Sociological Review 45 (June):
measurementvalidationthat can plausiblybe applied 370-90.
in both quantitativeand qualitativeresearch.Although Bollen, Kenneth A. 1989. StructuralEquations with Latent Variables.
most discussionsof validation focus on quantitative New York:Wiley.
Bollen, Kenneth A. 1990. "PoliticalDemocracy:Conceptualand
research,we have formulatedeach type in terms of Measurement Traps." Studies in ComparativeInternationalDevel-
basic questions intended to clarify the relevance to opment25 (Spring):7-24.
both quantitativeandqualitativeanalysis.We havealso Bollen,KennethA. 1993."LiberalDemocracy:ValidityandMethod
given examples of how these questions can be ad- Factors in Cross-National Measures."American Journal of Political
Science37 (November):1207-30.
dressedby scholarsfromwithinboth traditions.These
Bollen, KennethA., BarbaraEntwisle,and Arthur S. Anderson.
examplesalso illustrate,however,that while they may 1993."MacrocomparativeResearchMethods."AnnualReviewof
be addressingthe same questions, quantitativeand Sociology 19: 321-51.
qualitative scholars often employ different tools in Bollen, Kenneth A., and Richard Lennox. 1991. "Conventional
Wisdomon Measurement:A StructuralEquationPerspective."
findinganswers.
Withinthis framework,qualitativeand quantitative Psychological Bulletin 110 (September): 305-14.
Bollen, Kenneth A., and Pamela Paxton. 1998. "Detection and
researcherscan learn from these differences.Qualita- Determinantsof Biasin SubjectiveMeasures."
AmericanSociolog-
tive researcherscould benefit from self-consciously ical Review63 (June):465-78.
applyingthe validationproceduresthat to some degree Bollen,KennethA., andPamelaPaxton.2000."SubjectiveMeasures
of Liberal Democracy." ComparativePolitical Studies 33 (Febru-
they may alreadybe employingimplicitlyand, in par- ary):58-86.
ticular, from developing and comparing alternative Brady,Henry.1985."ThePerilsof SurveyResearch:Inter-Personally
indicatorsof a givensystematizedconcept.Theyshould Incomparable 11 (3-4): 269-91.
Responses."PoliticalMethodology
also recognize that nomological validation can be Brady,Henry,andDavidCollier,eds. 2001.RethinkingSocialInquiry:
Diverse Tools, Shared Standards. Berkeley: Berkeley Public Policy
importantin qualitativeresearch,as illustratedby the Press, Universityof California,and Boulder, CO: Roman &
Lijphartand Anderson examplesabove. Quantitative Littlefield.
researchers,in turn, could benefit from more fre- Research:A
Brewer,John, and Albert Hunter. 1989. Multimethod
quently supplementingother tools for validation by Synthesis of Styles. Newbury Park, CA: Sage.
employinga case-orientedapproach,using the close Cain,BruceE., andJohnFerejohn.1981."PartyIdentificationin the
United States and Great Britain." ComparativePolitical Studies 14
examinationof specific cases to identify threats to
measurementvalidity. (April):31-47.
Campbell,DonaldT. 1960."Recommendations for APA Test Stan-
dards Regarding Construct,Trait, or DiscriminantValidity."
American Psychologist 15 (August): 546-53.
Campbell,DonaldT. 1977/1988."DescriptiveEpistemology:Psycho-
REFERENCES logical,Sociological,and Evolutionary." and Episte-
Methodology
mology for Social Science: Selected Papers. Chicago: University of
Alvarez,Michael,Jose Antonio Cheibub,FernandoLimongi,and ChicagoPress.
AdamPrzeworski.1996."Classifying
PoliticalRegimes."Studiesin
Campbell,DonaldT., and DonaldW. Fiske. 1959."Convergentand
ComparativeInternationalDevelopment 31 (Summer): 3-36. Matrix."
DiscriminantValidationby the Multitrait-Multimethod
AmericanPsychologicalAssociation.1954. "TechnicalRecommen- Psychological Bulletin 56 (March): 81-105.
dationsfor PsychologicalTests and DiagnosticTechniques."Psy- Carmines,EdwardG., and RichardA. Zeller. 1979.Reliabilityand
chological Bulletin 51 (2, Part 2): 201-38. ValidityAssessment. Beverly Hills, CA: Sage.
AmericanPsychologicalAssociation.1966.Standards
for Educational Clausen, Aage. 1968. "Response Validity:Vote Report."Public
and Psychological Tests and Manuals. Washington, DC: American
Opinion Quarterly41 (Winter): 588-606.
PsychologicalAssociation. in the
of a Concept:'Corporatism'
Collier,David.1995."Trajectory
and
Anderson,BarbaraA., andBrianD. Silver.1986."Measurement Studyof LatinAmericanPolitics."In LatinAmericain Compara-
Mismeasurementof the Validity of the Self-ReportedVote." tive Perspective:Issues and Methods, ed. Peter H. Smith. Boulder,
American Journal of Political Science 30 (November): 771-85. CO:Westview.Pp. 135-62.
544
American PoliticalScience
American Political Science Review
Review Vol. 95, No.
Vol. No. 3
545
MeasurementValidity:A SharedStandardfor Qualitativeand QuantitativeResearch September 2001
"SocialStructure,and PoliticalParticipation:
DevelopmentalRe- Schmitter,PhilippeC., and TerryLynnKarl. 1992. "The Types of
lationships, Part I." American Political Science Review 63 (June): DemocracyEmergingin SouthernandEasternEuropeand South
361-78. and Central America." In Bound to Change: ConsolidatingDemoc-
Niemi, Richard,RichardKatz,and David Newman.1980."Recon- racy in East Central Europe, ed. Peter M. E. Volten. New York:
The Failureof the PartyIdentification
structingPastPartisanship: Institutefor EastWestStudies.Pp. 42-68.
Recall Questions." American Journal of Political Science 24 (No- Schrodt,PhilipA., and DeborahJ. Gerner.1994."ValidityAssess-
vember): 633-51. ment of a Machine-CodedEvent Data Set for the MiddleEast,
O'Donnell,Guillermo.1993. "On the State, Democratizationand 1982-92." American Journal of Political Science 38 (August):
SomeConceptualProblems."WorldDevelopment 27 (8): 1355-69. 825-54.
O'Donnell,Guillermo.1996."Illusionsabout Consolidation."Jour- Schumpeter, Joseph. 1947. Capitalism, Socialism and Democracy.
nal of Democracy 7 (April): 34-51. New York:Harper.
Orum, Anthony M., Joe R. Feagin, and Gideon Sjoberg. 1991. Shepard,Lorrie.1993."Evaluating
TestValidity."Reviewof Research
The Natureof the Case Study."In A Casefor the
"Introduction: in Education 19: 405-50.
Case Study,ed. Joe R. Feagin,AnthonyM. Orum,and Gideon Shively, W. Phillips. 1998. The Craft of Political Research. Upper
Sjoberg.ChapelHill:Universityof NorthCarolinaPress.Pp. 1-26. SaddleRiver,NJ: PrenticeHall.
Paxton,Pamela.2000."Womenin the Measurementof Democracy: Shultz,KennethS., MattL. Riggs,and JanetL. Kottke.1998."The
Problems of Operationalization." Studies in ComparativeIntera- Need for an Evolving Concept of Validity in Industrialand
tional Development 35 (Fall): 92-111. PersonnelPsychology:Psychometric,
Legal,andEmergingIssues."
Pitkin, Hanna Fenichel. 1967. The Concept of Representation.Berke- CurrentPsychology 17 (Winter): 265-86.
ley: Universityof CaliforniaPress. Sicinski,Andrzej.1970. "'Don'tKnow'Answersin Cross-National
Pitkin,HannaFenichel.1987."RethinkingReification."Theoryand Surveys." Public Opinion Quarterly34 (Spring): 126-9.
Society16 (March):263-93. Sireci,StephenG. 1998."TheConstructof ContentValidity."Social
Przeworski,Adam, Michael Alvarez,Jose Antonio Cheibub,and Indicators Research 45 (November): 83-117.
FernandoLimongi.1996. "WhatMakes DemocraciesEndure?" Skocpol, Theda. 1992. Protecting Soldiers and Mothers. Cambridge,
Journal of Democracy 7 (January): 39-55. MA: HarvardUniversityPress.
Przeworski,Adam, and HenryTeune. 1970. Logic of Comparative Sussman,LeonardR. 1982."TheContinuingStrugglefor Freedom
Social Inquiry. New York: John Wiley. of Information."
In Freedomin the World,ed. RaymondD. Gastil.
Rabkin, Rhoda. 1992. "The Aylwin Governmentand 'Tutelary' Westport,CT: Greenwood.Pp. 101-19.
Democracy:A Concept in Searchof a Case?"Journalof Inter- Valenzuela,J. Samuel. 1992. "DemocraticConsolidationin Post-
american Studies and WorldAffairs 34 (Winter): 119-94. TransitionalSettings:Notion, Process, and FacilitatingCondi-
Ragin,CharlesC. 1987.TheComparative
Method.Berkeley:Univer- tions." In Issues in Democratic Consolidation, ed. Scott Mainwar-
sity of CaliforniaPress. ing, Guillermo O'Donnell, and J. Samuel Valenzuela. Notre
Ragin, Charles C. 1994. Constructing Social Research. Thousand Dame, IN: Universityof Notre Dame Press.Pp. 57-104.
Oaks,CA: Pine Forge. Vanhanen, Tatu. 1979. Power and the Means to Power. Ann Arbor,
Reus-Smit,Christian.1997."TheConstitutionalStructureof Inter- MI: UniversityMicrofilmsInternational.
national Society and the Nature of FundamentalInstitutions." Vanhanen, Tatu. 1990. The Process of Democratization. New York:
International Organization51 (Autumn): 555-89. CraneRussak.
Ruggie, John G. 1998. "WhatMakes the World Hang Together? Verba,Sidney.1967."SomeDilemmasin ComparativeResearch."
Neo-Utilitarianismand the Social ConstructivistChallenge."In- WorldPolitics 20 (October): 111-27.
ternational Organization52 (Autumn): 855-85. Verba,Sidney.1971."Cross-National
SurveyResearch:The Problem
Russett, Bruce. 1993. Graspingthe Democratic Peace. Princeton, NJ: of Credibility." In Comparative Methods in Sociology, ed. Ivan
PrincetonUniversityPress. Vallier.Berkeley:Universityof CaliforniaPress.Pp. 309-56.
Sartori,Giovanni. 1970. "ConceptMisformationin Comparative Verba,Sidney,StevenKelman,GaryR. Orren,IchiroMiyake,Joji
Research." American Political Science Review 64 (December): Watanuki,IkuoKabashima,andG. DonaldFerree,Jr. 1987.Elites
1033-53. and the Idea of Equality. Cambridge, MA: Harvard University
Sartori,Giovanni.1975."TheTowerof Babel."In Towerof Babel, Press.
ed. GiovanniSartori,Fred W. Riggs,and HenryTeune. Interna- Verba, Sidney,NormanNie, and Jae-On Kim. 1978.Participation
tional StudiesAssociation,Universityof Pittsburgh.Pp. 7-37. and PoliticalEquality.Cambridge:CambridgeUniversityPress.
Sartori, Giovanni, ed. 1984. Social Science Concepts: A Systematic Webb,Eugene J., Donald T. Campbell,RichardD. Schwartz,and
Analysis.BeverlyHills, CA: Sage. Lee Sechrest. 1966. UnobtrusiveMeasures:NonreactiveResearch in
Sartori,Giovanni,FredW. Riggs,and HenryTeune. 1975.Towerof the Social Sciences. Chicago: Rand McNally.
Babel: On the Definition and Analysis of Concepts in the Social Zeitsch,John,Denis Lawrence,andJohnSalernian.1994."Compar-
Sciences.InternationalStudies Association,Universityof Pitts- ing Like with Like in ProductivityStudies:Apples, Orangesand
burgh. EconomicRecord70 (June):162-70.
Electricity."
Schaffer, Frederic Charles. 1998. Democracy in Translation: Under- Zeller,RichardA., and EdwardG. Carmines.1980.Measurement in
standing Politics in an Unfamiliar Culture. Ithaca, NY: Cornell the Social Sciences: The Link between Theoryand Data. Cambridge:
UniversityPress. CambridgeUniversityPress.
546