Beruflich Dokumente
Kultur Dokumente
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to
Journal of the Royal Statistical Society. Series A (General).
http://www.jstor.org
SUMMARY
ina
ofquestions
form
andplacement
wording,
oftheprecise
theeffects
Thepaperreviews
questionlength;the
feedbackand commitment;
instructions;
response;respondent
theuse ofbalancedquestions;
of"don'tknows";openand closedquestions;
treatment
order
andquestion
ofa middle
theorderofalternatives;
alternative;
theoffer
acquiescence;
effects.
and context
BIAS;
DESIRABILITY
MEMORY
ERRORS;
SOCIAL
OPINION
FACTUAL
QUESTIONS;
QUESTIONS;
Keywords:
CONTEXT
EFFECT
FORM;QUESTION
QUESTION
QUESTION
WORDING;
1. INTRODUCTION
that surveyresponsesmay be
THE surveyliteratureabounds withexamplesdemonstrating
sensitiveto theprecisewording,formatand placementofthequestionsasked.A usefulstartto
sought.
is to classifyquestionsaccordingto the typeof information
examiningtheseeffects
A widely-useddistinctionis that betweenfactualand opinion questions.Questionslike
"What was yourregularhourlyrateofpay on thisjob as ofSeptember30?" clearlyfallin the
formercategory,whilequestionslike "As you know,manyolder people share a home with
theirgrownchildren.Do you thinkthisis generallya good idea or a bad idea?" clearlyfallin
the latter.However,not all surveyquestionscan be classifiedas eitherfactualor opinion
ones: othertypesof question include questionstestingrespondents'knowledge,questions
questions.
questionsand preference
askingforreasons,hypothetical
typeof question,widelyused in surveypractice,deservesspecial comment.
One further
These questions,whichhave a factualcomponentoverlaidwithan evaluation,maybe termed
judgementor perceptualquestions.Examplesare: "Do you have street(highway)noisein this
neighbourhood?"and "Would you say yourhealthnow is excellent,good, fairor poor?" In
but the approach
manycases the intentof such questionsis to obtain factualinformation,
according
evaluationsof thefactsratherthantheirmeasurement
adopted seeks respondents'
to objectivecriteria.The use ofperceptualquestionsforthispurposeprobablyresultsfromthe
questions or take the
questionnairedesigner'sdecision that he could not ask sufficient
objectively;hencehe has respondents
measurements
necessaryto determinetheinformation
make theassessmentsforhim.The oftenlow levelsofcorrelationfoundbetweenperceptions
and factsmake this use of perceptualquestions,althoughwidespread,a dubious one. A
use of perceptualquestionsis indeed to obtain respondents'perceptionsof their
different
situations;in thiscase the questionsare similarto opinionquestions.
to dividequestionsintofactualand non-factual
For presentpurposes,it willbe sufficient
ones (includingas factualquestionsthose perceptualquestionsseekingto ascertainfactual
An importantdifference
betweenthesetwotypesof questionis thatwithfactual
information).
soughtwhichcan-at least in
questionsthereare individualtruevalues forthe information
f This paper is a slightlyrevisedversionof a paper presentedat the AmericanStatisticalAssociationmeetings,
Houston,August1980(Kalton and Schuman,1980,withdiscussionby Rothwelland Turner).
0035-9238/82/145042$2.00
1982]
43
fromsome sourceotherthanrespondents'reports,whereaswithother
theory be determined
questionsthisdoes not apply. While it is truethatthe responsesto some factualquestions
cannotbe validatedagainstexternalsources-forinstance,reportsofpast unrecordedprivate
holds in general.As a consequence,validitystudies are often
behaviour-the difference
conducted to examine how successfulfactual questions are in obtaining respondents'
individualtruevalues,whereaswithnon-factualquestionssuch studiesare not possible.
Althoughnumerousvaliditystudiesofresponsesto factualquestionshave beencarriedout
in manysubjectareas,themajorityofthemhave examinedonlythelevelofaccuracyachieved
procedures,as required
bya givenquestioningprocedure;theyhave notcomparedalternative
theaccuracyoftheresponses
formakingan assessmentofhow aspectsofa questionmayaffect
obtained.Many ofthecomparativestudiesthathave been conductedhave avoided theneed
fordata froman externalvalidatingsource by makingan assumptionabout the general
directionof the responseerrorsto be encountered,the assumptionadopted beingbased on
evidencefromothervaliditystudies.Thus,forinstance,itis oftenassumedfrompast evidence
thatcertaineventssuch as purchasesmade or illnessesexperiencedin a givenperiodwill be
Giventhisassumption,thebestquestionformis thentakento be theone that
underreported.
fortheevents.On theotherhand,a sociallydesirableactivity
producesthehighestfrequencies
in whichcase thebestquestionformis theone that
maybe assumedto be generallyoverstated,
ofobtainingvaliditydata make
forit.Whilethedifficulties
givesthelowestreportedfrequency
it does dependcriticallyon thevalidityoftheassumptionabout the
thisapproachattractive,
directionof responseerrors.
and lesscertain.The accuracy
Withnon-factual
questions,validationis evenmoredifficult
ofresponsescan oftenbe examinedonlybymeansofconstructvalidity,thatis bydetermining
oftheresponseswithothervariablesconformto thosepredictedby
whethertherelationships
theory.At thecurrentstageoftheorydevelopmentin thesocial sciences,a failureofdata to fit
Then,
a theoryis usuallyas likelyto cast doubton thetheoryas on themeasuringinstruments.
thisagreementis
coincidewiththeirtheoreticalpredictions,
eveniftheobservedrelationships
thattheresponsesare valid;itmay,forinstance,insteadbe an artifact
nota clearconfirmation
employed-a "methodseffect".
of the set of measuringinstruments
of validatingresponsesto non-factualquestions,researchon
In view of the difficulties
in which
withsuchquestionshas reliedmainlyon split-ballotexperiments,
questioningeffects
with
to comparablesamplesof respondents,
alternativeformsofa questionare administered
questionformsbeingcomparedforconsistency.This conceln
the responsesto the different
whichis
withconsistencyratherthanvaliditymeansthattheresearchusuallyfailsto identify
ofresponsesto
thebest questionform.It servesonlyto warntheresearcherofthesensitivity
markedly,or to increasehis feelingsof securityin the
questionformif the responsesdiffer
resultsiftheydo not differ.
questionsinvolvesthefeaturesstudied
betweenfactualand non-factual
A seconddifference
as possibleinfluenceson the responsesobtained.Althoughmanyof the featurespotentially
have beenmoreconcernedabout someofthem
applywithboth typesofquestion,researchers
withfactualquestionsand otherswithnon-factualones. Thus researchon factualquestions
has focused on problems of definition,comprehension,memoryand social desirability
responsebias,whilethaton non-factualquestionshas concentratedon variousquestionform
and theorderofpresentation
ofmiddlealternatives,
suchas issuesofbalance,theoffer
effects,
studiedin relationto factualquestionsare reviewedin
The featuresprimarily
ofalternatives.
the nextsection,and those studiedin relationto non-factualquestionsare takenup in the
of question orderand context,whichhave receivedattentionin
followingone. The effects
relationto both typesof question,are discussedin Section4.
we should note that we are
Beforeembarkingon the discussion of question effects,
ask
the
interviewers
in
questionsand recordthe
which
with
surveys
concerned
primarily
by a numberof other
questionnairesmay be affected
answers.Responsesto self-completion
features,such as the physicallocation of a questionon the questionnaire,the placementof
44
- Effect
KALTONAND SCHUMAN
of theQuestionon SurveyResponses
[Part 1,
instructions,
the general layout,and the colours of printfor questions and instructions.
Reportsof experimentson the effects
of some of thesefeaturesare givenby Rothwelland
Rustemeyer(1979) for the US Census of Population and Housing, and by Forsytheand
Wilhite(1972) forthe US Census of Agriculture.
2. QUESTION EFFECTSWITH FACTUAL QUESTIONS
The startingpointin constructing
a factualquestionis a precisedefinition
ofthefactto be
collected.It has beenshownon manyoccasionsthatapparentlymarginalchangesin definition
can have profoundeffects
on surveyresults.Definitionsof unemployment
and labour force
raise a host of issues (e.g. Bancroftand Welch, 1946; Jaffeand Stewart,1955), but even
ostensiblysimplefactslike the numberof rooms occupied by a householdpose a rangeof
definitional
problems(forinstance:Is a kitchento be includedifonlyused forcooking?Are
bathrooms,toilets,closets,landings,hallsto be included?Is a roompartitionedbycurtainsor
portablescreensone or two rooms?).
Once the fact has been defined,the request for it has to be communicatedto the
respondent.A numberofdifficulties
can arisein thisprocess.In thefirstplace,theneed fora
precisedefinition
can lead to an unwieldyquestionwhichtherespondentcannot or willnot
make theeffort
to absorb.In thequalitycheckon the 1966 Sample Census of Englandand
Wales,Grayand Gee (1966) foundthat1 in 6 householdersreportedan inaccuratenumberof
roomsin the household,whichtheyascribemainlyto thefactthathouseholdersknow how
and theytherefore
many rooms theyhave accordingto theirown definitions,
ignoredthe
To avoid thisproblem,some loosenessis oftenacceptedin survey
detailedcensusdefinition.
questions (especially in perceptual questions), but this may well lead to inconsistent
interpretations
betweenrespondents.
Anotheraspect of the communicationprocess is to ensure that the respondentfully
understandswhathe is beingasked and whatis an appropriateanswer.At one levelhe needs
to understandthe conceptsand framesof reference
impliedby the question(Cannell and
Kahn, 1968). At a more basic level he needs to comprehendthe questionitself.Methodological researchby Belson and Speak foundthateven some simplequestionson television
viewingwere oftennot perceivedas intendedby a sizeable proportionof respondents.For
instance,thequestions"Whatproportionofyoureveningviewingtimedo youspendwatching
come on betweentwo programmeson a
and "When theadvertisements
newsprogrammes?"
weekdayevening,do you usuallywatchthem?"weremisinterpreted
byalmosteverybodywho
answeredthem.Withthefirstquestion,veryfewrespondents
knewwhat"proportion"meant,
and only 1 ofthe246 respondentsknewhow to workit out. Withthesecond,"weekday"was
as either"anyday oftheweek"or "anyday exceptSunday"(Speak, 1967;
oftenmisinterpreted
Belson, 1968).
To givea correctanswerto a factualquestion,a respondentneeds to have thenecessary
firstrequiresthathe has had theinformation
information
accessible.Accessibility
at sometime
and has understoodit.Then,ifthequestionasks about thepast,he needsto be able to retrieve
it fromhis memory.Ease of recalldependsmainlyon thelengthof therecallperiodand the
salienceto the respondentof the information
being recalled(see, forexample,Cannell and
Kahn, 1968).His successin recallingthe information
dependson the ease of recalland the
efforthe is persuaded to make. Many surveyquestions ask about eventsoccurringin a
specifiedreference
period(e.g. seeinga doctorin thelast year),in whichcase the respondent
also has to be able to place the eventsin time.A well-knownplacementdistortionis the
an eventas havingoccurredmorerecently
thanin factis the
telescopingerrorofremembering
case (see, forexample,Sudman and Bradburn,1973, 1974).
The effects
of recallloss and telescopingworkin oppositedirections,recallloss causing
and telescopingcausingoverreporting.
The extentofthesetwosourcesoferror
underreporting
dependson thelengthof thereference
period:the longerthe period,the greateris the recall
loss,but thesmalleris thetelescopingeffect.
Thus,forshortreference
periods,thetelescoping
1982]
- Effect
oftheQuestionon SurveyResponses
KALTONAND SCHUMAN
45
effect
may outweighthe recallloss, whileforlong periodsthe reversewill apply;in between
counterbalanceeach other
therewillbe a lengthof reference
periodat whichthe two effects
periodsvaries
(Sudman and Bradburn,1973). The meaningof "short"and "long" reference
with the event under investigation,
dependingon the event's salience. The choice of an
appropriatereference
periodneedsto takeintoaccountthetelescopingand recallloss effects,
as well as the factthatlongerperiodsprovideestimateswithsmallersamplingerrors.This
choice has been examinedin a numberof different
subjectareas (see, forexample,National
CenterforHealth Statistics,1972; Sudman, 1980).
is
A techniquewhich aims at eliminatingtelescopingerrorsby repeatedinterviewing
knownas bounded recall(Neter and Waksberg,1965). Respondentsare interviewedat the
eventswhich
beginningand end of the reference
period.The firstinterviewservesto identify
occurredpriorto thestartoftheperiodso thattheycan be discountediftheyare thenreported
again at the second interview.
Three proceduresare widelyused in surveypracticeto attemptto minimizeor avoid
memoryerrors theuse of records,aided recalltechniquesand diaries and each procedure
has its own sizeable literature.Whererecordsare available,say frombills or cheque book
as wellas provideaccurate
records,theiruse can reducebothrecallloss and telescopingeffects,
details of the events.Aided recall techniquesaim to reduce recall loss by providingthe
respondentwithmemorycues; thesetechniquesare widelyused in media research,wherethe
respondentwould be provided with,say, a list of newspapersor yesterday'stelevision
of
programmesfromwhichhe chooses theones he looked at. In theirsummaryoftheeffects
aided recalltechniques,Sudmanand Bradburn(1974)concludethattheydo increasereported
an increasein telescopingerrors.
but pointout thatthismayat leastin partrepresent
activity,
theremaybe no way
Wheretheeventsto be reportedare numerousand relatively
insignificant,
to help respondents
remember
themwithsufficient
accuracy.In suchcases,as withhousehold
and tripsoutsidethehouse,memoryproblemsmaybe avoided
foodconsumption
expenditures,
byhavingrespondents
completediariesoftheeventsas theytakeplace.Diaries,however,have
theirdisadvantages:theyare expensive,it is harderto gainrespondents'
cooperation,thediary
over
keepingmay affect
behaviour,it may be incomplete,and its qualityusuallydeteriorates
time.
in responsesto factual(and other)questions
sourceofinvalidity
Anotherwell-documented
is a social desirability
bias: respondents
distorttheiranswerstowardsones theyconsidermore
favourableto them.Thus,forinstance,it has beenwellestablishedthata higherproportionof
surveyrespondentsreportthattheyvotedin an electionthanthevotingreturnsindicate(for
instance,Parry and Crossley,1950; Traugottand Katosh, 1979). If an event is seen as
therespondent
sensitiveor threatening,
mayrepressitsreport,or he maydistort
embarrassing,
his answerto one he considersmoresociallyacceptable.Thereare a numberof well-known
includingmakingresponsesmore privateby
techniquesforelicitingsensitiveinformation,
to desensitize
usinga numberedcard(oftenused forincome)or a sealedballot,and attempting
a particularresponseby makingit appear to be a commonor acceptableone. Barton(1958)
has providedan amusingsummaryof thesetechniques.
A more recentdevelopmentforasking sensitivequestionsis the randomizedresponse
technique,in whichtherespondentchooseswhichoftwo(or more)questionshe answersby a
randomdevice;he answersthe chosen questionwithoutthe interviewer
beingaware which
question is being answered.In this way the respondent'sprivacy is protected,and in
response.SinceWarner(1965)introduced
consequenceitis hopedthathe givesa moretruthful
the technique,many articleshave appeared developingit, extendingits potentialrange of
application,and examiningits statisticalproperties.The main focusof thiswork has been,
littleattentionhas beengivento itspractical
however,on theoretical
issues,and comparatively
fromstudiesin whichit has beenappliedis that
One common,butnotobvious,finding
utility.
it has generallybeen well receivedby respondents.In a small-scaleexperimentalstudyby
Locander et al. (1976),forinstance,only 1 in 20 respondentssaid it was confusing,
sillyor
46
- Effect
of theQuestionon SurveyResponses
KALTONANDSCHUMAN
[Part 1,
1982]
EffectoftheQuestionon SurveyResponses
47
complex than the short ones. The usual advice "keep questions short" is probably an
fromlong
inaccurateway of saying "keep questions simple"; in practice the difficulties
questionsprobablyderivefromtheircomplexityratherthan theirlengthper se.
Other techniquesdevelopedby Cannell and his colleaguesto improvesurveyreporting
theuse offeedbackand thesecuringofrespondent
includetheuse ofrespondentinstructions,
commitment.
in the questionnaireis to advise the
The purpose of includingrespondentinstructions
with
his task.Cannell et al. (1981) have experimented
respondenton how he should perform
at the startof the interviewto ask the respondentto think
providinggeneralinstructions
takehistimeand checkrecords,and to tellhimthataccurateand
carefully,
searchhismemory,
on
completeanswersare wanted.In addition,respondentscan be givenspecificinstructions
how to answer individualquestions;these specificinstructionshave the added benefitof
lengthening
the questions,thussecuringthe advantagesassociatedwithlongerquestions.
The
The purposeof feedbackis to informthe respondenton how well he is performing.
fromwhichto choose,their
are providedwitha selectionoffeed-backstatements
interviewers
choice beinggovernedby the respondent'sperformance.
Examplesof positiveand negative
feed-backstatementsare "Thanks, we appreciateyour frankness"and "Uh-huh. We are
interestedin details like these" on the one hand and "You answeredthat quickly" and
"Sometimesit's easy to forgetall the thingsyou feltor noticedhere.Could you thinkabout
it again?" on the other.
techniqueis thatifa respondentcan be persuadedto
The theorybehindthecommitment
he will feelbound by the termsof the
enterinto an agreementto respondconscientiously
by askingrespondentsto
The techniquecan be appliedwithpersonalinterviewing
agreement.
sign an agreementpromisingto do theirbest to give accurate and completeanswers.In
refuse
practiceCannelland hiscolleagueshavefoundthatonlyabout 5 percentofrespondents
to co-operate.With telephoneinterviewing,
respondentsmay be asked to make a verbal
to respondaccuratelyand completely:a studyapplyingthisprocedureencouncommitment
teredno problemsin securingrespondents'co-operation.
The evidencefromthe various experimentsconductedto examine the utilityof these
techniquessuggeststhat each of them leads to an improvementin reporting,with a
combinationof all threegivingthe best results.A concernthathigh-education
respondents
mightreactnegativelydid not materialize.In a healthstudy,the use of the threetechniques
togetherincreasedthe average numberof itemssuppliedin answersto open questionsby
about one-fifth;
substantiallyimprovedthe precisionof dates reportedfor doctor visits,
increasedby about three-fold
the checkingof data
medicaleventsand activitycurtailment;
fromoutsidesources;and securedalmosta thirdmorereportsofsymptomsand conditionsfor
In a smallthepelvicregion(consideredto be potentially
embarrassing
personalinformation).
witha combinationof
scale studyofmediause,comparingan experimental
groupinterviewed
withnone of them,the experimental
all threetechniqueswitha controlgroup interviewed
and a lesseramount
groupreporteda greateramountforactivitieslikelyto be underreported
Thus 86 per cent of the experimentalgroup reported
forthose likelyto be overreported.
watchingTV on the previousday compared with 66 per cent of the controlgroup; the
for24 hourson average,comparedwithan
experimental
grouplistenedto theradio yesterday
average of 1Nhours forthe controlgroup; 38 per cent of the experimentalgroup reported
readingtheeditorialpage on thepreviousday comparedwith55 percentofthecontrolgroup;
and the experimentalgroup reportedan average of 2 9 books read in the last 3 months
comparedwith5 3 forthecontrolgroup.
These experimentalresultssuggestthat the techniqueshold considerablepromisefor
itseemspremature
to
However,at thisstageoftheirdevelopment,
improving
surveyreporting.
advocate their general use in routine surveys.They involve significantalterationsto
need to be trainedin theiruse, and interviewstake longerto
interviewers
questionnaires,
researchis called for,to attemptto replicate
complete.Beforetheyare widelyadopted,further
48
[Part 1,
required item of information;there is an answer to the question, but the respondent cannot
1982]
49
forrespondents
interpretation,
provideit.Withopinionquestions,however,DK has a different
may trulyhave no opinionon theissue understudy.
The standardway of allowingforDK's withopinionquestionsis the same as thatused
includedin thequestion,and
withfactualquestions;theoptionto answerDK is notexplicitly
it
to use theDK responsecategoryonlywhentherespondentoffers
are instructed
interviewers
mayfeelpressuredto
The dangerwiththisprocedureis thatsomerespondents
spontaneously.
give a specificanswereven thoughDK is theirproperresponse.This dangerexistsforboth
factualand opinionquestions,but it is probablygreaterforthe latter.
will
Two examplesgivenby Schumanand Presser(1980) illustratethatmanyrespondents
foran opinionquestioneventhoughtheydo not
offered
indeedchoose one ofthealternatives
wereaskedfortheirviewsabout
knowabout theissueinvolved.In bothexamples,respondents
proposed legislationwhichfew,if any, would be aware of,and yet 30 per cent expressed
opinions.Bishop et al. (1980b) reportsimilarfindingsabout a whollyfictionalissue.
maybe used.
withoutopinions,sometypeoffiltering
out respondents
As a wayofscreening
in theresponsecategories
is to includean explicit"no opinion"optionor filter
One possibility
this offer
in the Schuman and Presserexperiment,
offered
to respondents a "quasi-filter";
reducedthe proportionof respondentsexpressingopinionson the two laws to 10 per cent
filterquestion"Do you have an opinion
possibilityis a preliminary
or less. A moreforceful
on . . .?" a "fullfilter".
Schuman and Presser(1978) carriedout severalexperimentsto examine the effectsof
They foundthatthe use of thefullfiltertypicallyincreasedthe percentageof DK's
filtering.
over thoseobtainedfromthestandardformby around 20-25 per cent.Bishop et al. (1980a)
thattheincreasesweregenerallyin therange20-25 per cent,
also foundin theirexperiments
but theyreporta muchsmallerincreasefora veryfamiliartopicand a muchlargerone foran
unfamiliartopic.
In Schuman and Presser'sexperimentsthe effectof the variationin question formon
substantiveresultswas somewhatunexpected.In the firstplace, once the DK's had been
of responses
eliminated(as would usuallybe done in analysis),the marginaldistributions
by questionform;also the relations
affected
turnedout in mostcases not to be significantly
between the opinion responses and standard background variables were littleaffected.
in
significantly
did differ
However,theassociationsbetweentheopinionresponsesthemselves
withthefiltered
certaincases betweenquestionforms:in one case theassociationwas stronger
form,in anotherit was weaker.
(b) Open or closedquestions
Whenasked a surveyquestionrespondents
mayeitherbe suppliedwitha listofalternative
responsesfromwhichto choose or theymaybe leftto makeup theirown responses.The major
advantages of the formertype of question termedvariouslya closed, fixed-choiceor
precodedquestion are standardizationof responseand economyof processing.Its major
disadvantages,and hence argumentsin favourof open questions,are that the alternatives
imposed by the closed formmay not be appropriatefor these respondents,and that the
alternativesoffered
may influencethe responsesselected.
The main contextin which open questionsare used extensivelyis when the potential
responsesare both nominalin natureand sizeable in number.These conditionsoccur often
and with
withmotivationquestions,askingfortheprincipalor all reasonsforan occurrence,
questionsaskingforthechoiceofthemost,or severalmost,importantfactorsinvolvedin an
issue. In such cases the questionnairedesignerfacesa real choice betweenopen and closed
questions.
Schumanand Presser(1979) carriedout
As partoftheirresearchon questionformeffects,
on open and closed questions,usingitemschosenfortheirutilityin one
severalexperiments
occurred
differences
important
formor theotherin a majorpast survey.In all theexperiments
50
of theQuestionon SurveyResponses
KALTON AND SCHUMAN- Effect
[Part 1,
1982]
51
52
of theQuestionon SurveyResponses
KALTON AND SCHUMAN- Effect
[Part 1,
scalesreported
withthree-point
finding
failedto hold,however,in twoofthethreeexperiments
by Kalton et al.
There is littleevidence that this question formvariationaffectsassociations between
opinionresponsesand othervariables.In viewofthesubstantialimpactofthequestionform
however,it seemsdangerousto place uncriticalreliance
variationon marginaldistributions,
on the "formresistantcorrelation"assumption.
(f) Orderofalternatives
are
bytheorderin whichthealternatives
The responsesto closedquestionsmaybe affected
presented.In discussingthis order effect,two modes of presentationmay need to be
distinguished:the alternativescan be presentedin writtenform,as with self-completion
questionnairesor when flashcardsare used; or they can be presentedorally,with the
interviewer
readingthemto respondents,sometimesas a runningprompt.When theyare
presentedin writtenform,thereappears to be a slighttendencyforthefirstalternativeto be
favoured(e.g. Belson, 1966;Quinn and Belson, 1969).When theyare presentedorally,Rugg
and Cantril(1944) provideexampleswherethe last-mentioned
alternativeis favoured,but
is negligible.Kalton et al. (1978)
Payne also gives severalexampleswherethe ordereffect
withfour
reporttheresultsofexperiments
on varyingtheorderoforallypresentedalternatives
simplequestions.In all cases, the evidencesuggestedthat,if anything,the first-mentioned
alternativewas favoured;theeffects
were,however,verysmall(around a 2 per centincrease),
and onlyon the borderof statisticalsignificance.
4. GENERAL QUESTION EFFECTS
The precedingdiscussionhas been dividedinto two parts,questioningissues relatingto
division
factualquestionsand thoserelatingto non-factual
(opinion)questions.This arbitrary
in emphasisof question
was made forconvenienceof expositionto reflectthe differences
wordingand formatresearchbetweenthetwo typesof question.However,it should not be
notedforone typeofquestiondo not applyto theother.Thus,
takento implythattheeffects
can clearlyarisewithopinionstatements,
as also wouldissues
forinstance,issuesofsensitivity
difficult
matter
ofmemoryifthesurveywas concernedabout changesofopinion(an extremely
on whichto collectaccurateinformation
byretrospective
questioning).Equally,whilemanyof
thequestionformvariationsdiscussedabove fornon-factualquestionsare not applicablefor
factualquestions,thelattermayalso be affected
by variationin questionform.Locanderand
Burton(1976),forinstance,showhow fourversionsofa questionaskingforfamilyincome,all
incomedistributions.
yieldedmarkedlydifferent
designedforuse withtelephoneinterviewing,
thesetofresponsecategoriesas a
All thequestionsused an unfolding
techniqueforpresenting
formsofthetechnique.Forms 1 and 4,
sequenceofbinarychoices,buttheyemployeddifferent
forexample,bothasked whetherthefamilyincomewas "morethatX" forX = $5000,$7500,
$10 000, $15 000, $20 000 and $25 000; form1 startedwith$5000 and took increasingvaluesof
X untila "no" answerwas given,whileform4 startedwith$25 000 and took decreasingvalues
untila "yes" answerwas given.With form1 37 5 per cent of respondentsreportedfamily
incomesof $15 000 or more;withform4 thecorresponding
percentagewas 63 7 per cent.
A final,important,questioningeffectto be discussed concernsthe presenceof other
and thepositionofthosequestionsin relationto thequestion
questionsin thequestionnaire,
understudy.Questionorderand contexteffects
may occurwithbothfactualand non-factual
ways.
questions,but theyappear to operatein different
ofquestionorder
A sizeablenumberofstudieshave been carriedout to examinetheeffect
has been discovered,
on responsesto opinionquestions.On manyoccasions no ordereffect
evenforquestionscloselyrelatedin subjectmatter.However,one typeofquestionordereffect
occurswhenone
exploration.This effect
has beenfoundin twocases and seemsworthfurther
ofthequestionsis a generalone on an issue and theotheris morespecificon thesame issue.
Schumanet al. (1981) withtwo opinionquestionson abortion,and Kalton et al. (1978) with
1982]
- Effect
KALTONANDSCHUMAN
of theQuestionon SurveyResponses
53
twoquestionson drivingstandards,bothfoundthatthedistributions
ofanswersto themore
specificquestionswerethe same whetherthe specificquestionwas asked beforeor afterthe
generalquestion,but that the distributionsof answers to the general questions differed
accordingto the questions' position. (However, Kalton et al. also reportanother such
witha contraryfinding.)
In theKalton et al. experiment,
experiment
respondentswereasked
aboutdrivingstandardsgenerallyand about drivingstandardsamongyoungerdrivers.When
the generalquestionwas asked first,34 per cent of respondentssaid that generaldriving
standardswere lower than theyused to be; when that questionfollowedthe more specific
questionabout youngerdrivers,thecorresponding
percentagefellby 7 percentto 27 percent.
Furtheranalysisshowedthatthequestionorderaffected
onlyrespondentsaged 45 or older,
in thepercentageswas 12 percent.No definitive
wherethedifference
reasonforthiseffect
has
beenestablished,but it maypossiblybe explainedas a subtractioneffect:
afteransweringthe
specificquestion,somerespondents
assumethatthegeneralquestionexcludesthespecificpart
(e.g.in the drivingexample,theyassume thatthegeneralquestionexcludesconsiderationof
thedrivingstandardsof youngerdrivers).
With factual questions,one situationwhere other questions on a questionnairemay
influence
theanswersto a particularquestionariseswhenrespondents
are asked to respondto
a longlistofsimilaritems,as forinstancein readershipsurveyswheretheyare takenthrougha
listofnewspapersand periodicalsto findout whichones theyhave looked at. Here levelsof
reporting
sometimestendto be lowerwhenitemsare placed laterin thelist.For instance,in
studyingreadershipreportingin the UK National Readership Surveys,Belson (1962)
conductedan experimentin whichhe varied the relativepositionof the different
typesof
periodicalsbetweendifferent
partsofthesample.The weeklypublicationsweremostaffected
bythepresentation
order:whentheyappearedlast theirreportedlevelofreadershipwas only
of whatit was whentheyappeared first.
three-quarters
Anothersourceofevidenceon thedisturbing
influenceof otherquestionscomes froman
examinationbyGibson et al. (1978) oftheeffects
oftheinclusionofsupplements
on theresults
forcore itemsin theNational CrimeSurvey(NCS), CurrentPopulation Surveyand Health
InterviewSurvey.
In the NCS Cities Sample a lengthyseriesof attitudequestions about topics such as
neighbourhoodsafety,opinionsregardinglocal police, crimetrendsand news coverageof
crimewas askedofa randomhalfofthesampleofadultsin additionto thecoreNCS questions
on crimevictimization.
Sinceit was thoughtthattheresponsesto theattitudequestionsmight
be affected
by thevictimization
questionsiftheywereasked afterthecore items,the attitude
questionswere asked first.The effectof the priorinclusionof the attitudequestionswas,
increase the reportedvictimizationrates: on
however,to substantiallyand significantly
averagethe rate forpersonalcrimeswas around 20 per cent greaterand thatforproperty
crimeswas around 13 per cent greaterfor the half sample that answered the attitude
supplementthan forthe halfsample that did not. Possible explanationsforthis effectare
thatthe attitudequestionsservedto stimulaterespondents'awarenessor memoryregarding
victimizationexperiences,that they increased respondents'desire to produce what they
perceivedto be the desired answers-victimizationexperiences or that a combination
ofboth thesecauses operated.
Froma further
analysisoftheNCS CitiesSample,Cowan et al. (1978)deducethattheeffect
of administeringthe attitudesupplementwas to increase reportingof the less serious
victimizations
(such as simpleassault,thosenot reportedto thepolice and thoseinvolvinga
loss of under$50) and to increasereportingamong populationsubgroupsexperiencing
high
victimization
rates(youngerpersons,males).Theyalso foundthatthehigherrateswerespread
the 12-monthreference
throughout
periodwithno discerniblepattern,a factorwhichargues
stimulatedby the attitudesupplement.Theyconclude
againstan increasedtelescopingeffect
in thereference
thattheeffect
ofthesupplementis to producebetterreporting
period,butthey
to attributethiseffect
to memorystimulation.
suggestthatit may be an oversimplification
54
EffectoftheQuestionon SurveyResponses
[Part 1,
1982]
55
7, 19-29.
Carolina. Demography,
J. Amer.Statist.
withproblemsoflaborforcemeasurement.
BANCROFT,G. and WELCH, E. H. (1946).Recentexperience
Ass.,41, 303-312.
56
[Part 1,
3(4), 1-13.
Behavior,2, 339-369.
politicalinterest.
Paper presentedat the36thAnnualConferenceoftheAmericanAssociationforPublic Opinion
Research,May 1981.
BISHOP, G. F., OLDENDICK, R. W., TUCHFARBER,A. J.and BENNETT,S. E. (1980b).Pseudo-opinionson publicaffairs.
Public OpinionQuart.,44, 198-209.
BRADBURN,N. M., SUDMAN, S. and AssocIATES (1979). Improving
InterviewMethodand Questionnaire
Design.San
Francisco:Jossey-Bass.
CAMPBELL,A., CONVERSE,P. E., MILLER, W. E. and STOKES, D. E. (1960). The AmericanVoter.New York: Wiley.
CANNELL,C. F. (1977).A Summary
ofStudiesofInterviewing
Methodology.
Vitaland HealthStatistics,
Series2, No. 69.
Washington,DC: US GovernmentPrintingOffice.
CANNELL,C. F. and KAHN, R. L. (1968).Interviewing.
In The HandbookofSocial Psychology.VolumeTwo: Research
Methods(G. Lindzeyand E. Aronson,eds), 2nd ed., Chapter 15. Reading,Mass.: Addison-Wesley.
CANNELL, C. F., MILLER, P. V. and OKSENBERG, L. (1981). Researchon interviewing
techniques.In Sociological
Methodology,
1981 (S. Leinhardt,ed.),pp. 389-437. San Francisco:Jossey-Bass.
COWAN, C. D., MURPHY, L. R. and WIENER,J.(1978). Effects
of supplementalquestionson victimization
estimates
fromthe National Crime Survey.Proceedingsof the Sectionon SurveyResearchMethods,AmericanStatistical
Association,1978,pp. 277-282.
FORSYTHE, J. B. and WILHITE, 0. (1972). Testing alternativeversions of AgricultureCensus questionnaires.
ProceedingsoftheBusinessand EconomicStatisticsSection,AmericanStatisticalAssociation,1972,pp. 206-215.
GIBSON,C. O., SHAPIRo,G. M., MURPHY, L. R. and STANKO,G. J.(1978).Interaction
ofsurveyquestionsas itrelatesto
interviewer-respondent
bias. Proceedingsof the Section on SurveyResearch Methods,AmericanStatistical
Association,1978,pp. 251-256.
GOODSTADT, M. S. and GRUSON, V. (1975). The randomizedresponsetechnique:a teston druguse. J. Amer.Statist.
Ass.,70, 814-818.
GRAY, P. and GEE, F. A. (1972). A QualityCheckon the 1966 Ten Per Cent SampleCensusof Englandand Wales.
London: HMSO.
one or both sides of a case. Statistician,
HEDGES, B. M. (1979). Questionwordingeffects:
presenting
28, 83-99.
JAFFE,J.A. and STEWART,C. D. (1955). The rationaleof the currentlabor forcemeasurement.
In The Languageof
Social Research(P. F. Lazarsfeldand M. Rosenberg,eds), pp. 28-34. New York: The Free Press.
in wordingopinionquestions.Appl.Statist.,27, 149-161.
KALTON,G., COLLINS, M. and BROOK,L. (1978).Experiments
ofoffering
a middleresponseoptionwithopinionquestions.
KALTON,G., ROBERTS,J.and HOLT, D. (1980).The effects
Statistician,
29, 65-78.
of the questionon surveyresponses:a review.(Withdiscussionby
KALTON, G. and SCHUMAN,H. (1980). The effect
N. D. Rothwelland C. F. Turner).ProceedingsoftheSectionon SurveyResearchMethods,AmericanStatistical
Association,1980,pp. 30-45.
of questionformon gatheringincomedata by telephone.J.
LOCANDER,W. B. and BURTON,J. P. (1976). The effect
MarketingRes., 13, 189-192.
LOCANDER,W., SUDMAN,S. and BRADBURN,N. M. (1976). An investigation of interview method, threat and response
distortion.J. Amer.Statist.Ass.,71, 269-275.
MADIGAN,F. C., ABERNATHY,J.R., HERRIN,A. N. and TAN, C. (1976). Purposive concealment of death in household
surveysin Misamis OrientalProvince.PopulationStudies,30, 295-303.
NATIONAL CENTER FOR HEALTH STATISTICS(1972). Optimum
Recall Periodfor ReportingPersonsInjuredin Motor
VehicleAccidents.Vitaland HealthStatistics,Series2, No. 50. Washington,
DC: US GovernmentPrintingOffice.
Data byHouseholdInterviews:
An
NETER, J.and WAKSBERG,J.(1965). ResponseErrorsin CollectionofExpenditures
Experimental
Study.Bureau of the Census TechnicalPaper No. 11. Washington,DC: US GovernmentPrinting
Office.
PARRY, H. J. and CROSSLEY, H. M. (1950). Validityof responsesto surveyquestions.Public OpinionQuart.,14,
61-80.
Press.
PAYNE, S. L. B. (1951). The ArtofAskingQuestions.Princeton:PrincetonUniversity
ofa middlepositionin attitudesurveys.PublicOpinionQuart.,
PRESSER,S. and SCHUMAN,H. (1980).The measurement
44, 70-85.
QUINN, S. B. and BELSON,W. A. (1969). The Effects
ofReversingtheOrderofPresentation
of VerbalRatingScales in
SurveyInterviews.
SurveyResearchCentre,London School of Economics.
J. MarketingRes.,16,401ROTHWELL,N. D. and RUSTEMEYER,A. M. (1979). Studiesof Census mail questionnaires.
409.
RUGG, D. and CANTRIL,H. (1944).The wordingofquestions.In GaugingPublicOpinion(H. Cantril,ed.),Chapter2.
Princeton:PrincetonUniversityPress.
variablein surveyanalysis.Sociol.Methods
SCHUMAN,H. and PRESSER,S. (1977).Questionwordingas an independent
and Res.,6, 151-170.
1982]
57
SCHUMAN,H. and PRESSER, S. (1978). The assessmentof "no opinion" in attitudesurveys.In Sociological
1979 (K. Schuessler,ed.), Chapter 10. San Francisco:Jossey-Bass.
Methodology,
(1979). The open and closed question.Amer.Sociol. Rev.,44, 692-712.
(1980).Publicopinionand publicignorance:thefinelinebetweenattitudesand nonattitudes.
Amer.J.Sociol.,85,
1214-1225.
-(1981).
Questionsand Answerson AttitudeSurveys.Experiments
in QuestionForm,Wording,and Context.New
York: AcademicPress.
SCHUMAN,
H., PRESSER,
S. and LUDWIG,J.(1981). Contexteffects
on surveyresponsesto questionsabout abortion.
PublicOpinionQuart.,45, 216-223.
I. M. and BONHAM,
SHIMIZU,
G. S. (1978).Randomizedresponsetechniquein a nationalsurvey.J. Amer.Statist.Ass.,
73, 35-39.
SPEAK,M. (1967).Communicationfailurein questioning:errors,misinterpretations
and personalframesofreference.
OccupationalPsychology,
41, 169-181.
SUDMAN,
S. (1980). Reducingresponseerrorsin surveys.Statistician,
29, 237-273.
SUDMAN,
S. and BRADBURN,
N. M. (1973).Effects
oftimeand memoryfactorson responsein surveys.J. Amer.Statist.
Ass.,68, 805-815.
(1974). ResponseEffects
in Surveys.Chicago: Aldine.
TRAUGOTT,
M. W. and KATOSH,
J.P. (1979). Responsevalidityin surveysof votingbehavior.Public OpinionQuart.,
43, 359-377.
E. (1978). Fallibleindicatorsofthesubjectivestateofthenation.AmericanPsychologist,
TURNER,
C. F. and KRAUSS,
33, 456-470.
WARNER,
S. L. (1965).Randomizedresponse:a surveytechniqueforeliminating
evasiveanswerbias. J. Amer.Statist.
Ass.,60, 63-69.
58
[Part 1,
in othersituations.The topic
ofeffects
willneedto standthetestoftimeand to explainthenon-existence
of formresistantcorrelationsis one in which no clear patternhas yet emergedand no theoretical
justificationexists.To illustratemy point,several timesthis eveningwe have heard about specific
methodsthat"thisoftenoccursbut not always".
in
Ifthistopicis a scienceratherthanan art form,thenwe clearlyneed a muchstrongerframework
whichwe observeand thereare one or two places wherethe possibilityof
whichto measuretheeffects
theworkofCannelland otherson theeffects
looks likea distantpromise.In particular,
sucha framework
The authors give a numberof
of question lengthand the completenessof responsesis interesting.
suggestionsas to whyquestionswhichhave been padded out withredundanciesmayencouragea more
to leap in withboth feetand
foramateursocial psychologists
completeresponseand it is not difficult
is to be developedforsuchtopics,thenwe need a wayofcodingquestions
suggestothers.Ifa framework
probablyin psychologicalterms,to yielda measureforthe
and,indeed,wholesectionsofquestionnaires,
question.For example,in Cannell'sworka measureof redundancyor fluencyor timelag betweenthe
of the
introductionof a conceptand the need of a response,or whateverthe relevantcharacteristics
questiondesignare thoughtto be. If such a measureis shownto be relatedto thecompletenessofthe
responsesovera varietyofquestionnairesin formand content,thenwe would have thebeginningsofa
againstwhichfuturequestionnairedesignscould be measured.It is thisdevelopmentto get
framework
beyond ex-post explanations of what has happened, to predict what effectswill exist in future
whichis the most urgentneed.
and to provideguidanceto surveymethodologists
questionnaires,
Finally,may I take issue with the authors on an assertionwhich theymake in several places
Section5. Here I feelthatthereis tacitsupportfortheviewthat,
thepaper and particularly
throughout
on comparisonsis acceptable.Of course
to measure,theconcentration
sincemarginalresultsare difficult
comparisonsare valid in theirown rightbut I have recentlyreturnedfromNew Zealand, wherepress
coveragehas been dominatedby therecentSpringboktour.A greatdeal ofemphasiswas placed on the
that54 percentofthepopulationwas againstthetourtakingplace.It seemsto me thatthe
surveyfinding
desireto derivesome measureof the attitudeof the population-to a Springboktourin thiscase-is
valid and can influencethe politicaland social lifeof a country.If the currenttechniquesfor
perfectly
measuringthisare inadequate,thenthereis a need to derivevalid methodsto measuretheproportion
thata simpleproportionis an inadequate
is to demonstrate
supportinga particularissue.The alternative
new set ofmeasuring
measureofopinionin such a complexissue and to providean alternativeentirely
otherwisesurveymethodologistswill be cast into the role of sayingthat theycannot
instruments;
measurewhat is needed.
review.It givesme
and a well-presented
theauthorson a well-written
May I close by complimenting
greatpleasureto proposea vote of thankson behalfof the Society.
Mr B. HEDGES (Social and CommunityPlanningResearch):I should begin,perhaps,by declaringa
no doubtsharedby manyof theaudience,in thesubjectofthispaper.I spendmytime
specialinterest,
welcomethis
primarilyin conductingsample surveysor on mattersconnectedwiththem.I therefore
ofquestionwordingon responses,and it is
paperas an excellentreviewofwhatis knownabout theeffect
one thatI shall findveryusefulin mywork.
Kalton and Schumanremindus thata greatdeal of workhas been done on thistopic,and thatwe
havelearnta lot fromit.It is notuncommonto meetpeoplewho approachthetaskofdesigningtheirfirst
whatafterall appearsto be an ordinary,
confidencein theirabilityto perform
questionnairewithperfect
everydaybusiness-the askingofquestions.Everyonedoes thatall thetime;it is simpleenough,then,to
devisea stringof questionsthatwillproducegood data readyforsophisticatedanalysis.However,this
paper should make such people thinktwice.Designingquestionsis not easy,and is fullof pitfalls.
Althougha lot is knownas a resultoftheworkthathas been done,whatis not knownis fargreater.
and in thatprocessrefersto theliteratureforguidance,is
Anyonewho triesto designa questionnaire,
likelyto be leftwitha largequantityofunsolvedproblems.I thinkthatthiswillcontinueto be so fora
is speeding
in,questionexperiments
on,and interest
longtime.It is truethattherateofexperimentation
researchtheauthorsadvocateis adopted,ratherthanthe
up. In spiteofthat,and eveniftheintegrated
willsolveonlya small
experiment
piecemealresearchwe have had in thepast,I stillthinkthatsystematic
proportionof the questionnairedesigner'sproblems.
is clearlythe best basis forquestion
Referenceto what is knownfromwell-plannedexperiments
design,but whatabout all the cases in whichsuch knowledgeis not available?My beliefis thatmuch
ofquestions,evenin theabsenceofspecific
could be learntfrommuchmoreopen discussionand criticism
There tends to be an implicitassumptionin the literaturethat in the absence of hard
experiments.
1982]
59
60
[Part 1,
Age group
0-4
5-14
15-44
45-54
65-74
75+
disability
longstanding
Limiting,
ratesper 1000 population
1971
1972
31
55
89
250
412
484
17
36
62
190
329
390
1982]
61
62
[Part 1,
1982]
63
64
[Part 1,
1982]
oE
:t^~~~~L
ce O-
I- nC
I.
cn
C~~~~~~~~~~~~~~~~~
O4
rq
En 0,
>
U~~~~~
m WX w
O c
11s tj
W O"
m
XN?m^N^
tr
O Sn ca
"
t1
I.+
Dc
;~~~~~~~~~~~
s O\
I+
65
tn
" cn CD W
)0 Ot
cyo>oon
o?^^ov
xn O O: O :
00
N w
^O
n W)
O)
O"
tn
'It
OON
^m
n
tn t- C ) cn t?111
cnW>,
I t
n C) O
B~~~~~~~~~r
(Z
R+
3N^vL
qXxN?v
ti -o
aftl
11
tn :t 00
- CN 00
tN "O
00
. b 4
~~~~~~~~C;
ra
C;
RV
tl
umno
sn
anc
uOdla
C )yo
oo
I_
-X
ho0
st.
tn
"t
:t
on "O
ya
cn
) 00
cn
w (1
cn
r-
O 6
n_ __ ,O cn __
cn
81 1
do 4
o m??o
t NC
,b 6
V) 0,
tn .I
iwn
f4
@
>
C# ,9,
*--4Uv1v,
^NFt^oX
uuawin1-s"nn
l>s~f .,1,7
v?o0
a~
O4
Zt 0
l!?a
^ooFNm
N Ns )0s 0s
OOO
66
[Part 1,
1982]
67
anyofmyexperience.
We makethemistakeoftalkingabout questions
and responses,whereaswe should
talkabout stimuli
and responses.All our questions,evenon themostsimplethings,are stimuli.Whatwe
get back are responses.The only objectification
of what we thinkthat we are measuringlies in the
wordingof the question-an imposedstructuring.
Mr Hedgeshas alreadyreferred
to thegap betweentheconceptthattheresearcher
thinkshe is asking
about and how he actuallywritesthe question.We thenhave the gap betweenthatand the way the
interviewer
administers
it,followedby anothergap betweenwhattheinterviewer
administers
and what
Ifwe go fromtheresearcherto theinformant
theinformant
understands.
thereis an enormousdistance,
and thesample'sanswerto anyquestion,I would hypothesize,
mustrepresent
thenetresultofa rangeof
interpretations
ofthestimulus.Ifwe getdifferences
whenwe changea questionor ask thesame question
in a different
in thedistribution
way,probablywhatthatrepresents
is a difference
ofinterpretations,
and
we are lookingat the netresultof that.
This may sound fairlypessimistic,
but it suggestswhatone or two otherspeakershave referred
to,
thatpeople shouldbe asked muchmoreabout whattheyunderstandby thequestions-what theythink
we wantto know.Certainmethodsof pilotingneverreallytacklethat;theytot up theanswersto one
wordingor anotherwording,but neverget at whyresultsdiffer.
Pilotingcan be usefulor useless,
whereI want
dependingon how it is done. In a methodI use,whichI call structured
depthinterviewing,
to quantify,I ask virtuallythe same questionseveraltimesforabout 15 or 20 minutes,veryslightly
worded.I have foundthatdifferent
differently
people startto "give",startto "flow",at different
pointsin
thissequence.That tiesin withthepointabout longerquestionsand whatDr Belson said-give people a
chanceto thinkabout thetopicand you getdifferent
replies.Ifwe flasha lightin somebody'seye,we do
not expectthateveryonewillblinkat exactlythe same numberof seconds afterthatflash-thereis a
of responsesto theflash,and likewiseof interpretation
of questions.
distribution
For reallyimportant
issuesI thinkwe mayhave to faceup to somethingwhichis ostensiblyverynonscientific,
perhapsaskingpeople how theywould like to talk to us about the subjectin whichwe are
interested.
Standardquestionsare not necessarilythe bestway to obtain equally accurateinformation
fromall people. Is it more importantto have identicalstimuli
whichsatisfysome shackle-likegoal of
answers?In that sense, I think that perhaps we should
"scientific"rectitudeor identically-based
sometimesreappraiseour targetswhensurveysare beingdesigned-what it is thatwe are tryingto do
and forwhat purposes,i.e. "just ask questions"or guide problem-solving.
The followingcontributions
werereceivedin writing,
afterthemeeting.
Dr AUBREY MCKENNELL (Universityof Southampton):The authors are to be congratulatedin
whichis at once wideprovidingus withan accountoftheperplexingvarietyofquestionwordingeffects
ranging,succinctand insightful.
Many people are goingto wantto referto thispaper forthatreason
alone. On itsinnovativeside,two themesin thepaperstandout forme.One is theemphasis,wherenonfactualor at leastopinionquestionsare concerned,on multipleindicators;theotheris theavowedaim to
These two aspectsare not onlyrelatedbut,in myview,need to be
build and testtheoreticalstructures.
theoriesabout the operationof single
developedtogether.I am inclinedto doubt whethersatisfactory
opinionitemscan be establishedotherwise.
It is now approachingfortyyearssince the psychometrician
Quinn McNemar (1946) publishedhis
on thewordingofopinionquestions-an abundantliterature
classicalreviewofexperiments
eventhenand came out firmly
withtherecommendation
thatsingle-item
opiniongaugingbe discardedin favourof
a multi-item
of
scaling approach. Broadly,speaking,that advice has been ignoredby the fraternity
professionallarge-scale surveyresearchers,but has been followed and been the basis of many
Attitudescalingmethodshave become
developmentsby academic researchers,
notablypsychologists.
and
increasinglysophisticatedsince McNemar's day. Techniques such as the semanticdifferential
smallestspace analysis,forexample,allow subtlevariationsin the connotationsof verbalitemsto be
in psychometrics
would seemrelevant
preciselymapped.On thefaceofit,theseand otherdevelopments
hardto traceanyinfluence
to thestudyofissuesin thewordingofsurveyquestionsbutit is surprisingly
in the way such issueshave customarilybeen investigated.
It could be thatthetraditionalsplitballot approachwithitsfocuson one questionvariantat a time
fromthe disciplineof psychometrics.
The psychometric
has obscuredthe possibilityof contributions
focusis on seriesofitemsand on theitemintercorrelations
ratherthantheirmarginals,it is true,butthe
It oughtto be possiblein principleto
basic notionsof trueand errorscoreswould seem transferrable.
in the technicalpsychometric
compare question wordingvariantsfor reliability,
sense, or even for
68
[Part 1,
1982]
69
70
[Part 1,
1982]
71
72
[Part 1,
withtheclosed formsuggestedthenormsfortheactivities.It
offered
classification
becausethefrequency
to further
explorethispoint.
would be usefulto conductadditionalexperiments
of
In thefirstplace he is unhappyabout our classification
Dr Belsonraisestwoissuesofterminology.
opinion questionsas non-factualquestions.Since responsesto opinion questionscannot be validated
againstan externalsource,it seemsnaturalto classifythemas non-factualquestionsaccordingto our
usage of the term.In this connectionwe mighttake the opportunityto emphasizethat we used the
as a simpledivisionof questionsforpurposesof organizingour material;
distinction
factual/nonfactual
whichwould be a substantial
questionclassification
to devisea comprehensive
we werenot attempting
taskin itself(see,forinstance,thediscussionofRothwellto Kalton and Schuman,1980).Dr Belson also
byredundancies,
thatwe use theterm'long'to describethequestionslengthened
considersitunfortunate
It seemsto us,however,that"long"is the
byCannelland hisco-workers.
as employedin theexperiments
appropriateadjectiveto describesuchquestions,whereasthekindsofquestionDr Belsonwoulddescribe
perhapsas "complex"or "long and complex".
as "long"shouldmoreaccuratelybe describedotherwise,
of survey
Mr Webb draws attentionto the possibilityof regionalvariationin the understanding
questions.Like him,we are unawareofresearchon thistopic,and we agreethatitwarrantsinvestigation.
Some of the interestin "Black English"in the UnitedStatessuggestssimilarproblemsalong ethnicor
social lines.
bya rangeoffactorsin additionto the
In orderto bringout thefactthatsurveyresponsesare affected
thatwe talkabout stimuliand responsesratherthanquestionsand
questionasked,Mr Alltrecommends
on responses,but we still
and othereffects
responses.We acknowledgethe importanceof interviewer
as one componentinthedata collectionprocess.Mr Allt
consideritusefulto studythequestionseparately,
answers.While
goes on to pose questionofwhetherwe shouldhave identicalstimulior identically-based
we can see thata case maybe made forsome limitedvariationin questionsto obtainmorevalid factual
classes of respondent(a proceduremade more feasibleby computerassisted
responsesfromdifferent
cautiouslyand onlyafter
weconsiderthattheprocedurewouldbe usedextremely
telephoneinterviewing),
preparatoryresearchhas documentedthe comparabilityof responses.In statisticalsurveys,and
ofstimuliso
withopinions,itseemsto us essentialto insiston a highdegreeofstandardization
particularly
that responsecan be aggregatedin a meaningfulway forquantitativeanalysis.We fail to see how
and howto assess
stimuli,
usingdifferent
answerscould be obtainedbysurveyinterviews
identically-based
the comparabilityof responsesobtainedthisway.
findingin this
drawsattentionto one consolingand somewhatsurprising
ProfessorSudman usefully
bymode of
differ
significantly
issuesresponsesdo notseem_to
difficult
area; namelythatfornonsensitive
This is an importantpoint in view of the currentinterestin mixed modes of data
administration.
collection.
In conclusion,we would like to provideDr Belson withthe assurancehe requestedthatwe have
criticallyappraisedthematerialthatwe presentedin thispaper. It was not possibleto supplydetailed
accountsofall thestudiescited,and forthesethereadermustturnto theoriginalpublications.We can,
studies,and the findingshave oftenbeen
however,say thatwe reliedmainlyon large,well-conducted,
by one or morereplications.
reinforced
1982]
73
oftheoptimumpositionon a ballotpaper.Appl.Statist.,24,
D. (1975).The determination
M.
INGRAM, David M.
RUSKIN, Heather J.
THOMAS, Roger K.
TURNER, Keith
WALLEY, Peter
WHITAKER, John J. M.
L.
DAVIES, Martin V.
DAVIES, Peter T.
DAVISON, Anthony C.
EASTBROOK, Gillian A.
EL-HELBAWY, Abdalla
ELLERAY,Elaine A.
FOCHTMANN, John A.
FOTOPOULOS, Stergios
FRANE, James W.
GRAY, Christopher T.
GRIFFITHS, Caroline L.
HILL, Barry
HILL, Peter R.
KIDD, Eileen P.
KIRBY, Simon P. J.
T.
KROLL, Mary E.
MCGIVERN, Kevin
MORRIS, Alfred C.
MYLVAGANAM, Arunthathi
NESS, Mitchell R.
NICHOLAS, Timothy R. M.
PEWSEY, Arthur R.
POON, Fun C.
RAFEE, Najib M.
REGAL, Ronald R.
SANDBACH, Jonathan
SMEDLEY, Peter J.
STEPHENS, Helen J.
WATSON,Gordon J.
WHARTON, Ann
WILSON, Ian S.
WORRALL, Leslie