Beruflich Dokumente
Kultur Dokumente
Wiley
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.
Royal Statistical Society and Wiley are collaborating with JSTOR to digitize, preserve and extend access to Journal of the
Royal Statistical Society. Series B (Methodological).
http://www.jstor.org
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
1955] 69
STATISTICAL
METHODSAND SCIENTIFIC
INDUCTION
By Sir RONALDFISHER
Department
of Genetics,University
of Cambridge
SUMMARY
THE attemptto reinterpret thecommontestsof significance
used in scientific
research
as thoughtheyconstitutedsome kindof acceptanceprocedureand led to "decisions"
in Wald's sense, originatedin several misapprehensionsand has led, apparently,
to severalmore.
The threephrases examinedhere,with a view to elucidatingthe fallaciesthey
embody,are:
(i) "Repeated samplingfromthe same population",
(ii) Errorsof the "second kind",
(iii) "Inductivebehaviour".
Mathematicianswithoutpersonal contact with the Natural Sciences have often
been misledby such phrases. The errorsto which theylead are not always only
numerical.
1. Introduction
DURING the presentcenturya good deal of progressseems to have been made in the businessof
interpretingobservationaldata, so as to obtain a betterunderstandingof the real world. The
threeaspects of principleimportanceforthis progresshave been, first,the use of bettermathe-
maticsand morecomprehensive ideas in mathematicalstatistics;leadingto morecorrector exact
methodsof calculation,applied to the givenbody of data (a unique sample in the language of
W. S. Gosset,writingunderthename of "Student")whichcomprehendsall thenumericalinforma-
tion available on the topic underdiscussion. Secondly,as methodsof summarizing and drawing
correctconclusionsapproached adequacy, the wide subject of experimentaldesign was opened
up, aimed at obtainingdata more completeand precise,and at avoiding waste of effortin the
accumulationof ill-planned,indecisive,or irrelevent observations. Thirdly,as a naturalor even
inevitableconcomitantof the firsttwo, a more completeunderstanding has been reached of the
structureand peculiaritiesof inductivelogic-that is of reasoningfromthe sampleto the popula-
tion fromwhichthe sample was drawn,fromconsequencesto causes, or in more logical terms,
fromthe particularto the general.
Much thatI have to say will not commanduniversalassent. I knowthisforit is just because
I findmyselfin disagreementwith some of the modes of expositionof this new subject which
have fromtimeto timebeen adopted,thatI have takenthisopportunity of expressinga different
pointof view; different in particularfromthatexpressedin numerouspapersby Neyman,Pearson
Wald and Bartlett. Thereis no difference to matterin the fieldof mathematical analysis,though
differentnumericalresultsare arrivedat, but thereis a clear difference in logical point of view,
and I owe to ProfessorBarnard of The Imperial College the penetratingobservationthat this
differencein pointof vieworiginatedwhenNeyman,thinking thathe was correcting and improving
my own early work on testsof significance, as a means to the "improvementof naturalknow-
ledge", in factreinterpreted themin termsof thattechnologicaland commercialapparatuswhich
is knownas an acceptanceprocedure.
Now, acceptance proceduresare of great importancein the modernworld. When a large
concernlike the Royal Navy receivesmaterialfroman engineering firmit is, I suppose,subjected
to sufficientlycarefulinspectionand testingto reduce the frequencyof the acceptanceof faulty
or defectiveconsignments. The instructions to the Officerscarryingout the tests must also, I
conceive,be intendedto keep low both the cost of testingand the frequencyof the rejectionof
satisfactorylots. Much ingenuity and skillmustbe exercisedin makingthe acceptanceprocedure
a reallyeffectualand economicalone. I am castingno contempton acceptanceprocedures,and
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
70 FISHER-StatisticalMethodsand ScientificInduction [Part 1,
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
1955] FISHER-StatisticalMethods and ScientificInduction 71
t= (b -P)\/r A
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
72 FISHER-StatisticalMethods and ScientificInduction [Part 1,
a + c b +d n
wereproportional
simplyto
1/a ! b ! c !d!
(a + b) ! (a Jr c) (b + d) ! (c + d) ! 1
n! *a ! b!c!d!
wherethenewfactordependsonlyon themargins and noton thecontents.
In thiscase themargins of thetable,whichby themselves supplyno information as to the
proportionalityof thecontents,do, likethevalueA in theregression example,determine how
muchinformation thecontentswillcontain. The reasonableprinciple thatin testing thesigni-
ficancewitha uniquesample,we shouldcompareit onlywithotherpossibilities in all relevant
respectslikethatobserved,willlead us to set aside thevariouspossibletableshavingdifferent
margins, therelativefrequencies
of whichmustdependon unknown factorsof thepopulation
sampled.
On two occasionsin theintervening twenty yearsdistinguished statisticians
haveattempted
to bringintotheaccountpopulations offourfoldtablesnothavingfixedmargins.In bothcases,
suchis thereasonablenessofhumannaturein favourable cases,theauthorsoftheseinnovations
withdrew themaftersomediscussion, and expressed themselvesas completely satisfiedthatthe
apparent advancetheyhadmadewasillusory.Thefirst wasProfessor E. B. WilsonoftheHarvard
Schoolof PublicHealth,writing in Sciencein 1941,and latertakingoccasionto expoundthe
methodof Fisherand Yates in two papers in theProceedingsof theNationalAcademyof Sciences
inthefollowingyear. ThesecondcasewasthatofProfessor Barnard,whostartedon theassump-
tionthatthemethodexpounded byNeymanand Pearsoncouldbe reliedon,and in thefirst flush
ofsuccessreported a testusingthelanguageofthattheory "muchmorepowerful thanFisher's",
butwhoalso,aftersomediscussion, hadthegenerosityto go outofhiswayto explainthatfurther
meditationhad led himto theconclusion thatFisherwas rightafterall.
ProfessorBarnardhas a keenand highly trained mathematicalmind,and thefactthathe was
misledintomuchwastedeffort and disappointment shouldbe a warningthatthetheoryoftesting
hypothesessetoutbyNeymanand Pearsonhas missedat leastsomeoftheessentials oftheprob-
lem,and willmisleadotherswho acceptit uncritically. Indeed,in thematterof Behren'stest
forthesignificance
ofthedifferencebetween themeansoftwosmallsamples, objectionwastaken
on exactly
thegroundthatthesignificancelevelis notthesameas thefrequencyfoundon repeated
sampling.
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
1955] FIsHER-Statistical Methodsand Scientific
Induction 73
The examplesI havegivenfromsimpler problemsshowclearlythatit shouldneverhavebeen
putforwardin thefieldof significance
tests,thoughperhapsperfectly
appropriate to acceptance
sampling.
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
74 FISHER-StatisticalMethodsand ScientificInduction [Part 1,
4. "InductiveBehaviour"
The erroneousinsistenceon the formulaof "repeated samplingfromthe same population"
and the misplacedemphasison "errorsof the second kind" seem both clearlyenough to flow
fromthe notion that the process by whichexperimenters learn fromtheirexperiments mightbe
equated to some equivalentacceptance procedure. The same confusionevidentlytakes part in
the curiouspreferenceexpressedby J. Neyman forthe phrase "inductivebehaviour"to replace
what he regardsas the mistakenphrase "inductivereasoning".
Logicians,in introducingthe terms"inductivereasoning"and "inductiveinference"evidently
implythattheyare speakingof processesof the mindfallingto some extentoutsidethoseof which
a fullaccountcan be givenin termsof thetraditionaldeductivereasoningof formallogic. Deduc-
tive reasoningin particularsuppliesno essentiallynew knowledge,but merelyrevealsor unfolds
the implicationsof the axiomatic basis adopted. Ideally, perhaps, it should be carried out
mechanically. It is the functionof inductivereasoningto be used, in conjunctionwithobserva-
tional data, to add new elementsto our theoreticalknowledge. That such a process existed,
and was possible to normalminds,has been understoodforcenturies; it is only withtherecent
developmentof statisticalsciencethat an analyticaccount can now be given,about as satisfying
and complete,at least,as thatgiventraditionallyof the deductiveprocesses.
When,therefore, Neyman denies the existenceof inductivereasoninghe is merelyexpressing
a verbal preference. For him "reasoning" means what "deductivereasoning"means to others.
He does not tellus what in his vocabularystandsforinductivereasoning,forhe does not clearly
understandwhat that is. What he tells us to call "inductivebehaviour" is merelythe practice
of makingsome assertionof theform
T< 0
in some circumstances, and refraining fromthisassertionin others. This is evidentlyan effort to
assimilatea testof significanceto an acceptanceprocedure. From a testof significance, however,
we learn more than that the body of data at our disposal would have passed an acceptancetest
at some particularlevel; we may learn, ifwe wish to, and it is to thisthatwe usuallypay atten-
tion, at what level it would have been doubtful; doing this we have a genuinemeasureof the
conifidence withwhichany particularopinionmay be held,in view of our particulardata. From
a strictlyrealisticviewpointwe have no expectationof an unendingsequence of similarbodies of
data, to each of whicha mechanical"yes or no" responseis to be given. What we look forward
to in scienceis furtherdata,probablyof a somewhatdifferent kind,whichmayconfirm or elaborate
the conclusionswe have drawn; but perhapsof the same kind,whichmaythenbe added to what
we have already,to forman enlargedbasis for induction.
Neymanreinforceshis choice of language by argumentsmuch less defensible. He seems to
claim thatthe statement(a) "0 has a probabilityof 5 per cent.of exceedingT" is a different state-
mentfrom(b) "T has a probabilityof 5 per cent. of fallingshortof 0". Since languageis meant
to be used I believe it is essentialthat such statements,whetherexpressedin words or symbols,
should be recognizedas equivalent,even when 0 is a parameter,definedas an objectivecharacter
of the real world,enteringintothe specification of our hypotheticalpopulation,whilstT is directly
calculable fromthe observations. To preventthe kindof confusionthatNeymanhas introduced
we may point out thatboth statementsare statementsof the relationshipin whichT, or 0, stands
to the other. Also, since probabilityis specified,the statementshave meaningonly in relation
to a sufficientlywell-defined population of pairs of these values. The statementsdo not imply
thatin thispopulationof pairs of values eitherT or 0 is constant,but also theydo not excludethe
possibilitythat one should be constant,and that variabilityshould be confinedto the other.
Referenceto the mode of calculatingour limitsin an ordinarytestof significance will generally
establishthatin thesecalculationstheparameter0 has been treatedprovisionallyas constant,and
variationscalculatedof T forgiven0. The possiblevariationof 0 is leftarbitrary, and is irrelevant
to thecalculations,muchas is thedistribution of theindependentvariatein theregressionproblem.
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
1955] FISHER-StatisticalMethodsand ScientificInduction 75
5. Requirements of InductiveInferences
(a) Since some inductiveinferencesare expressedin termsof probability(fiducialprobability)
the firstrequirementis a clear understandingthat probabilitystatementsalways have reference
to some sufficiently definedpopulation,and neverto individuals,save as typicalmembersof such
a population. This understandingis needed for deductiveinferencesalso, when statementsof
probabilityare made.
(b) A veryimportantfeatureof inductiveinference, unknownin thefieldof deductiveinference,
is the framingof the hypothesisin termsof whichthe data are to be interpreted.This hypothesis
must fulfillseveralrequirements:(i) it must be in accordance withthe factsof natureas so far
known; (ii) it must specifythe frequencydistributionof all observationalfactsincludedin the
data, so that the data as a whole may be taken as a typicalsample; (iii) it mustincorporateas
parametersall constantsof naturewhichit is intendedto estimate,in additionpossiblyto special,
or ad hoc,parameters; (iv) it must not be contradicted,in any wayjudgedrelevant,by the data
in hand. If it satisfiesthese conditionsit is thereforea scientific
constructof a fairlyelaborate
type. It is by no meansobvious thatdifferent personsshould not put forwarddifferentsuccessful
hypotheses,among which the data can supply littleor no discrimination.The hypothesisis
sometimescalled a model,but I shouldsuggestthatthewordmodelshouldonlybe used foraspects
of the hypothesisbetweenwhich the data cannot discriminate. As an act of constructionthe
hypothesisis not altogetherimpersonal,forthe scientist'spersonalcapacityfortheorizingcomes
into it; moreover,the criteriaby whichit is approvedrequirea certainhonesty,or integrity, in
theirapplication.
(c) In one respectinductivereasoningis more strictthan is deductivereasoning,since in the
latterany item of the data may be ignored,and valid inferencesmay be drawn fromthe rest;
i.e. fromany selectedsub-setof the set of axioms used, whereasin inductiveinferencethe whole
of the data must be taken into account. This seems to be verydifficult to be understoodby
workerstrainedin deductivemethodsonly,thoughmore easilyunderstoodby statisticians. The
politicalprinciplethatanythingcan be proved by statisticsarises fromthe practiceof presenting
onlya selectedsub-setof the data available.
In some earlyresultsof my own I relyon the datum "There is no knowledgeof probabilities
a priori". They would not certainlyhave been legitimatewithoutthis datum, but they have
been mistakenlydescribedas a kind of greatestcommon factorof the inferenceswhich could
be drawnfordifferent possible data givingprobabilitiesa priori.
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
76 FIsHER-StatisticalMethodsand ScientificInduction [Part 1,
P= FN(r, p)
such thatthe distribution
of r forgiven p is givenby the frequencyelement
aF
aF dr,
then the distribution
aF
- dp
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
1955] FISHER-StatisticalMethodsand ScientificInduction 77
values of x could be rejectedat least at the 5 per cent. level of significance;but this gives only
an inequalitystatementforthe probabilitythatx is less than any givenvalue. Neymanseemsto
ignorethis distinction,and to speak in both cases of confidencelimits. Logically,however,the
formof inferenceadmissibleis totallydistinct.
Equally, statementsof fiducialprobabilityin continuouscases are only proper if the whole
of the informationis utilized,as it is by the use of sufficient estimates,whereasfor any test of
significance, howeverlow in power, it may well be possible to point to the limitsoutside which
parametricvalues are significantly contradictedby the data at a givenlevelof significance. These
also should be regardedas givingonlyroughstatementsforthe fiducialprobability.
Thereare othercases in thetheoryof estimationin whichrathersimilardata yieldinformation
of remarkablydifferent kinds. Consider,forexample,the case in whichx and y are two observ-
ables distributedin normal distributionswith unit variance in each case, and independently,
about hypotheticalmeans i and n. No situationcould be simpler. Suppose, however,thatthe
data containa functionalrelationship connectingi and -. Then different cases arisefromdifferent
functionalforms:
(i) If thereis a simplelinearconnectionbetweeni and -, so that(R,) represents a pointon a
givenstraightline,thenthe foot of the perpendicularfromthe observationpoint (x, y) is a suffi-
cientestimate,and the fiducialdistribution of (g, -) on the givenline will be a normaldistribution
with unit variance about this estimate. All possible observationson the same perpendicular
are equivalent.
(ii) If the givenlocus of (R, ) is a circle,thereis no sufficient estimate; the distanceof (x, y)
fromthecentreof the givencircleis, however,an ancillarystatistic,whichtogetherwiththe maxi-
mum likelihoodestimatemakes the estimationexhaustive. For each possible distancean appro-
priatelyorientedfiducialdistribution on the circlemaybe specified.
(iii) In generalthereis a well definedlikelihoodfunction,and thereforean estimatedpoint
of maximumlikelihood. It is not obviousthatanygeneralsubstitute can be foundfortheancillary
statistic,save in an asymptoticsense, or that any statementof fiducialprobabilityis possible in
general. Thus threelogicallydistincttypesof inferencearise fromsimplechangesin the mathe-
maticalspecification of the problem.
(e) Finally,in inductiveinferencewe introduceno cost functionsfor faultyjudgements,for
it is recognizedin scientificresearchthat the attainmentof, or failureto attain to, a particular
scientific advance thisyear ratherthan later,has consequences,both to the researchprogramme,
and to advantageousapplicationsof scientificknowledge,which cannot be foreseen. In fact,
scientificresearchis not geared to maximizethe profitsof any particularorganization,but is
ratheran attemptto improvepublic knowledgeundertakenas an act of faithto the effectthat,
as more becomesknown,or moresurelyknown,the intelligent pursuitof a greatvarietyof aims,
by a greatvarietyof men,and groupsof men,willbe facilitated. We make no attemptto evaluate
theseconsequences,and do not assumethattheyare capable of evaluationin any sortof currency.
When decision is needed it is the businessof inductiveinferenceto evaluate the natureand
extentof the uncertainty withwhichthe decision is encumbered. Decision itselfmustproperly
be referredto a set of motives,the strengthor weaknessof whichshould have had no influence
whateveron any estimateof probability. We aim, in fact,at methodsof inferencewhichshould
be equallyconvincingto all rationalminds,irrespective of any intentionstheymayhave in utilizing
the knowledgeinferred.
We have the duty of formulating, of summarising,and of communicatingour conclusions,
in intelligibleform,in recognitionof the rightof otherfreemindsto utilizethemin makingtheir
own decisions.
References
BARNARD, G. A. (1945), "A newtestfor2 x 2 tables",Nature, 156,No. 3954, 177.
- (1946), "Sequential tests in industrialstatistics",J.R. Statist. Soc., Supp . 8, 1-21.
---(1947a), "Significance
testsfor2 x 2 tables",Biometrika,
34, 123-138.
(1947b),"The meaningofa significance
level",Biometrika,
34, 179-182.
-- (1947c), Review: Sequential Analysis. By Abraham Wald, J. Amer. Stat. Ass., 42, 658.
-- (1949), "Statistical inference",J. R. Statist. Soc., B, 11, 115-139.
COCHRAN, W. G., & Cox, G. M. (1950), ExperimentalDesigns. New York: Wiley. London: Chapman
& Hall.
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions
78 FISHER-StatisticalMethods and ScientificInduction [Part 1,
DAVIES,0. L. (ed.) (1954), The Designand Analysisof Industrial Experiments.London & Edinburglh:
Oliver& Boyd.
FINNEY, D. J.(1952),StatisticalMethodin BiologicalAssay. London: Griffin.
FISHER,R. A. (1922),"The goodnessoffitof regression formulae, and thedistributionofregressioncoeffi-
cients",J.R.Statist.Soc., 85, 597-612.
(1930),"Inverseprobability", Proc. Camb.Phil.Soc., 26, 528-535.
(1933),"The conceptsofinverse probability
offiducialprobability to unknown
referring parameters",
Proc.Roy.Soc., A, 139,343-348.
(1934), StatisticalMethodsfor ResearchWorkers.(5th ed. and later.) London & Edinburgh:
Oliver& Boyd.
(1941),"The interpretation of experimental fourfoldtables",Science,94, No. 2435,210-211.
(1945),"A newtestfor2 x 2 tables",Nature,156,No. 3961,388.
GOULDEN, C. H. (1939and 1952),MethodsofStatistical Analysis. New York: Wiley. London: Chap-
man & Hall.
NEYMAN, J.(1938),"L'estimation statistique
traitecommeunprobleme classiquede probabilite",
Actualites
et Industrielles,
Scientifiques No. 739,25-57.
& PEARSON, E. S. (1933a),"The testingof statistical
hypotheses in relationto probabilities
a priori",
Proc.Camb.Phil.Soc., 29, 492-510.
(1933b),"On the problemof the mostefficient testsof statisticalhypotheses",Phil. Trans.Roy.
Soc., A, 231, 289-337.
PEARSON, E. S. (1947),"The choiceof statistical
testsillustrated
on theinterpretationof data classedin a
2 x 2 table",Biometrika,34, 139-167.
"STUDENT" (1908),The probableerrorof a mean,Biometrika, 6, 1-25.
VENN,J. A. 1876),TheLogicof Chance(2nded.). London: Macmillan.
WALD,A. (1950),Statistical DecisionFunctions.New York: Wiley. London: Chapman& Hall.
WILSON, E. B. (1941),"The controlled
experiment andthefourfold table",Science,93, No. 2424,557-560.
(1942a),"On contingency tables",Proc.Nat. Acad.Sci., 28, No. 3, 94-100.
WORCESTER, J. (1942b),"Contingencytables",Proc.Nat.Acad.Sci.,28, No. 9, 378-384.
YATES,F. (1949),Sampling Methods forCensusesand Surveys.London: Griffin.
This content downloaded from 158.121.247.60 on Tue, 06 Oct 2015 17:38:04 UTC
All use subject to JSTOR Terms and Conditions