Beruflich Dokumente
Kultur Dokumente
"
PART ONE
Descriptive Statistics
1IIIalllluIIIDII ••• I.I . . . . . II ••••• ~IIII . . . I. . . lIl1 •• ID •• U .UI ••••• I1 . . . . . IRI"II.III •• I1 •• "' ..
CHAPTER 1
Introduction to Statistics
"
A. HISTORICAL BACKGROUND *
Statistics is a form of applied maLlwmatics. IL is a logical Lool used in aU the
sciellces and elllploy(~d b~' ulll1l(J(lern cultures. It is especially a tool of the
biological and social sciellces, a tool whose developllHll1t has paralleled the
practical demands of mall's lleeds ill a diverse and complex world.
* cr. H. M. Walker, Studies in ihe IIisiory of8talislical Method, Williams & Wilkins,
Baltimore, 1929.
3
4 INTRODUCTION TO STATISTICS
* Cf. H. M. Walker, Slrzdies in the Hislory of Stali.~lical Melhod, Williams & Wilkins,
Baltimore, 1929, p. 17.
t Cf. E. P. Bell, Men of Mathematics, Simon & Schuster, New York, 19:17, dUlp. '11,.
t E. H. Godfrey, Section on Canada, in John Koren (C(l.), The Hislory oj cr.,'IIlIi.l'lic.9:
Their Development and Progress in Many Counlries, Macmillan, New York, 1913, pp. 179-
198.
HISTORICAL BACKGROUND 5
* c. D. CU~I.efll,ldu, in the New York Ilerald Tribune, July 7, 1940; also direct corrc-
tipoll<kmce.
6 INTRODUCTION TO STATISTICS
Correlation
The lleed for Lhe technique of correlatioll is alltly illustraLed by some of
Bowditch's problems which he was unable Lo answer adequately, as he him-
self recognized. WiLh the object of improving school application in growing'
children, the Massachusetts Board of Health sponsored the study by Bow-
ditch, reported in 1877.", Descriptive statistics of nearly 25,000 children were
obtained, including not only bodily measurements alld age, but also uation-
ality, place of birth, and occupation of parents. BowdiLch wislH~d to analyze
The ahsdsHH (hnrizulltai) axis rp[JI'psenls th(~ sm"~ of llleHSlu'es 0[' t;cores of a
variable attributo or trait.. TIll' ordinate (VI~rt.iea!) axis repmsmlls tho .freqtlCllcie.~
of the distribut.ioJl. The higlH~r t.lw eurve at any [loin I, t.lw gnmtm' t.he nllmher of
frequencies Or illst.arH:m; for t.lw measures at that. point .. The point of greatcHt con-
centrution of frequencies i~ in the center of the dist.ribul.ioll, at. 111, the mean.
{lroup question. Wn kllow lhat some people arc likely to be Lall, some short,
some heavy, some light. Persons, even of the same age, thus vary ill height
and weight. Height and wcigllt arc therefore called ml'iables or variates.
Quetelet and oLhers established the fact that thoro is a very real tendency
for a large random samplf~ of persons of a given age to have weights or heights
which, when systematically organized into a series according to size, form a
distribution· which is similar to that of the normal probability curve (see
Fig. 1:1).
The question of relationship between two variable attributes like height
and weight is whether individuals of average height are also of average weight;
whether very tall individuals arc also very heavy; whether very short in-
8 INTRODUCTION TO STATISTICS
dividllalH arc also vl~ry light. til other wurds, llw question is whclher weights
and heights, wlwlI paired accurding I.u Uw pasolls from WhOlll Uwy arc
obtained, vary l.ogellwr ill ally systematic way. Thifl is the probkul of co-
1'Ul'ialiol! or correia/ion, allfl it is complicated by the faeL thaI. rarely, if ever,
do the measured attl'ibuleR of biological or :o;(lcial phcnollwna exhibit perfect
or complete cOlTdal.ioll. 1'l'rsoll:-; w\to wt'igh, say, .160 pOLlnds do lIot all have
the sallle lwighl; ratiter, Lhey vary ill height. Similarly, lwrs()us of a given
height, say 6 f(~('t" vary ill wt'igill. The f.;latistical pr'olJlertl iJern is olle or
determining' 11](' /01'111 {lnd Ut'(JI'I>l' of allY L(~lldell(,y for wdght aud hcig'l!L lu
vary together. The detaib of llll' tlLatisLicaltl'l'hlliquc' of COI'I'P1alioli dl·lnandl'.d
by this kil\d of problenl, tile prohleIll 01' pu~:;i]Jll~ ('(J-nll'iuIjulI, wi.ll b(~ t:OlI-
sidercd ill ChapL!')' 9. 11el'u we wish ollly to ClIlphasizl' that Lite technique dis-
covcred by Galloll hm; [well indispensable to t.he Illoderll devclopllIclit of the
biulogical and soeial scietlc!!,;.
It is hy the slatistical tecillliqlw of correlation Ibat we are toclay able hy
cOlllparatively simple meth(lds 1.0 illve~tjgate rdul,ions huLwecn t.lw aLtrilmt,es
of individuals, or of orgallisllls gPlluraily, as wcll us relations l)(:Lwecll tlw
attribuLes of other killds of natural and so('ial phen()mena. "VitaL is Lhe llat.un:
of the relaLion, if any, bctwI'pn tlw I.Q.'s and sc:hoolgradl's of childl'l:ll, }w-
tween the tes1erl achinvPlllents of paren ts ami their o(fsprillg, lwLwel~1l t.he
Jllanual ahililies of sihlillgs? Is tlwl'l~ allY relalion bl'(.wcell temperaLure and
plant growth, betwcen neighborhood sLatus alld delillquency, hetween ll]('
prot<~ill cOllletrt and proportion of vitreous kernels ill wheat grains? Although
methods of illvcsLigaLillg' su('h questiolls [lH these [lJ'p 80111dinwfl complicatl'd,
the method of correlatio]} itself remains a most. powerful tool for the study
of possible relaLiom; among the variable att.l'ihuLcs of natural and social
phenomena. It is again 10 be emphasized, llDwcVl'r, Ihat this mctho(l, a::; W(~1l
as statistics gClll'Tally, if.; for the st.udy of group phenomcna-of masses of
instances. Infcrences which can be made legitimately from HiaListical results
arc about the group, 1I0t about the individual instancf). Descriptively, such
results give us informatioll about the group as a whole. Analylitally, slIch
results may ofLell be llsed for prcdieting whaL !lIay happen in fhe lona 1'lln or
on lhe average, but llot in the individual case.
chances arc aboll t. l~vell that a child to be born will be a boy or a girl. Again,
the metaphor is based (1) on OUl' ignmallce of the deLermining factors in the
given, individual instancc, alld (2) on the empirical facts of vital statistics
which have revealed for thousaIlds of births that the ratio of' boys to girls
is about 51 La li9.
These two examples should serve to illustrate the acLuarial or group cbarac-
LeI' of statistics. What is true for the proporLion of heads alld tails in coin
tussillg, alld of the sex ratio of hirths in viLal statistics, is also true for aU
statistical iuference, in that pl'cuicLions are actuarial and not individual. It
is well established ill psychological and educational measurement, for (~xample,
that there exists a real cormlative relationship between the academic attain-
rncllts and intelligence test aellifwement of' the school population ill our
culture. Given a particular I.Q. score, say 70, obtained ullder optimnm con-
ditions of measurement, we can predict that school children with such an
l.Q. will, on lhe average, be below average ill Lhl'ir acad(~rnie attainments.
That this is an actuarial or group iufereJlce should bl~ obvious; neveltheless,
such a predictioll is SOlIwtirnes made for Lile individual child who is, after
all, either below average or not, ill bis acadelllie attainments. And what he
will couLinue to do in his school wurk call be efi'ccLivt:iy uJld logically pre-
dicted with cOllHdenee only m; a rcsnlL of sLndyillg him as the psychological
individual Ihat he is. In dealillg wilh tIle individual ehild, Hw psychologist
finds it useflll and valid 10 draw upon his fund of statistical or acl.ual'ial
experiellce amI information so long as 1m continues to fnclls his analytical
attention on the uuiquc totality of the particular child.*
That a child has an IQ. of 70 is useful iJlformation so far as t.he psycholo-
gist determines as precisely as possible what the actual illidligence test
performance means for that particular child. In facl, Llw competent. psy-
chowgical iuvestigator uses an intelligence lcst chiefly for such a purpose,
for tbe light which the child's performance may throw on his total personality.
111 individual diagnosis and prognosis, the calculation of Llle I.Q. score itself
is incidental La Lhis fundamental purpose.
We see, then, that the data and methods of sLat,isties are for the study of
group or mass phenomena. And statistical illferences are actuarial in charac-
leI', i.e., they are inferenees about what happens or may happen in the long
run, or 011 the average.
word statistics is also used to denote the dala or information about popula-
tions, about biological and social phenoIllena, tbat call be measured or enu-
merated. Allhough this lauer use of the term has been suggested ill the
preceding pag'(~s, we wish specifically to differentiale statistics as informa-
tion from slatistics as melhod.
Statistics as information r(,pJ'(~Sellts perhaps the most gelleral use of the
concept. Today tlwl'e al'l~ Iii erally thousands of publicatiolls presenting
statistical information of various kinds: vital statistics, statistics of health
and medical care, slalistics of ('ducation, of social security and of labor,
statistics of eriml', slatistics of governmcllLal llnancl\ statistics of agl'icultul'l~,
mallufactures, lllillcmls, of housing and building constructioll, of wholesalu
and retail tradl', of pllblic utilities, of mOllcy and banking, of security markets
aud corporations, stalistics of interuaLioIlal tradc, uf bllSillcsi:) activity, of
commodity prices, of cOllSllmpticlll, alld of natiollal illcome alld wealth. Al-
though we are 110t directly (~OlICerned wiLh sl.atistics ai:) illformatioll, I.he
student of IJsychology, alllhropology, sociology, or educatioll should he
familial' wiLh sources of sLatistical informatioll relevant to his field of !·esl'al'c11.
A short bibliography of source material is appended to serve this purpose
(see Appendix A).