Sie sind auf Seite 1von 10

Stat250GundersonLectureNotes

6:LearningabouttheDifferenceinPopulationProportions

Part1:DistributionforaDifferenceinSampleProportions

TheIndependentSamplesScenario
Twosamplesaresaidtobeindependentsampleswhenthemeasurementsinonesampleare
notrelatedtothemeasurementsintheothersample.Independentsamplesaregeneratedina
varietyofways.Somecommonways:

Random samples are taken separately from two populations and the same response
variableisrecordedforeachindividual.

Onerandomsampleistakenandavariableisrecordedforeachindividual,butthenunits
arecategorizedasbelongingtoonepopulationoranother,e.g.male/female.

Participants are randomly assigned to one of two treatment conditions, and the same
responsevariable,suchasweightloss,isrecordedforeachindividualunit.

Iftheresponsevariableiscategorical,aresearchermightcomparetwoindependentgroupsby
lookingatthedifferencebetweenthetwoproportions.

Thereareusuallytwoquestionsofinterestaboutadifferenceintwopopulationproportions.
First, we want to estimate the value of the difference. Second, often we want to test the
hypothesisthatthedifferenceis0,whichwouldindicatethatthetwoproportionsareequal.In
either case, we will need to know about the sampling distribution for the difference in two
sampleproportions(fromindependentsamples).

SamplingDistributionfortheDifferenceinTwoSampleProportions

Example:DrivingSafely
Questionofinterest:Howmuchofadifferenceistherebetweenmenandwomenwithregard
totheproportionwhohavedrivenacarwhentheyhadtoomuchalcoholtodrivesafely?

Study:TimemagazinereportedtheresultsofapollofadultAmericans.Onequestionasked
was:Haveyoueverdrivenacarwhenyouprobablyhadtoomuchalcoholtodrivesafely?

Letp1bethepopulationproportionofmenwhowouldrespondyes.

Letp2bethepopulationproportionofwomenwhowouldrespondyes.

Wewanttolearnaboutp1andp2andhowtheycomparetoeachother.Wecouldestimatethe
differencep1p2withthecorrespondingdifferenceinthesampleproportions p 1 p 2 .

Willitbeagoodestimate?Howclosecanweexpectthedifferenceinsampleproportionstobe
tothetruedifferenceinpopulationproportions(onaverage)?

93

Imaginerepeatingthestudymanytimes,eachtimetakingtwoindependentrandomsamplesof
sizes n1 and n2, and computing the value of p 1 p 2 . What kind of values could you get for
p 1 p 2 ?Whatwouldthedistributionofthepossible p 1 p 2 valueslooklike?Whatcanwesay
aboutthedistributionofthedifferenceintwosampleproportions?

Usingresultsabouthowtoworkwithdifferencesofindependentrandomvariablesandrecalling
theformofthesamplingdistributionforasampleproportion,thesamplingdistributionofthe
differenceintwosampleproportions p 1 p 2 canbedetermined.

Firstrecallthatwhenworkingwiththedifferenceintwoindependentrandomvariables:

themeanofthedifferenceisjustthedifferenceinthetwomeans

thevarianceofthedifferenceisthesumofthevariances

Next,rememberthatthestandarddeviationofasampleproportionis p(1 p) .
n

Sowhatwouldthevarianceofasinglesampleproportionbe? p(1 p)
n

So lets apply these ideas to our newest parameter of interest, the difference in two sample
proportions p 1 p 2 .

SamplingDistributionoftheDifferenceinTwo(Independent)SampleProportions

Ifthetwosampleproportionsarebasedonindependentrandomsamplesfromtwopopulations
andifallofthequantities n1 p 1 , n1 (1 p1 ) , n2 p 2 ,and n2 (1 p 2 ) areatleast10,

Thenthedistributionforthepossible

N p1 p2 ,

p 1 p 2 willbe(approximately)

p1 1 p1 p2 1 p2

n1
n2

94

Sincethepopulationproportionsofp1andp2arenotknown,wewillusethedatatocompute
thestandarderrorofthedifferenceinsampleproportions.

StandardErroroftheDifferenceinSampleProportions

s.e.( p 1 p 2 )

p 1 (1 p 1 ) p 2 (1 p 2 )

n1
n2

The standard error of p 1 p 2 estimates, roughly, the average distance of the possible

p 1 p 2 valuesfromp1p2.Thepossible p 1 p 2 valuesresultfromconsideringallpossible
independentrandomsamplesofthesamesizesfromthesametwopopulations.

Moreover,we can usethis standard error to produce arange of values that wecan be quite
confidentwillcontainthedifferenceinthepopulationproportionsp1p2:

p 1 p 2 (afew)s.e.( p 1 p 2 ).

Thisisthebasisforconfidenceintervalforthedifferenceinpopulationproportionsdiscussed
nextinPart2.

Ifweareinterestedintestinghypothesesaboutthedifferenceinthepopulationrates,wewill
needtoconstructanullstandarderrorofthedifferenceinthesampleproportionsanduseitto
computeastandardizedteststatistic.Thatteststatisticwillhavethefollowingbasicform:

SamplestatisticNullvalue.
(Null)standarderror

Thisisthebasisforthehypothesistestingaboutthedifferenceinpopulationproportionscovered
inPart3ofthissectionofnotes.

95

AdditionalNotes
Aplacetojotdownquestionsyoumayhaveandaskduringofficehours,takeafewextranotes,write
outanextraproblemorsummarycompletedinlecture,createyourownsummaryabouttheseconcepts.

96

Stat250GundersonLectureNotes
6:LearningabouttheDifferenceinPopulationProportions

Part2:ConfidenceIntervalforaDifferenceinPopulationProportions

Wehavetwopopulationsfromwhichindependentsamplesareavailable, (oronepopulationfor
whichtwogroupsformedusingacategoricalvariable).Theresponsevariableisalsocategoricaland
weareinterestedincomparingtheproportionsforthetwopopulations.

Letp1bethepopulationproportionforthefirstpopulation.
Letp2bethepopulationproportionforthesecondpopulation.

Parameter:thedifferenceinthepopulationproportionsp1p2.

Sampleestimate:thedifferenceinthesampleproportions p 1 p 2 .

Standarderror: s.e.( p p ) p 1 (1 p 1 ) p 2 (1 p 2 )
1

n1

n2

Sowehaveourestimateofthedifferenceinthetwopopulationproportions,namely p 1 p 2 ,
andwehaveitsstandarderror.Tomakeourconfidenceinterval,weneedtoknowthemultiplier.

SampleEstimateMultiplierxStandarderror

As in the case for estimating one population proportion, we assume the sample sizes are
sufficiently large so the multiplier will be a z* value found from using the standard normal
distribution.

TwoIndependentSampleszConfidenceIntervalforp1p2

p 1 p 2 z *s.e. p 1 p 2

where

s.e.( p 1 p 2 )

p 1 (1 p 1 ) p 2 (1 p 2 )

n1
n2

andz*istheappropriatemultiplierfromtheN(0,1)distribution.

Thisintervalrequiresthatthesampleproportionsarebasedonindependentrandomsamples
fromthetwopopulations.

Also,allofthequantities n1 p 1 , n1 (1 p 1 ) , n2 p 2 ,and n2 (1 p 2 ) bepreferablyatleast10.

97

TryIt!DoOlderPeopleSnoreMorethanYounger?

ResearchersattheNationalSleepFoundationwereinterestedincomparingtheproportionof
peoplewhosnorefortwoagepopulations(1=olderadultsdefinedasover50yearsoldand
2=youngeradultsdefinedasbetween18and30yearsold).Thefollowingdatawasobtained
fromadultswhoparticipatedinasleeplabstudy.

Group
1=olderadults(over50yearsold)
2=youngeradults(between18and30yearsold)

Snore?
Yes
No
168
312
45
135

Total
480
180

Letp2representthepopulationproportionofallyoungeradultswhosnore.Provideanestimate
forthispopulationproportionp2.Includetheappropriatesymbol.

p 2 45 / 180 0.25
Wewishtoprovidea90%confidenceintervaltoestimatethedifferenceinsnoringratesforthe
twopopulationproportionsofadults.Oneoftheconditionsforthatconfidenceintervaltobe
validinvolveshavingtwoindependentrandomsamples,whichisreasonablefromthedesignof
thestudy.Validatetheremainingassumption.

Weneedtohaveatleast10whodosnoreandatleast10whodonotsnoreineachofourtwo
samples.Herewehave168and312forgroup1and45and135forgroup2,soallfourofthese
countsisatleast10.

Providethe90%confidenceintervalandgiveaninterpretationofthisintervalincontext.

p1 168 / 480 0.35 and p 2 0.25

s.e.( p1 p 2 )

p1 (1 p1 ) p 2 (1 p 2 )
0.35(0.65) 0.25(0.75)

0.0389
n1
n2
480
180

p 1 p 2 z *s.e. p 1 p 2

(0.35 0.25) 1.645(0.0389)


0.10 0.064
(0.036, 0.164) or 3.6% to 16.3%

Interpretationthisinterval.
With90%confidenceweestimatethedifferenceinsnoringratesforthetwopopulation
proportionsofadultstobesomewherebetween___3.6%____and___16.4%___.

Whatvaluedoyounoticeisnotinthisinterval?___0____
Doesthereappeartobeasignificantdifferencebetween
thepopulationratesofsnoringforolderversusyoungeradults?
Yes
No

98

Stat250GundersonLectureNotes
6:LearningabouttheDifferenceinPopulationProportions

Part3:TestingaboutaDifferenceinPopulationProportions

TestingHypothesesabouttheDifferenceinTwoPopulationProportions

Wehavetwopopulationsfromwhichindependentsamplesareavailable,(oronepopulationfor
which two groups can be formed using a categorical variable). The response variable is also
categoricalandweareinterestedincomparingtheproportionsforthetwopopulations.

Letp1bethepopulationproportionforthefirstpopulation.
Letp2bethepopulationproportionforthesecondpopulation.

Parameter:thedifferenceinthepopulationproportionsp1p2.

Sampleestimate:thedifferenceinthesampleproportions p 1 p 2 .

Standarddeviationof p 1 p 2 : s.d.( p 1 p 2 )

p1 (1 p1 ) p 2 (1 p 2 )

n1
n2

Recallthatthemultiplierintheconfidenceintervalwasaz*value.Sowewillbecomputinga
Zteststatisticforperformingasignificancetest.

Thestandarderrorusedinconstructingtheconfidenceintervalforthedifferencebetweentwo
populationproportionsisnotthesameasthatusedforthestandardizedzteststatistic.

Wewillneedtoconstructthenullstandarderror,thestandarderrorforthestatisticwhenthe
nullhypothesisistrue.Letsstartwithwhatthehypotheseswilllooklike.

Possiblenullandalternativehypotheses.

1.H0: p1=p2(orp1p2=0)

versusHa:p1p2

2.H0: p1=p2(orp1p2=0)

versusHa:p1>p2

3.H0: p1=p2(orp1p2=0)

versusHa:p1<p

99

Nextweneedtodeterminetheteststatisticandunderstandtheconditionsrequiredforthetest
tobevalid.Thegeneralformoftheteststatisticis:

Teststatistic=SamplestatisticNullvalue
Standarderror

Inthecaseoftwopopulationproportions,ifthenullhypothesisistrue,wehavep1p2=0or
that the two population proportions are the same, p1= p2 = p. What is a reasonable way to
estimatethecommonpopulationproportionp?

n1 p1 n2 p 2
n1 n2

Thegeneralstandarderrorfor p 1 p 2 isgivenby:

s.e.( p1 p 2 )

p1 (1 p1 ) p 2 (1 p 2 )

n1
n2

butifthenullhypothesisistrue,then p isthebestestimateforeachpopulationproportion
andshouldbeusedinthestandarderror.
So,thenullstandarderrorfor p 1 p 2 isgivenby:

1 1
p (1 p )
n1 n2

Andthecorrespondingteststatisticis:

p1 p 2
1 1
p (1 p )
n1 n2

Ifthenullhypothesisistrue,thiszstatisticwillhavea_____N(0,1)______distribution.This
distributionisusedtofindthepvalueforthetest.

Conditions:Thistestrequiresthatthesampleproportionsarebasedonindependentrandom
samplesfromthetwopopulations.Also,allofthequantities n1 p , n1(1 p ) , n2 p ,and n2 (1 p )
bepreferablyatleast10.Notethesearecheckedwiththeestimateofthecommonpopulation
proportion p .

100

TryIt!TakingMorePictureswithCell
Cellphonescannowbeusedformanypurposesbesidesmakingcalls.Aninitialstudyfoundthat
more than 75% of young adults (defined as 1825 years old) use their cell phones for taking
picturesatleast2timesperweek.Thisstudyalsosuggestedthattheproportionofyoungwomen
inthisagegroupwhousetheircellphonetotakepicturesishigherthanthatforyoungmenin
thisagegroup.Afollowupstudywasconductedtoinvestigatethisconjecture.Theresearchers
whichtousea5%significancelevel.

Statedthehypotheses:H0:p1p2=0
versusHa:p1p2>0where
p1representsthepopulationproportionofallyoungwomen1825yearsoldwhoreportusing
theircellphonetotakepicturesatleast2timesperweek,and
p2representsthepopulationproportionofallyoungmen1825yearsoldwhoreportusingtheir
cellphonetotakepicturesatleast2timesperweek.

Herearetheresults:

Young
Agegroup=1825yearolds
Women
Numberwhoreportusingphonetotakepicturesatleast2times/week
417
SampleSize
521
Percent
80%

Young
Men
369
492
75%

Wecanassumethesesamplesareindependentrandomsamples.Verifytheremainingcondition
necessarytoconducttheZtest.
Allofthequantities n1 p , n1(1 p ) , n2 p ,and n2 (1 p ) bepreferablyatleast10.Noteweneed
tofindtheestimateofthecommonpopulationproportion p todothischeck.
417 369
p
0.7759
521 492
417
369
Conductthetest. p 1

0.8004
p 2
0.75
521
492
p1 p 2
0.8004 0.75
0.0504
z

1.92
1 0.0262
1 1
1
0.7759(1 0.7759)

p (1 p )

521 492
n1 n2
pvalue=P(Z1.92undertheN(0,1)distrib.)=0.0273(lessthan0.05)sowerejectH0

Usinga5%significancelevelwhichistheappropriateconclusion?

There is sufficient evidence to demonstrate the population proportion of all young


women1825yearsoldwhotakepictureswiththeirphoneatleasttwiceperweekis
greaterthanthatofthepopulationofallyoungmen1825yearsold.

Thereisnotsufficientevidencetodemonstratethepopulationproportionofallyoung
women1825yearsoldwhotake pictureswiththeirphoneatleasttwiceperweekis
greaterthanthatofthepopulationofallyoungmen1825yearsold.

101

AdditionalNotes
A place to jot down questions you may have and ask
duringofficehours,takeafewextranotes,writeoutan
extra problem or summary completed in lecture, create
yourownsummaryabouttheseconcepts.

102

Das könnte Ihnen auch gefallen