Sie sind auf Seite 1von 8

CHAPTER3

Reviewof Statistics

power of a test (77)


one-sidedalternative hypothesis(79)
confidenceset (79)

causal effect (84)


treatment effect (84)
scatterplot(91)

confidencelevel (79)

sample covariance(91)

confidenceinterval (79)

sample correlation coefficient


(sample correlation) (92)

coverageprobability (81)
test for the difference between two
m e a n s( t l 1 )

Reviewthe Concepts
3.1

E,xplainthe difference between the sample averageY and the population


mean.

3.2

Explain thc diffcrence bctween an estimator and an estimate.Provide an


e x a m p l eo f e a c h .

3.3

A population ciistributionhasa mean of l0 and a varianceof 16.Detcrmine


the mcan and valiance of Y from an i.i.d.san.rplefrom this population for
( a ) n : I 0 ; ( b ) n : l ( X ) l a n d( c ) n : l ( X X )R. e l a t ev o u r a n s w e r st o t h e l a w o f
largc nunrbors.

3.4

What rolc does thc central limit theorcrn play in statisticalhypothesiste$ing'?In the constructionof confidenceintervals'/

3.5

What is thc difference between a null and alternative hypothesis'/Antong


size,significancelcvel,and power'?Betwcen a onc-sidedalternativehypoth.
esisarrd a two-sided zrlternativchypothesis?

3.6

Why does a cclnfidenccinterval contain more information than the resull


of a singlc hypothcsistest?

3.r

3 . 7 E x p l a i n w h y t h e d i f l e r e n c e s - o f - m e a ness t i m a t o r ,a p p l i e d t o d a t a f r o m a
randrxrized controlled experiment,is an cstimator of the treatnrenteffect,
3.8

Sketcha hypotheticalscattcrplotfbr a samplcof size l0 fbr two randomvad.


ableswith arpopulationcorrclationof (a) 1.t);(b)-1.0;(c) t).9;(d)-{)..5;(e)0.0,

Exercises
. y:
3.1 ln a populationp

3.5
1 0 0a n d o l , : 4 1 . L J s et h e c e n t r a ll i m i t t h e o r e m t c

a n s w e rt h e I ' o l l o w i n gr ; u e s t i o n s :

Exercises

97

a. In a random sample of size n : 100.find Pr( 7 < t Of ;.


b . l r r a r a n d o m s a m p l e osf i z en : 6 4 . f i n d P r ( l ( ) t < Y < l ( ) 3 ) .
c. In a random sample of size n : 16-5.
find Pr( V > OA1.

3.2 Let Y be a Bernoulli random variablewith successprobabilityPr( f : 1) - p.


and let Yl,

, Y, be i.i.d. draws from this distribution. Let 2 be the fraction


o f s u c c e s s e(sl s ) i n t h i s s a m p l e .
a. Show tl'tatfi : Y.
b. Show that p is an unbiasedestimator of p.
c . S h o wt h a t v a r ( p ) : p ( l - p ) l n .

3.3 In a survey of 400 likely voters,2 I 5 respondedthat they would vote for the
i n c u m b c n ta n d 1 u - 5r c s p o n d c dt h a t t h e y r v o u l dv o t e f o r t h e c h a l l c n g e rL. e t
p d e n o t c t h e l r a c t i o n o f a l l l i k c l y v o t e r s w h o p r e f c r r e d t h c i n c u m b e n ta t
thc lirne of the survev.and lct 2 be thc lracticlnof survey respondcntswho
p r e f ' c r r c dt h e i n c u m b c n t .
a. LJscthc survey rcsults to estimatep.
b . L J s et h c e s t i m t r t o ro l ' t h e v a r i a n c co l p . p ( |
s l r t n c l i r re-rdr o r o l y r l u re s l i m i t l o r .

i t )l n , t o c a l c u l a t et h e

c . W h a t i s t l r c 2 - v a l n c l o r t h c t e s t H 1 ;7: r- 0 . 5 v s . H 1: p + 0 . 5 ' l
d . W h a t i s t h e p - v a l u el i r r l . h et c s t / { , : 2 : 0 . - 5v s . l l q : 7 r > 0 . - 5 ' l
e . W h y d o t h c r e s u l t sl r o r n ( c ) a n d ( d ) d i f l e r ?
f . D i d t h e s u r v e yc o n t a i ns t a t i s t i c a l l ys i g n i l i c a n te v i d c n c et h a t t h e
i n c u m b en t w a s a h c a do f t h e c h a l l e u g e ra t t h c t i m e o f t h e s u r v c y ?
Explain.

3.4 L ) s i n gt h e d a t a i n E r c r c i s c 3 . - l :
a. Construct a 95o1,confidencc interval for,a.
b. (lonstrucl tt t)t)"/,confidence intorval for p.
c . W h y i s t h e i n t e r v a l i n ( b ) w i d e r t h e r nt h c i n t e r v a l i n ( a ) ?
d . W i t h o u t d o i n g a n y a d d i t i o n a lc a l c u l z r t i r > n
t css, t t h c h y p o t h c s i s
:
H11:
p 0.50 vs.H; Jt * 0.-50at the 5 % significancclevel.

3.5 A s u r v e yo t ' 1 0 5 5r e s i s t e r e dv o t e r si s c o n d u c t e d a. n d t h e v o t e r sa r e a r s k e d
to choosebetwecn candidateA and candidatc B. Let p denote the fraction
o f v o t c r s i n t h e p o p u l a t i o nw h o p r e f e r c a n d i d a t eA , a n d l e t / d e n o t e t h e
f r a c t i o no f v < l t e r si n t h c s a m p l cw h o p r e l e r C l a n c l i d a tAe .

98

CHAPTER
3 Reviewof Statistics
a. You are interestedin the competing hypothesesH1;:p : 0.5 vs.
H; p # 0.5.Supposethat you decide to reject Hoit lb - 0.5 | > 0.02.
i. What is the size of this test?
i i . C o m p u t e t h e p o w e r o f t h i s t e s ti f p : 0 . 5 3 .
b. In the survey,p:0.54.
i. Test Ho:p - 0.-5vs.H1:p + 0.5 usinq a 5% significancelevel.
ii. Test Ho:p :0.-5 vs.H1:p > 0.5 using a 5% significancelevel.
iii. Construct a 95% confidenceinterval forp.
iv. Construct a 99"/" confidence interval for p.
v. Construct a 50"/" confidence interval for p.
c. Supposethat the survey is carried out 20 times,using independently
sclectedvoters in each survey.For each of these20 surveys,a95"/"
confidenceinterval for p is constructed.
i. What is the probzrbilitythat the true value of 7ris contained in all
20 of these confidenceintervals?
i i . H o w m a r r yo f t h e s cc o n f i d e n c ei n t e r v a l sd o y o u e x p e c tt o c o n t a i n
the truc value of p?
d . I n s u r v e yj a r g o n ,t h e " m a r g i n o f c r r o r " i s 1 . 9 6x S E ( p ) ; t h a t i s ,i t i s
half the lengtlr <tf95Y" confidencc interval.Supposeyou wanted to
design a survey that had a margin of crror of at most 1%. That is,you
w a n t e d p r ( l p - p l > t t . t t t ) - t t . t l SH
. o w l a r g es h o u l dn b e i f t h e s u r vey uscssirlple randonr sampling'/

3.6 L c t ) j , . , V , b e i . i . d . d r a w s f r o m a d i s t r i b u t i o n w i t h m e a n p . A t e s t o f
H o 'p. -

5 v e r s u sH ; p * 5 u s i n g t h eu s u a l/ - s t a t i s t i c y i e l d s a p - v a l u e o0f. 0 3 .

a . D o c s t h e c ) 5 %c o n f i d e n c ei n t e r v a lc o n t a i np : 5 ' l

Explain.

b. (lan you determine if trr: 6 is contained in the 95% confidcnce interval'i Explain.

3.7 In a given population,I I % of the likely votersare African American.A survey using a simple random sampleof 6(X)landline teleplronenumbers finds
8% African Amcricans.Is there evidencethat the surveyis biased?Explain.

3.8 A n e w v e r s i o n o f t h e S A T t e s t i s g i v e n t o 1 0 0 0r a n d o m l y s e l e c t e dh i g h
scht'rolseniors.The samplemean test scorc is 1110,and the samplestandard
deviation is 123.Construct a 95"1'confidence interval for the population
mean test score for high school seniors.

Exercises
3.9

Supposethat a lightbulb manufacturingplant producesbulbs with a mean


life of 2000 hours and a standard deviation of 200 hours.An inventor claims
to have developcd an improved processthat producesbulbs with a longer
mean life and the same standard deviation.The plant manager randomly
selects100bulbs producedby the process.She saysthat she will believe the
inventor's claim if the sample mean life of the bulbs is greater than 2100
hours; otherwise,she will conclude that the new processis no better than
the old process.Let p denote the mean of the new process.Consider the
nulf and alternative hypothesisHs: tr:2000 vs. H; p > 2000
a. What is the size of the plant manager'stesting procedure?
b. Supposethe new proccssis in fact better and has a mean bulb life of
21-50hours.What is the power of the plant manager'stesting
procedure'l
c. What testing procedure should the plant manager use if she wants the
sizc of her test to be -5o1,'?

3.10 Supposea new standardizcdtest is given to 100 randomly sclectedthirdgrade studcntsin Ncw Jersey.The sample averagcscorc 7 on the test is 58
points,and the san.rplestzrnderrd
deviation,sy, is 8 points.
a. The authors plan to administer the test to all third-gradc studentsin
New Jersey.Clonstructa 95Y" confidenceinterval for the mean score
of all New Jerscythird graders.
b . S u p p o s et h e s a m e t e s t i s g i v c n t o 2 0 0 r a n d o m l y s e l e c t e dt h i r d
grardersl'rom Iowa, producing a sample averageof 62 pclintsand
s a m p l c s t a n d a r dd c v i a t i o n o f 1 l p o i n t s .C o n s t r u c ta 9 0 ' / " c o n f i d e n c e
i n t e r v a l f o r t h e d i f ' l ' c r c n c ci n m c a n s c o r e sb e t w e e n l o w a a n d N c w
Jersey.
c . C a n y o u c o n c l u d ew i t h a h i g h d e g r e eo f c o n f i d e n c ct h a t t h e p o p u l a tion means for Iowa and New Jerseystudcnts are different'/ (What is
t h e s t a n d a r de r r o r o [ t h e d i f f e r c n c ci n t h c t w o s a m p l em e a n s ?W h a t
i s t h e p - v a l u e o l ' t h c t c s t o f n o d i f f e r e n c ei n m e a n sv e r s u ss o m e
difference'?)
3.11 Consider the estimator 7. dcfinccl in Equation (3.1). Show that
(a) r(7) - py and(b) var(7) : t.25olln.
3.12 Tir investigatepossiblegenderdiscriminationin a firm, a sampleof 100men
and 64 women with similarjob descriptionsare selectedat random.A summ a r y o f t h e r e s u l t i n gm o n t h l y s a l a r i e sf o l l o w s :

100

CHAPTER3

Reviewof Statistics

Average Salary ( Y)

Standard Deviation (sy)

Men

$3100

$200

*onl"n

$2900

$320

(t4
)

a. What do these data suggestabout wage differencesin the firm? Do


thev representstatisticallysignificantevidencethat averagewagesof
men and women are different? (To answer this question,first state the
null and alternativehypothesis;second,compute the relevant/-statistic;
t h i r d . s o r r r p u t et h c 2 - v a l u e a s s o c i a t e w
d i t h t h e r - s t a t i s t i ca: n d f i n a l l y .
u s e t h e p - v a l u et o a n s w e rt h e q u e s t i o n . )
b. Do these data suggestthat the firm is guilty of gender discrimination
i n i t s c o m p e n s a t i o np o l i c i e s ' E
? xplain.
3.13 Data on fifth-grerdetcst scores(reading and mathematics)for 420 school
d i s t r i c t si n C a l i l o r n i ay i e l d 7 : 6 4 6 . 2 a n d s t a n d a r dd e v i a t i o ns y : 1 9 . , 5 .
a. Construcl tt95"/" confidence interval for the mcan test scorc in the
population.
b . W h e n t h c d i s t r i c t sw e r e d i v i d e d i n t o d i s t r i c t sw i t h s m a l l c l a s s e s( < 2 0
s t u d c n t sp e r t e a c h c r )a n d l a r g c c l a s s e s( = 2 0 s t u d e n t sp e r t e a c h e r ) ,
t l r e l o l l o w i n g r e s u l t sw e r e f o u n d :
Average Score ( 7)

Standard Deviation (sy)

I s t h e r c s t a t i s t i c a l l ys i g n i f i c a n e
t v i d c n c et h a t t h c d i s t r i c t sw i t h s n r a l l e r
classeshave higher averagctest scores'/Explain.
3.14 Valucsol height in inches(X ) and weight in pounds ( Y ) are recordcd from
a s a m p l co f 3 ( X )m a l e c o l l c g es t u c l e n t sT.h e r e s u l t i n gs u m n r a r ys t a t i s t i c sa r e
X : 1 0 . 5 i n . . 7 : t 5 u l b , s ; - 1 . 8i n . , . s y - l 4 . 2 l b ,s y y . : 2 1 . 7 3i n . x l b . a n d
ryy : 0.f15.Clonvertthese statisticsto the ntetric systen (meters and kilograms).
3.15 Lct Yi and Y1,denote Bernouili random variablesfrom two different popu l a t i < r n sd,e n o t e da a n d b . S u p p o s et h a t E ( Y , ) - p , , a n d E ( Y 1 , ) : p 6 . A r a n dom sarlple of size n,, is chosen from population c, with sample average
dcnoted it,,.and a random sample of size rr1,is chosenfrom population b,

Exercises I Ol
with sample averagedenotcd p1,.Supposethe samplc from population n is
indepc-ndentof the-sarnplefrom population b.
a . S h o w t h a t E ( p , , ) : p , , a n d v a r ( 1 , , )- p , , ( l - p ) l n , , . S h o w t h e r t
E ( f r , , ) : p 1 ,a n d v a r ( p 7 , ): p r , ( 1 p 1 ) fn 1 , .
b . S h o r vl h u t v r r r (i , ,

Pt'('
- l'"(-l
rr,,,
, , , , 1 " ' )+
, , , , " , ' ' . ( H i r r l :R c m e m b e r

t h a t t h e s a m p l e sa r e i n d e p e n d e n t . )
c . S u p p o s et h a t r r , ,a l ) d n 7 ,a r e l a r g e .S h o w t h a t a t ) 5 ( / oc < l n l i c l e n cicn t e r v a l

| _ i t , , ) i t 1 , ( li , , , ) .
t 9 r , . / i , , (/1,,
,
ilt,

v
H o w w o u l d v o u c o n s t r u c ta 9 0 % ,c o n l i c l c n c ei n t c l v a l l c t r 1 t ,-, 1 t 1 , ' !

d . R c a c lt h c b o x " A N o v e l W i r y t o B o < t s lR e t i r c m e r . rSt a v i n s s "i n S c c t i o n


3 . - 5L. . c tp o p u l a t i o na d c n o t c t h e " o p t - o u t " ( t r e a t r n e n t )e r o L r pa n d
p o p u l a t i o nh c l c n < l t ct h c " o p t - i n " ( c o n t l o l ) g r o u l ) .( ' r t n s t r u c ta t ) 5 ' 2 ,
conl'iclcnceintcrval I'or thc trcatment cl'l'ect,7.r,,pt,.
s l na s t a n c l a t ' d i z et d
3.16 Ciracle<
c s t a r c k n o w n t o h a v e a l n e a n o l ' l ( X X )l i l r s t u -fhc
d e l . r t si n t l r c [ ] n i t e d S t a t c s .
t e s l i s a d m i n i s l c r c ctlo 4 5 3 r a n d o m l v
s c l e c t c csl t u d c n t si n F l o r i c l a i;n t h i s s a n r p l ct.h c n r c i t ni s l 0 l 3 a n c lt h c s t a n d a r d d c v i a t i o n( , r )i s l O l t .
a . f ' o t r s t r u c tt t 9 5 ' 2 ,c o n l i d c n c ci r r t c r v a l i r r t h c a v L r n r gtcc s t s c o r c I ' o r
F l o r i c l as t u d en t s .
b . I s t h c r c s t a l i s t i c a l l vs i g n i l ' i c a net v i c l c n c ct h a t F k r r i c l as t u d en t s p c r f r t r n r
d i l l ' c r . c n t l yt h i r n o t h e r s l u c l en t s i n t h c I J n i t c c lS t a t c s ' ?
c . A n o l h c r . 5 0 3s t u d c n t sa r c s c l c c t e da t r a n d o n tf r o n t F l o r i d a . T h c ya r c
g i v c r ta 3 - h t t u rp r c p a r a t i o nc o u r s cb c l i r r c t h e t c s t i s a c l n t i n i s t c r e c l .
T h c i r - l r v c r a g ct c s l s c o r c i s l ( ) 1 9w i t h i r s t a n c l a r cdlc v i a t i o r to 19 5 .
i. Construct a t)-57n
confidcncc intcrval l'or thc changc in averagc
l c s t s c r l r ea s s o c i a t c dw i t h t h e p r c p c o u r s e .
i i . I s t h c r c s l a t i s t i c a l lsyi u n i l i c a nct v i d c n c et h a t t h c p r e p c ( ) u r s c
h c l p ed ' /
d . T h c < l r i g i n a4l - 5 3s t u d e n t sa r e g i v c n t h c p r c p c o u r s ca n d t h c n a r c
a s k e dt o t a k c t h c t c s t a s c c o n dt i m e . T h ea v c r i l g cc h a n p ci n t h c i r t c s t
s c o r c si s 9 p o i n l s ,a n d t h e s t e r n d a r d ev i a t i o n o l ' t l " r cc h a n g ci s 6 0
points.
i . C ' o r r s t r u cat 9 5 7 , c o u f ' i d c n c ei n t e r r v aflo r t h c c h a n g ci n a v c r a g c
lcSl SC()feS.

102

CHAPTER
3 Reviewof Statistics
ii. Is there statisticallysignificantevidencethat studentswill perform
better on their secondattempt after taking the prep course?
iii. Studentsmay have performed better in their secondattempt
becauseof the prep course or becausethey gained test-taking
experiencein their first attempt. Describe an experirnentthat
would quantify these two effects.
3.17 Read the box "The Gender Gap of Earnings of College Graduates in the
U n i t e d S t a t e s "i n S e c t i o n3 . 5 .
a. Construct a 95V" confidenceinterval for the changein men's average
hourly earningsbetween 1992and 2008.
b. Construct a 95% confidenceinterval for the changein women's average hourly earningsbetween 1992and 2008.
c. Construct a 95"/" confidenceinterval for the changein the gcnder
gap in averagehourly earningsbctwcen 1992and 2008.(Hint:
) j , , , 1 * r . X , 1 , u ,i 2
s i n d e p e n d c n t l [ ) j , , . 2 1 1 ' *X , . . 2 { r 6 x . )
rf
3 . 1 8 T h i s e x e r c i s es h o w st h a t t h e s a m p l ev a r i a n c ei s a n u n b i a s e de s t i m a t o o
p
y
p
o
p
u
l
a
t
i
o
n
)
j
,
.
.
.
i
.
i
.
d
.
a
n
d
v
a
rithc
v a r i a n c cw h e n
with mean
.Y, arc
ancc,r3.
a . U s c E q u a l i o n( 2 . . 1)1l o s h o wl h a t E [ ( ) j

Y ) ' ] - v r r r (] j ) - 2 c o v (V , .| \ +

v a r (7 ) .
b. tJse Equation (2.-33)to sliow that cov( P. Y,7: <rr2.ln.
c . U s c t h c r e s u l t si n ( a ) a n c l( b ) t o s h o w t h a t 6 ( s r ? ): r r f .
3 . 1 9 a . 7 i s a n u n b i a s e de s t i r n a t o ro f g , 1 .I.s Y i a n u n b i a s e de s t i m a t o ro f
n l e s t i m a t o ro f p l 1
b . 7 i s a c o n s i s L c net s t i n r a t o ro f p 1 . I s 7 l a c o n s i s t e
3 . 2 0 S u p p o s et h a t ( , { , } i ) a r e i . i . d .w i t h f i n i t e f o u r t h n r o m e n t s P
. rovethat
sarnplec<lvarianceis a consistentestimator of the population covalia
(rx),,where say is <lefinedin Equation (3.24).(Hint:
that is.sxv L,
the strategyof Appendix 3.3 and the Cauchy-'Schwartzinequality.)
3.21 Show that the pooled standard crror [Sf,,,,,,r,,i(
Y,, Y. 1] givcn follov
Equation (3.23)cqualsthe usualstandardcrror for the dift'erencein mt
) h e n t h e t w t >g r o u p s i z c sa r e t h e s e r m e( n , , ,: r r , , , ) .
i n E q u a t i o n ( 3 . 1 c )w

Empirical
Exercise

103

EmpiricalExercise
E3.1

On the text Web site http://www.pearsonhighered.com/stock-watson/ you


will find a data file CPS92_08that contains an extended version of the
datasetusedin Table 3.1 of the text for the years 1992and 2008.It contains
data on full-time, full-year workers, age25-34, with a high school diploma
or B.A./8.S. as their highest degree^A detailed description is given in
CPs92_08_Description,available on the Web site. Use these data to answer
t h e f o l l o w i n gq u e s l i o n s .
a. Compute the sample mean fbr averagehourly earnings(AHE) in 1992
and in 2008.Construct a 9-5%confidenceinterval for the population
meansof AHE in 1992and 2(X)8and the changebetween 1992and 2008.
b. In 20011.
the value of the Consumer Price tndex (CPI) was 21-5.2.ln
1 9 9 2 t, h e v a l u c o f t h e C I P Iw a s 1 4 0 . 3R
. e p e a t( a ) b u t u s e A H E , m e a s u r e d i r . r c a l 2 ( X ) ud o l l a r s( $ 2 t t t t S ; ; t h aits , a d j u s t h e 1 9 9 2d a t a f o r t h e
pricc inflartionthat occurred betwcen 1992and 2001t.
c . I f v c l uw e r e i n t c r e s t c di n t h e c h a n g ei n w o r k e r s ' p u r c h a s i n gp o w c r
from 1992to 2(X)tt.would you usc the resultsfrom (a) or from (b) /
Explain.
d. [Jse thc 2(X)lJ
data to construct a95"/" confidenceinterval for the
m e a n o f A H E , f o r h i g h s c h o o lg r a c l u a t c sC.o n s t r u c ta 9 5 ' r - c o n f i d e n c e
i n t e r v a l1 ' o rt h e n r e a no f A H E f o r w o r k e r sw i t h a c o l l e g ed e g r e e .C o n struct a 9-5o1,
confidenceinterval fclr the differencc between the two
mcans.
e. Rcpeat (d) using the lt)92 data expressedin $20011.
f. Did real (in{'lation-adiusted)wagesof high school graduatesincreasc
from l t)92to 2(X)li'lExplain. Did rcal wzrgcsof collcge graduates
increase'?Did thc gap betwecn earningsof college and high school
g r e r d u a t ei sn c r e a s e ?I i x p l a i n ,u s i n ga p p r o p r i a t ec s t i r n a t e sc,o n f i d e n c e
i n t c r v a l s a. n d t e s t s t a t i s t i c s .
'lable
g.
3. I prescntsinformation on the gender gap for college graduatcs.Prepare a similar table for high school qracluatesusing the 1992
and 200ftdata.Are there irny notablc differencesbetween the results
for high school and collcge graduatcs'?

Das könnte Ihnen auch gefallen