Beruflich Dokumente
Kultur Dokumente
Reviewof Statistics
confidencelevel (79)
sample covariance(91)
confidenceinterval (79)
coverageprobability (81)
test for the difference between two
m e a n s( t l 1 )
Reviewthe Concepts
3.1
3.2
3.3
3.4
What rolc does thc central limit theorcrn play in statisticalhypothesiste$ing'?In the constructionof confidenceintervals'/
3.5
3.6
3.r
3 . 7 E x p l a i n w h y t h e d i f l e r e n c e s - o f - m e a ness t i m a t o r ,a p p l i e d t o d a t a f r o m a
randrxrized controlled experiment,is an cstimator of the treatnrenteffect,
3.8
Exercises
. y:
3.1 ln a populationp
3.5
1 0 0a n d o l , : 4 1 . L J s et h e c e n t r a ll i m i t t h e o r e m t c
a n s w e rt h e I ' o l l o w i n gr ; u e s t i o n s :
Exercises
97
3.3 In a survey of 400 likely voters,2 I 5 respondedthat they would vote for the
i n c u m b c n ta n d 1 u - 5r c s p o n d c dt h a t t h e y r v o u l dv o t e f o r t h e c h a l l c n g e rL. e t
p d e n o t c t h e l r a c t i o n o f a l l l i k c l y v o t e r s w h o p r e f c r r e d t h c i n c u m b e n ta t
thc lirne of the survev.and lct 2 be thc lracticlnof survey respondcntswho
p r e f ' c r r c dt h e i n c u m b c n t .
a. LJscthc survey rcsults to estimatep.
b . L J s et h c e s t i m t r t o ro l ' t h e v a r i a n c co l p . p ( |
s l r t n c l i r re-rdr o r o l y r l u re s l i m i t l o r .
i t )l n , t o c a l c u l a t et h e
c . W h a t i s t l r c 2 - v a l n c l o r t h c t e s t H 1 ;7: r- 0 . 5 v s . H 1: p + 0 . 5 ' l
d . W h a t i s t h e p - v a l u el i r r l . h et c s t / { , : 2 : 0 . - 5v s . l l q : 7 r > 0 . - 5 ' l
e . W h y d o t h c r e s u l t sl r o r n ( c ) a n d ( d ) d i f l e r ?
f . D i d t h e s u r v e yc o n t a i ns t a t i s t i c a l l ys i g n i l i c a n te v i d c n c et h a t t h e
i n c u m b en t w a s a h c a do f t h e c h a l l e u g e ra t t h c t i m e o f t h e s u r v c y ?
Explain.
3.4 L ) s i n gt h e d a t a i n E r c r c i s c 3 . - l :
a. Construct a 95o1,confidencc interval for,a.
b. (lonstrucl tt t)t)"/,confidence intorval for p.
c . W h y i s t h e i n t e r v a l i n ( b ) w i d e r t h e r nt h c i n t e r v a l i n ( a ) ?
d . W i t h o u t d o i n g a n y a d d i t i o n a lc a l c u l z r t i r > n
t css, t t h c h y p o t h c s i s
:
H11:
p 0.50 vs.H; Jt * 0.-50at the 5 % significancclevel.
3.5 A s u r v e yo t ' 1 0 5 5r e s i s t e r e dv o t e r si s c o n d u c t e d a. n d t h e v o t e r sa r e a r s k e d
to choosebetwecn candidateA and candidatc B. Let p denote the fraction
o f v o t c r s i n t h e p o p u l a t i o nw h o p r e f e r c a n d i d a t eA , a n d l e t / d e n o t e t h e
f r a c t i o no f v < l t e r si n t h c s a m p l cw h o p r e l e r C l a n c l i d a tAe .
98
CHAPTER
3 Reviewof Statistics
a. You are interestedin the competing hypothesesH1;:p : 0.5 vs.
H; p # 0.5.Supposethat you decide to reject Hoit lb - 0.5 | > 0.02.
i. What is the size of this test?
i i . C o m p u t e t h e p o w e r o f t h i s t e s ti f p : 0 . 5 3 .
b. In the survey,p:0.54.
i. Test Ho:p - 0.-5vs.H1:p + 0.5 usinq a 5% significancelevel.
ii. Test Ho:p :0.-5 vs.H1:p > 0.5 using a 5% significancelevel.
iii. Construct a 95% confidenceinterval forp.
iv. Construct a 99"/" confidence interval for p.
v. Construct a 50"/" confidence interval for p.
c. Supposethat the survey is carried out 20 times,using independently
sclectedvoters in each survey.For each of these20 surveys,a95"/"
confidenceinterval for p is constructed.
i. What is the probzrbilitythat the true value of 7ris contained in all
20 of these confidenceintervals?
i i . H o w m a r r yo f t h e s cc o n f i d e n c ei n t e r v a l sd o y o u e x p e c tt o c o n t a i n
the truc value of p?
d . I n s u r v e yj a r g o n ,t h e " m a r g i n o f c r r o r " i s 1 . 9 6x S E ( p ) ; t h a t i s ,i t i s
half the lengtlr <tf95Y" confidencc interval.Supposeyou wanted to
design a survey that had a margin of crror of at most 1%. That is,you
w a n t e d p r ( l p - p l > t t . t t t ) - t t . t l SH
. o w l a r g es h o u l dn b e i f t h e s u r vey uscssirlple randonr sampling'/
3.6 L c t ) j , . , V , b e i . i . d . d r a w s f r o m a d i s t r i b u t i o n w i t h m e a n p . A t e s t o f
H o 'p. -
5 v e r s u sH ; p * 5 u s i n g t h eu s u a l/ - s t a t i s t i c y i e l d s a p - v a l u e o0f. 0 3 .
a . D o c s t h e c ) 5 %c o n f i d e n c ei n t e r v a lc o n t a i np : 5 ' l
Explain.
b. (lan you determine if trr: 6 is contained in the 95% confidcnce interval'i Explain.
3.7 In a given population,I I % of the likely votersare African American.A survey using a simple random sampleof 6(X)landline teleplronenumbers finds
8% African Amcricans.Is there evidencethat the surveyis biased?Explain.
3.8 A n e w v e r s i o n o f t h e S A T t e s t i s g i v e n t o 1 0 0 0r a n d o m l y s e l e c t e dh i g h
scht'rolseniors.The samplemean test scorc is 1110,and the samplestandard
deviation is 123.Construct a 95"1'confidence interval for the population
mean test score for high school seniors.
Exercises
3.9
3.10 Supposea new standardizcdtest is given to 100 randomly sclectedthirdgrade studcntsin Ncw Jersey.The sample averagcscorc 7 on the test is 58
points,and the san.rplestzrnderrd
deviation,sy, is 8 points.
a. The authors plan to administer the test to all third-gradc studentsin
New Jersey.Clonstructa 95Y" confidenceinterval for the mean score
of all New Jerscythird graders.
b . S u p p o s et h e s a m e t e s t i s g i v c n t o 2 0 0 r a n d o m l y s e l e c t e dt h i r d
grardersl'rom Iowa, producing a sample averageof 62 pclintsand
s a m p l c s t a n d a r dd c v i a t i o n o f 1 l p o i n t s .C o n s t r u c ta 9 0 ' / " c o n f i d e n c e
i n t e r v a l f o r t h e d i f ' l ' c r c n c ci n m c a n s c o r e sb e t w e e n l o w a a n d N c w
Jersey.
c . C a n y o u c o n c l u d ew i t h a h i g h d e g r e eo f c o n f i d e n c ct h a t t h e p o p u l a tion means for Iowa and New Jerseystudcnts are different'/ (What is
t h e s t a n d a r de r r o r o [ t h e d i f f e r c n c ci n t h c t w o s a m p l em e a n s ?W h a t
i s t h e p - v a l u e o l ' t h c t c s t o f n o d i f f e r e n c ei n m e a n sv e r s u ss o m e
difference'?)
3.11 Consider the estimator 7. dcfinccl in Equation (3.1). Show that
(a) r(7) - py and(b) var(7) : t.25olln.
3.12 Tir investigatepossiblegenderdiscriminationin a firm, a sampleof 100men
and 64 women with similarjob descriptionsare selectedat random.A summ a r y o f t h e r e s u l t i n gm o n t h l y s a l a r i e sf o l l o w s :
100
CHAPTER3
Reviewof Statistics
Average Salary ( Y)
Men
$3100
$200
*onl"n
$2900
$320
(t4
)
I s t h e r c s t a t i s t i c a l l ys i g n i f i c a n e
t v i d c n c et h a t t h c d i s t r i c t sw i t h s n r a l l e r
classeshave higher averagctest scores'/Explain.
3.14 Valucsol height in inches(X ) and weight in pounds ( Y ) are recordcd from
a s a m p l co f 3 ( X )m a l e c o l l c g es t u c l e n t sT.h e r e s u l t i n gs u m n r a r ys t a t i s t i c sa r e
X : 1 0 . 5 i n . . 7 : t 5 u l b , s ; - 1 . 8i n . , . s y - l 4 . 2 l b ,s y y . : 2 1 . 7 3i n . x l b . a n d
ryy : 0.f15.Clonvertthese statisticsto the ntetric systen (meters and kilograms).
3.15 Lct Yi and Y1,denote Bernouili random variablesfrom two different popu l a t i < r n sd,e n o t e da a n d b . S u p p o s et h a t E ( Y , ) - p , , a n d E ( Y 1 , ) : p 6 . A r a n dom sarlple of size n,, is chosen from population c, with sample average
dcnoted it,,.and a random sample of size rr1,is chosenfrom population b,
Exercises I Ol
with sample averagedenotcd p1,.Supposethe samplc from population n is
indepc-ndentof the-sarnplefrom population b.
a . S h o w t h a t E ( p , , ) : p , , a n d v a r ( 1 , , )- p , , ( l - p ) l n , , . S h o w t h e r t
E ( f r , , ) : p 1 ,a n d v a r ( p 7 , ): p r , ( 1 p 1 ) fn 1 , .
b . S h o r vl h u t v r r r (i , ,
Pt'('
- l'"(-l
rr,,,
, , , , 1 " ' )+
, , , , " , ' ' . ( H i r r l :R c m e m b e r
t h a t t h e s a m p l e sa r e i n d e p e n d e n t . )
c . S u p p o s et h a t r r , ,a l ) d n 7 ,a r e l a r g e .S h o w t h a t a t ) 5 ( / oc < l n l i c l e n cicn t e r v a l
| _ i t , , ) i t 1 , ( li , , , ) .
t 9 r , . / i , , (/1,,
,
ilt,
v
H o w w o u l d v o u c o n s t r u c ta 9 0 % ,c o n l i c l c n c ei n t c l v a l l c t r 1 t ,-, 1 t 1 , ' !
102
CHAPTER
3 Reviewof Statistics
ii. Is there statisticallysignificantevidencethat studentswill perform
better on their secondattempt after taking the prep course?
iii. Studentsmay have performed better in their secondattempt
becauseof the prep course or becausethey gained test-taking
experiencein their first attempt. Describe an experirnentthat
would quantify these two effects.
3.17 Read the box "The Gender Gap of Earnings of College Graduates in the
U n i t e d S t a t e s "i n S e c t i o n3 . 5 .
a. Construct a 95V" confidenceinterval for the changein men's average
hourly earningsbetween 1992and 2008.
b. Construct a 95% confidenceinterval for the changein women's average hourly earningsbetween 1992and 2008.
c. Construct a 95"/" confidenceinterval for the changein the gcnder
gap in averagehourly earningsbctwcen 1992and 2008.(Hint:
) j , , , 1 * r . X , 1 , u ,i 2
s i n d e p e n d c n t l [ ) j , , . 2 1 1 ' *X , . . 2 { r 6 x . )
rf
3 . 1 8 T h i s e x e r c i s es h o w st h a t t h e s a m p l ev a r i a n c ei s a n u n b i a s e de s t i m a t o o
p
y
p
o
p
u
l
a
t
i
o
n
)
j
,
.
.
.
i
.
i
.
d
.
a
n
d
v
a
rithc
v a r i a n c cw h e n
with mean
.Y, arc
ancc,r3.
a . U s c E q u a l i o n( 2 . . 1)1l o s h o wl h a t E [ ( ) j
Y ) ' ] - v r r r (] j ) - 2 c o v (V , .| \ +
v a r (7 ) .
b. tJse Equation (2.-33)to sliow that cov( P. Y,7: <rr2.ln.
c . U s c t h c r e s u l t si n ( a ) a n c l( b ) t o s h o w t h a t 6 ( s r ? ): r r f .
3 . 1 9 a . 7 i s a n u n b i a s e de s t i r n a t o ro f g , 1 .I.s Y i a n u n b i a s e de s t i m a t o ro f
n l e s t i m a t o ro f p l 1
b . 7 i s a c o n s i s L c net s t i n r a t o ro f p 1 . I s 7 l a c o n s i s t e
3 . 2 0 S u p p o s et h a t ( , { , } i ) a r e i . i . d .w i t h f i n i t e f o u r t h n r o m e n t s P
. rovethat
sarnplec<lvarianceis a consistentestimator of the population covalia
(rx),,where say is <lefinedin Equation (3.24).(Hint:
that is.sxv L,
the strategyof Appendix 3.3 and the Cauchy-'Schwartzinequality.)
3.21 Show that the pooled standard crror [Sf,,,,,,r,,i(
Y,, Y. 1] givcn follov
Equation (3.23)cqualsthe usualstandardcrror for the dift'erencein mt
) h e n t h e t w t >g r o u p s i z c sa r e t h e s e r m e( n , , ,: r r , , , ) .
i n E q u a t i o n ( 3 . 1 c )w
Empirical
Exercise
103
EmpiricalExercise
E3.1