Sie sind auf Seite 1von 9

STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04

Today (Chapter 7):

• CI with σ unknown, t-distribution

• CI for proportions

• Two sample CI with σ known or unknown

• Hypothesis Testing, z-test

Confidence Intervals with σ unknown


Last Time: Confidence Interval when σ is known:
A level C, or 100(1 − α) % confidence interval for µ is
σ σ
[X̄ − zα/2 √ , X̄ + zα/2 √ ]
n n

But to return to reality, we don’t know σ. Thus we must estimate the standard deviation of X̄ with:
s
SEX̄ = √
n

But s is just a function of our Xi ’s and thus is a random variable too – it has a sampling distribution too.
Before we could say if we knew σ

X̄ − µ
P (−zα/2 < √ < zα/2 ) = 1 − α
σ/ n
which after algebra gave the confidence interval.
[Remember for any s, zs is defined as where 1−2s of the area falls in (−zs , zs ). So zs = qnorm(1−s) =
−qnorm(s) = 1 − s quantile. i.e. zs is the positive side.]
Now we want a similar setup, so that:

X̄ − µ
P (?? < <??) = α
SEX̄

We need know the probability distribution of T = X̄−µ


SEX̄
. T has the Student’s t-distribution with n − 1
degrees of freedom. We write this as T ∼ tn−1 . The degrees of freedom=ν is the only parameter of this
distribution.
[book uses ts for T ]

1
t−dist w/ df=1 t−dist w/ df=5
0.4

0.4
t−dist, df=1 t−dist, df=5
N(0,1) N(0,1)
0.3

0.3
0.2

0.2
0.1

0.1
0.0

0.0
−6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6

t−dist w/ df=10 t−dist w/ df=50


0.4

0.4

t−dist, df=10 t−dist, df=100


N(0,1) N(0,1)
0.3

0.3
0.2

0.2
0.1

0.1
0.0

0.0

−6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6

RCode:

> par(mfrow=c(2,2))
#tdist1.pdf
> plot(seq(-6,6,length=10000),dnorm(seq(-6,6,length=10000)),
type="l",lty=3,ylab="",xlab="",main="t-dist w/ df=1")
> lines(seq(-6,6,length=10000),dt(seq(-6,6,length=10000),df=1),
type="l",ylab="",xlab="")
> legend(x=2,y=.4,lty=c(1,3),legend=c("t-dist, df=1","N(0,1)"))
...

Thus t-distribution approaches normal as ν increases, but for small n gives wider intervals.
Why “degrees of freedom”??

2
Let yi = xi − x̄
We have n
2 1 X 2 X
s = y and yi = 0(∗)
n−1 1 i

Now (*) < − > 1 constraint on n numbers, hence the phrase “n-1 degrees of freedom”
Now that we know the distribution, we know we can find the “??” from above – these are just the
α/2 and 1 − α/2 quantiles of the t-distribution. Let tn−1,s be defined similarly as zs and is equal to
qt(1 − s, df = n − 1) = −qt(s, df = n − 1). We then have:

X̄ − µ
P (−tn−1,α/2 < < tn−1,α/2 ) = 1 − α
SEX̄
This gives us a confidence interval like before, only we use the quantiles of the t-distribution rather than
the normal distribution.
Example. Taken from the original paper on t-test by W.S. Gossett , 1908. [Gossett was employed by
Guiness Breweries, Dublin. A chemist, turned statistician, Guiness, fearing the results to be of commer-
ical importance, forbade Gossett to publish under his own name. Chose pseudonym “Student” out of
modesty]
Two drugs to induce sleep: A- “dextro”, B= “laevo”. Each of ten patients receives both drugs (presum-
ably in random order). Issue: Is drug B better than drug A? Student’s sleep data:
>extradiff
data(sleep) 1.2
extra group 2.4
1 0.7 1 1.3
2 -1.6 1 1.3
3 -0.2 1 0.0
4 -1.2 1 1.0
5 -0.1 1 1.8
6 3.4 1 0.8
7 3.7 1 4.6
8 0.8 1 1.4
9 0.0 1 > mean(extradiff)
10 2.0 1 [1] 1.58
11 1.9 2 > sqrt(var(extradiff))
12 0.8 2 [1] 1.229995
13 1.1 2 > sqrt(var(extradiff)/10)
14 0.1 2 [1] 0.3889587
15 -0.1 2 > 1.58/0.38896
16 4.4 2 [1] 4.062114
17 5.5 2 > qt(.975,9)
18 1.6 2 [1] 2.262157
19 4.6 2 > qt(.995,9)
20 3.4 2 [1] 3.249836
extra1=sleep[sleep[,2]==1,] > qnorm(0.975)
extra2=sleep[sleep[,2]==2,] [1] 1.959964
extradiff=extra2[,1]-extra1[,1] > qnorm(0.995)
[1] 2.575829

A level C conf. interval with σ unknown:

• exact if X Normal

• otherwise approx correct for large n

• Margin of error M in E ± M is
s
tn−1, α2 √ = tn−1, α2 SEX̄
n

Remark: Large value, 4.6 possible outlier, so some doubt about normal assumptions here.
What’s different?? Since we don’t know σ, pay a penalty with a (slightly) wider interval: ( e.g t=2.262
vs. z=1.96 for 5% level confidence )
For large sample sizes we can just use the normal distribution quantiles zα/2 , since the t-distribution
quickly looks like the normal distribution.

Proportions
We saw last time that p̂ is approximately distributed as N (p, p(1−p)
n
). If we want a confidence interval for
p̂ we can use this normality to get an approximate confidence interval.
q
p(1−p)
M = zα/2 × SEp = zα/2 n
2
r
y+0.5zα/2 p̃(1−p̃)
The book offers a correction to this using p̃ = 2
n+zα/2
and SEp = 2
n+zα/2
.

Two-samples
One of the most common statistical procedures. Is there a difference? Is it real?? However, because of
the preparatory work with one-sample problems, this should seem rather familiar, a case of dejà-vu. , but
with slightly more complex formulas.
What do we mean by “two-samples”?

• Two groups
• Distinct populations [treatment/control, . . . , male/female . . . ]
• Grouping variable: categorical variable with 2 levels.
• Data is independent between groups

Example: (Dalgaard p 87) Energy expenditure: Two groups of women, lean and obese. Twenty four
hour energy expenditure in MJ.

data(energy) lean_energy[energy$stature==’lean’,1]
obese_energy[energy$stature==’obese’,1]
obese
[1]
9.21 11.51 12.79 11.85 9.97 8.79 9.69 9.68
9.19

lean
[1] 7.53 7.48 8.08 8.09 10.15 8.40 10.88 6.13 7.90 7.05 7.48
7.58 8.11

plot(expend~stature,data=energy)

Beware: Some data sets that may look like two sample problems are really better treated as paired data.
Example: Sleep drugs data from above: 10 patients, Drugs A and B. But since each patient received both
A and B, the samples are not really independent (common component of variation due to patient) – better
to look at differences. Becomes a one-sample problem. (Will discuss more about pairing/blocking later).

Notation:
Population SRS from Each Population
Variable Mean SD Sample Size Sample Mean Sample SD
Population 1 X1 µ1 σ1 Sample 1 n1 X̄1 s1
Population 2 X2 µ2 σ2 Sample 2 n2 X̄2 s2

5
Distribution of X̄1 − X̄2

Sample mean difference: X̄1 − X̄2 – All depends on the variability and distribution of this difference!!
Recall in general that if E(V ) = µ and E(W ) = ν then

E(V − W ) = µ − ν

and if V and W are independent then

var(V − W ) = var(V ) + var(W )


σ2 σ2
So if X̄1 ∼ (µ1 , n11 ), X̄2 ∼ (µ2 , n22 ), we will have

µX̄1 −X̄2 = E(X̄1 − X̄2 ) = µ1 − µ2


and for independent rvs X̄1 and X̄2 :

2 2 2 σ12 σ22
σX̄1 −X̄2
= σX̄ + σX̄ = +
1 2
n1 n2
2
We need estimates for µ1 − µ2 and σX̄1 −X̄2
.
Clearly X̄1 − X̄2 is estimate for µ1 − µ2 . Once we have an estimate for the σX̄1 −X̄2 then we can use
similar method as for a 1-sample case to get a confidence interval.

1. Unequal variances: σ12 6= σ22 then use


s21 s2
2
SEX̄1 −X̄2
= + 2
n1 n2

2. Equal Variances: If σ12 = σ22 = σ 2 is unknown but assumed to be equal, can use a pooled
estimate of variance σ:
(n1 − 1)s21 + (n2 − 1)s22
s2pooled =
n1 + n2 − 2
i.e. average with weights equal to the respective degrees of freedom. Then our estimate of σX̄1 +X̄2

2 1 1
SEpooled = s2pooled ( + )
n1 n2
• Good method if the two SDs are close, but if also are moderate to large, there won’t be
much difference from the unequal variances method (below)
• If the two SDs are different, better to use unequal variances method.
• will use this pooled estimate again when we study Analysis of Variance

As above, we need the distribution of:

X̄1 − X̄2 − µX̄1 −X̄2


T =
SE of X̄1 − X̄2
If X1 ∼ N (µ1 , σ12 ) and X2 ∼ N (µ2 , σ22 ) then:

6
• Equal Variances: If we have equal variances in the two populations, then
SE of X̄1 − X̄2 = SEpooled and T ∼ tν with ν = n1 + n2 − 2

• Unequal Variances: Then SE of X̄1 − X̄2 = SEX̄1 −X̄2 and T is approximately distributed as tν .
We use one of two values for ν

1. ν = min(n1 − 1, n2 − 1)
2.
s21 s22
n1
+ n2
ν0 = s2 s2
1 1
( 1 )2
n1 −1 n1
+ ( 2 )2
n2 −1 n2

This is known as Welsh’s formula which gives fractional degrees of freedom. More accurate
formula (generally used by packages, and only on computers!):

Can use either approximation, but say which!


Note that one can generally not go too far wrong, since can show by algebra that

min(n1 − 1, n2 − 1) ≤ ν 0 ≤ n1 + n2 − 2

Summary: Two sample confidence intervals for µ1 − µ2 at the 100(1 − α)% level

E ± M, E = X̄1 − X̄2 , M = (zα/2 or tα/2 ) × (appropriate SE)

known large sample unknown, unequal unknown, equal


q 2 q 2 q 2 q
σ σ22 s s22 s s2
M = zα/2 n11 + n2
M = zα/2 n11 + n2
M = tα/2,ν 0 n11 + n22 M = tα/2,ν spooled n11 + 1
n2
ν = min(n1 − 1, n2 − 1) or ν 0 ν = n1 + n2 − 2
where zα/2 and tα/2,ν are same notation as for one-sample case.
In energy data above, we can construct a 95% confidence interval for the difference in the true means
between obese and lean. n1 = 9, n2 = 13 and X̄1 − X̄2 = 2.23. We’ll use the conservative estimate for
ν = min(9 − 1, 13 − 1) = 8. SEX̄1 −X̄2 = 0.58. So our M = 2.24 × 0.57 = 1.30. Then a (conservative)
95% confidence interval is [0.93,3.53]. Computer output for Welsh’s formula gives [1.00,3.46]
> mean(obese)-mean(lean)
[1] 2.231624
> qt(.9725,df=8)
[1] 2.244938
> sqrt(var(obese)/length(obese)+var(lean)/length(lean))
[1] 0.5788152
> t.test(obese,lean, conf.level=.95)
Welch Two Sample t-test data: obese and lean
t = 3.8555, df = 15.919, p-value = 0.001411
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
1.004081 3.459167
sample estimates:
mean of x mean of y
10.297778 8.066154

Hypothesis Tests
We will generally have some hypotheses about certain parameters of the population (or populations)
from which our data arose, and we will be interested in using our data to see whether these hypotheses
are consistent with what we have observed.
To do this, we have already calculated confidence intervals for them, now we will be conducting
hypothesis tests about the populations parameters of interest. We will discuss these two statistical
procedures, in general they are built on the idea that if some theory about the population parameters is
true, the observed data should follow, admittedly random, but generally predictable patterns. Thus, if the
data do not fall within the likely outcomes under our supposed ideas about the population, we will tend
to disbelieve these ideas, as the data do not strongly support them.
We will initially be interested in using our data to make inferences about µ, the population mean. To do
this, we will use our estimate of location from the data; namely, the sample mean (average) (since it is
mathematically nicer than the median). We will do this in the framework of several different data
structures, starting with the most basic, the one-sample situation. How can we decide if a given set of
data, and in particular its sample mean, is close enough to a hypothesized value for µ for us to believe
that the data are consistent with this value? In order to answer such a question, we need to know how a
statistic like the sample average behaves, i.e. its distribution.
Now we have already studied the distribution of the sample average and the sample proportions, when
the sample size is large enough, they follow Normal distributions, centered at the expected value and
with a spread of the order the relevant SE.

INFERENCE FOR A SINGLE SAMPLE: Z-DISTRIBUTION

Standard Error of the Sample Mean (σ known)


Example: Testing whether the birthweights of the secher babies have above average mean.
Variance of the original population σ=700 – Known.
We would like to test whether µ = 2500, versus the alternative µ > 2500.
We have a sample of n = 107 observations. mean(bwt) gives that X̄ = 2739, we would like to use
this data to test µ > 2500.
2 2
We have a sample of size 107, we know that X̄ will be normal with variance σn = 700 107
= 490000
107
.
If it is true that µ=2500 (this is called the null hypothesis), then under the central limit theorem,
X̄ ∼ N (2500, 490000/107) = N (2500, 67.72 ) and under the null hypothesis
X̄ − µ 2739 − 2500
P (X̄ ≥ 2739) = P ( ≥ ) = P (Z ≥ 3.53)
√σ 67.7
n

What is the probability that a standard normal Z score is as big as 3.53?


P (Z > 3.53) = 1 − P (Z ≤ 3.53) = 1 − Φ(3.53) = 0.000207

8
using the R command pnorm(3.53) which returns [1] 0.9997922
This is indeed very small, too small to be true. We reject the null hypothesis.
Let X1 , . . . , Xn be a sample of n i.i.d. random variables from a distribution having unknown mean µ,
and known standard deviation σ. Assume n is large, say n > 30. Suppose interest centers on testing the
hypothesis
H 0 : µ = µ0 ,
where µ0 is some fixed, pre-specified value. This will be our null hypothesis, notice that it is a simple
one, i.e. it postulates a single hypothesized value for µ. The hypothesis against which the null
hypothesis is to be compared, the alternative hypothesis, can take one of three basic forms:

1. HA : µ 6= µ0

2. HA : µ > µ0

3. HA : µ < µ0

The idea, as we have said, is to assess whether the data supports the null hypothesis (H0 ) or whether it
suggests the relevant alternative (HA ).
To begin, we assert that the null hypothesis is true (i.e. that the true value of µ is actually µ0 ). Under this
assumption, the Central Limit Theorem implies that the test statistic

X − µ0
Z= √ ,
σ/ n

has a standard normal (N (0, 1)) distribution (notice that the test statistic is just the standardized version
of X under the assumption that the true mean is actually equal to µ0 ). The usual convention applies that
if σ is unknown, and n is large then the sample standard deviation, s, is used in place of σ in forming the
test statistic. The null hypothesis is supported if the observed value of the test statistic is small (i.e. X is
close enough to µ0 , the hypothesized value, so that I would believe that the true mean is µ0 ). On the
other hand, if I observe a large value of the test statistic, this suggests that X is far from µ0 , which tends
to discredit the null hypothesis in favor of the alternative hypothesis HA : µ 6= µ0 .
The real issue is “how large is large?” (or small is small?).
For example, if I observe a Z value of 1, say, can we conclude in favor of H0 over HA , or should we
prefer HA over H0 . What about a Z value of −2? The answer to these question lies in considering what
the test statistic actually measures. In words, the observed value of Z is just the number of standard
errors the observed sample mean is from the hypothesized population mean; i.e.

Zobs = number of standard errors X̄ is away from µ0

This is determined by how rare a rare event should be to make us think soemthing else than H0 is going
on. This determines what we call the significance level α, most often α is taken to be 5%, sometimes
10%, and sometimes even .1 % (1/1000).
We compute the P-value which the probability of observing a value ‘as extreme as’ this.
The P-value computation either takes P (|Z| > Zobs ),P (Z > Zobs ) or P (Z < Zobs ) depending on what
the alternative HA was.

Das könnte Ihnen auch gefallen