Sie sind auf Seite 1von 17

AB1202

Statistics and Analysis


Lecture 5
Hypothesis Testing
Chin Chee Kai
cheekai@ntu.edu.sg
Nanyang Business School
Nanyang Technological University
NBS 2016S1 AB1202 CCK-STAT-018
2

Hypothesis Testing
• Concepts of Hypothesis Testing
• Formulating Hypotheses
• Significance Level α
• Test Statistic & p-value
• Concluding Test Properly
• Testing for Population Mean
• Testing for Population Proportion
NBS 2016S1 AB1202 CCK-STAT-018
3

Concepts of Hypothesis Testing


• There is a reality which we cannot “see”.
• We make an assumption, a hypothesis, about the
reality.
▫ Eg: Mean of all rulers produced 𝜇 = 10 cm
• We sample the reality, since this is all we could “see”
about the reality.
▫ Eg: Sample mean 𝑥 = 9.5 cm with s.d. 𝑠 = 2.1 cm
• Then we make a conclusion about the reality (which
we still cannot “see” thoroughly).
▫ Eg: “mean of all rulers produced 𝜇 = 10 cm indeed.”
• Keep in mind that all our hypothesis tests use results
from sampling distribution theory.
NBS 2016S1 AB1202 CCK-STAT-018
4

Formulating Hypotheses
• Hypothesis statements take the form:
▫ H0 : null hypothesis statement
▫ H1 : alternative hypothesis statement
▫ Eg: H0 : 𝜇 = 10 cm
▫ H1 : 𝜇 ≠ 10 cm
• Hypothesis statements should be mutually
exclusive, and exhaustive (covering all possible
outcomes)
• Hypothesis statements are not universal truths.
• Depending on who is performing the statistical
sampling and purpose, hypothesis statements
could be formed differently about the same reality.
NBS 2016S1 AB1202 CCK-STAT-018
5

Assume Appropriate Distribution


• An appropriate distribution matches the
population parameter being tested against.
▫ Eg: If we test against the population mean with
known 𝜎, we use the standard normal distribution
𝑍~𝑁 0,12
▫ If we test against population mean with unknown 𝜎,
we use Student-t distribution with degree of freedom
𝜈 = 𝑛 − 1 (𝑛=sample size).
• When we test against population variance 𝜎 2 , we
will consider using 𝜒 2 (Chi-square) or F-
distributions. More about these in later lectures.
NBS 2016S1 AB1202 CCK-STAT-018
6

Significance Level α
• The probability of outlying events occurring in
the samples when the null hypothesis is indeed
the reality.
• Area of 𝛼 covers the H0 : 𝜇 = 10 cm
H1 : 𝜇 ≠ 10 cm
alternative hypothesis 1−𝛼 𝛼
outcomes. 𝛼
2
𝛼 2
• It’s two-tailed with
2
if H1 has unequal
sign. 𝜇 = 10
𝑋

• It’s left-tailed with 𝛼 if H1 describes Notice that 𝛼


can be set
left extremities (with “<”). even before
• It’s right-tailed with 𝛼 if H1 describes sampling
right extremities (with “>”). activity.
NBS 2016S1 AB1202 CCK-STAT-018
7

Critical Value
• Critical value(s) 𝑧𝑐 depends on 𝛼 – in fact, it is such
𝛼
that 𝑃 𝑍 < 𝑧𝑐 = for two-tailed tests.
2
• Suppose we choose
H0 : 𝜇 = 10 cm
𝛼 = 0.05. H1 : 𝜇 ≠ 10 cm
• 𝑃 𝑍 < 𝑧𝑐 = 0.025 ⇒ 1−𝛼 𝛼
𝛼
2
𝑧𝑐 = −1.96 2
• Being two-tailed, our
critical values are
𝑋
± 1.96. 𝜇 = 10
𝑧=0 𝑍
• If it’s left-tailed with 𝑧𝑐− = −1.96 𝑧𝑐+ = 1.96
𝛼, 𝑧𝑐 = −1.645.
• If it’s right-tailed
with 𝛼, 𝑧𝑐 = 1.645.
NBS 2016S1 AB1202 CCK-STAT-018
8

Test Statistic & p-value


• Test statistic is a value summarizing the samples to be
compared against an appropriately assumed distribution
in order to reject or not reject null hypothesis.
• Eg: Suppose Normal Rejection Rejection
Distribution is Region H0 : 𝜇 = 10 cm Region
appropriate in our H1 : 𝜇 ≠ 10 cm
sampling activity. 𝛼 𝛼
𝑝 2
▫ We measured samples 2 1−𝛼 𝑝
and found the test 2
2
statistic 𝑧 = −2.53
p
▫ Half p-value = 𝑋
2 𝑥 𝜇 = 10
𝑃 𝑍 < 𝑧 = 0.0057
𝑧𝑐− = −1.96 𝑧=0 𝑧𝑐+ = 1.96 𝑍
▫ p-value = 2 × 𝑃 𝑍 < 𝑧 𝑧 = −2.53
= 0.0114
Observe that in this example, our test statistic landed in the
rejection region, where 𝑧 < 𝑧𝑐− and 𝑝 < 𝛼.
NBS 2016S1 AB1202 CCK-STAT-018
9

Concluding Test Properly


• Depending on where test statistic lands on the
distribution, we either reject or do not reject H0 .
▫ Note that it’s always about H0 – the very hypothesis we
are concerned with.
▫ We never say “reject H1 ”, “do not reject H1 ”.
▫ We also never say “accept H0 ”, “accept H1 ”.
• Follows a standard format (as below)
▫ Avoid over-concluding with implied inferences
 Eg: “As sample mean is only 8.5 cm, we have production
problems”
▫ Avoid under-concluding with insufficient description
 Eg: “Sample mean leads us to reject H0 ”
• “Since 𝑧 = −2.53 < −1.96 (critical value), we reject
H0 and conclude that the mean lengths of all rulers
produced is not equal to 10 cm at 𝛼 = 5%.”
NBS 2016S1 AB1202 CCK-STAT-018
10

One-Tail Cases
• If hypotheses are stated as: • If hypotheses are stated as:
▫ H0 : 𝜇 ≥ 10 cm ▫ H0 : 𝜇 ≤ 10 cm
▫ H1 : 𝜇 < 10 cm ▫ H1 : 𝜇 > 10 cm
• Then we have a left-tail • Then we have a right-tail
test. p-value = 𝑃 𝑍 < 𝑧 test. p-value = 𝑃 𝑍 > 𝑧
= 0.0057 = 1 − 0.0057 = 0.9943
Rejection Rejection
1−𝛼 1−𝛼
Region Region
𝑝
𝑝 𝛼
𝛼

𝑥 𝜇 = 10 𝑋 𝑥 𝜇 = 10
𝑧𝑐 = −1.645 𝑧𝑐 = 1.645 𝑋
𝑧 = −2.53 𝑧=0 𝑍 𝑧 = −2.53 𝑧=0 𝑍

Our test statistic landed in the Here, test statistic is in NON-


rejection region, where 𝑧 < 𝑧𝑐 rejection region, where 𝑧 < 𝑧𝑐
and 𝑝 < 𝛼. and 𝑝 > 𝛼.
NBS 2016S1 AB1202 CCK-STAT-018
11

So, What Really Is p-value?


• Suppose 𝛼 = 0.05 and we measured p-value
𝑝 = 0.02 (< 𝛼). The way to interpret this p-value is:
▫ Assuming null hypothesis H0 is true (ie. assume I
believe null hypothesis for the moment), what is the
theoretical probability of seeing the samples I’ve
observed in the sampling or more contradicting with
H0 ? Or put simply, how rare is it to see what I’ve seen?
▫ This value, the p-value, tells us how rare the observed
samples is, assuming H0 is true.
▫ The degree of rarity is agreed upon as 𝛼.
• Now, since the p-value 𝑝 = 0.02 (< 𝛼) indicates the
occurrence of an event that is even more unlikely
than 𝛼, it must be very rare to see such samples.
NBS 2016S1 AB1202 CCK-STAT-018
12

And Yet I Saw It!


• And yet I saw it (because the samples have been
observed)! So one of two conclusions can be drawn:
▫ H0 is still true, and I’m just “unlucky” to have obtained
an outlier sample. Just try several more times and
perhaps I should get larger p-values.
▫ H0 is simply false, which explains why a very rare
theoretical event (assuming H0 is true) can be
observed.
• Statistically, whenever a smaller-than-𝛼 p-value is
observed, we always conclude that H0 is to be
rejected.
NBS 2016S1 AB1202 CCK-STAT-018
13

Can p-value be nearly 1?


• Sure. If we observe a sample that gives p-value
𝑝 = 0.9943 (> 𝛼), we go by the same logic:
▫ Assuming null hypothesis H0 is true, the theoretical
probability of seeing the samples I’ve observed in
the sampling is so large (larger than 𝛼)
▫ It means that there’s high theoretical chance of
seeing what I’ve seen, which means it is not
surprising that I, indeed, observed the samples.
▫ Hence, there is nothing to contradict our
assumption.
• Statistically, whenever a larger-than-𝛼 p-value is
observed, we always conclude that H0 is NOT to
be rejected.
NBS 2016S1 AB1202 CCK-STAT-018
14

Testing for Population Mean


A factory has a 10-cm-ruler production machine
that usually produces rulers with mean 10 cm and
s.d. 0.1 cm.
Lately, Rex suspects that the machine might have
misaligned offset that could lead to a mean that
deviates from 10 cm.
Rex randomly collected 35 rulers for precision
measurements and found the sample mean to be
9.8 cm and sample s.d. to be 0.36 cm.
Determine if Rex should conclude that the mean
has shifted at a significance level of 5%.
NBS 2016S1 AB1202 CCK-STAT-018
15

Testing for Population Mean


• H0 : 𝜇 = 10 cm
• H1 : 𝜇 ≠ 10 cm, 𝛼 = 0.05
• 𝑥 = 9.8, 𝑠 = 0.36, 𝜎 = 0.1, 𝑛 = 35
• Although parent population distribution is unknown,
2 𝜎
by CLT, 𝑋~𝑁 𝜇𝑋 , 𝜎𝑋 , where 𝜇𝑋 = 𝜇 = 10, 𝜎𝑋 = =
𝑛
0.1
= 0.0169
35
• Critical values 𝑧𝑐 = ±1.96
𝑥 −𝜇 9.8−10
• Test statistic 𝑧 = 𝜎 = = −11.83
0.0169
𝑛
• Since 𝑧 = −11.83 < −1.96, we reject H0 and
conclude that the population mean ruler length
differs from 10 cm at 5% significance level.
NBS 2016S1 AB1202 CCK-STAT-018
16

Testing for Population Proportion


After tuning and maintenance, the ruler machine is
healthily back on production. Rulers produced have
mean 10 cm and s.d. 0.1 cm.
Rex now likes to know whether the machine is
“balanced” in that the proportion of rulers longer than
10 cm is the same as proportion of rulers shorter than
10 cm.
This time, Rex randomly collected 60 rulers for
precision measurements and found that 36 rulers are
longer than 10 cm.
Determine if Rex should conclude that the rulers
produced are balanced at a significance level of 1%.
NBS 2016S1 AB1202 CCK-STAT-018
17

Testing for Population Proportion


• H0 : 𝑝 = 0.5 𝑝 is proportion of rulers
• H1 : 𝑝 ≠ 0.5, 𝛼 = 0.01 longer than 10 cm.
36
• 𝑝= = 0.6, 𝑛 = 60
60
• By CLT, 𝑋~𝑁 𝜇𝑋 , 𝜎𝑋2 , where 𝜇𝑋 = 𝑝 = 0.5, 𝜎𝑋 =
𝑝(1−𝑝) 0.5×(1−0.5)
= = 0.06455
𝑛 60
• Critical values 𝑧𝑐 = ±2.576
𝑝−𝑝 0.6−0.5
• Test statistic 𝑧 = = = 1.5492
𝑝(1−𝑝) 0.06455
𝑛
• Since 𝑧 = 1.5492 < 2.576, we do NOT reject H0 and
conclude that the population proportion of rulers
larger than 10 cm is the same as those shorter than
10 cm at 1% significance level.

Das könnte Ihnen auch gefallen