Hypothesis Testing and ANOVA

Hypothesis Testing
Workshop #1
What is a Hypothesis?
• A hypothesis is a statement about the value of a

population parameter developed for the purpose of testing.
Examples of hypotheses made about a population parameter are:
The mean monthly income for systems analysts is $3,625.
Twenty percent of all customers at Bovine’s Chop House return

for another meal within a month.
What is Hypothesis Testing?
• Hypothesis testing is a procedure, based on sample

evidence and probability theory, used to determine
whether the hypothesis is a reasonable statement
and should not be rejected, or is unreasonable and
should be rejected.
Hypothesis Testing
S t e p 1 : S t a t e n u ll a n d a lt e r n a t e h y p o t h e s e s
S t e p 2 : S e le c t a le v e l o f s ig n ific a n c e
S t e p 3 : I d e n t ify t h e t e s t s t a t is t ic
S t e p 4 : F o r m u la t e a d e c is io n r u le
S t e p 5 : T a k e a s a m p le , a r r iv e a t a d e c is io n
D o n o t r e je c t n u ll R e je c t n u ll a n d a c c e p t a lt e r n a t e
Hypothesis Testing: Definitions
Step 1: The null hypothesis, denoted H0, is a statement of the

basic proposition being tested.
- generally represents the status quo
- not rejected unless there is convincing sample evidence that it is false
The alternative or research hypothesis, denoted Ha or H1,

is an alternative to the null hypothesis statement
- It ill be accepted only if there is convincing sample evidence that it is

true
Hypothesis Testing: Definitions (cont’d)
Step 2: Level of Significance: The probability of rejecting the
null hypothesis when it is actually true.
Type I Error: Rejecting H0 when it is true

Type II Error: Failing to reject H0 when it is false
State of Nature
Conclusion H0 True H0 False
Reject H0 Type I Correct

Error Decision
Do not Reject H0 Correct Type II
Decision Error
Step3: Test statistic: A value, determined from sample information,

used to determine whether or not to reject the null hypothesis.
Step 4: Critical value: The dividing point between the region where
the null hypothesis is rejected and the region where it is not
rejected.
- Obtained from table for corresponding statistic
• A one-tailed test is when the alternate hypothesis, H1, states a
direction, such as:
 H1: The mean yearly commissions earned by full-time realtors is
more than $35,000. (µ>$35,000)
H : The mean speed of trucks traveling on I-95 in Georgia is
1
less than 60 miles per hour. (µ<60)

H : Less than 20 percent of the customers pay cash for their gasoline
1
purchase. (< 0.20)
• A two-tailed test is when no direction is specified in the
alternate hypothesis H1 , such as:
H : The mean amount spent by customers at the Wal-Mart in
1
Georgetown is not equal to $25. (µ≠ $25).
H : The mean price for a gallon of gasoline is not equal to $1.54. (µ≠
1
$1.54).
Distribution of level of significance for one-tailed test
r a l i t r b u i o n :  = 0 ,  = 1
Ex. Z statistic for a One-Tailed Test at .05 Level of Significance

0 . 4
0 . 3
.95 probability
Rejection One tail of .05
0 . 2
region for region of
rejection
x
Critical
f (
0 . 1
Value
z=1.65
. 0
- 5
0 1 2 3 4
r a l i t r b u i o n :  = 0 ,  = 1
Distribution of level of significance for two-tailed test

Example: Z statistic at 0.05 Level of Significance
0 . 4
.95 probability
0 . 3
Two tails of
.025 for
0 . 2
regions
f ( x
Critical of rejection
0 . 1
Value
z=1.96
. 0
- 5
-4 -3 -2 -1 0 1 2 3 4
Testing for the Population Mean: Large
Sample, Population Standard Deviation Known
• When testing for the population mean from a

large sample (n >30) and the population standard
deviation is known, the test statistic is given by:
X 
z
/ n
EXAMPLE 1
• The processors of Fries’ Catsup indicate on the

label that the bottle contains 16 ounces of catsup.
The standard deviation of the process is 0.5
ounces. A sample of 36 bottles from last hour’s
production revealed a mean weight of 16.12
ounces per bottle. At the .05 significance level is
the process out of control? That is, can we
conclude that the mean amount per bottle is
different from 16 ounces?
EXAMPLE 1
 Step 1: State the null and the alternative hypotheses:
H0:  = 16; H1: ≠ 16
 Step 2: Select the level of significance.

In this case the significance level is given as 0.05
 Step 3: Identify the test statistic.

Because we know the population standard deviation,
the test statistic is z.
EXAMPLE 1 continued
 Step 4: State the decision rule:
Reject H0 if z > 1.96 or z < -1.96
 Step 5: Compute the value of the test statistic and arrive

at a decision.
X  16.12  16.00
z   1.44
 n 0 .5 36
Conclusion: Do not reject the null hypothesis. Based on

the sample at the 5% level of significance we cannot
conclude the mean is different from 16 ounces.
The p-value
The p-value or the observed level of

significance is the probability of
observing a value of the test statistic
greater than or equal to calculated value
when H0 is true.
• It measures the weight of the

evidence against the null hypothesis
and is also the smallest value of  for
which we can reject H0.
Large Sample Tests about Mean: p-Values
If the sampled population is normal or if n is large, we can reject
H0:  = 0 at the  level of significance (probability of Type I
error equal to ) if and only if the appropriate rejection point
condition holds or, equivalently, if the corresponding p-value is
less than .
Alternative Reject H0 if: p-Value
H :   z  z 
Area under std normal curve right of z
a 0
H a :   0 z   z Area under std normal curve left of z
H a :   0 z  z / 2 , that is Twice area under std normal curve right of z
z  z / 2 or z   z / 2
Test Statistic
x- 0 If  unknown and n is large, estimate  by s.
z=
/ n
8.5 Small Sample Tests about a Population Mean
If the sampled population is normal, we can reject H0:  = 0 at
the  level of significance (probability of Type I error equal to )
if and only if the appropriate rejection point condition holds or,
equivalently, if the corresponding p-value is less than .

H a :   0 t  t Area under t distributi on right of t
H a :   0 t  t Area under t distributi on left of t

H a :   0 t  t / 2 , that is Twice area under t distributi on right of t
t  t / 2 or t  t / 2
Test Statistic
x- 0 t, t/2 and p-values are based on n – 1 degrees of
t=
s/ n freedom.
8.5 Hypothesis Tests about a Population
Proportion
If the sample size n is large, we can reject H0: p = p0 at the 
level of significance (probability of Type I error equal to ) if and
only if the appropriate rejection point condition holds or,
equivalently, if the corresponding p-value is less than .

H a : p  p0 z  z Area under std normal curve right of z
H a : p  p0 z   z Area under std normal curve left of z
H a : p  p0 z  z / 2 , that is Twice area under std normal curve right of z
z  z / 2 or z   z / 2
Test Statistic
p̂ - p 0
z=
p0 (1  p0 )
n
Example 2
A drug company is launching a new drug ‘Phantol’ and they

would like to show that it is more effective than another drug
called ‘Virol’. Virol has been shown to provide relief to 70%
of patients suffereing from upper respiratory infection. To do
this, the company plans to randomly sample 300 patients
having viral upper respiratory infections to attempt to reject
the null hypothesis H0 :p .70 in favor of the alternative
hypothesis Ha :p >0.70. Here p is the true proportion of all
patients whose symptoms are relieved by Phantol. Suppose
that Phantol provides relief for 231 of the 300 randomly
selected patients. Test whether ‘Phantol’ is more effective than
Virol.
Example: Hypothesis Tests about a Proportion
Testing H0: p  0.70 versus Ha: p > 0.70

using rejection points and p-value.
Using Phantol, proportion of patients with reduced severity and
duration of viral infections.
p̂ - p 0 0.77 - 0.70
z= =  2.65
p0 (1  p0 ) 0.70(1  0.70)
n 300
z  2.65  z.05  1.645, z  2.65  z.01  2.33, z  2.65  z.001  3.09
p - value  P(z  2 .65)  ( 0.5  0.4960 )  0.004

Summary: Selecting an Appropriate Test Statistic for a
Test about a Population Mean
Statistical Inferences Based on Differences of
Two Samples
Sampling Distribution of x1  x 2
If independent random samples are taken from two

populationsthen the sampling distribution of the sample
difference in means x1  x 2 is normal if
each of the sampled populations is normal and
approximately normal if the sample sizes n1 and n2 are
large
Has mean:  x 1 - x 2 = 1   2
 12  22
Has standard deviation:  x1 - x 2 = 
n1 n 2
Sampling Distribution of x1  x 2
(Continued)
Large Sample Confidence Interval, Difference in Mean
If two independent samples are from populations that are
normal or each of the sample sizes is large, 100(1 - )%
confidence interval for 1 - 2 is
 12  22
(x1  x 2 )  z/2 
n1 n2
If 1 and 2 are unknown and each of the sample sizes is large

(n1, n2  30), estimate the sample standard deviations by s1 and
s2 and a 100(1 - )% confidence interval for 1 - 2 is
s12 s22
(x1  x 2 )  z/2 
n1 n2
Large Sample Tests about Differences in Means
If sampled populations are normal or both samples are large, we can
reject H0: 1 - 2 = D0 at the  level of significance if and only if the
appropriate rejection point condition holds or, equivalently, if the
corresponding p-value is less than .

H a : 1   2  D0 z  z Area under std normal curve right of z
H a : 1   2  D0 z   z Area under std normal curve left of z
H a : 1   2  D0 z  z / 2 , that is Twice area under std normal curve right of z
z  z / 2 or z   z / 2
Test Statistic
(x1  x 2 )  D 0
z If population variance unknown
 2
 2
and the sample sizes are large,

1 2
substitute sample variances.
n1 n2
Small Sample Confidence Interval, Difference in
Mean When Variances are Equal
If two independent samples are from populations that are normal with
equal variances, 100(1 - )% confidence interval for 1 - 2 is
1 1
(x1  x 2 )  t /2 s   
2
p
 n1 n2 
Where sp2 is the pooled variance
( n  1 ) s 2
 ( n  1) s 2
s 2p  1 1 2 2
( n1  n2  2)
And t/2 is based on (n1 – n2 – 2) degrees of freedom.
Small Sample Tests about Differences in
Means When Variances are Equal
If sampled populations are both normal with equal variances, we can
reject H0: 1 - 2 = D0 at the  level of significance if and only if the
appropriate rejection point condition holds or, equivalently, if the p-
value is less than .
H a : 1   2  D0 Area under t distributi on right of t
t  t
H a : 1   2  D0 t  t Area under t distributi on left of t
H a : 1   2  D0 t  t / 2 , that is Twice area under t distributi on right of t

t  t / 2 or t  t / 2
Test Statistic Pooled Variance
(x1  x 2 )  D 0 ( n1  1) s12  ( n2  1) s22
t s 
2
p
2 1 1 ( n1  n2  2)
s p   
 n1 n2  t, t/2 and p-values are based on (n1 – n2 – 2) df
Small Sample Intervals and Tests about Differences in
Means When Variances are Not Equal
If sampled populations are both normal, but sample sizes and variances
differ substantially, small-sample estimation and testing can be based on
the following “unequal variance” procedure.
Confidence Interval Test Statistic
(x1  x 2 )  D 0
2
s s 2 t
(x1  x 2 )  z/2 1
 2
s12 s22
n1 n2 
n1 n2
For both the interval and test, the degrees of freedom are equal to
(s12 / n1  s 22 / n 2 ) 2
df  2
(s1 / n1 ) 2 (s 22 / n 2 ) 2

n1  1 n 2 1
Paired Difference Interval for Difference in Mean
If the sampled population of differences is normally

distributed with mean d, then a )100% confidence
interval for d is
sd
d  t /2
n
t /2 is based on n – 1 degrees of freedom.

Paired Difference Test for Difference in Mean
If the population of differences is normal, we can reject H0: d =

D0 at the  level of significance (probability of Type I error equal
to ) if and only if the appropriate rejection point condition
holds or, equivalently, if the corresponding p-value is less than .

H a :  d  D0 t  t Area under t distributi on right of t
H a :  d  D0 t  t Area under t distributi on left of t
H a :  d  D0 t  t / 2 , that is Twice area under t distributi on right of t
t  t / 2 or t  t / 2
Test Statistic
d-D 0 t, t/2 and p-values are based on n – 1 degrees of
t=
sd / n freedom.
Example: Paired Difference Interval and Test
Car Garage 1 Garage 2 Difference Excel Test Output
Car 1 $ 7.10 $ 7.90 -0.8
Car 2 9.00 10.10 -1.1
Car 3 11.00 12.20 -1.2
Car 4 8.90 8.80 0.1
Car 5 9.90 10.40 -0.5
Car 6 9.10 9.80 -0.7
Car 7 10.30 11.70 -1.4
Table 9.3 Mean -0.8

Std Dev 0.5033
95% Confidence Interval

sd  0.5033 
d  t / 2  0.8  2.447   0.8  0.4654  [ 1.2654,0.3346]
n  7 

Hypothesis Testing and ANOVA

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Hypothesis Testing and ANOVA

Hochgeladen von

Copyright:

Verfügbare Formate

Hypothesis Testing

• A hypothesis is a statement about the value of a

Examples of hypotheses made about a population parameter are:

The mean monthly income for systems analysts is $3,625.

Twenty percent of all customers at Bovine’s Chop House return

• Hypothesis testing is a procedure, based on sample

Step 1: The null hypothesis, denoted H0, is a statement of the

The alternative or research hypothesis, denoted Ha or H1,

- It ill be accepted only if there is convincing sample evidence that it is

Type I Error: Rejecting H0 when it is true

Conclusion H0 True H0 False

Reject H0 Type I Correct

Step3: Test statistic: A value, determined from sample information,

less than 60 miles per hour. (µ<60)

Ex. Z statistic for a One-Tailed Test at .05 Level of Significance

Distribution of level of significance for two-tailed test

• When testing for the population mean from a

• The processors of Fries’ Catsup indicate on the

 Step 2: Select the level of significance.

 Step 3: Identify the test statistic.

 Step 5: Compute the value of the test statistic and arrive

Conclusion: Do not reject the null hypothesis. Based on

The p-value or the observed level of

• It measures the weight of the

H a :   0 z  z / 2 , that is Twice area under std normal curve right of z

Alternative Reject H0 if: p-Value

H a :   0 t  t Area under t distributi on left of t

Alternative Reject H0 if: p-Value

H a : p  p0 z   z Area under std normal curve left of z

H a : p  p0 z  z / 2 , that is Twice area under std normal curve right of z

A drug company is launching a new drug ‘Phantol’ and they

Testing H0: p  0.70 versus Ha: p > 0.70

z  2.65  z.05  1.645, z  2.65  z.01  2.33, z  2.65  z.001  3.09

p - value  P(z  2 .65)  ( 0.5  0.4960 )  0.004

If independent random samples are taken from two

If 1 and 2 are unknown and each of the sample sizes is large

Alternative Reject H0 if: p-Value

H a : 1   2  D0 z   z Area under std normal curve left of z

H a : 1   2  D0 z  z / 2 , that is Twice area under std normal curve right of z

H a : 1   2  D0 t  t / 2 , that is Twice area under t distributi on right of t

Confidence Interval Test Statistic

If the sampled population of differences is normally

t /2 is based on n – 1 degrees of freedom.

If the population of differences is normal, we can reject H0: d =

Alternative Reject H0 if: p-Value

H a :  d  D0 t  t Area under t distributi on left of t

H a :  d  D0 t  t / 2 , that is Twice area under t distributi on right of t

Table 9.3 Mean -0.8

95% Confidence Interval

Das könnte Ihnen auch gefallen