Sie sind auf Seite 1von 56

Week 6

Analysis of Variance
(ANOVA)
Week 6 - Learning Objectives
 Recognize situations in which to use Analysis of
Variance (ANOVA).
 Understand different ANOVA designs
 Perform a single-factor hypothesis test and interpret
the results
 Conduct and interpret post-hoc multiple comparisons
procedures

Week 6-2
Parametric Hypothesis Testing

So far we have discussed parametric tests for..

 One sample

 Two independent samples

 Two dependent samples or Paired samples

 More than two independent samples ***

Week 6-3
Chapter Overview

Analysis of Variance (ANOVA)

One-Way
ANOVA

F-test

Tukey-
Kramer
test

Week 6-4
ANOVA

ANalysis Of VAriance
Not discussed in
this chapter

One-Way ANOVA/ Two-Way ANOVA/


Completely ANOVA for Randomized
Randomized Design Block Design
Week 6-5
General ANOVA Setting
 Investigator controls one or more independent
variables
 Called factors (or treatment variables)
 Each factor contains two or more levels.
 Observe effects on the dependent variable
 Response to levels of independent variable
 Experimental design: the plan used to collect the
data

Week 6-6
Completely Randomized Design

 Experimental units (subjects) are assigned


randomly to treatments
 Subjects are assumed homogeneous

 Only one factor or independent variable


 With two or more treatment levels

 Analyzed by one-factor analysis of variance


(one-way ANOVA)

Week 6-7
The Assumptions of One-Way
ANOVA Hypothesis Testing

Before we test, we must assume that

 the populations are normally distributed.


 samples are randomly and independently selected.
 sample sizes are large enough.
 Populations have the same variances (each group has
equal population variance).

Week 6-8
The Steps in One-Way ANOVA
Hypothesis Testing
 Step 1: State the null and alternative hypothesis
 Step 2: State the significance level.
 Step 3: State the statement of decision rule to determine
the rejection region(s).
 Step 4: Find the critical value of the test from the
statistical table.
 Step 5: Determine the appropriate statistical technique
and the test statistic to use.
 Step 6: Make the comparison between test statistics value
and critical value to make statistical decision.
 Step 7: Making conclusion.

Week 6-9
The Steps in One-Way ANOVA
Hypothesis Testing (continue)

Additional steps (more information):

 Find the p-value from statistical table based on


test statistic value.

 Pairwise comparison: Conduct the


Tukey-Kramer procedure.

Week 6-10
Step 1: State the null and
alternative hypothesis
 H0: m1 = m2 = m3 = … = mk
 All population means are equal
 There is no treatment effect (no variation in means among
groups)

 H1: At least one of mi is not equal; i = 1, 2, 3,…, k


 Not all population means are equal
 There is a treatment effect
 Does not mean that all population means are different
(some pairs may still be the same statistically)

Week 6-11
Step 1 (Continue)

H0: m1 = m2 = m3 = … = mk
All Population Means are NOT Different:
The Null Hypothesis is True
(No Treatment Effect)

μ1  μ2  μ3
Week 6-12
Step 1 (Continue)
H1: At least one of mi is different;
i = 1, 2, 3,…, k
At least one population mean is different:
The Null Hypothesis is NOT true
(Treatment Effect is present)
mi

or

μ1  μ2  μ3 μ1  μ2  μ3
Week 6-13
Step 2: State the significance
level
What is the significance level?
 It is the criteria to be used for rejecting the null
hypothesis before the testing is implemented.
 It is the probability of rejecting H0 given that H0 is

true at desire level, which is set by researcher.


 Normally, significance level set by researcher is at
1%, 5% and 10% level.

Significance Level =  = Type 1 error


Week 6-14
Step 3: State the decision rule

 In the decision rule, we can reject null hypothesis


only when the test statistic value is greater than
upper critical value.

Decision Rule:
 Reject H0 if F > FU,  = .05
otherwise do not
reject H0 0 Do not Reject H0
reject H0
Upper critical
value Week 6-15
Step 4: Find critical value
 There are two types of degree of freedom for each F
value
 v1 = df 1 = k -1 (typically smaller)

 v2 = df 2 = n - k (typically larger)

n = the sum of sample sizes


k = the number of
from all populations
independent groups
= n1+ n2+…....nk

F , k 1, n  k
Week 6-16
Step 5: Compute the test statistic

 The F statistic is the ratio of the among estimate


of variance and the within estimate of variance
(The ratio must always be positive; F ≥ 1)

Before you compute the test statistic value,


you need to develop the ANOVA table

Week 6-17
Step 5 (continue)
One-Way ANOVA Table
Source of Sum of Degree of Mean F statistic
Variation Square Freedom Square
Among SSA MSA
SSA k-1 MSA = F=
Groups k-1 MSW
Within SSW
SSW n-k MSW =
Groups n-k

Total SST n-1

Test Statistic
Week 6-18
Step 5 (Continue)
Partition of Total Variation
Total Variation (SST)

Variation Due to Random


Variation Due to
= Factor (SSA)
+ Sampling (SSW)

Commonly referred to as: Commonly referred to as:


 Sum of Squares Between  Sum of Squares Within
 Sum of Squares Among  Sum of Squares Error
 Sum of Squares Unexplained
 Sum of Squares Explained
 Within Groups Variation
 Among Groups Variation
Week 6-19
Step 5 (Continue)

 Total variation can be split into two parts:

SST = SSA + SSW


SST = SSR + SSE
SST = Total Sum of Squares
(Total variation)
SSA = Sum of Squares Among Groups
(Among-group variation)
SSW = Sum of Squares Within Groups
(Within-group variation)
Week 6-20
Step 5 (Continue)

SST = SSA + SSW

Total Variation = the aggregate dispersion of the individual data


values across the various factor levels (SST)

Among-Group Variation = dispersion between the factor


sample means (SSA)

Within-Group Variation = dispersion that exists among the data


values within a particular factor level (SSW)

Week 6-21
Step 5 (Continue)

Total Sum of
SST = SSA + SSW
Squares/ Total k nj
Variation SST   ( X ij  X ) 2
Where: j 1 i 1

SST = Total sum of squares


k = number of groups (levels or treatments)
nj = number of observations in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)
Week 6-22
Step 5 (Continue)

SST  ( X 11  X ) 2  ( X 12  X ) 2  ...  ( X kni  X ) 2


Response, X

Group 1 Group 2 Group 3

Week 6-23
Step 5 (Continue)

Sum Squares Among SST = SSA + SSW


the Groups/ Among- k
Group Variation SSA   n j ( X j  X ) 2

Where: j 1

SSA = Sum of squares among groups


k = number of groups or populations
nj = sample size from group j
Xj = sample mean from group j
X = grand mean (mean of all data values)

Week 6-24
Step 5 (Continue)

SSA  n1 ( x1  x )  n2 ( x2  x )  ...  nk ( xk  x )
2 2 2

Response, X

X3
X2 X
X1

Group 1 Group 2 Group 3


Week 6-25
Step 5 (Continue)
Sum Squares of
SST = SSA + SSW Within
Group/Within-
nj Group Variation
k
SSW    ( X ij  X j ) 2

j 1 i 1
Where:
SSW = Sum of squares within groups
k = number of groups
nj = sample size from group j
Xj = sample mean from group j
Xij = ith observation in group j
Week 6-26
Step 5 (Continue)

SSW  ( X 11  X 1 )  ( X 12  X 2 )  ...  ( X knk  X k )


2 2 2

Response, X

X3
X2
X1

Group 1 Group 2 Group 3


Week 6-27
Step 5 (Continue)
Obtaining the Mean Squares = Sum Squares /
Degree of Freedom
Mean Square Among Groups
SSA
MSA 
= SSA/degrees of freedom for sum

k 1 squares among groups

Mean Square Within Group


SSW
MSW  = SSW/degrees of freedom for

nk sum squares within group


Week 6-28
Step 5 (Continue)

 Test statistic for One-Way ANOVA is computed as


MSA
F
MSW
MSA is mean squares among variances
MSW is mean squares within variances

Week 6-29
Step 6: State the decision making

 Decision making is the process to make


decision either to reject or do not reject Ho
after making comparison between F test
statistic values and upper critical value
(from F table).

Week 6-30
Step 7: Making conclusion

The process to make the statement regarding


whether the populations means from different
groups are significantly different or not based
on the decision making at the certain
significance level.

Week 6-31
Additional step: Find p-value

P-value is the probability of rejecting H0 given that H0


is true based on the test statistic value.

P-value is the actual probability of


committing Type I error
(reject H0 given that H0 is true)
after conclusion is drawn.

Week 6-32
Additional step: Find p-value
(Continue)
How to find the p-value?

Use Bounded p-value


F Table

Use
Computer Software Actual p-value
(Microsoft Excel)

Week 6-33
Additional Step: Tukey-Kramer Test
This test is more meaningful when we have sufficient evidence
to reject H0 in One-Way ANOVA hypothesis testing..

 Tells which population means are significantly


different
 e.g.: μ1 = μ2 ≠ μ3
 Allows pair-wise comparisons
 Compare absolute mean differences with critical range

μ1= μ2 μ3 x

Week 6-34
The Steps in Tukey-Kramer
Hypothesis Testing
 Step 1: State the null and alternative hypothesis
 Step 2: State the significance level.
 Step 3: State the statement of decision rule to determine
the rejection region(s).
 Step 4: Find the critical value of the test from the
statistical table.
 Step 5: Determine the appropriate statistical technique
and the test statistic to use.
 Step 6: Make the comparison between test statistics value
and critical value to make statistical decision.
 Step 7: Making conclusion.
Week 6-35
Additional Step: Tukey-Kramer
Test (Continue)

1. H0: mi  m j
H1: mi  m j ,where i j
2. Significance level: 
3. Decision rule: We reject H0 when test
statistic is greater than critical value.

Week 6-36
Additional Step: Tukey-Kramer
Test (Continue)
4. Tukey-Kramer Critical Range
MSW  1 1 
Critical Range  Qu  Q ,k ,n  k 
2  n i n j 
where:
QU = Value from Studentized Range Distribution with k
and n - k degrees of freedom for the desired level
of  (see appendix E.9 table)
MSW = Mean Square Within
ni and nj = Sample sizes from groups i and j

Week 6-37
Additional Step: Tukey-Kramer
Test (Continue)
5.Test Statistic
Absolute Different  
Two Sample Means = Xi X j

where:

X i = Sample mean from group i.


X j = Sample mean from group j
Week 6-38
Additional Step: Tukey-Kramer
Test (Continue)

6. Decision making: If the test statistic value is


greater than absolute critical value, it is shows
that we have strong evidence from sample to
reject H0 and otherwise.

7. Conclusion: To make the statement regarding


which groups are significantly difference at
certain significance level.

Week 6-39
One-Factor ANOVA
F Test Example

You want to see if three Club 1 Club 2 Club 3


different golf clubs yield 254 234 200
different distances. You 263 218 222
randomly select five 241 235 197
measurements from trials on an 237 227 206
automated driving machine for 251 216 204
each club. At the 0.05
significance level, is there a
difference in mean distance?

Week 6-40
One-Factor ANOVA Example:
Scatter Diagram
Distance
Club 1 Club 2 Club 3 270
254 234 200 260 •
263 218 222 ••
241 235 197
250 X1
240 •
237 227 206 • ••
251 216 204 230
• X
220
••
X2 •
210
x1  249.2 x 2  226.0 x 3  205.8
•• X3
200 ••
x  227.0 190

1 2 3
Club Week 6-41
One-Factor ANOVA Example
Computations SSA SSW Anova

Club 1 Club 2 Club 3 X1 = 249.2 n1 = 5


254 234 200 X2 = 226.0 n2 = 5
263 218 222
X3 = 205.8 n3 = 5
241 235 197
237 227 206 n = 15
X = 227.0
251 216 204 k=3
SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSA = 4716.4 / (3-1) = 2358.2 2358.2


F  25.275
MSW = 1119.6 / (15-3) = 93.3 93.3

Week 6-42
One-Factor ANOVA Example
Solution
H0: μ1 = μ2 = μ3 Step 1
Test Statistic: Step 5
H1: At least one of μi is not equal,
where i=1,2, and 3 MSA 2358.2
 = 0.05 Step 2 F   25.275
df1= 2 df2 = 12
MSW 93.3
Decision Rule:
 Reject H0 if F > FU, otherwise do not reject H0 Step 3
Critical
Value: Step 4
Decision:
Step 6 Reject H at  = 0.05
FU = 3.89 0

 = .05 Conclusion: Step 7

There is evidence that at least


0 Do not Reject H0
one of mean distance μi differs
reject H0
FU = 3.89
F = 25.275 from the rest at  = 0.05.
Week 6-43
ANOVA -- Single Factor:
Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
SUMMARY
Groups Count Sum Average Variance
Club 1 5 1246 249.2 108.2
Club 2 5 1130 226 77.5
Club 3 5 1029 205.8 94.2
ANOVA
Source of
SS df MS F P-value F crit
Variation

Between Groups 4716.4 2 2358.2 25.275 4.99E-05 3.89

Within
1119.6 12 93.3
Groups
Total 5836.0 14 Week 6-44
Question-Pairwise Comparison
 At the 0.05 significance level, determine
which club differ in mean distance?

Week 6-45
The Tukey-Kramer Procedure:
Example Test statistics

Compute absolute mean Step 5


Ho: m i  m j Step 1
differences:
H1:m i  m j ,where i  j
x1  x 2  249.2  226.0  23.2
Significance level:  = 0.05 Step 2
x1  x 3  249.2  205.8  43.4
Step 3 Decision rule: We reject Ho if test
statistic is greater than critical value. x 2  x 3  226.0  205.8  20.2

Find the QU value from the Studentized Range Q table with


k = 3 and (n – k) = (15 – 3) = 12 degrees of freedom for the
desired level of  ( = .05 used here): Step 4

QU  3.77
Week 6-46
The Tukey-Kramer Procedure:
Example MSW

(continued)
Compute Critical Range: Step 4

MSW  1 1  93.3  1 1 
Critical Range  QU   3.77     16.285
 
2  n j n j'  2 5 5

Compare:
Reject Ho for each pair of mean because
x1  x 2  23.2
all of the absolute mean differences are
greater than critical range. Step 6 x1  x 3  43.4
Conclusion: Therefore there is a
x 2  x 3  20.2
significant difference between each pair of
means at 5% level of significance. Step 7 Step 5

Week 6-47
Summary
 Described one-way analysis of variance
 The logic of ANOVA
 ANOVA assumptions
 F test for difference in c means
 The Tukey-Kramer procedure for
multiple comparisons

Week 6-49
Week 6-50
Assignment

 5-6 students who are in the same tutorial


group.

 The assignment must be submitted to your tutor


latest by Monday of Week 10, (31 July 2017)
by 5p.m. in hardcopy.

 The hardcopy should not be more than 10


pages excluding the cover page and appendix.

Week 6-51
 select a combination of THREE (3) countries
and ONE (1) variable.

 Refer to your respective tutor for the


registration.

Week 6-52
CODE Countries

A Angola
B Botswana
C Chad
D Colombia
E El Salvador
F Germany
G Ireland
H Malaysia
I Moldova
J Slovenia
Week 6-53
Variables Tutors
1. Trade (% of GDP) Dr Au Yong Hui Nee

2. Food production index (2004-2006 = Mr Cheah Siew Pong


100)
3. Consumer price index (2010 = 100) Ms Kalaivani

4. GDP growth (annual %) Ms Tan Yan Teng

5. CO2 emissions (metric tons per Mr Thurai Murugan


capita)
6. FDI net inflows (% of GDP) Ms Vinothiny

Week 6-54
Part A

 Illustrate the background of variable chosen in


the 3 countries. Make use of graphical
presentation, table, descriptive statistics
(mean, standard deviation) to enhance the
content.
(8 marks)

Week 6-55
Part B

Week 6-56
Part C

 what conclusion can be reached? (8 marks)

Week 6-57

Das könnte Ihnen auch gefallen