Sie sind auf Seite 1von 22

An Introduction to Analysis of Variance

Analysis of Variance (ANOVA) can be used to test


for the equality of three or more population means
using data obtained from observational or
experimental studies.
We want to use the sample results to test the
following hypotheses.

H0: 1=2=3=. . . = k
Ha: Not all population means are equal
If H0 is rejected, we cannot conclude that all
population means are different.
Rejecting H0 means that at least two population
means have different values.
Slide 1
Assumptions for Analysis of Variance

For each population, the response variable is


normally distributed.
The variance of the response variable, denoted 2, is
the same for all of the populations.
The observations must be independent.

Slide 2
Analysis of Variance:
Testing for the Equality of K Population Means
Between-Samples Estimate of Population Variance
Within-Samples Estimate of Population Variance
Comparing the Variance Estimates: The F Test
The ANOVA Table

Slide 3
Between-Samples Estimate
of Population Variance
A between-samples estimate of 2 is called the mean
square between (MSB).
k

n (x
j 1
j j x )2
MSB
k 1
The numerator of MSB is called the sum of squares
between (SSB).
The denominator of MSB represents the degrees of
freedom associated with SSB.

Slide 4
Within-Samples Estimate
of Population Variance
The estimate of 2 based on the variation of the
sample observations within each sample is called the
mean square within (MSW).
k
2
(n j 1) s 2j
j1
MSW
nT k
The numerator of MSW is called the sum of squares
within (SSW).
The denominator of MSW represents the degrees of
freedom associated with SSW.

Slide 5
Comparing the Variance Estimates: The F Test

If the null hypothesis is true and the ANOVA


assumptions are valid, the sampling distribution of
MSB/MSW is an F distribution with MSB d.f. equal
to k - 1 and MSW d.f. equal to nT - k.
If the means of the k populations are not equal, the
value of MSB/MSW will be inflated because MSB
overestimates 2.
Hence, we will reject H0 if the resulting value of
MSB/MSW appears to be too large to have been
selected at random from the appropriate F
distribution.

Slide 6
Test for the Equality of k Population Means

Hypotheses
H0: 1=2=3=. . . = k
Ha: Not all population means are equal
Test Statistic
F = MSB/MSW
Rejection Rule
Reject H0 if F > F
where the value of F is based on an F distribution
with k - 1 numerator degrees of freedom and nT - 1
denominator degrees of freedom.

Slide 7
Example: Reed Manufacturing

Analysis of Variance
J. R. Reed would like to know if the mean number of
hours worked per week is the same for the department
managers at her three manufacturing plants (Buffalo,
Pittsburgh, and Detroit).
A simple random sample of 5 managers from each of
the three plants was taken and the number of hours
worked by each manager for the previous week is
shown on the next slide.

Slide 8
Example: Reed Manufacturing

Analysis of Variance
Plant 1 Plant 2 Plant 3
Observation Buffalo Pittsburgh Detroit
1 48 73 51
2 54 63 63
3 57 66 61
4 54 64 54
5 62 74 56
Sample Mean 55 68 57
Sample Variance 26.0 26.5 24.5

Slide 9
Example: Reed Manufacturing

Analysis of Variance
Hypotheses

H0: 1= 2= 3
Ha: Not all the means are equal
where:
1 = mean number of hours worked per
week by the managers at Plant 1
2 = mean number of hours worked per
week by the managers at Plant 2
3 = mean number of hours worked per
week by the managers at Plant 3

Slide 10
Example: Reed Manufacturing

Analysis of Variance
Mean Square Between
Since the sample sizes are all equal
x= = (55 + 68 + 57)/3 = 60
SSB = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2 = 490
MSB = 490/(3 - 1) = 245
Mean Square Within
SSW = 4(26.0) + 4(26.5) + 4(24.5) = 308
MSW = 308/(15 - 3) = 25.667

Slide 11
Example: Reed Manufacturing

Analysis of Variance
F - Test
If H0 is true, the ratio MSB/MSW should be near 1
since both MSB and MSW are estimating 2. If Ha
is true, the ratio should be significantly larger than
1 since MSB tends to overestimate 2.
Rejection Rule
Assuming = .05, F.05 = 3.89 (2 d.f. numerator,
12 d.f. denominator). Reject H0 if F > 3.89

Slide 12
Example: Reed Manufacturing

Analysis of Variance
Test Statistic
F = MSB/MSW = 245/25.667 = 9.55
Conclusion
F = 9.55 > F.05 = 3.89, so we reject H0. The mean
number of hours worked per week by department
managers is not the same at each plant.

Slide 13
Example
Economic position of Household Reference Person
Unoc -
Ret unoc
Self- Fulltime Pt under
Unempl. over min
employed employee employee min ni
ni age
age TOTAL

EFS: Total
Alcoholic Mean 18.56 14.64 12.39 19.48 7.34 11.99 12.67
Beverages,
Tobacco
St. Dev. 19.0 18.5 15.0 19.7 14.6 19.1 17.8

Are there significant difference across the


means of these groups?
Or do the differences depend on the different
levels of variability across the groups?

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi
14
Null and alternative hypothesis for
ANOVA
Null hypothesis (H0): all the means are equal
Alternative hypothesis (H1): at least two
means are different

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi
15
Arranging data for ANOVA
Economic position of Household Reference Person
Group (g)
1 2 3 4 5 6
Unoc -
Ret unoc
Self- Fulltime Pt under
Unempl. over min
employed employee employee
t min ni
ni age
age
Observations
x11 x21 x13 x14 x15 x16
x21 x22 x23 x24 x25 x26
x31 x32 x33 x34 x35 x36

Number of observations (n)


n1 n2 n3 n4 n5 n6
Means
x1 x2 x3 x4 x5 x6

Overall mean x

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi
16
The statistical distribution to carry out
ANOVA
1. Decompose the total variation (sum of
squares corrected for the mean)
2. Compute the F-test statistic
3. Choose the critical value
4. Interpret the result

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi
17
ANOVA: data
Suppose that we have n observation within each group and g group
Group (factor level)
1 2 j g
1 x11 x12 x1j x1g
2 x21 x22 x2j x2g

Obs. i xi1 xi2 xij xig

n xn1 xn2 xnj xnn
Group mean xj xg
x1 x2
1 g
TOTAL MEAN x xj
g j 1
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi
18
Measuring and decomposing the total
variation
SUM OF SQUARES (corrected for the mean)

VARIATION BETWEEN THE GROUPS +


VARIATION WITHIN EACH GROUP=
________________________________

TOTAL VARIATION

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi
19
The test statistic

The test statistic is computed as:


sB2 Variance between groups
F 2
sW Variance within groups

This test statistic compares the weight of


the variance explained by the factors to the
weight of the variance not explained by the
factors
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi
20
ANOVA in SPSS

Target variable

Factor

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi
21
SPSS output
ANOVA
Variance between
EFS: Total Alcoholic Bev erages, Tobacco
Sum of
Squares df Mean Square F Sig.
Between Groups 6171.784 5 1234.357 4.024 .001
Within Groups 151535.3 494 306.752
Total 157707.1 499

p-value < 0.05


Variance within
Variation Degrees of The null is
decomposition freedom rejected

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi
22

Das könnte Ihnen auch gefallen