Chapter 9: Analysis of Variance: For Example

STAT3010: Lecture 4
CHAPTER 9: ANALYSIS OF VARIANCE

Analysis of Variance (ANOVA) is one of the most widely used
statistical techniques for testing the equality of population
means. ANOVA is used to test the equality of more than two
treatment means.
Recall from STAT 2020/2010: Hypothesis testing concerning a
Difference between two means. The data could have either
been independent, paired or pooled:
For Example: The Chapin Insight Test is a psychological test
designed to measure how accurate the subject appraises
other people. The possible scores on the test range from 0 to
41. During the development of the Chapin test, it was given to
several different groups of people. Here are the results for male
and female college students majoring in the liberal arts:
Group
1
2
Sex
Male
Female
n
133
162
25.34
24.94
s
5.05
5.44
Do these data support the contention that female and male

students differ in average social insight?
STAT3010: Lecture 4
For the ANOVA test, we want to test the equality or difference

of more than two treatment means, so our hypotheses are:
With assumptions:
1.
2.
3.
4.
Background Logic (Section 9.1, Page 408)

Example 9.1: Variation in Time to Relief of Symptoms Between
and Within Treatments
An experiment is conducted in which three treatments are
compared with respect to their effectiveness. For the purpose
of this example, effectiveness is evaluated in terms of time to
relief of symptoms, reported in minutes. We assume that the
distribution of time to relief are approximately normal. The test
of interest is as follows:
Fifteen subjects are randomly selected to participate in the

investigation. Five subjects are randomly assigned to each
treatment and each subject reports the time to relief of
symptoms, in minutes, following their assigned treatment.
Sample data and summary stats follow:
STAT3010: Lecture 4
Treatment 1
Treatment 2
29.0
29.2
29.1
28.9
28.8
Treatment 3
25.1
25.0
25.0
24.9
25.0
20.1
20.0
19.9
19.8
20.2
Summary Statistics by Treatment
x1
s12
s1
x2
s22
s2
x3
s32
s2
Summary statistics show:
So far, the summary statistics shows a small amount of within

group variability since all individual standard deviations are
quite small. Now, suppose that the population means are
equal (ie., H o : 1 2 3 is true). We can assume, then, that
the three samples are drawn from the same population and
can pool all of the observations together (i.e., N=15).
STAT3010: Lecture 4
options LS = 80 PS = 60
nodate;
data relief;
input x;
cards;
29
29.2
29.1
28.9
28.8
25.1
25
25
24.9
25
20.1
20
19.9
19.8
20.2
run;
proc print;
run;
proc means;
var x;
run;
The SAS System
The MEANS Procedure
Analysis Variable : x
N
Mean
Std Dev
Minimum
Maximum
15
24.6666667
3.8130728
19.8000000
29.2000000
This shows a large amount of between group variability.

Between groups variability is the difference between the mean
time to relief for each group versus the overall (pooled) mean.
Our individual mean values seems to be quite different from our
overall mean, so this suggests large between group
STAT3010: Lecture 4
variability. If the means in each group are very similar, then the
between variability is small.
In ANOVA, we compare the variation within samples (which is
small) to the variation between samples (which is large) to
assess the equality of the population means.
If the observations within a sample are similar in value (ie., small
within sample variation) and the means are different across
samples (large between sample variation), then a real
difference is said to exist in the population means. (Reject H o )
So what have we learned?
What causes variability?

Within groups: if the treatment has no effect, all participants
time to relief will be more or less the same, thus giving low
variability.
Between groups: if one treatment has a massively better effect
than the others, then the mean of this group will be very
different from the others, thus increasing between-group
variability.
In ANOVA, we wish to test the following:
H o : 1 2 3 ... k
H a : at least 2 means not equal
vs
5
STAT3010: Lecture 4
where k = the number of populations under consideration.

To test H o , we compute two estimates of the population
2
variance ( ).
First estimate:
Second estimate:
Formula for Within Treatment Variation:
Formula for Between Treatment Variation:
The test statistic in ANOVA is based on the ratio of these two

estimates:
STAT3010: Lecture 4
The test statistic follows an F distribution (Table B.4A and B.4B):
Lets use our example to find these values:

Treatment 1
29.0
29.2
29.1
28.9
28.8
Treatment 2
25.1
25.0
25.0
24.9
25.0
Treatment 3
20.1
20.0
19.9
19.8
20.2
Summary Statistics by Treatment
x1 29
s12 0.025
s1 0.158
x 2 25
s22 0.005
s2 0.071
x 3 20
s32 0.025
s2 0.158
STAT3010: Lecture 4
If the two estimates of 2 are close in value, then F will be

approx. equal to 1, which leads us to no reason for rejecting
H o . However, if the variation between samples ( sb2 ) is large
2
and the variation within samples ( sw ) is small, then F will be

large, and we would reject H o .
These arent our only guidelines for a conclusion.we need a
critical value from the F distribution as well. In order to get this
critical value, we need 2 degrees of freedom: the numerator
degrees of freedom ( df1 k 1 ) and the denominator degrees
of freedom ( df 2 nk k or df 2 N k ). We find this F critical
value using Table B.4A or B.4B.
Decision:
STAT3010: Lecture 4
Notation and Examples (Section 9.2, Page 413)

To make a decision of reject/do not reject the null hypothesis,
we simplify the test by the use of the ANOVA table. Here are
the formulas which make up the ANOVA table:
Analysis of Variance Table
Degrees of
Freedom
(df)
Source of
Variation
Sums of Squares
(SS)
Between
SS b n j ( X . j X .. ) 2
k-1
s b2 MS b
SS b
k 1
Within
SS w ( X ij X . j ) 2
N-k
s w2 MS w
SS w
N k
Total
SS total ( X ij X .. ) 2
N-1
Mean Squares
(MS)
MS b
MS w

Chapter 9: Analysis of Variance: For Example

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Chapter 9: Analysis of Variance: For Example

Hochgeladen von

Copyright:

Verfügbare Formate

STAT3010: Lecture 4

CHAPTER 9: ANALYSIS OF VARIANCE

Do these data support the contention that female and male

For the ANOVA test, we want to test the equality or difference

Background Logic (Section 9.1, Page 408)

Fifteen subjects are randomly selected to participate in the

Summary Statistics by Treatment

Summary statistics show:

So far, the summary statistics shows a small amount of within

This shows a large amount of between group variability.

What causes variability?

H a : at least 2 means not equal

where k = the number of populations under consideration.

Formula for Within Treatment Variation:

Formula for Between Treatment Variation:

The test statistic in ANOVA is based on the ratio of these two

The test statistic follows an F distribution (Table B.4A and B.4B):

Lets use our example to find these values:

Summary Statistics by Treatment

If the two estimates of 2 are close in value, then F will be

and the variation within samples ( sw ) is small, then F will be

Notation and Examples (Section 9.2, Page 413)

Das könnte Ihnen auch gefallen