Anova

General Linear Model 2
Intro to ANOVA
Questions
ANOVA makes assumptions about error for
significance tests. What are the
assumptions?
What might happen (why would it be a
problem) if the assumption of {normality,
equality of error, independence of error}
turned out to be false?
What is an expected mean square? Why is
it important?
Why do we use the F test to decide whether
means are equal in ANOVA?

Questions (2)
Correctly interpret ANOVA summary
tables.
Find correct values of critical F from tabled
values for a given test.
Suppose someone has worked out that a
one-way ANOVA with 6 levels has a power
of .80 for the overall F test. What does this
mean?
Describe (make up) a concrete example of a
one-way ANOVA where it makes sense to
use an overall F test. Explain why ANOVA
(not t, chi-square or something else) is the
best method for the analysis.

New Distributions
So far, the normal (z) and its short, fat
relative, the t distribution.
The normal has two children, chi-
square ( ) and F.
Chi-square is made of the sum of v
squared deviations from the unit
normal. It essentially show the
sampling distribution of the variance.
F is the ratio of two chi-squares.
2
_
ANOVA Assumptions
Recall we can partition total SS into
between (treatment) and within (error)
SS. No assumptions needed.
To conduct tests about population
effects, have to make assumptions:
1. Within cells (treatments) error is normal.
2. Homogeneity of error variance.
3. Independent errors.
Assumptions
Normality sampling distribution of
means,variances; not bad if N is large; e.g.
reaction time
Homogeneity pooled estimate of population
value. Where are means different? Assumed
equal error for each. E.g., ceiling effects in
training.
Independence sampling distribution again;
e.g., cheating on exam, nesting (schools, labs)
Mean Square Between Groups
Mean square = SS/df = = variance
estimate.
MS between =

E(MS between) =

If there is no treatment effect, MS between =
error variance.
If there is a treatment effect, MS between is
bigger than error variance.
v
v
2
) (
_
1 J
between SS
(J treatments)
1
2
2
J
n
j
j j
e
t
o
Mean Square Within Groups
MS within =

E(MS within) =
Expected mean square for error is .
Expected mean square for treatment is same
plus treatment effect: .

When there is no treatment effect, between
and within estimate same thing.
J N
within SS
(N is total sample size

and J is number of
groups.)
2
e
o
2
e
o
1
2
2
J
n
j
j j
e
t
o
Review
ANOVA makes assumptions about error for
significance tests. What are the assumptions?
What might happen (why would it be a
problem) if the assumption of {normality,
equality of error, independence of error}
turned out to be false?
What is an expected mean square? Why is it
important?

The F Test (1)
Suppose

The null is equivalent to:
If the null is true, then
, 0 :
0
=
j
H t
for all j
, 0 :
1
=
j
H t for some j
) , 1 (
~
J N J
F
witin MS
between MS

The ratio of the two variance
estimates will be distributed as
F with J-1 and N-J degrees of
freedom.
= = = =
j
...
2 1
The F Test (2) ) , 1 (
~
J N J
F
witin MS
between MS

This is a big deal because we can use variance estimates to test the
hypothesis that any number of population means are equal. Equality
of means is same as testing population treatment effect(s).
For a treatment effect to be detected, F must be larger than 1. F
is one-tailed in the tables which show upper tail values of F
given the two df.
6 5 4 3 2 1 0
Obtained F (2 and 10 df)
1.0
0.8
0.6
0.4
0.2
0.0
p

v
a
l
u
e

(
s
i
g
n
i
f
i
c
a
n
c
e
)

No Effect (n.s.)
Signifcant
(Alpha = .05)
F Table Critical Values
Numerator df: df
B

df
W
1 2 3 4 5
5 5%
1%
6.61
16.3
5.79
13.3
5.41
12.1
5.19
11.4
5.05
11.0
10 5%
1%
4.96
10.0
4.10
7.56
3.71
6.55
3.48
5.99
3.33
5.64
12 5%
1%
4.75
9.33
3.89
6.94
3.49
5.95
3.26
5.41
3.11
5.06
14 5%
1%
4.60
8.86
3.74
6.51
3.34
5.56
3.11
5.04
2.96
4.70
Review
Why do we use the F test to decide
whether means are equal in ANOVA?
Suppose we have an ANOVA design
with 3 cells and 5 people per cell. What
is the critical value of F at alpha = .05?

Calculating F 1 Way
ANOVA
Sums of squares (squared deviations from the mean) tell the story
of variance. The simple ANOVA designs have 3 sums of
squares.
=
2
) ( X X SS
ij tot
=
2
) (
j ij W
X X SS
=
2
) ( X X n SS
j j B
W B TOT
SS SS SS + =
The total sum of squares comes from the
distance of all the scores from the grand
mean. This is the total; its all you have.
The within-group or within-cell sum of
squares comes from the distance of the
observations to the cell means. This
indicates error.
The between-cells or between-groups
sum of squares tells of the distance of
the cell means from the grand mean.
This indicates IV effects.
Computational Example:
Caffeine on Test Scores
G1: Control G2: Mild G3: Jolt
Test Scores
75 80 70
77 82 72
79 84 74
81 86 76
83 88 78
Means
79 84 74
SDs (N-1)
3.16 3.16 3.16
G1 75 79 16
Control 77 79 4
M=79 79 79 0
SD=3.16 81 79 4
83 79 16
G2 80 79 1
M=84 82 79 9
SD=3.16 84 79 25
86 79 49
88 79 81
G3 70 79 81
M=74 72 79 49
SD=3.16 74 79 25
76 79 9
78 79 1
Sum 370
Total
Sum of
Squares
=
2
) ( X X SS
ij tot
G1 75 79 16
Control 77 79 4
M=79 79 79 0
SD=3.16 81 79 4
83 79 16
G2 80 84 16
M=84 82 84 4
SD=3.16 84 84 0
86 84 4
88 84 16
G3 70 74 16
M=74 72 74 4
SD=3.16 74 74 0
76 74 4
78 74 16
Sum 120
Within
Sum of
Squares
=
2
) (
j ij W
X X SS
ij
X
j
X
2
) (
j ij
X X
G1 79 79 0
Control 79 79 0
M=79 79 79 0
SD=3.16 79 79 0
79 79 0
G2 84 79 25
M=84 84 79 25
SD=3.16 84 79 25
84 79 25
84 79 25
G3 74 79 25
M=74 74 79 25
SD=3.16 74 79 25
74 79 25
74 79 25
Sum 250
Between
Sum of
Squares
=
2
) ( X X n SS
j j B j
X X
2
) ( X X
j

Source SS df MS F
Between
Groups
250 J-1=
3-1=2
SS/df
250/2=
125
=MS
B
F =
MS
B
/MS
W

= 125/10
=12.5
Within
Groups
120 N-J=
15-3=12
120/12 =
10 =
MS
W
Total 370 N-1=
15-1=14
ANOVA Source (Summary) Table
89 . 3
) 12 , 2 , 05 . (
=
= o
F
ANOVA Summary
Calculate SS (total, between, within)
Each SS has associated df to calculate
MS
F is ratio of MS
b
to MS
w
Compare obtained F (12.5) to critical
value (3.89). Significant if obtained F
is larger than critical.
One-tailed test makes sense for F.
Review
Suppose we have 4 groups and 10
people per group. We find that SS
B
=
60 and SS
W
= 40. Construct an
ANOVA summary table and test for
significance of the overall effect.
ANOVA Descriptive Stats
Because SS
tot
= SS
b
+SS
w
we can figure
proportion of total variance due to
treatment.
Proportion of total variance due to
treatment is:
R
2
= SS
b
/SS
tot
.
Varies from 0 (no effect) to 1 (no error).
Sample value is biased (too large).
Estimating Power
Power for what? For one-way
ANOVA, power usually means for the
overall F, i.e., at least 1 group mean is
different from the others.
Howell uses noncentral F for sample
size calculation.

2
2
/ ) (
'
e
j
k
o

|

=
n ' | | =
Where k is the number of treatment
goups; n is sample size per group.
Variance of error is MSE in the
population (variance of DV within
cells). Mu(j) are treatment means;
mu is grand mean.
2 '
2
|
|
= n
SAS Power calculation
SAS will compute sample size
requirements for a given scenario.
You input the expected means and a
common (within cell) standard
deviation, (along with alpha and desired
power) and it will tell you the sample
size you need.

SAS Input
run;
**********************************************************
* Power computation example from Howell, 2010, p. 350.
* Note the standard deviation is the square root of the
* provided MSE: sqrt(240.35) = ~ 15.5.
**********************************************************;
proc power ;
onewayanova
groupmeans = 34 | 50.8 | 60.33 | 48.5 | 38.1
stddev = 15.5
alpha = 0.05
npergroup = .
power = .8;
SAS Output

The POWER Procedure
Overall F Test for One-Way ANOVA

Fixed Scenario Elements

Method Exact
Alpha 0.05
Group Means 34 50.8 60.33 48.5 38.1
Standard Deviation 15.5
Nominal Power 0.8

Computed N Per Group

Actual N Per
Power Group
0.831 8
Review
Suppose someone has worked out that a
one-way ANOVA with 6 levels has a
power of .80 for the overall F test.
What does this mean?
Describe (make up) a concrete example
of a one-way ANOVA where it makes
sense to use an overall F test. Explain
why ANOVA (not t, chi-square or
something else) is the best method for
the analysis.

Anova

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Anova

Hochgeladen von

Copyright:

Verfügbare Formate

General Linear Model 2

(N is total sample size

Das könnte Ihnen auch gefallen