10 - Hypothesis Testing With One-Way ANOVA PDF

Hypothesis Testing with OneOne-Way
ANOVA
Statistics
Arlo Clark
Clark-Foos
Foos
Conceptual Refresher
Standardized z distribution of scores and of means
can be represented as percentile rankings.
2. t distribution of means, mean differences, and
differences between means can all be standardized,
standardized
allowing us to analyze differences between 2 means
3. Numerator of test statistic is always some
difference (between scores, means, mean
differences, or differences between means)
4. Denominator represents some measure of
variability (or form of standard deviation).
1.
Calculating Refresher
y Test Statistics
y Numerator = Differences between groups
y Example: Men are taller than woman
y Denominator = Variability within groups
y Example: Not all men/women are the same height
* There is overlap between these distributions.
z=
( M M )
M
t=
(M M )
sM
( M X M Y ) ( X Y ) ( M X M Y )
t=
=
sDifference
sDifference
Analysis of Variance (ANOVA)
y Hypothesis test typically used with one or more nominal IV (with at least
3 groups overall) and an interval DV.
y t Test: Distance between two distributions

y F ratio: Uses two measures of variability
F Ratio (Sir Ronald Fisher)

between - g
groups
p variance
F=
within - groups variance
y Between-Groups Variance: An estimate of the
population variance based on the differences

among the means of the samples
y Within
Within-Groups
Groups Variance: An estimate of the
population variance based on the differences

within each of the three or more sample
distributions
More than two groups

y Example:
Example Speech rates in America,
America Japan,
Japan & Wales
t test?
Two Sources of
Variance:
Between &
Within
t test?
t test?
Problem of Too Many Tests

p(A) AND p(B) = p(A) x p(B)
p(A) OR p(B) = p(A) + p(B)
y The probability
probabilit of a Type
T pe I error (rejecting the null
n ll when
hen the n
nullll is
true) greatly increases with the number of comparisons.
Fishing Expedition
If you torture the data long enough,
the numbers will prove anything you want (Bernstein, 1996)
Problem of Too Many Tests
Types of ANOVA
y Always preceded by two adjectives
1. Number
N b off Independent
I d
d t Variables
V i bl
2. Experimental Design
y
One-Way
O
W ANOVA:
ANOVA Hypothesis
H
h i test that
h iincludes
l d one
nominal IV with more than two levels and an interval DV.
Within-Groups One -Way ANOVA: ANOVA where each

sample is composed of the same participants (AKA
repeated measures ANOVA).
Between-Groups One-Way ANOVA: ANOVA where each

sample
l is
i composed
d off diff
different participants.
i i
Assumptions of ANOVA
from 1st edition of textbook
Assumption of Homoscedasticity
y Homoscedastic
H
d ti
populations have the

same variance
y Heteroscedastic
populations have
different
ff
variances
to the Six Steps

y Research Question:
y What influences foreign students to choose an American
graduate program? In particular, how important are financial
aspects to students in Arts & Sciences, Education, Law, &
B i
Business?
?
y Data Source:
y Survey of 17 graduate students from foreign countries currently
enrolled in universities in the U.S.
Importance Scores
Arts & Sciences
Education
Law
Business
1 Identify
1.
y Populations: All foreign graduate students enrolled in
__________ programs in the U.S.

y Comparison Distribution: F distribution
y Test:
T
O
One-Way
W Between-Subjects
B
S bj
ANOVA
y Assumptions:
y Participants not randomly selected
y Be careful generalizing results
y Not clear if population dist. are normal. Data are not skewed.
y Homoscedasticity
y We will return to this later during calculationsDont Forget!
2 Hypotheses
2.
y Null:
N ll Foreign
F i graduate
d t students
t d t iin A
Arts
t &S
Sciences,
i
Ed
Education,
ti
L
Law,
and Business all rate financial factors the same, on average.

1 = 2 = 3 = 4
y Research: Foreign graduate students in Arts & Sciences, Education,
Law, and Business do not all rate financial factors the same, on
average.
1 2 3 4
3 Determine characteristics
3.
y > 2 groups and interval DV:
F distribution
y df for each sample: NSample - 1

y
y
y
y
Arts & Sciences:

Ed ti
Education:
Law:
Business:
df1 = 5 - 1 = 4
df2 = 4 - 1 = 3
df3 = 4 - 1 = 3
df4 = 4 - 1 = 3
y dfBetween: NGroups - 1 = 4 - 1 = 3
y Numerator df
y dfWithin: df1 + df2 + df3 + df4 = 4 + 3 + 3 + 3 = 13

y Denominator dff
4 Determine Critical Values

4.
p = .05
dfBetween = 3
dfWithin = 13
FCritical = 3.41
5 Calculate the Test Statistic

5.
y In order to do this, we need 2 measures of variance
y Between-Groups Variance
y Within-Groups Variance
y We will do this shortly
6 Make a Decision
6.
y If our calculated test statistic exceeds our cutoff, we
reject the null hypothesis and can say the following:

Foreign
F
i graduate
d
students
d
studying
d i in
i the
h U.S.
U S rate
financial factors differently depending on the type of
program in which they are enrolled
enrolled
y ANOVA does not tell us where our differences are!
y We just know that there is a difference somewhere.
L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
between - groups variance
F=
within - ggroups
p variance
y Whenever differences between sample means are large
and differences between scores within each sample are

small, the F statistic will be large.
y Remember that large test statistics indicate statistically
significant results
L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
Large withingroups variability &
small between
groups variability
b) Large
L
withini hi
groups variability &
large between
groups variability
bl
c) Small withingroups
g
p variabilityy &
small between
groups variability.
a)
Less
ess O
Overlap!
e ap
L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
between
b
t
- groups variance
i
F=
within - groups variance
y If between-groups = within-groups, F = 1
y Null hypothesis predicts F = 1
y No differences between groups
y Within-groups variance based on scores, between-groups
variance based on means.

y Need correction.
C l l ti th
Calculating
the F Statistic:
St ti ti The
Th Source
S
Table
T bl
y Source Table
Table: Presents the important calculations and
final results of an ANOVA in a consistent and easy-toread format.

f
C l l ti th
Calculating
the F Statistic:
St ti ti The
Th Source
S
Table
T bl
Col.l 1: Th
C
The sources off variability
i bilit
Col. 5: Value of test statistic, F ratio
Col. 4: Mean Square: arithmetic
a erage of squared
average
sq ared de
deviations
iations
Col. 3: Degrees of freedom
Col. 2: Sum of Squares
MS Between
SS Between
=
df Between
MSWithin =
SSWithin
dfWithin
F=
MS Between
MSWithin
Sums of Squared Deviations

Put all of your scores in one
column, with samples
denoted in another
column.
column
SSTotal = ( X GM )
Grand Mean: Refers to the

mean of all scores in a
study, regardless of their
sample.
l
( X )
GM =
NTotal

SSWithin
Wi hi = ( X M )
Calculate the squared

d i ti off each
deviation
h
score from its own
particular sample
p
p
mean

SS Between
= ( M GM )
B
Calculate the squared
d i ti off each
deviation
h
sample mean from
the g
grand mean.
Source Table for our Example
What is our decision?
y Back to Step 1.
y Homoscedasticity
y Because the largest variance (.500) is not more than twice
(unequal sample sizes) the smallest variance (.251) then we

h
have
mett thi
this assumption.
ti
What is our decision?

y Step 6. Make a decision
F = 3.94 > Fcrit = 3.41

y We
W can reject
j t th
the nullll h
hypothesis.
th i Th
There iis ((are)) a
difference somewhere.
y Where?
y post-hoc test: Statistical procedure frequently carried out
after
f we reject the
h nullll h
hypothesis
h
in an ANOVA;
O
it allows
ll
us to make multiple comparisons among several means.
y p
post-hoc: Latin for after this
y Examples: Tukeys HSD, Scheffe, Dunnet, Duncan, Bonferroni
Reporting ANOVA in APA Style

1.
Italic letter F:
2.
O
Open
parenthesis
h i :
F(
3.
Between Groups df then comma:
F(dfBetween ,
4.
Within Groups df:
F(dfBetween
et ee , dfWithin
t )
5.
Close parentheses, equal sign:
F(dfBetween , dfWithin) =
6.
F Statistic then comma:
F(dfBetween , dfWithin) = 1.23,
7.
Lower case,
case italic letter p:
p
F(dfBetween , dfWithin) = 1.23,

1 23 p
8.
Significant, less than .05:

y OR non significant:
y OR exact p value:
Another
example:
F(dfBetween , dfWithin) = 1.23, p < .05

F(dfBetween , dfWithin) = 1.23, p > .05
F(dfBetween , dfWithin) = 1.23, p = .02
Between-Subjects One Way ANOVA

Example:
p Memoryy for Emotional Stimuli
Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Do you have differences in memory for emotional vs. neutral events?
yDo others have the same differences or is it something unique to you?
yLets find out
y Research Question: Will people asked to study pure lists of either
positive, negative, or neutral pictures have differences in recall of

those pure lists?
y Research Design: We asked 17 participants study one single list of
either 30 positive, 30 negative, or 30 neutral pictures (from IAPS).

Following a brief delay all participants were asked to recall as many of
the 30 studied photos as they could. These data are on the following
slide.

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Already Stated: NTotal = 17, one IV with 3 levels (Emotion) is between-sub.
Below are the proportion of pictures on their studied lists that each
participant successfully recalled (100% = perfect memory):
0.69
0.59
.64
0.84
0.64
.73
0.93
93
0.62
.51
5
0.91
0.71
.68
0.89
0.50
.61
0 90
0.90
0 60
0.60
M = .86
M = .61
M = .634

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Already Stated/Calculated
NTotal = 17
NNeg = 6
NNeut = 6
NPos = 5
dfNeg = 5
dfNeut = 5
dfPos = 4
dfBetween = 2
df Within = 14
MNeg = .86
MNeut = .61
MPos = .634
y Six Steps to Hypothesis Testingagain!

1.
Population: All memories for negative, neutral, and positive events.
Comparison
p
Distribution: F distribution
Test: One-Way Between-Subjects ANOVA
y Assumptions:
y Participants
p
were randomlyy selected from subject
j p
pool
y Not clear if population dist. are normal. Data are not skewed.
y Homoscedasticity

M
ti
l Sti
Memory ffor E
Emotional
Stimulili
2. Hypotheses
yp
Null: On average, memories for
negative,
ti neutral,
t l and
d positive
iti
pictures will not differ.
Neg = Neut = Pos
Research: On average, memories for
negative,
i neutral,l and
d positive
ii
pictures will be different.
Neg Neut Pos

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
3. Determine characteristics
y
> 2 groups and interval DV:
F distribution
0.69
0.59
.64
0.84
0.64
.73
0.93
0.62
.51
0.91
0.71
.68
0.89
0.50
.61
0.90
0.60
M = .86
M = .61
M = .634
s2 = .00784
s2 = .00472
s2 = .00683

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Digression: Test for Homoscedasticity
Rule
If sample sizes differ
across conditions,
largest variance must
not be more than
twice (2x) the smallest
variance
0.69
0.59
.64
0.84
0.64
.73
0.93
0.62
.51
0.91
0.71
.68
0.89
0.50
.61
0.90
0.60
M = .86
M = .61
M = .634
s2 = .00784
s2 = .00472
s2 = .00683
.00784
7 4
.0047
47 * 2 =.00944
944
.00784 < .00944 so this assumption is met.

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
4. Determine critical values
NTotal = 17
NNeg = 6
NNeut = 6
NPos = 5
dfNeg = 5
dfNeut = 5
dfPos = 4
dfBetween = 2
dfWithin= 14
MNeg = .86
MNeut = .61
MPos = .634
s2 = .00784
s2 = .00472
s2 = .00683
Fcrit = 3.74

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
GM =
5. Calculate a test statistic

Source
SS
df
Between
Within
14
Total
16
SSWithin = ( X M )
MS
( X )
NTotal
SS Between = ( M GM )
SSTotal = ( X GM )

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SSTotal = ( X GM )
GM =
( X )
NTotal
GM = .7053
X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61
(X - GM) (X - GM)
-0.02
0.0002
0.135
0.0181
0 225
0.225
0 0505
0.0505
0.205
0.0419
0.185
0.0341
0.195
0.0379
-0.12
0.0133
-0.07
0.0043
-0.09
0.0073
0.005
0.0
-0.21
0.0421
-0.11
0.0111
-0.07
0.0043
0.025
0.0006
-0.2
0.0381
-0.03
0.0006
-0.1
0.0091
SSTotal = .3135

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SSWithin = ( X M )
MNeg = .86
MNeut = .61
MPos = .634
X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61
(X - M)
-0.17
-0.02
0 07
0.07
0.05
0.03
0.04
-0.02
0.03
0.01
0.1
-0.11
-0.01
0.006
0.096
-0.124
0.046
-0.024
(X - M)
0.0289
0.0004
0 0049
0.0049
0.0025
0.0009
0.0016
0.0004
0.0009
0.0001
0.01
0.0121
0.0001
0
0.0092
0.0154
0.0021
0.0006
SSWithin = .0901

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SS Between = ( M GM )
GM = .7053
X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61
M
0.86
0.86
0 86
0.86
0.86
0.86
0.86
0.61
6
0.61
0.61
0.61
0.61
0.61
0.634
0.634
0.634
0.634
0.634
(M - GM) (M - GM)
0.155
0.024
0.155
0.024
0 155
0.155
0 024
0.024
0.155
0.024
0.155
0.024
0.155
0.024
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.07
0.005
-0.07
0.005
-0.07
0.005
-0.07
0.005
-0.07
0.005
SSBetween = .223

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Source
SS
df
MS
Between
.223
.1115
17.969
Within
.0901
14
.0064
Total
~.3135
16
MS Between
MSWithin
SS Between
= B
df Between
SSWithin
=
dfWithin
MS Between
F=
MSWithin

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
6. Make a decision
Source
SS
df
MS
Between
.223
.1115
17.969
Within
.0901
14
.0064
Total
~.3135
16
Fcrit = 3.74

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
F = 17.97
>
Fcrit = 33.74
74
6. Make a decision
Recall of negative, neutral, and positive pictures

was different, F(2, 14) = 19.97, p < .05.
But which pictures were remembered best? Worst?
A Priori & PostPost-Hoc Tests
Hindsight is 20
20-20
20
y Although
t oug your
you data may
ay suggest a
new relationship, and thus new

analyses
y Theory should guide research and
thus comparisons
th
i
should
h ld b
be decided
d id d
on before you conduct your
experiment.
p
Planned & A Priori Comparisons

y Based on literature review
y Theoretical
y Planned
l
d comparisons
y A test that is conducted when there are multiple groups of
scores, but specific comparisons have been specified prior

scores
to data collection.
y A Priori Comparisons
Planned & A Priori Comparisons

y If you have planned comparisons
y Just run t tests
y Subjective Decision about p value
p = .05?
y p = .01?
y Bonferroni Correction?
y
Post Hoc Tukey HSD

Post-Hoc:
y Tukey
T k Honestly
H
tl Si
Significant
ifi t Diff
Difference
y Determines differences between means in terms of
standard error
y Honest because we adjust for making multiple comparisons
y The HSD is compared to a critical value
y Overview
1. Calculate differences between a pair of means
2. Divide this difference by the standard error
* Basically this is a variant of a t test *
p againsort
g
of.
Oh no,, that means the six steps
Tukey HSD
(
M1 M 2 )
HSD =
sM
(
M1 M 2 )
t=
sDifference
y For Tukey HSD, standard error is calculated
differently depending on whether your sample sizes

are equal or not.
Tukey HSD
y Equal Sample Sizes
sM =
MSWithin
N
N = Sample size
within
i hi eachh group
y Unequal Sample Sizes
sM =
MSWithin
N
N Groups
N =
1
N
Tukey HSD
y Determine Critical Value from Table
y Make a Decision
y Lets go back to our memory for emotional pictures
example
Tukey HSD:
HSD Example
y Memory for Emotional Pictures Example:
Between-Subjects One Way ANOVA

y Decision: Recall of negative, neutral, and positive
pictures was different,

different F(2,
F(2 14) = 19.97,
19 97 p < .05..
05
y Where are our differences?
y Lets get our qcrit first
Tukey HSD:
HSD Example
NTotal = 17
NNeg = 6
NNeut = 6
NPos = 5
dfNeg = 5
dfNeut = 5
dfPos = 4
dfBetween
=2
B
(k = 3)
dfWithin= 14
MNeg = .86
MNeut = .61
qcrit = 3.70
MPos = .634
Tukey HSD:
HSD Example
NTotal = 17
0.69
0.59
.64
4
0.84
0.64
.73
0.93
0.62
.51
0 91
0.91
0 71
0.71
.68
68
0.89
0.50
.61
0.90
0.60
NNeg = 6
NNeut = 6
NPos = 5
dfNeg = 5
dfNeut = 5
dfPos = 4
dfBetween = 2
(k = 3)
dfWithin= 14
MNeg = .86
MNeut = .61
qcrit = 3.70
Source
SS
df
MS
Between
.223
.1115
17.969
Within
.0901
14
.0064
Total
~.3135
16
MPos = .634
Tukey HSD:
HSD Example
y Standard Error: Unequal Sample Sizes
N Groups
N =
1
N
sM =
MSWithin
N
N =
3
3
=
= 5.625
1 1 1 .533
+ +
6 6 5
.0064
sM =
= .0011378
0011378 = 0.034
0 034
5.625
Tukey HSD:
HSD Example
y Negative (M=0.86) vs. Neutral (M=0.61)
M 1 M 2 ) (.86 .61)
(
HSD =
=
= 7.35
sM
.034
y Negative (M=0.86) vs. Positive (M=0.634)
M 1 M 2 ) (.86
(
( 86 .634)
634)
HSD =
=
= 6.65
sM
.034
y Neutral (M=0.61) vs. Positive (M=0.634)
M 1 M 2 ) ((.61 .634))
(
HSD =
0 71
=
= 0.71
sM
.034
Tukey HSD:
HSD Example
y Make a Decision
y Post hoc comparisons using the Tukey HSD test
revealed that negative pictures were better

remembered (M = .86) than either positive (M = .634) or
neutral (M = .61) pictures, with no differences between
the latter two.
Bonferonni Correction
An alternative p
post-hoc strategy
gy
Bonferroni Correction
Fishing Expedition
y Remember the problem of too many tests?

y Inflates the risk of a Type I error.
y False positives
y Is there a way to address that without a new test?
y We
Weve
ve hinted at it already
already
Bonferroni Correction
Summary
y Between-Subjects One Way ANOVA
y Two Sources of Variance
y New Sums of Squares
y New df
y Homoscedasticity
y
y The problem of too many tests
y Source Table
y Post-Hoc tests
y
y
y
y
Tukeys HSD
Bonferroni
LSD
etc.

10 - Hypothesis Testing With One-Way ANOVA PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

10 - Hypothesis Testing With One-Way ANOVA PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Hypothesis Testing with OneOne-Way

Analysis of Variance (ANOVA)

3 groups overall) and an interval DV.

y t Test: Distance between two distributions

F Ratio (Sir Ronald Fisher)

population variance based on the differences

population variance based on the differences

More than two groups

Problem of Too Many Tests

true) greatly increases with the number of comparisons.

Problem of Too Many Tests

Within-Groups One -Way ANOVA: ANOVA where each

Between-Groups One-Way ANOVA: ANOVA where each

from 1st edition of textbook

populations have the

to the Six Steps

__________ programs in the U.S.

and Business all rate financial factors the same, on average.

y Research: Foreign graduate students in Arts & Sciences, Education,

y df for each sample: NSample - 1

Arts & Sciences:

y dfWithin: df1 + df2 + df3 + df4 = 4 + 3 + 3 + 3 = 13

4 Determine Critical Values

5 Calculate the Test Statistic

reject the null hypothesis and can say the following:

and differences between scores within each sample are

y Within-groups variance based on scores, between-groups

variance based on means.

final results of an ANOVA in a consistent and easy-toread format.

Sums of Squared Deviations

from 1st edition of textbook

Grand Mean: Refers to the

Sums of Squared Deviations

Calculate the squared

from 1st edition of textbook

Sums of Squared Deviations

from 1st edition of textbook

Sums of Squared Deviations

from 1st edition of textbook

Source Table for our Example

from 1st edition of textbook

What is our decision?

from 1st edition of textbook

y Because the largest variance (.500) is not more than twice

(unequal sample sizes) the smallest variance (.251) then we

What is our decision?

F = 3.94 > Fcrit = 3.41

Reporting ANOVA in APA Style

Between Groups df then comma:

Within Groups df:

Close parentheses, equal sign:

F Statistic then comma:

F(dfBetween , dfWithin) = 1.23,

F(dfBetween , dfWithin) = 1.23,

Significant, less than .05:

F(dfBetween , dfWithin) = 1.23, p < .05

Between-Subjects One Way ANOVA

Between-Subjects One Way ANOVA:

y Research Question: Will people asked to study pure lists of either

positive, negative, or neutral pictures have differences in recall of

either 30 positive, 30 negative, or 30 neutral pictures (from IAPS).

Between-Subjects One Way ANOVA:

Between-Subjects One Way ANOVA: