Sie sind auf Seite 1von 104

Goal of Analysis of Variance

The Formal ANOVA Model


Explanation by Example
Multiple Comparisons
Assumptions

Analysis of Variance and Contrasts

Ken Kelley’s Class Notes

1 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example
Multiple Comparisons
Assumptions

Lesson Breakdown by Topic

1 Example: Weight Loss Drink


Goal of Analysis of Variance
ANOVA Using SPSS
A Conceptual Example Appropriate
for ANOVA 4 Multiple Comparisons
Example F -Test for Independent Why Multiplicity Matters
Variances Error Rates
Conceptual Underpinnings of Linear Combinations of Means
ANOVA Controlling the Type I Error
Mean Squares 5 Assumptions
2 The Formal ANOVA Model Assumptions of the ANOVA
A Worked Example What You Learned
3 Explanation by Example Notations

2 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example
Multiple Comparisons
Assumptions

What You Will Learn from this Lesson

You will learn:


How to compare more than two independent means to assess if
there are any differences via an analysis of variance (ANOVA).
How the total sums of squares for the data can be decomposed
into a part that is due to the mean differences between groups
and to a part that is due to within group differences.
Why doing multiple t-tests is not the same thing as ANOVA.
Why doing multiple t-tests leads to a multiplicity issue, in that
as the number of tests increases, so to does the probability of
one or more error.
How to correct for the multiplicity issue in order for a set of
contrasts/comparisons has a Type I error rate for the collection
of tests at the desirable (e.g., .05) level.
How to use SPSS and R to implement an ANOVA and
follow-up tests.
3 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example
Multiple Comparisons
Assumptions

Motivation

When looking at different allergy medicines, there are


numerous options. So how can it be determined which brand
will work best when they all claim to do so?
Data could be collected to determine the outcomes from each
product among numerous individuals randomly assigned to
different brands.
An ANOVA could be run to infer if there is a performance
difference between these different brands.
If there are no significant results, evidence would not exist to
suggest there are differences in performance among the brands.
If there are significant results, we would infer that the brands
do not perform the same, but further tests would have to be
conducted so as to infer where the differences are .

4 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Goal of Analysis of Variance

The goal of ANOVA is to detect if mean differences exist


among m groups.
Recall the independent groups t-test is designed to detect
differences between two independent groups.
2
The t-test is a special case of ANOVA when m = 2 (tdf equals
the F(1,df ) from ANOVA for two groups).

5 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Obtaining a statistically significant result for ANOVA conveys


that not all groups have the same population mean.

However, a statistically significant ANOVA with more than


two groups does not convey where those differences exist.
Follow-up tests (contrasts/comparisons) can be conducted to
help discern specifically where group means differ.

6 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Consumer Preference

Consider the overall perception of how consumers regard


different companies.

An experiment was done in which 30 individuals were


randomly assigned into one of three groups.

All participants saw (almost) the same commercial advertising


a new Android smart phone.
The difference between the groups was that the commercial
attributed the phone to either (a) Nokia, (b) Samsung, or (c)
Motorola.
Of interest is in whether consumers tend to rate the brands
differently, even for the “same” cell phone.
7 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

What are other examples in which ANOVA would be useful?

8 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Consider the null hypotheses of equal variances:

H0 : σ12 = σ22 .

9 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Consider the null hypotheses of equal variances:

H0 : σ12 = σ22 .

The F -statistic is used to evaluate the above null hypothesis,


and is defined as the ratio of two independent variances:

s12
F(df1 ,df2 ) = ,
s22

where df1 and df2 are the degrees of freedom for s12 and s22 ,
respectively.

10 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Consider the null hypotheses of equal variances:

H0 : σ12 = σ22 .

The F -statistic is used to evaluate the above null hypothesis,


and is defined as the ratio of two independent variances:

s12
F(df1 ,df2 ) = ,
s22

where df1 and df2 are the degrees of freedom for s12 and s22 ,
respectively.
Notice that F cannot be negative and is unbound on the high
side.
F -is a positively skewed distribution.

11 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Examples

We have previously asked questions about the mean


difference, but the F -distribution allows us to ask questions
about variability.
Is the variability of user satisfaction of Gmail users different
than the variability of user satisfaction of Outlook.com?

Does Mars and their M&M’s production have “better control”


(i.e., smaller variance) than Wrigley’s Skittles?

For a given item, are Wal-Mart prices across the country more
stable than Kroger’s (for like items)?

Does a particular machine (or location/worker/shift) produce


more variable products than a counterpart?

12 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

The standard deviation of Gmail user satisfaction was 6.35


based on a sample size of 55.
The standard deviation of Outlook.com user satisfaction was
8.90 based on a sample size of 42.
For an F -test of this sort addressing any differences in the
variance (e.g., is there more variability in user satisfaction in
one group), there are two critical values, one at the α/2 value
and one at the 1 − α/2 value.
The critical values are and for the .025 and
.975 quantiles (i.e., when α = .05).
The F -statistic for the test of the null hypothesis is
6.352 40.3225
F = = = .509.
8.902 79.21

The conclusion is: .


13 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Thus far, we have talked only about the idea of comparing


two variances.

But, what does this have to do with comparing means, which


is the question we are interested in addressing?

14 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Analysis of variance (ANOVA) considers two variances:

15 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Analysis of variance (ANOVA) considers two variances:


one variance calculates the variance of the group means;

another variance is the (weighted) mean of within group


variances (recall sp2 from the two group t-test).

16 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Analysis of variance (ANOVA) considers two variances:


one variance calculates the variance of the group means;

another variance is the (weighted) mean of within group


variances (recall sp2 from the two group t-test).

We thus consider the variability of the group means to assess


if the population group means differ.

17 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Conceptual Underpinnings of ANOVA

The null hypothesis in an ANOVA context is that all of the


group means are the same: µ1 = µ2 = . . . = µm = µ,
where m is the total number of groups.

When the null hypothesis is true, we can estimate the


variance of the scores with two methods, both of which are
independent of one another.

18 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

If the ratio of variances (i.e., F -test) is so much larger than 1


that it seems unreasonable to have happened by chance alone,
then the null hypothesis can be rejected.
Of course, “so much larger than 1 that it seems unreasonable”
is defined in terms of the p-value (compared to α).

If the p-value is smaller than α, the null hypothesis of equal


population means is rejected.
The variance of the scores can be calculated from within each
group and then pooled across the groups (in exactly the same
manner as was done for the independent groups t-test).

19 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Mean Square Within

Recall that the best way to arrive at a pooled within group


variance is to calculate a weighted mean of the variances:
m m
(nj − 1)sj2
P P
SSj
2 j=1 j=1 2
sPooled = m = = sWithin = MSWithin ,
P N −m
nj − m
j=1

where SS is sum of squares, MS is mean square (i.e., a


variance), m is the number of groups, nj is the sample size in
the jth group (j = 1, . . . , m), and N is the total sample size
Pm
(N = nj ).
j=1

20 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

In the special case where n1 = n2 = . . . = nm = n, the


equation for the pooled variance reduces:
m
sj2
P
j=1 2
= sWithin = MSWithin .
m
Notice that the degrees of freedom here are N − m.
The degrees of freedom are N − m because there are N
independent observations yet m sample means estimated.

21 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

Mean Square Between

Recall from the single group situation that the variance of the
mean is equal
 to the variance
 of the scores divided by the
sY2
sample size i.e.,sȲ2 = nj
j
.
j

That is, the variance of the sample means is the variance of


the scores divided by the sample size.

22 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

However, under the null hypothesis, we can calculate the


variance of the sample means directly by using the m means
as if they were individual scores.

Then, an estimate of the variance of the scores could be


obtained by multiplying the variance of the means by sample
2
size (sBetween = nsȲ2 ).

If the F -statistic is statistically significant, the conclusions is


that the variance of the means is larger than it should have
been, if in fact the null hypothesis was true.

Notice that the degrees of freedom here are m − 1.

23 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

There are thus two variances that estimate the same value
under the null hypothesis.
2
One (σWithin ) calculated by pooling within group variances.

2
The other (σBetween ) by calculating the variance of the means
and multiplying by the within group sample size.

2
σBetween
If the null hypothesis is exactly true, 2
σWithin
= 1.

If the null hypothesis is false and mean differences do exist,


2
sBetween will be larger than would be expected under the null
2
sBetween
hypothesis, then 2
sWithin
> 1.

24 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

s2
If F = sBetween
2 (i.e., F = MS
MSWithin ) is statistically significant,
Between
Within
we will reject the null hypothesis and conclude that
H0 : µ1 = µ2 = . . . = µm = µ is false.

25 / 104
Goal of Analysis of Variance
A Conceptual Example Appropriate for ANOVA
The Formal ANOVA Model
Example F -Test for Independent Variances
Explanation by Example
Conceptual Underpinnings of ANOVA
Multiple Comparisons
Mean Squares
Assumptions

s2
If F = sBetween
2 (i.e., F = MS
MSWithin ) is statistically significant,
Between
Within
we will reject the null hypothesis and conclude that
H0 : µ1 = µ2 = . . . = µm = µ is false.

Thus, we are comparing means based on variances!

26 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

The ANOVA Model

The ANOVA assumes that the score for the ith individual in
the jth group is a function of some overall mean, µ, some
effect for being in the jth group exists, τj , and some
uniqueness exists, εij .
Such a scenario implies that

Yij = µ + τj + εij ,

where
τj = µj − µ,
with τj being the treatment effect of the jth group.

27 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

When the null hypothesis is true, the sum of the τ s squared


m
τj2 = 0.
P
equals zero:
j=1

When the null hypothesis is false, the sum of the τ s squared


m
τj2 > 0.
P
equals some number larger than zero:
j=1

28 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

Thus, we can formally write the null and alternative


hypotheses for ANOVA as
m
τj2 = 0
P
H0 :
j=1

and
m
τj2 > 0,
P
Ha :
j=1

respectively.
m
τj2 = 0 is equivalent to
P
Note that H0 :
j=1
H0 : µ1 = µ2 = . . . = µm = µ.

29 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

The null hypothesis can be evaluated by determining,


probabilistically, if the sum of the estimated τ s squared is
greater than zero by more than what would be expected by
chance alone.
The “hard to believe” part is evaluated by the specified α
value.

30 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

The sums of squares are defined as follows:


m
X
SSBetween = SSTreatment = SSAmong = nj (Ȳj − Ȳ.. )2 ;
j=1

and
nj
m X
X
SSWithin = SSError = SSResidual = (Yij − Ȳj )2 ;
j=1 i=1

nj
m X
X
SSTotal = (Yij − Ȳ.. )2 .
j=1 i=1

SSTotal = SSBetween + SSWithin

31 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

Like usual, we divide the sums of squares by the appropriate


degrees of freedom in order to obtain a variance.

In the ANOVA context, the sums of squares divided by its


degrees of freedom is called a mean square: SS
df = MS.
“Mean squares” are so named because when the sums of
squares is divided by its degrees of freedom, the resultant value
is the mean of the squared deviations (i.e., the mean square).

Mean square simply means variance.

32 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

In general, the ANOVA source table is defined as:

Source SS df MS F p-value
m
P 2 SSBetween MSBetween
Between nj (Ȳj − Ȳ ..) m−1 m−1 MSWithin p
j=1
m P nj
SSWithin
(Yij − Ȳ.j )2
P
Within N −m N−m
j=1 i=1
m Pnj
(Yij − Ȳ ..)2
P
Total N −1
j=1 i=1

33 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

In general, the ANOVA source table is defined as:

Source SS df MS F p-value
m
P 2 SSBetween MSBetween
Between nj (Ȳj − Ȳ ..) m−1 m−1 MSWithin p
j=1
m P nj
SSWithin
(Yij − Ȳ.j )2
P
Within N −m N−m
j=1 i=1
m Pnj
(Yij − Ȳ ..)2
P
Total N −1
j=1 i=1

The ANOVA source table is (very) similar to that used in the


context of multiple regression, a widely applicable future topic.

34 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Explanation by Example A Worked Example
Multiple Comparisons
Assumptions

It can also be shown that the expected values of the mean


squares are given as
m
nj τj2
P
2 j=1
E [MSBetween ] = σWithin + ,
m−1
2
E [MSWithin ] = σWithin ,
When all of the population means are equal, the second
component of the MSBetween and the expectation of the two
mean squares is the same.

When any population mean difference exists,


E [MSBetween ] > E [MSWithin ].

35 / 104
Worked Example – Raw Data
Nokia Samsung Motorola

ei12 = ( yi1 − y1 ) ei2 = yi2 − y2 ei22 = ( yi 2 − y2 ) ei3 = yi3 − y3 ei 3 = ( yi 3 − y3 )


2
ei1 = yi1 − y1
2 2 2
Ratings Ratings Ratings
6 1.5 2.25 10 2 4 10 3 9
6 1.5 2.25 10 2 4 6 11 1
2 12.5 6.25 9 1 1 10 3 9
3 11.5 2.25 4 14 16 5 12 4
4 10.5 0.25 4 14 16 10 3 9
4 10.5 0.25 10 2 4 5 12 4
6 1.5 2.25 10 2 4 2 15 25
2 12.5 6.25 10 2 4 10 3 9
5 0.5 0.25 3 15 25 2 15 25
7 2.5 6.25 10 2 4 10 3 9
Σ 45.00 0.00 28.50 80.00 0.00 82.00 70.00 0.00 104.00
Mean 4.50 0.00 8.00 0.00 7.00 0.00
SD 1.78 1.78 3.02 3.02 3.40 3.40
Variance 3.17 3.17 9.11 9.11 11.56 11.56

y..
Grand=Mean=(======;=y1bar=dot=dot==)= ===(4.50*10=+=8.00*10=+=7.00*10)/30===6.50
The=grand=mean=is=the=(weighted)=mean=of=the=sample=means=(here=it=is=simply=equal=to=the=mean=of=the=means=due=to=equal=group=sample=sizes.

Sums%of%Squares
Between=Sum=of=Squares ===10*(4.5016.50)2=+=10*(8.0016.50)2=+=10*(7.0016.50)2===65.00===SSBetween
This=is=the=weighted=(because=each=score=in=a=group=has=the=same=sample=mean,=of=course)=sum=of=squared=deviation=between=the=group=means=and=the=grand=mean.

Within=Sum=of=Squares ===9*3.17=+=9*9.11=+=9*11.56===28.5=+=82=+=104===214.50===SSWithin
This=is=the=sum=of=each=of=the=within=group=sum=of=squares.

Mean%Squares
Now,=to=obtain=the=mean=squares,=divide=the=sums=of=squares=by=their=appropriate=degrees=of=freedom:
Mean=Square=Between ===65.00/(311)===32.50===MSBetween

Mean=Square=Within= ===214.50/=27===7.94===MSWithin

Inference
Now,=to=obtain=the=F1statistic,=divide=the=Mean=Square=Between=by=the=Mean=Square=Within:
F"=" 32.50/7.94===4.091

To=obtain=the=p1value,=use=the="F.Dist.RT"=formula=for=finding=the=area=in=the=right=tail=that=exceeds=the=F=value=of=4.091
p"=" 0.028061704

Now,=because=the=p1value=is=less=than=α=(.05=being=typical),=we=reject=the=null=hypothesis.=We=infer=that=the=population=group=mean=are=not=all=equal.=
Thus,=the=same=phone=commercial,=as=attributed=to=different=brands,=had=an=effect=on=the=ratings=of=the=phone.=
The=conclusion=is=that=there=is=an=effect=of=brand=on=consumer=sentiment=1=consumers=rate=the=same=thing=differently=depending=on=the=brand=attribution.=

The data are available here: nd.edu/~kkelley/Teaching/Data/Phone_Commercial_Preference.sav.


Worked Example – Summary Statistics

Summary8Statistics8from8the8Phone8Evaluation
Nokia Samsung Motorola
Mean yj 4.50 8.00 7.00
Standard8deviation sj 1.78 3.02 3.40
Sample8size y.. 10 10 10

Grand8mean y.. 8=8(4.50*108+88.00*108+87.00*10)/(30)8=86.50

Rather8than8using8the8full8data8set,8only8the8summary8statistics8are8actually8needed.8The8reason8is8because8we8can8determine8the8
sums8of8squares8from8the8summary8data.8The8within8sum8of8squares8is8literally8the8sum8of8the8degrees8of8freedom8multiplied8by8the8variance8from8each8group.

Sums%of%Squares
Between8Sum8of8Squares 8=810*(4.50N6.50)28+810*(8.00N6.50)28+810*(7.00N6.50)28=8865.008=8SSBetween
This8is8the8weighted8(because8each8score8in8a8group8has8the8same8sample8mean,8of8cousre)8sum8of8squared8deviation8between8the8group8means8and8the8grand8mean.

Within8Sum8of8Squares 8=81.782*(10N1)8+83.022*(10N1)8+83.402*(10N1)8=8214.50,8which8in8terms8of8variances8(instead8of8standard8deviations)8can8be8written8as:
8=83.17*(10N1)8+9.11*(10N1)8+811.56*(10N1)8=8214.508=8SSWithin
Recall8that8the8sums8of8squares8divided8by8its8degree8of8freedom8is8a8variance8Correspdongly,8a8variance8multiplied8by8its8degrees8
of8freedom8is8a8sum8of8squares.8Thus,8we8are8able8to8find8the8sum8of8squares8by8multiplying8the8variances8by8their8degrees8of8freedom.8

Mean%Squares
Now,8to8obtain8the8mean8squares,8divide8the8sums8of8squares8by8their8appropriate8degrees8of8freedom:
Mean8Square8Between 8=865.00/(3N1)8=832.508=8MSBetween

Mean8Square8Within8 8=8214.50/8278=87.948=8MSWithin

Inference
Now,8to8obtain8the8FNstatistic,8divide8the8Mean8Square8Between8by8the8Mean8Square8Within:
F"=" 32.50/7.948=84.091

To8obtain8the8pNvalue,8use8the8"F.Dist.RT"8formula8for8finding8the8area8in8the8right8tail8that8exceeds8the8F8value8of84.091
p"=" 0.0281

Now,8because8the8pNvalue8is8less8than8α8(.058being8typical),8we8reject8the8null8hypothesis.8We8infer8that8the8population8group8mean8are8not8all8equal.8
Thus,8the8same8phone8commercial,8as8attributed8to8different8brands,8had8an8effect8on8the8ratings8of8the8phone.8
The8conclusion8is8that8there8is8an8effect8of8brand8on8consumer8sentiment8N8consumers8rate8the8same8thing8difference,8depending8on8the8brand8attribution.8

The data are available here: nd.edu/~kkelley/Teaching/Data/Phone_Commercial_Preference.sav.


Goal of Analysis of Variance
The Formal ANOVA Model
Example: Weight Loss Drink
Explanation by Example
ANOVA Using SPSS
Multiple Comparisons
Assumptions

Product Effectiveness: Weight Loss Drinks

Over a two month period in the early spring, 99 participants from


the midwest were randomly assigned to one of three groups (33
each) to assess the effectiveness of meal replacement weight loss
drink.
Study was conducted and analyzed by an independent firm.

The three groups were a (a) control group, (b) SF, and (c) TL.

All participants were encouraged to exercise and given running


shoes, workout outfit, and a pedometer.

38 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Example: Weight Loss Drink
Explanation by Example
ANOVA Using SPSS
Multiple Comparisons
Assumptions

The summary statistics for weight change in pounds (before breakfast)


are given as:

Control SF TL Total
Ȳ -1.61 -3.06 -7.29 -3.78
s 1.83 2.12 1.79 3.00
n 26 29 22 77

As can be seen, 22 participants did not compete the study.


Implications?

39 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Example: Weight Loss Drink
Explanation by Example
ANOVA Using SPSS
Multiple Comparisons
Assumptions

The following table is the ANOVA source table:


Source SS df MS F p
Between 408.28 2 204.14 54.567 < .001
Within 276.84 74 3.74
Total 685.12 76

40 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Example: Weight Loss Drink
Explanation by Example
ANOVA Using SPSS
Multiple Comparisons
Assumptions

The critical F -value at the .05 level for 2 and 69 degrees of


freedom is F(2,74;.95) = 3.12.

So, given the information, the decision is to


.

The one-sentence interpretation of the results is:

41 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Example: Weight Loss Drink
Explanation by Example
ANOVA Using SPSS
Multiple Comparisons
Assumptions

Performing an ANOVA in SPSS

42 / 104
Goal of Analysis of Variance
The Formal ANOVA Model
Example: Weight Loss Drink
Explanation by Example
ANOVA Using SPSS
Multiple Comparisons
Assumptions

43 / 104
ANOVA Output from SPSS
Goal of Analysis of Variance
The Formal ANOVA Model
Example: Weight Loss Drink
Explanation by Example
ANOVA Using SPSS
Multiple Comparisons
Assumptions

Suggestions when Performing ANOVA in SPSS

Start with Analyze → Descriptives → Explore.

Analyze → Compare Means → One-Way ANOVA for ANOVA


procedure.

In the One-Way ANOVA specification, request a Means Plot


(via Options).
Consider using Analyze → General Linear Model → Univariate
for a more general approach.

45 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Omnibus Versus Targeted Tests

Procedures such as the t-test are targeted, and thus test


specific hypotheses.
For example, the independent groups t-test evaluates the
hypothesis that µ1 = µ2 .

Thus, after an ANOVA is performed, oftentimes we want to


know where the mean differences exist.

However, a rationale of ANOVA was not to perform many


significance tests.

46 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

An Analogy

Consider an airline scheduling system at the gate of departure.

47 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

An Analogy

Consider an airline scheduling system at the gate of departure.


This system requires all five processes to simultaneously
function:
1 live feed from the corporate server;
2 live feed to the corporate server;
3 live feed to the departing airport;
4 live feed to the arrival airport;
5 computer terminal to function property (e.g., no software
glitches, no power loss).

48 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

An Analogy

Consider an airline scheduling system at the gate of departure.


This system requires all five processes to simultaneously
function:
1 live feed from the corporate server;
2 live feed to the corporate server;
3 live feed to the departing airport;
4 live feed to the arrival airport;
5 computer terminal to function property (e.g., no software
glitches, no power loss).
Suppose that the “uptime” or ”reliability” of each of these
independent systems is .95, meaning at any given time there
is a 95% chance each process is working.

49 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

What is the probability that the system can be used when


needed (i.e., that all five systems working properly)?

50 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

What is the probability that the system can be used when


needed (i.e., that all five systems working properly)?
Recalling the rule of independent events, the probability that
the system can be used is
.95 × .95 × .95 × .95 × .95 = .955 = .7738.

51 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

What is the probability that the system can be used when


needed (i.e., that all five systems working properly)?
Recalling the rule of independent events, the probability that
the system can be used is
.95 × .95 × .95 × .95 × .95 = .955 = .7738.
Thus, even though each piece of the system has a 95%
chance of working properly, there is only a 77.38% chance
that the system itself can be used.

52 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

What is the probability that the system can be used when


needed (i.e., that all five systems working properly)?
Recalling the rule of independent events, the probability that
the system can be used is
.95 × .95 × .95 × .95 × .95 = .955 = .7738.
Thus, even though each piece of the system has a 95%
chance of working properly, there is only a 77.38% chance
that the system itself can be used.
The implication here is that an error occurring somewhere in
the set of processes (1-.7738=0.2262) is much higher than for
any given system (1-.95=.05).

53 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

What is the probability that the system can be used when


needed (i.e., that all five systems working properly)?
Recalling the rule of independent events, the probability that
the system can be used is
.95 × .95 × .95 × .95 × .95 = .955 = .7738.
Thus, even though each piece of the system has a 95%
chance of working properly, there is only a 77.38% chance
that the system itself can be used.
The implication here is that an error occurring somewhere in
the set of processes (1-.7738=0.2262) is much higher than for
any given system (1-.95=.05).
Note that the rate of errors in the system is (.2262/.05) 4.524
times higher than in a given process!

54 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

What is the probability that the system can be used when


needed (i.e., that all five systems working properly)?
Recalling the rule of independent events, the probability that
the system can be used is
.95 × .95 × .95 × .95 × .95 = .955 = .7738.
Thus, even though each piece of the system has a 95%
chance of working properly, there is only a 77.38% chance
that the system itself can be used.
The implication here is that an error occurring somewhere in
the set of processes (1-.7738=0.2262) is much higher than for
any given system (1-.95=.05).
Note that the rate of errors in the system is (.2262/.05) 4.524
times higher than in a given process!
This is the multiplicity issue – an error somewhere among a
set of “tests” is higher than for any given test.
55 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Why Multiplicity Matters – Multiple Testing

The probability of making a Type I error out of C (i.e.,


independent) comparisons is given as

p(At least one Type I error) = 1−p(No Type I errors) = 1−(1−α)C ,

where C is the number of independent comparisons to be performed


(based on rules of probability).
If C = 5, then p(At least one Type I error) = .2262!

Note that this is the same probability that 1 or more


confidence intervals when 5 are computed, each at the 95%
level, do not bracket the population quantity.
The scenario here is analogous to the airline scheduling system.

56 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Types of Error Rates

There are three types of error rates that can be considered:


1 Per comparison error rate (αPC ).
Analogous to the per process failure rate (5%) in the the
airline system example.

2 Familywise error rate (αFW ).


Analogous to the system failure rate (22.62%) of the airline
system example above.
3 Experimentwise error rate (αEW ).
Analogous to the multiple systems being required to fly the
airplane (e.g., not only the scheduling system, but also that
the plan functions properly, the flight team arrives on time,
etc.), which can be much higher than αFW (if there are
multiple families).

57 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Per Comparison Error Rate

αPC : the probability that a particular test (i.e., a comparison)


will reject a true null hypothesis.

This is the Type I error rate with which we have always used
(as we only worked with a single test at a time).

58 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Familywise Error Rate (αFW )

αFW : the probability that one or more tests will reject a true
null hypothesis somewhere in the “family.”

Defining exactly what a family is can be difficult and open to


interpretation.
As an aside, there are many statistical issues “open to
interpretation.”

Reasonable people can disagree on how to handle various


issues.

Openness about the methods, it’s assumptions, and limitations


is key.

59 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Experimentwise Error Rate (αEW )

αEW : the probability that one or more tests will reject a true
null hypothesis somewhere in the “experiment” (or study
more generally).

Modifying the significance criterion so that αFW is the


probability of a Type I error out of the set C significance tests
is the same as forming C simultaneous confidence intervals.

We do not focus on the experiment wise error rate, as we will


assume a single family for our set of tests.

60 / 104
THING EXPLAINER IS AVAILABLE AT: AMAZON, BARNES & NOBLE, INDIE BOUND, HUDSON
ABOUT
A Hypothesis, A Result
SIGNIFICANT
|< < PREV RANDOM NEXT > >|

From XKCD: http://xkcd.com/882/


Tests, tests, tests, . . .

From XKCD: http://xkcd.com/882/


. . . and more tests. . .

xkcd.com/882/

From XKCD: http://xkcd.com/882/


. . . and more tests. . .

016 xkcd: Significant

From XKCD: http://xkcd.com/882/


. . . and yet more tests. . .

From XKCD: http://xkcd.com/882/


THING EXPLAINER IS AVAILABLE AT: AMAZON, BARNES & NOBLE, INDIE BOUND, HUDSON
ABOUT
A Type I Error (It Seems)
SIGNIFICANT
|< < PREV RANDOM NEXT > >|

From XKCD: http://xkcd.com/882/


After Many Tests, A “Finding”

|< < PREV RANDOM NEXT > >|

From XKCD: http://xkcd.com/882/


PERMANENT LINK TO THIS COMIC: HTTP://XKCD.COM/882/
IMAGE URL (FOR HOTLINKING/EMBEDDING): HTTP://IMGS.XKCD.COM/COMICS/SIGNIFICANT.PNG
Error Rate

|< < PREV RANDOM NEXT > >|

PERMANENT LINK TO THIS COMIC: HTTP://XKCD.COM/882/


IMAGE URL (FOR HOTLINKING/EMBEDDING): HTTP://IMGS.XKCD.COM/COMICS/SIGNIFICANT.PNG

The probability of a Type I error for 20 independent tests,


which the jelly bean comparisons were, is
http://xkcd.com/882/ 2/4

1 − (1 − .05)20 = 1 − .0.3584859 = 0.6415141

Thus, there is a 64% chance of a Type I error in such a case!

From XKCD: http://xkcd.com/882/


A Summary. . .

Multiplicity Matters!
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Linear Comparisons: Specifying Contrasts of Interest

Suppose a question of interest is the contrast of group 1 and


group 3 in a three group design.
That is, we are interested in the following effect: Ȳ1 − Ȳ3 .

The above is equivalent to: (1) × Ȳ1 + (0) × Ȳ2 + (−1) × Ȳ3 .

70 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Suppose a question of interest is the mean of group 1 and


group 2 (i.e., the mean of the two group means) and group 3
in a three group design.
Ȳ1 +Ȳ2
That is, we are interested in the following effect: 2 − Ȳ3 .

Ȳ1 +Ȳ2
The above is equivalent to: 2 + (−1) × Ȳ3 .

The above is equivalent to: ( 12 ) × Ȳ1 + ( 12 ) × Ȳ2 + (−1) × Ȳ3 .

We could also write the above as:


(.5) × Ȳ1 + (.5) × Ȳ2 + (−1) × Ȳ3 .

71 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Consider a situation in which group 1 receives one type of


allergy medication, group 2 receives another type of allergy
medication, and group 3 receives a placebo (i.e., no
medication).
The question here is “does taking medication have an effect
over not taking medication on self reported allergy symptoms.”

72 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Forming Linear comparisons


In the population, the value of any contrast of interest is
given as
m
X
Ψ = c1 µ1 + c2 µ2 + c3 µ3 + . . . + cm µm = c j µj ,
j=1

where cj is the comparison weight for the jth group and Ψ is


the population value of a particular linear combination of
means.
An estimated linear comparisons is of the form
m
X
Ψ̂ = c1 Ȳ1 + c2 Ȳ2 + c3 Ȳ3 + . . . + cm Ȳm = c j Ȳj ,
j=1

where cj is the comparison weight for the jth group and Ψ̂ is


the particular linear combination of means.
73 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Forming Linear comparisons

The first example from above was comparing the mean of


group 1 versus group 2.
In c-weight form the c-weights are [1, 0, −1]:
(1) × Ȳ1 + (0) × Ȳ2 + (−1) × Ȳ3 .

Comparing one mean to another (i.e., using a 1 and -1


c-weight with the rest 0’s) is called a pairwise comparisons (as
the comparison only involves a pair).

74 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Forming Linear comparisons

The second example was comparing the mean of groups 1 and


2 versus group 3.
In c-weight form the c-weights are [.5, .5, −1]:
(.5) × Ȳ1 + (.5) × Ȳ2 + (−1) × Ȳ3 .

Comparing weightings of two or more groups to one or more


other groups is called a complex comparison. That is, if the
c-weights are something other than 1 and -1 and the rest 0’s,
it is a complex comparison.

75 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

m
P
It is required that cj = 0.
j=1

For example, setting c1 to 1 and c2 to −1 yields the pair-wise


comparison of Group 1 and Group 2:

Ψ̂ = (1 × Ȳ1 ) + (−1 × Ȳ2 ) = Ȳ1 − Ȳ2 .

76 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Rules for c-Weights

The sum of the c-weights for a comparison that are positive


should sum to 1.
The sum of the c-weights for a comparison that are negative
should sum to -1.
By implication of the two rules above,
P sum of all c-weights for
a comparison should sum to 0 (i.e., cj = 0).

Otherwise, the corresponding confidence interval is not as


intuitive.
However, any rescaling of such c-weights produces the same
t-test.
The confidence interval will have a different interpretation than
usual, as the effect will be for a specific linear combination
(e.g., Ψ̂ = 2Ȳ1 − Ȳ2 − Ȳ3 ).
77 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Thus, for the mean of Groups 1 and 2 compared to the mean


of Group 3, the contrast is

Ȳ1 + Ȳ2
Ψ̂ = (.5 × Ȳ1 ) + (.5 × Ȳ2 ) + (−1 × Ȳ3 ) = − Ȳ3 .
2

78 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Consider a situation in which one wants to weight the groups


based on the relative size of an outside factor, such as
marketshare, profit-per-segment, number of users, et cetera.

Suppose that interest is in comparing teens versus a weighted


average of 20 year olds and 30 year olds in an online
community, where among the 20 and 30 year olds the
proportion of users is 70 percent and 30 percent, respectively.

Ψ̂ = 1 × ȲTeens + (−.70 × Ȳ20s ) + (−.30 × Ȳ30s ).

79 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

There are technically an infinite number of comparisons that


can be formed, but only a few will likely be of interest.

The comparisons are formed so that targeted research


questions about population mean differences can be
addressed.

But, recall that in general, the sum of the c-weights that are
positive should sum to 1 and the sum of the c-weights that are
negative should sum to -1 so as to have a more interpretable
confidence interval.

80 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

A More Powerful t-Test

The t-test corresponding to a particular contrast it given as


P
cj Ȳj Ψ̂
t=s = ,
m SE (Ψ̂ )
 
P cj2
MSWithin nj
j=1

where the MSWithin is from the ANOVA and is the best


estimate of the population variance.
Importantly, this t-test has N − m degrees of freedom (i.e.,
the MSWithin degrees of freedom).
Note that the denominator is simply the standard error of the
contrast, which is used for the corresponding confidence
interval.
81 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Recall that when the homogeneity of variance assumption holds,


there are m different estimates of σ 2 .

For homogeneous variances, the best estimate of the population


variance for any group is the mean square error (MSWithin ), which
uses information from all groups.

Thus, the independent groups t-test can be given as

Ȳj − Ȳk
t=r  ,
1 1
MSWithin nj + nk

with degrees of freedom based on the mean square within (N − m),


which provides more power.

82 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

The above two-group t-test is still addresses the question


“does the population mean of Group 1 differ from the
population mean of Group 2?”

However, there is more information is used because the error


term is based on N − m degrees of freedom instead of
n1 + n2 − 2.

83 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

The MSWithin – Even for a Single Group


Even if we are interested in testing or forming a confidence
interval for a single group, the mean square within can (and
usually should) be used — again, due to having a better
estimate of σ 2 :

Ȳj − µ0
t=r  .
MSWithin n1j

The two-sided confidence interval is thus:


s  
1
Ȳj ± MSWithin × t(1−α/2,N−m) .
nj
The degrees of freedom for the above test and confidence
interval is, because MSWithin is used as the estimate of σ 2 ,
N − m. 84 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Thus, using MSWithin is one way to have more power to test


the null hypothesis concerning a single group or two groups,
even when more than two groups are available.

Additionally, precision is increased because the confidence


interval will be narrower (due to the smaller standard error
and smaller critical value).

85 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

The Bonferroni Procedure

The Bonferroni Procedure is also called Dunn’s procedure.

Good for a few pre-planned targeted tests, but doing too


many leads to conservative critical values.
Conservative critical values are those that are bigger (i.e.,
harder to achieve significance) than would be the case ideally.

Liberal critical values are those that are smaller (i.e., easier to
achieve significance) than would be the case ideally.

86 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

It can be shown that αPC ≤ αFW ≤ C αPC , where C is the


number of comparisons.

The per comparison error rate can be manipulated by dividing


the desired familywise (or experimentwise) Type I error rate by
C , the number of comparisons: αPC = αFW C .

The standard t-test formula is used, but the obtained t value


is compared to a critical value based on α/C : t(1−(α/C )/2,df ) .
The observed p-values (e.g., from SPSS) can be corrected for
multiplicity by multiplying the C p-values by C .
If the corrected p value is less than αFW , then the test is
statistically significant in the context of a correct familywise
Type I error rate.

87 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

The critical value is what changes in the context of a


Bonferroni test, not the way in which the t-test and/or
confidence intervals are calculated.

Incorporating MSWithin√into the denominator of the t-test is
not really a change, as MSWithin is just a pooled variance
based on m (rather than 2) groups.
2
Recall this is just an extension of sPooled when information on
more than two groups is available.

88 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Tukey’s Test

Tukey’s test is used when all (or several) pairwise comparisons


are to be tested.

For comparing all possible pair-wise comparisons, Tukey’s test


provides the most powerful multiple comparison procedure.
There is a Tukey-b in SPSS — I recommend “Tukey.”
The p-values and confidence intervals given by SPSS already
yields, for the Tukey procedure, “corrected” p-values and
confidence intervals.

89 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

Pairwise comparisons compare the means of two groups (i.e.,


a pair; µ1 − µ3 ) without allowing any other complex
comparisons (e.g., (Ȳ1 + Ȳ2 )/2 − Ȳ3 ).

The observed test statistic is compared to the tabled values of


the Studentized range distribution.
This is the distribution that the Tukey procedure uses to
obtain confidence intervals and p-values.

90 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

The Scheffé Test

For any number of post hoc tests with any linear combination
of means, the Scheffé Test is generally optimal.

Although the Scheffé Test is conservative for a small number


of comparisons, any number of comparisons can be conducted
while controlling the Type I error rate.

91 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

We compute the F -value (just a t-value squared) in accord


with some linear combination of means, and a critical value is
determined for the specific context.

The Scheffé critical F -value (take the square root for the
critical t-value) is given as

(m − 1)F(m−1,N−m;α) ,

which is m − 1 times larger than the critical ANOVA value.

92 / 104
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

The Scheffé procedure should not be done for all pairwise


comparisons (it is not as powerful as Tukey’s Test for pairwise
comparisons).

If many complex and other (e.g., pairwise) are to be done,


usually the Scheffé procedure is optimal.

93 / 104
Flowchart for Multiple Comparisons
Begin

Testing all pairwise and


no complex comparisons
(either planed or post Yes
Use Tukey’s method
hoc) or choosing to
test only some pairwise
comparisons post hoc?

No

Are all comparisons No


Use Scheffé’s method
planned?

Yes

Is Bonferroni critical
Yes
value less than Use Bonferroni’s method
Scheffé critical value?

No

Use Scheffé’s method (or, prior


to collecting the data, reduce the
number of contrasts to be tested)
Goal of Analysis of Variance
Why Multiplicity Matters
The Formal ANOVA Model
Error Rates
Explanation by Example
Linear Combinations of Means
Multiple Comparisons
Controlling the Type I Error
Assumptions

SPSS does not make it easy to get the appropriate p-values


and confidence intervals for complex comparisons.

The Bonferroni and Scheffé procedures in SPSS are for


pair-wise comparisons, which are not of interest because
Tukey is almost always preferred for pair-wise.

For the specified contrasts, SPSS reports only the standard


output (i.e., not controlling the Type I error rate).

Thus, users need to be really careful they are appropriately


controlling the Type I error rate appropriately.

95 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

Assumptions of the ANOVA

The assumptions of the ANOVA are the same as for the


two-group t-test.
1 The population from which the scores were sampled is
normally distributed.

2 The variances for each of the m groups is the same.

3 The observations are independent.

Recall that multiple regression assumes homoscedasticity,


which is just an extension of homogeneity of variance.

96 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

Also like the independent groups t-test, the first two


assumptions become less important as sample size increases.
This is especially when the per group sample sizes are equal or
nearly so.

Thus, the larger the sample size, the more robust the model to
these two assumption violations.

97 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

Again, like the t-test, the ANOVA is very sensitive (i.e., it is


not robust) to violations of the assumption of independence.
Observations that are not independent can make the empirical
α rate much different than the nominal α rate.

98 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

Analysis of variance procedures test an omnibus (i.e., an


overall) hypothesis.
More specifically, ANOVA models test the hypothesis that
µ1 = µ2 = . . . = µm .

In many situations, primary interest concerns targeted null


hypotheses (not just the omnibus hypothesis).

Thus, additional analyses may be necessary.

99 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

A Summary from Designing Experiments and Analyzing


Data
This discussion “focuses on special methods that are needed when the
goal is to control αFW instead of to control αPC . Once a decision has
been made to control αFW , further consideration is required to choose an
appropriate method of achieving this control for the specific
circumstance. One consideration is whether all comparisons of interest
have been planned in advance of collecting the data. If so, the Bonferroni
adjustment is usually most appropriate, unless the number of planned
comparisons is quite large. Statisticians have devoted a great deal of
attention to methods of controlling αFW for conducting all pairwise
comparisons, because researchers often want to know which groups differ
from other groups. We generally recommend Tukey?s method for
conducting all pairwise comparisons. Neither Bonferroni nor Tukey is
appropriate when interest includes complex comparisons chosen after
having collected the data, in which case Scheffé’s method is generally 100 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

What You Learned from this Lesson


You learned:
How to compare more than two independent means to assess if
there are any differences via analysis of variance (ANOVA).
How the total sums of squares for the data can be decomposed
to a part that is due to the mean differences between groups
and to a part that is due to within group differences.
Why doing multiple t-tests is not the same thing as ANOVA.
Why doing multiple t-tests leads to a multiplicity issue, in that
as the number of tests increases, so to does the probability of
one or more error.
How to correct for the multiplicity issue in order for a set of
contrasts/comparisons has a Type I error rate for the collection
of tests at the desirable (e.g., .05) level.
How to use SPSS to implement an ANOVA and follow-up
tests.
101 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

Notations

H0 : σ12 = σ22 - The null hypothesis of equal variances


F(df1 ,df2 ) - The F -statistic with df1 and df2 as the degrees of
freedom
s12 and s22 - The variances for group 1 and group 2, respectively
2
sPooled - Pooled within group variance
m - Number of groups
nj - Sample size in the jth group
(j = 1, . . . , m)
N - Total sample size
m
P
N= nj
j=1

102 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

Notations Continued

SS - Sum of squares
This can be for the Between, Treatment, Among, Within,
Error, or Total Sum of Squares
MS - Mean square (i.e., a variance)
MSWithin is the mean square within a group
Yij - The score for the ith individual in the jth group

τj - The treatment effect of the jth group

εij - Some uniqueness for the ith individual in the jth group

E[MSWithin ] - The expected value of the mean squares within


a group

103 / 104
Goal of Analysis of Variance
The Formal ANOVA Model Assumptions of the ANOVA
Explanation by Example What You Learned
Multiple Comparisons Notations
Assumptions

Notations Continued

C - The number of independent comparisons to be performed

αPC - Per comparison error rate

αFW - Familywise error rate

αEW - Experimentwise error rate

Ψ̂ - The particular linear combination of means

cj - Comparison weight for the jth group

104 / 104

Das könnte Ihnen auch gefallen