Sie sind auf Seite 1von 22

1

Week 11
Parametric technique
One-way between groups ANOVA
One-way repeated measures ANOVA
Two-way between groups ANOVA
Mixed between-within group ANOVA

Non-parametric technique
Kruskal-Wallis Test
Friedman Test
None
None

Analysis of variance is used when you have two or more groups or time points.
Paired-sample/ repeated measures/ within-group techniques are used when you test the same
people on more than one occasion, or you have matched pairs.
Independent/ between-group techniques are used when the participants in each group are different
people (independent of one another).
One-way ANOVA one categorical independent variable (e.g.: gender) and one continuous
dependent variable (e.g.: scores)
Two-way ANOVA two independent variables (e.g.: gender, age group) and one continuous
dependent variable (e.g: scores)

Examples for types of ANOVA:


A manager wants to raise the productivity at his company by increasing the speed at which his
employees can use a particular spreadsheet program. As he does not have the skills in-house, he
employs an external agency which provides training in this spreadsheet program. They offer 3
courses: a beginner, intermediate and advanced course. He is unsure which course is needed for the
type of work they do at his company, so he sends 10 employees on the beginner course, 10 on the
intermediate and 10 on the advanced course. When they all return from the training, he gives them
a problem to solve using the spreadsheet program, and times how long it takes them to complete
the problem. He then compares the three courses (beginner, intermediate, advanced) to see if there
are any differences in the average time it took to complete the problem.
One-way between-group ANOVA

Heart disease is one of the largest causes of premature death and it is now known that chronic, lowlevel inflammation is a cause of heart disease. Exercise is known to have many benefits, including
protection against heart disease. A researcher wants to know whether this protection against heart
disease might be due to exercise reducing inflammation. The researcher was also curious as to
whether this protection might be gained over a short period of time or whether it took longer. In
order to investigate this idea, the researcher recruited 20 participants who underwent a 6-month
exercise training program. In order to determine whether inflammation had been reduced, the
researcher measured the inflammatory marker called CRP at pre-training, 2 weeks into training and
after 6 months of training.
One-way repeated measures ANOVA/ One-way within-group ANOVA

2
Assumptions for one-way between-group/within-group ANOVA:
Before running any parametric test, we always need to make sure that the data we want to analyse
can actually be analysed using a one-way ANOVA.
Between-group
Within-group
Assumption #1: The dependent variable should be measured at the interval or ratio
level (i.e., continuous scale rather than discrete scale).
For example: Revision time (measured in hours), intelligence (measured using IQ score), exam
performance (measured from 0 to 100), weight (measured in kg).
Assumption #2: The independent
variable should consist of two or more
categorical, independent groups. When you
have only two groups (e.g.: gender: male and
female), an independent-samples t-test is
commonly used, although one-way ANOVA will
generate the same results.
For example: Ethnicity (e.g., 3 groups:
Caucasian, African American and Hispanic),
physical activity level (e.g., 4 groups: sedentary,
low, moderate and high), profession (e.g., 5
groups: surgeon, doctor, nurse, dentist,
therapist).

Assumption #2: The independent


variable should consist of at least two
categorical, "related groups" or "matched
pairs". "Related groups" indicates that the
same subjects are present in both groups. The
reason that it is possible to have the same
subjects in each group is because each subject
has been measured on two occasions on the
same dependent variable.
For example, individuals' performance in a
spelling test (the dependent variable) before
and after they underwent a new form of
computerized teaching method to improve
spelling. The repeated measures ANOVA can
also be used to compare different subjects, but
this does not happen very often.

Assumption #3: You should have independence of observations, which means that there is no
relationship between the observations in each group or between the groups themselves. For
example, when using between-group techniques, there must be different participants in each
group with no participant being in more than one group. This is more of a study design issue than
something you can test for, but it is an important assumption of the one-way ANOVA. If your
study fails this assumption, you will need to use another statistical test instead of the one-way
ANOVA (e.g., a repeated measures design).
N/A for within-group techniques
Assumption #4: There should be no significant outliers. Outliers are simply single data points
within your data that do not follow the usual pattern (e.g., in a study of 100 students' IQ scores,
where the mean score was 108 with only a small variation between students, one student had a
score of 156, which is very unusual, and may even put her in the top 1% of IQ scores globally). The
problem with outliers is that they can have a negative effect on the one-way ANOVA, reducing the
validity and accuracy of your results.
Assumption #5: Your dependent variable should be approximately normally distributed for each
category of the independent variable. We talk about the one-way ANOVA only
requiring approximately normal data because it is quite "robust" to violations of normality,
meaning that assumption can be a little violated and still provide valid results, provided that the
experiment design is balanced. You can test for normality using the Shapiro-Wilk test of
normality, which is easily tested for using SPSS Statistics.

3
Assumption #6: There needs to
be homogeneity of variances. You can test this
assumption in SPSS Statistics using Levene's
test for homogeneity of variances. If your data
fails this assumption, you will need to carry out
a Welch ANOVA instead of a one-way ANOVA,
which you can do using SPSS Statistics, and also
use a different post-hoc test.

Assumption #6: Known as sphericity, the


variances of the differences between all
combinations of related groups must be equal.
Unfortunately, repeated measures ANOVAs are
particularly susceptible to violating the
assumption of sphericity, which causes the test
to become too liberal (i.e., leads to an increase
in the Type I error rate; that is, the likelihood of
detecting a statistically significant result when
there isn't one). Fortunately, SPSS Statistics
makes it easy to test whether your data has
met or failed this assumption.

Example:
Research question: Is there a statistically significant difference in undergraduate students grade
points for a Statistics class based on the type of lecture medium (online conference class, traditional
lecture, and traditional lecture supplemented by online conference class).
: There is no statistically significant difference in undergraduate students grade points for a
Statistics class based on the type of lecture (online conference class, traditional lecture, and
traditional lecture supplemented by online conference class).
Null hypothesis: No difference between population means (1 = 2 = 3 ).
=
Research hypothesis: Population means are different (1 2 3 ).
>

Independent samples t-test

One-Way ANOVA

Control

Control
Outcome
(Dependent
variable)

Treatment
1

Treatment
Treatment
2

Outcome
(Dependent
variable)

4
Example #1 (post hoc):
We have three teaching program (online conference class, traditional lecture, and traditional lecture
supplemented by online conference class) and we are interested in the effectiveness of each
program on increasing undergraduate students grade points for a Statistics class. Below is the data
for analysis.

MEAN, X

Online conference
class
12
15
9
12.00

Traditional lecture
20
19
23
20.67

Traditional lecture supplemented


by online conference class
40
35
42
39.00

Table Summary:
SS
Between

Within

Total

df
k-1

MS
SS (B)
k-1

N-k

SS (W)
N-k

)
n (X X

)2
(X X

(X
X)2

N-1

i refers to the individual cell.


j refers to the specific group.
k refers to the number of conditions/ treatments/ groups.
n is the observations in each group (level of factor A).
N refers to the total number of participants for the entire study.

X is the grand mean.

F-ratio
MS (B)
MS (W)

5
Computation of ANOVA:
Sum of squares between-groups examines the differences among the group means by calculating
) around the grand mean (X
). This is variation in scores that is due to
the variation of each mean (X
the treatment (or independent variable).
SSA = n (X
X)2

= 12.00+20.67+39.00
X

X=

3
12+15+9+20+19+23+40+35+42
9

= .

SSA = 3 [(12-23.89)2 + (20.67 - 23.89)2 + (39.00 - 23.89)2 ]


= 1140.21

Sum of squares within-group examines error variation or variation of individual scores around each
group mean. This is variation in scores that is not due to the treatment (or independent variable) but
due to variation in individuals.
)2
SSS/A = (X X

SSOnline = (12-12)2 + (15 12)2 + (9 12)2


=0+9+9
= 18.00
SSTraditional = (20-20.67)2 + (19 20.67)2 + (23 20.67)2
= 0.44 + 2.78 + 5.44
= 8.67
SSOnline and traditional combination = (40-39)2 + (35 39)2 + (42 39)2
= 1 + 16 + 9
= 26.00
SSS/A = 18 + 8.67 + 26
= 52.67

6
Total sum of squares can be computed by adding SSA and SSS/A , but also by simply subtracting each
score from the grand mean, squaring, and then summing across all cases.
SST = SSA + SSS/A
SST = (X
X)2

SST = (12 23.89)2 + (15 23.89)2 + (9 23.89)2 + (20 23.89)2 + (19 23.89)2 +
(23 23.89)2 + (40 23.89)2 + (35 23.89)2 + (42 23.89)2
= 141.35 + 79.01 + 221.68 + 15.12 + 23.90 + 0.79 + 259.57 + 123.46 + 328.01
= 1192.89

Between
Within
Total

F (2,6)= 64.94, p<0.01.

SS
1140.21

df
3 1 =2

MS
570.105

52.67

9 -3 = 6

8.779

1192.89

91=8

F-ratio
64.94

7
Test procedures in SPSS Statistics
1.
2.
3.
4.
5.
6.

Click Analyze > Compare Means > One-way ANOVA


Dependent List: Dependent Variable
Factor: Independent Variable
Post hoc: Tukey
Options: Descriptive, Homogeneity of variance test, Means Plot
Missing Values: Exclude cases by analysis by analysis

Data Outcome:
Levenes Test tests the homogeneity of variance (HOV)/ whether the variances of the groups are the
same.
***If Levenes test is significant, (i.e.: the p-value is less than .05), then we can say that the
variances are significantly different and we have violated the assumption of homogeneity. We
always want the p-value for Levenet test to be more than .05, and not violate the assumption of
HOV.
When we found that we have violated the HOV assumption, we will need to refer to the table
Robust Tests of Equality of Means.
Solution: Adjust the F-test to correct the problem using Brown-Forsythe (1974) F-ratio or Welchs F.
Effect size:
One can estimate the magnitude of the effect of the independent variable by computing 2 or 2 .
SSA
2 =
SST

2 =

1140.222
1192.9

= 0.95 or 95%
2 =

2 =

1140.222(21)(8.778)
1192.9+ 8.778

SSA ( 1)(MSS/A )
SST + MSS/A

= 0.94 or 94%

Reporting of results:
There was a statistically significant difference between groups as determined by one-way ANOVA
(F(2,6) = 64.949, p < .001). A Tukey HSD post-hoc test revealed that the effectiveness of the
treatment is significant for all treatments: online class (M = 12.00, SD = 3.00); traditional class (M =
20.67, SD = 2.08) and traditional class with online class (M = 39.00, SD = 3.61). Traditional class is
more statistically significant with both Treatment A and B at p < .001. The proportion of variance in
undergraduates grade accounted by the type of teaching program was approximately 95% (2 =
0.95).

8
Pairwise comparison
(i)

Planned comparison: Planned at the beginning of the study


Planned comparisons are more sensitive in detecting the differences.
However, they do not control for the increased risks of Type 1 errors (rejecting the
null hypothesis when it is true). Post hoc set a more stringent significance levels to
reduce the risk of a Type 1 error, given the larger number of comparison tests
performed.
One way to control for Type 1 error is to apply Bonferroni adjustment to the alpha
level that you will use to judge statistical significance. This involves setting a more
stringent alpha level for each comparison, to keep the alpha across all the tests at a
reasonable level.
To achieve this, you divide the alpha level (usually .05) by the number of
comparisons that you intend to make.
Test procedures in SPSS Statistics
Same procedure as one-way ANOVA but just that instead of clicking on the
post-hoc button, we click on the contrasts button.
Coefficients: Put in the pre-determined coefficients
Make sure that the coefficient total comes up to 0.

(ii)

Post-hoc comparison: Conducted if the F-ratio is significant and are exploratory. The
common ones are Fishers LSD, Tukeys and Scheffe tests.
Post hoc comparisons are designed to guard against the possibility of an increased
Type 1 error due to the large number of different comparisons being made. This is
done by setting more stringent criteria for significance, and therefore it is often
harder to achieve significance. With small samples, this can be a problem, as it can
be very hard to find a significance result even when the apparent difference in
scores between the groups is quite large.

Test procedures in SPSS Statistics


Same procedure as one-way ANOVA and click on the post-hoc button.

** It is not appropriate to try both and see which results you prefer!

9
Example # 1 (Planned comparison):
We have three teaching program (online conference class, traditional lecture, and traditional lecture
supplemented by online conference class) and we are interested in the effectiveness of each
program on increasing undergraduate students grade points for a Statistics class. Below is the data
for analysis.
Online conference
Traditional lecture
Traditional lecture supplemented
class
by online conference class
12
20
40
15
19
35
9
23
42
MEAN
12.00
20.67
39.00
Research question 1: Is combination of traditional lecture and online conference class is superior to
online conference class and traditional stand-alone?
: A traditional lecture supplemented by online conference class is NOT superior to online
conference class and traditional alone.
: No difference between population means for online class vs. combination of lecture and online
class (1 = 3 ) and traditional lecture vs. combination of lecture and online class (2 = 3 ).

1 : A traditional lecture supplemented by online conference class is superior to online conference


class and traditional alone.
1 : There is a difference between population means for online class vs. combination of lecture and
online class (1 < 3 ) and traditional lecture vs. combination of lecture and online class (2 < 3 ).
Type of lecture
Online conference class
Traditional lecture
Traditional lecture supplemented by online
conference class

Coded as
A
B
C

Coefficients
-1
-1
2

Research question 2: Is a traditional lecture supplemented by online conference class is superior to


online conference class alone? (Ignoring the traditional lecture).
: No difference between population means for online class vs. combination of lecture and online
class (1 = 3 ).
1 : There is a difference between population means for online class vs. combination of lecture and
online class (1 < 3 ).
Type of lecture
Online conference class
Traditional lecture
Traditional lecture supplemented by online
conference class

Coded as
A
B
C

Coefficients
-1
0
1

10
Example #2 (Planned comparison):
A researcher wants to test the effectiveness of Drug X on preventing seasonal allergy and she
administered the drugs to the patients in her research clinic. She randomly grouped them into 3
conditions: placebo (sugar pill), low dose and high dose. The dependant variable is an objective
measure of the effectiveness of the drug.

s
2

Placebo
3
2
1
1
4
2.20
1.30
1.70

Low Dose
5
2
4
2
3
3.20
1.30
1.70
Grand Mean= 3.467
Grand SD= 1.767
Grand Variance= 3.124

High Dose
7
4
5
3
6
5.00
1.58
2.50

One-Way ANOVA
= Means for the three groups are the same = = =
1 = Means for the three groups are different=
Planned comparisons
Research question 1: Is Drug X superior to placebo? Is Drug X effective in preventing seasonal
allergy?
Conditions
Placebo
Low dose
High dose

Coded as
A
B
C

Coefficients
-2
1
1

Research question 2: What is the amount of dose that is needed to prevent seasonal allergy?
Conditions
Placebo
Low dose
High dose

Coded as
A
B
C

Coefficients
0
-1
1

11
Figure 1: Overview of the general procedure for one-way ANOVA
Explore data

Check for outliers, normality,


homogeneity, etc.

Boxplots, histograms,
descriptive statistics

Correct outliers/normality
problems

Run the ANOVA

Levene's test
significant

Use Welch or BrownForsythe F

Follow-up tests

Specific hypotheses

Planned comparisons

No hypotheses

Post-hoc tests

Calculate effect size

12
One-way Repeated Measures ANOVA/ Within-subjects ANOVA/ ANOVA for correlated samples
It is equivalent of the one-way ANOVA, but for related and not-independent groups. You can also
think of it as an extension of the dependent t-test.
There is one categorical (e.g.: nominal or ordinal) independent variable and one continuous (e.g.:
interval or ratio) dependent variable.
We use a repeated measures of ANOVA when:
(1) It is a study that investigates changes in mean scores over three or more time points.
For example, you might be investigating the effect of a 6-month exercise training
programme on blood pressure and want to measure blood pressure at 3 separate time
points (pre-, midway and post-exercise intervention), which would allow you to develop a
time-course for any exercise effect.
In repeated measures ANOVA, the independent variable has categories
called levels or related groups. Where measurements are repeated over time, such as when
measuring changes in blood pressure due to an exercise-training programme, the
independent variable is time. Each level (or related group) is a specific time point. Hence,
for the exercise-training study, there would be three time points and each time-point is a
level of the independent variable (a schematic of a time-course repeated measures design is
shown below):

13
(2) It is a study that investigates differences in mean scores under three or more different
conditions.
For example, you might get the same subjects to eat different types of cake (chocolate,
caramel and lemon) and rate each one for taste, rather than having different people flavour
each different cake.
Where measurements are made under different conditions, the conditions are the levels (or
related groups) of the independent variable (e.g., type of cake is the independent variable
with chocolate, caramel, and lemon cake as the levels of the independent variable). A
schematic of a different-conditions repeated measures design is shown below. It should be
noted that often the levels of the independent variable are not referred to as conditions,
but treatments. Which one you want to use is up to you. There is no right or wrong naming
convention. You will also see the independent variable more commonly referred to as
the within-subjects factor.

***It is important to note that for these two studies mentioned above, the same people are being
measured more than once on the same dependent variable. This is also why it is called repeated
measures design.
Hypothesis for Repeated Measures ANOVA
The repeated measures ANOVA tests for whether there are any differences between related
population means. The null hypothesis (H0) states that the means are equal:

H0: 1 = 2 = 3 = = k
where = population mean and k = number of related groups. The alternative hypothesis (HA) states
that the related population means are not equal (at least one mean is different to another mean):

HA: at least two means are significantly different

14
F-Ratio:
One-way ANOVA

Repeated measures ANOVA

In one-way ANOVA, we partition the variability attributable to the differences between groups
(SSconditions) and variability within groups (SSw).
However, with a repeated measures ANOVA, as we are using the same subjects in each group, we
can remove the variability due to the individual differences between subjects, referred to as
SSsubjects, from the within-groups variability (SSw). Each subject becomes a level of a factor
called subjects. And, with the ability to subtract SSsubjects it will leave us with a smaller
SSerror term.
The between-subjects variability, our new SSerror only reflects individual variability to each
condition. You might recognise this as the interaction effect of subject by conditions; that is, how
subjects react to the different conditions.

15
Example # 3
You are interested to investigate the effect of a 6-month exercise training programme on blood
pressure and want to measure blood pressure at 3 separate time points (pre-, midway and postexercise intervention), which would allow you to develop a time-course for any exercise effect.
Subjects

Pre

Exercise intervention
3 months
6 months

1
2
3
4
5
6
Mean, X

45
42
36
39
51
44
42.83

50
42
41
35
55
49
45.33

55
45
43
40
59
56
49.67

Subject Means, X
50
43
40
38
55
49.67

Grand Mean = 45.94


Table Summary:
SS

df
k-1

MS
SS (B)
k-1

F-ratio
MS (B)
MS (e)

)2
(X X

n-1

SS (W)
N-k

MS (W)
MS (e)

SS = SS + SS

(k-1)(n-1)

SS (e)
(k-1)(n-1)

Between
(Treatments)

)
n (X X

Within
(Subjects)
Error

SS = SS SS

)2
SS =k (X X
Total

(X
X)2

i refers to the individual cell.


j refers to the specific group.
k refers to the number of levels in a factor
n is the subjects in each group (level of factor A)
N is the total number of subjects in the whole study.

X is the grand mean.

N-1

16
Computation of ANOVA:
Sum of squares between-groups examines the differences between related group means by
) around the grand mean (X
). This is variation in scores that
calculating the variation of each mean (X
is due to the treatment (or independent variable).
SSBetween = n (X
X)2

SSBetween = 6 [(42.8 45.9)2 + (45.3 45.9)2 + (49.7 45.9)2 ]


= 6 [9.61 + 0.36 + 14.44]
=

143.44

Sum of squares within-group examines error variation or variation of individual scores around each
group mean. This is variation in scores that is not due to the treatment (or independent variable) but
due to variation caused by other factors.
SSwithin = (X
X )2

SSpre = (45 42.8)2 + (42 42.8)2 + (36 42.8)2 + (39 42.8)2 + (51 42.8)2 + (44 42.8)2
= 134.83

SS3months = (50 45.3)2 + (42 45.3)2 + (41 45.3)2 + (35 45.3)2 + (55 45.3)2 + (49 45.3)2
= 265.33

SS6months = (55 49.7)2 + (45 49.7)2 + (43 49.7)2 + (40 49.7)2 + (59 49.7)2 + (56 49.7)2
= 315.33

SSwithin = 134.83 + 265.33 + 315.33


= 715.5

Sum of squares error examines error variation or variation of individual scores around each group
mean. This is variation in scores that is not due to the treatment (or independent variable) but due
to variation caused by other factors.
SSSubjects = 3 [(50 45.9)2 + (43 45.9)2 + (40 45.9)2 + (38 45.9)2 + (55 45.9)2 + (49.7 45.9)2 ]

= 658.3

17

SS = SS + SS

SS = SS SS
SS = 715.5 658.3

= .

Total sum of squares can be computed by adding SSA and SSS/A , but also by simply subtracting each
score from the grand mean, squaring, and then summing across all cases.
SST = SSbetween + SSwithin + SSerror
SST = (X
X)2

SST = (12 23.89)2 + (15 23.89)2 + (9 23.89)2 + (20 23.89)2 + (19 23.89)2 +
(23 23.89)2 + (40 23.89)2 + (35 23.89)2 + (42 23.89)2
= 141.35 + 79.01 + 221.68 + 15.12 + 23.90 + 0.79 + 259.57 + 123.46 + 328.01
= 1192.89

Between

SS
143.44

df
3 1 =2

MS
71.72

Within

715.5

6-1 =5

143.1

Error

57.2

(3-1)(6-1) =10

5.72

Total

858.94

18 1 = 17

F-ratio
12.53

There was a statistically significant effect of time on exercise-induced fitness, F (2, 10) = 12.53, p =
.002.

Partial eta-squared of:


2

SS
SS + SS

18

Test procedures in SPSS Statistics


1. Click Analyze > General Linear Model > Repeated Measures
2. Within Subject Factor Name: Put in meaningful name for your Independent Variable (e.g.:
Time or Condition)
3. Number of Levels: No. of levels in the factor
4. Measure Name: Put in meaningful name for your Dependent Variable
5. Define
6. Within-subjects variables: Drag the related levels for IV into this box in order (e.g.: Time 1,
Time 2, Time 3).
7. Plots: Move factors into Horizontal Axis, then Add and Continue
8. Options: Transfer IV from Factor(s) and Factor Interaction to the Display Means for.
9. Tick Compare Main Effects
10. Select Bonferroni for Confidence interval adjustment.
11. Display: Descriptive statistics, Estimates of effect size and Homogeneity tests.
12. Continue and OK.

19
Increased Power in a Repeated Measures ANOVA
The major advantage with running a repeated measures ANOVA over an independent
ANOVA is that the test is generally much more powerful. This particular advantage is achieved by the
reduction in MSerror (the denominator of the F-statistic) that comes from the partitioning of
variability due to differences between subjects (SSsubjects) from the original error term in an
independent ANOVA (SSw): i.e. SSerror = SSw - SSsubjects.
We achieved a result of F(2, 10) = 12.53, p = .002, for our example repeated measures
ANOVA. How does this compare to if we had run an independent ANOVA instead? Well, if we ran
through the calculations, we would have ended up with a result ofF(2, 15) = 1.504, p = .254, for the
independent ANOVA. We can clearly see the advantage of using the same subjects in a repeated
measures ANOVA as opposed to different subjects.
For our exercise-training example, the illustration below shows that after taking away
SSsubjectsfrom SSw we are left with an error term (SSerror) that is only 8% as large as the
independent ANOVA error term.
This does not lead to an automatic increase in the F-statistic as there are a greater number
of degrees of freedom for SSw than SSerror. However, it is usual for SSsubjects to account for such a
large percentage of the within-groups variability that the reduction in the error term is large enough
to more than compensate for the loss in the degrees of freedom (as used in selecting an Fdistribution).
Underlying Assumptions: Sphericity
ANOVAs with repeated measures (within-subject factors) are particularly susceptible to the
violation of the assumption of sphericity. Sphericity is the condition where the variances of the
differences between all combinations of related groups (levels) are equal. Violation of sphericity is
when the variances of the differences between all combinations of related groups are not equal.
Sphericity can be likened to homogeneity of variances in a between-subjects ANOVA.
The violation of sphericity is serious for the repeated measures ANOVA, with violation
causing the test to become too liberal (i.e., an increase in the Type I error rate). Therefore,
determining whether sphericity has been violated is very important. Luckily, if violations of sphericity
do occur, corrections have been developed to produce a more valid critical F-value (i.e., reduce the
increase in Type I error rate). This is achieved by estimating the degree to which sphericity has been
violated and applying a correction factor to the degrees of freedom of the F-distribution.
Testing for sphericity is an option in SPSS using Mauchly's Test for Sphericity as part of the
GLM Repeated Measures procedure. Mauchly's Test of Sphericity tests the null hypothesis that the
variances of the differences are equal. Thus, if Mauchly's Test of Sphericity is statistically significant
(p < .05), we can reject the null hypothesis and accept the alternative hypothesis that the variances
of the differences are not equal (i.e., sphericity has been violated).

20

Mauchly's Test of Sphericitya


Measure: CBR
Within Subjects

Mauchly's

Approx. Chi-

Square

Effect

time

.434

df

3.343

Epsilonb

Sig.

Greenhouse-

Huynh-

Lower-

Geisser

Feldt

bound

.188

.638

.760

.500

Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent
variables is proportional to an identity matrix.
a. Design: Intercept
Within Subjects Design: time
b. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are
displayed in the Tests of Within-Subjects Effects table.

Reporting on Mauchlys Test of Sphericity


Mauchly's Test of Sphericity indicated that the assumption of sphericity had not been violated, 2 =
3.343, p = .188

When it is violated, you can report is as:


Mauchly's Test of Sphericity indicated that the assumption of sphericity had been violated, 2(2) =
22.115, p < .0005, and therefore, a Greenhouse-Geisser correction was used. There was a significant
effect of time the DV, F(1.171, 38) = XXX, p < .0005.

Effect size according to Cohens (1988) guidelines. According to him:

Small: 0.01
Medium: 0.059
Large: 0.138

So if you end up with = 0.45, you can assume the effect size is very large. It also means that 45%
of the change in the DV can be accounted for by the IV.

21
Results:

Table 97
Descriptive statistics for effect of a 6-month exercise training at 3 time points: pre, mid and
post exercise.
Source
Pre
3 months
6 months

n
6
6
6

Mean
42.83
45.33
49.67

Standard deviation, SD
5.19
7.29
7.94

Table 98
Analysis of variance (ANOVA) summary
Source
Time
Error (Time)

SS
143.44
57.22

df
2
10

MS
71.72
5.72

F
12.53

p
.002

2
.715

Table 99
Bonferroni comparison for time: pre, 3 months, and 6 months.

Comparisons
Time
Pre vs. 6 months
3 months vs. 6 months

Mean
Difference

Std.
Error

6.83*
4.33*

1.70
.72

95% CI
Lower
Upper
Bound
Bound
.82
1.81

12.85
6.86

* p < 0.05
Reporting the result:
A repeated measures ANOVA was conducted to investigate the effect of a 6-month exercise training
programme on blood pressure at 3 separate time points (pre-, midway and post-exercise
intervention). The mean and standard deviations of word status are presented in Table 3. Mauchly's
Test of Sphericity indicated that the assumption of sphericity had not been violated, 2 = 3.343, p =
.188.
The repeated measures ANOVA determined that blood pressure due to exercise effect and time
differed statistically significantly (F(2, 10) = 12.53, P = 0.002). Partial eta squared is reported at .715
(large). Post hoc tests using the Bonferroni correction revealed that the mean difference in blood
pressure for pre and 6-month (MD = 6.83, SD = 1.70, CI = .821 to 12.846) and 3-month and 6-month
(MD = 4.33, SD = 4.33, CI= 1.81 to 6.86) were statistically significant. However, there is no significant
difference in blood pressure for pre and 3-month (MD = 2.50, SD = 1.52, CI = 2.88 to 7.88).
Therefore, we can conclude that a long-term exercise training programme (6-month) elicits a
significant reduction in blood pressure, but not after only 3 months of training.

22
Example # 4
Research conducted by: Pearson et al. (2003)
Case study prepared by: David Lane and Emily Zitek
Overview: This study investigated the cognitive effects of stimulant medication in children with
mental retardation and Attention-Deficit/Hyperactivity Disorder. This case study shows the data for
the Delay of Gratification (DOG) task. Children were given various dosages of a drug,
methylphenidate (MPH) and then completed this task as part of a larger battery of tests. The order
of doses was counterbalanced so that each dose appeared equally often in each position. For
example, six children received the lowest dose first, six received it second, etc. The children were on
each dose one week before testing.
This task, adapted from the preschool delay task of the Gordon Diagnostic System (Gordon, 1983),
measures the ability to suppress or delay impulsive behavioral responses. Children were told that a
star would appear on the computer screen if they waited long enough to press a response key. If a
child responded sooner in less than four seconds after their previous response, they did not earn a
star, and the 4-second counter restarted. The DOG differentiates children with and without ADHD of
normal intelligence (e.g., Mayes et al., 2001), and is sensitive to MPH treatment in these children
(Hall & Kataria, 1992).
Questions to Answer
Does higher dosage lead to higher cognitive performance (measured by the number of correct
responses to the DOG task)?
Design Issues
This is a repeated-measures design because each participant performed the task after each dosage.
Descriptions of Variables
Variable

Description

d0
d15
d30
d60

Number of correct responses after taking a placebo


Number of correct responses after taking .15 mg/kg of the drug
Number of correct responses after taking .30 mg/kg of the drug
Number of correct responses after taking .60 mg/kg of the drug

References:
Pearson, D.A., Santos, C.W., Jerger, S.W., Casat, C.D., Roache, J., Loveland, K.A., Lane, D.M., Lachar,
D., Faria, L.P., & Getchell, C. (2003) Treatment effects of methylphenidate on cognitive
functioning in children with mental retardation and ADHD. Journal of the American Academy
of Child and Adolescent Psychiatry, 43, 677-685.