Ch. 12 The Analysis of Variance: Example

Ch.
12 The Analysis of Variance
• Example.
◦ A quality characteristic of electric motors is motor

vibration.
◦ Does the mean amount of vibration depend on the

type of bearing?
◦ An experiment:
◦◦ 5 brands of bearings
◦◦ 6 motors tested per brand
◦◦ ⇒ 5 × 6 vibration measurements recorded

◦ This is an example of a completely randomized single
factor experiment.
◦ The single factor is bearing brand.
◦ There are 5 factor levels. Each brand is a level. Levels

are sometimes called treatments.
◦ Each treatment has 6 observations, or replicates.

motor vibration (microns)
12 13 14 15 16 17
V1
V2
V3
bearing brand
V4
V5
◦ A side-by-side boxplot indicates that there is variability
1. within each treatment
2. between different treatment

◦ The Analysis of Variance (ANOVA) is used to
determine whether the amount of variation between
different treatments is larger than the amount of
variation within each treatment.
◦ Alternative Viewpoint: ANOVA analyzes a

signal-to-noise ratio.
• An ideal World: No noise
◦ Without noise or error, the signal can be discerned

without the use of statistics
◦ e.g. 1; In a perfectly controlled environment, 4

measurements at each of 3 different factor levels were
recorded:
Level A Level B Level C

13 10 13
13 10 13
13 10 13
13 10 13
◦ Without noise, variation between groups can be clearly

seen (where it exists).
• The real World: A noisy place
◦ With noise or error, the signal cannot be discerned

without the use of statistics
◦ e.g. 2; In a realistic environment, 4 measurements at

each of 3 different factor levels were recorded:
Level A Level B Level C

14 12 15
12 10 14
15 11 12
13 8 11
◦ With noise, variation between groups (where it exists) is

obscured (partially drowned out) by the noise.
◦ noise ⇐⇒ variability within each level
◦ σ 2 = variance within each level; estimate with MSE

• The Completely Randomized Design
(Balanced Case)
◦ I independent random samples of measurements are

taken:
X11, X12, . . . , X1J
X21, X22, . . . , X2J
···
XI1, XI2, . . . , XIJ
◦ There are I factor levels or treatments.
◦ J replicates per treatment. (balanced design)

◦ The jth measurement from the i treatment can be
modelled as
Xij = µi + εij
where
1. µi is the expected value of all measurements in the

ith treatment group
2. εij is the amount by which the jth replicated

measurement differs from its expected value. This is
called a random error, and we assume
E[εij ] = 0
and
V (εij ) = σ 2
◦ Since the measurements are independent of each
other, the random errors ε are independent.
◦ For e.g. 1, σ 2 = 0. For e.g. 2, σ 2 > 0 (estimate this

using MSE)
• Summarizing and Estimating Within Treatment Variability
◦ An estimate of σ 2 is based on the error sum of squares

I X
J
(Xij − X̄i·)2
X
SSE =
i=1 j=1
where X̄i· denotes the ith treatment sample average,
for i = 1, 2, . . . , I.
◦ Dividing by the degrees of freedom remaining after

estimating I means, an unbiased estimate of σ 2 is the
mean-squared-error
SSE
M SE =
I(J − 1)
◦ Note that the MSE is the average of the I sample

variances (this only works in the balanced case).
• Summarizing and Estimating Between Treatment Variability
◦ The expected value of Xij is
E[Xij ] = E[µi] + E[εij ] = µi
◦ Differences between the I different treatment groups

are reflected in the variability in µ1, µ2, . . . , µI .
◦ Estimates of these expected values can be obtained by

computing the I sample averages: X̄1., X̄2., . . . , X̄I..
◦ The variability in these averages is related to the sum
of squares of the differences between the sample
averages and the grand average:
I
(X̄i. − X̄..)2
X
SST r = J
i=1
I
1 X
X̄.. = X̄i.
I i=1
◦ We can show that (if all expected values are equal),

then
I
SST r/σ 2 = J (X̄i. − X̄..)2/σ 2
X
i=1
has a χ2 distribution on I − 1 degrees of freedom.
◦ Therefore, if all expected values are equal, then
E[M ST r] = E[SST R/(I − 1)] = σ 2
◦ If the expected values differ, then
E[M ST r] = σ 2 + extra variation
◦ Comparing MSTr with an unbiased estimate of σ 2 will

tell us whether there is extra variation among the
expected values.
• Comparing Between and Within Variability
◦ If all treatment expected values are equal, then both

M ST r and M SE are unbiased estimates of σ 2.
◦ Then the F ratio

M ST R
f =
M SE
will tend to be near 1.
◦ If the treatment group expected values vary, then f

will tend to be larger than 1, since
◦◦ M SE still estimates σ 2
◦◦ M ST R tends to be larger than σ 2
◦ To decide whether F is significantly larger than 1, we

consult the F table.
i.e.
H0 : µ1 = µ2 = · · · = µI
Ha : at least one mean differs

Test Statistic:
M ST R
f =
M SE
p-value:
P (F > f )
◦ numerator degrees of freedom: I − 1
◦ denominator degrees of freedom I(J − 1)

• Motor-vibration Example
◦ The 5 treatment averages and variances are
1. X̄1. = 13.68, S12 = 1.43
2. X̄2. = 15.95, S22 = 1.36
3. X̄3. = 13.67, S32 = 0.67
4. X̄4. = 14.73, S42 = 0.88
5. X̄5. = 13.08, S52 = 0.23

I = 5, and J = 6.
M SE = 0.913; SST r = 30.85

so
M ST r = 7.71
M ST r 7.71
f = = = 8.45
M SE .913
degrees of freedom: I − 1 = 4 and I(J − 1) = 25
From the F-table with 4, 25 degrees of freedom:
.100 2.18
.050 2.76
.010 4.18
.001 6.49
P (F > f ) = P (F > 8.45) < .001
◦ We conclude that there really are differences in

performance among the different motor bearing brand
means.
• Another Viewpoint; Analyzing or Breaking Down Variation
◦ Example.
◦◦ Metal plate-connected trusses used for roof support.
◦◦ Plate Lengths (in inches): 4, 6, 8, 10, 12
◦◦ Response Measurements: Axial Stiffness Index (ASI,

KIPS/in)
◦◦ 7 independent measurements per plate length ⇒

J = 7 replicates
◦ This is an example of a balanced CRD with I = 5

factor levels (plate lengths).
◦ Does variation in plate length have any effect on true

mean axial stiffness?
Scatterplot of Plate−connected Trusses Data
450
axial stiffness index
400
350
4 6 8 10 12
plate length
◦ We will analyze the variation in the ASI
measurements:
variation in ASI =
variation due to possible
differences in plate length
variation due to noise (error)
i.e.
SST = SST r + SSE
◦ Model:
Xi,j = µi + εi,j
where
◦◦ µi is the expected ASI for the ith plate length group

(treatment group) and
◦◦ εi,j is the random disturbance associated with the

jth measurement in the i treatment group.
◦ Estimates of the expected values for each of the

I = 5 treatment groups are
Pl. Length 4 6 8 10 12
x̄i· 333 368 375 407 437
◦ From this, and the boxplot (or scatterplot), there

appears to be a difference among the expected values.
◦ Is this difference real, or is it due to noise?
◦ ANOVA calculations: Test for a difference among

the means.
◦◦ SST = total sum of squares = 75621.27

(Recall: this is the summary of all variability in the
data set.)
PI
◦◦ SSTr = J i=1(x̄i· − x̄··)2 = 43932
(this is the sum of squares attributable to variation
between treatment groups)
M ST r = SST r/(I − 1) =
43932/4 = 10983
SSE = SST − SST r = 31689.27

SSE 31689.27
M SE = = = 1056.309
I(J − 1) 30
10983
f = = 10.4
1056.309
p-value (from the F table):
P (F > f ) = P (F > 10.4) < .001
◦ We conclude that there is strong evidence of a

difference among the expected values of the ASI
measurements at the 5% level.
• A Summary: the ANOVA table

Variation Source d.f. SS MS f
Treatments I-1 SSTr MSTr MSTr/MSE
Error I(J-1) SSE MSE
Total IJ-1 SST
Exercise. Suppose 8 observations were taken on 3
different levels of a factor giving an MSE of 30 and an
SST of 850. Is there evidence that the three factor level
means differ?

Ch. 12 The Analysis of Variance: Example

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Ch. 12 The Analysis of Variance: Example

Hochgeladen von

Copyright:

Verfügbare Formate

Ch.

12 The Analysis of Variance

◦ A quality characteristic of electric motors is motor

◦ Does the mean amount of vibration depend on the

◦◦ 6 motors tested per brand

◦◦ ⇒ 5 × 6 vibration measurements recorded

◦ The single factor is bearing brand.

◦ There are 5 factor levels. Each brand is a level. Levels

◦ Each treatment has 6 observations, or replicates.

1. within each treatment

2. between different treatment

◦ Alternative Viewpoint: ANOVA analyzes a

◦ Without noise or error, the signal can be discerned

◦ e.g. 1; In a perfectly controlled environment, 4

Level A Level B Level C

◦ Without noise, variation between groups can be clearly

◦ With noise or error, the signal cannot be discerned

◦ e.g. 2; In a realistic environment, 4 measurements at

Level A Level B Level C

◦ With noise, variation between groups (where it exists) is

◦ σ 2 = variance within each level; estimate with MSE

◦ I independent random samples of measurements are

X11, X12, . . . , X1J

X21, X22, . . . , X2J

XI1, XI2, . . . , XIJ

◦ There are I factor levels or treatments.

◦ J replicates per treatment. (balanced design)

1. µi is the expected value of all measurements in the

2. εij is the amount by which the jth replicated

◦ For e.g. 1, σ 2 = 0. For e.g. 2, σ 2 > 0 (estimate this

◦ An estimate of σ 2 is based on the error sum of squares

◦ Dividing by the degrees of freedom remaining after

◦ Note that the MSE is the average of the I sample

◦ The expected value of Xij is

E[Xij ] = E[µi] + E[εij ] = µi

◦ Differences between the I different treatment groups

◦ Estimates of these expected values can be obtained by

◦ We can show that (if all expected values are equal),

E[M ST r] = E[SST R/(I − 1)] = σ 2

◦ If the expected values differ, then

E[M ST r] = σ 2 + extra variation

◦ Comparing MSTr with an unbiased estimate of σ 2 will

◦ If all treatment expected values are equal, then both

◦ Then the F ratio

◦ If the treatment group expected values vary, then f

◦◦ M ST R tends to be larger than σ 2

◦ To decide whether F is significantly larger than 1, we

Ha : at least one mean differs

◦ numerator degrees of freedom: I − 1

◦ denominator degrees of freedom I(J − 1)

◦ The 5 treatment averages and variances are

1. X̄1. = 13.68, S12 = 1.43

2. X̄2. = 15.95, S22 = 1.36

3. X̄3. = 13.67, S32 = 0.67

4. X̄4. = 14.73, S42 = 0.88

5. X̄5. = 13.08, S52 = 0.23

M SE = 0.913; SST r = 30.85

P (F > f ) = P (F > 8.45) < .001

◦ We conclude that there really are differences in

◦◦ Metal plate-connected trusses used for roof support.

◦◦ Plate Lengths (in inches): 4, 6, 8, 10, 12

◦◦ Response Measurements: Axial Stiffness Index (ASI,