Beruflich Dokumente
Kultur Dokumente
Introduction
In a previous chapter, the calculation of the t statistic was defined. In this article, the use of
the variance—the square of the standard deviation—and the F statistic will be described in
an alternative method to decide whether two samples are not the same.
1 First published in the American Paint and Coatings Journal, May 27, 1996; revised publication in Paint and Coatings
Industry, “deSigns of the Times: Or, when F is a passing grade,” submitted for publication August, 2000.
33
3.5, 3.4, 3.6; and Polyester 2: 3.8, 3.9, 3.6, 3.8, 3.7. The average acid value for Polyester 1
was 3.6 and for Polyester 2 was 3.76; and the standard deviation for the acid value of
Polyester 1 was 0.19 and for Polyester 2 was 0.11. The calculated t value was 1.63. A
published t Table (found in all statistical references) tells the researcher that he would be
wrong fifteen times in a hundred if he said these polyesters had different acid numbers.
These were not acceptable odds, so the researcher concluded that the polyesters had the
same acid numbers. A statistician would rather state that “the two polyesters could not be
proven to be different.”
Example 2: The researcher of Example 1 was still not satisfied so he decided to run five
more replicates the next day: Polyester 1, 3.6, 3.4, 3.6, 3.4, 3.5; and Polyester 2, 3.8, 3.6,
3.7, 3.8, 3.8. Combining the first results with the second results, the average acid value for
Polyester 1 was 3.55 and for Polyester 2 was 3.75, while the standard deviation for the acid
value of Polyester 1 was 0.15 and for Polyester 2 was 0.10. With twenty tests the researcher
has increased his power to make a decision. The calculated t value was 2.98. This time the t
Table tells that the researcher would be wrong only one time in one hundred, if he said these
polyesters had different acid numbers. The researcher and the statistician would both
conclude that with the additional data, the polyesters have different acid numbers.
What is Variance?
In a previous chapter, “What’s a Standard Deviation Good for Anyway?”, standard
deviation, s, was described as the square root of the variance (Equation 2).
s = Variance 2
The variance is calculated by subtracting the average from each data point to obtain the
deviation, d (Equation 3); squaring each deviation; summing the squared deviation
(Equation 4); dividing the sum-of-the-squares by the number of data points, n-1 (Equation
5); and finally taking the square root (Equation 6).
—
di = Xi – X 3
2
Sum of Squares = ∑ di 4
2
∑ di 5
Variance = s2 =
n -1
What is ANOVA?
Variance is a statistic that is calculated by measuring the effects when the levels of the
variables are changed. In the example above a variance can be calculated for the difference
between Polyester 1 and Polyester 2 and a variance can be calculated for the experimental
error. These variances are additive.
34
How is an ANOVA table constructed?
Example 3: For simplicity, all the data in Examples 1 and 2 will be used.
Polyester 1, 3.9, 3.6, 3.5, 3.4, 3.6, 3.6, 3.4, 3.6, 3.4, 3.5
Polyester 2, 3.8, 3.9, 3.6, 3.8, 3.7, 3.8, 3.6, 3.7, 3.8, 3.8
The average for each polyester is calculated: 3.55 and 3.75.
The grand average is calculated: 3.65.
The grand sum of the deviation squares (often abbreviated to “grand sum of the squares” or
simply, Total SS) is calculated using twenty data points and the grand average (see for
example, Equation 4): 0.49.
The sum of the squares within the two resins is calculated using each resin average, the
grand average and the number of replicates.
The sum of the square between the two resins is calculated using the averages of the two
resins and the individual data points.
The Total Sum of Squares is seen to be the sum of the Between- Resins and Within-Resins
Sum of Squares. This means that only two of the three SSs need to be calculated long-hand,
and the third can be calculated from the others.
The ANOVA table is completed by including the degrees of freedom, df; and calculating the
variance by dividing the SS by the df. The variance in an ANOVA table is usually called the
Mean Square.
In this example, two resins were compared, so the between-resins degrees-of-freedom is
one; there are ten analysis for each resin or nine degrees-of-freedom for each resin, so the
experimental error or within-resin degrees-of-freedom is twice nine or eighteen; and the
grand degrees-of-freedom is nineteen (the grand degrees-of-freedom is equal to the sum of
the between-resins degrees-of-freedom and the within-resins degrees-of-freedom; or, is
equal to the total number of experiments minus one). This is summarized in Table 1. 2
Table 1: ANOVA
Source of Variation Sum of df Mean
Squares Square
Between Resins 0.200 1 0.200
Within Resins 0.290 18 0.016
2 I know I’m waving my hands here , but I would refer you to any good text in applied statistics to find the algebraic
derivation.
35
In this example the within-resins Mean-Square is the error variance. The standard deviation
for this acid number test is the square-root of the Within-Groups Mean-Square (s2 = 0.016);
therefore, s = 0.127 with 18 degrees of freedom.
The above demonstrates calculations to isolate the effects between and within different lots
of polyester, that is, the variance between the lots and the variance of the experimental error.
But how does ANOVA tell whether Resin 1 is different than Resin 2? This requires the
calculation of the F statistic and a comparison in an “F” test
What is an F statistic?
The challenge is to describe the F statistic and how to use it to judge differences in samples
or treatments without going into its derivation (see Footnote 1). An Fsample is calculated by
dividing the variance between the samples, s12 , by the error variance, s 22 , Equation 8.
s12
Fsample = 8
s 22
This calculated Fsample is then compared to values in the F table. F tables are published in
most statistics references. An excerpt of the table is given in Table 2.
36
A sample variance, s2, is only an estimate of the population variance, σ2. If two
different samples are taken to calculate two variances, s2. Each s2 is an independent estimate
of the variance σ2. The ratio of the variances for the two samples could vary by as much as F
and still be due to experimental variation. If the Fsample is larger than the F from the table,
something more than just experimental variation is happening. This additional variation is
attributed to the difference in the treatment.
Using the data from Example 3, the following holds. Since the between-groups
variance = polyester variance (with 1 degree of freedom) and the within groups
variance = experimental error (with 18 degrees of freedom), the F Value is calculated by
dividing the between-groups variance by the within-groups variance (Equation 7).
Polyester Variance 0.200
= = 12.5 = F 1,18 (7)
Experimental Error 0.016
The next step is to look up the F value in an F Table. Since the Between Groups had 1 degree
of freedom, the first column is used; and since the within groups had 18 degrees of freedom,
the eighteenth row is used. The F table tells that at 90% confidence, F1,18,90 = 3.01; at 95%
confidence, F1,18,95= 4.41; and at 99% confidence, F1,18,99 = 8.29.
The F1,18,95 = 4.41, tells that at the 95% confidence level, experimental Between Resins
Variance can be as small as 0.016 or as large as 4.41 * 0.016 = 0.070 and still be due only to
serror
2
. Since The Between Resins Variance of Polyesters 1 and 2 = 0.200, this is larger than
can be explained simply by experimental error. An experimenter would be wrong only 1
time in 20 if he said the lots were different. Moreover, at 99% confidence, F1,18,99 = 8.29 and
the experimental Between Resins Variance can be as small as 0.016 or as large as
8.29 * 0.016 = 0.130; Even at the 99% confidence level, the Between Resins Variance of
0.200 is larger than can be explained by experimental error; and an experimenter would be
wrong only 1 time in 100 if he said the lots were different.
An easier way to use the F table is to go the ANOVA table and compare the calculated
Fsample to the F1,18 from the F Table for each confidence level, Table 3.
A comparison of Fsample = 12.5 to the values in the table show that the there is less than 1
chance in 100 that these lots are not different.
The results of ANOVA is identical to the results of comparing the means in the t-test.
37
Why construct a complicated ANOVA table when the t-test is easier?
A t-test can only be used to compare two means at a time. When there are more than
two groups in a treatment, use of the t-test would require that all pairs be compared. For
example, if there were five in the group, then ten comparisons would have to be made.
ANOVA can compare all five in one step.
Example 4: The acid numbers of five polyesters were to be compared to see if any
were different. Six replicate determinations of acid number were made for each resin. The
data is given in Table 4 and the Analysis of Variance is given in Table 5.
Table 4 / Example 4 data
Polyester Acid Numbers Average
3 3.4, 3.4, 3.5, 3.3, 3.5, 3.3 3.43
4 3.5, 3.5, 3.6, 3.4, 3.6, 3.5 3.52
5 3.2, 3.3, 3.2, 3.3, 3.3, 3.3 3.27
6 3.5, 3.6, 3.4, 3.4, 3.6, 3.5 3.50
7 3.8, 3.9, 3.8, 3.8, 3.9, 3.9 3.85
The F ratio of 52.2 tells that there is less than 1 chance in 10,000 in making a mistake
if the experimenter says that these five polyesters are not the same. Figure 1 shows a plot of
the data for each polyester and comparison circles, which will allow the experimenter to
judge which resins are the same and which are different.
Figure 1 / Data plot with comparison circle
4.0
3.8
3.6
Acid Value
3.4
3.2
Comparison
1 2 3 4 5
Circles
Polyester
38
Overlapping comparsion circles means that the ANOVA cannot tell if the resins are
different. A comparison circle that is isolated indicates that the data disprove the hypothesis
that a sample is the same as the others. Figure 1 shows one polyester with a high acid
number, one with a low acid number and three polyesters whose acid numbers cannot be
told apart. For these three resins, since one of the comparason cirlces does.not overlap the
other two excatly, additional experimentation on these three might show a difference.
This was an example of a one-way Analysis of Variance, because only one variable
was present.
These results lead to a conclusion that indeed there did seem to be differences in the lots. A
review of the production tests of these lots led to no definitive conclusions. At face value it
seemed that lot D was slow to dry, lot C was fast to dry and lots A and B were in the middle.
A statistically designed experiment was conducted to increase the number of degrees of
freedom in order to increase the statistical “power.” The dry times of these lots were
evaluated over several days, Table 7.
Table 7: Gardner Dry Times
Isocyanate Samples
Lo t A B C D
D ay
These results seemed to reveal a pattern about the performance of the lots and something
about the experimental error. Table 8 shows the day average and standard deviation for the
tests run on each lot; shows the lot average and standard deviation for the tests run on the
different days; and the grand mean and grand standard deviation for the test.
39
Table 8: Gardner Dry Times
Isocyanate Samples
Lo t A B C D Day Day
D ay Mean Standard
Deviation
From this data two hypothesis can be stated: ① There are no differences between the lots;
and ② There are no day-to-day differences in dry time. The data seem to show that the lot-
to-lot averages fall into two groups: two resins have higher gel times than the other two; and
that the day-to-day averages don’t seem to fall into any pattern. To analyze this data an
ANOVA Table is constructed, Table 9, to test these hypotheses.
40
squaring this difference and taking the sum of the squares for the sixteen runs. Finally, the
Error Sum-of-Squares was calculated by subtracting the Between Lots Sum-of-Squares and
the Between Days Sum-of-Squares from the Total Sum-of-Squares. (see Footnote 1.)
Since there were four days there were three Between Days degrees-of-freedom; and there
were four lots so there were three Between Lots degrees-of-freedom. There were sixteen
experiments so there were fifteen Total degrees-of-freedom. The Error degrees-of-freedom,
9, were calculated by subtracting the Days and the Lots degrees-of-freedom from the Total
degrees-of-freedom
Each Mean Square was calculated by dividing each Sum of Squares by its respective
degrees of freedom.
Each Fsample was calculated by dividing the Lot or Day Sum-of-Squares by the Error Sum-of-
Squares, respectively. The Fsample was then compared to the Fs from the F table with 3 and 9
degrees of freedom.
If an experimenter would say that there is a difference between each day’s testing, he would
be wrong 77 out of 100 times, so the first hypothesis false true. However, if he said that
there is a difference between the lots, he would be wrong only 4 times in 100, and the
second hypothesis was false.
The standard deviation for the experimental error was calculated by taking the square root of
the error Mean Square. One surprise was that the day to day error was so large—a standard
deviation of 82 hours. This means that if an experimenter determined only one dry time per
sample, he would have to see a difference of ~160 minutes before he could confidently say
the samples had different dry times. In this case, differences between lots could be seen
because of the increased power in the experiment due to the large number of replicates.
41
42
Disclaimer
The manner in which you use and the purpose to which you put and utilize our products, technical assistance
and information (whether verbal, written or by way of production evaluations), including any suggested
formulations and recommendations are beyond our control. Therefore, it is imperative that you test our
products, technical assistance and information to determine to your own satisfaction whether they are suitable
for your intended uses and applications. This application-specific analysis must at least include testing to
determine suitability from a technical as well as health, safety, and environmental standpoint. Such testing
has not necessarily been done by us. Unless we otherwise agree in writing, all products are sold strictly
pursuant to the terms of our standard conditions of sale. All information and technical assistance is given
without warranty or guarantee and is subject to change without notice. It is expressly understood and agreed
that you assume and hereby expressly release us from all liability, in tort, contract or otherwise, incurred in
connection with the use of our products, technical assistance, and information. Any statement or
recommendation not contained herein is unauthorized and shall not bind us. Nothing herein shall be
construed as a recommendation to use any product in conflict with patents covering any material or its use.
No license is implied or in fact granted under the claims of any patent.