Beruflich Dokumente
Kultur Dokumente
In basic statistics, the F-distribution is used in: (1) making inferences about two population variances
i.e., homogeneity of variance test, and (2) analysis of variance (ANOVA). In this class, we will cover
only the ANOVA test.
Fishers F-distribution
If 12 = 22 and s12 and s22 are sample variances from independent simple random samples of size n1
and n2, respectively, drawn from normal populations, then
2
s
F = 12
s2
follows the F-distribution with n1-1 degrees of freedom in the numerator and n2-1 degrees of freedom
in the denominator.
E.g., if samples are drawn of size n1=8 from Population 1 and size n2=5 from Population 2, then F has df
= 7, 4 (i.e., (8-1), (5-1)).
( x1i x1) 2
2
n1 = 8
s1 =
df = 7
8 1
2
s
F = 1 2 has df = 7 and 4
s2
n2 = 5
( x2
=
x 2) 2
df = 4
5 1
Note that df for F is always stated as first numerator df and then denominator df.
s2
Find the critical F-value for a right-tailed test with =0.05, degrees of freedom in the numerator = 10 and
degrees of freedom in the denominator = 6.
F 0.05,
10. 6
df of denominator
df of numerator
F0.05, 10, 6
Analysis of Variance (ANOVA) is an inferential method that is used to test the equality of three or more
population means. ANOVA is an extension of a t-test for independent samples (section 10.2)
H0: 1 = 2 = = k
H1: not all means are equal
For example, for k=3 the null hypothesis and alternative hypotheses are:
H0: 1 = 2 = 3
H1: 1 = 2 3
1 2 = 3
1 = 3 2
1 2 3
Population 1
Population 2
Population 3
ANOVA Test using the F-distributionHypothesis Test Regarding Three or More Means with
Unknown
Assumptions:
k simple random samples from k populations.
Step 1: A claim is made regarding the means of three or more populations. The null and alternative hypotheses are written as:
H 0: 1 = 2 = = k
H1: not all means are equal
Step 2: Select a level of significance, , and find the right-tailed critical value for the F-distribution with df=(k-1),
(n1+n2++nk-k). The rejection region (or critical region) is the set of all values of the test statistic to the right of the critical
F-value.
F,(k-1),(n1+n2++nk-k)
Calculate the grand mean of the combined data set, x , by adding up all the observations and dividing by the number of
observations.
b.
Find the sample mean for each population or treatment ( x1 = sample mean from population 1; x 2 = sample mean from
population 2; and so on).
Find the sample variance for each population (s12 = sample variance from population 1; s22 = sample variance from
population 2; and so on).
Calculate the mean square due to treatment. (Another name for mean square is variance which is equal to the mean of
c.
d.
n 1 ( x 1 x ) 2 + n 2 ( x 2 x ) 2 + ... + n k ( x k x ) 2
MST =
,
k 1
where n1 is the sample size from population 1;
n2 is the sample size from population 2; and so on
k is the number of populations, or treatment levels.
e. Calculate the mean square due to error:
f.
F=
Sum of Squares
Degrees of Freedom
Mean Square
SST
SSE
SS
k-1
n1+n2++nk-k
n1+n2++nk-1
MST=SST/(k-1)
MSE=SSE / ( n1+n2++nk-k)
FStatistic
F=MST/MSE
Fenugreek
229.1
240.7
239.4
207.7
225.7
230.8
206.6
213.3
Garlic
177.4
202.2
163.1
184.7
197.9
164.6
193.9
158.1
Onion
299.7
258.3
286.8
244.0
267.1
297.1
249.9
265.1
Step 1: A claim is made regarding the means of the three populations. The null and alternative hypotheses are
written as:
H 0: 1 = 2 = 3
H1: not all means are equal
Step 2: Select = 0.05 and find the right-tailed critical value for the F-distribution with df=(k-1), (n1+n2+n3+n4-k)
or df=3, 28.
F0.05, 2, 33 = 2.99
Step 3:
a. Calculate the grand mean of the entire data set:
288.1 + 296.8 + ... + 249.9 + 265.1
x=
= 238.54
32
b. Find the sample mean of each population, where control = Population 1, Fenugreek = Population 2, Garlic =
Population 3 and Onion = Population 4.
288.1 + 296.8 + ... + 283.8
x1 =
= 278.56
8
229.1 + 240.7 + ... + 213.3
x2 =
= 224.16
8
177.4 + 202.2 + ... + 158.1
x3 =
= 180.24
8
299.7 + 258.3 + ... + 265.1
x4 =
= 271.00
8
s3
2
2
2
( 229.1 224.16) + ( 240.7 224.16) + ... + ( 213.3 224.16)
8 1
2
2
2
(177.4 180.24) + ( 202.2 180.24) + ... + (158.1 180.24)
8 1
2
2
2
( 299.7 271.) + ( 258.3 271.) + ... + ( 265.1 271.)
8 1
= 181.99
= 291.03
= 448.58
d. Compute MST:
MST =
2
2
2
2
8( 278.56 238.54) + 8( 224.16 238.54) + 8(180.24 238.54) + 8( 271 238.54)
3 1
50,087.4112
3
= 16,695.8
e. Compute MSE:
MSW =
8,030.89
28
= 286.82
ANOVA Table:
Source of
Variation
Between
Within
Total
Sum of
Squares
50,087.41
8,030.89
58,118.30
MST
MSE
16,695.8
286.82
Degrees of
Freedom
k-1=4-1=3
n1+n2+n3+n4-k=28
n1+n2+n3+n4-1=31
= 58.21
Mean Square
MST=16,695.80
MSE=286.82
F-Test Statistic
calc F=58.21
Step 4: ConclusionBecause the calculated F-statistic=58.21 is less than the critical F=2.99,
reject H0 at the 0.05 significance level. At least one of the population means is different from the
others.
6
Step 3: With the cursor in the Input Range: box, highlight the data. Click OK.
Perform the calculations using Excel.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
A
Control
288.1
296.8
267.8
256.7
292.1
282.9
260.3
283.8
B
Fenugreek
229.1
240.7
239.4
207.7
225.7
230.8
206.6
213.3
C
Garlic
177.4
202.2
163.1
184.7
197.9
164.6
193.9
158.1
D
Onion
299.7
258.3
286.8
244.0
267.1
297.1
249.9
265.1
Count
Sum
2228.5
1793.3
1441.9
2168.0
Average
278.5625
224.1625
180.2375
271
8
8
8
8
Variance
225.7713
181.9884
291.0341
448.58
F-statistic (or
calculated F)
Critical F
ANOVA
Source of Variation
Between Groups
(SST)
Within Groups (SSE)
Total
SS
50090.69
8031.616
58122.31
df
3
28
31
MS
16696.9
286.8434
58.2091
P-value
3.74E12
F crit
2.946685
H0 is TRUE
x = grand mean
H0 is FALSE
x = grand mean
Tukeys Test Using the Studendized Range DistributionHypothesis Test Comparing Two
Means (see Section 13.2, available under Course Compass).
Assumptions:
Step 1: A claim is made regarding the two population means (i and j).
Two-Tailed Test
H0: i = j
H1: i j
i<j or i>j
s 1
1
2 n i n j
Note that s2 is the mean square error due to error, MSE, from the ANOVA table; ni is the sample
size from population i; and nj is the sample size from population j.
2
Compare the calculated q (or q statistic) to the critical value, q, (n1+n2++nk-k), k, and state
whether or not the H0 is rejected at the specified .
If q q, (n1+n2++nk-k), k, reject H0; otherwise do not reject H0.
Interpret the conclusion in the context of the problem
Compare all pairwise differences to identify which population means are considered equal.
Fenugreek
229.1
240.7
239.4
207.7
225.7
230.8
206.6
213.3
224.16
Garlic
177.4
202.2
163.1
184.7
197.9
164.6
193.9
158.1
180.24
Onion
299.7
258.3
286.8
244.0
267.1
297.1
249.9
265.1
271.00
Sum of
Squares
50,087.41
8,030.89
58,118.30
Degrees of
Freedom
3
28
31
Mean Square
16,695.80
286.82
FStatistic
58.21
( x i x j ) ( i j )
s 1
1
+
2 n i n j
Step 4Conclusion. Provide a conclusion and the statistical justification for the conclusion, and
interpret your conclusion in the context of the problem.
Comparison,
H0 and H1
Difference,
xi x j
Test Statistic, q
Critical Value
Conclusion
Summary of Tukeys Test (arrange sample means from highest to lowest and draw a line under means that are not significantly different, p. 695):