Sie sind auf Seite 1von 21

Comparing more than two Population Means

Simultaneously

Analysis of Variance Technique


(ANOVA)

STAT-600 1
Analysis of Variance (ANOVA)
Analysis of Variance is a procedure that
partitions the total variability in the data into
distinct components.
Each component represents the variation due to a
recognized source of variation, in addition, one
component represents the variation due to
uncontrolled factors and random errors
associated with the response measurements
Explained Un-Explained

Total Variation 2
Example:- The milk butterfat percentage of 4 breeds of cows is desired
to be known. A random sample of 6 Mature cows from each of 4
breeds was taken and the following data were recorded.
Breed 1 Breed 2 Breed 3 Breed 4
3.6 4.6 3.7 5.8
4.1 4.9 3.6 5.0
4.0 5.7 3.8 5.3 Test the hypothesis that the average
milk butterfat percentage for four
3.9 5.9 3.2 5.2
breeds are same
3.2 4.3 3.9 4.9
4.3 5.1 3.2 5.8
23.1 30.5 21.4 32.0 107.0

Explained Un-Explained
(Between Breed) STAT-600 (Within Breed) 3
Graphical View of the data
Dot Plot of Butter fat percentage Boxplot of Butter fat percentage
6.0 6.0

5.5 5.5
Butter fat percentage

Butter fat percentage


5.0 5.0

4.5 4.5

4.0 4.0

3.5 3.5

3.0 3.0
Breed 1 Breed 2 Breed 3 Breed 4 Breed 1 Breed 2 Breed 3 Breed 4
Breed Breed

Average milk butterfat percentage of Breed 2 and 4 while for Breed1 & 3 are almost
similar. Although Breed 2 has largest variability in the data but variability between
4
four breeds are same.
Statistical Analysis by One Way ANOVA
Ho : 1=2=3=4
Average milk butterfat percentage are same for 4 breeds
H1: At least two means are different
2

Test Statistic F S b
2
S w

Source Of Variation Degree of Sum of Squares Mean Sum of Squares Fcal


(S.O.V) Freedom SS MSS=SS/df
DF
Between Breed 4-1 =3 13.919 4.640 S2b 23.67*
Within Breed 23-3=20 3.920 0.196 S2w(MSE)
TOTAL 24-1=23 17.8383
STAT-600 5
Breed 1 Breed 2 Breed 3 Breed 4 Correction Factor (CF)
3.6 4.6 3.7 5.8 (G.T)2/Obs= (107)2/24 = 477.04
4.1 4.9 3.6 5.0
4.0 5.7 3.8 5.3
3.9 5.9 3.2 5.2 TotalSS
(3.6)2+(4.1)2 …(5.8)2 – CF
3.2 4.3 3.9 4.9
494.88 – 477.04 = 17.8383
4.3 5.1 3.2 5.8
23.1 30.5 21.4 32.0 107.0
Between Breed
(23.1) 2 (30.5) 2 (21.4) 2 (32.0) 2
    CF  13.919
6 6 6 6

Within Breed
Total- Between Breed
6
17.8383-13.919=3.92 6
(S.O.V) DF SS MSS=SS/df Fcal
Between Breed 3 13.919 4.640 S2b 23.67*
Within Breed 20 3.920 0.196 S2w(MSE)
TOTAL 23 17.8383

Decision Rule:- Reject Ho if Fcal  F(3,20)

Result:-As Fcal =23.67 > F.05(3,20) =3.10 So reject Ho and conclude


that there is difference in the mean for 4 breeds of cows.

7
7
Breed Mean
Breed 1 3.850 Mean Plot for Butterfat Percentage
Breed 2 5.083 5.5
Breed 3 3.567
Breed 4 5.333

5.0
Mean

4.5

4.0

3.5
Breed 1 Breed 2 Breed 3 Breed 4
Breed
STAT-600 8
TWO WAY ANOVA

The effective life (in hours) of batteries is compared by three material type

TYPE-I: Nickel-Cadmium, TYPE-II: Nickel-Metal Hydride and TYPE-III: Lithium-Ion and


operating temperature: Low (-10˚C), Medium (20˚C) or High (45˚C).

Batteries are randomly selected from each material type and are then randomly allocated
to each temperature level. The resulting life of all batteries is shown below:

Type I Type II Type III

Low 180 188 160


Medium 215 210 190
High 82 90 80
Test the hypothesis that
•Mean life of the batteries for different material types are same
•Mean life of the batteries at different operating temperatures are same
Response variable: Life (in hours) of batteries

Type: Nickel-Cadmium, Nickel-Metal Hydride, Lithium-Ion

Temperature: Low (-10˚C), Medium (20˚C), High (45˚C)


Ho : 1=2=3
Average life of the batteries are same for three different type of material
H1: At least two means are different
Ho : 1=2=3
Average life of the batteries are same at three different operating temperature
H1: At least two means are different

Explained Un-Explained
• Due to material type (Error)
• Due to different Temp
Type I Type II Type III total Correction Factor (CF)
Low 180 188 160 528 (G.T)2/Obs= (1395)2/9 = 216225
Medium 215 210 190 615
High TotalSS
82 90 80 252
(180)2+(215)2 …(80)2 – CF
1395
Total 477 488 430 = 240993 – 216225 = 24768

Between Material
(477) 2 (488) 2 (430) 2
   CF  632.67
Error 3 3 3
Total – Material – Temp
Between Temp
24768-632.67-23946=189.33 (528) 2 (615) 2 ( 252) 2
   CF  23946
3 3 3

11
(S.O.V) DF SS MSS=SS/df Fcal Ftab
Material 2 632.67 316.33 S2M 6.68ns F0.05(2,4)=6.94
Temp 2 23946 11973 S2T 252.95* F.05(2,4)=6.94
Error
4 189.33 47.33 (MSE)
TOTAL 8 24768

☼Significant Rresult i.e atleast two means are different


ns not-significant results i.e all means may be same

Result:- No difference in average life of batteries due to different


material type , but average life is different at different temperatures

12
12
Low Medium High
Means 176 205 84

200

175
Average Life

150

125

100

Low Medium High


Temperature

STAT-600 13
Chi-Square Goodness of Fit Test

At times research is undertaken to determine whether


some observed pattern of frequencies conforms to an
“expected” pattern. The goodness-of-fit technique is used
in which the researcher tests whether a significant
difference exists between the observed number of
responses in each category and the expected number for
each category.

STAT-600 14
Example:- Genetic theory suggested that the ratio of different types of

flowers of a certain species should be 9:3:3:1. An experiment of this


nature gave 110 yellow flowers with a green stigma, 40 yellow flowers
with a red stigma, 30 white flowers with a green stigma and 15 white
flowers with a red stigma. Can we conclude that the data support the
theory at 5%.

Categories Observed

Yellow with green stigma 110

Yellow with red stigma 40

White with green stigma 30

White with red stigma 15


STAT-600 15
Categories Observed (O)
Yellow with green stigma 110
Yellow with red stigma 40
White with green stigma 30
White with red stigma 15

Ho : The data support the theory i.e the ratio is 9:3:3:1

H1 : The data do not support the theory

   
2
  O  E  2


E 
 

O: Observed Frequency
E : Expected Frequency i.e Frequency considering Ho true
STAT-600 16
The ratio is 9:3:3:1
Categories O E 
   
2  O  E  2


(9/16)x195 = 109.68 E 
Yellow with green stigma 110
 
Yellow with red stigma 40 (3/16)x195 = 36.563
White with green stigma 30 (3/16)x195 = 36.563
=2.15
White with red stigma 15 (1/16)x195 = 12.188
TOTAL 195 TOTAL 195

Catagories O E (O-E) (O-E)2 (O-E)2/E

Yellow with green stigma 110 109.68 0.320 0.102 0.000934


Yellow with red stigma 40 36.563 3.437 11.813 0.323085
White with green stigma 30 36.563 -6.563 43.073 1.178048
White with red stigma 15 12.188 2.812 7.907 0.648781
195
TOTAL 195 0 2.15
STAT-600 17
• If O are equal to the corresponding E, χ2-value will be zero
Exact fit
• If O are close to the corresponding E, χ2-value will be more than
zero (not much large)
Good fit
• If O differ considerably from corresponding E, χ2-value will be large
Worse fit
0 Infinity

Exact Good Bad worse

Reject Ho if
 cal
2
 2.15
 cal
2
 2
 ( k 1) 0
2
.05(3)
 7.81
k  # catagories
STAT-600 Don' t reject Ho 18
TEST OF INDEPENDENCE
BETWEEN QUALITATIVE VARIABLES

Chi-Square test can also be used to


test whether two qualitative variables
(attributes) are associated are not in a
contingency table, in such situation
test is called test of independence of
attributes.
STAT-600 19
A certain drug is claimed to be effective in curing cold. In an
experiment on 164 people with colds, some of them were given the
drug and some of them were given sugar pills (control) . The
patient’s reaction to the treatment are recorded in the following table.
Test the hypothesis that the two attributes are independent.
Category Helped Harmed No Effect Sub Total
Drug 92 10 10 112

Sugar 30 12 10 52

Sub 122 22 20 164


Total
Ho : Two variables ( treatment and reaction) are independent
H1 : Two variables ( treatment and reaction) are not independent

STAT-600 20
Expected Frequency Category Helped Harmed No Effect Sub
(Row Total)  (Column Total) Total
 Drug (112x122)/164=83.32 (112x22)/164=15.02 (112x20)/164=13.66 112
Total Observations
Sugar (52x122)/164=38.68 (52x22)/164=6.98 (52x20)/164=6.34 52

Sub 122 22 20 164


Total

Reject Ho if O E (O-E) (O-E)2 (O-E)2/E


92 83.32 8.68 75.342 0.9043
 cal
2
 2
 (df )
10 15.02 -5.02 25.2 1.6778
df  (row  1)(col  1)
10 13.66 -3.66 13.396 0.9806
30 38.68 -8.68 75.342 1.9478
 cal
2
 11 .23 12 6.98 5.02 25.2 3.6104
0
2
.05( 2)
 5.99 10 6.34 3.66 13.396 2.1129
Reject Ho 164 164 11.23

STAT-600 21

Das könnte Ihnen auch gefallen