Sie sind auf Seite 1von 8

04122013

1
ANalysis OfVAriance
Basicconcepts
Onewayanalysisofvariancetotestfor
differencesamongthemeansofseveralgroups
Twowayanalysisofvarianceandinterpretthe
interaction
QAM II byGauravGarg (IIMLucknow)
Thedifferencebetweentwomeanscanbe
examinedusingt testorZ test.
Ifwehavemorethan2samples.
Wewishtotestthehypothesisthat
allthesamplesaredrawnfromthepopulation
havingthesamemeans.
Orallpopulationmeansarethesame.
WeuseANOVA.
QAM II byGauravGarg (IIMLucknow)
Example:
Thereare5varietiesofafertilizer.
Eachvarietyisappliedtosomeplotsofwheat.
Yieldofwheatoneachoftheplotisrecorded.
Wewishtotestiftheeffectsofthesevarietiesof
fertilizeronyieldarethesame.
Giventhat,allotherconditionsarethesame.
ThisistestedbyANOVA.
Thus,basicpurposeofANOVAistotestthe
homogeneityofseveralmeans.
QAM II byGauravGarg (IIMLucknow)
Example:
Fromtimetotime,unknowntoitsemployees,the
researchdepartment,atamultinationalbank
observesvariousemployeesforworkproductivity.
Recently,thisdepartmentwantedtocheckifthe4
tellersatabranchserve,onanaverage,thesame
numberofcustomersperhour.
Researchmanagerobservedeachtellerforcertain
numberofhours.
Followingtablegivesthenumberofcustomers
servedby4tellersduringeachoftheobserved
hours:
QAM II byGauravGarg (IIMLucknow)
TellerA: 19 21 26 14 18
TellerB: 14 16 14 13 17 13
TellerC: 11 14 21 13 16 18
TellerD: 24 19 21 26 20
Averagenumberofcustomersservedperhour
byeachofthese4tellersare:
A:19.6, B:14.5, C:15.5, D:22
Canyouconcludethataveragenumberof
customersservedperhourbyeachofthese4
tellersarethesame.
Ortheyaredifferentsignificantly.
QAM II byGauravGarg (IIMLucknow)
ANOVAisessentiallyaprocedurefortestingthe
differenceamongvariousgroupsofdatafor
homogeneity.
Atitssimplest,ANOVAteststhefollowing
hypotheses:
H
0
:Themeansofallthegroupsareequal.
H
1
:Notallthemeansareequal
doesntsayhoworwhichonesdiffer.
Canfollowupwithmultiplecomparisons
QAM II byGauravGarg (IIMLucknow)
04122013
2
QAM II byGauravGarg (IIMLucknow)
3 2 1
= =
3 2 1
= =
3 2 1
= =
or
AssumptionsofANOVA
Samplesarerandomlyandindependentlydrawn
Eachpopulationisapproximatelynormal
Maybecheckedbylookingathistogramsor
normalQQplots
Standarddeviationsofeachpopulationare
approximatelyequal
ruleofthumb:ratiooflargesttosmallestsamplestd.
dev.mustbelessthan2:1
QAM II byGauravGarg (IIMLucknow)
OneWayClassification
LetX bearandomvariable.
ThevaluesofX areaffectedbydifferentlevelsofone
factor.
Thesedifferentlevelsmaybetermedastreatments.
Lettherebek suchtreatments.
Letn observationsarecollectedonX.
Thesen observationsaregroupedonsomebasisinto
k groups(treatments)ofsizesn
1
, n
2
, , n
k
,
respectively.
n= n
1
+ n
2
+ + n
k
QAM II byGauravGarg (IIMLucknow)
H
0
:Allmeansarethesame
H
1
:Allmeansarenotthesame
(atleastonemeanisdifferentfromothers)
QAM II byGauravGarg (IIMLucknow)
G
T x x x x
T x x x x
T x x x x
k k kn k k
n
n
k
-
-
-

2 1
2 2 2 22 21
1 1 1 12 11
2
1
Total Mean
OverallvariationinthedataisrepresentedbyTotalSum
ofSquares(TSS ):
TSS ispartitionedintotwoparts:
BetweenGroupsVariationor
SumofSquaresduetoTreatments
WithinGroupsVariationor
SumofSquaresduetoError
QAM II byGauravGarg (IIMLucknow)
k
k
i
n
j
ij
k
i
n
j
ij
n n n n x
n
x x x TSS
j j
+ + = = =

= = = =
2 1
1 1 1 1
2
,
1
, ) (
. ,..., 2 , 1 ,
1
, ) (
1 1
2
k i x
n
x x x n SST
i
n
j
ij
i
i
k
i
i i
= = =

=
-
=
-

= =
-
=
k
i
n
j
i ij
j
x x SSE
1 1
2
) (
Theseformulaecanbesimplifiedasbelow:
Then,wedotheanalysisofthepartitionedor
separatedvariations.
QAM II byGauravGarg (IIMLucknow)
TSS-SST SSE
CF
n
T
SST
CF x TSS
i i
i
i j
ij
=
=
=
=

: Error to due Squares of Sum


: Treatments to due Squares of Sum
: Squares of Sum Total
n
G
CF : Factor Correction
2
2
2
04122013
3
ANOVATable
TheteststatisticisgivenbyF
c
inANOVATable.
F
c
~ F(k-1, n-k)
WerejectH
0
at x 100% levelofsignificance,if
F
c
>F
k-1, n-k,
QAM II byGauravGarg (IIMLucknow)
Sourceof
Variation
Sumof
Squares
Degreeof
Freedom
MeanSumof
Squares
VarianceRatio
Treatments
(Between
Groups)
SST k-1 MST=SST/(k-1)
Error
(Within
Groups)
SSE (n-1)-(k-1)=n-k MSE=SSE/(n-k) F
c
=MST/MSE
Total TSS n-1
Example:Consider4Tellersexample.
TellerA: 19 21 26 14 18
TellerB: 14 16 14 13 17 13
TellerC: 11 14 21 13 16 18
TellerD: 24 19 21 26 20
k=4, n=22, n
1
=5, n
2
=6, n
3
=6, n
4
=5
T
1
=98, T
2
=87, T
3
=93, T
4
=110 G=388
CF=G
2
/n = 6842.909
TSS = 391.091
SST= 200.891
SSE= 190.200
QAM II byGauravGarg (IIMLucknow)
ANOVATable
DistributionofTestStatistic:F
c
~ F(3, 18)
For5%levelofsignificance,CriticalValueF
()
=3.1599
Since F
c
>F
3, 18, 0.05
WerejectH
0
at5% levelofsignificance.
QAM II byGauravGarg (IIMLucknow)
SourceofVariation Sumof
Squares
Degreeof
Freedom
MeanSum
ofSquares
Variance
Ratio
Treatments
(BetweenGroups)
200.891 3 66.964
Error
(WithinGroups)
190.200 18 10.567 F
c
= 6.337
Total 391.091 21
Weconcludethataveragenumberofcustomers
servedperhourbyeachofthese4tellersare
differentsignificantly.
QAM II byGauravGarg (IIMLucknow)
Example: Thefollowingtableshowsthelivesin
hoursoffourbatchesofelectriclamps:
Batch1: 1600 1610 1650 1680 1700 1720 1800
Batch2: 1580 1640 1640 1700 1750
Batch3: 1460 1550 1600 1620 1640 1660 1740
1820
Batch4: 1510 1520 1530 1570 1600 1680
Performananalysisofvarianceofthesedataand
showthatasignificancetestdoesnotrejecttheir
homogeneity.
QAM II byGauravGarg (IIMLucknow)
Iftheobservationsarelarge,youcanshifttheir
originandscale.
Thiswillnotchangetheresult.
Shiftingoriginmeansaddingorsubtractingsome
constant.
Shiftingofscalemeansmultiplyingordividingby
someconstant.
QAM II byGauravGarg (IIMLucknow)
04122013
4
CriticalDifference
If,onthebasisofANOVA,werejectH
0
i.e.,ifthereissignificantdifferenceamongvarioustreatment
means
Thenwewouldbeinterestedtofindoutwhichpair(s)of
treatmentsdiffersignificantly
WeuseScheffes Testforthis.
i
th
andj
th
treatmentmeansdiffersignificantlyif
CriticalDifferencefori
th
andj
th
treatmentisgivenby
QAM II byGauravGarg (IIMLucknow)
( )
2 / 1
; , 1
1
1 1
(
(


|
|
.
|

\
|
+
o k n k
j i
F k
n n
MSE
(C.D) Difference Critical >
- - j i
x x
Example:Consider4Tellersexample.
For =0.05,CriticalValueF
(0.05)
at (3, 18) d.f = 3.1599
Weobtained,MSE=10.567
C.D.fortellersAandB=6.0605
AbsoluteDifferenceofSampleMeansofTellersAandB=
5.1<C.D.
So,tellersAandBdonotdiffersignificantly.
QAM II byGauravGarg (IIMLucknow)
Teller Mean Sample size
A 19.6 5
B 14.5 6
C 15.5 6
D 22 5
Example:
Youwanttoseeifthreedifferentgolfclubsyield
differentdistances.
Yourandomlyselectfivemeasurementsfromtrialson
anautomateddrivingmachineforeachclub.
Atthe0.05significancelevel,isthereadifferencein
meandistance?
QAM II byGauravGarg (IIMLucknow)
Club1 Club2 Club3
254 234 200
263 218 222
241 235 197
237 227 206
251 216 204
ANOVATable
DistributionofTestStatistic:F
c
~ F(2, 12)
For5%levelofsignificance,CriticalValueF
()
=3.89
Since F
c
>F
()
WerejectH
0
at5% levelofsignificance.
QAM II byGauravGarg (IIMLucknow)
SourceofVariation Sumof
Squares
Degreeof
Freedom
MeanSum
ofSquares
Variance
Ratio
Treatments
(BetweenGroups)
4716.4 2 2358.2 25.275
Error
(WithinGroups)
1119.6 12 93.3
Total 5836.0 14
Whichpair(s)ofclubsdiffersignificantly?
CriticalDifference?
QAM II byGauravGarg (IIMLucknow)
F
c
=25.275
0
o =0.05
F

=3.89
Example:
Thedatainthetable(inthousandsofdollars)wereextracted
fromBusinessWeeks1986ExecutiveCompensationScoreboard.
Assumethatthedatarepresentindependentsamplesof1986
totalcashcompensationsforeightcorporateexecutivesineach
ofthethreeindustries banks,utilities,andoffice
equipments/computers.
QAM II byGauravGarg (IIMLucknow)
Banks Utilities OfficeEquipments/Computers
755 520 438
712 295 828
845 553 622
985 950 453
1300 930 562
1143 428 348
733 510 405
1189 864 938
04122013
5
ApartialANOVAtableisgivenhere.Completethetable.
Isthereevidenceofadifferenceamongthemeansof
1986totalcashcompensationsforthethreegroupsof
corporateexecutives?Testat1%levelofsignificance.
Findtheindustry(industries)forwhichmean
compensationis(are)differentfromtheothersat1%
level.
QAM II byGauravGarg (IIMLucknow)
SourceofVariation Sumof
Squares
Degreeof
Freedom
MeanSum
ofSquares
Variance
Ratio
Industry
Error 1,115,232.50
Total 1,800,361.83
TwoWayClassification
Example:
Achefwasexperiencingdifficultyingettingtypesof
pastatobealdente.
Sheconductsanexperimentwithtwotypesofpasta
AmericanandItalian.
150gramspastaofbothtypeswereused.
Sampleswerecookedeitherfor4minutesor8minutes.
Because,cookingofpastaenablesittoabsorbwater.
Theweightsofcookedpastaweremeasured.
Theresultsfortworeplicatesforeachtypeandcooking
timeareasfollows:
QAM II byGauravGarg (IIMLucknow)
QAM II byGauravGarg (IIMLucknow)
Isthereaneffectonthecookedpasta(intermsof
weightofcookedpasta)
duetotypeofpasta?
duetocookingtime?
Isthereaninteractioneffectbetweentypeofpasta
andcookingtime?
Anyparticularcombinationofpastatypeandcooking
timeissignificantlydifferent?
COOKINGTIME
TYPE 4Minutes 8Minutes
American 265
270
310
320
Italian 250
245
300
305
Twowayanalysisofvarianceisanextensionof
onewayanalysisofvariance.
Thevariationiscontrolledbytwo factors.
ThevaluesofrandomvariableX areaffectedby
differentlevelsoftwo factors.
Assumptions
Thepopulationsarenormallydistributed.
Thesamplesareindependent.
Thevariancesofthepopulationsareequal.
QAM II byGauravGarg (IIMLucknow)
H
A0
:AlllevelsofFactorA havethesameeffect
H
A1
:AlllevelsofFactorA donthavethesameeffect
H
B0
:AlllevelsofFactorB havethesameeffect
H
B1
:AlllevelsofFactorB donthavethesameeffect
H
AB0
:Thereisnointeractioneffect
H
AB1
:Interactioneffectisthere
QAM II byGauravGarg (IIMLucknow)
a =numberoflevelsofFactorA
b =numberoflevelsofFactorB
m =numberofobservations(repetitions)percell
n = abm =totalnumberofobservations
x
ijk
=k
th
observationofthecellreceiving
i
th
levelofFactorA and
j
th
levelofFactorB.
G = Grandtotal
T
Ai
= Sumofobservationsreceivingi
th
levelofFactorA
T
Bi
= Sumofobservationsreceivingj
th
levelofFactorB
T
ij
= Sumofobservationsreceivingi
th
levelofFactorA
aswellasj
th
levelofFactorB
QAM II byGauravGarg (IIMLucknow)
04122013
6
QAM II byGauravGarg (IIMLucknow)
SSAB SSB SSA TSS SSE
CF SSB SSA T
m
SSAB
CF T
ma
SSB
CF T
mb
SSA
CF x TSS
abm
G
CF
i j
ij
j
Bj
i
Ai
i j k
ijk
=
=
=
=
=
=

: Error to due Squares of Sum


: n Interactio to due Squares of Sum
: B Factor to due Squares of Sum
: A Factor to due Squares of Sum
: Squares of Sum Total
: Factor Correction
2
2
2
2
2
1
1
1
ANOVATable
F
Ac
~ F(a-1, ab(m-1))
F
Bc
~ F(b-1, ab(m-1))
F
ABc
~ F((a-1)(b-1), ab(m-1))
WerejectH
0
at x 100% levelofsignificance,if
ComputedF>F
()
QAM II byGauravGarg (IIMLucknow)
Sourceof
Variation
Sumof
Squares
Degreeof
Freedom
MeanSumofSquares VarianceRatio
FactorA SSA a-1 MSA = SSA/(a-1) F
Ac
=MSA/MSE
FactorB SSB b-1 MSB = SSB/(b-1) F
Bc
=MSB/MSE
Interaction SSAB (a-1)(b-1) MSAB = SSAB/ (a-1)(b-1) F
ABc
=MSAB/MSE
Error SSE ab(m-1) MSE = SSE / ab(m-1)
Total TSS abm-1
QAM II byGauravGarg (IIMLucknow)
COOKINGTIME
FactorB
Factor A

4Minutes 8Minutes
Total
American 265
270
(T
11
=535)
310
320
(T
12
=630)
T
A1
=1165
Italian 250
245
(T
21
=495)
300
305
(T
22
=605)
T
A2
=1100
Total T
B1
=1030 T
B2
=1235 G=2265
CF = 641278.125
TSS = 647175 - 641278.125= 5896.875
SSA = 641806.25 - 641278.125 =528.125
SSB = 646531.25 - 641278.125 = 5253.125
SSAB = 647087.5 - 528.125 - 5253.125 - 641278.125
= 28.125
SSE = 87.5
QAM II byGauravGarg (IIMLucknow)
ANOVATable
F
Ac
~ F(1,4)
F
Bc
~ F(1,4)
F
ABc
~ F(1,4)
CriticalValueat5%levelofSignificance=7.70865
QAM II byGauravGarg (IIMLucknow)
Sourceof
Variation
SS df MS Variance
Ratio
CriticalF
FactorA 528.125 1 528.125 24.14286 7.70865
FactorB 5253.125 1 5253.125 240.1429 7.70865
Interaction 28.125 1 28.125 1.285714 7.70865
Error 87.5 4 21.875
Total 5896.875 7
Inpastaexample:
QAM II byGauravGarg (IIMLucknow)
04122013
7
QAM II byGauravGarg (IIMLucknow)
FactorBLevel1
FactorBLevel3
FactorBLevel2
FactorALevels
FactorBLevel1
FactorBLevel3
FactorBLevel2
FactorALevels
M
e
a
n

R
e
s
p
o
n
s
e
M
e
a
n

R
e
s
p
o
n
s
e
NoSignificant
Interaction
Significant
Interaction
Example:
Acompanystampsgasketsoutofsheetsofrubber,plasticandcork.
Themanufacturerwantstodeterminewhether
Onemachineismoreproductivethantheother
Onemachineismoreproductiveinproducingrubbergaskets
whiletheotherismoreproductiveinproducingplasticorcork
gaskets.
Themanufacturerdecidestoconductanexperimentusing3types
ofgasketmaterial.
Eachmachineisoperatedfor3onehourtimeperiodsforeachof
thegasketmaterial,withthe18onehourtimeperiodsassignedto
the6machinematerialcombinationsinrandomorder.
Thepurposeofrandomizationistoeliminatethepossibilitythat
uncontrolledenvironmentalfactorsmightbiastheresults.
QAM II byGauravGarg (IIMLucknow)
Thedata(No.ofgasketsinthousands)isasfollows:
Helpthemanufacturer.
Use5%levelofsignificance.
QAM II byGauravGarg (IIMLucknow)
M
a
c
h
i
n
e

GasketMaterial
Cork Rubber Plastic Total
I 4.31
4.27
4.40
3.36
3.42
3.48
4.01
3.94
3.89
35.08
II 3.94
3.81
3.99
3.91
3.80
3.85
3.48
3.53
3.42
33.73
Total 24.72 21.82 22.27 68.81
Whentheinteractioneffectsaresignificant,
Thehypothesistestingofmaineffectsbecomes
complicated.
Wecannotdirectlyconcludethatthemain
effectsarenotsignificant.
Whichcombinationisthebestcanbejudged
fromtheplot.
Whentheinteractioneffectsarenotsignificant
Butmaineffectsaresignificant
Wecandetermineparticularlevelsofthefactors
thataresignificant
QAM II byGauravGarg (IIMLucknow)
Methodisthesameasusedinonewayclassification.
ForthelevelsofFactorA
ObtainthemeansofalllevelsofFactorA
meanofi
th
levelofFactorA = T
Ai
/ bm
i
th
levelandj
th
differsignificantlyif
| T
Ai
- T
Aj
| / bm > CD
WhereCD isgivenby
MethodforthelevelsofFactorB issimilar
QAM II byGauravGarg (IIMLucknow)
( )
2 / 1
), 1 ( , 1
1
1 1
(

|
.
|

\
|
+
o m ab a
F a
bm bm
MSE
Example:
Supposeyouwanttodeterminewhetherthebrandof
laundrydetergentusedandthetemperatureaffectsthe
amountofdirtremovedfromyourlaundry.
YoubuytwodifferentbrandofdetergentSuperand
Best
Choosethreedifferenttemperaturelevels Cold,Warm,Hot
Theamountofdirtremovedisgiveninfollowingtable.
At5%levelofsignificance,testif
Varietiesofdetergenthavesignificanteffectondirtremoved
Varietiesinthetemperatureofwaterhavesignificantdifferent
ondirtremoved
QAM II byGauravGarg (IIMLucknow)
Cold Warm Hot
Super 5 9 10
Best 5 13 12
04122013
8
Example:
Threevarietiesofcoalwereanalyzedbyfourchemists
andashcontentinthevarietieswasfoundtobeas
under:
Dothevarietiesofcoaldiffersignificantlyintheirash
content?
Dothechemistsdiffersignificantlyintheiranalysis?
QAM II byGauravGarg (IIMLucknow)
Chemists
I II III IV
Varietie
s
A 8 5 5 7
B 7 6 4 4
C 3 6 5 4
Summary
Onewayanalysisofvariance
Onefactoratvariouslevels
Ftestfordifferenceinmorethantwomeans
Scheffes procedureformultiplecomparisons
Twowayanalysisofvariance
Effectsoftwofactors
Interactionbetweentwofactors
Multiplecomparisons
QAM II byGauravGarg (IIMLucknow)

Das könnte Ihnen auch gefallen