Sie sind auf Seite 1von 53

Analysis of

Variance
Introduction
• Analysis of variance compares two
or more populations of interval
data.
• Specifically, we are interested in
determining whether differences
exist between the population
means.
• The procedure works by analyzing
the sample variance.
One Way Analysis of
Variance

• The analysis of variance is a


procedure that tests to
determine whether differences
exits between two or more
population means.

• To do this, the technique analyzes
the sample variances
One Way Analysis of
Variance
• Example
– An apple juice manufacturer is
planning to develop a new product -a
liquid concentrate.
– The marketing manager has to decide
how to market the new product.
– Three strategies are considered
• Emphasize convenience of using the
product.
• Emphasize the quality of the product.
• Emphasize the product’s low price.
One Way Analysis of
Variance
• Example continued
– An experiment was conducted as
follows:
• In three cities an advertisement
campaign was launched .
• In each city only one of the three
characteristics (convenience,
quality, and price) was
emphasized.
One Way Analysis of
Variance
Convnce Quality Price
529 804 672
658 630 531
793 774 443
514 717 596
663 679 602
719 604 502
711 620 659
606 697 689
461 706 675
529 615 512
498 492 691
663 719 733
604 787 698
495 699 776
485 572 561
557 523 572
353 584 469
557 634 581
542 580 679
614 624 532
One Way Analysis of
Variance

• Solution
– The data are interval
– The problem objective is to
compare sales in three cities.
– We hypothesize that the three
population means are equal
Defining the Hypotheses

•Solution
H 0: µ 1 = µ 2= µ 3
H1: At least two means differ
To build the statistic needed to
test the
hypotheses use the following
notation:
Notation
e n d e n t sa m p le s a re d ra w n fro m k p o p u la tio n s ( tre a t
1 2 k
First observation, X 11 X 12 X 1k
first sample x 21 x 22 x 2k
. . .
. . .
Second observation, . . .
second sample X n1,1 X n2,2 X nk,k
n1
n2 nk
x1
x2 xk
Sample size
Sample
mean
X is the “response variable”.
The variables’ value are called “responses”.
Terminology

• In the context of this problem…


• Response variable – weekly sales
Responses – actual sale values
Experimental unit – weeks in the three
cities when we record sales figures.
Factor – the criterion by which we
classify the populations (the treatments).
In this problems the factor is the
marketing strategy.
• Factor levels – the population
(treatment) names. In this problem factor
levels are the marketing strategies.
T h e ra tio n a le o f th e te st
sta tistic
Two types of variability are
employed when testing for the
equality of the population means
G ra p h ica l d e m o n stra tio n :
E m p lo y in g tw o ty p e s o f
v a ria b ility
30

25
x3 = 20
x3 = 20
20 20
19
x2 = 15
16 x2 = 15
15
14
x1 = 10 12
11 x1 = 10
10 10
9 9

A small variability within The sample


1 means are the same as before,
the samplesTreatment
Treatment 2easierbut3 Treatment
1makes itTreatment the larger1within -sample
Treatment variability
2Treatment 3
to draw a conclusion about makes
the it harder to draw a conclusion
population means. about the population means.
The rationale behind the
test statistic – I
• If the null hypothesis is true, we
would expect all the sample
means to be close to one another
(and as a result, close to the
grand mean).
• If the alternative hypothesis is
true, at least some of the sample
means would differ.
• Thus, we measure variability
Variability between sample
means
•T h e va ria b ility b e tw e e n th e sa m p le
m e a n s is m e a su re d a s th e su m o f
sq u a re d d ista n ce s b e tw e e n e a ch
m e a n a n d th e g ra n d m e a n .

T h is su m is ca lle d th e
S u m o f S quares for Treatments
In ourexample treatments are SST
represented by the different
advertising strategies.
Sum of squares for
treatments (SST)
k
SST= ∑ nj (xj − x) 2

j=1

There are k treatments

The size of sample jThe mean of sample j

Note: When the sample means are close to


one another, their distance from the grand
mean is small, leading to a small SST. Thus,
large SST indicates large variation between
sample means, which supports H1.
Sum of squares for
treatments (SST)
• Solution – continued
Calculate SST
x1 = 577.55x2 = 653
.00 x3 = 608
.65
k
SST = ∑nj (xj −x)2
j=1
he grand mean is calculated by= 20(577.55 - 613.07)2 +
+ 20 (653 . 00 - 613 .07 )2 +
n1x1 + n2x2 + ...+ nkxk
X= + 20(608.65 - 613.07)2 =
n1 + n2 + ...+ nk = 57,512.23
The rationale behind test
statistic – II
• Large variability within the
samples weakens the “ability” of
the sample means to represent
their corresponding population
means.
• Therefore, even though sample
means may markedly differ from
one another, SST must be judged
relative to the “within samples
variability”.
Within samples variability
• The variability within samples is
measured by adding all the
squared distances between
observations and their sample
means.

•This sum is called the


our example• this is the
Sum
of all squared of Squares for Error
differences
ween sales in city j and the •
ple mean of city j (over all SSE
three cities).
Sum of squares for errors
(SSE)
• Solution –
continued
Calculate SSE
s12 = 10,775.00 s 22 = 7,238,11 s32 = 8,670.24
k nj
SSE = ∑∑
1)js=112 i +
(xij − x j ) 2
2
=1 (n2 -1)s2 + (n3 -1)s3
2
= (n1 -

= (20 -1)10,774.44 + (20 -1)7,238.61+ (20-


1)8,670.24
= 506,983.50
The mean sum of squares
To perform the test we need
to calculate the mean
squares as follows:
Calculation of Calculation of MSE
MST - M ean S quare for Error
M ean S quare for
T reatments
SST SSE
MST = MSE =
k −1 n−k
57 ,512 .23 509,983.50
= =
3 −1 60 − 3
= 28 ,756 .12 = 8,894.45
Calculation of the test
statistic
MST
F=
MSE
28 ,756 .12
=
8,894 .45
= 3.23
Required Conditions:
1. The populations tested
are normally distributed.
2. The variances of all thewith the following degrees of freedom:
populations tested arev1=k -1 and v2=n-k
equal.
The F test rejection
region
And the hypothesis test:
finally

H0: µ 1 = µ 2 = …=µ k
H1: At least two means differ

Test statisticF= : MST


MSE
R.R: F>Fα ,k-1n
, -k
The F test
MST
F=
MSE
28,756.12
Ho: µ 1 = µ 2= µ 3 =
8,894.17
H1: At least two means differ
=3.23

Test statistic F= MST/


MSE=R.R3.:.F23> Fα ,k−1,n− k = F0.05,3−1,60− 3 ≈ 3.15
Since 3.23 > 3.15, there is sufficient
evidence
to reject Ho in favor of H1, and argue
that at least one
of the mean sales is different than
the others.
single factor ANOVA
Anov a: Single Factor

SUMM ARY
G roups Count Sum Average Variance
Conv enience 20 11551 577. 55 10775. 00
Q uality 20 13060 653. 00 7238. 11
Price 20 12173 608. 65 8670. 24

ANO VA
Source of Variation SS df MS F P-value F crit
Between G roups 57512 2 28756 3. 23 0. 0468 3. 16
W ithin G roups 506984 57 8894

Total 564496 59

SS(Total) = SST + SSE


Models of Fixed and Random
Effects
• Fixed effects
– If all possible levels of a factor are
included in our analysis we have a
fixed effect ANOVA.
– The conclusion of a fixed effect ANOVA
applies only to the levels studied.
• Random effects
– If the levels included in our analysis
represent a random sample of all the
possible levels, we have a random-
effect ANOVA.
– The conclusion of the random-effect
ANOVA applies to all the levels (not
Models of Fixed and Random
Effects.
• In some ANOVA models the test statistic
of the fixed effects case may differ
from the test statistic of the random
effect case.
• Fixed and random effects - examples
– Fixed effects - The advertisement
Example .All the levels of the
marketing strategies were included
– Random effects - To determine if there is
a difference in the production rate of
50 machines, four machines are
randomly selected and there
Two Way
Analysis of
Variance
One - way ANOVA
Single factor
Two - way ANOVA Response
Two factors

Response

Treatment 3 (level 1)
Treatment 2 (level 2)
Treatment 1 (level 3)

Level 3
Level2
Level 1 Fa cto r A
Level2 Level 1
Factor B
Two-Factor Analysis of
Variance -
• Example
– Suppose in the Example, two factors
are to be examined:
• The effects of the marketing strategy
on sales.
– Emphasis on convenience
– Emphasis on quality
– Emphasis on price
• The effects of the selected media on
sales.
– Advertise on TV
– Advertise in newspapers
Attempting one-way ANOVA

• Solution
– We may attempt to analyze
combinations of levels, one from
each factor using one-way ANOVA.
– The treatments will be:
• Treatment 1: Emphasize convenience
and advertise in TV
• Treatment 2: Emphasize convenience
and advertise in newspapers
• …………………………………………………
………………….
• Treatment 6: Emphasize price and
Attempting one-way ANOVA

• Solution
– The hypotheses tested are:
• H0: µ 1= µ 2= µ 3= µ 4= µ 5= µ 6
• H1: At least two means differ.
Attempting one-way ANOVA
•S o lu–tiIon e a ch o n e o f six citie s sa le s a re
n re co rd e d fo r te n
weeks .
– In e a ch city a d iffe re n t co m b in a tio n o f
m a rke tin g

emphasis and media usage is
• e m p lo ye dCity2
City1 . City3 City4 City5
City6
Convnce Convnce Quality Quality Price
Price
• TV Paper TV Paper TV
Paper
Attempting one-way ANOVA

• Solutio
C ity1
n C ity2 C ity3 C ity4 C ity5
C ity6
C o n vn ce C o n vn ce Q u a lity Q u a lity Price
Price
TV Pa p e r TV Pa p e r TV
Pa p e r

T h e p -va lu e =. 0 4 5 2 .
W e co n clu d e th a t th e re is e vid e n ce th a t d iffe re n ce s
exist in the mean weekly sales among the six cities.
Interesting questions – no
answers

• These result raises some


questions:
– Are the differences in sales caused
by the different marketing
strategies?
– Are the differences in sales caused
by the different media used for
advertising?
– Are there combinations of
marketing strategy and media
Two-way ANOVA (two factors)

• The current experimental design


cannot provide answers to these
questions.
• A new experimental design is
needed.
Two-way ANOVA (two
factors)
Factor A: Marketing strategy
Convenience Qualit Pric
y e
Advertising media

TV C ity 1 C ity3 C ity 5


sa le s sa le s sa le s
Factor B:

C ity 2 C ity 4 C ity 6


Newspapers sa le s sa le s sa le s

Are there differences in the mean sales


caused by different marketing strategies?
Two-way ANOVA (two
factors)

• Test whether mean sales of


“Convenience”, “Quality”,
• and “Price” significantly differ from
one another.
C a lcu la tio n s a re
• b a se d o n th e su m o f
sq u a re fo r fa cto r A
• H0: µ Conv. =µ Quality S S (µ
= A ) Price

• H1: At least two means differ


Two-way ANOVA (two
factors)
Fa cto r A : M a rke tin g stra te g y
Convenience Qualit Pric
y e
A d ve rtisin g m e d ia

C ity 1 C ity 3 C ity 5


TV sa le s sa le s sa le s
Fa cto r B :

C ity 2 C ity 4 C ity 6


Newspapers sa le s sa le s sa le s

A re th e re d iffe re n ce s in th e m e a n sa le s
ca u se d b y d iffe re n t a d ve rtisin g m e d ia ?
Two-way ANOVA (two
factors)

st whether mean sales of the “TV”, and “Newspapers”


gnificantly differ from one another.
H0: µ TV = µ Newspapers Calculations are based on
H1: The means differ the sum of square for factor B
SS(B)
Two-way ANOVA (two
factors)
Factor A: Marketing strategy
Qualit
Convenience Qualit Pric
Advertising media

y e
C ity 1 City C ity 5
TV
TV
Factor B:

3
sa le s sales sa le s

C ity 2 C ity 4 C ity 6


Newspapers sa le s sa le s sa le s

Are there differences in the mean sales


caused by interaction between marketing
strategy and advertising medium?
Two-way ANOVA (two
factors)

• Test whether mean sales of


certain cells
• are different than the level
expected.

• Calculation are based on the sum of


square for interaction SS(AB)
Sums of squares
a


SS(A) = rb
i=1
(x[A]i − x)2 (10(2){(xconv. − x) 2 + ( xquality − x) 2 + ( x price − x) 2 }


SS(B) = ra
j=1
(x[B]j − x)2 (10 )(3){( xTV − x) 2 + ( x Newspaper − x ) 2 }

a b
SS(AB) = r∑ ∑ (x[AB]ij − x[A]i − x[B]j + x)2

i=1 j=1
a b r
SSE = ∑∑∑
i =1 j =1 k =1
( xijk − x[ AB ]ij ) 2
F tests for the Two-way ANOVA

• Test for the difference between the


levels of the main factors A
S S ( A )/( a -1 ) and B S S ( B )/( b -1 )
M S (A ) M S (B )
F= F =
M SE M SE S S E /( n -a b )
R e je ctio n re g io n : F > F α ,a-1 ,n-ab
F > F α , b-1, n-ab
•Te st fo r in te ra ctio n b e tw e e n
fa cto rs A a n dM BS ( A B ) S S ( A B )/( a -1 )( b -1 )
F=
M SE
R e je ctio n re g io n : F > F α ,( a-
1)(b-1),n-ab
Required conditions:

1.The response distributions is


normal
2.The treatment variances are
equal.
3.The samples are independent.
F tests for the Two-way ANOVA
Convenience Quality Price
TV 491 677 575
TV 712 627 614
TV 558 590 706
TV 447 632 484
TV 479 683 478
TV 624 760 650
TV 546 690 583
TV 444 548 536
TV 582 579 579
TV 672 644 795
Newspaper 464 689 803
Newspaper 559 650 584
Newspaper 759 704 525
Newspaper 557 652 498
Newspaper 528 576 812
Newspaper 670 836 565
Newspaper 534 628 708
Newspaper 657 798 546
Newspaper 557 497 616
Newspaper 474 841 587
F tests for the Two-way ANOVA

• Example – continued
– Test of the difference in mean sales between the
three marketing strategies
• H0: µ conv. =µ quality =µ price
• H1: At least two mean sales are different

ANOVA
Source of Variation SS df MS F P-value F crit
Sample 13172.0 1 13172.0 1.42 0.2387 4.02
Columns 98838.6 2 49419.3 5.33 0.0077 3.17
Interaction 1609.6 2 804.8 0.09 0.9171 3.17
Within 501136.7 54 9280.3

Total 614757.0 59

Fa cto r A M a rke tin g stra te g ie s


F tests for the Two-way ANOVA
• Example – continued
– Test of the difference in mean sales
between the three marketing strategies
• H 0: µ conv. =µ quality =µ price
• H1: At least two mean sales are
M Sdifferent
( A ) /M S E

F = MS(Marketing strategy)/MSE = 5.33

• Fcritical = Fα ,a-1,n-ab = F.05,3-1,60-(3)(2) = 3.17; (p-


value = .0077)

– At 5% significance level there is evidence
to infer that differences in weekly sales
F tests for the Two-way ANOVA

• Example - continued
– Test of the difference in mean sales
between the two advertising media
• H0: µ TV. = µ Nespaper
• H1: The two mean sales differ

ANOVA
Source of Variation SS df MS F P-value F crit
Sample 13172.0 1 13172.0 1.42 0.2387 4.02
Columns 98838.6 2 49419.3 5.33 0.0077 3.17
Interaction 1609.6 2 804.8 0.09 0.9171 3.17
Within 501136.7 54 9280.3

Total 614757.0 59

Fa cto r B = A d ve rtisin g m e d ia
F tests for the Two-way ANOVA

• Example - continued
– Test of the difference in mean sales
between the two advertising media
• H 0: µ TV. =µ Nespaper
• H1: The two mean salesMdiffer
S ( B ) /M S E

F = MS(Media)/MSE = 1.42
• Fcritical = Fα , a-1,n-ab = F.05,2-1,60-(3)(2) = 4.02 (p-
value = .2387)

– At 5% significance level there is
insufficient evidence to infer that
differences in weekly sales exist
F tests for the Two-way ANOVA

• Example - continued
– Test for interaction between factors A
and B
• H 0: µ TV*conv. =µ TV*quality =…=µ newsp.*price
• H1: At least two means differ
ANOVA
Source of Variation SS df MS F P-value F crit
Sample 13172.0 1 13172.0 1.42 0.2387 4.02
Columns 98838.6 2 49419.3 5.33 0.0077 3.17
Interaction 1609.6 2 804.8 0.09 0.9171 3.17
W ithin 501136.7 54 9280.3

Total 614757.0 59

In te ra ctio n A B = M a rke tin g * M e d ia


F tests for the Two-way ANOVA

• Example - continued
– Test for interaction between factor A and
B
• H 0: µ TV*conv. =µ =…=µ
TV*qualityB ) /M
M S ( A new SE
sp.*price
• H1: At least two means differ

F = MS(Marketing*Media)/MSE = .09

• Fcritical = Fα ,( a-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2) =


3.17 (p-value= .9171)

– At 5% significance level there is
insufficient evidence to infer that the
Jyothimon C
M . Tech Technology Management
University of Kerala
Send your feedbacks and queries to
jyothimonc@yahoo . com

Das könnte Ihnen auch gefallen