Sie sind auf Seite 1von 18

Application of ANOVA

What is ANOVA?

An ANOVA is an analysis of the variation present


in an experiment. It is a test of the hypothesis that
the existence of differences among several
population means.
Application of ANOVA
ANOVA is designed to detect differences
among means from populations subject to
different treatments

ANOVA is a joint test


• The equality of several population means is
tested simultaneously or jointly.

ANOVA tests for the equality of several


population means by looking at two
estimators of the population variance (hence,
analysis of variance).
The Hypothesis Test of Analysis of Variance
• In an analysis of variance:
We have r independent random samples, each one
corresponding to a population subject to a different
treatment.
We have:
 n = n1+ n2+ n3+ ...+nr total observations.
 r sample means: x1, x2 , x3 , ... , xr
 These r sample means can be used to calculate an
estimator of the population variance. If the population
means are equal, we expect the variance among the
sample means to be small.
 r sample variances: s12, s22, s32, ...,sr2
 These sample variances can be used to find a pooled
estimator of the population variance.
The Hypothesis Test of Analysis of
Variance (continued): Assumptions
• We assume independent random sampling from
each of the r populations
• We assume that the r populations under study:
– are normally distributed,
– with means mi that may or may not be equal,
– but with equal variances, si2.

m1 m2 m3
Population 1 Population 2 Population 3
Testing Hypothesis
The hypothesis test of analysis of variance:

H0: m1 = m2 = m3 = m4 = ... mr
H1: Not all mi (i = 1, ..., r) are equal

• The test statistic of analysis of variance:



• F(r-1, n-r) = Estimate of variance based on means from r samples
• Estimate of variance based on all sample
observations
• That is, the test statistic in an analysis of variance is based on the ratio
of two estimators of a population variance, and is therefore based on
the F distribution, with (r-1) degrees of freedom in the numerator and (n-
r) degrees of freedom in the denominator.
Extension of ANOVA to Three Factors
Source of Sum of Degrees
Variation Squares of Freedom Mean Square F Ratio
Factor A SSA a-1 SSA MSA
MSA = F =
a -1
MSE
Factor B SSB b-1 SSB MSB
MSB = F =
b -1 MSE
Factor C SSC c-1 SSC MSC
MSC = F =
c -1 MSE
Interaction SS(AB) (a-1)(b-1) SS ( AB ) MS ( AB )
MS ( AB ) = F =
(AB) ( a - 1)( b - 1) MSE

Interaction SS(AC) (a-1)(c-1) SS ( AC ) MS ( AC )


MS ( AC ) = F =
(AC) ( a - 1)( c - 1) MSE

Interaction SS(BC) (b-1)(c-1) SS ( BC ) MS ( BC )


MS ( BC ) = F =
(BC) (b - 1)( c - 1) MSE

Interaction SS(ABC) (a-1)(b-1)(c-1) SS ( ABC ) MS ( ABC )


MS ( ABC ) = F =
(ABC) ( a - 1)( b - 1)( c - 1) MSE

Error SSE abc(n-1) SSE


MSE =
abc ( n - 1)
Total SST abcn-1
Application of ANOVA

• We can manipulate certain variables (like promotion, ad


copy, display at the point of purchase), and observe
changes in other variables (like sales, or consumer
preferences, behavior or attitude). The application areas for
experiments are wide .

• Whenever a marketing-mix variable (independent variable)


is changed, we can determine its effect. Such variables
include price, a specific promotion or type of distribution, or
specific elements like shelf space, color of packaging etc.
• An experiment can be done with just one
independent variable (factor) or with multiple
independent variables.

• ANOVA - The key to success in an ANOVA


Analysis Survey is the degree of control on the
various independent variable (factors) that are
being manipulated during the experiment.
N-way Analysis of Variance
In business research, one is often concerned with the effect
of more than one factor simultaneously. For example:

• How do advertising levels (high, medium, and low) interact


with price levels (high, medium, and low) to influence a
brand's sale?

• Do educational levels (less than high school, high school


graduate, some college, and college graduate) and age
(less than 35, 35-55, more than 55) affect consumption of a
brand?

• What is the effect of consumers' familiarity with a


department store (high, medium, and low) and store image
(positive, neutral, and negative) on preference for the store?
N-way Analysis of Variance
Consider the simple case of two factors X1 and X2 having categories c1 and
c2. The total variation in this case is partitioned as follows:

SStotal = SS due to X1 + SS due to X2 + SS due to interaction of X1 and X2 +


SSwithin

or
SS y = SS x 1 + SS x 2 + SS x 1x 2 + SS error

The strength of the joint effect of two factors, called the overall effect, or
multiple 2, is measured as follows:

multiple 2 = (SS x 1 + SS x 2 + SS x 1x 2)/ SS y


N-way Analysis of Variance
The significance of the overall effect may be tested by an F test, as
follows

(SS x 1 + SS x 2 + SS x 1x 2)/dfn
F=
SS error/dfd
SS x 1,x 2,x 1x 2/ dfn
=
SS error/dfd
MS x 1,x 2,x 1x 2
=
MS error
where

dfn = degrees of freedom for the numerator


= (c1 - 1) + (c2 - 1) + (c1 - 1) (c2 - 1)
= c1c2 - 1
dfd = degrees of freedom for the denominator
= N - c1c2
MS = mean square
N-way Analysis of Variance
If the overall effect is significant, the next step is to examine the
significance of the interaction effect. Under the null hypothesis of
no interaction, the appropriate F test is:

SS x 1x 2/dfn
F=
SS error/dfd

MS x 1x 2
=
MS error

where

dfn = (c1 - 1) (c2 - 1)


dfd = N - c 1c 2
N-way Analysis of Variance
The significance of the main effect of each factor may be tested
as follows for X1:
SS x 1/dfn MS x 1
F= =
SS error/dfd MS error

where

dfn = c1 - 1
dfd = N - c 1c 2
Issues in Interpretation
• The most commonly used measure in ANOVA is omega squared,
2
 . This measure indicates what proportion of the variation in the
dependent variable is related to a particular independent variable or
factor. The relative contribution of a factor X is calculated as

• follows:
SS x - (dfx x MS error)
2x =
SS total + MS error

2
• Normally,  is interpreted only for statistically significant effects.
Repeated Measures ANOVA
In the case of a single factor with repeated measures, the
total variation, with nc - 1 degrees of freedom, may be
split into between-people variation and within-people
variation.

SStotal = SSbetween people + SSwithin people

The between-people variation, which is related to the


differences between the means of people, has n - 1
degrees of freedom. The within-people variation has
n (c - 1) degrees of freedom. The within-people variation
may, in turn, be divided into two different sources of
variation. One source is related to the differences
between treatment means, and the second consists of
residual or error variation. The degrees of freedom
corresponding to the treatment variation are c - 1, and
those corresponding to residual variation are
(c - 1) (n -1).
Repeated Measures ANOVA
Thus,

SSwithin people = SSx + SSerror

A test of the null hypothesis of equal means may now be


constructed in the usual way:
SS x /(c - 1) MS x
F= =
SS error/(n - 1) (c - 1) MS error

So far we have assumed that the dependent variable is


measured on an interval or ratio scale. If the dependent
variable is nonmetric, however, a different procedure
should be used.
Nonmetric Analysis of Variance
• Nonmetric analysis of variance examines the
difference in the central tendencies of more than two
groups when the dependent variable is measured
on an ordinal scale.
• One such procedure is the k-sample median test.
As its name implies, this is an extension of the
median test for two groups, which was considered in
Chapter 15.
Nonmetric Analysis of Variance
• A more powerful test is the Kruskal-Wallis one way analysis
of variance. This is an extension of the Mann-Whitney test
This test also examines the difference in medians. All cases
from the k groups are ordered in a single ranking. If the k
populations are the same, the groups should be similar in
terms of ranks within each group. The rank sum is calculated
for each group. From these, the Kruskal-Wallis H statistic,
which has a chi-square distribution, is computed.
• The Kruskal-Wallis test is more powerful than the k-sample
median test as it uses the rank value of each case, not merely
its location relative to the median. However, if there are a
large number of tied rankings in the data, the k-sample
median test may be a better choice.

Das könnte Ihnen auch gefallen