Chapter3 Anova Experimental Design Models

Chapter 3
Experimental Design Models

We consider the models which are used in designing an experiment. The experimental conditions,
experimental setup and the objective of the study essentially determine that what type of design is to
be used and hence which type of design model can be used for the further statistical analysis to
conclude about the decisions. These models are based on one-way classification, two way
classifications (with or without interactions), etc. We discuss them now in detail in few setups which
can be extended further to any order of classification. We discuss them now under the set up of
one-way and two-way classifications.
It may be noted that it has already been described how to develop the likelihood ratio tests for the
testing the hypothesis of equality of more than two means from normal distributions and now we
will concentrate more on deriving the same tests through the least squares principle under the setup
of linear regression model.The design matrix is assumed to be not necessarily of full rank and
consists of 0’s and 1’s only.
One way classification:

Let p random samples from p normal populations with same variances but different means and
different sample sizes have been independently drawn.
Let the observations Yij follow the linear regression model setup and
Yij denotes the jth observation of dependent variable Y when effect of ith level of factor is present.
Then Yij are independently normally distributed with
E (Yij )     i , i  1, 2,..., p, j  1, 2,..., ni
V (Yij )   2
where
  is the general mean effect.
- is fixed.
- gives an idea about the general conditions of the experimental units and treatments.
 i  is the effect of ith level of the factor.
- can be fixed or random.
Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

1
1
Example: Consider a medicine experiment in which there are three different dosages of medicines
- 2 mg., 5 mg., 10 mg. which are given to patients for controlling the fever. These are the 3 levels of
medicines, and so denote 1  2 mg.,  2  5 mg.,  3  10 mg. Let Y denotes the time taken by the
medicine to reduce the body temperature from high to normal. Suppose two patients have been given
2 mg. of dosage, so Y11 and Y12 will denote their responses. So we can write that when 1  2mg is
given to the two patients, then
E (Y1 j )     1 ; j  1, 2.
Similarly, if  2  5 mg. and  3  10 mg. of dosages are given to 4 and 7 patients respectively then
the responses follow the model
E (Y2 j )     2 ; j  1, 2,3, 4
E (Y3 j )    3 ; j  1, 2,3, 4,5,6, 7.
Here  denotes the general mean effect which may be thought as follows: The human body has
tendency to fight against the fever, so the time taken by the medicine to bring down the temperature
depends on many factors like body weight, height, general health condition etc. of the patient. So 
denotes the general effect of all these factors which is present in all the observations.
In the terminology of linear regression model,  denotes the intercept term which is the value of the
response variable when all the independent variables are set to take value zero. In experimental
designs, the models with intercept term are more commonly used and so generally we consider these
types of models.
Also, we can express

Yij     i   ij ; i  1, 2,..., p , j  1, 2,..., ni where  ij is the random error component in Yij . It
indicates the variations due to uncontrolled causes which can influence the observations. We assume
that ij ’s are identically and independently distributed as N (0,  2 ) with E ( ij )  0, Var ( ij )   2 .

2
2
Note that the general linear model considered is
E (Y )  X 
for which Yij can be written as
E (Yij )   i .
When all the entries in X are 0’s or 1’s, then this model can also be re-expressed in the form of
E (Yij )     i .
This gives rise to some more issues.
Consider and rewrite

E (Yij )  i
   ( i   )
   i
where
1 p
   i
p i 1
 i  i   .
Now let us see the changes in the structure of design matrix and the vector of regression
coefficients.
The model E (Yij )  i     i can now be rewritten as
E (Y )  X *  *
Cov (Y )   2 I
where  *  (  ,  1 ,  2 ,...,  p ) is a p  1 vector and
1 
1 X 
X*  
 
 
1 
is a n  ( p  1) matrix, and X denotes the earlier defined design matrix in which
- first n1 rows as (1,0,0,…,0),
- second n2 rows as (0,1,0,…,0)
- …, and
- last np rows as (0,0,0,…,1).
3
3
We earlier assumed that rank  X   p but can we also say that rank  X * is also p in the present
case?
Since the first column of X* is the vector sum of all its remaining p columns, so
rank  X *  p .
It is thus apparent that all the linear parametric functions of  1 ,  2 ,...,  p are not estimable. The
question now arises that what kind of linear parametric functions are estimable?
Consider any linear estimator

p ni
L   aijYij
i 1 j 1
with
ni
Ci   aij
j 1
Now
p ni
E ( L)   aij E (Yij )
i 1 j 1
p ni
  aij (    i )
i 1 j 1
p ni p ni
   aij   a  ij i
i 1 j 1 i 1 j 1
p p
  ( Ci )   Ci i .
i 1 i 1
p
Thus C
i 1
i  i is estimable if and only if
C
i 1
i  0,
p
i.e., C
i 1
i i
is a contrast.
Thus, in general neither 

i 1
i nor any  ,  1 ,  2 ,...,  p is estimable. If it is a contrast, then it is
estimable.

4
4
This effect and outcome can also be seen from the following explanation based on the estimation of
parameters  ,  1 ,  2 ,...,  p .
Consider the least squares estimation ˆ , ˆ1 , ˆ 2 ,..., ˆ p of 1 , 1 ,  2 ,...,  p respectively.
Minimize the sum of squares due to  ij ' s

p ni p ni
S     ( yij     i )2
2
ij
i 1 j 1 i 1 j 1
to obtain ˆ , ˆ1 ,..., ˆ p .
S p ni
(a)  0   ( yij     i )  0
 i 1 j 1
S ni
(b)  0   ( yij     i )  0, i  1, 2,..., p.
 i j 1
Note that (a) can be obtained from (b) or vice versa. So (a) and (b) are linearly dependent in the
sense that there are (p + 1) unknowns and p linearly independent equations. Consequently
ˆ , ˆ1 ,..., ˆ p do not have a unique solution. Same applies to the maximum likelihood estimation of
 ,  1 ,... p . .
If a side condition that

p p
 niî  0 or
i 1
n i 1
i i 0
is imposed then (a) and (b) have a unique solution as

1 p ni
ˆ   yij  yoo ,
n i 1 j 1
1 ni
ˆ i 
np
y
j 1
ij  ˆ
 yio  yoo
p
where n   ni .
i 1
p p
In case, all the sample sizes are same, then the condition  niî  0
i 1
or n
i 1
i i  0 reduces to
p p
î  0 or
i 1
i 1
i 0.

5
5
So the model yij     i   ij needs to be rewritten so that all the parameters can be uniquely
estimated. Thus
Yij     i   ij
 (    )  ( i   )   ij
  *   i*   ij
where
*    
 i*   i  
1 p
 i
p i 1
and
p

i 1
*
i 0
This is a reparameterized form of the linear model.
Thus in a linear model when X is not of full rank, then the parameters do not have unique estimates.
p p
In such conditions, a restriction 
i 1
i  0 (or equivalently  n
i 1
i i 0 in case all ni’s are not
same) can be added and then the least squares (or maximum likelihood) estimators obtained are
unique.
The model
p
E (Yij )   *  i* ; 
i 1
*
1 0
is called a reparametrization of the original linear model.
Let us now consider the analysis of variance with additional constraint. Let
Yij  i   ij , i  1, 2,..., p; j  1, 2,..., ni
   ( i   )   ij
    i   ij
with
1 p
 
p
 ,
i 1
i  i  i   ,
p p
 ni i  0, n   ni .
i 1 i 1
and  ij ’s are identically and independently distributed with mean 0 and variance  2 .
6
6
The null hypothesis is
H 0 :  1   2  ...   p  0
and the alternative hypothesis is

H 1 : atleast one  i   j for all i , j.
This model is a one-way layout in the sense that the observations yij ' s are assumed to be affected
by only one treatment effect i . So the null hypothesis is equivalent to testing the equality of p
population means or equivalently the equality of p treatment effects.
We use the principal of least squares to estimate the parameters  ,  1 ,  2 ,... p .
Minimize the error sum of squares

p ni p ni
E    ij2   ( yij    i )2
i 1 j 1 i 1 j 1
with respect to  ,  1 ,  2 ,...,  p . The normal equations are obtained as
E p ni

 0  2  (y
i 1 j 1
ij    i )  0
or
p p ni
n   ni i   yij (1)
i 1 i 1 j 1
E ni
 0  2 ( yij     i )  0
 i j 1
or
ni
ni   ni i   yij (i  1, 2,..., p). (2)
j 1
p
Using n
i 1
i i  0 in (1) gives
1 p ni G
ˆ  
n i 1 j 1
yij   yoo
n
p ni
where G   yij is the grand total of all the observations.
i 1 j 1

7
7
Substituting ̂ in (2) gives
1 ni
ˆ i 
ni
y
j 1
ij  ˆ
Ti
  ˆ
ni
 yio  yoo
ni
where Ti   yij is the treatment total due to ith effect  i , i.e., total of all the observations receiving
j 1
1 ni
the ith treatment and yio   yij .
ni j 1
Now the fitted model is yij  ˆ  ˆ i and the error sum of squares after substituting ̂ and ˆ i in E
becomes
p ni
E   ( yij  ˆ  î ) 2
i 1 j 1
p ni
  ( yij  yoo )  ( yio  yoo ) 
2
i 1 j 1
p ni p ni
  ( yij  yoo )   ( yio  yoo ) 2 2
i 1 j 1 i 1 j 1
 p ni G 2   p Ti 2 G 2 
   yij2    
 i 1 j 1 n   i 1 ni n 
where the total sum of squares (TSS )

p ni
TSS   ( yij  yoo ) 2
i 1 j 1
p ni
G2
  yij2  ,
i 1 j 1 n
G2
and is called as correction factor  CF  .
n
To obtain a measure of variation due to treatments, let

H 0   1   2  ...   p  0
be true. Then the model becomes

Yij     ij , i  1, 2,..., p ; j  1, 2,..., ni .

8
8
Minimizing the error sum of squares
p ni
E1   ( yij   )2
i 1 j 1
with respect to  , the normal equation is obtained as
E1 p ni
 0  2  (y  )  0

ij
i 1 j 1
or
G
ˆ   y oo .
n
Substituting ˆ in E1 , the error sum of squares becomes

p ni
E1   ( yij  ˆ ) 2
i 1 j 1
p ni
  ( yij  yoo ) 2
i 1 j 1
p ni
G2
  yij2  .
i 1 j 1 n
Note that
E1: Contains variation due to treatment and error both
E: Contains variation due to error only
So E1  E : contain variation due to treatment only.
The sum of quares due to treatment ( SSTr ) is given by

SSTr  E1  E
p ni
SSTr   ( yio  yoo ) 2
i 1 j 1
p
Ti 2 G2
  .
i 1 ni n
The following quantity is called the error sum of squares or sum of squares due to error (SSE)
n ni
SSE   ( y ij  yio )2 .
i 1 j 1

9
9
These sum of squares forms the basis for the development of tools in the analysis of variance. We
can write
TSS  SSTr  SSE.
The distribution of degrees of freedom among these sum of squares is as follows:

 The total sum of squares is based on n quantities subject to the constraint that
p ni
 ( y
i 1 j 1
ij  yoo )  0 so TSS carries (n 1) degrees of freedom.
 The sum of squares due to the treatments is based on p quantities subject to the constraint
p
n (y
i 1
i io  yoo )  0 so SSTr has ( p  1) degrees of freedom.
 The sum of squares due to errors is based on n quantities subject to p constraints

ni
( y
j 1
ij  yio )  0, i  1, 2,..., p
so SSE carries (n  p) degrees of freedom.
Also note that

TSS  SSTr  SSE ,
the TSS has been divided into two orthogonal components - SSTr and SSE. Moreover, all TSS, SSTr
and SSE can be expressed in a quadratic form. Since  ij are assumed to be identically and
independently distributed following N (0,  2 ), so y ij are also independently distributed following
N (    i ,  2 ).
Now using the theorems 7 and 8 with q1  SSTr, q2  SSE , we have under H 0 ,
SSTr
~  2 ( p  1)
 2
and
SSE
~  2 ( n  p ).
 2
Moreover, SSTr and SSE are independently distributed.

10
10
The mean squares is defined as the sum of squares divided by the degrees of freedom. So the mean
square due to treatment is
SSTr
MSTr 
p 1
and the mean square due to error is
SSE
MSE  .
n p
Thus, under H 0 ,
 MSTr 

 2 
F ~ F ( p  1, n  p).
 MSE 
 2 
  
The decision rule is that reject H 0 if
F  F1 , p 1, n  p
at  % level of significance.
If H 0 does not hold true, then
MSTr
~ noncentral F ( p  1, n  p,  )
MSE
p
ni i2
where    is the noncentrality parameter.
i 1 2
MSTr
Note that the test statistic can also be obtained from the likelihood ratio test.
MSE
If H 0 is rejected, then we go for multiple comparison tests and try to divide the population into
several groups having the same effects.

11
11
The analysis of variance table is as follows:
Source Degrees Sum of Mean sum F-value

of variation of freedom squares of squares
MSTr
Treatment p 1 SSTr MSTr
MSE
Error n p SSE MSE
Total n 1 TSS
Now we find the expectations of SSTr and SSE .
 p 
E ( SSTr )  E   ni ( yio  yoo ) 2 
 i 1 
 p
2
 E   ni (    i   io )  (    oo ) 
 i 1 
where
1 ni
1 p ni p
ni i
 io 
ni

j 1
 ij ,  oo    ij and
n i 1 j 1
i 1 n
 0.
 p 2
E ( SSTr )  E   ni  i  ( io   oo ) 
 i 1 
p p
  ni E (i2 ) 
i 1
 n E (
i 1
i io   oo ) 2  0.
Since
p p
 1 1
E ( SSTr )   ni i2   2  ni   
i 1 i 1  ni n 
p
  ni i2  ( p  1) 2
i 1
p
 SSTr   ni i2
  
2 i 1
or E 
 p 1  p 1
p
n i i
2
or E ( MSTr )   2  i 1
.
p 1

12
12
1 ni
 1 2
E ( io2 )  Var ( io )  Var    ij   ni2
ni 2

 ni j 1  ni
1 p ni
 1 2
E ( )  Var ( oo )  Var    ij   2 n 
2
oo
2
 n i 1 j 1  n n
E ( io oo )  Cov( io ,  oo )
1  ni p ni

 Cov    ij   ij 
ni n  j 1 i 1 j 1 
ni 2  2
  .
ni n n
Next
 p ni 
E ( SSE )  E   ( yij  yio ) 2 
 i 1 j 1 
 p ni 2
 E   (    i   ij )  (    i   io ) 
 i 1 j 1 
 p ni 
 E   ( ij   io ) 2 
 i 1 j 1 
p ni
  E ( ij2   io2  2 ij io )
i 1 j 1
p ni
 2  2 2 2 
=      
i 1 j 1  ni ni 
p
 ni  1 
ni
 2   
i 1 j 1  ni 
p
n (n  1)
 2 i i
i 1 ni
p
  2  (ni  1)
i 1
 (n  p) 2
 SSE 
 
2
or E 
 n  p 
or E ( MSE )   2 .
Thus MSE is an unbiased estimator of  2 .

13
13
Two way classification under fixed effects model
Suppose the response of an outcome is affected by the two factors – A and B. For example, suppose I
varities of mangoes are grown on I different plots of same size in each of the J different locations.
All plots are given same treatment like equal amont of water, equal amount of fertilizer etc. So there
are two factors in the experiment which affect the yield of mangoes.
- Location (A)
- Variety of mangoes (B)
Such an experiment is called two – factor experiment. The different locations correspond to
the different levels of A and the different varities correspond to the different levels of factor B. The
observations are collected on the basis of per plot.
The combined effect of the two factors (A and B in our case) is called the interaction effect (of A
and B).
Mathematically, let a and b be the levels of factors A and B respectively then a function f ( a , b ) is
called a function of no interaction if and only if there exists functions g ( a ) and h (b ) such that
f ( a , b )  g ( a )  h (b ) .
Otherwise the factors are said to interact.

For a function f ( a , b ) of no interaction,
f (a1 , b)  g (a1 )  h(b)
f (a2 , b)  g (a2 )  h(b)
 f (a1 , b)  f (a2 , b)  g (a1 )  g (a2 )
and so it is independent of b. Such no interaction functions are called additive functions.
Now there are two options:

- Only one observation per plot is collected.
- More than one observations per plot are collected.
If there is only one observation per plot then there cannot be any interaction effect among the
observtions and we assume it to be zero.
If there are more than one observations per plot then interaction effect among the observations can
be considered.

14
14
We consider here two cases
1. One observation per plot in which the interaction effect is zero.
2. More than one observations per plot in which the interaction effect is present.
Two way classficiation without interaction

Let y ij be the response of observation from ith level of first factor, say A and jth level of second
factor, say B. So assume Yij are independently distributed as N (ij ,  2 ) i  1, 2,..., I , j  1, 2,..., J .
This can be represented in the form of a linear model as

E (Yij )  ij
 oo  ( io  oo )  ( oj  oo )  ( ij  io  oj  oo )
    i   j   ij
where
  oo
 i  io  oo
i  oj  oo
 ij  ij  io  oj  oo
with
I I
 i   (io  oo )  0
i 1 i 1
J J
    (
j 1
j
j 1
oj  oo )  0
Here
i : effect of ith level of factor A
or excess of mea nof ith level of A over the general mean.
 j : effect of jth level of B
or excess of mean of jth level of B over the general mean.

 ij : Interaction effect of ith level of A and jth level of B.
Here we assume  ij  0 as we have only one observation per plot.
We also assume that the model E (Yij )   ij is a full rank model so that  ij and all linear parametric
functions of  ij are estimable.

15
15
The total number of observations are I  J which can be arranged in a two way calssified I  J table
where the rows corresponds to the different levels of A and the column corresponds to the different
levels of B.
The observations on Y and the design matrix X in this case are
Y  1  2   I 1 2   j
y11 1 1 0  0 1 0  0
y12 1 1 0  0 0 1  0
     
y1J 1 1 0  0 0 0  1
       
yI 1 1 0 0  1 1 0 0
yI 2 1 0 0 1 0 1  0
       
yIJ 1 0 0 1 0 0 1
If the design matrix is not of full rank, then the model can be reparameterized. In such a case, we can
start the analysis by assuming that the model E (Yij )     i   j is obtained after
reparameterization.
There are two null hypothesis of interest:

H 0 : 1   2  ...   I  0
H 0  : 1  2  ...   J  0
against
H1 : at least one  i (i  1, 2,..., I ) is different from others
H 1 : at least one  j ( j  1, 2,..., J ) is different from others.
Now we derive the least squares estimators (or equivalently the maximum likelihood estimator) of
 ,  i and  j , i  1, 2,..., I , j  1, 2,..., J by minimizing the error sum of squares
I J
E (y ij     i   j )2 .
i 1 j 1

16
16
The normal equations are obtained as
E I J

 0  2  (y
i 1 j 1
ij    i   j )  0
E J
 i
 0  2 (y
j 1
ij     i   j )  0 , i  1, 2,..., I ,
E I
 j
 0  2 (y
i 1
ij     i   j )  0 , j  1, 2,..., J .
I J
Solving the normal equations and using 
i 1
i  0 and 
j 1
j  0 , the least squares estimator are
obtaianed as
1 I J
G
ˆ 
IJ
 yi 1 j 1
ij 
IJ
 yoo
1 J
T
î  
J j 1
yij  yoo  i  yoo = yio  yoo i  1, 2,..., I
J
1 I Bj
ˆ j  
I i 1
yij  yoo 
I
 yoo  yoj  yoo , j  1, 2,..., J
where
Ti : treatment totals due to ith  effect, i.e., sum of all the observations receiving the ith treatment
effect.
B j : block totals due to jth  effect, i.e., sum of all the observations in the jth block.
Thus the error sum of squares is

SSE  Min E
 , i ,  j
I J
  ( yij  ˆ i  ˆ i  ˆ j ) 2
i 1 j 1
I J
 ( y
2
= ij  yoo )  ( yio  yoo )  ( yoj  yoo ) 
i 1 j 1
I J
  ( yij  yio  yoj  yoo ) 2
i 1 j 1
I J I J
=  ( yij  yoo ) 2  J  ( yio  yoo ) 2  I  ( yoj  yoo ) 2
i 1 j 1 i 1 j 1
which carries
IJ  ( I  1)  ( J  1)  1  ( I  1)( J  1)
degrees of freedom.
17
17
Next we consider the estimation of  and  j under the null hypothesis H 0 : 1   2  ...   I  0
by minimizing the error sum of squares

I J
E1   ( yij     j ) 2 .
i 1 j 1
The normal equation are obtained by

E1 E1
 0 and  0, j  1, 2,..., J
  j
which on solving gives the least square estimates

ˆ  yoo
ˆ j  yoj  yoo .
The sum of squares due to H 0 is

I J
Min E1  Min  ( yij     j )2
 , j  , j
i 1 j 1
I J
  ( yij  ˆ  ˆ j ) 2
i 1 j 1
I I J
 J  ( yio  yoo ) 2   ( yij  yio  yoj  yoo ) 2 .
i 1 i 1 j 1
 
Sum of squares due to factor A Error sum of squares
Thus the sum of squares due to deviation from H 0 (or sum of squares due to rows or sum of squares
are to factor A)
I I
SSA  J  ( yio  yoo ) 2  J  yio2  IJyoo2
i 1 i 1
and carries
 IJ  J    I 1 J 1  I 1.

degrees of freedom.
Now we find the estimates of  and  i under H 0  : 1   2  ...   J  0 by minimizing

I J
E2   ( yij     i ) 2 .
i 1 j 1

18
18
The normal equations are
E2 E2
 0 and  0, i  1, 2,..., I
  i
which on solving give the estimators as
ˆ  yoo
î  yio  yoo .
The minimum value of the error sum of squares is obtained by
I J
Min E2   ( yij  ˆ  ˆ i ) 2
 , j
i 1 j 1
I J
  ( yij  yio ) 2
i 1 j 1
J I J
 I  ( yoj  yoo ) 2   ( yij  yio  yoj  yoo ) 2
j 1 i 1 j 1
 
Sum of squares due to factor B Error sum of squares
The sum of squares due to deviation from H 0  (or the sum of squares due to columns or sum of
squares due to factor B) is

J
SSB  I  ( yoj  yoo ) 2 I  yoj2  IJ yoo2
j 1 j
and its degrees of freedom are
 IJ  I    I 1 J 1  J 1.

Note that the total sum of squares is
I J
TSS   ( yij  yoo ) 2
i 1 j 1
I J
   ( yio  yoo )  ( yoj  yoo )  ( yij  yio  yoj  yoo ) 
2
i 1 j 1
I J I J
 J  ( yio  yoo ) 2 I  ( yoj  yoo ) 2   ( yij  yio  yoj  yoo ) 2
i 1 j 1 i 1 j 1
 SSA  SSB  SSE.
The partitioning of degrees of freedom into the corresponding groups is

IJ  1  ( I  1)  ( J  1)  ( I  1)( J  1).
19
19
Note that SSA, SSB and SSE are mutually orthogonal and that is why the degrees of freedom can be
divided like this.
Now using the theory explained while discussing the likelihood ratio test or assuming yij ' s to be
independently distributed as N (  i   j ,  2 ), i  1, 2,..., I ; j  1, 2,..., J , anfd using the Theorems
6 and 7, we can write

SSA
~  2 ( I  1)
2
SSB
~  2 ( J  1)
 2
SSE
~  2 (( I  1)( J  1)).
 2
So the test statistic for H 0 is obtained as
 SSA /  2 
 
F1   I 1 
 SSE /  2 
 
 ( I  1)( J  1) 
( I  1)( J  1) SSA
 .
( I  1) SSE
MSA
 ~ F (( I  1), ( I  1) ( J  1)) under H 0
MSE
where
SSA
MSA 
I 1
SSE
MSE  .
( I  1)( J  1)
Same statistic is also obtained using the likelihood ratio test for H 0 .
The decision rule is

Reject H 0 if F1  F1  ( I  1),( I  1) ( J 1) .
I
J   i2
Under H1 , F1 follows a noncentral F distribution F ( , ( J  1), ( I  1)( J  1)) where   i 1
is
2
the associated noncentrality parameter.

20
20
Similarly, the test statistic for H 0  is obtained as
 SSB /  2 
 
J 1 
F2  
 SSE /  2 
 
 ( I  1)( J  1) 
( I  1)( J  1) SSB

( J  1) SSE
MSB
 ~ F (( J  1), ( I  1)( J  1)) under H 0 
MSE
SSB
where MSB  .
J 1
The decision rule is
Reject H 0  if F2  F1 (( J  1), ( I  1)( J  1)) .
The same test statistic can also be obtained from the likelihood ratio test.
Source of Degrees Sum of Mean sum F-value

variation of freedom squares of squares
MSA
Factor A (or rows) ( I  1) SSA MSA F1 
MSE
MSB
Factor B (or column) ( J  1) SSB MSB F2 
MSE
Error ( I  1)( J  1) SSE MSE
(by subtraction)
Total IJ  1 TSS
It can be found on similar lines as in the case of one way classification that
J I 2
E ( MSA)   2   i
I  1 i 1
I J 2
E ( MSB)   2  j
J  1 j 1
E ( MSE )   2 .

21
21
If the null hypothesis is rejected, then we use the multiple comparison tests to divide
the i ' s (or  j ' s) into groups such that i ' s (or  j ' s) belonging to the same group are equal and
those belonging to different groups are different. Generally, in practice, the interest of experimenter
is more in using the multiple comparison test for treatment effects rather on the block effects. So the
multiple comparison test are used generally for the treatment effects only.
Two way classification with interactions:

Consider the two way classification with an equal number, say K observations per cell. Let
yijk : kth observation in (i , j ) th cell , i.e., receiving the treatments ith level of factor A and jth
level of factor B, i  1, 2,..., I ; j  1, 2,..., J ; k  1, 2,..., K and
yijk are independently drawn from N (ij ,  ) so that the linear model under consideration is
2
yijk   ij   ijk
where  ijk are identically and independently distributed following N (0,  2 ). Thus
E ( yijk )  ij
 oo  ( io  oo )  ( oj  oo )  ( ij  io  oj  oo )
    i   j   ij
where
  oo
 i  io  oo
 j  oj  oo
 ij  ij  io  oj  oo
with
I J I J

i 1
i  0,   j  0,   ij  0,   ij  0.
j 1 i 1 j 1
Assume that the design matrix X is of full rank so that all the parametric functions of ij are
estimable.
The null hypothesis are

H 0 : 1   2  ... =  I  0
H 0  : 1   2  ... =  J  0
H 0 :All  ij  0 for all i, j.

22
22
The corresponding alternative hypothesis is
H1 : At least one  i   j , for i  j
H1 : At least one i   j , for i  j
H1 : At least one  ij   ik , for j  k .
Minimizing the error sum of squares

I J K
E  (y ijk     i   j   ij ) 2 ,
i 1 j 1 k 1
The normal equations are obtained as

E E E E
 0,  0 for all i,  0 for all j and  0 for all i and j
  i  j  ij
The least squares estimates are obtained as

1 I J K
ˆ  yooo 
IJK
  y
i 1 j 1 k 1
ijk
1 I
î  yioo  yooo 
JK
y
i 1
ijk  yooo
1 J
ˆ j  yojo  yooo 
IK
y
j 1
ijk  yooo
îj  yijo  yioo  yojo  yooo

1 I J

K
 yi 1 j 1
ijk  yioo  yojo  yooo .
The error sum of square is

I J K
SSE  Minˆ  ( y
ˆ , ˆ i ,  j , ˆ ij i 1 j 1 k 1
ijk     i   j   ij ) 2
I J K
  ( yijk  ˆ  ˆ i  ˆ j  îj ) 2
i 1 j 1 k 1
I J K
  ( yijk  yijo ) 2
i 1 j 1 k 1
SSE
with ~  2 ( IJ ( K  1)).
 2

23
23
Now minimizing the error sum of squares under H 0  1   2  ...   I  0 , i.e., minimizing
I J K
E1    (y ijk     j   ij ) 2
i 1 j 1 k 1
with respect to  ,  j and  ij and solving the normal equations
E1 E1 E1

 0,  0 for all j and  0 for all i and j
  j  ij
gives the least squares estimates as

ˆ  yˆ ooo
ˆ j  yojo  yooo
îj  yijo  yooo  yojo  yooo .
The sum of squares due to H 0 , is

I J K
Min
 ,  j , ij
 ( y
i 1 j 1 k 1
ijk     j   ij ) 2
I J K
  ( yijk  ˆ  ˆ j  îj ) 2
i 1 j 1 k 1
I J K I
  ( yijk  yijo ) 2  JK  ( yioo  yooo ) 2
i 1 j 1 k 1 i 1
I
 SSE  JK (y
i 1
ioo  yooo ) 2 .
Thus the sum of squares due to deviation from H 0 or the sum of squares due to effect A is
I
SSA  Sum of squares due to H 0  SSE  JK  ( yioo  yooo ) 2
i 1
SSA
with ~  2 ( I  1).
 2
Minimizing the error sum of squares under H 0  : 1   2  ...   J  0 i.e., minimizing

I J K
E2    (y ijk     i   ij ) 2 ,
i 1 j 1 k 1
and solving the normal equations

E 2 E 2 E 2
 0,  0 for all j and  0 for all i and j
  i  ij
yields the least squares estimators as

24
24
ˆ  yooo
ˆ i  yiooo  yooo
îj  yijo  yioo  yojo  yooo .
The minimum error sum of squares is
I J K
 ( y
i 1 j 1 k 1
ijk  ˆ  ˆ i  îj ) 2
J
 SSE  IK  ( yojo  yooo ) 2
j 1
and the sum of squares due to deviation from H o or the sum of squares due to effect B is
J
SSB  Sum of squares due to H 0   SSE  IK  ( yojo  yooo ) 2
j 1
SSB
with ~  2 ( J  1).
2
Next, minimizing the error sum of squares under H 0 : all  ij  0 for all i, j, i.e., minimizing
I J K
E3    (y ijk     i   j )2
i 1 j 1 k 1
with respect to  ,  i and  j and solving the normal equations
E3 E3 E3

 0,  0 for all i and  0 for all j
  i  j
yields the least squares estimators

ˆ  yooo
ˆ i  yioo  yooo
ˆ j  yojo  yooo .
The sum of squarers due to H 0 is

I J K
Min
 , i ,  j
 ( y
i 1 j 1 k 1
ijk    i   j )2
I J K
  ( yijk  ˆ  ˆ i  ˆ j ) 2
i 1 j 1 k 1
I J
 SSE  K  ( yijo  yioo  yojo  yooo ) 2 .
i 1 j 1

25
25
Thus the sum of squares due to deviation from H 0 or the sum of squares due to interaction effect
AB is
I J
SSAB  Sum of squares due to H 0  SSE  K  (y
i 1 j 1
ijo  yioo  yojo  yooo ) 2
SSAB
with ~  2 (( I  1) J  1)).
 2
The total sum of squares can be partitioned as

TSS  SSA  SSB  SSAB  SSE
where SSA, SSB , SSAB and SSE are mutually orthogonal. So either using the independence of
SSA, SSB , SSAB and SSE as well as their respective   distributions or using the likelihood ratio
2
test approach , the decision rules for the null hypothesis at  level of significance are based on F-
statistic as follows
IJ ( K  1) SSA
F1  . ~ F  ( I  1, IJ ( K  1)  under H 0 ,
I  1 SSE
IJ ( K  1) SSB
F2  . ~ F  ( J  1, IJ ( K  1) under H 0  ,
J  1 SSE
and
IJ ( K  1) SSAB
F3  . ~ F  ( I  1)( J  1), IJ ( K  1)  under H 0 .
( I  1)( J  1) SSE
So
Reject H 0 if F1  F1  ( I  1), IJ ( K  1) 
Reject H 0  if F2  F1  ( J  1), IJ ( K  1) 
Reject H 0 if F3  F1  ( I  1)( J  1), IJ ( K  1)  .
If H 0 or H 0  is rejected, one can use t -test or multiple comparison test to find which pairs of
i ' s or  j ' s are significantly different.

If H 0 is rejected, one would not usually explore it further but theoretically t- test or multiple
comparison tests can be used.

26
26
It can also be shown that
JK I 2
E ( SSA)   2   i
I  1 i 1
IK J 2
E ( SSB )   2  j
J  1 j 1
I J
K
E ( SSAB )   2  
( I  1)( J  1) i 1 j 1
 ij2
E ( SSE )   2 .

SSA MSA
Factor A ( I  1) SSA MSA  F1 
I 1 MSE
SSB MSB
Factor B ( J  1) SSB MSB  F2 
J 1 MSE
SSAB MSAB
Interaction AB ( I  1)( J  1) SSAB MSAB  F3 
( I  1)( J  1) MSE
SSE
Error IJ ( K  1) SSE MSE 
IJ ( K  1)
Total ( IJK  1) TSS

27
27
Tukey’s test for nonadditivity:
Consider the set up of two way classification with one observation per cell and interaction as
I J
yij    i   j   ij   ij , i  1, 2..., I , j  1, 2,..., J with   i  0,
i 1

j 1
j  0.
The distribution of degrees of freedom in this case is as follows:
Source Degrees of freedom

A I 1
B J 1
AB(interaction) ( I  1)( J  1)
Error 0
_________________________________
Total IJ  1
____________________________________
There is no degree of freedom for error. The problem is that the two factor interaction effect and
random error component are subsumed together and cannot be separated out. There is no estimate
for  2 .
If no interaction exists, then H 0 :  ij  0 for all i ,j is accepted and the additive model
yij     i   j   ij
is well enough to test the hypothesis H 0 :  i  0 and H 0 :  j  0 with error having ( I  1)( J  1)
degrees of freedom.
If interaction exists, then H 0 :  ij  0 is rejected. In such a case, if we assume that the structure of
interaction effect is such that it is proportional to the product of individual effects, i.e.,
 ij   i  j
then a test for testing H 0 :   0 can be constructed. Such a test will serve as a test for nonadditivity.
It will help in knowing the effect of presence of interact effect and whether the interaction enters
into the model additively. Such a test is given by Tukey’s test for nonadditivity which requires one
degree of freedom leaving  I -1 J -1 -1 degrees of freedom for error.

28
28
Let us assume that departure from additivity can be specified by introducing a product term and
writing the model as
I J
E ( yij )     i   j   i  j ; i  1, 2,..., I , j  1, 2,..., J with 
i 1
i  0, 
j 1
j  0.
When   0, the model becomes nonlinear model and the least squares theory for linear models is
not applicable.
I J
Note that using   i  0,
i 1
 j 1
j  0 , we have
1 I J
1 I J
yoo 
IJ
 yij 
i 1 j 1 IJ
    
i 1 j 1
i   j   i  j   ij 
1 I 1 J  I J
  i J
I i 1
 
j 1
 j  (    j )   oo
 )(
IJ i 1 j j 1
    oo
E ( yoo )  
 ˆ  yoo .
Next
1 J 1 J
yio  
J j 1
yij       i   j   i  j   ij 
J j 1
1 J 1 J
   i  
J j 1
 j   i   j   io
J j 1
    i   io
E ( yio )     i
 î  yio  ˆ  yio  yoo .
Similarly
yoj     j
 ˆ j  yoj  ˆ  yoj  yoo
Thus ˆ , î and ̂ j remain the unbiased estimators of  ,  i and  j , respectively irrespective of
whether   0 or not.

29
29
Also
E  yij  yio  yoj  yoo    i  j
or
E  ( yij  yoo )  ( yio  yoo )  ( yoj  yoo )    i  j .
Consider the estimation of  ,  i ,  j and  based on the minimization of
S   ( yij     i   j   i  j ) 2
i j
  Sij2 .
i j
The normal equations are solved as

S I J
 0   Sij  0
 i 1 j 1
 ˆ  yoo
S J
 0   (1   j ) Sij  0
 i j 1
S I
 0   (1   i ) Sij  0
 j i 1
S I J
 0    i  j Sij  0
 i 1 j 1
I J
or   i  j ( yij     i   j   i  j )  0
i 1 j 1
I J
   i j yij

  (say)
i 1 j 1
or  
 2  2
I J
  i     j 
 i 1   j 1 
which can be estimated provided  i and  j are assumed to be known.
Since  i and  j can be estimated by ̂ i  yio  yoo and ˆ j  yoj  yoo irrespective of whether   0
or not, so we can substitute them in place of  i and  j in  which gives

30
30
I J I J
 î ˆ j yij ( IJ ) ˆ i ˆ j yij

ˆ  i 1 j 1

i 1 j 1
 2  ˆ2   2  ˆ2 
I J I J
  ˆ i     j    i  j 
J ˆ  I
 i 1   j 1   i 1   j 1 
I J
IJ  ( yio  yoo )( yoj  yoo ) yij
i 1 j 1

S ASB
I I
where S A  J  ˆ i2  J  ( yio  yoo ) 2
i 1 i 1
J J
S B  I  ˆ j2  I  ( yoj  yoo ) 2 .
j 1 j 1
Assuming  i and  j to be known

2
 
   I J 
  1     i2  j2Var ( yij )  0 
Var ( )  I
 J
2   i 1 j 1
 i   j 
2 
 i 1 j  1 
 I
 J 
 2    i2     j2 
 i 1   j 1 
 2 2
 I 2  J 2
 i     j 
 i 1   j 1 
2
=
 I 2  J 2 
  i     j 
 i 1   j 1 
using Var ( yij )   2 , Cov( yij , y jk )  0 for all i  k .
When  i and  j are estimated by ̂ i and ˆ j , then substitute them back in the expression of
Var () and treating it as Var (ˆ) gives
2
Var (ˆ ) 
 I 2  J ˆ2 
  ˆ i     j 
 i 1   j 1 
IJ  2

S ASB
for given ̂ i and ˆ j .

31
31
Note that if   0, then
 I J 
   i  j yij 
E ˆ / ˆ i , ˆ j for all i, j   E  i 1I j 1 J 
 2 
  i   j 
2
 i 1 j 1 
 I J

   i  j (    i   j  0   ij ) 
 E  1 1 
i j
 
I J
 (  i )(  j )
2 2

 i 1 j 1 
0
 I J
 0.
(  i2 ) (  j2 )
i 1 j 1
As ̂ i and ̂ j remains valid irrespective of   0 or not, in this sense ̂ is a function of y ij and
hence normally distributed as

 IJ 2 
ˆ ~ N  0, .
 S A SB 
Thus the statistic

2
 I J 
(ˆ ) 2
IJ   î ˆ j yij 
  2 
i 1 j 1
Var ( )ˆ  S ASB
2
 I J 
IJ  ( yio  yoo )( yoj  yoo ) yij 
  
i 1 j 1
 2 S ASB
2
 I J 
IJ  ( yio  yoo )( yoj  yoo )( yij  yio  yoj  yoo ) 
  
i 1 j 1
 2 S ASB
SN

2
follows a  2 - distribution with one degree of freedom where
2
 I J 
IJ  ( yio  yoo )( yoj  yoo )( yij  yio  yoj  yoo ) 
SN   
i 1 j 1
S ASB
is the sum of squares due to nonadditivity .

32
32
Note that
I J
S AB
(y
i 1 j 1
ij  yio  yoj  yoo ) 2

 2
2
follows  2 (( I  1)( J  1)).
S SAB 
so  N2  2  is nonnegative and follows  2  ( I  1)( J  1)  1 .
  
The reason for this is as follows:

yij     i   j  non additivity   ij
and so
TSS  SSA  SSB  S N  SSE
 SSE  TSS  SSA  SSB  S N
has degrees of freedom

 ( IJ  1)  ( I  1)  ( J  1)  1
 ( I  1)( J  1)  1
We need to ensure that SSE > 0. So using the result
“If Q, Q1 and Q2 are quadratic forms such that
Q  Q1  Q2 with Q ~  2 ( a ), Q2 ~  2 (b ) and Q2 is non-negative, then

Q1 ~  2 ( a  b )"
ensures that the difference

SN SAB

 2
2
is nonnegative.
Moreover SN (SS due to nonadditivity) and SSE are orthogonal. Thus the F-test for nonadditivity is
 SN /  2 
 
 1 
F 
 SSE /  2 
 
 ( I  1)( J  1)  1 
SSN
  ( I  1)( J  1)  1
SSE
~ F 1, ( I  1)( J  1)  1 under H 0 .

33
33
So the decision rule is
Reject H 0 :   0 whenever
F  F1 1,(I 1)( J 1) 1
The analysis of variance table for the model including a term for nonadditivity is as follows:

SA
A I 1 SA MS A 
I 1
S
B J 1 SB MS B  B
J 1
MS N
Nonadditivity 1 SN MS N  S N
MSE
SSE
Error ( I  1)( J  1)  1 SSE MSE 
( I  1)( J  1)  1
(By substraction)
Total IJ  1 TSS
Comparison of Variances
One of the basic assumptions in the analysis of variance is that the samples are drawn from different
normal populations with different means but same variances. So before going for analysis of
variance, the test of hypothesis about the equality of variance is needed to be done.
We discuss the test of equality of two variances and more than two variances.

34
34
Case 1: Equality of two variances
H 0 :  12   22   2 .
Suppose there are two independent random samples
A : x1 , x2 ,..., xn1 ; xi ~ N (  A ,  A2 )
B : y1 , y2 ,..., yn2 ; yi ~ N (  B ,  B2 )
The sample variance corresponding to the two samples are

1 n1
sx2   ( xi  x )2
n1  1 i 1
1 n2
s y2   ( yi  y )2 .
n2  1 i 1
Under H 0 :  A2   B2   2 ,
(n1  1) sx2
~  2 (n1  1)
 2
(n2  1) s y2
~  2 (n2  1).
 2
Moreover, the sample variances sx2 and sy2 and are independent. So
  (n  1) s 2  
 1 2 x
  
 n1  1  s2
   x2 ~ Fn1 1, n2 1.
  (n2  1) s y   s y
2
   2  
 
  1 
 n 2 
So for testing H 0 :  12   22 versus H1 : 12   22 , the null hypothesis H 0 is rejected if
F F 
1 ; n1 1, n2 1
2
or
F F 
1 ; n1 1, n2 1
2
where
1
F  .
; n1 1, n2 1; F
2 
1 ;n2 1, n1 1
2
If the null hypothesis H 0 :  12   22 is rejected, then the problem is termed as the Fister-Behren’s
problem. The solutions are available for this problem.
35
35
Case 2: Equality of more than two variances: Bartlett’s test
H 0 :  12   22  ...   k2 and H1 :  i2   2j for atleast one i  j  1, 2,..., k.
Let there be k independent normal population N (i ,  i2 ) each of size ni , i  1, 2,..., k . Let
s12 , s22 ,..., sk2 be k independent unbiased estimators of population variances 12 ,  22 ,...,  k2 respectively
with  1 , 2 ,..., k degrees of freedom. Under H 0 , all the variances are same as  2 , say and an
unbiased estimate of  2 is
k
 i si2 k
s2   where  i  ni  1,   i .
i 1  i 1
Bartlett has shown that under H 0

k
 s2 
 
 i
i 1 
ln 
si2 
 1  k  1  1 
 1      
 3(k  1)  i 1   i   
is distributed as  2 ( k  1) based on which H 0 can be tested.

36
36

Chapter3 Anova Experimental Design Models

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Chapter3 Anova Experimental Design Models

Hochgeladen von

Copyright:

Verfügbare Formate

Chapter 3

Experimental Design Models

One way classification:

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

Also, we can express

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

for which Yij can be written as

This gives rise to some more issues.

Consider and rewrite

The model E (Yij )  i     i can now be rewritten as

where  *  (  ,  1 ,  2 ,...,  p ) is a p  1 vector and

Consider any linear estimator

Thus, in general neither 

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

Consider the least squares estimation ˆ , ˆ1 , ˆ 2 ,..., ˆ p of 1 , 1 ,  2 ,...,  p respectively.

Minimize the sum of squares due to  ij ' s

to obtain ˆ , ˆ1 ,..., ˆ p .

If a side condition that

is imposed then (a) and (b) have a unique solution as

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

This is a reparameterized form of the linear model.

is called a reparametrization of the original linear model.

and the alternative hypothesis is

We use the principal of least squares to estimate the parameters  ,  1 ,  2 ,... p .

Minimize the error sum of squares

with respect to  ,  1 ,  2 ,...,  p . The normal equations are obtained as

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

where the total sum of squares (TSS )

To obtain a measure of variation due to treatments, let

be true. Then the model becomes

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

with respect to  , the normal equation is obtained as

Substituting ˆ in E1 , the error sum of squares becomes

The sum of quares due to treatment ( SSTr ) is given by

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

The distribution of degrees of freedom among these sum of squares is as follows:

 The sum of squares due to errors is based on n quantities subject to p constraints

so SSE carries (n  p) degrees of freedom.

Also note that

independently distributed following N (0,  2 ), so y ij are also independently distributed following

Moreover, SSTr and SSE are independently distributed.

If H 0 does not hold true, then

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

Source Degrees Sum of Mean sum F-value

Error n p SSE MSE

Now we find the expectations of SSTr and SSE .

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

Thus MSE is an unbiased estimator of  2 .

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

Otherwise the factors are said to interact.

Now there are two options:

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

Two way classficiation without interaction

This can be represented in the form of a linear model as

or excess of mean of jth level of B over the general mean.

Here we assume  ij  0 as we have only one observation per plot.

functions of  ij are estimable.

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

There are two null hypothesis of interest:

H 1 : at least one  j ( j  1, 2,..., J ) is different from others.

Analysis of Variance | Chapter 3 | Experimental Design Models | Shalabh, IIT Kanpur

Thus the error sum of squares is

by minimizing the error sum of squares

The normal equation are obtained by