Beruflich Dokumente
Kultur Dokumente
A
1. All the students of this batch are requested to keep a copy of
the main class note in order to check for any discrepancy
G
2. And for outsiders of the batch if you find any mistake re-
garding anything in this notes, then before taking screen shot
and making fun of that, report to me directly cause always
remember you are reading this as your note is not worthwhile
N
3. Bhagwan ko mante ho? bhagwan ko lund farak nehi padta!So
point number 2 is invalid..
RA
1. What is meant by design of experiment? Explain the terms
Experiment, Treatment, Experimental unit, and experimental
error.
1
• Experimental error: A fundamental phenomenon in replicated
experiment is that the outcome from experimental units should very,
even if the same treatment is applied. This variation is caused by a
combination of systematic effect and a random effect. This random
part of variation is called experimental error.
A
• The three basic principles of design of experiments are:
(i) Randomisation (ii) Replication (iii) Local control.
• Randomisation: Whenever treatments are allotted to several ex-
G
perimental units, randomization is applied in order to prevent favour-
ing or disfavouring any particular treatment. Applying randomisa-
tion ensures that each and every treatment has an equal probability
of getting allocated to experimental units. Thus randomization leads
N
to an unbiased estimation of the treatment difference. In design of
experiment, mostly agrarian fields are used as experimental unit and
these plots are in general correlated in terms of soil fertility. Since the
RA
statistical analysis of design of experiment is carried out by analysis
of variance, an assumption violation may happen as the responses
would be correlated. Randomization is also another way to get rid
of this correlation.
• Replication: The second essential feature of an experiment is repli-
cation. Replication leads to a more reliable estimate of treatment ef-
TA
2
usage of confounded designs where the number of treatment com-
bination is very large. The use of auxiliary information in analysis
covariance is also useful to reduce error.
3. How do the size and shape of plots and blocks affect the re-
sult of a field experiment?
or,
Starting from Fairfield-Smiths’ variance law, discuss why a
fixed experimental area, it is better to have smaller plot
A
rather than a large one.
Fairfield-Smith showed empirically that for an experimental design,
Vx = Vx1b where Vx is the variance of experimental per unit area for
G
plots of area x, b is soil characteristic measure of correlation among
plots. Here b = 1 means adjacent plots are not correlated and b = 0
means they are perfectly correlated for b = 1, Vx = Vx1 i.e. as x increases
the precision of the experiment increases.
N
On the other hand for b = 0, Vx ∼ V1 i.e., Vx is a constant irrespective
of the plot size log Vx = log V1 − b log x.
Obviously for 0 < b < 1 as x increases log Vx decreases i.e., precision
RA
of the experiment increases. Thus enhanced plot size ensures higher
precision.
If the total experimental area remains fixed, then an increase in the
plot size will decrease the no. of plots. The size and shape of a block
will ordinarily be determined by the size and shape of plots and no. of
plots in a block.
TA
3
5. Define treatment contrast and elementary treatment con-
trast. Show that every treatment contrast can be written as
a linear combination of elementary treatment contrast.
Let t1 , t2 . . . tn be n treatments.
n
X
(i) The linear combination of t1 , t2 , . . . , tn is diti, said to be a
i=1
n
X
treatment contrast if di = 0.
i=1
A
(ii) The contrast having structure, ti − tj , i 6= j is called elementary
contrast.
G
Next, we consider the elementary contrast
d3 = C2 − C3 di = 0
i=1
dn−1 = Cn=2 − Cn−1
d =C
n n−1
6. Consider a two way layout fixed effects model. Find the cor-
relation coefficient between (α1 − α2 ) and (β1 − 2β2 + β3 ).
→ Consider the two way fixed effects model,
We know,
α̂i = ȳi0 − ȳ00 , β̂j = ȳoj − ȳ00
4
cov α̂1 − α̂2 , βˆ1 − 2β̂2 + β̂3
= cov (ȳ10 − ȳ20 , ȳ01 − 2ȳ02 + ȳ03 )
= cov (α1 − α2 + ē10 − ē20 , β1 − 2β2 + β3 + ē01 − 2ē02 + ē03 )
= cov (ē10 , ē01 ) − 2 cov (ē10 , ē02 ) + cov (ē10 , ē03 )
− cov (ē20 , ē01 ) + 2 cov (ē20 , ē02 ) − cov (ē20 , ē03 )
P P P P
e1j ei1 e2j ei2
j i j i
= cov , − 2 cov ,
q p q p
A
P P
eij e2j
P P
ei3 eij
j i j i
+ cov , − cov ,
q p q p
G
P P
e2j e2j
P P
ei2 ei3
j i j i
+ 2 cov , − cov ,
q p q p
N =
σ 2 2σ 2 σ 2 σ 2 2σ 2 σ 2
pq
−
pq
+
pq
−
pq
+
pq
−
pq
=0
RA
7. Define orthogonal contrast and give example.
Two observational contrast say
u
X u
X
L1 = c1 yi = c0 y and L2 = di yi = d0 y
i=1 i=1
e ee
u
X
0
cd= ci di = 0.
i=1
ee
When there are more than two contrasts, they are said to be mutually
orthogonal, if they are orthogonal parities.
e.g.: There are four observations y1 , y2 , y3 , y4 we can write the following
mutually orthogonal contrast:
(i) y1 + y2 − y3 − y4 (ii) y1 − y2 − y3 + y4
7. Obtain the maximum number of mutually orthogonal con-
trasts.
or,
5
Given a set of n values y1 , y2 , . . . , yn . The maximum member of
mutual orthogonal contrasts among them is (n − 1).
A
c01 c = 0
e. e
..
c0m c = 0
G
e0 e
1c=0
ee
c01
N e.
.
.
Then, c = 0, c 6= 0, i.e. pm×1×n c = 0
c0 e e e e
m e e
RA
e
1
e
⇒ Rank (P ) < n i.e. Rank (P ) ≤ n − 1 i.e. atmost (n-1) of (L1 , L2 , .., Lm )
are mutually independent. Hence, the maximum member of orthogonal con-
trast is (n − 1).
analysis of CRD.
In a completely randomised design (CRD). The notion of randomisation
and replication is employed.
Suppose there are t treatments and ith treatment is replicated ri times. i.e.,
it can look upon as a similar setup like ANOVA one may fixed effect model,
where a single factor has t− levels and each level consists of ri observations.
Xt
Let ri = n, i.e., there are total n plots (experimental unit).
i=1
Layout: The term layout refers to the placement of treatments to the
experimental unit according to the condition of design.
Here n plots are randomly subdivided into t parts, where jth part contains
rj plots. The first treatment in allotted to first set of r1 plots, the second
6
treatments to the r2 plots and so on. In this way arrangement of n plots into
n!
groups of r1 , r2 , . . . , rt plots is carried out such that all the r1 !...r t!
arrange-
ments have equal chance of occurrence. The technique of local control is not
used here.
Analysis: Here we consider the following model
A
yij = observation corresponding to jth replication of ith treatment
µ = general effect
αi = additional effect due to ith treatment
G
eij = random error
t
X
We assume, (i) eij ∼ N 0, σ 2 (ii)
ri αi = 0
N
The least square estimates are
i=1
µ̂ = ȳ00
RA
α̂i = ȳi0 − ȳ00
It can be shown that MST = Mean square due to treatment
t
1 X
E(M ST ) = σe2 + ri τi2 MSE = Mean square error
t−1
i=1
E(M SE) = σe2
TA
Source df SS MS F
Treatment t−1 SST MST
Error n−t SSE MSE F = M ST /M SE
Total n−1 TSS
7
Inference: Observed F > Fα;t−1,n−t we reject hypothesis of equal effects. If
the observed F is significantly larger than Fαit−1,n−t , the treatment effects
ti’s are not equal. In such cases, it becomes necessary to estimate and test
individual treatment contrasts.
Xt
An unbiased estimate of the contrast Li αi is
i=1
t
X t
X t
X t
X
Li α̂i = Li (ȳi0 − ȳ00 ) = Li ȳi0 where Li = 0
i=1 i=1 i=1 i=1
A
Note that
t t t
! ! !
X X X
var Li α̂i = var Li (ȳi0 − ȳ00 ) = var Li ȳi0
G
i=1 i=1 i=1
t
X σ2
= L2i e
ri
N i=1
in fact
t t t
!
X X X L2
Li α̂i ∼ N Li αi , σe2 i
ri
RA
i=1 i=1 i=1
ri
t X
X
(yij − ȳi0 )2
SSE i=1 j=1
and = ∼ χ2n−t , independently.
σe2 σe2
Hence,
t
X t
X
Li ȳi0 − Li αi
TA
i=1 i=1
v ∼ tn−t
t
L2i
u
u X
tM SE
ri
i=1
t
X
we reject H0 : Li αi = c if
i
t
X
L i ȳi0 − c
v i
! > t α2 ;n−t at level α.
u t
u X L 2
i
M SE
t
ri
i
8
A pairwise comparison H0i αi = αi0 can be done by the statistic
ȳi0 − ȳi0 0
s ∼ tn−k under H0 .
1 1
2M SE +
ri ri0
A
ȳi0 − ȳi0 0 − t α2 ;n−t 2 × M SE + ,
r1 ri0
s
1 1
ȳi0 − ȳi0 0 + t 2 ;n−t 2 × M SE +
α
G
ri ri
The common critical difference (CCD) for testing H0i : γi = γi0 of the
difference of estimates α̂i − α̂i0 = ȳi0 − ȳi0 0 is
N s
t α2 ;n−t 2 × M SE
1
+
1
ri ri
RA
Advantages and disadvantages:
The main advantages of CRD are
(ii) The design allows for the maximum number of df in the error sum of
TA
squares (SSE) and as a result the estimate σ̂e = MSE becomes reliable
and sensitive.
The chief disadvantages of the design is that it is usually suited only for
small numbers of treatments and for homogeneous experimental material.
When large number of treatments are included, a large number of experi-
mental material may be used. This generally increases variation among the
treatments.
What is randomised block design? Give the layout and complete
analysis of RBD.
One problem of completely randomised design is that it does not take in
account the notion like soil fertility. Thus when experimental units are not
9
homogeneous, CRD should not be used. The simplest design which enables
us to take care of variability among unit is randomised block design.
Suppose there are t treatments and each treatment is replicated equal
number of times say r, i.e., there are n = rt plots (experimental unit) and
the fertility of the plots is not homogeneous.
Primarily in an RBD design, the plots are divided in r blocks, where the
variation within block is controlled. Each block consists same number of plots
and that number is equal to total number of treatments. The treatments are
allocated in the blocks in such a way that each treatment appears exactly
A
once in the block.
Layout: Let there are 5 treatments A, B, C, D, E and each are replicated
four times. The whole experimental area is divided in four homogeneous
G
strata. Then treatments are allocated at random to the plots in a block. A
particular layout may be as follows,
Block
N I A E B D C
II E D C B A
III C B A E D
IV A D E C B
RA
Analysis: The data collected from experiments with randomised block
design is essentially a two-way classification data which has two factors (block
and treatments). There are rt cells in the two way table with one observation
in each cell. Let yij be the observation from the ith treatment in the jth
block.
The model is given by yij = µ + γi + bj + eij , i = 1(1)t and j = 1(1)r.
TA
µ = general effect
γi = additional effect due to ith treatment
βj = additional effect due to jth block
eij = random error.
Assumption:
t r
iid
X X
(i) γi = 0 (ii) bj = 0 (iii) eij ∼ N (0, σe2 ).
i=1 j=1
Here we want to test
H0i : γi = 0 vs H11 : at least one inequality in H1i
Here we do not test for block effects and blocks are constructed in such a
10
way that within block no variation is present but between the blocks sub-
stantial variation is present. Hence by default blocks have significant impact
to response.
The least square estimates are given by
µ̂ = ȳ00
γ̂i = ȳi0 − ȳ00 MST = Mean square due to treatment
β̂j = ȳ0j − ȳ00 MSB = Mean square due to block
It can be shown that
A
t
1 X 2
E(M ST ) = σe2 + bj
t−1
i=1
r
G
1 X
E(MSB) = σe2 + b2j
r−1
j=1
M ST
Under H01 , ∼ Ft−1,(r−1)(t−1)
M SE
RA
MSB
Under H02 , ∼ Fr−1 , (t − 1)(r − 1)
M SE
ANOVA table:
Source ∂f SS MS F
Treatment (t − 1) SST M ST
F = M ST /M SE
TA
Block (r − 1) SSB M SE
Error (t − 1)(r − 1) SSE M SE
F = M ST /M SE
Total rt − 1
Hoij : γi = γj vs Hij : γi 6= γi
11
!
t
have we will test hypothesis.
2
Note that
A
(γ̂i − γ̂j ) − (γi − γj )
⇒ r ∼ N (0, 1)
σe2
2
r
G
since σ is unknown, σ is estimated by
SSE
σ̂ 2 = Mean square error = , here ∂f = (r − 1)(t − 1)
df
N
Now under H0ij
γ̂i − γ̂j SSE
∼ N (0, 1) and ∼ χ2(r−1)(t−1)
RA
2
r
2σe2 σ e
r
Now
γ̂i − γ̂j
r
2σe2
r γ̂i − γ̂j
TA
r ∼ t(r−1)(t−1) ⇒ r ∼ t(r−1)(t−1)
SSE 2M SE
(r − 1)(t − 1)
σe2 r
We reject H0i at level α if
r
α 2M SE
|γ̂i − γ̂j | > t ; (t − 1)(r − 1)
2 r
This is common critical difference.
A 100(1−α) % confidence interval on the difference of two treatment effects
γ̂i − γ̂j would be
" r r #
2M SE 2M SE
γ̂i − γ̂j − t α2 (t−1)(r−1) , γ̂i − γ̂j + t α2 (t−1)(r−1) ×
r r
12
Q. Show that in RBD block contrast is orthogonal to treatment
contrast.
RBD model is given by yij = µ + αi + βj + eij i = 1(1)t
j = 1(1)r
α1 = additional effect due to ith treatment
βj = additional effect due to jth block.
Treatment contrast is given by
t t
A
X X
ci αi where ci = 0
i=1 i=1
G
r
X r
X
dj βj , dj = 0
N j=1 j=1
Now
t
X r
X
cov ci αi , dj βj
RA
i=1 j=1
Xt r
X
= cov ci (ȳi0 − ȳ00 ), dj (ȳ0j − ȳ00 )
i=1 j=1
t t r r
X X
X X
TA
= cov
ci ȳi0 − ȳ00 ci , dj ȳ0j − ȳ00 dj
i=1 i=1 j=1 j=1
| {z } | {z }
0 0
X t r
X t
XX r
= cov ci ȳi0 , dj ȳ0j = ci dj cov(ȳi0 , ȳ0j )
i=1 j=1 i=1 j=1
X r
t X t X
X r
= ci dj cov (ȳi0 , ȳ0j ) = ci dj cov (αi + ēi0 , βj + ē0j )
i=1 j=1 i=1 j=1
t X
r t r
X σ2 X X
= ci dj cov (ēi0 , ē0j ) = ci dj = 0
rt
i=1 j=1 i=1 j=1
13
Q. Obtain efficiency of RBD over CRD.
Consider an experimental design with t treatments which replicates r times
each. The total no. of plots is rt. Let error mean square of RBD is SE2 and
A
If uniformity trial is applied to RBD then all the rt plots being treated by
the same treatment. Thus there will be no variation due to treatment. Hence
the treatment df will be added to error df, i.e., error df under uniformity
G
trial = (r − 1)(t − 1) + (t − 1) = (t − 1)r.
2.
Error sum of square under uniformity trial = r(t − 1)SE
It the same uniformity trial applied on CRD on the same rt plots, then
N
there will be no effect due to block.
Hence for CRD df will be r(t − 1) + (r − 1) = rt − 1
Error mean sum of square
RA
r(t − 1)SE2 + (r − 1)S 2
2 B
SE 0 =
(rt − 1)
2
SB = sum of squares due to block.
Hence,
r(t − 1)SE2 + (r − 1)S 2
B
E=
TA
2
(rt − 1)SE
Note that E ≥ 1 implies that SB2 > S 2 i.e. RBD exhibits improvement over
E
CRD only if the block variation is larger than the overall error.
Q. What is Latin square design? Why is it called an incomplete
design/layout? Give a complete analysis of LSD.
In randomised block design, the whole experimental area is divided into
relatively homogeneous groups. But it may happen that the field exhibit
fertility in strips. i.e., alternatively high fertility and low fertility zone arises.
RBD will be effective if the blocks are parallel with the strips. But otherwise
RBD will result in inefficiency. As a remedy to it grouping can be done in
two ways both row wise and column wise as variation can be present in both
the ways. Such a design is called Latin square design.
14
In this design, the number of treatments is equal to the no. of replication
of each treatment.
This m2 units are arranged in m rows and m columns and these m treat-
ments allocated to this m2 plots at random in such a manner that each
treatment occurs once and only once in each row and each column.
A
The example of 4 × 4 LSD is given by
D C B A
G
C B A D
B A D C
A D C B
N
LSD is essentially three way incomplete layout. Here variation is achieved
by row, column and treatment and each of the factors have m levels. So for
RA
a complete three way layout, there should be m3 observations. But Instead
of m3 observations, LSD contains m2 observations. Hence LSD is regarded
as an incomplete layout.
Analysis of LSD:
Let, yijk : be the response corresponding ith row, jth column and kth
TA
treatment.
15
The residual sum of squares is given by
X
E= (yijk − µ − αi − βj − γk )2
(i,j,k)∈S
∂E X
⇒ (yijk − µ − αi − βj − γk )(−1) = 0
∂µ
(i,j,k)∈S
1 X
⇒ µ̂ = yijk = ȳ000
m2
(i,j,k)∈S
hence, µ̂ = ȳ000
A
∂E X
=0⇒ (yijk − µ − αi − βj − γk ) = 0
∂αi ∗ (j,k)∈S
1
G
X
⇒α̂i = yijk − µ
m
(j,k)∈S ∗
(i,j,k)∈S
XXX
= (ȳi00 − ȳ000 ) + (ȳ0j0 − ȳ000 ) + (ȳ00k − ȳ000 )
(i,j,k)∈S 2
+ (yijk − ȳi00 − ȳ0j0 − ȳ00k + 2ȳ000 )
X X X
=m (ȳi00 − ȳ000 )2 + m (ȳ0j0 − ȳ000 )2 + m (ȳ00k − ȳ000 )2
i j k
XXX
2
+ (yijk − ȳ0j0 − ȳ0j0 − ȳ00k + 2ȳ000 )
(i,j,k)∈S
16
Here want to test whether there are any effect of row, column or treatment
on the yield (observations) i.e., we want to test
H01 : αi = 0 ∀ i vs H11 : at least one inequality
H02 : βj = 0 ∀ j vs H12 : at least one inequality
H03 : γk = 0 ∀ i vs H13 : at least one inequality.
The df of TSS is (m2 − 1)
The ∂f of SSR/SSC/SST is (m − 1)
Degress of freedom of SSE is given by (m2 −1)−3(m−1) = (m−1)(m−2).
A
Now it can be shown that
m X 2
E(M SR) = σ 2 + αi
m−1
G
i
m X 2
E(M SC) = σ 2 + βj
m−1
j
m X 2
E(M ST ) = σ 2 + γk
N E(M SE) = σ 2
m−1
k
RA
M SR
Now under H01 , ∼ F(m−1),(m−1)(m−2)
M SE
M SC
under H02 , ∼ F(m−1),(m−1)(m−2)
M SE
M ST
under H03 , ∼ F(m−1),(m−1)(m−2) .
M SE
ANOVA Table
TA
Source ∂f SS MS F
Row (m − 1) SSR M SR
Column (m − 1) SSC M SC
F(m−1),(m−1)(m−2)
Treatment (m − 1) SST M ST
Error (m − 1)(m − 2) SSE M SE
Total (m2 − 1) T SS
17
An mxm Latin square design with m treatments A, B, . . . is said to be in
its standard form if in the first row and first column, the letters are arranged
in natural order. Then for m = 4 we have
A B C D
B C D A
← example of standard latin square design
C D A B
D A B C
A
From a standard m × m LSD, we can obtain m! × (m − 1)! different LSD’s
by permuting all the columns and (m − 1) rows except the for the first row.
The m × m Latin square design is said to be orthogonal if these latin
G
squares are superimposed each of the m2 letters occur once and only once.
Example of orthogonal latin square design is given by
A B C A B C AA BB CC
N B
C
C
A
A
B
C
B
A
C
B
A
BC
CB
CA
AC
AB
AB
RA
I II
C B A B C A CB BC AA
B A C A B C BA AB CC
A C B C A B AC CA BB
III IV
TA
Here I and II are mutually orthogonal latin squares. III and IV are also
example of mutually orthogonal latin squares.
Q. Obtain the efficiency of LSD relative to RBD
Let SE2 be the error mean sum of square of RBD and S 2 is the error
0 E0
mean square of LSD
Then efficiency of LSD relative to RBD is given by
2
SE 0
E1 = 2
SE
18
we apply uniformity trial to LSD which consists in using the same treatment
to all m2 units. Thus there will be no treatment variation. Thus treatment
degrees of freedom will be added to the error df.
Thus error df=(m − 1)(m − 2) + (m − 1) = (m − 1)2 .
A
respectively.
So the error df will be = (m − 1)2 + (m − 1) = m(m − 1)
G
Error SS = (m − 1)SC2 + (m − 1)2 SE
2
Error mean squares for RBD (with rows as blocks) is given by,
N 2 =
SE 0
(m − 1)SC2 + (m − 1)2 SE
m(m − 1)
2
= C
S 2 + (m − 1)SE
Now, E ≥ 1 if SC2 ≥ SE
2
2 2 + S2
SE 0 (m − 1)SE R
(ii) 2 = 2 ,
SE mSE
TA
19
Therefore, error SS of CRD will be given by
2 (m − 1)2 SE
2 + (m − 1)S 2 + (m − 1)S 2
C R
SE 0 =
(m2 − 1)
2 + S2 + S2
(m − 1)SE C R
=
(m + 1)
A
The efficiency of LSD relative to CRD is given by
G
2 + S2 + S2
2
SE 0 (m − 1)SE C R
2 = 2
SE (m + 1)SE
N
2 = sum of square due to rows of LSD
SR
m
X Pm
The treatment contrast is given by ci αi , where i=1 ci =0
i=1
m
X Pm
The row effect contrast = dj βj , where j=1 dj =0
j=1
20
We are required to show treatment contrast is orthogonal to row contrast.
m
X m
X
∴ cov ci α̂i , dj β̂j
i=1 j=1
X X X X
= cov ci ȳi00 − ȳ000 ci , dj ȳ0j0 − ȳ000 dj
i i j j
X X
A
= cov ci ȳi00 , dj ȳ0j0
i j
X X
= cov ci (µ + αi + ēi00 ), dj (µ + βj + ē0j0 )
G
i j
XX
= ci dj cov (ēi00 , ē0j0 )
N i j
P P
eijk
PP
eijk
XX j k i k
= ci dj cov ,
m m
i j
RA
XX 1
= ci dj mσ 2
m2
i j
σ2 XX
= ci dj = 0
m
i j
21
A B C D E F
B C D E F A
C D E F A B
D E F A B C
E F A B C D
F A B C D E
Q. What is average variance of elementary treatment contrasts.
For a completely randomized design with v treatments and repli-
A
cation numbers r1 , r2 , .., rv , find the average variance of all the
elementary treatment contrasts. Find r1 , r2 , .., rv so that the aver-
age variance is minimum subject to vi=1 ri = n
P
G
• Let t1 , t2 , . . . , tn be n treatment effects and t̂i be the estimated effect of t!
i.
n
Now (ti − tj ) is called an elementary treatment contrast and we have
N
elementary contrasts. Then variance of particular treatment contrast in
given by V (ti − tj ).
2
RA
So, the average variance is given by
1 XX
avg.var = ! V (ti − tj )
n i<j
2
22
here we have v treatments.
1 XX 1 1
avg var = ! σ2 +
v ri rj
i<j
2
2σ 2 XX 1 XX 1
= +
2v(v − 1) ri rj
i6=j i6=j
v v
σ2 X 1 X 1
A
= (v − 1) + (v − 1)
v(v − 1) ri rj
i=1 j=1
v
2σ 2 X 1
=
G
v ri
i=1
v v
2σ 2 X 1 X
Next we need to minimise
N subject to the restriction, ri = n.
v ri
i=1 i=1
Now define,
v v
!
2σ 2 X 1
RA
X
g(r1 , r2 , . . . , rv ) = +λ ri − n =0
v ri
i=1 i=1
Now
2σ 2 2σ 2
∂g 1
=0⇒ − 2 +λ=0⇒ λ= 2
∂ri v ri vri
√
TA
2σ 2 2σ
⇒ ri2 = ⇒ ri = √
vλ vλ
Now √
v √
X 2σ √ σ
ri = n ⇒ v × √ = n ⇒ λ = 2v
vλ n
i=1
√
2σ n n
i.e., ri = √ · √ = ∀i
v 2vσ v
i.e., For equal member of replication to each treatment variance is min-
imised.
23
The RBD model is given by
yij = µ + γi + βj + eij , i = 1(1)t, j = 1(1)r
A
Therefore,
v(γ̂i − γ̂j ) = v [(γi − γj ) + (ēi0 − ēj0 )] = v(ēi0 − ēj0 )
G
= v(ēi0 ) + v(ēj0 ) [cov(ēi0 , ēj0 ) = 0]
2σ 2
=
N r
Therefore average variance is given by
!
1 XX 1 2σ 2 n 2σ 2
= ! v(γ̂i − γ̂j ) = ! × =
n n r 2 r
RA
i<j
2 2
k = 1(1)m
γk = additional effect due to kth treatment
γ̂k = ȳ00k − ȳ000
m
X
Here we assume that γk = 0
k=1
Therefore,
v(γ̂k − γ̂k0 ) = v ((γk − γk0 ) + (ē00k − ē00k0 )) = (ē00k − ē00k0 )
2σ 2
= v(ē00k ) + v(ē00k0 ) =
m
24
2
Therefore average variance is = 2 σm .
Q. Analyse an RBD design with one missing observation. Also
obtain the corresponding average variance.
A
1 y11 y21 y31 . . . yi1 ... yt2 0
T01
2 y12 y22 y32 . . . yi2 ... yt2 T020
.. .. .. .. ..
G
3 . . . . .
j y1j y2j ... ... yij = x ... ytj T0j + x
.. .. .. .. ..
N . . . . .
r yir y2r ... ... yir ... ytr 0
T0r
Total 0
Ti0 0
T20 ... 0
Ti0+x ... 0
Tt0 T00+x = G
RA
Ti0 0 = Sum of total observation corresponding to the treatments not con-
taining the missing observation. i0 = 1, 2, . . . , i − 1, i + 1, . . . , t
T0j 0 = Sum of total observation corresponding to the blocks, not contain-
ing the missing observation j 0 = 1, 2, . . . , j − 1, j + 1, . . . , r
Ti0 = Sum of known observation corresponding to the the ith treatment
TA
25
Estimation of the missing value
X (T00 + x)2
Total SS = yi20 j 0 + x2 −
tr
(i0 ,j 0 )6=(i,j)
A
(T00+x )2
Here correction factor (CF) =
tr
G
The error SS is given by
(i0 ,j 0 )6=(i,j)
yi20 j 0 + x2 −
(T00+x
rt
−
1X
r
i0 6=i
Ti20 0 +
(Ti0+x
r
−
(T00+x
rt
(T )2 2
1 X 0j+x (T )
00+x
RA
2
− T0j 0 + −
t 0
t rt
j6=j
We obtain x as
d
TA
SSE(x) = 0
dx
2(Ti0+x ) 2(T0j+x ) 2(T00+x )
⇒ 2x − − + =0
r t rt
⇒ xrt − tTi0 − xt − T0j r − xr + T00 + x = 0
⇒ x(rt − t − r + 1) = tTi0 + rT0j − T00
tTi0 + rT0j − T00
⇒x=
(t − 1)(r − 1)
26
missing value to the data and re analyse the data using same RBD tech-
nique and calculate sum square due to treatment i.e. SST and sum square
due to error SSE
A
df due to error = (t − 1)(r − 1) − 1
Under H0
M ST
G
F =
M SE
follows approximately Ft−1,(t−1)(r−1)−1 .
N
We reject H0 at level α if, F > Fα;(t−1),(t−1)(r−1)−1 .
ANOVA Table
Source df SS MS F
RA
Treatment (t − 1) SST M ST
M ST
Block (r − 1) SSB M SB F =
M SE
Error (t − 1)(r − 1) − 1 SSE M SE
Total (rt − 2) T SS
Ti0 +x̂
• The estimate of missing ith treatment is τˆi =
TA
r .
Now,
27
as
var(Ti0 ) = (r − 1)σ 2
var(T0j ) = (t − 1)σe2
var(Ti0 , T0j ) = 0
cov(Ti0 , T0j ) = 0
cov(Ti0 , T00 ) = (r − 1)σ 2
cov(T0j , T00 ) = (t − 1)σ 2
A
var (τ̂i − τ̂i0 ) ∀ i 6= i0
G
σ2 σ2
t
= 1+ +
r (t − 1)(r − 1) r
2
N σ t
= 2+
r (t − 1)(r − 1)
t
(t − 1) 2+ +
r (t − 1)(r − 1) 2 r
avg var = !
t−1
(t − 1) +
2
Q. Analyse LSD for one missing observation. Also obtain the cor-
responding average variance.
We are conducting Latin square design for m treatments in m2 plots. Let
yield from ith row and jth column receiving kth treatment is missing.
The LSD model is given by
28
where S contains m2 points where, i, j, k = 1(1)m. where,
τk =Additional effect due to k th treatment
We define the following
Ti0 00 = Total of observations containing ith row, not containing the missing
observation.
i0 = 1, 2, . . . , i − 1, i + 1, . . . , m.
T0j 0 0 = Total of observations containing jth row, not containing the missing
observation.
A
j 0 = 1, 2, . . . , j − 1, j + 1, . . . , m.
T00k0 = Total observations containing kth treatment, not containing the
missing observation.
G
k 0 = 1, 2, . . . , k − 1, k + 1, . . . , m.
T000 = Total of all known (m2 − 1) observations.
Ti00 = Total of known (m − 1) observation of the row containing missing
N observation.
T0j0 = Total of known (m − 1) observations of the column containing miss-
ing observations.
RA
T00k = Total of known (m − 1) observations of the treatments containing
missing observation.
yijk = x = missing observation.
Now, total sum of square
X (T000 + x)2
T SS = yi20 j 0 k0 + x2 −
m2
(i0 ,j 0 ,k0 )6=(i,j,k)
TA
29
Now,
A
SSE(x) = 0
∂x
4(T000 + x) 2(Ti00 + x) 2(T0j0 + x) 2(T00k + x)
⇒ 2x + − − −
m2 m m m
2T000 2x 2Ti00 2x 2T0j0 2x 2T00k 2x
G
⇒ x+ 2
+ 2− − − − − − =0
m m m m m m m m
2 3 Ti00 + T0j0 + T00k 2T000
⇒ x 1+ 2 − − − =0
N m m m m2
m(Ti00 + T0j0 + T00k ) − 2T000
⇒ x=
(m − 1)(m − 2)
Thus we have obtained the missing observation as x̂. Now we want to test
RA
the hypothesis whether the treatment is significant or not i.e. τk = 0∀k vs.
H1 : at least one inequality in H0 .
For this purpose we obtain, we plug in the estimated missing value x̂ in
the data and calculate the treatment sum of square SST and error sum of
square SSE.
TA
F > Fα;(m−1),(m−1)(m−2)−1
30
The estimate of the missing treatment mean is
T00k + x̂
τ̂k =
m
A
1
τ̂k = (T00k + x̂)
m
1 m(Ti00 + T0j0 + T00k ) − 2T000
= T00k +
m (m − 1)(m − 2)
G
1 m m
= Ti00 + T0j0
m (m − 1)(m − 2) (m − 1)(m − 2)
m 2T000
+ 1+ T00k −
=
N 1
(m − 1)(m − 2)
m(m − 1)(m − 2)
(m − 1)(m − 2)
mTi00 + mT0j0 + (m2 − 3m + 2 + m)T00k − 2T000
1
RA
mTi00 + mT0j0 + (m2 − 2m + 2)T00k − 2T000
=
m(m − 1)(m − 2)
Therefore,
1
V (τ̂k ) =
m2 (m − 1)2 (m − 2)2
TA
m2 V (Ti00 ) + m2 V (T0j0 ) + (m2 − 2m + 2)2 V (T00k )
Note that
V (Ti00 ) = V (T0j0 ) = V (T00k ) = (m − 1)σ 2
cov(Ti00 , T0j0 ) = cov(T0j0 , T00k ) = cov(Ti00 , T00k ) = 0
cov(Ti00 , T000 ) cov(T0j0 , T000 ) = cov(T00k , T000 ) = (m − 1)σ 2
31
Therefore,
1
V (τ̂k ) = [m2 (m − 1)σ 2 + m2 (m − 1)σ 2
m2 (m − 1)2 (m − 2)2
+ (m − 1)(m2 − 2m + 2)2 σ 2 + 4(m2 − 1)σ 2
− 4m(m − 1)σ 2 − 4m(m − 1)σ 2 − 4(m2 − 2m + 2)(m − 1)σ 2 ]
σ 2 m {(m − 1)(m − 2) + m} σ2 m2
= 2 = 2 m+
m (m − 1)(m − 2) m (m − 1)(m − 2)
Now,
A
V (τ̂k − τ̄k0 ) = V (τ̂k ) + V (τ̂k0 )
σ2 m2 σ2
G
= 2 m+ +
m (m − 1)(m − 2) m
σ2
1 1
= σ2 + +
N m (m − 1)(m − 2) m
2 1
= σ2 +
m (m − 1)(m − 2)
2σ 2 0
V (τ̂k0 − τ̂j ) = , k 6= j same as simple LSD
m
We have (m−1)(m−2) treatment differences corresponding to the treatments
having no missing responses. Therefore, average variance is given by
TA
!
m−1 2σ 2
2 2 1
(m − 1)σ + +
m (m − 1)(m − 2) 2 m
= !
m−1
(m − 1) +
2
32
Factorial design
A
What is factorial experiment? In what respect it differs from a
single factor experiment?
Experiments, where the effects of more than one factor each at 2 level or
G
more are considered together are called factorial experiment. On the other
hand, an experiment which is carried out with one factor at varying levels,
is called a simple experiment or single factor experiment.
N
Factorial experiment is not a scheme of design like CRD, RBD, LSD.
Rather any of there designs can be carried out by factorial experiment.
We illustrate the scenario with an example. Let us suppose that a disease
RA
has 2 specified treatments say A and B. The treatment A has 2 different
doses and treatment B has also 2 different doses. The experimenter may
be interested to check the effect of individual treatment levels, as well as
combination of different dose levels. Factorial experiment takes into account
such cases. But the simple experiment only points out effect of the treatment
as a whole.
TA
33
2. a: A in higher level and factor B at lower level
We denote [a] for total of all observation receiving the treatment combi-
nation ’a’ and (a) represent the corresponding mean.
A and B denotes respectively main effect due to factor A and B. The main
effect due to A is derived as a contrast of the four treatment combination as
A
follows,
A = 12 {(ab) − (b) + (a) − 1} = 21 (a − 1)(b + 1)
G
1
B= 2 {(ab) − (a) + (b) − 1} = 12 (b − 1)(a + 1)
AB + − − + 2
[X] = Total effect due to the factorial effect X=A,B,AB
[x] = Total yield due to a treatment combination x=(1),a,b,ab
Define the terms main effect and interaction effect in relation 23
experiment. Give a detailed analysis of 23 experiment conducted
in RBD.
Let us suppose experiment with 3 factors A, B, C. Each having 2 levels,
higher level and lower level (say).
The presence of the lower case letters (a, b, c) indicates the presence of
higher level of A, B, C. The absence indicates the presence of lower level of
A, B, C we define in the following
34
Higher level Lower level
1 — A, B, C
a A B, C
b B A, C
c C A, B
ab A, B C
bc B, C A
ac C, A B
A
abc A, B, C —
[x] and (x) respectively denotes the total and mean response obtained using
the treatment combination ’x’. A, B, C denote the main effect based on the
G
eight treatment combinations and AB, BC, AC, ABC denote the interaction
effect based on the different treatment combination.
We define the following sign table
N (1) (a) (b) (c) (ab) (bc) (ca) (abc) Division
M + + + + + + + + 8
A − + − − + − + + 4
RA
B − − + − + + − + 4
C − − − + − + + + 4 For
AB + − − + + − − + 4
BC + + − − − + − + 4
AC + − − − − + + 4
TA
ABC − + + + − − − + 4
example, the main effect due to A is given by,
1
A= 4 {(abc) + (ab) − (bc) + (ca) + (a) − (b) − (c) − 1}
1
= 4 (a − 1)(b − 1)(c − 1)
1
AB = 4 {(abc) + (ab) − (bc) − (ac) − (a) − (b) + (c) + 1}
1
= 4 (a − 1)(b − 1)(c + 1)
35
Analysis: We conduct this factorial experiment in RBD. With r (blocks)
i.e., each of this 8 treatment combination is replicated r times.
[X]2
The SS due to factorial effect X is outlined as SSA = 8r .
here total df= (8r − 1)
For each of the treatment combination A, B, C, AB, BC,AC, ABC each
has df= 1.
df due to block is = (r − 1), i.e., error df will be = (8r − 1) − (r − 1) − 7 =
7r − 7 = 7(r − 1),
A
Here we test 7 hypothesis regarding 7 different factorial effects as follows,
G
Now the test statistic for testing the significance of any factorial effect is
given by
MSX
N FX =
M SE
∼ F1,7(r−1)
ANOVA Table
Source ∂f SS MS F
Block (r − 1) SS (Blocks) MS (Blocks) -
A 1 [A]2 /8r MSA MSA/MSE
TA
36
We can consider a 23 experiment with main effects A, B, C each having
2 levels. The presence at lower case letters indicates the higher level of the
effects and the absence indicates the lower level. We have total 23 = 8
treatment given as follows.
A
c A, C C
ab C A, B
bc A B, C
G
ac B C, A
abc — A, B, C
N
The Yate’s algorithm for determining the factorial effect total is given as
follows.
RA
(i) In the first column we write down the treatment combination in lexi-
cographic ordering i.e. (1),a,b,ab,c,ac,bc,abc
(iii) The entries in the third column can be split in two halves. The first
half is obtained by writing down in order the pairwise sum and next
half is obtained by writing in the same order the pairwise differences.
This how the second column is obtained.
(v) The column 5 is obtained by the same process applied on column 4. This
column yields like factorial effect total of the corresponding treatment.
37
Treatment Total yield Column Column Column Factorial
Combination from all 3 4 5 effect total
replicate
1 [1] = u1 [1] + [a] = u1 u1 + u2 = v1 v1 + v2 = w1 G
a [a] = u2 [b] + [ab] = u2 u3 + u4 = v2 v3 + v4 = w2 [A]
b [b] = u3 [c] + [ac] = u3 u5 + u6 = v3 v5 + v6 = w3 [B] Once
ab [ab] = u4 [bc] + [abc] = u4 u6 + u8 = v4 v7 + v8 = w4 [AB]
c [c] = u5 −[1] + [a] = u5 u2 + u1 = v5 v2 − v1 = w5 [C]
A
ac [ac] = u6 −[b] + [ab] = u6 u4 + u3 = v6 v4 − v3 = w6 [AC]
bc [bc] = u7 −[c] + [ac] = u7 u6 + u5 = v7 v6 − v5 = w7 [BC]
abc [abc] = −[bc] + [abc] = u8 u8 + u7 = v8 v8 − v7 = w8 [ABC]
G
the factorial effect total of X is obtained as [X], then sum square due to X
2
is obtained as [X]
23 r
N
What is meant by confounding in factorial experiment? Why
is confounding used even at the cost of loss of information on the
RA
confounded effects? Explain the term “complete confounding” and
“partial confounding”.
38
ABC is given by
1
ABC = 4 [(abc) − (bc) − (ac) + (c) − (ab) + (b) + (a) − (1)]
1
= 4 [(abc) + (a) + (b) + (c) − (ab) − (bc) − (ca) − (1)]
A
and the second block contains the treatments having − sign in the above
expression.
Note that the interaction ABC can be written as the difference between
G
the below stated block
Block I Block II
N a
b
c
1
ab
ac
RA
abc bc
confounding.
39
Replication 1: Block I Block II
1 a
ab b AB is confounded
c bc
abc ac
A
b ab ABC is confounded
c bc
abc ca
G
What is meant by balanced partially confounded design?
Let AB, BC, AC which are first order interaction(two factor interactions)
are confounded in 6 replications as follows.
6 replications
TA
I II III IV V VI
Note that each of the first order inter-
AB BC BC AB AC AC
actions are confounded twice. Hence this design is a balanced partially con-
founded design.
Problems on confounding
Hey hey. What are you expecting is it a fuckin textbook ?? Go and study
through class notes. And outsiders, sadly you need to ask your friends to get
these problems.
40
Or, For a 23 experiment, identify the orthogonal effects.
1
A= 4 [(a) + (ab) + (ac) + (abc) − (1) − (b) − (c) − (bc)]
1
B= 4 [(b) + (ab) + (bc) + (abc) − (1) − (a) − (c) − (ac)]
1
C= 4 [(c) + (bc) + (ac) + (abc) − (1) − (b) − (a) − (ab)]
1
AB = 4 [(1) + (c) + (ab) + (abc) − (a) − (b) − (bc) − (ac)]
1
AC = 4 [(1) + (b) + (ac) + (abc) − (a) − (c) − (ab) − (bc)]
1
BC = 4 [(1) + (a) + (bc) + (abc) − (b) − (c) − (ab) − (ac)]
A
1
ABC = 4 [(a) + (b) + (c) + (abc) − (1) − (ab) − (bc) − (ac)]
G
M + + + + + + + + 8
A − + − − + − + + 4
B − − + − + + − + 4
N C
AB
−
+
−
−
−
−
+
+
−
+
+
−
+
−
+
+
4
4
− − − −
RA
BC + + + + 4
CA + − + − − − + + 4
ABC − + + + − − − + 4
41
Then the defining equations are given by
A
i.e. From the defining equation we find that BC is confounded where, BC =
AC × AB, is the generalised interaction of AB and AC.
Q. How sum of squares calculated from factorial experiment?
G
Show that in 22 factorial experiment, sum square due to treatment
can be written as sum square of individual treatment effects with
degrees of freedom 1.
N
Let us consider n treatments t1 , t2 , .., tn each replicated r times. A treat-
ment contrast L is given by
k k
RA
X X
L= ci ti where, ci = 0
i=1 i=1
Note that
k 8
L2
X X
2
E(L ) = c2i v(ti) = rσ 2
c2i ⇒ E P 2 = σ2
i=1 i=1
r i ci
TA
L2
Then sum of square due to the contrast L is P 2 with degrees of freedom
r ci
i
1.
In a 22 experiment we have three treatment combinations A, B, AB and
(1) is the controlled treatment.
Where [A] = [ab] − [b] + [a] − [1]
i.e., the above is contrast.
i.e., sum of square due to contrast is given by
[A]2 [A]2
SSA = = with ∂f 1.
r [12 + (−1)2 + (1)2 + (−1)2 ] r22
42
Similarly, SS due to B is
[B]2 [AB]2
SSB = with ∂f 1; SS(AB) = with ∂f 1.
r22 r22
A
N G
RA
TA
43
Analysis of covariance
A
Discuss the ANCOVA technique for RBD. Also derive average
G
variance for treatment variance
t r
iid
X X
Assumption: τi = 0, θj = 0, eij ∼ N (0, σ 2 )
i=1 j=1
44
Estimation of model parameter:
t X
X r
E= e2ij
i=1 j=1
t r
∂E XX ∂
= [yij − µ − τi − θj − β(xij − x̄00 )]2 = 0
∂µ ∂µ
i=1 j=1
t X
X r
= (yij − µ − τi − θj − β(xij − x̄00 )) = 0
A
i=1 j=1
t X
X r t
X r
X XX
⇒ yij − nµ − r τi − t θj − β (xij − x̄00 ) = 0
i=1 j=1 i=1 j=1
G
t r
1 XX
⇒ µ̂ = yij = ȳ00
n
i=1 j=1
r
∂E
∂τi
=
N X
j=1
Xr
(yij − µ − τi − θj − β(xij − x̄00 )) = 0
r
X r
X
⇒ yij = rµ + rτi + θj + β (xij − x̄00 )
RA
j=1 j=1 j=1
r r
1X βX
⇒ τ̂i = yij − ȳ00 − xij − x̄00
r r
j=1 j=1
= (ȳi0 − ȳ00 ) − β(x̄i0 − x̄00 )
∂E
Now, = 0.
∂β
t X
X r
E= (yij − µ̂ − τ̂i − θ̂j − β(xij − x̄00 ))2
i=1 j=1
t X
X r
⇒ [(yij − ȳ00 ) − {(ȳi0 − ȳ00 ) − β(x̄i0 − x̄00 )}
i=1 j=1
45
Therefore,
t X
X r
(xij − ȳi0 − ȳ0j + ȳ00 )(xij − x̄i0 − x̄0j + x̄00 )
i=1 j=1
β̂ = t X
r
X
(xij − x̄i0 − x̄0j + x̄00 )2
i=1 j=1
Exy
=
Exx
A
Xt Xr n o2
SSE = (yij − ȳi0 − ȳ0j + ȳ00 ) − β̂(xij − x̄i0 − x̄0j + x̄00 )
i=1 j=1
t X
X r t X
X r
G
= (yij − ȳi0 − ȳ0j + ȳ00 )2 − 2β̂ (yij − ȳi0 − ȳ0j − ȳ00 )
i=1 j=1 i=1 j=1
t X
X r
N (xij − x̄i0 − x̄0j + x̄00 ) + β̂ 2 (xij − x̄i0 − x̄0j + x̄00 )2
i=1 j=1
t X
r 2
X Exy Exy
= (yij − ȳi0 − ȳ0j + ȳ00 )2 − 2 Exy +
Exx Exx
i=1 j=1
RA
t X r 2
X Exy
= (yij − ȳi0 − ȳ0j − ȳ00 )2 −
Exx
i=1 j=1
!
2
Exy 2
Exy
= SSE(RBD) − = Eyy −
Exx Exx
TA
σ2
β̂ ∼ N β,
Exx
!
β̂ − β p
Exx ∼ N (0, 1)
σ
46
treatment df = (t − 1) total df = (rt − 1 − 1)
block df = (r − 1) Error df = (rt − 1) − (r − 1) − (t − 1) − 1
SSE 2
∼ X(r−1)(t−1) −1
σ2
A
SSE/(r − 1)(t − 1) − 1
E =1
σ2
G
⇒ E(M SE) = σ 2 i.e.σ 2 can be estimated by MSE
√
(β̂ − β) Exx
i.e., p ∼ t(r−1)(t−1)−1
N SSE/(r − 1)(t − 1) − 1
√
β̂ Exx
⇒ T =√ ∼ t(r−1)(t−1)−1 , under H0
M SE
RA
We reject H0 at level α is |T | > t α2 ;(t−1)(r−1) .
Now,
µ̂H0 = ȳ00
θ̂jH0 = (ȳ0j − ȳ00 ) − β̂(x̄0j − x̄00 )
t X
X r
(yij − ȳ0j )(xij − x̄0j )
H0
Exy
i=1 j=1
β̂H0 = t X
r
= H0
X
2
Exx
(xij − x̄0j )
i=1 j=1
47
i.e.,
t X
X r
SSEH0 = (yij − µ̂H0 − θ̂jH0 − β̂H0 (xij − x̄00 ))2
i=1 j=1
t X
X r
= ((yij − ȳ00 ) − (ȳ0j − ȳ00 ) + β̂H0 (x̄0j − x̄00 ) − β̂H0 (xij − x̄0j ))2
i=1 j=1
t X
X r h i2
= (yij − ȳ0j ) − β̂H0 (xij − x̄0j )
A
i=1 j=1
t X
X r XX
= (yij − ȳ0j )2 − β̂H
2
0
(xij − x̄0j )2
i=1 j=1 i j
G
t X
r 2H0
X Exy
= (yij − ȳ0j )2 − H0
i=1 j=1 Exx
2H0
Exy
N H0
= Eyy − H0
Exx
Therefore,
TA
SSEH0 − SSE
∼ x2r(t−1)−1−(r−1)(t−1)+1
σ2
≈ x2t−1
SSEH0 − SSE
F ∼ F(t−1),(r−1)(t−1)−1
SSE
48
We know τ̂i = (ȳi0 − ȳ00 ) − β̂(x̄i0 − x̄00 )
τ̂j = (ȳi0 − ȳ00 ) − β̂(x̄j0 − x̄00 )
h i
V (τ̂i − τ̂j ) = V (ŷi0 − ȳj0 ) − β̂(x̄i0 − x̄j0 )
= V (ȳi0 − ȳj0 ) + (x̄i0 − x̄j0 )2 V (β̂)
σ2
= V [(τi − τj ) + (ēi0 − ēj0 )] + (x̄i0 − x̄j0 )2
A
Exx
2σ 2 σ2
= + (x̄i0 − x̄j0 )2
r Exx
G
1 X X 2σ 2 2
2 σ
Avg var = ! + (x̄i0 − x̄j0 )
t r Exx
i<j
N 2
2σ 21 σ2 X X
= + (x̄i0 − x̄j0 )2
r t(t − 1) Exx
X XX
(x̄i0 − x̄j0 )2 = ((x̄i0 − x̄00 ) − (x̄j0 − x̄00 ))2
RA
i6=j i j
X X
=t (x̄i0 − x̄00 )2 + t · (x̄j0 − x̄00 )2
i j
2t X 2t
= r (x̄j0 − x̄00 )2 = Txx
r r
2σ 2 σ 2 Txx 2σ 2
Txx
Avg var = +2 = 1+
r r(t − 1) Exx r (t − 1)Exx
TA
49
Model: yijk = µ + αi + βj + τk + γ(xijk − x̄000 ) + eijk
A
yijk : observation corresponding to ith row jth column and kth treatment
xijk : concomitant variable corresponding to yijk
Assumption:
(i)
NX
i
αi = 0 (ii)
X
j
βj = 0 (iii)
X
k G
τk = 0 (iv) eijk ∼ N (0, σ 2 )
RA
Estimation of the model parameter:
XXX
SSE = (yijk − µ − τk − αi − βj − γ(xijk − x̄000 ))2
TA
i j k
∂(SSE) X X X
= (yijk − µ − αi − βj − τk − γ(xijk − x̄000 )) = 0
∂µ
i j k
⇒ µ̂ = ȳ000
α̂i = (ȳi00 − ȳ000 ) − γ̂(x̄i00 − x̄000 )
β̂j = (ȳ0j0 − ȳ000 ) − γ̂(x̄0j0 − x̄000 )
XXX
(yijk − ȳi00 − ȳ0j0 − ȳ00k + 2ȳ000 )(xijk − x̄i00 − x̄0j0 − x̄00k +
(i,j,k)∈s
τ̂k = XXX
(xijk − x̄i00 − x̄0j0 − x̄00k + 2x̄000 )2
Exy
=
Exx
50
under ANOCOVA Model the minimum SSE is given by
XXXh i2
SSE = yijk − µ̂ − α̂i − β̂j − τ̂k − γ̂(xijk − x̄000 )
(i,j,k)∈s
XXX
= [(yijk − ȳ000 ) − {(ȳi00 − ȳ000 ) − γ(x̄i00 − x̄000 )}
(i,j,k)∈s
A
XXX
= [(yijk − ȳi00 − ȳ00k + 2ȳ000 )
(i,j,k)∈s
G
XXX
= (yijk − ȳi00 − ȳ00k + 2ȳ000 )2
(i,j,k)∈s
XXX
N − γ̄ 2 (xijk − x̄i00 − x̄0j0 − x̄00k + 2x̄000 )2
(i,j,k)∈s
2
Exy 2
Exy
= Eyy − 2
Exx = Eyy −
Exx Exx
RA
Note that Eyy is the error in usual LSD. So introducing concomitant
variable is useful in error control.
Now the degrees of freedom for error
Testing: Here want to test whether all the treatment effects are same or
not.
i.e., H0 : τ1 = τ2 = · · · = τk = 0 vs H1 : inequality in H0 .
i.e., the model is reduces to yijk = µ + αi + βj + γ (xijk − x̄000 ).
The estimate obtain by minimizing
XXX
(yijk − µ − αi − βj − γ(xijk − x̄000 ))2
(i,j,k)∈s
51
µ̂ = ȳ000
α̂i = (ȳi00 − ȳ000 ) − γ̂H0 (x̄i00 − x̄000 )
β̂j = (ȳ0j0 − ȳ000 ) − γ̂H0 (x̄0j0 − x̄000 )
XXX
(xijk − x̄i00 − x̄0j0 + x̄000 )(yijk − ȳi00 − ȳ0j0 + ȳ000 )
γ̂ = XXX
(xijk − x̄i00 − x̄0j0 + x̄000 )2
Exy (H0 )
=
Exx (H0 )
XXX
SSEH0 = [(yijk − ȳ000 ) − {(ȳi00 − ȳ000 ) − γ̂H0 (x̄i00 − x̄000 )}
A
(i,j,k)∈s
G
=
(i,j,k)∈s
XXX
= (yijk − ȳi00 − ȳ0j0 + ȳ000 )2
(i,j,k)∈s
N − γ̂H0
XXX
(i,j,k)∈s
(xijk − x̄i00 − x̄0j0 + x̄000 )2
2 (H )
Exy
RA
(H0 ) 0
= Eyy − (H )
Exx 0
= (m − 1)(m − 1 − m + 2) = (m − 1)
Under H0
SSEH0 − SSE SSE
∼ x2m−1 and ∼ χ2(m−1)(m−2)−1
σ2 σ2
Also (SSEH0 − SSE) is independent with SSE.
SSEH0 − SSE
(m − 1)
σ2
F = ∼ F(m−1),(m−1)(m−2)−1
SSE
(m − 1)(m − 2) − 1
σ2
52
We reject H0 if f at level α
F > Fα;(m−1),(m−1)(m−2)−1
H0 : γ = 0 ag H1 : γ 6= 0.
A
σ2
γ̂ ∼ N γ,
Exx
Therefore, √
G
(γ̂ − γ) Exx
∼ N (0, 1)
σ
σ̂ 2 can be estimated by MSE, where,
N SSE
M SE =
(m − 1)(m − 2) − 1
Average variance
Here, τ̂k = (ȳ00k − ȳ000 ) − γ̂(x̄00k − x̄000 )
and τ̂k0 = (ȳ00k0 − ȳ000 ) − γ̂(x̄00k0 − x̄000 )
Now,
53
Average Variance=
1 X X 2σ 2 σ2
2
⇒ + (x̄00k − x̄00k0 )
m(m − 1) 0
m Exx
k6=k
2σ 2 σ2 XX
⇒ + (x̄00k − x̄00k0 )2 /m(m − 1)
m Exx
k6=k
Now,
XX XX
(x̄00k − x̄00k0 )2 = (x̄00k − x̄000 + x̄000 − x̄00k0 )2
A
k6=k0 k6=k0
XX XX
= (x̄00k − x̄000 )2 + (x̄00k0 − x̄000 )2
G
XX
= 2m (x̄00k − x̄000 )2
2σ 2 σ2 1 XX
(x̄00k − x̄000 )2
N
Avg variance =
=
m
2σ 2
+
1+
m
×
Exx m(m − 1)
PP
2m
(x̄00k − x̄000 )2 1
m (m − 1) Exx
RA
2σ 2
Txx
= 1+
m (m − 1)Exx
TA
54