Beruflich Dokumente
Kultur Dokumente
BOLGATANGA POLYTECHNIC
DEPARTMENT OF STATISTICS HND 3
MULTIVARIATE DATA ANALYSIS
END OF FIRST SEMESTER EXAMINATION 2011/2012 (TIME 3 HRS)
SECTION A. ANSWER ALL QUESTIONS (50 MARKS)
Q1)a.Briefly explain the following
(i).Symmetry Matrix (2 marks)
(ii).Trace of a Matrix (1mark)
(iii).Orthogonal Matrix (1mark)
(iv).Unit Matrix (1mark)
(v).Transpose of a Matrix (2marks)
b. Let the random variable y have a covariance matrix
1
1
1
]
1
9 1 4
1 4 2
4 2 25
(i). Calculate the correlation matrix of y. (8marks)
(ii). Given that
[ ] 1 0 2 a , calculate the variance of y. (4marks)
(iii). Calculate the trace of the matrix above. (2marks)
Q2)a.From the data given below, calculate;
(i). the sample means,
X (2marks)
(ii). the sample variance and covariances, S
( 6marks)
(iii). the sample correlations, R. (4marks)
Variable 1 ( X
1
) 5 4 6 2 2 6 3
Variable 2 ( X
2
) 5 5 4 7 9 5 5
Q2b). Given that
1
]
1
00390 . 0
0065 . 0
1
X
,
1
]
1
0262 . 0
2483 . 0
2
X
S
P
-1
= 1
]
1
,
_
044 . 0
210 . 2
? ( 7 marks)
Q3. As part of the study of AIDS prevention undertaken by a social counselor, some questions
were designed and the respondents were categorizes in males (population 1) and females
(population 2). A sample of 30 males and 30 females were considered. The sample mean vector
were
Males Females
1
X
1
1
1
1
]
1
700 . 4
967 . 3
033 . 7
833 . 6
2
X
1
1
1
1
]
1
533 . 4
000 . 4
000 . 7
633 . 6
and the pooled variance-covariance is
1
1
1
1
]
1
pp
).
b. Using the matrix X =
1
1
1
]
1
2 2 0
2 1 3
0 3 1
, which of the following are independent? Explain.
(i)
( )
2 1
, x x
(ii)
( ) [ ]
3 3 2
, x x x +
(iii)
( ) ( )
1
]
1
+
3 1 3 2 1
2
1
,
2
1
x x x x x
(iv)
1
]
1
+
3
2 1
,
2
X
X X
Q2. Consider the following independent samples from three levels. The observation vectors
1
]
1
2
1
X
X
are;
1
]
1
1
]
1
1
]
1
1
]
1
4
16
2
8
2
6
8
14
: 100 Level
1
]
1
1
]
1
1
]
1
1
]
1
7
2
15
0
12
5
6
1
: 200 Level
a. Break up the observation in to mean,
treatment, and residual components and construct the corresponding arrays for each variable.
b. Construct the one-way MANOVA table using the information in (a)
c. Calculate Wilks lamda. (Use
% 5
).
1
]
1
1
]
1
1
]
1
1
]
1
6
6
1
11
7
2
2
3
: 300 Level
Q.3) Perspiration from 20 healthy females was analysed. Three components:
x
1
= sweat rate, x
2
= sodium content and
x
3
= potassium content were measured and the results are presented below.
Summary in a computer analysis as follows.
,
_
,
_
,
_
4 9 5
9 8 2
5 2 1
ii. The trace of a matrix is the sum of its diagonal element.
If A=
,
_
22 21
12 11
a a
a a
, then trace(A) = 22 11
a a a
ii
+
iii. An Orthogonal matrix Q is a matrix that QQ
T
= Q
T
Q= I. eg Q =
,
_
2
1
2
1
2
1
2
1
iv. Unit matrix is a diagonal matrix in which the elements on the leading diagonal are all
unitary. eg
,
_
1 0 0
0 1 0
0 0 1
v. Transpose of a matrix is a matrix when the rows and columns of a matrix are
interchanged: ie the first row becomes the first column, the second row becomes the
second column, the third row becomes the third column, etc. then the new matrix so
formed is called the transposed of the original matrix. Eg if A=
,
_
5 2
9 7
6 4
,
then A
T
=
,
_
5 9 6
2 7 4
b. Given that
,
_
9 1 4
1 4 2
4 2 25
, since y has a covariance matrix
Let v be the diagonal matrix and correlation matrix y be R.
But
2
1
2
1
V R V
R =
2
1
2
1
V V
1
1
1
1
1
]
1
9
1
0 0
0
4
1
0
0 0
25
1
2
1
V =
1
1
1
1
]
1
3
1
0 0
0
2
1
0
0 0
5
1
R =
1
1
1
1
]
1
3
1
0 0
0
2
1
0
0 0
5
1
1
1
1
]
1
9 1 4
1 4 2
4 2 25
1
1
1
1
]
1
3
1
0 0
0
2
1
0
0 0
5
1
Therefore, the correlation matrix R is ;
R=
1
1
1
1
]
1
1
6
1
5
4
6
1
1
5
1
5
4
5
1
1
ii. since Var
[ ]
a a y
Var
[ ] ] [ 1 0 2
9 1 4
1 4 2
4 2 25
1
0
2
1
1
1
]
1
1
1
1
]
1
y
Var
[ ] 125 y
iii. Let A=
1
1
1
]
1
9 1 4
1 4 2
4 2 25
Since trace(A) =
38 9 4 25 ) ( ,
1
+ +
A trace a
n
i
ii
Q2) a.i Since
i
X
n
X
1
1
7
3 6 2 2 6 4 5
1
+ + + + + +
X
4
1
X
6
7
7 5 9 7 4 5 5
2
+ + + + + +
X
6
2
X
i. For the sample Variance and covariance =S
n
since
S
n
=
1
]
1
22 21
12 11
S S
S S
,
S
n
=
( )
2
1
1
1
X X
n
( )
2
1 11
1
1
X X
n
S
( ) ( ) ( ) ( ) ( ) ( ) ( ) [ ] 3 4 3 4 6 4 2 4 2 4 6 4 4 4 5
6
1
2 2 2 2 2 2 2
11
+ + + + + + S
( ) ( ) ( ) ( ) ( ) ( ) ( ) [ ] 3 6 7 6 5 6 9 6 7 6 4 6 5 6 5
6
1
2 2 2 2 2 2 2
22
+ + + + + + S
( )( ) ( )( ) ( )( ) ( )( ) ( )( ) [ ] 67 . 2 6 7 4 3 ... 6 7 4 2 6 4 4 6 6 5 4 4 6 5 4 5
6
1
22 21
+ + + + + S S
,
_
3 67 . 2
267 3
n
S
iii. For the sample correlation, R
Since, R =
,
_
,
_
3 67 . 2
67 . 2 3
22 21
12 11
r r
r r
r
ij
=
jj ii
ij
r r
r
1
3
3
9
3
3 3
3
11
r
,
1
3
3
9
3
3 3
3
22
r
89 . 0
3
67 . 2
9
67 . 2
3 3
67 . 2
12 21
r r
R =
,
_
1 89 . 0
89 . 0 1
[ ]
4
6
, X mean samlpe the hence
b. using fishers method of discriminant function, Given that
1
]
1
00390 . 0
0065 . 0
1
X
,
1
]
1
0262 . 0
2483 . 0
2
X
S
P
-1
=
1
]
1
,
_
044 . 0
210 . 2
since y=( )
0
1
2 1
X Sp x x
( )
1
]
1
+
+
0262 . 0 0390 . 0
2483 . 0 0065 . 0
2 1
x x
,
_
0128 . 0
2418 . 0
y=
( )
0
1
2 1
X Sp x x
= (0.2418 -0.0128)
1
]
1
,
_
044 . 0
210 . 2
therefore y = -5.8801
Also since
D
2
( ) ( )
2 1
1
2 1
x x Sp x x +
( ) +
2 1
x x +
1
]
1
00390 . 0
0065 . 0
,
_
1
]
1
0301 . 0
3133 . 0
0262 . 0
2483 . 0
D
2
(0.2418 -0.0128)
1
]
1
,
_
0301 . 0
3133 . 0
D
2
4299 . 3
2
8598 . 6
700 . 4
967 . 3
033 . 7
833 . 6
2
X
1
1
1
1
]
1
533 . 4
000 . 4
000 . 7
633 . 6
1
1
1
1
]
1
,
_
1 1 0 0
1 1 1 0
0 0 1 1
T
2
= ( )
2
) (
1 1
2 1
1
2 1
2 1
C x x C C CSp
n n
c x x >
1
]
1
,
_
,
_
,
_
167 . 0
033 . 0
033 . 0
2 . 0
533 . 4 700 . 4
000 . 4 967 . 3
000 . 7 033 . 7
633 . 6 833 . 6
)
2 1
x x
(
1
2 1
) x x =
[ ] 167 . 0 033 . 0 033 . 0 200 . 0
(
1 1
2 1
) C x x =
[ ] 167 . 0 033 . 0 033 . 0 200 . 0 033 . 0
1 0 0
1 1 0
0 1 1
0 0 1
,
_
033 . 0
167 . 0
033 . 0
033 . 0
2 . 0
1 1 0 0
0 1 1 0
0 0 1 1
) (
2 1
,
_
,
_
x x C
C SpC
1
=
,
_
1 1 0 0
1 1 1 0
0 0 1 1
1
1
1
1
]
1
,
_
1 0 0
1 1 0
0 1 1
0 0 1
C SpC
1
=
,
_
058 . 1 751 . 0 125 . 0
751 . 0 101 . 1 268 . 0
125 . 0 268 . 0 719 . 0
But
067 . 0
30
1
30
1 1 1
2 1
+ +
n n
0.067 C Sp C
1
= 0.067
,
_
058 . 1 751 . 0 125 . 0
751 . 0 101 . 1 268 . 0
125 . 0 268 . 0 719 . 0
=
1
070568 . 0 05009 . 0 00337 . 0
05009 . 0 0734 . 0 0178 . 0
008337 . 0 01787 . 0 04795 . 0
,
_
But
1
1
1
]
1
1
]
1
,
_
F
=F(2,58)
05 . 0
= 4.00
DetT
2
= 0.061136
T
2
< C
2
1.51126 < 4.00
We fail to reject H
0
and conclude that there is parallism between the two profiles. Once there
assume coincidence.
a. Since
Sp =
1
1
1
1
]
1
6390 . 0
2695 . 0
3038 . 0
2738 . 0
1
X
1
1
1
1
]
1
700 . 4
967 . 3
033 . 7
833 . 6
2
X
1
1
1
1
]
1
533 . 4
000 . 4
000 . 7
633 . 6
0
1
2
3
4
5
6
7
8
Category 1 Category 2 Category 3 Category 4
Series 1
Series 2
Series 3
BOLGATANGA POLYTECHNIC
DEPARTMENT OF STATISTICS HND 3
MULTI-VARIATE DATA ANALYSIS (STA 311)
END OF FIRST SEMESTER EXAMINATION 2011 / 2012 (TIME: 3 HRS)
MARKING SCHEME
SECTION B (50MARKS)
Q1).a Given that
[ ] 1 .... 0 0 0 m ,
( )
pp p p
N z ~
Since
( )
m m m Z
, Let
1
1
1
1
1
]
1
2
1
1
1
1
1
]
1
PP P P
P
P
2 1
2 22 21
1 22 11
[ ]
1
1
1
1
1
]
1
p
m
2
1
1 ... 0 0 0
P
m ... 0 0 0 + + +
P
m
[ ]
1
1
1
1
]
1
1
1
1
]
1
1
0
0
1 ... 0 0
2 1
23 22 21
13 12 11
pp p p
m m
[ ]
1
1
1
]
1
1
0
0
2 1 pp p p
pp
pp
m m
Hence
( )
pp p p
N z ~
b. Given that
1
1
1
]
1
2 2 0
2 1 3
0 3 1
i. ( ) 3 ,
2 1
X X ,
Since their covariance is not zero, it implies that it is not independent.
ii.
( ) [ ]
3 3 2
, x x x +
X
2 +
X
3
= 3+0 = 3
X
2 +
X
3
= -1-2 = -3
X
2 +
X
3
= -2+2 = 0
( ) [ ]
3 3 2
, x x x +
0
2 0
2 3
0 3
1
1
1
]
1
Since cov(X
2
+X
3
), X
3
= 0, therefore it is independent
iii.
( ) ( )
1
]
1
+
3 1 3 2 1
2
1
,
2
1
x x x x x
( ) ( ) 2 0 3 1
2
1
2
1
3 2 1
+ + x x x
( ) ( ) 2 2 1 3
2
1
2
1
3 2 1
+ + x x x
( ) ( ) 2 2 2 0
2
1
2
1
3 2 1
+ x x x
( )
3 1
2
1
x x
=
( )
2
1
0 1
2
1
( )
3 1
2
1
x x
=
( )
2
5
2 3
2
1
+
( )
3 1
2
1
x x
=
( ) 1 2 0
2
1
1
1
1
1
]
1
1 2
2
5
2
2
1
2
Cov=
( ) ( )
2
1
2
1
,
2
1
3 1 3 2 1
1
]
1
+ x x x x x
Therefore it is not independent. Since their covariance is not equal to zero.
iv. Given that
1
1
1
]
1
2 2 0
2 1 3
0 3 1
, to find
1
]
1
+
3
2 1
,
2
X
X X
( ) 2 3 1
2
1
2
2 1
+
+ x x
( ) 1 1 3
2
1
2
2 1
+ x x
( ) 1 2 0
2
1
2
2 1
+ x x
Therefore
1
1
1
]
1
2 1
2 1
0 2
,
+
0 ,
2
3
2 1
x
x x
it is independent since the covariance is zero.
Q2. a Hypothesis
H
O
:
0 ...
2 1
g
H
1
:
0 ...
2 1
g
11
4
16 8 6 14
1
+ + +
X
2
4
4 2 2 8
1
+ +
X
1
]
1
2
11
1
X
2
4
2 0 5 1
2
+ + +
X
10
4
7 15 12 6
2
+ + +
X
1
]
1
10
2
2
X
4
4
6 11 2 3
2
X
3
4
6 1 7 2
3
+ + +
X
1
]
1
3
4
3
X
For the overall mean, since
1
1
1
1
]
1
+ +
+ +
i
i
n
x n x n x n
n
x n x n x n
X
) (
) (
3 3 2 2 1 1
3 3 2 2 1 1
1
]
1
1
1
1
]
1
+ +
+ +
5
3
12
) 3 4 10 4 2 4 (
12
) 4 ( 4 2 4 11 4 (
X
since X
i j
= X + ( X
i
- X ) + ( X
i j
- X
i
)
,
_
,
_
,
_
,
_
2 7 2 7
0 2 3 1
5 3 5 3
7 7 7 7
1 1 1 1
8 8 8 8
3 3 3 3
3 3 3 3
3 3 3 3
6 11 2 3
2 0 5 1
16 8 6 14
,
_
6 11 2 3
2 0 5 1
16 8 6 14
,
_
6 11 2 3
2 0 5 1
16 8 6 14
Up roll
,
_
,
_
,
_
,
_
3 2 4 5
3 5 2 4
6 0 0 6
2 2 2 2
5 5 5 5
3 3 3 3
5 5 5 5
5 5 5 5
5 5 5 5
6 1 7 2
7 15 12 6
4 2 2 8
,
_
6 1 7 2
7 15 12 6
4 2 2 8
,
_
6 1 7 2
7 15 12 6
4 2 2 8
SS
OBS
= SS
MEAN
+ SS
TRT
+ SS
ERROR
UP ROLL
752 = 108 + 456 + 188
Total sum of squares corrected = SS
OBS
- SS
MEAN
TSS = 752 - 108 = 644
Total sum of squares corrected = SS
OBS
- SS
MEAN
TSS = 632 - 300 = 332
Mean
OBS = Mean + Trt +Res
(3
) 5
+ (3
5
) + + (3
) 5
= 12(3
) 5
= 180
Treatment = 4
( ) 3 8
+ 4 ) 5 ( (-1
)
+4
) 2 ( ) 7 (
= -60
Residual = (3
) 6
+ (-5
) 0
+(-3
) 0
+(5
) ) 6 (
+ = - 60
Total cross product
(14
) 8
+(6
) 5
+(8
) 5
+16
) 4 (
+ = 89
Total corrected cross product = Total cross product Mean cross product
= 89 108 = -91
b.
MANOVA TABLE
Source of Variation Matrix SS and Cross product D.F
Treatment
1
]
1
152 60
60 456 3
Residual
1
]
1
180 31
31 188 8
Total
1
]
1
332 91
91 644 11
c. For the wilks lamda,
since L = - ( n 1 -
2
g p +
) 1n
W B
W
+
, but
W B
W
+
=
8281 213808
961 33840
332 91
91 644
180 31
31 188
1
]
1
1
]
1
16 . 0
205527
32879
,
L= - ( 12- 1 -
)
2
3 4+
1n (0.16) = 13.74
Hence L = 13.74 since L > ( )
) 1 (
2
g
p
For the degree of freedom, ( ) 8 1 3 4 ) 1 (
2
g p
5073 . 15 ) 05 . 0 (
2
8
< 74 . 13 sin L ce
5073 . 15 ) 05 . 0 (
2
8
DECISION
Since 13.74 < 15.5073, we fail to reject H
O
at
05 . 0
CONCLUSION
We conclude that the three levels are not of the same effect
Q 3)
,
_
,
_
,
_
,
_
,
_
,
_
,
_
160 . 0
072 . 0
467 . 0
035 . 0 , 100 . 4 , 64 . 0 20
10 965 . 9
50 400 . 45
4 640 . 4
402 . 0 022 . 0 258 . 0
400 . 45 006 . 0 022 . 0
258 . 0 022 . 0 586 . 0
10 965 . 9 , 50 400 . 45 , 4 640 . 4 20
2
T
T
2
= 9.74 (10marks)
By comparing with the critical value
( )
18 . 8 74 . 9
18 . 8 ) 10 . 0 (
) 1 (
74 . 9 . .
18 . 8 ) 44 . 2 ( 353 . 3
) 10 . 0 (
17
) 3 ( 19
) 10 . 0 (
1
,
2
17 , 3 ,
>
>
p n p
p n p
F
p n
p n
T e i
F F
p n
p n
(5marks)
Therefore we reject H
0
at the 10% level of significance.
(2marks)