Beruflich Dokumente
Kultur Dokumente
In below figure, samples falling in one of the closed curve, belongs to one class and samples falling
in another closed curve belongs to other class. Does PCA a good choice to reduce dimension of the
data? Why? or Why not?
Answer: PCA may not be a good choice because PCA is meant to find the best representation of the
whole data in lesser dimensions, but the problem stated here belongs to classification problem. Also,
from figures it is clear that direction of maximum variability of the data is almost orthogonal to direction
of variation in class labels.
Thus, after projecting the data in the direction of maximum variability of the data will lose the
important information necessary for classification.
Q2. Use Fisher Linear Discriminant method to find discriminant function also predict if last row belongs
to class1 or class2
class
X1
2.95
2.53
3.57
3.16
2.58
2.16
3.27
2.81
1
1
1
1
2
2
2
prediction
X2
6.63
7.79
5.65
5.47
4.46
6.22
3.52
5.46
Answer:
http://people.revoledu.com/kardi/tutorial/LDA/index.html
class
X1
1
1
1
1
2
2
2
X2
prediction
2.95
2.53
3.57
3.16
2.58
2.16
3.27
2.81
6.63 x1'
7.79 x2'
5.65 x3'
5.47 x4'
4.46 x5'
6.22 x6'
3.52 x7'
5.46 x8'
m1'
m2'
3.05
2.67
6.38
4.73
m'
2.88
5.65
count
P(1)
p(2)
4
3
0.571429
0.428571
mean adjusted
-0.10
0.24
-0.52
1.40
0.52
-0.73
0.11
-0.92
-0.09
-0.28
-0.51
1.49
0.60
-1.21
-2.70
7.16
inverse Sw
5.70
2.15
2.15
0.95
w
c
5.714348 -29.9252
2.386407
Q3 Which classification problem from below will get better discriminant function from Least Squares
method and why? (5 min)
Answer: Least Square function equally weights every training sample while estimating discriminant
function (classifier). Now if we see the second figure, classifier obtained from these samples will be
almost equally good for classifying samples in first figure. But, while training with additional samples, at
the bottom right corner of figure first, will affect the classifier to tilt towards them, which may pose a
threat to misclassify some samples that are close to decision boundary.
Q4.
PCA, SVD both are dimension reduction techniques.
PCA can be derived from SVD
PCA is achieved by projecting data points to eigen vectors corresponding to higher eigen values
of covariance matrix
SVD is more powerful, can be calculated even for singular matrix / rectangular matrix
SVD takes decomposition of matrix A = UXV'
U is orthonormal eigen vector matrix in column space of A
V is orthonormal eigen vector matrix in row space of A
X is diagonal matrix keeping eigen values at diagonals
in both eigen vectors corresponding to higher eigen values are responsible for higher variability in data
S_1
S_2
S_3
S_4
X1
0
6
5
2
X2
5
1
1
2
Y
1
1
-1
-1
X = [S_1;S_2;...]