Beruflich Dokumente
Kultur Dokumente
Matlab Code
Andrew I. Hanna
November 3, 2006
Let start with the basics, what are we trying to do with Gaussian Mixture
Models (GMMs)? Well as I see it, given some input data, X and a number of
mixtures M (that we assume we know a priori) then we would like to fit the
data as best as we can using M Gaussian distributions. Without further a do,
lets introduce our friend the Gaussian/normal distribution.
p(x|i) =
(2)D/2 |i |1/2
exp 2 (xi )
1
i (xi )
(1)
So what we are asking here is : what is the probability given some input x and a
particular mixture i that has covariance matrix i and mean i ?. To continue
our discussion we must define another quantity. We would like to be able to
tell how much a particular data point x is represented by a particular mixture.
This value is given by
p(x|i)
p(i|x) = PM
.
(2)
j=1 p(x|j)p(j)
At the start of the algorithm the values of p(j) are all set to 1/M , where M
is the number of mixtures. In words this means that each mixture has equal
probablility, i.e. there is no biasing.
Lets get on with the equations for the algorithm. So initially we would have
assigned values to the means and covariance matrix for each mixture. Maybe
it was random, maybe it was informed, it really depends on your data. But we
have some values for i and i for i = 1, . . . , M . We can now update those
means and covariances using the following equations using (2),
PN
d=1 p(i|xd )xd
.
(3)
i = P
N
d=1 p(i|xd )
PN
p(i|xd )(xd i )(xd i )T
(4)
i = d=1
PN
d=1 p(i|xd )
1
p(i) =
N
1 X
p(i|xd )
N
(5)
d=1
We had better have a function for calculating the output from a Normal
distribution for a particular mean and covariance.
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
10
11
12
13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
end
su = sum(Q, 2);
Q = Q./repmat(su, [1 M]);
for i=1:M
gm = mixtures(i);
PN
sum p i x = sum(Q(:,i));
d=1 p(i|xd ) P
N
1
p i(i) = (1/size(Q,1))* sum p i x; N
d=1 p(i|xd )
PN
p(i|x )x
25
d
d
gm.mu = X*Q(:,i)./sum p i x; Pd=1
N
p(i|xd )
d=1
sigma = zeros(dim);
for j=1:N
V = X(:,j) gm.mu;
sigma = sigma + Q(j, i)*(V*V');
end
oldsigma = gm.sigma;
26
gm.sigma = sigma/sum p i x;
27
[P, L] = eig(gm.sigma);
if any(diag(L)<minEigenvalue)
gm.sigma = oldsigma;
end
mixtures(i) = gm;
19
20
21
22
23
24
28
29
30
31
32
33
34
PN
)(xd i )(xd i )
d=1 p(i|x
Pd
N
p(i|xd )
d=1
end
end
return;
Once we have this we can go ahead and use our Matlab skills to write a very
simple script like the one shown below.
1
2
3
x1 = randn(300, 2);
x2 = randn(300, 2)*5 + 4;
X = [x1; x2]';
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18