Sie sind auf Seite 1von 2

# Probabilistic Graphical Models Problem Set 1 Daniel Foreman-Mackey

T
J
B
M
E
N A T
J
B
M
E
N
Figure 3: Left: the original graph. Right: the graph marginalized over A.
11 Problem 11 Exercise 3.15
The set of independences implied by graph (a) are D A, C|B and A C. There are no other
I-equivalent graphs. The independences implied by the Bayesian network (b) are A C, D|B and
C D|B. The four Bayesian networks (including (b) from the exercise) in Figure 4 all imply this
same independence structure. Therefore, they are all I-equivalent.
A
B
C D
C
B
A D
D
B
A C
D
B
A
C
Figure 4: Four I-equivalent Bayesian networks.
12 Problem 12 Exercise 3.2
(a) The assumption given in Equation (3.6) is that each feature X
i
is independent of the other
features {X
i
} conditioned on the class C. This can be written as X
i
{X
i
}|C. Any joint
distribution p(C, X
1
, . . . , X
n
) can be factored (using the chain rule) into p(C) p(X
1
, . . . , X
n
|C).
Then, for a particular feature X
i
, the conditional independence assumption above implies that
p(X
i
, {X
i
}|C) = p(X
i
|C) p( {X
i
}|C). (26)
7
Probabilistic Graphical Models Problem Set 1 Daniel Foreman-Mackey
Applying the conditional independence assumption again, we nd
p(X
i
|C) p(X
j
, {X
i
, X
j
}|C) = p(X
i
|C) p(X
j
|C) p( {X
i
, X
j
}|C). (27)
We can iterate this procedure for all values of i = 1, . . . , n to nd that
p(X
1
, . . . , X
n
|C) =
n
Y
i=1
p(X
i
|C). (28)
Therefore, the conditional independence assumption from Equation (3.6) implies that the join
distribution can be factored
p(C, X
1
, . . . , X
n
) = p(C)
n
Y
i=1
p(X
i
|C) (29)
which is exactly the result from Equation (3.7).
(b) Using the chain rule, we can rewrite the joint probability above as
p(C, X) = p(X) p(C|X). (30)
Therefore, the ratio of joint probabilities can be written (for the observed feature vector x)
p(c
1
, x)
p(c
2
, x)
=
p(c
1
|x)
p(c
2
|x)
. (31)
Then, using Equation (29), this ratio can also be written
p(c
1
, x)
p(c
2
, x)
=
p(c
1
)
p(c
2
)
n
Y
i=1
p(x
i
|c
1
)
p(x
i
|c
2
)
. (32)
Equating these two expressions, we nd the expected Equation (3.8):
p(c
1
|x)
p(c
2
|x)
=
p(c
1
)
p(c
2
)
n
Y
i=1
p(x
i
|c
1
)
p(x
i
|c
2
)
. (33)
(c) Taking the logarithm of Equation (33), we nd
log

p(c
1
|x)
p(c
2
|x)

= log

p(c
1
)
p(c
2
)

+
n
X
i=1
[log p(x
i
|c
1
) log p(x
i
|c
2
)] (34)
13 Problem 13
(a) A recursive equation for the marginalized probability p(X
i
= 1) is given by
p(X
i
= 1) = p(X
i1
= 1) [p(X
i
= 1|X
i1
= 1) p(X
i
= 1|X
i1
= 0)] +p(X
i
= 1|X
i1
= 0). (35)
Starting with
p(X
2
= 1) = p(X
1
= 1) p(X
2
= 1|X
1
= 1) +p(X
1
= 0) p(X
2
= 1|X
1
= 0), (36)
where everything is known, we can iterate using Equation (35) to nd p(X
i
= 1) for each i = 1, . . . , n
in linear time. The pseudocode for this algorithm is shown here:
8