Beruflich Dokumente
Kultur Dokumente
1
Latent variable models
Z Latent variables
Discrete
Z latent variables:
Cluster assignments
3
Hidden Markov models
Discrete
latent variables:
Z Z Z
Cluster assignments …
Parameters Φ X X X
Data Points
Observed data
4
Hidden Markov models
vs mixture models
5
Example:
Occasionally dishonest casino
6
Constructing graphical model
plate diagrams from pseudocode
(two separate
plates is also fine)
K = 1:K 7
Constructing graphical model
plate diagrams from pseudocode
K = 1:K
Z1 Z2 Z3 ZT
…
K = 1:K X1 X2 X3 X4
8
Joint distributions
from plate diagrams
• If there is a plate, include a product over the
plate’s factors
K = 1:K
Z1 Z2 Z3 ZT
…
K = 1:K
X1 X2 X3 X4
9
9
Learning outcomes
By the end of the lesson, you should be able to:
10
11
12
13
Example application:
Part of speech tagging
The quick brown fox jumps over the sly lazy dogs
Plural noun
16
Collapsed Gibbs sampling
(mixture model example)
17
Collapsed Gibbs sampler
(Pointwise, collapsed)
• Marginalize out transition probabilities
(need Dirichlet prior for this)
18
Blocked Gibbs sampler
(Blocked, explicit)
• Until convergence:
– Sample transition probabilities
19
Blocked Gibbs sampler
(Blocked, explicit)
• Sample all z’s in each HMM chain at once
20
Forwards filtering
• For each timestep t
– Use Bayes’ rule to recursively compute probability
of current state
22
Backwards sampling
• We can re-write the joint posterior from
right to left:
23
Backwards sampling
• Compute the sampling distribution recursively
24
Collapsed blocked Gibbs sampler
(blocked, collapsed)
• Collapsing and blocking can both be beneficial
– However, the forward filtering, backward sampling algorithm for
blocked Gibbs assumes parameters are available.
– How to resolve this dilemma, and get best of both worlds?
• Solution:
– Use a Metropolis-Hastings-within-Gibbs scheme
• Temporarily re-instantiate parameters to good values
• Use these values to generate a proposal state sequence, via the FFBS algorithm
• Choose to accept or reject the proposal via a Metropolis-Hastings decision
Johnson, M., Griffiths, T. L., & Goldwater, S. (2007, April). Bayesian Inference for PCFGs
via Markov Chain Monte Carlo. In HLT-NAACL (pp. 139-146). 25
How to re-instantiate parameters?
• The standard Rao-Blackwellized estimator is to
plug in the posterior predictive probability,
from the Polya urn model
26
27
Hidden semi-Markov models
• According to the generative process of an HMM, the probability we stay in
state I for a duration of d steps is geometrically distributed:
28
Input-Output HMMs
• Condition on inputs at each timestep, which can
affect state transitions and/or outputs
Recipe step
Background
switch
Speech transcription
(words)
Textual recipe
steps
32
Factorial HMMs
• Multiple hidden chains each encode aspects of latent state.
• Each chain evolves independently, but observations generated
based on all chains
33
34
Example: Dynamic relational infinite
feature model (DRIFT)
• Model social networks over time
• Each actor has a vector of latent features
(e.g. interests), each with Markov dynamics
Feature 1 Feature 2 Feature 3
time
actor
J. R. Foulds, A. Asuncion, C. DuBois, C. T. Butts, P. Smyth. A dynamic relational infinite feature model for longitudinal 35
social networks. Proceedings of the 14th International Conference on AI and Statistics (AI Stats), April 2011.
Miller, Griffiths, Jordan (2009)
Latent Feature Relational Model
Alice Bob
Cycling Tango
Fishing Salsa
Running
Claire
Waltz
Running
J. R. Foulds, A. Asuncion, C. DuBois, C. T. Butts, P. Smyth. A dynamic relational infinite feature model for longitudinal 37
social networks. Proceedings of the 14th International Conference on AI and Statistics (AI Stats), April 2011.
Think-pair-share:
Tennis match video action recognition
• You are a data analyst hired to help analyze the playing
style of professional tennis players, in order to help them
improve their performance.
38