Learning Scripts As HMMs

Learning Scripts with HMMS
Walker Orr
February 24, 2014

Scripts
A script is a structure that describes appropriate sequences of

events in a particular context (Shank & Abelson 1977).
Examples: Going to a restaurant, a football game, or a Presidential

election
Referencing Scripts
John went to a restaurant. He asked the waitress for coq au vin.

He paid the check and left.
John was the quarterback. As time ran down, he threw a 60-yard

pass into the end zone. His team won the game.
Examples taken from (Shank & Abelson 1977)

Script Uses
I Resolving References
I Inferring causal/temporal relationships
I Filling in gaps in stories
I Answering questions:
I What happened between John going to the restaurant and
ordering?
I What kind of restaurant did John go to?
I Where did John eat?
Representations and Learning of Scripts
I Originally hand coded (Shank & Abelson 1977)

I Narrative Chains (Chambers & Jurafsky, 2008)
I Sequence Alignment (Regneri et al., 2010)
I Clusters of events (Chambers 2013)
I Probablistic Frames (Cheung et al., 2013)
HMMs
Example HMM:
I What are they? <: 1.0
I Set of states, Q 0.46
I Transition matrix, T hear: 1.0 0.54
I Per state observation 0.85
distribution, : 0.61
walk: 0.39
I What can they do?
0.15 1.0
I Answer queries:
ask: 1.0
P(event|evidence)
1.0
open: 1.0
1.0
>: 1.0
HMMs and Scripts
A script is a structure that describes appropriate sequences of

events in a particular context. (Shank & Abelson 1977)
Script HMMs
Events States & Observations
Sequence Structure States and transitions
Question answering Probablistic queries, P(event|history )
Story Single path through the HMM
Problems
I HMMs assume no observations (events) are missing

I In reality there are omissions due to extraction error or writing
conventions
I The result is erroneous transitions between states
I Example:
If B is missing the sequence A, B, C then it will appear as
though C follows A
HMM Learning
Problem given a set of sequences of events, learn the HMM that

produced them.
I Learn structure: states and transitions
I learn parameters: probability of transitions and observations

HMMs with Transitions
I the null observation, is included in each states distribution

over observations
I The result is an addition to the likelihood calculation, allowing
for transitions from state to state without observations
Example
Structural EM for HMMS
procedure Learn-HMM(Documents E , Integer r )

m=
D = {}
for all r sized batch of E , b do
D =D b
m0 = join(m, pta(b))
while Improvement do
m0 = argmaxm successor (m0 ) (P(m |D))
m0 = EM(m0 , D)
m = m0
end while
end for
return m
end procedure
Merge Example
Edge Deletion Example
Bayesian Scoring
I The objective:P(M|D) P(D|M)P(M)

I P(D|M) is the likelihood of the data under the model
1 (1 |Q|+2 |T |+3 |C |)
I The model prior P(M) = Ze
I |C | are the number of contraint violations by the model
I The model prior penalizes large, complex models
Issue of Speed
Parameter Estimation
I Use Bayesian Model Merging and Structural EM assumptions

(BSEM)
I Transition and observation distribution parameters are
computing with sufficient statistics
I The sufficient statistics are updated by EM after each batch
to correct for violations of assumptions
Approximate Posterior
I P(M|D) takes O(n) time and is computed for O(n2 )

candidate models
I Approximate P(M|D) by updating the parents score based on
the structual change with sufficient statistics
I The approximation is O(1) which reduces the overall
complexity from O(n3 ) to O(n2 )
Evalutation
I Evaluate the accuracy on filling in the gaps in a sequence of

events
I Predictions are made with the Forward-Backward algorithm by
computing argmaxo P(oi |O\i )
I Example:
I In the following sequence walk is withheld and the model is
asked to fill in the gap.
I hear, debate, walk, ask, debate, unlock, open
OMICS Domain
I Simple step-by-step instructions for commonplace activities

for 175 different domains
I 14000 sets of instructions
I Selected the most complex 84 domains for evaluation
I Used Wordnet, linear model and agglomerative clustering to
learn conceptual events
I Sentence similarity is computed with
f (s1 , s2 ) = w1 sim(v1 , v2 ) + w2 sim(o1 , o2 )
OMICS Example
Sequence 1 Sequence 2
I hear the bell I hear the doorbell
I debate whether to get up I walk to the door
I walk over to the door I open the door
I ask who it is I allow the people in
I debate again to open I close the door
I unlock the door
I open the door
Results
Batch Size r 2 5 10
SEM-HMM 41.9% 45.1% 46.0%
SEM-HMM w/ Approx. 43.0%
BMM+EM 41.1% 41.2% 42.1%
BMM 41.0% 39.5% 39.1%
Simple 27.3%
Summary & Future Work
I Summary
I Represented Scripts as HMMs
I Provided a HMM learning algorithm in light of missingness
I Created a novel fast parameter estimation method based on
EM
I Created a evaluation task for Scripts
I Future work
I Combine event extraction with HMM learning
I Include actors with events

Learning Scripts As HMMs

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Learning Scripts As HMMs

Hochgeladen von

Copyright:

Verfügbare Formate

Learning Scripts with HMMS

February 24, 2014

A script is a structure that describes appropriate sequences of

Examples: Going to a restaurant, a football game, or a Presidential

John went to a restaurant. He asked the waitress for coq au vin.

John was the quarterback. As time ran down, he threw a 60-yard

Examples taken from (Shank & Abelson 1977)

I Originally hand coded (Shank & Abelson 1977)

A script is a structure that describes appropriate sequences of

I HMMs assume no observations (events) are missing

Problem given a set of sequences of events, learn the HMM that

I learn parameters: probability of transitions and observations

I the null observation, is included in each states distribution

procedure Learn-HMM(Documents E , Integer r )

I The objective:P(M|D) P(D|M)P(M)

I Use Bayesian Model Merging and Structural EM assumptions

I P(M|D) takes O(n) time and is computed for O(n2 )

I Evaluate the accuracy on filling in the gaps in a sequence of

I Simple step-by-step instructions for commonplace activities

Das könnte Ihnen auch gefallen