Sie sind auf Seite 1von 23

Learning Scripts with HMMS

Walker Orr

February 24, 2014


Scripts

A script is a structure that describes appropriate sequences of


events in a particular context (Shank & Abelson 1977).

Examples: Going to a restaurant, a football game, or a Presidential


election
Referencing Scripts

John went to a restaurant. He asked the waitress for coq au vin.


He paid the check and left.

John was the quarterback. As time ran down, he threw a 60-yard


pass into the end zone. His team won the game.

Examples taken from (Shank & Abelson 1977)


Script Uses

I Resolving References
I Inferring causal/temporal relationships
I Filling in gaps in stories
I Answering questions:
I What happened between John going to the restaurant and
ordering?
I What kind of restaurant did John go to?
I Where did John eat?
Representations and Learning of Scripts

I Originally hand coded (Shank & Abelson 1977)


I Narrative Chains (Chambers & Jurafsky, 2008)
I Sequence Alignment (Regneri et al., 2010)
I Clusters of events (Chambers 2013)
I Probablistic Frames (Cheung et al., 2013)
HMMs

Example HMM:
I What are they? <: 1.0
I Set of states, Q 0.46
I Transition matrix, T hear: 1.0 0.54
I Per state observation 0.85

distribution, : 0.61
walk: 0.39
I What can they do?
0.15 1.0
I Answer queries:
ask: 1.0
P(event|evidence)
1.0

open: 1.0

1.0

>: 1.0
HMMs and Scripts

A script is a structure that describes appropriate sequences of


events in a particular context. (Shank & Abelson 1977)
Script HMMs
Events States & Observations
Sequence Structure States and transitions
Question answering Probablistic queries, P(event|history )
Story Single path through the HMM
Problems

I HMMs assume no observations (events) are missing


I In reality there are omissions due to extraction error or writing
conventions
I The result is erroneous transitions between states
I Example:
If B is missing the sequence A, B, C then it will appear as
though C follows A
HMM Learning

Problem given a set of sequences of events, learn the HMM that


produced them.
I Learn structure: states and transitions

I learn parameters: probability of transitions and observations


HMMs with Transitions

I the null observation, is included in each states distribution


over observations
I The result is an addition to the likelihood calculation, allowing
for transitions from state to state without observations
Example
Structural EM for HMMS

procedure Learn-HMM(Documents E , Integer r )


m=
D = {}
for all r sized batch of E , b do
D =D b
m0 = join(m, pta(b))
while Improvement do
m0 = argmaxm successor (m0 ) (P(m |D))
m0 = EM(m0 , D)
m = m0
end while
end for
return m
end procedure
Merge Example
Edge Deletion Example
Bayesian Scoring

I The objective:P(M|D) P(D|M)P(M)


I P(D|M) is the likelihood of the data under the model
1 (1 |Q|+2 |T |+3 |C |)
I The model prior P(M) = Ze
I |C | are the number of contraint violations by the model
I The model prior penalizes large, complex models
Issue of Speed
Parameter Estimation

I Use Bayesian Model Merging and Structural EM assumptions


(BSEM)
I Transition and observation distribution parameters are
computing with sufficient statistics
I The sufficient statistics are updated by EM after each batch
to correct for violations of assumptions
Approximate Posterior

I P(M|D) takes O(n) time and is computed for O(n2 )


candidate models
I Approximate P(M|D) by updating the parents score based on
the structual change with sufficient statistics
I The approximation is O(1) which reduces the overall
complexity from O(n3 ) to O(n2 )
Evalutation

I Evaluate the accuracy on filling in the gaps in a sequence of


events
I Predictions are made with the Forward-Backward algorithm by
computing argmaxo P(oi |O\i )
I Example:
I In the following sequence walk is withheld and the model is
asked to fill in the gap.
I hear, debate, walk, ask, debate, unlock, open
OMICS Domain

I Simple step-by-step instructions for commonplace activities


for 175 different domains
I 14000 sets of instructions
I Selected the most complex 84 domains for evaluation
I Used Wordnet, linear model and agglomerative clustering to
learn conceptual events
I Sentence similarity is computed with
f (s1 , s2 ) = w1 sim(v1 , v2 ) + w2 sim(o1 , o2 )
OMICS Example

Sequence 1 Sequence 2
I hear the bell I hear the doorbell
I debate whether to get up I walk to the door
I walk over to the door I open the door
I ask who it is I allow the people in
I debate again to open I close the door
I unlock the door
I open the door
Results

Batch Size r 2 5 10
SEM-HMM 41.9% 45.1% 46.0%
SEM-HMM w/ Approx. 43.0%
BMM+EM 41.1% 41.2% 42.1%
BMM 41.0% 39.5% 39.1%
Simple 27.3%
Summary & Future Work

I Summary
I Represented Scripts as HMMs
I Provided a HMM learning algorithm in light of missingness
I Created a novel fast parameter estimation method based on
EM
I Created a evaluation task for Scripts
I Future work
I Combine event extraction with HMM learning
I Include actors with events

Das könnte Ihnen auch gefallen