Beruflich Dokumente
Kultur Dokumente
Anaphora Resolution
Coreference Resolution
References
Coreference Resolution
Hinrich Schtze and Desislava Zhekova
CIS, LMU
desi@cis.uni-muenchen.de
Coreference Resolution
Outline
1
Anaphora Resolution
The Task of Anaphora Resolution
Types of Anaphora
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Coreference Resolution
Coreference Resolution
Reference Resolution
Coreference Resolution
Reference Resolution
#
1
2
3
4
Referring Expression
Mary, she
vanilla slice, the cake
the birthday party
the oven
Referent
Mary
the vanilla slice cake
the birthday party
the oven
Coreference Resolution
Reference Resolution
Why is this helpful to NLP?
Let us ask the Natural Language Question Answering System START
some questions using reference:
What does the question sequence (Who is the Queen of
England? What is her age?) return?
What does the question sequence (Who is James Bond? Who is
the Queen of England? What is his age?) return?
What does the question sequence (Who is James Bond? Who is
the Queen of England? How old is this person?) return?
http://start.csail.mit.edu
Hinrich Schtze and Desislava Zhekova
Coreference Resolution
Reference Resolution
Coreference Resolution
Reference Resolution
First mention
Mary
vanilla slice
Reference
she
the cake
Coreference Resolution
Reference Resolution
Coreference Resolution
Reference Resolution
Anaphora Resolution
Coreference Resolution
Coreference Resolution
Anaphora Resolution
Coreference Resolution
Anaphora Resolution
Coreference Resolution
Anaphora Resolution
Coreference Resolution
Types of Anaphora
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Larry King:
caller_3:
Mike Wallace:
Larry King:
Mike Wallace:
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Hands On
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Larry King:
caller_3:
Mike Wallace:
Larry King:
Mike Wallace:
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Reference Resolution
What about mentions, such as [the birthday party]3 and [the oven]6 .
These entities are only introduced once, but never referred to!
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
singletons - mentions that refer to an entity in the text that no other
mention refers to
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Rule-based CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Rule-based CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Rule-based CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Models
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Evaluation
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Baselines
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Baselines
SINGLETONS
ALL - IN - ONE
R
0.0
100
MUC
P
0.0
29.2
F1
0.0
45.2
R
71.2
10.5
CEAF
P
71.2
10.5
F1
71.2
10.5
R
71.2
100
B3
P
100
3.5
Coreference Resolution
F1
83.2
6.7
R
50.0
50.0
BLANC
P
BLANC
49.2
49.6
0.8
1.6
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Evaluation Settings
gold-closed - gold linguistic annotations must be used by the systems
and no external tools and resources are allowed for additional
preprocessing.
auto-closed - auto linguistic annotations must be used by the systems
and no external tools and resources are allowed for additional
preprocessing.
gold-open - gold linguistic annotations must be used by the systems
and external tools and resources are allowed for additional
preprocessing.
auto-open - auto linguistic annotations must be used by the systems
and external tools and resources are allowed for additional
preprocessing.
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Data
Word#
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.
POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.
ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))
PredLemma
-
PFID
03
01
-
WS
2
-
SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*
Coreference Resolution
PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*
Coref
(22)
(24
24)
(13
13)
-
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Data
#begin document <document ID>
<sentence>
<sentence>
...
<sentence>
#end document <document ID>
...
#begin document <document ID>
<sentence>
<sentence>
...
<sentence>
#end document <document ID>
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Data
<token#1 column#1>
<token#2 column#1>
<token#3 column#1>
...
<token#1 column#2>
<token#2 column#2>
<token#3 column#2>
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
The CR pipeline
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Mention Detection
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.
POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.
ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))
PredLemma
-
PFID
03
01
-
WS
2
-
SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*
Coreference Resolution
PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*
Coref
(22)
(24
24)
(13
13)
-
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Using NEs
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Using NEs
Word#
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.
POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.
ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))
PredLemma
-
PFID
03
01
-
WS
2
-
SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*
Coreference Resolution
PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*
Coref
(22)
(24
24)
(13
13)
-
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.
POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.
ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))
PredLemma
-
PFID
03
01
-
WS
2
-
SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*
Coreference Resolution
PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*
Coref
(22)
(24
24)
(13
13)
-
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.
POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.
ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))
PredLemma
-
PFID
03
01
-
WS
2
-
SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*
Coreference Resolution
PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*
Coref
(22)
(24
24)
(13
13)
-
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Example:
Hinrich Schtze and Desislava Zhekova
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
idea Mary
She idea
She Mary
John She
John idea
John Mary
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Test instances:
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Hands On
How would you employ semantic similarity for the task of coreference
resolution?
Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR
Thank you!
Coreference Resolution
Bibliography
Coreference Resolution