Sie sind auf Seite 1von 55

Entities in Natural Language

Anaphora Resolution
Coreference Resolution
References

Coreference Resolution
Hinrich Schtze and Desislava Zhekova
CIS, LMU

desi@cis.uni-muenchen.de

June 21, 2013

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Outline
1

Entities in Natural Language


Understanding Natural Language
The use of Entities in Natural Language
Reference Resolution

Anaphora Resolution
The Task of Anaphora Resolution
Types of Anaphora

Coreference Resolution
Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

Understanding Natural Language

John: Mary baked a vanilla slice for the birthday party.


Bob: Really?

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

The use of Entities in Natural Language

John: [Mary]1 baked [a vanilla slice]2 for [the birthday party]3 .


Bob: Really?

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

Reference Resolution

John: [Mary]1 baked [a vanilla slice]2 for [the birthday party]3 .


Unfortunately, [she]4 forgot [the cake]5 in [the oven]6 .
Bob: Really?

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

Reference Resolution

#
1
2
3
4

Referring Expression
Mary, she
vanilla slice, the cake
the birthday party
the oven

Referent
Mary
the vanilla slice cake
the birthday party
the oven

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

Reference Resolution
Why is this helpful to NLP?
Let us ask the Natural Language Question Answering System START
some questions using reference:
What does the question sequence (Who is the Queen of
England? What is her age?) return?
What does the question sequence (Who is James Bond? Who is
the Queen of England? What is his age?) return?
What does the question sequence (Who is James Bond? Who is
the Queen of England? How old is this person?) return?

http://start.csail.mit.edu
Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

Reference Resolution

The ambiguity of referring expressions is often disambiguated by


humans via clarifications questions:
Did you mean James Bond?
Did you mean the Queen?
How old is who?
Who did you mean?

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

Reference Resolution

John: [Mary]1 baked [a vanilla slice]2 for [the birthday party]3 .


Unfortunately, [she]4 forgot [the cake]5 in [the oven]6 .
Bob: Really?
#
1
2

First mention
Mary
vanilla slice

Hinrich Schtze and Desislava Zhekova

Reference
she
the cake

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

Reference Resolution

antecedent - denotes the expression that appears previous to a


referring expression to the same discourse entity
anaphor - denotes the referring expression to an entity that has
already been introduced to the discourse
anaphoric relation - the relation that binds the antecedent and
the anaphor

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Understanding Natural Language


The use of Entities in Natural Language
Reference Resolution

Reference Resolution

Anaphora Resolution
Coreference Resolution

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

The Task of Anaphora Resolution


Types of Anaphora

Anaphora Resolution

Anaphora Resolution (AR) - is the task that aims at the identification


of the antecedent of a target word or phrase previously introduced to
the discourse. [Mitkov, 2002]

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

The Task of Anaphora Resolution


Types of Anaphora

Anaphora Resolution

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

The Task of Anaphora Resolution


Types of Anaphora

Anaphora Resolution

find the correct antecedent for each anaphor


once one antecedent is found the task is complete - AR does not
detect all antecedents in the given discourse

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

The Task of Anaphora Resolution


Types of Anaphora

Types of Anaphora

The various types of anaphora may be distinguished:


according to the form of the anaphor (e.g. pronominal anaphora,
lexical noun phrase anaphora, verb/adverb anaphora, zero
anaphora)
according to the locations of the anaphor and the antecedent
(e.g. intrasentential anaphora, intersentential anaphora,
interdocument anaphora)
other (e.g. identity-of-reference anaphora, identity-of-sense
anaphora, cataphora)

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Coreference Resolution

Coreference resolution (CR) - is the process that aims to identify the


various referring expressions in a discourse that are associated with
the same entity and group them under the same equivalence classes.
mention/markable - potentially anaphoric phrase
coreference chain - an equivalence class or a set of mentions
refering to the same entity

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Coreference Resolution

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Coreference Resolution
Larry King:
caller_3:

Mike Wallace:
Larry King:
Mike Wallace:

Hello hello Jay Georgia hello.


Ah thank (you) (Larry). And (Mike) (I) loved ((your)
book). (It) was great. And toward the end of the
(book) (you) said Secretary (Putin of Russia) had
asked (you) to come over and (interview) (him).
Had (you) done (that)? Uh and (I)d like to know
about (it). Thank (you) so much.
Yeah.
(I) did interview (Putin) yes.
on the sixtieth anniversary of the uh end of World
War Two (he) asked (me) to come on over and (interview) (him). And (it) was carried uh in a lot of
places. But (I) tell you something. (Putin) to (my)
way of thinking who calls (himself) a democrat (He)s not our kind of democrat.

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Hands On

How many coreference chains do these mentions form?


Which are the chains?

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Coreference Resolution
Larry King:
caller_3:

Mike Wallace:
Larry King:
Mike Wallace:

Hello hello Jay Georgia hello.


Ah thank (you1 ) (Larry1 ). And (Mike2 ) (I3 ) loved
((your2 ) book4 ). (It4 ) was great. And toward the
end of the (book4 ) (you2 ) said Secretary (Putin
of Russia5 ) had asked (you2 ) to come over and
(interview6 ) (him5 ). Had (you2 ) done (that6 )? Uh
and (I3 )d like to know about (it6 ). Thank (you2 ) so
much.
Yeah.
(I1 ) did interview (Putin5 ) yes.
on the sixtieth anniversary of the uh end of World
War Two (he5 ) asked (me2 ) to come on over and
(interview7 ) (him5 ). And (it7 ) was carried uh in a lot
of places. But (I2 ) tell you something. (Putin5 ) to
(my2 ) way of thinking who calls (himself5 ) a democrat - (He5 )s not our kind of democrat.

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Coreference Resolution

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Reference Resolution

John: [Mary]1 baked [a vanilla]2 slice for [the birthday party]3 .


Unfortunately, [she]4 forgot [the cake]5 in [the oven]6 .
Bob: Really?

What about mentions, such as [the birthday party]3 and [the oven]6 .
These entities are only introduced once, but never referred to!

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Coreference Resolution
singletons - mentions that refer to an entity in the text that no other
mention refers to

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Coreference Resolution

So, how do we identify coreferential relations? Similar to the WSD task


that we previously discussed, we have two different approaches:
rule-based approaches
machine learning approaches

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Rule-based CR

Rule-based approaches rely on:


the availability of lexical and encyclopedic knowledge
manually handcrafted rules

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Rule-based CR

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Rule-based CR

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Machine Learning for CR

Machine Learning for CR tries to meet the drawbacks of rule-based


approaches:
the cost for manually developing rules
the cost for maintaining and extending the rules

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Machine Learning for CR

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Coreference Models

Coreference Resolution is often represented as a binary classification


task and there are several CR models that can be used for this
purpose [Rahman and Ng, 2011]:
mention-pair model
mention-ranking model
entity-mention model
cluster-ranking model

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Evaluation

Most widely used evaluation metrics are: MUC, B3 , both CEAF


variants (CEAFe and CEAFm ) and BLANC. None of them, however,
manages to provide an optimal evaluation. This is well demonstrated
by the two baselines (singletons and all-in-one).

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Baselines

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Baselines

SINGLETONS
ALL - IN - ONE

R
0.0
100

MUC
P
0.0
29.2

F1
0.0
45.2

R
71.2
10.5

CEAF
P
71.2
10.5

Hinrich Schtze and Desislava Zhekova

F1
71.2
10.5

R
71.2
100

B3
P
100
3.5

Coreference Resolution

F1
83.2
6.7

R
50.0
50.0

BLANC
P
BLANC
49.2
49.6
0.8
1.6

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Evaluation Settings
gold-closed - gold linguistic annotations must be used by the systems
and no external tools and resources are allowed for additional
preprocessing.
auto-closed - auto linguistic annotations must be used by the systems
and no external tools and resources are allowed for additional
preprocessing.
gold-open - gold linguistic annotations must be used by the systems
and external tools and resources are allowed for additional
preprocessing.
auto-open - auto linguistic annotations must be used by the systems
and external tools and resources are allowed for additional
preprocessing.

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Data
Word#
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.

POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.

ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))

PredLemma
-

PFID
03
01
-

Hinrich Schtze and Desislava Zhekova

WS
2
-

SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1

NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*

Coreference Resolution

PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*

Coref
(22)
(24
24)
(13
13)
-

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Data
#begin document <document ID>
<sentence>
<sentence>
...
<sentence>
#end document <document ID>
...
#begin document <document ID>
<sentence>
<sentence>
...
<sentence>
#end document <document ID>

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Data

<token#1 column#1>
<token#2 column#1>
<token#3 column#1>
...

<token#1 column#2>
<token#2 column#2>
<token#3 column#2>

Hinrich Schtze and Desislava Zhekova

<token#1 column#3> ...


<token#2 column#3> ...
<token#3 column#3> ...

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

The CR pipeline

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Mention Detection

Identification of mentions using:


Heuristic approaches POS, NEs
Rule-based approaches syntactic annotation
Machine learning approaches can use all types of provided
annotations
Hybrid approaches combination of implemented approaches

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Mention Detection Methods


Word#
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.

POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.

ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))

PredLemma
-

PFID
03
01
-

Hinrich Schtze and Desislava Zhekova

WS
2
-

SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1

NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*

Coreference Resolution

PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*

Coref
(22)
(24
24)
(13
13)
-

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Using NEs

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Using NEs
Word#
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.

POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.

ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))

PredLemma
-

PFID
03
01
-

Hinrich Schtze and Desislava Zhekova

WS
2
-

SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1

NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*

Coreference Resolution

PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*

Coref
(22)
(24
24)
(13
13)
-

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Using POS-based chunking


Word#
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.

POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.

ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))

PredLemma
-

PFID
03
01
-

Hinrich Schtze and Desislava Zhekova

WS
2
-

SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1

NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*

Coreference Resolution

PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*

Coref
(22)
(24
24)
(13
13)
-

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Using the syntactic annotation


Word#
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Word
It
is
composed
of
a
primary
stele
,
secondary
steles
,
a
huge
round
sculpture
and
beacon
tower
,
and
the
Great
Wall
,
among
other
things
.

POS
PRP
VBZ
VBN
IN
DT
JJ
NN
,
JJ
NNS
,
DT
JJ
NN
NN
CC
NN
NN
,
CC
DT
NNP
NNP
,
IN
JJ
NNS
.

ParseBit
(TOP(S(NP*)
(VP*
(VP*
(PP*
(NP(NP*
*
*)
*
(NP*
*)
*
(NP*
*
*
(NML(NML*)
*
(NML*
*)))
*
*
(NP*
*
*)
*
(PP*
(NP*
*))))))
*))

PredLemma
-

PFID
03
01
-

Hinrich Schtze and Desislava Zhekova

WS
2
-

SA
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1
Speaker#1

NE
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
(WORK_OF_ART*
*
*)
*
*
*
*
*

Coreference Resolution

PredArgs
*
(V*)
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

PredArgs
(ARG1*)
*
(V*)
(ARG2*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*)
*

Coref
(22)
(24
24)
(13
13)
-

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Toy Example following the Mention-Pair Model

[Mary1 ] had [a good idea2 ]. [She3 ] wanted to tell [John4 ].

[a good idea] [Mary]


[She] [a good idea]
[She] [Mary]
[John] [She]
[John] [a good idea]
[John] [Mary]

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Mention Head Detection

Example:
Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Mention Head Detection

Mention Head Detection is generally realized via:


Heuristics
Rule-based methods

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Toy Example following the Mention-Pair Model

[Mary1 ] had [a good idea2 ]. [She3 ] wanted to tell [John4 ].

idea Mary
She idea
She Mary
John She
John idea
John Mary

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Example of Features Used by the Menion-Pair Model

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Example of Features Used by the Menion-Pair Model

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Toy Example following the Mention-Pair Model

[Mary1 ] had [a good idea2 ]. [She3 ] wanted to tell [John4 ].

idea Mary NN NNP


She idea NNP NN
She Mary PRP NNP
John She NNP PRP
John idea NNP NN
John Mary NNP NNP

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Toy Example following the Mention-Pair Model

training: [Mary1 ] had [a good idea2 ]. [She1 ] wanted to tell [John4 ].


test: [Mary1 ] had [a good idea2 ]. [She3 ] wanted to tell [John4 ].
Training instances:

Test instances:

idea Mary NN NNP F


She idea NNP NN F
She Mary PRP NNP T
John She NNP PRP F
John idea NNP NN F
John Mary NNP NNP F

idea Mary NN NNP


She idea NNP NN
She Mary PRP NNP
John She NNP PRP
John idea NNP NN
John Mary NNP NNP

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Hands On

How would you employ semantic similarity for the task of coreference
resolution?

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Rule-based approaches to CR
Machine Learning approaches to CR
Subtasks of CR

Thank you!

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Entities in Natural Language


Anaphora Resolution
Coreference Resolution
References

Bibliography

Ruslan Mitkov. Anaphora resolution. Studies in Language and


Linguistics. Longman, 2002.
Altaf Rahman and Vincent Ng. Narrowing the Modeling Gap: a
Cluster-Ranking Approach to Coreference Resolution. J. Artif. Int.
Res., 40(1):469521, January 2011.

Hinrich Schtze and Desislava Zhekova

Coreference Resolution

Das könnte Ihnen auch gefallen