Presentation is loading. Please wait.

Presentation is loading. Please wait.

(plus a few notes on coreference in advance for Friday)

Similar presentations


Presentation on theme: "(plus a few notes on coreference in advance for Friday)"— Presentation transcript:

1 (plus a few notes on coreference in advance for Friday)
CS 224U LINGUIST 288/188 Natural Language Understanding Jurafsky and Manning Lecture 5: Coherence (plus a few notes on coreference in advance for Friday) Oct 9, 2007 Dan Jurafsky Slides from Massimo Poesio, Eleni Miltsakaki, Julia Hirschberg, Daniel Marcu

2 What makes a text/dialogue coherent?
“Consider, for example, the difference between passages (18.71) and (18.72). Almost certainly not. The reason is that these utterances, when juxtaposed, will not exhibit coherence. Do you have a discourse? Assume that you have collected an arbitrary set of well-formed and independently interpretable utterances, for instance, by randomly selecting one sentence from each of the previous chapters of this book.” vs….

3 What makes a text/dialogue coherent?
“Assume that you have collected an arbitrary set of well-formed and independently interpretable utterances, for instance, by randomly selecting one sentence from each of the previous chapters of this book. Do you have a discourse? Almost certainly not. The reason is that these utterances, when juxtaposed, will not exhibit coherence. Consider, for example, the difference between passages (18.71) and (18.72). (J&M:695)

4 What makes a text coherent?
Discourse structure In a coherent text the parts of the discourse exhibit a sensible ordering and hierarchical relationship Rhetorical structure The elements in a coherent text are related via meaningful relations (“coherence relations”) Discourse entities Entity structure (“Focus”) A coherent text is about some entity or entities, and the entity/entities is/are referred to in a structured way throughout the text.

5 Part I: Discourse Structure
Conventional structures for different genres Academic articles: Abstract, Introduction, Methodology, Results, Conclusion Newspaper story: inverted pyramid structure (lead followed by expansion) Spoken patient reports SOAP: Subjective, Objective, Assessment, Plan

6 Discourse Segmentation
Simpler task Discourse segmentation Separating document into linear sequence of subtopics Why? Information retrieval: automatically segmenting a TV news boradcast or a long news story into sequence of stories Text summarization: summarize different segments of a document Information extraction: Extract info from inside a single discourse segment

7 Unsupervised Discourse Segmentation
Hearst (1997): 21-pgraph science news article called “Stargazers” Goal: produce the following subtopic segments:

8 Key intuition: cohesion
Halliday and Hasan (1976): “The use of certain linguistic devices to link or tie together textual units” Lexical cohesion: Indicated by relations between words in the two units (identical word, synonym, hypernym) Before winter I built a chimney, and shingled the sides of my house. I thus have a tight shingled and plastered house. Peel, core and slice the pears and the apples. Add the fruit to the skillet. Nonlexical: anaphora The Woodhouses were first in consequence there. All looked up to them. Cohesion chain: Peel, core and slice the pears and the apples. Add the fruit to the skillet. When they are soft…

9 Intuition of cohesion-based segmentation
Sentences or paragraphs in a subtopic are cohesive with each other But not with paragraphs in a neighboring subtopic Thus if we measured the cohesion between every neighboring sentences We might expect a ‘dip’ in cohesion at subtopic boundaries.

10 TextTiling (Hearst 1997) Tokenization Lexical Score Determination
Each space-deliminated word Converted to lower case Throw out stop list words Stem the rest Group into pseudo-sentences of length w=20 Lexical Score Determination Boundary Identification

11 TextTiling algorithm

12 Cosine

13 Supervised Discourse segmentation
Discourse markers or cue words Broadcast news Good evening, I’m <PERSON> …coming up…. Science articles “First,….” “The next topic….”

14 Supervised discourse segmentation
Supervised machine learning Label segment boundaries in training and test set Extract features in training Learn a classifier In testing, apply feature to predict boundaries

15 Supervised discourse segmentation
Evaluation: WindowDiff (Pevzner and Hearst 2000)

16 Part II: Coherence Relations

17 What is RST? A descriptive theory of discourse organization, characterizing text mostly in terms of relations that hold between parts of text. Slide from Eleni Miltsakaki

18 History RST was developed as part of a project on computer-based generation of text by Bill Mann, Sandy Thompson and Christian Matthiessen RST is based on studies of carefully written text of a variety of sources RST is intended to describe texts (not processes of producing or understanding them) RST gives an account of coherence in text Slide from Eleni Miltsakaki

19 Elements of RST Relations Schemas Schema applications Structures
Slide from Eleni Miltsakaki

20 Relations Relations hold between two non-overlapping text spans
Nuclear:Satellite (denoted by N and S) Multi-nuclear relations Slide from Eleni Miltsakaki

21 Example Slide from Eleni Miltsakaki

22 RST tree Slide from Eleni Miltsakaki

23 Definition of relations
Constraints on nucleus Constraints on satellite Constraints on the combination of nucleus and satellite The effect Slide from Eleni Miltsakaki

24 RST schemas Schemas define the structural constituency arrangement of text. Slide from Eleni Miltsakaki

25 RST schema applications
Unordered spans: the schemas do not constrain the order of nucleus or satellites in the text span in which the schema is applied Optional relations: for multi-relations schemas, all individual relations are optional, but at least one if the relations must hold Repeated relations: a relation that is part of a schema can be applied any number of times in the application of that schema Slide from Eleni Miltsakaki

26 Basic RST relations Slide from Eleni Miltsakaki

27 Evidence Relation name: EVIDENCE
Constraints on N: R might not believe N to a degree satisfactory to W(riter) Constraints on S: The reader believes S or will find it credible Constraints on the N+S combination: R’s comprehending S increases R’s belief of N The effect: R’s belief of N is increased Locus of the effect: N Slide from Eleni Miltsakaki

28 Example The program as published for calendar year 1980 really works.
In only a few minutes, I entered all the figures from my 1980 tax return And got a result which agreed with my hand calculations to the penny. 2-3 EVIDENCE for 1 Slide from Eleni Miltsakaki

29 Justify Relation name: JUSTIFY Constraints on N: none
Constraints on S: none Constraints on N+S combination: R’s comprehending S increases R’s readiness to accept W’s right to present N The effect: R’s readiness to accept W’s right to present N is increased Locus of the effect: N Slide from Eleni Miltsakaki

30 Antithesis Relation name: ANTITHESIS
Constraints on N: W has positive regard for the situation presented in N Constraints on S: none Constraints on N+S combination: the situation presented in N and S are in contrast. Because of the incompatibility that arises from contrast, one cannot have positive regard for both situations presented in N and S; comprehending S and the incompatibility between the situations presented in N and S increases R’s positive regard for the situation presented in N The effect: R’s positive regard for N is increased Locus of effect: N Slide from Eleni Miltsakaki

31 Concession Relation name: CONCESSION
Constraints on N: W has positive regard for the situation presented in N Constraints on S: W is not claiming that the situation presented in S doesn’t hold Constraints on the N+S combination: W acknowledges a potential or apparent incompatibility between the situations presented in N and S; recognizing the incompatibility increases R’s positive regard for the situation presented in N The effect: R’s positive regard for the situation presented in N is increased Locus of effect: N and S Slide from Eleni Miltsakaki

32 Example Concern that this material is harmful to health or the environment may be misplaced. Although it is toxic to certain animals, Evidence is lacking that it has any serious long-term effect on human beings. 2 CONCESSION to 3 2-3 ELABORATION to 1 Slide from Eleni Miltsakaki

33 Span order Slide from Eleni Miltsakaki

34 Distinctions among relations
Subject matter (semantic) Two parts of the text are understood as causally related in the subject matter E.g. VOLITIONAL CAUSE Presentational (pragmatic) Facilitate presentation process E.g. JUSTIFY Slide from Eleni Miltsakaki

35 What is nuclearity? Relations are mostly asymmetric
E.g. If A is evidence for B, then B is not evidence for A Diagnostics for nuclearity One member is independent of the other but not vice versa One member is more suitable for substitution that the other. An EVIDENCE satellite can be replaced by entirely different evidence One member is more essential to the writer’s purpse than the other Slide from Eleni Miltsakaki

36 Some Problems with RST (cf. Moore & Pollack ‘92)
How many Rhetorical Relations are there? How can we use RST in dialogue as well as monologue? How do we incorporate speaker intentions into RST? RST does not allow for multiple relations holding between parts of a discourse RST does not model overall structure of the discourse Slide from Julia Hirschberg

37 Part III: Entity-Based Coherence

38 Centering Theory (Grosz, Joshi and Weinstein, 1995)
The component of Grosz and Sidner’s theory of attention and coherence in discourse concerned with LOCAL effects (within discourse segments) Its aim is to make cross-linguistically valid claims about which discourses are easier to process, abstracting away from specific algorithms Slide by Massimo Poesio

39 Claims of the theory, I: Entity coherence
Discourses without a clear ‘central entity’ feel less coherent (1) a. John went to his favorite music store to buy a piano. b.  He had frequented the store for many years. c.   He was excited that he could finally buy a piano. d.  He arrived just as the store was closing for the day. This example is a bit peculiar but is the standard one (2) a. John went to his favorite music store to buy a piano. b. It was a store John had frequented for many years. c. He was excited that he could finally buy a piano. d. It was closing just as John arrived. Slide by Massimo Poesio

40 Concepts and definitions, I
Every UTTERANCE U in a discourse (segment) DS updates the LOCAL FOCUS - a PARTIALLY RANKED set of discourse entities, or FORWARD-LOOKING CENTERS (CFs) An utterance U in discourse segment DS updates the existing CF set by replacing it with the set of CFs REALIZED in U, CF(U,DS) (usually simplified to CF(U)) The most highly ranked CF realized in utterance U is CP(U) (1) u1. Susan gave James a pet hamster. CF(u1) = [Susan,James,pet hamster]. CP(u1) = Susan (2) u2. She gave Peter a nice scarf. CF(u2) = [Susan,Peter,nice scarf]. CP(u2) = Susan Slide by Massimo Poesio

41 Concepts and Definitions,II: The CB
Constraint 3: The BACKWARD-LOOKING CENTER of utterance Ui, CB(Ui), is the highest-ranked element of CF(UI-1) that is realized in Ui The hypotheses of entity coherence and uniqueness are formalized in terms of the CB, CT’s equivalent of ‘topic’ Slide by Massimo Poesio

42 The CB: Examples … or the most highly ranked entity of the CURRENT one
(1) u1. Susan gave James a pet hamster. CF(u1) = [Susan,James,pet hamster]. CB = undefined CP=Susan (2) u2. She gave Peter a nice scarf. CF(u2) = [Susan,Peter,nice scarf]. CB=Susan. CP=Susan NB: The CB is not always the most ranked entity of the PREVIOUS utterance (2’) u2. He loves hamsters. CF(u2) = [James]. CB=James. CP=James … or the most highly ranked entity of the CURRENT one (2’’) u2. Peter gave her a nice scarf. CF(u2) = [Peter,Susan, nice scarf]. CB=Susan. CP=Peter Slide by Massimo Poesio

43 Constraint 1 CONSTRAINT 1 (STRONG): All utterances of a segment except for the first have exactly one CB CB UNIQUENESS: Utterances have at most one CB ENTITY CONTINUITY: For all utterances of a segment except for the first, CF(Ui)  CF(U i-1)  Ø CONSTRAINT 1 (WEAK): All utterances of a segment except for the first have AT MOST ONE CB Slide by Massimo Poesio

44 Claims of the theory, II: Local salience and pronominalization
Grosz et al (1995): the CB is also the most salient entity. Texts in which other entities (but not the CB) are pronominalized are less felicitous (1) a. Something must be wrong with John. b.  He has been acting quite odd. c.   He called up Mike yesterday. d.  John wanted to meet him quite urgently. (2) a. Something must be wrong with John. b. He has been acting quite odd. c. He called up Mike yesterday. d. He wanted to meet him quite urgently. Slide by Massimo Poesio

45 Rule 1 RULE 1: if any CF is pronominalized, the CB is.
Slide by Massimo Poesio

46 Claims of the theory, III: Preserving the ranking
Discourses without a clear ‘central entity’ feel less coherent (1) a. John went to his favorite music store to buy a piano. b.  He had frequented the store for many years. c.   He was excited that he could finally buy a piano. d.  He arrived just as the store was closing for the day. (2) a. John went to his favorite music store to buy a piano. b. It was a store John had frequented for many years. c. He was excited that he could finally buy a piano. d. It was closing just as John arrived. Slide by Massimo Poesio

47 Transitions Grosz et al proposed that the load involved in processing an utterance depends on whether that utterance preserves the CB of the previous utterance or not, and on whether CB(U) is also CP(U). They introduce the following classification: CENTER CONTINUATION: Ui is a continuation if CB(Ui) = CB(Ui-1), and CB(Ui) = CP(Ui) CENTER RETAIN: Ui is a retain if CB(Ui) = CB(Ui-1), but CB(Ui) is different from CP(Ui) CENTER SHIFT: Ui is a shift if CB(Ui) ≠ CB(Ui-1 Slide by Massimo Poesio

48 Utterance classification
(0) u0. Susan is a generous person. CF(u0) = [Susan] CB = undefined CP = Susan. (1) u1. She gave James a pet hamster. CF(u1) = [Susan,James,pet hamster]. CB = Susan CP=Susan CONTINUE: (2) u2. She gave Peter a nice scarf. CF(u2) = [Susan,Peter,nice scarf]. CB=Susan CP=Susan CONTINUE Slide by Massimo Poesio

49 Utterance classification, II
(0) u0. Susan is a generous person. CF(u0) = [Susan] CB = undefined CP = Susan. (1) u1. She gave James a pet hamster. CF(u1) = [Susan,James,pet hamster]. CB = Susan CP=Susan SHIFT: (2’) u2. He loves hamsters. CF(u2) = [James]. CB=James. CP=James SHIFT Slide by Massimo Poesio

50 Utterance classification, III
(0) u0. Susan is a generous person. CF(u0) = [Susan] CB = undefined CP = Susan. (1) u1. She gave James a pet hamster. CF(u1) = [Susan,James,pet hamster]. CB = Susan CP=Susan RETAIN: (2’’) u2. Peter gave her a nice scarf. CF(u2) = [Peter,Susan, nice scarf]. CB=Susan CP=Peter RETAIN Slide by Massimo Poesio

51 Rule 2 RULE 2: (Sequences of) continuations are preferred over (sequences of) retains, which are preferred over (sequences of) shifts. Slide by Massimo Poesio

52 Summary of the claims CONSTRAINT 1: All utterances of a segment except for the first have exactly one CB RULE 1: if any CF is pronominalized, the CB is. RULE 2: (Sequences of) continuations are preferred over (sequences of) retains, which are preferred over (sequences of) shifts. Slide by Massimo Poesio

53 A parametric theory Grosz et al do not provide algorithms for computing any of the notions used in the basic definitions: UTTERANCE PREVIOUS UTTERANCE REALIZATION RANKING What counts as a ‘PRONOUN’ for the purposes of Rule 1? (Only personal pronouns? Or demonstrative pronouns as well? What about second person pronouns?) One of the reasons for the success of the theory is that it provides plenty of scope for theorizing … Slide by Massimo Poesio

54 Utterance and Previous Utterance
Originally, utterances implicitly identified with sentences. Kameyama (1998) and others suggested to identify utterances with finite clauses. If utterances identified with sentences, the previous utterance is generally easy to identify. But if utterances are identified with finite clauses, there are various ways of dealing with cases like: (u1) John wanted to leave home (u2) before Bill came home. (u3) He would be drunk as usual. KAMEYAMA: PREV(u3) = u2. SURI and MCCOY: PREV(u3) = u1 Slide by Massimo Poesio

55 Realization A basic question is whether entities can be ‘indirectly’ realized in utterances by an associate (as in Sidner’s algorithm) (u1) John walked towards the house. (u2) THE DOOR was open. A second question is whether first and second person entities are realized: (u1) Before you buy this medicine, (u2) you should contact your doctor. Realization greatly affects Constraint 1. Slide by Massimo Poesio

56 Ranking GRAMMATICAL FUNCTION (Kameyama 1986, Grosz Joshi and Weinstein 1986, Brennan et al 1987, Hudson et al 1986, Gordon et al 1993): SUBJ < OBJ < OTHERS A student was here to see John today: A STUDENT < JOHN INFORMATION STATUS (Strube and Hahn, 1999): HEARER-OLD < MEDIATED < HEARER-NEW A student was here to see John today: JOHN < A STUDENT THEMATIC ROLES (Cote, 1998) FIRST MENTION / LINEAR ORDER (Rambow, 1993; Gordon et al, 1993) In Lisa’s opinion, John shouldn’t have done that Slide by Massimo Poesio

57 Echihabi and Marcu Compile a small set of cue-phrases for each relation, e.g. Cause: [Because X, Y], [X. As a consequence, Y], etc. Contrast: [X. However, Y], [X even though Y], etc. Baseline: Choose random non-contiguous sents from a document Mine a large amount of (noisy) data: If we find a sentence: “Because [x1 x2 … xn] , [y1 y2 … ym] .” And note down that pairs (x1, y1) … (xn, ym) were observed in a causal setting So if we find: “Because [of poaching , smuggling and related treacheries], [tigers, rhinos and civets are endangered species] . … our belief that the pair (poaching ,endangered) indicates a causal relationship is increased Construct a naïve Bayes classifier s/t for two text spans W1 and W2, the probability of Relation rk is estimated as: Slide from Sasha Blair-Goldensohn

58 Final projects

59 Automatic extraction of semantic relations
Automatic Extraction of Semantic Relations like RESULT or CONTRAST/OPPOSITION (Sebastian Pado and Marie-Catherine de Marneffe) OPPOSITION: Learning that verbs like WALK and DRIVE are in opposition (so “A walked to the store” probably implies that “A didn’t drive to the store”) Or learning that state board member And County board member Are in contrast RESULT: A MAILED B a book This implies B HAS a book I.e. “X MAIL Y to Z” implies “Z has Y”

60 ACQUISITION OF DIATHESIS ALTERNATIONS
(Sebastian Pado and Marie-Catherine de Marneffe) Goal: learn mappings like Peter gave the book to Paul -> Peter gave Paul the book From a large parsed corpus of German that already has the subcategorization frames for each verb

61 Citation Analysis: classification citations
“BACKGROUND FOR CURRENT WORK” “CONTRAST TO CURRENT WORK” “BASIS OF CURRENT WORK” EXAMPLE: “PREVIOUS WORK IS WEAK:

62 Linking Events to Times
Build a system that links events to time expressions (or the document timestamp) with one of the relations: before, after, overlaps. The already tagged Timebank corpus could be used for development.

63 Lexical semantics: exclusive
(Bill MacCartney) Figure out how to learn categories of predicates (verbs, adjectives, multi-word phrases) which do or do not have exclusive semantics w.r.t. their arguments. For example, the predicate "is the president of" is exclusive with respect to its subject, since "X is the president of Y" can be true for only one X (for a given Y).

64 President Hennessy project: Discovering change in research topics
Discovering how research topics change By reading people’s papers And measuring the change in vocabulary (lexical chains, etc) over the years.

65 Improving thesaurus induction
Learning hyponyms or other semantic relations so as to add new words to a thesaurus.

66 Coreference Resolution
Implementing a state-of-the-art coreference resolution system.

67 Nominalization Unwrapping
Learning to related nominal and verbal forms Map The destruction of the city by the Romans To The Romans destroyed the city

68 Noun Noun compound disambiguation
Is “chocolate box” a box made of chocolates Or A box filled with chocolates?

69 Event coreference Take multiple stories with events in them
Link up coreferring events

70 STAIR Dialogue agents for Andrew Ng’s “STAIR: Stanford AI Robot”

71 Build a basic end-to-end system
Summarization Multi-document summarization Robust Textual Entailment Semantic Role Labeling

72 Part III: Coreference Victoria Chen, Chief Financial Officer of Megabucks Banking Corp since 2004, saw her pay jump 20%, to $1.3 million, as the 37-year-old also became the Denver-based financial-service company’s president. It has been ten years since she came to Megabucks from rival Lotsabucks The Tin Woodman went to the Emerald City to see the Wizard of Oz and ask for a heart. After he asked for it, the Woodman waited for the Wizard’s response

73 Why reference resolution?
Information Extraction: First Union Corp. is continuing to wrestle with severe problems unleashed by a botched merger and a troubled business strategy. According to industry insiders at Paine Webber, their president, John R. Georgius, is planning to retire by the end of the year.

74 Some terminology Reference: process by which speakers use words Victoria Chen and she to denote a particular person Referring expression: Victoria Chen, she Referent: the actual entity (but as a shorthand we might call “Victoria Chen” the referent). John and he “corefer” Antecedent: Victoria Chen Anaphor: she

75 Tasks Pronominal anaphora resolution Coreference resolution
Given a pronoun, find its antecedent Coreference resolution Find the coreference relations among all referring expressions Each set of referring expressions is a coreference chain. What are the chains in our story? {Victoria Chen, Chief Financial Officer of Megabucks Banking Corp, her, the 37-year-olod, the Denver-based financial-services company’s president, she} {Megabucks Banking Corp., the Denver-based financial-services company, Megabucks} {her pay} {Lotsabucks}

76 Many types of reference
(after Webber 91) According to Doug, Sue just bought a 1962 Ford Falcon But that turned out to be a lie (a speech act) But that was false (proposition) That struck me as a funny way to describe the situation (manner of description) That caused Sue to become rather poor (event)

77 4 types of referring expressions
Indefinite noun phrases: new to hearer Mrs. Martin was so very kind as to send Mrs. Goddard a beautiful goose He had gone round one day to bring her some walnuts. I am going to the butchers to buy a goose (specific/non-specific) I hope they still have it I hope they still have one. Definite noun phrases: identifiable to hearer because Mentioned: It concerns a white stallion which I have sold to an officer. But the pedigree of the white stallion was not fully established. Identifiable from beliefs or unique: I read about it in The New York Times Inherently unique: The fastest car in …

78 Reference Phenomena: 3. Pronouns
Emma smiled and chatted as cheerfully as she could Compared to definite noun phrases, pronouns require more referent salience. John went to Bob’s party, and parked next to a classic Ford Falcon He went inside and talked to Bob for more than an hour. Bob told him that he recently got engaged. ??He also said that he bought it yesterday. OK He also said that he bought the Falcon yesterday

79 More on Pronouns Cataphora: pronoun appears before referent:
Even before she saw it, Dorothy had been thinking about the Emerald City every day.

80 4. Names Miss Woodhouse certainly had not done him justice
International Business Machines sought patent compensation from Amazon. In fact, IBM had previously sued a number of other companies.

81 Complications: Inferrables and Generics
Inferrables (“bridging inferences”) I almost bought a 1962 Ford Falcon today, but a door had a dent and the engine seemed noisy. Generics: I’m interested in buying a Mac laptop. They are very stylish In March in Boulder you have to wear a jacket.

82 Information Structure
Givenness Hierarchy (Gundel et al 1993) Accessibility Scale (Ariel 2001)

83 Features for pronominal anaphora resolution

84 Features for pronominal anaphora resolution
Number agreement John has a Ford Falcon. It is red. *John has three Ford Falcons. It is red. But note: IBM is announcing a new machine translation product. They have been been working on it for 20 years. Gender agreement John has an Acura. He/it/she is attractive. Syntactic constraints (“Binding Theory”) John bought himself a new Ford (himself=John) John bought him a new Ford (him = not John)

85 Pronoun Interpretation Features
Selectional Restrictions John parked his Ford in the garage. He had driven it around for hours. Recency The doctor found an old map in the captain’s chest. Jim found an even older map hidden on the shelf. It described an island full of redwood trees and sandy beaches.

86 Pronoun Interpretation Preferences
Grammatical Role: Subject preference Billy Bones went to the bar with Jim Hawkins. He called for a glass of rum [he=Billy] Jim Hawkins went to he bar with Billy Bones. He called for a glass of rum [he = Jim]

87 Repeated Mention preference
Billy Bones had been thinking about a glass of rum ever since the pirate ship docked. He hobbled over to the Old Parrot bar. Jim Hawkins went with him. He called for a glass of rum. [he=Billy]

88 Parallelism Preference
Long John Silver went with Jim to the Old Parrot Billy Bones went with him to the Old Anchor Inn [him=Jim].

89 Verb Semantics Preferences
John telephoned Bill. He lost the laptop. John criticized Bill. He lost the laptop. Implicit causality Implicit cause of criticizing is object. Implicit cause of telephoning is subject.

90 Two algorithms for pronominal anaphora resolution
The Hobbs Algorithm A Log-Linear Model

91 Hobbs algorithm Begin at NP
Go up tree to first NP or S. Call this X, and the path p. Traverse all branches below X to the left of p, left-to-right, breadth-first. Propose as antecedent any NP that has a NP or S between it and X If X is the highest S in the sentence, traverse the parse trees of the previous sentences in the order of recency. Traverse left-to-right, breadth first. When a NP is encountered, propose as antecedent. If not the highest node, go to step 5.

92 From node X, go up the tree to the first NP or S
From node X, go up the tree to the first NP or S. Call it X, and the path p. If X is an NP and the path to X did not pass through the nominal that X dominates, propose X as antecedent Traverse all branches below X to the right of the path, in a left-to-right, breadth first manner. Propose any NP encountered as the antecedent If X is an S node, traverse all branches of X to the right of the path but do not go below any NP or S encountered. Propose any NP as the antecedent. Go to step 4

93 A loglinear model Supervised machine learning
Train on a corpus in which each pronoun is labeled with the correct antecedent In order to train: We need to extract Positive examples of referent-pronoun pairs Negative example of referent-pronoun pairs Feature for each one Then we train model to predict 1 for true antecedent and 0 for wrong antecendents

94 Features

95 Example John saw a beautiful 1961 Ford Falcon at the used car dealership (u1) He showed it to bob (U2) He bought it (U3)

96 Coreference resolution
Victoria Chen, Chief Financial Officer of Megabucks Banking Corp since 2004, saw her pay jump 20%, to $1.3 million, as the 37-year-old also became the Denver-based financial-service company’s president. It has been ten years since she came to Megabucks from rival Lotsabucks {Victoria Chen, Chief Financial Officer of Megabucks Banking Corp, her, the 37-year-old, the Denver-based financial-services company’s president, she} {Megabucks Banking Corp., the Denver-based financial-services company, Megabucks} {her pay} {Lotsabucks}

97 Coreference resolution
Victoria Chen, Chief Financial Officer of Megabucks Banking Corp since 2004, saw her pay jump 20%, to $1.3 million, as the 37-year-old also became the Denver-based financial-service company’s president. It has been ten years since she came to Megabucks from rival Lotsabucks Have to deal with Names Non-referential pronouns Definite NPs

98 Algorithm for coreference resolution
Based on: a binary classifier given an anaphor and a potential antecedent Returns true or false Process a document from left to right For each NPj we encounter Search backwards through document at NPs For each such potential antecedent NPi Run our classifier If it returns true, coindex NPi and NPj and return Terminate when we reach beginning of document

99 Features for coreference classifier

100 More features

101 Evaluation: Vilain et al 1995
Model-based evaluation Suppose A, B, and C are coreferent Could represent this as A-B, B-C Or as A-C, A-B Or as A-C, B-C Call any of these sets of correct links the reference set The output of coref algorithm is the hypothesis links. Our goal: compute precision and recall from the hypothesis to the reference set of links Clever algorithm to deal with the fact that there are multiple possible referent links. Suppose A,B,C,D coreferent and (A-B,B-C,C-D) is referent. Algo returns A-B, C-D Precision should be 1, recall should be 2/3 (since need 3 links to make 4 things corefernt, and we got 2 of them)

102 Coreference: further difficulties
Lots of other algorithms and other constraints Hobbs: reference resolution as by-product of general reasoning The city council denied the demonstrators a permit because they feared violence they advocated violence An axiom: for all X,Y,Z,Y fear(X,Z)&advocate(Y,Z)&enable_to_cause(W,Y,Z)-> deny(X,Z,W) Hence deny(city_council,demonstrators,permit)


Download ppt "(plus a few notes on coreference in advance for Friday)"

Similar presentations


Ads by Google