Coreference Resolution

Slides:

Advertisements

Similar presentations

Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.

Advertisements

Three Basic Problems 1.Compute the probability of a text (observation) language modeling – evaluate alternative texts and models P m (W 1,N ) 2.Compute.

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.

A Machine Learning Approach to Coreference Resolution of Noun Phrases By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen.

1 Discourse, coherence and anaphora resolution Lecture 16.

Chapter 18: Discourse Tianjun Fu Ling538 Presentation Nov 30th, 2006.

1 A Hidden Markov Model- Based POS Tagger for Arabic ICS 482 Presentation A Hidden Markov Model- Based POS Tagger for Arabic By Saleh Yousef Al-Hudail.

Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.

Basi di dati distribuite Prof. M.T. PAZIENZA a.a

1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.

Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.

Supervised models for coreference resolution Altaf Rahman and Vincent Ng Human Language Technology Research Institute University of Texas at Dallas 1.

تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.

Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.

Improving Machine Learning Approaches to Coreference Resolution Vincent Ng and Claire Cardie Cornell Univ. ACL 2002 slides prepared by Ralph Grishman.

Introduction to Machine Learning Approach Lecture 5.

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.

Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.

Mining and Summarizing Customer Reviews

Disambiguation of References to Individuals Levon Lloyd (State University of New York) Varun Bhagwan, Daniel Gruhl (IBM Research Center) Varun Bhagwan,

A Light-weight Approach to Coreference Resolution for Named Entities in Text Marin Dimitrov Ontotext Lab, Sirma AI Kalina Bontcheva, Hamish Cunningham,

AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.

9/8/20151 Natural Language Processing Lecture Notes 1.

Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.

The Problem Finding information about people in huge text collections or on-line repositories on the Web is a common activity Person names, however, are.

Some Advances in Transformation-Based Part of Speech Tagging

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.

Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,

Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.

Introduction  Information Extraction (IE)  A limited form of “complete text comprehension”  Document 로부터 entity, relationship 을 추출 

On the Issue of Combining Anaphoricity Determination and Antecedent Identification in Anaphora Resolution Ryu Iida, Kentaro Inui, Yuji Matsumoto Nara Institute.

Incorporating Extra-linguistic Information into Reference Resolution in Collaborative Task Dialogue Ryu Iida Shumpei Kobayashi Takenobu Tokunaga Tokyo.

1 Statistical NLP: Lecture 9 Word Sense Disambiguation.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

A Cross-Lingual ILP Solution to Zero Anaphora Resolution Ryu Iida & Massimo Poesio (ACL-HLT 2011)

Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.

REFERENTIAL CHOICE AS A PROBABILISTIC MULTI-FACTORIAL PROCESS Andrej A. Kibrik, Grigorij B. Dobrov, Natalia V. Loukachevitch, Dmitrij A. Zalmanov

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.

An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

1 Toward Opinion Summarization: Linking the Sources Veselin Stoyanov and Claire Cardie Department of Computer Science Cornell University Ithaca, NY 14850,

Using Semantic Relations to Improve Passage Retrieval for Question Answering Tom Morton.

For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.

For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.

Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.

Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick Rizzolo, Mark Sammons, and Dan Roth This research.

For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.

UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.

Evaluation issues in anaphora resolution and beyond Ruslan Mitkov University of Wolverhampton Faro, 27 June 2002.

Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.

Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.

An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,

For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.

Using Semantic Relations to Improve Information Retrieval

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.

Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.

Simone Paolo Ponzetto University of Heidelberg Massimo Poesio

Social Knowledge Mining

Machine Learning in Natural Language Processing

Clustering Algorithms for Noun Phrase Coreference Resolution

Statistical NLP: Lecture 9

A Machine Learning Approach to Coreference Resolution of Noun Phrases

Information Retrieval

Statistical NLP : Lecture 9 Word Sense Disambiguation

Presentation transcript:

Coreference Resolution Seminar by Satpreet Arora (07D05003)

What is Coreference? In linguistics, co-reference occurs when multiple expressions in a sentence or document refer to the same entity. Example: Aditya went to Videorec to buy a DVD for himself. He had frequented the store for many years now. Here, Aditya, himself and He are coreferent. Also Videorec and store are coreferent.

Coreference Resolution. Determine which entities in a document/discourse have the same referent. In NLP, we usually deal with Coreference resolution of NPs. The coreference system has to form equivalence classes of NPs that have the same real-world entity as its referent. Coreference Relation is both transitive and symmetric.

A Coreference System Input of a coreference system: John Simon, Chief Financial Officer of Prime Corp. since 1986, saw his pay jump 20%, to $1.3 million, as the 37-year-old also became the financial-service company’s president. Output of a coreference system: [JS John Simon], [JS Chief Financial Officer] of [PC Prime Corp.] since 1986, saw [JS his] pay jump 20%, to $1.3 million, as [JS the 37-year-old] also became the [PC financial-service company]’s [JS president] Equivalence classes: JS: {John Simon, Chief Financial Officer, his, the 37-year-old, president} PC: {Prime Corp., financial-service company}

Anaphora Resolution Anaphora refers to the linguistic phenomenon of having a noun phrase refer to a previously mentioned entity in a text for its semantic interpretation In other words, a pair of NPs <npi, npj> constitutes an anaphoric relationship if i < j and npj depends on npi for its interpretation, where npk denotes the kth NP in a document. For instance, the NP pair <Queen Elizabeth, her> forms an anaphoric relationshipin our example. Queen Elizabeth set about transforming her husband

Coreference Resolution - an Important Problem Applications include: Question answering systems and Information Retrieval: Consider the query- Where was Mozart born?. A question answering system may first retrieve the sentence He was born in Salzberg from a document talking about Mozart. In this case, the system will return the correct answer if it can determine that the pronoun ‘He’ is coreferent with Mozart. Machine Translation. Anaphora resolution comes into play when discrepancies exist between the two languages with respect to anaphor selection. For example, a pronoun in the Malay language is often translated directly by its antecedent (Mitkov, 1999).

Coreference Resolution - an Important Problem (cont’d) Text Summarization Text summarisation tools using coreference resolution not only include in the summary those sentences that contain a term appearing in the query, they also incorporate sentences containing a noun phrase that is coreferent with a term occurring in a sentence already selected by the system. Cross-document coreference is particularly useful for text summarization systems that need to identify and merge the same piece of information about an entity mentioned in different documents in order to avoid repetition.

Coreference Resolution - a Hard Problem The difficulty of the problem lies in its dependence on sophisticated semantic and world knowledge. The policemen refused the women a permit for the demonstration because they feared violence. The policemen refused the women a permit for the demonstration because they advocated violence. Observe how they is referring two two different entities in the two sentences depending upon the context. Its easy for humans but difficult for machines.

Coreference Resolution - a Hard Problem (cont’d) Many sources of information play a role: Lexical information such as head noun matches (as in Lord Spencer and Mr. Spencer) is an indicator of coreference, although it is not an absolute indicator (e.g. Lord Spencer and Diana Spencer are not coreferent). Knowledge sources such as gender and number, semantic class, discourse focus, and world knowledge also play a role in determining whether two discourse entities are coreferent.

Coreference Resolution - a Hard Problem (cont’d) No single source of knowledge is completely reliable indicator: For example, two semantically compatible NPs are potentially coreferent (e.g. Diana Spencer and the princess) but whether the NPs are actually coreferent depends on other factors (such as contextual information). Linguistic constraints indicating (non- )coreference such as number (dis)agreement is not absolutely hard (e.g. the singular NP assassination (of her bodyguards) can be coreferent with the plural NP these murders.

Coreference Resolution - a Hard Problem (cont’d) Coreference strategies differ depending on the type of NP: Definite NPs are more likely to be anaphoric than their non-definite counterparts (e.g. the article immediately preceding man in the sentence “Diana saw the/a photographer following her secretly” determines whether the NP has an existential or definite reading). Pronoun resolution is difficult because resolution strategies differ for each type of pronouns (e.g. reflexives versus possessives) and also because some pronouns such as pleonastic pronouns are semantically empty (e.g. the pronoun it in the sentence “Camilla went outside and it was raining ” is pleonastic).

The Algorithm A lot of different algorithms use different approaches to solve the problem. However, they all share some basic components. (for non machine learning based algorithms) Step 1: Identification of Discourse Entities (NPs) Step 2: Representation of NPs (as a set of features); Step3: Calculating distances between NPs using the distance metric Step 4:Creating equivalence classes using a clustering algorithm or other classification tools.

Identification of Discourse Entities For coreference resolution algorithms, the task is to identify all of the noun phrases in the text. Textual elements which can be definite noun phrases, demonstrative noun phrases, proper names, appositives, sub-noun phrases that act as modifiers, pronouns and so on. The basic structure of the identification is as follows:

Identification of Discourse Entities (cont’d) Tokenization and Morphological Processing: Splitting the text to sentences and stemming words to their root form POS tagging: Hidden Markov Model based statistical POS tagger. Noun Phrase Identification: Determines noun phrase boundaries based on POS tagger. Named Entity recognition: May also be HMM based, learns from a tagged corpus of named entities. If there are overlaps – boundaries are adjusted. Nested Noun Phrases Extraction : Accepts noun phrases and determines the nested phrases (if any)

Representation of NPs Representation of NPs : A set of features used to construct the feature vector. individual words in the NP; head noun: last word of the NP; position in the document; pronoun type: nominative, accusative, possessive, ambiguous; article: indeﬁnite, deﬁnite, none; appositive: based on heuristics (commas, etc.); number: plural, singular; proper name: based on heuristics (capitalization, etc.); semantic class based on Wordnet; gender: masculine, feminine, either, neuter; animacy: based on semantic class.

Representation of NPs (cont’d)

Calculating distances between NPs Different algorithms use different distance metrics. We here present one from Cardie et al. (1999). Noun phrase coreference as clustering’ and the corresponding clustering algorithm. The distance between noun phrases NP1 and NP2 is deﬁned as: dist(NP1,NP2) =∑ wf * incompatibilityf(NP1,NP2) The summation is over f ∈ F F: set of features wf: weight of feature f incompatibilityf: degree of incompatibility between NP1 and NP2

Calculating distances between NPs (cont’d)

Clustering Properties of the clustering algorithm: start from the end of the document and work backwards; if distance between two NPs is less than r , then their equivalence classes are considered for merging; classes can be merged unless they contain incompatible NPs; the algorithm automatically computes the transitive closure of the coreference relation; two NPs can be coreferent even if dist(NP1,NP2) > r, as long as dist(NP1,NP2) ≠∞; r is a free parameters of the algorithm.

Clustering

Machine Learning algorithms In the algorithm we just saw, the weights of each feature are fixed manually. Unlike manual approaches, machine learning approaches to coreference resolution induce a model that determines the probability that two NPs are coreferent from annotated data automatically via the use of learning algorithms. They can be characterized in terms of the knowledge sources being employed, the method of training data creation, as well as the learning algorithm and the clustering algorithm being chosen.

Machine Learning algorithms Training Data Creation: Positive instances:Two different methods are used to create positive training instances: from Aone et al. (1995) 1)Transitive (i.e. an instance is formed between an NP and each of its preceding NPs in the same anaphoric chain) and 2)Non-transitive (i.e. an instance is formed between an NP and its closest preceding NP in the same anaphoric chain Negative instances: 1) Negative instances are generated by pairing an NP with each preceding NP that does not have an anaphoric relationship with it.- Aone et al. (1995) 2) To reduce the ratio of negative instances to positive instances a negative instance is created by pairing an anaphoric NP, npj , with each NP appearing between npj and its closest preceding antecedent-Soon et al. (2001)

Learning Algorithm A lot of recent research involving machine learning techniques use decision trees for classifying NP pairs. Soon et al. used C5 tree classifier.

Learning Algorithm (cont’d) Each pair of NPs is presented to the classifier and the classifier returns a probability or certainty value of the pair being coreferent. All pairs which receive a probability value greater than a threshold value are considered as being coreferent. The algorithm then constructs the transitive closure of all the pairs and thus achieves partitioning of all the NPs into coreferent classes.

Clustering algorithms are also used in machine learning approaches. The relative importance of all the factors discussed previously is learnt from the training corpus instead of being fixed manually. This allows for consideration of a larger number of factors. In principle, learning-based systems are more robust than knowledge-based systems in the face of noisy input (meaning there are exceptions to rules). Also machine learning algorithms adapt easier to different topics.

Conclusion Machine learning approaches to coreference resolution have been shown to be a promising way to build robust coreference resolution systems. Despite the successful application of machine learning techniques to coreference resolution, the problem is far from being solved. Linguistics combined with machine learning techniques can prove effective in solving the coreference problem. Coreference resolution is one of the most difficult problems in language understanding. Given that NLP is often said to be “AI-complete” we might be able to conclude that coreference resolution is among the hardest of the hardest problems in artificial intelligence.

References Vincent N G (2002):Machine Learning for Coreference Resolution:Recent Successes and Future Challenges Cardie, Claire and Kiri Wagstaﬀ. 1999. Noun phrase coreference as clustering. Byron, D. and J. Tetreault, 1999. A Flexible Architecture for Reference Resolution. In Proc. of the 9th EACL. Unsupervised Models for Co-reference Resolution (Vincent Ng 2008). Wee Meng Soon, Daniel Chung Yong Lim, Hwee Tou Ng(2001) DSO National Laboratories :A Machine Learning Approach to Coreference Resolution of Noun Phrases http://www.cs.tau.ac.il/~nachumd/NLP/2010/Anaphora.pdf http://www.inf.ed.ac.uk/teaching/courses/nlu/lectures/nlu_l16.pdf http://www.dfki.de/~loeckelt/ss2010/presentations/coreference_res olution.pdf Wikipedia

Questions