Using Semantic Relations to Improve Passage Retrieval for Question Answering Tom Morton.

Slides:



Advertisements
Similar presentations
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Advertisements

A Machine Learning Approach to Coreference Resolution of Noun Phrases By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen.
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
1 A Hidden Markov Model- Based POS Tagger for Arabic ICS 482 Presentation A Hidden Markov Model- Based POS Tagger for Arabic By Saleh Yousef Al-Hudail.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
Basi di dati distribuite Prof. M.T. PAZIENZA a.a
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Introduction to Machine Learning Approach Lecture 5.
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
Information Extraction with Unlabeled Data Rayid Ghani Joint work with: Rosie Jones (CMU) Tom Mitchell (CMU & WhizBang! Labs) Ellen Riloff (University.
Towards a semantic extraction of named entities Diana Maynard, Kalina Bontcheva, Hamish Cunningham University of Sheffield, UK.
Information Retrieval in Practice
Mining and Summarizing Customer Reviews
Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
Final Review 31 October WP2: Named Entity Recognition and Classification Claire Grover University of Edinburgh.
Survey of Semantic Annotation Platforms
A Survey of NLP Toolkits Jing Jiang Mar 8, /08/20072 Outline WordNet Statistics-based phrases POS taggers Parsers Chunkers (syntax-based phrases)
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
Information Extraction From Medical Records by Alexander Barsky.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
1 Named Entity Recognition based on three different machine learning techniques Zornitsa Kozareva JRC Workshop September 27, 2005.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Using Semantic Relations to Improve Information Retrieval Tom Morton.
I2B2 Shared Task 2011 Coreference Resolution in Clinical Text David Hinote Carlos Ramirez.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
NYU: Description of the Proteus/PET System as Used for MUC-7 ST Roman Yangarber & Ralph Grishman Presented by Jinying Chen 10/04/2002.
Natural language processing tools Lê Đức Trọng 1.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Using Text Mining and Natural Language Processing for.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Hybrid Method for Tagging Arabic Text Written By: Yamina Tlili-Guiassa University Badji Mokhtar Annaba, Algeria Presented By: Ahmed Bukhamsin.
POS Tagger and Chunker for Tamil
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
A Statistical Model for Multilingual Entity Detection and Tracking R. Florian, H. Hassan, A. Ittycheriah, H. Jing, N. Kambhatla, X. Luo, N. Nicolov, S.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Using Semantic Relations to Improve Information Retrieval
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Improving a Pipeline Architecture for Shallow Discourse Parsing
Social Knowledge Mining
Clustering Algorithms for Noun Phrase Coreference Resolution
A Machine Learning Approach to Coreference Resolution of Noun Phrases
A Machine Learning Approach to Coreference Resolution of Noun Phrases
Presentation transcript:

Using Semantic Relations to Improve Passage Retrieval for Question Answering Tom Morton

Introduction Paragraph retrieval for natural-language questions. – Correctness of answers to natural language questions can be accurately determined automatically. – Standard precursor to TREC question answering task. What NLP technologies might help this task and are they robust enough?

NLP Technologies Question Analysis: – Questions tend to specify the semantic type of their answer. This component tries to identify this type. Named-Entity Detection: – Named-entity detection determines the semantic type of proper nouns and numeric amounts in text.

How these technologies help? Question Analysis – The category predicted is appended to the question. Named-Entity Detection: – The NE categories found in text are included as new terms. This approach requires additional question terms to be in the paragraph. What party is John Major in? (ORGANIZATION) It probably won't be clear for some time whether the Conservative Party has chosen in John Major a truly worthy successor to Margaret Thatcher, who has been a giant on the world stage. +ORGANIZATION +PERSON

NLP Technologies Coreference Relations: – Interpretation of a paragraph may depend on the context in which it occurs. Syntactically-based Categorical Relation Extraction: – Appositive and predicate nominative constructions provide descriptive terms about entities.

Coreference: – Use coreference relationships to introduce new terms referred to but not present in the paragraph’s text. How these technologies help? How long was Margaret Thatcher the prime minister? (DURATION) The truth, which has been added to over each of her 11 1/2 years in power, is that they don't make many like her anymore. +MARGARET +THATCHER +PRIME +MINISTER +DURATION

How these technologies help? Categorical Relation Extraction – Identifies DESCRIPTION category. – Allows descriptive terms to be used in term expansion. Famed architect Frank Lloyd Wright… +DESCRIPTION Buildings he designed include the Guggenheim Museum in New York and Robie House in Chicago. +FRANK +LLOYD +WRIGHT +FAMED +ARCHITECT Who is Frank Lloyd Wright? (DESCRIPTION) What architect designed Robie House? (PERSON)

How does it work? Coreference – Use Approach described in ACL (Morton 2000). – Divide referring expressions into three classes and create a separate resolution approach for each. Singular third person pronouns: Statistical Proper nouns: Rule-based Definite noun phrases: Rule-based – Apply resolution approaches to text in an interleaved fashion.

Coreference 1.John Major, a truly worthy… 2.Margaret Thatcher, her, … 3.The Conservative Party 4.the undoubted exception 5.Winston Churchill 6.… she ? 20% 70% 10% 5% 10% Pronoun is resolved to entity rather than most recent extent.

Paragraph Retrieval Results

Conclusion Developed and evaluated new techniques in: – Coreference Resolution. – Categorical Relation Extraction. – Question Analysis. Integrated these techniques with existing NLP components: – NE detection, POS tagging, Sentence detection, etc. Demonstrated that these techniques can be used to improve performance in an information retrieval task. – Paragraph retrieval for natural language questions.

Porting this approach to ACE A rapidly developed IE system – Built using the same approach Pipelined Architecture – Easy to construct from existing components – Easy to plug in new components Statistical Components – Maximum Entropy – Require less hand-tuning – Easy to improve with new training data or better machine learning algorithms

Input File Tokenizing/ Preprocessing NE Tagging Parsing Nominal Tagging Coreference Relation Extraction Output File

Integrating CRF: Results The CRF tagger significantly improves NE detection, giving a higher entity score. Better NE detection allows the system to find more relations, giving a higher relation score. Maxent CRF Maxent +BBN Maxent +BBN Entity ScoresRelation Scores