A Novel Pattern Learning Method for Open Domain Question Answering IJCNLP 2004 Yongping Du, Xuanjing Huang, Xin Li, Lide Wu.

Slides:



Advertisements
Similar presentations
Understanding Tables on the Web Jingjing Wang. Problem to Solve A wealth of information in the World Wide Web Not easy to access or process by machine.
Advertisements

Knowledge Base Completion via Search-Based Question Answering
Improved TF-IDF Ranker
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
Multimedia Answer Generation for Community Question Answering.
A New Suffix Tree Similarity Measure for Document Clustering Hung Chim, Xiaotie Deng City University of Hong Kong WWW 2007 Session: Similarity Search April.
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
KnowItNow: Fast, Scalable Information Extraction from the Web Michael J. Cafarella, Doug Downey, Stephen Soderland, Oren Etzioni.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
Methods for Domain-Independent Information Extraction from the Web An Experimental Comparison Oren Etzioni et al. Prepared by Ang Sun
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
A Web-based Question Answering System Yu-shan & Wenxiu
CS344: Introduction to Artificial Intelligence Vishal Vachhani M.Tech, CSE Lecture 34-35: CLIR and Ranking in IR.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Learning to Find Answers to Questions Eugene Agichtein Steve Lawrence Columbia University NEC Research Luis Gravano Columbia University.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
ONTOLOGY LEARNING AND POPULATION FROM FROM TEXT Ch8 Population.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A semantic approach for question classification using.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Grouping search-engine returned citations for person-name queries Reema Al-Kamha, David W. Embley (Proceedings of the 6th annual ACM international workshop.
Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.
A search-based Chinese Word Segmentation Method ——WWW 2007 Xin-Jing Wang: IBM China Wen Liu: Huazhong Univ. China Yong Qin: IBM China.
Querying Structured Text in an XML Database By Xuemei Luo.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Web Document Clustering: A Feasibility Demonstration Oren Zamir and Oren Etzioni, SIGIR, 1998.
Structured Use of External Knowledge for Event-based Open Domain Question Answering Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh National University.
11 A Hybrid Phish Detection Approach by Identity Discovery and Keywords Retrieval Reporter: 林佳宜 /10/17.
Intent Subtopic Mining for Web Search Diversification Aymeric Damien, Min Zhang, Yiqun Liu, Shaoping Ma State Key Laboratory of Intelligent Technology.
QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.
A Language Independent Method for Question Classification COLING 2004.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Wenyi Huang, Yabin Zheng and Maosong Sun 2010, ACM Automatic Keyphrase Extraction.
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
LING 573 Deliverable 3 Jonggun Park Haotian He Maria Antoniak Ron Lockwood.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
Learning Phonetic Similarity for Matching Named Entity Translations and Mining New Translations Wai Lam Ruizhang Huang Pik-Shan Cheung Department of Systems.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Date: 2013/10/23 Author: Salvatore Oriando, Francesco Pizzolon, Gabriele Tolomei Source: WWW’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang SEED:A Framework.
Ranking Definitions with Supervised Learning Methods J.Xu, Y.Cao, H.Li and M.Zhao WWW 2005 Presenter: Baoning Wu.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Post-Ranking query suggestion by diversifying search Chao Wang.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
SEMANTIC VERIFICATION IN AN ONLINE FACT SEEKING ENVIRONMENT DMITRI ROUSSINOV, OZGUR TURETKEN Speaker: Li, HueiJyun Advisor: Koh, JiaLing Date: 2008/5/1.
Date: 2013/9/25 Author: Mikhail Ageev, Dmitry Lagun, Eugene Agichtein Source: SIGIR’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Improving Search Result.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Opinion Observer: Analyzing and Comparing Opinions on the Web WWW 2005, May 10-14, 2005, Chiba, Japan. Bing Liu, Minqing Hu, Junsheng Cheng.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Query Reformulation & Answer Extraction
IST 516 Fall 2011 Dongwon Lee, Ph.D.
Personalized, Interactive Question Answering on the Web
A research literature search engine with abbreviation recognition
Presentation transcript:

A Novel Pattern Learning Method for Open Domain Question Answering IJCNLP 2004 Yongping Du, Xuanjing Huang, Xin Li, Lide Wu

Abstract They develop a novel pattern based method for implementing answer extraction in QA For each type of question, the corresponding answer patterns can be learned from the Web automatically Given a new question, these answer patterns can be applied to find the answer They give a performance analysis of this approach using the TREC-11 question set

Introduction Three main components in QA system Many other question answering systems have used pattern based method Patterns are not learned They can handle only one question term in the candidate answering sentence

Introduction (Cont.) Each answer pattern is consisted of three parts + [ConstString] + : the key phrases of question : the answer, any string holding the position will be extracted as the answer [ConstString]: a sequence of words indentification -> learning answer patterns -> answer extraction -> performance analysis

Question Analysis A set of symbols are defined to represent questions Q_Focus: the head word of the noun phrase which is binding with interrogative Interrogative + be verb + NP… How + Adj+ be verb…

Question Analysis They select 182 questions from TREC, and these questions contain all the Q_Tag symbols Every element of the questions is replaced with its corresponding Q_Tag The classification of questions will be built based on the Q_Pattern and the answer type Six answer type

Example Question: “What is the largest city in Germany?” Question type: [LCN] What Q_BeVerb Q_Focus in Q_LCN? Doc 1: “…,Berlin is the largest city in Germany and is develop...” Answer pattern: “, Q_BeVerb Q_Focus in Q_LCN”

Pattern Learning 1. Constructing Query: “Q_Tag + Answer”, for example, “the largest city” + ”Germany” + “Berlin” 2. Searching: Submit query into Google, and download the top 100 documents 3. Snipped Selection: Extract the snippets containing 10 words around the answer 4. Answer Pattern Extraction: Replace the question term in each snippet by the corresponding Q_Tag, and the answer term by the tag The shortest string containing the Q_Tag and the tag is extracted as the answer pattern

Pattern Learning (Cont.) 5. Computing the weight of each answer pattern They learn answer patterns for each question type

Pattern Evaluation answer pattern “ Q_BeVerb Q_Focus” may extract candidate answer “Portland” from the snippet “Portland is the largest city in Oregon…” They provide an approach to evaluate the answer patterns 1. Query for each answer pattern of the question is formed and submitted to Google, and top 100 snippets are downloaded for evaluation The query consists of three parts: [Head] + [Tail] + [Q_Focus + Q_NameEntity] answer pattern “ Q_BeVerb Q_Focus” query: “is the largest city” + “Germany”

Pattern Evaluation (Cont.) 2. The confidence of each answer pattern is calculated by the formula 3. At last the score of each answer pattern is computed as the formula The major advantage over other pattern based QA systems is that more than one question term can be included in the answer pattern

Sample Answer patterns

Answer Extraction Use google as the search engine and get top 100 snippets for answer extraction 1. Identify the Q_Tag of the new question and generate its Q_Pattern “What is the most populous city in the United States?” What Q_BeVerb Q_Focus in Q_LCN? 2. Determine the question type of the question. The corresponding answer patterns of this question type are also selected 3. Replace the Q_Tag symbols of each answer pattern with the corresponding question term of the question “, Q_BeVerb Q_Focus in Q_LCN”, is the most populous city in the United States

Answer Extraction (Cont.) 4. Select the words matching tag as the candidate answer 5. Discard the candidate answers which do not satisfy the answer type of the question, using name entity tagger 6. Sort the remainder candidate answers by their answer pattern’s scores and their frequencies. The highest score is selected as the final answer

Performance Analysis TREC-9 and TREC-10 are training examples to learn answer patterns 500 questions of TREC-11 are experimented

Conclusion They take part in the TREC-12 this year and their result is above the median score of all runs submitted Some answer patterns they learned are too specific and are useless for answering new question “What is a shaman?” “Q_Focus was the priest, the and”