Automatic Question Answering Beyond the Factoid Radu Soricut Information Sciences Institute University of Southern California Eric Brill Microsoft Research.

Slides:

Advertisements

Similar presentations

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Advertisements

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Probabilistic Language Processing Chapter 23. Probabilistic Language Models Goal -- define probability distribution over set of strings Unigram, bigram,

Identifying Translations Philip Resnik, Noah Smith University of Maryland.

IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.

Using TF-IDF to Determine Word Relevance in Document Queries

Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.

Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.

Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.

ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.

Online Learning for Web Query Generation: Finding Documents Matching a Minority Concept on the Web Rayid Ghani Accenture Technology Labs, USA Rosie Jones.

Scaffold Download free viewer:

A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Benjamin Arai Computer Science and Engineering Department.

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

Natural Language Processing Expectation Maximization.

Natural Language Processing Lab Northeastern University, China Feiliang Ren EBMT Based on Finite Automata State Transfer Generation Feiliang Ren.

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.

Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.

Hang Cui et al. NUS at TREC-13 QA Main Task 1/20 National University of Singapore at the TREC- 13 Question Answering Main Task Hang Cui Keya Li Renxu Sun.

Lecture 6: The Ultimate Authorship Problem: Verification for Short Docs Moshe Koppel and Yaron Winter.

Large Language Models in Machine Translation Conference on Empirical Methods in Natural Language Processing 2007 報告者：郝柏翰 2013/06/04 Thorsten Brants, Ashok.

An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.

1 Cross-Lingual Query Suggestion Using Query Logs of Different Languages SIGIR 07.

Concept Unification of Terms in Different Languages for IR Qing Li, Sung-Hyon Myaeng (1), Yun Jin (2),Bo-yeong Kang (3) (1) Information & Communications.

Finding Similar Questions in Large Question and Answer Archives Jiwoon Jeon, W. Bruce Croft and Joon Ho Lee Retrieval Models for Question and Answer Archives.

AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.

Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.

When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.

Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.

Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu University of.

Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,

A Language Independent Method for Question Classification COLING 2004.

Distributed Information Retrieval Using a Multi-Agent System and The Role of Logic Programming.

Crawling and Aligning Scholarly Presentations and Documents from the Web By SARAVANAN.S 09/09/2011 Under the guidance of A/P Min-Yen Kan 10/23/

CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.

Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.

Korea Maritime and Ocean University NLP Jung Tae LEE

Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.

Chapter 23: Probabilistic Language Models April 13, 2004.

Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.

Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.

An Iterative Approach to Extract Dictionaries from Wikipedia for Under-resourced Languages G. Rohit Bharadwaj Niket Tandon Vasudeva Varma Search and Information.

Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.

Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.

UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.

Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,

Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA SIGIR 2001.

Research Methodology Class.   Your report must contains,  Abstract  Chapter 1 - Introduction  Chapter 2 - Literature Review  Chapter 3 - System.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.

Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.

1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

The Effect of Database Size Distribution on Resource Selection Algorithms Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University.

Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.

Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,

1 Predicting Answer Location Using Shallow Semantic Analogical Reasoning in a Factoid Question Answering System Hapnes Toba, Mirna Adriani, and Ruli Manurung.

Your friend has a hobby of generating random bit strings, and finding patterns in them. One day she come to you, excited and says: I found the strangest.

Statistical Machine Translation Part II: Word Alignments and EM

Presentation transcript:

Automatic Question Answering Beyond the Factoid Radu Soricut Information Sciences Institute University of Southern California Eric Brill Microsoft Research NAACL 2004

Abstract QA system that goes beyond answering factoid questions Focus on FAQ-like questions and answers Build a system around a noisy-channel architecture which exploits both –a language model for answers –a transformation model for answer/question terms, trained on a corpus of 1 million question/answer pairs collected from the Web

Beyond Factoid QA A question-answer pair training corpus –by mining FAQ pages from the Web A statistical chunker (Instead of sentence parsing) –To transform a question into a phrase-based query A search engine –return the N most relevant documents from the Web An answer is found by computing –an answer language model probability (indicating how similar the proposed answer is to answers seen in the training corpus),and –an answer/question translation model probability (indicating how similar the proposed answer/question pair is to pairs seen in the training corpus)

A QA Corpus for FAQs Query “FAQ” to an existing search engine Roughly 2.3 million FAQ URLs to be used for collecting question/answer pairs two-step approach: –a first recall-oriented pass based on universal indicators such as punctuation and lexical cues allowed us to retrieve most of the question/answer pairs, along with other noise data; –a second precision-oriented pass used several filters, such as language identification, length constrains, and lexical cues to reduce the level of noise of the question/answer pair corpus Roughly 1 million q/a pairs collected

A QA System Architecture

The Question2Query Module A statistical chunker –uses a dynamic programming algorithm to chunk the question into chunks/phrases –trained on the answer side of the Training corpus in order to learn 2 and 3-word collocations, defined using the likelihood ratio of Dunning (1993)

The SearchEngine Module & Filter Module Search Engine: MSN & Google Filtering Steps: –First N hits –Tokenization and segmentation –access to the reference answers for the test questions, and ensured that, if the reference answer matched a string in some retrieved page, that page was discarded (only for evaluation purpose)

The AnswerExtraction Module the need to “bridge the lexical chasm” between the question terms and the answer terms Two different algorithms –one that does NOT bridge the lexical chasm, based on N-gram co-occurrences between the question terms and the answer terms –one that attempts to bridge the lexical chasm using Statistical Machine Translation inspired techniques (Brown et al., 1993)

N-gram Co-Occurrence Statistics for Answer Extraction using the BLEU score of Papineni et al. (2002) as a means of assessing the overlap between the question and the proposed answers The best scoring potential answer was presented

Statistical Translation for Answer Extraction Berger et al. (2000): the lexical gap between questions and answers can be bridged by a statistical translation model between answer terms and question terms

Answer generation model proposes an answer A according to an answer generation probability distribution answer A is further transformed into question Q by an answer/question translation model according to a question-given-answer conditional probability distribution Let the task T be defined as “find a 3-sentence answer for a given question”. Then we can formulate the algorithm as finding the a- posteriori most likely answer given question and task, and write it as p(a|q,T)

Because task T fits the characteristics of the question- answer pair corpus described in Section 3, we can use the answer side of this corpus to compute the prior probability p(a|T). The role of the prior is to help downgrading those answers that are too long or too short, or are otherwise not well-formed. We use a standard trigram language model to compute the probability distribution p(·|T)

The mapping of answer terms to question terms is modeled using Black et al.’s (1993) simplest model, called IBM Model 1 a question is generated from an answer a of length n according to the following steps: first, a length m is chosen for the question, according to the distribution ψ(m|n) (we assume this distribution is uniform); then, for each position j in q, a position i in a is chosen from which qj is generated, according to the distribution t(·| ai ). The answer is assumed to include a NULL word, whose purpose is to generate the content-free words in the question (such as in “Can you please tell me…?”)

p(q|a) is computed as the sum over all possible alignments t(qj| ai ) are the probabilities of “translating” answer terms into question terms c(ai|a) are the relative counts of the answer terms. The parallel corpus of questions and answers can be used to compute the translation table t(qj| ai ) using the EM algorithm, as described by Brown et al. (1993)

Following Berger and Lafferty (2000), an even simpler model than Model 1 can be devised by skewing the translation distribution t(·| ai ) such that all the probability mass goes to the term ai. This simpler model is called Model 0

Evaluations and Discussions The evaluation was done by a human judge on a set of 115 Test questions, which contained a large variety of nonfactoid questions Each answer was rated as either correct(C), somehow related(S), wrong(W), or cannot tell(N) estimated the performance of the system using the formula (|C|+.5|S|)/(|C|+|S|+|W|)

Question2Query Module Evaluation Keep fixed: MSNSearch & top 10 hits AnswerExtraction module: –N-gram co-occurrence based algorithm (NG- AE) –Model 1 based algorithm M1e-AE

SearchEngine Module Evaluation Keep fixed: segmented question & top 10 hits AnswerExtraction module: –NG-AE, M1e-AE, and –ONG-AE exactly like NG-AE, with the potential answers compared with a reference answer available to an Oracle, rather than against the question The performance obtained using this algorithm can be thought of as indicative of the ceiling in the performance

Filter Module Evaluation assessed the trade-off between computation time and accuracy of the overall system: –the size of the set of potential answers directly influences the accuracy of the system while increasing the computation time of the AnswerExtraction module

AnswerExtraction Module Evaluation Keep fixed: segmented question, MSN, and top 10 hits Based on the BLEU score – NG-AE, and its Oracle-informed variant ONG-AE (with score 0.23 & 0.46), do not depend on the amount of training data Based on the noisy-channel architecture –increased performance with the increase in the amount of available training data –reaching as high as 0.38

Why Model 1 (M1-AE) performed poorer than Model 0 (M0-AE)? –probability distribution of question terms given answer terms learnt by Model 1 is well informed (many mappings are allowed) but badly distributed –Steep learning curve of Model 1: whose performance gets increasingly better as the distribution probabilities of various answer terms become more informed (more mappings are learnt) –Gentle learning curve of Model 0: whose performance increases slightly only as more words become known as self-translations to the system

M1e-AE –obtained when Model 1 was trained on both the question/answer parallel corpus, and an artificially created parallel corpus in which each question had itself as its “translation” –allowed the model to assign high probabilities to identity mappings (better distributed), while also distributing some probability mass to other question- answer term pairs (and therefore be well informed) –Top score of 0.38

Performance Issues We demonstrated that a statistical model can capitalize on large amounts of readily available training data to achieve reasonable performance on answering non-factoid questions

Reasons for those questions not answered correctly –answer was not in the retrieved pages (see the 46% performance ceiling given by the Oracle) –answer was of the wrong “type” (e.g., an answer for “how-to” instead of “what-is”) –it pointed to where an answer might be instead of answering the question

Reasons for those questions not answered correctly –the translation model overweighed the answer language model (too good a "translation", too bad an answer) –did not pick up the key content word (in the example below, eggs)