Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar.

Slides:



Advertisements
Similar presentations
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Advertisements

IMPLEMENTATION OF INFORMATION RETRIEVAL SYSTEMS VIA RDBMS.
Chapter 5: Introduction to Information Retrieval
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Effective Keyword Based Selection of Relational Databases Bei Yu, Guoliang Li, Karen Sollins, Anthony K.H Tung.
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
Deliverable #3: Document and Passage Retrieval Ling 573 NLP Systems and Applications May 10, 2011.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Modern Information Retrieval Chapter 2 Modeling. Probabilistic model the appearance or absent of an index term in a document is interpreted either as.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
INFO 624 Week 3 Retrieval System Evaluation
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
1 CS 430: Information Discovery Lecture 20 The User in the Loop.
HYPERGEO 1 st technical verification ARISTOTLE UNIVERSITY OF THESSALONIKI Baseline Document Retrieval Component N. Bassiou, C. Kotropoulos, I. Pitas 20/07/2000,
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
Information Retrieval in Practice
1 Probabilistic Language-Model Based Document Retrieval.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Abstract Question answering is an important task of natural language processing. Unification-based grammars have emerged as formalisms for reasoning about.
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
Controlling Overlap in Content-Oriented XML Retrieval Charles L. A. Clarke School of Computer Science University of Waterloo Waterloo, Canada.
Chapter 6: Information Retrieval and Web Search
The Anatomy of a Large-Scale Hypertextual Web Search Engine Sergey Brin & Lawrence Page Presented by: Siddharth Sriram & Joseph Xavier Department of Electrical.
Question Answering over Implicitly Structured Web Content
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –
Querying Web Data – The WebQA Approach Author: Sunny K.S.Lam and M.Tamer Özsu CSI5311 Presentation Dongmei Jiang and Zhiping Duan.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Text REtrieval Conference (TREC) Implementing a Question-Answering Evaluation for AQUAINT Ellen M. Voorhees Donna Harman.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
UIC at TREC 2006: Genomics Track Wei Zhou, Clement T. Yu University of Illinois at Chicago Nov. 16, 2006.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
AQUAINT AQUAINT Evaluation Overview Ellen M. Voorhees.
Survey Jaehui Park Copyright  2008 by CEBT Introduction  Members Jung-Yeon Yang, Jaehui Park, Sungchan Park, Jongheum Yeon  We are interested.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.
LEARNING IN A PAIRWISE TERM-TERM PROXIMITY FRAMEWORK FOR INFORMATION RETRIEVAL Ronan Cummins, Colm O’Riordan (SIGIR’09) Speaker : Yi-Ling Tai Date : 2010/03/15.
Developments in Evaluation of Search Engines
Text Based Information Retrieval
An Empirical Study of Learning to Rank for Entity Search
Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD
Applying Key Phrase Extraction to aid Invalidity Search
CS246: Information Retrieval
Presentation transcript:

Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar

Sources TREC-9.TREC E. Voorhees. "The Overview of the TREC-9 Question Answering track."The Overview of the TREC-9 Question Answering track. J. Prager, E. Brown, A. Coden and D. Radev. "Question answering by predictive annotation." SIGIR '00.Question answering by predictive annotation. C.L.A. Clarke, G.V. Cormack and T.R. Lynam. "Exploiting redundancy in question answering." In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Exploiting redundancy in question answering V P C

Question Answering IR Successful in large scale text search problems Retrieve full documents IE Successful in extracting very precise answers from text Work on pre-specified domains Combining the strengths

QA track in TREC Collection of unstructured documents (table 1 in V) Short factual questions in English ( Why can't ostriches fly ? Where did Bill Gates go to college ?) also figure 1 in V Return answer as a ranked list of 5 fragments of documents (2 categories: 50 and 250 bytes)

Evaluation By people Reciprocal rank of first correct answer or 0 % answers which were found Strict and Lenient scores (supported and unsupported judgment) Short and Long version

2 QA TREK systems Question Answering by Predictive Annotation - Prager, Brown, Coden (IBM) and Radev (U of Michigan) Exploiting Redundancy in Question Answering - Clarke, Cormack, Lynam (U of Waterloo) Ranking - Table 2 in V

Exploiting Redundancy in Question Answering Question -> a query for submission to a passage retrieval component -> a set of selection rules what guides the process of extracting answers from the passages (answer category) Get a list of k passages Identify possible answers Rank the possible answers Figure 1 in C Question analysis – IR – IE

3 features with greatest contribution Flexibility of the parser Passage retrieval technique (high quality passages) Redundancy in the answer selection component – contribution of evidence from multiple passages to identify the most likely answer

Passage Retrieval techniques Each document D is an ordered sequence of terms D= d1 d2 d3 … dm Extent (u, v) (minimal) Query Q generated from the question Q={q1, q2, q3, …} Compute the score for an extent(u, v) for which T  Q is a cover Higher scores to passages whose P of occurrence is lower

Redundancy Each candidate term t is is assigned a weight that takes into account the number of distinct passages in which the term appears, as well as the relative frequency of the term in the database Wt = Ct log (N/ft) Ct is the number of distinct passages in which t appears Summing the weights of a all terms in a candidate answer Determine the first one, reduce weights to 0, do all over until have 5 Figure 2 in C

Exploiting redundancy “Who” questions 100 GB corpus K depth, W width Figure 2 in C

Who wants to be a Millionaire? Real life example 70% correct overall Figure 5 in C

Question answering by predictive annotation IBM system Shallow NLP System structure Figure 1 in P Annotation Indexing