Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Similar presentations


Presentation on theme: "A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal."— Presentation transcript:

1 A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal

2 Contents: Introduction Why Question Answering ? The Architecture of a Generic QA System Issues with traditional QA Systems The Web Solution- AskMSR Current Research Work Conclusion

3 Question answering (QA), in information retrieval, is the task of automatically answering a question posed in natural language (NL) using either a pre-structured database or a collection of natural language documents. Goal : to retrieve answers to questions rather than full documents or best-matching passages QA=Information Retrieval + Information Extraction to find short answers to fact-based questions Introduction

4 Why Question Answering ? Google – Query driven search Answers to a query are documents Question Answering – Answer driven search Answers to a query are phrases

5 Question Processing question query Passage Retrieval Answer Extraction answers Document Retrieval Document Retrieval The Architecture of a Generic QA System

6 Question Processing Captures the semantics of the question; Tasks: Determine the question type Determining the answer type Extract keywords from the question and formulate a query

7 Question Types Class 1 Answer: single datum or list of items C: who, when, where, how (old, much, large) Class 2 A: multi-sentence C: extract from multiple sentences Class 3 A: across several texts C: comparative/contrastive Class 4 A: an analysis of retrieved information C: synthesized coherently from several retrieved fragments Class 5 A: result of reasoning C: word/domain knowledge and common sense reasoning

8 Types of QA Closed-domain QA systems: are built for very specific domain and exploit expert knowledge in them.  very high accuracy  require extensive language processing and limited to one domain Open-domain QA systems: can answer any question from any collection.  can potentially answer any question  very low accuracy

9 Keyword Selection List of keywords in the question to help in finding relevant texts Some systems expanded them with lexical/semantic alternations for better matching: inventor -> invent have been sold -> sell dog -> animal

10 Passage Retrieval Extracts passages that contain all selected keywords Passage quality based on loops: In the first iteration use the first 6 keyword selection heuristics If no. passages < a threshold  query is too strict  drop a keyword If no. passages > a threshold  query is too relaxed  add a keyword

11 Answer Extraction  Pattern matching between question and the representation of the candidate answer-bearing texts  A set of candidate answers is produced  Ranking according to likelihood of correctness.

12 QA SystemOutput AnswerBusSentences AskJeeves (ask.com) Documents/direct answers IONAUTPassages LCCSentences MulderExtracted answers QuASMDocument blocks STARTMixture WebclopediaSentences Example of Answer Processing

13 Issues with traditional QA Systems Retrieval is performed against small set of documents Extensive use of linguistic resources POS tagging, Named Entity Tagging, WordNet etc. Difficult to recognize answers that do not match question syntax E.g. Q: Who shot President Abraham Lincoln? A: John Wilkes Booth is perhaps America’s most infamous assassin having fired the bullet that killed Abraham Lincoln.

14 The Web can help Web – A gigantic data repository with extensive data redundancy Factoids likely to be expressed in hundreds of different ways At-least a few will match the way the question was asked E.g. Q: Who shot President Abraham Lincoln? A: John Wilkes Booth shot President Abraham Lincoln.

15 AskMSR: Details

16 Step 1: Rewrite queries Intuition: The user’s question is often syntactically quite close to sentences that contain the answer E.g. Q-Where is the Louvre Museum located? A- The Louvre Museum is located in Paris  Classify question into specific categories.  Category-specific transformation rules  Expected answer “Data type” (E.g. Date, Person, Location, …) Step 2: Query search engine Send all rewrites to a Web search engine Retrieve top N answers For speed, search engine’s “snippets” are used instead of full text or the actual document

17 Step 3: Mining N-Grams Enumerate all N-grams (N=1,2,3 say) in all retrieved snippets Use hash table to make this efficient Weight of an n-gram: occurrence count, each weighted by “reliability” (weight) of rewrite that fetched the document Step 4: Filtering N-Grams Each question type is associated with one or more “data-type filters” = regular expression When… Where… What … Who … Boost score of n-grams that do match regular exp Lower score of n-grams that don’t match regular exp Date Location Person

18 Step 5: Tiling the Answers Dickens Charles Dickens Mr Charles Scores 20 15 10 merged, discard old n-grams Mr Charles Dickens Score 45 N-Grams tile highest-scoring n-gram N-Grams Repeat, until no more overlap Example: “Who created the character of Scrooge?”

19 Current Research Work Human Question Answering Performance Using an Interactive Document Retrieval System Document retrieval + QA system : the ability of the users answering the questions on their own using an interactive document retrieval system and result compared and evaluated by QA systems Towards Automatic Question Answering over Social Media by Learning Question Equivalence Patterns Collaborative Question Answering (CQA) systems : accessed on an existing archive, in which users answer each other questions, many questions to be asked have already been asked and answered and group equivalence patterns are generated for questions having syntactic similarities

20 An Automatic Answering System with Template Matching for Natural Language Questions Closed-domain system : template matching is applied, to provide a service for cell phones by SMS, Frequently Asked Questions (FAQ) are used as sample data.  Preprocessing  Question Template Matching  Answering

21 Conclusion Question Answering requires more complex NLP techniques compared to other forms of Information Retrieval There is a huge possibility that complex Automatic QA systems can replace simple web search systems, but the Automatic QA are still non-trivial research fields as Document and Information Retrieval, are huge with many different approaches, which are still not all fully developed

22 References An Analysis of the AskMSR Question Answering System, Eric Brill et. al., Proceedings of the Conference on Empirical Methods in Natural Association for Computational Linguistics. Language Processing (EMNLP), Philadelphia, July 2002, pp. 257-2 6 4 New Trends in Automatic Question Answering by Univ.- Doz. Dr.techn. Christian Gutl

23 Thank You..


Download ppt "A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal."

Similar presentations


Ads by Google