TextMap: An Intelligent Question- Answering Assistant Project Members:Ulf Hermjakob Eduard Hovy Chin-Yew Lin Kevin Knight Daniel Marcu Deepak Ravichandran.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
ADGEN USC/ISI ADGEN: Advanced Generation for Question Answering Kevin Knight and Daniel Marcu USC/Information Sciences Institute.
TextMap: An Intelligent Question- Answering Assistant Project Members:Abdessamad Echihabi Ulf Hermjakob Eduard Hovy Kevin Knight Daniel Marcu Deepak Ravichandran.
QA and Language Modeling (and Some Challenges) Eduard Hovy Information Sciences Institute University of Southern California.
GLOCAL Event-based Retrieval of Networked Media NEM Concertation Meeting Brussels, Feb
Text mining Extract from various presentations: Temis, URI-INIST-CNRS, Aster Data …
Natural Language Processing Group Department of Computer Science University of Sheffield, UK IR4QA: An Unhappy Marriage Mark A. Greenwood.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Basi di dati distribuite Prof. M.T. PAZIENZA a.a
Singular Value Decomposition in Text Mining Ram Akella University of California Berkeley Silicon Valley Center/SC Lecture 4b February 9, 2011.
1 Information Retrieval and Web Search Introduction.
Advance Information Retrieval Topics Hassan Bashiri.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Techniques Used in Modern Question-Answering Systems Candidacy Exam Elena Filatova December 11, 2002 Committee Luis GravanoColumbia University Vasileios.
Basic IR Concepts & Techniques ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Oxford English Dictionary (1989) factoid, n. and a. A. n. Something that becomes accepted as a fact, although it is not (or may not be) true; spec. an.
Information Retrieval in Practice
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
UAM CorpusTool: An Overview Debopam Das Discourse Research Group Department of Linguistics Simon Fraser University Feb 5, 2014.
DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.
TextMap: An Intelligent Question- Answering Assistant Project Members:Abdessamad Echihabi Ulf Hermjakob Eduard Hovy Soo-Min Kim Kevin Knight Daniel Marcu.
Semantic Search via XML Fragments: A High-Precision Approach to IR Jennifer Chu-Carroll, John Prager, David Ferrucci, and Pablo Duboue IBM T.J. Watson.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.
Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu University of.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Date: 2014/02/25 Author: Aliaksei Severyn, Massimo Nicosia, Aleessandro Moschitti Source: CIKM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Building.
Comparing syntactic semantic patterns and passages in Interactive Cross Language Information Access (iCLEF at the University of Alicante) Borja Navarro,
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Splitting Complex Temporal Questions for Question Answering systems ACL 2004.
Artificial Intelligence Research Center Pereslavl-Zalessky, Russia Program Systems Institute, RAS.
AQUAINT Testbed John Aberdeen, John Burger, Conrad Chang, Scott Mardis The MITRE Corporation © 2002, The MITRE Corporation.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Automatic Evaluation of Linguistic Quality in Multi- Document Summarization Pitler, Louis, Nenkova 2010 Presented by Dan Feblowitz and Jeremy B. Merrill.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
MIT Artificial Intelligence Laboratory — Research Directions The START Information Access System Boris Katz
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
Multilingual Search Shibamouli Lahiri
Human-Assisted Machine Annotation Sergei Nirenburg, Marjorie McShane, Stephen Beale Institute for Language and Information Technologies University of Maryland.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
Concept mining for programming automation. Problem ➲ A lot of trivial tasks that could be automated – Add field Patronim on Customer page. – Remove field.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
From Frequency to Meaning: Vector Space Models of Semantics
Information Retrieval and Web Search
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Information Retrieval and Web Search
Information Retrieval and Web Search
CSE 635 Multimedia Information Retrieval
Injecting Linguistics into NLP by Annotation
Information Retrieval and Web Search
Structure of IR Systems
Information Retrieval
Topic: Semantic Text Mining
Presentation transcript:

TextMap: An Intelligent Question- Answering Assistant Project Members:Ulf Hermjakob Eduard Hovy Chin-Yew Lin Kevin Knight Daniel Marcu Deepak Ravichandran

State-of-the-art Q&A capabilities [Webclopedia-2001] Question 110: Who killed Lee Harvey Oswald? Qtargets: I-EN-PROPER-PERSON & S-PROPER-NAME, I-EN-PROPER-ORGANIZATION “Belli’s clients have included Jack Ruby, who killed John F. Kennedy assassin Lee Harvey Oswald, and Jim and Tammy Bakker.”

What can current Q&A systems do well? Answer factoid questions –What was the name of the first Russian astronaut to do a spacewalk? –Where is Belize located? –How much folic acid should an expectant mother get daily? –What type of bridge is the Golden Gate Bridge? Best system performance (TREC-10): 66%.

What can current systems not do well? Answer complex questions: –What do you know about Bill Clinton? Answer rhetorical questions: –What were the causes of the war in Yugoslavia? Find answers in foreign-language documents. Assist users in –exploring large textual collections; –aggregating the information they mine to enable subsequent analysis. Adapt to users’ preferences and knowledge.

The TextMap Approach Put the user in the driver’s seat: –let the user decide how complex questions should be decomposed into simple questions and how answers to simple questions should be aggregated; –log all steps to enable automatic learning of complex question decomposition and answer matching. Pre-annotate! –Syntax, Shallow semantics (Named Entities), Ontologies, Discourse.

The TextMap Intelligent Q&A Assistant

TextMap Scenarios Scenario 1: –Start with simple questions: When was Mullah Mohammad Rabbani born? Where did Mullah Mohammad Rabbani get his education? What is the highest position Mullah Mohammad Rabbani had in the Afghan government? –Use answers to search “adjacent” information spaces: What are the political views of Mullah Mohammad Rabbani? –Aggregate answers according to user-defined criteria, to form a coherent answer.

TextMap Scenarios Scenario 2: –Start with complex questions: Construct the biography of Mullah Mohammad Rabbani. –Automatically decompose complex questions into simple ones: When was Mullah Mohammad Rabbani born? Where did Mullah Mohammad Rabbani get his education? What is the highest position Mullah Mohammad Rabbani had in the Afghan government? –Automatically aggregate answers, using previously observed / learned patterns.

Resources at ISI (1) Webclopedia—Q&A answering system: Software –CONTEX, a syntactic/semantic parser [Hermjakob, 2001] –Query-formation module, which includes stemming, query expansion, and other preprocessing routines –MG, an Information Retrieval engine (Sydney University) –text segmenters and text rankers to determine likelihood that segments contain answers –IdentiFinder (BBN’s Named Entitity recognizer) –Answers modules to find and present the answers Additional resources –Typology of Question/Answer types –18,000+ questions from answers.com

Resources at ISI (2) Summarization: –SUMMARIST and NeATS (single- and multi-document summarizers). –SEE (Summarization Evaluation Interface). Discourse processing: –Discourse parser and discourse-based summarizer. –Corpus of discourse trees. Machine Translation: –ReWrite: Statistical-based machine translation system (learner + decoder). –Parallel and comparable corpora.

Development plan – Year 1 Build TextMap Interface and integrate Webclopedia capabilities into it. Annotate massive amounts of texts with syntactic, semantic, discourse tags. Develop rhetorical question-answering capabilities (focus initially on answering causal questions). Develop complex question-answering capabilities (focus initially on answering event descriptions and biographical questions). Query expansion for foreign names, covering spelling variants.

Development plan – Year 2 Improve TextMap Interface to learn from user feedback. Extend simple, rhetorical, and complex question- answering capabilities. Exploit system logs in order to learn question- answering decompositions and question-answering patterns. Translation of names, locations, etc., to provide English indices to foreign-language documents.

Main problem We don’t know how to evaluate! –Want to automate if possible, but have to figure out how to remove the user from the task.