TextMap: An Intelligent Question- Answering Assistant Project Members:Abdessamad Echihabi Ulf Hermjakob Eduard Hovy Kevin Knight Daniel Marcu Deepak Ravichandran.

TextMap: An Intelligent Question- Answering Assistant Project Members:Abdessamad Echihabi Ulf Hermjakob Eduard Hovy Kevin Knight Daniel Marcu Deepak Ravichandran

Research Foci and Accomplishments Increase q&a performance on simple, factoid-type questions –Learned surface text patterns for q&a –Incorporated pattern-based answering into TextMap Develop capability for answering cause/evidence and opinion questions –Learned to recognize causal/evidence relations in arbitrary texts Develop capability for answering complex questions –Answer “who is” questions as mini-biographies Design q&a interface and system architecture –Multiple q&a engines run in parallel –Dynamically ranked answers are presented to analysts as soon as they become available

Learning surface text patterns for q&a [Ravichandran and Hovy; ACL-2002] Motivation: –Surface text patterns can be used to answer certain factoid questions (“When was NAME born?”) was born in ( -- Hypothesis: –Surface text patterns can be automatically learned from the Web

Approach Start with with a small set/“seed” of known answers to a given question type –“Gandhi 1869” “Newton 1642” Download documents from the web that contain the answers in single sentence Use suffix tree to find common sub-strings –“The great composer Mozart (1756-1791) achieved fame at a young age” –“Mozart (1756-1791) was a genius” –“The whole world would always be indebted to the great music of Mozart (1756-1791)” Replace tags in common sub-strings – ( -- – was born in Discard low-frequency patterns and measure precision

Initial results – TREC10 questions Answers found in the TREC corpus Question type#questionsMRR –Birthyear80.4787 –Inventors60.1666 –Discoverers40.1250 –Definitions1020.3445 –Why-famous30.6666 –Locations160.75 Answers found on the Web Question type#questionsMRR –Birthyear80.6875 –Inventors60.5833 –Discoverers40.8750 –Definitions1020.3857 –Why-famous30.0 –Locations160.8643

Future work Incorporate semantic filters to block wrong types: –Mozart was born in Salzburg Learn to deal with long-distance dependencies and answers of some expected length – lies on –London, which has one of the most busiest airports in the world, lies on the banks of the river Thames

Answering cause/evidence questions Motivation –Question: Why did people die in Burundi? “In Burundi, 179 people died. The flood that hit the capital was the largest ever recorded.” –Answer: “because of the flood” –In order to be able to produce this answer, we need to identify that a cause/evidence relation holds between the sentences above. –The answer is implicit.

Recognizing discourse relations in texts [Marcu and Echihabi; ACL-2002] Such standards would preclude arms sales to states like Libya, which is also currently subject to U.N. embargo. ??? states like Rwanda before its present crisis would still be able to legally buy arms. BUT  Can_buy_arms_legally(Libya) Can_buy_arms_legally(Rwanda) Similar(Libya, Rwanda) P(BUT/Contrast | ) is high

Approach Collect a corpus of 1 billion words of English (41M sentences) Use simple pattern matching to automatically extract MANY examples of contrast, cause, elaboration, … relations –[BOS … EOS][BOS But … EOS] –[BOS … ][but … EOS] –[BOS Although …][, … EOS] Relation# of examples Contrast3,881,588 Cause-Explanation-Evidence889,946 Condition1,203,813 Elaboration1,836,227 No-Relation-Same-Text1,000,000 No-Relation-Different-Texts1,000,000

Approach (cont) Train a simple Bayesian model that explains how the data can be generated decoder Two bags of words W1,W2 (sentences, clauses) Most likely relation r argmax P(r) P(W1,W2 | r) r source P(relation) r channel Pairs of words P( | r) argmax P(r)   W1 x W2 P( | r) r

Results (In all cases, baseline is 50%) CEVCondElabNo-Rel- Same- Text No-Rel- Diff- Text Contrast 87748264 Cause-Evidence 76937574 Condition 896971 Elaboration 7675 No-Rel-Same-Text 64

Future work Learn to recognize cause/evidence questions –Develop typology for cause/evidence q&a types Develop algorithms for cause/evidence answer extraction/production Incorporate cause/evidence q&a capability into TextMap

Answering complex questions [Hermjakob, Hovy, Ticrea, Cha] Some complex answers have stereotypical content and structure “Mini-bio” –Defined prototypical biography structure –Defined initial typology of biography/person types –Implemented prototype system “Natural disasters” –Defined prototypical structure –Defined initial typology of disaster types

Example Question: “Who is Clarence Thomas?” Old (factoid) answer: “judge” New answer: “Clarence Thomas, born 1947/48; judge for the U.S. Court of Appeals for the District of Columbia; nominated to the Supreme Court in 1991 by President Bush; confirmed by the Senate by a narrow 52 to 48 vote.”

Future work Large-scale tests and refinements of complex answer types Creation of biographies and descriptions of natural disasters –Possibly only partly coherent –Possibly incomplete

Interface and system architecture [Graehl, Knight, and Marcu] Three q&a systems run in parallel –IR-based –Surface text pattern-based –Syntax/semantics-based [Webclopedia] Answers are presented to the analyst as soon as they become available, as a dynamically ranked list

Future work Learn to choose between answers produced by different systems Log analyst actions for data mining

High-performance question-answering system capable of answering Complex questions (biographical and event-related) Causal questions using rhetorical parsing Multilingual QA by robust named-entity translation TextMap: An Intelligent Question-Answering Assistant The Novel Ideas Impact Milestones/Dates/Status An adaptable, flexible QA system that learns from user interactions Advanced rhetorical-, semantics-, and statistical-based question understanding, answering, and indexing Advanced representations of structure of complex multi-part answers Answers integrated from multiple sources http://www.isi.edu/natural-language/textmap.html PIs: Daniel Marcu, Eduard Hovy, Kevin Knight, USC/ISIProject COTR: Kellcy Allwein, DIADate prepared: Dec 2001 Architecture ScheduledActual –Initial interfaceJUN 2002 JUN 2002 Question Types –FactoidsJUN 2002 JUN 2002 –Structured questionsDEC 2002 –Causal questionsJUN 2003 User profiling –Initial profilingDEC 2002 –Learning preferencesDEC 2003 Named-entity translation DEC 2003 ACQUAINT

TextMap: An Intelligent Question- Answering Assistant Project Members:Abdessamad Echihabi Ulf Hermjakob Eduard Hovy Kevin Knight Daniel Marcu Deepak Ravichandran.

Similar presentations

Presentation on theme: "TextMap: An Intelligent Question- Answering Assistant Project Members:Abdessamad Echihabi Ulf Hermjakob Eduard Hovy Kevin Knight Daniel Marcu Deepak Ravichandran."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

TextMap: An Intelligent Question- Answering Assistant Project Members:Abdessamad Echihabi Ulf Hermjakob Eduard Hovy Kevin Knight Daniel Marcu Deepak Ravichandran.

Similar presentations

Presentation on theme: "TextMap: An Intelligent Question- Answering Assistant Project Members:Abdessamad Echihabi Ulf Hermjakob Eduard Hovy Kevin Knight Daniel Marcu Deepak Ravichandran."— Presentation transcript:

Similar presentations

About project

Feedback