Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Just-in-Time Interactive Question Answering Language Computer Corporation Sanda Harabagiu, PI John Lehmann John Williams Paul Aarseth.

Similar presentations


Presentation on theme: "1 Just-in-Time Interactive Question Answering Language Computer Corporation Sanda Harabagiu, PI John Lehmann John Williams Paul Aarseth."— Presentation transcript:

1 1 Just-in-Time Interactive Question Answering Language Computer Corporation Sanda Harabagiu, PI John Lehmann John Williams Paul Aarseth

2 2 Overview Project Introduction Preparation for “Wizard of Oz” pilot Performance in WOZ pilot Challenges encountered in WOZ pilot Current work and future plans

3 3 Research Project Objective Address the interactive aspect of QA systems by designing and implementing a dialog shell that can be used with any QA system

4 4 Tasks in JITIQA

5 5 Predicted Challenges in WOZ We imagined the following about the assessor Asks complex questions, compared to TREC Sample showed < 1/3 with “known” answer types Wants fast responses Assumes dialogue context (pronouns, ellipses) Has no knowledge of question formulation Assumes the QA system’s collection contains answer

6 6 Preparation for WOZ Extend work in factual QA with two approaches Information/knowledge-centric Create a Question/Answer Database (QADB) Develop a question similarity metric Build higher quality domain-specific document collections User-centric Reformulate questions to resolve references and incorporate context Decompose complex questions into simpler ones

7 7 Question/Answer Database Because domain is closed, we may be able to predict questions and collect answers How well can we cover the range of possible questions? Process: 1. Split up topics between developers 2. Generate question and answer records 3. Rotate topics among developers

8 8 QADB Population For 10 domains, collected 334 question records, each with answers from multiple sources Perform retrieval of answers by computing question similarity based on concepts 33Surgery 37Sanchez 27Microsoft 33Japan 17Ivory Coast 57Indonesia 40Colombia 31Black Sea 35Africa 24Afghanistan #recsDomain

9 9 Question Concepts Q: “Why does so much opium production take place in Afghanistan?” Concept 1: cause Concept 2: popularity Concept 3: produced Other questions satisfying 100% of the concepts Why is so much opium produced in Afghanistan? Why is poppy farming popular in Afghanistan? For what reasons is growing poppy common in Afghanistan? What causes poppy farming to be so popular in Afghanistan? What makes opium farms so commonplace in Afghanistan?

10 10 Document Collection Reasons for document collection Alternative to slow Internet searches Pre-filtering documents for domain relevance Internet information is of low quality Keeps experiment repeatable

11 11 Documents Collected DomainNewsWeb Opium/Afghanistan1,7962,306 AIDS/Africa10,3957,189 Black Sea Pollution1,8674,131 FARC/Colombia4,0805,161 Indonesian Economy16,66514,704 Cell Phones/Ivory Coast1,2643,164 Joint Ventures/Japan1,9443,556 Microsoft/Viruses3,5077,553 Elizardo Sánchez508307 Robotic Surgery5,4484,908 News source collections Documents from major newspapers with dates Collected with one general query per domain to catch all possibly relevant documents Used in pilot Web source collections Generally poor quality documents Multiple specific queries used per domain, saving top 500 documents each time Not used in pilot

12 12 Performance in Pilot Performance Measure P1P2 Final Answer65.7 Time5.55.8 Dialog6.25.6 Clarifications76 System Clarity6.36.4 Overall6.25.9 Assessors in pilots 1 and 2 graded our dialog based on several performance measures for each domain Scale: 1-7 with 7 representing “completely satisfied”

13 13 QADB Performance 945Robotic Surgery 75201639Total 615Elizardo Sánchez 413Microsoft/Viruses 642Joint Ventures/Japan 11263Cell Phones/Ivory Coast 13238Indonesian Economy 532FARC/Colombia 725Black Sea Pollution 8323AIDS/Africa 6123Opium/Afghanistan TotalNonePartFullDomain Number of questions QADB answered fully, partially, or not at all, for both pilot experiments combined

14 14 Complex Questions Complex questions require mapping into simpler questions “Biographical information needed on Elizardo Sanchez, Cuban dissident” When and where was Elizardo Sanchez born? Where did Elizardo Sanchez go to school? Who is in Elizardo Sanchez’s family? “Give some information on uses of robotic surgery in US and foreign countries?” What kinds of surgeries do robots perform? What laws govern robotic surgery in the US? What are the benefits of robotic surgery?

15 15 Complex Answer Types System only recognizes simple answer types Money - How much money can be made from opium smuggling? Locations - What countries are involved in fighting Afghan opium production? Date - When did opium production begin in Afghanistan? Most questions sought complex answers Cause – Why does so much opium production take place in Afghanistan? Action – What is being done to fight opium production? Effects – How have recent events affected opium production? Problems – What problems do counter narcotics face in Afghanistan? Policy – What is the United States’ financial commitment to drug control efforts?

16 16 Other Challenging Questions Ambiguous questions “During the years 1998-2001 what was Indonesia’s currency exchange rate?” What measure is desired? An average? Each year? Follow-up questions A: “Kim currently works three days a week at WIN-TECH, a three-company joint venture...” Q: “who is kim?” Misleading questions Misspellings, slang, capitalization, statements

17 17 Current Work Automatic complex question break- down Specification of general terms “What elements cause Black Sea pollution?” Expand elements into companies, countries, chemicals Decomposition into members “What about Sherron Watkins’ family?” Decompose family into parents, children, spouse Dialog context understanding Coreference of anaphors

18 18 Future Plans Enhanced transformations of complex questions into simple ones Enhanced incorporation of context Ellipsis resolution Recognition of intensions Automatic and interactive generation of topic knowledge and QADB population


Download ppt "1 Just-in-Time Interactive Question Answering Language Computer Corporation Sanda Harabagiu, PI John Lehmann John Williams Paul Aarseth."

Similar presentations


Ads by Google