Reading Report on Hybrid Question Answering System

Slides:



Advertisements
Similar presentations
Arnd Christian König Venkatesh Ganti Rares Vernica Microsoft Research Entity Categorization Over Large Document Collections.
Advertisements

A Linguistic Approach for Semantic Web Service Discovery International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) July 13, 2012 Jordy.
Leveraging Community-built Knowledge For Type Coercion In Question Answering Aditya Kalyanpur, J William Murdock, James Fan and Chris Welty Mehdi AllahyariSpring.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
CS344: Introduction to Artificial Intelligence Vishal Vachhani M.Tech, CSE Lecture 34-35: CLIR and Ranking in IR.
Moving beyond free text. Authors Scientist does research Scientist publishes research results in journal article Old Paradigm:
Information Retrieval – and projects we have done. Group Members: Aditya Tiwari ( ) Harshit Mittal ( ) Rohit Kumar Saraf ( ) Vinay.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Survey of Semantic Annotation Platforms
Author: William Tunstall-Pedoe Presenter: Bahareh Sarrafzadeh CS 886 Spring 2015.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A semantic approach for question classification using.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Open Information Extraction using Wikipedia
Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010.
QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
NTCIR /21 ASQA: Academia Sinica Question Answering System for CLQA (IASL) Cheng-Wei Lee, Cheng-Wei Shih, Min-Yuh Day, Tzong-Han Tsai, Tian-Jian Jiang,
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
Alexey Kolosoff, Michael Bogatyrev 1 Tula State University Faculty of Cybernetics Laboratory of Information Systems.
Talk Schedule Question Answering from Bryan Klimt July 28, 2005.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Using Semantic Relations to Improve Passage Retrieval for Question Answering Tom Morton.
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Evaluating Answer Validation in multi- stream Question Answering Álvaro Rodrigo, Anselmo Peñas, Felisa Verdejo UNED NLP & IR group nlp.uned.es The Second.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
LINDEN : Linking Named Entities with Knowledge Base via Semantic Knowledge Date : 2013/03/25 Resource : WWW 2012 Advisor : Dr. Jia-Ling Koh Speaker : Wei.
Named Entity Disambiguation: A Hybrid Statistical and Rule-based Incremental Approach Hien Nguyen * (Ton Duc Thang University, Vietnam) Tru Cao (Ho Chi.
Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.
Using Semantic Relations to Improve Information Retrieval
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Consumer Health Question Answering Systems Rohit Chandra Sourabh Singh
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
Ensembling Diverse Approaches to Question Answering
Measuring Monolinguality
PRESENTED BY: PEAR A BHUIYAN
6 ~ GIR.
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Natural Language Processing (NLP)
Relaxed Query Graph for Question Answering in QALD-5
Associative Query Answering via Query Feature Similarity
Web IR: Recent Trends; Future of Web Search
Traditional Question Answering System: an Overview
Wikitology Wikipedia as an Ontology
Question Answering via Question-to-Question Mapping
QA Systems in QALD Hybrid Task
问句理解示例 瞿裕忠.
Reading Report on Question Answering
Question Answering & Linked Data
CS246: Information Retrieval
Natural Language Processing (NLP)
Chaitali Gupta, Madhusudhan Govindaraju
Template-based Question Answering over RDF Data
Introduction Dataset search
wikiKnows a Qustion Answering System based on Wikipedia Knowledge
Natural Language Processing (NLP)
Presentation transcript:

Reading Report on Hybrid Question Answering System 系统阅读 孙亚伟

Articles ISOFT at QALD-5: Hybrid question answering system over linked data and text data Seonyeong Park, Pohang University of Science and Technology In CLEF 2015 Working Notes Papers

Introduction System Description Experiment Conclusion Outline System Architecture Basic Analysis Query Generation Semantic Answer Type Multi-information Tagged Text Database SPARQL query template generator Experiment Conclusion

Introduction Complex Questions Where was the "Father of Singapore" born? Which Secretary of State was significantly involved in the United States‘ dominance of the Caribbean? Who is the architect of the tallest building in Japan? Which German mathematicians were members of the von Braun rocket group? How old was Steve Jobs' sister when she first met him? Answering these questions need information from both structured data and unstructured data

QA over linked data and text data Method: combine KBQA and IRQA Introduction QA over linked data and text data Method: combine KBQA and IRQA Extract clues using IRQA If fail to find clues, generate SPARQL query

Multi-information Tagged Text Database SPARQL query template generator System Description System Architecture Basic Analysis Query Generation Semantic Answer Type Multi-information Tagged Text Database SPARQL query template generator

System Architecture

Example

Ordinary NLP techniques Basic Analysis Method Statistical and Rule Ordinary NLP techniques Tokenization, POS tagging, Dependency Parsing Keyword , Term , NE QA system oriented techniques Question to statement(Q2S) Lexical answer type (LAT) Phrase extraction Predicate phrase Prepositional phrase

Example

Question  Sequential Phrases Queries Query Generation Question  Sequential Phrases Queries First query: concatenating the two rightmost phrases Next query: concatenation of the answer of the first query and the next rightmost phrase Repeat this process to find the answer Note: If fail to find clues, generate SPARQL query

Semantic Answer Type SAT Classification Type Set from DBpedia Features Keywords, LAT, type of each NE, Interrogative wh-word, predicate and its arguments Accuracy 71.42% (3-level type ontology) 84.62% (2-level type ontology)

Multi-information Tagged Text Database Each Sentence in Wikipedia Plain text Tagged text with co-reference resolution and disambiguation information Title from which Wikipedia page the sentence is POS tagging, dependency parsing, and SRL result

SPARQL query template generator Detect words from each question to extract the SPARQL template Questions including arithmetic information Comparative word such as ‘deeper’ , Superlative word such as ‘deepest’ Yes/No questions If not ‘wh-question’ nor ‘list-question’, then ‘yes/no question’ Simple questions Using lexical matching and semantic similarity, Map predicate to properties in KB

Experiment Test: QALD-5 hybrid question test dataset Data: Wikipedia, DBpedia 3.10

Success Question QGeneration Process [53].Who is the architect of the tallest building in Japan?[Building] in Japan/tallest building/architect [1]. Find tallest building in Japan is “Tokyo_Skytree” [using IR] [2]. Map “architect” to “architect”  Nikken_Sekkei [using sparql] [55].In which city where Charilie Chaplin’s half brothers born[City] Charlie Chaplin half brother /born [1].Find the half brother of Charlie Chaplin [using IR] [2].Select ?uri {Sydney_Chaplin birthplace?uri} [using sparql] “England”, “United Kingdom” and “London” [3]. Using Semantic answer type, filter “London”. [Filter] [58].Are there man-made lakes in Australia that are deeper than 100 meters?[Place] man-made lake Australia / deeper 100 [1]. Extract answer clues of “man -made lake Australia”. [using IR] [2]. Compare the length of each named entities and check the more than one river is deeper than 100 meters. [using sparql]

Error No find answer clue Question Question generation [51].Where was the “Father of Singapore” born? [Place] “Father of Singapore” born/Where [52].Which Secretary of State was significantly involved in the United States‘ dominance of the Caribbean [ADMINISTRATIVE REGION] Unites’ dominance of the Carbbean/involve/ Secretary [56].Which German mathematicians were members of the von Braun rocket group?[Person] member von Braun rocket group /German mathematician [57].Which writers converted to Islam[Person] writer converted Islam [59].Which movie by the Coen brothers stars John Turturro in the role of a New York City playwright?[Place] role of a New York City playwright/Coen brother stars [60].Which of the volcanoes that erupted in 1550 is still active?[Place] volcanoes/erupted in 1550/active

Error Question Question generation Error [54].What is the name of the Viennese newspaper founded by the creator of the croissant[Person] creator of croissant/ Viennese newspaper founded Cannot map the “founded” to predicate in DBpedia [60].Which of the volcanoes that erupted in 1550 is still active?[Place] volcanoes/erupted in 1550/active Cannot generate appropriate query to extract answer clue

Conclusion Hybrid QA System Future Both KBQA approach and IRQA approach First search text data If the results are not appropriate or related to arithmetic, generate SPARQL query Future Extend query to find relevant answer clue Predicate Mapping Semantic Parsing Information Extraction

Tools Stanford co-reference tool: co-reference resolution DBPedia Spotlight: map NEs in the question to entities in DBpedia Predicate Mapping lexical matching semantic similarity on explicit semantic analysis(ESA) ClearNLP Tokenization, POS tagging and dependency parsing WordNet: term extraction Lucene: Wikipedia index libSVM: SAT classifier

致谢 欢迎老师和同学提问!

讨论 ISOFT ISOFT优点是什么? ISOFT缺点是什么? 问题分解成短语 从右开始,文本检索 一旦文本检索不相关或与算术计算有关,进入SPARQL模块 最后利用答案类型过滤 ISOFT优点是什么? ISOFT缺点是什么?

讨论 Hybrid QA System 的关键是什么? 问句分析?Predicate-argument Structure? 问句拆分子句?子句如何求解?原子操作是什么? 短语检索?底层数据文本分析? Entity Linking?Predicate Mapping?Semantic Matching? SPARQL Construction? Candidate Answers Ranking? Filters? Textual Entailment? Paraphrase?