Natural Language Interfaces to Ontologies LarKc PhD symphosium, Beijing, 14 November 2010 Danica Damljanović University of Sheffield

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Objectives: Generate and describe sequences. Vocabulary:
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Programming Language Concepts
LIBRARY WEBSITE, CATALOG, DATABASES AND FREE WEB RESOURCES.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
The 5S numbers game..
Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1.
Knowledge Extraction from Technical Documents Knowledge Extraction from Technical Documents *With first class-support for Feature Modeling Rehan Rauf,
Break Time Remaining 10:00.
PP Test Review Sections 6-1 to 6-6
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
26/10/2008 SWESE'08 1 Enhanced Semantic Access to Software Artefacts Danica Damljanović and Kalina Bontcheva.
Funded by: European Commission – 6th Framework Project Reference: IST WP6 review presentation GATE ontology QuestIO - Question-based Interface.
Copyright © 2013, 2009, 2005 Pearson Education, Inc.
2 |SharePoint Saturday New York City
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
VOORBLAD.
1 Evaluations in information retrieval. 2 Evaluations in information retrieval: summary The following gives an overview of approaches that are applied.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
8/25/2014Danica Damljanović1 Natural Language Interfaces to Ontologies: usability and performance (Transfer report) Student: Danica Damljanović Supervisor:
Danica Damljanović, Milan Agatonović, Hamish Cunningham contact: FREyA: an Interactive Way of Querying Linked Data using Natural.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
Natural Language Interfaces to Ontologies Danica Damljanović
Danica Damljanović, Milan Agatonović, Hamish Cunningham contact: Natural Language Interfaces to Ontologies: Combining Syntactic Analysis.
Danica Damljanović Towards Portable Controlled Natural Languages for Querying Ontologies.
Danica Damljanović University of Sheffield
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Before Between After.
25 seconds left…...
1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Januar MDMDFSSMDMDFSSS
Analyzing Genes and Genomes
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Clock will move after 1 minute
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
RefWorks: The Basics October 12, What is RefWorks? A personal bibliographic software manager –Manages citations –Creates bibliogaphies Accessible.
Introduction Peter Dolog dolog [at] cs [dot] aau [dot] dk Intelligent Web and Information Systems September 9, 2010.
From Model-based to Model-driven Design of User Interfaces.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
02/04/09Danica Damljanović1 Natural Language Interfaces to conceptual models: usability and performance Danica Damljanović
Natural Language Interfaces to Ontologies Danica Damljanović
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
All sections to appear here
Presentation transcript:

Natural Language Interfaces to Ontologies LarKc PhD symphosium, Beijing, 14 November 2010 Danica Damljanović University of Sheffield

Introduction What are Natural Language Interfaces? What are Conceptual Models? Who are the users? – application developers – end users

What are Natural Language Interfaces to Ontologies? 3

Customisation Ontology editing (e.g. using Protege) Domain lexicon NLI for querying … Domain knowledge WordNet Domain expert Ontology engineer NLI for Ontology authoring

The Objective Increase usability of Natural Language Interfaces to ontologies – For end users: increase precision and recall – For application developers: decrease the time for customisation

6 C HALLENGES : N ATURAL L ANGUAGE Ambiguity Expressiveness

7 Natural Language vs. formal language Knowledge structure C HALLENGES : P ORTABILITY

Portability vs. Performance 8 moderate precision lowest precision highest precision high precision large datasets (several domains) simple factual/CNL questions complex/free-text questions small datasets (narrow domain) Damljanovic, D., Bontcheva, K.: Towards Enhanced Usability of Natural Language Interfaces to Knowledge Bases. In Devedzic V. and Gasevic D. (Eds.), Special issue on Semantic Web and Web 2.0, Annals of Information systems, Springer-Verlag, 2009.

Controlled Natural Language a limited subset of a Natural Language which is translated into a formal language – supports certain vocabulary and grammar – is balancing ambiguity and expressiveness portability: building the vocabulary (lexicon) automatically from the ontology structure remains formal and needs to be learnt by end users 9

Previous Work: QuestIO compare

But... Ontologies are not perfect: – ontology lexicalisations often missing or too many – ranking based on ontology structure might be misleading Encouraging users to use keywords might be misleading User evaluation: – defined tasks: user satisfaction reaching 90% – undefined tasks: user satisfaction low (~44%)

The user’s vocabulary The lexicalisations in the ontology often do not match those used in the users’ questions – vocabulary can be extended by using tools such as Wordnet for synonyms – still... especially for very specific domains, Wordnet would not find usable lexicalisations - but what about the user’s vocabulary? 12

13 Feedback: showing the user system interpretation of the query Refinement: – resolving ambiguity: generating dialog whenever one term refers to more than one concept in the ontology (precision) Extended Vocabulary: – expressiveness: generating dialog whenever an “unknown” term appears in the question (recall) – portability: no need for customisation from application developers The dialog: – generated by combining the syntactic parsing and ontology-based lookup – learns from the user’s selections FREyA - Feedback, Refinement, Extended Vocabulary Aggregator

FREyA Workflow Potential Ontology Concept (POC) Ontology Concept (OC) 14 answer NL query POCsOCstriplesSPARQL learn Indentify the Answer Type Answer Type

Mapping POCs to OCs Extract all ontology lexicalisations (lemmas) Perform Lexicon-based lookup Analyse grammar to find Potential Ontology Concepts (POCs) Generate the dialog Add the POC to the lexicon 15

Find Potential Ontology Concepts CNL 2010, Marettimo, Sicily 16

Finding Ontology Concepts 17

geo:City geo:State new york POC population geo:cityPopulation Mapping POC to OCs: Ambiguities 18 geo:State

New York is a city 19

New York is a state 20

Ambiguous Lexicon 21 POCOC (context)candidate OCfunction new yorkgeo:State- new yorkgeo:City- populationgeo:Stategeo:statePopulation- populationgeo:Citygeo:cityPopulation- IFTHEN

POC state area geo:stateArea geo:State geo:isLowestPointOf point 22 The User Controls the Output max geo:LoPoint geo:loElevation min

23 TRIPLES: ?firstJoker – geo:isLowestPointOf – geo:State geo:State – (max) geo:stateArea - ?lastJoker SPARQL: prefix rdf: prefix xsd: select ?firstJoker ?p0 ?c1 ?p2 ?lastJoker where { { { ?c1 ?p0 ?firstJoker} UNION { ?firstJoker ?p0 ?c1}. filter (?p0= ). } ?c1 rdf:type. ?c1 ?p2 ?lastJoker. filter (?p2= ). } ORDER BY DESC(xsd:double(?lastJoker)) W HAT IS THE LOWEST POINT OF THE STATE WITH THE LARGEST AREA ?

24 W HAT IS THE LOWEST POINT OF THE STATE WITH THE LARGEST AREA ? TRIPLES: ?firstJoker – (min) geo:loElevation – geo:LoPoint geo:LoPoint - ?joker3 – geo:State geo:State – (max) geo:stateArea - ?lastJoker SPARQL: prefix rdf: prefix xsd: select ?firstJoker ?p0 ?c1 ?joker3 ?c2 ?p3 ?lastJoker where { ?c1 ?p0 ?firstJoker. filter (?p0= ). ?c1 rdf:type. {{ ?c2 ?joker3 ?c1 } UNION { ?c1 ?joker3 ?c2 }} ?c2 rdf:type. ?c2 ?p3 ?lastJoker. filter (?p3= ). } ORDER BY ASC(xsd:double(?firstJoker)) DESC(xsd:double(?lastJoker)) the answer for both is Death Valley

New Lexicon 25 POCOC (context)candidate OCfunction areageo:Stategeo:stateArea- largestgeo:stateArea max pointgeo:Stategeo:LoPoint- lowestgeo:LoPointgeo:loElevationmin lowestgeo:isLowestPointOf-- IFTHEN

JSON 26 "Key: largest "identifier": " "function":“max “, “score”:”0.89” "Key: area "identifier": " "function":“ “, “score”:”0.89”

Learning 27

FREyA: a Natural Language Interface to Ontologies 03 June 2010ESWC

29 Answer Type Identification

No Answer Type Identifier (WH-phrases) Found? Often means we could derive the answer type without engaging the user 30

Evaluation: correctness 31  Mooney GeoQuery dataset, 250 questions  34 no dialog, 14 failed to be answered  Precision=recall=94.4%

Evaluation: Learning 32  10-fold cross-validation, 202 Mooney GeoQuery questions that could be correctly mapped into SPARQL and required dialog, from 0.25 to 0.48  Errors: ambiguity and sparseness

Evaluation: Ranking  Mean Reciprocal Rank: 0.76 (default ranking based on string similarity and synonym detection)

Learning the Correct Ranking  Randomly selected 103 dialogs from 202 questions (343 dialogs)  MRR increased for 6% from 0.72 to 0.78

Evaluation: Answer Type 35

Evaluation: Customisation Small empirical evaluation with 1 subject who is not familiar with ontologies and NLP No training, short introduction into the domain 17 questions asked in total; 3 were cancelled by the user during one of the dialogs 78.57% correctly answered 21.43% failed or incorrectly answered

Evaluation: customisation (continued)

Conclusion Combining syntactic parsing with ontology- based lookup through user interaction can increase the precision and recall of NLIs to ontologies, while reducing the time for customisation by shifting it from application developers to end users. 38

Next steps Improvement of the learning model to avoid errors due to ambiguities – point> geo:HiPoint or geo:LoPoint Using lexicon to improve other systems 39

More information... D. Damljanovic, M. Agatonovic, H. Cunningham: Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Extended Semantic Web Conference (ESWC 2010), Springer Verlag, Heraklion, Greece, May 31-June 3, PDFPDF D. Damljanovic, M. Agatonovic, H. Cunningham: Identification of the Question Focus: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Language Resources and Evaluation Conference (LREC 2010), ELRA 2010, La Valletta, Malta, May 17-23, PDF D. Damljanovic. Towards portable controlled natural languages for querying ontologies. In Rosner, M., Fuchs, N., eds.: Proceedings of the 2nd Workshop on Controlled Natural Language. Lecture Notes in Computer Science. Springer Berlin/Heidelberg, Marettimo Island, Sicily (September 2010)PDF