Presentation is loading. Please wait.

Presentation is loading. Please wait.

Natural Language Interfaces to Ontologies LarKc PhD symphosium, Beijing, 14 November 2010 Danica Damljanović University of Sheffield

Similar presentations


Presentation on theme: "Natural Language Interfaces to Ontologies LarKc PhD symphosium, Beijing, 14 November 2010 Danica Damljanović University of Sheffield"— Presentation transcript:

1 Natural Language Interfaces to Ontologies LarKc PhD symphosium, Beijing, 14 November 2010 Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk

2 Introduction What are Natural Language Interfaces? What are Conceptual Models? Who are the users? – application developers – end users

3 What are Natural Language Interfaces to Ontologies? 3

4 Customisation Ontology editing (e.g. using Protege) Domain lexicon NLI for querying … Domain knowledge WordNet Domain expert Ontology engineer NLI for Ontology authoring

5 The Objective Increase usability of Natural Language Interfaces to ontologies – For end users: increase precision and recall – For application developers: decrease the time for customisation

6 6 C HALLENGES : N ATURAL L ANGUAGE Ambiguity Expressiveness

7 7 Natural Language vs. formal language Knowledge structure C HALLENGES : P ORTABILITY

8 Portability vs. Performance 8 moderate precision lowest precision highest precision high precision large datasets (several domains) simple factual/CNL questions complex/free-text questions small datasets (narrow domain) Damljanovic, D., Bontcheva, K.: Towards Enhanced Usability of Natural Language Interfaces to Knowledge Bases. In Devedzic V. and Gasevic D. (Eds.), Special issue on Semantic Web and Web 2.0, Annals of Information systems, Springer-Verlag, 2009.

9 Controlled Natural Language a limited subset of a Natural Language which is translated into a formal language – supports certain vocabulary and grammar – is balancing ambiguity and expressiveness portability: building the vocabulary (lexicon) automatically from the ontology structure remains formal and needs to be learnt by end users 9

10 Previous Work: QuestIO 1.15 1.19 compare

11 But... Ontologies are not perfect: – ontology lexicalisations often missing or too many – ranking based on ontology structure might be misleading Encouraging users to use keywords might be misleading User evaluation: – defined tasks: user satisfaction reaching 90% – undefined tasks: user satisfaction low (~44%)

12 The user’s vocabulary The lexicalisations in the ontology often do not match those used in the users’ questions – vocabulary can be extended by using tools such as Wordnet for synonyms – still... especially for very specific domains, Wordnet would not find usable lexicalisations - but what about the user’s vocabulary? 12

13 13 Feedback: showing the user system interpretation of the query Refinement: – resolving ambiguity: generating dialog whenever one term refers to more than one concept in the ontology (precision) Extended Vocabulary: – expressiveness: generating dialog whenever an “unknown” term appears in the question (recall) – portability: no need for customisation from application developers The dialog: – generated by combining the syntactic parsing and ontology-based lookup – learns from the user’s selections FREyA - Feedback, Refinement, Extended Vocabulary Aggregator

14 FREyA Workflow Potential Ontology Concept (POC) Ontology Concept (OC) 14 answer NL query POCsOCstriplesSPARQL learn Indentify the Answer Type Answer Type

15 Mapping POCs to OCs Extract all ontology lexicalisations (lemmas) Perform Lexicon-based lookup Analyse grammar to find Potential Ontology Concepts (POCs) Generate the dialog Add the POC to the lexicon 15

16 Find Potential Ontology Concepts CNL 2010, Marettimo, Sicily 16

17 Finding Ontology Concepts 17

18 geo:City geo:State new york POC population geo:cityPopulation Mapping POC to OCs: Ambiguities 18 geo:State

19 New York is a city 19

20 New York is a state 20

21 Ambiguous Lexicon 21 POCOC (context)candidate OCfunction new yorkgeo:State- new yorkgeo:City- populationgeo:Stategeo:statePopulation- populationgeo:Citygeo:cityPopulation- IFTHEN

22 POC state area geo:stateArea geo:State geo:isLowestPointOf point 22 The User Controls the Output max geo:LoPoint geo:loElevation min

23 23 TRIPLES: ?firstJoker – geo:isLowestPointOf – geo:State geo:State – (max) geo:stateArea - ?lastJoker SPARQL: prefix rdf: prefix xsd: select ?firstJoker ?p0 ?c1 ?p2 ?lastJoker where { { { ?c1 ?p0 ?firstJoker} UNION { ?firstJoker ?p0 ?c1}. filter (?p0= ). } ?c1 rdf:type. ?c1 ?p2 ?lastJoker. filter (?p2= ). } ORDER BY DESC(xsd:double(?lastJoker)) W HAT IS THE LOWEST POINT OF THE STATE WITH THE LARGEST AREA ?

24 24 W HAT IS THE LOWEST POINT OF THE STATE WITH THE LARGEST AREA ? TRIPLES: ?firstJoker – (min) geo:loElevation – geo:LoPoint geo:LoPoint - ?joker3 – geo:State geo:State – (max) geo:stateArea - ?lastJoker SPARQL: prefix rdf: prefix xsd: select ?firstJoker ?p0 ?c1 ?joker3 ?c2 ?p3 ?lastJoker where { ?c1 ?p0 ?firstJoker. filter (?p0= ). ?c1 rdf:type. {{ ?c2 ?joker3 ?c1 } UNION { ?c1 ?joker3 ?c2 }} ?c2 rdf:type. ?c2 ?p3 ?lastJoker. filter (?p3= ). } ORDER BY ASC(xsd:double(?firstJoker)) DESC(xsd:double(?lastJoker)) the answer for both is Death Valley

25 New Lexicon 25 POCOC (context)candidate OCfunction areageo:Stategeo:stateArea- largestgeo:stateArea max pointgeo:Stategeo:LoPoint- lowestgeo:LoPointgeo:loElevationmin lowestgeo:isLowestPointOf-- IFTHEN

26 JSON 26 "Key: largest http://www.mooney.net/geo#State", "identifier": "http://www.mooney.net/geo#stateArea", "function":“max “, “score”:”0.89” "Key: area http://www.mooney.net/geo#State", "identifier": "http://www.mooney.net/geo#stateArea", "function":“ “, “score”:”0.89”

27 Learning 27

28 FREyA: a Natural Language Interface to Ontologies 03 June 2010ESWC 201028 http://gate.ac.uk/freya

29 29 Answer Type Identification

30 No Answer Type Identifier (WH-phrases) Found? Often means we could derive the answer type without engaging the user 30

31 Evaluation: correctness 31  Mooney GeoQuery dataset, 250 questions  34 no dialog, 14 failed to be answered  Precision=recall=94.4%

32 Evaluation: Learning 32  10-fold cross-validation, 202 Mooney GeoQuery questions that could be correctly mapped into SPARQL and required dialog, from 0.25 to 0.48  Errors: ambiguity and sparseness

33 Evaluation: Ranking  Mean Reciprocal Rank: 0.76 (default ranking based on string similarity and synonym detection)

34 Learning the Correct Ranking  Randomly selected 103 dialogs from 202 questions (343 dialogs)  MRR increased for 6% from 0.72 to 0.78

35 Evaluation: Answer Type 35

36 Evaluation: Customisation Small empirical evaluation with 1 subject who is not familiar with ontologies and NLP No training, short introduction into the domain 17 questions asked in total; 3 were cancelled by the user during one of the dialogs 78.57% correctly answered 21.43% failed or incorrectly answered

37 Evaluation: customisation (continued)

38 Conclusion Combining syntactic parsing with ontology- based lookup through user interaction can increase the precision and recall of NLIs to ontologies, while reducing the time for customisation by shifting it from application developers to end users. 38

39 Next steps Improvement of the learning model to avoid errors due to ambiguities – point> geo:HiPoint or geo:LoPoint Using lexicon to improve other systems 39

40 More information... D. Damljanovic, M. Agatonovic, H. Cunningham: Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Extended Semantic Web Conference (ESWC 2010), Springer Verlag, Heraklion, Greece, May 31-June 3, 2010. PDFPDF D. Damljanovic, M. Agatonovic, H. Cunningham: Identification of the Question Focus: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Language Resources and Evaluation Conference (LREC 2010), ELRA 2010, La Valletta, Malta, May 17-23, 2010. PDF D. Damljanovic. Towards portable controlled natural languages for querying ontologies. In Rosner, M., Fuchs, N., eds.: Proceedings of the 2nd Workshop on Controlled Natural Language. Lecture Notes in Computer Science. Springer Berlin/Heidelberg, Marettimo Island, Sicily (September 2010)PDF


Download ppt "Natural Language Interfaces to Ontologies LarKc PhD symphosium, Beijing, 14 November 2010 Danica Damljanović University of Sheffield"

Similar presentations


Ads by Google