Presentation is loading. Please wait.

Presentation is loading. Please wait.

Porting Natural Language Interfaces between Domains - An Experimental User Study with the ORAKEL System - Philipp Cimiano, Peter Haase, Jörg Heizmann Institute.

Similar presentations


Presentation on theme: "Porting Natural Language Interfaces between Domains - An Experimental User Study with the ORAKEL System - Philipp Cimiano, Peter Haase, Jörg Heizmann Institute."— Presentation transcript:

1 Porting Natural Language Interfaces between Domains - An Experimental User Study with the ORAKEL System - Philipp Cimiano, Peter Haase, Jörg Heizmann Institute AIFB, University of Karlsruhe (TH) Intelligent User Interfaces (IUI) January 28-31, 2007, Hawaii

2 Agenda Motivation Natural Language Interfaces The ORAKEL System Adaptation Methodology Experiments and Results Conclusion and Outlook

3 Motivation Electronic devices get smaller and smaller: limited I/O functionality need for intuitive ways of interacting with devices natural language might be an interesting option for querying knowledge Problems of using natural language: ambiguity at all levels of interpretation large coverage (grammar) robustness and precision adaptability

4 Natural Language Interfaces (NLIs) Definition: tool allowing users to query/update a database or knowledge base using (restricted/unrestricted) natural language Fashionable research topic in the 70s and 80s Problem too complex ? No businees models ? Renewed interest in the new millennium: Database Technology (mature) Semantic Web More and More Data Electronic Devices get smaller …

5 Is this a complex task? Intuitively easier than natural language understanding (as a whole): Very focused to a particular domain and KB Relatively short sentences (compared to newspaper text) No discourse phenomena (no anaphora, no ellipsis) More complex the more you move to `real´ dialogue…

6 Challenge & Research Question Challenge: domain-specific interpretation of a question Research Question: can we develop a model allowing non-NLP experts to easily port the system accross domains? Which river flows through more cities than the Rhein?

7 KB General Lexicon Domain Lexicon FrameMapper Query Interpreter Query Converter Answer Generation The ORAKEL System

8 Query Interpreter - Compositional Semantics - Standard compositional semantics approach, i.e. the meaning of a question is composed of the meanings of the words and the way they are connected Parse tree is used to guide the incremental semantics composition Meaning is captured through lambda expressions Three composition operators: Functional application (beta reduction) Renaming of bound variables (alpha reduction) Marked substitution

9 Query Interpreter (4) - Meaning Construction - S DP VP PP DPP Which river V flows Karlsruhe through

10 Lexicon (1-5) Tripartite Structure of Lexicon: Domain-specific Lexicon (created by user) Domain-independent Lexicon (pre-encoded) Ontology Lexicon (generated from ontology) All lexicons are actually lexicalized grammars (consisting of trees) used for parsing and construction of the query.

11 Ontology Lexicon Contains lexical representation of instances and concepts. Generated automatically from the ontology, relying on its labels Lexicon used to generated flected variants, e.g. plural forms No manual work needed by user (!)

12 Domain-independent Lexicon Contains closed-class words with constant meaning across domains: Determiners: every, most, the most, a, the only, the, all, no, … Prepositions: after, before, in (spatial), in (temporal), … Question pronouns: who, what, which, when, where, … Meaning is captured with respect to foundational categories, e.g. as provided by DOLCE (No manual work by user!)

13 Domain-specific lexicon Adaptation Mechanism Subcategorization Frames: linguistic predicate-argument stuctures e.g. flow(subj,pcomp(through)) Relations in the Knowledge Base: flowThrough(river,city) Basic idea: user performs mapping between arguments of a subcategorization frame and a relation in the knowledge base Domain-specific lexicon is generated on the background as a byproduct of the mappings performed by a lexicon engineer Research question: Can naive users (in the sense of being unfamiliar with computational linguistics) customize the system to work with a specific knowledge base?

14 FrameMapper GUI

15 Type hierarchies Subcategorization Frames Arity2Arity3Arity4 Relation Arity2Arity3 Arity4 TransitiveIntransitive+PPNoun+PPTransitive+PPNoun+PP+PP Binary Relation2x2 JoinTernary Relation2x2 Join´3x2Join

16 FrameMapper GUI

17 Adaptation Methodology FrameMapper ORAKEL Lexicon Questions Failed Questions

18 Evaluation:Goals First Claim: Users not trained in NLP are able to create domain-specific lexica comparable to those created by NLP experts. Second Claim: The coverage of the lexicon will improve proportionally to the number of iterations performed.

19 Evaluation: Measures Claims: Precision / Recall should be comparable for different users (NLP and non-NLP experts) Recall should increase over iterations

20 Experimental Settings Lexicon Engineers: NLP expert (no training needed) Master´s student (self-training, no NLP knowlede) Other users: short explanation of types supported by ORAKEL (10 min.) short training on FrameMapper (10 min) End Users: Academic (Researchers and Students), Industrial Received handout describing the task, the knowledge base and some restrictions on the allowed questions Were supposed to ask at least 10 questions to the system Were asked to confirm if the answer provided by the system was correct (yes/no) Lexicon engineers developed the lexicon in different iterations, refining lexica after being presented with the questions not answered by the system (End Users)

21 German Geography Knowledge Base Created by students at our department in 2003. Contains information about cities (+ the states where they are located), rivers (+ the cities they pass), highways (+ the cities they pass), states, capitals of states etc. KB represented in F- Logic type# Cities106 States16 Rivers18 Highways108 Countries9 Seas2

22 British Telecom´s digital library A digital library created and mantained by British Telecom Used as a case study within the SEKT Project Metadata stored in a database which was mapped to the Proton ontology (and thus accessible through KAON2) for querying! OWL/SPARQL instead of F-Logic Type# Authors67.015 Documents33.501 Topics17.174

23 Geography Knowledge Base Goal: compare lexica engineered by NLP and non-NLP experts with respect to ORAKEL performance in terms of precision and recall Setting: NLP expert (A) constructed lexicon from scratch two non-NLP experts (B +C) over two rounds (30min training, 2x30min) 24 end users (8 + 2*4 + 2*4), asking at least 10 questions Conclusions: Comparable results for A as well as B and C (after 2 iterations) Results (in terms of recall) clearly improve after lexicon modification Lexicon#UsersRec. (avg)Prec. (avg) A853.67%84.23% B (1st)444.39%74.53% B (2nd)445.15%80.95% C (1st)435.41%82.25% C (2nd)447.66%80.60%

24 BT´s digital library Master´s student as lexicon engineer constructed lexicon in three iterations (6h + 2*30m.) 12 Users (three querying rounds with 4 users) Conclusions: Average Recall shows clear improvement over the three rounds ORAKEL can in principle scale to much larger knowledge bases IterationsRec. (avg.)Prec. (avg.) 142%52% 249%71% 361%73%

25 Related Work No domain-adaptation needed: exploit lexical matches PRECISE [Popescu et al. 2003] Aqualog [Lopez et al. 2004] principled limits: Relation modeled as authorOf(x,y) Asked for „Who wrote what?“ Engineering expertise required: Quetal [Frank et al. 2006] ACE [Fuchs et al. 2006]

26 Conclusion Adaption mechanism and iterative methodology seems suitable for end users not familiar with natural language processing This has been corroborated by experimental validation showing that: Precision/Recall comparable for NLP and non-NLP experts Recall improves proportionally to the iterations Precision is quite reasonable (73-82%)

27 The longer-term vision Benefits: Lexica are reused and minor changes performed Not everybody has to develop a lexicon portal.owlportal.lex.owl ORAKEL


Download ppt "Porting Natural Language Interfaces between Domains - An Experimental User Study with the ORAKEL System - Philipp Cimiano, Peter Haase, Jörg Heizmann Institute."

Similar presentations


Ads by Google