Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

Similar presentations


Presentation on theme: "Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:"— Presentation transcript:

1 Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

2 2 Presentation Overview Web Queries Web Queries Explanation of AskOntos Explanation of AskOntos Demo Demo Evaluation Evaluation Future Work and Conclusion Future Work and Conclusion

3 3 Web Queries: Challenges Example: Searching for a car Cannot specify constraints Cannot specify constraints Documents returned (usually too many) Documents returned (usually too many) Takes time to read through documents Takes time to read through documents Determine relevance Determine relevance Find information (price, year, etc.) Find information (price, year, etc.)

4 4 Web Queries: Opportunities Semantic web Semantic web Proposed ontology-based framework for making information machine-readable Proposed ontology-based framework for making information machine-readable Uses markup languages to identify information Uses markup languages to identify information “[A] search program can look for only those pages that refer to a precise concept…” “[A] search program can look for only those pages that refer to a precise concept…” -Tim Berners-Lee How should semantic web be searched? How should semantic web be searched?

5 5 Solution: AskOntos – a Query System for the Semantic Web Allows free-form queries over semantically annotated pages Allows free-form queries over semantically annotated pages Processes queries using information extraction Processes queries using information extraction Returns tables of extracted values Returns tables of extracted values

6 6 AskOntos Overview

7 7 Extraction Ontologies Object sets Relationship sets Participation constraints Lexical Non-lexical Primary object set Aggregation Generalization/Specialization

8 8 Extraction Ontologies Value Expression: \s*[$]\s*(\d{1,3})*(\.\d{2})? Key Word Phrase Left Context: $ Data Frame: Internal Representation: float Value Phrase Key Word Expression: ([Pp]rice)|([Cc]ost)| … Operation Phrase Operator: > Expression: (more\s*than)|(more\s*costly)|…

9 9 Annotating Web Pages

10 10 Annotating Web Pages

11 11 Step 1. Parse Query “Find me the and of all s – I want a ”pricemileagere d Nissan1996or newer >= Operator

12 12 Step 2. Find Related Ontology Similarity value: 5 Similarity value: 2 “Find me the price and mileage of all red Nissans – I want a 1996 or newer”

13 13 Conjunctive and aggregate queries run over selected ontology’s extracted values Conjunctive and aggregate queries run over selected ontology’s extracted values Value-phrase-matching words determine conditions Value-phrase-matching words determine conditions Conditions: Conditions: Color = “red” Color = “red” Make = “Nissan” Make = “Nissan” Year >= 1996 Year >= 1996 >= Operator Step 3. Formulate XQuery Expression

14 14 For Let Where Return Step 3. Formulate XQuery Expression

15 15 Step 4. Run XQuery Expression Over Ontology’s Extracted Data Uses Qexo 1.7, GNU’s XQuery engine for Java Uses Qexo 1.7, GNU’s XQuery engine for Java Orders results according to number of values Orders results according to number of values

16 16Demo

17 17 Evaluation of AskOntos Success Measure: ability to translate free- form queries into formal queries Success Measure: ability to translate free- form queries into formal queries Extraction ontologies : car ads, house ads, countries, movies, and diamond ads Extraction ontologies : car ads, house ads, countries, movies, and diamond ads 3 rounds of testing 3 rounds of testing 50 queries each (gathered from other CS students) 50 queries each (gathered from other CS students) 1 st round discarded due to queries 1 st round discarded due to queries Minor improvements on system between rounds Minor improvements on system between rounds

18 18 Query Translation Metrics “Find me the price and mileage of all red Nissans – I want a 1996 or newer.” Human conversion for $doc in document("file:///.../Car.OWL")/rdf:RDF for $Record in $doc/owl:Thing … where($Color="red" or empty($Color)) and ($Make="Nissan" or empty($Make)) and ($Year="1996" or empty($Year)) return {$Price} {$Color} {$Make} {$Year} Automated conversion PrecisionRecall Return-Clause Names 100%80% Conditions66%66% Return-Clause Names: {Price,Color, Make, Year} Conditions: {(Color,=,“red”), (Make,=,“Nissan”), (Year,=,“1996”)} Return-Clause Names: {Price, Mileage,Color, Make, Year} Conditions: {(Color,=,“red”), (Make,=,“Nissan”), (Year,>=,“1996”)}

19 19Results

20 20 Result Analysis Common reasons for errors: 1. Word not in lexicon: “5 Bedrooms, 3 Bath, study, game room, 2 car garage, and < $250,000”

21 21 Result Analysis “Which countries use the euro?” 2. Mistakes in regular expressions

22 22 Result Analysis 3. Not enough context: “What are the models from 2005”

23 23 Conclusion/Contributions AskOntos AskOntos Is a free-form query system for the semantic web Is a free-form query system for the semantic web Applies information extraction for query processing Applies information extraction for query processing Answers questions with extracted data values Answers questions with extracted data values Contributions Contributions Web queries that use semantic annotations Web queries that use semantic annotations Web queries returning answers from extracted data Web queries returning answers from extracted data Processing free-form queries using ontologies Processing free-form queries using ontologies

24 24 Future Work Disjunction and negation Disjunction and negation Fuzzy queries Fuzzy queries Spellchecker Spellchecker

25 25

26 26 TREC 2004 QA Question Topics

27 27 Related Research SimilaritiesDifferences QUEST (1999) Uses Ontologies Uses Ontologies Graphic-based interface Graphic-based interface Returns generated documents and Returns generated documents and graphs graphs SHOE (2000) Returns tables of data Returns tables of data Form-based interface Form-based interface AQUA (2004) Natural language interface Natural language interface Uses ontology as part of query translation process Uses ontology as part of query translation process For single domain environment For single domain environment Part-of-speech recognition Part-of-speech recognition Uses ontology for term replacement Uses ontology for term replacement Returns passages Returns passages

28 28 Related Research SimilaritiesDifferences Bernstein et al. (2005) Natural language interface Natural language interface Allows only subset of English (Attempto Controlled English) queries Allows only subset of English (Attempto Controlled English) queries SWSE (2005) Natural language interface Natural language interface Returns semantically annotated Returns semantically annotated data data No part-of-speech recognition No part-of-speech recognition Query context found by matching Query context found by matching RDF labels, comments and literals RDF labels, comments and literals Uses WordNet Uses WordNet NaLIX (2006) Converts natural language Converts natural language query to same XML query query to same XML query language language Limited to parsing ability of MINIPAR Limited to parsing ability of MINIPAR For XML database For XML database Query terms expanded with WordNet Query terms expanded with WordNet

29 29 recordsreturnedcorrectprecisionrecall simple119201995.00%100.00% Simple21917 100.00%89.47% Simple311 100.00% Simple4999100.00% Simple512131184.62%91.67% Simple612111090.91%83.33% Simple71410 100.00%71.43% Simple857571.43%100.00% Simple914 100.00% Simple1015 100.00% Total13012712195.28%93.08% recordsreturnedcorrectprecisionrecall simple119221986.36%100.00% simple2192000.00% simple311141178.57%100.00% simple4910990.00%100.00% simple512161275.00%100.00% simple61223939.13%75.00% simple714221359.09%92.86% simple851000.00% simple914161487.50%100.00% simple10151600.00% Total1301698751.48%66.92% Simple Multiple-Record Documents VSM Separator Highest-Fanout Separator Genealogy Domain – from Troy Walker’s thesis

30 30 Complex Multiple- Record Documents recordsreturnedmissedextracorrectprecisionrecall complex110 00 100.00% complex215 00 100.00% complex312 00 100.00% complex47913666.67%85.71% complex5161510 100.00%93.75% complex61516231381.25%86.67% complex7131210 100.00%92.31% complex810 00 100.00% complex91920121890.00%94.74% complex1010 11990.00% complex11151140 100.00%73.33% complex1215 00 100.00% complex1311 00 100.00% complex141618131583.33%93.75% complex158822675.00% complex168901888.89%100.00% complex17101100 100.00%110.00% complex1841301100.00%25.00% complex1981103872.73%100.00% complex201613411292.31%75.00% Total238237211921891.98%91.60%

31 31 Scaling to the Web Ontologies crawl and harvest web pages Ontologies crawl and harvest web pages Ontologies extract values from pages Ontologies extract values from pages Ontologies indexed Ontologies indexed Queries extracted by relevant ontologies Queries extracted by relevant ontologies Rely on Google-like technology Rely on Google-like technology


Download ppt "Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:"

Similar presentations


Ads by Google