Presentation is loading. Please wait.

Presentation is loading. Please wait.

HyKSS: Hybrid Keyword and Semantic Search Andrew Zitzelberger 1.

Similar presentations


Presentation on theme: "HyKSS: Hybrid Keyword and Semantic Search Andrew Zitzelberger 1."— Presentation transcript:

1 HyKSS: Hybrid Keyword and Semantic Search Andrew Zitzelberger 1

2 Keyword Search 2

3 Form Based Search 3

4 4 over 8,000 meters in elevationless than 100K milesfaster than 100 mph What about?

5 5

6 HyKSS Hybrid Keyword and Semantic Search Semantics – extracted annotations – Multiple ontologies Keywords – text 6

7 Thesis Statement HyKSS (hybrid search) – Outperforms keyword and semantic search – Dynamic query weighting outperforms various other hybrid search approaches – Allows queries over multiple ontologies – Allows pay-as-you-go improvement 7

8 Extraction Ontologies 8

9 Data Frames 9

10 Indexing Architecture 10 Keyword IndexerSemantic Indexer Keyword IndexSemantic Index Document Collection

11 Indexing Architecture Implementation 11 Keyword Indexer Semantic Indexer Keyword Index Semantic Index Document Collection OntoES Ontology Library Sesame Lucene

12 Query Processing 12 Free Form Query Execute Query Post-Process Query Combine Results Pre-Process Query Execute Query Post-Process Query Pre-Process Query Keyword ProcessingSemantic Processing

13 Keyword Query Pre-Processing 13 Remove Lucene special characters (except quotes) Remove (inequality) comparison constraints Remove non-phrase stopwords hondas in "excellent condition" in orem for under 12 grand hondas “excellent condition” orem

14 Keyword Query Execution and Post-Processing Executed by Lucene Empty Post-Processing step 14

15 Semantic Query Pre-Processing Individual Ontology Scoring hondas in "excellent condition" in orem for under 12 grand 15

16 Semantic Query Pre-Processing Ontology Set Creation For each ontology sorted by score: – For each remaining ontology: Add point for each new or subsuming match If added points > 0 add ontology Completely subsumed ontologies are removed during query generation 16

17 Semantic Query Pre-Processing Ontology Set Creation 17 Price < 12000 LocationVehicle ContractualServices Location Vehicle Contractual Services Vehicle_Score + 1 US_City=“orem” Price < 12000 Price < 12000 ContractualServices_Score + 1 Vehicle_Score US_City=“orem”

18 Semantic Query Pre-Processing Structured Query Generation Open world assumption SPARQL query 18

19 Semantic Query Execution and Post-Processing Sesame query execution Semantic ranking: – 1 point for each requested projection satisfied – Normalized by # of projections requested hondas in "excellent condition" in orem for under 12 grand – Projections on Make, Price and US_City 19

20 Hybrid Query Processing Linear interpolation: – (kw_weight * kw_score) + (sm_weight * sm_score) Dynamic solution: – # keywords remaining (#kw) – concept match score (cms) = ½ * (selections + projections) – kw_weight = #kw/(#kw + cms) – sm_weight = cms/(#kw + cms) 20

21 Basic Search 21

22 Results Display 22

23 23 Form Based Search

24 Results Display

25 Experimental Setup – Ontology Libraries 5 Ontology Levels – Number – Generic Units – Vehicle Units – Vehicle – Vehicle+ 25

26 Experimental Setup – Query Sets 113 syntactically unique queries from database students 60 syntactically unique queries from linguistic students 26

27 Experimental Setup – Document Collection 250 vehicle advertisements (Craigslist) – 100 training, 50 validation, 100 test 318 mountain pages (Wikipedia) 66 roller coaster (Wikipedia) 88 video game advertisements (Craigslist) 27

28 Experiments 1)Training queries over test vehicle documents 2)Test queries over test vehicle documents 3)Training queries over test vehicle documents + additional noise 4)Test queries over test vehicle documents + additional noise 5)5 queries over noisy data (Generic Units only) 28

29 Experiments - Metric Mean Average Precision 29

30 Experimental Results 30

31 Experimental Results 31

32 Experimental Results 32

33 Conclusions Hybrid search outperforms keyword and semantic search HyKSS’s dynamic query weighting approach outperforms various other weighting techniques Using multiple does not outperform selecting and using a single ontology 33

34 External Image Citations Slide 2 Google search screenshot: http://www.google.com (07/30/11)http://www.google.com Slide 3 partial car search form screenshots: http://autotrader.com/fyc (07/30/11)http://autotrader.com/fyc Slide 4 mountain image: http://en.wikipedia.org/wiki/Lhotse (04/26/11)http://en.wikipedia.org/wiki/Lhotse Slide 4 car image: http://en.wikipedia.org/wiki/Honda (04/26/11)http://en.wikipedia.org/wiki/Honda Slide 4 roller coaster image: http://en.wikipedia.org/wiki/Kingda_Ka (04/26/11)http://en.wikipedia.org/wiki/Kingda_Ka Slide 4 Wikipedia logo: http://en.wikipedia.org/wiki/Main_Page (04/26/11)http://en.wikipedia.org/wiki/Main_Page Slide 4 craigslist logo: http://provo.craigslist.org/ (04/26/11)http://provo.craigslist.org/ 34


Download ppt "HyKSS: Hybrid Keyword and Semantic Search Andrew Zitzelberger 1."

Similar presentations


Ads by Google