iRDQL – Imprecise RDQL Queries Using Similarity Joins 5 Similarity Measurement – Status Quo Similarity between feature vectors [Lee et al. 98] –features of objects like name, degree,... –Cosine of angle strings or sequences of strings [Levenshtein 66] –textual description of objects –Levenshtein, TFIDF trees and graphs [Shasha et al. 02] –tree/graph comparison –Isomorphisms, Tree-edit distance objects [Resnik 95] –amount of information contained in objects CS Dept US UnderGrad Courses Grad Courses People FacultyStaff Associate Prof Prof name: Mike Meyers granting-institution: NYU CS Dept Swiss Courses Staff Academic Staff Technical Staff Lecturer Professor first-name: Abraham last-name: Bernstein degree: Prof., Ph.D. Administration Staff This is the Department of Informatics at the University of Zurich. Computer Science Department at the University of NY. p = 0.0345
iRDQL – Imprecise RDQL Queries Using Similarity Joins 6 Evaluation Approach Quantitative evaluation using an OWL-S service retrieval test collection [Klusch 05] OWL-S Service-based Precision, Recall and F-Measure as performance measures Precision / Recall / F-Measure:
iRDQL – Imprecise RDQL Queries Using Similarity Joins 7 OWL-S Service Retrieval Test Collection hotel reservation booking service Provide the best hotel reservation system in a given city. city_broker_service.owls city_broker_service2.owls city_financial_agent_service.owls city_financial_agent_service1.owls urbanarea_financial_agent_service.owls city_organization_service.owls... 406 OWL-S services of 6 different domains 9 queries together with its correct answers Query Relevance Set Query http://www.w3.org/Submission/OWL-S/
iRDQL – Imprecise RDQL Queries Using Similarity Joins 9 iRDQL – Performance (iRDQL vs. OWLS- M4) iRDQL slightly outperformed by specialized algorithm OWLS-M4: Matchmaking algorithm of OWLS- MX [Klusch et al. 05] Precision Recall F-Measure
iRDQL – Imprecise RDQL Queries Using Similarity Joins 10 Conclusions and Outlook Combination of RDQL and similarity measures Generic IR-based approach only slightly outperformed Inspired by Cohens work [Cohen 00] no flat tables aggregated (ontologized) objects Performance improvements Switch to SPARQL Thank You...