An Efficient Method for Computing Alignment Diagnoses

Slides:



Advertisements
Similar presentations
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Advertisements

CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Mar, 4, 2015 Slide credit: some slides adapted from Stuart.
1 A Description Logic with Concrete Domains CS848 presentation Presenter: Yongjuan Zou.
Data Mining Association Analysis: Basic Concepts and Algorithms
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
Firewall Policy Queries Author: Alex X. Liu, Mohamed G. Gouda Publisher: IEEE Transaction on Parallel and Distributed Systems 2009 Presenter: Chen-Yu Chang.
Models -1 Scientists often describe what they do as constructing models. Understanding scientific reasoning requires understanding something about models.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
Lesson 6. Refinement of the Operator Model This page describes formally how we refine Figure 2.5 into a more detailed model so that we can connect it.
Describing Syntax and Semantics
China.2005http://sekt.semanticweb.org/ Reasoning with Multi-version Ontologies: a temporal logic approach Zhisheng Huang and Heiner Stuckenschmidt Vrije.
Binary search trees Definition Binary search trees and dynamic set operations Balanced binary search trees –Tree rotations –Red-black trees Move to front.
ANHAI DOAN ALON HALEVY ZACHARY IVES Chapter 6: General Schema Manipulation Operators PRINCIPLES OF DATA INTEGRATION.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Consensus building workshop Conference track OAEI-2007 Ondřej Šváb The University of Economics The Department of Information and Knowledge Engineering.
BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING Pavel Shvaiko joint work with Fausto Giunchiglia and Mikalai Yatskevich INFINT 2007 Bertinoro Workshop on Information.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
Theory of Algorithms: Brute Force. Outline Examples Brute-Force String Matching Closest-Pair Convex-Hull Exhaustive Search brute-force strengths and weaknesses.
A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.
Towards Distributed Information Retrieval in the Semantic Web: Query Reformulation Using the Framework Wednesday 14 th of June, 2006.
Title: Diagnosing a team of agents: Scaling up Written by: Meir Kalech and Gal A. Kaminka Presented by: Reymes Madrazo-Rivera.
Overview Concept Learning Representation Inductive Learning Hypothesis
LDK R Logics for Data and Knowledge Representation ClassL (Propositional Description Logic with Individuals) 1.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Oct, 30, 2015 Slide credit: some slides adapted from Stuart.
LDK R Logics for Data and Knowledge Representation ClassL (part 2): Reasoning with a TBox 1.
University of the Aegean AI – LAB ESWC 2008 From Conceptual to Instance Matching George A. Vouros AI Lab Department of Information and Communication Systems.
Finding Regular Simple Paths Sept. 2013Yangjun Chen ACS Finding Regular Simple Paths in Graph Databases Basic definitions Regular paths Regular simple.
Chapter 3 Brute Force Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Yoon kyoung-a A Semantic Match Algorithm for Web Services Based on Improved Semantic Distance Gongzhen Wang, Donghong Xu, Yong Qi, Di Hou School.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Of 24 lecture 11: ontology – mediation, merging & aligning.
BOOLEAN INFORMATION RETRIEVAL 1Adrienn Skrop. Boolean Information Retrieval  The Boolean model of IR (BIR) is a classical IR model and, at the same time,
PROBABILITY AND COMPUTING RANDOMIZED ALGORITHMS AND PROBABILISTIC ANALYSIS CHAPTER 1 IWAMA and ITO Lab. M1 Sakaidani Hikaru 1.
Zlatan Dragisic, Patrick Lambrix and Eva Blomqvist
P & NP.
n-ary relations OWL modeling problem when n≥3
A Kernel Revision Operator for Terminologies Algorithms and Evaluation
Axiomatic Number Theory and Gödel’s Incompleteness Theorems
Binary search trees Definition
Logics for Data and Knowledge Representation
Data Science Algorithms: The Basic Methods
Rule Induction for Classification Using
EA C461 – Artificial Intelligence Logical Agent
Computational Complexity of Terminological Reasoning in BACK
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Presented by: Hassan Sayyadi
Chapter 3 Brute Force Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Chapter 3 Brute Force Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Planning: Heuristics and CSP Planning
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Binary Search Back in the days when phone numbers weren’t stored in cell phones, you might have actually had to look them up in a phonebook. How did you.
Result of Ontology Alignment with RiMOM at OAEI’06
I don’t need a title slide for a lecture
Extracting Semantic Concept Relations
Great Theoretical Ideas in Computer Science
A finite sequence of operations that solves a given task
Logics for Data and Knowledge Representation
ece 627 intelligent web: ontology and beyond
Semantics In Text: Chapter 3.
Chapter 3 Brute Force Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Chase Zbigniew W. Ras UNC-Charlotte.
3. Brute Force Selection sort Brute-Force string matching
3. Brute Force Selection sort Brute-Force string matching
Test-Driven Ontology Development in Protégé
Actively Learning Ontology Matching via User Interaction
ONTOMERGE Ontology translations by merging ontologies Paper: Ontology Translation on the Semantic Web by Dejing Dou, Drew McDermott and Peishen Qi 2003.
A handbook on validation methodology. Metrics.
3. Brute Force Selection sort Brute-Force string matching
Logics for Data and Knowledge Representation
Presentation transcript:

An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner Stuckenschmidt University of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de

Computing a local optimal diagnosis Problem Statement Automatically and manually (!) generated ontology alignments are often incoherent See OAEI-2008 results of conference track => Incoherent alignments are a problem in many application scenarios* Instance migration results in inconsistent ontologies Query translation results in ‚a priori‘ empty result sets Find a way to automatically repair incoherent alignments in a very efficient way, because … ‚Agents on the web‘ require coherent alignments on the fly Large ontologies require efficient algorithms * C.Meilicke and H.Stuckenschmidt. Incoherence as a Basis for Measuring the Quality of Ontology Mappings. OM-08. Computing a local optimal diagnosis

Computing a local optimal diagnosis Outline Alignment Semantics Incoherence of an alignment, MIPS alignments Alignment Diagnosis Diagnosis, Minimal Hitting Set, Local Optimal Diagnosis Computing a Local Optimal Diagnosis (LOD) Brute-Force LOD and Efficient LOD Experimental Results Runtime, Quality of the Diagnosis Computing a local optimal diagnosis

"Natural" Semantics O1 ∪A O2 O2 O1 Merged Ontology <1#Person, 2#Person, =, 0.98> <1#hasName, 2#name, =, 0.87> <1#writtenBy, 2#docWrittenBy, = 0.7> <1#authorOf, 2#hasWritten, =, 0.56> <1#firstAuthor, 2#Author, ⊑ , 0.56> O1 ∪A O2 Correspondences An alignment A and two ontologies O1 and O2 O2 O1 1#firstAuthor ⊑ 2#Author Axioms 1#Person ≣ 2#Person … Computing a local optimal diagnosis

Incoherence of an Alignment Definition: Incoherence of an Alignment An alignment A between ontologies O1 and O2 is incoherent iff there exists an satisfiable concept i#C or property i#R in Oi  {1,2} that is unsatisfiable in O1 ∪A O2. can be reduced to the satisfiability of ∃i#R.⊤ Definition: MIPS Alignment (minimal conflict set) Given an incoherent alignment A between ontologies O1 and O2. A subalignment M ⊆ A is a MIPS alignment (= minimal incoherence preserving subalignment) iff M is incoherent and there exists no M‘ ⊂ M such that M‘ is incoherent. Computing a local optimal diagnosis

Computing a local optimal diagnosis "Terminology" Alignment Correspondence Alignment with MIPS shown as subsets Alignment in a sequence ordered by confidences MIPS depicted by red-dotted links Computing a local optimal diagnosis

Computing a local optimal diagnosis Alignment Diagnosis Definition: Alignment Diagnosis Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff A \ ∆ is coherent with respect to O1 and O2 and for each ∆‘ ⊂ ∆ alignment A \ ∆‘ is incoherent with respect to O1 and O2. Proposition: Alignment Diagnosis and minimal Hitting Sets Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff ∆ is a minimal hitting set over all MIPS in A. Computing a local optimal diagnosis

Local Optimal Diagnosis (LOD) high confidence Definition: Accused correspondence A correspondence c  A is accused by A iff there exists a MIPS in A with c  M such that for all c‘ ≠ c in M it holds that (1) conf(c‘) > conf(c) and (2) c‘ is not accused by A. Definition: Local optimal diagnosis (LOD) The set of all accussed correspondences is referred to as local optimal diagnosis (LOD). important! low confidence Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 1 2 3 4 5 6 7 8 9 10 Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 Coherent? YES! 1 2 3 4 5 6 7 8 9 10 Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 Coherent? YES! 1 2 3 4 5 6 7 8 9 10 Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 Coherent? NO! 1 2 3 4 5 6 7 8 9 10 Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 Coherent? Now it is! 1 2 3 4 5 6 7 8 9 10 Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 Coherent? YES! 1 2 3 4 5 6 7 8 9 10 Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 Coherent? YES! 1 2 3 4 5 6 7 8 9 10 Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 Coherent? NO! 1 2 3 4 5 6 7 8 9 10 Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1 Coherent? Now it is! 1 2 3 4 5 6 7 8 9 10 … continue the same way Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 1: Result … and after a few more slides we would end up like this: 1 2 3 4 5 6 7 8 9 10 Note: 10 times checking coherence for constructing a local optimal diagnosis, which is a minimal hitting set over all MIPS We have not computed a single MIPS alignment! First sketch: Meilicke,Völker, Stuckenschmidt. Learning Disjointness for Debugging Mappings between Lightweight Ontologies (EKAW-08) With focus on relation to belief revision discussed in: Qi, Ji, Haase: A Conflict-based Operator for Mapping Revision (ISWC-09) Computing a local optimal diagnosis

„Patternbased“ reasoning Idea: Use incomplete method for incoherence detection in A‘ ⊆A Classify O1 and O2 once, then check for each pair of correspondence in A‘ wether a certain pattern occurs If pattern occurs for some pair of an alignment A‘, then A‘ is incoherent If no pattern occurs A‘ can nevertheless be incoherent! Oj Oi Computing a local optimal diagnosis

Computing a local optimal diagnosis That doesn‘t work … Use the efficient coherence test instead of complete reasoning in algorithm described above Reasoning about A' ⊆ A does not require to reason in O1 ∪A' O2, but is replaced by iterating over all pairs in A' Hoewever: Resulting alignment might still be incoherent and ∆ is not a LOD Missing out one MIPS might result in a chain of incorrect follow-up decisions! Thus, afterwards removal of missed-out MIPS does not work! How to exploit the efficient method while still constructing a LOD? Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 2: Example 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence Computing a local optimal diagnosis

Algorithm 2: Example Run the BF algorithm with efficient reasoning. Still incoherent? Verification Step: Use binary search to detect correspondence k such that A[0… k-1] is coherent and A[0 … k] is incoherent safe part, efficient reasoning did not fail up to k 1 2 3 4 5 6 7 8 9 10 k=8 incorrect part, recompute! Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 2: Example Run the main algorithm again with efficient reasoning for A[k+1 … n] where ∆1-k ∪ A[k] for A[1… k] is a fixed part of the resulting diagnosis. Still incoherent? If yes, we have knew > kold repeat again the same verification step A[1…k] 1 2 3 4 5 6 7 8 9 10 A[k+1…n] Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence Computing a local optimal diagnosis

Computing a local optimal diagnosis Algorithm 2: Example Final result is a LOD. 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence Computing a local optimal diagnosis

Runtime Considerations (Theory) n = size of alignment A m = number of times the binary search is applied The "more complete„ pattern-based reasoning is => the less verification steps/ iterations are necesarry Runtime of pattern based reasoning not really matters with respect to runtime! Runtime Comparison Brute Force LOD: O(n) Efficient LOD: O(log(n) * m) Do we have m << n ? Computing a local optimal diagnosis

Computing a local optimal diagnosis Results: Runtime Based on experiments with OAEI conference ontologies and submission from 2007/08 Expressivity SHIN(D), ELI(D), SIF(D), ALCIF(D) Four different state of the art matching systems n m Better results for benchmark datasets: 5 to 10 times faster Computing a local optimal diagnosis

Results: Quality of Diagnosis Removing the LOD results in an alignment with increased precision and slightly decreased recall => slightly increased f-measure For alignments with low precision positive effects are very strong. In rare cases an incorrect correspondences annotated with high confidence has negative effects Computing a local optimal diagnosis

Computing a local optimal diagnosis Summary Algorithm 1: Algorithm for computing a LOD Without computing MIPS or MUPS! Algorithm 2: General approach for improving the algorithms of type 1 Shown for natural interpretation of correspondences as axioms and a specific type of incomplete reasoning In principle applicable to each semantic for which we can find a similar efficient reasoning approach! Good results for natural interpretation + pattern based reasoning: between 2 and 10 times faster! Computing a local optimal diagnosis

Thanks for attention Questions? Computing a local optimal diagnosis

Computing a local optimal diagnosis Back-Up Slides Computing a local optimal diagnosis

Property Pattern Example ∃readPaper.⊤ ⊑ Reviewer Reviewer ⊑ Person Document ⊑ ¬Person O2 ∃readPaper.⊤ ∃reviewOfPaper.⊤ ≣ readPaper reviewOfPaper disjoint disjoint ≣ Document Document ∃reviewOfPaper.⊤ ⊑ Review ⊑ Document O1 Computing a local optimal diagnosis