Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.

Similar presentations


Presentation on theme: "A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan."— Presentation transcript:

1 A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan

2 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 2 Outline Introduction Classification of schema-based matching approaches Matching systems Conclusions Future work

3 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 3 Introduction

4 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 4 Semantic Web and the Match operator Information sources (e.g., database schemas, taxonomies or ontologies) can be viewed as graph-like structures containing terms and their inter-relationships Match is one of the key operators for enabling the Semantic Web since it takes two graph-like structures and produces a mapping between the nodes of the graphs that “correspond” semantically to each other

5 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 5 Example: Two XML schemas HT FT

6 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 6 Schema matching vs Ontology alignment Differences: Database schemas often do not provide explicit semantics for their data Ontologies are logical systems that themselves incorporate semantics (intuitive or formal) E.g., ontology definitions as a set of logical axioms Ontology data models are richer (the number of primitives is higher, and they are more complex) then schema data models E.g., OWL allows defining new classes as unions or intersections of other classes Commonalities: Ontologies can be viewed as schemas for knowledge bases Techniques developed for both problems are of a mutual benefit

7 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 7 Matching {M}{M'} Parameters (e.g., weights, thresholds) Auxiliary Information (e.g., lexicons, thesauri) S1 S2 Match Mapping element, M is a 5-tuple n = {x  [0,1]} R = { =,,, , }

8 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 8 Classification of Schema-based Matching Approaches

9 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 9 Schema matching approaches Individual matchers Schema-basedInstance-based Graph matching Linguistic Constraint- based Types Keys Value pattern and ranges Constraint -based Linguistic IR (word frequencies, key terms) Constraint- based Names Descriptions Structure-levelElement-level Combined matchers automatic composition Composite manual composition Hybrid Taxonomy from [E. Rahm, P. Bernstein, 2001]

10 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 10 Semantic view on matching Heuristic vs formal: heuristic techniques try to guess relations which may hold between similar labels or graph structures formal techniques have model-theoretic semantics which is used to justify their results Implicit vs explicit: Implicit techniques are syntax driven techniques E.g., techniques, which consider labels as strings, or analyze data types, or soundex of schema/ontology elements Explicit techniques exploit the semantics of labels E.g., thesauruses, ontologies What is missing in the taxonomy of schema matching approaches we have just seen ? Two new criteria:

11 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 11 Schema Matching Approaches Individual matchers Schema-based Graph matching Linguistic Constraint- based Types Keys Constraint- based Names Descriptions Structure-levelElement-level Heuristic vs Formal Implicit vs Explicit

12 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 12 Schema-based Matching Approaches Heuristic Techniques Formal Techniques Element-level Structure-level Implicit Explicit String- based Constraint- based Auxiliary Information Ontology- based Reasoner- based - Names - Descriptions - Type similarity - Key properties - Precompiled dictionary - Lexicons - Graph matching - Children - Leaves - Taxonomic structure - OWL properties - Propositional SAT - Modal SAT

13 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 13 Heuristic Techniques Element-level explicit techniques Precompiled dictionary (Cupid, COMA) E.g., syn key - "NKN:Nikon = syn“ Lexicons (S-Match, CTXmatch) E.g., WordNet: Camera is a hypernym for Digital Camera, therefore, Digital_Cameras  Photo_and_Cameras Structure-level explicit techniques Taxonomic structure (Anchor-Prompt, NOM) E.g., Given that Digital_Cameras  Photo_and_Cameras, then FJFLM and FujiFilm can be found as an appropriate match Example

14 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 14 Formal Techniques Example Element-level explicit techniques OWL properties (NOM) E.g., sameClassAs constructor explicitly states that one class is equivalent to the other Digital_Cameras = Camera DigitalPhoto_Producer Structure-level explicit techniques Propositional satisfiability (SAT) (S-Match, CTXmatch) The approach is to translate the matching problem, namely the two graphs (trees) and mapping queries into propositional formula and then to check it for its validity Modal SAT (S-Match) The idea is to enhance propositional logics with modal logic (or ALC DL) operators. Therefore, the matching problem is translated into a modal logic formula which is further checked for its validity using sound and complete satisfiability search procedures.

15 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 15 Matching Systems

16 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 16 Characteristics of state of the art matchers Conclusions

17 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 17 Uses of Classification The classification proposed provides a common conceptual basis, and hence can be used for comparing (analytically) different existing schema/ontology matching systems It can help in designing a new matching system, or an elementary matcher, taking advantages of state of the art solutions

18 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 18 Future Work Provide a more detailed view on the general properties of matching algorithms Add to the classification language-based techniques, e.g., tokenization, lemmatization, elimination Extend classification by taking into account DL-based matchmaking solutions Extend classification by adding new appearing matching techniques and systems implementing them, e.g., OLA, QOM Compare matching systems also experimentally, with the help of benchmarks

19 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 19 References Knowledge Web project: http://knowledgeweb.semanticweb.org/http://knowledgeweb.semanticweb.org/ Project website at DIT - ACCORD: http://www.dit.unitn.it/~accord/http://www.dit.unitn.it/~accord/ P. Shvaiko: A classification of schema-based matching approaches. Technical Report, DIT-04-93, University of Trento, 2004. E. Rahm, P. Bernstein: A survey of approaches to automatic schema matching. In Very Large Databases Journal, 10(4):334-350, 2001. F. Giunchiglia, P.Shvaiko: Semantic matching. In The Knowledge Engineering Review Journal, 18(3):265-280, 2003. P. Bouquet, L. Serafini, S. Zanobini: Semantic coordination: a new approach and an application. In Proceedings of ISWC, 130-145, 2003.

20 MCN workshop, ISWC, 8 th November 2004, Hiroshima, Japan 20 Thank you!


Download ppt "A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan."

Similar presentations


Ads by Google