Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of the Aegean AI – LAB ESWC 2008 From Conceptual to Instance Matching George A. Vouros AI Lab Department of Information and Communication Systems.

Similar presentations


Presentation on theme: "University of the Aegean AI – LAB ESWC 2008 From Conceptual to Instance Matching George A. Vouros AI Lab Department of Information and Communication Systems."— Presentation transcript:

1 University of the Aegean AI – LAB ESWC 2008 From Conceptual to Instance Matching George A. Vouros AI Lab Department of Information and Communication Systems Eng. University of the Aegean 83200 Karlovassi, Samos, Greece georgev@aegean.gr

2 University of the Aegean AI – LAB ESWC 2008 Given two Ontologies (S 1,A 1, I 1 ), (S 2,A 2,I 2 ) find a mapping (i.e. equivalences) between their signatures so that The translation of A 1 with respect to this mapping is satisfied by A 2. Ontology Matching at the Conceptual Level 2

3 University of the Aegean AI – LAB ESWC 2008

4 University of the Aegean AI – LAB ESWC 2008 Given two Ontologies (S 1,A 1, I 1 ), (S 2,A 2,I 2 ), find a mapping between their - Signatures (i.e. equivalences) & - Instances (i.e. “same as” assertions) such that the assertions in I 2, together with the “translated” assertions in I 1 are consistent with A 2 and the translated axioms in A 1 Instance Matching 4

5 University of the Aegean AI – LAB ESWC 2008 The Instance Matching contest was composed by two tracks The ISLab Instance Matching Benchmark (IIMB) is a benchmark automatically generated starting from one data source that is automatically modified according to various criteria. The AKT-Rexa-DBLP test case aims at testing the capability of the tools to match individuals. All three datasets were structured using the same schema. The challenges for the matchers included ambiguous labels (person names and paper titles) and noisy data (some sources contained incorrect information). OAEI 09 Instance Matching Track 5

6 University of the Aegean AI – LAB ESWC 2008 (a) Scalability (b) Different methods exploit different information concerning instances, or different facets of the same type of information (c) Assumptions concerning the structure of the “search space” Issues 6

7 University of the Aegean AI – LAB ESWC 2008 Our first approach: - Computing clusters of “same as” instances where each cluster is represented by a “model” of the cluster. - Clusters and models are stored on disk files - New instances are compared with each cluster by exploiting the “models” - The highest similarity above a specific threshold indicates the cluster of the new instance Scalability 7

8 University of the Aegean AI – LAB ESWC 2008  COCLU: Aims at discovering typographic similarities between sequences of characters over an alphabet (ASCII or UTF character set), aiming to reveal the similarity of classes instances’ lexicalizations during ontology population. It is a partition-based clustering algorithm which divides data into clusters and searches the space of possible clusters using a greedy heuristic. Methods 8

9 University of the Aegean AI – LAB ESWC 2008  Vector Space Model – based (VSM) method: It computes the matching of two pseudo documents. In our case each such document corresponds to an instance and it is produced by the words in the vicinity of that instance. The “vicinity” includes all words occurring (i) to the local name, label and comments of this concept, (ii) to any of its properties (exploiting the properties’ local names, labels and comments), as well as (iii) to any of its related concepts or instances. Each document is represented by a vector of n weighted index words, where the weight of a word is the frequency of its appearance in the document. The similarity between two vectors is computed by means of the cosine similarity measure. Methods 9

10 University of the Aegean AI – LAB ESWC 2008  Simple (e.g. the union/intersection of clusters with at least one common member)  Biased : The clusters of one method may be used as input by the another.  Model based: Set the constraints that must be satisfied according to the axioms of the schemas and run a generic method (e.g. the max-sum algorithm or a DCOP method) that reconciles the prefernces and conflicts among the individual methods. Synthesis of different methods 10

11 University of the Aegean AI – LAB ESWC 2008  Get results from the individual methods….  Implement the synthesis of different methods  Investigate the interaction between conceptual mapping and instance mapping for a sophisticated but scalable synthesis method. To be done 11


Download ppt "University of the Aegean AI – LAB ESWC 2008 From Conceptual to Instance Matching George A. Vouros AI Lab Department of Information and Communication Systems."

Similar presentations


Ads by Google