University of the Aegean AI – LAB ESWC 2008 From Conceptual to Instance Matching George A. Vouros AI Lab Department of Information and Communication Systems.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

CS 267: Automated Verification Lecture 8: Automata Theoretic Model Checking Instructor: Tevfik Bultan.
Chapter 5: Introduction to Information Retrieval
Sandro Spina, John Abela Department of CS & AI, University of Malta. Mutually compatible and incompatible merges for the search of the smallest consistent.
Formal Modelling of Reactive Agents as an aggregation of Simple Behaviours P.Kefalas Dept. of Computer Science 13 Tsimiski Str Thessaloniki Greece.
Database Systems: Design, Implementation, and Management Tenth Edition
MODULE 4 File and Folder Management. Creating file and folder A computer file is a resource for storing information, which is available to a computer.
An Extensible System for Merging Two Models Rachel Pottinger University of Washington Supervisors: Phil Bernstein and Alon Halevy.
Unsupervised Feature Selection for Multi-Cluster Data Deng Cai et al, KDD 2010 Presenter: Yunchao Gong Dept. Computer Science, UNC Chapel Hill.
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Chapter 2 Data Models Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Aparna Kulkarni Nachal Ramasamy Rashmi Havaldar N-grams to Process Hindi Queries.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
L. Padmasree Vamshi Ambati J. Anand Chandulal J. Anand Chandulal M. Sreenivasa Rao M. Sreenivasa Rao Signature Based Duplicate Detection in Digital Libraries.
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Exploiting Wikipedia as External Knowledge for Document Clustering Sakyasingha Dasgupta, Pradeep Ghosh Data Mining and Exploration-Presentation School.
1 On the Criteria To Be Used in Decomposing Systems into Modules by D.L.Parnas Dec presented by Yuanhua Qu for spring 2003 CS5391.
BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING Pavel Shvaiko joint work with Fausto Giunchiglia and Mikalai Yatskevich INFINT 2007 Bertinoro Workshop on Information.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Machine Learning Approach for Ontology Mapping using Multiple Concept Similarity Measures IEEE/ACIS International Conference on Computer and Information.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.
CSE 6331 © Leonidas Fegaras Information Retrieval 1 Information Retrieval and Web Search Engines Leonidas Fegaras.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Document retrieval Similarity –Vector space model –Multi dimension Search –Range query –KNN query Query processing example.
1 University of Palestine Topics In CIS ITBS 3202 Ms. Eman Alajrami 2 nd Semester
Minor Thesis A scalable schema matching framework for relational databases Student: Ahmed Saimon Adam ID: Award: MSc (Computer & Information.
Datasets on the GRID David Adams PPDG All Hands Meeting Catalogs and Datasets session June 11, 2003 BNL.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
University of Malta CSA3080: Lecture 6 © Chris Staff 1 of 20 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
GIS Data Models GEOG 370 Christine Erlien, Instructor.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Ontology Quality by Detection of Conflicts in Metadata Budak I. Arpinar Karthikeyan Giriloganathan Boanerges Aleman-Meza LSDIS lab Computer Science University.
LogTree: A Framework for Generating System Events from Raw Textual Logs Liang Tang and Tao Li School of Computing and Information Sciences Florida International.
An Ontological Approach to Financial Analysis and Monitoring.
2 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel Data Models Why data models are important About the basic data-modeling.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Unsupervised Classification
Of 24 lecture 11: ontology – mediation, merging & aligning.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Information Retrieval Inverted Files.. Document Vectors as Points on a Surface Normalize all document vectors to be of length 1 Define d' = Then the ends.
Introduction toData structures and Algorithms
WP4 Models and Contents Quality Assessment
Introduction to Algorithms
An Efficient Method for Computing Alignment Diagnoses
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Using Partial Reference Alignments to Align Ontologies
Methodology – Physical Database Design for Relational Databases
Representation of documents and queries
Property consolidation for entity browsing
[jws13] Evaluation of instance matching tools: The experience of OAEI
Introduction to Algorithms
Integrating Taxonomies
Presentation transcript:

University of the Aegean AI – LAB ESWC 2008 From Conceptual to Instance Matching George A. Vouros AI Lab Department of Information and Communication Systems Eng. University of the Aegean Karlovassi, Samos, Greece

University of the Aegean AI – LAB ESWC 2008 Given two Ontologies (S 1,A 1, I 1 ), (S 2,A 2,I 2 ) find a mapping (i.e. equivalences) between their signatures so that The translation of A 1 with respect to this mapping is satisfied by A 2. Ontology Matching at the Conceptual Level 2

University of the Aegean AI – LAB ESWC 2008

University of the Aegean AI – LAB ESWC 2008 Given two Ontologies (S 1,A 1, I 1 ), (S 2,A 2,I 2 ), find a mapping between their - Signatures (i.e. equivalences) & - Instances (i.e. “same as” assertions) such that the assertions in I 2, together with the “translated” assertions in I 1 are consistent with A 2 and the translated axioms in A 1 Instance Matching 4

University of the Aegean AI – LAB ESWC 2008 The Instance Matching contest was composed by two tracks The ISLab Instance Matching Benchmark (IIMB) is a benchmark automatically generated starting from one data source that is automatically modified according to various criteria. The AKT-Rexa-DBLP test case aims at testing the capability of the tools to match individuals. All three datasets were structured using the same schema. The challenges for the matchers included ambiguous labels (person names and paper titles) and noisy data (some sources contained incorrect information). OAEI 09 Instance Matching Track 5

University of the Aegean AI – LAB ESWC 2008 (a) Scalability (b) Different methods exploit different information concerning instances, or different facets of the same type of information (c) Assumptions concerning the structure of the “search space” Issues 6

University of the Aegean AI – LAB ESWC 2008 Our first approach: - Computing clusters of “same as” instances where each cluster is represented by a “model” of the cluster. - Clusters and models are stored on disk files - New instances are compared with each cluster by exploiting the “models” - The highest similarity above a specific threshold indicates the cluster of the new instance Scalability 7

University of the Aegean AI – LAB ESWC 2008  COCLU: Aims at discovering typographic similarities between sequences of characters over an alphabet (ASCII or UTF character set), aiming to reveal the similarity of classes instances’ lexicalizations during ontology population. It is a partition-based clustering algorithm which divides data into clusters and searches the space of possible clusters using a greedy heuristic. Methods 8

University of the Aegean AI – LAB ESWC 2008  Vector Space Model – based (VSM) method: It computes the matching of two pseudo documents. In our case each such document corresponds to an instance and it is produced by the words in the vicinity of that instance. The “vicinity” includes all words occurring (i) to the local name, label and comments of this concept, (ii) to any of its properties (exploiting the properties’ local names, labels and comments), as well as (iii) to any of its related concepts or instances. Each document is represented by a vector of n weighted index words, where the weight of a word is the frequency of its appearance in the document. The similarity between two vectors is computed by means of the cosine similarity measure. Methods 9

University of the Aegean AI – LAB ESWC 2008  Simple (e.g. the union/intersection of clusters with at least one common member)  Biased : The clusters of one method may be used as input by the another.  Model based: Set the constraints that must be satisfied according to the axioms of the schemas and run a generic method (e.g. the max-sum algorithm or a DCOP method) that reconciles the prefernces and conflicts among the individual methods. Synthesis of different methods 10

University of the Aegean AI – LAB ESWC 2008  Get results from the individual methods….  Implement the synthesis of different methods  Investigate the interaction between conceptual mapping and instance mapping for a sophisticated but scalable synthesis method. To be done 11