Semantic integration of data in database systems and ontologies

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

1 ICS-FORTH EU-NSF Semantic Web Workshop 3-5 Oct Christophides Vassilis Database Technology for the Semantic Web Vassilis Christophides Dimitris Plexousakis.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
CS570 Artificial Intelligence Semantic Web & Ontology 2
Amit Shvarchenberg and Rafi Sayag. Based on a paper by: Robin Dhamankar, Yoonkyong Lee, AnHai Doan Department of Computer Science University of Illinois,
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.4/1 Outline Introduction Background Distributed Database Design Database Integration ➡ Schema Matching ➡
Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Eduard C. Dragut Ramon Lawrence Eduard C. Dragut Ramon Lawrence.
An Extensible System for Merging Two Models Rachel Pottinger University of Washington Supervisors: Phil Bernstein and Alon Halevy.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
A Framework for Ontology-Based Knowledge Management System
PR-OWL: A Framework for Probabilistic Ontologies by Paulo C. G. COSTA, Kathryn B. LASKEY George Mason University presented by Thomas Packer 1PR-OWL.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Where are the Semantics in the Semantic Web? Michael Ushold The Boeing Company.
A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
XML on Semantic Web. Outline The Semantic Web Ontology XML Probabilistic DTD References.
Generic Schema Matching with Cupid Jayant Madhavan Philip A. Bernstein Erhard Raham Proceedings of the 27 th VLDB Conference.
Distributed Database Management Systems. Reading Textbook: Ch. 4 Textbook: Ch. 4 FarkasCSCE Spring
QoM: Qualitative and Quantitative Measure of Schema Matching Naiyana Tansalarak and Kajal T. Claypool (Kajal Claypool - presenter) University of Massachusetts,
BYU Data Extraction Group Funded by NSF1 Brigham Young University Li Xu Source Discovery and Schema Mapping for Data Integration.
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
Ontology matching ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΙΓΑΙΟΥ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ ΠΛΗΡΟΦΟΡΙΑΚΩΝ ΚΑΙ ΕΠΙΚΟΙΝΩΝΙΑΚΩΝ ΣΥΣΤΗΜΑΤΩΝ Πρόγραμμα Μεταπτυχιακών Σπουδών
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
ONTOLOGY MATCHING Part III: Systems and evaluation.
Pedro Domingos Joint work with AnHai Doan & Alon Levy Department of Computer Science & Engineering University of Washington Data Integration: A “Killer.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
Ontology Matching Basics Ontology Matching by Jerome Euzenat and Pavel Shvaiko Parts I and II 11/6/2012Ontology Matching Basics - PL, CS 6521.
Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING Pavel Shvaiko joint work with Fausto Giunchiglia and Mikalai Yatskevich INFINT 2007 Bertinoro Workshop on Information.
AnHai Doan, Pedro Domingos, Alon Halevy University of Washington Reconciling Schemas of Disparate Data Sources: A Machine Learning Approach The LSD Project.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
AnHai Doan Pedro Domingos Alon Levy Department of Computer Science & Engineering University of Washington Learning Source Descriptions for Data Integration.
Learning Source Mappings Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems October 27, 2008 LSD Slides courtesy AnHai.
A SURVEY OF APPROACHES TO AUTOMATIC SCHEMA MATCHING Sushant Vemparala Gaurang Telang.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Dimitrios Skoutas Alkis Simitsis
Logics for Data and Knowledge Representation Applications of ClassL: Lightweight Ontologies.
A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.
Interoperability & Knowledge Sharing Advisor: Dr. Sudha Ram Dr. Jinsoo Park Kangsuk Kim (former MS Student) Yousub Hwang (Ph.D. Student)
Logics for Data and Knowledge Representation
Proposed NWI KIF/CG --> Common Logic Standard A working group was recently formed from the KIF working group. John Sowa is the only CG representative so.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Working with Ontologies Introduction to DOGMA and related research.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Semantic Mappings for Data Mediation
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Modeling Security-Relevant Data Semantics Xue Ying Chen Department of Computer Science.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Presented by Kyumars Sheykh Esmaili Description Logics for Data Bases (DLHB,Chapter 16) Semantic Web Seminar.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Chapter 8A Semantic Web Primer 1 Chapter 8 Conclusion and Outlook Grigoris Antoniou Frank van Harmelen.
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
AnHai Doan, Pedro Domingos, Alon Halevy University of Washington
Cross-Ontological Relationships
ece 627 intelligent web: ontology and beyond
Database Systems Instructor Name: Lecture-3.
Semantic Markup for Semantic Web Tools:
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Presentation transcript:

Semantic integration of data in database systems and ontologies Technical university of Liberec Faculty of mechatronics Semantic integration of data in database systems and ontologies Ing. Petra Šeflová

Integration of data - merging a set given schemas into global schema Semantic integration - part of concept integration of data - be focusing on data exchange between applications in the light of their meaning, content and required business rules

Integration of data wrapper Example homeseekers.com Source schema wrapper mediated schema Find houses with four bathrooms and price under $500.000 realestate.com greathomes.com A data integration system in the real estate domain.

Applications Catalog integration in B2B applications E-commerce Bioinformatics P2P Databases Agent communications Web services Integration

Key commonalities application of Semantic integration Use structured representation (e.g. relational schemas and XML DTDs) Must resolve heterogenities with respect to the schema and their data Enable their manipulation Merging the schemas Computing differences Enable translation of data and queries across the schemas/ontologies

Database schema Ontology Present definition physical system layout (database) Ontology System of knowledge about world Claimless on coherence (lot of partial ontology) Frequently specific created artefact Definition of Gruber: Ontology is formal, explicit specification sharing conceptualization.

Problems of Semantic integration Semantic of elements can be inferred from only a few information sources Creators of data Dokumentation Associated schema and data Schema element are typically matched based on clues in the schema and data Schema and data clues are often incomlpete Matching is often subjective, depending in the application

Matching process Take as input two schemas/ontologies, each consisting of a set discrete entities, and determine as output the relationships holding between these entities

Schema S Houses Schema T Agents Location Price ($) Agent-id Atlanta, GA 360,000 32 Raleigh, NC 430,000 15 Schema T Area list-price Agent-address Agent-name Denver,CO 550,000 Boulder,CO Laura Smith Atlanta, GA 370,800 Athens, Mike Brown Agents Id Name city state 32 Mike Brown Athlanta GA 15 Jean Laup Relaign NC Example : The schema of two relational database S and T on house listing, and the semantic correspondence between them

Matching techniques Two groups Rule-based Learning-based

Rule-based solutions Many of the early as well as current matching solutions employ hand-crafted rules Exploit schema information Element names Data types Structures Integrity constraints Can provide a quick and concise method to capture valuable user knowledge about domain

Rule-based solutions Benefits Drawback For example : „relatively inexpensive“ Do not require training Operate only on schema Drawback They cannot exploit data instance effectively They cannot exploit previous matching efforts For example : TranScm DIKE MOMIS CUPID

TranScm DIKE MOMIS CUPID Employs rules such as „two elements match if they have the same name (allowing synonyms) and the same number of subelements DIKE Computes similarity between two schema element based on similarity of the characteristics of the element and similarity of related elements MOMIS Compute similarity of schema elements as a weighted suma of the similarity of name,data type and substructure CUPID Employs rules that categorize elements based on names, data types and domains

Learning-based solutions Exploit both schema and data information They do exploit previous matching efforts Examples: SemInt system LSD system iMAP system Autocomplex Automatch

Autoplex and Automatch SemInt Uses a neuralnetwork learning approaches It matched schema elements based on attribute specifications and statistic of data content LSD Employs Naive Bayes over data instance Develop novel learning solution exploit the hierarchical nature of XML data iMAP Matches the schemas of two sources by analyzing the description of objects that are found in both sources Autoplex and Automatch Use a Naive Bayes learning approach that exploits data instances to match element

The Matching dimensions Input dimension Process dimension Output dimensions

Input dimension Concern the kind of input on which algorithm operate First dimension Algorithms depending on the data/ conceptual model in which ontologies or schemas are expressed Second dimension Depend on the kind of data algorithms exploit Different approaches exploit different information of the input data/conceptual models Schema-level information Instance data Exploit both

Process dimensions Classification of the matching process could be based on its general properties It depends on the approximate or exact nature of its computation Exact algorithms compute the absolute solution to a problem Approximate algorithms sacrifice exactness to performance Three large classes based on intrinsic input, external resources or some semantic theory Syntactic External Semantic

Output dimensions Concern the form of the result they produce One-to-one correspondence Is any relation suitable Has it to be final mapping element System deliver a graded answer Correspondences hold with 98% confidence Correspondences hold with 4/5 probability All-or-nothing answer Correspondences using distance measuring Kind of relations between entities a system can provide Equivalence Subsumption Incompatibility

Classification of elementary schema-based matching approaches Schema-Based Matching Techniques Element-level Structure-level Syntantic Syntactic External Linguistic Internal Relational Semantic Structural Terminological String- Based Language- Resource Contraint- Upper Level Formal ontologies Graph- Taxonomy- Repository of Structure Model- Alignment reuse Basic Techniques layer Granuality/Input Interpretation layer

Element-level vs structure-level Element-level matching techniques compute mapping elements by analyzing entities in isolation Ignoring their relation with other entities Structure-level techniques compute mapping elements by analyzing how entities appear together in a structure

Internal vs external techniques Interal Exploiting information which comes only with input schema/ontologies Syntactic interpretation of input Sematic interpretation of input External Exploit auxiliary (external) resources of domain to interpret the input Resources : Human input Some thesaurus expressing the relationship between terms

Schema Matching vs Ontology Matching Differences Database schema often do not provide explicit semantics for their data Semantics is usually specified explicitly at design-time Usually performed with the help of techniques trying to guess the meaning encoded in the schemas Ontologies are logical systems that themselves obey some formal semantics Primarily try to exploit knowledge explicitly encoded in the ontologies

Schema Matchin vs Ontology Matching Commonalities Ontologies and schemas are similar in the sense : Provide a vocablurary of terms that describes a domain of interest Constrain the meaning of terms used in vocablurary Schema and ontologies are found in such enviroment as the Semantic web

Sources : Natalya F.Noy : Semantic Integration: A survey of Ontology-Based Approaches AnHai Doan, Alon Y. Haley: Semantic Integration in the Database Community: A Brief Survey P.Schvaiko, J. Euzenat: A Survey of schema-based Matching Approaches G. Antonious, F. van Harmelen: A Semantic Web Primer R. Araújo, H. Sofia Pinto: Toward Semantics-based ontology similarity H. Wache, T. Vögele, U. Visser, H. Stuckenschmidt, G. Shuster, H. Neumann and S. Húbner: Ontology-based integration of information – A survey existing Approaches E. Rahm, P.A. Bernstein: A survey of approaches to automatic schema matching