Prototipo di un query manager per la gestione di query globali

Slides:



Advertisements
Similar presentations
Semi-automatic compound nouns annotation for data integration systems Tuesday, 23 June 2009 SEBD 2009 Sonia Bergamaschi Serena Sorrentino
Advertisements

Uncertainty in Data Integration Ai Jing
Francesco Guerra – 1 DOTTORATO DI RICERCA IN INGEGNERIA DELLINFORMAZIONE XVI ciclo di dottorato - II ciclo Nuova Serie Dai Dati allInformazione:
ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 1 DB unimo Searching for data and services F. Guerra 1, A. Maurino 2, M. Palmonari.
ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 1 DB unimo Semantic Analysis for an Advanced ETL framework S.Bergamaschi 1, F.
Università di Modena e Reggio Emilia ;-)WINK WINK System: Project Collaboration Portal & Intelligent Integration Framework Architecture Davide Gazzotti.
Università di Modena e Reggio Emilia ;-)WINK Maurizio Vincini UniMORE Researcher Università di Modena e Reggio Emilia WINK System: Intelligent Integration.
IST SEWASIE 16 May 2002 Sonia Bergamaschi Università di Modena e Reggio Emilia.
Intelligent Technologies Module: Ontologies and their use in Information Systems Part II Alex Poulovassilis November/December 2009.
CONSTRUCTION DE LA SPIRALE DOR. Construire un rectangle dor L1 = 18,3 cm l 1 = 11,3 cm.
Advanced SQL (part 1) CS263 Lecture 7.
1. PROCEDURE MERGE SORT (list, first, last) If (first < last) middle = (first + last) div 2 2. Merge Sort (list, first, middle) 3. Merge Sort (list, middle+1,
07 - Special Session on Agricultural Metadata & Semantics Antonio Sala - Università di Modena e Reggio Emilia 1 Creating and Querying.
National Institute for Physics of Matter Università degli Studi di Modena e Reggio Emilia Bioengineering approach to the Balkan Syndrome Dr.Antonietta.
1 Corso di Rappresentazione della Informazione e della Conoscenza Anno Accademico Matteo Palmonari Query Processing in Data Integration Materiale.
Information Integration Using Logical Views Jeffrey D. Ullman.
D2I Project, Rome, October ARTEMIS The ARTEMIS prototype for the construction of reconciled views based on affinity evaluation and interactive.
1June 7, 2004Ontologies for interoperability1 Ontology-based data integration Maurizio Lenzerini Dipartimento di Informatica e Sistemistica “A. Ruberti”
Heterogeneous Data Warehouse Analysis and Dimensional Integration Marius Octavian Olaru XXVI Cycle Computer Engineering and Science Advisor: Prof. Maurizio.
DL-LITE: TRACTABLE DESCRIPTION LOGICS FOR ONTOLOGIES AUTHORS: DIEGO CALVANESE, GIUSEPPE DE GIACOMO, DOMENICO LEMBO, MAURIZIO LENZERINI, RICCARDO ROSATI.
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
A Principled Approach to Data Integration and Reconciliation in Data Warehousing Diego Calvanese Giuseppe De Giacomo Maurizio Lenzerini Daniele Nardi Riccardo.
An Extensible System for Merging Two Models Rachel Pottinger University of Washington Supervisors: Phil Bernstein and Alon Halevy.
Affinity-based Schema Matching Silvana Castano Università di Milano D2I –– Modena, 27 aprile 2001.
Università degli Studi di Modena e Reggio Emilia The MOMIS project - Sonia Bergamaschi, Alberto Corni, Francesco Guerra,
D2I Modena, 27 Aprile 2001 Methodologies and techniques for the extraction, the representation and the integration of structured and semi-structured information.
Tema 1: Applicazioni per basi di dati su Internet e Intranet Use of ontologies and extensional inter-schema properties for integration D. Beneventano,
1 Query Planning with Limited Source Capabilities Chen Li Stanford University Edward Y. Chang University of California, Santa Barbara.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes 1.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
Business Intelligence Instructor: Bajuna Salehe Web:
Research Topics in Computing Data Modelling for Data Schema Integration 1 March 2005 David George.
Copyright © Curt Hill The Relational Algebra What operations can be done?
Review External modelling –subsetting the universe of phenomena into user’s views Conceptual modelling –synthesis of external models into a schematic representation.
Similarity measuress Laboratory of Image Analysis for Computer Vision and Multimedia Università di Modena e Reggio Emilia,
WISDOM D0.P1 – Integrated System Protoype 1 WISDOM (Web Intelligent Search based on DOMain ontologies): Demo Sonia BergamaschiPaolo BouquetPaolo Ciaccia.
Page 1 Composing Mappings between Schemas using a Reference Ontology - ODBASE’04 - Eduard Dragut, Ramon Lawrence Composing Mappings between Schemas using.
DBSQL 3-1 Copyright © Genetic Computer School 2009 Chapter 3 Relational Database Model.
Tutorial 3. This tutorial went through how to convert multiplicity numbering used in UML modelling to ERD, and vice versa. In the exam and assignments,
Data Integration by Bi-Directional Schema Transformation Rules Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria.
STASIS The STASIS project Domenico Beneventano BDGROUP Università degli Studi di Modena e Reggio Emilia - Italy DB unimo International Workshop.
Difficulty Analysis for Learners in Problem Solving Process based on the Knowledge Map Speaker: Rita Kuo Rita Kuo, Wei-Peng Lien, Maiga Chang, Jia-Sheng.
SEWASIE: a Semantic Search Engine Sonia Bergamaschi, Maurizio Vincini Università di Modena e Reggio Emilia October 2002 Vilnius, Lithuania TELEBALT.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 4: Intermediate.
NeP4B Aims and Innovations: Toward a Unified View of Data and Services Carlo Batini Matteo Palmonari Andrea Maurino University of Milan-Bicocca Italy Sonia.
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
Hippo a System for Computing Consistent Query Answers to a Class of SQL Queries Jan Chomicki University at Buffalo Jerzy Marcinkowski Wroclaw University.
Claudio Gennaro ISDSI Query Processing in a Mediator System for Data and Multimedia D. Beneventano 1, C. Gennaro 2, M. Mordacchini 2, R. Carlos.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Advanced Relational Algebra & SQL (Part1 )
1 Computing Full Disjunctions Yaron Kanza Yehoshua Sagiv The Selim and Rachel Benin School of Engineering and Computer Science The Hebrew University of.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS* Claudio Gennaro 1, Federica Mandreoli 2,4, Riccardo Martoglia.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
Object storage and object interoperability
M.Sc. Seminar - Keren Lenz Supervisor - Dr. Yossi Gil July 1 st 2007 Simple and Safe SQL Queries with C++ Templates A RA R AT -
1 Integration of data sources Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Section 20.1 Modes of Information Integration Anilkumar Panicker CS257: Database Systems ID: 118.
The MOMIS project Demo - Schemata initialization.
1 Corso di Architetture della Info A.A Carlo Batini I sistemi di Data Integration elementi architetturali.
Midterm Review. Main Topics ER model Relational model Relational Database Design (Theory)
Conceptual Modeling for XML Data
University of Milano Bicocca Carlo Batini Course on Data Base Design
Computing Full Disjunctions
Data Warehouse.
Chapter 4: Intermediate SQL Joins
Intro to Computer Science CS1510 Dr. Sarah Diesburg
CS222P: Principles of Data Management Lecture #5: Schema Versioning
Materializing Views With Minimal Size To Answer Queries
Presentation transcript:

Prototipo di un query manager per la gestione di query globali MOMIS Query Manager Prototipo di un query manager per la gestione di query globali D. Beneventano, S. Bergamaschi, F. Mandreoli Università degli Studi di Modena e Reggio Emilia D2I Integrazione, Warehousing e Mining di sorgenti eterogenee Tema 1: Integrazione di dati provenienti da sorgenti eterogenee ROMA, 11 OTTOBRE 2002

Example Local classes (relational) L1(firstn,lastn,year,e_mail) L2(name,e_mail,dept_code,s_code) INTEGRATION Global Class: G Name E_mail Section Year Dept L1 firstn and lastn e_mail null year L2 name s_code dept_code Global Class Schema: G S(G) = (Name,E_mail,Year,Dept,Section) Local Class Schemata w.r.t. Global Class: S(L1) = (Name,E_mail,Year) S(L2) = (Name,E_mail,Dept,Section)

Data cleaning and reconciliation Integration at the extensional level the data returned by various sources need to be converted/reconciled interpretation and merging of the data provided by the sources Schema Translation (example: firstn and lastn to Name) Data conversion (example: ‘Rita’ + ‘Verde’ to ‘Rita Verde’) firstn lastn e_mail year Rita Verde PV@i.it 2 Ada Rossi RA@i.it 1 name e_mail dept_c S_code Rossi_Ada RA@i.it Dept1 413245 Po_Ugo UP@i.it 2314 L1 L2 Name E_mail Year Rita Verde PV@i.it 2 Ada Rossi RA@i.it 1 Name E_mail Dept Section Ada Rossi RA@i.it Dept1 413245 Ugo Po UP@i.it 2314

Redundancy and Reconcilation Hypothesis Instances of the same object in different local class must have the same value for a common attribute L2 L1 O1 O O2 L1 L2 Name E_mail Year Rita Verde PV@i.it 2 Ada Rossi RA@i.it 1 Name E_mail Dept Section Ada Rossi RA@i.it Dept1 413245 Ugo Po UP@i.it 2314 O1 O O O2

Object fusion To identify instances of the same object and fuse them: JoinMap - join criteria among classes L1 L2 O1 O O2 Name E_mail Year Rita Verde PV@i.it 2 Ada Rossi RA@i.it 1 Name E_mail Dept Section Ada Rossi RA@i.it Dept1 413245 Ugo Po UP@i.it 2314 O1 O O O2 JoinMap JM(L1,L2) L1.Name=L2.Name Name Ada Rossi

Object fusion : indirect map L1 L2 O1 O2 O3 Id Name E_mail Year 123 Rita Verde PV@i.it 2 243 Ada Rossi RA@i.it 1 E_mail Dept SN RA@i.it Dept1 XY413245 UP@i.it XZ2314 O1 O2 O2 O3 JoinMap JMCS.S,UNI.RS Matr SN 243 XY413245

Global Class Instance G GAV with “Single database property” (Lenzerini - Data Integration: A Theoretical Perspective, PODS 2002) The computation is based on “FULL DISJUNCTION” (Rajarama, Ullman - Integrating Information by Outerjoins and Full Disjunctions. PODS 1996) “Computing the natural outerjoin of many relations in a way that preserves all possible connections amon facts” L1 Name E_mail Year Rita Verde PV@i.it 2 Ada Rossi RA@i.it 1 L2 Name E_mail Dept Section Ada Rossi RA@i.it Dept1 413245 Ugo Po UP@i.it 2314 G: select S(G) from L1 outer join L2 on JM(L1,L2) G Name E_mail Year Dept Section Ada Rossi RA@i.it 1 Dept1 413245 Rita Verde PV@i.it 2 Ugo Po UP@i.it 2314

FULL DISJUNCTION COMPUTATION Question: when a full disjunction can be computed by some sequence of natural outerjoins Answer: there is a natural outerjoin sequence producing the full disjunction if and only if the set of relation schemes forms a connected, -acyclic hypergraph (Fagin - 1983) A Global class with n local classes, n >2 : -cyclic hypergraph L1 JM(L1,L2) JM(L1,L3) New Method JM(L2,L3) L2 L3 Example: n = 3 : G: select S(G) from (L1 outer join L2 on JM(L1,L2)) outer join (L1 outer join L3 on JM(L1,L3)) on JM(L2,L3)

Query rewiting method Global query (in DNF) : Q1 Local query for the class L : Q1_L where-condition of Q1_L : all factors of DNF which can be solved in L residual factors of Q1 : factors not included in all local where-condition select-list of Q1_L : attributes of the select-list of Q1 + residual factors +JoinMap Global query reformulation full disjunction based on the JoinMap + residual factors

Query rewiting example Global query Q1: select E_mail from G where (E_mail like ’*.it' and Dept='Dept1') or (E_mail like ’*.it' and Year=2) Local queries Q1_L1: select Name, Year, E_mail from L1 where (E_mail like ’*.it' or Year=2) Q1_L2: select Name, Dept, E_mail from L2 where (E_mail like ’*.it' or Dept='Dept1') Global query reformulation: Q1: select E_mail from Q1_L1 outer join Q1_L2 on JM where (Dept='Dept1' or Year=2) residual factor