1 Information Integration Mediators Warehousing Answering Queries Using Views.

Slides:



Advertisements
Similar presentations
1 Data Integration June 3 rd, What is Data Integration? uniform accessmultiple autonomousheterogeneousdistributed Provide uniform access to data.
Advertisements

CSE 636 Data Integration Data Integration Approaches.
Information Integration Using Logical Views Jeffrey D. Ullman.
Data integration Chitta Baral Arizona State University.
Wrappers in Mediator-Based Systems Chapter 21.3 Information Integration Presented By Annie Hii Toderici.
1 Global-as-View and Local-as-View for Information Integration CS652 Spring 2004 Presenter: Yihong Ding.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
An Extensible System for Merging Two Models Rachel Pottinger University of Washington Supervisors: Phil Bernstein and Alon Halevy.
CS 257 Database Systems Principles Assignment 1 Instructor: Student: Dr. T. Y. Lin Rajan Vyas (119)
Chapter 21.2 Modes of Information Integration ID: 219 Name: Qun Yu Class: CS Spring 2009 Instructor: Dr. T.Y.Lin.
Distributed Systems Architectures
Information Integration. Modes of Information Integration Applications involved more than one database source Three different modes –Federated Databases.
CS 257 Database Systems Principles Assignment 1 Instructor: Student: Dr. T. Y. Lin Rajan Vyas (119)
BYU 2003BYU Data Extraction Group Combining the Best of Global-as-View and Local-as-View for Data Integration Li Xu Brigham Young University Funded by.
Capability-Based Optimization in Mediators Rohit Deshmukh ID 120 CS-257 Rohit Deshmukh ID 120 CS-257.
Local-as-View Mediators Priya Gangaraju(Class Id:203)
1 CIS607, Fall 2005 Semantic Information Integration Presentation by Paea LePendu Week 8 (Nov. 16)
1 Information Integration Mediators Semistructured Data Answering Queries Using Views.
Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
2005Integration-intro1 Data Integration Systems overview The architecture of a data integration system:  Components and their interaction  Tasks  Concepts.
1 Where Is Database Research Headed? Jeffrey D. Ullman DASFAA March 26, 2003.
CSE 636 Data Integration Overview. 2 Data Warehouse Architecture Data Source Data Source Relational Database (Warehouse) Data Source Users   Applications.
CSE 636 Data Integration Introduction. 2 Staff Instructor: Dr. Michalis Petropoulos Location: 210 Bell Hall Office Hours:
Distributed Database Management Systems. Reading Textbook: Ch. 4 Textbook: Ch. 4 FarkasCSCE Spring
Software Architecture April-10Confidential Proprietary Master Data Management mainly inspired from Enterprise Master Data Management – An SOA approach.
PHASE 3: SYSTEMS DESIGN Chapter 8 System Architecture.
1 Overview of Database Federation and IBM Garlic Project Presented by Xiaofen He.
Lowell 2003 Challenges Alon Y. Halevy University of Washington.
CSC2012 Database Technology & CSC2513 Database Systems.
Research Topics in Computing Data Modelling for Data Schema Integration 1 March 2005 David George.
Chapter 21.2 Modes of Information Integration ID: 219 Name: Qun Yu Class: CS Spring 2009 Instructor: Dr. T.Y.Lin.
CSE 636 Data Integration Overview Fall What is Data Integration? The problem of providing uniform (sources transparent to user) access to (query,
Navigational Plans For Data Integration Marc Friedman Alon Levy Todd Millistein Presented By Avinash Ponnala Avinash Ponnala.
© 2007 by Prentice Hall 1 Introduction to databases.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
DEPICT: DiscovEring Patterns and InteraCTions in databases A tool for testing data-intensive systems.
1 Lessons from the TSIMMIS Project Yannis Papakonstantinou Department of Computer Science & Engineering University of California, San Diego.
Mediators, Wrappers, etc. Based on TSIMMIS project at Stanford. Concepts used in several other related projects. Goal: integrate info. in heterogeneous.
The Worlds of Database Systems From: Ch. 1 of A First Course in Database Systems, by J. D. Pullman and H. Widom.
Submitted by: Deepti Kundu Submitted to: Dr.T.Y.Lin
Data Access and Security in Multiple Heterogeneous Databases Afroz Deepti.
1 Information Integration. 2 Information Resides on Heterogeneous Information Sources different interfaces different data representations redundant and.
1 Information Integration Mediators Warehousing Answering Queries Using Views Slides are modified from Dr. Ullman’s notes.
Scaling Heterogeneous Databases and Design of DISCO Anthony Tomasic Louiqa Raschid Patrick Valduriez Presented by: Nazia Khatir Texas A&M University.
DBMS2001Notes 10: Information Integration1 Principles of Database Management Systems 10: Information Integration Pekka Kilpeläinen University of Kuopio.
Presented by Jiwen Sun, Lihui Zhao 24/3/2004
1 On-Line Analytic Processing Warehousing Data Cubes.
Information Integration BIRN supports integration across complex data sources – Can process wide variety of structured & semi-structured sources (DBMS,
Information Integration By Neel Bavishi. Mediator Introduction A mediator supports a virtual view or collection of views that integrates several sources.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
Fall CSE330/CIS550: Introduction to Database Management Systems Prof. Susan Davidson Office: 278 Moore Office hours: TTh
Data Integration Approaches
1 Integration of data sources Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Section 20.1 Modes of Information Integration Anilkumar Panicker CS257: Database Systems ID: 118.
1 10 Systems Analysis and Design in a Changing World, 2 nd Edition, Satzinger, Jackson, & Burd Chapter 10 Designing Databases.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
Where Is Database Research Headed?
Architetture della Informazione Anno accademico Carlo Batini Methodologies for planning the evolution of data architectures 1.
Foreign Keys Local and Global Constraints Triggers
Data Warehouse.
On-Line Analytic Processing
Introduction to Database Systems, CS420
CSC 480 Software Engineering
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
MANAGING DATA RESOURCES
Introduction to Data Warehousing
Information Integration
ONTOMERGE Ontology translations by merging ontologies Paper: Ontology Translation on the Semantic Web by Dejing Dou, Drew McDermott and Peishen Qi 2003.
CS561-Spring 2012 WPI, Mohamed eltabakh
Presentation transcript:

1 Information Integration Mediators Warehousing Answering Queries Using Views

2 Example Applications 1.Enterprise Information Integration: making separate DB’s, all owned by one company, work together. 2.Scientific DB’s, e.g., genome DB’s. 3.Catalog integration: combining product information from all your suppliers.

3 Challenges 1.Legacy databases : DB’s get used for many applications. uYou can’t change its structure for the sake of one application, because it will cause others to break. 2.Incompatibilities : Two, supposedly similar databases, will mismatch in many ways.

4 Examples: Incompatibilities  Lexical : addr in one DB is address in another. uValue mismatches : is a “red” car the same color in each DB? Is 20 degrees Fahrenheit or Centigrade? uSemantic : are “employees” in each database the same? What about consultants? Retirees? Contractors?

5 What Do You Do About It? uGrubby, handwritten translation at each interface. wSome research on automatic inference of relationships. uWrapper (aka “adapter”) translates incoming queries and outgoing answers.

6 Integration Architectures 1.Federation : everybody talks directly to everyone else. 2.Warehouse : Sources are translated from their local schema to a global schema and copied to a central DB. 3.Mediator : Virtual warehouse --- turns a user query into a sequence of source queries.

7 Federations Wrapper

8 Warehouse Diagram Warehouse Wrapper Source 1Source 2

9 A Mediator Mediator Wrapper Source 1Source 2 User query Query Result

10 Two Mediation Approaches 1.Global as View : Mediator processes queries into steps executed at sources. 2.Local as View : Sources are defined in terms of global relations; mediator finds all ways to build query from views.

11 Example: Catalog Integration uSuppose Dell wants to buy a bus and a disk that share the same protocol.  Global schema: Buses(manf,model,protocol) Disks(manf,model,protocol) uLocal schemas: each bus or disk manufacturer has a (model,protocol) relation --- manf is implied.

12 Example: Global-as-View uMediator might start by querying each bus manufacturer for model-protocol pairs. wThe wrapper would turn them into triples by adding the manf component. uThen, for each protocol returned, mediator queries disk manufacturers for disks with that protocol. wAgain, wrapper adds manf component.

13 Example: Local-as-View uSources’ capabilities are defined in terms of the global predicates. wE.g.,Quantum’s disk database could be defined by QuantumView(M,P) = Disks(’Quantum’,M,P). uMediator discovers all combinations of a bus and disk “view,” equijoined on the protocol components.

14 A Harder LAV Case uThe mediator supports a par(c,p) relation (which doesn’t really exist, but can be queried). uSources can support views that are complex expressions of par. uA logic is needed to work with queries and view definitions. wDatalog is a good choice.

15 Example: Some Local Views uSource 1 provides some parent facts. V1(c,p) <- par(c,p) uSource 2, run by the “Society of Grandparents,” supports only grandparent facts. V2(c,g) <- par(c,p) AND par(p,g)

16 Example – (2) uQuery (great-grandparents): ggp(c,x) <- par(c,u) AND par(u,v) AND par(v,x) uHow can the sources provide solutions that provide all available answers?

17 Example – (3) Sol1(c,x) <- V1(c,u) AND V1(u,v) AND V1(u,x) Sol2(c,x) <- V1(c,u) AND V2(u,x) Sol3(c,x) <- V2(c,v) AND V1(v,x) uNo other queries involving the views can provide more ggp facts. uDeep theory needed to explain.

18 Comparison: LAV Vs. GAV uGAV is simpler to implement. wLets you control what the mediator does. uLAV is more extensible. wAdd a new source simply by defining what it contributes as a view of the global schema. wCan get some use from grandparent info., even if par(c,p) is the only mediator data.

19 Course Plug uIn the Spring 07-08, Alon Halevy (Google) is teaching CS345C Information Integration. uIt will cover this technology and many others.