Information Integration. Modes of Information Integration Applications involved more than one database source Three different modes –Federated Databases.

Slides:



Advertisements
Similar presentations
Supervisor : Prof . Abbdolahzadeh
Advertisements

Managing data Resources: An information system provides users with timely, accurate, and relevant information. The information is stored in computer files.
Multidimensional Database Structure
Components and Architecture CS 543 – Data Warehousing.
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
Data Warehousing Data Warehousing: A Definition “A data warehouse is a single integrated store of data which provides the infrastructural basis for informational.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
Distributed Database Management Systems. Reading Textbook: Ch. 4 Textbook: Ch. 4 FarkasCSCE Spring
Ch3 Data Warehouse part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
An Overview of Data Warehousing and OLTP Technology Presenter: Parminder Jeet Kaur Discussion Lead: Kailang.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
ETL By Dr. Gabriel.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
1 Introduction An organization's survival relies on decisions made by management An organization's survival relies on decisions made by management To make.
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
AN OVERVIEW OF DATA WAREHOUSING
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
Data Warehouse Development Methodology
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
Database A database is a collection of data organized to meet users’ needs. In this section: Database Structure Database Tools Industrial Databases Concepts.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Announcements. Data Management Chapter 12 Traditional File Approach  Structure Field  Record  File  Fixed All records have common fields, and a field.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Prepared By Aakanksha Agrawal & Richa Pandey Mtech CSE 3 rd SEM.
Chapter 8 Data and Knowledge Management. 2 Learning Objectives When you finish this chapter, you will  Know the difference between traditional file organization.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
Data resource management
DATABASES AND DATA WAREHOUSES
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
Two-Tier DW Architecture. Three-Tier DW Architecture.
Advanced Database Concepts
CS 157B: Database Management Systems II April 10 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 1 Database Systems.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
An Overview of Data Warehousing and OLAP Technology
1 Management Information Systems M Agung Ali Fikri, SE. MM.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Supervisor : Prof . Abbdolahzadeh
Introduction to DBMS Purpose of Database Systems View of Data
Databases and DBMSs Todd S. Bacastow January 2005.
Intro to MIS – MGS351 Databases and Data Warehouses
Chapter 13 The Data Warehouse
Data Warehouse.
Databases and Data Warehouses Chapter 3
Business Intelligence
Data Warehouse and OLAP
Chapter 5 Data Resource Management.
Managing data Resources:
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
An Introduction to Data Warehousing
MANAGING DATA RESOURCES
Introduction to DBMS Purpose of Database Systems View of Data
Metadata The metadata contains
Data Warehousing Concepts
Data Warehouse and OLAP
Presentation transcript:

Information Integration

Modes of Information Integration Applications involved more than one database source Three different modes –Federated Databases –Data Warehouses –Mediation

Three Different Modes Federated Databases –Similar to multiple databases system – DBs are connected to each other but no global schema – Everything depends on application programmer Data Warehouses –Global schema for different sources – application programmers can develop/optimize the execution of queries issued to the data warehouse Mediator –A software program that provides users with a virtual global database – does not store data like data warehouse but interaction between users and data sources is similar to the data warehouse

Data Warehouse

DB1DB2DBn EXTRACTION & INTEGRATION DATA WAREHOUSE AGGREGATION & CUSTOMIZATION END USER Schema & Data DW Designer Source Admin

Overview of Data Warehouse Components Designing issues Loading data from sources into DW Extracting data from DW Physical structure of DW Metadata management DW project management

Components Local data sources (schema & data) Global database (global schema & data) Users’ requests

Designing Issues Top-down (design a global schema, then integrate data) vs bottom-up (develop smaller, specialized data marts) Changes in local schema need to be reflected in the global schema Integration issues –Syntactic difference Data type Value –Semantic difference –Missing value

Getting Data into DW Requirement: access to different information source Need: wrappers, loaders, mediators (programs that load data of the information sources into DW) –Wrappers & loaders: loading, transforming, cleaning, and updating the data from the source to the data warehouse –Mediators: integrate the data into the warehouse by resolving inconsistencies and conflict between different information souces

Getting Data into DW Wrappers, loaders, mediators: called Extract- Transform-Load Tools (ETL Tools) try to automate/support the following tasks: –Extraction –Cleaning –Transformation –Loading –Replication –Analyzing –High-speed data transfer –Checking for data quality –Analyzing metadata

Extracting data from DW Basic OLAP operations –Roll up –Drill down –Slice and dice –Pivot Others –Report and query tools –GIS –Data mining –Decision support system –Executive information system –Statistics

Physical structure of DW Centralized (single database, one location) Federated (single virtual database, several locations) Tiered (combined both centralized & federated)

DB1DB2DBn DATA WAREHOUSE END USER LDB1LDB2LDBn LOGICAL DATA WAREHOUSE END USER END USER DM1DM2 CENTRALIZED FEDERATED TIERED LDB1LDB2LDBn DATA WAREHOUSE END USER END USER SDM1SDM2 SDM : Summarize data

Metadata management Provide the information necessary for accessing the data warehouse efficiently May include –Data dictionary (definitions of the dbs being maintained and the relationship between data elements) –Data flow (direction and frequency of the data feed) –Data transformation (transformations required when data is moved) –Version control (changes to the metadata are stored) –Data usage statistics (profile of data in the warehouse) –Alias information (alias names for a field) –Security (who is allowed to access the data)

DW project management Important issues –Design (no standard methodology as of now) –Technical (Hardware, Software) –Procedural (deployment: training end users!)

DW Research: Issues

Issues Data extraction and reconciliation (difficult to automate) Data aggregation and customization: requires a richer language for representation of hierarchies Query optimization Update propagation Modeling and measuring data warehouse quality (what can be used to measure the quality of a data warehouse?) –Accessibility (more accessible to users) –Interpretability (help users understand the data they get) –Usefulness (help users in their work) –Believability (trust of users in the data, makes data from less reliable source becomes more trustworthy) –Validation (how to ensure that all of the above quality issues have actually been adequately addressed)

Mediators A software program that provides users with a virtual global database – does not store data like data warehouse but interaction between users and data sources is similar to the data warehouse

DB1DBn END USER MEDIATOR Query Result WRAPPER Query Result Query Result WRAPPER