Data Integration Combining data from different sources, providing a unified view of the data Combining data from different sources, providing a unified.

Slides:



Advertisements
Similar presentations
Chapter 11: Data Warehousing
Advertisements

IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
Chapter 13 The Data Warehouse.
COMP 578 Data Warehouse Architecture And Design
Chapter 10: data Quality and Integration
Dr. Chen, Data Base Management Chapter 10: Data Quality and Integration Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga.
Chapter 11: Data Warehousing
Information Integration. Modes of Information Integration Applications involved more than one database source Three different modes –Federated Databases.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Data Staging Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential Chair of.
© 2007 by Prentice Hall 1 Chapter 11: Data Warehousing Modern Database Management 8 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Data Warehouse success depends on metadata
The Data Warehouse and Design. Summary The design of the data warehouse begins with the data model The primary concern of the data warehouse developer.
Business and IS Performance (IS 6010) MBS BIS 2010 / th November 2010 Fergal Carton Accounting Finance and Information Systems.
Page 1Prepared by Sapient for MITVersion 0.1 – August – September 2004 This document represents a snapshot of an evolving set of documents. For information.
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Chapter 1: Data Warehousing
Chapter 4 Data Warehousing.
Pokročilé databázové technológie Genči
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
ETL By Dr. Gabriel.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Data Warehousing.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 10: Data Quality and Integration Modern Database Management 10 th Edition Jeffrey.
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
MBA 664 Database Management Systems Dave Salisbury ( )
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
© 2007 Robert T. Monroe Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Data Warehousing II: Extract, Transform,
1 Data Warehouses BUAD/American University Data Warehouses.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
1 Data Warehousing. 2Definition Data Warehouse Data Warehouse: – A subject-oriented, integrated, time-variant, non- updatable collection of data used.
Ahsan Abdullah 1 Data Warehousing Lecture-18 ETL Detail: Data Extraction & Transformation Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. &
Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia
Data Warehousing.
SLIDE 1IS 257 – Fall 2014 Data Warehousing University of California, Berkeley School of Information IS 257: Database Management.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
Prepared By Aakanksha Agrawal & Richa Pandey Mtech CSE 3 rd SEM.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
© 2007 Robert T. Monroe Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Data Warehousing BI Tools and Techniques.
Database Management System Prepared by Dr. Ahmed El-Ragal Reviewed & Presented By Mr. Mahmoud Rafeek Alfarra College Of Science & Technology- Khan younis.
Chapter 11: Data Warehousing Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
7 Strategies for Extracting, Transforming, and Loading.
Data Warehousing.
Information Integration 15 th Meeting Course Name: Business Intelligence Year: 2009.
Carnegie Mellon University © Robert T. Monroe Management Information Systems Data Warehousing Management Information Systems Robert.
© 2013 Pearson Education, Inc. Publishing as Prentice Hall 1 CHAPTER 10: DATA QUALITY AND INTEGRATION Modern Database Management 11 th Edition Jeffrey.
ETL Concepts.
Introduction DATAWAREHOUSE What is Data Warehousing? A process of transforming data into information and making it available to.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Unit 4 : Data Warehousing Technologies and Implementation Lecturer : Bijay Mishra.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
DO YOU TRUST YOUR DATA? KNOW THE ANSWER WITH EIM! Jose Hernandez Director, Business Intelligence Dunn Solutions Group.
Lecture 12: Data Quality and Integration
1 HCMC UT, 2008 Data Warehousing 1.Basic Concepts of data warehousing 2.Data warehouse architectures 3.Some characteristics of data warehouse data 4.The.
Data Warehousing University of California, Berkeley
Overview of MDM Site Hub
Summarized from various resources Modern Database Management
Chapter 11: Data Warehousing
MIS5101: Extract, Transform, Load (ETL)
Data Warehouse.
Chapter 1: Data Warehousing
MIS5101: Extract, Transform, Load (ETL)
MIS5101: Extract, Transform, Load (ETL)
Typically data is extracted from multiple sources
Data Warehousing Concepts
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

Data Integration Combining data from different sources, providing a unified view of the data Combining data from different sources, providing a unified view of the data Data warehouse is a repository that results from some types of data integration processes Data warehouse is a repository that results from some types of data integration processes 1

Techniques for Data Integration Consolidation (ETL) Consolidation (ETL) Extract/Transform/Load Extract/Transform/Load Consolidating all data into a centralized database (like a data warehouse) Consolidating all data into a centralized database (like a data warehouse) Data federation (EII) Data federation (EII) Enterprise Information Integration Enterprise Information Integration Provides a virtual view of data without actually creating one centralized database Provides a virtual view of data without actually creating one centralized database Data propagation (EAI) Data propagation (EAI) Enterprise Application Integrations Enterprise Application Integrations Duplicate data across databases, with near real-time delay Duplicate data across databases, with near real-time delay 2

3 The ETL Process Capture/Extract Capture/Extract Scrub or data cleansing Scrub or data cleansing Transform Transform Load and Index Load and Index ETL = Extract, transform, and load

4 Static extract Static extract = capturing a snapshot of the source data at a point in time Incremental extract Incremental extract = capturing changes that have occurred since the last static extract Capture/Extract…obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse

5 Scrub/Cleanse…uses pattern recognition and AI techniques to upgrade data quality Fixing errors: Fixing errors: misspellings, erroneous dates, incorrect field usage, mismatched addresses, missing data, duplicate data, inconsistencies Also: Also: decoding, reformatting, time stamping, conversion, key generation, merging, error detection/logging, locating missing data

6 Transform = convert data from format of operational system to format of data warehouse Record-level: Selection–data partitioning Joining–data combining Aggregation–data summarization Field-level: single-field–from one field to one field multi-field–from many fields to one, or one field to many

7 Load/Index= place transformed data into the warehouse and create indexes Refresh mode: Refresh mode: bulk rewriting of target data at periodic intervals Update mode: Update mode: only changes in source data are written to data warehouse

Data Transformation Functions Record-level Record-level Transformation that involves obtaining the set of records you want from the data source Transformation that involves obtaining the set of records you want from the data source Selection, joining, aggregation Selection, joining, aggregation Field-level Field-level Transformation that converts data from fields of a source record to field(s) of a target record. Transformation that converts data from fields of a source record to field(s) of a target record. Single-field vs. Multi-field transformations Single-field vs. Multi-field transformations 8

9 Single-field transformation In general–some transformation function translates data from old form to new form Algorithmic transformation uses a formula or logical expression Table lookup–another approach, uses a separate table keyed by source record code

10 Multifield transformation M:1–from many source fields to one target field 1:M–from one source field to many target fields