Prepared By Aakanksha Agrawal & Richa Pandey Mtech CSE 3 rd SEM.

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Data Extraction, Cleanup & Transformation Tools
Data Warehousing Willem Visser RW334. Somebody is watching! Everybody seems to be recording your every move Loyalty cards Cookies – Facebook, Twitter,…
Technical BI Project Lifecycle
Data Integration Combining data from different sources, providing a unified view of the data Combining data from different sources, providing a unified.
Information Integration. Modes of Information Integration Applications involved more than one database source Three different modes –Federated Databases.
Components and Architecture CS 543 – Data Warehousing.
Microsoft Dynamics AX Technical Conference 2013
Business Intelligence Instructor: Bajuna Salehe Web:
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
ETL By Dr. Gabriel.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Data Warehouse Tools and Technologies - ETL
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
SSIS Over DTS Sagayaraj Putti (139460). 5 September What is DTS?  Data Transformation Services (DTS)  DTS is a set of objects and utilities that.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Database Management Systems. This lesson includes the following sections  Databases and Management Systems Working with a Database Enterprise Software.
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
Data Warehouse Management March 13, 2000 Prof. Hwan-Seung Yong Dept. of CSE, Ewha Womans Univ. The Case for Data Warehousing.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Distributed Systems Fall 2014 Zubair Amjad. Outline Motivation What is Sqoop? How Sqoop works? Sqoop Architecture Import Export Sqoop Connectors Sqoop.
Module 19 Managing Multiple Servers. Module Overview Working with Multiple Servers Virtualizing SQL Server Deploying and Upgrading Data-Tier Applications.
AN OVERVIEW OF DATA WAREHOUSING
Research plan – LSRT Consortium. Targets Correctness approval Vs. Sybase database. Implementation of a validation scenario with TTI database. Demo preparations.
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
1 Data Warehouses BUAD/American University Data Warehouses.
3. Data Warehouse Architecture
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
CHAPTER 7: ARCHITECTURAL COMPONENTS. CHAPTER OBJECTIVES  Understand data warehouse architecture  Examine how the architectural framework supports the.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
By N.Gopinath AP/CSE.  The data warehouse architecture is based on a relational database management system server that functions as the central repository.
RoOUG Iunie Bucuresti, 26 Iunie Agenda Inregistrarea participantilor ODI – Common Use Cases 2Iunie 2013.
Data Warehousing 101 Howard Sherman Director – Business Intelligence xwave.
Software Systems Division (TEC-SW) ASSERT process & toolchain Maxime Perrotin, ESA.
Data Warehouse A place the information system department puts the data that is turned into information. Data must be properly prepared,organized,and presented.
MIS 451 Building Business Intelligence Systems Data Staging.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
©NIIT BCP and DTS Implementing Stored Procedures Lesson 2A / Slide 1 of 23 Objectives In this lesson, you will learn to: Perform bulk copy using the BCP.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
Data Integration - The ETL Process Module 4: BIC#4 – Data Integration Capability Populating Data Warehouse (Data Mart) 1.
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
Overview of MDM Site Hub
Warehouse Inventory: Better organize warehouse inventory
Data Warehouse.
المحاضرة 4 : مستودعات البيانات (Data warehouse)
Data Warehouse and OLAP
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Typically data is extracted from multiple sources
Data Warehouse A place the information system department puts the data that is turned into information. Data must be properly prepared,organized,and presented.
Data warehouse.
THE ARCHITECTURAL COMPONENTS
Data Warehouse.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
HP ALM Introduction.
Data Warehouse and OLAP
Best Practices in Higher Education Student Data Warehousing Forum
David Gilmore & Richard Blevins Senior Consultants April 17th, 2012
Multiplication Facts 3 x Table.
Presentation transcript:

Prepared By Aakanksha Agrawal & Richa Pandey Mtech CSE 3 rd SEM

 Data Extraction - Involves gathering data from multiple heterogeneous sources.  Data Cleaning - Involves finding and correcting the errors in data.  Data Transformation - Involves converting the data from legacy format to warehouse format. 2 EXTRACTION, CLEANING AND TRANSFORMATION TOOLS

 Data extraction takes data from the source systems.  Data load takes the extracted data and loads it into the data warehouse. Note: Before loading the data into the data warehouse, the information extracted from the external sources must be reconstructed. EXTRACTION, CLEANING AND TRANSFORMATION TOOLS 3

 Controlling the process involves determining when to start data extraction and the consistency check on data. Controlling process ensures that the tools, the logic modules, and the programs are executed in correct sequence and at correct time. EXTRACTION, CLEANING AND TRANSFORMATION TOOLS 4  Data needs to be in a consistent state when it is extracted, i.e., the data warehouse should represent a single, consistent version of the information to the user.

 After extracting the data, it is loaded into a temporary data store where it is cleaned up and made consistent. Note: Consistency checks are executed only when all the data sources have been loaded into the temporary data store. EXTRACTION, CLEANING AND TRANSFORMATION TOOLS 5

 Once the data is extracted and loaded into the temporary data store, it is time to perform Cleaning and Transforming.  Steps involved in Cleaning and Transforming: A) Clean and transform the loaded data into a structure B) Partition the data C) Aggregation EXTRACTION, CLEANING AND TRANSFORMATION TOOLS 6

Cleaning and transforming the loaded data helps speed up the queries. It can be done by making the data consistent: within itself with other data within the same data source with the data in other source systems with the existing data present in the warehouse EXTRACTION, CLEANING AND TRANSFORMATION TOOLS 7

 Transforming involves converting the source data into a structure.  Structuring the data increases the query performance and decreases the operational cost.  The data contained in a data warehouse must be transformed to support performance requirements and control the ongoing operational costs. EXTRACTION, CLEANING AND TRANSFORMATION TOOLS 8

 It will optimize the hardware performance and simplify the management of data warehouse. Here we partition each fact table into multiple separate partitions. EXTRACTION, CLEANING AND TRANSFORMATION TOOLS 9  Aggregation is required to speed up common queries.  Aggregation relies on the fact that most common queries will analyze a subset or an aggregation of the detailed data.

 Tasks of capturing data from source systems, cleansing and transforming it, and loading the results into a target system can be carried out either by separate products, or by a single integrated solution.  Integrated solutions can fall into one of the categories below:  Code Generators  Database Data Replication Tools  Dynamic Transformation Engines EXTRACTION, CLEANING AND TRANSFORMATION TOOLS 10

11 Thankyou