Data Warehouse Tools and Technologies - ETL

Slides:



Advertisements
Similar presentations
Pentaho Open Source BI Goldwin. Pentaho Overview Pentaho is the commercial open source software for Business Pentaho is the commercial open source software.
Advertisements

C6 Databases.
Data Manager Business Intelligence Solutions. Data Mart and Data Warehouse Data Warehouse Architecture Dimensional Data Structure Extract, transform and.
Technical BI Project Lifecycle
Data Warehousing M R BRAHMAM.
SAS® Data Integration Solution
Components and Architecture CS 543 – Data Warehousing.
Accelerated Access to BW Al Weedman Idea Integration.
Page 1Prepared by Sapient for MITVersion 0.1 – August – September 2004 This document represents a snapshot of an evolving set of documents. For information.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
1 A Comparative Study between ETL and E-LT approaches for loading data into a Data Warehouse Vikas Ranjan CSCI 693.
Business Intelligence System September 2013 BI.
Introduction to Building a BI Solution 권오주 OLAPForum
How Business Intelligence Software Works and a Brief Overview of Leading Products Jai Windsor MIS 5973 December 8, 2005.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Data Management Capabilities and Past Performance Dr. Srinivas Kankanahalli.
What is Business Intelligence Business Intelligence (BI) encompasses the processes, tools, and technologies required to transform enterprise data into.
ETL The process of updating the data warehouse.. Recent Developments in Data Warehousing: A Tutorial Hugh J. Watson Terry College of Business University.
ETL By Dr. Gabriel.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Efficient BI Solution Presented by: Leo Khaskin, PowerCubes Lab Value of Information as Business Asset.
Understanding Data Warehousing
1 Brett Hanes 30 March 2007 Data Warehousing & Business Intelligence 30 March 2007 Brett Hanes.
Jean-Pierre Dijcks Principal Product Manager Oracle Warehouse Builder Oracle Corporation.
IBM Start Now Business Intelligence Solutions. Agenda Overview of BI Who will buy and why Start Now BI solution Benefit to customer.
1 The following presentation is from the Oracle Webcast “What’s New in P6 EPPM Release 8.1.” As a partner, you may not use the Oracle Power Point template,
Introduction to the Orion Star Data
PO320: Reporting with the EPM Solution Keshav Puttaswamy Program Manager Lead Project Business Unit Microsoft Corporation.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
ETL Overview February 24, DS User Group - ETL - February ETL Overview “ETL is the heart and soul of business intelligence (BI).” -- TDWI ETL.
More ETL. ETL in a nutshell ETL is an abbreviation of the three words Extract, Transform and Load. It is an ETL process to –extract data, mostly from.
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Enterprise Reporting Solution
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
Information Builders : SmartMart Seon-Min Rhee Visualization & Simulation Lab Dept. of Computer Science & Engineering Ewha Womans University.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Prepared By Aakanksha Agrawal & Richa Pandey Mtech CSE 3 rd SEM.
Chapter 3 Databases and Data Warehouses: Building Business Intelligence Copyright © 2010 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
McGraw-Hill/Irwin ©2009 The McGraw-Hill Companies, All Rights Reserved CHAPTER 6 DATABASES AND DATA WAREHOUSES CHAPTER 6 DATABASES AND DATA WAREHOUSES.
Reporting & Analytics Stephen Chan Senior Solution Consultant.
Pooja Sharma Shanti Ragathi Vaishnavi Kasala. BUSINESS BACKGROUND Lowe's started as a single hardware store in North Carolina in 1946 and since then has.
CS 157B: Database Management Systems II April 10 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Platinum DecisionBase1 DW Product Platinum - Computer AssociatesDecisionBase Hyunsook Lim Database Laboratory Dept. of CSE.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
DO YOU TRUST YOUR DATA? KNOW THE ANSWER WITH EIM! Jose Hernandez Director, Business Intelligence Dunn Solutions Group.
Cognos BI. What is Cognos? Cognos (Cognos Incorporated) was an Ottawa, Ontario-based company that makes Business Intelligence (BI) and Performance Management.
Bartek Doruch, Managing Partner, Kamil Karbowiak, Managing Partner, Using Power BI in a Corporate.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Business Intelligence Overview
Data Management Capabilities and Past Performance
CHAPTER SIX DATA Business Intelligence
Business Intelligence & Data Warehousing
QlikView Connector for Informatica Powercenter An Introduction
PowerMart of Informatica
Data Warehouse.
Chapter 1 Database Systems
Unidad II Data Warehousing Interview Questions
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data warehouse.
Metadata The metadata contains
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data Warehousing Concepts
Business Intelligence
SEWICKLEY, PA.
Analytics, BI & Data Integration
Presentation transcript:

Data Warehouse Tools and Technologies - ETL By: Issarachevawat, Raynoo Romieh, Christian Wongkamolchun, Siri Zhang, Ying

What Is ETL? Extract -- the process of reading data from a outer database. Transform -- the process of converting extracted data to a form useable by the target database. Occurs by using rules or lookup tables or by combining the data with other data. Load -- the process of writing the data into the target database.

What does ETL do? Extracts data from multiple data sources Migrates data from one DB to another Converts DB from one format or type to another. Transforms the data to make it accessible to business analysis Forms data marts and data warehouses Enables loading of multiple target databases Performs at least three specific functions reads data from an input source ; passes the stream of information through either an ETL engine- or code-based process to modify, enhance, or eliminate data elements based on the instructions of the job; writes the resultant data set back out to a flat file, relational table, etc.

What can ETL be used? To acquire a temporary subset of data (like a VIEW) for reports or other purposes. a more permanent data set may be acquired for other purposes such as: the population of a data mart or data warehouse Question: Since the ETL provides a mini-data-warehouse component that looks remarkably like the data mart and perform all the data extraction, filtering, integration, classification and aggregation functions that the data warehouse normally provides, why we need a extra data warehouse as an duplicated part? Answer In Fact, when properly implemented, the data warehouse performs all data preparation function instead of letting ETL perform those chores, so there is no duplication of function. Better yet, the data warehouse handles the data component much more efficiently than ETL does, so we can appreciate the benefits of having a central data warehouse serve as the large enterprise decision support database. Moreover, to provide better performance, ETL merge the data warehouse and data mart approaches by storing small extracts of the data warehouse at end-user workstations.

Data extracted from the data warehouse provide faster processing ETL SYSTEM Operational Data OLAP End Users Local Data Marts ETL Engine Extract Transform Load Filter Outer Sources Different vendor Different format Data Warehouse Data extracted from the data warehouse provide faster processing

Issues that are key to an effective ETL tool Scheduling and job dependencies: particularly relies on graphical environment. Session nesting: When developing an ETL session for a particular part of the system, nesting eliminates duplicate development. Robust SQL support: Increases speed over using code to read and write to a database. Version management: enables quick roll back rather than manually making code changes. In many cases, the DB’s version control may not work on the ETL.

Key Issues … (Cont’d) Debugging functionality: very useful for developer support. ETL should rely on underlying database security. Transformation capabilities vs. cleansing capabilities: seldom very strong in both. Metadata support: must work with the overall metadata strategy.

Current ETL Market Share Total Market Share: $667 Million

ETL Evaluation Ascential Software Formed in July 2001 Throughout the following sections, each of the vendors and their ETL products are evaluated, focusing on primary differences between such products. Ascential Software Formed in July 2001 Focuses on improving, developing, and perfecting their ETL and “back-end” tools Do not have current plans of entering the BI tool market. The Ascential DataStage product family highly scalable ETL solution uses end-to-end metadata management and data quality assurance functions. can create and manage scalable, complex data integration for enterprise applications such as CRM, ERP, SCM, BI/analytics, E-business and data warehouses.

Cognos Corporation Founded in 1969 Prefers that all components of the enterprise data warehouse are Cognos Products DecisionStream easily integrates with Cognos BI tools, etc. has difficulty integrating with other vendor Products. DecisionStream is powerful ETL software Allows users to extract and unite data from disparate sources and deliver coordinated Business Intelligence across your organization. includes advanced data merging, aggregation and transformation capabilities: let users unite data from different sources, and transform it into information using best-practices dimensional design.

Informatica PowerConnect An extension to Informatica PowerCenter, and PowerCenterRT data integration software. Eliminates the need for customers to manually code data extraction programs for their enterprise applications. Ensures that mission-critical operational data can be effectively used to inform key business decisions across the enterprise. Allows companies to directly source and integrate: ERP CRM Real-time message queue Mainframe AS/400 Remote data Metadata with other enterprise data and deliver it to: Data warehouses Operational data stores Business intelligence tools Packaged analytic applications.

Conclusion Issues analyzed: Cognos could not compete development environments version control Securities metadata exchanges standards Cost Cognos could not compete based on the relative youth limitations of ETL tools. unable to show support for version or revision control security provided by the underlying database, favors non-Cognos Products. The ETL tools presented by Ascential and Informatica are comparable in numerous ways it would be best to select Informatica as an ETL vendor. more mature and stable as a company more comprehensive ETL at an efficient price.

Questions? For Copies of the paper, Please email Christian Romieh, cromieh@hotmail.com