Demonstration 10 EDW Implementation Strategy and Process 1/10/2012 www.InstantBI.com.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Organisation Of Data (1) Database Theory
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Best Practice Model Customisation and ETL for Sybase IWS – Instant IWS
Introduction to OWB(Oracle Warehouse Builder)
Introduction to ETL Using Microsoft Tools By Dr. Gabriel.
Data Manager Business Intelligence Solutions. Data Mart and Data Warehouse Data Warehouse Architecture Dimensional Data Structure Extract, transform and.
BI: Not Just for Bosses Anymore Written by Meridith levinson January 15 th, 2006 CIO Magazine By Shereen El Sammaa.
Technical BI Project Lifecycle
Business Intelligence Methodology 1/3/2012
Data Manager Best Practices Business Intelligence Solutions.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Data Warehouse success depends on metadata
DT211 Stage 2 Databases Lab 1. Get to know SQL Server SQL server has 2 parts: –A client, running on your machine, in the lab. You access the database.
Universe Design Concepts Business Intelligence Copyright © SUPINFO. All rights reserved.
5 Copyright © 2009, Oracle. All rights reserved. Defining ETL Mappings for Staging Data.
Chapter 9 Database Management
ETL Design and Development Michael A. Fudge, Jr.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
ETL The process of updating the data warehouse.. Recent Developments in Data Warehousing: A Tutorial Hugh J. Watson Terry College of Business University.
ETL By Dr. Gabriel.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Data Warehouse Tools and Technologies - ETL
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
SeETL Demonstration 17 SeETL Beta 01 15/07/2013
Creating Data Marts from COBOL Files (ISAM to RDBMS)
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
DAY 15: ACCESS CHAPTER 2 Larry Reaves October 7,
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Converting COBOL Data to SQL Data: GDT-ETL Part 1.
Moodle (Course Management Systems). Assignments 1 Assignments are a refreshingly simple method for collecting student work. They are a simple and flexible.
Data: Migrating, Distributing and Audit Tracking Michelle Ayers, Advisory Solution Consultant
1 Productivity Benefits of the Instant Data Warehouse 27/7/ As more and more large organisations use the Instant Data Warehouse we are starting.
M1G Introduction to Database Development 6. Building Applications.
ETL Overview February 24, DS User Group - ETL - February ETL Overview “ETL is the heart and soul of business intelligence (BI).” -- TDWI ETL.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Databases and Data Warehouses: Supporting the Analytics-Driven.
Session 4: The HANA Curriculum and Demos Dr. Bjarne Berg Associate professor Computer Science Lenoir-Rhyne University.
1 The Instant Data Warehouse Released 15/01/ Hello and Welcome!! Today I am very pleased to announce the release of the 'Instant Data Warehouse'.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
1 Data Warehouses BUAD/American University Data Warehouses.
Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia
Data Management Console Synonym Editor
Soup-2-Nuts Alaska Department of Fish & Game Commercial Fisheries October, 2011.
ETL Extract. Design Logical before Physical Have a plan Identify Data source candidates Analyze source systems with data- profiling tools Receive walk-through.
Oracle Data Integrator Transformations: Adding More Complexity
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
1.  An introduction to data modelling  The purpose of data modelling  Modelling data relationships 2.
LS Retail BI Information/requirements/deployment steps.
December 5, Repository Metadata: Tips and Tricks Peggy Rodriguez, Kathy Kimball.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
Chapter 3 Databases and Data Warehouses: Building Business Intelligence Copyright © 2010 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
SeETL Demonstration 28 BI4ALL Integration 20/07/2013
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
7 Strategies for Extracting, Transforming, and Loading.
9 Copyright © 2009, Oracle. All rights reserved. Deploying and Reporting on ETL Jobs.
PestPac Software. Leads The Leads Module allows you to track all of your pending sales for your company from the first contact to the close. By the end.
Unit 17: SDLC. Systems Development Life Cycle Five Major Phases Plus Documentation throughout Plus Evaluation…
Day in the Life (DITL) Production Operations with Energy Builder Copyright © 2015 EDataViz LLC.
MIS 451 Building Business Intelligence Systems Data Staging.
Physical Layer of a Repository. March 6, 2009 Agenda – What is a Repository? –What is meant by Physical Layer? –Data Source, Connection Pool, Tables and.
Invoices and Service Invoices Training Presentation for Raytheon Supply Chain Platform (RSCP) April 2016.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
7 Copyright © 2006, Oracle. All rights reserved. Defining a Relational Dimensional Model.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Typically data is extracted from multiple sources
Data warehouse.
David Gilmore & Richard Blevins Senior Consultants April 17th, 2012
Presentation transcript:

Demonstration 10 EDW Implementation Strategy and Process 1/10/2012

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 2 Agenda  Standard Implementation Strategy  Standard Implementation Process  Prototyping  Setting up the Staging Area

Standard Implementation Strategy

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 4 Prototyping  “Plan to build it twice. You will anyway”  Frederick P Brooks – The Mythical Man Month  The BEST IT development book ever written  DWs have become FAR more sophisticated  Early 90s (90-94)  dimension tables and 5-10 fact tables  All about ‘sales’ and campaigns  These were the high value applications  300+ work days to build the ETL for these in cobol  Today with models like BI4ALL or even custom development  100+ dimension tables and 30+ fact tables (easily)  work days to build the ETL  Much more of the ‘ETL time’ is ‘understanding data’

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 5 Prototyping  Prototyping allows you to  Find data understanding errors before writing ETL  Quickly tune database model for performance  Start developing reports earlier  Report development tools have become more ‘complex’  Ensure data integrity earlier  Data integrity is the #1 killer of ETL productivity  We used to only find errors when we wrote the ETL  Find ‘assumptions’ that do not hold up for this EDW  More easily communicate the ‘end result’ to business users  If you can’t build the prototype, you can’t build the real thing  Bottom Line: Strongly recommended

1 System 1 System 2 System 3 EDW Staging Area Trans form And Load EDW Validate and Clean Source Systems Direct Connect App 1 Commercial Specific Apps Data Marts Data Mart 1 BI Apps 1 BI Apps 2 BI Apps 3 Extracts Reporting Systems System 4 System 5 System 6 System 7 System 8 System 9 System 10 ODS Trans form And Load Extracts Data Mart 2 Data Mart 3 Data Mart 4 Data Mart 5 Data Mart 6 App 2App 3App 4 App 5 App 6 App 7 App 8 App 9App 10 App 11 App 12 App 13 App 14

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 7 Standard Implementation Strategy  Data is ‘somehow’ extracted from source systems  Must be careful to detect deletes  Can be a very difficult problem to solve  Data, as raw as possible, sent to Staging Area  Usually as files, sometimes as ODBC links  The OLTP system usually controls extraction schedule  Extract one field -> extract whole table  Data is profiled to determine real data type  Staging area tables use ‘real’ data types  Start developing understanding of raw data  Understanding data is a huge and difficult job usually

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 8 Standard Implementation Strategy  Data loaded into ‘staging area’  Today we can afford RDBMS staging areas (used to be files)  As data goes into the staging area fields are converted from possible native types to RDBMS types  Numeric/string dates, codes, flags, numerics in chars  Errors must be caught so both source and target columns are kept so the source value is visible  Calculations within a row are performed  Durations/elapsed times, ages etc  Three Flags are set  Row deleted from source – if it was deleted  Row valid – valid by default – set to ‘N’ if found invalid  Row sent to EDW = ‘N’ because it has not  It is then possible to run ‘cross table validations’ on the data BEFORE sending it into the EDW  Always beware sending invalid data into a DW

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 9 Example Staging Table CREATE TABLE UNBILLED_CALLS ( CUM_START_DATE_TIME DATE NOT NULL, CUM_START_DATE_TIME DATE NOT NULL, CALL_DIALLED_DIGITS VARCHAR2(18 BYTE), CALL_DIALLED_DIGITS VARCHAR2(18 BYTE), CALL_DURATION NUMBER(10,2) NOT NULL, CALL_DURATION NUMBER(10,2) NOT NULL, CALL_RETAIL_PRICE NUMBER(14,3) NOT NULL, CALL_RETAIL_PRICE NUMBER(14,3) NOT NULL, CALL_BREAKDOWN_CODE VARCHAR2(5 BYTE) NOT NULL, CALL_BREAKDOWN_CODE VARCHAR2(5 BYTE) NOT NULL, CALL_DISCOUNT NUMBER(4,1) NOT NULL, CALL_DISCOUNT NUMBER(4,1) NOT NULL, CALL_UNITS NUMBER(4) NOT NULL, CALL_UNITS NUMBER(4) NOT NULL, CALL_PP_ALLOWANCE NUMBER(5,1) NOT NULL, CALL_PP_ALLOWANCE NUMBER(5,1) NOT NULL, CALL_CLASS VARCHAR2(5 BYTE), CALL_CLASS VARCHAR2(5 BYTE), CALL_CATEGORY VARCHAR2(5 BYTE), CALL_CATEGORY VARCHAR2(5 BYTE), CALL_ORIGINATION VARCHAR2(2 BYTE), CALL_ORIGINATION VARCHAR2(2 BYTE), SERVICE_CODE VARCHAR2(2 BYTE) NOT NULL, SERVICE_CODE VARCHAR2(2 BYTE) NOT NULL, CUM_CUSTOMER NUMBER(10) NOT NULL, CUM_CUSTOMER NUMBER(10) NOT NULL, CUM_SUBSCRIBER VARCHAR2(18 BYTE) NOT NULL, CUM_SUBSCRIBER VARCHAR2(18 BYTE) NOT NULL, CALL_DIRECTION VARCHAR2(2 BYTE), CALL_DIRECTION VARCHAR2(2 BYTE), CALL_LOCATION VARCHAR2(13 BYTE), CALL_LOCATION VARCHAR2(13 BYTE), CALL_DESTINATION VARCHAR2(5 BYTE), CALL_DESTINATION VARCHAR2(5 BYTE), CALL_RECORD_TYPE VARCHAR2(3 BYTE) NOT NULL, CALL_RECORD_TYPE VARCHAR2(3 BYTE) NOT NULL, SERVICE_TYPE VARCHAR2(2 BYTE) NOT NULL, SERVICE_TYPE VARCHAR2(2 BYTE) NOT NULL, BUCKET_TYPE VARCHAR2(5 BYTE), BUCKET_TYPE VARCHAR2(5 BYTE), ROW_DEL_FRM_SRC_IND VARCHAR2(1 BYTE) DEFAULT 'N' NOT NULL, ROW_DEL_FRM_SRC_IND VARCHAR2(1 BYTE) DEFAULT 'N' NOT NULL, ROW_VALID_IND VARCHAR2(1 BYTE) DEFAULT 'Y' NOT NULL, ROW_VALID_IND VARCHAR2(1 BYTE) DEFAULT 'Y' NOT NULL, ROW_SENT_TO_IWS VARCHAR2(1 BYTE) DEFAULT 'N' NOT NULL ROW_SENT_TO_IWS VARCHAR2(1 BYTE) DEFAULT 'N' NOT NULL)

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 10 Getting Data into Staging Area  Get data into a Staging Area asap  Helps in learning to understand the data  Can query/browse much more easily than in files  We now use (free) utilities to load staging area pttype  Using utilities is faster and less costly than using Infa for pptype development  Read a wide variety of files and load the data ‘as is’  Defaults the flags  Get ALL data for this release into staging area before starting mapping  Or at least as much as possible  Late arriving data can only confuse the issue

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 11 Cross Table Validations  Cross table validations might include  Checking customer/account exists for a sales record  Checking address is a valid address  Checking details provided by retailer match other systems such as sell through capture  Checking codes entered on tables exist  The list of ‘possible’ things to check is endless  EA must decided what validations will stop data from flowing into the EDW  These are more likely to be business than technology based  We have only built to capability to do so, not the rules themselves  Can be implemented in Infa or Stored Procedures

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 12 Data Quality  Data Quality Measures/Correction  Can be implemented in ETL tools if acquired or ‘home grown’  Can send data back to source systems if needed  Is a whole ‘other topic’  But is will be done at some point in the life of EDW

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 13 The ‘Mapping Spreadsheet’  So, Staging Area has data in it ‘What to do next?’  Build the left hand side of the ‘mapping spreadsheet’  Source to Target Mapping  The right hand side starts blank!!  Once you have developed your understanding of the data, built/loaded the staging area with all the data you want in the ‘current release’ you are ready to perform ‘data mapping’ and Data Modelling.  To do this, you need to understand current models and have some ideas about BI4ALL model…  Now begins the modeling portion of the training…

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 14 Data Mapping  Filling in the right hand side of mapping spreadsheet  Defining ‘real keys’ from source data  Defining tables to be joined/split  Defining how to present a ‘view’ of the staging area such that it can be sent into the EDW  Defining changes to EDW model  Two columns in SS - Table Exists/Column Exists  EDW Modeller sets them for the DBA  We will see the ‘in progress’ version later  Each source field required in the EDW is mapped  Notes and comments are included…  Database level transformations are included in SS  Key role in any EDW implementation

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 15 Prototype Building  DBA implements changes to ODW/EDW Physical Model  Tables, indexes, naming standards  Key role is to ‘catch mistakes’ by the modeller  With 000’s of fields to map modeller will make mistakes  From the SS we can now generate pptype ETL  Tools available from IBI  One of which was just written (SeETL)  Includes all elements such as  Generated views  Updating control tables  Running and testing on windows  All prior to building Informatica/DataStage ETL

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 16 Prototype Building  Once Prototype is ‘relatively stable’  Sizable amounts of test data can be loaded  Prototype can be moved to deployment platform  Presentation Views can be created  Cognos Catalogs/Business Objects started  Early reports started  More learning about real data volumes can happen  When errors are found (and they will be found)  Prototype can be changed easily  Generated ETL can be changed easily  When everyone is happy with the pptype  EDW is ‘made real’  Real database, ETL, reports

Commercial in Confidence. Copyright 2012 – Instant Business Intelligence. 17 Conclusion  We have talked about the overall process of EDW Implementation  So, once you have developed your understanding of the data, built/loaded the staging area with all the data you want in the ‘current release’ you are ready to perform ‘data mapping’ and BI4ALL Modelling.  To do this, you need to understand the EDW model…  Now begins the modeling portion of the training…

Thank You for Your Time