Enterprise Data Warehouse A Technical Perspective Tony Dalwood Information Architecture & Management University of South Australia.

Slides:



Advertisements
Similar presentations
Database Area Neighborhood (DAN)
Advertisements

Database Security and Auditing: Protecting Data Integrity and Accessibility Chapter 8 Application Data Auditing.
Technical BI Project Lifecycle
Data Warehousing M R BRAHMAM.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
© 2004 Visible Systems Corporation. All rights reserved. 1 (800) 6VISIBLE Holistic View of the Enterprise Business Development Operations.
Building Enterprise Applications Using Visual Studio ®.NET Enterprise Architect.
Tools You Own Maggie Moehringer AIRPO, June 2006.
Components and Architecture CS 543 – Data Warehousing.
Business Intelligence Dr. Mahdi Esmaeili 1. Technical Infrastructure Evaluation Hardware Network Middleware Database Management Systems Tools and Standards.
Effort in hours Duration Over Weeks Or Months Inception Launch Web Lifecycle Methodology Maintenance Phases Copyright Wonderlane Studios.
Components of the Data Warehouse Michael A. Fudge, Jr.
Database Administration Chapter 16. Need for Databases  Data is used by different people, in different departments, for different reasons  Interpretation.
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
ETL Design and Development Michael A. Fudge, Jr.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
ETL By Dr. Gabriel.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Managing Data Interoperability with FME Tony Kent Applications Engineer IMGS.
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Jean-Pierre Dijcks Principal Product Manager Oracle Warehouse Builder Oracle Corporation.
Data Warehousing at STC MSIS 2007 Geneva, May 8-10, 2007 Karen Doherty Director General Informatics Branch Statistics Canada.
Using ISO/IEC to Help with Metadata Management Problems Graeme Oakley Australian Bureau of Statistics.
ETL Overview February 24, DS User Group - ETL - February ETL Overview “ETL is the heart and soul of business intelligence (BI).” -- TDWI ETL.
More ETL. ETL in a nutshell ETL is an abbreviation of the three words Extract, Transform and Load. It is an ETL process to –extract data, mostly from.
The Big Green Thingy – A Case Study in Data Warehousing Allison Lobato, DBA Enterprise Data Warehouse Department of Technology Services Denver Public.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
OWB Implementation Testimonial Presented by David Cordas Data Architecture Team Lead LUCRUM, Incorporated 312 Plum Street, Suite 1110 Cincinnati, OH
OWB 10g Release 2 Codename: Paris Oracle Warehouse Builder 10g r2 (OWB) How you can leverage the new features of OWB “Paris” release.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Information Builders : SmartMart Seon-Min Rhee Visualization & Simulation Lab Dept. of Computer Science & Engineering Ewha Womans University.
Carey Probst Technical Director Technology Business Unit - OLAP Oracle Corporation.
Datawarehouse A sneak preview. 2 Data Warehouse Approach An old idea with a new interest: Cheap Computing Power Special Purpose Hardware New Data Structures.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
Enterprise Data Warehousing— Planning for the Long Haul Vicky Shaffer and Marti Graham April 18, 2005.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Database Administration
Workforce Scheduling Release 5.0 for Windows Implementation Overview OWS Development Team.
7 Strategies for Extracting, Transforming, and Loading.
9 Copyright © 2009, Oracle. All rights reserved. Deploying and Reporting on ETL Jobs.
3 Copyright © 2009, Oracle. All rights reserved. Understanding the Warehouse Builder Architecture.
Oracle’s EPM System and Strategy
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
1 Copyright © 2006, Oracle. All rights reserved. Setting Up and Starting Warehouse Builder.
Copyright © 2006, Oracle. All rights reserved. Czinkóczki László oktató Using the Oracle Warehouse Builder.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
1 Copyright © 2007, Oracle. All rights reserved. Installing and Setting Up the Warehouse Builder Environment.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
Introduction to Oracle Forms Developer and Oracle Forms Services
Defining Data Warehouse Concepts and Terminology
Data and Applications Security Developments and Directions
Introduction to Oracle Forms Developer and Oracle Forms Services
Introduction to Oracle Forms Developer and Oracle Forms Services
Introduction.
Data Warehouse.
Defining Data Warehouse Concepts and Terminology
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Metadata The metadata contains
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Technical Architecture
Best Practices in Higher Education Student Data Warehousing Forum
David Gilmore & Richard Blevins Senior Consultants April 17th, 2012
Presentation transcript:

Enterprise Data Warehouse A Technical Perspective Tony Dalwood Information Architecture & Management University of South Australia

IT Structure ISTS – Information Strategy & Technology Services ISTS – Information Strategy & Technology Services Information Strategy Information Strategy Corporate Information Systems Corporate Information Systems E-Business E-Business Information Architecture & Management Information Architecture & Management Technical Services Technical Services Customer Services Customer Services Network Services Network Services Systems Infrastructure Systems Infrastructure

Information Architecture & Management (IAM) Merger of DBA team & Information Integration team in Feb 2006 Merger of DBA team & Information Integration team in Feb 2006 IAM manages IAM manages Corporate System Databases (3 DBA’s) Corporate System Databases (3 DBA’s) Operational Data Store Management Operational Data Store Management Middle Tier Apps Middle Tier Apps Student Portal (myUniSA) Student Portal (myUniSA) Staff “Portal” (UniSAinfo) Staff “Portal” (UniSAinfo) UniSAinfo Reporting UniSAinfo Reporting EDW EDW

Project Governance Steering Group Steering Group Includes Directors of ISTS, Planning and Assurance Services (PAS), Student & Academic Services (SAS), Finance Includes Directors of ISTS, Planning and Assurance Services (PAS), Student & Academic Services (SAS), Finance Sponsors Group Sponsors Group Director of Planning & Assurance Services Director of Planning & Assurance Services Dep. Director Information Strategy Dep. Director Information Strategy Business Project Manager Business Project Manager Technical Project Manager Technical Project Manager Reference Group Reference Group Senior Officers from PAS, HR, Research, SAS, Finance Senior Officers from PAS, HR, Research, SAS, Finance

Project Governance Project Team Project Team Business Project Manager (PAS) Business Project Manager (PAS) Technical Project Manager (ISTS) Technical Project Manager (ISTS) Design Architect/Dev Team Leader (ISTS) Design Architect/Dev Team Leader (ISTS) Business Analyst (x1.5) (PAS) Business Analyst (x1.5) (PAS) Data Quality Manager (0.5) (PAS) Data Quality Manager (0.5) (PAS) Developers (x3 variant) (ISTS) Developers (x3 variant) (ISTS)

EDW Project Milestones Aug Business Case submitted by Planning & Assurance Services (PAS) and ISTS to extend current reporting environment to an EDW ($150K) Aug Business Case submitted by Planning & Assurance Services (PAS) and ISTS to extend current reporting environment to an EDW ($150K) Feb 2005 – Project Commenced Feb 2005 – Project Commenced Feb-July 2005 – Data Gathering Workshops Feb-July 2005 – Data Gathering Workshops Sep-Dec 2005 – Technical Research & Proof of Concept (0.5 IT Resource) Sep-Dec 2005 – Technical Research & Proof of Concept (0.5 IT Resource) Jan-Feb 2006 – External Consultancy (1 IT Resource) Jan-Feb 2006 – External Consultancy (1 IT Resource) May 2006 – First Star Schema complete (Research Publications) (4 IT Resources) May 2006 – First Star Schema complete (Research Publications) (4 IT Resources) July 2006 – Three more Star Schemas complete (Research Income, AVCC Data, Research Staff Supervision) (4 IT Resources) July 2006 – Three more Star Schemas complete (Research Income, AVCC Data, Research Staff Supervision) (4 IT Resources) August 2006 – First “Soft” Production Release (2.5 IT Resources) August 2006 – First “Soft” Production Release (2.5 IT Resources) Beyond – Student Data & Finance Data (min 2 IT Resources) Beyond – Student Data & Finance Data (min 2 IT Resources) NB: IT Resource not including part time Tech Project Manager

BusinessTechnical ‘One’ Source of the truth Conformed Dimensions Consolidated Facts Performance Transformed schema design External Data Flexible data sources Simplicity Pre-calculated measures Historical Capability Versioning, Snapshot Data Quality Verification, Validation, Audit Trail Project Goals

By-Products of an EDW Project Data Discovery Data Discovery What data do we have What data do we have How data is used and maintained How data is used and maintained What is the quality of the data What is the quality of the data How data can be utilised by more of the organisation How data can be utilised by more of the organisation Enhanced Collaboration Enhanced Collaboration Intra and Inter communication between business units, system owners and IT Intra and Inter communication between business units, system owners and IT

Technical Project Plan “Warehousing” Research “Warehousing” Research Proof of Concept exercise Proof of Concept exercise External Assistance External Assistance Implementation of an Architecture Implementation of an Architecture Development Standards & Procedures Development Standards & Procedures Build & Implementation of Stage 1 Build & Implementation of Stage 1 Review Review

Proof of Concept Validate Warehouse research findings Validate Warehouse research findings Proof of Concept covered the following topics: Proof of Concept covered the following topics: Project methodology Project methodology Technical architecture Technical architecture Design methodology Design methodology ETL methodology ETL methodology MetaData options MetaData options Data Quality approach Data Quality approach Security implementation options Security implementation options

Project Methodology

Technical Architecture Inputs into Architecture Inputs into Architecture Business Goals Business Goals Existing Reporting Environments Existing Reporting Environments Technology Technology Time Time $$ $$ Resources/Skills Resources/Skills

Data Flow Architecture

Design Methodology Dimensional Modelling chosen as the design philosophy Dimensional Modelling chosen as the design philosophy Star Schemas/Snowflakes Star Schemas/Snowflakes Facts Facts Dimensions Dimensions Measures Measures Bridges Bridges History Retention for Slowly Changing Dimensions History Retention for Slowly Changing Dimensions Warehouse records are versioned i.e. never deleted or overwritten. Warehouse records are versioned i.e. never deleted or overwritten. Views to identify “current” records Views to identify “current” records

Transformation of Design - Source

Transformation of Design - Target

ETL Methodology Scripts Vs Tool decision Scripts Vs Tool decision Tool chosen for following reasons: Tool chosen for following reasons: Already licensed for Oracle Internet Developer Suite that includes Oracle Warehouse Builder Already licensed for Oracle Internet Developer Suite that includes Oracle Warehouse Builder Oracle Database environment Oracle Database environment Oracle technical skills Oracle technical skills Visibility of Development Environment Visibility of Development Environment Auto technical Meta Data generation Auto technical Meta Data generation Auto and accessible code generation using PL/SQL Auto and accessible code generation using PL/SQL Ability to include custom code Ability to include custom code Integration with Oracle database and related Oracle technology Integration with Oracle database and related Oracle technology Framework for Beginners Framework for Beginners Difficult to evaluate other products without expertise Difficult to evaluate other products without expertise Smarts & Effort into Modelling and Design – ETL should be a “no brainer” Smarts & Effort into Modelling and Design – ETL should be a “no brainer”

MetaData Data about Data Data about Data Oracle Warehouse Builder provides technical metadata Oracle Warehouse Builder provides technical metadata Business MetaData facility currently restricted to documentation and Cognos catalogs Business MetaData facility currently restricted to documentation and Cognos catalogs Evaluation of MetaData methods to be reviewed at the completion of Stage 1 development Evaluation of MetaData methods to be reviewed at the completion of Stage 1 development

Data Quality Pre-ETL Pre-ETL Technical profile to ensure physical design has mapped appropriate data elements Technical profile to ensure physical design has mapped appropriate data elements Business profile of source data to identify data attributes e.g. data type, patterns, nulls, min, max, outlies Business profile of source data to identify data attributes e.g. data type, patterns, nulls, min, max, outlies ETL ETL Transform to conformed data sets Transform to conformed data sets Foreign Key checks Foreign Key checks Reporting of anomolies Reporting of anomolies Post ETL Post ETL Final Business profile to validate transformations of data Final Business profile to validate transformations of data

Security Security options implemented are: Security options implemented are: Database Layer Database Layer Oracle roles to grant or deny access to database objects based on Business rules Oracle roles to grant or deny access to database objects based on Business rules Oracle views for granular data security where appropriate Oracle views for granular data security where appropriate User Layer User Layer Access to end user Cognos catalogues/cubes controlled via Cognos security mechanisms and filesystem access Access to end user Cognos catalogues/cubes controlled via Cognos security mechanisms and filesystem access

Development Lifecycle Business Requirements Business Requirements Design Process Design Process Logical Design Logical Design Physical Design Physical Design Data Mapping Data Mapping Data Profiling Data Profiling

Development Lifecycle Design & Build ETL Objects & Processes Design & Build ETL Objects & Processes Extraction routines Extraction routines ‘Diff’ routines ‘Diff’ routines Tag records as Inserts, Updates or Deletes Tag records as Inserts, Updates or Deletes Build Staging tables Build Staging tables Build Target warehouse tables Build Target warehouse tables

Standard ETL Process Scheduled Extract/Diff process runs to populate a Diff table in the Staging Area Scheduled Extract/Diff process runs to populate a Diff table in the Staging Area ETL process then performs a standard set of steps ETL process then performs a standard set of steps Load Staging from Diff table Load Staging from Diff table Stamp Staging record according to Diff type (U, D or I) Stamp Staging record according to Diff type (U, D or I) Updated Record – Tag staging record as new ‘version’ of core record Updated Record – Tag staging record as new ‘version’ of core record Deleted Record – Tag staging record ‘Retired’ record in warehouse Deleted Record – Tag staging record ‘Retired’ record in warehouse Inserted Record – Tag staging record to be new record (version 1) Inserted Record – Tag staging record to be new record (version 1) Update Core – End date existing “current” record Update Core – End date existing “current” record Load new Core – New “current” record from Staging Load new Core – New “current” record from Staging

Development Lifecycle Post ETL Post ETL Measures Measures Summary data Summary data Process Flows to execute ETL Process Flows to execute ETL Security views Security views End User Layer e.g. Catalogues End User Layer e.g. Catalogues

ETL Auditing When did a process last run When did a process last run How long did it run for How long did it run for Did it Succeed, Fail or produce Warnings Did it Succeed, Fail or produce Warnings How many records did it alter or insert How many records did it alter or insert What were the data exceptions What were the data exceptions

UniSA EDW Toolset Oracle Database Oracle Database Oracle Warehouse Builder Oracle Warehouse Builder Oracle Workflow Oracle Workflow Oracle Enterprise Manager Oracle Enterprise Manager Datiris Data profiler Datiris Data profiler Cognos Impromptu/Powerplay Cognos Impromptu/Powerplay Whiteboard and lots of A3 Paper!!! Whiteboard and lots of A3 Paper!!!

Oracle Database Options assisting Warehouse implementation Options assisting Warehouse implementation External tables External tables Materialised Views Materialised Views Query Rewrite Query Rewrite Bitmap indexes Bitmap indexes Partitioning Partitioning Star Query optimizer options Star Query optimizer options

Oracle Warehouse Builder Provides the design and development environment and framework for the build and deployment of Warehouse objects and transformation processes Provides the design and development environment and framework for the build and deployment of Warehouse objects and transformation processes Consists of Design Repository and Runtime components Consists of Design Repository and Runtime components

Oracle Workflow Optionally used for job execution with “dependency management” Optionally used for job execution with “dependency management” Exists as an optional install with RDBMS Exists as an optional install with RDBMS Run as Client/Server or HTTP browser based application Run as Client/Server or HTTP browser based application Workflow engine is a service on the warehouse database server administered by a workflow schema Workflow engine is a service on the warehouse database server administered by a workflow schema

Oracle Enterprise Manager Optionally used as the scheduling option for submitting and monitoring Warehouse builder processes or workflows Optionally used as the scheduling option for submitting and monitoring Warehouse builder processes or workflows Base OEM comes with RDBMS Base OEM comes with RDBMS Optionally run as standalone install or Management Server mode using a web console Optionally run as standalone install or Management Server mode using a web console

Cognos 7.3 Reporting Suite Catalogues Catalogues Report Developer access layer Report Developer access layer Impromptu Impromptu Reporting capability Reporting capability Powerplay Powerplay Multi-dimensional analysis Multi-dimensional analysis Upfront Upfront Web interface Web interface

Oracle Warehouse Builder Demonstration

OWB 10g Release 2 - Paris New Features: Design Tool Design Tool Graphic Interface Improvements Graphic Interface Improvements Built in Slowly Changing Dimension property Built in Slowly Changing Dimension property Data Profiling/Quality utilities Data Profiling/Quality utilities Better Integrated Workflow Engine Better Integrated Workflow Engine Job Scheduling within OWB via OEM Job Scheduling within OWB via OEM

Project Review Sanity Check on whole process, architecture, methodology Sanity Check on whole process, architecture, methodology Business & Technical Business & Technical Evaluate ROI Evaluate ROI Quantify metrics on time to deliver Quantify metrics on time to deliver Proposed Future phases Proposed Future phases Usage Statistics Usage Statistics Hardware adequacy & capacity Hardware adequacy & capacity

Useful Technical References Links Links Oracle Business Intelligence & Technical Sites Oracle Business Intelligence & Technical Sites Rittman Blog Rittman Blog Kimball Tips Kimball Tips Texts Texts Oracle 9iRel2 Data Warehousing - Hobbs Oracle 9iRel2 Data Warehousing - Hobbs Kimball Texts Kimball Texts The Data Warehouse Lifecycle Toolkit The Data Warehouse Lifecycle Toolkit The Data Warehouse ETL Toolkit The Data Warehouse ETL Toolkit

Questions ?