Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enterprise Data Warehouse A Technical Perspective Tony Dalwood Information Architecture & Management University of South Australia.

Similar presentations


Presentation on theme: "Enterprise Data Warehouse A Technical Perspective Tony Dalwood Information Architecture & Management University of South Australia."— Presentation transcript:

1 Enterprise Data Warehouse A Technical Perspective Tony Dalwood Information Architecture & Management University of South Australia

2 IT Structure ISTS – Information Strategy & Technology Services ISTS – Information Strategy & Technology Services Information Strategy Information Strategy Corporate Information Systems Corporate Information Systems E-Business E-Business Information Architecture & Management Information Architecture & Management Technical Services Technical Services Customer Services Customer Services Network Services Network Services Systems Infrastructure Systems Infrastructure

3 Information Architecture & Management (IAM) Merger of DBA team & Information Integration team in Feb 2006 Merger of DBA team & Information Integration team in Feb 2006 IAM manages IAM manages Corporate System Databases (3 DBA’s) Corporate System Databases (3 DBA’s) Operational Data Store Management Operational Data Store Management Middle Tier Apps Middle Tier Apps Student Portal (myUniSA) Student Portal (myUniSA) Staff “Portal” (UniSAinfo) Staff “Portal” (UniSAinfo) UniSAinfo Reporting UniSAinfo Reporting EDW EDW

4 Project Governance Steering Group Steering Group Includes Directors of ISTS, Planning and Assurance Services (PAS), Student & Academic Services (SAS), Finance Includes Directors of ISTS, Planning and Assurance Services (PAS), Student & Academic Services (SAS), Finance Sponsors Group Sponsors Group Director of Planning & Assurance Services Director of Planning & Assurance Services Dep. Director Information Strategy Dep. Director Information Strategy Business Project Manager Business Project Manager Technical Project Manager Technical Project Manager Reference Group Reference Group Senior Officers from PAS, HR, Research, SAS, Finance Senior Officers from PAS, HR, Research, SAS, Finance

5 Project Governance Project Team Project Team Business Project Manager (PAS) Business Project Manager (PAS) Technical Project Manager (ISTS) Technical Project Manager (ISTS) Design Architect/Dev Team Leader (ISTS) Design Architect/Dev Team Leader (ISTS) Business Analyst (x1.5) (PAS) Business Analyst (x1.5) (PAS) Data Quality Manager (0.5) (PAS) Data Quality Manager (0.5) (PAS) Developers (x3 variant) (ISTS) Developers (x3 variant) (ISTS)

6 EDW Project Milestones Aug 2004 - Business Case submitted by Planning & Assurance Services (PAS) and ISTS to extend current reporting environment to an EDW ($150K) Aug 2004 - Business Case submitted by Planning & Assurance Services (PAS) and ISTS to extend current reporting environment to an EDW ($150K) Feb 2005 – Project Commenced Feb 2005 – Project Commenced Feb-July 2005 – Data Gathering Workshops Feb-July 2005 – Data Gathering Workshops Sep-Dec 2005 – Technical Research & Proof of Concept (0.5 IT Resource) Sep-Dec 2005 – Technical Research & Proof of Concept (0.5 IT Resource) Jan-Feb 2006 – External Consultancy (1 IT Resource) Jan-Feb 2006 – External Consultancy (1 IT Resource) May 2006 – First Star Schema complete (Research Publications) (4 IT Resources) May 2006 – First Star Schema complete (Research Publications) (4 IT Resources) July 2006 – Three more Star Schemas complete (Research Income, AVCC Data, Research Staff Supervision) (4 IT Resources) July 2006 – Three more Star Schemas complete (Research Income, AVCC Data, Research Staff Supervision) (4 IT Resources) August 2006 – First “Soft” Production Release (2.5 IT Resources) August 2006 – First “Soft” Production Release (2.5 IT Resources) Beyond – Student Data & Finance Data (min 2 IT Resources) Beyond – Student Data & Finance Data (min 2 IT Resources) NB: IT Resource not including part time Tech Project Manager

7 BusinessTechnical ‘One’ Source of the truth Conformed Dimensions Consolidated Facts Performance Transformed schema design External Data Flexible data sources Simplicity Pre-calculated measures Historical Capability Versioning, Snapshot Data Quality Verification, Validation, Audit Trail Project Goals

8 By-Products of an EDW Project Data Discovery Data Discovery What data do we have What data do we have How data is used and maintained How data is used and maintained What is the quality of the data What is the quality of the data How data can be utilised by more of the organisation How data can be utilised by more of the organisation Enhanced Collaboration Enhanced Collaboration Intra and Inter communication between business units, system owners and IT Intra and Inter communication between business units, system owners and IT

9 Technical Project Plan “Warehousing” Research “Warehousing” Research Proof of Concept exercise Proof of Concept exercise External Assistance External Assistance Implementation of an Architecture Implementation of an Architecture Development Standards & Procedures Development Standards & Procedures Build & Implementation of Stage 1 Build & Implementation of Stage 1 Review Review

10 Proof of Concept Validate Warehouse research findings Validate Warehouse research findings Proof of Concept covered the following topics: Proof of Concept covered the following topics: Project methodology Project methodology Technical architecture Technical architecture Design methodology Design methodology ETL methodology ETL methodology MetaData options MetaData options Data Quality approach Data Quality approach Security implementation options Security implementation options

11 Project Methodology

12 Technical Architecture Inputs into Architecture Inputs into Architecture Business Goals Business Goals Existing Reporting Environments Existing Reporting Environments Technology Technology Time Time $$ $$ Resources/Skills Resources/Skills

13

14 Data Flow Architecture

15 Design Methodology Dimensional Modelling chosen as the design philosophy Dimensional Modelling chosen as the design philosophy Star Schemas/Snowflakes Star Schemas/Snowflakes Facts Facts Dimensions Dimensions Measures Measures Bridges Bridges History Retention for Slowly Changing Dimensions History Retention for Slowly Changing Dimensions Warehouse records are versioned i.e. never deleted or overwritten. Warehouse records are versioned i.e. never deleted or overwritten. Views to identify “current” records Views to identify “current” records

16 Transformation of Design - Source

17 Transformation of Design - Target

18 ETL Methodology Scripts Vs Tool decision Scripts Vs Tool decision Tool chosen for following reasons: Tool chosen for following reasons: Already licensed for Oracle Internet Developer Suite that includes Oracle Warehouse Builder Already licensed for Oracle Internet Developer Suite that includes Oracle Warehouse Builder Oracle Database environment Oracle Database environment Oracle technical skills Oracle technical skills Visibility of Development Environment Visibility of Development Environment Auto technical Meta Data generation Auto technical Meta Data generation Auto and accessible code generation using PL/SQL Auto and accessible code generation using PL/SQL Ability to include custom code Ability to include custom code Integration with Oracle database and related Oracle technology Integration with Oracle database and related Oracle technology Framework for Beginners Framework for Beginners Difficult to evaluate other products without expertise Difficult to evaluate other products without expertise Smarts & Effort into Modelling and Design – ETL should be a “no brainer” Smarts & Effort into Modelling and Design – ETL should be a “no brainer”

19 MetaData Data about Data Data about Data Oracle Warehouse Builder provides technical metadata Oracle Warehouse Builder provides technical metadata Business MetaData facility currently restricted to documentation and Cognos catalogs Business MetaData facility currently restricted to documentation and Cognos catalogs Evaluation of MetaData methods to be reviewed at the completion of Stage 1 development Evaluation of MetaData methods to be reviewed at the completion of Stage 1 development

20 Data Quality Pre-ETL Pre-ETL Technical profile to ensure physical design has mapped appropriate data elements Technical profile to ensure physical design has mapped appropriate data elements Business profile of source data to identify data attributes e.g. data type, patterns, nulls, min, max, outlies Business profile of source data to identify data attributes e.g. data type, patterns, nulls, min, max, outlies ETL ETL Transform to conformed data sets Transform to conformed data sets Foreign Key checks Foreign Key checks Reporting of anomolies Reporting of anomolies Post ETL Post ETL Final Business profile to validate transformations of data Final Business profile to validate transformations of data

21 Security Security options implemented are: Security options implemented are: Database Layer Database Layer Oracle roles to grant or deny access to database objects based on Business rules Oracle roles to grant or deny access to database objects based on Business rules Oracle views for granular data security where appropriate Oracle views for granular data security where appropriate User Layer User Layer Access to end user Cognos catalogues/cubes controlled via Cognos security mechanisms and filesystem access Access to end user Cognos catalogues/cubes controlled via Cognos security mechanisms and filesystem access

22 Development Lifecycle Business Requirements Business Requirements Design Process Design Process Logical Design Logical Design Physical Design Physical Design Data Mapping Data Mapping Data Profiling Data Profiling

23 Development Lifecycle Design & Build ETL Objects & Processes Design & Build ETL Objects & Processes Extraction routines Extraction routines ‘Diff’ routines ‘Diff’ routines Tag records as Inserts, Updates or Deletes Tag records as Inserts, Updates or Deletes Build Staging tables Build Staging tables Build Target warehouse tables Build Target warehouse tables

24 Standard ETL Process Scheduled Extract/Diff process runs to populate a Diff table in the Staging Area Scheduled Extract/Diff process runs to populate a Diff table in the Staging Area ETL process then performs a standard set of steps ETL process then performs a standard set of steps Load Staging from Diff table Load Staging from Diff table Stamp Staging record according to Diff type (U, D or I) Stamp Staging record according to Diff type (U, D or I) Updated Record – Tag staging record as new ‘version’ of core record Updated Record – Tag staging record as new ‘version’ of core record Deleted Record – Tag staging record ‘Retired’ record in warehouse Deleted Record – Tag staging record ‘Retired’ record in warehouse Inserted Record – Tag staging record to be new record (version 1) Inserted Record – Tag staging record to be new record (version 1) Update Core – End date existing “current” record Update Core – End date existing “current” record Load new Core – New “current” record from Staging Load new Core – New “current” record from Staging

25 Development Lifecycle Post ETL Post ETL Measures Measures Summary data Summary data Process Flows to execute ETL Process Flows to execute ETL Security views Security views End User Layer e.g. Catalogues End User Layer e.g. Catalogues

26 ETL Auditing When did a process last run When did a process last run How long did it run for How long did it run for Did it Succeed, Fail or produce Warnings Did it Succeed, Fail or produce Warnings How many records did it alter or insert How many records did it alter or insert What were the data exceptions What were the data exceptions

27 UniSA EDW Toolset Oracle Database Oracle Database Oracle Warehouse Builder Oracle Warehouse Builder Oracle Workflow Oracle Workflow Oracle Enterprise Manager Oracle Enterprise Manager Datiris Data profiler Datiris Data profiler Cognos Impromptu/Powerplay Cognos Impromptu/Powerplay Whiteboard and lots of A3 Paper!!! Whiteboard and lots of A3 Paper!!!

28 Oracle Database Options assisting Warehouse implementation Options assisting Warehouse implementation External tables External tables Materialised Views Materialised Views Query Rewrite Query Rewrite Bitmap indexes Bitmap indexes Partitioning Partitioning Star Query optimizer options Star Query optimizer options

29 Oracle Warehouse Builder Provides the design and development environment and framework for the build and deployment of Warehouse objects and transformation processes Provides the design and development environment and framework for the build and deployment of Warehouse objects and transformation processes Consists of Design Repository and Runtime components Consists of Design Repository and Runtime components

30

31 Oracle Workflow Optionally used for job execution with “dependency management” Optionally used for job execution with “dependency management” Exists as an optional install with RDBMS Exists as an optional install with RDBMS Run as Client/Server or HTTP browser based application Run as Client/Server or HTTP browser based application Workflow engine is a service on the warehouse database server administered by a workflow schema Workflow engine is a service on the warehouse database server administered by a workflow schema

32 Oracle Enterprise Manager Optionally used as the scheduling option for submitting and monitoring Warehouse builder processes or workflows Optionally used as the scheduling option for submitting and monitoring Warehouse builder processes or workflows Base OEM comes with RDBMS Base OEM comes with RDBMS Optionally run as standalone install or Management Server mode using a web console Optionally run as standalone install or Management Server mode using a web console

33 Cognos 7.3 Reporting Suite Catalogues Catalogues Report Developer access layer Report Developer access layer Impromptu Impromptu Reporting capability Reporting capability Powerplay Powerplay Multi-dimensional analysis Multi-dimensional analysis Upfront Upfront Web interface Web interface

34 Oracle Warehouse Builder Demonstration

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81 OWB 10g Release 2 - Paris New Features: Design Tool Design Tool Graphic Interface Improvements Graphic Interface Improvements Built in Slowly Changing Dimension property Built in Slowly Changing Dimension property Data Profiling/Quality utilities Data Profiling/Quality utilities Better Integrated Workflow Engine Better Integrated Workflow Engine Job Scheduling within OWB via OEM Job Scheduling within OWB via OEM

82 Project Review Sanity Check on whole process, architecture, methodology Sanity Check on whole process, architecture, methodology Business & Technical Business & Technical Evaluate ROI Evaluate ROI Quantify metrics on time to deliver Quantify metrics on time to deliver Proposed Future phases Proposed Future phases Usage Statistics Usage Statistics Hardware adequacy & capacity Hardware adequacy & capacity

83 Useful Technical References Links Links Oracle Business Intelligence & Technical Sites Oracle Business Intelligence & Technical Sites http://www.oracle.com/solutions/business_intelligence/index.html http://www.oracle.com/solutions/business_intelligence/index.html http://www.oracle.com/solutions/business_intelligence/index.html http://www.oracle.com/technology/tech/bi/index.html http://www.oracle.com/technology/tech/bi/index.html http://www.oracle.com/technology/tech/bi/index.html Rittman Blog Rittman Blog http://www.rittman.net/ http://www.rittman.net/ http://www.rittman.net/ Kimball Tips Kimball Tips http://www.kimballgroup.com/html/designtips.html http://www.kimballgroup.com/html/designtips.html http://www.kimballgroup.com/html/designtips.html Texts Texts Oracle 9iRel2 Data Warehousing - Hobbs Oracle 9iRel2 Data Warehousing - Hobbs Kimball Texts Kimball Texts The Data Warehouse Lifecycle Toolkit The Data Warehouse Lifecycle Toolkit The Data Warehouse ETL Toolkit The Data Warehouse ETL Toolkit

84 Questions ?


Download ppt "Enterprise Data Warehouse A Technical Perspective Tony Dalwood Information Architecture & Management University of South Australia."

Similar presentations


Ads by Google