Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building a Data Warehouse...Bring in the Sheaves January 13, 2004 EDUCAUSE Mid-Atlantic Conference Baltimore, Maryland Ella Smith U.S. Department of Agriculture.

Similar presentations


Presentation on theme: "Building a Data Warehouse...Bring in the Sheaves January 13, 2004 EDUCAUSE Mid-Atlantic Conference Baltimore, Maryland Ella Smith U.S. Department of Agriculture."— Presentation transcript:

1 Building a Data Warehouse...Bring in the Sheaves January 13, 2004 EDUCAUSE Mid-Atlantic Conference Baltimore, Maryland Ella Smith U.S. Department of Agriculture Alan Harmon U.S. Naval Academy January 13, 2004 EDUCAUSE Mid-Atlantic Conference Baltimore, Maryland Ella Smith U.S. Department of Agriculture Alan Harmon U.S. Naval Academy

2 Copyright Ella Smith and Alan Harmon, 2004. This work is the intellectual property of the authors. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the authors. To disseminate otherwise or to republish requires written permission from the authors.

3 Agenda for this Session Definition Overview of the Initial Process (Proof-of-Concept) Overview of the Initial Process (Proof-of-Concept) Organizational Ownership Organizational Ownership Data Warehouse Architecture Data Warehouse Architecture Project Team Composition Project Team Composition The Process The Process Wrap Up Wrap Up Questions Questions Definition Overview of the Initial Process (Proof-of-Concept) Overview of the Initial Process (Proof-of-Concept) Organizational Ownership Organizational Ownership Data Warehouse Architecture Data Warehouse Architecture Project Team Composition Project Team Composition The Process The Process Wrap Up Wrap Up Questions Questions

4 What is a Data Warehouse? u Definition: a repository of data derived from operational systems or external source; NOT an archive u Purpose: collect and report data in a consistent, centralized manner; mechanism for conducting longitudinal analysis u Strategy: Target key applications (Admissions, Registrar, Frozen Files), clean data and load.

5 Benefits of a Data Warehouse u Cost Savings by reducing the amount of manual time and effort required to compile, organize, and report the data. u Data Consistency among the different areas since the data will be synchronized upon entry into the data warehouse. u Access to the information will be faster since the process will be automated and available online (versus paper reports).

6 Loading and Cleaning Data Opportunity to Integrate, Correct, and Validate Data Loading and Cleaning Data Opportunity to Integrate, Correct, and Validate Data DataWarehouseDataWarehouse Data Extraction and Cleaning (can be very complex) u Integrate multiple data sources u Correct data problems (cleanse) u Validate Data u Summarize and roll-up data u Update Metadata Flat File DataSources Data in Databases Data Live Data Sources ApplicationsSAPPeopleSoft Oracle Apps ApplicationsSAPPeopleSoft

7 Online Analytical Processing Fast and Selective Access to Summarized Data REGISTRAR View FINANCIAL View Ad Hoc View PROD MARKETMARKET TIME ADMISSIONS View MAJORS CLASS YEAR TIME STUDENTS

8 DW Development Strategy u Think & Plan Big –Build In Small Steps; Don’t build a BARN! Not an archive system u Identify your audience u Use DW to address new areas, add new capabilities, and fix existing problems u Retain existing transactional systems u Iterative development approach –Address key needs –Rapidly deliver capability to users –Lower risk

9 Strategy (continued) u Evolve system in manageable phases –Identify questions you need to answer OR –Look at data to determine questions you can answer u Strategy –Develop an overall plan –Develop common metadata standards –Implement needed pieces mindful of integration and expansion

10 Initial Considerations u Vision u Proof-of-Concept / Phased Approach u Benefits u Strategy u Timeline u Cost u Issues –Data –Political constraints –Organizational Factors

11 1 st Step: Proof-of-Concept u Develop a Stand Alone Proof-of-Concept u Develop model to demonstrate use of new tools to end users. u Provide benchmarks for future planning. u Low cost way to “test the waters” u Exposes YOUR data and ability to deal with it u Define number of tasks and deliverables.

12 Proof-of-Concept Timeline u 6-8 Weeks for each increment –Requirements: gather and document –Data: identify source, construct model, extract data, cleanse data, transport data to database –Data Access: user interface, security, training, documentation

13 Proof-of-Concept Timeline

14 ADMISSION DATA DATA SSN CLASS YEAR DEMOGRAPHIC DATA DATA SSN CLASS YEAR ETHNICITY GENDER HIGH SCHOOL REGION STUDENT_FACT SATVHI SATMHI H.S. RANK H.S. CLASS SIZE TIME #ACYEAR CLASS YEAR REGISTRAR DATA DATA SSN CLASS YEAR GPA MAJOR Proof-of-Concept Logical Data Model SSN ACYR

15 ADMISSION DATA DATA #ADMISSION_SSN SCORE_CLASS DEMOGRAPHIC DATA DATA # DEMO_SSN DEMO_CLASS ETHNICITY GENDER HIGH SCHOOL REGION STUDENT_FACT #ADMISSION_SSN #DEMO_SSN #REGISTRAR_SSN #ACYEAR SATVHI SATMHI H.S. RANK H.S. CLASS SIZE TIME #ACYEAR CLASS_YEAR REGISTRAR DATA DATA #REGISTRAR_SSN CLASS GPA MAJOR Proof-of-Concept Physical Data Model SSN ACYR

16 Post-PoC: DW Architecture u Many types of architecture –Star schema, Snowflake, Hybrid u Depends on: –Types of queries –Size of database –Capability of hardware and software u Basic Components: –Logical Model –Physical Model

17 Physical Data Warehouse Topology Admissions Academic Affairs Dean of Students Finance Office HR President Instit Research WebServer Database Server Server General Public for remote connectivity Remote Laptop1Remote Laptop2 Public Affairs

18 MetaData Definition: Information about your data MetaData Definition: Information about your data u Centralized description of business rules –Describes data and transformations within DW –Captures changes in business rules over time to provide a level playing field for comparing data u Audit trail for data authentication u Bottom line –Increased trust in DW-based analysis results results

19 Project Team Composition u Types of Personnel and Level of Skill – Analysis & Design (HIGH) – Implementation (MED) – Test & Quality Assurance (LOW) u Skill = $$$ u Vary Skill by Task to control cost

20 The Project Model “Roles and Responsibilities” Steering Committee Project Manager Quality Assurance PrgmrPrgmrModelerModelerDBADBA Tool Prgmrs EndUserLiaisonEndUserLiaisonDocumentationDocumentation Planning, Reporting, Certification Joint Client and Consultant Test and Map to Requirements

21 The Project Model “Roles and Responsibilities” PrgmrPrgmrModelerModelerDBADBA Tool Prgmrs EndUserLiaisonEndUserLiaisonDocumentationDocumentation Scoping Scoping Infrastructure InfrastructureScoping Modeling Cleaning Capacity Planning PrototypingModeling ETL Implementation ImplementationBuildingBuilding QAQAQAQA QA / Training Scoping Modeling Documentation Training Analysis Phase Architecture Phase Implementation Phase Transition Phase

22 The Harvest! u Review requirements and results periodically –At end of each phase –Annually, taken as a whole u Optimize data warehouse –Response based on queries and load –Bring in-line with operational systems u Review and Adjust the DW mission as institutional mandates change

23 Cost Control u Start small and develop in phases u Bring in skill sets as needed remember: $$$ = (Skills) x (period of time) remember: $$$ = (Skills) x (period of time) u Institutional staff should know the data u Organizational issues need to be resolved by the Project Manager and Steering Committee

24 Accountability u MUST show results (standard or adhoc reports) u Ensure complete documentation to maintain responsibility and association of data to departments u Establish a Return-on-Investment (ROI) whether tangible (number of reports) or intangible (executive support/decision making)

25 Issues u Security u Performance u Managing the metadata u Managing the data warehouse u Hardware/software configuration u Resources u Staying in the loop!

26 Building a Data Warehouse


Download ppt "Building a Data Warehouse...Bring in the Sheaves January 13, 2004 EDUCAUSE Mid-Atlantic Conference Baltimore, Maryland Ella Smith U.S. Department of Agriculture."

Similar presentations


Ads by Google