Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Presentation Objectives: Data Warehouse Overview Definition Benefits & Considerations Terminology Architecture Information Access Maturity Roadmap to a more Data Driven Institution
Data Warehouse, is it clear to you ?
Data Warehouse Definition A data warehouse is -subject-oriented, -integrated, -time-variant, -nonvolatile collection of data in support of management’s decision making process.
Data Warehouse is not: • A single physical piece of hardware or a software product. • A single project with an end • A single solution or product
Data Warehouse is: • A necessary component in order to achieve higher end reporting and analysis capability with respect to historical data, current trends, and future projections. • A data source • A combination of software and hardware
Subject-oriented Data warehouse is organized around subjects such as sales,product,customer. It focuses on modeling and analysis of data for decision makers. Excludes data not useful in decision support process.
Integration Data Warehouse is constructed by integrating multiple heterogeneous sources. Data Preprocessing are applied to ensure consistency. RDBMS Data Warehouse Legacy System Flat File Data Processing Data Transformation
Time-variant Provides information from historical perspective e.g. past 5-10 years Every key structure contains either implicitly or explicitly an element of time
Nonvolatile Data once recorded cannot be updated. Data warehouse requires two operations in data accessing Initial loading of data Access of data load access
Data Warehouse Benefits Speed up reporting Reduce reporting load on transactional systems Make institutional data more user-friendly and accessible Integrate data from different source systems Enable ‘point-in-time’ analysis and trending over time To help identify and resolve data integrity issues, either in the warehouse itself or in the source systems that collect the data
Data Warehouse Benefits Has a subject area orientation Integrates data from multiple, diverse sources Allows for analysis of data over time Adds ad hoc reporting and enquiry Provides analysis capabilities to decision makers Relieves the development burden on IT
Data Warehouse Benefits Relieves the development burden on IT Provides improved performance for complex analytical queries Relieves processing burden on transaction oriented databases Allows for a continuous planning process Converts corporate data into strategic information
Data Warehouse Considerations High-level support Identification of reporting needs by subject area and organizational role Bridging the gap between reporting needs and technical specifications Partnerships with central and campus administrative areas Customer support and training
Data Warehouse Terminology A copy of transaction data specifically structured for querying and reporting Data Mart A logical subset of the complete data warehouse OLAP (On-Line Analytic Processing) The activity of querying and presenting text and number data, usually with underlying multidimensional ‘cubes’ of data Dimensional Modeling A specific discipline for modeling data that is an alternative to entity-relationship (E/R) modeling; usually employed in data warehouses and OLAP systems.
Data Warehouse Architecture What makes up a Data Warehouse ? Concepts Characteristics Logical & Physical Components
A Data Warehouse Is A Component Raw Detail No/Minimal History Integrated Scrubbed History Summaries Targeted Specialized (OLAP) Data Characteristics Design Mapping Source OLTP Systems Architected Data Mart Central Repository Load Index Aggregation Data Warehouse Extract Scrub Transform End User Workstations Replication Data Set Distribution Access & Analysis Resource Scheduling & Distribution Meta Data System Monitoring
Tiered Architecture Data Storage Analysis Query/Reports Data mining Extract Transform Load Refresh Data Sources Operational Databases External Sources Serve OLAP Engine OLAP Server Tier2: OLAP Server Tier3: Clients Tier1: Data Warehouse Server Data Warehouse Analysis Query/Reports Data mining Data Marts Data Storage Front-End Tools
Data Warehouse Architecture Data Warehouse server almost always a relational DBMS,rarely flat files OLAP servers to support and operate on multi-dimensional data structures Clients Query and reporting tools Analysis tools Data mining tools
Data Warehouse from a logical perspective
Another look from a logical perspective
How it fits into Business Intelligence Viewpoint
Data Warehouse from a conceptual perspective A data warehouse is based on a multidimensional data model which views data in the form of a data cube
Conceptual Model Student Profile 1 2 3 4 sum First Time Returning Data View Student Profile 1 2 3 4 sum First Time Type of Student Returning Vincennes Transfer At Rsik Jasper Campus Indianapolis Out of State ALL
Data to Knowledge Process
How a Data Warehouse fits within our overall Data Governance
Current Strategy / Approach
Current Data Access Delivery Mechanisms & Tools Ad-hoc Reporting Access Scheduled and On-Demand Report Generation Using tools such as e~print, discoverer, ms access and excel, jobsub, population selection, argos, etc.
Data Driven Framework Pillars of Success
Data Warehouse Concepts Questions and Answers