Competing on Analytics II

Slides:



Advertisements
Similar presentations
Chapter 11: Data Warehousing
Advertisements

Dimensional Modeling.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
Chapter 13 The Data Warehouse
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Data Warehouse & Data Mining
MBA 664 Database Management Systems Dave Salisbury ( )
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Business Intelligence Process Grain of the Fact Table Dr. Chang Liu
Introduction to the Orion Star Data
© 2007 Robert T. Monroe Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Administrivia – HW #2 Homework #2 OLAP.
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
Data Warehousing Concepts, by Dr. Khalil 1 Data Warehousing Design Dr. Awad Khalil Computer Science Department AUC.
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
© 2007 Robert T. Monroe Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Data Warehousing II: Extract, Transform,
Data Warehouse. Design DataWarehouse Key Design Considerations it is important to consider the intended purpose of the data warehouse or business intelligence.
1 Data Warehouses BUAD/American University Data Warehouses.
Data Warehousing.
Chapter 9: data warehousing
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
UNIT-II Principles of dimensional modeling
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
Chapter 11: Data Warehousing Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Data Warehousing.
The Data Warehouse Chapter Operational Databases = transactional database  designed to process individual transaction quickly and efficiently.
Carnegie Mellon University © Robert T. Monroe Management Information Systems Data Warehousing Management Information Systems Robert.
 Definition of terms  Reasons for need of data warehousing  Describe three levels of data warehouse architectures  Describe two components of star.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Lecture 14: Data Warehousing Modern Database Management 9 th Edition Jeffrey A. Hoffer, Mary.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Jaclyn Hansberry MIS2502: Data Analytics The Things You Can Do With Data The Information Architecture of an Organization Jaclyn.
Advanced Applied IT for Business 2
Chapter 13 Business Intelligence and Data Warehouses
Data warehouse and OLAP
Using Partitions and Fragments
Chapter 13 The Data Warehouse
Data storage is growing Future Prediction through historical data
Summarized from various resources Modern Database Management
Chapter 11: Data Warehousing
Data Warehouse.
Star Schema.
Overview and Fundamentals
MIS2502: Data Analytics Dimensional Data Modeling
CMPE 226 Database Systems April 11 Class Meeting
Data Warehouse and OLAP
An Introduction to Data Warehousing
MIS2502: Data Analytics The Information Architecture of an Organization Acknowledgement: David Schuff.
MIS2502: Data Analytics Dimensional Data Modeling
Dimensional Modeling.
MIS2502: Data Analytics Dimensional Data Modeling
Introduction of Week 9 Return assignment 5-2
Retail Sales is used to illustrate a first dimensional model
Data Warehouse.
Data Warehousing Concepts
Big DATA.
Data Warehouse and OLAP
Presentation transcript:

Competing on Analytics II BI Tools and Techniques Robert Monroe April 1, 2008

Goals Introduce dimensional models (aka star schemas) and why they are useful in data warehousing Explore some common business processes, both inward-facing and outward-facing, that companies choose as a basis for analytic competition Understand the type of data management and warehousing infrastructure needed to support those analytic capabilities

ETL Extended Dance Remix: Dimensional Modeling

Quick Review: The Three-Layer Data Architecture Data goes through three common stages during ETL Operational Data transactional data stored in individual systems of record throughout the organization Reconciled Data detailed, current data intended to be the single, authoritative source for all decision support applications Derived Data data that have been selected, formatted, and aggregated for end-user decision support applications Operational Data Reconciled Data Derived Data

Quick Review: The ETL Process Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Quick Review: Typical Data Warehouse Structure Reconcile Data Derive Data Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Derived Data Operational Data Reconciled Derived Although reconciled data provides a consistent, hiqh-quality collection of enterprise data it is not necessarily in an efficient form for use by BI tools Derived data objectives: Ease of use for decision support applications Fast response to predefined user queries Customized data for particular target audiences Ad-hoc query support Data mining capabilities Characteristics Detailed (mostly periodic) data Aggregated (for summary) Processed Distributed (to data marts)

Dimensional Modeling: Facts and Dimensions a simple database design in which dimensional data are separated from fact or event data. Dimensional models are also sometimes called star schemas. Dimensional models are a common way to represent derived data for informational data stores Well suited to ad-hoc queries and OLAP Poorly suited for transaction processing Commonly used for data warehouse/mart storage model

Dimensional Modeling Dimensional modeling is a simple database design pattern in which dimensional data are separated from fact or event data. Dimensional models are also sometimes called star schemas. Dimensional models are a common way to represent derived data for analytic data stores Well suited to ad-hoc queries and OLAP Poorly suited for transaction processing

Star Schema Structure Fact tables contain factual or quantitative data Dimension tables contain descriptions about the subjects of the business Fact tables contain factual or quantitative data Dimension tables are denormalized to maximize performance 1:N relationship between dimension tables and fact tables Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Star Schema Example Dimension tables provides details on stores, products, and time periods Fact table provides statistics for sales broken down by product, period and store dimensions Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Star Schema Example With Data Product Period Store Sales Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Dimensional Model Benefits Simple and predictable framework Well suited to ad-hoc analytical queries Relatively straightforward mapping from most transactional systems Dimensional independence Query performance is somewhat independent of dimensions used in the query Straightforward model extensions support evolution

Dimension Hierarchies Dimension tables can capture hierarchies Dimensions use levels to represent hierarchies Each sub-level subdivides the parent level with finer granularity Examples Dimension: Time Period Levels – Year :: Quarter :: Month :: Week :: Day Dimension: Organization Levels – Company :: Division :: Department :: Employee Exercise: Define the levels for a Geography dimension for customer locations

Facts and Measures Measures represent the interesting data at the intersection of different dimensions There is a space for a measure at every intersection of every level of every dimension Base facts are stored in the intersections of lowest-level dimensions (either simple or calculated measures) Aggregate or computed values are stored at the intersections of where all of the dimensions are not at the lowest level (aggregate values must be calculated measures)

Modeling Hierarchies Dimension tables frequently model hierarchies Example: Customers dimension stores data about your customers You may sell to several divisions of a single company You want to be able to analyze sales to the individual divisions and also capture “rolled-up” values for the parent company Divisions of ABC Automotive Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Modeling Hierarchies With Denormalized Tables Hierarchical dimensions are frequently represented with denormalized tables This approach simplifies and speeds queries … at the expense of introducing anomalies Customer_Dimension Parent_Company Customer_Key Name Address Type <null> C000001 ABC Automotive 100 1st St. Dealer C000002 ABC Auto Sales 110 1st St. Sales C000003 ABC Repair 130 1st St. Service C000004 ABC Auto New Sales C000005 ABC Auto Used Sales C000006 Bubba’s House O’ Cars 5432 Maple Ln

In-Class Exercise: Star Schema Form teams of 2-3 people Complete exercise 2, question #1 on handout Build a star schema to store grades at Millenium College

Modeling Dates and Time Dates, times, and time periods are almost always included in BI systems, but they can be tricky to model Decide on the granularity of dates carefully (big effect on size) A common approach to modeling dates as a dimension: Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Multiple Fact Tables It is frequently useful to store more than one type of fact in a single multidimensional database (star schema) This can be handled by using multiple fact tables that share dimensions Example: modeling products sold and products purchased Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Conformed Dimensions When dimensions are shared across multiple fact tables they must be conformed dimensions Conformed dimensions One or more dimension tables associated with two or more fact tables for which the dimension tables have the same business meaning and primary key with each fact table Conformed dimensions allow users to: Query across multiple fact tables Improve consistency of meaning and structure for derived and retrieved information

Factless Fact Tables – Tracking Events “Factless” fact tables store only foreign keys, no facts Factless fact tables allow the tracking of what types of events happened, and under what circumstances they happened Diagram Source: Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

In-Class Exercise 2: Extending The Star Schema Form 2-3 person teams Add the ability to store new facts for FCE data: Specific facts to store: Aggregate average course eval rating (1-5) for each course section offered Aggregate average instructor eval rating (1-5) for each course section offered

ETL Tools – Microsoft SSIS Demo Image Source: Conchango Consulting