CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak www.cs.sjsu.edu/~mak.

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Technical BI Project Lifecycle
Chapter 18: Data Analysis and Mining Kat Powell. Chapter 18: Data Analysis and Mining ➔ Decision Support Systems ➔ Data Analysis and OLAP ➔ Data Warehousing.
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
Data Warehousing M R BRAHMAM.
Dimensional Modeling – Part 2
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Ch3 Data Warehouse part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
Sayed Ahmed Logical Design of a Data Warehouse.  Free Training and Educational Services  Training and Education in Bangla: Training and Education in.
OnLine Analytical Processing (OLAP)
CMPE 226 Database Systems September 16 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
Cube Intro. Decision Making Effective decision making Goal: Choice that moves an organization closer to an agreed-on set of goals in a timely manner Goal:
CS 157B: Database Management Systems II March 20 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
BI Terminologies.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Ahsan Abdullah 1 Data Warehousing Lecture-10 Online Analytical Processing (OLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
CS 157B: Database Management Systems II April 3 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
CMPE 226 Database Systems October 7 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
1 On-Line Analytic Processing Warehousing Data Cubes.
CMPE 226 Database Systems October 28 Class Meeting
CMPE 226 Database Systems October 14 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Business Intelligence Training Siemens Engineering Pakistan Zeeshan Shah December 07, 2009.
The Data Warehouse Chapter Operational Databases = transactional database  designed to process individual transaction quickly and efficiently.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
CS 157B: Database Management Systems II April 10 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
CMPE 226 Database Systems April 19 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Data Warehouses and OLAP 1.  Review Questions ◦ Question 1: OLAP ◦ Question 2: Data Warehouses ◦ Question 3: Various Terms and Definitions ◦ Question.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
CMPE 226 Database Systems April 5 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
CMPE Database Systems Workshop June 12 Class Meeting
Chapter 8 - Data Warehouse and Data Mart Modeling
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
On-Line Analytic Processing
Data warehouse and OLAP
Data Warehouse.
Star Schema.
Applying Data Warehouse Techniques
Databases & Data Warehouses
Competing on Analytics II
CMPE 226 Database Systems April 11 Class Meeting
Data Warehouse and OLAP
An Introduction to Data Warehousing
Introduction of Week 9 Return assignment 5-2
Applying Data Warehouse Techniques
Data Warehouse and OLAP
Data Warehousing.
Presentation transcript:

CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Detailed vs. Aggregated Fact Tables  In a detailed fact table, each row contains data about a single fact.  In an aggregated fact table, each row contains a summary of multiple facts. Such as a sum (aggregation) of all sales of a product in a particular store during a single day. 2

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Detailed Fact Table Example 3 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Detailed Fact Table Example, cont’d 4 Detailed fact table Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Detailed Fact Table Example, cont’d 5 Each fact table record contains data about one sales fact. Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Line-Item Detailed Fact Table  Each row is a single line item of a particular transaction. 6 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Transaction-Level Detailed Fact Table  Each row is a single transaction. 7 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Aggregated Fact Table Example 8 Aggregated fact table DPCS: Total amount sold in dollars and units on a particular day for a particular product for a particular customer for a particular store. Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Aggregated Fact Table Example, cont’d 9 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Aggregated Fact Table Example, cont’d 10 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Aggregated Fact Table Example, cont’d 11 Aggregated fact table DCS: Total amount sold in dollars and units on a particular day for a particular customer for a particular store for all products. Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Aggregated Fact Table Example, cont’d 12 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Aggregated Fact Table Example, cont’d 13 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Aggregated Fact Table Example, cont’d 14 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Granularity of the Fact Table  Fine level of granularity: Detailed fact table.  Courser level of granularity: Aggregated fact table.  Finer granularity: More analysis power.  Courser granularity: Faster queries. 15

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Granularity of the Fact Table, cont’d  Granularity can also depend on how the data is collected and loaded into the fact table.  Example: Load only daily sales. But then you lose the ability to analyze sales by the hour.  A solution: Keep both fine-grained and aggregated tables and have them share the dimension tables. 16

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Granularity of the Fact Table, cont’d  Good DW design involves deciding which aggregates are worth storing as tables.  The base fact tables contain data at the finest level of granularity required for analysis.  Facts can be pre-summarized in aggregate tables at granularity levels that are determined to be optimal for certain analysis procedures. 17

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Granularity of the Fact Table, cont’d 18 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Slowly Changing Dimensions  In a typical dimension of a star schema, either: The values of a dimension’s attributes do not change or change extremely rarely.  Examples: store address, customer gender OR: The values of a dimension’s attributes change occasionally and sporadically over time.  Example: customer address  Three approaches to handling slowly changing dimensions: Type 1, Type 2, and Type 3. 19

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Slowly Changing Dimensions: Type 1  Simply change the value in the dimension table’s record. Often used to correct errors. 20 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Slowly Changing Dimensions: Type 2  Preserve history by creating an additional row with the new value. Often used with timestamps. 21 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Slowly Changing Dimensions: Type 3  Create “previous” and “current” columns. Only a fixed number of changes is possible. Record only a limited history. 22 Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Snowflakes  Dimension tables can be unnormalized.  Fewer tables = faster joins = faster queries.  Update anomalies are not a concern. Dimension tables are slowly changing. Updates happen rarely if at all.  An undesirable snowflake results from unnecessarily normalizing dimension tables. 23

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Snowflakes, cont’d 24 Unnormalized dimension tables. Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Snowflakes, cont’d 25 Avoid creating a snowflake! X Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak DW Architecture: Bill Inmon Approach 26 Normalized Data Warehouse Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak DW Architecture: Ralph Kimball Approach 27 Dimensionally Modeled Data Warehouse Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak DW Architecture: Independent Data Marts 28 Inferior – Do not use! Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Online Transaction Processing (OLTP)  Online: The computer responds immediately or very quickly. online ≠ Internet  OLTP operational database = OLTP system  Update and query operational data.  Present data Generate reports 29

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Online Analytical Processing (OLAP)  Query data from data warehouses and/or data marts to analyze and present data.  OLAP tools support decision making.  OLAP tools are read only.  OLAP operations drill up and drill down slice and dice pivot 30

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak OLAP Drill Up and Drill Down  Drill through dimension hierarchies. Examples:  Location: country  state  region  city  store  Time: year  quarter  month  week  day  hour  Drill up AKA roll up Make the data granularity coarser. Aggregate the data.  Drill down Make the data granularity finer. 31

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak OLAP Drill Up and Drill Down, cont’d 32

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak OLAP Drill Up and Drill Down, cont’d 33

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Slice and Dice  Slice: Select one value of a dimension attribute. 34

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Slice and Dice  Dice: Select attribute values from two or more dimensions. 35

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Pivot  Pivot: Reorganize query results by rotation. 36

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Conformed Dimensions  When multiple star schemas share a common set of dimensions, the dimensions are called conformed dimensions.  Conformed dimensions enable analyses to span multiple star schemas, where the schemas share a common view of the world. For example, all the schemas must share a common view of what a customer is. Drill across: A OLAP operation that spans multiple star schemas. 37

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak OLAP/BI Tools  Users can query fact and dimension tables by using simple point-and-click query-building applications.  Based on user actions, the tool generates and executes the SQL code on the data warehouse or data mart. SQL code to drill up or down, slice or dice, or pivot.  Example OLAP/BI tools IBM Cognos, Oracle BI, TIBCO Spotfire, Tableau 38

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak OLAP/BI Tools, cont’d 39 Typical OLAP tool layout Database Systems by Jukić, Vrbsky, & Nestorov Pearson 2014 ISBN

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak OLAP/BI Tool Demos  RadarSoft: soft.com/Demos/HtmlOLAPGrid.aspxhttp://olaponline.radar- soft.com/Demos/HtmlOLAPGrid.aspx  Telerik: ajax/pivotgrid/examples/olap/defaultcs.aspxhttp://demos.telerik.com/aspnet- ajax/pivotgrid/examples/olap/defaultcs.aspx  See also: more more 40

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Assignment #7  Perform OLAP operations on your dimensional model from Assignment #6.  For each of the following operations, write and execute SQL queries using your sample data: drill up and drill down slice and dice pivot 41

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Assignment #7, cont’d  For each OLAP operation, show the query, and “before” and “after” query output. Example: Show quarterly results from a query. Then for a drill down, show monthly results. For a drill up, aggregate and show yearly results.  Turn in a zip file containing: Dump of your dimensional model. SQL queries and text files containing the results.  Due Wednesday, October

Computer Engineering Dept. Fall 2015: October 21 CMPE 226: Database Systems © R. Mak Next Wednesday, October 28  Guest speaker: Dr. Patricia Selinger See 03.ibm.com/ibm/history/witexhibit/wit_fellows_seling er.htmlhttps://www- 03.ibm.com/ibm/history/witexhibit/wit_fellows_seling er.html Please come to class on time!  (Hopefully!) An introduction to the Cisco Information Server for data virtualization. Each team will have a CIS account. See: services/information-server/ services/information-server/ 43