Copyright © 1995-1996 Archer Decision Sciences, Inc. Our Model Store DimensionProduct Dimension District Region Total Brand Manufacturer Total StoresProducts.

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Alternative Database topology: The star schema
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Data Warehousing M R BRAHMAM.
Jennifer Widom On-Line Analytical Processing (OLAP) Introduction.
Dimensional Modeling Business Intelligence Solutions.
Prof. Navneet Goyal Computer Science Department BITS, Pilani
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Business Intelligence/ Decision Models Week 2 IT Infrastructure & Marketing Database Design and Implementation.
Dimensional Modeling – Part 2
Exploiting the DW data DW is a platform for creating a wide array of reports It solves data feed problems, but does not lead to specific decision support.
Data Warehouse Models and OLAP Operations
Data Warehousing Design Transparencies
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
By N.Gopinath AP/CSE. Two common multi-dimensional schemas are 1. Star schema: Consists of a fact table with a single table for each dimension 2. Snowflake.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Data Warehousing: Data Models and OLAP operations By Kishore Jaladi
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Principles of Dimensional Modeling
DWH – Dimesional Modeling PDT Genči. 2 Outline Requirement gathering Fact and Dimension table Star schema Inside dimension table Inside fact table STAR.
Data Warehousing.
Multi-Dimensional Databases & Online Analytical Processing This presentation uses some materials from: “ An Introduction to Multidimensional Database Technology,
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
Data Models for Warehouse Session-12/13 Data Management for Decision Support.
Data Warehousing Concepts, by Dr. Khalil 1 Data Warehousing Design Dr. Awad Khalil Computer Science Department AUC.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Data Warehouse. Design DataWarehouse Key Design Considerations it is important to consider the intended purpose of the data warehouse or business intelligence.
Object Persistence (Data Base) Design Chapter 13.
BI Terminologies.
Grade 11 Computer Science. Relational Databases  Using the link below, answer questions in your notebooks  Look at Kites.accdb database to refresh your.
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Basic Model: Retail Grocery Store
Designing a Data Warehousing System. Overview Business Analysis Process Data Warehousing System Modeling a Data Warehouse Choosing the Grain Establishing.
UNIT-II Principles of dimensional modeling
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Shilpa Seth.  Multidimensional Data Model Concepts Multidimensional Data Model Concepts  Data Cube Data Cube  Data warehouse Schemas Data warehouse.
Creating the Dimensional Model
Data Warehousing Multidimensional Analysis
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
Chapter 4 Logical & Physical Database Design
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Data Warehousing.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Description and exemplification use of a Data Dictionary. A data dictionary is a catalogue of all data items in a system. The data dictionary stores details.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Denormalization - Causes redundancy, but fast performance & no referential integrity - Denormalize when specific queries occur frequently, a strict performance.
5 Copyright © 2008, Oracle. All rights reserved. Testing and Validating a Repository.
Last Updated : 26th may 2003 Center of Excellence Data Warehousing Introductionto Data Modeling.
COMP 430 Intro. to Database Systems Denormalization & Dimensional Modeling.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
3 Copyright © 2006, Oracle. All rights reserved. Business, Logical, and Dimensional Modeling.
Data Warehousing, Business Intelligence & Data Visualization for Smarter Decision Making Phyllis Chasser, PhD. Adjunct Faculty, College of Engineering.
Multi-Dimensional Databases & Online Analytical Processing This presentation uses some materials from: “An Introduction to Multidimensional Database Technology,”
CSE6011 Implementing a Warehouse  Monitoring: Sending data from sources  Integrating: Loading, cleansing,...  Processing: Query processing, indexing,...
Defining Data Warehouse Structures Data Warehouse Data Access End User Data Access Data Sources Staging Area Data Marts Data Extract, Transform, and Load.
Using Partitions and Fragments
Chapter 4 Relational Databases
Data Warehousing: Data Models and OLAP operations
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009
Retail Sales is used to illustrate a first dimensional model
Aggregate Improvement and Lost, shrunken, and collapsed
Retail Sales is used to illustrate a first dimensional model
Dimensional Model January 16, 2003
DWH – Dimesional Modeling
Presentation transcript:

Copyright © Archer Decision Sciences, Inc. Our Model Store DimensionProduct Dimension District Region Total Brand Manufacturer Total StoresProducts

Copyright © Archer Decision Sciences, Inc. Another View Year Quarter Month Date Region District Store Manufacturer Brand Product Color Region Mgr. Size StateCity Current Flag Sequence Day of Week

Copyright © Archer Decision Sciences, Inc. The “Classic” Star Schema PERIOD KEY Store Dimension Time Dimension Product Dimension STORE KEY PRODUCT KEY PERIOD KEY Dollars Units Price Period Desc Year Quarter Month Day Fact Table PRODUCT KEY Store Description City State District ID District Desc. Region_ID Region Desc. Regional Mgr. Product Desc. Brand Color Size Manufacturer STORE KEY

Copyright © Archer Decision Sciences, Inc. A Word About Indexing... Compound Keys LEVELSTORE DESCRIPSTOREDISTRICTREGION Compound Keys PL43A7 AR43A6 PO12B3 LA12A6 NULL TEXA EPEN TEXA EPEN NULL SOUTH NORTH SOUTH NORTH PLANO#3 ARLINGTON#2 POTTSTOWN LANSDALE NULL STORE DISTRICT REGION

Copyright © Archer Decision Sciences, Inc. A Word About Indexing... Concatenated Keys LEVELSTORE DESCRIPCONCATENATED STORE KEY Concatenated Keys PL43A7 AR43A6 PO12B3 LA12A6 NULL TEXA EPEN TEXA EPEN NULL SOUTH NORTH SOUTH NORTH PLANO#3 ARLINGTON#2 POTTSTOWN LANSDALE NULL STORE DISTRICT REGION

Copyright © Archer Decision Sciences, Inc. A Word About Indexing... Generated Keys LEVELSTORE DESCRIPSTOREDISTRICTREGION Generated Keys PL43A7 AR43A6 PO12B3 LA12A6 NULL TEXA EPEN TEXA EPEN NULL SOUTH NORTH SOUTH NORTH PLANO#3 ARLINGTON#2 POTTSTOWN LANSDALE NULL STORE DISTRICT REGION STORE KEY

Copyright © Archer Decision Sciences, Inc. The “Classic” Star Schema  A single fact table, with detail and summary data  Fact table primary key has only one key column per dimension  Each key is generated  Each dimension is a single table, highly denormalized Benefits: Easy to understand, easy to define hierarchies, reduces # of physical joins, low maintenance, very simple metadata Drawbacks: Summary data in the fact table yields poorer performance for summary levels, huge dimension tables a problem

Copyright © Archer Decision Sciences, Inc. The “Classic” Star Schema The biggest drawback: dimension tables must carry a “level” indicator for every record and every query must use it. In the example below, without the level constraint, keys for all stores in the NORTH region, including aggregates for region and district will be pulled from the fact table, resulting in error. Example: Select A.STORE_KEY, A.PERIOD_KEY, A.dollars from Fact_Table A where A.STORE_KEY in (select STORE_KEY from Store_Dimension B where region = “North” and Level = 2) and etc... Level is needed whenever aggregates are stored with detail facts.

Copyright © Archer Decision Sciences, Inc. The “Level” Problem Level is a problem because because it causes potential for error. If the query builder, human or program, forgets about it, perfectly reasonable looking WRONG answers can occur. One alternative: the FACT CONSTELLATION model...

Copyright © Archer Decision Sciences, Inc. The “Fact Constellation” Schema Dollars Units Price District Fact Table District_ID PRODUCT_KEY PERIOD_KEY Dollars Units Price Region Fact Table Region_ID PRODUCT_KEY PERIOD_KEY

Copyright © Archer Decision Sciences, Inc. The “Fact Constellation” Schema In the Fact Constellations, aggregate tables are created separately from the detail, therefor it is impossible to pick up, for example, Store detail when querying the District Fact Table. Major Advantage: No need for the “Level” indicator in the dimension tables, since no aggregated data is stored with lower-level detail Disadvantage: Dimension tables are still very large in some cases, which can slow performance; front-end must be able to detect existence of aggregate facts, which requires more extensive metadata

Copyright © Archer Decision Sciences, Inc. Another Alternative to “Level” Fact Constellation is a good alternative to the Star, but when dimensions have very high cardinality, the sub-selects in the dimension tables can be a source of delay. An alternative is to normalize the dimension tables by attribute level, with each smaller dimension table pointing to an appropriate aggregated fact table, the “Snowflake Schema”...

Copyright © Archer Decision Sciences, Inc. The “Snowflake” Schema STORE KEY Store Dimension Store Description City State District ID District Desc. Region_ID Region Desc. Regional Mgr. District_ID District Desc. Region_ID Region Desc. Regional Mgr. STORE KEY PRODUCT KEY PERIOD KEY Dollars Units Price Store Fact Table Dollars Units Price District Fact Table District_ID PRODUCT_KEY PERIOD_KEY Dollars Units Price RegionFact Table Region_ID PRODUCT_KEY PERIOD_KEY

Copyright © Archer Decision Sciences, Inc. The “Snowflake” Schema No LEVEL in dimension tables Dimension tables are normalized by decomposing at the attribute level Each dimension table has one key for each level of the dimension’s hierarchy The lowest level key joins the dimension table to both the fact table and the lower level attribute table How does it work? The best way is for the query to be built by understanding which summary levels exist, and finding the proper snowflaked attribute tables, constraining there for keys, then select’ing from the fact table.

Copyright © Archer Decision Sciences, Inc. The “Snowflake” Schema Additional features: The original Store Dimension table, completely de- normalized, is kept intact, since certain queries can benefit by its all- encompassing content. In practice, start with a Star Schema and create the “snowflakes” with queries. This eliminates the need to create separate extracts for each table, and referential integrity is inherited from the dimension table. Advantage: Best performance when queries involve aggregation Disadvantage: Complicated maintenance and metadata, explosion in the number of tables in the database