Tanvi Madgavkar CSE 7330 FALL 2009. Ralph Kimball states that : A data warehouse is a copy of transaction data specifically structured for query and analysis.

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Introduction to Data Warehousing CPS Notes 6.
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Data Warehouse IMS5024 – presented by Eder Tsang.
Data Warehousing Xintao Wu. Evolution of Database Technology (See Fig. 1.1) 1960s: Data collection, database creation, IMS and network DBMS 1970s: Relational.
Dr. M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2010 COMP207: Data Mining Data Warehousing COMP207: Data Mining.
1 Lecture 10: More OLAP - Dimensional modeling
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 13 The Data Warehouse
DATA WAREHOUSE (Muscat, Oman).
1 Data Warehousing and OLAP. 2 Data Warehousing & OLAP Defined in many different ways, but not rigorously.  A decision support database that is maintained.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Designing a Data Warehouse
1 Data Warehouses C hapter 2. 2 Chapter 2 Outline Chapter 2 Outline – Introduction –Data Warehouses –Data Warehouse in Organisation – OLTP vs. OLAP –Why.
Business Intelligence Instructor: Bajuna Salehe Web:
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
Dr. Bernard Chen Ph.D. University of Central Arkansas
8/20/ Data Warehousing and OLAP. 2 Data Warehousing & OLAP Defined in many different ways, but not rigorously. Defined in many different ways, but.
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
SQL Analysis Services Microsoft® SQL Server 2005 Analysis Services provides unified, fully integrated views of your business data to support online.
Datawarehousing Concepts | 7.0 9/7/2015 Datawarehousing Concepts.
1 Brett Hanes 30 March 2007 Data Warehousing & Business Intelligence 30 March 2007 Brett Hanes.
On-Line Analytic Processing Chetan Meshram Class Id:221.
OnLine Analytical Processing (OLAP)
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Data Warehousing Xintao Wu. Can You Easily Answer These Questions? What are Personnel Services costs across all departments for all funding sources? What.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
BI Terminologies.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Data Warehouses and OLAP Data Management Dennis Volemi D61/70384/2009 Judy Mwangoe D61/73260/2009 Jeremy Ndirangu D61/75216/2009.
Section D.  OLAP (or Online Analytical Processing) has been growing in popularity due to the increase in data volumes and the recognition of the business.
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
Fox MIS Spring 2011 Data Warehouse Week 8 Introduction of Data Warehouse Multidimensional Analysis: OLAP.
UNIT-II Principles of dimensional modeling
Shilpa Seth.  Multidimensional Data Model Concepts Multidimensional Data Model Concepts  Data Cube Data Cube  Data warehouse Schemas Data warehouse.
1 On-Line Analytic Processing Warehousing Data Cubes.
Data Mining Data Warehouses.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Pooja Sharma Shanti Ragathi Vaishnavi Kasala. BUSINESS BACKGROUND Lowe's started as a single hardware store in North Carolina in 1946 and since then has.
Managing Data for DSS II. Managing Data for DS Data Warehouse Common characteristics : –Database designed to meet analytical tasks comprising of data.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 This is the full course notes, but not quite complete. You.
What is OLAP?.
CSE 5331/7331 F'071 CSE 5331/7331 Fall 2007 Dimensional Modeling Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University.
January 21, 2016Data Mining: Concepts and Techniques 1 Chapter 3: Data Warehousing and OLAP Technology: An Overview What is a data warehouse? A multi-dimensional.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
SQL Server Analysis Services Understanding Unified Dimension Model (UDM)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
DATA WAREHOUSING – DIMENSIONAL MODELLING AND SCHEMAS With MIKE –AARONE ATUHE Handout 5.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Data Warehousing COMP3017 Advanced Databases Dr Nicholas Gibbins –
Or How I Learned to Love the Cube…. Alexander P. Nykolaiszyn BLOG:
11/20/ :11 AMData Mining 1 Data Mining – CSE 9033 Chapter – 1; Data Warehousing Dr. Goutam Sarker, B.E., M.E., Ph.D.(Engineering), Fellow: IE(I),
On-Line Analytic Processing
Data warehouse and OLAP
A multi-dimensional data model
On-Line Analytic Processing
Chapter 13 The Data Warehouse
Data Warehouse and OLAP
An Introduction to Data Warehousing
Data Warehousing Concepts
Analytics, BI & Data Integration
Data Warehouse and OLAP
Presentation transcript:

Tanvi Madgavkar CSE 7330 FALL 2009

Ralph Kimball states that : A data warehouse is a copy of transaction data specifically structured for query and analysis.

Bill Inmon states that : A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process.

A data warehouse provides a common data model for all data of interest regardless of the data's source. Prior to loading data into the data warehouse, inconsistencies are identified and resolved. The information in the warehouse can be stored safely for extended periods of time.

It is a short for On Line Transaction Processing. OLTP refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and information retrieval. It is characterized by a large number of short on-line transactions. The main emphasis for OLTP systems is put on very fast query processing in multi-access environments.

It is a short for On Line Analytical Processing. OLAP is an approach to quickly answer multi-dimensional analytical queries. The term OLAP was created as a slight modification of the traditional database term OLTP. It is characterized by relatively low volume of transactions.

In general, OLTP systems provide source data to data warehouses, whereas OLAP systems help to analyze it. OLTPOLAP Source of dataOLTPs are the original source of data Data comes from various OLTP databases Purpose of dataTo run fundamental transaction related tasks To help with planning and decision support QueriesStandardized and simple queries Complex queries involving Aggregation Processing Speed Very FastDepends on the amount of data involved Space Requirements Relatively smallLarger due to existence of historical data

Multidimensional OLAP - MOLAP This is the more traditional way of OLAP analysis. In MOLAP, data is not stored in the relational database but in a multidimensional cube. Relational OLAP - ROLAP It works directly with relational databases, the base data is stored as relational tables and new tables are created to hold the aggregated information. Hybrid OLAP - HOLAP HOLAP attempt to combine the advantages of MOLAP and ROLAP. Here, a database will divide data between relational to hold the larger quantities of detailed data and specialized storage for smaller quantities of less-detailed data.

Steps in OLAP creation process:

OLAPs are designed to give an overview analysis of what happened. Hence the data storage has to be set up differently. OLAP cubes also called a multidimensional cube or a hypercube and are created from data models. OLAP cubes are not strictly cuboids - it is the name given to the process of linking data from the different dimensions.

There can be number of cubes, developed along units of dimensions or a giant cube can be formed with all the dimensions. The OLAP cube is present at the core of any OLAP system and consists of number of tables arranged in a particular schema. The cube metadata is typically created from either a star schema or snowflake schema of tables in a relational database.

The most common method is called the star design and it is called so, because it resembles a ‘star’ in shape. The star schema also known as star join schema is the simplest style of data warehouse schema. The star schema consists of a few fact tables, normally possibly only one, justifying the name referencing number of dimension tables.

Create Table FACT1 (time_key INTEGER, item_key INTEGER, branch_key INTEGER, Location_key INTEGER, PRIMARY KEY (time_key)) Create Table TIME (time_key INTEGER, day VARCHAR(10), month VARCHAR(10), year VARCHAR(10), day_of_work VARCHAR(10), quarter VARCHAR(10), FOREIGN KEY time_key REFERENCES FACT1) Create Table BRANCH (time_key INTEGER, branch_key INTEGER, branch_name VARCHAR(10), branch_type VARCHAR(10), FOREIGN KEY time_key REFERENCES FACT1)

Advantages:  Simplest DW schema.  Easy to understand.  Easy to Navigate between the tables due to less number of joins.  Most suitable for Query processing. Disadvantages:  Occupies more space.  Highly Denormalized.

A snowflake schema is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake in shape. It is closely related to star schema as it is just a variation of it. The only difference being that dimensions are normalized into multiple related tables in a snowflake schema whereas the star schema's dimensions are denormalized with each dimension being represented by a single table.

Create Table FACT1 (time_key INTEGER, item_key INTEGER, branch_key INTEGER, Location_key INTEGER, PRIMARY KEY (time_key))) Create Table ITEM(time_key INTEGER, item_key INTEGER, item_name VARCHAR(10), brand VARCHAR(10), type VARCHAR(10), supplier_type VARCHAR(10) FOREIGN KEY time_key REFERENCES FACT1) Create table SUPPLIER (time_key integer, supplier_key integer, supplier_type integer) FOREIGN KEY time_key REFERENCES FACT1)

Create Table FACT1 (time_key INTEGER, item_key INTEGER, branch_key INTEGER, Location_key INTEGER, PRIMARY KEY (time_key))) Create Table LOCATION(time_key INTEGER, location_key INTEGER, street VARCHAR (10), city VARCHAR(10), PRIMARY KEY(location_key) FOREIGN KEY time_key REFERENCES FACT1) Create table CITY (location_key INTEGER, city_key INTEGER, country VARCHAR (10), city VARCHAR (10), state VARCHAR (10))s FOREIGN KEY location_key REFERENCES LOCATION)

Advantages:  These tables are easier to maintain.  Saves the storage space. Disadvantages:  Due to large number of joins it is complex to navigate.

Star schema is a better option to choose from users point of view. This schema exposes users to the underlying table structures and also the queries are simpler in nature. It is more likely to be used when the data warehouse is large. Snowflake schema are often better with more sophisticated query tools and smaller data warehouse. Even though its maintenance is relatively easy, it is based on environments having numerous queries with complex criteria and hence more query execution time.

 W.H. Inmon. “What is a Data Warehouse?, Prism, Volume 1, Number 1, 1995”.  Ralph Kimball. “The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses”.  Jun Yang. “WareHouse Information Prototype at Stanford”.  C. Caldeira. "Data Warehousing – Concepts and Models".  RainMaker DataWarehousing. “OLAP_vs_OLTP.pdf”, works.com/pdfdocs/OLTP_vs_OLAP.pdf  “Data Warehousing: A look at Business Intelligence and Data Warehouse”, housing/ molap-rolap.html  Hari Mailvaganam. “ Data Warehousing Review – Introduction to OLAP”, /Introduction_OLAP.html  Mri Sonam. “What is the difference between star schema and snow flake schema?”,  Wikipedia, The Free Encyclopedia. “Data Warehouse, OLAP, OLTP, Star Schema, Snowflake Schema”,