Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Analysis of the Publication "An Overview of Data Warehousing and OLAP Technology” by Surajit Chaudhuri, Umeshwar Dayal Michael Goshey University of.

Similar presentations


Presentation on theme: "An Analysis of the Publication "An Overview of Data Warehousing and OLAP Technology” by Surajit Chaudhuri, Umeshwar Dayal Michael Goshey University of."— Presentation transcript:

1 An Analysis of the Publication "An Overview of Data Warehousing and OLAP Technology” by Surajit Chaudhuri, Umeshwar Dayal Michael Goshey University of Minnesota, Fall 2006 CSci 8701: Overview of Database Research

2 Michael Goshey: 9/19/20062 Outline 1. Introduction 2. Problem Addressed 3. Major Contributions 4. Key Concepts 5. Validation Methodology 6. Assumptions 7. 2006 Rewrite

3 Michael Goshey: 9/19/20063 Introduction Selected paper  S. Chaudhuri and U. Dayal, An Overview of Data Warehousing and OLAP Technology, SIGMOD Record 26(1): 65-74(1997). Motivation Personal Interest

4 Michael Goshey: 9/19/20064 Outline 1. Introduction 2. Problem Addressed 3. Major Contributions 4. Key Concepts 5. Validation Methodology 6. Assumptions 7. 2006 Rewrite

5 Michael Goshey: 9/19/20065 Problem Addressed Problem Statement  Survey: organizing the data warehousing space  Differing requirements between OLTP and OLAP Significance  Growth area  Reference work establishing consensus on terms, architectures and issues

6 Michael Goshey: 9/19/20066 Outline 1. Introduction 2. Problem Addressed 3. Major Contributions 4. Key Concepts 5. Validation Methodology 6. Assumptions 7. 2006 Rewrite

7 Michael Goshey: 9/19/20067 Major Contributions Bridging the gulf between industry and academia OLTP vs. OLAP: clarifying the differences Concise survey of relevant issues, architectures and tools Concrete list of data warehouse design and build steps

8 Michael Goshey: 9/19/20068 Outline 1. Introduction 2. Problem Addressed 3. Major Contributions 4. Key Concepts 5. Validation Methodology 6. Assumptions 7. 2006 Rewrite

9 Michael Goshey: 9/19/20069 Key Concepts Data warehouses and data marts OLTP, OLAP, ROLAP vs. MOLAP) Relational and dimensional data models Bitmap Index ETL Metadata Managed query vs. ad hoc environments Materialized views SQL extensions (cube, rollup, rank, percentile, etc.)

10 Michael Goshey: 9/19/200610 Data Warehouse, Data Mart

11 Michael Goshey: 9/19/200611 Relational or Dimensional?

12 Michael Goshey: 9/19/200612 Relational or Dimensional? (image from http://www.laynetworks.com)

13 Michael Goshey: 9/19/200613 Bitmap Indices customerage 0-10age 11-20age 21-30age 31-40 Mary1000 John0100 Steve0010 Tom0001 Lisa0010 cardinality: unique values/total rows B-Tree vs. bitmap: 1% rule, uniqueness Boolean algebra directly on indices

14 Michael Goshey: 9/19/200614 Outline 1. Introduction 2. Problem Addressed 3. Major Contributions 4. Key Concepts 5. Validation Methodology 6. Assumptions 7. 2006 Rewrite

15 Michael Goshey: 9/19/200615 Validation Methodology Survey paper goals Academic and industry citations Referencing tools, vendors Case studies

16 Michael Goshey: 9/19/200616 Outline 1. Introduction 2. Problem Addressed 3. Major Contributions 4. Key Concepts 5. Validation Methodology 6. Assumptions 7. 2006 Rewrite

17 Michael Goshey: 9/19/200617 Assumptions Read-only environments Shortcomings  (occasional) transactional commitments  the data revision problem

18 Michael Goshey: 9/19/200618 Outline 1. Introduction 2. Problem Addressed 3. Major Contributions 4. Key Concepts 5. Validation Methodology 6. Assumptions 7. 2006 Rewrite

19 Michael Goshey: 9/19/200619 2006 Rewrite Changes in terminology, tools, vendors  Fact constellations -> conformed dimensions  Decision support -> BI  Vendors and tools in BI, ETL, OLAP Multiple user constituencies Data history difficulties petabyte databases -> very large warehouses common data expiry challenges slowly changing dimensions

20 Michael Goshey: 9/19/200620 Slowly Changing Dimensions CustomerIDNameStatus 001Mary JohnsonGold CustomerIDNameStatus 001Mary JohnsonPlatinum CustomerIDNameStatus 001Mary JohnsonGold 001Mary JohnsonPlatinum CustomerIDNameOriginal StatusCurrent StatusEffective Date 001Mary JohnsonGoldPlatinum10/1/2006 Before After: Type 1 After: Type 2 After: Type 3 CustomerIDNameStatus 001Mary JohnsonPlatinum

21 Michael Goshey: 9/19/200621 Questions?


Download ppt "An Analysis of the Publication "An Overview of Data Warehousing and OLAP Technology” by Surajit Chaudhuri, Umeshwar Dayal Michael Goshey University of."

Similar presentations


Ads by Google