Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.

Similar presentations


Presentation on theme: "Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions."— Presentation transcript:

1 Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions

2 Hector Garcia Molina: Data Warehousing and OLAP 2 Warehouse Models & Operators Data Models u relations u stars & snowflakes u Cubes Operators u slice & dice u roll-up, drill down u pivoting u other

3 Hector Garcia Molina: Data Warehousing and OLAP 3 Star

4 Hector Garcia Molina: Data Warehousing and OLAP 4 Star Schema sale orderId date custId prodId storeId qty amt

5 Hector Garcia Molina: Data Warehousing and OLAP 5 Terms l Fact table l Dimension tables l Measures

6 Hector Garcia Molina: Data Warehousing and OLAP 6 Dimension Hierarchies store sType cityregion  snowflake schema  constellations

7 Hector Garcia Molina: Data Warehousing and OLAP 7 Cube Fact table view: Multi-dimensional cube: dimensions = 2

8 Hector Garcia Molina: Data Warehousing and OLAP 8 3-D Cube day 2 day 1 dimensions = 3 Multi-dimensional cube:Fact table view:

9 Hector Garcia Molina: Data Warehousing and OLAP 9 ROLAP vs. MOLAP l ROLAP: Relational On-Line Analytical Processing l MOLAP: Multi-Dimensional On-Line Analytical Processing

10 Hector Garcia Molina: Data Warehousing and OLAP 10 Aggregates Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE WHERE date = 1 81

11 Hector Garcia Molina: Data Warehousing and OLAP 11 Aggregates Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date

12 Hector Garcia Molina: Data Warehousing and OLAP 12 Another Example Add up amounts by day, product In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId drill-down rollup

13 Hector Garcia Molina: Data Warehousing and OLAP 13 Aggregates l Operators: sum, count, max, min, median, ave l “Having” clause l Using dimension hierarchy u average by region (within store) u maximum by month (within date)

14 Hector Garcia Molina: Data Warehousing and OLAP 14 Cube Aggregation day 2 day 1 129... drill-down rollup Example: computing sums

15 Hector Garcia Molina: Data Warehousing and OLAP 15 Cube Operators day 2 day 1 129... sale(c1,*,*) sale(*,*,*) sale(c2,p2,*)

16 Hector Garcia Molina: Data Warehousing and OLAP 16 Extended Cube day 2 day 1 * sale(*,p2,*)

17 Hector Garcia Molina: Data Warehousing and OLAP 17 Aggregation Using Hierarchies day 2 day 1 customer region country (customer c1 in Region A; customers c2, c3 in Region B)

18 Hector Garcia Molina: Data Warehousing and OLAP 18 Pivoting day 2 day 1 Multi-dimensional cube: Fact table view:

19 Hector Garcia Molina: Data Warehousing and OLAP 19 Integration l Data Cleaning l Data Loading l Derived Data Client Warehouse Source Query & Analysis Integration Metadata

20 Hector Garcia Molina: Data Warehousing and OLAP 20 Data Cleaning Migration (e.g., yen  dollars) l Scrubbing: use domain-specific knowledge (e.g., social security numbers) l Fusion (e.g., mail list, customer merging) l Auditing: discover rules & relationships (like data mining) billing DB service DB customer1(Joe) customer2(Joe) merged_customer(Joe)

21 Hector Garcia Molina: Data Warehousing and OLAP 21 Loading Data l Incremental vs. refresh l Off-line vs. on-line l Frequency of loading u At night, 1x a week/month, continuously l Parallel/Partitioned load

22 Hector Garcia Molina: Data Warehousing and OLAP 22 Derived Data l Derived Warehouse Data u indexes u aggregates u materialized views (next slide) l When to update derived data? l Incremental vs. refresh

23 Hector Garcia Molina: Data Warehousing and OLAP 23 Materialized Views l Define new warehouse relations using SQL expressions does not exist at any source

24 Hector Garcia Molina: Data Warehousing and OLAP 24 Processing l ROLAP servers vs. MOLAP servers l Index Structures l What to Materialize? l Algorithms Client Warehouse Source Query & Analysis Integration Metadata

25 Hector Garcia Molina: Data Warehousing and OLAP 25 ROLAP Server l Relational OLAP Server relational DBMS ROLAP server tools utilities Special indices, tuning; Schema is “denormalized”

26 Hector Garcia Molina: Data Warehousing and OLAP 26 MOLAP Server l Multi-Dimensional OLAP Server multi- dimensional server M.D. tools utilities could also sit on relational DBMS Product City Date 1 2 3 4 milk soda eggs soap A B Sales

27 Hector Garcia Molina: Data Warehousing and OLAP 27 Join “Combine” SALE, PRODUCT relations In SQL: SELECT * FROM SALE, PRODUCT

28 Hector Garcia Molina: Data Warehousing and OLAP 28 Join Indexes join index

29 Hector Garcia Molina: Data Warehousing and OLAP 29 What to Materialize? l Store in warehouse results useful for common queries l Example: day 2 day 1 129... total sales materialize

30 Hector Garcia Molina: Data Warehousing and OLAP 30 Cube Aggregates Lattice city, product, date city, productcity, dateproduct, date cityproductdate all day 2 day 1 129 use greedy algorithm to decide what to materialize

31 Hector Garcia Molina: Data Warehousing and OLAP 31 Dimension Hierarchies all state city

32 Hector Garcia Molina: Data Warehousing and OLAP 32 Dimension Hierarchies city, product city, product, date city, date product, date city product date all state, product, date state, date state, product state not all arcs shown...

33 Hector Garcia Molina: Data Warehousing and OLAP 33 Interesting Hierarchy all years quarters months days weeks conceptual dimension table


Download ppt "Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions."

Similar presentations


Ads by Google