Presentation is loading. Please wait.

Presentation is loading. Please wait.

20.5 Data Cubes Instructor : Dr. T.Y. Lin Chandrika Satyavolu 222.

Similar presentations


Presentation on theme: "20.5 Data Cubes Instructor : Dr. T.Y. Lin Chandrika Satyavolu 222."— Presentation transcript:

1 20.5 Data Cubes Instructor : Dr. T.Y. Lin Chandrika Satyavolu 222

2 Outline Introduction The Cube Operator Aggregation Cube implementation by Materialized Views Materialized Views Queries using materialized view The Lattice of Views Lattice of a materialized view

3 Introduction A formal data cube precomputes all possible aggregates in a systematic way. Amount of extra storage required is tolerable. If the warehoused data does not change, there is no penalty incurred to keep the aggregates up-to-date.

4 The Cube Operator Given : a fact table F, an augmented table would be CUBE(F) and it adds a new value ‘*’, to each dimension. * means any. It aggregates dimension in which it appears. Border represents only a small addition to the volume of the CUBE. A tuple of CUBE(F) that has ‘*’ in one or more dimension will have sum of values of that dependent attribute in all tuples in that dimension.

5 Example of aggregation : Given fact table: –Sales (serialNo, date, dealer, price) –serialNo is a key, so aggregating over all dates, dealers or prices will still give just one result with that serialNo –Replace serialNo by two attributes : model and color (both non-key) –It could give total sales price for a given model a given color a given dealer a given date –To get average sales, add a new dimension cnt and replace price with val i.e. sum of prices over a category –Relation Sales now becomes: Sales (model, color, date, dealer, val, cnt) Aggregation

6 Example of a tuple in both Sales and CUBE(Sales) would be: (‘Gobi’, ‘red’, ‘2001-05-21’, ‘Friendly-Fred’, 45000, 2) Aggregating over all colors: (‘Gobi’, *, ‘2001-05-21’, ‘Friendly-Fred’, 152000, 7) Aggregating over all colors and dealers: (‘Gobi’, *, ‘2001-05-21’, *, 2348000, 100) Aggregating over all colors, times and dealers: (‘Gobi’, *, *, *, 1339800000, 58000) Aggregating over all colors, times, dealers and models: (*, *, *, *, 3521727000, 198000) Aggregating over all dimensions tells us that the total sales of all models in all colors, over all time at all dealers is 198000 cars for a total price of $3521727000

7 Query and Tuple: –If the query specifies a value v for attribute a, then the tuple t has v in its component for a. –If the query groups by an attribute a, then t has a non-* value in its component for a. –If the query neither groups by an attribute a nor specifies a value a, then t has a * value in its component for a. Example: –Query: SELECT color, AVG(price) FROM Sales WHERE model = ‘Gobi’ GROUP BY color; –Tuple: (‘Gobi’, c, *, *, v, n)

8 Cube implementation by Materialized Views Until now, we assumed that queries choose to aggregate completely in a dimension or not to aggregate at all. Some dimensions can have many degrees of granularity. Example: –days can be aggregated by weeks, months, quarters or years As number of choices of grouping increases, cost of storing the results increases too. Answer to this is : chose some materialized views of the data cube.

9 Materialized Views A materialized view is the result of some query, which we choose to store in the database, rather than reconstructing it as needed in response to queries. Coarser the partition, the less space materialized view takes. To use a view to answer a query, the view must not partition any more coarser than the query does. Example: Group dates by months and dealers by city –INSERT INTO SalesV1 SELECT model, color, month, city, SUM(val) AS val, SUM(cnt) AS cnt FROM Sales JOIN Dealers on dealer = name GROUP BY model, color, month, city; Example: Group dates by weeks and dealers by state and over all colors –INSERT INTO SalesV2 SELECT model, week, state, SUM(val) AS val, SUM(cnt) AS cnt FROM Sales JOIN Dealers on dealer = name GROUP BY model, week, state;

10 Queries using materialized view Q1 : SELECT model, SUM(val) FROM Sales GROUP BY model; can be answered by: SELECT model, SUM(val) FROM SalesV1 GROUP BY model; or SELECT model, SUM(val) FROM SalesV2 GROUP BY model; Q2 : SELECT model, year, state, SUM(val) FROM Sales JOIN Dealers ON dealer = name GROUP BY model, year, state; can only be answered by: SELECT model, year, state, SUM(val) FROM SalesV1 GROUP BY model, year, state; Q3 : SELECT model, color, date, SUM(val) FROM Sales GROUP BY model, color, date; can be answered by neither SalesV1 nor SalesV2.

11 The Lattice of Views Lattice arranges groupings of similar dimensions in a hierarchical order. Points of lattice are the ways we can partition a dimension. If partition P 1 is below partition P 2 (P 1 ≤P 2 ), if each group of P 1 is contained within some group of P 2. Examples of lattice: Lattice of partitions on time interval Lattice of partitions on dealers

12 Lattice of a materialized view If V 1 and V 2 are two different views, then V 1 ≤ V 2 means in each dimension, the partition P 1 from view V 1 is at least as fine as partition P 2 from view V 2 ; that is P 1 ≤P 2. General rule: We can answer a query Q using view V if and only if V ≤ Q. Example of view lattice:

13 Thank You.


Download ppt "20.5 Data Cubes Instructor : Dr. T.Y. Lin Chandrika Satyavolu 222."

Similar presentations


Ads by Google