Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hierarchical Dwarfs for the Rollup-Cube Yannis Sismanis Antonios Deligiannakis Yannis Kotidis Nick Roussopoulos.

Similar presentations


Presentation on theme: "Hierarchical Dwarfs for the Rollup-Cube Yannis Sismanis Antonios Deligiannakis Yannis Kotidis Nick Roussopoulos."— Presentation transcript:

1 Hierarchical Dwarfs for the Rollup-Cube Yannis Sismanis Antonios Deligiannakis Yannis Kotidis Nick Roussopoulos

2 Yannis Sismanis DOLAP 20032 Motivation Dimensional values annotated with hierarchies: Examples Time: secmin hour Store: coderetailer Flat/Lattice Profound effect on cube complexity L hierarchy levels/dimension & d dimensions: 2 d  (L+1) d Traditionally handled externally Mapped into queries on raw data Requires aggregation of the results

3 Yannis Sismanis DOLAP 20033 Importance On Line Analytical Processing (OLAP) Decision Support Systems (DSS) Data Mining Queries Rollup/Drilldown Ad-hoc It’s not just about fast queries Computation/storage/indexing/…

4 Yannis Sismanis DOLAP 20034 Related Work View materialization NP-complete Even greedy algorithms are not practical for high- dimensional/hierarchical cubes Compute the cube Various techniques that suffer from the dimensionality curse Store/Index ROLAP/MOLAP/… Compressed Cubes Condensed,Dwarf,Quotient

5 Yannis Sismanis DOLAP 20035 Our Contribution Extend Dwarf Architecture [SDRK02] Implemented two approaches Partial view-covering Breaks the problem to sub-problems and solves separately each one Hierarchical Treats the problem as a whole Maximize the effects of compression Important on all aspects of cube management Address partial/full materialization Extensive experimentation with real OLAP data

6 Yannis Sismanis DOLAP 20036 Dwarf Overview Complete system (100% accuracy) Compute/Store/Index/Query/Update [SDRK02] Structural Redundancies Prefix Elimination Very high on dense areas Suffix Coalescing (!) Orders of magnitude more important on sparse areas Partial materialization Minimum granularity Resembles iceberg cubes Optimizations Clustering

7 Yannis Sismanis DOLAP 20037 View-covering Partial Dwarfs Use a forest of Dwarfs Encapsulates all views in the hierarchical cube “base dwarf” contains the lowest hierarchical views “partial dwarfs” cover the higher hierarchical views Partial Dwarf: Do not store every possible combination of views Avoids duplication in the final forest of partial dwarfs Fast View covering enumeration process Single traversal over the view-space Keep track of just the last enumerated partial dwarf

8 Yannis Sismanis DOLAP 20038 Partial Dwarfs (example) Store: StoreId  Retailer  ALL Product: Code  Group  ALL Customer: Name  ALL

9 Yannis Sismanis DOLAP 20039 Hierarchical Dwarf Extend the Dwarf model Incorporate hierarchies inside the Dwarf DAG Even higher-level aggregates can be reached through a path from the root Nature of prefix redundancies changes Common prefixes between partial dwarfs Most importantly suffix redundancies are now exploited in a “global” way

10 Yannis Sismanis DOLAP 200310 Hierarchical Dwarf (example) StoreCodeNameSales S1C2N1$10 S2C3N2$30 S3C1N1$60 StoreCode S1R1C1G2 S2R1C2G1 S3R2C3G2

11 Yannis Sismanis DOLAP 200311 Non-linear Hierarchies

12 Yannis Sismanis DOLAP 200312 Experiments Real-world data 8 dimensions (7458,2765,3857,3247,213,660,4,4) 4 hierarchies (1x6,2x4,1x3) 256-views vs 11,200-views Comparison with base Dwarf I.e all hierarchical queries are mapped to the raw data and then further aggregated Full uncompressed cube (BSF)

13 Yannis Sismanis DOLAP 200313 Computation

14 Yannis Sismanis DOLAP 200314 Storage

15 Yannis Sismanis DOLAP 200315 Full Cube Statistics

16 Yannis Sismanis DOLAP 200316 Query Evaluation Simulated Queries Point/Range Children Effect of Rollup/Drill-Down

17 Yannis Sismanis DOLAP 200317 Queries – Gmin=1

18 Yannis Sismanis DOLAP 200318 Queries – Gmin=1000

19 Yannis Sismanis DOLAP 200319 Conclusions Presented two extensions to the Dwarf architecture Decompose the problem to simpler Embed hierarchies in the structure Suffix redundancies are more apparent in hierarchical cubes Compression ratio of more than 70 times Query response performance increase of ~10 times Sparsity exploitation: A minimum granularity of 1,000 minimizes computation time (about 3 times) and increases performance

20 Yannis Sismanis DOLAP 200320 Questions?


Download ppt "Hierarchical Dwarfs for the Rollup-Cube Yannis Sismanis Antonios Deligiannakis Yannis Kotidis Nick Roussopoulos."

Similar presentations


Ads by Google