Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compressing Relations And Indexes

Similar presentations


Presentation on theme: "Compressing Relations And Indexes"— Presentation transcript:

1 Compressing Relations And Indexes
Jonathan Goldstein Raghu Ramakrishnan Uri Shaft Department of Compter Sciences, University of Wisconsin-Madison June 18, 1997

2 Agenda Introduction Compressing A Relation
Compression Applied to Rectangle Base Indexes Performance Evaluation Questions and Remarks

3 Introduction Page level Compression Performance Study
Application to B-trees and R-trees Multidimensional bulk loading algorithm

4 Introduction

5 Introduction

6 Compressing A relation
Frames Of Reference Non numeric attributes File level compression

7 Frames of Reference

8 Point approximation in lossy compression

9 Compressing an indexing structure
Compressing a B-tree Compressing a rectangle based indexing structure Compression oriented Bulk Loading

10 Rectangle Based indexing qualities

11 Changing the frame of reference

12 Bulk-Loading Algorithm
Input. A set of points in some n-dimentional space. Output. A partition of the inut into subsets. Requirements. The partition shuold group points that are close to each other in the same group as much as possiblg

13 GB-Pack compression oriented bulk loading

14 GB-Pack compression oriented bulk loading
Qualities: trading off some tree quality for increased compression. number of entries per page is data-dependent. cutting a dimension in a value boundary in the data.

15 GB-Pack compression oriented bulk loading

16 GB-Pack compression oriented bulk loading

17 GB-Pack compression oriented bulk loading

18 Performance Evaluation
Relational Compression Experiments. CPU vs. I/O Costs. Comparison With Techniques in commercial systems. Importance of Tuple-Level Decompression. R-tree Compression Experiments.

19 Synthetic Data Sets Size: The number of tuples in the relation.
Dimensionality: The number of attributes of the relations. Range: The range of values for the attributes. Distribution :uniform(worst case) / exponential. Partition Strategy. Page size.

20 Sales Data Set Sales data set. Compression Achieved versus dimensionality

21 CPU vs. I/O Costs

22 R-tree Compression Experiments
Testing the quality of R-trees on Sales Data Set.

23 Questions And Remarks


Download ppt "Compressing Relations And Indexes"

Similar presentations


Ads by Google