Presentation is loading. Please wait.

Presentation is loading. Please wait.

On the analysis of indexing schemes

Similar presentations


Presentation on theme: "On the analysis of indexing schemes"— Presentation transcript:

1 On the analysis of indexing schemes
Written by: Joseph M. Hellerstein Elias Koutsoupias Christos H. Papadimitriou Presented by Tali Kaufman

2 Presentation layout Problem definition - define a framework to measure the efficiency of an index. Performance factors - access overhead and storage redundancy. Range-queries access overhead upper bound access overhead lower bound (r = 1) access overhead lower bound (r >= 1) Set-queries worst-case access overhead conclusions open problems

3 The problem Problem - define a framework for measuring the efficiency of an indexing scheme for a workload, based on two performance factors: storage redundancy and access overhead. Workload - a definition of a data set and a set of potential queries. Indexing scheme - a collection of blocks, which store an actual data set instance.

4 Workload definition

5 Example - a workload with two dimensional range queries

6 Indexing scheme definition

7 Performance factors definition

8 Access overhead upper bound for two dimensional range queries

9 Access overhead lower bound (redundancy = 1)

10 Access overhead lower bound (redundancy = 1) [cont]

11 Access overhead lower bound (redundancy 1)

12 Access overhead lower bound (redundancy 1) [cont]

13 Access overhead lower bound (redundancy 1) [cont]

14 Access overhead lower bound (redundancy 1) [cont]

15 Access overhead lower bound (redundancy 1) [cont]

16 Access overhead lower bound (redundancy 1) [cont]

17 Example - Set inclusion workloads

18 Set inclusion workloads worst-case access overhead

19 Conclusions Theory of indexability- the article presents a framework for studying indexability. Workload and index scheme in indexability theory vs. language and algorithm in complexity theory. Emphasis the secondary storage nature of indexing schemes, examine storage utilization(redundancy) and disk access (access overhead) Consider range queries and set queries and focus on lower bounds and trade-off between redundancy and access overhead The trade-off is worse for workloads with large number of queries (set queries - exponential, range queries - polynomial) Algorithms to find the best access methods (search algorithms), and to find best partition into blocks, are not considered. The size of the instance does not affect the results

20 Open problems


Download ppt "On the analysis of indexing schemes"

Similar presentations


Ads by Google