Presentation is loading. Please wait.

Presentation is loading. Please wait.

Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise.

Similar presentations


Presentation on theme: "Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise."— Presentation transcript:

1 Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise Getoor, Univ. of Maryland VLDB 2009, Lyon, France

2 2 University of California, Santa Cruz Index Selection Index selection problem: –Given a query workload –Choose indices that improve workload performance Does index benefit depend on other indices? –If so, this is called index interaction Index “benefit” is a key concept –Informally, for an index i, [benefit of i] = [exec cost without i] – [exec cost with i]

3 3 University of California, Santa Cruz Related Work Interactions are a key concern in physical tuning –[Whang et al. 1981] make assumptions implying that indices on different tables do not interact –[Finklestein et al. 1988] assume that indices do not interact if they are relevant to separate queries –[Bruno and Chaudhuri 2007] explicitly account for some interactions in on-line index selection –Many more… These studies treat interactions as a secondary issue, and often rely on ad hoc assumptions

4 4 University of California, Santa Cruz Index Interactions Let S be a set of indices relevant to a query Q cost(X) cost(X  {a}) benefit({a}, X) cost(X  {b}) cost(X  {a,b})benefit({a}, X  {b}) Indices a,b are independent with respect to X

5 5 University of California, Santa Cruz Index Interactions cost(X) cost(X  {a}) benefit({a}, X) cost(X  {b}) cost(X  {a,b})benefit({a}, X  {b}) Indices a,b positively interact with respect to X Let S be a set of indices relevant to a query Q

6 6 University of California, Santa Cruz Index Interactions cost(X) cost(X  {a}) benefit({a}, X) cost(X  {b}) cost(X  {a,b})benefit({a}, X  {b}) Indices a,b negatively interact with respect to X Let S be a set of indices relevant to a query Q

7 7 University of California, Santa Cruz = degree of interaction between a,b with respect to X = Degree of Interaction =

8 8 University of California, Santa Cruz Problem Statement Which indices in S interact? How strong are the interactions? The Degree of Interaction Problem:

9 9 University of California, Santa Cruz Outline Properties of Query Optimization Degree of Interaction Algorithm Applying Interaction Information

10 10 University of California, Santa Cruz Outline Properties of Query Optimization Degree of Interaction Algorithm Applying Interaction Information

11 11 University of California, Santa Cruz Query Optimization Computing doi(a,b) is not practical if the optimizer is totally arbitrary –Need to compute In practice, query optimization is not arbitrary –E.g., we expect We put mild assumptions on query optimization: –Plans are selected from some fixed space P –Optimizer chooses the cheapest feasible plan from P –Ties are broken consistently

12 12 University of California, Santa Cruz Index Benefit Graph An Index Benefit Graph (IBG) encodes the selection of optimal plans for a query –Introduced by [Frank, Omiecinski, and Navathe 1992] Example IBG when S = {a,b,c,d} a b c d a b cb c d a cb c = 20 = 45 d = 80 c = 80 = 50 c d = 65 = 50 = 80 used in opt plan cost of plan –There are 16 subsets of S –IBG has 8 nodes –But IBG can compute

13 13 University of California, Santa Cruz Outline Properties of Query Optimization Degree of Interaction Algorithm Applying Interaction Information

14 14 University of California, Santa Cruz Naive Algorithm Recall that we want the degree of interaction between all pairs of indices in S Each doi(a,b) may be computed directly Upon termination, T[a,b] = doi(a,b) for all a,b Can save time using an IBG as a cache of cost function Downside: iteration over all subsets of S

15 15 University of California, Santa Cruz The Q I NTERACT Algorithm Naive Algorithm (condensed) We should avoid evaluating doi(a,b,X) for all Q I NTERACT algorithm processes two index sets per IBG node Q I NTERACT Algorithm

16 16 University of California, Santa Cruz Q I NTERACT Example a b u v = 20 a u v = 30b u v = 30 a u = 40u v = 40 v = 50 u = 50 b v = 40 Let’s calculate doi(a,b) on the graph below What happens on iteration Y = {u} ? Y a b u v = 20 a u v = 30b u v = 30 a u = 40u v = 40 v = 50 u = 50 b v = 40 Y

17 17 University of California, Santa Cruz Interleaved IBG Processing In Q I NTERACT, the IBG is built, then analyzed –I.e., IBG construction and analysis is serial We can discover interactions in a partial IBG IBG construction and analysis may be interleaved -Improves accuracy of doi over time a b c d a b cb c d a c = 20 = 45 = 50 = 80... b c d = 80 c = 80 c d = 65 = 50

18 18 University of California, Santa Cruz Outline Properties of Query Optimization Degree of Interaction Algorithm Applying Interaction Information -Visualizing Index Interactions -Scheduling Index Creation

19 19 University of California, Santa Cruz Outline Properties of Query Optimization Degree of Interaction Algorithm Applying Interaction Information -Visualizing Index Interactions -Scheduling Index Creation

20 20 University of California, Santa Cruz Visualizing Index Interactions We can visualize the doi function as a graph –Nodes correspond to indices –Edge between a and b has weight doi(a,b) O(CK,OK) C(CK,NK) LI(SK,SD,D,EP,OK) LI(SD,D) S(NK,N,SK)S(NK,SK)S(SK,NK) C(NK,CK) LI(SD,Q) 0.01 0.02 0.04 0.02 0.03 0.09 0.02 0.01 0.02 TPC-H Query 7

21 21 University of California, Santa Cruz Interaction Graph The connected components have special meaning

22 22 University of California, Santa Cruz Outline Properties of Query Optimization Degree of Interaction Algorithm Applying Interaction Information -Visualizing Index Interactions -Scheduling Index Creation

23 23 University of California, Santa Cruz Scheduling Index Creation Suppose we want to materialize new indices In what order should they be created? Benefit a,baa,b,c Materialized Indices a,cca,b,c Schedule = a,b,c Choose first schedule to maximize benefit over time (shaded area) a,bba,b,c Schedule = b,a,cSchedule = c,a,b

24 24 University of California, Santa Cruz Scheduling Index Creation We define an optimization problem –M = preexisting indices –{a 1, …, a n } = new indices to create –Permute new indices as t 1, …, t n to maximize This problem is computationally hard –There is a connection to the Set Cover problem, since each new index “covers” more benefit

25 25 University of California, Santa Cruz Greedy Scheduling We are tempted to use a greedy heuristic This results in the third schedule Greedy schedule can be suboptimal by a factor of about (n – 1) Benefit a,baa,b,c Materialized Indices a,cca,b,c Schedule = a,b,c a,bba,b,c Schedule = b,a,cSchedule = c,a,b

26 26 University of California, Santa Cruz Interaction-Aware Scheduling Scheduling can use interaction graph Idea:First find optimal sub-schedules for each C i Then choose the best interleaving of sub-schedules Idea:First find optimal sub-schedules for each C i Then choose the best interleaving of sub-schedules This heuristic avoids the pitfalls of greedy scheduling We can also show stronger performance guarantees

27 27 University of California, Santa Cruz Conclusions Index interactions provide useful insights for physical design tuning The doi metric is an effective characterization of interaction relationships We can analyze interactions efficiently when the Index Benefit Graph has limited size Future work?

28 28 University of California, Santa Cruz Thank You

29 29 University of California, Santa Cruz Performance Evaluation Q I NTERACT implementation in Java –Uses JDBC to connect to IBM DB2 database Experiments use 22 TPC-H benchmark queries We generate indices based on the DB2 advisor –S ALL = all indices recommended by DB2 –S 1C = indices in S ALL with first column only We monitor the progress of the “serial” and “interleaved” approaches over time

30 30 University of California, Santa Cruz Experimental Results S ALL index set 0.1 threshold S 1C index set 0.1 threshold

31 31 University of California, Santa Cruz Applications Q I NTERACT returns doi(a,b) for all a,b We propose two applications of this information –Visualizing index interactions Illustrates the global interactions as a graph Useful when manually tuning the index set –Scheduling index construction Want to choose when new indices will be created Goal is to increase performance as quickly as possible Knowledge of index interactions can help

32 32 University of California, Santa Cruz Problem Statement Which indices in S interact? How strong are the interactions? The Degree of Interaction Problem: It may be useful to ignore “minor” interactions A threshold-based variant:

33 33 University of California, Santa Cruz Index Selection Index selection problem: Does benefit(a, X) depend on X ? –If so, this is called index interaction We can quantify the benefit of an index:

34 34 University of California, Santa Cruz Future Work Expand our support for updates Implementation of visualization tool Experiments with materialization scheduling Incremental updates to doi function Exploring stronger assumptions on query optimization –Efficient upper bounds on doi function?


Download ppt "Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise."

Similar presentations


Ads by Google