Presentation is loading. Please wait.

Presentation is loading. Please wait.

Magic Decorrelation: An Optimization Technique for SQL queries CS525 Lecture WPI.

Similar presentations


Presentation on theme: "Magic Decorrelation: An Optimization Technique for SQL queries CS525 Lecture WPI."— Presentation transcript:

1 Magic Decorrelation: An Optimization Technique for SQL queries CS525 Lecture WPI

2 DSRG, WPI 2 Query Graph Model (QGM) Internal representation of an SQL query used within IBM DB2 One box for each “query block” and for each aggregation Each box has a “head” and a “body” Head says what is returned by the box Body describes the computation performed DISTINT specified to work with bags Head specifies DISTINCT = TRUE/FALSE If DISTINCT = TRUE, then there will be no duplicates in the result returned by the box If DISTINCT = FALSE, then duplicates may be present in the results of the box

3 DSRG, WPI 3 QGM (Contd…) DISTINCT specified to work with bags Head specifies DISTINCT = TRUE/FALSE If DISTINCT = TRUE, then there will be no duplicates in the result returned by the box If DISTINCT = FALSE, then duplicates may be present in the results of the box Body specifies DISTINCT = ENFORCE/PRESERVE/PERMIT If DISTINCT = ENFORCE, then the body explicitly removes duplicates If DISTINCT = PRESERVE, then the body does not add/remove duplicates obtained from the operation If DISTINCT = PERMIT, then the body may add/remove duplicates

4 DSRG, WPI 4 QGM Example 1 Consider the following SQL query: SELECT DISTINCT q1.PartNo, q1.Desc, q2.SuppNo FROM inventory q1, quotations q2 WHERE q1.PartNo = q2.PartNo AND q1.Desc = 'engine' AND q2.price <= ALL (SELECT q3.Price FROM quotations q3 WHERE q2.PartNo = q3.PartNo)

5 DSRG, WPI 5 QGM Example 1 (Contd…) Note the following Box 3 corresponds to the outer block Box 3 has DISTINCT = TRUE for head, and DISTINCT = ENFORCE for the body. Box 3 takes 3 inputs – inventory, quotations and output of box 4 See the quantifiers specifying q1(F) (denoting q1 is from the FROM clause), q4(A) denotes the specified condition must be true for all the rows in q4. Box 4 corresponds to the inner block DISTINCT = PERMIT for the body of box 4. See the input from box 3 to box 4 denoting correlation

6 DSRG, WPI 6 QGM Example 2 Consider the following correlated query with aggregation. SELECT d.Name FROM dept d WHERE d.Budget (SELECT COUNT (*) FROM emp e WHERE d.Building = e.Building)

7 DSRG, WPI 7 QGM Example 2 (Contd…) (1) (2) (3)

8 DSRG, WPI 8 Some terminology Box (3) is directly correlated to Box (1) as it uses the input from (1). Box (3) and Box (2) are said to be correlated to Box (1) because at least one of the descendants of (3) and (2) are directly correlated to (1) q1.Building is the correlation column

9 DSRG, WPI 9 Removing Correlation Traverse the QGM in depth first order For our example, visit the boxes in the order (1), (2), (3) For each box A, check if a (descendant) box B is correlated to it/its ancestor. If yes, then feed the correlation to its child (if any) If A is correlated to an (ancestor) box, then Absorb the correlation for this box. Note Absorb will be different depending on whether the box is aggregate box/SPJ box

10 DSRG, WPI 10 Removing Decorrelation We first visit box (1). It has a descendant box, that is correlated to it. So perform the feed Box (1) is not correlated to an ancestor box, so there is no absorb Let us see how feed for box (1) is performed

11 DSRG, WPI 11 Feed for (1) Check if there is any condition on the “correlation” column in Box (1) If yes, push the selection condition before Box (1) – check figure (b) Create another box, which removes duplicate values of the correlation column – magic in figure (c) Create 2 boxes as in figure (d) DCO box takes the above values as input, box (3) will now depend on this box C1 box takes output of DCO box, is correlated to box (1) and performs the equi-join

12 DSRG, WPI 12 Pushing the Selection Condition (1) (2)

13 DSRG, WPI 13 Removing Duplicates (1) (2)

14 DSRG, WPI 14 Removing the correlation between (1) and (3) (1) (2)

15 DSRG, WPI 15 Decorrelating Box (2) Box (3) is correlated to the parent DCO box of (2). So perform the feed – fig [b] Push select conditions – here none Remove duplicates – here none Create a DCO box and a C1 box as before. Box (2) is correlated to its parent DCO box. So perform the absorb – fig [c] For aggregate operator, absorb includes a group by, followed by a LOJ In this case, we end up with an unnecessary C1 box. Remove it – fig (d)

16 DSRG, WPI 16 Starting point for box (2) (2) (3)

17 DSRG, WPI 17 Feed for Box (2) (2) (3)

18 DSRG, WPI 18 Absorb for box (2) (2) (3)

19 DSRG, WPI 19 Remove unnecessary C1 box (2) (3)

20 DSRG, WPI 20 Decorrelating Box (3) There is no descendant box that is correlation to box (3) or its ancestor. Therefore, no feed Box (3) is correlated to its parent DCO box. So perform the absorb – fig (b) Absorb for SPJ box means just remove the correlation, and feed the box directly as input to the SPJ box Remove unnecessary Q8 input to DCO box – fig (c) Remove unnecessary DCO box – fig (d)

21 DSRG, WPI 21 Starting point for box (3) (3)

22 DSRG, WPI 22 Absorb for box (3) (3)

23 DSRG, WPI 23 Remove unnecessary Q8 input to DCO box (3)

24 DSRG, WPI 24 Remove unnecessary DCO box (3)


Download ppt "Magic Decorrelation: An Optimization Technique for SQL queries CS525 Lecture WPI."

Similar presentations


Ads by Google