Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 44321 CS4432: Database Systems II Logical Plan Rewriting.

Similar presentations


Presentation on theme: "CS 44321 CS4432: Database Systems II Logical Plan Rewriting."— Presentation transcript:

1 CS CS4432: Database Systems II Logical Plan Rewriting

2 CS 4432query processing2 parse convert apply laws estimate result sizes consider physical plans estimate costs pick best execute {P1,P2,…..} {(P1,C1),(P2,C2)...} Pi answer SQL query parse tree logical query plan “improved” l.q.p l.q.p. +sizes statistics

3 CS Query in SQL  Query Plan in Algebra (logical)  Other Query Plan in Algebra (logical)

4 CS Query plan 1 (in relational algebra)  B,D  R.A =“c”  S.E=2  R.C=S.C  X RS

5 CS Query plan 2 (in relational algebra)  B,D  R.A = “c”  S.E = 2 R S natural join on R.C=S.C

6 CS Relational algebra optimization What are transformation rules ? –preserve equivalence What are good transformations? –reduce query execution costs

7 CS Rules: Natural join rewriting. R S=SR (R S) T= R (S T) R SS T T R Can also write as trees, e.g.:

8 CS Rules: Other binary operators ? R S=SR (R S) T= R (S T) What about : Cross product? Condition join? Union? Intersection ? Difference ?

9 CS Note: T R R SS T

10 CS R x S = S x R (R x S) x T = R x (S x T) R U S = S U R R U (S U T) = (R U S) U T Rules: Natural joins & cross products & union R S=SR (R S) T= R (S T)

11 CS Rules: Selects  p1  p2 (R)=  p1 [  p2 (R)] [  p1 (R)] U [  p2 (R)]  p1vp2 (R) =

12 CS Bags vs. Sets R = {a,a,b,b,b,c} S = {b,b,c,c,d} What about union R U S = ? Option 1 SUM R U S = {a,a,b,b,b,b,b,c,c,c,d} Option 2 MAX R U S = {a,a,b,b,b,c,c,d}

13 CS Which option makes this rule work ?  p1vp2 (R) =  p1 (R) U  p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c  p1vp2 (R) = {a,a,b,b,b,c}  p1 (R) = {a,a,b,b,b}  p2 (R) = {b,b,b,c}  p1 (R) U  p2 (R) = {a,a,b,b,b,c} Let us try MAX():

14 CS Which option makes this rule work ?  p1vp2 (R) =  p1 (R) U  p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c  p1vp2 (R) = {a,a,b,b,b,c}  p1 (R) = {a,a,b,b,b}  p2 (R) = {b,b,b,c}  p1 (R) U  p2 (R) = {a,a,b,b,b,b,b,b,c} What about Sum()?

15 CS Which option makes this rule work ?  p1  p2 (R)=  p1 [  p2 (R)] Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c What about MAX versus SUM ?

16 CS Option 2 (MAX) makes this rule work:  p1vp2 (R) =  p1 (R) U  p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c  p1vp2 (R) = {a,a,b,b,b,c}  p1 (R) = {a,a,b,b,b}  p2 (R) = {b,b,b,c}  p1 (R) U  p2 (R) = {a,a,b,b,b,c}

17 CS Yet another example ! Senators (……)Reps (……) T1 =  yr,state Senators; T2 =  yr,state Reps T1 Yr State T2 Yr State 97 CA 99 CA 99 CA 99 CA 98 AZ 98 CA Union? “Sum” option makes more sense!

18 CS Executive Decision -> Use “SUM” option for bag unions -> CAREFUL ! Some rules cannot be used for bags

19 CS Rules: Project Let: X = set of attributes Y = set of attributes XY = X U Y  xy (R) =  x [  y (R)]

20 CS Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attribs  p (R S) =  q (R S) = Rules:  combined [  p (R)] S R [  q (S)]

21 CS  p  q (R S) = ? Rules:  combined Rule can be derived !

22 CS Derivation for rule :  p  q (R S) =  p [  q (R S) ] =  p [ R  q (S) ] = [  p (R)] [  q (S)]

23 CS More Rules can be Derived:  p  q (R S) =  p  q  m (R S) =  pvq (R S) = Rules:  combined (continued)

24 CS We did one, do others on your own :  p  q (R S) = [  p (R)] [  q (S)]  p  q  m (R S) =  m [ (  p R) (  q S) ]  pvq (R S) = [ (  p R) S ] U [ R (  q S) ]

25 CS Rules:  combined Let x = subset of R attributes z = attributes in predicate P (subset of R attributes)  x [  p ( R ) ] =  {  p [  x ( R ) ] } x x  xz

26 CS Rules:  combined Let x = subset of R attributes y = subset of S attributes z = intersection of R,S attributes  xy (R S) =  xy { [  xz ( R ) ] [  yz ( S ) ] }

27 CS  xy {  p (R S) } =  xy {  p [  xz’ (R)  yz’ (S)] } z’ = z U { attributes used in P }

28 CS  p (R U S) =  p (R) U  p (S)  p (R - S) =  p (R) - S =  p (R) -  p (S) Rules   U  combined:

29 CS Which are “good” transformations?

30 CS Conventional wisdom: do projects early Example: relation R(A,B,C,D,E) predicate P: (A=3)  (B=“cat”)  E {  p (R)} vs.  E {  p {  ABE (R)} }

31 CS What if we have A, B indexes? B = “cat” A=3 Intersect pointers to get pointers to matching tuples! But Then better to do projection later !

32 CS  p1  p2 (R)   p1 [  p2 (R)]  p (R S)  [  p (R)] S R S  S R  x [  p (R)]   x {  p [  xz (R)] } Which are “good” transformations?

33 CS Bottom line: Some heuristics : –Early selection is usually good No transformation is always good Rule application defines a search space –Need cost criteria to make decision

34 CS In textbook: more transformations Chapter 16.2, More rewrite rules Other operations, such as, duplicate elimination, etc.


Download ppt "CS 44321 CS4432: Database Systems II Logical Plan Rewriting."

Similar presentations


Ads by Google