Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

Similar presentations


Presentation on theme: "1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina."— Presentation transcript:

1 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina

2 2 parse convert apply laws estimate result sizes consider physical plans estimate costs pick best execute {P1,P2,…..} {(P1,C1),(P2,C2)...} Pi answer SQL query parse tree logical query plan “improved” l.q.p l.q.p. +sizes statistics

3 3 Outline uConvert SQL query to a parse tree wSemantic checking: attributes, relation names, types uConvert to a logical query plan (relational algebra expression) wdeal with subqueries uImprove the logical query plan wuse algebraic transformations wgroup together certain operators wevaluate logical plan based on estimated size of relations uConvert to a physical query plan wsearch the space of physical plans wchoose order of operations wcomplete the physical query plan

4 4 Parsing uGoal is to convert a text string containing a query into a parse tree data structure: wleaves form the text string (broken into lexical elements) winternal nodes are syntactic categories uUses standard algorithmic techniques from compilers wgiven a grammar for the language (e.g., SQL), process the string and build the tree

5 5 Example: SQL query SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); (Find the movies with stars born in 1960) Assume we have a simplified grammar for SQL.

6 6 Example: Parse Tree SELECT FROM WHERE IN title StarsIn ( ) starName LIKE name MovieStar birthDate ‘%1960’ SELECT FROM WHERE

7 7 The Preprocessor ureplaces each reference to a view with a parse (sub)-tree that describes the view (i.e., a query) udoes semantic checking: ware relations and views mentioned in the schema? ware attributes mentioned in the current scope? ware attribute types correct?

8 8 Outline uConvert SQL query to a parse tree wSemantic checking: attributes, relation names, types uConvert to a logical query plan (relational algebra expression) wdeal with subqueries uImprove the logical query plan wuse algebraic transformations wgroup together certain operators wevaluate logical plan based on estimated size of relations uConvert to a physical query plan wsearch the space of physical plans wchoose order of operations wcomplete the physical query plan

9 9 Convert Parse Tree to Relational Algebra uComplete algorithm depends on specific grammar, which determines forms of the parse trees uHere give a flavor of the approach

10 10 Conversion uSuppose there are no subqueries. uSELECT att-list FROM rel-list WHERE cond is converted into PROJ att-list (SELECT cond (PRODUCT(rel-list))), or  att-list (  cond ( X (rel-list)))

11 11 SELECT movieTitle FROM StarsIn, MovieStar WHERE starName = name AND birthdate LIKE '%1960'; SELECT FROM WHERE, AND movieTitle StarsIn LIKE MovieStar birthdate '%1960' = starName name

12 12 Equivalent Algebraic Expression Tree  movieTitle starname = name AND birthdate LIKE '%1960' X StarsIn MovieStar 

13 13 Handling Subqueries uRecall the (equivalent) query: SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); uUse an intermediate format called two- argument selection

14 14  title  StarsIn IN  name  birthdate LIKE ‘%1960’ starName MovieStar Example: Two-Argument Selection

15 15 Converting Two-Argument Selection uTo continue the conversion, we need rules for replacing two-argument selection with a relational algebra expression uDifferent rules depending on the nature of the subquery uHere show example for IN operator and uncorrelated query (subquery computes a relation independent of the tuple being tested)

16 16 Rules for IN  R t IN S R  S  X C C is the condition that equates attributes in t with corresponding attributes in S

17 17 Example: Logical Query Plan  title  starName=name StarsIn  birthdate LIKE ‘%1960’ MovieStar   name 

18 18 What if Subquery is Correlated? uExample is when subquery refers to the current tuple of the outer scope that is being tested uMore complicated to deal with, since subquery cannot be translated in isolation uNeed to incorporate external attributes in the translation uSome details are in textbook

19 19 Outline uConvert SQL query to a parse tree wSemantic checking: attributes, relation names, types uConvert to a logical query plan (relational algebra expression) wdeal with subqueries uImprove the logical query plan wuse algebraic transformations wgroup together certain operators wevaluate logical plan based on estimated size of relations uConvert to a physical query plan wsearch the space of physical plans wchoose order of operations wcomplete the physical query plan

20 20 Improving the Logical Query Plan uThere are numerous algebraic laws concerning relational algebra operations uBy applying them to a logical query plan judiciously, we can get an equivalent query plan that can be executed more efficiently uNext we'll survey some of these laws

21 21 Associative and Commutative Operations uproduct unatural join uset and bag union uset and bag intersection uassociative: (A op B) op C = A op (B op C) ucommutative: A op B = B op A

22 22 Laws Involving Selection uSelections usually reduce the size of the relation uUsually good to do selections early, i.e., "push them down the tree" uAlso can be helpful to break up a complex selection into parts

23 23 Selection Splitting   C1 AND C2 (R) =  C1 (  C2 (R))   C1 OR C2 (R) = (  C1 (R)) U set (  C2 (R)) if R is a set   C1 (  C2 (R)) =  C2 (  C1 (R))

24 24 Selection and Binary Operators uMust push selection to both arguments:   C (R U S) =  C (R) U  C (S) uMust push to first arg, optional for 2nd:   C (R - S) =  C (R) - S   C (R - S) =  C (R) -  C (S) uPush to at least one arg with all attributes mentioned in C: wproduct, natural join, theta join, intersection  e.g.,  C (R X S) =  C (R) X S, if R has all the atts in C

25 25 Pushing Selection Up the Tree uSuppose we have relations wStarsIn(title,year,starName) wMovie(title,year,len,inColor,studioName) uand a view wCREATE VIEW MoviesOf1996 AS SELECT * FROM Movie WHERE year = 1996; uand the query wSELECT starName, studioName FROM MoviesOf1996 NATURAL JOIN StarsIn;

26 26 The Straightforward Tree  starName,studioName  year=1996 StarsIn Movie Remember the rule  C (R S) =  C (R) S ?

27 27 The Improved Logical Query Plan  starName,studioName  year=1996 StarsIn Movie  starName,studioName  year=1996 Movie StarsIn  starName,studioName  year=1996 Movie StarsIn push selection up tree push selection down tree

28 28 Laws Involving Projections uConsider adding in additional projections uAdding a projection lower in the tree can improve performance, since often tuple size is reduced wUsually not as helpful as pushing selections down uIf a projection is inserted in the tree, then none of the eliminated attributes can appear above this point in the tree  Ex:  L (R X S) =  L (  M (R) X  N (S)), where M (resp. N) is all attributes of R (resp. S) that are used in L uAnother example:   L (R U bag S) =  L (R) U bag  L (S) But watch out for set union!

29 29 Push Projection Below Selection?  Rule:  L (  C (R)) =  L (  C (  M (R))) where M is all attributes used by L or C uBut is it a good idea? SELECT starName FROM StarsIn WHERE movieYear = 1996;  starName  movieYear=1996 StarsIn  starName,movieYear  starName  movieYear=1996 StarsIn

30 30 Joins and Products uRecall from the definitions of relational algebra:  R C S =  C (R X S) (theta join)  R S =  L (  C (R X S)) (natural join) where C equates same-name attributes in R and S, and L includes all attributes of R and S dropping duplicates uTo improve a logical query plan, replace a product followed by a selection with a join wJoin algorithms are usually faster than doing product followed by selection

31 31 Duplicate Elimination  Moving  down the tree is potentially beneficial as it can reduce the size of intermediate relations uCan be eliminated if argument has no duplicates wa relation with a primary key wa relation resulting from a grouping operator  Legal to push  through product, join, selection, and bag intersection  Ex:  (R X S) =  (R) X  (S)  Cannot push  through bag union, bag difference or projection

32 32 Grouping and Aggregation  Since  produces no duplicates:   (  L (R)) =  L (R) uGet rid of useless attributes:   L (R) =  L (  M (R)) where M contains all attributes in L uIf L contains only MIN and MAX:   L (R) =  L (  (R))

33 33 Example uSuppose we have the relations MovieStar(name,addr,gender,birthdate) StarsIn(title,year,starName) uand we want to find the youngest star to appear in a movie for each year: SELECT year, MAX(birthdate) FROM MovieStar,StarsIn WHERE name = starName GROUP BY year;  year,MAX(birthdate)  name=starName X MovieStarStarsIn

34 34 Example cont'd  year,MAX(birthdate)  name=starName X MovieStarStarsIn  year,MAX(birthdate)  year,birthdate MovieStarStarsIn name=starName  year, starName   birthdate, name  year,MAX(birthdate) MovieStarStarsIn  year,birthdate name=starName  

35 35 Summary of LQP Improvements uSelections: wpush down tree as far as possible wif condition is an AND, split and push separately wsometimes need to push up before pushing down uProjections: wcan be pushed down wnew ones can be added (but be careful) uDuplicate elimination: wsometimes can be removed uSelection/product combinations: w can sometimes be replaced with join

36 36 Outline uConvert SQL query to a parse tree wSemantic checking: attributes, relation names, types uConvert to a logical query plan (relational algebra expression) wdeal with subqueries uImprove the logical query plan wuse algebraic transformations wgroup together certain operators wevaluate logical plan based on estimated size of relations uConvert to a physical query plan wsearch the space of physical plans wchoose order of operations wcomplete the physical query plan

37 37 Grouping Assoc/Comm Operators uGroup together adjacent joins, adjacent unions, and adjacent intersections as siblings in the tree uSets up the logical QP for future optimization when physical QP is constructed: determine best order for doing a sequence of joins (or unions or intersections) U D E F U U A BC DEF A B C

38 38 Evaluating Logical Query Plans uThe transformations discussed so far intuitively seem like good ideas uBut how can we evaluate them more scientifically? uEstimate size of relations, also helpful in evaluating physical query plans uComing up next…


Download ppt "1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina."

Similar presentations


Ads by Google