Chapters 15-16a1 (Slides by Hector Garcia-Molina, Chapters 15 and 16: Query Processing.

Slides:



Advertisements
Similar presentations
CS CS4432: Database Systems II Logical Plan Rewriting.
Advertisements

CS CS4432: Database Systems II Operator Algorithms Chapter 15.
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
Query Execution Since our SQL queries are very high level the query processor does a lot of processing to supply all the details. An SQL query is translated.
Query Compiler. The Query Compiler Parses SQL query into parse tree Transforms parse tree into expression tree (logical query plan) Transforms logical.
Query Execution Optimizing Performance. Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional.
COMP 451/651 Optimizing Performance
The Query Compiler Parses SQL query into parse tree Transforms parse tree into expression tree (logical query plan) Transforms logical query plan into.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 183 Database Systems II Query Compiler.
Algebraic Laws For the binary operators, we push the selection only if all attributes in the condition C are in R.
CS 4432query processing - lecture 141 CS4432: Database Systems II Lecture #14 Query Processing – Size Estimation Professor Elke A. Rundensteiner.
CS 245Notes 71 CS 245: Database System Principles Notes 7: Query Optimization Hector Garcia-Molina.
CS CS4432: Database Systems II Query Processing – Size Estimation.
Estimating the Cost of Operations. From l.q.p. to p.q.p Having parsed a query and transformed it into a logical query plan, we must turn the logical plan.
CS Spring 2002Notes 61 CS 277: Database System Implementation Notes 6: Query Processing Arthur Keller.
CS 4432query processing - lecture 131 CS4432: Database Systems II Lecture #13 Query Processing Professor Elke A. Rundensteiner.
1 Anna Östlin Pagh and Rasmus Pagh IT University of Copenhagen Advanced Database Technology March 25, 2004 QUERY COMPILATION II Lecture based on [GUW,
CS 4432query processing1 CS4432: Database Systems II.
CS 4432logical query rewriting - lecture 151 CS4432: Database Systems II Lecture #15 Logical Query Rewriting Professor Elke A. Rundensteiner.
Query Processing & Optimization
Algebraic Laws. {P1,P2,…..} {P1,C1>...} parse convert apply laws estimate result sizes consider physical plans estimate costs pick best execute Pi answer.
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
CS 4432query processing - lecture 121 CS4432: Database Systems II Lecture #12 Query Processing Professor Elke A. Rundensteiner.
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:
CSCE Database Systems Chapter 15: Query Execution 1.
Advanced Database Systems Notes:Query Processing (Overview) Shivnath Babu.
Query Execution Optimizing Performance. Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional.
DBMS 2001Notes 6: Query Compilation1 Principles of Database Management Systems 6: Query Compilation and Optimization Pekka Kilpeläinen (partially based.
CPS216: Advanced Database Systems Notes 02:Query Processing (Overview) Shivnath Babu.
CPS216: Advanced Database Systems Notes 08:Query Optimization (Plan Space, Query Rewrites) Shivnath Babu.
CS 245Notes 61 CS 245: Database System Principles Notes 6: Query Processing Hector Garcia-Molina.
CS 245Notes 61 CS 245: Database System Principles Notes 6: Query Processing Hector Garcia-Molina.
CPS216: Advanced Database Systems Query Rewrite Rules for Subqueries Shivnath Babu.
SCUHolliday - COEN 17814–1 Schedule Today: u Query Processing overview.
DBMS 2001Notes 5: Query Processing1 Principles of Database Management Systems 5: Query Processing Pekka Kilpeläinen (partially based on Stanford CS245.
QUERY PROCESSING AND OPTIMIZATION. Overview SQL is a declarative language:  Specifies the outcome of the computation without defining any flow of control.
CPS216: Data-Intensive Computing Systems Introduction to Query Processing Shivnath Babu.
CS 4432query processing1 CS4432: Database Systems II Lecture #11 Professor Elke A. Rundensteiner.
Query Processing Notes. Query Processing The query processor turns user queries and data modification commands into a query plan - a sequence of operations.
Query Compilation Contains slides by Hector Garcia-Molina.
Estimating the Cost of Operations. Suppose we have parsed a query and transformed it into a logical query plan (lqp) Also suppose all possible transformations.
CS 4432estimation - lecture 161 CS4432: Database Systems II Lecture #16 Query Processing : Estimating Sizes of Results Professor Elke A. Rundensteiner.
CS4432: Database Systems II Query Processing- Part 2.
CS 245Notes 61 CS 245: Database System Principles Notes 6: Query Processing Hector Garcia-Molina.
Data Engineering SQL Query Processing Shivnath Babu.
Lecture 17: Query Execution Tuesday, February 28, 2001.
Chapter 13: Query Processing
CS4432: Database Systems II Query Processing- Part 1 1.
CPS216: Advanced Database Systems Notes 02:Query Processing (Overview) Shivnath Babu.
1 Ullman et al. : Database System Principles Notes 6: Query Processing.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Query Optimization Query Optimization.
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
Advanced Database Systems: DBS CB, 2nd Edition
CS 245: Database System Principles
Query Execution Presented by Khadke, Suvarna CS 257
Focus: Relational System
CS 245: Database System Principles
Chapters 15 and 16b: Query Optimization
Outline - Query Processing
Algebraic Laws.
Relational Algebra Friday, 11/14/2003.
Lecture 23: Query Execution
CPS216: Data-Intensive Computing Systems Query Processing (contd.)
Yan Huang - CSCI5330 Database Implementation – Query Processing
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
Query Compiler By:Payal Gupta Shirali Choksi Professor :Tsau Young Lin.
Outline - Query Processing
CPS216: Data-Intensive Computing Systems Query Processing (Overview)
Presentation transcript:

Chapters 15-16a1 (Slides by Hector Garcia-Molina, Chapters 15 and 16: Query Processing

Chapters 15-16a2 Query Processing Q  Query Plan Focus: Relational System

Chapters 15-16a3 Example Select B,D From R,S Where R.A = “c” AND S.E = 2 AND R.C=S.C

Chapters 15-16a4 R:ABC S: CDE a11010x2 b12020y2 c21030z2 d23540x1 e34550y3 AnswerB D 2 x

Chapters 15-16a5 How do we execute query? - Do Cartesian product - Select tuples - Do projection One idea

Chapters 15-16a6 RXSR.AR.BR.CS.CS.DS.E a x 2 a y 2. C x 2. Bingo! Got one...

Chapters 15-16a7 Relational Algebra - can be used to describe plans... Ex: Plan I  B,D  R.A =“c”  S.E=2  R.C=S.C  X RS OR:  B,D [  R.A=“c”  S.E=2  R.C = S.C (RXS)]

Chapters 15-16a8 Another idea:  B,D  R.A = “c”  S.E = 2 R S Plan II natural join

Chapters 15-16a9 R S A B C  ( R )  ( S ) C D E a 1 10 A B C C D E 10 x 2 b 1 20c x 2 20 y 2 c y 2 30 z 2 d z 2 40 x 1 e y 3

Chapters 15-16a10 Plan III Use R.A and S.C Indexes (1) Use R.A index to select R tuples with R.A = “c” (2) For each R.C value found, use S.C index to find matching tuples (3) Eliminate S tuples S.E  2 (4) Join matching R,S tuples, project B,D attributes and place in result

Chapters 15-16a11 R S A B C C D E a x 2 b y 2 c z 2 d x 1 e y 3 AC I1I1 I2I2 =“c” check=2? output: next tuple:

Chapters 15-16a12 Overview of Query Optimization

Chapters 15-16a13 parse convert apply laws estimate result sizes consider physical plans estimate costs pick best execute {P1,P2,…..} {(P1,C1),(P2,C2)...} Pi answer SQL query parse tree logical query plan “improved” l.q.p l.q.p. +sizes statistics

Chapters 15-16a14 Example: SQL query SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); (Find the movies with stars born in 1960)

Chapters 15-16a15 Example: Parse Tree SELECT FROM WHERE IN title StarsIn ( ) starName SELECT FROM WHERE LIKE name MovieStar birthDate ‘%1960’

Chapters 15-16a16 Example: Generating Relational Algebra  title  StarsIn IN  name  birthdate LIKE ‘%1960’ starName MovieStar An expression using a two-argument , midway between a parse tree and relational algebra

Chapters 15-16a17 Example: Logical Query Plan  title  starName=name StarsIn  name  birthdate LIKE ‘%1960’ MovieStar Applying the rule for IN conditions 

Chapters 15-16a18 Example: Improved Logical Query Plan  title starName=name StarsIn  name  birthdate LIKE ‘%1960’ MovieStar Fig. 7.20: An improvement on fig Question: Push project to StarsIn?

Chapters 15-16a19 Example: Estimate Result Sizes Need expected size StarsIn MovieStar 

Chapters 15-16a20 Example: One Physical Plan Parameters: join order, memory size,... Hash join SEQ scanindex scan Parameters: Select Condition,... StarsInMovieStar

Chapters 15-16a21 Example: Estimate costs L.Q.P P1 P2 …. Pn C1 C2 …. Cn Pick best!

Chapters 15-16a22 Textbook outline Chapter 5: Algebra for queries [bags vs sets] - Select, project, join, ….[project list a,a+b->x,…] - Duplicate elimination, grouping, sorting Chapter 15: - Physical operators - Scan,sort, … - Implementing operators and estimating their cost

Chapters 15-16a23 Chapter 16: - Parsing - Algebraic laws - Parse tree -> logical query plan - Estimating result sizes - Cost based optimization

Chapters 15-16a24 Query Optimization Relational algebra level Detailed query plan level –Estimate Costs without indexes with indexes –Generate and compare plans

Chapters 15-16a25 Relational algebra optimization Transformation rules (preserve equivalence) What are good transformations?

Chapters 15-16a26 R x S = S x R (R x S) x T = R x (S x T) R U S = S U R R U (S U T) = (R U S) U T Rules: Natural joins & cross products & union R S=SR (R S) T= R (S T)

Chapters 15-16a27 Note: Can also write as trees, e.g.: T R R SS T

Chapters 15-16a28 Rules: Selects  p1  p2 (R) =  p1vp2 (R) =  p1 [  p2 (R)] [  p1 (R)] U [  p2 (R)]

Chapters 15-16a29 Let p = predicate with only R attribs q = predicate with only S attribs  p (R S) =  q (R S) = Rules:  combined [  p (R)] S R [  q (S)]

Chapters 15-16a30  p1  p2 (R)   p1 [  p2 (R)]  p (R S)  [  p (R)] S R S  S R Which are “good” transformations?

Chapters 15-16a31 Conventional wisdom: do projects early Example: R(A,B,C,D,E) x={E} P: (A=3)  (B=“cat”)  x {  p (R)} vs.  E {  p {  ABE (R)} }

Chapters 15-16a32 What if we have A, B indexes? B = “cat” A=3 Intersect pointers to get pointers to matching tuples But

Chapters 15-16a33 Bottom line: No transformation is always good Usually good: early selections

Chapters 15-16a34 In textbook: more transformations Eliminate common sub-expressions Other operations: duplicate elimination

Chapters 15-16a35 Outline - Query Processing Relational algebra level –transformations –good transformations Detailed query plan level –estimate costs –generate and compare plans

Chapters 15-16a36 Estimating cost of query plan (1) Estimating size of results (2) Estimating # of IOs

Chapters 15-16a37 Estimating result size Keep statistics for relation R –T(R) : # tuples in R –S(R) : # of bytes in each R tuple –B(R): # of blocks to hold all R tuples –V(R, A) : # distinct values in R for attribute A

Chapters 15-16a38 Example RA: 20 byte string B: 4 byte integer C: 8 byte date D: 5 byte string ABCD cat110a cat120b dog130a dog140c bat150d T(R) = 5 S(R) = 37 V(R,A) = 3V(R,C) = 5 V(R,B) = 1V(R,D) = 4

Chapters 15-16a39 Size estimates for W = R1 x R2 T(W) = S(W) = T(R1)  T(R2) S(R1) + S(R2)

Chapters 15-16a40 Selection cardinality SC(R,A) = average # records that satisfy equality condition on R.A T(R) V(R,A) SC(R,A) = T(R) DOM(R,A)

Chapters 15-16a41 What about W =  z  val (R) ? T(W) = ? Solution # 1: T(W) = T(R)/2 Solution # 2: T(W) = T(R)/3

Chapters 15-16a42 Size estimate for W = R1 R2 Let x = attributes of R1 y = attributes of R2 X  Y =  Same as R1 x R2 Case 1

Chapters 15-16a43 W = R1 R2 X  Y = A R1 A B C R2 A D Case 2 Assumption: V(R1,A)  V(R2,A)  Every A value in R1 is in R2 V(R2,A)  V(R1,A)  Every A value in R2 is in R1 “containment of value sets”

Chapters 15-16a44 V(R1,A)  V(R2,A) T(W) = T(R2) T(R1) V(R2,A) V(R2,A)  V(R1,A) T(W) = T(R2) T(R1) V(R1,A) [A is common attribute]

Chapters 15-16a45 T(W) = T(R2) T(R1) max{ V(R1,A), V(R2,A) } In general W = R1 R2

Chapters 15-16a46 Using similar ideas, we can estimate sizes of:  AB (R) …..  A=a  B=b (R) …. R S with common attribs. A,B,C Union, intersection, diff, ….

Chapters 15-16a47 Example: Z = R1(A,B) R2(B,C) R3(C,D) T(R1) = 1000 V(R1,A)=50 V(R1,B)=100 T(R2) = 2000 V(R2,B)=200 V(R2,C)=300 T(R3) = 3000 V(R3,C)=90 V(R3,D)=500 R1 R2 R3

Chapters 15-16a48 T(U) = 1000  2000 V(U,A) = V(U,B) = 100 V(U,C) = 300 Partial Result: U = R1 R2

Chapters 15-16a49 Z = U R3 T(Z) = 1000  2000  3000 V(Z,A) =  300 V(Z,B) = 100 V(Z,C) = 90 V(Z,D) = 500

Chapters 15-16a50 Summary Estimating size of results is an “art” Don’t forget: Statistics must be kept up to date… (cost?)

Chapters 15-16a51 Outline Estimating cost of query plan –Estimating size of results done! –Estimating # of IOs Generate and compare plans