CS 4432logical query rewriting - lecture 151 CS4432: Database Systems II Lecture #15 Logical Query Rewriting Professor Elke A. Rundensteiner.

Slides:



Advertisements
Similar presentations
CS4432: Database Systems II
Advertisements

CS CS4432: Database Systems II Logical Plan Rewriting.
Lecture 07: Relational Algebra
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
INFS614, Fall 08 1 Relational Algebra Lecture 4. INFS614, Fall 08 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of.
Query Compiler. The Query Compiler Parses SQL query into parse tree Transforms parse tree into expression tree (logical query plan) Transforms logical.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 183 Database Systems II Query Compiler.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Oct 28, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
1 Distributed Databases CS347 Lecture 14 May 30, 2001.
CS Spring 2002Notes 61 CS 277: Database System Implementation Notes 6: Query Processing Arthur Keller.
CS 4432query processing - lecture 131 CS4432: Database Systems II Lecture #13 Query Processing Professor Elke A. Rundensteiner.
1 Anna Östlin Pagh and Rasmus Pagh IT University of Copenhagen Advanced Database Technology March 25, 2004 QUERY COMPILATION II Lecture based on [GUW,
CS 4432query processing1 CS4432: Database Systems II.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
1 Lecture 07: Relational Algebra. 2 Outline Relational Algebra (Section 6.1)
16.5 Introduction to Cost- based plan selection Amith KC Student Id: 109.
CS 4432lecture #71 CS4432: Database Systems II Lecture #7 Professor Elke A. Rundensteiner.
16.2 ALGEBRAIC LAWS FOR IMPROVING QUERY PLANS Ramya Karri ID: 206.
Cs3431 Relational Algebra : #I Based on Chapter 2.4 & 5.1.
Query Processing & Optimization
Nov 18, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
CS 4432query processing - lecture 121 CS4432: Database Systems II Lecture #12 Query Processing Professor Elke A. Rundensteiner.
Murali Mani Relational Algebra. Murali Mani What is Relational Algebra? Defines operations (data retrieval) for relational model SQL’s DML (Data Manipulation.
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
Advanced Database Systems Notes:Query Processing (Overview) Shivnath Babu.
DBMS 2001Notes 6: Query Compilation1 Principles of Database Management Systems 6: Query Compilation and Optimization Pekka Kilpeläinen (partially based.
CPS216: Advanced Database Systems Notes 02:Query Processing (Overview) Shivnath Babu.
CPS216: Advanced Database Systems Notes 08:Query Optimization (Plan Space, Query Rewrites) Shivnath Babu.
CS 245Notes 61 CS 245: Database System Principles Notes 6: Query Processing Hector Garcia-Molina.
CS 245Notes 61 CS 245: Database System Principles Notes 6: Query Processing Hector Garcia-Molina.
RELATIONAL ALGEBRA CHAPTER 6 1. LECTURE OUTLINE  Unary Relational Operations: SELECT and PROJECT  Relational Algebra Operations from Set Theory  Binary.
CPS216: Data-Intensive Computing Systems Introduction to Query Processing Shivnath Babu.
CS 4432query processing1 CS4432: Database Systems II Lecture #11 Professor Elke A. Rundensteiner.
From Relational Algebra to SQL CS 157B Enrique Tang.
Chapters 15-16a1 (Slides by Hector Garcia-Molina, Chapters 15 and 16: Query Processing.
CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
CS 245Notes 61 CS 245: Database System Principles Notes 6: Query Processing Hector Garcia-Molina.
Data Engineering SQL Query Processing Shivnath Babu.
CSC271 Database Systems Lecture # 7. Summary: Previous Lecture  Relational keys  Integrity constraints  Views.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
CS4432: Database Systems II Query Processing- Part 1 1.
CPS216: Advanced Database Systems Notes 02:Query Processing (Overview) Shivnath Babu.
1 Ullman et al. : Database System Principles Notes 6: Query Processing.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Query Optimization Query Optimization.
Advanced Database Systems: DBS CB, 2nd Edition
CS4432: Database Systems II
Relational Algebra Chapter 4 1.
CS 245: Database System Principles
Relational Algebra 1.
Relational Algebra Chapter 4 1.
Relational Algebra : #I
Focus: Relational System
Relational Algebra Chapter 4, Sections 4.1 – 4.2
CS 245: Database System Principles
Algebraic Laws.
Lecture 33: The Relational Model 2
CPSC-608 Database Systems
Query Optimization.
CPS216: Data-Intensive Computing Systems Query Processing (contd.)
Yan Huang - CSCI5330 Database Implementation – Query Processing
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
CPS216: Data-Intensive Computing Systems Query Processing (Overview)
Presentation transcript:

CS 4432logical query rewriting - lecture 151 CS4432: Database Systems II Lecture #15 Logical Query Rewriting Professor Elke A. Rundensteiner

CS 4432logical query rewriting - lecture 152 Query in SQL  Query Plan in Algebra (logical)  Other Query Plan in Algebra (logical)

CS 4432logical query rewriting - lecture 153 Query plan 1 (in relational algebra)  B,D  R.A =“c”  S.E=2  R.C=S.C  X RS

CS 4432logical query rewriting - lecture 154 Query plan 2 (in relational algebra)  B,D  R.A = “c”  S.E = 2 R S natural join on R.C=S.C

CS 4432logical query rewriting - lecture 155 Relational algebra optimization What are transformation rules ? –preserve equivalence What are good transformations? –reduce query execution costs

CS 4432logical query rewriting - lecture 156 Rules: Natural join rewriting. R S=SR (R S) T= R (S T) R SS T T R Can also write as trees, e.g.:

CS 4432logical query rewriting - lecture 157 Rules: Other binary operators ? R S=SR (R S) T= R (S T) What about : Cross product? Condition join? Union? Intersection ? Difference ?

CS 4432logical query rewriting - lecture 158 Note: Carry attribute names in results, so order is not important T R R SS T

CS 4432logical query rewriting - lecture 159 R x S = S x R (R x S) x T = R x (S x T) R U S = S U R R U (S U T) = (R U S) U T Rules: Natural joins & cross products & union R S=SR (R S) T= R (S T)

CS 4432logical query rewriting - lecture 1510 Rules: Selects  p1  p2 (R)=  p1 [  p2 (R)] [  p1 (R)] U [  p2 (R)]  p1vp2 (R) =

CS 4432logical query rewriting - lecture 1511 Bags vs. Sets R = {a,a,b,b,b,c} S = {b,b,c,c,d} What about union R U S = ? Option 1 SUM R U S = {a,a,b,b,b,b,b,c,c,c,d} Option 2 MAX R U S = {a,a,b,b,b,c,c,d}

CS 4432logical query rewriting - lecture 1512 Which option makes this rule work ?  p1vp2 (R) =  p1 (R) U  p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c  p1vp2 (R) = {a,a,b,b,b,c}  p1 (R) = {a,a,b,b,b}  p2 (R) = {b,b,b,c}  p1 (R) U  p2 (R) = {a,a,b,b,b,c} Let us try MAX():

CS 4432logical query rewriting - lecture 1513 Which option makes this rule work ?  p1vp2 (R) =  p1 (R) U  p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c  p1vp2 (R) = {a,a,b,b,b,c}  p1 (R) = {a,a,b,b,b}  p2 (R) = {b,b,b,c}  p1 (R) U  p2 (R) = {a,a,b,b,b,b,b,b,c} What about Sum()?

CS 4432logical query rewriting - lecture 1514 Which option makes this rule work ?  p1  p2 (R)=  p1 [  p2 (R)] Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c MAX or SUM ?

CS 4432logical query rewriting - lecture 1515 Option 2 (MAX) makes this rule work:  p1vp2 (R) =  p1 (R) U  p2 (R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c  p1vp2 (R) = {a,a,b,b,b,c}  p1 (R) = {a,a,b,b,b}  p2 (R) = {b,b,b,c}  p1 (R) U  p2 (R) = {a,a,b,b,b,c}

CS 4432logical query rewriting - lecture 1516 Yet another example ! Senators (……)Reps (……) T1 =  yr,state Senators; T2 =  yr,state Reps T1 Yr State T2 Yr State 97 CA 99 CA 99 CA 99 CA 98 AZ 98 CA Union? “Sum” option makes more sense!

CS 4432logical query rewriting - lecture 1517 Decision -> In summary, we tend to use “SUM” option for bag union -> Thus great care must be taken, as some rules cannot be used for bags !

CS 4432logical query rewriting - lecture 1518 Rules: Project Let: X = set of attributes Y = set of attributes XY = X U Y  xy (R) =  x [  y (R)]

CS 4432logical query rewriting - lecture 1519 Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attribs  p (R S) =  q (R S) = Rules:  combined [  p (R)] S R [  q (S)]

CS 4432logical query rewriting - lecture 1520  p  q (R S) = ? Rules:  combined Rule can be derived !

CS 4432logical query rewriting - lecture 1521 Derivation for rule :  p  q (R S) =  p [  q (R S) ] =  p [ R  q (S) ] = [  p (R)] [  q (S)]

CS 4432logical query rewriting - lecture 1522 More Rules can be Derived:  p  q (R S) =  p  q  m (R S) =  pvq (R S) = Rules:  combined (continued)

CS 4432logical query rewriting - lecture 1523 We did one, do others on your own :  p  q (R S) = [  p (R)] [  q (S)]  p  q  m (R S) =  m [ (  p R) (  q S) ]  pvq (R S) = [ (  p R) S ] U [ R (  q S) ]

CS 4432logical query rewriting - lecture 1524 Rules:  combined Let x = subset of R attributes z = attributes in predicate P (subset of R attributes)  x [  p ( R ) ] =  {  p [  x ( R ) ] } x x  xz

CS 4432logical query rewriting - lecture 1525 Rules:  combined Let x = subset of R attributes y = subset of S attributes z = intersection of R,S attributes  xy (R S) =  xy { [  xz ( R ) ] [  yz ( S ) ] }

CS 4432logical query rewriting - lecture 1526 In textbook: more transformations More rewrite rules Other operations, such as, duplicate elimination, etc. Eliminate common sub-expressions Identify contradictions

CS 4432logical query rewriting - lecture 1527  xy {  p (R S) } =  xy {  p [  xz’ (R)  yz’ (S)] } z’ = z U { attributes used in P }

CS 4432logical query rewriting - lecture 1528 Rules for    combined with X similar... e.g.,  p (R X S) = ?

CS 4432logical query rewriting - lecture 1529  p (R U S) =  p (R) U  p (S)  p (R - S) =  p (R) - S =  p (R) -  p (S) Rules   U  combined:

CS 4432logical query rewriting - lecture 1530 Which are “good” transformations?

CS 4432logical query rewriting - lecture 1531 Conventional wisdom: do projects early Example: relation R(A,B,C,D,E) predicate P: (A=3)  (B=“cat”)  E {  p (R)} vs.  E {  p {  ABE (R)} }

CS 4432logical query rewriting - lecture 1532 What if we have A, B indexes? B = “cat” A=3 Intersect pointers to get pointers to matching tuples! But Then better to do projection later !

CS 4432logical query rewriting - lecture 1533  p1  p2 (R)   p1 [  p2 (R)]  p (R S)  [  p (R)] S R S  S R  x [  p (R)]   x {  p [  xz (R)] } Which are “good” transformations?

CS 4432logical query rewriting - lecture 1534 Bottom line: Some heuristics : –Early selection is usually good No transformation is always good Rule application defines a search space –Need cost criteria to make decision