Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Processing and Optimization, and Database Tuning

Similar presentations


Presentation on theme: "Query Processing and Optimization, and Database Tuning"— Presentation transcript:

1 Query Processing and Optimization, and Database Tuning
From: Fundamentals of Database Systems Sixth Edition, Chapter 19 Elmasri & Navathe Prepared by: David Marshburn IST-734, Fall 2010 CSU, Dr. Matos

2 Query Processing SQL – most common high-level language
Scan – identifies tokens: SQL keywords, attribute name, relation names Parse – valid syntax Validate – attributes and relations are valid and semantically meaningful Query in high-level language Scanning, parsing, and validating Immediate form of query Query optimizer Execution plan Query code generator Code to execute the query Runtime database processor

3 Immediate form of query
Optimization Really “reasonably efficient” Too time consuming to completely optimize Based on what is known in the database catalog Used in relational databases Query describes the results, not how to get the results Not used in navigational databases Used in OO-databases (but not discussed here) Immediate form of query Query optimizer Execution plan

4 Optimization (cont’d)
Reduces the amount of data being processed Amount of data Size of data Uses relational algebra Represented by query trees Sequence from leaf nodes to root node Can use heuristic rules Immediate form of query Root Node Internal Node Leaf Node Calculus and query graphs are not typically used for optimization as they do not indicate sequence

5 Heuristic Optimization of Query Tree
Immediate form of query based on relational algebra depiction of higher-level language (SQL) query Very inefficient If employee record count is 500, works_on record count is 150, and project record count is 100, then there will be 3,750,000 records returned to the select operation Multiply by the size of each record SELECT lname FROM employee, works_on, project WHERE pname = ‘Aquarius’ AND pnumber = pno AND essn = ssn AND bdate > ‘ ’ πlname σpname = ‘Aquarius’ AND pnumber = pno and essn = ssn AND bdate > ‘ ’ x x Project Employee Works_on

6 Heuristic Optimization of Query Tree (cont’d)
Moving SELECT operations down the query tree Only need projects where pname = ‘Aquarius’ Only need employees whose bdate > ‘ ’ Reduced the number of records returned πlname σpnumber = pno x σessn = ssn σpname = ‘Aquarius’ x Project σbdate > ‘ ’ Works_on Employee

7 Heuristic Optimization of Query Tree (cont’d)
Apply more restrictive SELECT operations first Switch positions of employee and project to retrieve less records (Database catalog shows that pname is a key for project; therefore a select on pname will return only 1 record) πlname σessn = ssn x σpnumber = pno σbdate > ‘ ’ x Employee σpname = ‘Aquarius’ Works_on Project

8 Heuristic Optimization of Query Tree (cont’d)
Replace CARTESIAN PRODUCTs and SELECTs with JOINs πlname ⨝essn = ssn ⨝pnumber = pno σbdate > ‘ ’ σpname = ‘Aquarius’ Works_on Employee Project

9 Heuristic Optimization of Query Tree (cont’d)
Move PROJECT down tree PROJECT reduces the number of columns returned This makes each record smaller πlname ⨝essn = ssn πessn σbdate > ‘ ’ ⨝pnumber = pno Employee πpnumber π essn, pno σpname = ‘Aquarius’ Works_on Project

10 Heuristic Optimization of Query Tree (cont’d)
Transformation of query tree to more efficient query tree a step-by-step process Must ensure that each step retains the equivalency of the query Use Transformation Rules for Relational Algebra to develop a Heuristic Algebraic Optimization Algorithm

11 Transformation Rules for Relational Algebra
Cascade of SELECT – a conjunctive selection condition can be broken up into a cascade (sequence) of individual select operations σc1 AND c2 AND AND cn  (R) ≡ σc1 (σc2 (σcn(R)). . .)) Commutativity of SELECT – the SELECT operation is commutative σc1 (σc2 (R)) ≡ σc2 (σc1 (R)) Cascade of PROJECT – in a cascade (sequence) of project operations, all but the last one can be ignored πList1 (πList2 (. . . (πListn (R)) )) ≡ πList1 (R)

12 Transformation Rules for Relational Algebra (cont’d)
Commuting SELECT with PROJECT – if the selection condition c involves only those attributes A An in the project list, the two operations can be commuted πA1, A1, An (σc (R)) ≡ σc (πA1, A1, An (R)) c1 AND c2 AND AND cn  (R) ≡ σc1 (σc2 (σcn(R)). . .)) Commutativity of JOINs – the join operation is commutative, as is the CARTESIAN PRODUCT R ⨝ S ≡ S ⨝ R R x S ≡ S x R

13 Transformation Rules for Relational Algebra (cont’d)
Commuting σ with ⨝ (or x) – if all the attributes in the selection condition c involve only the attributes of one of the relations being joined – say, R – the two operations can be commuted as follows: σc (R ⨝ S) ≡ (σc (R)) ⨝ S Alternatively, if the selection condition c can be written as (c1 AND c2), where condition c1 involves only the attributes of R and condition c2 involves only the attributes of S, the operations commute as follows: σc (R ⨝ S) ≡ (σc (R)) ⨝ (σc ( S)) The same rules apply if the ⨝ is replaced by a x operation

14 Transformation Rules for Relational Algebra (cont’d)
Commuting π with ⨝ (or x) – suppose that the projection list is L = {A1, . . ., An, B1, . . ., Bm}, where A1, . . ., An are attributes of R and B1, . . ., Bm are attributes of S. If the join condition c involves only attributes in L, the two operations can be commuted as follows: π L (R ⨝c S) ≡ (πA1, . . ., AN (R)) ⨝c (πB1, . . ., BM (S))

15 Transformation Rules for Relational Algebra (cont’d)
(cont’d) If the join condition c contains additional attributes not in L, these must be added to the projection list, and a final π operation is needed. For example, if attributes An+1, . . ., An+k of R and Bm+1, . . ., Bm+k of S are involved in the join condition c but are not int the projection list L, the operations commute as follows: π L (R ⨝c S) ≡ (πA1, . . ., AN, AN+1, . . ., AN+k (R)) ⨝c (πB1, . . ., BM, BN+1,. . ., BM+P (S)) For x, there is no condition c, so the first transformation rule always applies by replacing ⨝c with x

16 Transformation Rules for Relational Algebra (cont’d)
Commutativity of set operations – the set operations ∪ and ∩ are commutative but – is not Associativity of ⨝, x, ∪ and ∩ - these four operations are individually associated; this is if θ stands for any one of these four operations (throughout the expression), we have: (R θ S) θ T ≡ R θ (S θ T) Commuting σ with set operations – the σ operation commutes with ∪, ∩, and -. If θ stands for any one of these three operations (throughout the expression), we have: σc (R θ S) ≡ (σc (R)) θ (σc (S))

17 Transformation Rules for Relational Algebra (cont’d)
The operation π commutes with ∪ πL (R ∪ S) ≡ (πL (R)) ∪ (πL (S)) Converting a (σ, x) sequence into ⨝ - if the condition of c of a σ that follows a x corresponds to a join condition, convert the (σ, x) sequence into a ⨝ as follows: σc (R x S) ≡ (R ⨝L S)

18 Outline of a Heuristic Algebraic Optimization Algorithm
Using Rule 1, break up an SELECT operations with conjunctive conditions into a cascade of SELECT operations. This permits a greater degree of freedom in moving SELECT operations down different branches of the tree

19 Outline of a Heuristic Algebraic Optimization Algorithm (cont’d)
Using Rules 2, 4, 6, and 10 concerning commutativity of SELECT with other operations, move each SELECT operation as far down the query tree as is permitted by the attributes involved in the select condition. If the condition involves attributes from only one table, which means that it represents a selection condition, the operation is moved all the way to the leaf node that represents this table. If the condition involves attributes from two tables, which means that it represents a join condition, the condition is moved to a location down the tree after the two tables are combined.

20 Outline of a Heuristic Algebraic Optimization Algorithm (cont’d)
Using Rules 5 and 9 concerning commutativity and associativity of binary operations, rearrange the leaf nodes of the tree using the following criteria: Position the leaf node relations with the most restrictive SELECT operations so they are executed first Make sure that the ordering of leaf nodes does not cause CARTESIAN PRODUCT operations Using Rule 12, combine a CARTESIAN PRODUCT operation with a subsequent SELECT operation in the tree into a JOIN operation

21 Outline of a Heuristic Algebraic Optimization Algorithm (cont’d)
Using Rules 3, 4, 7, and 11 concerning the cascading of PROJECT and the commuting of PROJECT with other operations, break down and move lists of projection attributes down the tree as far as possible by creating new PROJECT operations as needed Identify subtrees that represent groups of operations that can be executed by a single algorithm

22 Cost-Based Query Optimization
Multiple execution strategies and algorithms are generated and compared based on costs Access cost to secondary storage – transferring data blocks between secondary storage and main memory Disk storage costs – storing on disk and intermediate files for execution strategy Computation costs – in-memory operations during query execution Memory usage costs – memory buffers needed during query Communication costs – shipping query and result from database to terminal

23 Database Catalog Information
Size of each file For a file whose records are all the same type Number of records Average record size Number of file blocks Blocking factor Primary file organization Ordered or unordered on an attribute with a primary or clustering index Hashed on a key Number of levels (for multilevel index) Number of first-level index blocks


Download ppt "Query Processing and Optimization, and Database Tuning"

Similar presentations


Ads by Google