# Query Optimization Dr. Karen C. Davis Professor School of Electronic and Computing Systems School of Computing Sciences and Informatics.

## Presentation on theme: "Query Optimization Dr. Karen C. Davis Professor School of Electronic and Computing Systems School of Computing Sciences and Informatics."— Presentation transcript:

Query Optimization Dr. Karen C. Davis Professor School of Electronic and Computing Systems School of Computing Sciences and Informatics

Outline overview of relational query optimization logical optimization –algebraic equivalences –transformation of trees physical optimization –selection algorithms –join algorithms cost-based optimization research example using relational algebra

Relational Query Optimization query optimizer logicalphysical SQL query relational algebra query tree access plan (executable)

Learning Outcomes translate basic SQL to RA query tree perform heuristic optimizations to tree use cost-based optimization to select algorithms for tree operators to generate an execution plan

SQL is declarative describes what data, not how to retrieve it select distinct … from … where … helpful for users, not necessarily good for efficient execution

Relational Algebra is procedural specifies operators and the order of evaluation steps for query evaluation: 1.translate SQL to RA operators (query tree) 2.perform heuristic optimizations: a.push RA select operators down the tree b.convert select and cross product to join c.others based on algebraic transformations

Relational Algebra Operators namesymbolicallyevaluation selectσcRσcR applies condition c to R project π l R keeps a list (l) of attributes of R cross product R X S all possible combinations of tuples of R are appended with tuples from S join R ⋈ c S π l (σ c (R X S)), where l is a list of attributes of R and S with duplicate columns removed and c is a join condition

SQL to RA select distinct …π l from …x where …σ c π l | σ c | X / \ R S π l | σ c | X / \ X S / \ R T π l | σ c | X / \ X S / \ X T / \ R U  two relations three relations  four relations ↓

SQL to RA Tree Example select A.x, A.y, B.z from A, B where A.a = B.z and A.x > 10 π A.x, A.y, B.z | σ A.a = B.z and A.z > 10 | X / \ A B evaluated bottom-up left to right; intermediate values are passed up the tree to the next operator

SQL to RA Tree Example select lname from employee, works_on, projects where pname = ‘Aquarius’ and pnumber = pno and essn = ssn and bdate = ‘1985-12-03’ π lname | σ pname = ‘Aquarius’ and pnumber = pno and essn = ssn and bdate = ‘1985-12-03’ | X / \ X projects / \ employee works_on

Simple Heuristic Optimization 1.cascade selects (split them up) π l | σ c 1 and c 2 and c 3 | X / \ R S π l | σ c 1 | σ c 2 | σ c 3 | X / \ R S

2.Push any single attribute selects down the tree to be just above their relation π l | σ c 1 | σ c 2 | σ c 3 | X / \ R S π l | σ c 2 | X / \ σ c 1 σ c 3 | R S

3.Convert 2-attribute select and cross product to join π l | σ c 2 | X / \ σ c 1 σ c 3 | R S π l | ⋈ c 2 / \ σ c 1 σ c 3 | R S smaller intermediate results efficient join algorithms

Practice problem: optimize RA tree select P.pnumber, P.dnum, E.lname, E.bdate from projects P, department D, employee E where D.dnumber = P.dnum and// c 1 D.mgrssn = E.ssn and// c 2 P.plocation = ‘Stafford’;// c 3

RA tree to RA expression π l | ⋈ c 2 / \ σ c 1 σ c 3 | R S σc1 Rσc1 R σc3 Sσc3 S ⋈c2⋈c2 πl(πl( )

Other Operators in Relational Algebra SQL: (select pnumber from projects, department, employee where dnum = dnumber and mgrssn = ssn and lname = 'Smith‘) union (select pnumber from projects, works_on, employee where pnumber = pno and essn = ssn and lname = 'Smith'); RA: π pnumber (σ lname = ‘Smith’ employee ⋈ ssn=mgrssn department ⋈ dnumber = dnum projects) ⋃ π pnumber (σ lname = ‘Smith’ employee ⋈ ssn=essn works_on ⋈ pnumber = pno projects)

Selection Algorithms linear search binary search primary index or hash for point query primary index for range query clustering index secondary index conjunctives –individual index –composite index or hash –intersection of record pointers for multiple indexes

Join Algorithms nested loop join single-scan join sort-merge join hash join http://docs.oracle.com/cd/E13085_01/doc/timesten.1121/e14261/query.htm sort-merge using indexes example execution plan

Multiple View Processing Plan (MVPP)  view chromosome: 101100010100001  index chromosome: 1100110  Fitness: sum of query processing costs of individual queries using the views and indexes selected ⋈ orderkey (v7) Customer (C)Orders (O)Lineitem (L)Nation (N)Part (P) Q1 Q2Q3 π O.orderkey, O.shippriority (v9) π C.custkey, C.name, C.acctbal, N.name, C.address, C.phone (v12) π P.type, L.extendedprice (v15) σ C.mktsegment = “building” and L.shipdate = “1995- 03-15” (v8) σ O.orderdate = “1994-10- 01” (v11) σ L.shipdate = “1995- 09-01” (v14) ⋈ nationkey (v10) ⋈ custkey (v6) ⋈ partkey (v13) π name, address, phone, acctbal, nationkey, custkey, mktsegment (v1) π orderkey, orderdate, custkey, shippriority (v2) π partkey, orderkey, shipdate, extendedprice (v3) π nationkey, name (v4) π partkey, type (v5) thesis defense of Sirisha Machiraju: Space Allocation for Materialized Views and Indexes Using Genetic Algorithms, June 2002

Download ppt "Query Optimization Dr. Karen C. Davis Professor School of Electronic and Computing Systems School of Computing Sciences and Informatics."

Similar presentations