Examples of Physical Query Plan Alternatives

Slides:



Advertisements
Similar presentations
Overview of Query Evaluation (contd.) Chapter 12 Ramakrishnan and Gehrke (Sections )
Advertisements

Query Optimization The Next-to-last Frontier
Query Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) Imperative query execution plan: SELECT S.sname FROM Reserves.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 12, Part A.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Evaluation of Relational Operators CS634 Lecture 11, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Query Optimization Goal: Declarative SQL query
1 Overview of Query Evaluation Chapter Objectives  Preliminaries:  Core query processing techniques  Catalog  Access paths to data  Index matching.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
Query Evaluation. SQL to ERA SQL queries are translated into extended relational algebra. Query evaluation plans are represented as trees of relational.
1 Relational Query Optimization Module 5, Lecture 2.
Relational Query Optimization 198:541. Overview of Query Optimization  Plan: Tree of R.A. ops, with choice of alg for each op. Each operator typically.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
1 Implementation of Relational Operations Module 5, Lecture 1.
SPRING 2004CENG 3521 Join Algorithms Chapter 14. SPRING 2004CENG 3522 Schema for Examples Similar to old schema; rname added for variations. Reserves:
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
Overview of Query Evaluation R&G Chapter 12 Lecture 13.
Query Optimization: Transformations May 29 th, 2002.
Query Optimization II R&G, Chapters 12, 13, 14 Lecture 9.
CS186 Final Review Query Optimization.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapter 15.
Query Optimization Overview Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2004 Some slide content derived.
Evaluation of Relational Operations. Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation.
Overview of Query Optimization v Plan : Tree of R.A. ops, with choice of alg for each op. –Each operator typically implemented using a `pull’ interface:
Query Optimization R&G, Chapter 15 Lecture 16. Administrivia Homework 3 available today –Written exercise; will be posted on class website –Due date:
1 Implementation of Relational Operations: Joins.
Query Optimization, part 2 CS634 Lecture 13, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Overview of Implementing Relational Operators and Query Evaluation
Introduction to Database Systems1 Relational Query Optimization Query Processing: Topic 2.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 12: Overview.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Implementing Relational Operators and Query Evaluation Chapter 12.
Query Optimization. overview Histograms A histogram is a data structure maintained by a DBMS to approximate a data distribution Equiwidth vs equidepth.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
1 Overview of Query Evaluation Chapter Overview of Query Evaluation  Plan : Tree of R.A. ops, with choice of alg for each op.  Each operator typically.
Database systems/COMP4910/Melikyan1 Relational Query Optimization How are SQL queries are translated into relational algebra? How does the optimizer estimates.
Advanced Databases: Lecture 8 Query Optimization (III) 1 Query Optimization Advanced Databases By Dr. Akhtar Ali.
Database Systems/comp4910/spring20031 Evaluation of Relational Operations Why does a DBMS implements several algorithms for each algebra operation? What.
Implementing Natural Joins, R. Ramakrishnan and J. Gehrke with corrections by Christoph F. Eick 1 Implementing Natural Joins.
1 Database Systems ( 資料庫系統 ) December 3, 2008 Lecture #10.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Implementing Relational Operators and Query Evaluation Chapter 12.
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Query Optimization Chapter 13.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Implementation of Database Systems, Jarek Gryz1 Evaluation of Relational Operations Chapter 12, Part A.
Implementation of Database Systems, Jarek Gryz1 Relational Query Optimization Chapters 12.
Alon Levy 1 Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation. – Projection ( ) Deletes.
1 Database Systems ( 資料庫系統 ) Chapter 12 Overview of Query Evaluation November 22, 2004 By Hao-hua Chu ( 朱浩華 )
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 14, Part A (Joins)
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction To Query Optimization and Examples Chpt
Database Applications (15-415) DBMS Internals- Part IX Lecture 20, March 31, 2016 Mohammad Hammoud.
Running Example – Airline
CS222P: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Introduction to Query Optimization
Evaluation of Relational Operations
Overview of Query Optimization
Introduction to Database Systems
Examples of Physical Query Plan Alternatives
Relational Operations
Database Applications (15-415) DBMS Internals- Part IX Lecture 21, April 1, 2018 Mohammad Hammoud.
Relational Query Optimization
Overview of Query Evaluation
Overview of Query Evaluation
Implementation of Relational Operations
Relational Query Optimization
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Database Systems (資料庫系統)
Overview of Query Evaluation
CS222: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Relational Algebra Chpt 4a Xintao Wu Raghu Ramakrishnan
Presentation transcript:

Examples of Physical Query Plan Alternatives Selections from Chapters 12, 14, 15 The slides for this text are organized into chapters. This lecture covers Chapter 12, providing an overview of query optimization and execution. This chapter is the first of a sequence (Chapters 12, 13, 14, 15) on query evaluation that might be covered in full in a course with a systems emphasis. It can also be used stand-alone, as a self-contained overview of these issues, in a course with an application emphasis. It covers the essential concepts in sufficient detail to support a discussion of physical database design and tuning in Chapter 20. 1

Query Optimization NOTE: Relational query languages provide a wide variety of ways in which a user can express. HENCE: system has many options for evaluating a query. Optimizer is important for query performance. Generates alternative plans Choose plan with least estimated cost. Ideally, find best plan. Realistically, consistently find a quite good one.

A Query (Evaluation) Plan An extended relational algebra tree Annotations at each node indicate: access methods to use for each table. implementation methods used for each relational operator. Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) Reserves Sailors sid=sid bid=100 rating > 5 sname

Query Optimization Multi-operator Queries: Pipelined Evaluation C B A On-the-fly: The result of one operator is pipelined to another operator without creating a temporary table to hold intermediate result, called on-the-fly. Materialized : Otherwise, intermediate results must be materialized. C B A

Alternative Plans: Schema Examples Sailors (sid: integer, sname: string, rating: integer, age: real) Reserves (sid: integer, bid: integer, day: dates, rname: string) Reserves: Each tuple is 40 bytes long, 100 tuples per page, 1000 pages. Sailors: Each tuple is 50 bytes long, 80 tuples per page, 500 pages. 3

Alternative Plans: Motivating Example SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 RA Tree: Reserves Sailors sid=sid bid=100 rating > 5 sname 4

RA Tree: R.bid=100 AND S.rating>5 Plan: SELECT S.sname Reserves Sailors sid=sid bid=100 rating > 5 sname SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Costs : 1. Scan Sailors : For each page of Sailors, scan Reserves 500+500*1000 I/Os Or, 2. Scan Reserves For each page of Reserves, scan Sailors 1000+1000 * 500 I/Os Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) Plan: 4

Alternative Plans: Motivating Example RA Tree: Reserves Sailors sid=sid bid=100 rating > 5 sname SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Cost: 500+500*1000 I/Os Almost the worst plan! Reasons : selections could be `pushed’ earlier, no use made of indexes Goal of optimization: To find more efficient plans that compute the same answer. Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) Plan: 4

Alternative Plans 1 (No Indexes) Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Scan; write to temp T1) temp T2) (Sort-Merge Join) Alternative Plans 1 (No Indexes) Main difference: push selects. Reduce size of table to be joined With 5 buffers, cost of plan: Scan Reserves (1000) + write temp T1 (10 pages, if we have 100 boats, uniform distribution). Scan Sailors (500) + write temp T2 (250 pages, if we have 10 ratings). Sort T1 (2*2*10), sort T2 (2*4*250), merge (10+250) Total: 4060 page I/Os. Optimization1: block nested loops join: join cost = 10+4*250, total cost = 2770. Optimization2: `push’ projections: T1 has only sid, T2 only sid and sname: T1 fits in 3 pages, cost of BNL drops to under 250 pages, total < 2000. 5

Alternative Plan : Using Index ? Push Selections Down ? What indices help here? Index on Reserves.bid? Index on Sailors.sid? Index on Sailors.rating? Reserves Sailors sid=sid bid=100 sname rating > 5 6

Example Plan : With Index With index on Reserves.bid : Assume 100 different bid values. Assume 100,000 tuples. Assume 100 tuples/disk page We get 100,000/100 = 1000 tuples On 1000/100 = 10 disk pages. If index clustered, Cost = 10 I/Os. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) 6

Example Plan : Use Another Index Index on Sailors.bid? Selection on bid reduces number of tuples considered in join. INL with pipelining : Outer is not materialized Projecting out unnecessary fields from outer doesn’t help. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loop Join, with pipelining ) 6

Example Plan Continued Index on Sailors.sid : - Join column sid is key for Sailors. - At most one matching tuple, unclustered on sid OK. Cost? - For each Reserves tuples (1000): get matching Sailors tuple (1.2 I/O); so total 1210 I/Os. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) 6

Alternative Plan : With Second Index Selection Pushing down? Push (rating>5) before join ? Answer: No, because of availability of sid index on Sailors. Reason : No index on selection result. Then selection requires scan Sailors. Reserves Sailors sid=sid bid=100 sname (On-the-fly) rating > 5 (Use hash index; do not write result to temp) (Index Nested Loops, with pipelining ) 6

Summary A query is evaluated by converting it to a tree of operators and evaluating the operators in the tree. There are several alternative evaluation algorithms for each relational operator. Query evaluation must compare alternative plans based on their estimated costs Must understand query optimization in order to fully understand the performance impact of a given database design on a query workload 19