CS 257, Spring’08 Presented By: Presented By: Gayatri Gopalakrishnan Gayatri Gopalakrishnan ID : 201.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

CS 245Notes 71 CS 245: Database System Principles Notes 7: Query Optimization Hector Garcia-Molina.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 19 Algorithms for Query Processing and Optimization.
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
CS 540 Database Management Systems
1 CSE 480: Database Systems Lecture 22: Query Optimization Reference: Read Chapter 15.6 – 15.8 of the textbook.
Query Optimization Dr. Karen C. Davis Professor School of Electronic and Computing Systems School of Computing Sciences and Informatics.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
Query Execution Since our SQL queries are very high level the query processor does a lot of processing to supply all the details. An SQL query is translated.
Query Execution Optimizing Performance. Resolving an SQL query Since our SQL queries are very high level, the query processor must do a lot of additional.
COMP 451/651 Optimizing Performance
Query Execution Professor: Dr T.Y. Lin Prepared by, Mudra Patel Class id: 113.
Query Execution Professor: Dr T.Y. Lin Prepared by, Mudra Patel Class id: 113.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
QUERY OPTIMIZATION AND QUERY PROCESSING.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization See Sections 15.1, 2, 3, 7.
ICS (072)Query Processing and Optimization 1 Chapter 15 Algorithms for Query Processing and Optimization ICS 424 Advanced Database Systems Dr.
Query Execution Professor: Dr T.Y. Lin Prepared by, Mudra Patel Class id: 113.
Query Compiler: 16.7 Completing the Physical Query-Plan CS257 Spring 2009 Professor Tsau Lin Student: Suntorn Sae-Eung ID: 212.
Query Processing and Optimization. Query Processing Efficient Query Processing crucial for good or even effective operations of a database Query Processing.
Chapter 19 Query Processing and Optimization
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)
Access Path Selection in a Relation Database Management System (summarized in section 2)
1 Relational Operators. 2 Outline Logical/physical operators Cost parameters and sorting One-pass algorithms Nested-loop joins Two-pass algorithms.
CSCE Database Systems Chapter 15: Query Execution 1.
Database Management 9. course. Execution of queries.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
CPS216: Advanced Database Systems Notes 07:Query Execution Shivnath Babu.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Department of Computer Science and Engineering, HKUST Slide Query Processing and Optimization Query Processing and Optimization.
Query Optimization Chap. 19. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying where.
Query Processing and Optimization
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.
16.7 Completing the Physical- Query-Plan By Aniket Mulye CS257 Prof: Dr. T. Y. Lin.
CS4432: Database Systems II Query Processing- Part 2.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
Advance Database Systems Query Optimization Ch 15 Department of Computer Science The University of Lahore.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Query Processing – Implementing Set Operations and Joins Chap. 19.
CS 540 Database Management Systems
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution.
Chapter 18 Query Processing and Optimization. Chapter Outline u Introduction. u Using Heuristics in Query Optimization –Query Trees and Query Graphs –Transformation.
Query Processing COMP3017 Advanced Databases Nicholas Gibbins
CS4432: Database Systems II Query Processing- Part 1 1.
Query Processing and Query Optimization Database System Implementation CSE 507 Slides adapted from Silberschatz, Korth and Sudarshan Database System Concepts.
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
Chiu Luk CS257 Database Systems Principles Spring 2009
Query Processing and Optimization, and Database Tuning
Database System Implementation CSE 507
15.1 – Introduction to physical-Query-plan operators
CS 540 Database Management Systems
CS 440 Database Management Systems
Database Management System
Prepared by : Ankit Patel (226)
Chapter 12: Query Processing
Chapter 15 QUERY EXECUTION.
Query Execution Presented by Khadke, Suvarna CS 257
File Processing : Query Processing
QUERY OPTIMIZATION.
Advance Database Systems
Query Execution Presented by Jiten Oswal CS 257 Chapter 15
Chapter 12 Query Processing (1)
Presentation transcript:

CS 257, Spring’08 Presented By: Presented By: Gayatri Gopalakrishnan Gayatri Gopalakrishnan ID : 201

Agenda Agenda Query CompilationQuery Compilation Physical-Query-Plan operatorsPhysical-Query-Plan operators Scanning TablesScanning Tables Sorting while ScanningSorting while Scanning Parameters for Measuring costParameters for Measuring cost I/O Cost for Scan OperatorsI/O Cost for Scan Operators Iterators for Implementation of Physical OperatorsIterators for Implementation of Physical Operators

Query Processing Query Processor Low Level Data manipulation steps High Level SQL Query

Query Compilation Query Compilation Query Compilation consists of Parsing –Parsing – Parse tree is constructed based on query and its structure Parse tree is constructed based on query and its structure Query Rewrite –Query Rewrite – Parse tree converted into the initial query plan. Parse tree converted into the initial query plan. The initial query plan is an algebraic representation of the query. The initial query plan is an algebraic representation of the query. Initial query plan is transformed to equivalent query plan which would take lesser time to execute. Initial query plan is transformed to equivalent query plan which would take lesser time to execute. Also called Logical Query Plan Also called Logical Query Plan Physical plan execution –Physical plan execution – Physical Query Plan is generated. Physical Query Plan is generated. Algorithms selected to implement each operator present in the Logical Query Plan. Algorithms selected to implement each operator present in the Logical Query Plan. After parsing the logical query plan the physical plan is represented by an Expression Tree After parsing the logical query plan the physical plan is represented by an Expression Tree Contains details like how the relations are accessed and when they need to be sorted Contains details like how the relations are accessed and when they need to be sorted

Query Compilation SQL Query Parse Query Select Logical Query Plan Execute Plan Select Physical Query Plan

Example of Transforming a Query Find the last names of employees born after 1957 who work on a project named ‘Aquarius’. SELECT LNAME SELECT LNAME FROM EMPLOYEE, WORKS_ON, PROJECT FROM EMPLOYEE, WORKS_ON, PROJECT WHERE PNAME=‘Aquarius’ AND NUMBER=PNO WHERE PNAME=‘Aquarius’ AND NUMBER=PNO AND ESSN=SSN AND BDATE > ‘ ’; AND ESSN=SSN AND BDATE > ‘ ’; EMPLOYEE LNAMESSN BDATE... WORKS_ON ESSNPNO HOURS PROJECT PNUMBER PLOCATION PNAME DNUM

ESSNPNO HOURS EMPLOYEEWORKS_ONPROJECT X X  PNAME=‘Aquarius’ & PNUMBER=PNO & ESNN=SSN & BDATE >  LNAME Select LNAME From EMPLOYEE, WORKS_ON, PROJECT Where PNUMBER=PNO and ESNN=SSN and BDATE > and PNAME=‘Aquarius’ Initial Tree - Push down select LNAMESSN BDATE... PNUMBER PLOCATION PNAME DNUM

 PNUMBER  ESSN, PNO  SSN, LNAME  ESSN   PNUMBER=PNO  LNAME EMPLOYEE WORKS_ON PROJECT  BDATE sPNAME=‘Aquarius’   ESNN=SSN

Introduction to Physical Query Plan Operators Physical query plans are built from physical operatorsPhysical query plans are built from physical operators Physical Operators are implementations of relational algebra operatorsPhysical Operators are implementations of relational algebra operators There are other kinds of physical operators, that do functions like “SCAN Table”, bring contents of a relation into the main memoryThere are other kinds of physical operators, that do functions like “SCAN Table”, bring contents of a relation into the main memory Iterator is a method by which operators of a physical plan can pass tuples among themselvesIterator is a method by which operators of a physical plan can pass tuples among themselves

Scanning Tables Basic in most physical query plansBasic in most physical query plans Join queries, Union Queries, Queries with a predicateJoin queries, Union Queries, Queries with a predicate Two Basic approaches to locate tuples in a relationTwo Basic approaches to locate tuples in a relation Table Scan Table Scan Index Scan Index Scan

Scanning Tables Table Scan If R is stored on secondary storage If R is stored on secondary storage Tuples of R are arranged in blocks Tuples of R are arranged in blocks Fetch the blocks one by one Fetch the blocks one by one Index Scan Index on an attribute of R Index on an attribute of R Sparse Index on R can lead us to the blocks holding R Sparse Index on R can lead us to the blocks holding R Fetch the tuples based on the Index Fetch the tuples based on the Index

Sorting While Scanning Tables ORDER BY clause requires a Relation R to be sortedORDER BY clause requires a Relation R to be sorted Some operators of relational algebra might require one or both of its arguments to be sorted relationsSome operators of relational algebra might require one or both of its arguments to be sorted relations Physical query plan operator “SORT-SCAN” takes a relation R and specifications of attributes on which the sort is to be madePhysical query plan operator “SORT-SCAN” takes a relation R and specifications of attributes on which the sort is to be made

Implementing SORT-SCAN Operator If R is sorted by attribute aIf R is sorted by attribute a R is stored as an indexed sequential file or there is B- tree index on a attribute R is stored as an indexed sequential file or there is B- tree index on a attribute Scan of the Index will produce R is sorted order Scan of the Index will produce R is sorted order If R is small enoughIf R is small enough Retrieve all tuples into main memory by table or index scan Retrieve all tuples into main memory by table or index scan Use efficient main memory sorting algorithms Use efficient main memory sorting algorithms If R is too largeIf R is too large Multiway merge approach Multiway merge approach Produce sorted block at a time in the main memory as tuples are needed Produce sorted block at a time in the main memory as tuples are needed

Parameters for measuring Costs Memory required to store arguments and intermediate results of the operatorMemory required to store arguments and intermediate results of the operator Say M is the number of buffers available for execution of an operator known by the query optimizer Say M is the number of buffers available for execution of an operator known by the query optimizer M may be decided during execution, so if the actual availability is less than M, the query takes longer than predicted M may be decided during execution, so if the actual availability is less than M, the query takes longer than predicted Cost of accessing the argument relations Size and distribution of data in a relation computed periodically to help the optimizer Size and distribution of data in a relation computed periodically to help the optimizer B( R ) - Gives number of blocks on which the relation is stored B( R ) - Gives number of blocks on which the relation is stored T ( R ) – Gives the number of tuples in relation T ( R ) – Gives the number of tuples in relation V ( R, a) – Number of distinct values that appear in each column V ( R, a) – Number of distinct values that appear in each column Based on memory and access data parameters the query Optimizer will choose the best physical operatorBased on memory and access data parameters the query Optimizer will choose the best physical operator

I/O Costs for Scan Parameters Cost for Table Scan Assumptions for exampleAssumptions for example Relation R fits in main memory Relation R fits in main memory Requires a Two phase Multiway merge sort Requires a Two phase Multiway merge sort If R is clustered then number of I/O’s for table scan is 3BIf R is clustered then number of I/O’s for table scan is 3B B for reading R in sublists B for reading R in sublists B for writing out the sublists B for writing out the sublists B for re-reading the sublists B for re-reading the sublists If R is not clustered then the number of I/O’s can be as much as T + 2BIf R is not clustered then the number of I/O’s can be as much as T + 2B T for reading R in sublists T for reading R in sublists B for writing the sublists B for writing the sublists B for re-reading the sublists B for re-reading the sublists Cost for Index Scan – Examining Index+ Reading the Relation.

Iterators for Implementation of Physical Operators Iterator is a group of 3 functionsIterator is a group of 3 functions Iterator allows the result of a physical operator to get one Tuple at a timeIterator allows the result of a physical operator to get one Tuple at a time The three functions in an IteratorThe three functions in an Iterator Open Open Initializes data structures needed for operation Initializes data structures needed for operation GetNext GetNext Returns the next Tuple in the result Adjusts the data structure so that subsequent tuple can be obtained If No more tuples present, it returns a NotFound Close Close Ends the iteration after all the tuples are obtainedEnds the iteration after all the tuples are obtained

Iterator for Table Scan Open() { b := the first block of B b := the first block of B t := the first tuple of block B t := the first tuple of block B} GetNext() { IF (t is past the last tuple on block B){ IF (t is past the last tuple on block B){ increment B to next block; increment B to next block; IF (there is no next block) IF (there is no next block) RETURN NotFound RETURN NotFound ELSE ELSE /* B is a new block */ /* B is a new block */ t := first tuple on block B t := first tuple on block B } /* now we are ready to return t and increment */ /* now we are ready to return t and increment */ oldt= t; oldt= t; increment t to the next tuple of B increment t to the next tuple of B RETURN oldt; RETURN oldt; } Close() { Close() { }

Thank You References Databases the Complete Book - by Hector Garcia-Molina (Author), Jeffrey D. Ullman (Author), Jennifer D. Widom (Author) Hector Garcia-Molina Jeffrey D. UllmanJennifer D. WidomHector Garcia-Molina Jeffrey D. UllmanJennifer D. Widomwww.sky.fit.qut.edu.au