Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 257, Spring’08 Presented By: Presented By: Gayatri Gopalakrishnan Gayatri Gopalakrishnan ID : 201.

Similar presentations


Presentation on theme: "CS 257, Spring’08 Presented By: Presented By: Gayatri Gopalakrishnan Gayatri Gopalakrishnan ID : 201."— Presentation transcript:

1 CS 257, Spring’08 Presented By: Presented By: Gayatri Gopalakrishnan Gayatri Gopalakrishnan ID : 201

2 Agenda Agenda Query CompilationQuery Compilation Physical-Query-Plan operatorsPhysical-Query-Plan operators Scanning TablesScanning Tables Sorting while ScanningSorting while Scanning Parameters for Measuring costParameters for Measuring cost I/O Cost for Scan OperatorsI/O Cost for Scan Operators Iterators for Implementation of Physical OperatorsIterators for Implementation of Physical Operators

3 Query Processing Query Processor Low Level Data manipulation steps High Level SQL Query

4 Query Compilation Query Compilation Query Compilation consists of Parsing –Parsing – Parse tree is constructed based on query and its structure Parse tree is constructed based on query and its structure Query Rewrite –Query Rewrite – Parse tree converted into the initial query plan. Parse tree converted into the initial query plan. The initial query plan is an algebraic representation of the query. The initial query plan is an algebraic representation of the query. Initial query plan is transformed to equivalent query plan which would take lesser time to execute. Initial query plan is transformed to equivalent query plan which would take lesser time to execute. Also called Logical Query Plan Also called Logical Query Plan Physical plan execution –Physical plan execution – Physical Query Plan is generated. Physical Query Plan is generated. Algorithms selected to implement each operator present in the Logical Query Plan. Algorithms selected to implement each operator present in the Logical Query Plan. After parsing the logical query plan the physical plan is represented by an Expression Tree After parsing the logical query plan the physical plan is represented by an Expression Tree Contains details like how the relations are accessed and when they need to be sorted Contains details like how the relations are accessed and when they need to be sorted

5 Query Compilation SQL Query Parse Query Select Logical Query Plan Execute Plan Select Physical Query Plan

6 Example of Transforming a Query Find the last names of employees born after 1957 who work on a project named ‘Aquarius’. SELECT LNAME SELECT LNAME FROM EMPLOYEE, WORKS_ON, PROJECT FROM EMPLOYEE, WORKS_ON, PROJECT WHERE PNAME=‘Aquarius’ AND NUMBER=PNO WHERE PNAME=‘Aquarius’ AND NUMBER=PNO AND ESSN=SSN AND BDATE > ‘1957-12-31’; AND ESSN=SSN AND BDATE > ‘1957-12-31’; EMPLOYEE LNAMESSN BDATE... WORKS_ON ESSNPNO HOURS PROJECT PNUMBER PLOCATION PNAME DNUM

7 ESSNPNO HOURS EMPLOYEEWORKS_ONPROJECT X X  PNAME=‘Aquarius’ & PNUMBER=PNO & ESNN=SSN & BDATE > 1957-12-31  LNAME Select LNAME From EMPLOYEE, WORKS_ON, PROJECT Where PNUMBER=PNO and ESNN=SSN and BDATE > 1957-12-31 and PNAME=‘Aquarius’ Initial Tree - Push down select LNAMESSN BDATE... PNUMBER PLOCATION PNAME DNUM

8  PNUMBER  ESSN, PNO  SSN, LNAME  ESSN   PNUMBER=PNO  LNAME EMPLOYEE WORKS_ON PROJECT  BDATE 1957-12-31 sPNAME=‘Aquarius’   ESNN=SSN

9 Introduction to Physical Query Plan Operators Physical query plans are built from physical operatorsPhysical query plans are built from physical operators Physical Operators are implementations of relational algebra operatorsPhysical Operators are implementations of relational algebra operators There are other kinds of physical operators, that do functions like “SCAN Table”, bring contents of a relation into the main memoryThere are other kinds of physical operators, that do functions like “SCAN Table”, bring contents of a relation into the main memory Iterator is a method by which operators of a physical plan can pass tuples among themselvesIterator is a method by which operators of a physical plan can pass tuples among themselves

10 Scanning Tables Basic in most physical query plansBasic in most physical query plans Join queries, Union Queries, Queries with a predicateJoin queries, Union Queries, Queries with a predicate Two Basic approaches to locate tuples in a relationTwo Basic approaches to locate tuples in a relation Table Scan Table Scan Index Scan Index Scan

11 Scanning Tables Table Scan If R is stored on secondary storage If R is stored on secondary storage Tuples of R are arranged in blocks Tuples of R are arranged in blocks Fetch the blocks one by one Fetch the blocks one by one Index Scan Index on an attribute of R Index on an attribute of R Sparse Index on R can lead us to the blocks holding R Sparse Index on R can lead us to the blocks holding R Fetch the tuples based on the Index Fetch the tuples based on the Index

12 Sorting While Scanning Tables ORDER BY clause requires a Relation R to be sortedORDER BY clause requires a Relation R to be sorted Some operators of relational algebra might require one or both of its arguments to be sorted relationsSome operators of relational algebra might require one or both of its arguments to be sorted relations Physical query plan operator “SORT-SCAN” takes a relation R and specifications of attributes on which the sort is to be madePhysical query plan operator “SORT-SCAN” takes a relation R and specifications of attributes on which the sort is to be made

13 Implementing SORT-SCAN Operator If R is sorted by attribute aIf R is sorted by attribute a R is stored as an indexed sequential file or there is B- tree index on a attribute R is stored as an indexed sequential file or there is B- tree index on a attribute Scan of the Index will produce R is sorted order Scan of the Index will produce R is sorted order If R is small enoughIf R is small enough Retrieve all tuples into main memory by table or index scan Retrieve all tuples into main memory by table or index scan Use efficient main memory sorting algorithms Use efficient main memory sorting algorithms If R is too largeIf R is too large Multiway merge approach Multiway merge approach Produce sorted block at a time in the main memory as tuples are needed Produce sorted block at a time in the main memory as tuples are needed

14 Parameters for measuring Costs Memory required to store arguments and intermediate results of the operatorMemory required to store arguments and intermediate results of the operator Say M is the number of buffers available for execution of an operator known by the query optimizer Say M is the number of buffers available for execution of an operator known by the query optimizer M may be decided during execution, so if the actual availability is less than M, the query takes longer than predicted M may be decided during execution, so if the actual availability is less than M, the query takes longer than predicted Cost of accessing the argument relations Size and distribution of data in a relation computed periodically to help the optimizer Size and distribution of data in a relation computed periodically to help the optimizer B( R ) - Gives number of blocks on which the relation is stored B( R ) - Gives number of blocks on which the relation is stored T ( R ) – Gives the number of tuples in relation T ( R ) – Gives the number of tuples in relation V ( R, a) – Number of distinct values that appear in each column V ( R, a) – Number of distinct values that appear in each column Based on memory and access data parameters the query Optimizer will choose the best physical operatorBased on memory and access data parameters the query Optimizer will choose the best physical operator

15 I/O Costs for Scan Parameters Cost for Table Scan Assumptions for exampleAssumptions for example Relation R fits in main memory Relation R fits in main memory Requires a Two phase Multiway merge sort Requires a Two phase Multiway merge sort If R is clustered then number of I/O’s for table scan is 3BIf R is clustered then number of I/O’s for table scan is 3B B for reading R in sublists B for reading R in sublists B for writing out the sublists B for writing out the sublists B for re-reading the sublists B for re-reading the sublists If R is not clustered then the number of I/O’s can be as much as T + 2BIf R is not clustered then the number of I/O’s can be as much as T + 2B T for reading R in sublists T for reading R in sublists B for writing the sublists B for writing the sublists B for re-reading the sublists B for re-reading the sublists Cost for Index Scan – Examining Index+ Reading the Relation.

16 Iterators for Implementation of Physical Operators Iterator is a group of 3 functionsIterator is a group of 3 functions Iterator allows the result of a physical operator to get one Tuple at a timeIterator allows the result of a physical operator to get one Tuple at a time The three functions in an IteratorThe three functions in an Iterator Open Open Initializes data structures needed for operation Initializes data structures needed for operation GetNext GetNext Returns the next Tuple in the result Adjusts the data structure so that subsequent tuple can be obtained If No more tuples present, it returns a NotFound Close Close Ends the iteration after all the tuples are obtainedEnds the iteration after all the tuples are obtained

17 Iterator for Table Scan Open() { b := the first block of B b := the first block of B t := the first tuple of block B t := the first tuple of block B} GetNext() { IF (t is past the last tuple on block B){ IF (t is past the last tuple on block B){ increment B to next block; increment B to next block; IF (there is no next block) IF (there is no next block) RETURN NotFound RETURN NotFound ELSE ELSE /* B is a new block */ /* B is a new block */ t := first tuple on block B t := first tuple on block B } /* now we are ready to return t and increment */ /* now we are ready to return t and increment */ oldt= t; oldt= t; increment t to the next tuple of B increment t to the next tuple of B RETURN oldt; RETURN oldt; } Close() { Close() { }

18 Thank You References Databases the Complete Book - by Hector Garcia-Molina (Author), Jeffrey D. Ullman (Author), Jennifer D. Widom (Author) Hector Garcia-Molina Jeffrey D. UllmanJennifer D. WidomHector Garcia-Molina Jeffrey D. UllmanJennifer D. Widomwww.sky.fit.qut.edu.au


Download ppt "CS 257, Spring’08 Presented By: Presented By: Gayatri Gopalakrishnan Gayatri Gopalakrishnan ID : 201."

Similar presentations


Ads by Google