Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-usewww.db-book.com Chapter 13: Query Processing.

Similar presentations


Presentation on theme: "Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-usewww.db-book.com Chapter 13: Query Processing."— Presentation transcript:

1 Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-usewww.db-book.com Chapter 13: Query Processing

2 ©Silberschatz, Korth and Sudarshan13.2Database System Concepts - 5 th Edition, Aug 27, 2005. Basic Steps in Query Processing 1.Parsing and translation 2.Optimization 3.Evaluation

3 ©Silberschatz, Korth and Sudarshan13.3Database System Concepts - 5 th Edition, Aug 27, 2005. Basic Steps in Query Processing (Cont.) n Parsing and translation l translate the query into its internal form. This is then translated into relational algebra. l Parser checks syntax, verifies relations n Evaluation l The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.

4 ©Silberschatz, Korth and Sudarshan13.4Database System Concepts - 5 th Edition, Aug 27, 2005. Basic Steps in Query Processing : Optimization n A relational algebra expression may have many equivalent expressions l E.g.,  balance  2500 (  balance (account)) is equivalent to  balance (  balance  2500 (account)) n Each relational algebra operation can be evaluated using one of several different algorithms l Correspondingly, a relational-algebra expression can be evaluated in many ways. n Annotated expression specifying detailed evaluation strategy is called an evaluation-plan. l E.g., can use an index on balance to find accounts with balance < 2500, l or can perform complete relation scan and discard accounts with balance  2500

5 ©Silberschatz, Korth and Sudarshan13.5Database System Concepts - 5 th Edition, Aug 27, 2005. Basic Steps: Optimization (Cont.) n Query Optimization: Amongst all equivalent evaluation plans choose the one with lowest cost. l Cost is estimated using statistical information from the database catalog  e.g. number of tuples in each relation, size of tuples, etc. n In this chapter we study l How to measure query costs l Algorithms for evaluating relational algebra operations l How to combine algorithms for individual operations in order to evaluate a complete expression

6 ©Silberschatz, Korth and Sudarshan13.6Database System Concepts - 5 th Edition, Aug 27, 2005. Measures of Query Cost n Cost is generally measured as total elapsed time for answering query l Many factors contribute to time cost  disk accesses, CPU, or even network communication n Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account l Number of seeks * average-seek-cost l Number of blocks read * average-block-read-cost l Number of blocks written * average-block-write-cost  Cost to write a block is greater than cost to read a block –data is read back after being written to ensure that the write was successful

7 ©Silberschatz, Korth and Sudarshan13.7Database System Concepts - 5 th Edition, Aug 27, 2005. Measures of Query Cost (Cont.) n For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measures l t T – time to transfer one block l t S – time for one seek l Cost for b block transfers plus S seeks b * t T + S * t S n We ignore CPU costs for simplicity l Real systems do take CPU cost into account n We do not include cost to writing output to disk in our cost formulae n Several algorithms can reduce disk IO by using extra buffer space l Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution  We often use worst case estimates, assuming only the minimum amount of memory needed for the operation is available n Required data may be buffer resident already, avoiding disk I/O l But hard to take into account for cost estimation

8 ©Silberschatz, Korth and Sudarshan13.8Database System Concepts - 5 th Edition, Aug 27, 2005. Selection Operation n File scan – search algorithms that locate and retrieve records that fulfill a selection condition. n Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition. l Cost estimate = b r block transfers + 1 seek  b r denotes number of blocks containing records from relation r l If selection is on a key attribute, can stop on finding record  cost = ( b r /2) block transfers + 1 seek l Linear search can be applied regardless of  selection condition or  ordering of records in the file, or  availability of indices

9 ©Silberschatz, Korth and Sudarshan13.9Database System Concepts - 5 th Edition, Aug 27, 2005. Selection Operation (Cont.) n A2 (binary search). Applicable if selection is an equality comparison on the attribute on which file is ordered. l Assume that the blocks of a relation are stored contiguously l Cost estimate (number of disk blocks to be scanned):  cost of locating the first tuple by a binary search on the blocks –  log 2 (b r )  * (t T + t S )  If there are multiple records satisfying selection –Add transfer cost of the number of blocks containing records that satisfy selection condition

10 ©Silberschatz, Korth and Sudarshan13.10Database System Concepts - 5 th Edition, Aug 27, 2005. Selections Using Indices n Index scan – search algorithms that use an index l selection condition must be on search-key of index. n A3 (primary index on candidate key, equality). Retrieve a single record that satisfies the corresponding equality condition l Cost = (h i + 1) * (t T + t S ) n A4 (primary index on nonkey, equality) Retrieve multiple records. l Records will be on consecutive blocks  Let b = number of blocks containing matching records l Cost = h i * (t T + t S ) + t S + t T * b

11 ©Silberschatz, Korth and Sudarshan13.11Database System Concepts - 5 th Edition, Aug 27, 2005. Selections Using Indices n A5 (equality on search-key of secondary index). l Retrieve a single record if the search-key is a candidate key  Cost = (h i + 1) * (t T + t S ) l Retrieve multiple records if search-key is not a candidate key  each of n matching records may be on a different block  Cost = (h i + n) * (t T + t S ) –Can be very expensive!

12 Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-usewww.db-book.com End of Chapter


Download ppt "Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-usewww.db-book.com Chapter 13: Query Processing."

Similar presentations


Ads by Google