Chapter 13: Query Processing

Slides:



Advertisements
Similar presentations
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Advertisements

Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 5: Other Relational.
Chapter 1 The Study of Body Function Image PowerPoint
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Query optimisation.
Examples of Physical Query Plan Alternatives
Database Performance Tuning and Query Optimization
CS 3630 Database Design and Implementation. Where Clause and Aggregate Functions -- List all rooms whose price is greater than the -- average room price.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com ©Silberschatz, Korth and SudarshanDatabase.
Chapter 12: Query Processing
Unit 1:Parallel Databases
Chapter 5 Test Review Sections 5-1 through 5-4.
25 seconds left…...
Choosing an Order for Joins
CS 245Notes 71 CS 245: Database System Principles Notes 7: Query Optimization Hector Garcia-Molina.
SPRING 2004CENG 3521 Query Evaluation Chapters 12, 14.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Query Processing.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Processing (overview)
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
1 40T1 60T2 30T3 10T4 20T5 10T6 60T7 40T8 20T9 R S C C R JOIN S?
©Silberschatz, Korth and Sudarshan14.1Database System Concepts 3 rd Edition Chapter 14: Query Optimization Overview Catalog Information for Cost Estimation.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Evaluation and Optimization Main Steps 1.Translate into RA: select/project/join 2.Greedy optimization of RA: by pushing selection and projection.
Dr. Kalpakis CMSC 461, Database Management Systems Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Query Processing Chapter 12
COMP 5138 Relational Database Management Systems Semester 2, 2007 Lecture 12 Query Processing and Optimization.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
©Silberschatz, Korth and Sudarshan7.1 Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
12.1Database System Concepts - 6 th Edition Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Join Operation Sorting 、 Other.
SCUHolliday - COEN 17814–1 Schedule Today: u Query Processing overview.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan Chapter 13: Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Chapter 13: Query Processing Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations.
Computing & Information Sciences Kansas State University Tuesday, 03 Apr 2007CIS 560: Database System Concepts Lecture 29 of 42 Tuesday, 03 April 2007.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Chapter 13: Query Processing
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.
13.1 Chapter 13: Query Processing n Overview n Measures of Query Cost n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation.
Chapter 12 Query Processing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Query Processing and Query Optimization Database System Implementation CSE 507 Some slides adapted from Silberschatz, Korth and Sudarshan Database System.
Computing & Information Sciences Kansas State University Wednesday, 02 Apr 2008CIS 560: Database System Concepts Lecture 27 of 42 Wednesday, 02 April 2008.
Chapter 13: Query Processing
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Chapter 4: Query Processing
Database Management System
Chapter 12: Query Processing
File Processing : Query Processing
File Processing : Query Processing
File organization and Indexing
Dynamic Hashing Good for database that grows and shrinks in size
Query Processing B.Ramamurthy Chapter 12 11/27/2018 B.Ramamurthy.
Lecture 2- Query Processing (continued)
Chapter 12 Query Processing (1)
Query Processing.
Presentation transcript:

Chapter 13: Query Processing

Chapter 13: Query Processing Overview Measures of Query Cost Join Operation

Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation

Basic Steps in Query Processing (Cont.) Parsing and translation translate the query into its internal form. This is then translated into relational algebra. Parser checks syntax, verifies relations Evaluation The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.

Basic Steps in Query Processing : Optimization A relational algebra expression may have many equivalent expressions E.g., balance2500(balance(account)) is equivalent to balance(balance2500(account)) Each relational algebra operation can be evaluated using one of several different algorithms Correspondingly, a relational-algebra expression can be evaluated in many ways. Annotated expression specifying detailed evaluation strategy is called an evaluation-plan. E.g., can use an index on balance to find accounts with balance < 2500, or can perform complete relation scan and discard accounts with balance  2500

Evaluation Plan An evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated.

Basic Steps: Optimization (Cont.) Query Optimization: Amongst all equivalent evaluation plans choose the one with lowest cost. Cost is estimated using statistical information from the database catalog e.g. number of tuples in each relation, size of tuples, etc.

Measures of Query Cost Cost is generally measured as total elapsed time for answering query Many factors contribute to time cost disk accesses, CPU, or even network communication Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account Number of seeks * average-seek-cost Number of blocks read * average-block-read-cost Number of blocks written * average-block-write-cost

Measures of Query Cost (Cont.) For simplicity we just use the number of block transfers from disk as the cost measures tT – time to transfer one block Cost for b block transfers b * tT

Join Operation Several different algorithms to implement joins 1. Nested-loop join 2. Block nested-loop join 3. Indexed nested-loop join 4. Merge-join 5. Hash-join Choice based on cost estimate

Join Operation Join Strategies Consider deposit(branch-name, account-#, customer-name, balance) customer(customer-name, c-st, c-city) Consider deposit customer = 10,000 = 200 Simple Iteration (Nested-loop join) : Assume no indices. It must examine 10,000 * 200 = 2,000,000 pairs of tuples Expensive since it examines every pair of tuples in the two relations.

1. Nested-Loop Join (case 1) for each tuple d in deposit do begin for each tuple c in customer do test pair(d,c) to see if a tuple should be added to the result end If the tuples of deposit are stored together physically (assume 20 tuples fit in one block), reading deposit requires 10,000/20=500 block accesses (cf. in the worst case, 10,000 block access)

1. Nested-Loop Join (Cont.) As for the customer, 200/20 = 10 accesses per tuple of deposit if it is stored together physically. Thus 10 * 10,000 = 100,000 block accesses to customer are needed to process the query. ∴ total : 100,500 block accesses

1. Nested-Loop Join (Cont.) (case 2) Assume that customer in the outer loop and deposit in the inner loop. 100,000 accesses to deposit (200 * (10,000/20) = 100,000) + 10 accesses to read the customer (200/20 = 10) ∴ total 100,010 block accesses Thus the choice of inner and outer loop relations can have a dramatic effect on the cost of evaluating queries.

2. Block Nested-Loop Join Variant of nested-loop join in which every block of inner relation is paired with every block of outer relation. Block-Oriented Iteration : for each block Bd of deposit do begin for each block Bc of customer do for each tuple d in Bd do for each tuple c in Bc do test pair(d,c) to see if a tuple should be added to the result end

2. Block Nested-Loop Join (Cont.) per-block basis(not per-tuple basis)  saving in block accesses. Assume deposit & customer are stored together physically. Instead of reading the customer relation once for each tuple of deposit, we read the customer relation one for each block of deposit. 5,500 accesses = ( 5,000(=500(200/20) ) accesses to customer block + 500(=10,000/20) accesses to deposit blocks)

2. Block Nested-Loop Join (Cont.) Think customer : outer loop deposit : inner loop (10  (10,000/20) = 10  500 = 5,000 access to deposit + 10 (200/20 = 10) accesses to customer) = 5,000+10 =5,010 accesses. A major advantage to use of the smaller relation(customer) in the inner loop is that it may be possible to store the entire relation in main memory temporarily. If customer fit in M.M, 500 block access to read deposit + 10 blocks to read customer  510 accesses

3. Merge-Join Merge-Join : Assume that both relations are in sorted order on the join attributes and are stored together physically deposit customer  510 block accesses Merge-Join allows us to compute the join by reading each block exactly once. 500 block accesses to read deposit (10,000/20 = 500) + 10 block accesses to read customer (200/20 = 10) 510 block accesses

3. Merge-Join (Cont.) Algorithms : - A group of tuples of one relation with the same value on the join attributes is read. - The corresponding tuples of the other relation are read. - Since the relations are in sorted order, tuples with the same value on the join attributes are in consecutive order. This allows us to read each tuple only once.

3. Merge-Join (Cont.)

4. Indexed nested-loop join Simple iteration (Nested-loop join) deposit customer  10,000 X 200 = 2,000,000 block accesses (no physical clustering of tuples) Merge-join requires sorted order. Block-oriented iteration requires that tuples of each relation be stored physically together. But there are no restrictions on the simple iteration (nested-loop join).

4. Indexed nested-loop join If an index exists on customer for customer-name, then 10,000 block accesses to read deposit + 10,000  3 block accesses ( 2 for index block, 1 to read the customer tuple itself) 40,000 block accesses Given a tuple d in deposit, it is no longer necessary to read the entire customer relation. Instead, the index is used to look up tuples in customer for which the customer-name value is d[customer-name]. Only one tuples in customer table for which d[c-name] = c[c-name] since c-name is a primary key for customer.

5. Hash-Join Hash Join : A hash function h is used to hash tuples of both relations on the basis of join attributes. Let d be a tuple in deposit, c be a tuple in customer. If h(c) ≠ h(d), then c & d must have different values for customer-name. If h(c) = h(d), check.

5. Hash-Join (Cont.) h: customer-name  { 0, 1, 2, .... , Max } denote buckets of pointers to customer. denote buckets of pointers to deposit. rd : the set of deposit tuples that hash to bucket i. rc - the set of customer tuples that hash to bucket i. rd rc Total 510(for hashing) + 510(perform rd rc) = 1,020 block accesses. Assume that deposit and customer tuples are stored together physically, respectively.

5. Hash-Join (Cont.)

Three-Way Join branch deposit customer Where ndeposit = 10,000 Consider branch(branch-name, assets, b-city) deposit(branch-name, account-#, customer-name, balance) customer(customer-name, c-st, c-city) branch deposit customer Where ndeposit = 10,000 ncustomer = 200 n branch = 50 Consider a choice of which join to compute first.

Three-Way Join It is associative : Estimation of the size of a natural join Let and be relations ① If then ② If is a key for then the number of tuples is the number of tuples in . (a tuple of will join with exactly one tuple from ) Ex)

Three-Way Join Strategy 1. ① deposit customer first since c-name is a key for customer, at most 10,000 tuples. ② build an index an branch for b-name. compute branch (deposit customer) For each t ∈ deposit customer, look up the tuple in branch with a branch-name value of t[branch-name]. Since b-name is a key for branch, examine only one branch tuples for each of 10,000 tuples in (deposit customer). ※ If R1 ∩ R2 is a key for R1, the # of tuple in r1 r2 ≤ the # of tuples in r2.

Three-Way Join Strategy 2. 50 * 10,000 * 200 possibilities, without constructing indices at all. Strategy 3. build two indices : on branch for b-name. on customer for c-name. Consider each t ∈ deposit, look up the corresponding tuple in customer and the corresponding tuple in branch. Thus, we examine each tuple of deposit exactly once.

End of Chapter 13