Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.

Slides:



Advertisements
Similar presentations
Chapter 13: Query Processing
Advertisements

Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
CS 540 Database Management Systems
SPRING 2004CENG 3521 Query Evaluation Chapters 12, 14.
Query processing and optimization. Advanced DatabasesQuery processing and optimization2 Definitions Query processing –translation of query into low-level.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Processing (overview)
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
ICS (072)Query Processing and Optimization 1 Chapter 15 Algorithms for Query Processing and Optimization ICS 424 Advanced Database Systems Dr.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
Chapter 19 Query Processing and Optimization
Dr. Kalpakis CMSC 461, Database Management Systems Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Query Processing Chapter 12
Database Management 9. course. Execution of queries.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
©Silberschatz, Korth and Sudarshan7.1 Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations.
Query Optimization Chap. 19. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying where.
12.1Database System Concepts - 6 th Edition Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Join Operation Sorting 、 Other.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan Chapter 13: Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Chapter 13: Query Processing Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations.
Computing & Information Sciences Kansas State University Tuesday, 03 Apr 2007CIS 560: Database System Concepts Lecture 29 of 42 Tuesday, 03 April 2007.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Chapter 13: Query Processing
CS4432: Database Systems II Query Processing- Part 2.
Advance Database Systems Query Optimization Ch 15 Department of Computer Science The University of Lahore.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
13.1 Chapter 13: Query Processing n Overview n Measures of Query Cost n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation.
Chapter 13: Query Processing. Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
CS 540 Database Management Systems
©Silberschatz, Korth and Sudarshan1 Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation.
Query Processing and Query Optimization Database System Implementation CSE 507 Some slides adapted from Silberschatz, Korth and Sudarshan Database System.
Chapter 13: Query Processing
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Chapter 4: Query Processing
CS 440 Database Management Systems
Database Management System
Chapter 13: Query Processing
Chapter 12: Query Processing
Chapter 12: Query Processing
Query Processing.
Chapter 13: Query Processing
File Processing : Query Processing
Dynamic Hashing Good for database that grows and shrinks in size
Query Processing B.Ramamurthy Chapter 12 11/27/2018 B.Ramamurthy.
Query Processing.
Chapter 13: Query Processing
Query processing and optimization
Chapter 13: Query Processing
Chapter 12: Query Processing
Chapter 13: Query Processing
Module 13: Query Processing
Lecture 2- Query Processing (continued)
Advance Database Systems
Chapter 13: Query Processing
Chapter 12 Query Processing (1)
Chapter 13: Query Processing
Chapter 13: Query Processing
Chapter 13: Query Processing
Lecture 20: Query Execution
Presentation transcript:

Query Processing

Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational algebra. Optimize –Make it run faster. Evaluate

Translation Example Possible SQL Query: SELECT balance FROM account WHERE balance<2500 Possible Relational Algebra Query:  balance  balance<2500 (account))

Tree Representation of Relational Algebra  balance  balance<2500 (account))  balance  balance<2500 account

Making An Evaluation Plan Annotate Query Tree with evaluation instructions: The query can now be executed by the query execution engine.  balance  balance<2500 account use index 1

Before Optimizing the Query Must predict the cost of execution plans. –Measured by CPU time, Number of disk block reads, Network communication (in distributed DBs), –where C(CPU) < C(Disk) < C(Network). –Major factor is buffer space. –Use statistics found in the catalog to help predict the work required to evaluate a query.

Disk Cost Seek time = rotational latency + arm movement. Scan time = time to read the data. Typically, seek time is orders of magnitude greater. Disk cost is assumed to be highest, so it can be used to approximate total cost.

Reading Data, No Indices Linear scan –Cost is a function of file size. Binary search on ordering attribute –Cost is lg of the file size. –Requires table to be sorted.

Reading Data with Indices Primary index: index on sort key. –Can be dense or sparse. Secondary index: index on non-sort key. Queries can be point queries or range queries. –Point queries return a single record. –Range queries return a sequence of consecutive records.

Point Queries Point queries –Cost = index cost + block read cost. Range queries (c1 <= key <= c2) –Primary index: Cost = index cost + scan of blocks –Secondary index: Cost = #blocks(index cost + scan of block)

More on Range Queries Range query on sort key (c1 <= key) –c1 <= key: Linear scan until you find key. –c1 >= key: Use index to find key, then linear scan. Range query using secondary index –Scan through index blocks. Requires accessing index for every record.

More Complex Selections Conditions on multiple attributes Negations Disjunctions Grouping pointers when selection is on multiple attributes: –Find a set of solutions for each condition. –Either compute its union or intersection, depending on the condition (disjunction or conjunction.)

Sorting Sorted relations are easier to scan. The cost of sorting a relation before querying it can be less than querying an unsorted relation. Two types of sorts: –In memory –Out of memory (a.k.a., external sorting)

External Merge Sort Use this when you cannot fit the relation in memory. Assume there are M memory buffers. Two phases: –Create sorted runs. –Merge sorted runs.

External Merge Sort, Phase 1 Fill the M memory buffers with the next M blocks of the relation. Sort the M blocks. Write the sorted blocks to disk.

External Merge Sort, Phase 2 Assume there are at most M-1 runs. Read the first block of each run into memory. At each iteration, find the lowest record from the M-1 runs. Place it into the memory buffer. If any run is empty, read its next block.

External Merge Sort Notes Can be extended to an arbitrarily large relation using multiple passes. Cost is: –Br(2 * lg_(M-1) (Br/M) + 1) –Br is the number of blocks for the relation. –B is the size of a memory buffer.

Nested Loop Join No indices (for now). Nested Loop –R join S –R is the outer relation. –S is the inner relation. –Read a block of R, then read each block of S and compare their contents using the join condition. –Write any matching tuples to another block.

Nested Loop Join Cost If you read tuple by tuple, it’s: –#tuples in R * #blocks in S + #blocks in R. Question: Which should be in inner relation, and which should be the outer?

Block Nested Loop Nested Loop Join, but block by block instead. Cost for R join S, where R is outer, S is inner: –#blocks in R * #blocks in S + #blocks in S

Block Nested Loop Improvements Sorted relations? More memory?

Indexed Nested Loop Join Assume we have an index on a join attribute of one of the relations, R or S. Questions: –Which should the index be on? –Or, if both have indices on them, which should be the outer one?

Indexed Nested Loop Join Cost #blocks in R + #rows in R * Ls –Ls is the cost of looking up a record in S using the index.

More Joins Merge join –Sort R and S, and then merge them. Hash join –Hash R and S into buckets, and compare the bucket contents.

Evaluation Materialization: Build intermediate tables as the expression goes up the tree. Here, one intermediate table is created for the select, and is the input of the project.  balance  balance<2500 account

Materialization Cost Cost of writing out intermediate results to disk.

Pipelining Compute several operations simultaneously. As soon as a tuple is created from one operation, send it to the next. Here, send selected tuples straight to the projection.  balance  balance<2500 account

Implementation of Pipelining Requires buffers for each operation. Can be: –Demand driven – an operator must be asked to generate a tuple. –Producer driven – an operator generates a tuple whether its asked for or not.

Query Optimization

Some Actions of Query Optimization Reordering joins. Changing the positions of projects and selects. Changing the access structures used to read data.

Catalog Info Number of tuples in r. Number of blocks for r. Size of tuple of r. Blocking factor a r – the number of r tuples that fit in a block. The number of distinct values of each attribute of r.