Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Optimization CS 157B Ch. 14 Mien Siao. Outline Introduction Steps in Cost-based query optimization- Query Flow Projection Example Query Interaction.

Similar presentations


Presentation on theme: "Query Optimization CS 157B Ch. 14 Mien Siao. Outline Introduction Steps in Cost-based query optimization- Query Flow Projection Example Query Interaction."— Presentation transcript:

1 Query Optimization CS 157B Ch. 14 Mien Siao

2 Outline Introduction Steps in Cost-based query optimization- Query Flow Projection Example Query Interaction in DBMS Cost-based query Optimization: Algebraic Expressions

3 Introduction What is Query Optimization? Suppose you were given a chance to visit 15 pre-selected different cities in Europe. The only constraint would be ‘Time’ -> Would you have a plan to visit the cities in any order?

4 Europe

5 Plan: -> Place the 15 cities in different groups based on their proximity to each other. -> Start with one group and move on to the next group. Important point made over here is that you would have visited the cities in a more organized manner, and the ‘Time’ constraint mentioned earlier would have been dealt with efficiently.

6 Query Optimization works in a similar way: There can be many different ways to get an answer from a given query. The result would be same in all scenarios. DBMS strive to process the query in the most efficient way (in terms of ‘Time’) to produce the answer. Cost = Time needed to get all answers

7 Starting with System-R, most of the commercial DBMSs use cost-based optimizers. The estimation should be accurate and easy. Another important point is the need for being logically consistent because the least cost plan will always be consistently low.

8 Steps in a Cost-based query optimization 1. Parsing 2. Transformation 3. Implementation 4. Plan selection based on cost estimates

9 Query Flow Parser Optimizer Code Generator/ Interpreter Processor SQL

10 Query Parser – Verify validity of the SQL statement. Translate query into an internal structure using relational calculus. Query Optimizer – Find the best expression from various different algebraic expressions. Criteria used is ‘Cheapness’ Code Generator/Interpreter – Make calls for the Query processor as a result of the work done by the optimizer. Query Processor – Execute the calls obtained from the code generator.

11 Cost of physical plans includes processor time and communication time. The most important factor to consider is disk I/Os because it is the most time consuming action. Some other costs associated are: - Operations (joins, unions, intersections). - The order of operations. Why?

12 Joins, unions, and intersections are associative and commutative. - Management of storage of arguments and passing of it. Factors mentioned above should be limited and minimized when creating the best physical plan.

13 Projection Example: Projections produce a result tuple for every argument tuple. What is the change? Change in the output size is the change in the length of tuples Let’s take a relation ‘R’ Relation (20,000 tuples): R(a, b, c) Each Tuple (190 bytes): header = 24 bytes, a = 8 bytes, b = 8 bytes, c = 150 bytes Each Block (1024): header = 24 bytes

14 We can fit 5 tuples into 1 block - 5 tuples * 190 bytes/tuple = 950 bytes can fit into 1 block - For 20,000 tuples, we would require 4,000 blocks (20,000 / 5 tuples per block = 4,000 With a projection resulting in elimination of column c (150 bytes), we could estimate that each tuple would decrease to 40 bytes (190 – 150 bytes)

15 Now, the new estimate will be 25 tuples in 1 block. - 25 tuples * 40 bytes/tuple = 1000 bytes will be able to fit into 1 block - With 20,000 tuples, the new estimate is 800 blocks (20,000 tuples / 25 tuples per block = 800 blocks) Result is reduction by a factor of 5

16 Query interaction in DBMS How does a query interact with a DBMS? - Interactive users - Embedded queries in programs written in C, C++, etc. What is the difference between these two ?

17 Interactive Users: - When there is an interactive user query, the query goes through the Query Parser, Query Optimizer, Code Generator, and Query Processor each time.

18 Embedded Query: - When there is an embedded query, the query does not have to through the Query Parser, Query Optimizer, Code Generator, and the Query Processor each time.

19 - In an embedded query, the calls generated by the code generator are stored in the database. Each time the query is reached within the program at run-time, the Query Processor invokes the stored calls in the database. - Optimization is independent in embedded queries.

20 Cost-based query Optimization: Algebraic Expressions If we had the following query- SELECT p.pname, d.dname FROM Patients p, Doctors d WHERE p.doctor = d.dname AND d.dgender = ‘M’

21 projection filter join Scan (Patients)Scan (Doctors)

22 Cost-based query Optimization : Transformation projectionprojection filter join join Filter Scan (Patients) Scan (Doctors) Scan(Patients) Scan(Doctors)

23 Cost-based query Optimization: Implementation projection projection filter hash join natural join filter Scan(Patients) Scan(Doctors)

24 Cost-based query Optimization: Plan selection based on costs projection projection filter hash join natural join filter Scan(Patients) Scan(Doctors) Estimated Costs = 100ms Estimated Costs = 50ms


Download ppt "Query Optimization CS 157B Ch. 14 Mien Siao. Outline Introduction Steps in Cost-based query optimization- Query Flow Projection Example Query Interaction."

Similar presentations


Ads by Google