Mostafa Elhemali Leo Giakoumakis. Problem definition QRel system overview Case Study Conclusion 2.

Slides:



Advertisements
Similar presentations
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Advertisements

Using the Optimizer to Generate an Effective Regression Suite: A First Step Murali M. Krishna Presented by Harumi Kuno HP.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
The Volcano/Cascades Query Optimization Framework
EXECUTION PLANS By Nimesh Shah, Amit Bhawnani. Outline  What is execution plan  How are execution plans created  How to get an execution plan  Graphical.
Manajemen Basis Data Pertemuan Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
University of Konstanz Advances in Database Query Processing Sahak Maloyan Avoiding Sorting and Grouping In Processing Queries Sahak Maloyan.
Outline SQL Server Optimizer  Enumeration architecture  Search space: flexibility/extensibility  Cost and statistics Automatic Physical Tuning  Database.
Midterm Review Lecture 14b. 14 Lectures So Far 1.Introduction 2.The Relational Model 3.Disks and Files 4.Relational Algebra 5.File Org, Indexes 6.Relational.
Introduction to Structured Query Language (SQL)
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 8 Advanced SQL.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
16.5 Introduction to Cost- based plan selection Amith KC Student Id: 109.
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
Jingren Zhou, Per-Ake Larson, Ronnie Chaiken ICDE 2010 Talk by S. Sudarshan, IIT Bombay Some slides from original talk by Zhou et al. 1.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
©Silberschatz, Korth and Sudarshan5.1Database System Concepts Chapter 5: Other Relational Languages Query-by-Example (QBE) Datalog.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Data Access Patterns Some of the problems with data access from OO programs: 1.Data source and OO program use different data modelling concepts 2.Decoupling.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Access Path Selection in a Relational Database Management System Selinger et al.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
Advanced Databases: Lecture 8 Query Optimization (III) 1 Query Optimization Advanced Databases By Dr. Akhtar Ali.
Database Management 9. course. Execution of queries.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Query Optimization Arash Izadpanah. Introduction: What is Query Optimization? Query optimization is the process of selecting the most efficient query-evaluation.
1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach.
Data Partitioning in VLDB Tal Olier
Module 5 Planning for SQL Server® 2008 R2 Indexing.
Dive into the Query Optimizer Dive into the Query Optimizer: Undocumented Insight Benjamin Nevarez Blog: benjaminnevarez.com
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
1 Execution Strategies for SQL Subqueries Mostafa Elhemali, César Galindo- Legaria, Torsten Grabs, Milind Joshi Microsoft Corp With additional slides from.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Module 4 Database SQL Tuning Section 3 Application Performance.
6 Copyright © 2009, Oracle. All rights reserved. Using the Data Transformation Operators.
CS4432: Database Systems II Query Processing- Part 2.
7 Strategies for Extracting, Transforming, and Loading.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
1 CSE444: REVIEW. 2 CSE444 in one slide v Logical : E/R diagram  normalized relations v Physical : files, buffering, and indexes v Logical : Relational.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Oracle Data Integrator User Functions, Variables and Advanced Mappings
Session 1 Module 1: Introduction to Data Integrity
1 Execution Strategies for SQL Subqueries Mostafa Elhemali, César Galindo- Legaria, Torsten Grabs, Milind Joshi Microsoft Corp.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
Sorting and Joining.
Query Processing – Implementing Set Operations and Joins Chap. 19.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
CS 440 Database Management Systems Stored procedures & OR mapping 1.
 CONACT UC:  Magnific training   
Scott Fallen Sales Engineer, SQL Sentry Blog: scottfallen.blogspot.com.
Execution Plans Detail From Zero to Hero İsmail Adar.
BTM 382 Database Management Chapter 8 Advanced SQL Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
Select Complex Queries Database Management Fundamentals LESSON 3.1b.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Tuning Transact-SQL Queries
Database Performance Tuning and Query Optimization
Chapter 15 QUERY EXECUTION.
Execution Plans Demystified
SQL Server Query Plans Journeyman and Beyond
One-Pass Algorithms for Database Operations (15.2)
Four Rules For Columnstore Query Performance
Chapter 11 Database Performance Tuning and Query Optimization
A Framework for Testing Query Transformation Rules
Diving into Query Execution Plans
Query Optimization.
A – Pre Join Indexes.
Presentation transcript:

Mostafa Elhemali Leo Giakoumakis

Problem definition QRel system overview Case Study Conclusion 2

⋈ T.b=R.y π T.a,R.z T σ T.c<5 R Merge ⋈ T.b=R.y π T.a,R.z Scan Index T.Ib σ T.c<5 Scan Table R Sort on y ParseOptimize SELECT T.a,R.z FROM T JOIN R on T.b = R.y WHERE T.c < 5 Query processing in SQL Server: 3 ? Profit

⋈ T.b=R.y π T.a,R.z T σ T.c<5 R Merge ⋈ T.b=R.y π T.a,R.z Scan Index T.Ib σ T.c<5 Scan Table R Sort on y ⋈ σ T.c<5 ⋈ ⋈ Merge ⋈ Transformation rules ⋈ Hash ⋈ 4

SELECT T.a,R.z FROM T JOIN R on T.b = R.y WHERE T.c < 5 Merge ⋈ T.b=R.y π T.a,R.z Scan Index T.Ib σ T.c<5 Scan Table R Sort on y ⋈ σ T.c<5 ⋈ What we want to test System Input System Output 5

Rule to test: Push Aggregates Below Join 6 ⋈ G ⋈ G

Manual Approach – Construct equivalent SQL Manually construct equivalent SQL constructs Think of which tables to use Think of exact join predicate, grouping columns, aggregates, etc. 7 ⋈ G ⋈ G SELECT T.a,SUM(T.c) FROM T JOIN R on T.a = R.y GROUP BY T.a

Manual Approach – Variation on basic case Project between Aggregation and Join 8 ⋈ G ⋈ G SELECT M.Tb,SUM(M.p) FROM ( SELECT T.b Tb, (T.a + T.c) p FROM T JOIN R on T.b = R.y) M GROUP BY M.Tb π π

Manual construction of SQL test cases is hard Manual variation of these cases is harder SQL test cases are over-specified Harder to maintain 9 SELECT M.Tb,SUM(M.p) FROM ( SELECT T.b Tb, (T.a + T.c) p FROM T JOIN R on T.b = R.y) M GROUP BY M.Tb Is the goal to test Aggregation over a sum? Is the choice of T & R significant?

Allow testers to write test cases in abstract relational trees: ⋈ σ T.c<5 Yet present SQL Server with concrete SQL queries: SELECT T.a,R.z FROM T JOIN R on T.b = R.y WHERE T.c < 5 10

RelOps SqlGenRelGen OpFilter OpJoin … Expression Predicate SELECT … FROM T JOIN T2 ON T.a = T2.b WHERE T.x = 2 T σ T.x=2 ⋈ T.a=T2.b TT2 σ T.x=2 ⋈ T.a=T2.b TT2 11

Allow testers to write their relational tree test cases ⋈ σ T.c<5 Represented as OpFilter PredCompare(LT) ExpColumn(T.c)ExpScalar(5) OpInnerJoin 12

.NET classes for all relational and scalar operators expressible in SQL Relational: Join, Selection, Sort, … Scalar: Predicates, Expressions Does not represent any operator not expressible in SQL, e.g. semi-join Metadata extraction Basic property derivation Output columns for relational operators Data type for scalar expressions 13

Allow testers to write only the essential parts of the relational tree, and automatically fill out the rest ⋈ σ T.c<5 ⋈ T.b=R.y π T.a,R.z T σ T.c<5 R 14

Top-down generation of relational tree skeleton, Followed by bottom-up filling out phase 15

⋈ Possible operators π σ ⋈ → G T R Available relations CompareColumnAndScalar CompareTwoColumns Predicate Generators InSubquery π σσ T R … … a,b,cx,y T.b=R.y T.c<5 T.a,R.z 16

Tree generation is highly customizable for targeted testing Can start from a partially filled out tree Customizable probability distributions for any random parts, e.g. choice of relational operators, choice of predicates Use of constraints to influence the tree generation 17

Trees should be free of logical errors E.g. No Aggregation over XML columns Trees (and subtrees) should not be trivially optimized away Avoid contradicting predicates Operators should yield (many) rows if executed Reasoning: makes them more expensive, lures optimizer into deeper optimizations. Also we use QRel to test all of QP at times Use real data in predicates, e.g. Country = ‘England’ Use PK-FK columns in joins 18

Presents the server with proper SQL queries from the relational tree test cases ⋈ T.b=R.y π T.a,R.z T σ T.c<5 R SELECT T.a,R.z FROM T JOIN R on T.b = R.y WHERE T.c < 5 19

Subquery and derived table formation SQL clause formation Table and column aliasing (inverse of binding) 20

T.b=R.y T.b<5 SUM(T.a),R.z π ⋈ σ R G T.b,SUM(T.a) T SELECTSUM(T1_1.C2) AS C1, T1_2.z AS C2 FROM( SELECTT2_1.b AS C1, SUM(T2_1.a) AS C2 FROMT AS T2_1 GROUP BYT2_1.b HAVINGT2_1.b < 5 ) AS T1_1JOINR AS T1_2 ON T1_1.C1 = T1_2.y 21

Rule to test: ApplySJ UA σ1σ1 R U ApplySJ R σ2σ2 σ1σ1 R σ2σ2 ApplySJ = Apply-Semi-Join UA = Union All U = Union R σ OR EXISTS σ1σ1 σ2σ2 Cannot be represented Target rule Preliminary rule 22

PredExists a = new PredExists(new OpFilter(new OpGet(), null)); PredExists b = new PredExists(new OpFilter(new OpGet(), null)); PredBinaryOp p = new PredBinaryOp(a, b, PredBinaryOp.LogicOp.Or); OpFilter mainFilter = new OpFilter(new OpGet(), p); RelOps code: … And that’s the basic test case 23

Goes into RelGen (using TPC-H database): σ OR EXISTS σσ SUPPLIER Available relations CorrelationPredicate Predicate Generators ORDERS … O_SHIPPRIORITY, O_COMMENT, … SUPPLIER S_NATIONKEY, S_ADDRESS, … S_NATIONKEY >= O_SHIPPRIORITY SUPPLIER S_ADDRESS < O_COMMENT 24

σ OR EXISTS σσ S_NATIONKEY >= O_SHIPPRIORITY S_ADDRESS < O_COMMENT SUPPLIER ORDERS SELECT * FROM [ORDERS] AS T1_1 WHERE (EXISTS ( SELECT 1 AS C1 FROM [SUPPLIER] AS T2_1 WHERE T2_1.S_NATIONKEY >= T1_1.O_SHIPPRIORITY)) OR (EXISTS ( SELECT 1 AS C1 FROM [SUPPLIER] AS T2_1 WHERE T2_1.S_ADDRESS < T1_1.O_COMMENT)) Goes into SqlGen: 25

More random exploration Embed the basic tree pattern into completely random trees Systematic exploration of various dimensions Number of subquery disjunctives Add scalar disjunctives alongside the relational ones Add more operators in the relational disjunctives … 26

How do we verify the correct behavior for semi- random queries? Be creative! 27

Rule modeling Do the same transformation in the input query (usually very easy in RelOps) Present the two queries to the optimizer Should get the same plan Only works when the rule output is expressible in SQL Rules on & off Turn the rule under test off Should get a different plan, but same results Only works for non-essential exploration rules 28

Testing the Query Optimizer transformation rules using abstract relational trees Utilizing QRel to go from abstract relational trees to concrete SQL queries Future directions Libraries of abstract relational trees More advanced customizations of tree generation Combinatorial techniques for systematic exploration of various trees 29