Algebraic Transformations Page 1 © 2013 Hortonworks HIVE-784: Sub Query in Where or Having clause HIVE-5555: Alt. Join Syntax; Join conditions in the Where.

Slides:



Advertisements
Similar presentations
Introduction to SQL Tuning Brown Bag Three essential concepts.
Advertisements

Chapter 4 Joining Multiple Tables
A Guide to SQL, Seventh Edition. Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables.
Session 3BBK P1 Module05-May-2007 : [‹#›] Date Manipulation.
Query Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) Imperative query execution plan: SELECT S.sname FROM Reserves.
1 Jaql → pipes Unix pipes for the JSON data model Kevin Beyer, Vuk Ercegovac, Eugene Shekita, Jun Rao, Ning Li, Sandeep Tata IBM Almaden Research Center.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Ι.Β -- Εκτέλεση Ερωτήσεων και ΒελτιστοποίησηΣελίδα 4.40 Κεφάλαιο 9 Επεξεργασία και Βελτιστοποίηση Ερωτήσεων σε Σχεσιακές Βάσεις Δεδομένων.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapters 14.
Implementation of Other Relational Algebra Operators, R. Ramakrishnan and J. Gehrke1 Implementation of other Relational Algebra Operators Chapter 12.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Midterm Review Lecture 14b. 14 Lectures So Far 1.Introduction 2.The Relational Model 3.Disks and Files 4.Relational Algebra 5.File Org, Indexes 6.Relational.
Query Rewrite: Predicate Pushdown (through grouping) Select bid, Max(age) From Reserves R, Sailors S Where R.sid=S.sid GroupBy bid Having Max(age) > 40.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 8 Advanced SQL.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapter 15.
Query Optimization Overview Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2004 Some slide content derived.
Midterm 1 Concepts Relational Algebra (DB4) SQL Querying and updating (DB5) Constraints and Triggers (DB11) Unified Modeling Language (DB9) Relational.
Query Processing Presented by Aung S. Win.
The query processor does what the query plan tells it to do A “good” query plan is essential for a well- performing.
Query Optimization R&G, Chapter 15 Lecture 16. Administrivia Homework 3 available today –Written exercise; will be posted on class website –Due date:
School of Software SUN YAT-SEN UNIVERSITY Mar, 27, 2011.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 8 Advanced SQL.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 12: Overview.
Query Optimization. overview Histograms A histogram is a data structure maintained by a DBMS to approximate a data distribution Equiwidth vs equidepth.
Database systems/COMP4910/Melikyan1 Relational Query Optimization How are SQL queries are translated into relational algebra? How does the optimizer estimates.
Advanced Databases: Lecture 8 Query Optimization (III) 1 Query Optimization Advanced Databases By Dr. Akhtar Ali.
A Guide to MySQL 5. 2 Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables Use a subquery.
1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.
8 1 Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Query Optimization Arash Izadpanah. Introduction: What is Query Optimization? Query optimization is the process of selecting the most efficient query-evaluation.
1 Intro to JOINs SQL INNER JOIN SQL OUTER JOIN SQL FULL JOIN SQL CROSS JOIN Intro to VIEWs Simple VIEWs Considerations about VIEWs VIEWs as filters ALTER.
Chapter 6 SQL: Data Manipulation (Advanced Commands) Pearson Education © 2009.
Dive into the Query Optimizer Dive into the Query Optimizer: Undocumented Insight Benjamin Nevarez Blog: benjaminnevarez.com
Week 10 Quiz 9 Answers Group 28 Christine Hallstrom Deena Phadnis.
Chapter 4Introduction to Oracle9i: SQL1 Chapter 4 Joining Multiple Tables.
Unit 4 Queries and Joins. Key Concepts Using the SELECT statement Statement clauses Subqueries Multiple table statements Using table pseudonyms Inner.
1 © Cloudera, Inc. All rights reserved. Simplifying Analytic Workloads via Complex Schemas Josh Wills, Alex Behm, and Marcel Kornacker Data Modeling for.
Copyright © Curt Hill Joins Revisited What is there beyond Natural Joins?
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
A Guide to SQL, Eighth Edition Chapter Five Multiple-Table Queries.
+ Complex SQL Week 9. + Today’s Objectives TOP GROUP BY JOIN Inner vs. Outer Right vs. Left.
Page 1 © Hortonworks Inc – All Rights Reserved Hive: Data Organization for Performance Gopal Vijayaraghavan.
Random Query Generator for Hive November 2015 Hive Contributor Meetup Szehon Ho.
In this session, you will learn to: Query data by using joins Query data by using subqueries Objectives.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
CS 3630 Database Design and Implementation. Joins -- For each booking, display the booking -- details with the room type and price Select B.*, rtype,
More SQL: Complex Queries,
Tuning Transact-SQL Queries
Prepared by : Ankit Patel (226)
CS222P: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Data Engineering Query Optimization (Cost-based optimization)
Overview of Query Optimization
06 | Using Subqueries and APPLY
Introduction to Database Systems
Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Introduction to Database Systems CSE 444 Lecture 23: Final Review
Warm up The chart is made of INDEPENDENT clauses.
SQL: Structured Query Language
Relational Query Optimization
Relational Query Optimization
Evaluation of Relational Operations: Other Techniques
Introduction to Database Systems CSE 444 Lecture 23: Final Review
CS222: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Final Review Friday, December 8, 2006.
Relational Query Optimization
Relational Query Optimization
Database Instructor: Bei Kang.
Presentation transcript:

Algebraic Transformations Page 1 © 2013 Hortonworks HIVE-784: Sub Query in Where or Having clause HIVE-5555: Alt. Join Syntax; Join conditions in the Where Clause

Sub Query transformation Page 2 © 2013 Hortonworks Support for In, Not In, Exists, Not Exists in Where or Having clause But lots of restrictions –Sub Query predicate must be a top level conjunct –Only 1 Sub Query predicate –No Sub Query nesting –Correlation condition must be valid join conditions –And many more: See Spec on HIVE-784; 17 Restrictions so far. Transformation at a high level are: –In/Exists => Left Outer Join –Not In/Exists => Left Outer Join + null check + null count for Not In –Correlation converted to Gby in Sub Query In spite of long list of Restrictions, possibly useful –See HIVE-784 for TPCH Queries Q4, Q15, Q16, Q18 written with SQs –TPCDS Query 45

Some Examples Page 3 © 2013 Hortonworks -- non agg, corr select * from src b where b.key in (select a.key from src a where b.value = a.value and a.key > '9' ) ; -- non agg, non corr select key, count(*) from src group by key having count(*) in (select count(*) from src s1 where s1.key > '9' group by s1.key ) ; -- tpch Q4 select o_orderpriority, count(*) as order_count from orders o where unix_timestamp(o_orderdate, 'yyyy-MM-dd') >= unix_timestamp(' ', 'yyyy-MM-dd') and unix_timestamp(o_orderdate, 'yyyy-MM-dd') < unix_timestamp(' ', 'yyyy-MM-dd') and exists ( select * from lineitem where l_orderkey = o.o_orderkey and l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority;

Join syntax Page 4 © 2013 Hortonworks I want to use old-style Join syntax: join conditions in Where clause Sub problems: –HIVE-5556: Push Join conditions up Join tree. –So A join B join C on A.x = B.x and A.y = C.y should be handled as: –A join B on A.x = B.x join C on A.y = C.y –Fix holes in handling of Join Tree merging –HIVE-5557: Push ‘qualifying’ predicates from Where Clause up Join Tree –HIVE-5558: support alternate syntax for cross product (allow use of comma).