Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Ramon Lawrence University of Iowa

Slides:



Advertisements
Similar presentations
Choosing an Order for Joins
Advertisements

Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
CS 245Notes 71 CS 245: Database System Principles Notes 7: Query Optimization Hector Garcia-Molina.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe Algorithms for SELECT and JOIN Operations (8) Implementing the JOIN Operation: Join (EQUIJOIN, NATURAL.
Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
Improving Hash Join Performance By Exploiting Intrinsic Data Skew by Bryce Cutt supervised by Dr. Ramon Lawrence.
Join Processing in Databases Systems with Large Main Memories
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Implementation of Other Relational Algebra Operators, R. Ramakrishnan and J. Gehrke1 Implementation of other Relational Algebra Operators Chapter 12.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Ramon Lawrence University of Iowa
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 11 External Sorting.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
XJoin: Getting Fast Answers From Slow and Bursty Networks T. Urhan M. J. Franklin IACS, CSD, University of Maryland Presented by: Abdelmounaam Rezgui CS-TR-3994.
Chapter 4 Parallel Sort and GroupBy 4.1Sorting, Duplicate Removal and Aggregate 4.2Serial External Sorting Method 4.3Algorithms for Parallel External Sort.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
Introduction to Database Systems 1 Join Algorithms Query Processing: Lecture 1.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
External Sorting 198:541. Why Sort?  A classic problem in computer science!  Data requested in sorted order e.g., find students in increasing gpa order.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Evaluation of Relational Operations. Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation.
External Sorting Chapter 13.. Why Sort? A classic problem in computer science! Data requested in sorted order  e.g., find students in increasing gpa.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 14 – Join Processing.
1 XJoin: Faster Query Results Over Slow And Bursty Networks IEEE Bulletin, 2000 by T. Urhan and M Franklin Based on a talk prepared by Asima Silva & Leena.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Primary Key, Cluster Key & Identity Loop, Hash & Merge Joins Joe Chang
Relational Operator Evaluation. Overview Index Nested Loops Join If there is an index on the join column of one relation (say S), can make it the inner.
Sorting.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 13.
12.1Database System Concepts - 6 th Edition Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Join Operation Sorting 、 Other.
Database Management Systems, R. Ramakrishnan and J. Gehrke 1 External Sorting Chapter 13.
PermJoin: An Efficient Algorithm for Producing Early Results in Multi-join Query Plans Justin J. Levandoski Mohamed E. Khalefa Mohamed F. Mokbel University.
1 External Sorting. 2 Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing gpa order.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
GSLPI: a Cost-based Query Progress Indicator
Multi-Way Hash Join Effectiveness M.Sc Thesis Michael Henderson Supervisor Dr. Ramon Lawrence 2.
CS411 Database Systems Kazuhiro Minami 11: Query Execution.
CS4432: Database Systems II Query Processing- Part 2.
B+ Trees: An IO-Aware Index Structure Lecture 13.
CSCE Database Systems Chapter 15: Query Execution 1.
CSCI 5708: Query Processing II Pusheng Zhang University of Minnesota Feb 5, 2004.
Page 1 A Platform for Scalable One-pass Analytics using MapReduce Boduo Li, E. Mazur, Y. Diao, A. McGregor, P. Shenoy SIGMOD 2011 IDS Fall Seminar 2011.
Query Processing CS 405G Introduction to Database Systems.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Lecture 3 - Query Processing (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.
Introduction to Database Systems1 External Sorting Query Processing: Topic 0.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
Implementation of Database Systems, Jarek Gryz1 Evaluation of Relational Operations Chapter 12, Part A.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapters 13: 13.1—13.5.
CS 540 Database Management Systems
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
Query Processing and Query Optimization Database System Implementation CSE 507 Some slides adapted from Silberschatz, Korth and Sudarshan Database System.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 14, Part A (Joins)
CS 540 Database Management Systems
CS 440 Database Management Systems
Chapter 12: Query Processing
Evaluation of Relational Operations: Other Operations
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
Selected Topics: External Sorting, Join Algorithms, …
Implementation of Relational Operations
Evaluation of Relational Operations: Other Techniques
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Ramon Lawrence University of Iowa

Page 2 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Introduction Interactive user querying requires the DBMS produce the first few query answers quickly as well as minimize the total query execution time. Queries that produce a lot of results with large hash joins have a slow response time as the smaller input must be completely partitioned before any output can be generated. It is desirable to have a hash-based join algorithm for centralized databases that: u Has rapid response time to produce the first few results u Has overall execution time comparable to hybrid hash join u Can be dynamically configured by the optimizer

Page 3 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Previous Work Hash joins: u hybrid hash join [DeWitt84] - standard join used in most DBMSs u dynamic hash join [DeWitt95,Nakayama88] - dynamic partitioning u symmetric hash join [Hong93,Wilschut91] - dual hash table u ripple join [Haas99,Luo02] - online aggregation, reading policies u MJoin [Ding03] - purges join state using stream punctuation Mediator-based joins: u Improve overall execution time by executing during delays instead of plan re-ordering/query scrambling [Raman99, Urhan98]. u double pipelined hash join [Ives99] - Tukwila system u XJoin [Urhan00] - probe in-memory partitions when blocked u hash-merge Join [Mokbel04] - sort-merge partitions when blocked u progressive merge join [Dittrich02] - dual sort-based join

Page 4 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Motivation Interactive users of centralized DBMS can benefit from fast response time inherent in dual-hash table joins. Challenge is to ensure overall performance is not signficantly sacrificed for this fast response time. Dual-hash table join has other benefits as the operator is more easily pipelined (since it is symmetric). This is valuable for federated joins when one or more of the inputs may not be local to the database engine.

Page 5 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Reading Strategy A reading strategy is the rules an algorithm uses to decide how to read from the two inputs when both inputs have tuples available. u Reading strategies do NOT apply to streaming (push-based) inputs. u They are useful when the inputs are on a local hard drive or a fast network source (pull-based). Reading strategies have been used before for processing top-k queries and in ripple joins. The reading strategy for hybrid hash join is to read the entire smaller input then the larger input. Another strategy is to read alternately from the inputs.

Page 6 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Flushing Policy The flushing policy determines which tuples in memory are written to disk when memory must be released to accept new input. Previous flushing policies: u Flush the largest single partition (XJoin) u Co-ordinated flushing of a partition pair (Hash-merge join) Flushing policy affects the duplicate detection strategy of the join algorithm. Also affects its performance in two ways: u 1) Join output rate - The number of results generated as input is being received. This depends on the tuples in memory. u 2) Overall execution time - The total time may change depending on the cost of flushing and post-join cleanup.

Page 7 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Early Hash Join (EHJ) Algorithm The Early Hash Join (EHJ) algorithm uses a dual hash table approach. It is specifically designed for a centralized DBMS where overall execution time is dictated by the flushing and partitioning speed and not by the input arrival rates. EHJ uses: u a variable reading strategy that changes when memory is full u a biased flushing policy to favor the smaller input u optimizations to flush join memory state for 1:* joins u simplified duplicate detection that requires no timestamps for 1:* joins and only one timestamp for *:* joins u a background process when used for mediator joins or with slow network-based inputs

Page 8 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Early Hash Join (EHJ) Algorithm Start Join Read tuple from R or S (policy) Yes Input left? Tuple of R? Insert in R table Probe S table Output results Insert in S table Probe R table Output results Yes No Memory full? No Initialize 1st cleanup phase Close S file. Delete on-disk partitions. No On-disk S no R? Yes In phase 1? No Join complete Yes Load R to memory On-disk R? No Read S tuple TSProbe R table Output results Input left in S file? Initialize probe file for S partition Yes Bias Flush Initialize 2nd cleanup phase No

Page 9 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Biased Flushing Policy The biased flushing policy is designed to keep as much of the smaller input in memory as possible (similar to hybrid hash join). Biased flushing policy: u Flush largest non-frozen partition of S (larger input). u If no such partition of S exists, flush smallest, non-frozen partition of R (smaller input). Idea of freezing a partition is from dynamic hash join [DeWitt95]. A frozen partition does not accept input once it has been flushed and is not probed. u XJoin and HMJ do not freeze partitions. u Freezing partitions and using biased flushing simplifies the duplicate detection strategy.

Page 10 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Duplicate Detection Duplicate detection is required so that join results are not re- generated during the cleanup pass. For common 1:* joins, no timestamps are needed: u With a *-side probe tuple, it is discarded if matched. u With a 1-side probe tuple, delete from the hash table any matching tuples on the *-side. For *:* joins a single timestamp representing the tuples arrival order is kept. In cleanup pass, result tuple of (T R,T S ) passes timestamp check (and is output) if one of these is true: u 1) T S arrived before its partition of S was flushed and T R arrived after its corresponding partition of S was flushed. u 2) T S arrived after its partition of S was flushed but before the matching partition of R was flushed and T R arrived after T S. u 3) T S arrived after partition of R was flushed.

Page 11 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Performance Analysis Parameters: u Two input relations R and S with |R|  |S|. u Join memory M where M  |R|. Let f = M / |R|. u Reading policy before memory is full is A 1 :B 1. Let q 1 =A 1 /(A 1 +B 1 ). u Reading policy after memory is full is A 2 :B 2. Let q 2 =A 2 /(A 2 +B 2 ). Number of I/O operations: (not counting reading inputs) u where u Note for hybrid hash join, leftS = |S|.

Page 12 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Background Process A background process can be used when the inputs are from sources other than the hard drive used for flushing. u This includes mediator and federated joins. u As shown in previous work, most valuable for slow or bursty networks. Not as useful for high speed networks. Similar to XJoin, use an on-disk partition of S to probe the matching partition of R currently in memory. Designed as a background process that runs concurrently with main join process. This can boost join output rate, but still must be careful not to needlessly tie up CPU when background process may only generate a few results. u Duplicate detection is slightly modified when using BG process.

Page 13 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Experimental Evaluation The performance of early hash join was compared with dynamic hash join, XJoin, and hash-merge join. u All algorithms were implemented in Java and tested on a TPC-H 1 GB size data set (raw text files). All dual hash table algorithms used the same table structure. Summary of results: u EHJ is 10-35% faster than HMJ/XJoin for many-to-many joins and 25-75% faster for one-to-many joins. u EHJ is faster over all memory sizes except for very small memory (less than 10% of smaller relation size). u EHJ performs better when the difference in the relative sizes of the relations is large. u EHJ is within 10% of overall time of DHJ, but with a response time that is an order of magnitude faster. Intelligent buffering may be able to further reduce this difference.

Page 14 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Many-to-Many Join Experiment Query: SELECT * FROM PartSupp P1, PartSupp P2  WHERE P1.p_partkey = P2.p_partkey u P1 and P2 were randomly permuted as sorted on p_partkey. u Memory size = 300,000 tuples (37.5% of 800,000 tuples)

Page 15 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results One-to-Many Join Experiment Query: SELECT * FROM Customer C, Orders O WHERE C.c_custkey = O.o_custkey u Memory size = 75,000 tuples (50% of 150,000 tuples)

Page 16 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Multi-Join Experiment SELECT c_custkey, c_name, c_address, o_orderkey, o_custkey, o_totalprice, o_orderdate, l_orderkey, l_partkey, l_suppkey, l_quantity, l_extendedprice FROM Customer C, Orders O, LineItem LI WHERE C.c_custkey=O.o_custkey and O.o_orderkey = LI.l_orderkey u Memory size = 90,000 tuples (60% of 150,000 tuples) (C+O) u Memory size = 450,000 tuples (30% of 1,500,000) (C/O + LI)

Page 17 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Mediator Experimental Evaluation The performance of early hash join was compared with dynamic hash join, XJoin, and hash-merge join for mediator joins. u All algorithms were implemented in Java and tested on a TPC-H 100 MB size data set with queries processed by SQL Server. u DHJ downloaded from both inputs in parallel. Summary of results: u Both overall execution time and join output rate is dictated by speed of inputs. Little variation in execution time for algorithms. u All early algorithms have response time an order of magnitude faster than DHJ especially when left input is slow. u For 1:* joins, EHJ is 5%-15% faster overall than HMJ/XJoin and equivalent to DHJ. It also has a slightly faster join output rate. u For *:* joins, EHJ was only marginally faster in overall time with a very similar join output rate.

Page 18 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Applications The primary application is interactive querying on a centralized database. EHJ has a response time an order of magnitude faster than hybrid hash join with little execution overhead. EHJ is also more suitable for pipelining within and outside DBMS as it is a symmetric operator that tolerates source delays. This may be especially valuable for federate queries. EHJ can be used with LIMIT queries to produce the first few results without the overhead of partitioning the smaller input. However, any query with blocking operators such as ordering and grouping cannot benefit from its fast response time. Further, it is not order preserving without additional modifications.

Page 19 The University of Iowa. Copyright© 2005 Ramon Lawrence - Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Future Work and Conclusions EHJ is a useful algorithm for interactive querying and a good candidate for inclusion into the set of join algorithms for a centralized DBMS. EHJ is dynamically configurable using a reading policy and can adapt to slow input arrival. In a centralized environment, it significantly outperforms previous early join algorithms. Future work: u Implement and test performance of EHJ in PostgreSQL. u Expand algorithm for a N-way join. u Investigate possibility of making order preserving and optimized for distributed/mediator joins.

Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Ramon Lawrence University of Iowa Thank You!