Download presentation
Presentation is loading. Please wait.
Published byMarcus Thomas Modified over 9 years ago
1
Generalized Hash Teams for Join and Group-By Alfons Kemper Donald Kossmann Christian Wiesner Universität Passau Germany
2
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams2VLDB´99 Outline oMotivating Example oStandard Hash Teams oGeneralized Hash Teams for Joins oGeneralized Hash Teams for Joins/Grouping oFalse Drops Analysis oApplication Examples (TPC-D) oPerformance Evaluation
3
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams3VLDB´99 Traditional Join Plan Result R S A A T R S T
4
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams4VLDB´99 Traditional Hash Team Join Plan [Graefe, Bunker, Cooper: VLDB 98] R S A A T Result A AR.A S.A T.A R A A S T
5
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams5VLDB´99 Generalized Hash Teams R B A S T
6
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams6VLDB´99 Generalized Hash Teams R B A S T R B A S T 6 mod 5 =1 Partition on B odd: yellow even: green
7
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams7VLDB´99 Generalized Hash Team for Grouping/Aggregation oselect c.City, sum(o.Value) from Customer c, Order o where c.C# = o.C# group by c.City Agg Bit- maps (BM) Order Customer Ptn on C# Ptn on City Order Customer Ptn on City Ptn on BM Agg Join and grouping team
8
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams8VLDB´99 Group (Customer Order ) C# City Customer Order C# City C# Partition on City and generate bitmaps for C# Partition with bitmaps for C#
9
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams9VLDB´99 Group (Customer Order Lineitem) C# City O# Customer Order Lineitem O# C# City C# O# Partition on City and generate bitmaps for C# Partition with bitmaps for O# Partition with bitmaps for C# and generate bitmaps for O#
10
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams10VLDB´99 False Drops R B A S T R B A S T
11
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams11VLDB´99 Overlapping Partitions T S R Customer Order Lineitem Partition on C# and generate bitmaps for O# Partition with Bitmaps Partition on B and generate bitmaps for A Partition based on the bitmaps for A (Customer Order Lineitem) C#O#
12
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams12VLDB´99 Applicability of Generalized Hash Teams for partitioning hierarchical structures A B Partition on B Partition on bitmaps for A but it is also correct for non-strict hierarchies A B (but performance deteriorates)
13
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams13VLDB´99 Non-strict hierarchy A B R B A S T R B A S T T S R
14
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams14VLDB´99 False Drops Estimation b: cardinality of the bitmaps n: number of partitions probability that some s sets a bit leading to a false drop of an r into a particular partition: total number of false drops: conservative approximation:
15
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams15VLDB´99 Implementation Details: Fine Tuning the Partitioning Bitmaps Bloom-Filter [Bratbergsengen] [Valduriez]
16
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams16VLDB´99 Implementation Details: Teaming up Join and Grouping Group (Customer Order ) C# City Customer Order C# City C# Partition on City and generate bitmaps for C# Partition with bitmaps for C#
17
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams17VLDB´99 Teaming Up Join and Grouping: Build Phase 5 PA M 13 25 23
18
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams18VLDB´99 5 PA M 13 25 23 10 1 Teaming Up Join and Grouping: Probe Phase
19
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams19VLDB´99 Performance Comparison: Group (Customer Order ) C# City Memory [MB]
20
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams20VLDB´99 False Drops Estimation and Measurement
21
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams21VLDB´99 Performance Comparison: Group (Customer Order Lineitem) C# City O# Memory [MB]
22
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams22VLDB´99 False Drops Estimation and Measurement
23
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams23VLDB´99 Conclusion and Future Work oLook-Ahead Partitioning for Joins and Grouping oApplicable for hierarchical data structures ocorrectness does not depend on strict hierarchies oApplicable for several TPC-D (TPC-H and TPC-R) queries: e.g., Q5, Q10, Q18 oCombining Generalized Hash Teams and Order Preserving Hash Joins (OHJ)
24
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams24VLDB´99 TPC-D Q5 SELECT N_NAME, SUM(L_EXTENDEDPRICE * ( 1 - L_DISCOUNT)) AS REVENUE FROM CUSTOMER, ORDER, LINEITEM, SUPPLIER, NATION, REGION WHERE C_CUSTKEY = O_CUSTKEY AND O_ORDERKEY = L_ORDERKEY AND L_SUPPKEY = S_SUPPKEY AND C_NATIONKEY = S_NATIONKEY AND S_NATIONKEY = N_NATIONKEY AND N_REGIONKEY = R_REGIONKEY AND R_NAME = '[region]' AND O_ORDERDATE >= DATE '[date]' AND O_ORDERDATE < DATE '[date]' + INTERVAL 1 YEAR GROUP BY N_NAME ORDER BY REVENUE DESC;
25
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams25VLDB´99 TPC-D Q10 SELECT C_CUSTKEY, C_NAME, SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT)) AS REVENUE, C_ACCTBAL, N_NAME, C_ADDRESS, C_PHONE, C_COMMENT FROM CUSTOMER, ORDER, LINEITEM, NATION WHERE C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY AND O_ORDERDATE >= DATE '[date]' AND O_ORDERDATE < DATE '[date]' + INTERVAL 3 MONTH AND L_RETURNFLAG = 'R' AND C_NATIONKEY = N_NATIONKEY GROUP BY C_CUSTKEY, C_NAME, C_ACCTBAL, C_PHONE, N_NAME, C_ADDRESS, C_COMMENT ORDER BY REVENUE DESC;
26
A. Kemper, D. Kossmann, C. Wiesner: Generalized Hash Teams26VLDB´99 Indirectly Partitioning a Hierarchical Structure Lineitem Order Customer O# C# City Partition 1 Partition 3Partition 2
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.