Presentation is loading. Please wait.

Presentation is loading. Please wait.

VLDB 20051 Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute

Similar presentations


Presentation on theme: "VLDB 20051 Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute"— Presentation transcript:

1 VLDB 20051 Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute (binliu|rundenst)@cs.wpi.edu http://www.davis.wpi.edu/dsrg

2 VLDB 20052 Multi-Join Queries Data Integration over Distributed Data Sources  i.e., Extract Transform Load (ETL) Services Data Source … Data Warehouse Data Warehouse … Persistent Storage (1) High IO costs given large intermediate results (2) Disk access undesirable since one time process

3 VLDB 20053 Applying Parallelism Processed in Main Memory of a PC Cluster  Make use of aggregated resources (main memory, CPU) Network Clusters of Machines

4 VLDB 20054 Three Types of Parallelism Pipelined: Operators be composed into producer and consumer relationship Independent: Independent operators run simultaneously on distinct machines Partitioned: Single operator replicated and run on multiple machines

5 VLDB 20055 Basics of Hash Join Two-Phase Hash Join [SD89, LTS90]  Demonstrated High Performance  Potential High Degree of Parallelism ……… …5/140012 …5/130011 …DateID ……… …HPS0012 …IPC0011 …ItemOID OrdersLineItems valuekey (1) Build hash tables of Orders based on ID ……… …5/140012 …5/130011 …DateID Orders (2) Probe hash tables and output results ……… …HPS0012 …IPC0011 …ItemOID LineItems

6 VLDB 20056 Partitioned Hash Join Orders (1) Build hash tables of Orders based on ID ……… …5/140012 …5/130011 …DateID Split valuekeyvaluekeyvaluekey (2) Probe hash tables and output results ……… …HPS0012 …IPC0011 …ItemOID LineItems  Partition (Inputs) Hash Tables across Processors  Have Each Processing Node Run in Parallel

7 VLDB 20057 Left-Deep Tree [SD90] R6R6 R7R7 R1R1 R2R2 R5R5 R4R4 R3R3 R8R8 R9R9 Example Join Graph R1R1 R2R2 R3R3 R8R8 R9R9 B 1 P 1 B 2 P 2 B 7 P 7 B 8 P 8 Left-Deep Query TreeSteps: (1) Scan R 1 – Build R 1 (2) Scan R 2 – Probe P 1 – Build B 2 (3) Scan R 3 – Probe P 2 – Build B 3 (8) Scan R 8 – Probe P 7 – Build B 8 (9) Scan R 9 – Probe P 8 – Output …

8 VLDB 20058 Right-Deep Tree [SD90] R6R6 R7R7 R1R1 R2R2 R5R5 R4R4 R3R3 R8R8 R9R9 Example Join Graph R1R1 R2R2 R3R3 R8R8 R9R9 B 1 P 1 B 2 P 2 B 7 P 7 B 8 P 8 Right-Deep Query Tree (1)Scan R 2 – Build R 1, Scan R 3 – Build R 2, …, Scan R 9 – Build R 8 (2) Scan R 1, Probe P 1, Probe P 2, …, Probe P 8

9 VLDB 20059 Tradeoffs Between Left and Right Trees Right-Deep  Good potential for pipelined parallelism.  Intermediate results exist only as a stream.  Size of building relations can be predicted accurately.  Large memory consumption. Left-Deep  Less memory consumption  Less pipelined parallelism

10 VLDB 200510 State-of-the-Art Solutions Implicit Assumption : Prefer Maximal Pipelined Parallelism !!! R3R3 R2R2 R1R1 R5R5 B 1 P 1 B 2 P 2 B 4 P 4 B 3 P 3 R4R4 B 8 P 8 R9R9 B 7 P 7 R8R8

11 VLDB 200511 State-of-the-Art Solutions What if : Memory Constrained Environments ? Strategy : R3R3 R2R2 R1R1 R5R5 B 1 P 1 B 2 P 2 B 4 P 4 B 3 P 3 R4R4 B 8 P 8 R9R9 B 7 P 7 R8R8 R3R3 R2R2 R1R1 R5R5 B 1 P 1 B 2 P 2 B 4 P 4 B 3 P 3 R4R4 B 8 P 8 R9R9 B 7 P 7 R8R8 Pipeline ! Break tree into several pieces, and Process one piece at a time (as pipeline) I.e., Static Right-Deep[SD90], ZigZag [ZZBS93], Segmented Right-Deep [CLYY92].

12 VLDB 200512 Pipelined Execution Optimal Degree of Parallelism? I.e., It may not be necessary to partition R 2 over a large number of machines if it only has 1000 tuples? Redirection Cost: The intermediate results generated may need to be partitioned to a different machine. R1R1 R2R2 R3R3 R4R4 R2R2 R3R3 R4R4 R1R1 Computation Machines Partition BuildingProbing P 3 2 P 3 3 P 3 4 P 2 2 P 2 3 P 2 4 P 1 2 P 1 3 P 1 4 t t P 1 2

13 VLDB 200513 Pipelined Cost Model Compute n-way join over k machines Probing relation R 0, building relations, R 1, R 2, …, R n I i represents the intermediate results after joining with R i Total Work (W b +W p ) & Total Processing Time (T b +T p )

14 VLDB 200514 Break Pipelined Parallelism  Large number of small pipelines  High interdependence between pipelined segments i.e., P 1 > P 2, P 3 > P 4, P 2 > P 4, R9R9 R7R7 R1R1 R0R0 P1P1 P2P2 P3P3 P4P4 R3R3 R2R2 R1R1 R0R0 R4R4 R5R5 R7R7 R6R6 To Break Long Pipeline and Introduce Independent Parallelism

15 VLDB 200515 Segmented Bushy Tree Basic Idea  Compose large pipelined segment  Run pipelined segments independently  Compose bushy tree with minimal interdependency R7R7 R6R6 R4R4 R3R3 R5R5 R0R0 R1R1 R8R8 R9R9 R2R2 R2R2 R4R4 R3R3 R8R8 R6R6 R9R9 R7R7 R5R5 R1R1 R0R0 I1I1 I2I2 P1P1 P3P3 P2P2 To balance pipelined and independent parallelism

16 VLDB 200516 Cost-Based Heuristics Composing Segmented Tree Input: A connected join graph G with n nodes. Number m specifies maximum number of nodes in each graph. Output: Segmented bushy tree with at least n/m subtrees. completed = false; WHILE (!completed) { Choose node V with largest cardinality that has not yet been grouped as probing relation; Enumerate all subgraphs starting from V with at most m nodes; Choose best subgraph, mark nodes in this group as having been selected in original join graph; IF !(exist K, K is a connected subgraph of G with unselected nodes) && (K.size() >= 2) { completed = true; } Compose segmented bushy tree from all groups;

17 VLDB 200517 Example R7R7 R6R6 R4R4 R3R3 R5R5 R0R0 R1R1 R8R8 R9R9 R2R2 R7R7 R6R6 R4R4 R3R3 R5R5 R0R0 R1R1 R8R8 R9R9 R2R2 G1G1 (1) R 7, R 8, R 9, R 6 (2) R 7, R 9, R 6, R 8 (3) R 7, R 4, R 8, R 5... (1) R 1, R 0, R 2, R 3 (2) R 1, R 2, R 0, R 3 (3) R 1, R 2, R 3, R 4... R7R7 R6R6 R4R4 R3R3 R5R5 R0R0 R1R1 R8R8 R9R9 R2R2 G1G1 G2G2

18 VLDB 200518 Example : Segmented Bushy Tree R2R2 R4R4 R3R3 R8R8 R6R6 R9R9 R7R7 R5R5 R1R1 R0R0 I1I1 I2I2 R7R7 R6R6 R4R4 R3R3 R5R5 R0R0 R1R1 R8R8 R9R9 R2R2 G1G1 G2G2 G3G3

19 VLDB 200519 Machine Allocation Based on building relation sizes of each segment  N b : total amount of building work.  k i : number of machines allocated to pipeline i R2R2 R4R4 R3R3 R8R8 R6R6 R9R9 R7R7 R5R5 R1R1 R0R0 I1I1 I2I2 k1k1 k3k3 k2k2 N b =

20 VLDB 200520 Insufficient Main Memory Break query based on main memory availability Compose segmented bushy tree for each part R7R7 R6R6 R4R4 R3R3 R5R5 R0R0 R1R1 R8R8 R9R9 R2R2 R 15 R 16 R 18 R 19 R 17 R 11 R 10 R 14 R 13 R 12

21 VLDB 200521 Experimental Setup 10 Machine Cluster  Each machine has 2 2.4GHz Xeon CPUs, 2G Memory.  Connect by gigabit ethernet switch Oracle 8i Controller... 10 Machine Cluster PIII 800M Hz PC, 256M Memory 2 PIII 1G CPUs, 1G Memory Application PIII 800M Hz PC, 256M Memory

22 VLDB 200522 Experimental Setup (cont.) Generated Data Set with Integer Join Values  Around 40 bytes per tuple Randomly Generated Join Queries  Acyclic join graph with 8, 12, 16 nodes  Each node represents one join relation  Each edge represents one join condition  Average join ratio is 1  Cardinality of each relation is from 1K ~ 100K  Up to 600MB per query

23 VLDB 200523 Pipelined vs. Segmented (I)

24 VLDB 200524 Pipelined vs. Segmented (II)

25 VLDB 200525 Insufficient Main Memory

26 VLDB 200526 Related Work [SD90] Tradeoffs in processing complex join queries via hashing in multiprocessor database machines. VLDB 1990. [CLYY92] Using segmented right deep trees for execution of pipelined hash joins. VLDB 1992. [MLD94] Parallel hash based join algorithms for a shared everything environment. TKDE 1994. [MD97] Data placement in shared nothing parallel database systems. VLDB 1997. [WFA95] Parallel evaluation of multi-join queries. SIGMOD 1995. [HCY94] On parallel execution of multiple pipelined hash joins. SIGMOD 1994. [DNSS92] Practical skew handling in parallel joins. VLDB 1992. [SHCF03] Flux: an adaptive partitioning operator for continuous query systems. ICDE, 2003.

27 VLDB 200527 Conclusions Observation: Maximal pipelined hash join processing  Redirection costs? optimal degree of parallelism? Hypothesis: Worthwhile to incorporate independent parallelism into processing  Both, so several shorter pipelines in parallel Solution: Segmented bushy tree processing  Heuristics and cost-driven algorithm developed Validation : Extensive experimental studies  Achieve around 50% improvement over pure pipelined processing


Download ppt "VLDB 20051 Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute"

Similar presentations


Ads by Google