Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08.

Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08 Presented by Kisung Kim

Introduction  Parallel processing continues to be important in large data warehouses  Shared nothing architecture –Multiple nodes communicate via high-speed interconnect network –Each node has its own private memory and disks  Parallel Unit (PU) –Virtual processors doing the scans, joins, locking, transaction management,…  Relations are horizontally partitioned across all Pus –Hash partitioning is commonly used PU Data 2 / 28

Introduction  Partitioning column –R: x –S: y  Hash function –h(i) = i mod 3 + 1 3 / 28

Two Join Geographies  Redistribution plan –Redistribute the tables based on join attributes if they are not partitioned by the join attributes –Join is performed on each PU in parallel 4 / 28

Two Join Geographies  Duplication plan –Duplicate tuples of the smaller relation on each PU to all Pus –Join is performed on each PU in parallel 5 / 28

Redistribution Skew  Hot PU –After redistribution, some PUs have larger number of tuples than others –Performance bottleneck in the whole system –Relations with many rows with the same value in the join attributes  Adding more nodes will not solve the skew problem  Examples –In travel booking industry, a big customer often makes a large number of reservations on behalf of its end users –In online e-commerce, a few professionals make millions of transactions a year –… 6 / 28

Redistribution Skew  Relations in these applications are almost evenly partitioned  When the join attribute is a non-partitioning column attribute, severe redistribution skew happens  Duplication plan can be a solution only when one join relation is fairly small  Our solution –Partial Redistribution & Partial Duplication (PRPD) join 7 / 28

PRPD Join  Assumptions –DBAs evenly partition their data for efficient parallel processing –Skewed rows tend to be evenly partitioned on each PU –The system knows the set of skewed values  Intuition –Deal with the skewed rows and non-skewed rows of R differently 8 / 28

PRPD  L 1 : set of skewed values R.a  L 2 : set of skewed values S.b  Step 1 –Scan R i and split the rows into three sets  R i 2-loc : all skewed rows of R i  R i 2-dup : every rows of R i whose R.a value matches any value in L 2  R i 2-redis : all other rows of R i –Three spools for each PU i  R i loc : all rows from R i 2-loc  R i dup : all rows of R duplicated to PU i  R i redis : all rows of R redistributed to Pu i –Similarly on S  Kept Locally  Duplicated to all PUs  Hash redistributed on R.a 9 / 28

PRPD: Example L 1 = {1} L 2 = {2} 10 / 28

R 3 2-redis PRPD Step 1 PU 3 R 3 2-dup R 3 2-loc R 2 2-redis PU 2 R 2 2-dup R 2 2-loc R 1 2-redis PU 1 R 1 2-dup R 1 2-loc R 3 redis R 3 dup R 3 loc R 2 redis R 2 dup R 2 loc R 1 redis R 1 dup R 1 loc PU 3 PU 2 PU 1 R i 2-loc : Store Locally 11 / 28

R 3 2-redis PRPD Step 1 PU 3 R 3 2-dup R 3 2-loc R 2 2-redis PU 2 R 2 2-dup R 2 2-loc R 1 2-redis PU 1 R 1 2-dup R 1 2-loc R 3 redis R 3 dup R 3 loc R 2 redis R 2 dup R 2 loc R 1 redis R 1 dup R 1 loc PU 3 PU 2 PU 1 R i 2-dup : Duplicate 12 / 28

R 3 2-redis PRPD Step 1 PU 3 R 3 2-dup R 3 2-loc R 2 2-redis PU 2 R 2 2-dup R 2 2-loc R 1 2-redis PU 1 R 1 2-dup R 1 2-loc R 3 redis R 3 dup R 3 loc R 2 redis R 2 dup R 2 loc R 1 redis R 1 dup R 1 loc PU 3 PU 2 PU 1 R i 2-redis : Redistribute 13 / 28

PRPD Step 1 14 / 28

PRPD Step 2  On each PU i, R 1 redis R 1 dup R 1 loc PU 1 S 1 redis S 1 dup S 1 loc 15 / 28

PRPD  All sub-steps in each step can run in parallel  Overlapping skewed values –The overlapping skew values  R i 2-loc or R i 2-dup ? –System chooses to include the overlapping skewed value in only one of L1 and L2 –Calculate the size of rows and choose small one 16 / 28

Comparison with Redistribution Plan  Use more total spool space than redistribution plan –PRPD duplicate some rows  Less networking cost –Keep the skewed rows locally  PRPD does not send all skewed rows to a single PU R i 2-redis R i 2-dup R i 2-loc  Keep locally, less network cost  Duplicate, more spool space  Same as redistribution plan 17 / 28

Comparison with Duplication Plan  Less spool space than duplication plan –Partial duplication  More networking cost –When data skew is not significant –PRPD plan needs to redistribute a large relation  Less join cost –Duplication plan always joins a complete copy of the duplicated relation 18 / 28

PRPD: Hybrid of Two Plans  L 1 = Ø, L 2 =Ø –Same as redistribution plan  L 1 =Uniq(R.a) ⊃ Uniq(S.b) –Same as duplication plan (duplicate S) 19 / 28

PRPD: Hybrid of Two Plans  n: the number of PUs  x: percentage of the skewed rows in a relation R  The number of rows of R after redistribution in redistribution –Hot PU: –Non-hot PU:  The number of rows of R after redistribution in PRPD –Hot PU:  Ratio of the number of rows of hot PU in redistribution over the number of rows of R in PPRD 20 / 28

Experimental Evaluation  Compare PRPD with redistribution plan –Redistribution plan is more widely used than duplication plan  Schema & test query 21 / 28

Generating Skewed Data  Originally 25 unique nations in TPC-H  We increased the number of unique nations to 1000  5% skewness 22 / 28

Query Execution Time  10 nodes, 80 PUs  Node –Pentium IV 3.6 GHz CPUs, 4GB memory, 8 PUs  1 million rows for Supplier relation  1 million rows for Customer relation  The size of query result is around 1 billion rows 23 / 28

Query Execution Time  1 Hot PUs 24 / 28

Query Execution Time  2 Hot PUs 25 / 28

Different Number of PUs  Speedup ratio of PRPD over redistribution plan  As the skewness increases, the speedup ratio increases  The larger the system, the larger the speed up 26 / 28

Conclusions  Effectively handle data skew in joins –Important challenges in parallel DBMS  We propose PRPD join –Hybrid of redistribution and duplication plan –PRPD also can be used in multiple joins 27 / 28

Thank you 28 / 28

Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08.

Similar presentations

Presentation on theme: "Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08.

Similar presentations

Presentation on theme: "Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08."— Presentation transcript:

Similar presentations

About project

Feedback