Presentation is loading. Please wait.

Presentation is loading. Please wait.

Supporting On-Demand Elasticity in Distributed Graph Processing Mayank Pundir*, Manoj Kumar, Luke M. Leslie, Indranil Gupta, Roy H. Campbell University.

Similar presentations


Presentation on theme: "Supporting On-Demand Elasticity in Distributed Graph Processing Mayank Pundir*, Manoj Kumar, Luke M. Leslie, Indranil Gupta, Roy H. Campbell University."— Presentation transcript:

1 Supporting On-Demand Elasticity in Distributed Graph Processing Mayank Pundir*, Manoj Kumar, Luke M. Leslie, Indranil Gupta, Roy H. Campbell University of Illinois at Urbana-Champaign *Facebook (work done at UIUC)

2 Synchronous Gather-Apply-Scatter PARTITIONING GATHER APPLY SCATTER ITERATIONS 2

3 Background: Existing Systems Systems are typically configured with a static set of servers. PowerGraph [Gonzalez et al. OSDI 2012] Giraph [Ching et al., VLDB 2015] Pregel [Malewicz et al. SIGMOD 2010] LFGraph [Hoque et al. TRIOS 2013] Consequently, these systems lack the flexibility to scale-out/in during computation. 3

4 Background: Graph Partitioning Current mechanisms partition the entire graph across a fixed set of servers. Partitioning occurs once at the start of computation. Supporting elasticity requires an incremental approach to partitioning. We must repartition during computation as servers leave and join. 4

5 Background: Graph Partitioning We assume hash-based vertex partitioning. Use consistent hashing. Vertex assigned to server with ID = hash(v) % N Recent studies [Hoque et al 2013] have shown hash-based vertex partitioning: Involves the least amount of overhead. Performs well. 5

6 Our Contribution We present and analyze two techniques to achieve scale-out/in in distributed graph processing systems. 1.Contiguous Vertex Repartitioning (CVR). 2.Ring-based Vertex Repartitioning (RVR). We have implemented our techniques in LFGraph. Experiments show performance within 9% and 21% of optimum for scale-out and scale-in operations. 6

7 Key Questions 1.How (and what) to migrate? How should vertices be migrated to minimize network overhead? What vertex data must be migrated? 2.When to migrate? At what point during computation should migration start/end? 7

8 How (and What) to Migrate? Assumption: hashed vertex space is divided into equi-sized partitions. Key Problem: upon scale-out/in, how do we assign new equi-sized partitions to servers? Goal: minimize network overhead. 8

9 How (and What) to Migrate? 9 Before After Servers (4) S 1 [V1,V25] S 2 [V25,V50] S 3 [V51, V75] S 4 [V76, V100] Servers (5) S 1 [V1,V20] S 2 [V21,V40] S 3 [V41, V60] S 5 [V81, V100] S 4 [V61, V80] [V21,V25] [V41,V50] [V61,V75][V81,V100] Total transfer: 5 + 10 + 15 + 20 = 50 vertices

10 How (and What) to Migrate? 10 Before After Servers (4) S 1 [V1,V25] S 2 [V25,V50] S 3 [V51, V75] S 4 [V76, V100] Servers (5) S 1 [V1,V20] S 2 [V21,V40] S 3 [V41, V60] S 5 [V81, V100] S 4 [V61, V80] [V21,V25] [V41,V50] [V51,V60] [V76,V80] Total transfer: 5 + 10 + 10 + 5 = 30 vertices

11 How (and What) to Migrate? CVR CVR: Contiguous Vertex Repartitioning Intuition: Reduce to min cost graph matching problem and utilize an efficient heuristic. Can use O(N) greedy algorithm due to contiguity of partitions. 11

12 How (and What) to Migrate? CVR Assume scale-out from N to N + k servers (k new). 1.Repartition vertex sequence into N + k equi-sized partitions. 2.Create complete bipartite graph. Edge cost = #vertices that must be transferred. 12 Servers (N + k) Partitions (N + k)

13 How (and What) to Migrate? CVR 3.For each old server, greedily iterate through partitions with non-zero overlap and choose one with largest set of overlapping vertices. #partitions with non-zero overlap is O(1) due to contiguity. 13 Servers (N + k) Partitions (N + k)

14 How (and What) to Migrate? CVR 14 P 3 [V41, V60] New Partitions (5) New Servers (1) S 5 [] Old Servers (4) S 1 [V1,V25] S 2 [V25,V50] S 3 [V51, V75] S 4 [V76, V100] P 1 [V1, V20] P 2 [V21, V40] P 4 [V61, V80] P 5 [V81, V100] Non-zero overlap

15 How (and What) to Migrate? CVR Given a load-balanced system, we prove that to minimize network traffic and preserve load balance: Scale-out: joining server placed in the middle of list. I.e., insert after S N/2. Scale-in: leaving server must be middle server in list. I.e., remove S (N+1)/2. 15

16 Can we do better? 16 Problems: With CVR, we experience (1) imbalanced load across servers when transferring vertices, (2) many servers involved in each operation. In the previous example, servers close to the middle transferred twice as many vertices. Question: Can we find a way to provide better load balance and minimize the number of affected servers?

17 How (and What) to Migrate? RVR RVR: Ring-based Vertex Repartitioning. Intuition: use Chord-style consistent hashing. To maintain load balance, assign servers equi-sized ring segment. Server with ID n i is responsible for vertices hashed to (n i-1, n i ]. 17 s1s1 s5s5 s 29 s 25 s9s9 s 13 s 21 s 17 v 22

18 How (and What) to Migrate? RVR General process: Scale-out: joining server splits successor’s segment. I.e., n i takes (n i-1, n i ] from n i+1. Scale-in: leaving server gives segment to successor. I.e., n i gives (n i-1, n i ] to n i+1. Scale-out/in operation with k servers affects at most k other servers. 18

19 How (and What) to Migrate? RVR Given a load-balanced system… Scale-out: spread out affected portions over the ring. For ⌈ k/N ⌉ rounds, assign N servers each to a disjoint old partition. ≤ N servers being added => V/2N vertices transferred per addition. Otherwise, only minimax if m-1< k/N ≤ m, where m is the maximum number of new servers added to an old partition. 19

20 How (and What) to Migrate? RVR Given a load-balanced system… Scale-in: remove alternating servers. ≤ N/2 servers being removed => V/N vertices transferred per removal. Otherwise, only minimax if (m-1)/m < k/N ≤ m/(m + 1), where m is the maximum number of servers removed from an old partition. 20

21 LFGraph: a brief overview Graph is partitioned (equi-sized) among servers. Partitions further divided among worker threads. Into vertex groups (one per thread). Centralized barrier server for synchronization. Communication occurs via pub-sub mechanism. Servers subscribe to in-neighbors of vertices. 21

22 When to Migrate? Must decide when to migrate vertices to minimize interference with ongoing computation. Migration includes static data and dynamic data. Static data: vertex IDs, neighbor IDs, edge values. Dynamic data: vertex states. 22

23 When to Migrate? Possible solution: migrate everything during synchronization interval between iterations. Problem: migrating both static and dynamic data during this interval is very wasteful. Migration might only involve a few servers. Static data doesn’t change – can be migrated at any point. 23

24 When to Migrate? Solution: migrate static data in the background, and dynamic data during synchronization. Migration is merged with the scatter phase to further reduce overhead. 24

25 LFGraph Optimizations Parallel Migration: If two servers are running the same number of threads, there is a one-to-one correspondence between ring segments (and thus vertex groups). Thus, we can transfer data directly and in parallel between corresponding threads. 25

26 LFGraph Optimizations RVR Optimizations: 1.Modified scatter phase transfers vertex values from servers to successors in parallel w/reconfiguration. 2.During scale-in, servers quickly rebuild subscription lists by appending leaving server’s list to successors. 26

27 LFGraph Optimizations Pre-Building Subscription Lists: Allow servers to receive information in background from barrier server about joining/leaving servers. Hence, can start building subscription lists before cluster reconfiguration. 27

28 Experimental Setup Up to 30 servers, each with 16 GB RAM, 8 CPUs. Twitter Graph: 41.65 M vertices, 1.47 B edges. Algorithms: PageRank, SSSP, MSSP, Connected Components, K-means. Infinite Bandwidth (IB) baseline: repartitioning under infinite network bandwidth. Assume cluster converges immediately to new size. Measure overhead as the increase in iteration time 28

29 Evaluation Scale-out/in starts at iteration 1, ends at iteration 3. 29 Overhead

30 Evaluation Operation overhead falls as cluster size increases. 30 5 → 10 15 → 30 10 → 530 → 15 10 → 20 20 → 10

31 Evaluation Operation overhead is insensitive to starting iteration. 31 Iter. 1 Iter. 6

32 Evaluation For PageRank: RVR: overhead vs. optimal is <6% for scale-out, <8% for scale-in. CVR: overhead vs. optimal is <6% for scale-out, <11% for scale-in. 32

33 Evaluation Similar results for other applications. 33 Lower execution time => higher overhead

34 Evaluation For other algorithms: RVR: overhead vs. optimal is <8% for scale-out, <16% for scale-in. CVR: overhead vs. optimal is <9% for scale-out, <21% for scale-in. 34

35 Conclusions We have proposed two techniques to enable on- demand elasticity in distributed graph processing systems: CVR and RVR. We have integrated CVR and RVR into LFGraph. Experiments show that our approaches incur <9% overhead for scale-out, <21% for scale-in. 35 http://dprg.cs.uiuc.edu http://srg.cs.illinois.edu


Download ppt "Supporting On-Demand Elasticity in Distributed Graph Processing Mayank Pundir*, Manoj Kumar, Luke M. Leslie, Indranil Gupta, Roy H. Campbell University."

Similar presentations


Ads by Google