Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.

Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University Big Data Final Seminar VLDB 2014 H.V. Jagadish University of Michigan

Outline Presented by Haejoon Lee Partition Based Recovery Implementation Evaluation Conclusion Motivation Background

1 2 / 20Presented by Haejoon Lee Distributed Graph Processing System - The set of vertices and edges is divided partitions. - The partitions are distributed among compute nodes. Bulk Synchronous Parallel Computation model in DGPS. Each worker executes input phase. Then they are iteratively processing by Global Barrier.

1 Background 3 / 20Presented by Haejoon Lee Scaling the # of nodes causes two effects - It increase the # of failed nodes during job-execution. - System progress stops during recovery, so a # of nodes could become idle. For these reasons, we need efficient failure recovery system Why do you think?

2 Motivation 4 / 20Presented by Haejoon Lee Checkpoint Based Recovery Flow - Requires nodes to write the status to storage as checkpoint. - Uses healthy nodes to load the status from the last check-point. - Re-executes all the missing workloads. However, CBR causes high recovery latency. - Re-executes the missing workloads over the whole graph in failed and even healthy nodes.

2 Motivation 5 / 20Presented by Haejoon Lee Problem in Cascading Failure - Def. failure occurs during normal execution at any time. - Frequent check-pointing will incur long execution time. Proposes Fast Failure Recovery (Partition Based Recovery)

Outline 6 / 20Presented by Haejoon Lee Partition Based Recovery Implementation Evaluation Conclusion Motivation Background

3 Partition Based Recovery 7 / 20Presented by Haejoon Lee Execution Flow - Restricts recovery of subgraph in only failed nodes using log msg. - Divides the subgraphs in only failed nodes into partitions. - Distribute these partitions among computer nodes. - Reload these partitions from the last checkpoint and rebalance it What is locally log message in PBR? - PBR require every node to log its outgoing msg at the end of super step. - Every healthy node forwards the log msg to vertices in failed partitions.

3 CBR vs PBR 8 / 20Presented by Haejoon Lee CD AB E F Checkpoint Based Recovery N1 N2 C ’ D ‘ A ‘B ‘ E ‘F ‘ Each node storage has Checkpoint CBR incurs HIGH computation cost and communication cost

3 CBR vs PBR 9 / 20Presented by Haejoon Lee Partition Based Recovery AB CD EF N1 N2

3 Details of PBR 10 / 20Presented by Haejoon Lee AB CD E F N1 N2 1.Reassignment Partition -Random assigning partitions -In each iteration calculate the above one for Cost -Check the minimal cost -Find the optimal partition based minimal cost Optimal Partition after checking generated partition

3 Details of PBR 11 / 20Presented by Haejoon Lee A B C D EF N1 N2 2. Recomputation Missing Workload -Failed partitions (A,B),(C,D) load checkpoint in step11 - Healthy Partition D forwards locally log msg to vertices in failed partitions A B C D EF A B C D EF Superstep 11 Superstep 12 Assuming: the latest checkpoint is in super step 11 Locally log message Compute vertices from checkpoint FailedHealthy

3 Details of PBR 12 / 20Presented by Haejoon Lee 3. Re-balance configuration if each node’s one is different How to handle Cascading Failure? - Unlike the CBR’s handling, PBR treats cascading failure as normal failure by executing these 3 steps - In practice, the occurrence of failure is not very frequent.

4 PBR Architecture on Giraph 13 / 20Presented by Haejoon Lee Master - ‘Assign Partitions’ as recovery plan and save it to Zookeeper Zookeeper - a centralized service for maintaining configuration information, naming, providing distributed synchronization Slaves - fetch the partitions from Zookeeper If ( slaves are in checkpoint step ) they do checkpoint, and perform computation Else if ( slaves are failed as restart ) they load partitions and perform computation

Outline Presented by Haejoon Lee Partition Based Recovery Implementation Evaluation Conclusion Motivation Background

5 Experimental Setup CBR vs PBR Benchmark- *K-means, Semi-clustering, and *PageRank - Runs all the tasks for 20 super steps. - Performs a checkpoint at the beginning of step 11. Cluster - 72 Compute Nodes- Intel X3430 2.4GHZ, 8GB memory, 2 * 500GB HDD - Giraph with PBR runs as MapReduce job on Hadoop Dataset 14 / 20Presented by Haejoon Lee

5 Evaluation- K-means CBR vs PBR 15 / 20Presented by Haejoon Lee PBR outperforms 12.4 to 25.7 than CBR. The recovery time of two function increase linearly. PBR takes almost the same time as CBR. - No outgoing msg among differ vertices in K-means - The time of checkpoint is negligible compared to computing the new belonging clusters

5 Evaluation- K-means CBR vs PBR 16 / 20Presented by Haejoon Lee These experiments verify the effectiveness of PBR, which parallelizes computation and eliminates unnecessary recovery cost. PBR outperforms 6.8 to 23.9 than CBR - In CBR, no matter how many nodes fail because they have to reload all computation PBR can reduce recovery time by 23.8 to 26.8 than CBR.

5 Evaluation- PageRank CBR vs PBR 17 / 20Presented by Haejoon Lee PBR takes slightly more time than CBR. - Friendester’s property is Power-law links. - Each super step involve a # of forwarding logged msg via Disk I/O. Check Pointing

5 Evaluation- PageRank CBR vs PBR 18 / 20Presented by Haejoon Lee These experiments verify the effectiveness of PBR, which parallelizes computation and eliminates unnecessary recovery cost.

6 Conclusion 19 / 20Presented by Haejoon Lee Partition based recovery is proposed as novel recovery system which parallelize failure recovery processing. This system distributes the recovery task to multiple compute nodes such that the recovery processing can be executed concurrently It is implemented on the widely used Girpah system and observe outperforms existing checkpoint-based recovery stem by up to 30 times

Thank s

6 Backup: Semi-Clustering Master Seminar PresentationPresented by Haejoon Lee

6 PBR Architecture on Giraph 13 / 20Presented by Haejoon Lee Master - ‘Assign Partitions’ as recovery plan and save it to Zookeeper Slaves fetch the partitions from Zookeeper - If they are in checkpoint step, they do and perform computation - If they are in fail as restart, they load partitions and perform it

6 Backup: Communication Cost of PR Master Seminar PresentationPresented by Haejoon Lee

Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.

Similar presentations

Presentation on theme: "Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.

Similar presentations

Presentation on theme: "Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University."— Presentation transcript:

Similar presentations

About project

Feedback