Presentation is loading. Please wait.

Presentation is loading. Please wait.

Is Your Graph Algorithm Eligible for Nondeterministic Execution? Zhiyuan Shao, Lin Hou, Yan Ai, Yu Zhang and Hai Jin Services Computing Technology and.

Similar presentations


Presentation on theme: "Is Your Graph Algorithm Eligible for Nondeterministic Execution? Zhiyuan Shao, Lin Hou, Yan Ai, Yu Zhang and Hai Jin Services Computing Technology and."— Presentation transcript:

1 Is Your Graph Algorithm Eligible for Nondeterministic Execution? Zhiyuan Shao, Lin Hou, Yan Ai, Yu Zhang and Hai Jin Services Computing Technology and System Lab Cluster and Grid Computing Lab Huazhong University of Science and Technology ICPP’15

2 Outline Motivation System model Algorithm Convergence Evaluation Conclusion

3 Motivation “Big data” era Loosely coupled data Key-value pairs Hadoop, Spark, many others Tightly coupled data Graph data Pregel, GraphLab, GraphChi, X-Stream, many others Graph computing Execution model Synchronous model (BSP) Asynchronous model Execution manner Deterministic executions Nondeterministic executions

4 Motivation (Cont’d) Deterministic execution Widely and extensively studied Architecture, OS, Scheduling Set/Chromatic scheduler (GraphLab), DIG (Galois), external deterministic (GraphChi) Pros. Deterministic execution path (always) leads to deterministic results Cons. High overhead introduced to order the tasks (consider a billion-node graph!) Nondeterministic execution Poorly studied Pros. High parallelism, High performance! Cons. Need to prevent (at least) data-races Un-documented

5 Motivation (Cont’d) Example of two execution manners Problem: High overhead for defining the execution sequence! Question: What if all these tasks are executed nondeterministically? A1: Obviously, Avoided ordering overhead and improved parallelism! A2: Data-races on edges! Taken from GraphLab paper But what if we eliminate the data-races?

6 Motivation (Cont’d) Objective of this research Study the nondeterministic execution of graph algorithms Wait…… Why to study that? Graph algorithms are special cases of parallel computing! Iterative computing Associative law: a+(b+c) = (a+b)+c Idempotent law: f(f(x)) = f(x) Potential towards higher performance! Questions: Will an algorithm converge by nondeterministic executions? Will the executions lead to deterministic results (i.e., external deterministic)?

7 Outline Motivation System model Algorithm Convergence Evaluation Conclusion

8 System model Share memory computer # processors >= 1 Graphs loaded in memory COTS components, nothing special for HW and OS Synchronous implementation of asynchronous model Computing is organized by multiple iterations Barriers are enforced between two consecutive iterations Updates are applied “immediately” Example: GraphChi, GRACE Vertex-centric computing “Think Like A Vertex” Data-dependences happen on edges

9 System model (Cont’d) Race-free Method1: Architecture support Method2: Compiler support Method3: Explicit lock/unlock Convert data-races to “conflicts” Scheduling General methods Example: static, dynamic or other methods in OpenMP Assumption on scheduling DvDv DuDu v u DeDe DeDe add_schedule(u)

10 Outline Motivation System model Algorithm Convergence Evaluation Conclusion

11 Algorithm Convergence Methodology Classify the “conflicts” on edges Read-write conflicts Case1: Read-after-write  read new value  converge Case2: Write-after-read  read old value  converge? Write-write conflicts Case1: (correct)write-after-(wrong)write  correct edge values  converge Case2: (wrong)write-after-(correct)write  corruption edge values  converge?

12 Algorithm Convergence (Cont’d) Read-write conflict DvDv DuDu v u DeDe DeDe Case1: Read-after-write Converge DvDv DuDu v u DeDe DeDe Case2: Write-after-read DvDv DuDu v u Read old value Next iteration Converge DeDe DeDe

13 Algorithm Convergence (Cont’d) Sufficient condition1 to convergence Chain-to-converge exists Deduction1: If algorithm A on graph G converges with synchronous model execution, A will converge with nondeterministic execution. Deduction2: If algorithm A on graph G converge by a deterministic scheduler of asynchronous mode, A will converge with nondeterministic execution. Example algorithms that converge: PageRank Many other fixed point iterative algorithms

14 Algorithm Convergence (Cont’d) Write-write conflicts DvDv DuDu v u DeDe DeDe Case1: (correct)write-after-(wrong)write DeDe Converge DvDv DuDu v u DeDe DeDe Case2: (wrong)write-after-(correct)write DeDe DvDv DuDu v u DeDe DeDe Corrupted edge value Next iteration DeDe Falsely converge Correcting edge value DvDv DuDu v u DeDe Next iteration Converge

15 Algorithm Convergence (Cont’d) Sufficient condition2 to convergence In order to correct the corrupted edge value: Algorithm A on graph G converges with deterministic asynchronous model execution. Algorithm A satisfies monotonicity property. (falsely converge) Algorithms that converge: WCC (Weakly Connected Components) by MLP (Minimal Label Propagation) BFS (Breadth First Search) Many other graph traversal algorithms Algorithms that does not converge: BP (Belief Propagation)

16 Outline Motivation System model Algorithm Convergence Evaluation Conclusion

17 Evaluation Experiment setup 2*2.6-GHz Intel Xeon E5-2670 processors (8 cores) 64GB of RAM GCC version: 4.8.3 Real-world graph data-sets Web-BerkStan, web-Google, soc-LiveJournal1, cage15 Platform GraphChi (C++ version 0.2) Algorithms PageRank, SSSP, WCC, BFS Avail at: https://github.com/mrshawcode/GraphChi_nondeter_algorithm

18 Evaluation (Cont’d) Using architecture support achieves best performance (exec. time reduction can be up to 70%) Using explicit locking/unlocking achieves not the best performance, but still good scalability, and sometimes even outperform deterministic executions.

19 Evaluation (Cont’d) difference degree is 3 Result1:{1, 2, 3, 5, 7} Result2:{1, 2, 3, 7, 5} Suffix---- 0, 1, 2, 3, 4 Results are not deterministic (external deterministic) With increased precision (smaller ε), variations in results move to less important pages How about the produced results of PageRank? Measure the difference:

20 Outline Motivation System model Algorithm Convergence Evaluation Conclusion

21 Graph algorithms are special cases of parallel computing Does not necessarily need high overhead deterministic executions! Most of the algorithms can be executed nondeterministically Examples include PageRank, WCC, BFS and many others. Not all of the nondeterministic executions produce deterministic results! Open problems More discussions on sufficient conditions for algorithm convergence by nondeterministic execution More discussions on the variations (nondeterminacy) in results produced by nondeterministic executions (e.g., PageRank) Theoretical analysis on speed of convergence Extending the system model to pure asynchronous computing

22 Thank you! Q&A


Download ppt "Is Your Graph Algorithm Eligible for Nondeterministic Execution? Zhiyuan Shao, Lin Hou, Yan Ai, Yu Zhang and Hai Jin Services Computing Technology and."

Similar presentations


Ads by Google