Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,

Similar presentations


Presentation on theme: "Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,"— Presentation transcript:

1 Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (GOOGLE, INC.)

2 Overview What is a graph? Graph Problems The Purpose of Pregel Model of Computation C++ API Implementation Applications Experiments

3 What is a graph? G = (V, E) Binary Tree

4 Graph Problems Network Routing Social Network Connections

5 The Purpose of Pregel Google was interested in applications that could perform internet-related graph algorithms, such as PageRank, so they designed Pregel to perform these tasks efficiently. It is a scalable, general-purpose system for implementing graph algorithms in a distributed environment. Focus on “Thinking Like a Vertex” and parallelism

6 Model of Computation

7 Model of Computation (Vertex) Vertex ID Vertex Value Edge Value Vertex ID Edge Value

8 Model of Computation (Superstep) Superstep 0Superstep 1Superstep 2 Execution Time Compute()

9 Model of Computation (Vertex Actions) A vertex can: Vertex ID Vertex Value Modify its values Receive messages from previous superstep Send messages Request topology changes

10 Model of Computation (State Machine)

11

12 C++ API

13

14 C++ API (Message Passing) Destination Vertex ID Message Value 257 1 2 Message Buffer

15 C++ API (Combiners & Aggregators) Combiner Aggregator

16 C++ API (Topology Mutations) V Superstep

17 C++ API (Input and Output) 01234 000110 10001 1 211011 301101 411100

18 Implementation

19 Implementation (Basic Architecture)

20 Implementation (Program Execution) Flow: 1.Copy user program – Master copy & worker copies 2.Master assigns graph partitions 3.Master takes user input data, assigns to workers – load vertex data 4.Supersteps (Compute() and send messages) 5.Save output

21 Implementation (Fault Tolerance) Checkpoint Worker Save() Worker Save() Worker Save() Recover Worker Recompute() Worker Recompute() X

22 Implementation (Worker) Worker

23 Implementation (Master) List of Workers Master Partitions

24 Application s

25 Applications (Shortest Path) 2 1 5 3

26 Experiment s

27 Experiments (Description) Test the execution times of Pregel running the Single- Source Shortest Path algorithm. Use a cluster of 300 multicore commodity PCs. Run Pregel with Binary Tree graphs, and with a more realistic, randomly-distributed graph. Results do not include initialization, graph generation, and result verification times. Failure Recovery is not included (reduces overhead)

28

29

30

31 Conclusion Pregel is a model suitable for large-scale graph computing with a production-quality, scalable and fault tolerant implementation. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges. This implementation is flexible enough to express a broad set of algorithms.


Download ppt "Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,"

Similar presentations


Ads by Google