Pregel: A System for Large-Scale Graph Processing

Slides:



Advertisements
Similar presentations
Pregel: A System for Large-Scale Graph Processing
Advertisements

Piccolo: Building fast distributed programs with partitioned tables Russell Power Jinyang Li New York University.
Distributed Graph Analytics Imranul Hoque CS525 Spring 2013.
Distributed Graph Processing Abhishek Verma CS425.
Spark: Cluster Computing with Working Sets
APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.
PaaS Techniques Programming Model
Distributed Computations
Aho-Corasick String Matching An Efficient String Matching.
MapReduce Dean and Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, Vol. 51, No. 1, January Shahram.
Pregel: A System for Large-Scale Graph Processing
Big Data Infrastructure Jimmy Lin University of Maryland Monday, April 13, 2015 Session 10: Beyond MapReduce — Graph Processing This work is licensed under.
Paper by: Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) Pregel: A System for.
Distributed Computations MapReduce
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 22: Stream Processing, Graph Processing All slides © IG.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
Google Distributed System and Hadoop Lakshmi Thyagarajan.
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
A System for Large-Scale Graph Processing
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Parallel Programming Models Basic question: what is the “right” way to write parallel programs –And deal with the complexity of finding parallelism, coarsening.
Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining Wei Jiang and Gagan Agrawal.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.
1 Fast Failure Recovery in Distributed Graph Processing Systems Yanyan Shen, Gang Chen, H.V. Jagadish, Wei Lu, Beng Chin Ooi, Bogdan Marius Tudor.
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
MapReduce How to painlessly process terabytes of data.
MapReduce M/R slides adapted from those of Jeff Dean’s.
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Distributed Systems CS
Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
Data Structures and Algorithms in Parallel Computing Lecture 4.
Data Structures and Algorithms in Parallel Computing
Pregel: A System for Large-Scale Graph Processing Nov 25 th 2013 Database Lab. Wonseok Choi.
Outline  Introduction  Subgraph Pattern Matching  Types of Subgraph Pattern Matching  Models of Computation  Distributed Algorithms  Performance.
REX: RECURSIVE, DELTA-BASED DATA-CENTRIC COMPUTATION Yavuz MESTER Svilen R. Mihaylov, Zachary G. Ives, Sudipto Guha University of Pennsylvania.
Department of Computer Science, Johns Hopkins University Pregel: BSP and Message Passing for Graph Computations EN Randal Burns 14 November 2013.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
EpiC: an Extensible and Scalable System for Processing Big Data Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian Lee Tan, Sai Wu School of Computing, National.
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
TensorFlow– A system for large-scale machine learning
Hadoop Aakash Kag What Why How 1.
Miraj Kheni Authors: Toyotaro Suzumura, Koji Ueno
Pagerank and Betweenness centrality on Big Taxi Trajectory Graph
Large-scale file systems and Map-Reduce
Scaling Up Scientific Workflows with Makeflow
PREGEL Data Management in the Cloud
Abstract Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for.
Data Structures and Algorithms in Parallel Computing
MapReduce Simplied Data Processing on Large Clusters
湖南大学-信息科学与工程学院-计算机与科学系
Distributed Systems CS
Distributed Systems CS
Da Yan, James Cheng, Yi Lu, Wilfred Ng Presented By: Nafisa Anzum
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski Miao Xie

Outline Introduction Model of Computation Implementation Applications Experiments

Introduction Nowadays many practical computing problems concern large graphs. The scale of these graphs poses challenges to their efficient processing. To address distributed processing of large scale graphs, the author built a scalable and fault-tolerant platform with an API that is sufficiently flexible to express arbitrary algorithms. The resulting system is called Pregel.

Model of Computation Input: The input to a Pregel computation is a directed graph in which each vertex is uniquely identified by a string vertex identifier. Each vertex is associated with a modifiable, userdefined value. The directed edges are associated with their source vertices, and each edge consists of modifiable, userdefined value and a target vertex identifier.

Model of Computation Superstep: Pregel computations consist of a sequence of iterations, called supersteps. During a superstep the framework invokes a userdefined function for each vertex, conceptually in parallel. The function specifies behavior at a single vertex V and a single superstep S. It can read messages sent to V in superstep S − 1, send messages to other vertices that will be received at superstep S + 1, and modify the state of V and its outgoing edges.

Model of Computation In superstep 0, every vertex is in the active state. All active vertices participate in the computation of any given superstep. Algorithm termination is based on every vertex voting to halt. A vertex deactivates itself by voting to halt. This means the vertex has no further work to do unless triggered externally. If reactivated by a message, a vertex must explicitly deactivate itself again.

Model of Computation Vertex State Machine The algorithm as a whole terminates when all vertices are simultaneously inactive and there are no messages in transit.

Model of Computation Output: The output of a Pregal program is the set of values explicitly output by vertices. Example: Given a strongly connected graph where each vertex contains a value, it propagates the largest value to every vertex. In each superstep, any vertex that has learned a larger value from its messages sends it to all its neighbors. When no further vertices change in a superstep, the algorithm terminates.

Model of Computation

Implementation Pregel was designed for the Google cluster architecture. Each cluster consists of thousands of commodity PCs organized into racks with high intra-rack bandwidth. Clusters are interconnected but distributed geographically. Our applications typically execute on a cluster management system that schedules jobs to optimize resource allocation, sometimes killing instances or moving them to different machines.

Implementation Basic architecture: The Pregel library divides a graph into partitions, each consisting of a set of vertices and all of those vertices’ outgoing edges. Assignment of a vertex to a partition depends solely on the vertex ID, which implies it is possible to know which partition a given vertex belongs to even if the vertex is owned by a different machine, or even if the vertex does not yet exist.

Implementation Typically the execution of a Pregel program consists of several stages 1. Many copies of the user program begin executing on a cluster of machines. One of these copies acts as the master. 2. The master determines how many partitions the graph will have, and assigns one or more partitions to each worker machine. 3. The master assigns a portion of the user’s input to each worker. 4. The master instructs each worker to perform a superstep. The worker loops through its active vertices, using one thread for each partition. 5. After the computation halts, the master may instruct each worker to save its portion of the graph.

Implementation Fault Tolerance: Fault tolerance is achieved through checkpointing. At the beginning of a superstep, the master instructs the workers to save the state of their partitions to persistent storage. Worker failures are detected using regular “ping” messages that the master issues to workers. If a worker does not receive a ping message after a specified interval, the worker process terminates. If the master does not hear back from a worker, the master marks that worker process as failed. When one or more workers fail, the current state of the partitions assigned to these workers is lost. The master reassigns graph partitions to the currently available set of workers, and they all reload their partition state from the most recent available checkpoint at the beginning of a superstep S.

Implementation Worker implementation: A worker machine maintains the state of its portion of the graph in memory. This can be thought of as a map from vertex ID to the state of each vertex, where the state of each vertex consists of its current value. When the worker performs a superstep it loops through all vertices, passing it the current value, an iterator to the incoming messages, and an iterator to the outgoing edges.

Implementation Master Implementation: The master is primarily responsible for coordinating the activities of workers. Each worker is assigned a unique identifier at the time of its registration. The master maintains a list of all workers currently known to be alive, including the worker’s unique identifier, its addressing information, and which portion of the graph it has been assigned.

Applications PageRank Shortest Paths Bipartite Matching Semi-Clustering

Experiments The author conducted various experiments with the single-source shortest paths implementation on a cluster of 300 multicore commodity PCs. The single-source shortest paths problem requires finding a shortest path between a single source vertex and every other vertex in the graph.

Experiments As an indication of how Pregel scales with worker tasks, Figure 7 shows shortest paths runtimes for a binary tree with a billion vertices when the number of Pregel workers varies from 50 to 800. The drop from 174 to 17.3 seconds using 16 times as many workers represents a speedup of about 10.

Experiments To show how Pregel scales with graph size, Figure 8 presents shortest paths runtimes for binary trees varying in size from a billion to 50 billion vertices, now using a fixed number of 800 worker tasks scheduled on 300 multicore machines. Here the increase from 17.3 to 702 seconds demonstrates that for graphs with a low average outdegree the runtime increases linearly in the graph size.

Experiments The author also conducted experiments with random graphs that use a log-normal distribution of outdegrees. Figure 9 shows shortest paths runtimes for such graphs varying in size from 10 million to a billion vertices, again with 800 worker tasks scheduled on 300 multicore machines.

Experiments

Thank you