Distributed Graph Processing Abhishek Verma CS425.

Slides:



Advertisements
Similar presentations
Pregel: A System for Large-Scale Graph Processing
Advertisements

Chapter 5: Tree Constructions
Piccolo: Building fast distributed programs with partitioned tables Russell Power Jinyang Li New York University.
epiC: an Extensible and Scalable System for Processing Big Data
1 TDD: Topics in Distributed Databases Distributed Query Processing MapReduce Vertex-centric models for querying graphs Distributed query evaluation by.
CS 484. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
Distributed Graph Analytics Imranul Hoque CS525 Spring 2013.
APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.
Piccolo – Paper Discussion Big Data Reading Group 9/20/2010.
Distributed Computations
Cloud Computing Lecture #5 Graph Algorithms with MapReduce Jimmy Lin The iSchool University of Maryland Wednesday, October 1, 2008 This work is licensed.
Graph Processing Recap: data-intensive cloud computing – Just database management on the cloud – But scaling it to thousands of nodes – Handling partial.
CS 584. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
Pregel: A System for Large-Scale Graph Processing
Big Data Infrastructure Jimmy Lin University of Maryland Monday, April 13, 2015 Session 10: Beyond MapReduce — Graph Processing This work is licensed under.
Paper by: Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) Pregel: A System for.
Distributed Computations MapReduce
Cloud Computing Lecture #4 Graph Algorithms with MapReduce Jimmy Lin The iSchool University of Maryland Wednesday, February 6, 2008 This work is licensed.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 22: Stream Processing, Graph Processing All slides © IG.
MapReduce Simplified Data Processing On large Clusters Jeffery Dean and Sanjay Ghemawat.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
Pregel: A System for Large-Scale Graph Processing
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
1 The Google File System Reporter: You-Wei Zhang.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
CSCI-455/552 Introduction to High Performance Computing Lecture 18.
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
Load Balancing and Termination Detection Load balance : - statically before the execution of any processes - dynamic during the execution of the processes.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
X-Stream: Edge-Centric Graph Processing using Streaming Partitions
Distributed shared memory. What we’ve learnt so far  MapReduce/Dryad as a distributed programming model  Data-flow (computation as vertex, data flow.
Map Reduce: Simplified Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat Google, Inc. OSDI ’04: 6 th Symposium on Operating Systems Design.
MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
MapReduce How to painlessly process terabytes of data.
MapReduce M/R slides adapted from those of Jeff Dean’s.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Graph Algorithms. Graph Algorithms: Topics  Introduction to graph algorithms and graph represent ations  Single Source Shortest Path (SSSP) problem.
Distributed Systems CS
Data Structures and Algorithms in Parallel Computing Lecture 4.
CS 584. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
Data Structures and Algorithms in Parallel Computing Lecture 3.
Data Structures and Algorithms in Parallel Computing
Pregel: A System for Large-Scale Graph Processing Nov 25 th 2013 Database Lab. Wonseok Choi.
Department of Computer Science, Johns Hopkins University Pregel: BSP and Message Passing for Graph Computations EN Randal Burns 14 November 2013.
EpiC: an Extensible and Scalable System for Processing Big Data Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian Lee Tan, Sai Wu School of Computing, National.
TensorFlow– A system for large-scale machine learning
Lecture 22: Stream Processing, Graph Processing
Large-scale file systems and Map-Reduce
PREGEL Data Management in the Cloud
Data Structures and Algorithms in Parallel Computing
湖南大学-信息科学与工程学院-计算机与科学系
Distributed Systems CS
Distributed Systems CS
Cloud Computing Lecture #4 Graph Algorithms with MapReduce
Replication-based Fault-tolerance for Large-scale Graph Processing
Lecture 22: Stream Processing, Graph Processing
5/7/2019 Map Reduce Map reduce.
MapReduce: Simplified Data Processing on Large Clusters
Lecture 29: Distributed Systems
Presentation transcript:

Distributed Graph Processing Abhishek Verma CS425

Guess the Graph? 2

The Internet Human Brain LinkedIn Social Network 3

Graph Problems Finding shortest paths – Routing Internet traffic and UPS trucks Finding minimum spanning trees – Google Fiber Finding Max Flow – Scheduling Bipartite matching – Dating websites Identify special nodes and communities – Spread of diseases, terrorists 4

Outline Single Source Shortest Path (SSSP) SSSP MapReduce Implementation Pregel – Computation Model – SSSP Example – Writing a Pregel program – System Design MapReduce vs Pregel for Graph processing 5

Single Source Shortest Path (SSSP) Problem – Find shortest path from a source node to all target nodes Solution for s ingle processor machine – Dijkstra’s algorithm 6

Example: SSSP – Dijkstra’s Algorithm 0     A BC DE 10 7

Example: SSSP – Dijkstra’s Algorithm   A BC DE 8

Example: SSSP – Dijkstra’s Algorithm A BC DE 10 9

Example: SSSP – Dijkstra’s Algorithm A BC DE 10

Example: SSSP – Dijkstra’s Algorithm A BC DE 10 11

Example: SSSP – Dijkstra’s Algorithm A BC DE 10 12

Example 2 13

Single Source Shortest Path (SSSP) Problem – Find shortest path from a source node to all target nodes Solution on single processor machine – Dijkstra’s algorithm Distributed solution: parallel breadth-first search – MapReduce – Pregel 14

MapReduce Execution Overview 15

Example: SSSP – Parallel BFS in MapReduce Adjacency matrix Adjacency List A: (B, 10), (D, 5) B: (C, 1), (D, 2) C: (E, 4) D: (B, 3), (C, 9), (E, 2) E: (A, 7), (C, 6) 0     A BC DE ABCDE A 5 B12 C4 D392 E76 ABCDE A B C D E 16

0     A BC DE Map input: > >> Map output: >> Flushed to local disk!! Example: SSSP – Parallel BFS in MapReduce 17

Reduce input: >> >> >> >> >> 0     A BC DE Example: SSSP – Parallel BFS in MapReduce 18

Reduce input: >> >> >> >> >> 0     A BC DE Example: SSSP – Parallel BFS in MapReduce 19

Reduce output: > = Map input for next iteration >> Map output:   A BC DE >> Flushed to DFS!! Flushed to local disk!! Example: SSSP – Parallel BFS in MapReduce 20

Reduce input: >> >> >> >> >>   A BC DE Example: SSSP – Parallel BFS in MapReduce 21

Reduce input: >> >> >> >> >>   A BC DE Example: SSSP – Parallel BFS in MapReduce 22

Reduce output: > = Map input for next iteration >> … the rest omitted … A BC DE Flushed to DFS!! Example: SSSP – Parallel BFS in MapReduce 23

Slow Processor Problems with MapReduce Map-Reduce cannot express iterative algorithms efficiently : Data CPU 1 CPU 2 CPU 3 Data CPU 1 CPU 2 CPU 3 Data CPU 1 CPU 2 CPU 3 Iterations Barrier 24

MapAbuse: Iterative MapReduce Only a subset of data needs computation: Data CPU 1 CPU 2 CPU 3 Data CPU 1 CPU 2 CPU 3 Data CPU 1 CPU 2 CPU 3 Iterations Barrier 25

MapAbuse: Iterative MapReduce System is not optimized for iteration: Data CPU 1 CPU 2 CPU 3 Data CPU 1 CPU 2 CPU 3 Data CPU 1 CPU 2 CPU 3 Iterations Disk Penalty StartupPenalty 26

Barrier Pregel Model Bulk Synchronous Parallel Model: ComputeCommunicate 27

Computation Model Input Output Supersteps (a sequence of iterations) 28

Computation Model “Think like a vertex” Inspired by Valiant’s Bulk Synchronous Parallel model (1990) Source: 29

Computation Model Superstep: the vertices compute in parallel – Each vertex Receives messages sent in the previous superstep Executes the same user-defined function Modifies its value or that of its outgoing edges Sends messages to other vertices (to be received in the next superstep) Mutates the topology of the graph Votes to halt if it has no further work to do – Termination condition All vertices are simultaneously inactive There are no messages in transit 30

Intermission 31

Example: SSSP – Parallel BFS in Pregel 0    

Example: SSSP – Parallel BFS in Pregel 0             33

Example: SSSP – Parallel BFS in Pregel  

Example: SSSP – Parallel BFS in Pregel  

Example: SSSP – Parallel BFS in Pregel

Example: SSSP – Parallel BFS in Pregel

Example: SSSP – Parallel BFS in Pregel

Example: SSSP – Parallel BFS in Pregel

Example: SSSP – Parallel BFS in Pregel

C++ API Writing a Pregel program – Subclassing the predefined Vertex class Override this! in msgs out msg 41

Example: Vertex Class for SSSP 42

System Design Pregel system also uses the master/worker model – Master Maintains worker Recovers faults of workers Provides Web-UI monitoring tool of job progress – Worker Processes its task Communicates with the other workers Persistent data is stored as files on a distributed storage system (such as GFS or BigTable) Temporary data is stored on local disk 43

Execution of a Pregel Program 1.Many copies of the program begin executing on a cluster 2.The master assigns a partition of input to each worker – Each worker loads the vertices and marks them as active 3.The master instructs each worker to perform a superstep – Each worker loops through its active vertices & computes for each vertex – Messages are sent asynchronously, but are delivered before the end of the superstep – This step is repeated as long as any vertices are active, or any messages are in transit 4.After the computation halts, the master may instruct each worker to save its portion of the graph 44

Fault Tolerance Checkpointing – The master periodically instructs the workers to save the state of their partitions to persistent storage e.g., Vertex values, edge values, incoming messages Failure detection – Using regular “ping” messages Recovery – The master reassigns graph partitions to the currently available workers – The workers all reload their partition state from most recent available checkpoint 45

Experiments Environment – H/W: A cluster of 300 multicore commodity PCs – Data: binary trees, log-normal random graphs (general graphs) Naïve SSSP implementation – The weight of all edges = 1 – No checkpointing 46

Experiments SSSP – 1 billion vertex binary tree: varying # of worker tasks 47

Experiments SSSP – binary trees: varying graph sizes on 800 worker tasks 48

Experiments SSSP – Random graphs: varying graph sizes on 800 worker tasks 49

Differences from MapReduce Graph algorithms can be written as a series of chained MapReduce invocation Pregel – Keeps vertices & edges on the machine that performs computation – Uses network transfers only for messages MapReduce – Passes the entire state of the graph from one stage to the next – Needs to coordinate the steps of a chained MapReduce 50

MapReduce vs Graph Processing When to use MapReduce vs Graph Processing Frameworks? – Two models can be reduced to each other – MR on Pregel: Every vertex to every other vertex – Pregel on MR: Similar to PageRank on MR Use graph fw for sparse graphs and matrixes – Comm. limited to small neighborhood (partitioning) – Faster than the shuffle/sort/reduce of MR – Local state kept between iterations (side-effects) 51

MapReduce vs Graph Processing (2) Use MR for all-to-all problems – Distributed joins – When little or no state between tasks (otherwise have to written to disk) Graph processing is middleground between message passing (MPI) and data-intensive computing 52

Backup Slides 53

Pregel vs GraphLab Distributed system models – Asynchronous model and Synchronous model – Well known tradeoff between two models Synchronous: concurrency-control/failures easy, poor perf. Asynchronous: concurrency-control/failures hard, good perf. Pregel is a synchronous system – No concurrency control, no worry of consistency – Fault-tolerance, check point at each barrier GraphLab is asynchronous system – Consistency of updates harder (sequential, vertex) – Fault-tolerance harder (need a snapshot with consistency) 54

Pregel vs GraphLab (2) Synchronous model requires waiting for every node – Good for problems that require it – Bad when waiting for stragglers or load-imbalance Asynchronous model can make faster progress – Some problems work in presence of drift – Fast, but harder to reason about computation – Can load balance in scheduling to deal with load skew 55