# APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.

## Presentation on theme: "APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam."— Presentation transcript:

APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam

Fast Scalable Graph Processing 2  What is Apache Giraph  Why do I need it  Giraph + MapReduce  Giraph + Yarn

What is Apache Giraph 3  Giraph is a framework for performing offline batch processing of semi-structured graph data on massive scale  Giraph is loosely based upon Google’s Pregel graph processing framework

What is Apache Giraph 4  Giraph performs iterative calculation on top of an existing Hadoop cluster

What is Apache Giraph 5  Giraph uses Apache Zookeeper to enforce atomic barrier waits and perform leader election Done! Still Working …!

What is Apache Giraph 6

Why do I need it? 7  Giraph makes graph algorithms easy to reason about and implement by following the Bulk Synchronous Parallel (BSP) programming model  In BSP, all algorithms are implemented from the point of view of a single vertex in the input graph performing a single iteration of the computation

Why do I need it? 8  Giraph makes iterative data processing more practical for Hadoop users  Giraph can avoid costly disk and network operations that are mandatory in MR  No concept of message passing in MR

Why do I need it? 9  Each cycle of an iterative calculation on Hadoop means running a full MapReduce job

PageRank example 10  PageRank – measuring the relative importance of document within a set of documents  1. All vertices start with same PageRank 1.0

PageRank example 11  2. Each vertex distributes an equal portion of its pagerank to all neighbors 1.0 0.5

PageRank example 12  3. Each vertex sums incoming values times a weight factor and adds in small adjustment: 1/(# vertices in graph) 1.5*1+ 1/3 1*1+1 /3 0.5*1+ 1/3

PageRank example 13  4. This value becomes the vertices’ PageRank for the next iteration 1.33 0.83 1.83

PageRank example 14  5. Repeat until convergence: change in PR per iteration < epsilon)

PageRank on MapReduce 15  1. Load complete input graph from disk as [K = vertex ID, V = out-edges and PR] MapSort/ShuffleReduce

PageRank on MapReduce 16  2. Emit all input records (full graph state), emit [K = edgeTarget, V = share of PR] MapSort/ShuffleReduce

PageRank on MapReduce 17  3. Sort and Shuffle this entire mess. MapSort/ShuffleReduce

PageRank on MapReduce 18  4. Sum incoming PR shares for each vertex, update PR values in graph state records MapSort/ShuffleReduce

PageRank on MapReduce 19  5. Emit full graph state to disk… MapSort/ShuffleReduce

PageRank on MapReduce 20  6. … and START OVER! MapSort/ShuffleReduce

PageRank on MapReduce 21  Awkward to reason about  I/O bound despite simple core business logic

PageRank on Giraph 22  1. Hadoop Mappers are “hijacked” to host Giraph master and worker tasks Map Sort/ShuffleReduce

PageRank on Giraph 23  2. Input graph is loaded once, maintaining code- data locality when possible Map Sort/ShuffleReduce

PageRank on Giraph 24  3. All iterations are performed on data in memory, optionally spilled to disk. Disk access is linear/scan- based Map Sort/ShuffleReduce

PageRank on Giraph 25  4. Output is written from the Mappers hosting the calculation, and the job run ends Map Sort/ShuffleReduce

PageRank on Giraph 26  This is all well and good, but must we manipulate Hadoop this way?  Heap and other resources are set once, globally for all Mappers in the computation  No control of which cluster nodes host which tasks  No control over how Mappers are scheduled  Mapper and Reducer slots abstraction is meaningless for Giraph

Overview of Yarn  YARN (Yet Another Resource Negotiator) is Hadoop’s next-gen management platform  A general purpose framework that is not fixed to MapReduce paradigm  Offers fine-grained control over each task’s resource allocation 27

Giraph on Yarn 28  It’s a natural fit!

Giraph on Yarn 29  Client  Resource Manager  Application Master Node Manager Worker Node Manager App Mstr Worker Node Manager Master Worker Resource Manager Client ZooKeeper

Overview of Giraph  A distributed graph processing framework  Master/slave architecture  In-memory computation  Vertex-centric high-level programming model  Based on Bulk Synchronous Parallel (BSP) 30

Giraph Architecture 31  Master / Workers  Zookeeper Master Worker

Giraph Computation 32

Overview of Yarn  YARN (Yet Another Resource Negotiator) is Hadoop’s next-gen management platform  A general purpose framework that is not fixed to MapReduce paradigm  Offers fine-grained control over each task’s resource allocation 33

Giraph on Yarn 34  Client  Resource Manager  Application Master Node Manager Worker Node Manager App Mstr Worker Node Manager Master Worker Resource Manager Client ZooKeeper

Metrics 35  Performance  Processing time  Scalability  Graph size (number of vertices and number of edges)

Optimization Factors 36 JVM Giraph App GC control Parallel GC Concurrent GC Young Generation Memory Size Number of Workers Combiner Out-of-Core Object Reuse

Experimental Settings 37  Cluster - 43 nodes ~ 800 GB memory  Hadoop-2.0.3-alpha (non-secure)  Giraph-1.0.0-release  Data - LinkedIn social network graph  Approx. 205 million vertices  Approx. 11 billion edges  Application - PageRank algorithm

Baseline Result 38  10 v.s 20 GB per worker  Max memory 800 GB  Processing time  10 GB per worker – better performance  Scalability  20 GB per worker – higher scalability 40 workers 400G 800G 30 workers 10 workers 25 workers 15 workers 5 workers

Heap Dump w/o Concurrent GC 39  Iteration 3  Iteration 27  Big portion of unreachable objects are messages created at each superstep GB

Concurrent GC 40  Significantly improves the scalability by 3 folds  Suffered from performance degradation by 16% 20 GB per worker

Using Combiner 41  Scale up 2 times w/o any other optimizations  Speed up the performance by 50% 20 GB per worker

Memory Distribution 42  More workers achieve better performance  Larger memory size per worker provides higher scalability

Application – Object Reuse 43  Improves 5x scalability  Improves 4x performance  Require skills from application developers 20 GB per worker 650G 29 mins

Problems of Giraph on Yarn 44  Various knobs to tune to make Giraph applications work efficiently  Highly depend on skillful application developers  Performance penalties suffered from scaling up

Future Direction 45  C++ provides direct control over memory management  No need to rewrite the whole Giraph  Only master and worker in C++

Conclusion 46  Linkedin is the 1st player of Giraph on Yarn  Improvements and bug fixes  Provide patches in Apache Giraph  Make full LI graph run on 40-node cluster with 650GB memory  Evaluate various performance and scalability options

Thank Linkedin! 47  Great experience  State-of-the-art Technology  Intern activities and food truck!

Misc. 48  AvroVertexInputFormat  BinaryJsonVertexInputFormat  Synthetic vertex generator  Graph sampler  Bug fixes

Parallel GC 49  ParallelGCThreads=8  Observation - Parallel GC improves Giraph’s performance by 15%, but no improvement on scalability

Out-of-Core 50  Spill to disk in order to reduce memory pressure  Significantly degrades Giraph’s performance

Heap Dump w Concurrent GC 51  Iteration 3  Iteration 27  Reduce the size of unreachable significantly  Suffer from the performance degradation

Unreachable Analysis 52  Messages generated at each superstep  Left in Tenured generation