Miraj Kheni Authors: Toyotaro Suzumura, Koji Ueno ScaleGraph:A High-Performance Library for Billion-Scale Graph Analytics Miraj Kheni Authors: Toyotaro Suzumura, Koji Ueno
ScaleGraph: Motivations Growing interest in the field of large-scale graph mining Achieving high performance and low software complexity at the same time Harnessing performance and productivity of X10
ScaleGraph: Related Work Parallel Boost Graph Library (PBGL) Google’s Pregel Apache Giraph GraphLab
ScaleGraph: X10 Overview A type-safe, imperative, concurrent, object-oriented language Asynchronous partitioned global address space (APGAS) programming model Key concepts: place: analogous to process activity: analogous to lightweight thread activity can access remote data using at keyword or X10.util.Team class Seamless migration of an activity to different place Supports Clik-style fork-join programming Async, Finish
ScaleGraph: Improved X10 Team Class X10 Team: contains routines for collective communication ScaleGraph proposes a new X10 Team class that optimizes collective communication by not emulating X10RT Implements both blocking and non-blocking collective communication routines that directly uses MPI collective communication routines.
ScaleGraph: New Memory Management System X10 back-end uses Boehm-Demers-Weiser conservative garbage collector (BDWGC) Does not differentiate small and large memory allocation requests Authors proposes EMM (Explicitly Managed Memory) For small memory allocation- use BDWGC and for large memory allocation- explicitly use malloc and free
ScaleGraph 2: Overview Implemented new readers and writes for parallel IO Implemented new X10 Team Implemented XPregel framework Three main components: XPregel framework Basic linear Algebra Subprograms (BLAS) File IO Provides a number of graph algorithms
ScaleGraph: Graph Representation Distributed Graph Object Distributed mutable lists of edges, vertex attributes and edge attributes Does not support query operations on graph Distributed Sparse Matrix Efficient way for storing immutable, sparse, adjacency matrix Supports querying vertex neighbors, vertex attributes and edge attributes
ScaleGraph: Graph Representation Distributions of distributed sparse Matrix on four places: 1D Row wise (R=4, C=1) 1D Column wise (R=1, C=4) 2D Block (R=2, C=2)
ScaleGraph: XPregel Framework Pregel Computation Model Supersteps- during each superstep, a user-defined function is invoked for each vertex Three main interfaces- Message Passing, Combiner and Aggregator XPregel interface:
ScaleGraph: More Optimizations SendAll Optimization Technique Aims at reducing number of messages when a vertex sends the same massage to all of its neighbors If enabled, the vertex will send only one message to destination place, which in turn would duplicate the message before passing it to respective vertices BLAS for Sparse Matrix For graph problems that can be expressed using repeated matrix-vector multiplication
ScaleGraph: Evaluation Updated X10 Team:
ScaleGraph: Evaluation Weak Scaling: Strong Scaling:
ScaleGraph: Evaluation Strong Scaling: ScaleGraph vs Giraph vs PBGL
Conclusion Use of X10 – helps in productivity and performance Use of optimized X10 Team – optimizes collective communications Introduces EMM – better memory management XPregel’s SendAll – Reduces duplication of messages These optimizations helps ScaleGraph in achieving high productivity, scalability and productivity.
Thank You