Distributed Breadth-First Search with 2-D Partitioning Edmond Chow, Keith Henderson, Andy Yoo Lawrence Livermore National Laboratory LLNL Technical report.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Algorithms (and Datastructures) Lecture 3 MAS 714 part 2 Hartmut Klauck.
Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Social network partition Presenter: Xiaofei Cao Partick Berg.
Graphs COP Graphs  Train Lines Gainesville OcalaDeltona Daytona Melbourne Lakeland Tampa Orlando.
Review Binary Search Trees Operations on Binary Search Tree
Running Large Graph Algorithms – Evaluation of Current State-of-the-Art Andy Yoo Lawrence Livermore National Laboratory – Google Tech Talk Feb Summarized.
CS 206 Introduction to Computer Science II 03 / 27 / 2009 Instructor: Michael Eckmann.
CSE 2331/5331 Topic 11: Basic Graph Alg. Representations Undirected graph Directed graph Topological sort.
Graph A graph, G = (V, E), is a data structure where: V is a set of vertices (aka nodes) E is a set of edges We use graphs to represent relationships among.
Graph Traversals Visit vertices of a graph G to determine some property: Is G connected? Is there a path from vertex a to vertex b? Does G have a cycle?
CSE 373: Data Structures and Algorithms Lecture 19: Graphs III 1.
Parallel Graph Algorithms
Chapter 8, Part I Graph Algorithms.
© 2006 Pearson Addison-Wesley. All rights reserved14 A-1 Chapter 14 excerpts Graphs (breadth-first-search)
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 CHAPTER 4 - PART 2 GRAPHS 1.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
CS 206 Introduction to Computer Science II 11 / 11 / Veterans Day Instructor: Michael Eckmann.
Graph & BFS.
MSSG: A Framework for Massive-Scale Semantic Graphs Timothy D. R. Hartley, Umit Catalyurek, Fusun Ozguner, Andy Yoo, Scott Kohn, Keith Henderson Dept.
Graph Analysis with High Performance Computing by Bruce Hendrickson and Jonathan W. Berry Sandria National Laboratories Published in the March/April 2008.
Massive Graph Visualization: LDRD Final Report Sandia National Laboratories Sand Printed October 2007.
Graph COMP171 Fall Graph / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D E A C F B Vertex Edge.
Graphs. Graph definitions There are two kinds of graphs: directed graphs (sometimes called digraphs) and undirected graphs Birmingham Rugby London Cambridge.
Graph & BFS Lecture 22 COMP171 Fall Graph & BFS / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D.
CS 206 Introduction to Computer Science II 11 / 03 / 2008 Instructor: Michael Eckmann.
CSE 780 Algorithms Advanced Algorithms Graph Algorithms Representations BFS.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
CS 206 Introduction to Computer Science II 11 / 05 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 03 / 25 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 03 / 30 / 2009 Instructor: Michael Eckmann.
Tirgul 7 Review of graphs Graph algorithms: – BFS (next tirgul) – DFS – Properties of DFS – Topological sort.
Biological Networks Lectures 6-7 : February 02, 2010 Graph Algorithms Review Global Network Properties Local Network Properties 1.
11 If you were plowing a field, which would you rather use? Two oxen, or 1024 chickens? (Attributed to S. Cray) Abdullah Gharaibeh, Lauro Costa, Elizeu.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Representing and Using Graphs
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
After step 2, processors know who owns the data in their assumed partitions— now the assumed partition defines the rendezvous points Scalable Conceptual.
Parallel Programming: All-Pairs Shortest Path CS599 David Monismith Based upon notes from multiple sources.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
Data Structures and Algorithms in Parallel Computing Lecture 7.
1 Directed Graphs Chapter 8. 2 Objectives You will be able to: Say what a directed graph is. Describe two ways to represent a directed graph: Adjacency.
Data Structures and Algorithms in Parallel Computing
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 5.
Parallel Graph Algorithms Sathish Vadhiyar. Graph Traversal  Graph search plays an important role in analyzing large data sets  Relationship between.
Project 1 : Phase 1 22C:021 CS II Data Structures.
Graph Representations And Traversals. Graphs Graph : – Set of Vertices (Nodes) – Set of Edges connecting vertices (u, v) : edge connecting Origin: u Destination:
CSE 421 Algorithms Richard Anderson Autumn 2015 Lecture 5.
Main Index Contents 11 Main Index Contents Graph Categories Graph Categories Example of Digraph Example of Digraph Connectedness of Digraph Connectedness.
Graph Theory Def: A graph is a set of vertices and edges G={V,E} Ex. V = {a,b,c,d,e} E = {ab,bd,ad,ed,ce,cd} Note: above is a purely mathematical definition.
Computation on Graphs. Graphs and Sparse Matrices Sparse matrix is a representation of.
Graphs David Kauchak cs302 Spring Admin HW 12 and 13 (and likely 14) You can submit revised solutions to any problem you missed Also submit your.
Brute Force and Exhaustive Search Brute Force and Exhaustive Search Traveling Salesman Problem Knapsack Problem Assignment Problem Selection Sort and Bubble.
Parallel Graph Algorithms
Computing Connected Components on Parallel Computers
CSE 2331/5331 Topic 9: Basic Graph Alg.
Parallel Density-based Hybrid Clustering
Parallel Graph Algorithms
Graphs Chapter 11 Objectives Upon completion you will be able to:
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform
Department of Computer Science University of York
Graphs Part 2 Adjacency Matrix
Graphs.
Richard Anderson Autumn 2016 Lecture 5
Graph Implementation.
Parallel Graph Algorithms
Richard Anderson Winter 2019 Lecture 5
Presentation transcript:

Distributed Breadth-First Search with 2-D Partitioning Edmond Chow, Keith Henderson, Andy Yoo Lawrence Livermore National Laboratory LLNL Technical report UCRL-CONF Presented by K. Sheldon March 2011

Abstract This paper studies the implementation of a level synchronized distributed breadth-first search (BFS) algorithm applied to large graphs and evaluates performance using two different partitioning strategies. The authors compare 1-D (vertex) partitioning with 2-D (edge) partitioning using Poisson random graphs. The experimental findings show that the partitioning method can make a difference in BFS performance for Poisson random graphs depending on the average degree of the graph. Presented by K. Sheldon March 2011

Definitions Poisson random graphs – the probability of an edge connecting any two vertices is constant. P – number of processors n – number of vertices in a Poisson random graph k – the average degree level – the graph distance of a vertex from the start vertex frontier – the set of vertices on the current level of the BFS algorithm neighbors – a vertex that shares an edge with another vertex Presented by K. Sheldon March 2011

Breadth-first Search (BFS) This method of traversing the graph nodes requires exploration of all the nodes adjacent to the root node before moving deeper. It visits the nodes level by level. Level synchronized BFS algorithms (used in this experiment) proceed level by level on all processors. The Question: Compare the two different partitioning strategies, 1-D and 2-D, with regard to communication and overall time for BFS. Presented by K. Sheldon March 2011

1-D (vertex) Partitioning It is simpler than 2-D partitioning. The vertices of the adjacency matrix are divided up among the processors. The edges emanating from each vertex are owned by the same processor. The edges emanating from a vertex form its edge list. This is the list of vertex indices in row v of matrix A. Below is an example of a 1-D P-way partition of adjacency matrix A, symmetrically reordered so that vertices owned by the same processor are contiguous.

1-D partitioning (cont.) The disadvantage here for parallel processing is that the vertices point to vertices in all other processors meaning that the communication requirements between processes are all-to-all. This introduces a great deal of overhead. Presented by K. Sheldon March 2011

2-D (edge) Partitioning It is more complex than 1-D partitioning. Edge partitioning divides up the edges rather than vertices between processors resulting in a partial adjacency matrix owned by each process. The advantage is that any edge in the graph can be followed by moving along the row or column. The vertices are also partitioned so that each vertex is also owned by one processor. A process owns edges incident on its vertexes and some edges that are not. Below is an example of a 2-D (checkerboard) partition of adjacency matrix A, again symmetrically reordered so that vertices owned by the same processor are contiguous. Partitioning is for P=RC processors.

2-D Edge Partitioning Layout

2-D partitioning (cont.) The adjacency matrix is divided into RC block rows and C block columns. The A * i,j is a block owned by processor (i,j). Each processor owns C blocks. For vertices, processor (i,j) owns vertices in block row (j-1)R+I. For comparison, the 1-D partition is a 2-D case where R=1 or C=1. (P = RC) The edge list for a vertex is a column of the adjacency matrix, A. So each block has partial edge lists. Presented by K. Sheldon March 2011

BFS with 1-D Partitioning Algorithm Highlights: Start with v s and initialize L (levels or graph distance from v s ) Create set F – frontier vertices owned by the processor The edge lists of F are merged to form set N, neighboring vertices Send messages to processes that own vertices in set N to potentially add these vertices to their next F Receive set N vertices from other processors Merge to form final N set (remove overlap) Update L Presented by K. Sheldon March 2011

BFS with 2-D Partitioning Algorithm Highlights: Same as 1-D: Start with v s and initialize L (levels or graph distance from v s ) Create set F – frontier vertices owned by the process Different: Send F to processor-column, since a v might have the edge list on another processor. Receive set F from other processors in the column Merge to form the complete F set The edge lists (on this processor) of F are merged to form set N, neighboring vertices Almost the Same as 1-D: Send messages to processors (in processor row) that own vertices in set N to potentially add these vertices to their next F Receive set N vertices from other processors (in processor row) Merge to form final N set (remove duplicates) Update L Presented by K. Sheldon March 2011

Advantage of 2-D Partitioning The advantage of 2-D over 1-D is: Processor-column and processor-row communications: R and C In 1-D partitioning all P processors are involved. In 2-D partitioning each processor must store edge list information about the other processors in its column. Presented by K. Sheldon March 2011

Experimental Results Message length and timing results based on: Distributed BFS Poisson random graphs Load balanced 1-D and 2-D partitioning 2-D: R = C= √P 1-D: R = 1, C = P 100 pairs randomly generated start and target vertices Timings are the average of the final 99 trials. Code run on two different computer systems at LLNL: MCR and BlueGene/L Presented by K. Sheldon March 2011

Message Lengths

Weak Scaling, k = 100, 2-D has better performance

Weak Scaling Test, k = 10, 1-D has better performance

Strong Scaling Test, k = 10

Conclusions Project Accomplishments: Demonstrated distributed BFS using Poisson random graphs of large scale and compared performance with 1-D and 2- D partitioning. 2-D partitioning is a useful strategy when the average degree k of the graph is large. Future Work: Investigate different partitioning methods with other more structured graphs such as those with a large clustering coefficient or scale-free graphs. These are graphs with a few vertices that have a very large degree. Presented by K. Sheldon March 2011

Authors Andy Yoo, Edmond Chow, Keith Henderson, William McLendon, Bruce Hendrickson, and Umit Catelyurek. A scalable distributed parallel breadth- first search algorithm on BlueGene/L. In SC ’05: Proceedings of the 2005 ACM/IEEEconference on Supercomputing, DOI= /SC