1 Parallel Sparse Operations in Matlab: Exploring Large Graphs John R. Gilbert University of California at Santa Barbara Aydin Buluc (UCSB) Brad McRae.

Slides:



Advertisements
Similar presentations
Algorithms (and Datastructures) Lecture 3 MAS 714 part 2 Hartmut Klauck.
Advertisements

1 Computational models of the physical world Cortical bone Trabecular bone.
Seunghwa Kang David A. Bader Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System.
Chapter 8, Part I Graph Algorithms.
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
 Graph Graph  Types of Graphs Types of Graphs  Data Structures to Store Graphs Data Structures to Store Graphs  Graph Definitions Graph Definitions.
Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers Chapter 11: Numerical Algorithms Sec 11.2: Implementing.
Chansup Byun, William Arcand, David Bestor, Bill Bergeron, Matthew Hubbell, Jeremy Kepner, Andrew McCabe, Peter Michaleas, Julie Mullen, David O’Gwynn,
Numerical Algorithms • Matrix multiplication
Graph & BFS.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
1cs542g-term Sparse matrix data structure  Typically either Compressed Sparse Row (CSR) or Compressed Sparse Column (CSC) Informally “ia-ja” format.
High-Performance Computation for Path Problems in Graphs
Sparse Matrix Algorithms CS 524 – High-Performance Computing.
1 An Interactive Environment for Combinatorial Supercomputing John R. Gilbert University of California, Santa Barbara Viral Shah (UCSB) Steve Reinhardt.
Star-P and the Knowledge Discovery Suite Steve Reinhardt, Viral Shah, John Gilbert,
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Sparse Matrices and Combinatorial Algorithms in Star-P Status and Plans April 8, 2005.
1 Combinatorial Scientific Computing: Experiences, Directions, and Challenges John R. Gilbert University of California, Santa Barbara DOE CSCAPES Workshop.
1 A High-Performance Interactive Tool for Exploring Large Graphs John R. Gilbert University of California, Santa Barbara Aydin Buluc & Viral Shah (UCSB)
CS240A: Conjugate Gradients and the Model Problem.
Tools and Primitives for High Performance Graph Computation
CS240A: Computation on Graphs. Graphs and Sparse Matrices Sparse matrix is a representation.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
1 High-Performance Graph Computation via Sparse Matrices John R. Gilbert University of California, Santa Barbara with Aydin Buluc, LBNL; Armando Fox, UCB;
Conjugate gradients, sparse matrix-vector multiplication, graphs, and meshes Thanks to Aydin Buluc, Umit Catalyurek, Alan Edelman, and Kathy Yelick for.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
1 Challenges in Combinatorial Scientific Computing John R. Gilbert University of California, Santa Barbara Grand Challenges in Data-Intensive Discovery.
Introduction to MATLAB adapted from Dr. Rolf Lakaemper.
Beyond GEMM: How Can We Make Quantum Chemistry Fast? or: Why Computer Scientists Don’t Like Chemists Devin Matthews 9/25/ BLIS Retreat1.
Knowledge Discovery Toolbox kdt.sourceforge.net Adam Lugowski.
Parallel Algorithms Sorting and more. Keep hardware in mind When considering ‘parallel’ algorithms, – We have to have an understanding of the hardware.
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
IIIT, Hyderabad Performance Primitives for Massive Multithreading P J Narayanan Centre for Visual Information Technology IIIT, Hyderabad.
MapReduce and Graph Data Chapter 5 Based on slides from Jimmy Lin’s lecture slides ( (licensed.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
Computation on meshes, sparse matrices, and graphs Some slides are from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
L17: Introduction to “Irregular” Algorithms and MPI, cont. November 8, 2011.
Distributed WHT Algorithms Kang Chen Jeremy Johnson Computer Science Drexel University Franz Franchetti Electrical and Computer Engineering.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Interactive Supercomputing Update IDC HPC User’s Forum, September 2008.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
CS240A: Conjugate Gradients and the Model Problem.
CS240A: Computation on Graphs. Graphs and Sparse Matrices Sparse matrix is a representation.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Case Study in Computational Science & Engineering - Lecture 5 1 Iterative Solution of Linear Systems Jacobi Method while not converged do { }
1 Circuitscape Design Review Presentation Team Circuitscape Mike Schulte Sean Collins Katie Rankin Carl Reniker.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
Data Structures and Algorithms in Parallel Computing Lecture 3.
Graphs. Graphs Similar to the graphs you’ve known since the 5 th grade: line graphs, bar graphs, etc., but more general. Those mathematical graphs are.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
Data Structures and Algorithms in Parallel Computing Lecture 7.
An Interactive Environment for Combinatorial Scientific Computing Viral B. Shah John R. Gilbert Steve Reinhardt With thanks to: Brad McRae, Stefan Karpinski,
Paper_topic: Parallel Matrix Multiplication using Vertical Data.
PARALLEL PROCESSING From Applications to Systems Gorana Bosic Veljko Milutinovic
1 Circuitscape Capstone Presentation Team Circuitscape Katie Rankin Mike Schulte Carl Reniker Sean Collins.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
CIS 595 MATLAB First Impressions. MATLAB This introduction will give Some basic ideas Main advantages and drawbacks compared to other languages.
Computation on Graphs. Graphs and Sparse Matrices Sparse matrix is a representation of.
Lecture 20. Graphs and network models 1. Recap Binary search tree is a special binary tree which is designed to make the search of elements or keys in.
Numerical Algorithms Chapter 11.
Graphs Representation, BFS, DFS
CS 290H Administrivia: April 16, 2008
Simulation And Modeling
Parallel Programming in C with MPI and OpenMP
Presentation transcript:

1 Parallel Sparse Operations in Matlab: Exploring Large Graphs John R. Gilbert University of California at Santa Barbara Aydin Buluc (UCSB) Brad McRae (NCEAS) Steve Reinhardt (Interactive Supercomputing) Viral Shah (ISC & UCSB) with thanks to Alan Edelman (MIT & ISC) and Jeremy Kepner (MIT-LL) Support: DOE, NSF, DARPA, SGI, ISC

2 3D Spectral Coordinates

3 2D Histogram: RMAT Graph

4 Strongly Connected Components

5 Social Network Analysis in Matlab: 1993 Co-author graph from 1993 Householder symposium

6 Combinatorial Scientific Computing Emerging large scale, high-performance applications: Web search and information retrieval Knowledge discovery Computational biology Dynamical systems Machine learning Bioinformatics Sparse matrix methods Geometric modeling... How will combinatorial methods be used by nonexperts?

7 Outline Infrastructure: Array-based sparse graph computation An application: Computational ecology Some nuts and bolts: Sparse matrix multiplication

8 Matlab*P A = rand(4000*p, 4000*p); x = randn(4000*p, 1); y = zeros(size(x)); while norm(x-y) / norm(x) > 1e-11 y = x; x = A*x; x = x / norm(x); end;

9 MATLAB ® Star-P Architecture Ordinary Matlab variables Star-P client manager server manager package manager processor #0 processor #n-1 processor #1 processor #2 processor #3... ScaLAPACK FFTW FPGA interface matrix manager Distributed matrices sort dense/sparse UPC user code MPI user code

10 P0P0 P1P1 P2P2 PnPn Each processor stores local vertices & edges in a compressed row structure. Has been scaled to >10 8 vertices, >10 9 edges in interactive session. Distributed Sparse Array Structure

11 Sparse Array and Matrix Operations dsparse layout, same semantics as ordinary full & sparse Matrix arithmetic: +, max, sum, etc. matrix * matrix and matrix * vector Matrix indexing and concatenation A (1:3, [4 5 2]) = [ B(:, J) C ] ; Linear solvers: x = A \ b; using SuperLU (MPI) Eigensolvers: [V, D] = eigs(A); using PARPACK (MPI)

12 Large-Scale Graph Algorithms Graph theory, algorithms, and data structures are ubiquitous in sparse matrix computation. Time to turn the relationship around! Represent a graph as a sparse adjacency matrix. A sparse matrix language is a good start on primitives for computing with graphs. Leverage the mature techniques and tools of high- performance numerical computation.

13 Sparse Adjacency Matrix and Graph Adjacency matrix: sparse array w/ nonzeros for graph edges Storage-efficient implementation from sparse data structures xATxATx ATAT 

14 Breadth-First Search: sparse mat * vec xATxATx ATAT  Multiply by adjacency matrix  step to neighbor vertices Work-efficient implementation from sparse data structures

15 Breadth-First Search: sparse mat * vec xATxATx ATAT  Multiply by adjacency matrix  step to neighbor vertices Work-efficient implementation from sparse data structures

16 Breadth-First Search: sparse mat * vec ATAT (A T ) 2 x   xATxATx Multiply by adjacency matrix  step to neighbor vertices Work-efficient implementation from sparse data structures

17 Many tight clusters, loosely interconnected Input data is edge triples Vertices and edges permuted randomly HPCS Graph Clustering Benchmark Fine-grained, irregular data access Searching and clustering

18 Clustering by Breadth-First Search % Grow each seed to vertices % reached by at least k % paths of length 1 or 2 C = sparse(seeds, 1:ns, 1, n, ns); C = A * C; C = C + A * C; C = C >= k; Grow local clusters from many seeds in parallel Breadth-first search by sparse matrix * matrix Cluster vertices connected by many short paths

19 Toolbox for Graph Analysis and Pattern Discovery Layer 1: Graph Theoretic Tools Graph operations Global structure of graphs Graph partitioning and clustering Graph generators Visualization and graphics Scan and combining operations Utilities

20 Typical Application Stack Distributed Sparse Matrices Arithmetic, matrix multiplication, indexing, solvers (\, eigs) Graph Analysis & PD Toolbox Graph querying & manipulation, connectivity, spanning trees, geometric partitioning, nested dissection, NNMF,... Preconditioned Iterative Methods CG, BiCGStab, etc. + combinatorial preconditioners (AMG, Vaidya) Applications Computational ecology, CFD, data exploration

21 Landscape Connnectivity Modeling Landscape type and features facilitate or impede movement of members of a species Different species have different criteria, scales, etc. Habitat quality, gene flow, population stability Corridor identification, conservation planning

22 Pumas in Southern California Joshua Tree N.P. L.A. Palm Springs Habitat quality model

23 Predicting Gene Flow with Resistive Networks Circuit model predictions: Genetic vs. geographic distance:

24 Early Experience with Real Genetic Data Good results with wolverines, mahogany, pumas Matlab implementation Needed: –Finer resolution –Larger landscapes –Faster interaction 5km resolution(too coarse)

25 Circuitscape: Combinatorics and Numerics Model landscape (ideally at 100m resolution for pumas). Initial grid models connections to 4 or 8 neighbors. Partition landscape into connected components via GAPDT Use GAPDT to contract habitats into single graph nodes. Compute resistance for pairs of habitats. Direct methods are too slow for largest problems. Use iterative solvers via Star-P:Hypre (PCG+AMG)

26 Parallel Circuitscape Results Pumas in southern California: –12 million nodes –Under 1 hour (16 processors) –Original code took 3 days at coarser resolution Targeting much larger problems: –Yellowstone-to-Yukon corridor Figures courtesy of Brad McRae, NCEAS

27 Sparse Matrix times Sparse Matrix A primitive in many array-based graph algorithms: –Parallel breadth-first search –Shortest paths –Graph contraction –Subgraph / submatrix indexing –Etc. Graphs are often not mesh-like, i.e. geometric locality and good separators. Often do not want to optimize for one repeated operation, as in matvec for iterative methods

28 Sparse Matrix times Sparse Matrix Current work: –Parallel algorithms with 2D data layout –Sequential and parallel hypersparse algorithms –Matrices over semirings

29 * = I J A(I,K) K K B(K,J) C(I,J) ParSpGEMM C(I,J) += A(I,K)*B(K,J) Based on SUMMA Simple for non-square matrices, etc.

30 How Sparse? HyperSparse ! blocks nnz(j) = c nnz(j) =  Any local data structure that depends on local submatrix dimension n (such as CSR or CSC) is too wasteful.

31 SparseDComp Data Structure “Doubly compressed” data structure Maintains both DCSC and DCSR C = A*B needs only A.DCSC and B.DCSR 4*nnz values communicated for A*B in the worst case (though we usually get away with much less)

32 Sequential Operation Counts Matlab: O(n+nnz(B)+f) SpGEMM: O(nzc(A)+nzr(B)+f*logk) Break-even point Required non- zero operations (flops) Number of columns of A containing at least one non-zero

33 Parallel Timings 16-processor Opteron, hypertransport, 64 GB memory R-MAT * R-MAT n = 2 20 nnz = {8, 4, 2, 1,.5} * 2 20 time vs n/nnz, log-log plot

34 Matrices over Semirings Matrix multiplication C = AB (or matrix/vector): C i,j = A i,1  B 1,j + A i,2  B 2,j + · · · + A i,n  B n,j Replace scalar operations  and + by  : associative, distributes over , identity 1  : associative, commutative, identity 0 annihilates under  Then C i,j = A i,1  B 1,j  A i,2  B 2,j  · · ·  A i,n  B n,j Examples: ( ,+) ; (and,or) ; (+,min) ;... Same data reference pattern and control flow

35 Remarks Tools for combinatorial methods built on parallel sparse matrix infrastructure Easy-to-use interactive programming environment –Rapid prototyping tool for algorithm development –Interactive exploration and visualization of data Sparse matrix * sparse matrix is a key primitive Matrices over semirings like (min,+) as well as (+,*)