Presentation on theme: "1 Parallel Sparse Operations in Matlab: Exploring Large Graphs John R. Gilbert University of California at Santa Barbara Aydin Buluc (UCSB) Brad McRae."— Presentation transcript:
1 Parallel Sparse Operations in Matlab: Exploring Large Graphs John R. Gilbert University of California at Santa Barbara Aydin Buluc (UCSB) Brad McRae (NCEAS) Steve Reinhardt (Interactive Supercomputing) Viral Shah (ISC & UCSB) with thanks to Alan Edelman (MIT & ISC) and Jeremy Kepner (MIT-LL) Support: DOE, NSF, DARPA, SGI, ISC
5 Social Network Analysis in Matlab: 1993 Co-author graph from 1993 Householder symposium
6 Combinatorial Scientific Computing Emerging large scale, high-performance applications: Web search and information retrieval Knowledge discovery Computational biology Dynamical systems Machine learning Bioinformatics Sparse matrix methods Geometric modeling... How will combinatorial methods be used by nonexperts?
7 Outline Infrastructure: Array-based sparse graph computation An application: Computational ecology Some nuts and bolts: Sparse matrix multiplication
8 Matlab*P A = rand(4000*p, 4000*p); x = randn(4000*p, 1); y = zeros(size(x)); while norm(x-y) / norm(x) > 1e-11 y = x; x = A*x; x = x / norm(x); end;
10 P0P0 P1P1 P2P2 PnPn Each processor stores local vertices & edges in a compressed row structure. Has been scaled to >10 8 vertices, >10 9 edges in interactive session. Distributed Sparse Array Structure 1 2 3 26 53 41 31 59
11 Sparse Array and Matrix Operations dsparse layout, same semantics as ordinary full & sparse Matrix arithmetic: +, max, sum, etc. matrix * matrix and matrix * vector Matrix indexing and concatenation A (1:3, [4 5 2]) = [ B(:, J) C ] ; Linear solvers: x = A \ b; using SuperLU (MPI) Eigensolvers: [V, D] = eigs(A); using PARPACK (MPI)
12 Large-Scale Graph Algorithms Graph theory, algorithms, and data structures are ubiquitous in sparse matrix computation. Time to turn the relationship around! Represent a graph as a sparse adjacency matrix. A sparse matrix language is a good start on primitives for computing with graphs. Leverage the mature techniques and tools of high- performance numerical computation.
13 Sparse Adjacency Matrix and Graph Adjacency matrix: sparse array w/ nonzeros for graph edges Storage-efficient implementation from sparse data structures xATxATx 1 2 3 4 7 6 5 ATAT
14 Breadth-First Search: sparse mat * vec xATxATx 1 2 3 4 7 6 5 ATAT Multiply by adjacency matrix step to neighbor vertices Work-efficient implementation from sparse data structures
15 Breadth-First Search: sparse mat * vec xATxATx 1 2 3 4 7 6 5 ATAT Multiply by adjacency matrix step to neighbor vertices Work-efficient implementation from sparse data structures
16 Breadth-First Search: sparse mat * vec ATAT 1 2 3 4 7 6 5 (A T ) 2 x xATxATx Multiply by adjacency matrix step to neighbor vertices Work-efficient implementation from sparse data structures
17 Many tight clusters, loosely interconnected Input data is edge triples Vertices and edges permuted randomly HPCS Graph Clustering Benchmark Fine-grained, irregular data access Searching and clustering
18 Clustering by Breadth-First Search % Grow each seed to vertices % reached by at least k % paths of length 1 or 2 C = sparse(seeds, 1:ns, 1, n, ns); C = A * C; C = C + A * C; C = C >= k; Grow local clusters from many seeds in parallel Breadth-first search by sparse matrix * matrix Cluster vertices connected by many short paths
19 Toolbox for Graph Analysis and Pattern Discovery Layer 1: Graph Theoretic Tools Graph operations Global structure of graphs Graph partitioning and clustering Graph generators Visualization and graphics Scan and combining operations Utilities
21 Landscape Connnectivity Modeling Landscape type and features facilitate or impede movement of members of a species Different species have different criteria, scales, etc. Habitat quality, gene flow, population stability Corridor identification, conservation planning
22 Pumas in Southern California Joshua Tree N.P. L.A. Palm Springs Habitat quality model
23 Predicting Gene Flow with Resistive Networks Circuit model predictions: Genetic vs. geographic distance:
24 Early Experience with Real Genetic Data Good results with wolverines, mahogany, pumas Matlab implementation Needed: –Finer resolution –Larger landscapes –Faster interaction 5km resolution(too coarse)
25 Circuitscape: Combinatorics and Numerics Model landscape (ideally at 100m resolution for pumas). Initial grid models connections to 4 or 8 neighbors. Partition landscape into connected components via GAPDT Use GAPDT to contract habitats into single graph nodes. Compute resistance for pairs of habitats. Direct methods are too slow for largest problems. Use iterative solvers via Star-P:Hypre (PCG+AMG)
26 Parallel Circuitscape Results Pumas in southern California: –12 million nodes –Under 1 hour (16 processors) –Original code took 3 days at coarser resolution Targeting much larger problems: –Yellowstone-to-Yukon corridor Figures courtesy of Brad McRae, NCEAS
27 Sparse Matrix times Sparse Matrix A primitive in many array-based graph algorithms: –Parallel breadth-first search –Shortest paths –Graph contraction –Subgraph / submatrix indexing –Etc. Graphs are often not mesh-like, i.e. geometric locality and good separators. Often do not want to optimize for one repeated operation, as in matvec for iterative methods
28 Sparse Matrix times Sparse Matrix Current work: –Parallel algorithms with 2D data layout –Sequential and parallel hypersparse algorithms –Matrices over semirings
29 * = I J A(I,K) K K B(K,J) C(I,J) ParSpGEMM C(I,J) += A(I,K)*B(K,J) Based on SUMMA Simple for non-square matrices, etc.
30 How Sparse? HyperSparse ! blocks nnz(j) = c nnz(j) = Any local data structure that depends on local submatrix dimension n (such as CSR or CSC) is too wasteful.
31 SparseDComp Data Structure “Doubly compressed” data structure Maintains both DCSC and DCSR C = A*B needs only A.DCSC and B.DCSR 4*nnz values communicated for A*B in the worst case (though we usually get away with much less)
32 Sequential Operation Counts Matlab: O(n+nnz(B)+f) SpGEMM: O(nzc(A)+nzr(B)+f*logk) Break-even point Required non- zero operations (flops) Number of columns of A containing at least one non-zero
34 Matrices over Semirings Matrix multiplication C = AB (or matrix/vector): C i,j = A i,1 B 1,j + A i,2 B 2,j + · · · + A i,n B n,j Replace scalar operations and + by : associative, distributes over , identity 1 : associative, commutative, identity 0 annihilates under Then C i,j = A i,1 B 1,j A i,2 B 2,j · · · A i,n B n,j Examples: ( ,+) ; (and,or) ; (+,min) ;... Same data reference pattern and control flow
35 Remarks Tools for combinatorial methods built on parallel sparse matrix infrastructure Easy-to-use interactive programming environment –Rapid prototyping tool for algorithm development –Interactive exploration and visualization of data Sparse matrix * sparse matrix is a key primitive Matrices over semirings like (min,+) as well as (+,*)