CS267 L14 Graph Partitioning I.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 14: Graph Partitioning - I James Demmel

Slides:



Advertisements
Similar presentations
Algorithms (and Datastructures) Lecture 3 MAS 714 part 2 Hartmut Klauck.
Advertisements

Great Theoretical Ideas in Computer Science
Planar graphs Algorithms and Networks. Planar graphs2 Can be drawn on the plane without crossings Plane graph: planar graph, given together with an embedding.
Graph Isomorphism Algorithms and networks. Graph Isomorphism 2 Today Graph isomorphism: definition Complexity: isomorphism completeness The refinement.
Searching on Multi-Dimensional Data
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
CS267 L17 Graph Partitioning III.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 17: Graph Partitioning - III James Demmel
Easy Optimization Problems, Relaxation, Local Processing for a single variable.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Graphs.
SubSea: An Efficient Heuristic Algorithm for Subgraph Isomorphism Vladimir Lipets Ben-Gurion University of the Negev Joint work with Prof. Ehud Gudes.
CS267 L15 Graph Partitioning II.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 15: Graph Partitioning - II James Demmel
CS267 L15 Graph Partitioning II.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 15: Graph Partitioning - II James Demmel
CISC220 Fall 2009 James Atlas Nov 13: Graphs, Line Intersections.
CSE 589 Applied Algorithms Course Introduction. CSE Lecture 1 - Spring Instructors Instructor –Richard Ladner –206.
Module #1 - Logic 1 Based on Rosen, Discrete Mathematics & Its Applications. Prepared by (c) , Michael P. Frank and Modified By Mingwu Chen Trees.
02/23/2005CS267 Lecture 101 CS 267: Applications of Parallel Computers Graph Partitioning James Demmel
15-853Page :Algorithms in the Real World Separators – Introduction – Applications.
03/01/2011CS267 Lecture 131 CS 267: Applications of Parallel Computers Graph Partitioning James Demmel and Kathy Yelick
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Graph partition in PCB and VLSI physical synthesis Lin Zhong ELEC424, Fall 2010.
Abhiram Ranade IIT Bombay
GRAPH Learning Outcomes Students should be able to:
CS 450: Computer Graphics REVIEW: OVERVIEW OF POLYGONS
296.3:Algorithms in the Real World
Graph Partitioning Problem Kernighan and Lin Algorithm
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
15-853Page1 Separatosr in the Real World Practice Slides combined from
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Graphs.
CS 584. Load Balancing Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Week 11 - Monday.  What did we talk about last time?  Binomial theorem and Pascal's triangle  Conditional probability  Bayes’ theorem.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Unit – V Graph theory. Representation of Graphs Graph G (V, E,  ) V Set of vertices ESet of edges  Function that assigns vertices {v, w} to each edge.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 13: Graphs Data Abstraction & Problem Solving with C++
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Spectral Partitioning: One way to slice a problem in half C B A.
Graphs, Vectors, and Matrices Daniel A. Spielman Yale University AMS Josiah Willard Gibbs Lecture January 6, 2016.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 5.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
CSE 421 Algorithms Richard Anderson Autumn 2015 Lecture 5.
CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning
High Performance Computing Seminar
Algorithms and Networks
Great Theoretical Ideas In Computer Science
CS 290H Administrivia: April 16, 2008
Graph theory Definitions Trees, cycles, directed graphs.
Haim Kaplan and Uri Zwick
CS 267: Applications of Parallel Computers Graph Partitioning
Mean Shift Segmentation
CS 240A: Graph and hypergraph partitioning
CS 267: Applications of Parallel Computers Graph Partitioning
I206: Lecture 15: Graphs Marti Hearst Spring 2012.
Randomized Algorithms CS648
CS 267: Applications of Parallel Computers Graph Partitioning
Haim Kaplan and Uri Zwick
James Demmel 11/30/2018 Graph Partitioning James Demmel 04/30/2010 CS267,
CS 267: Applications of Parallel Computers Graph Partitioning
Graphs Part 2 Adjacency Matrix
CS 267: Applications of Parallel Computers Graph Partitioning
Read GLN sections 6.1 through 6.4.
CS 267: Applications of Parallel Computers Graph Partitioning
Richard Anderson Winter 2019 Lecture 5
James Demmel CS 267 Applications of Parallel Computers Lecture 14: Graph Partitioning - I James Demmel.
Presentation transcript:

CS267 L14 Graph Partitioning I.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 14: Graph Partitioning - I James Demmel

CS267 L14 Graph Partitioning I.2 Demmel Sp 1999 Outline of Graph Partitioning Lectures °Review definition of Graph Partitioning problem °Outline of Heuristics °Partitioning with Nodal Coordinates °Partitioning without Nodal Coordinates °Multilevel Acceleration BIG IDEA, will appear often in course °Available Software good sequential and parallel software availble °Comparison of Methods °Applications

CS267 L14 Graph Partitioning I.3 Demmel Sp 1999 Definition of Graph Partitioning °Given a graph G = (N, E, W N, W E ) N = nodes (or vertices), E = edges W N = node weights, W E = edge weights °Ex: N = {tasks}, W N = {task costs}, edge (j,k) in E means task j sends W E (j,k) words to task k °Choose a partition N = N 1 U N 2 U … U N P such that The sum of the node weights in each N j is “about the same” The sum of all edge weights of edges connecting all different pairs N j and N k is minimized °Ex: balance the work load, while minimizing communication °Special case of N = N 1 U N 2 : Graph Bisection

CS267 L14 Graph Partitioning I.4 Demmel Sp 1999 Applications °Load Balancing while Minimizing Communication °Sparse Matrix time Vector Multiplication Solving PDEs N = {1,…,n}, (j,k) in E if A(j,k) nonzero, W N (j) = #nonzeros in row j, W E (j,k) = 1 °VLSI Layout N = {units on chip}, E = {wires}, W E (j,k) = wire length °Telephone network design Original application, algorithm due to Kernighan °Sparse Gaussian Elimination Used to reorder rows and columns to increase parallelism, decrease “fill-in” °Physical Mapping of DNA °Evaluating Bayesian Networks?

CS267 L14 Graph Partitioning I.5 Demmel Sp 1999 Sparse Matrix Vector Multiplication

CS267 L14 Graph Partitioning I.6 Demmel Sp 1999 Cost of Graph Partitioning °Many possible partitionings to search: °n choose n/2 ~ sqrt(2n/pi)*2 n bisection possibilities °Choosing optimal partitioning is NP-complete Only known exact algorithms have cost = exponential(n) °We need good heuristics

CS267 L14 Graph Partitioning I.7 Demmel Sp 1999 First Heuristic: Repeated Graph Bisection °To partition N into 2k parts, bisect graph recursively k times Henceforth discuss mostly graph bisection

CS267 L14 Graph Partitioning I.8 Demmel Sp 1999 Overview of Partitioning Heuristics for Bisection °Partitioning with Nodal Coordinates Each node has x,y,z coordinates Partition nodes by partitioning space °Partitioning without Nodal Coordinates Sparse matrix of Web: A(j,k) = # times keyword j appears in URL k °Multilevel acceleration (BIG IDEA) Approximate problem by “coarse graph”, do so recursively

CS267 L14 Graph Partitioning I.9 Demmel Sp 1999 Edge Separators vs. Vertex Separators of G(N,E) °Edge Separator: E s (subset of E) separates G if removing E s from E leaves two ~equal-sized, disconnected components of N: N 1 and N 2 °Vertex Separator: N s (subset of N) separates G if removing N s and all incident edges leaves two ~equal-sized, disconnected components of N: N 1 and N 2 °Making an N s from an E s : pick one endpoint of each edge in E s How big can |N s | be, compared to |E s | ? °Making an E s from an N s : pick all edges incident on N s How big can |E s | be, compared to |N s | ? °We will find Edge or Vertex Separators, as convenient E s = green edges or blue edges N s = red vertices

CS267 L14 Graph Partitioning I.10 Demmel Sp 1999 Graphs with Nodal Coordinates - Planar graphs °Planar graph can be drawn in plane without edge crossings °Ex: m x m grid of m 2 nodes:  vertex separator  N s with |N s | = m = sqrt(|N|) (see last slide for m=5 ) °Theorem (Tarjan, Lipton, 1979): If G is planar,  N s such that N = N 1 U N s U N 2 is a partition, |N 1 | <= 2/3 |N| and |N 2 | <= 2/3 |N| |N s | <= sqrt(8 * |N|) °Theorem motivates intuition of following algorithms

CS267 L14 Graph Partitioning I.11 Demmel Sp 1999 Graphs with Nodal Coordinates: Inertial Partitioning °For a graph in 2D, choose line with half the nodes on one side and half on the other In 3D, choose a plane, but consider 2D for simplicity °Choose a line L, and then choose an L  perpendicular to it, with half the nodes on either side °Remains to choose L 1) L given by a*(x-xbar)+b*(y-ybar)=0, with a 2 +b 2 =1; (a,b) is unit vector  to L 2) For each nj = (xj,yj), compute coordinate S j = -b*(x j -xbar) + a*(y j -ybar) along L 3) Let Sbar = median(S 1,…,S n ) 4) Let nodes with S j < Sbar be in N 1, rest in N 2

CS267 L14 Graph Partitioning I.12 Demmel Sp 1999 Inertial Partitioning: Choosing L °Clearly prefer L on left below °Mathematically, choose L to be a total least squares fit of the nodes Minimize sum of squares of distances to L (green lines on last slide) Equivalent to choosing L as axis of rotation that minimizes the moment of inertia of nodes (unit weights) - source of name

CS267 L14 Graph Partitioning I.13 Demmel Sp 1999 Inertial Partitioning: choosing L (continued)  j (length of j-th green line) 2 =  j [ (x j - xbar) 2 + (y j - ybar) 2 - (-b*(x j - xhar) + a*(y j - ybar)) 2 ] … Pythagorean Theorem = a 2 *  j (x j - xbar) 2 + 2*a*b*  j (x j - xbar)*(x j - ybar) + b 2  j (y j - ybar) 2 = a 2 * X1 + 2*a*b* X2 + b 2 * X3 = [a b] * X1 X2 * a X2 X3 b Minimized by choosing (xbar, ybar) = (  j x j,  j y j ) / N = center of mass (a,b) = eigenvector of smallest eigenvalue of X1 X2 X2 X3 (a,b) is unit vector perpendicular to L

CS267 L14 Graph Partitioning I.14 Demmel Sp 1999 Graphs with Nodal Coordinates: Random Spheres °Emulate Lipton/Tarjan in higher dimensions than planar (2D) °Take an n by n by n mesh of |N| = n 3 nodes Edges to 6 nearest neighbors Partition by taking plane parallel to 2 axes Cuts n 2 =|N| 2/3 = O(|E| 2/3 ) edges

CS267 L14 Graph Partitioning I.15 Demmel Sp 1999 Random Spheres: Well Shaped Graphs °Need Notion of “well shaped” graphs in 3D, higher D Any graph fits in 3D without edge crossings! °Approach due to Miller, Teng, Thurston, Vavasis °Def: A k-ply neighborhood system in d dimensions is a set {D 1,…,D n } of closed disks in R d such that no point in R d is strictly interior to more than k disks °Def: An ( ,k) overlap graph is a graph defined in terms of  >= 1 and a k-ply neighborhood system {D 1,…,D n }: There is a node for each D j, and an edge from j to k if expanding the radius of the smaller of D j and D k by >  causes the two disks to overlap Ex: n-by-n mesh is a (1,1) overlap graph Ex: Any planar graph is ( ,k) overlap for some ,k

CS267 L14 Graph Partitioning I.16 Demmel Sp 1999 Generalizing Lipton/Tarjan to higher dimensions °Theorem (Miller, Teng, Thurston, Vavasis, 1993): Let G=(N,E) be an ( ,k) overlap graph in d dimensions with n=|N|. Then there is a vertex separator N s such that N = N 1 U N s U N 2 and N 1 and N 2 each has at most n*(d+1)/(d+2) nodes N s has at most O(  * k 1/d * n (d-1)/d ) nodes °When d=2, same as Lipton/Tarjan °Algorithm: Choose a sphere S in R d Edges that S “cuts” form edge separator E s Build N s from E s Choose “randomly”, so that it satisfies Theorem with high probability

CS267 L14 Graph Partitioning I.17 Demmel Sp 1999 Stereographic Projection °Stereographic projection from plane to sphere In d=2, draw line from p to North Pole, projection p’ of p is where the line and sphere intersect Similar in higher dimensions

CS267 L14 Graph Partitioning I.18 Demmel Sp 1999 Choosing a Random Sphere °Do stereographic projection from R d to sphere in R d+1 °Find centerpoint of projected points Any plane through centerpoint divides points ~evenly There is a linear programming algorithm, cheaper heuristics °Conformally map points on sphere Rotate points around origin so centerpoint at (0,…0,r) for some r Dilate points (unproject, multiply by sqrt((1-r)/(1+r)), project) -this maps centerpoint to origin (0,…,0) °Pick a random plane through origin Intersection of plane and sphere is circle °Unproject circle yields desired circle C in R d °Create N s : j belongs to N s if  *D j intersects C

CS267 L14 Graph Partitioning I.19 Demmel Sp 1999 Example of Random Sphere Algorithm (Gilbert)

CS267 L14 Graph Partitioning I.20 Demmel Sp 1999 Partitioning with Nodal Coordinates - Summary °Other variations on these algorithms °Algorithms are efficient °Rely on graphs having nodes connected (mostly) to “nearest neighbors” in space algorithm does not depend on where actual edges are! °Common when graph arises from physical model °Can be used as good starting guess for subsequent partitioners, which do examine edges °Can do poorly if graph less connected: °Details at (tech reports and SW) www-sal.cs.uiuc.edu/~steng

CS267 L14 Graph Partitioning I.21 Demmel Sp 1999 Partitioning without Nodal Coordinates- Breadth First Search (BFS) °Given G(N,E) and a root node r in N, BFS produces A subgraph T of G (same nodes, subset of edges) T is a tree rooted at r Each node assigned a level = distance from r

CS267 L14 Graph Partitioning I.22 Demmel Sp 1999 Breadth First Search °Queue (First In First Out, or FIFO) Enqueue(x,Q) adds x to back of Q x = Dequeue(Q) removes x from front of Q °Compute Tree T(N T,E T ) N T = {(r,0)}, E T = empty set … Initially T = root r, which is at level 0 Enqueue((r,0),Q) … Put root on initially empty Queue Q Mark r … Mark root as having been processed While Q not empty … While nodes remain to be processed (n,level) = Dequeue(Q) … Get a node to process For all unmarked children c of n N T = N T U (c,level+1) … Add child c to N T E T = E T U (n,c) … Add edge (n,c) to E T Enqueue((c,level+1),Q)) … Add child c to Q for processing Mark c … Mark c as processed Endfor Endwhile

CS267 L14 Graph Partitioning I.23 Demmel Sp 1999 Partitioning via Breadth First Search °BFS identifies 3 kinds of edges Tree Edges - part of T Horizontal Edges - connect nodes at same level Interlevel Edges - connect nodes at adjacent levels °No edges connect nodes in levels differing by more than 1 (why?) °BFS partioning heuristic N = N 1 U N 2, where -N 1 = {nodes at level <= L}, -N 2 = {nodes at level > L} Choose L so |N 1 | close to |N 2 |

CS267 L14 Graph Partitioning I.24 Demmel Sp 1999 Partitioning without nodal coordinates - Kernighan/Lin °Take a initial partition and iteratively improve it Kernighan/Lin (1970), cost = O(|N| 3 ) but easy to understand -What else did Kernighan invent? Fiduccia/Mattheyses (1982), cost = O(|E|), much better, but more complicated °Given G = (N,E,W E ) and a partitioning N = A U B, where |A| = |B| T = cost(A,B) =  {W(e) where e connects nodes in A and B} Find subsets X of A and Y of B with |X| = |Y| Swapping X and Y should decrease cost: -newA = A - X U Y and newB = B - Y U X -newT = cost(newA, newB) < cost(A,B) °Need to compute newT efficiently for many possible X and Y, choose smallest

CS267 L14 Graph Partitioning I.25 Demmel Sp 1999 Kernighan/Lin - Preliminary Definitions °T = cost(A, B), newT = cost(newA, newB) °Need an efficient formula for newT; will use E(a) = external cost of a in A = S {W(a,b) for b in B} I(a) = internal cost of a in A = S {W(a,a’) for other a’ in A} D(a) = cost of a in A = E(a) - I(a) E(b), I(b) and D(b) defined analogously for b in B °Consider swapping X = {a} and Y = {b} newA = A - {a} U {b}, newB = B - {b} U {a} °newT = T - ( D(a) + D(b) - 2*w(a,b) ) = T - gain(a,b) gain(a,b) measures improvement gotten by swapping a and b °Update formulas newD(a’) = D(a’) + 2*w(a’,a) - 2*w(a’,b) for a’ in A, a’ != a newD(b’) = D(b’) + 2*w(b’,b) - 2*w(b’,a) for b’ in B, b’ != b

CS267 L14 Graph Partitioning I.26 Demmel Sp 1999 Kernighan/Lin Algorithm Compute T = cost(A,B) for initial A, B … cost = O(|N| 2 ) Repeat Compute costs D(n) for all n in N … cost = O(|N| 2 ) Unmark all nodes in N … cost = O(|N|) While there are unmarked nodes … |N|/2 iterations Find an unmarked pair (a,b) maximizing gain(a,b) … cost = O(|N| 2 ) Mark a and b (but do not swap them) … cost = O(1) Update D(n) for all unmarked n, as though a and b had been swapped … cost = O(|N|) Endwhile … At this point we have computed a sequence of pairs … (a1,b1), …, (ak,bk) and gains gain(1),…., gain(k) … for k = |N|/2, ordered by the order in which we marked them Pick j maximizing Gain =  k=1 to j gain(k) … cost = O(|N|) … Gain is reduction in cost from swapping (a1,b1) through (aj,bj) If Gain > 0 then … it is worth swapping Update newA = A - { a1,…,ak } U { b1,…,bk } … cost = O(|N|) Update newB = B - { b1,…,bk } U { a1,…,ak } … cost = O(|N|) Update T = T - Gain … cost = O(1) endif Until Gain <= 0 One pass greedily computes |N|/2 possible X and Y to swap, picks best