Graph Algorithms. Overview Graphs are very general data structures – data structures such as dense and sparse matrices, sets, multi- sets, etc. can be.

Slides:



Advertisements
Similar presentations
Graph Algorithms - 3 Algorithm Design and Analysis Victor AdamchikCS Spring 2014 Lecture 13Feb 12, 2014Carnegie Mellon University.
Advertisements

§3 Shortest Path Algorithms Given a digraph G = ( V, E ), and a cost function c( e ) for e  E( G ). The length of a path P from source to destination.
Greed is good. (Some of the time)
October 31, Algorithms and Data Structures Lecture XIII Simonas Šaltenis Nykredit Center for Database Research Aalborg University
ParaMeter: A profiling tool for amorphous data-parallel programs Donald Nguyen University of Texas at Austin.
PSU CS Algorithms Analysis and Design Graphs.
1 Theory I Algorithm Design and Analysis (10 - Shortest paths in graphs) T. Lauer.
Design and Analysis of Algorithms Single-source shortest paths, all-pairs shortest paths Haidong Xue Summer 2012, at GSU.
CSE 101- Winter ‘15 Discussion Section January 26th 2015.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
Structure-driven Optimizations for Amorphous Data-parallel Programs 1 Mario Méndez-Lojo 1 Donald Nguyen 1 Dimitrios Prountzos 1 Xin Sui 1 M. Amber Hassaan.
The of Parallelism in Algorithms Keshav Pingali The University of Texas at Austin Joint work with D.Nguyen, M.Kulkarni, M.Burtscher, A.Hassaan, R.Kaleem,
Chapter 23 Minimum Spanning Trees
Tirgul 12 Algorithm for Single-Source-Shortest-Paths (s-s-s-p) Problem Application of s-s-s-p for Solving a System of Difference Constraints.
CS 311 Graph Algorithms. Definitions A Graph G = (V, E) where V is a set of vertices and E is a set of edges, An edge is a pair (u,v) where u,v  V. If.
Chapter 9 Graph algorithms. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Shortest Paths Definitions Single Source Algorithms –Bellman Ford –DAG shortest path algorithm –Dijkstra All Pairs Algorithms –Using Single Source Algorithms.
1 8-ShortestPaths Shortest Paths in a Graph Fundamental Algorithms.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Shortest Path Algorithms
Shortest Paths Definitions Single Source Algorithms
Tirgul 13. Unweighted Graphs Wishful Thinking – you decide to go to work on your sun-tan in ‘ Hatzuk ’ beach in Tel-Aviv. Therefore, you take your swimming.
All-Pairs Shortest Paths
Dijkstra’s Algorithm Slide Courtesy: Uwash, UT 1.
CS 473 All Pairs Shortest Paths1 CS473 – Algorithms I All Pairs Shortest Paths.
A Lightweight Infrastructure for Graph Analytics Donald Nguyen Andrew Lenharth and Keshav Pingali The University of Texas at Austin.
Graph Algorithms. Overview Graphs are very general data structures – data structures such as dense and sparse matrices, sets, multi-sets, etc. can be.
Data Structures Using C++ 2E
Keshav Pingali The University of Texas at Austin Parallel Program = Operator + Schedule + Parallel data structure SAMOS XV Keynote.
Algorithmic Foundations COMP108 COMP108 Algorithmic Foundations Greedy methods Prudence Wong
1 Shortest Path Algorithms Andreas Klappenecker [based on slides by Prof. Welch]
L21: “Irregular” Graph Algorithms November 11, 2010.
CSCI-455/552 Introduction to High Performance Computing Lecture 18.
1 Graphs Algorithms Sections 9.1, 9.2, and Graphs v1v1 v2v2 v5v5 v7v7 v8v8 v3v3 v6v6 v4v4 A graph G = (V, E) –V: set of vertices (nodes) –E: set.
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
Dijkstras Algorithm Named after its discoverer, Dutch computer scientist Edsger Dijkstra, is an algorithm that solves the single-source shortest path problem.
Elixir : A System for Synthesizing Concurrent Graph Programs
Agenda Project discussion Modeling Critical Sections in Amdahl's Law and its Implications for Multicore Design, S. Eyerman, L. Eeckhout, ISCA'10 [pdf]pdf.
June 21, 2007 Minimum Interference Channel Assignment in Multi-Radio Wireless Mesh Networks Anand Prabhu Subramanian, Himanshu Gupta.
1 ELEC692 Fall 2004 Lecture 1b ELEC692 Lecture 1a Introduction to graph theory and algorithm.
Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
1 Keshav Pingali University of Texas, Austin Introduction to parallelism in irregular algorithms.
1 Keshav Pingali University of Texas, Austin Operator Formulation of Irregular Algorithms.
Graphs. Made up of vertices and arcs Digraph (directed graph) –All arcs have arrows that give direction –You can only traverse the graph in the direction.
Introduction to Algorithms Jiafen Liu Sept
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
CSE 2331 / 5331 Topic 12: Shortest Path Basics Dijkstra Algorithm Relaxation Bellman-Ford Alg.
Shortest Path Algorithms. Definitions Variants  Single-source shortest-paths problem: Given a graph, finding a shortest path from a given source.
The all-pairs shortest path problem (APSP) input: a directed graph G = (V, E) with edge weights goal: find a minimum weight (shortest) path between every.
Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University.
Data Structures & Algorithms Graphs
Graphs A ‘Graph’ is a diagram that shows how things are connected together. It makes no attempt to draw actual paths or routes and scale is generally inconsequential.
1 Prim’s algorithm. 2 Minimum Spanning Tree Given a weighted undirected graph G, find a tree T that spans all the vertices of G and minimizes the sum.
Suppose G = (V, E) is a directed network. Each edge (i,j) in E has an associated ‘length’ c ij (cost, time, distance, …). Determine a path of shortest.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Graphs. Graph Definitions A graph G is denoted by G = (V, E) where  V is the set of vertices or nodes of the graph  E is the set of edges or arcs connecting.
1 GRAPHS – Definitions A graph G = (V, E) consists of –a set of vertices, V, and –a set of edges, E, where each edge is a pair (v,w) s.t. v,w  V Vertices.
1 Data Structures and Algorithms Graphs. 2 Graphs Basic Definitions Paths and Cycles Connectivity Other Properties Representation Examples of Graph Algorithms:
Mesh Generation, Refinement and Partitioning Algorithms Xin Sui The University of Texas at Austin.
CSE 373: Data Structures and Algorithms Lecture 21: Graphs V 1.
COMP108 Algorithmic Foundations Greedy methods
Algorithms (2IL15) – Lecture 5 SINGLE-SOURCE SHORTEST PATHS
CSE 373 Data Structures and Algorithms
CSE 417: Algorithms and Computational Complexity
Graphs.
Chapter 9 Graph algorithms
More Graphs Lecture 19 CS2110 – Fall 2009.
Presentation transcript:

Graph Algorithms

Overview Graphs are very general data structures – data structures such as dense and sparse matrices, sets, multi- sets, etc. can be viewed as representations of graphs Algorithms on matrices/sets/etc. can usually be interpreted as graph algorithms – but it may or may not be useful to do this – sparse matrix algorithms can be usefully viewed as graph algorithms Some graph algorithms can be interpreted as matrix algorithms – but it may or may not be useful to do this – may be useful if graph structure is fixed as in graph analytics applications: many of these applications can be formulated as sparse matrix-vector product

Graph-matrix duality Graph (V,E) as a matrix – Choose an ordering of vertices – Number them sequentially – Fill in |V|x|V| matrix – Called “adjacency matrix” of graph Observations: – Diagonal entries: weights on self- loops – Symmetric matrix  undirected graph – Lower triangular matrix  no edges from lower numbered nodes to higher numbered nodes – Dense matrix  clique (edge between every pair of nodes) a b c d e f g 0 a f c e d 0 b 0 0 g

Matrix-vector multiplication Matrix computation: y = Ax Graph interpretation: – Each node i has two values (labels) x(i) and y(i) – Each node i updates its label y using the x value from each of its neighbors j, scaled by the label on edge (i,j) Observation: – Graph perspective shows dense MVM is just a special case of sparse MVM a b c d e f g 0 a f c e d 0 b 0 0 g A x

Graph set/multiset duality Set/multiset is isomorphic to a graph – labeled nodes – no edges “Opposite” of clique Algorithms on sets/multisets can be viewed as graph algorithms Usually no particular advantage to doing this but it shows generality of graph algorithms a b c e f {a,c,f,e,b} Set Graph

Graph algorithm examples Problem: single-source shortest-path (SSSP) computation Formulation: – Given an undirected graph with positive weights on edges, and a node called the source – Compute the shortest distance from source to every other node Variations: – Negative edge weights but no negative weight cycles – All-pairs shortest paths Applications: – GPS devices for driving directions – social network analyses: centrality metrics A B C D E F H Node A is the source

SSSP Problem Many algorithms – Dijkstra (1959) – Bellman-Ford (1957) – Chaotic relaxation (1969) – Delta-stepping (1998) Common structure: – Each node has a label d containing shortest known distance to that node from source Initialized to 0 for source and infinity for all other nodes – Key operations: relax-edge(u,v): if d[v] > d[u]+w(u,v) then d[v]  d[u]+w(u,v) relax-node(u): relax all edges connected to u G A B C D E F H ∞ ∞ ∞ ∞ ∞ ∞ ∞

SSSP algorithms (I) Dijkstra’s algorithm (1959): – priority queue of nodes, ordered by shortest distance known to node – iterate over nodes in priority order – node is relaxed just once – work-efficient: O(|E|*lg(|V|)) Active nodes: – nodes in PQ: level has been lowered but node has not yet been relaxed Key data structures: – Graph – Work set/multiset: ordered Priority queue Parallelism in algorithm – Edges connected to node can be relaxed in parallel – Difficult to relax multiple nodes from priority queue in parallel – Little parallelism for sparse graphs Ordered algorithm H A B C D E F G Priority queue

SSSP algorithms (II) Chaotic relaxation (1969): – use set to track active nodes – iterate over nodes in any order – nodes can be relaxed many times may do more work than Dijkstra Key data structures: – Graph – Work set/multiset: unordered Parallelization: – process multiple work-set nodes – need concurrent data structures concurrent set/multiset: elements are added/removed correctly concurrent graph: simultaneous updates to node happen correctly Unordered algorithm A B C D E F G H AB C D E F 0 D Set

SSSP algorithms (II contd.) Need for synchronization at graph nodes – Suppose nodes B and C are relaxed simultaneously – Both relaxations may update value at D Value at D is infinity Relax-C operation reads this value and wants to update it to 3. At the same time, Relax-D operation reads D’s value and wants to update it to 12 If the two updates are not sequenced properly, final value at D after both relaxations may be 12, which is incorrect – One solution: ensure that the “read- modify-write” in edge relaxation is “atomic” – no other thread can read or write that location while the read- modify-write is happening Also need synchronization at node being relaxed to ensure its value is not changed by some other core when the node relaxation is going on A B C D E F G H B C 0 Set ∞ ∞ ∞ ∞ ∞

SSSP algorithms (III) Delta-stepping (1998) – variation of chaotic relaxation – active nodes currently closer to source are more likely to be chosen for processing from set Work-set/multiset: – Parameter:  – Sequence of sets – Nodes whose current distance is between n  and (n+1)  are put in the n th set – Nodes in each set are processed in parallel – Nodes in set n are completed before processing of nodes in set (n+1) are started  = 1: Dijkstra  = : Chaotic relaxation Picking an optimal  : – depends on graph and machine – Do experimentally A B C D E F G H  ∞

SSSP algorithms (IV) Bellman-Ford (1957): – Iterate over all edges of graph in any order, relaxing each edge – Do this |V| times – O(|E|*|V|) Parallelization – Iterate over set of edges – Inspector-executor: use graph matching to generate a conflict- free schedule of edge relaxations after input graph is given – Edges in a matching do not have nodes in common so they can be relaxed without synchronization – Barrier synchronization between successive stages in schedule A B C E F H G 1. {(A,B),(C,D),(E,H)}, 2. {(A,C),(B,D),(E,G),(F,H)}, 3. {(D,E),(G,H)} 4. {(D,F)} Conflict-free schedule D

Matching Given a graph G = (V,E), a matching is a subset of edges such that no edges in the subset have a node in common – (eg) {(A,B),(C,D),(E,H)} – Not a matching: {(A,B),(A,C)} Maximal matching: a matching to which no new edge can be added without destroying matching property – (eg) {(A,B),(C,D),(E,H)} – (eg) {(A,C),(B,D)(E,G),(F,H)} – Can be computed in O(|E|) time using a simple greedy algorithm Maximum matching: matching that contains the largest number of edges – (eg) {(A,C),(B,D)(E,G),(F,H)} – Can be computed in time O(sqrt(|V|)|E|) A B C D E F H G 1. {(A,B),(C,D),(E,H)}, 2. {(A,C),(B,D),(E,G),(F,H)}, 3. {(D,E),(G,H)} 4. {(D,F)} Conflict-free schedule

Summary of SSSP Algorithms Ordered algorithms: use priority queues – Dijkstra’s algorithm Work-efficient but difficult to extract parallelism Unordered algorithms: use sets/multisets – Chaotic relaxation Parallelism but amount of work depends on schedule Fine-grain synchronization Ordered outer/unordered inner – Delta-stepping Controlled chaotic relaxation: parameter   permits trade-off between parallelism and extra work Both fine-grain and coarse-grain synchronization – Bellman-Ford algorithm Inspector: use matching to find contention-free schedule for inner loop Executor: perform relaxations using barriers Only coarse-grain synchronization

Delaunay Mesh Refinement Iterative refinement to remove badly shaped triangles: while there are bad triangles do { Pick a bad triangle; Find its cavity; Retriangulate cavity; // may create new bad triangles } Don’t-care non-determinism: –final mesh depends on order in which bad triangles are processed –applications do not care which mesh is produced Data structure: –graph in which nodes represent triangles and edges represent triangle adjacencies Parallelism: –bad triangles with cavities that do not overlap can be processed in parallel –parallelism is dependent on runtime values compilers cannot find this parallelism –(Miller et al) at runtime, repeatedly build interference graph and find maximal independent sets for parallel execution Mesh m = /* read in mesh */ WorkList wl; wl.add(m.badTriangles()); while (true) { if ( wl.empty() ) break; Element e = wl.get(); if (e no longer in mesh) continue; Cavity c = new Cavity(e);//determine new cavity c.expand(); c.retriangulate(); m.update(c);//update mesh wl.add(c.badTriangles()); }

Stencil computation: Jacobi iteration Finite-difference method for solving pde’s –discrete representation of domain: grid Values at interior points are updated using values at neighbors –values at boundary points are fixed Data structure: –dense arrays Parallelism: –values at next time step can be computed simultaneously –parallelism is not dependent on runtime values Compiler can find the parallelism –spatial loops are DO-ALL loops //Jacobi iteration with 5-point stencil //initialize array A for time = 1, nsteps for in [2,n-1]x[2,n-1] temp(i,j)=0.25*(A(i-1,j)+A(i+1,j)+A(i,j-1)+A(i,j+1)) for in [2,n-1]x[2,n-1]: A(i,j) = temp(i,j) Jacobi iteration, 5-point stencil AtAt A t+1

Page Rank Used to determine relative importance of webpages by examining links between pages Abstraction: –graph in which nodes are webpages and edges are links –nodes have weights [0,1] initialized to 1/N (N is number of nodes) when algorithm terminates, weight is heuristic measure of importance Core algorithm: iterative step repeated a few times –each node u contributes an equal fraction of its own current weight to its immediate neighbors in the graph u v i+1 i

Page Rank (contd.) Intuition behind page rank: –if you do a random walk on the graph, how likely is it that you end up at various nodes in the graph in the limit? Small twist needed to handle nodes with no outgoing edges Damping factor: d –Small constant: 0.85 –Assume that each node, you may also take a random jump to any other node with probability (1-d)

PageRank

Questions We have seen several algorithms from a number of problem domains –Networks: Dijkstra SSSP, chaotic-relaxation SSSP, delta-stepping SSSP, Bellman-Ford –Graphics: Delaunay mesh refinement –Finite-differences: Stencil computations –Big data: Page rank What are the right abstractions for seeing commonalities and differences between these algorithms?

Abstraction of algorithms Operator formulation – Active elements: nodes or edges where there is work to be done – Operator: computation at active element Activity: application of operator to active element Neighborhood: graph elements read or written by activity – Ordering: order in which active elements must appear to have been processed Unordered algorithms: any order is fine (eg. chaotic relaxation, Jacobi, PageRank) Ordered algorithms: algorithm- specific order (eg. Dijkstra) : active node : neighborhood

TAO analysis: algorithm abstraction : active node : neighborhood 22 Dijkstra SSSP: general graph, data-driven, ordered, local computation Chaotic relaxation SSSP: general graph, data-driven, unordered, local computation Bellman-Ford SSSP: general graph, topology-driven, unordered, local computation Delta-stepping SSSP: general graph, data-driven, ordered, local computation Delaunay mesh refinement: general graph, data-driven, unordered, morph Jacobi: grid, topology-driven, unordered, local computation

Infrastructure for graph algorithms Concurrent data structures: – Concurrent graph data structure – Concurrent set/bag, priority queue – Can be very complex to implement One software architecture: – Exploit Wirth’s equation: Program = Algorithm + Data Structure Parallel program = Parallel algorithm + Parallel data structure = Operator + Schedule + Parallel data structure – Provide a library of concurrent data structures – Programmer specifies operator schedule for applying operator at different active elements This is the approach we use in the Galois project

Two-level infrastructure: Galois Small number of expert programmers must support a large number of application programmers – cf. SQL Galois project: – Program = Algorithm + Data structure (Wirth) – Library of concurrent data structures and runtime system written by expert programmers – Application programmers code in sequential C++ All concurrency control is in data structure library and runtime system Algorithms Data structures Parallel program = Operator + Schedule + Parallel data structures Joe: Operator + Schedule Stephanie: Parallel data structures

Summary Graph algorithms can be very complex – Work may be created dynamically – Different orders of doing work may result in different amounts of work – Parallelism may not be known until runtime – Underlying graph structure may change dynamically SSSP algorithms illustrate most of this complexity so the SSSP problem is a good model problem for the study of parallel graph algorithms Operator formulation and TAO analysis are useful abstractions for understanding parallelism in algorithms Galois project: software architecture is based on these ideas – Library of concurrent data structures written by expert programmers – Joe programmer writes C++ code to specify operator and schedule