CPSC 311, Fall CPSC 311 Analysis of Algorithms More Graph Algorithms Prof. Jennifer Welch Fall 2009
CPSC 311, Fall Minimum Spanning Tree find subset of edges that span all the nodes, create no cycle, and minimize sum of weights
CPSC 311, Fall Facts About MSTs There can be many spanning trees of a graph In fact, there can be many minimum spanning trees of a graph But if every edge has a unique weight, then there is a unique MST
CPSC 311, Fall Uniqueness of MST Suppose in contradiction there are 2 MSTs, M 1 and M 2. Let e be edge with minimum weight that is one but not the other (say it is in M 1 ). If e is added to M 2, a cycle is formed. Let e' be an edge in the cycle that is not in M 1
CPSC 311, Fall Uniqueness of MST e: in M 1 but not M 2 e': in M 2 but not M 1 ; wt is less than wt of e M2:M2: Replacing e with e' creates a new MST M 3 whose weight is less than that of M 2
CPSC 311, Fall Generic MST Algorithm input: weighted undirected graph G = (V,E,w) T := empty set while T is not yet a spanning tree of G find an edge e in E s.t. T U {e} is a subgraph of some MST of G add e to T return T (as MST of G)
CPSC 311, Fall Kruskal's MST algorithm consider the edges in increasing order of weight, add in an edge iff it does not cause a cycle
CPSC 311, Fall Kruskal's Algorithm as a Special Case of Generic Alg. Consider edges in increasing order of weight Add the next edge iff it doesn't cause a cycle At any point, T is a forest (set of trees); eventually T is a single tree
CPSC 311, Fall Kruskal's Algorithm input: G = (V,E,w) // weighted graph T := Ø // subset of E sort E by increasing weights for each (u,v) in E in sorted order do if T U {u,v} has no cycle then T := T U {(u,v)} return (V,T)
CPSC 311, Fall Why is Kruskal's Greedy? Algorithm manages a set of edges s.t. these edges are a subset of some MST At each iteration: choose an edge so that the MST-subset property remains true subproblem left is to do the same with the remaining edges Always try to add cheapest available edge that will not violate the tree property locally optimal choice
CPSC 311, Fall Correctness of Kruskal's Alg. Let e 1, e 2, …, e n-1 be sequence of edges chosen Clearly they form a spanning tree Suppose it is not minimum weight Let e i be the edge where the algorithm goes wrong {e 1,…,e i-1 } is part of some MST M but {e 1,…,e i } is not part of any MST
CPSC 311, Fall Correctness of Kruskal's Alg. white edges are part of MST M, which contains e 1 to e i-1, but not e i M:e i, forms a cycle in M e* : min wt. edge in cycle not in e 1 to e i-1 replacing e* w/ e i forms a spanning tree with smaller weight than M, contradiction! wt(e*) > wt(e i )
CPSC 311, Fall Note on Correctness Proof Argument on previous slide works for case when every edge has a unique weight Algorithm also works when edge weights are not necessarily correct Modify proof on previous slide: contradiction is reached to assumption that e i is not part of any MST
CPSC 311, Fall Implementing Kruskal's Alg. Sort edges by weight efficient algorithms known How to test quickly if adding in the next edge would cause a cycle? use Disjoint Sets data structure…
CPSC 311, Fall Disjoint Sets in Kruskal's Alg. Use Disjoint Sets data structure to test whether adding an edge to a set of edges would cause a cycle Each set represents nodes that are connected using edges chosen so far Initially put each node in a set by itself using Make-Set When testing edge (u,v), call Find-Set to check if u and v are in the same set if so, then a cycle would result if (u,v) is chosen When edge (u,v) is chosen, call Union(u,v)
CPSC 311, Fall Kruskal's Algorithm #2 input: G = (V,E,w) // weighted graph T := Ø // subset of E sort E by increasing weights for each v in V do MakeSet(v) for each (u,v) in E in sorted order do a := FindSet(u) b := FindSet(v) if a ≠ b then T := T U {(u,v)} Union(a,b) return (V,T)
CPSC 311, Fall Running Time of Kruskal's MST Algorithm Sorting the edges takes O(E log E) time The rest of the time is proportional to the total time taken by all the calls to MakeSet, FindSet, and Union. Number of calls to MakeSet is |V| If we unravel the for loop, we can see: total number of calls to FindSet is O(E) total number of calls to Union is O(V) So number of operations is O(V+E), with |V| MakeSet ops. If the Disjoint Sets data structure is implemented using union by rank and path compression, the time for all the Disjoint Sets operations is O((V+E) log*V) Total time is O(E log E), since graph is connected, so V ≤ E + 1, and this equals O(E log V), since E is O(V 2 ) and log V 2 = 2log V
CPSC 311, Fall Another Greedy MST Alg. Kruskal's algorithm maintains a forest that grows until it forms a spanning tree Alternative idea is keep just one tree and grow it until it spans all the nodes Prim's algorithm At each iteration, choose the minimum weight outgoing edge to add greedy!
CPSC 311, Fall Idea of Prim's Algorithm Instead of growing the MST as possibly multiple trees that eventually all merge, grow the MST from a single node, so that there is only one tree at any point. Also a special case of the generic algorithm: at each step, add the minimum weight edge that goes out from the tree constructed so far.
CPSC 311, Fall Prim's Algorithm input: weighted undirected graph G = (V,e,w) T := empty set S := {any node in V} while |T| < |V| - 1 do let (u,v) be a min wt. outgoing edge (u in S, v not in S) add (u,v) to T add v to S return (S,T) (as MST of G)
CPSC 311, Fall Prim's Algorithm Example a b h i c g d f e
CPSC 311, Fall Correctness of Prim's Algorithm Let T i be the tree represented by (S,T) at the end of iteration i. Show by induction on i that T i is a subtree of some MST of G. Basis: i = 0 (before first iteration). T 0 contains just a single node, and thus is a subtree of every MST of G.
CPSC 311, Fall Correctness of Prim's Algorithm Induction: Assume T i is a subtree of some MST M. We must show T i+1 is a subtree of some MST. Let (u,v) be the edge added in iteration i+1. u v TiTi T i+1 Case 1: (u,v) is in M. Then T i+1 is also a subtree of M.
CPSC 311, Fall Correctness of Prim's Algorithm Case 2: (u,v) is not in M. There is a path P in M from u to v, since M spans G. Let (x,y) be the first edge in P with one endpoint in T i and the other not in T i. u v TiTi x y P
CPSC 311, Fall Correctness of Prim's Algorithm Let M' = M - {(x,y)} U {(u,v)} M' is also a spanning tree of G. w(M') = w(M) - w(x,y) + w(u,v) ≤ w(M) since (u,v) is min wt outgoing edge So M' is also an MST and T i+1 is subtree M' u v TiTi x y T i+1
CPSC 311, Fall Implementing Prim's Algorithm How do we find minimum weight outgoing edge? First cut: scan all adjacency lists at each iteration. Results in O(VE) time. Try to do better.
CPSC 311, Fall Implementing Prim's Algorithm Idea: have each node not yet in the tree keep track of its best (cheapest) edge to the tree constructed so far. To find min wt. outgoing edge, find minimum among these values use a priority queue to store the best edge info (insert and extract-min operations)
CPSC 311, Fall Implementing Prim's Algorithm When a node v is added to T, some other nodes might have their best edges affected, but only neighbors of v add decrease-key operation to the priority queue u v TiTi T i+1 w w's best edge to T i check if this edge is cheaper for w x x's best edge to T i v's best edge to T i
CPSC 311, Fall Details on Prim's Algorithm Associate with each node v two fields: best-wt[v] : if v is not yet in the tree, then it holds the min. wt. of all edges from v to a node in the tree. Initially infinity. best-node[v] : if v is not yet in the tree, then it holds the name of the node u in the tree s.t. w(v,u) is v's best-wt. Initially nil.
CPSC 311, Fall Details on Prim's Algorithm input: G = (V,E,w) // initialization initialize priority queue Q to contain all nodes, using best-wt values as keys let v 0 be any node in V decrease-key(Q,v 0,0) // last line means change best-wt[v 0 ] to 0 and adjust Q accordingly
CPSC 311, Fall Details on Prim's Algorithm while Q is not empty do u := extract-min(Q) // node w/ smallest best-wt if u is not v 0 then add (u,best-node[u]) to T for each neighbor v of u do if v is in Q and w(u,v) < best-wt[v] then best-node[v] := u decrease-key(Q,v,w(u,v)) return (V,T) // as MST of G
CPSC 311, Fall Running Time of Prim's Algorithm Depends on priority queue implementation. Let T ins be time for insert T dec be time for decrease-key T ex be time for extract-min Then we have |V| inserts and one decrease-key in the initialization: O(V T ins +T dec ) |V| iterations of while one extract-min per iteration: O(V T ex ) total
CPSC 311, Fall Running Time of Prim's Algorithm Each iteration of while includes a for loop. Number of iterations of for loop varies, depending on how many neighbors the current node has Total number of iterations of for loop is O(E). Each iteration of for loop: one decrease key, so O(E T dec ) total
CPSC 311, Fall Running Time of Prim's Algorithm O(V(T ins + T ex ) + E T dec ) If priority queue is implemented with a binary heap, then T ins = T ex = T dec = O(log V) total time is O(E log V) (Think about how to implement decrease- key in O(log V) time.)
CPSC 311, Fall Shortest Paths in a Graph We already saw the Floyd-Warshall algorithm to compute all pairs of shortest paths Let's review two important single-source shortest path algorithms: Dijkstra's algorithm Bellman-Ford algorithm
CPSC 311, Fall Single Source Shortest Path Problem Given: directed or undirected graph G = (V,E,w) and source node s in V Find: For each t in V, a path in G from s to t with minimum weight Warning! Negative weights are a problem: s t 4 55
CPSC 311, Fall Shortest Path Tree Result of a SSSP algorithm can be viewed as a tree rooted at the source Why not use breadth-first search? Works fine if all weights are the same: weight of each path is (a multiple of) the number of edges in the path Doesn't work when weights are different
CPSC 311, Fall Dijkstra's SSSP Algorithm Assumes all edge weights are nonnegative Similar to Prim's MST algorithm Start with source node s and iteratively construct a tree rooted at s Each node keeps track of tree node that provides cheapest path from s (not just cheapest path from any tree node) At each iteration, include the node whose cheapest path from s is the overall cheapest
CPSC 311, Fall Prim's vs. Dijkstra's s Prim's MST s Dijkstra's SSSP
CPSC 311, Fall Implementing Dijkstra's Alg. How can each node u keep track of its best path from s? Keep an estimate, d[u], of shortest path distance from s to u Use d as a key in a priority queue When u is added to the tree, check each of u's neighbors v to see if u provides v with a cheaper path from s: compare d[v] to d[u] + w(u,v)
CPSC 311, Fall Dijkstra's Algorithm input: G = (V,E,w) and source node s // initialization d[s] := 0 d[v] := infinity for all other nodes v initialize priority queue Q to contain all nodes using d values as keys
CPSC 311, Fall Dijkstra's Algorithm while Q is not empty do u := extract-min(Q) for each neighbor v of u do if d[u] + w(u,v) < d[v] then d[v] := d[u] + w(u,v) decrease-key(Q,v,d[v]) parent(v) := u
CPSC 311, Fall Dijkstra's Algorithm Example ab de c source is node a QabcdebcdecdedededØ d[a] d[b] ∞ d[c] ∞ 1210 d[d] ∞∞∞ 1613 d[e] ∞∞ 11 iteration
CPSC 311, Fall Correctness of Dijkstra's Alg. Let T i be the tree constructed after i-th iteration of while loop: nodes not in Q edges indicated by parent variables Show by induction on i that the path in T i from s to u is a shortest path and has distance d[u], for all u in T i. Basis: i = 1. s is the only node in T 1 and d[s] = 0.
CPSC 311, Fall Correctness of Dijkstra's Alg. Induction: Assume T i-1 is a correct shortest path tree and show for T i. Let u be the node added in iteration i. Let x = parent(u). s x T i-1 u TiTi Need to show path in T i from s to u is a shortest path, and has distance d[u]
CPSC 311, Fall Correctness of Dijkstra's Alg s x T i-1 u TiTi P, path in T i from s to u (a,b) is first edge in P' that leaves T i-1 (i.e., a is in T i-1 but b is not) a b P', another path from s to u
CPSC 311, Fall Let P 1 be part of P' before (a,b). Let P 2 be part of P' after (a,b). w(P') = w(P 1 ) + w(a,b) + w(P 2 ) ≥ w(P 1 ) + w(a,b) (nonneg wts) ≥ (wt of path in T i-1 from s to a) + w(a,b) (inductive hypothesis) ≥ (wt of path in T i-1 from s to x) + w(x,u) (alg chose u in iteration i and d-values are accurate, by inductive hyp.) = w(P). So P is a shortest path, and d[u] is accurate after iteration i. Correctness of Dijkstra's Alg s x T i-1 u TiTi a b P' P
CPSC 311, Fall Running Time of Dijstra's Alg. initialization: insert each node once O(V T ins ) O(V) iterations of while loop one extract-min per iteration => O(V T ex ) for loop inside while loop has variable number of iterations… For loop has O(E) iterations total one decrease-key per iteration => O(E T dec ) Total is O(V (T ins + T ex ) + E T dec )
CPSC 311, Fall Using Fancier Heap Implementations O(V(T ins + T ex ) + E T dec ) If priority queue is implemented with a binary heap, then T ins = T ex = T dec = O(log V) total time is O(E log V) There are fancier implementations of the priority queue, such as Fibonacci heap: T ins = O(1), T ex = O(log V), T dec = O(1) (amortized) total time is O(V log V + E)
CPSC 311, Fall Using Simpler Heap Implementations O(V(T ins + T ex ) + E T dec ) If graph is dense, so that |E| = (V 2 ), then it doesn't help to make T ins and T ex to be at most O(V). Instead, focus on making T dec be small, say constant. Implement priority queue with an unsorted array: T ins = O(1), T ex = O(V), T dec = O(1) total is O(V 2 )
CPSC 311, Fall What About Negative Edge Weights? Dijkstra's SSSP algorithm requires all edge weights to be nonnegative even more restrictive than outlawing negative weight cycles Bellman-Ford SSSP algorithm can handle negative edge weights even "handles" negative weight cycles by reporting they exist
CPSC 311, Fall Bellman-Ford Idea Consder each edge (u,v) and see if u offers v a cheaper path from s compare d[v] to d[u] + w(u,v) Repeat this process |V| - 1 times to ensure that accurate information propagates from s, no matter what order the edges are considered in
CPSC 311, Fall Bellman-Ford SSSP Algorithm input: directed or undirected graph G = (V,E,w) //initialization initialize d[v] to infinity and parent[v] to nil for all v in V other than the source initialize d[s] to 0 and parent[s] to s // main body for i := 1 to |V| - 1 do for each (u,v) in E do // consider in arbitrary order if d[u] + w(u,v) < d[v] then d[v] := d[u] + w(u,v) parent[v] := u // continued on next slide
CPSC 311, Fall Bellman-Ford SSSP Algorithm // check for negative weight cycles for each (u,v) in E do if d[u] + w(u,v) < d[v] then output "negative weight cycle exists"
CPSC 311, Fall Running Time of Bellman-Ford O(V) iterations of outer for loop O(E) iterations of inner for loop O(VE) time total
CPSC 311, Fall Bellman-Ford Example s c a b 3 — process edges in order (c,b) (a,b) (c,a) (s,a) (s,c)
CPSC 311, Fall Correctness of Bellman-Ford Assume no negative-weight cycles. Lemma: d[v] is never an underestimate of the actual shortest path distance from s to v. Lemma: If there is a shortest s-to-v path containing at most i edges, then after iteration i of the outer for loop, d[v] is at most the actual shortest path distance from s to v. Theorem: Bellman-Ford is correct. Proof: Follows from these 2 lemmas and fact that every shortest path has at most |V| - 1 edges.
CPSC 311, Fall Correctness of Bellman-Ford Suppose there is a negative weight cycle. Then the distance will decrease even after iteration |V| - 1 shortest path distance is negative infinity This is what the last part of the code checks for.
CPSC 311, Fall Speeding Up Bellman-Ford The previous example would have converged faster if we had considered the edges in a different order in the for loop move outward from s If the graph is a DAG (no cycles), we can fully exploit this idea to speed up the running time
CPSC 311, Fall DAG Shortest Path Algorithm input: directed graph G = (V,E,w) and source node s in V topologically sort G d[v] := infinity for all v in V d[s] := 0 for each u in V in topological sort order do for each neighbor v of u do d[v] := min{d[v],d[u] + w(u,v)}
CPSC 311, Fall DAG Shortest Path Algorithm Running Time is O(V + E). Example: