 # Minimum Spanning Tree Algorithms

## Presentation on theme: "Minimum Spanning Tree Algorithms"— Presentation transcript:

Minimum Spanning Tree Algorithms
Prim’s Algorithm Kruskal’s Algorithm Runtime Analysis Correctness Proofs

What is A Spanning Tree? A spanning tree for an undirected graph G=(V,E) is a subgraph of G that is a tree and contains all the vertices of G Can a graph have more than one spanning tree? Can an unconnected graph have a spanning tree? a b u e c v f d Copyright 1999 by Cutler/Head

 Minimal Spanning Tree.
The weight of a subgraph is the sum of the weights of it edges. A minimum spanning tree for a weighted graph is a spanning tree with minimum weight. Can a graph have more then one minimum spanning tree? a 4 4 9 3 b u e 14 2 10 15 c v f 3 8 d Mst T: w( T )= (u,v)  T w(u,v ) is minimized Copyright 1999 by Cutler/Head

Example of a Problem that Translates into a MST
The Problem Several pins of an electronic circuit must be connected using the least amount of wire. Modeling the Problem The graph is a complete, undirected graph G = ( V, E ,W ), where V is the set of pins, E is the set of all possible interconnections between the pairs of pins and w(e) is the length of the wire needed to connect the pair of vertices. Find a minimum spanning tree. Copyright 1999 by Cutler/Head

Greedy Choice We will show two ways to build a minimum spanning tree.
A MST can be grown from the current spanning tree by adding the nearest vertex and the edge connecting the nearest vertex to the MST. (Prim's algorithm) A MST can be grown from a forest of spanning trees by adding the smallest edge connecting two spanning trees. (Kruskal's algorithm) Copyright 1999 by Cutler/Head

Notation Prim’s Selection rule
Tree-vertices: in the tree constructed so far Non-tree vertices: rest of vertices Prim’s Selection rule Select the minimum weight edge between a tree-node and a non-tree node and add to the tree Copyright 1999 by Cutler/Head

The Prim algorithm Main Idea
Select a vertex to be a tree-node while (there are non-tree vertices) { if there is no edge connecting a tree node with a non-tree node return “no spanning tree” select an edge of minimum weight between a tree node and a non-tree node add the selected edge and its new vertex to the tree } return tree 6 4 5 2 15 8 10 14 3 u v b a c d f Copyright 1999 by Cutler/Head

Implementation Issues
How is the graph implemented? Assume that we just added node u to the tree. The distance of the nodes adjacent to u to the tree may now be decreased. There must be fast access to all the adjacent vertices. So using adjacency lists seems better How should the set of non-tree vertices be represented? The operations are: build set delete node closest to tree decrease the distance of a non-tree node from the tree check whether a node is a non- tree node Copyright 1999 by Cutler/Head

Implementation Issues
How should the set of non-tree vertices be represented? A priority queue PQ may be used with the priority D[v] equal to the minimum distance of each non-tree vertex v to the tree. Each item in PQ contains: D[v], the vertex v, and the shortest distance edge (v, u) where u is a tree node This means: build a PQ of non-tree nodes with initial values - fast build heap O (V ) building an unsorted list O(V) building a sorted list O(V) (special case) Copyright 1999 by Cutler/Head

Implementation Issues
delete node closest to tree (extractMin) O(lg V ) if heap and O(V) if unsorted list O(1) sorted list decrease the distance of a non-tree node to the tree We need to find the location i of node v in the priority queue and then execute (decreasePriorityValue(i, p)) where p is the new priority decreasePriorityValue(i, p) O(lg V) for heap, O(1) for unsorted list O(V ) for sorted list (too slow) Copyright 1999 by Cutler/Head

Implementation Issues
What is the location i of node v in a priority queue? Find in Heaps, and sorted lists O(n) Unsorted – if the nodes are numbered 1 to n and we use an array where node v is the v item in the array O(1) Extended heap We will use extended heaps that contain a “handle” to the location of each node in the heap. When a node is not in PQ the “handle” will indicate that this is the case This means that we can access a node in the extended heap in O(1), and check v  PQ in O(1) Note that the “handle” must be updated whenever a heap operation is applied Copyright 1999 by Cutler/Head

Implementation Issues
2. Unsorted list Array implementation where node v can be accesses as PQ[v] in O(1), and the value of PQ[v] indicates when the node is not in PQ. Copyright 1999 by Cutler/Head

Prim’s Algorithm 1. for each u V 2. do D [u ]   3. D[ r ]  0
4. PQ make-heap(D,V, {})//No edges 5. T   while PQ   do 8. (u,e )  PQ.extractMin() add (u,e) to T for each v  Adjacent (u ) // execute relaxation do if v PQ && w( u, v ) < D [ v ] then D [ v ]  w (u, v) PQ.decreasePriorityValue ( D[v], v, (u,v )) 14. return T // T is a mst. Lines 1-5 initialize the priority queue PQ to contain all Vertices. Ds for all vertices except r, are set to infinity. r is the starting vertex of the T The T so far is empty Add closest vertex and edge to current T Get all adjacent vertices v of u, update D of each non-tree vertex adjacent to u Store the current minimum weight edge, and updated distance in the priority queue Copyright 1999 by Cutler/Head

Prim’s Algorithm Initialization
Prim (G ) 1. for each u V do D [u ]   3. D[ r ]  0 4. PQ make-heap(D,V, {})//No edges 5. T   Copyright 1999 by Cutler/Head

Building the MST (D[v], v, (u,v) ) 14. return T // solution check
7. while PQ   do //Selection and feasibility (u,e )  PQ.extractMin() // T contains the solution so far add (u,e) to T for each v  Adjacent (u ) do if v  PQ && w( u, v ) < D [ v ] then D [ v ]  w (u, v) PQ.decreasePriorityValue (D[v], v, (u,v) ) 14. return T Copyright 1999 by Cutler/Head

Using Extended Heap implementation
Time Analysis Using Extended Heap implementation Lines 1 -6 run in O (V ) Max Size of PQ is | V | Count7 =O (V ) Count7(8) = O (V )  O( lg V ) Count7(10) = O(deg(u ) ) = O( E ) Count7(10(11)) = O(1)O( E ) Count7(10(11(12))) = O(1) O( E ) Count7(10(13)) = O( lg V)  O( E ) Decrease- Key operation on the extended heap can be implemented in O( lg V) So total time for Prim's Algorithm is O ( V lg V + E lg V ) What is O(E ) ? Sparse Graph, E =O(V) , O (E lg V)=O(V lg V ) Dense Graph, E=O(V2), O (E lg V)=O(V2 lg V) 1. for each u V 2. do D [u ]   3. D[ r ]  0 4. PQ make-PQ(D,V, {})//No edges 5. T   6. 7. while PQ   do 8. (u,e )  PQ.extractMin() add (u,e) to T for each v  Adjacent (u ) do if v PQ && w( u, v ) < D [ v ] then D [ v ]  w (u, v) PQ.decreasePriorityValue (D[v], v, (u,v)) 15. return T // T is a mst. Assume a node in PQ can be accessed in O(1) ** Decrease key for v requires O(lgV ) provided the node in heap with v’s data can be accessed in O(1) In order to achieve a O(lg V ) cost for decrease key you can use an additional data structure like a Hashtable to store the location in the PQ. Copyright 1999 by Cutler/Head

Time Analysis Using unsorted PQ
1. for each u V 2. do D [u ]   3. D[ r ]  0 4. PQ make-PQ(D,V, {})//No edges 5. T   while PQ   do 8. (u,e )  PQ.extractMin() add (u,e) to T for each v  Adjacent (u ) do if v PQ && w( u, v ) < D [ v ] then D [ v ]  w (u, v) PQ.decreasePriorityValue (D[v], v, (u,v)) 15. return T // T is a mst. Using unsorted PQ Lines run in O (V ) Max Size of PQ is | V | Count7 = O (V ) Count7(8) = O (V )  O(V ) Count7(10) = O(deg(u ) ) = O( E ) Count7(10(11)) = O(1)O( E ) Count7(10(11(12))) = O(1) O( E ) Count7(10(13)) =O( 1)  O( E ) So total time for Prim's Algorithm is O (V + V2 + E ) = O (V2 ) For Sparse/Dense graph : O( V2 ) Note growth rate unchanged for adjacency matrix graph representation The Loop begining in line 10 is executed O(deg(u ) ) = O( E ). It is based only on the number of edges in the graph not vertices. We assume the PQ is implemented as an array (sequence). Therefore, removeMinElement( ) , costs O(V) because you have to find the minimum element and you will have to shift the elements to fill in the whole. If we know the position of the key to be decrease then the cost of change the value is O(1). Copyright 1999 by Cutler/Head

Prim - extended Heap After Initialization
PQ T A handle A B C 1 2 3 1 0, (A, {}) 2 6 B 5 C G , (B, {}) , (C, {}) 2 3 Prim (G, r) 1. for each u V 2. do D [u ]   3. D[ r ]  0 4. PQ make-heap(D,V, { }) 5. T   G A B C B 2 C 6 A 2 C 5 Copyright 1999 by Cutler/Head A 6 B 5

Prim - extended Heap Build tree - after PQ.extractMin
handle A B C Null 2 1 T (A, {}) PQ 2 1 , (C, {}) 6 B 5 C G , (B, {}) 7. while PQ   do (u,e)  PQ.extractMin() add (u,e) to T 2 Copyright 1999 by Cutler/Head

Update B adjacent to A handle A B C Null 1 2 T (A, {}) PQ A 1
2, (B, {A, B}) 2 6 B 5 C G , (C, {}) 2 10. for each v  Adjacent (u ) // relaxation operation // relaxation 11. do if v PQ && w( u, v ) < D [ v ] then D [ v ]  w (u, v) PQ.decreasePriorityValue ( D[v], v, (u,v)) Copyright 1999 by Cutler/Head

Update C adjacent to A handle A B C Null 1 2 T (A, {}) PQ 1
2, (B, {A, B}) 6, (C, {A, C}) A 2 2 6 B 5 C G Copyright 1999 by Cutler/Head

Build tree - after PQ.extractMin
handle A B C Null 1 T (A, {}) (B, {A, B}) PQ 6, (C, {A, C}) 1 A 7. while PQ   do (u,e)  PQ.extractMin() add (u,e) to T 2 6 B 5 C G Copyright 1999 by Cutler/Head

Update C adjacent to B handle A B C Null 1 T (A, {}) (B, {A, B}) T
PQ 5, (C, {B, C}) 1 A 10. for each v  Adjacent (u ) // relaxation operation 2 6 B 5 C G Copyright 1999 by Cutler/Head

Build tree - after PQ.extractMin
handle A B C Null T (A, {}) (B, {A, B}) (C, {B, C}) T (A, {}) PQ A 7. while PQ   do (u,e)  PQ.extractMin() add (u,e) to T 2 6 B 5 C G Copyright 1999 by Cutler/Head

Prim - unsorted list After Initialization
PQ A A B C 12 4 B 0, (A, {}) , (B, {}) , (C, {}) 5 C G G A B C B 12 C 4 Prim (G, r) 1. for each u V 2. do D [u ]   3. D[ r ]  0 4. PQ make-PQ(D,V, { }) 5. T   A 12 C 5 A 4 B 5 Copyright 1999 by Cutler/Head

Build tree - after PQ.extractMin
12 4 B Null , (B, {}) , (C, {}) 5 C G 7. while PQ   do (u,e)  PQ.extractMin() add (u,e) to T Copyright 1999 by Cutler/Head

Update B, C adjacent to A T (A, {}) PQ A A B C 12 4 B Null
12, (B, {A, B}) 4, (C, {A, C}) 5 C G 10. for each v  Adjacent (u ) // relaxation operation Copyright 1999 by Cutler/Head

Build tree - after PQ.extractMin
(C, {A, C}) A 12 PQ 4 B 5 A B C C Null 12, (B, {A, B}) Null G 7. while PQ   do (u,e)  PQ.extractMin() add (u,e) to T Copyright 1999 by Cutler/Head

Update B adjacent to C T (A, {}) (C, {A, C}) T (A, {}) PQ A 12 4 B 5 A
Null 5, (B, {C, B}) Null G 10. for each v  Adjacent (u ) // relaxation operation Copyright 1999 by Cutler/Head

Build tree - after PQ.extractMin
(C, {A, C}) (B, {C, B}) T (A, {}) PQ A 12 4 B 5 A B C C Null Null Null G 7. while PQ   do (u,e)  PQ.extractMin() add (u,e) to T Copyright 1999 by Cutler/Head

PQ = {( 0,(a,)), (,(b,?)), ...(,(h,?))}
Prim (G) 1. for each u V 2. do D [u ]   3. D[ r ]  0 4. PQ make-heap(D,V, { }) 5. T   D = [ 0, , …,  ] PQ = {( 0,(a,)), (,(b,?)), ...(,(h,?))} T = { } r G = a 4 6 9 5 b e g 14 2 10 15 c f h 3 8 d Copyright 1999 by Cutler/Head

// relaxation 11. do if v PQ && w( u, v ) < D [ v ] then D [ v ]  w (u, v) PQ.decreasePriorityValue ( D[v], v, (u,v)) 7. while PQ   do (u,e)  PQ.extractMin() add (u,e) to T 10. for each v  Adjacent (u ) // relaxation operation 15. return T D = [ 0, r G = a T = { 4 6 9 5 b e g PQ = { 14 2 10 15 c f h 3 8 d Copyright 1999 by Cutler/Head

Lemma 1 Let G = ( V, E) be a connected, weighted undirected graph. Let T be a promising subset of E. Let Y be the set of vertices connected by the edges in T. If e is a minimum weight edge that connects a vertex in Y to a vertex in V - Y, then T  { e } is promising. Note: A feasible set is promising if it can be extended to produce not only a solution , but an optimal solution. In this algorithm: A feasible set of edges is promising if it is a subset of a Minimum Spanning Tree for the connected graph. This is the same proof for Kruskal’s algorithm. Copyright 1999 by Cutler/Head

Outline of Proof of Correctness of Lemma 1
T is the promising subset, and e the minimum cost edge of Lemma 1 Let T ' be the MST such that T  T ' We will show that if e  T' then there must be another MST T" such that T {e}  T". Proof has 4 stages: 1. Adding e to T', closes a cycle in T' {e}. 2. Cycle contains another edge e' T' but e'  T 3. T"=T'  {e} - {e’ } is a spanning tree 4. T" is a MST Copyright 1999 by Cutler/Head

The Promising Set of Edges Selected by Prim
Lemma 1 The Promising Set of Edges Selected by Prim T X e  Y  V -Y MST T' but e  T' X Copyright 1999 by Cutler/Head

Lemma 1 Since T' is a spanning tree, it is connected. Adding the edge, e, creates a cycle. X T'  {e} Stage 1 u v X  Y  V -Y e In T' there is a path from uY to v  V -Y. Therefore the path must include another edge e' with one vertex in Y and the other in V-Y. X Stage 2 e' Copyright 1999 by Cutler/Head

Lemma 1 If we remove e’ from T’  { e } the cycle disappears.
T”=T’  { e } - {e’} is connected. T’ is connected. Every pair of vertices connected by a path that does not include e’ is still connected in T”. Every pair of vertices connected by a path that included e’ , is still connected in T” because there is a path in T" =T’  { e } - { e’ } connecting the vertices of e’. X e T " X  Y  V -Y b Stage 3 w( e )  w( e' ) By the way Prim picks the next edge Copyright 1999 by Cutler/Head

Lemma 1 Conclusion T  { e } is promising
w( e )  w( e' ) by the way Prim picks the next edge. The weight of T", w(T") = w (T' ) + w( e ) - w( e' )  w(T'). But w (T' )  w(T") because T' is a MST. So w (T' ) = w(T") and T" is a MST Stage 4 X e T" X  Y  V -Y Conclusion T  { e } is promising Copyright 1999 by Cutler/Head

Theorem : Prim's Algorithm always produces a minimum spanning tree.
Proof by induction on the set T of promising edges. Base case: Initially, T =  is promising. Induction hypothesis: The current set of edges T selected by Prim is promising. Induction step: After Prim adds the edge e, T  { e } is promising. Proof: T  { e } is promising by Lemma 1. Conclusion: When G is connected, Prim terminates with |T | = |V | -1, and T is a MST. Copyright 1999 by Cutler/Head

Kruskal’s Algorithm

Kruskal's Algorithm: Main Idea
6 4 5 2 15 8 10 14 3 u v b a c d f solution = { } while ( more edges in E) do // Selection select minimum weight edge remove edge from E // Feasibility if (edge closes a cycle with solution so far) then reject edge else add edge to solution // Solution check if |solution| = |V | - 1 return solution return null // when does this happen? Copyright 1999 by Cutler/Head

Kruskal's Algorithm: 1. Sort the edges E in non-decreasing weight 2. T   3. For each v V create a set. 4. repeat 5. Select next {u,v}  E, in order 6. ucomp  find (u) 7. vcomp  find (v) if ucomp  vcomp then add edge (u,v) to T union ( ucomp,vcomp ) 10.until T contains |V | - 1 edges 11. return tree T 6 4 5 2 9 15 8 10 14 3 e f b a c d g h C = { {a}, {b}, {c}, {d}, {e}, {f}, {g}, {h} } C is a forest of trees. Copyright 1999 by Cutler/Head

Kruskal - Disjoint set After Initialization
Sorted edges T 2 A B 2 6 B B C 5 5 A C 6 C 7 G B D 7 D 1. Sort the edges E in non-decreasing weight 2. T   3. For each v V create a set. A B C D Disjoint data set for G Copyright 1999 by Cutler/Head

Kruskal - add minimum weight edge if feasible
Sorted edges 2 A B 2 (A, B) 6 B B C 5 5 7 A C 6 C G D B D 7 5. for each {u,v}  in ordered E ucomp  find (u) vcomp  find (v) if ucomp  vcomp then add edge (v,u) to T union( ucomp,vcomp ) Find(A) Find(B) A B C D Disjoint data set for G A C D B After merge(A, B) Copyright 1999 by Cutler/Head

Kruskal - add minimum weight edge if feasible
Sorted edges 2 A B 2 (A, B) (B, C) 6 B B C 5 5 7 A C 6 C G D B D 7 5. for each {u,v}  in ordered E ucomp  find (u) vcomp  find (v) if ucomp  vcomp then add edge (v,u) to T union ( ucomp,vcomp ) Find(B) Find(C) A C D B After merge(A, C) A B C D Copyright 1999 by Cutler/Head

Kruskal - add minimum weight edge if feasible
Sorted edges 2 A B 2 (A, B) (B, C) 6 B B C 5 5 7 A C 6 C G D B D 7 5. for each {u,v}  in ordered E ucomp  find (u) vcomp  find (v) if ucomp  vcomp then add edge (v,u) to T union ( ucomp,vcomp ) Find(A) Find(C) A D B C A and C in same set Copyright 1999 by Cutler/Head

Kruskal - add minimum weight edge if feasible
Sorted edges 2 A B 2 (A, B) (B, C) (B, D) 6 B B C 5 5 7 A C 6 C G D B D 7 5. for each {u,v}  in ordered E ucomp  find (u) vcomp  find (v) if ucomp  vcomp then add edge (v,u) to T union ( ucomp,vcomp ) Find(B) Find(D) A D B C A After merge B C D Copyright 1999 by Cutler/Head

Kruskal's Algorithm: Time Analysis
Kruskal ( G ) 1. Sort the edges E in non-decreasing weight 2. T   3. For each v V create a set. 4. repeat 5. {u,v}  E, in order 6. ucomp  find (u) 7. vcomp  find (v) if ucomp  vcomp then add edge (v,u) to T union ( ucomp,vcomp ) 11.until T contains |V | - 1 edges 12. return tree T Count1 = ( E lg E ) Count2= (1) Count3= ( V ) Count4 = O( E ) Using Disjoint set-height and path compression Count4(6+7+10)= O((E +V) (V)) Sorting dominates the runtime. We get T( E,V ) = ( E lg E), so for a sparse graph we get ( V lg V) for a dense graph we get ( V2 lg V2) = ( V2 lg V) Note: Could have used a PQ. However unlike Prim’s we do not need to change the PQ. Sorting the Edges is slightly faster. Copyright 1999 by Cutler/Head