1 Chapter 5-3 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

Graph Algorithms - 4 Algorithm Design and Analysis Victor AdamchikCS Spring 2014 Lecture 14Feb 14, 2014Carnegie Mellon University.
Greedy Algorithms Greed is good. (Some of the time)
Great Theoretical Ideas in Computer Science for Some.
Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
Great Theoretical Ideas in Computer Science.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Chapter 23 Minimum Spanning Trees
Discussion #36 Spanning Trees
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
CSE 421 Algorithms Richard Anderson Dijkstra’s algorithm.
Spanning Trees.
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Great Theoretical Ideas in Computer Science.
Minimum-Cost Spanning Tree weighted connected undirected graph spanning tree cost of spanning tree is sum of edge costs find spanning tree that has minimum.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne MST: Red Rule, Blue Rule Some of these lecture slides are adapted from material.
Minimal Spanning Trees. Spanning Tree Assume you have an undirected graph G = (V,E) Spanning tree of graph G is tree T = (V,E T E, R) –Tree has same set.
1 Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
CSE 421 Algorithms Richard Anderson Lecture 10 Minimum Spanning Trees.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 11 Instructor: Paul Beame.
Minimum Spanning Trees
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
1 Minimum Spanning Trees Longin Jan Latecki Temple University based on slides by David Matuszek, UPenn, Rose Hoberman, CMU, Bing Liu, U. of Illinois, Boting.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
Minimum Spanning Tree Algorithms. What is A Spanning Tree? u v b a c d e f Given a connected, undirected graph G=(V,E), a spanning tree of that graph.
1 Minimum Spanning Trees Longin Jan Latecki Temple University based on slides by David Matuszek, UPenn, Rose Hoberman, CMU, Bing Liu, U. of Illinois, Boting.
Data Structure & Algorithm 11 – Minimal Spanning Tree JJCAO Steal some from Prof. Yoram Moses & Princeton COS 226.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Minimum Spanning Trees and Clustering By Swee-Ling Tang April 20, /20/20101.
1 GRAPHS - ADVANCED APPLICATIONS Minimim Spanning Trees Shortest Path Transitive Closure.
9/10/10 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Adam Smith Algorithm Design and Analysis L ECTURE 8 Greedy Graph.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
1 Minimum Spanning Tree Problem Topic 10 ITS033 – Programming & Algorithms Asst. Prof. Dr. Bunyarit Uyyanonvara IT Program, Image and Vision Computing.
MST Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
Spring 2015 Lecture 11: Minimum Spanning Trees
 2004 SDU Lecture 7- Minimum Spanning Tree-- Extension 1.Properties of Minimum Spanning Tree 2.Secondary Minimum Spanning Tree 3.Bottleneck.
1 Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Google News and the theory behind it Sections 4.5, 4.6, 4.7 of [KT]
Lectures on Greedy Algorithms and Dynamic Programming
Introduction to Graph Theory
Overview of Graph Theory. Some applications of Graph Theory Models for communications and electrical networks Models for computer architectures Network.
1 Chapter 4 Minimum Spanning Trees Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 12.
Chapter 8 Maximum Flows: Additional Topics All-Pairs Minimum Value Cut Problem  Given an undirected network G, find minimum value cut for all.
Minimum- Spanning Trees
1 Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 11.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Proof of correctness of Dijkstra’s algorithm: Basically, we need to prove two claims. (1)Let S be the set of vertices for which the shortest path from.
::Network Optimization:: Minimum Spanning Trees and Clustering Taufik Djatna, Dr.Eng. 1.
Midwestern State University Minimum Spanning Trees Definition of MST Generic MST algorithm Kruskal's algorithm Prim's algorithm 1.
November 22, Algorithms and Data Structures Lecture XII Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Chapter 4 Greedy Algorithms
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
Chapter 5. Optimal Matchings
Minimal Spanning Trees
Autumn 2016 Lecture 11 Minimum Spanning Trees (Part II)
Autumn 2015 Lecture 11 Minimum Spanning Trees (Part II)
Minimum-Cost Spanning Tree
Autumn 2015 Lecture 10 Minimum Spanning Trees
Chapter 4 Greedy Algorithms
Richard Anderson Lecture 10 Minimum Spanning Trees
Minimum Spanning Tree.
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
Autumn 2019 Lecture 11 Minimum Spanning Trees (Part II)
Presentation transcript:

1 Chapter 5-3 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.

4.5 Minimum Spanning Tree

3 Minimum Spanning Tree Minimum spanning tree. Given a connected graph G = (V, E) with real- valued edge weights c e, an MST is a subset of the edges T  E such that T is a spanning tree whose sum of edge weights is minimized. Cayley's Theorem. There are n n-2 spanning trees of K n G = (V, E) T,  e  T c e = 50 can't solve by brute force

4 Applications MST is fundamental problem with diverse applications. n Network design. – telephone, electrical, hydraulic, TV cable, computer, road n Approximation algorithms for NP-hard problems. – traveling salesperson problem, Steiner tree n Indirect applications. – max bottleneck paths – LDPC codes for error correction – image registration with Renyi entropy – learning salient features for real-time face verification – reducing data storage in sequencing amino acids in a protein – model locality of particle interactions in turbulent fluid flows – autoconfig protocol for Ethernet bridging to avoid cycles in a network n Cluster analysis.

5 Greedy Algorithms Kruskal's algorithm. Start with T = . Consider edges in ascending order of cost. Insert edge e in T unless doing so would create a cycle. Reverse-Delete algorithm. Start with T = E. Consider edges in descending order of cost. Delete edge e from T unless doing so would disconnect T. Prim's algorithm. Start with some root node s and greedily grow a tree T from s outward. At each step, add the cheapest edge e to T that has exactly one endpoint in T. Remark. All three algorithms produce an MST.

6 Greedy Algorithms Simplifying assumption. All edge costs c e are distinct. Cut property. Let S be any subset of nodes, and let e be the min cost edge with exactly one endpoint in S. Then the MST contains e. Cycle property. Let C be any cycle, and let f be the max cost edge belonging to C. Then the MST does not contain f. f C S e is in the MST e f is not in the MST

7 Cycles and Cuts Cycle. Set of edges the form a-b, b-c, c-d, …, y-z, z-a. Cutset. A cut is a subset of nodes S. The corresponding cutset D is the subset of edges with exactly one endpoint in S. Cycle C = 1-2, 2-3, 3-4, 4-5, 5-6, Cut S = { 4, 5, 8 } Cutset D = 5-6, 5-7, 3-4, 3-5,

8 Cycle-Cut Intersection Claim. A cycle and a cutset intersect in an even number of edges. Pf. (by picture) S V - S C Cycle C = 1-2, 2-3, 3-4, 4-5, 5-6, 6-1 Cutset D = 3-4, 3-5, 5-6, 5-7, 7-8 Intersection = 3-4, 5-6

9 Greedy Algorithms Simplifying assumption. All edge costs c e are distinct. Cut property. Let S be any subset of nodes, and let e be the min cost edge with exactly one endpoint in S. Then the MST T* contains e. Pf. (exchange argument) n Suppose e does not belong to T*, and let's see what happens. n Adding e to T* creates a cycle C in T*. n Edge e is both in the cycle C and in the cutset D corresponding to S  there exists another edge, say f, that is in both C and D. n T' = T*  { e } - { f } is also a spanning tree. n Since c e < c f, cost(T') < cost(T*). n This is a contradiction. ▪ f T* e S

10 Greedy Algorithms Simplifying assumption. All edge costs c e are distinct. Cycle property. Let C be any cycle in G, and let f be the max cost edge belonging to C. Then the MST T* does not contain f. Pf. (exchange argument) n Suppose f belongs to T*, and let's see what happens. n Deleting f from T* creates a cut S in T*. n Edge f is both in the cycle C and in the cutset D corresponding to S  there exists another edge, say e, that is in both C and D. n T' = T*  { e } - { f } is also a spanning tree. n Since c e < c f, cost(T') < cost(T*). n This is a contradiction. ▪ f T* e S

11 Prim's Algorithm: Proof of Correctness Prim's algorithm. [Jarník 1930, Dijkstra 1957, Prim 1959] n Initialize S = any node. n Apply cut property to S. n Add min cost edge in cutset corresponding to S to T, and add one new explored node u to S. S

12 Implementation: Prim's Algorithm Prim(G, c) { foreach (v  V) a[v]   Initialize an empty priority queue Q foreach (v  V) insert v onto Q Initialize set of explored nodes S   while (Q is not empty) { u  delete min element from Q S  S  { u } foreach (edge e = (u, v) incident to u) if ((v  S) and (c e < a[v])) decrease priority a[v] to c e } Implementation. Use a priority queue ala Dijkstra. n Maintain set of explored nodes S. n For each unexplored node v, maintain attachment cost a[v] = cost of cheapest edge v to a node in S. n O(n 2 ) with an array; O(m log n) with a binary heap.

13 Kruskal's Algorithm: Proof of Correctness Kruskal's algorithm. [Kruskal, 1956] n Consider edges in ascending order of weight. n Case 1: If adding e to T creates a cycle, discard e according to cycle property. n Case 2: Otherwise, insert e = (u, v) into T according to cut property where S = set of nodes in u's connected component. Case 1 v u Case 2 e e S

14 Implementation: Kruskal's Algorithm Kruskal(G, c) { Sort edges weights so that c 1  c 2 ...  c m. T   foreach (u  V) make a set containing singleton u for i = 1 to m (u,v) = e i if (u and v are in different sets) { T  T  {e i } merge the sets containing u and v } return T } Implementation. Use the union-find data structure. n Build set T of edges in the MST. n Maintain set for each connected component. n O(m log n) for sorting and O(m  (m, n)) for union-find. are u and v in different connected components? merge two components m  n 2  log m is O(log n) essentially a constant

15 Lexicographic Tiebreaking To remove the assumption that all edge costs are distinct: perturb all edge costs by tiny amounts to break any ties. Impact. Kruskal and Prim only interact with costs via pairwise comparisons. If perturbations are sufficiently small, MST with perturbed costs is MST with original costs. Implementation. Can handle arbitrarily small perturbations implicitly by breaking ties lexicographically, according to index. boolean less(i, j) { if (cost(e i ) < cost(e j )) return true else if (cost(e i ) > cost(e j )) return false else if (i < j) return true else return false } e.g., if all edge costs are integers, perturbing cost of edge e i by i / n 2

4.7 Clustering Outbreak of cholera deaths in London in 1850s. Reference: Nina Mishra, HP Labs

17 Clustering Clustering. Given a set U of n objects labeled p 1, …, p n, classify into coherent groups. Distance function. Numeric value specifying "closeness" of two objects. Fundamental problem. Divide into clusters so that points in different clusters are far apart. n Routing in mobile ad hoc networks. n Identify patterns in gene expression. n Document categorization for web search. n Similarity searching in medical image databases n Skycat: cluster 10 9 sky objects into stars, quasars, galaxies. photos, documents. micro-organisms number of corresponding pixels whose intensities differ by some threshold

18 Clustering of Maximum Spacing k-clustering. Divide objects into k non-empty groups. Distance function. Assume it satisfies several natural properties. n d(p i, p j ) = 0 iff p i = p j (identity of indiscernibles) n d(p i, p j )  0 (nonnegativity) n d(p i, p j ) = d(p j, p i ) (symmetry) Spacing. Min distance between any pair of points in different clusters. Clustering of maximum spacing. Given an integer k, find a k-clustering of maximum spacing. spacing k = 4

19 Greedy Clustering Algorithm Single-link k-clustering algorithm. n Form a graph on the vertex set U, corresponding to n clusters. n Find the closest pair of objects such that each object is in a different cluster, and add an edge between them. n Repeat n-k times until there are exactly k clusters. Key observation. This procedure is precisely Kruskal's algorithm (except we stop when there are k connected components). Remark. Equivalent to finding an MST and deleting the k-1 most expensive edges.

20 Greedy Clustering Algorithm: Analysis Theorem. Let C* denote the clustering C* 1, …, C* k formed by deleting the k-1 most expensive edges of a MST. C* is a k-clustering of max spacing. Pf. Let C denote some other clustering C 1, …, C k. n The spacing of C* is the length d* of the (k-1) st most expensive edge. n Let p i, p j be in the same cluster in C*, say C* r, but different clusters in C, say C s and C t. n Some edge (p, q) on p i -p j path in C* r spans two different clusters in C. n All edges on p i -p j path have length  d* since Kruskal chose them. n Spacing of C is  d* since p and q are in different clusters. ▪ p q pipi pjpj CsCs CtCt C* r

4.9 Minimum-Cost Arborescences

22 Minimum-Cost Arborescences - Definitions Given a directed graph G=(V, E) and one node r є V as root, an arborescence w.r.t. r is essentially a directed spanning tree rooted at r. It is a subgraph T=(V, F) such that T is a spanning tree of G if we ignore the direction of edges and there is a path in T from r to each other node v є V if we take the direction of edges into account.

23 Example Arborescences

24 Characterizing Arborescences Claim 1: A subgraph T=(V, F) of G is an arborescence wrt root r iff T has no cycles, and for each node v ≠ r, there is exactly one edge in F that enters v. Proof: (If):T is an arborescence with root r, then by definition (spanning tree) has no cycles and also for each node v ≠ r, there is exactly one edge in F entering it (on the unique r-v path). (Only if): if no cycles and each node v ≠ r has exactly one edge entering, then we need to show that there is a directed path from r to each other node. Take any node v ≠ r and repeatedly follow edges in the backward direction. Since no cycles, the process must terminate. But r is the only node without an incoming edge. Thus the sequence of nodes visited forms a path in the reverse direction from r to v. Claim 2: A directed graph G has an arborescence rooted at r iff there is a directed path from r to each other node. Proof: Perform a BFS constructing the BFS tree rooted at r.

25 Minimum-Cost Arborescence Definition: Given a directed graph G=(V, E) with a distinguished node r and with a non-negative cost c e ≥ 0 on each edge, compute an arborescence rooted at r of minimum total cost. Ex: choose cheapest edges as in MST. Try it out for edge with cost 1. Our myopic rule needs to be a bit more involved in this case.

26 Minimum-Cost Arborescence Initial Strategy: For all vєV-{r}, select the cheapest edge entering v and let F* be this set of n-1 edges. Claim 3: If (V, F*) is an arborescence, then it is minimum-cost. What if (V, F*) is not an arborescence. By Claim 1, it must contain a cycle which does not include the root r. (Why?) Observation: Every arborescence contains exactly one edge entering each node v ≠ r. So, if we pick some node v and subtract a uniform quantity from the cost of every edge entering v, then the total cost of every arborescence changes by exactly the same amount. This means, essentially, that the actual cost of the cheapest edge entering v is not important; what matters is the cost of others entering v relative to the cheapest. Let y v = min cost edge e=(u, v) c e ≥ 0 entering v and define c e ’ = c e – y v for all the edges entering v.

27 Minimum-Cost Arborescence y v = min cost edge e=(u, v) c e ≥ 0 entering v, and c e ’ = c e – y v for all the edges entering v Claim 4: T is an optimal arborescence in G subject to costs {c e } iff it is an optimal arborescence subject to {c e ’ }. Proof: Consider an arbitrary arborescence T. The difference between its cost with edge costs {c e } and {c e ’ } is exactly Σc e - Σc e ’ for e є T. Σc e - Σc e ’ = Σy v for v ≠ r. Now, consider the problem in terms of {c e ’ }. All the edges in F* have cost 0. And if there is a cycle C in (V, F*), all edges in C have cost 0. This suggests that we can use as many edges from C as possible as we want, since no raise in cost is ever introduced.

28 Minimum-Cost Arborescence Algorithm Contract C into a single supernode, obtaining G’=(V’, E’), V’ = V –C U c* and E’ is obtained by transforming each edge e in E to e’ by replacing each end of e that belongs to C with c*. So G’ can have parallel edges and self-loops. Delete self-loops. Then, recursively, find an optimal arborescence in G’ subject to {c e ’ }. Given the solution for G’ returned by the recursive call, modify it to obtain the solution for G by including all but one edge on C.

29 Analyze the Algorithm - I To prove that algorithm finds optimal, we must prove G has an optimal arborescence with exactly one edge entering C. Claim 5: Let C be a cycle in G consisting of edges of cost 0, such that r is not in C. Then there is an optimal arborescence rooted at r that has exactly one edge entering C. Proof: Consider an optimal arborescence T in G. Since r has a path in T to every node, there is at least one edge of T that enters C. If T enters C exactly once, then we are done. Otherwise, suppose that T enters C more than once. Consider how we can modify it to an arborescence of no greater cost that enters C exactly once: Let e=(a, b) be an edge entering C on as short a path as possible from r (No edges from r to a can enter C. Why?).  Delete all edges of T that enter C except for e.  Add all edges of C except one entering b.  Let T ’ denote the resulting subgraph of G.

30 Analyze the Algorithm - II Proof (Claim 5) continued: We claim that T ’ is also an arborescence. Note first that cost(T ’) ≤ cost(T) since the only edges that are added have cost 0. T ’ has exactly one edge entering each v ≠ r, and no edge to r. n So, T ’ has exactly n-1 edges; hence if we can show there is an r-v path in T ’ for each v, then T ’ must be connected in an undirected sense, and hence a tree. Consider any v ≠ r. There are 2 cases to consider: i. if v є C, then go to e=(a,b) and follow edges in C. ii. if v not in C, then let P denote r-v path in T. There are 2 cases again: a.if P did not touch C, then it still exists in T ’. b.if P touches C, let w be the last node in P C and let P ’ be the subpath of P from w to v. All the edges in P’ still exist in T ‘ and w is reachable from r by (i).

Minimum-Cost Arborescence Optimality Claim 6: The Algorithm finds an optimal arborescence rooted at r in G. Proof: The proof is by induction on the number of nodes in G. base: Clearly true for any G having one node. hypothesis: assume true for G when |V|≤ n induction: for n+1 nodes. Apply algorithm. n if F* form an arborescence, we are done n otherwise consider the problem with {c e ’ } – after contracting a 0-cost cycle C to obtain a smaller graph G’, the algorithm produces an optimal solution by the inductive hypothesis. – And there is an optimal arborescence in G corresponding to the optimal computed for G’. 31

Extra Slides

33 MST Algorithms: Theory Deterministic comparison based algorithms. n O(m log n)[Jarník, Prim, Dijkstra, Kruskal, Boruvka] n O(m log log n).[Cheriton-Tarjan 1976, Yao 1975] n O(m  (m, n)).[Fredman-Tarjan 1987] n O(m log  (m, n)).[Gabow-Galil-Spencer-Tarjan 1986] n O(m  (m, n)).[Chazelle 2000] Holy grail. O(m). Notable. n O(m) randomized.[Karger-Klein-Tarjan 1995] n O(m) verification.[Dixon-Rauch-Tarjan 1992] Euclidean. n 2-d: O(n log n).compute MST of edges in Delaunay n k-d: O(k n 2 ).dense Prim

34 Dendrogram Dendrogram. Scientific visualization of hypothetical sequence of evolutionary events. n Leaves = genes. n Internal nodes = hypothetical ancestors. Reference:

35 Dendrogram of Cancers in Human Tumors in similar tissues cluster together. Reference: Botstein & Brown group Gene 1 Gene n gene expressed gene not expressed

HOMEWORK (Chapter 5) 1 2-a